THE RATIONAL SPIRIT IN MODERN CONTINUUM MECHANICS
The Rational Spirit in Modern Continuum Mechanics Essays and Papers Dedicated to the Memory of Clifford Ambrose Truesdell III
Edited by
CHI-SING MAN University of Kentucky, Lexington, U.S.A.
and
ROGER L. FOSDICK University of Minnesota, Minneapolis, U.S.A.
Reprinted from Journal of Elasticity: The Physical and Mathematical Science of Solids, Vols. 70, 71, 72 (2003)
KLUWER ACADEMIC PUBLISHERS NEW YORK, BOSTON, DORDRECHT, LONDON, MOSCOW
eBook ISBN: Print ISBN:
1-4020-2308-1 1-4020-1828-2
©2005 Springer Science + Business Media, Inc.
Print ©2004 Kluwer Academic Publishers Dordrecht All rights reserved
No part of this eBook may be reproduced or transmitted in any form or by any means, electronic, mechanical, recording, or otherwise, without written consent from the Publisher
Created in the United States of America
Visit Springer's eBookstore at: and the Springer Global Website Online at:
http://ebooks.kluweronline.com http://www.springeronline.com
Portrait by Joseph Sheppard
Table of Contents
Portrait by Joseph Sheppard
v
Foreword by Chi-Sing Man and Roger Fosdick
xi
Published Works of Clifford Ambrose Truesdell III
xiii
Serials Edited by Clifford Ambrose Truesdell III
xli
Eulogium by Roger Fosdick
xliii
Photograph: Bloomington, Indiana, 1959
xlv
BERNARD D. COLEMAN / Memories of Clifford Truesdell
1–13
ENRICO GIUSTI / Clifford Truesdell (1919–2000), Historian of Mathematics
15–22
WALTER NOLL / The Genesis of Truesdell’s Nonlinear Field Theories of Mechanics
23–30
JAMES SERRIN / An Appreciation of Clifford Truesdell
31–38
D. SPEISER / Clifford A. Truesdell’s Contributions to the Euler and the Bernoulli Edition Photograph: Baltimore, Maryland, 1978 STUART S. ANTMAN / Invariant Dissipative Mechanisms for the Spatial Motion of Rods Suggested by Artificial Viscosity
39–53
55–64
MILLARD F. BEATTY / An Average-Stretch Full-Network Model for Rubber Elasticity
65–86
MICHELE BUONSANTI and GIANNI ROYER-CARFAGNI / From 3-D Nonlinear Elasticity Theory to 1-D Bars with Nonconvex Energy
87–100
GIOVANNI BURATTI, YONGZHONG HUO and INGO MÜLLER / Eshelby Tensor as a Tensor of Free Enthalpy
101–112
SANDRO CAPARRINI and FRANCO PASTRONE / E. Frola (1906–1962): An Attempt Towards an Axiomatic Theory of Elasticity
113–125
GIANFRANCO CAPRIZ and PAOLO MARIA MARIANO / Symmetries and Hamiltonian Formalism for Complex Materials
127–140
DONALD E. CARLSON, ELIOT FRIED and DANIEL A. TORTORELLI / Geometrically-based Consequences of Internal Constraints
141–149
vii
viii YI-CHAO CHEN / Second Variation Condition and Quadratic Integral Inequalities with Higher Order Derivatives
151–167
ELENA CHERKAEV and ANDREJ CHERKAEV / Principal Compliance and Robust Optimal Design
169–196
JOHN C. CRISCIONE / Rivlin’s Representation Formula is Ill-Conceived for the Determination of Response Functions via Biaxial Testing 197–215 CESARE DAVINI and ROBERTO PARONI / Generalized Hessian and External Approximations in Variational Problems of Second Order
217–242
F. DELL’ISOLA, G. SCIARRA and R.C. BATRA / Static Deformations of a Linear Elastic Porous Body Filled with an Inviscid Fluid
243–264
GIANPIETRO DEL PIERO / A Class of Fit Regions and a Universe of Shapes for Continuum Mechanics
265–285
LUCA DESERI and DAVID R. OWEN / Toward a Field Theory for Elastic Bodies Undergoing Disarrangements
287–326
MARCELO EPSTEIN and IOAN BUCATARU / Continuous Distributions of Dislocations in Bodies with Microstructure
327–344
˙ MARCELO EPSTEIN and MAREK ELZANOWSKI / A Model of the Evolution of a Two-dimensional Defective Structure
345–355
J.L. ERICKSEN / On the Theory of Rotation Twins in Crystal Multilattices
357–373
MAURO FABRIZIO and MURROUGH GOLDEN / Minimum Free Energies for Materials with Finite Memory
375–397
ROGER FOSDICK and LEV TRUSKINOVSKY / About Clapeyron’s Theorem in Linear Elasticity
399–426
M. FOSS, W. HRUSA and V.J. MIZEL / The Lavrentiev Phenomenon in Nonlinear Elasticity
427–435
GIOVANNI P. GALDI / Steady Flow of a Navier–Stokes Fluid around a Rotating Obstacle
437–467
TIMOTHY J. HEALEY and ERROL L. MONTES-PIZARRO / Global Bifurcation in Nonlinear Elasticity with an Application to Barrelling States of Cylindrical Columns
469–494
MOJIA HUANG and CHI-SING MAN / Constitutive Relation of Elastic Polycrystal with Quadratic Texture Dependence
495–524
MASARU IKEHATA and GEN NAKAMURA / Reconstruction Formula for Identifying Cracks
525–538
R.J. KNOPS and PIERO VILLAGGIO / An Approximate Treatment of Blunt Body Impact
539–554
I-SHIH LIU / On the Transformation Property of the Deformation Gradient under a Change of Frame
555–562
ix KONSTANTIN A. LURIE / Some New Advances in the Theory of Dynamic Materials
563–573
GERARD A. MAUGIN / Pseudo-plasticity and Pseudo-inhomogeneity Effects in Materials Mechanics
575–597
A. IAN MURDOCH / On the Microscopic Interpretation of Stress and Couple Stress
599–625
PABLO V. NEGRÓN-MARRERO / The Hanging Rope of Minimum Elongation for a Nonlinear Stress–Strain Relation
627–649
MARIO PITTERI / On Certain Weak Phase Transformations in Multilattices
651–671
PAOLO PODIO-GUIDUGLI / A New Quasilinear Model for Plate Buckling
673–698
G. RODNAY and R. SEGEV / Cauchy’s Flux Theorem in Light of Geometric Integration Theory
699–719
U. SARAVANAN and K.R. RAJAGOPAL / A Comparison of the Response of Isotropic Inhomogeneous Elastic Cylindrical and Spherical Shells and Their Homogenized Counterparts
721–749
M. ŠILHAVÝ / On SO(n)-Invariant Rank 1 Convex Functions
751–762
´ K. WILMANSKI / On Thermodynamics of Nonlinear Poroelastic Materials
763–777
WAN-LEE YIN / Anisotropic Elasticity and Multi-Material Singularities
779–808
Foreword Through his voluminous and influential writings, editorial activities, organizational leadership, intellectual acumen, and strong sense of history, Clifford Ambrose Truesdell III (1919–2000) was the main architect for the renaissance of rational continuum mechanics since the middle of the twentieth century. The present collection of 42 essays and research papers pays tribute to this man of mathematics, science, and natural philosophy as well as to his legacy. The first five essays by B.D. Coleman, E. Giusti, W. Noll, J. Serrin, and D. Speiser were texts of addresses given by their authors at the Meeting in memory of Clifford Truesdell, which was held in Pisa in November 2000. In these essays the reader will find personal reminiscences of Clifford Truesdell the man and of some of his activities as scientist, author, editor, historian of exact sciences, and principal founding member of the Society for Natural Philosophy. The bulk of the collection comprises 37 research papers which bear witness to the Truesdellian legacy. These papers cover a wide range of topics; what ties them together is the rational spirit. Clifford Truesdell, in his address upon receipt of a Birkhoff Prize in 1978, put the essence of modern continuum mechanics succinctly as “conceptual analysis, analysis not in the sense of the technical term but in the root meaning: logical criticism, dissection, and creative scrutiny.” It is in celebration of this spirit and this essence that these research papers are dedicated to the memory of their bearer, driving force, and main promoter for half a century. Most of these papers were presented at the Symposium on Recent Advances and New Directions in Mechanics, Continuum Thermodynamics, and Kinetic Theory – In Memory of Clifford A. Truesdell III, held in Blacksburg, Virginia, in June 2002; parts of two papers were delivered at the meeting Remembering Clifford Truesdell, held in Turin in November 2002; and the rest was written especially for the present collection. The portrait, a photo of which serves as the frontispiece of this collection, adorns the Clifford A. Truesdell III Room of History of Science in the library of the Scuola Normale Superiore (Pisa, Italy), which was inaugurated in October 2003 and permanently houses Clifford Truesdell’s previously private collection of books, papers, and correspondence. We are grateful to Mrs. Charlotte Truesdell for helping us secure a digital file of this photo and for providing us with the list of published works of Clifford Truesdell. C HI -S ING M AN University of Kentucky Lexington
ROGER F OSDICK University of Minnesota Minneapolis xi
Published Works of Clifford Ambrose Truesdell III
The year of publication is omitted from the entry unless it differs from the year under which the entry is listed. Letters following a number indicate subsidiary separate publications, as follows: P A C L R RE T TC TE
Preliminary report or preprint, Abstract, separately published or only published version, Condensed or extracted version, Lecture concerning part or all of the contents of main entry, Reprint, entire, Reprint of an extract, Translation, entire, Translation, condensed, Translation of an extract.
The list excludes some 600 reviews published between 1949 and 1971 in Mathematical Reviews, Applied Mechanics Reviews, Zentralblatt für Mathematik, Industrial Laboratories, and Mathematics of Computation but includes reviews published in other journals. 1943 1. (Co-author P. N EMÉNYI) A stress function for the membrane theory of shells of revolution, Proceedings of the National Academy of Sciences (U.S.A.) 29, 159–162. Other publication in 1943: No. 3A1. 1944 2. A LONZO C HURCH, Introduction to Mathematical Logic, Part I, Notes by C.A. T RUESDELL, Annals of Mathematics Studies No. 13, Princeton, University Press, vi + 118 pp. Note by the editors: This list and the list on p. 29 are slightly edited versions of those that we
received from Mrs. C. Truesdell, to whom we are heartily grateful. In our editorial work we have added a few entries, updated several items, and made a small number of other minor corrections. To G.P. Galdi, K. Hutter, R.G. Muncaster, F. Pastrone, and D. Speiser, we are beholden for their help in tracking down article titles and numbers of journal volumes. In what follows, explanatory remarks set off by square brackets were made by Clifford Truesdell himself. xiii
xiv
PUBLISHED WORKS OF C.A. TRUESDELL
1945 3. The membrane theory of shells of revolution, Transactions of the American Mathematical Society 58, 96–166. 3A1. The differential equations of the membrane theory of shells of revolution, Bulletin of the American Mathematical Society 49 (1943), 863– 864. 3A2. The membrane theory of shells of revolution, Bulletin of the American Mathematical Society 51, 225. 4. On a function which occurs in the theory of the structure of polymers, Annals of Mathematics 46, 144–157. −2m n , ∞ 5. Generalizations of Euler’s summations of the series ∞ n=1 n n=0 (−) × (2n + 1)−2m−1 , etc., Annals of Mathematics 46, 194–195. Other publication in 1945: No. 12A1. 1946 6. (Co-author R.C. P RIM) On Linearized Axially Symmetric Flow of a Compressible Fluid, U.S. Naval Ordnance Laboratory Memorandum 8885, 16 December, 4 pp. 7. On Behrbohm and Pinl’s linearization of the equation of two-dimensional steady polytropic flow of a compressible fluid, Proceedings of the National Academy of Sciences (U.S.A.) 32, 289–293 = U.S. Naval Ordnance Laboratory Memorandum 8888, 18 December, 6 pp. 7A. On Behrbohm and Pinl’s linearization of the two dimensional steady flow of a compressible adiabatic fluid, Bulletin of the American Mathematical Society 53 (1947), 59. Other publications in 1946: Nos. 8A and 12A2. 1947 8. On Sokolovsky’s “Momentless shells”, Transactions of the American Mathematical Society 61, 128–133. 8A. Same title, Bulletin of the American Mathematical Society 52 (1946), 240. 9. (Co-author R.N. S CHWARTZ) The Newtonian mechanics of continua, U.S. Naval Ordnance Laboratory Memorandum 9223, 18 July, 25 pp. 9A. (Co-author R. S CHWARTZ) On the Newtonian Mechanics of Continua, Bulletin of the American Mathematical Society 53, 1125. 10. A note on the Poisson–Charlier functions, Annals of Mathematical Statistics 18, 450–454. 11. Review of L. Brand’s “Vector and Tensor Analysis”, Science 106, 623. Other publications in 1947: Nos. 7A, 12A3, 13P, 14P, 16P, 16A, 23P, 48P. 1948 12. An Essay toward a Unified Theory of Special Functions, based on the Functional Equation ∂F (z, α)/∂z = F (z, α + 1), Annals of Mathematics Studies No. 18, Princeton, Princeton University Press, iv + 182 pp.
PUBLISHED WORKS OF C.A. TRUESDELL
13.
14.
15.
16.
17.
xv
12A1. On the functional equation ∂F (z, α)/∂z = F (z, α + 1), Bulletin of the American Mathematical Society 51 (1945), 883. 12A2. On a class of differential-difference equations, Bulletin of the American Mathematical Society 52 (1946), 823. 12A3. On the Functional Equation (∂/∂z)F (z, a) = F (z, a + 1), U.S. Naval Ordnance Laboratory Memorandum 8975, 17 February 1947, 13 pp. = Proceedings of the National Academy of Sciences (U.S.A.) 33 (1947), 82–93. 12A4. A unified theory of special functions, American Mathematical Monthly 56 (1949), 368. 12L. Une méthode nouvelle concernant les fonctions spéciales, pp. 53–72 of Three Lectures on Mathematics and Mechanics, U.S. Naval Research Laboratory Theoretical Mechanics Section Memorandum No. 3836-1, August 1, 1949. On the total vorticity of motion of a continuous medium, Physical Review (2) 73, 510–512. 13P. The Transport of Vorticity, U.S. Naval Ordnance Laboratory Memorandum 9260, 11 August 1947, 7 pp. On the transfer of energy in continuous media, Physical Review (2) 73, 513– 515. 14P. The Energy Theorem for Newtonian Continua, U.S. Naval Ordnance Laboratory Memorandum 9224, 21 July 1947, 8 pp. A New Definition of a Fluid, U.S. Naval Ordnance Laboratory Memorandum 9487, 5 January, 31 pp. 15A1. On the differential equations of slip flow, Proceedings of the National Academy of Sciences (U.S.A.) 34, 342–347. 15A2. On the differential equations for slip flow, Physical Review (2) 73, 1255. On the reliability of the membrane theory of shells of revolution, Bulletin of the American Mathematical Society 54, 994–1008. 16P. Same title, U.S. Naval Ordnance Laboratory Memorandum 9270, 14 August 1947, 15 pp. 16A. Same title, Bulletin of the American Mathematical Society 53 (1947), 1125. Généralisation de la formule de Cauchy et des théorèmes de Helmholtz au mouvement d’un milieu continu quelconque, Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 227, 757–759. 17L. Sur la cinématique des mouvements tourbillonaires, pp. 3–20, 73–74 of Three Lectures on Mathematics and Mechanics, U.S. Naval Research Laboratory Theoretical Mechanics Section Memorandum No. 3836-1, August 1, 1949.
xvi
PUBLISHED WORKS OF C.A. TRUESDELL
18. Une formule pour le vecteur tourbillon d’un fluide visqueux élastique, Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 227, 821–823. 18L. Des théorèmes tourbillonaires de la mécanique des fluides, pp. 21–37, 75–76 of Three Lectures on Mathematics and Mechanics, U.S. Naval Research Laboratory Theoretical Mechanics Section Memorandum No. 3836-1, August 1, 1949. Other publications in 1948: Nos. 19P, 32P, 64P. 1949 19. The effect of viscosity on circulation, Journal of Meteorology 6, 61–62. 19P. Same title, U.S. Naval Ordnance Laboratory Memorandum 9516, 27 January 1948, 6 pp. 19A. Same title, Physical Review (2) 76 (1949), 192–193. 20. Deux formes de la transformation de Green, Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 229, 1199–1200. Other publications in 1949: Nos. 12A4, 12L, 17L, 18L, 22P, 26P, 26L1, 26L2, 26L3A, 29P, 35P, 64A. 1950 21A. On finite strain of an elastic body, Bulletin of the American Mathematical Society 55, 1072. 22. Bernoulli’s theorem for viscous compressible fluids, Physical Review (2) 77, 535–536. 22P. Same title, U.S. Naval Research Laboratory Report 3558, October 12, 1949, iv + 3 pp. 22A. Bernoulli’s theorem for viscous fluids, Bulletin of the American Mathematical Society 56, 253. 23. (Co-author R. P RIM) A derivation of Zorawski’s criterion for permanent vectorlines, Proceedings of the American Mathematical Society 1, 32–34. 23P. (Co-author R.C. P RIM) Zorawski’s Kinematic Theorems, U.S. Naval Ordnance Laboratory Memorandum No. 9354, 20 September 1947, 4 pp. 24. On the effect of a current of ionized air upon the earth’s magnetic field, Journal of Geophysical Research 55, 247–260; 56 (1951), 134. 25. On the balance between deformation and rotation in the motion of a continuous medium, Journal of the Washington Academy of Sciences 40, 313–317. 26. A new definition of a fluid, I: The Stokesian fluid, Journal de Mathématiques Pures et Appliquées (9) 29, 215–244; 30 (1951), 156–158. 26P. Same title, pp. 351–364 of Proceedings of the 7th International Congress of Applied Mechanics (1948), Volume 2, 1949 = [with minor alterations] U.S. Naval Research Laboratory Report P-3457, April 26, 1949, iv + 11 pp.
PUBLISHED WORKS OF C.A. TRUESDELL
xvii
26L1. Deformation: Elastic, plastic, and fluid masses, Research Reviews (U.S. Office of Naval Research), 15 April 1949, pp. 10–14. 26L2. Une définition nouvelle des fluides, pp. 38–52, 76 of Three Lectures on Mathematics and Mechanics, U.S. Naval Research Laboratory Theoretical Mechanics Section Memorandum No. 3836-1, August 1, 1949. 26L3A. Recent continuum theories of fluid dynamics, Physical Review (2) 75 (1949), 1293. 27. On the addition and multiplication theorems for special functions, Proceedings of the National Academy of Sciences (U.S.A.) 36, 752–755. 28. The effect of the compressibility of the earth on its magnetic field, Physical Review (2) 78, 823. Other publications in 1950: Nos. 29A, 30A1, 30A2. 1951 29. A form of Green’s transformation, American Journal of Mathematics 73, 43– 47. 29P. Same title, U.S. Naval Research Laboratory Report No. 3554, 11 October 1949, iii + 4 pp. 29A. Same title, Bulletin of the American Mathematical Society 56 (1950), 171. 30. Vorticity averages, Canadian Journal of Mathematics 3, 69–86. 30A1. On Poincaré’s analogy between vorticity and mass density, Bulletin of the American Mathematical Society 56 (1950), 347. 30A2. Vorticity averages, Physical Review (2) 79 (1950), 229. 31. Verallgemeinerung und Vereinheitlichung der Wirbelsätze ebener und rotationssymmetrischer Flüssigkeitsbewegungen, Zeitschrift für Angewandte Mathematik und Mechanik 31, 65–71. 31A. A new vorticity theorem, pp. 639–640 of Proceedings of the International Congress of Mathematicians, 1950, Volume 1, 1952. 32. On Ertel’s vorticity theorem, Zeitschrift für Angewandte Mathematik und Physik 2, 109–114. 32P. On Ertel’s Theorem of the Diffusion of Vorticity, U.S. Naval Ordnance Laboratory Memorandum No. 9528, 3 February 1948, 8 pp. 33. Caractérisation des champs vectoriels qui s’annulent sur une frontière fermée, Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 232, 1277–1279. 34. Analogue tri-dimensionnel au théorème de M. Synge concernant les champs vectoriels qui s’annulent sur une frontière fermée, Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 232, 1396–1397. 35. A new definition of a fluid, II: The Maxwellian fluid, Journal de Mathématiques Pures et Appliquées (9) 30, 111–155. 35P. Same title, U.S. Naval Research Laboratory Report No. 3553, September 20, 1949, viii + 36 pp.
xviii
PUBLISHED WORKS OF C.A. TRUESDELL
36. Proof that Ertel’s vorticity theorem holds in average for any medium suffering no tangential acceleration on the boundary, Geofisica Pura e Applicata 19, 1–3. 37. On the equation of the bounding surface, Bulletin of the Technical University of Istanbul 3, 71–78. 38. On the velocity of sound in fluids, Journal of the Aeronautical Sciences 18, 501. 39. The analogy between irrotational gas flow and minimal surfaces, Journal of the Aeronautical Sciences 18, 502. 40A. Severe pure shear of an elastic body, Indiana Academy of Science Proceedings 61, 271. 41. Discussion of the paper by W.R. Osgood and J.A. Joseph, “On the general theory of thin shells”, Journal of Applied Mechanics 18, 231–232. 42. Review of J.L. Synge and R.A. Griffith’s “Principles of Mechanics”, 2nd edn, American Mathematical Monthly 57, 351–354. Other publications in 1951: Nos. 24 (corrections), 26 (corrections), 54A1. 1952 43. The mechanical foundations of elasticity and fluid dynamics, Journal of Rational Mechanics and Analysis 1, 125–300; 2 (1953), 593–616; 3 (1954), 801. 43R. [corrected, with a preface, annotations, and appendices (1962)], pp. i–vxi, 1–186, 204–214 of Continuum Mechanics I, New York, Gordon & Breach, 1966. 44. A program of physical research in classical mechanics, Zeitschrift für Angewandte Mathematik und Physik 3, 79–95. 44R. [corrected and annotated] pp. 187–203, 215–218 of Continuum Mechanics I, New York, Gordon & Breach, 1966. 45. On the viscosity of fluids according to the kinetic theory, Zeitschrift für Physik 131, 273–289. 46. On curved shocks in steady plane flow of an ideal fluid, Journal of the Aeronautical Sciences 19, 826–828. 47. Longueur critique pour la propagation des ondes libres dans un fluide visqueux, Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 235, 702–704. 48. Vorticity and the Thermodynamic State in a Gas Flow, Mémorial des Sciences Mathématiques No. 119, Paris, Gauthier-Villars, 56 pp. 48P. (Co-author R.C. P RIM) Vorticity and the Thermodynamic State in the Flow of an Inviscid Fluid, U.S. Naval Ordnance Laboratory Memorandum 9416, 12 November 1947, 14 pp. 49. Review of “Advances in Applied Mechanics”, Volume 2, Bulletin of the American Mathematical Society 58, 403–407.
PUBLISHED WORKS OF C.A. TRUESDELL
xix
50. Review of F.D. Murnaghan’s “Finite Deformation of an Elastic Solid”, Bulletin of the American Mathematical Society 58, 577–579. 50C. Same title, Science 115, 634. 51. Discussion of H.M. Trent’s paper, “An alternative formulation of the laws of mechanics”, Journal of Applied Mechanics 19, 569–570. 52. Discussion of R.A. Toupin’s paper, “A variational principle for the mesh-type analysis of a mechanical system”, Journal of Applied Mechanics 19, 574. 53. Review of W. Prager and P.G. Hodge’s “Theory of Perfectly Plastic Solids”, Bulletin of the American Mathematical Society 58, 674–677. Other publications in 1952: Nos. 31A, 57P, 59C1, 59C1T, 59C2. 1953 54. Two measures of vorticity, Journal of Rational Mechanics and Analysis 2, 173–217. 54A1. A measure of vorticity, Bulletin of the American Mathematical Society 57 (1951), 138. 54A2. La velocità massima nel moto di Gromeka–Beltrami, Accademia Nazionale del Lincei, Rendiconti della Classe di Scienze Fisiche, Matematiche e Naturali (8) 13, 378–379. 54A3. A measure of vorticity, pp. 245–246 of Proceedings of the 8th International Congress on Theoretical and Applied Mechanics 1953, 1954. 55. Notes on the history of the general equations of fluid dynamics, American Mathematical Monthly 60, 445–458. 55R. Same title, Journal of the American Society of Naval Engineers 66 (1954), 97–108. 56. Generalization of a geometrical theorem of Euler, Commentarii Mathematici Helvetici 27, 233–234. 57. Precise theory of the absorption and dispersion of forced plane infinitesimal waves according to the Navier–Stokes equations, Journal of Rational Mechanics and Analysis 2, 643–742. 57P. Preliminary Report: Non-linear absorption and dispersion of plane ultrasonic waves in pure fluids, Journal of the Washington Academy of Sciences 42 (1952), 33–36. 58. The physical components of vectors and tensors, Zeitschrift für Angewandte Mathematik und Mechanik 33, 345–356; 34 (1954), 69–70. 59. Paul Felix Neményi, Journal of the Washington Academy of Sciences 43, 62–63. 59C1. Same title, Science (2) 116 (1952), 215–216. 59C1T. [inaccurate] Same title, Physikalische Blätter 7 (1952), 325–326. 59C2. Same title, Zeitschrift für Angewandte Mathematik und Physik 3 (1952), 400–401. 59C3. Same title, Zeitschrift für Angewandte Mathematik und Mechanik 33, 72.
xx
PUBLISHED WORKS OF C.A. TRUESDELL
60. Review of H.M. Westergaard’s “Theory of Elasticity and Plasticity”, Bulletin of the American Mathematical Society 59, 412–413. 61. Review of V.V. Novozhilov’s “Foundations of the Nonlinear Theory of Elasticity”, Bulletin of the American Mathematical Society 59, 467–473. 62. Václav Hlavatý, International Mathematical News No. 29/30, 2–3. Other publications in 1953: Nos. 43 (corrections and additions), 65A. 1954 63. A new chapter in the theory of elastica, pp. 52–55 of Proceedings of the First Midwestern Conference on Solid Mechanics, 1953. 64. The Kinematics of Vorticity, Indiana University Science Series No. 19, xvii + 232 pp. 64P. Same title, U.S. Naval Ordnance Laboratory Memorandum 9591, 11 March 1948, 35 pp. 64A. Same title, Bulletin of the American Mathematical Society 55 (1949), 296 and 699. 65. Le pendule hydraulique, pp. 383–396 of Mémoires sur la Mécanique des Fluides offerts à M.D. Riabouchinsky à l’occasion de son Jubilé scientifique, Publications Scientifiques et Techniques du Ministère de l’Air, Paris. 65A. The hydraulic pendulum, Indiana Academy of Science Proceedings 63 (1953), 263. 66. Editor’s Introduction: Rational fluid mechanics, 1687–1765, pp. VII–CXXV of Leonhardi Euleri Opera Omnia, Series II, Volume 12, Zürich, Füssli. 67. The present status of the controversy regarding the bulk viscosity of fluids, Proceedings of the Royal Society (London) A 226, 59–65. 68. Mathematics, pp. 618–619 of The American Peoples Encyclopedia Yearbook for 1953. 69. Review of J. Pérès’ “Mécanique Générale”, Bulletin of the American Mathematical Society 60, 286. 70. Review of E.J. McShane, J.L. Kelley and F.J. Reno’s “Exterior Ballistics”, Scripta Mathematica 20, 172–174. 71. Review of A. Erdélyi, W. Magnus, F. Oberhettinger, and F. Tricomi’s “Higher Transcendental Functions”, Volumes I and II, American Mathematical Monthly 61, 576–578. Other publications in 1954: Nos. 43 (corrections), 55R, 54A3 and 84C. 1955 72. Hypo-elasticity, Journal of Rational Mechanics and Analysis 4, 83–133, 1019– 1020. 72L. L’ipoelasticità, Conferenze del Seminario di Matematica dell’Università di Bari No. 29, Bologna, Zanichelli, 1957, 16 pp. [The text was somewhat mangled by the editor.] 72R. [corrected] pp. 43–92 of Continuum Mechanics III: Foundations of Elasticity Theory, New York, Gordon & Breach, 1965.
PUBLISHED WORKS OF C.A. TRUESDELL
xxi
73. The simplest rate theory of pure elasticity, Communications on Pure and Applied Mathematics 8, 123–132. 73R. [corrected] pp. 32–41 of Continuum Mechanics III: Foundations of Elasticity Theory, New York, Gordon & Breach, 1965. 74. Review of F.I. Frankl and E.A. Karpovich’s “Gas Dynamics of Thin Bodies”, Science 121, 163–164. 75. Some things you don’t know about mathematics and mechanics, Indiana Alumni Magazine 17, 2–5. [The title was supplied by the editor; C.T. would not have accepted it, had he been informed.] 76. IU Prof says pupils can’t write, either, Indianapolis Times, June 16. [The title was supplied by the editor.] 77. I. Editor’s Introduction: The first three sections of Euler’s treatise on fluid mechanics (1766). II. The theory of aerial sound, 1687–1788. III. Rational fluid mechanics, 1765–1788, pp. VII–CXVII of Leonhardi Euleri Opera Omnia, Series II, Volume 13, Zürich, Füssli. Other publication in 1955: No. 80A. 1956 78. (Co-author E. I KENBERRY) On the pressures and the flux of energy in a gas according to Maxwell’s kinetic theory, I, Journal of Rational Mechanics and Analysis 5, 1–54. 79. On the pressures and the flux of energy in a gas according to Maxwell’s kinetic theory, II, Journal of Rational Mechanics and Analysis 5, 55–128. 79L1. La crise actuelle dans la théorie cinétique des gaz (1955), Journal de Mathématiques Pures et Appliquées (9) 37 (1958), 103–118. 79L1T. By B.H. Aleksanova i N.T. Pawenko, Sovremenny krizis v kinetiqesko teorii gazov, Mexanika No. 4/62 (1960), 65–75. 79L2. Une solution exacte des équations de Maxwell (1955), Journal de Mathématiques Pures et Appliquées (9) 37 (1958), 119–133. 79L3. Congetture intorno ad un nuovo metodo di approssimazione asintotica (1961), Rendiconti di Matematica 23 (1964), 185–192. 80. Das ungelöste Hauptproblem der endlichen Elastizitätstheorie, Zeitschrift für Angewandte Mathematik und Mechanik 36, 97–103. 80A. Same title, Physikalische Verhandlungen 68 (1955), 129. 80T1. By G.. Danelidze: Nerexenna glavna zadaqa nelineno teorii uprugosti, Mexanika No. 1/41 (1957), 67– 74. 80T2. By C.T.: The main open problem in the finite theory of elasticity, pp. 102–108 of Continuum Mechanics III: Foundations of Elasticity Theory, New York, Gordon & Breach, 1965. 81. Hypo-elastic shear, Journal of Applied Physics 27, 441–447. 81R. Pp. 93–100 of Continuum Mechanics III: Foundations of Elasticity Theory, New York, Gordon & Breach, 1965.
xxii
PUBLISHED WORKS OF C.A. TRUESDELL
82. Zur Geschichte des Begriffes “innerer Druck”, Physikalische Blätter 12, 315– 326. 83. Experience, theory, and experiment, pp. l3–18 of Proceedings of the Sixth Hydraulics Conference, Bulletin 36, State University of Iowa Studies in Engineering. 84. Review of “Advances in Applied Mechanics”, Volume 3, Scripta Mathematica 22, 65–68. 84C. A comment on scientific writing, Science 120 (1954), 434. 85. Review of R. Dugas’ “La Mécanique au XVIIe Siècle”, Isis 47, 449–452. 86. Query No. 150 [Bounded magic], Isis 47, 59. 1957 87. (Co-author B. B ERNSTEIN) The solution of linear differential equations by quadratures, Journal für die Reine und Angewandte Mathematik 197, 104– 111. 88. Sulle basi della termomeccanica, Accademia Nazionale del Lincei, Rendiconti della Classe di Scienze Fisiche, Matematiche e Naturali (8) 22, 33–38, 158– 166. 88T. By C.T.: On the foundations of mechanics and energetics, pp. 293–305 of Continuum Mechanics II: The Rational Mechanics of Materials, New York, Gordon & Breach, 1965. 89. Eulers Leistungen in der Mechanik, Enseignement Mathématique 3, 251–262. 90. General solution for the stresses in a curved membrane, Proceedings of the National Academy of Sciences (U.S.A.) 43, 1070–1072. 91. Review of “The Principal Works of Simon Stevin”, Volume 1, edited by E. Crone, E.J. Dijksterhuis, R.J. Forbes, M.G.J. Minnaert, A. Pannekoeg, Physikalische Blätter 13, 578–579. Other publications in 1957: Nos. 72L, 80T1, 95A. 1958 92. The new Bernoulli edition, Isis 49, 54–62. 93. Geometric interpretation for the reciprocal deformation tensors, Quarterly of Applied Mathematics 15, 434–435. 94. Recent advances in rational mechanics, Science 127, 729–739. 94R. [corrected] Essay VIII in No. 165 below. 95. Neuere Anschauungen über die Geschichte der allgemeinen Mechanik, Zeitschrift für angewandte Mathematik und Mechanik 38, 148–157. 95A. Neuere Anschauungen über die Geschichte der Mechanik, Physikalische Verhandlungen 83 (1957), 50. 96. Neuere Entwicklungen in der klassischen statistischen Mechanik und in der kinetischen Gastheorie, ausgearbeitet von D. MORGENSTERN, Ergebnisse der exakten Naturwissenschaften 30, 286–343. 97. (Co-author J.L. E RICKSEN) Exact theory of stress and strain in rods and shells, Archive for Rational Mechanics and Analysis 1 (1957/8), 295–323.
xxiii
PUBLISHED WORKS OF C.A. TRUESDELL
97R. Pp. 307–323 of Continuum Mechanics II: The Rational Mechanics of Materials, New York, Gordon & Breach, 1965. Other publications in 1958: Nos. 79L1, 79L2, 98P, 104P, 107P. 1959 98. The rational mechanics of materials – past, present, future, Applied Mechanics Reviews 12, 75–80. 98P. Same title, Mathematics Research Center, United States Army, The University of Wisconsin, Technical Summary Report No. 41, July 1958, 28 pp. 98R. [corrected and modified] pp. 225–236 of Applied Mechanics Surveys, Washington, Spartan Books, 1966. 99. Invariant and complete stress functions for general continua, Archive for Rational Mechanics and Analysis 4 (1959/60), 1–29. 100. 20 Lectures on the Elements of Fluid Mechanics, notes taken by R. Wells, Rheology Section, National Bureau of Standards, June 30–September 11, multiplied typescript, 131 pp. 101. Review of H. Rouse and S. Ince’s “History of Hydraulics”, Isis 50, 69–71. 102. Review of “Rheology, theory and applications”, edited by F. Eirich, Quarterly of Applied Mathematics 17, 221–222. 103. Query No. 158, “Physical Intuition”, Isis 50, 480. Other publication in 1959: No. 110A. 1960 104. Intrinsic equations of spatial gas flow, Zeitschrift für Angewandte Mathematik und Mechanik 40, 9–14. 104P. Same title, Mathematics Research Center, United States Army, The University of Wisconsin, Technical Summary Report No. 33, July 1958, 13 pp. 105. (Co-author R.P. K ANWAL) Electric current and fluid spin created by the passage of a magnetosonic wave, Archive for Rational Mechanics and Analysis 5, 432–439. 106. (Co-author B.D. C OLEMAN) On the reciprocal relations of Onsager, Journal of Chemical Physics 33, 28–31. 107. (Co-author R. T OUPIN) The classical field theories, pp. 226–793 of Flügge’s Handbuch der Physik, Volume 3, Part 1, Berlin/Göttingen/Heidelberg, Springer-Verlag. 107P. [Chapter C only] Kinematics of singular surfaces and waves, Mathematics Research Center, United States Army, The University of Wisconsin, Technical Summary Report No. 43, October 1958, 89 pp. 108. (translation by Mathäi) Zu den Grundlagen der Mechanik und Thermodynamik, Physikalische Blätter 16, 512–517. 108T. [English original] Text of the Chairman’s Introduction to the Colloquium on the Foundations of Mechanics and Thermodynamics
xxiv
PUBLISHED WORKS OF C.A. TRUESDELL
held at the U.S. National Bureau of Standards, Washington, October 21–23, 1959, Appendix to No. 153, 1966. 109. A program toward rediscovering the rational mechanics of the age of reason, Archive for History of Exact Sciences 1, 3–36. 109TE. By C.T.: La scienza del moto dai ‘Principia Mathematica Naturalis Philosophiae’ di Newton alla ‘Méchanique Analitique’ di Lagrange, Atti e Memorie della Academia Nazionale di Scienze, Lettere ed Arti, Modena (6) 2, 3–32. 109R1. [corrected] Essay II in No. 165 below. 109R2. [of the foregoing] No. HS-76 in The Bobbs-Merrill Reprint Series in History of Science, 1972. 110. Modern theories of materials, Transactions of the Society of Rheology 4, 9– 22. 110A. Same title, Rheology Bulletin 28, No. 3 (1959), p. 5. 111. The Rational Mechanics of Elastic or Flexible Bodies, 1638–1788, L. Euleri Opera Omnia, Series II, Volume 11, Part 2, Zürich, Füssli, 435 pp. 111A. Outline of the history of flexible or elastic bodies to 1788, Journal of the Acoustical Society of America 32, 1647–1656. 111L1. Origin of the theory of vibrating systems, Res Mechanica 21 (1987), 291–311. [This text was drawn by others from Truesdell’s notes for a lecture.] 112. [unsigned] Potentials (physics), pp. 539–542 of McGraw-Hill Encyclopedia of Science and Technology, Volume 10. 113. [unsigned] Unified field theories, pp. 200–201 of McGraw-Hill Encyclopedia of Science and Technology, Volume 14. 114. Review of “Die Deutsch–Russische Begegnung und Leonhard Euler”, edited by E. Winter, Isis 51, 115. 115. Review of Leonhard Euler’s “Vollständige Anleitung zur Algebra”, edited by J.E. Hofmann, Isis 51, 434. 116. Query No. 161, Approximate theories in early research, Isis 51, 207. [Answered by B.L. VAN DER WAERDEN on pp. 567–568.] Other publication in 1960: No. 79L1T. 1961 117. Stages in the development of the concept of stress, pp. 556–564 of Problems of Continuum Mechanics [Muskhelisvili Anniversary Volume], Philadelphia, Society for Industrial and Applied Mathematics. 117T. tapy razviti ponti napreni, pp. 439–447 of Problemy mexaniki sploxno sredy, Moscow, Izdatelstvo Akademii Nauk SSSR. 118. Exact theory of self-expanding piston rings, Ingenieur-Archiv 30, 77–87. 119. The Principles of Continuum Mechanics, Socony Mobil Oil Company Colloquium Lectures in Pure and Applied Science No. 5 (February, 1960), (x) + 371 + XVIII pp. Reprinted in 1963 and 1965.
xxv
PUBLISHED WORKS OF C.A. TRUESDELL
120. General and exact theory of waves in finite elastic strain, Archive for Rational Mechanics and Analysis 8, 263–296. 120L. Second-order theory of wave propagation in isotropic elastic materials, pp. 187–199 of Proceedings of the International Conference on Second-order Effects, Haifa (1962), 1964. 120R1. [corrected] pp. 230–263 of Continuum Mechanics IV: Problems of Nonlinear Elasticity, New York, Gordon & Breach, 1965. 120R2. Ibid, [not repaginated] Memoir 1 in Wave Propagation in Dissipative Materials, a Reprint of Five Memoirs by B.D. C OLEMAN , M.E. G URTIN , I. H ERRERA R., and C. T RUESDELL, New York, Springer-Verlag, 1965. 121. Ergodic theory in classical statistical mechanics, pp. 21–56 of Rendiconti della Società Italiana di Fisica, XIV Corso = Ergodic Theory, ed. P. Caldirola, New York, Academic Press. 122. Review of M. Clagett’s “The Science of Mechanics in the Middle Ages”, Speculum 36, 119–121. 123. Review of “Critical Problems in the History of Science”, edited by M. Clagett, Manuscripta 5, 101–103. 124. Review of “Die Berliner und die Petersburger Akademie im Briefwechsel Leonhard Eulers, Teil I, Der Briefwechsel L. Eulers mit G.F. Müller, 1735– 1767”, edited by A.P. Juškeviˇc, E. Winter, and P. Hoffmann, Isis 52, 113–114. 125. Review of M. Dyck’s “Novalis and Mathematics: A Study of Friedrich von Hardenberg’s Fragments on Mathematics and its Relation to Magic, Music, Religion, Philosophy, Language and Literature”, Isis 52, 606–607. 1962 126. Mechanical basis of diffusion, Journal of Chemical Physics 37, 2336–2344. 126L. Una teoria meccanica della diffusione, pp. 161–168 of Celebrazioni Archimedee del Secolo XX (Siracusa, 1961), Volume 3. 127. Reactions of the history of mechanics upon modern research, pp. 35–47 of Proceedings of the Fourth U.S. National Congress of Applied Mechanics. 127A. Same title, Journal of Applied Mechanics 29, 225. 127R. [corrected] Essay VII in No. 165, below. 127T. [of the foregoing] by P. Zimmermann: Rückwirkungen der Geschichte der Mechanik auf die moderne Forschung, Humanismus und Technik 13 (1969), 1–25. 128. Solutio generalis et accurata problematum quamplurimorum de motu corporum elasticorum incomprimibilium in deformationibus valde magnis, Archive for Rational Mechanics and Analysis 11, 106–113; 12 (1963), 427–428; 28 (1968), 397–398. 129. (Co-author R.P. K ANWAL) Fluid and magnetic distortion carried by magnetosonic waves, The Physics of Fluids 5, 368–369.
xxvi
PUBLISHED WORKS OF C.A. TRUESDELL
130. Review of “Die Berliner und die Petersburger Akademie im Briefwechsel Leonhard Eulers, Teil II, der Briefwechsel L. Eulers mit Nartov, Razumovskij, Schumacher, Teplov und der Petersburger Akademie, 1730–1763”, edited by A.P. Juškeviˇc, E. Winter, P. Hoffmann, and Ju.Ch. Kopeleviˇc, Isis 53, 411–413. Other publication in 1962: No. 144P1. 1963 131. (Co-author R.A. T OUPIN) Static grounds for inequalities in finite strain of elastic materials, Archive for Rational Mechanics and Analysis 12, 1–33; 19 (1965), 407. 132. The meaning of Betti’s reciprocal theorem, Journal of Research of the National Bureau of Standards 67B, 85–86. 133. Remarks on hypo-elasticity, Journal of Research of the National Bureau of Standards 67B, 141–143. 134. Review of M. Jammer’s “Concepts of Mass in Classical and Modern Physics”, Isis 54, 290–291. 135. Review of D. Morgenstern and I. Szabò’s “Vorlesungen über theoretische Mechanik”, Bulletin of the American Mathematical Society 69, 330–332. 136. Query 170 – Portrait of George Green, Isis 54, 277. Other publication in 1963: No. 128 (corrections). 1964 137. Second-order effects in the mechanics of materials, pp. 1–47 of Proceedings of the International Conference on Second-order Effects, Haifa (1962). 138. The natural time of a visco-elastic fluid: its significance and measurement, The Physics of Fluids 7, 1134–1142. 139. A theorem on the isotropy groups of a hyperelastic material, Proceedings of the National Academy of Sciences (U.S.A.) 52, 1081–1083. 140. Whence the law of moment of momentum?, pp. 588–612 of Mélanges Alexandre Koyré, Volume 1, Paris, Hermann. 140R. [corrected] Essay V of No. 165, below. 140TE. By C.T., with a different appendix: “Die Entwicklung des Drallsatzes”, Zeitschrift für Angewandte Mathematik und Mechanik 44, 149–158. 141. The modern spirit in applied mathematics, I.C.S.U. Review of World Science 6, 195–205. 142. Fluid mechanics before the Society for Natural Philosophy, Science 143, 382. 143. [Gratiae ob lauream honoris causa ab Academia Polytechnica Mediolanensi collatam], p. 40 of Cerimonie Celebrative del Centenario del Politicnico, 2–4 Aprile 1964, Milano. Other publications in 1964: Nos. 79L3, 120L, 144P2.
xxvii
PUBLISHED WORKS OF C.A. TRUESDELL
1965 144. Rational mechanics of deformation and flow [Bingham Medal Address], pp. 3–30 of Proceedings of the 4th International Congress on Rheology (1963), Volume 2. 144P1. Il punto di vista invariantivo nella meccanica dei corpi continui, Rendiconti del Seminario Matematico e Fisico di Milano 32 (1962), 91–104. 144P2. Die Rationale Mechanik der Kontinua, Zeitschrift für Angewandte Mathematik und Mechanik 44 (1964), 341–347. 144P2T. By A.I. Vandiner: Racionalna mexanika sploxno sredy, Mexanika No. 4/92 (1965), 103–111. 144RE. [with editorial changes in incorrect English] Buletinul Institutului Politehnic din Iasi (n.s.) 13 (17) (1967), 415–418; 14 (18) (1968), 131–136. 145. (Co-author W. N OLL) The Non-Linear Field Theories of Mechanics, Flügge’s Handbuch der Physik, Volume 3, Part 3, Berlin-Heidelberg-New York, Springer-Verlag, viii + 602 pp. 146. (Co-author B.D. C OLEMAN) Homogeneous motions of incompressible materials, Zeitschrift für Angewandte Mathematik und Mechanik 45, 547–551. 147. Fluids of the second grade regarded as fluids of convected elasticity [with an appendix by C.-C. WANG], The Physics of Fluids 8, 1936–1938. 148. Twenty prefaces in Continuum Mechanics II: The Rational Mechanics of Materials, New York, Gordon & Breach. 149. Sixteen prefaces in Continuum Mechanics III: Foundations of Elasticity Theory, New York, Gordon & Breach. 150. Seventeen prefaces in Continuum Mechanics IV: Problems of Non-Linear Elasticity, New York, Gordon & Breach. 151. Preface to Wave Propagation in Dissipative Materials, a Reprint of Five Memoirs by B.D. C OLEMAN, M.E. G URTIN, I. H ERRERA R., and C. T RUES DELL, New York, Springer-Verlag, 1965. Other publications in 1965: Nos. 72R, 73R, 80T2, 81R, 88T, 97R, 120R1, 120R2, 135 (corrections). 1966 152. Instabilities of isotropic perfectly elastic materials in simple shear, pp. 139– 142 of Proceedings of the Eleventh International Congress of Applied Mechanics, Munich (1964). 153. Six Lectures on Modern Natural Philosophy, New York, Springer-Verlag, (viii) + 117 pp. 153T1. By Magdalena Staszel and Wojciech Zakrewski: Sze´sc´ Wykładów Nowoczesnej Filozofii Przyrody, Warsaw, Panstwowe Wydawnictwo Naukowe, 1969, 143 pp.
xxviii
PUBLISHED WORKS OF C.A. TRUESDELL
153TE. [first three lectures] by I.T. Rabotnova: Glavy iz knigi «Xest lekci po sovremenno naturfilosofii», Mexanika No. 4/122 (1970), 99–136. 153RE. [Lecture 5], pp. 55–73 of A Taste of Science, ed. R.J. Tykodi, Westport, Connecticut, Technomic Publishing Co., 1975. 154. The Elements of Continuum Mechanics, New York, Springer-Verlag, [iv] + 279 pp. Corrected second printing, 1985. 154L1. Foundations of continuum mechanics, pp. 35–48 of Delaware Seminar in the Philosophy of Physics (1965), edited by M. Bunge, New York, Springer-Verlag, 1967. 154L2. Thermodynamics of deformation, pp. 101–112 of Non-Equilibrium Thermodynamics, Variational Techniques and Stability, Chicago, University of Chicago Press. 154L3. Thermodynamics of deformation, pp. 1–12 of Modern Developments in the Mechanics of Continua, New York, Academic Press. 154L4. The nonlinear field theories in mechanics (1966), pp. 19–215 of Topics in Nonlinear Physics, Berlin-Heidelberg-New York, SpringerVerlag, 1968. 154L5. La thermodynamique de la déformation, pp. 207–231 of Canadian Congress of Applied Mechanics (1967), Proceedings, Volume 3, 1968. 154L6. Classical and modern continuum theories, pp. 79–92 of Polymers in the Engineering Curriculum, Proceedings of the Third Buhl International Conference on Materials, Pittsburgh, October 28–29, 1968, 1971. 155. Existence of longitudinal waves, Journal of the Acoustical Society of America 40, 729–730. 156. Preface, pp. IVA–IVL, to the second edition of G.G. Stokes’s Mathematical and Physical Papers, New York, Johnson Reprint Co., Volume 1. Other publications in 1966: Nos. 43R, 44R, 98R, 163L1. 1967 157. Reactions of late baroque mechanics to success, conjecture, error, and failure in Newton’s Principia, The Texas Quarterly, Autumn, 238–258. 157R1. [corrected] Essay III in No. 165, below. 157R2. [of the preceding], pp. 2–47 of Mechanics 1970, American Academy of Mechanics, 1970. 157R3. [of No. 157R1], pp. 192–232 of The Annus Mirabilis of Sir Isaac Newton, 1666–1966, edited by R. Palter, Cambridge, Massachusetts, M.I.T. Press, 1971. 157R4. [of No. 157R1] No. HS77 in The Bobbs-Merrill Reprint Series in History of Science, 1972. 158. Reply to the paper “Zum Begriff des Elastischen Körpers” by H. Ziegler and D. McVean, Zeitschrift für Angewandte Mathematik und Physik 18, 293.
PUBLISHED WORKS OF C.A. TRUESDELL
xxix
159. Review of C.W. Kilmister and J.E. Reeve’s “Rational Mechanics”, American Mathematical Monthly 74, 748–749. 160. Review of “Manuscripta Euleriana Archivi Academiae Scientiarum URSS, Tomus 1, Descriptio Scientifica”, Isis 58, 271–273. 160C. Same title, Scripta Mathematica 28 (1968), 210–211. 161. Review of “Manuscripta Euleriana Archivi Academiae Scientiarum URSS, Tomus 2, Opera Mechanica”, Isis 58, 273–274 = Scripta Mathematica 28 (1968), 211–212. 162. Review of I.E. Farquhar’s “Ergodic Theory in Statistical Mechanics”, Quarterly of Applied Mathematics 24, 386. Other publications in 1967: Nos. 144RE, 154L1, 165P. 1968 163. Thermodynamics for beginners, pp. 373–387 of Irreversible Aspects of Continuum Mechanics, Proceedings of the IUTAM Symposia Vienna, June 22–28, 1966, Wien/New York, Springer-Verlag. 163L1. Termodinamica per principianti, Atti e Memorie della Accademia Nazionale di Scienze, Lettere ed Arti, Modena (6) 8 (1966), 136– 144. 163L2. Termodinamica elementare, Rendiconti del Seminario Matematico dell’Università e del Politecnico di Torino 27 (1967/68), 19–33. 163T. By I.T. Rabotnova: Termodinamika dl naqinawix, Mexanika No. 3/121 (1970), 116–128. 164. Sulle basi della termodinamica delle miscele, Accademia Nazionale del Lincei, Rendiconti della Classe di Scienze Fisiche, Matematiche e Naturali (8) 44, 381–383. 164T. By C.T.: On the foundations of the thermodynamics of mixtures, pp. 273–297 of Mechanics 1971, American Academy of Mechanics, 1973. 165. Essays in the History of Mechanics, New York, Springer-Verlag, (x)+384 pp. 165P. [Essay I only] Leonardo da Vinci, The Myths and the Reality, Johns Hopkins Magazine, Spring, 1967, 29–42. [The title was supplied by the editor without informing C.T., who would not have accepted it.] 165A. “Essays in the History of Mechanics”, Archives Internationales d’Histoire des Sciences 23 (1970), 177–178. 165T. By J.C. Navascues Howard and E.T. Perez-Relaño: Ensayos de Historia de la Mecánica, Madrid, Editorial Tecnos, 1975, 343 pp. 165TE. [Essay VI] by P. Zimmermann: Frühe kinetische Gastheorien, Humanismus und Technik 14 (1970), 1–29. See also Nos. 94, 109, 127, 140, 157. 166. Comment on longitudinal waves, Journal of the Acoustical Society of America 43, 170.
xxx
PUBLISHED WORKS OF C.A. TRUESDELL
167. Parole [di ringraziamento per il V Premio internazionale con medaglia d’oro “Prof. Modesto Panetti”], Atti della Accademia di Scienze di Torino 102, 21–24. 168. Preface, pp. III–V of Continuum Theory of Inhomogeneities in Simple Bodies, Berlin-Heidelberg-New York, Springer-Verlag. Other publications in 1968: Nos. 128 (corrections) 144RE, 154L4, 154L5, 161 (second publication). 1969 169. Rational Thermodynamics. A Course of Lectures on Selected Topics, New York, McGraw-Hill, (x) + 208 pp. 169T1. By D.J. Fernandez Ferrer: Termodinámica Racional, Barcelona, Editorial Reverte 1973, X + 221 pp. 169T2. By M. Fichera Colautti: Termodinamica Razionale, with an appendix, Contributi del Centro Linceo di Scienze Matematiche e Applicazioni No. 20, Roma, Accademia dei Lincei, 1976, 235 pp. 170. A precise upper limit for the correctness of the Navier–Stokes theory with respect to the kinetic theory, Journal of Statistical Physics 1, 313–318. 170T. By E.G. Berner: Toqny verhni predel korrektnosti teorii Nave–Stoksa s uqetom kinetiqesko teorii, Mehanika No. 4/122 (1970), 137–142. 171. A teaching assistant remembers, Wall Street Journal, January 27, p. 14. [The title was supplied by the editor.] Other publications in 1969: Nos. 127T, 153T1, 157T. 1970 172. De pressionibus negativis in sinu et in pariete regionis fluido viscoso moventi impletae schedula, Annali di Matematica Pura ed Applicata (4) 84, 213– 224. 172A. Same title, Zentralblatt für Mathematik 207 (1971), 252–253. 173. Review of L. Suklje’s “Rheological Aspects of Soil Mechanics”, American Scientist 58, 210–211. Other publications in 1970: Nos. 153TE, 163T, 165A, 165TE, 170T, 181C. 1971 174. Letter to the Editor, The Johns Hopkins Magazine, Spring, 3–4. 175. Review of O. Penrose’s “Foundations of Statistical Mechanics”, American Scientist 59, 638. 176. Review of R.M. Christensen’s “Theory of Viscoelasticity: An Introduction”, American Scientist 59, 615. 177. Review of T.L. Hankins’ “Jean d’Alembert: Science and the Enlightenment”, Centaurus 16, 56–59. Other publications in 1971: Nos. 154L6, 157R3, 172A.
xxxi
PUBLISHED WORKS OF C.A. TRUESDELL
1972 178. Leonard Euler, supreme geometer (1707–1783), pp. 51–95 of Studies in Eighteenth Century Culture, Volume II, Irrationalism in the Eighteenth Century, Case Western Reserve University Press. 179. Review of G.A. Tokaty’s “A History and Philosophy of Fluidmechanics”, Nature 236, 84–85. 180. Review of C. Naux’s “Histoire des logarithmes de Neper à Euler, Tome II, La promotion des logarithmes au rang de valeur analytique”, Isis 63, 443–444. Other publications in 1972: Nos. 109R2 and 157R4. 1973 181. Is there a philosophy of science? An essay review of “Induction and Intuition in Scientific Thought” by Peter Brian Medawar, Centaurus 17, 142–172. 181C. Review of Medawar’s “Induction and Intuition in Scientific Thought”, Die Naturwissenschaften 57 (1970), 314. 182. (Co-author C.-C. WANG) Introduction to Rational Elasticity, Leyden, Wolters–Noordhoff, xii + 566 pp. 183. Introduction à la Mécanique Rationnelle des Milieux Continus [translation by D. Euvrard of an unpublished English text], vii + 367 pp., Paris, Masson [published as of 1974]. 184. Theoria de effectibus mechanicis caloris pridem ab illmo Sadi Carnoto verbis physicis promulgata nunc primum mathematice enucleata, Atti della Accademia di Scienze dell’Istituto di Bologna Classe die Scienze Fisiche (12) 10, 29–41. 184T. By B. Cimbleris, annotated: Carnot finalmente matematizado, Revista da Escola da Engenharia da Universidade Federal do Minas Gerais 2 (1974), 3–21. 185. The efficiency of a homogeneous heat engine, Journal of Mathematical and Physical Sciences (Madras) 7 [Milne-Thomson anniversary volume], 349– 371; 9 (1975), 193–194. 185A. Sul rendimento delle macchine termiche omogenee, Accademia Nazionale dei Lincei, Rendiconti della Classe die Scienze Fisiche, Matematiche e Naturali (8) 53, 549–553. 185T. By S. Benenti: Sul rendimento di una macchina termica omogenea, Rendiconti del Seminario Matematico dell’ Università e del Politecnico di Torino 31 (1971/3), 47–68 (1974). 186. Mathematical Aspects of the Kinetic Theory of Gases, Notas de Matemática Física, Volume III, Instituto de Matemática, Universidade Federal do Rio de Janeiro, (viii) + 246 pp. 187. The scholar’s workshop and tools, Centaurus 17, 1–10. 188. Review of I. Bernard Cohen’s “Introduction to Newton’s Principia”, Physics Today, April, p. 59. 189. Review of “Die Werke von Jakob Bernoulli, Band I”, Isis 64, 112–114.
xxxii
PUBLISHED WORKS OF C.A. TRUESDELL
190. Review of W. Flügge’s “Tensor Analysis and Continuum Mechanics”, American Scientist 61, 100. 191. Review of G.S. Gilmor’s “Coulomb and the Evolution of Physics and Engineering in Eighteenth-Century France”, Eighteenth-Century Studies 7 (1973/4), 213–225. Other publications in 1973: Nos. 164T, 169T1, 203PT, 224P. 1974 192. The meaning of viscometry in fluid dynamics, Annual Review of Fluid Mechanics 6, 111–146. 193. (Co-author H. M OON) Interpretation of adscititious inequalities through the effects pure shear stress produces upon an isotropic elastic solid, Archive for Rational Mechanics and Analysis 55, 1–17. 194. Preface, pp. V–VI, The Foundations of Mechanics and Thermodynamics, Selected Papers by W. Noll, Berlin-Heidelberg-New York, Springer-Verlag. 195. A simple example of an initial-value problem with more than one solution, Istituto Lombardo, Accademia di Scienze e Lettere. Rendiconti, Classe di Scienze Matematiche e Naturali (A) 108, 301–304. Other publications in 1974: Nos. 184T and 185T. 1975 196. Pervonaqalny kurs racionalno mehaniki sploxnyh sred, translation by R.V. Goldxten and V.M. Entov of an unpublished English text, edited by P.A. ilin and A.I. Lure, Moscow, Mir, 592 pp. 197. (Co-author H. M OON) Inequalities sufficient to ensure semi-invertibility of isotropic functions, Journal of Elasticity 5, 183–189. 197A. Same title, Zentralblatt für Mathematik 324 (1976), 513–514. 198. Early kinetic theories of gases, Archive for History of Exact Sciences 15, 1–66. 199. Les bases axiomatiques de la thermodynamique, Entropie 11, No. 63, 6–11; No. 64, 4–10; No. 65, 4–8. [The title is an unauthorized editorial replacement for the author’s “Trois conférences sur la structure conceptuelle de la thermodynamique, 1973”.] 200. Review of P. Costabel’s “Leibniz and Dynamics”, Historia Mathematica 2, 360–361. 201. Review of A.C. Eringen and E.S. Suhubi’s “Elastodynamics, Volume I, Finite Motions”, Journal of the Acoustical Society of America 58, 539–540. Other publications in 1975: Nos. 153RE, 185 (corrections), 213A. 1976 202. History of classical mechanics, Die Naturwissenschaften 63, 53–62, 119– 130. 202T. História da Mecânica clássica, Revista Brazileira de Ciências Mecânicas 4 (1982), No. 2, 3–17, and No. 3, 3–21.
xxxiii
PUBLISHED WORKS OF C.A. TRUESDELL
203. The scholar, a species threatened by professions, Critical Inquiry 2, 631– 648. 203A. Same title, Sociological Abstracts 26 (1978), 807. 203PT. By C. Wintzer and C.J. Scriba: Der Gelehrte: Eine durch die Professionen bedrohte Spezies, Humanismus und Technik 17 (1973), 113– 127. [An entire page of text is accidentally omitted.] 203R. [corrected] Speculations in Science and Technology 3 (1980), 517– 532. 204. Improved estimates for the efficiencies of irreversible heat engines, Annali di Matematica Pura ed Applicata (4) 108, 305–323. 205. Review of “The Mathematical Papers of Isaac Newton, Volume VI”, edited by D.T. Whiteside, American Scientist 64, 230. 206. Questioni vecchie e nuove di termodinamica razionale (Corso Linceo di 1973), published as an appendix (pp. 209–235) to No. 169T2. 207. Irreversible heat engines and the second law of thermodynamics, Letters in Heat and Mass Transfer 3, 267–289. 207L. Macchine termiche irreversibili e la seconda legge della termodinamica, pp. 297–307 of Problemi attuali di meccanica teorica e applicata, Atti del Convegno Internazionale a ricordo di Modesot Panetti, Torino. 208. Review of “L. Euleri Opera Omnia Series IVA, Volume 1”, edited by A. Juškeviˇc, V. Smirnov, and W. Habicht, Eighteenth-Century Studies 9, 627–634 = [corrected] Archives Internationales d’Histoire des Sciences 27 (1977), 292–296. 209. (Co-authors G. A STARITA and G.G. S ARTI) Insegnamento della termodinamica nella Facoltà di Ingegneria con i metodi della termodinamica razionale, La Chimica e l’Industria 58, 204–206. 210. Review of K. Walters’ “Rheometry”: “Theoretical Rheology”, edited by J.F. Hutton, J.R.A. Pearson, and K. Walters; R.R. Huilgol’s “Continuum Mechanics of Viscoelastic Liquids”; and P. Chadwick’s “Continuum Mechanics”, American Scientist 64, 705–706. 1977 211. A First Course in Rational Continuum Mechanics, Part I: Fundamental Concepts, New York, Academic Press, xxiii + 280 pp. 212. (Co-author R. F OSDICK) Universal flows in the simplest theories of fluids, Annali della Scuola Normale Superiore di Pisa (IV) 2, 323–341. 212R. Pp. 330–348 of Volume 1 of Raccolta degli Scritti dedicati a Jean Leray, Scuola Normale Superiore, Pisa, 1978. 213. (Co-author S. B HARATHA) The Concepts and Logic of Classical Thermodynamics as a Theory of Heat Engines, Rigorously Constructed Upon the Foundation Laid by S. Carnot and F. Reech, New York, Springer-Verlag, xxii + 154 pp.
xxxiv
214.
215.
216.
217. 218.
219.
220. 221. 222. 223.
224.
225.
PUBLISHED WORKS OF C.A. TRUESDELL
213A. How to understand and teach the logical structure and the history of classical thermodynamics, pp. 577–586 of Volume 2 of Proceedings of the International Congress of Mathematicians, Vancouver, 1974, 1975. Correction of two errors in the kinetic theory of gases which have been used to cast unfounded doubt upon the principle of material frame-indifference, Meccanica 11 (1976), 196–199. Review of “Commentationes Mechanicae et Astronomicae, Commentationes ad Scientiam Navalem Pertinentes, Volumen Prius, Leonhardi Euleri Opera Omnia Series II, Volumen 20”, edited by W. Habicht, Centaurus 21, 76–77. Review of “Lodovico Ferrari e Niccolò Tartaglia, Cartelli di Sfida Matematica”, edited by A. Masotti, Isis 68, 643–644. 1978 [Address upon receipt of a Birkhoff Prize, 1978], The Mathematical Intelligencer 1, 99–101, 193. Review of “Die Berliner und die Petersburger Akademie der Wissenschaften im Briefwechsel Leonhard Eulers, Teil 3, Wissenschaftliche und Wissenschaftsorganisatorische Korrespondenzen 1726–1744”, edited by A.P. Juškeviˇc, E. Winter, P. Hoffmann, I.N. Klado, and Ju.Ch. Kopeleviˇc, Isis 69, 301– 303. Some challenges offered to analysis by rational thermomechanics, pp. 495– 603 of Contemporary Developments in Continuum Mechanics and Partial Differential Equations (Proceedings of the International Conference on Continuum Mechanics and Partial Differential Equations, Rio de Janeiro, August 1977), edited by G.M. de LaPenha and L.A. Medeiros, Amsterdam, North-Holland. 1979 Absolute temperatures as a consequence of Carnot’s General Axiom, Archive for History of Exact Sciences 20, 357–380. Schizzo concettuale della termodinamica per gli studiosi di meccanica, Bollettino della Unione Matematica Italiana (5) 16-A, 1–20. Essay review of I. Szabò’s “Die Geschichte der Mechanischen Prinzipien”, Centaurus 23, 163–175. [translation of an unpublished English text] Meccanica e Termomeccanica razionale (1974), pp. 33–52 of Volume IV of Enciclopedia del Secolo XX, Rome. 1980 (Co-author R.G. M UNCASTER) Fundamentals of Maxwell’s Kinetic Theory of a Simple Monatomic Gas, Treated as a Branch of Rational Mechanics, New York, Academic Press, xxvii + 593 pp. The Tragicomical History of Thermodynamics, 1822–1854, New York, Springer-Verlag, XII + 372 pp.
xxxv
PUBLISHED WORKS OF C.A. TRUESDELL
225P. The Tragicomedy of Classical Thermodynamics (1971), International Centre for Mechanical Sciences, Udine, Courses and Lectures, No. 70, Wien and New York, Springer-Verlag, 41 pp., 1973. [Publication in this form was not authorized by C.T. and was contrary to his wishes.] 225RE. The disastrous effects of experiment upon the early development of thermodynamics, pp. 415-423 of Scientific Philosophy Today. Essays in Honor of Mario Bunge, Dordrecht, Reidel, 1981. 226. Sketch for a history of constitutive relations, pp. 1–27 of Proceedings of the 8th International Congress on Rheology, Volume 1. 227A. The nature and function of constitutive relations, pp. 9-41 through 9-44 of Volume 2 EPRI Workshop Proceedings: Basic Two-Phase Flow Modelling in Reactor Safety and Performance, Electric Power Research Institute, Palo Alto, March. 228. (Co-author J.F. BELL) §§3–6 of Physics of Music, pp. 666, 667 of Grove’s Dictionary of Music and Musicians, Volume 14. [Editorial revisions introduced a good many errors.] 229. Biographies [mangled by the editor and hence all but one unsigned] of D. Bernoulli, Chladni, Euler (co-author J.F. Bell), Helmholtz, Hooke, Lagrange, Lambert, and Sauveur, p. 628 of Volume 2, pp. 289, 290 of Volume 4, p. 292 of Volume 6, pp. 465, 466 and 686 of Volume 8, p. 361 and 397 of Volume 10, p. 524 of Volume 16 of Grove’s Dictionary of Music and Musicians. 230. Rapport sur le pli cachété No . 126, paquet présenté à l’Académie des Sciences dans le séance du 1er . Octobre 1827, par M. Cauchy, et contenant le Mémoire “Sur l’équilibre et le mouvement intérieur d’un corps, solide considéré comme un système de molécules distinctes les unes des autres”, Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 291 (Vie Académique), 33–46. 230A. “Cauchy’s first attempt at molecular theory of elasticity”, Bollettino di Storia delle Scienze Matematiche 1 (1981), 135–143. 231. Proof that my work estimate implies the Clausius–Planck inequality, Accademia Nazionale dei Lincei, Rendiconti della Classe di Scienze Fisiche, Matematiche e Naturali (8) 68, 191–199. Other publication in 1980: No. 203R. 1981 232. [Paroles de reconnaissance] Comptes Rendus Hebdomadaires des Séances de l’Académie des Sciences (Paris) 292 (Vie Académique), 45. 233. The role of mathematics in science as exemplified by the work of the Bernoullis and Euler, Verhandlungen der Naturforschenden Gesellschaft in Basel 91, 5–22.
xxxvi
PUBLISHED WORKS OF C.A. TRUESDELL
234. [Translation of an unpublished text in English] Il calcolatore: rovina della scienza e minaccia per il genere umano, pp. 37–65 of La nuova ragione, Scienza e cultura nella società contemporanea, Bologna, Scientia/Il mulino. 234R. Notiziario della Unione Matematica Italiana 11 (1984), 52–80. Other publication in 1981: No. 230A. 1982 235. The kinetic theory of gases, a challenge to analysts, pages 321–344 of Contributions to Analysis and Geometry (The Philip Hartman Symposium), edited by D.N. Clark, G. Pecelli, and R. Sacksteder, Johns Hopkins University Press. 236. Our debt to the French tradition; our search for structure today, Scientia 76, 63–77. 236T. Il nostro debito verso la tradizione francese: le “catastrofi” e l’attuale ricerca di struttura, ibid. 79–87. [This translation is faulty at some essential points.] 227. Perpetual motion consistent with classical thermodynamics, Atti della Accademia di Scienze di Torino 114 (1980), 433–436. 228. Fundamental mechanics in the Madrid Codices, pp. 309–324 of Leonardo e l’Età della Ragione, Milano, Scientia. 238T. [gravely defective] I primi principi di meccanica nei codici di Madrid, pp. 325–332 of ibid. Other publication in 1982: No. 202T. 1983 239. Euler’s contribution to the theory of ships and mechanics, Centaurus 26, 323–335. 240. The influence of elasticity on analysis: The classic heritage, Bulletin of the American Mathematical Society (2) 9, 293–310. 1984 241. Preface to the reissue of Handbuch der Physik, Volume VIa, pp. V–VIII of each of the four parts. 242. Correction of some errors published in this journal, Journal of Non-Newtonian Fluid Mechanics 15, 249–251. 243. An Idiot’s Fugitive Essays on Science: Methods, Criticism, Training, Circumstances, New York, Springer-Verlag, XVII + 645 pp. Second printing, revised and augmented, 1987, xvii + 661 pp. 243C. Essay 33d with most of the quotations from Gulliver’s Travels omitted, pages vii–xxxix of L. Euler, Elements of Algebra, New York, Springer, 1984. 244. Rational Thermodynamics, A Course of Lectures on Selected Topics, with an appendix by C.-C. Wang, second edition, corrected and enlarged, to which
xxxvii
PUBLISHED WORKS OF C.A. TRUESDELL
are adjoined appendices by R.M. Bowen, G. Capriz, P.J. Chen, B.D. Coleman, C.M. Dafermos, W.A. Day, J.L. Ericksen, M. Feinberg, M.E. Gurtin, R. Lavine, I.-S. Liu, I. Müller, J.W. Nunziato, S.L. Passman, M. Pitteri, P. Podio-Guidugli, D.R. Owen, P.A.C. Raats, M. Šilhavý, C. Truesdell, E.K. Walsh, and W.O. Williams, New York, Springer-Verlag, xvii+578 pp. 244LA. [of Appendix 1A] Classical Thermodynamics is a Mathematical Science, pp. 40–53 of Proceedings of the International Conference on Nonlinear Mechanics, Shanghai, October 28–31, 1985, Beijing, Science Press, 1985. 245. A puzzle divided: English and Continental chairs following a unique design of the early eighteenth century, Furniture History (John Hayward Memorial) 20, 56–60 and plates 76–80. Other publications in 1985: Nos. 154R, 244LA. 1986 246. A third line of argument in thermodynamics, pp. 79–83 of New Perspectives in Thermodynamics (workshop at the Institute for Mathematics and its Applications, University of Minnesota, June, 1983), New York, Springer-Verlag. 247. What did Gibbs and Carathéodory leave us about thermodynamics?, pp. 101– 124 of New Perspectives in Thermodynamics (workshop at the Institute for Mathematics and its Applications, University of Minnesota, June, 1983), New York, Springer-Verlag. 248. Classical thermodynamics cleansed and cured, pp. 265–291 of Meeting on Finite Thermoelasticity, Contributi del centro Linceo interdisciplinare di Scienze Matematiche e loro applicazioni No. 76, Roma, Accademia Nazionale dei Lincei. Corrected reprint circulated in 1988. 249. Preface, p. V of The Breadth and Depth of Continuum Mechanics, A Collection of Papers Dedicated to J.L. Ericksen on His Sixtieth Birthday, edited by C.M. Dafermos, D.D. Joseph, and F.M. Leslie, Berlin, Springer-Verlag. 1987 250. Great Scientists of Old as Heretics in “The Scientific Method”, Charlottesville, University of Virginia Press, 96 pp. 251. Preface, pp. V–VI of Analysis and Thermodynamics, A Collection of Papers Dedicated to W. Noll on His Sixtieth Birthday, edited by B.D. Coleman, M. Feinberg, and J. Serrin, Berlin, Springer-Verlag. Other publications in 1987: Nos. 111L1, 254A. 1988 252. Editorial, Archive for Rational Mechanics and Analysis 100 (1987/8), IX– XXII. 253. On the vorticity numbers of monotonous motions, Archive for Rational Mechanics and Analysis 104, 105–109.
xxxviii
PUBLISHED WORKS OF C.A. TRUESDELL
254. Review of U. Bottazzini’s “The Higher Calculus: A History of Real and Complex Analysis from Euler to Weierstrass”, Archives Internationales d’Histoire des Sciences 38, 125–137. 254A. Same title, Bulletin of the American Mathematical Society (2) 17, 186–189 (1987). 1989 255. Preface, pp. V–VI of Analysis and Continuum Mechanics, A Collection of Papers Dedicated to J. Serrin on His Sixtieth Birthday, edited by S.S. Antman, H. Brezis, B.D. Coleman, M. Feinberg, J.A. Nohel, and W.P. Ziemer, Berlin, Springer-Verlag. 256. [Comment on the article by Charles J. Sykes on The Johns Hopkins University, September 6], Wall Street Journal, September 19. 257. Newton’s influence on the mechanics of the eighteenth century. 257T. (By K. Hutter) Newtons Einfluß auf die Mechanik des 18. Jahrhunderts, pp. 47–73 of Die Anfänger der Mechanik: Newtons Principia gedeutet aus ihrer Zeit und ihrer Wirkung auf die Physik, edited by K. Hutter, Berlin, Springer-Verlag. 258. Maria Gaetana Agnesi, Archive for History of Exact Sciences 40, 113–142; corrections and additions, 43 (1992), 385–386. 1991 259. Foreword, pp. vii–x of Edoardo Benvenuto, An Introduction to the History of Structural Mechanics, New York, Springer-Verlag (2 Volumes). 260. Letter to the Editor, Isis 82, 90. 261. A First Course in Rational Continuum Mechanics, Volume 1, Second Edition, corrected, revised, and augmented, Academic Press, xviii + 391 pp. 1992 262. Jacopo Riccati, un grande “Savant” del ’700: Vita, Studi, Carattere, pp. 1–25 of J. Riccati e la Cultura della Marca nel Settecento Europeo (Atti del Convegno Internazionale di Studio, Castelfranco Veneto, 5–6 Aprile 1990), edited by Gregorio Piaia and Maria Laura Soppelsa, Leo S. Olschki, Firenze. 263. Cauchy and the modern mechanics of continua (1989), Revue d’Histoire des Sciences 45, 5–24. 264. Sophie Germain: Fame earned by stubborn error, Bolletino di Storia delle Scienze Matematiche 11, 3–24. 265. Functionals in the modern mechanics of continua, Convegno Internazionale in Memoria di Vito Volterra (1990), Atti del Convegni Lincei 92, 225–242. 266. (Co-author W. N OLL) The Non-Linear Field Theories of Mechanics, Second Edition, Berlin-Heidelberg-New York, Springer-Verlag, X + 591 pp. 266T. (By Chen Zhaoxun) Fei xian xing chang zhi li xue li lun, National Intitute for Compilation and Translation, Taipei, I+III+VI+ 742 pp., 2001.
xxxix
PUBLISHED WORKS OF C.A. TRUESDELL
1993 267. Mechanics, especially elasticity, in the correspondence of Jacob Bernoulli with Leibniz, pp. 13–26 of Der Briefwechsel von Jacob Bernoulli, edited by A. Weil, in Die gesammelten Werke der Mathematiker und Physiker der Familie Bernoulli, Basel, Birkhäuser. 1994 268. A modern exposition of classical thermodynamics, in: La Termodinamica e la Termocinetica nelle Scuole di Ingegneria, a ricordo del Prof. Cesare Codegone (Atti della giornata di studio tenutasi presso il Politecnico di Torino il 15 ottobre 1992), Atti della Accademia delle Scienze di Torino, Classe di Scienze Matematiche, Fisiche e Naturali 128, suppl. 2, 71–94. 1995 269. A che serve la storia delle scienze matematiche?, pp. 45–52 of Honoris Causa, Lezioni Dottorali di Insigniti di Laurea ad Honorem in Occasione del VI Centenario dell’Ateneo, Anno Accademico 1991/92, Ferrara, Università degli Studi di Ferrara. 1996 270. Jean-Baptiste-Marie Charles Meusnier de la Place (1754–1793): An historical note, Meccanica 31, 607–610. 271. The thirty-fifth anniversary of this Archive, by the Editor, Archive for History of Exact Sciences 50, 1–4. 2000 272. (Co-author K.R. R AJAGOPAL) An Introduction to the Mechanics of Fluids, xiii + 277 pp., Birkhäuser, Boston. 2001 Other publication in 2001: No. 266T. 2004 273. (Co-author W. Noll) The Non-Linear Field Theories of Mechanics, Third Edition, edited by S.S. Antman, Berlin-Heidelberg-New York, SpringerVerlag, XXIX + 602 pp.
Serials Edited by Clifford Ambrose Truesdell III I. Serials, as Founder or Co-founder (Co-founder and co-editor T.Y. Thomas, later co-editor V. Hlavatý) Journal of Rational Mechanis and Analysis, Indiana University, 5 volumes, 1952–1956. Archive for Rational Mechanics and Analysis, Berlin, Springer-Verlag, 1957– 1989 (co-editor J. Serrin, 1967–1985). Archive for History of Exact Sciences, Berlin, Springer-Verlag, 1960–1990. Springer Tracts in Natural Philosophy, New York, Springer-Verlag, 1963– 1966; co-editor, 1967–1978; editor, 1979–2000. II. Other Serials Editor, Reihe für Mechanik, Ergebnisse der Angewandten Mathematik, Berlin, Springer, 1957–1962, 3 volumes in all. Member of the Editorial Committee, Rendiconti del Circolo Matematico di Palermo, 1971–2000. Member of the International Editorial Committee, Meccanica, 1974–1994. Member of the International Editorial Committee, Annali della Scuola Normale Superiore di Pisa, 1974–1999. Member of the International Editorial Board, Il Nuovo Cimento B, 1979–1987. Member of the Editorial Council, Bollettino di Storia delle Scienze Matematiche, Unione Matematica Italiana, 1979–2000. Member of the Editorial Board, Speculations in Science and Technology, 1980– 1987. Member of the Editorial Board, Ganita-Bhãrati, 1981–1993. Member of the Editorial Board, Stability and Applied Analysis of Continuous Media, 1991–1993.
xli
Eulogium
CLIFFORD AMBROSE TRUESDELL III (b. February 18, 1919; d. January 14, 2000) Clifford Ambrose Truesdell III died on January 14, 2000. This man of mathematics, science and natural philosophy focused his strong sense of history and his talents and taste for identifying major advances in rational mechanics to establish a renaissance in mechanics and materials research that has prospered since the middle of the 20th century. He contributed substance and spirit to the areas of continuum mechanics, thermodynamics and kinetic theory, challenged the establishment and its dogmatic thinking, and engaged the community of young researchers with a new and fundamental direction of inquiry which concentrated on foundations, structure and logical implication. His letters, his books, his essays, his university courses, his co-founding of the Journal for Rational Mechanics and Analysis, his founding of the Archive for Rational Mechanics and Analysis, the Archive for History of Exact Sciences, Springer Tracts in Natural Philosophy, the Society for Natural Philosophy and his support of the research of other scientists were exceptional. His joint encyclopedic articles, The Classical Field Theories in 1960 and The Non-Linear Field Theories of Mechanics in 1965, were masterful, erudite, comprehensive, and pioneering works of lasting value. Clifford Truesdell was a scholar of immense creativity, a linguist, a connoisseur of the arts and a historian unfettered by fashion. Throughout his life, he taught us to preserve scholarship, to question foundations and to follow a path of reason with principle, purpose and passion. His actions constantly provided a stimulus, an environment and a framework for scientific discovery. He made a profound contribution to our science. ROGER F OSDICK University of Minnesota Minneapolis
xliii
Bloomington, Indiana, 1959
Memories of Clifford Truesdell BERNARD D. COLEMAN Department of Mechanics and Materials Science, Rutgers University, 98 Brett Road, Piscataway, NJ 08854-8058, U.S.A. Received 10 February 2003; in revised form 2 March 2003
Below is a shortened version of the text of a talk given at the Meeting in memory of Clifford Truesdell held in Pisa in November of 2000 and at the Symposium on Recent Advances and New Directions in Mechanics, Continuum Thermodynamics, and Kinetic Theory held in Blacksburg in June of 2002. Appended to that text is the Curriculum Vita of Professor Truesdell as he kept it up-to-date until October 1993, at which time, with his approval, I had it transcribed into its present format. Clifford Truesdell and Thermodynamics I consider myself to have been among the most fortunate of men: I have had a teacher and friend, indeed, more than a friend, in effect, an elder brother, who was the leading scholar in my science and who gave me encouragement, sound advice, and every type of help that I might need, even when I did not know that I needed it. Most important of all, he taught me that careful scholarship and the persistent search for insight and understanding are far more important than facile skill in the use of contemporary techniques for the solution of currently popular problems. Clifford Ambrose Truesdell III was born in Los Angeles, February 18, 1919. In his 23rd and 24th years he received, from the California Institute of Technology, the B.S. Degree in Mathematics, the B.S. Degree in Physics, and the M.S. Degree in Mathematics, and, in addition, from Brown University, a Certificate in Mechanics. In his 25th year he received, from Princeton University, the Ph.D. Degree in Mathematics. In the course of his career he received numerous awards and prizes, among which are: the Euler medal of the USSR Academy of Sciences, which was received twice, in 1958 and 1983, the Bingham Medal of the Society of Rheology, the Panetti Prize and Gold Medal of the Accademia di Scienze di Torino, the Birkhoff Prize of the American Mathematical Society and the Society for Industrial and Applied Mathematics, and the Ordine del Cherubino of the Università di Pisa. He received honorary doctorates from five universities and was awarded membership in twelve international academies of science; among them is the illustrious Italian Accademia Nazionale dei Lincei. 1 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 1–13. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
2
B.D. COLEMAN
Although I was an undergraduate student at Indiana University from February 1948 to June 1951, and hence my stay in Bloomington Indiana did overlap, albeit partially, that of Clifford and Charlotte Truesdell, we first met years later, in the Spring of 1958, at a scientific meeting in Lancaster Pennsylvania on the subject of rheology. The meeting was followed by an exchange of letters about thermodynamics, which was in turn followed by a one-week visit with Clifford and Charlotte Truesdell at their house in Bloomington in the winter of 1958 and not long after that by his two week visit to the Mellon Institute. If you bear with me, I should like to tell you about events of that period from the point of view of one whose subsequent view of science, the arts, and life itself were completely changed by his interaction with Clifford Truesdell. In the summer of 1957 I left a position in the chemical industry to become a Senior Fellow of the Mellon Institute in Pittsburgh, and soon after my arrival I started to attend courses given by Walter Noll on continuum mechanics and related branches of mathematics. Before the academic year was over, we both went to the rheology meeting in Lancaster. The list of speakers for that meeting included Clifford Truesdell, Walter Noll, Jerry Ericksen, and Ronald Rivlin. At a luncheon that was held there, Walter Noll and I were sitting at a table with several persons other than those just mentioned, and I expressed the view that although we were all told in school that thermodynamics is a closed subject whose general principles are known and pertain to only equilibrium states or to processes that stay so close to equilibrium that all departures from equilibrium are governed by linear constitutive relations, I could not believe that such is the case, and I felt that our knowledge of what the science of thermodynamics could be was in some way analogous to what the cultivators of mechanics knew about their subject at the time of the publication of Newton’s Principia and the early work of the Basel School. All within hearing, with the exception of Walter Noll, agreed with each other that I was wrong. Walter agreed with me and suggested that I read certain papers of Clifford Truesdell and that we talk more when we were back in Pittsburgh. We did talk more, much more, and I, with my eyes open wide with excitement, read whatever I could of Clifford Truesdell’s writings on thermodynamics. There were papers that carried forward Maxwell’s idea that a properly formulated theory of diffusion of mass in fluid mixtures should account, as Fick’s Law does not, for balance of linear and angular momentum. There were passages decrying the vagueness that rendered nearly empty, at least for mathematicians, the text-book versions of the Second Law of Thermodynamics. Among these was a footnote to a discussion of classical thermodynamics in his paper, The Mechanical Foundations of Elasticity and Fluid Dynamics, published in 1952 in Volume 1 of the Journal for Rational Mechanics and Analysis. The footnote urges the rational student “to cleave the stinging fog of pseudo-philosophical mysticism” hiding the mathematics behind a certain formulation of the Second Law. It was clear that he saw that thermodynamics, far from being a closed subject, was in a terrible state.
MEMORIES OF CLIFFORD TRUESDELL
3
Today we know that he was then doing research that would supply the key to setting things straight. In that paper of 1952 there appears a preliminary version of what he, with great generosity, called the Clausius–Duhem inequality, and which appeared in its present form in The Classical Field Theories of Mechanics, by Clifford Truesdell and Richard Toupin, published in 1960 in the Handbuch der Physik, Vol. III: q·n r d H − da + dm, H = η dm. dt θ ∂P P θ P Here H is the total entropy of the part P , θ is the thermodynamical temperature, q is the inward directed heat flux, r is the supply of heat from external sources, and n is the outward directed unit normal vector. On reading those two works one sees that before 1960 it was clear to Clifford Truesdell that that inequality is the correct mathematical form of the second-law of thermodynamics for the materials or systems such that the total entropy H is an integral over P of an entropy density η, and nearly all the thermodynamical systems that we consider in continuum physics have that property. The question that seemed open at the time was the following: How does one use the inequality? Is it a restriction on the process, or a relation to be obeyed by all processes? In the early 1960’s Walter Noll put to me the idea, as if it should be obvious to every one, that the inequality is a restriction on all processes that are admissible in the material of which the body is composed, and, because one defines each material by giving a set of constitutive relations, the Clausius–Duhem inequality, as it must hold for all processes compatible with those relations, becomes a restriction on constitutive relations. In another act of great generosity, Walter suggested that we develop the idea together. It took awhile to sort the argument out and to present it in a way that would convince the wary. The paper was written while he and I were on sabbatical leave and were guests of Clifford Truesdell at Hopkins. Shortly thereafter, I used this approach to the Clausius–Duhem inequality to render mathematical ideas I had been struggling to express for years about thermodynamical restrictions on materials with gradually fading memory. Many of the people in this room have done research on the Clausius–Duhem inequality, in the study of its implications for new classes of constitutive relations, in the study of its implications for the theory of the evolution of singular surfaces, or in the study of its logical relation to other mathematically precise statements of the second law. I am certain that I do not exaggerate when I say that every one of them feels a deep debt of gratitude to Clifford Truesdell for finding the tool to cleave the fog that once obscured the science of thermodynamics. But we have even greater debts to him. I should like to return to the time of that rheology meeting in Lancaster and elaborate on my own debt to him. A few months after that meeting, I received a letter from Clifford in which he said that he heard from Walter Noll that I had studied at The remark was appropriate to both the meeting in Pisa and symposium in Blacksburg.
4
B.D. COLEMAN
Yale, had read some of the works of J. Willard Gibbs, and knew some things about thermodynamics. He then asked if I could clarify some passages in Gibbs’ paper on The Equilibrium of Heterogeneous Substances. With Walter’s help I studied those passages for months and finally sent off a long letter in which Clifford’s questions were answered but issues were raised which for decades affected my work on the stability of thermodynamical systems. In his reply he invited me to visit him and Charlotte in Bloomington. Many years later, in the summer of 1993, I wrote to him about that period and with your indulgence I shall read from the letter. I know there are others here who could say similar things about our beloved friend, and when I have finished you will see what I mean about our greater debt to him. What I read now are three brief consecutive paragraphs out of a long letter. “My correspondence with you about Gibbs’ conditions for the stability of fluid phases occurred in this happy period. I visited your house in Bloomington for a week in, I believe, the winter of 58–59. That visit had a major influence on my view of what is important. As if struck by lightening, like the one Christians call Paul, I suddenly saw clearly something for which I was ready by instinct. In my case it was not a new religion, but a way to get out of a rut, by seeking to study the languages, the writings, the art, the customs, the lives, and the music and diversions of the ages in which our science originated and the works we admire were produced. Your example gave me the impetus to try to learn properly other languages. I became serious in my study of Italian. In later years I have tried to improve my French. (In my 62nd year, I started working on Attic and Homeric Greek, but was too old for such efforts.)” “The influence of our friendship on my intellectual development has been too great to describe in a few phrases couched in generalities. A brief summary is impossible. Only examples will do, and there is space here for only one.” “That one concerns my behavior when I am writing something for publication. Invariably, upon completion of a passage I put down the pencil and ask myself: ‘What would Clifford say if he saw what I have just written?’ The subsequent imagined conversation is often such that I feel obliged to rewrite the passage.” Would that he were here now, to help us live up to the standards of scholarship and clarity that he set for us!
MEMORIES OF CLIFFORD TRUESDELL
5
Curriculum Vita of Clifford Ambrose Truesdell III Born in Los Angeles, California, February 18, 1919 Studies European Travel and Private Study, 1936–1938. California Institute of Technology: B.S. (Mathematics), 1941; B.S. (Physics), 1941; M.S. (Mathematics), 1942. Brown University: Certificate in Mechanics, 1942. Princeton University: Ph.D. (Mathematics), 1943. Primary Employment California Institute of Technology: Assistant in history, debating, and mathematics, 1940–1942. Brown University: Assistant in Mechanics, 1942. Princeton University: Instructor of Mathematics, 1942–1943. University of Michigan: Instructor of Mathematics, 1943–1944. Radiation Laboratory, Massachusetts Institute of Technology: Staff Member, 1944–1946. U.S. Naval Ordnance Laboratory, White Oak, Maryland: Chief, Theoretical Mechanics Subdivision, 1946–1948. U.S. Naval Research Laboratory, Washington, D.C.: Head, Theoretical Mechanics Section, 1948–1950. Indiana University: Professor of Mathematics, 1950–1961. Johns Hopkins University: Professor of Rational Mechanics, 1961–1989; Emeritus, 1989– Part-time, Temporary, and Visiting Appointments University of Maryland, College Park: Lecturer in Mathematics, 1946–1947, Assistant Professor of Mathematics, 1947–1949, Associate Professor of Mathematics, 1949–1950. U.S. Naval Research Laboratory, Washington, D.C.: Consultant, 1951–1955. Universität Marburg an der Lahn: Gastprofessor, 1957. Mathematics Research Center, U.S. Army, University of Wisconsin, Madison: Member, 1958.
6
B.D. COLEMAN
Mellon Institute, Pittsburgh: Visitor, 1959. Socony-Mobil Research Laboratory, Dallas, Texas: Colloquium Lecturer, 1960. U.S. National Bureau of Standards, Washington, D.C.: Consultant, 1950–1962. University of California at Los Angeles: Special Lecturer, 1963. Technische Universität Berlin–Charlottenburg: Gastprofessor, 1964. University of Washington, Seattle: Walker-Ames Professor, 1964. Australian Mathematical Society Summer Research Institute, Melbourne: Lecturer, 1965. Syracuse University, New York: Distinguished Visiting Professor, 1965. International School on Nonlinear Problems in Physics, München: Lecturer, 1966. Università di Pisa: Visiting Lecturer, 1966, 1973–1975, 1978, 1980, 1982, 1985, 1987. Sandia Corporation, Albuquerque, N.M.: Visitor, 1966. Drexel Institute of Technology, Philadelphia: Seventy-Fifth Anniversary Lecturer, 1966–1967. Accademia dei Lincei, Roma: Professore Linceo, 1970, 1973. Universidade Federal do Rio de Janiero: Lecturer for the Coordenacão dos Programas de Pós-Graduacão de Engenharia and the Instituto de Matemática, 1972. Georgia Institute of Technology, Atlanta: Consultant, 1973–1974. Scuola Normale Superiore, Pisa: Ospite Linceo, 1974. Brookhaven National Laboratories, Long Island: Consultant (Advanced Codes Review Committee, U.S. Nuclear Regulatory Commission), 1975–1983. University of Delaware: Bicentennial Scholar in Residence, 1976. Instituto de Ingenieria Mecánica y Mecánica Teórica y Applicada, Universidad Autónoma de Mexico, Mexico D.F.: Lecturer, 1977. Università di Bologna: Professore Visitatore, 1978, 1987, 1988. Université Catholique de Louvain: Visiting Professor, 1979.
MEMORIES OF CLIFFORD TRUESDELL
7
Institut des Hautes Études Scientifiques, Bures-sur-Yvette: Visitor, 1981. Cornell University: First Distinguished Visiting Professor of Theoretical and Applied Mechanics, 1982. Università di Firenze, Scuola di Architettura: Professore à Contratto, 1985. Scuola di Ingegneria Strutturale, Universitá di Roma “La Sapienza”: Visiting Professor, 1990. Short Lecture Series and Named Single Lectures University of Toronto, 1949. Sorbonne, Paris, 1949, 1955. State University of Iowa, 1956. Indiana University, 1959. Scuola Internazionale di Fisica, Varenna, 1960. Universitá di Padova, 1961. Universitá e Politecnico di Milano, 1961. Midwest Mechanics Seminar Tour, 1962. Academy of Sciences, Warsaw, 1963, 1964. The Johns Hopkins University, 1965. Gibson Lecturer in the History of Mathematics, University of Glasgow, 1965. Distinguished Visiting Lecturer, Centennial of the University of Kentucky, 1965. NSF Conference on Recent Developments in Continuum Mechanics, Virginia Polytechnic Institute, 1966, 1969. Koerner Lecturer, Simon Fraser University, Burnaby, BC, 1969. Distinguished Lecturer in Chemical Engineering, University of Rochester, 1970. International Centre of Mechanical Sciences, Udine, 1971. Centennial Lecturer in Engineering Mechanics, Virgina Polytechnic Institute, 1971. Section de Transferts Thermiques, Centre de Recherches Nucléaires, Grenoble, 1973. Bajer Lecture, Princeton University, 1975. Durelli Lecture, Catholic University of America, 1977. International Symposium on Continuum Mechanics and Partial Differential Equations, Universidade do Rio de Janeiro, 1977. University of Chicago, 1979. Thermofluids Lectures, Departments of Chemical and Mechanical Engineering, School of Mines, University of Arizona, Tucson, 1980. Ritt Lectures, Department of Mathematics, Columbia University, 1982.
8
B.D. COLEMAN
First MTU Lectures in Engineering Science, Michigan Technical University, Houghton, 1983. St. Andrews University, Scotland, 1983. Distinguished Scientist Lecture, Trinity University, San Antonio, TX, 1984. Allen Lecture in Mathematical Sciences, Rensselaer Polytechnic Institute, 1985. Page-Barbour Lectures, University of Virginia, Charlottesville, 1985. Franklin Lecture, Auburn University, Auburn, AL, 1986. Invited Single Lectures to Meetings and Symposia American Mathematical Association (Baltimore, 1948). American Physical Society (Charlottesville, 1949; State College, PA, 1953; San Diego, CA, 1971). International Conference on Theoretical Fluid Mechanics, Harvard, 1950. Conference on Elasticity, University of Maryland, 1952. Sigma Xi (Indiana University, 1952; State University of Iowa, 1956; Illinois Institute of Technology, 1960; Georgia Institute of Technology (Monie A. Ferst Memorial Lecture), 1969; University of Tennessee, 1976; McGill University, 1976. Symposium on Ultrasonic Absorption and Dispersion in Fluids, Brown University, 1952. Discussion Meeting on the Second Viscosity of Fluids, The Royal Society, London, 1953. First Midwestern Conference on Solid Mechanics, University of Illinois, 1953. Symposium of the Office of Ordnance Research and the American Mathematical Society, Chicago, 1954. American Society for Engineering Education, Urbana, 1954. Gesellschaft für Angewandte Mathematik und Mechanik (General Lectures), Berlin, 1955; Hamburg, 1957. Sixth Conference on Hydraulics (General Lecture), Iowa City, 1955. Eulerfeier (Main Lecture), Basel, 1957. Washington Philosophical Society, 1958. Accademia Nazionale di Scienze, Lettere ed Arti (Inaugural Address), Modena, 1960. Celebrazioni Archimedee, Siracusa, 1961. I.U.T.A.M. Symposium on Second-order Effects in Elasticity, Plasticity, and Fluid Dynamics (General Lecture), Haifa, 1962. Fourth U.S. National Congress of Applied Mechanics (General Lecture), Berkeley, 1962. Summer Conference on Non-ideal Mechanical Behavior, Princeton, 1962. American Society of Mechanical Engineers, Washington, 1962. Symposium on Hemodynamics and Hydrodynamics, Baltimore, 1962.
MEMORIES OF CLIFFORD TRUESDELL
9
International Congress of Rheology (Bingham Medal Address), Providence, 1963. Society for Natural Philosophy at Pittsburgh, 1963; Notre Dame, 1971; Seattle, 1972; Pisa, 1974, 1978; Williamsburg (Fifteenth Anniversary Lecture), 1978; Rolla, 80; Brown, 1983; Baltimore, 1987; Pittsburgh (Walter Noll retirement symposium), 1993. Eleventh International Congress of Applied Mechanics, Munich, 1954. Philosophy of Science Seminar, University of Delaware, 1965. Convegno dei Meccanici Italiana, Modena, 1966. I.U.T.A.M. Symposium on Irreversible Thermodynamics in Continuous Media, Vienna, 1966. Commemoration of Newton’s Annus Mirabilis, Austin, 1966. First Canadian National Congress of Applied Mechanics (General Lecture), Quebec, 1967. Third Buhl International Conference on Materials, Mellon Institute, 1968. Symposium on “The Interplay between Mathematics and Physics – The Rise of Mathematical Physics” at the University of Aarhus, 1970. Southwest Graduate Research Conference, Houston, Texas, 1971. Second Annual Meeting of the American Society for Eighteenth Century Studies, College Park, Maryland, 1971. Banquet address, meeting of the History of Science Society and Society for the History of Technology, Washington, 1972. Sectional address (History and Paedagogy), International Congress of Mathematicians, Vancouver, 1974. Address at the Engineering Commencement, Tulane University, New Orleans, 1976. Address on receipt of a Birkhoff Prize, Annual meeting of the American Mathematical Society and the Society for Industrial and Applied Mathematics, 1978. Euromech Colloquium, Pisa, 1978. Italo–American Co-operative Science Seminar, Venice, 1978. Organizer’s address, Special Symposium on “Conceptual Analysis in Rational Thermomechanics”, Summer meeting of the American Mathematical Society, Providence, 1978. Keynote address on Constitutive Relations, E.P.R.I. Workshop on Two-Phase Flow, Tampa, Florida, 1979. Colloquium on Continuum Thermodynamics, Society of Engineering Science, Northwestern University, Evanston, 1979. Celebration of the 75th anniversary of Scientia, Milano, 1980. Plenary lecture, 8th International Congress of Rheology, Naples, 1980. General Lecture, Society of Engineering Science, Atlanta, 1980. Colloquium on the History of Mathematics, Winter meeting of the American Mathematical Society, San Francisco, 1981.
10
B.D. COLEMAN
Keynote address, 11th Southeastern Conference on Theoretical and Applied Mechanics, Huntsville, Alabama, 1982. Joint Session on the History of Mathematics, meetings of the American Mathematical Society and American Mathematical Association, Toronto, 1982. Festakt Daniel Bernoulli (Main Lecture), Basel, 1982. Leonardo e l’età della ragione (Congress organized by Scientia and the governing bodies of Milano and Lombardy), Milano, 1982. 25th British Theoretical Mechanics Colloquium, Manchester, 1983. International Symposium, “The Codex Hammer in Context”, Walters Art Gallery, Baltimore, 1983. Workshop on the Laws and Structure of Continuum Thermomechanics, Institute for Mathematics and its Applications, University of Minnesota, Minneapolis, 1983. Convegno sul tema “Termoelasticità finita”, Accademia Nazionale dei Lincei, Rome, 1985. International Conference on Nonlinear Mechanics, Shanghai, 1985. 900th Anniversary celebrations, University of Bologna, 1987, 1988. 300 Years of Gravitation, University of Cambridge, England, 1987. International Conference dedicated to the Tricentenary of the Publication of Newton’s Principia, U.S.S.R. Academy of Sciences, Moscow, 1987. First Plenary Lecture, 4th National Congress of Theoretical and Applied Mechanics, Coimbra, Portugal, 1987. Celebration of the 300th anniversary of Newton’s Principia, Technische Hochschule Darmstadt, 1988. First Plenary Lecture, III International Workshop on Mathematical Aspects of Fluid and Plasma Dynamics, Salice Terme, Italy, 1988. First Plenary Lecture, IX Congress Nazionale dell’Associazione Italiana di Meccanica Teorica ed Applicata, Bari, 1988. Imola Conference, Università degli Studi di Bologna, September 5–7, 1988. Inaugural Charles E. Foster Lecture, School of Aerospace and Mechanical Engineering, University of Oklahoma, Norman, 1990. Convegno Internazionale “I Riccati e la cultura della Marca nel Settecento Europeo”, Castelfranco Veneto, 1990. First Rutgers Conference on Theoretical Mechanics: The Dynamics of Rods, August 24–27, 1990, Rutgers University, New Brunswick. Convegno Internazionale in Memoria di Vito Volterra, Accademia Nazionale dei Lincei, October 8–11, 1990. Editorial Positions Co-founder and Co-editor, Journal of Rational Mechanics and Analysis, 1952–1956.
MEMORIES OF CLIFFORD TRUESDELL
11
Editor or Co-editor, Leonhardi Euleri Opera Omnia, Series II, Vols. 10–13, 18–19, 1952–1971. Co-editor, Handbuch der Physik, (Springer) Vols. 8/I, 8/II, 9 and 6a/1–6a/4, 1956–1974. Founder and Editor, Archive for Rational Mechanics and Analysis, 1957– 1967, Co-editor, 1967–1985; Editor, 1985–1989. Editor, Reihe für Mechanik, Ergebnisse der Angewandten Mathematik, 1957– 1962. Founder and Editor, Archive for History of Exact Sciences, 1960–. Founder and Editor, Springer Tracts in Natural Philosophy, 1962–1966; Co-editor, 1967–1978; Editor, 1979–. Co-editor, Studies in the Foundations, Methodology and Philosophy of Science, 1966–1970. Member of the Editorial Board, Rendiconti del Circolo Matematico di Palermo, 1971–. Member of the Editorial Board, Annali della Scuola Normale Superiore, Pisa, 1974–. Member of the Editorial Board, Meccanica, 1974–. Member of the International Editorial Board, Il Nuovo Cimento B, 1979– 1981; Il Nuovo Cimento D, 1982–1987. Member of the Editorial Council, Bollettino di Storia delle Scienza Matematiche, Unione Matematica Italiana, 1979–. Member of the Editorial Board, Speculations in Science and Technology, 1980–1987. Member of the Editorial Board, Ganita-Bharati, 1981–. Member of the Editorial Board, Stability and Applied Analysis of Continuous Media, 1991–. Organizational Positions U.S. Correspondent, International Mathematical News (Austria), 1952–1956. Member, Committee on Applied Mathematics, U.S. National Research Council, 1954–1956. Sponsor for Elasticity, American Society for Mechanical Engineers, 1956– 1958. General Chairman, Conference on the Foundations of Mechanics and Thermodynamics, National Bureau of Standards, 1959. Member of organizing committee, International Conference on Rarefied Gas Dynamics, Berkeley, California, 1960. Member of organizing committee, International Congress of Logic and the Philosophy of Science, Stanford, 1960. Member of organizing committee, I.U.T.A.M. Conference on Second-order Effects in Elasticity, Plasticity, and Fluid Mechanics, Haifa, 1962.
12
B.D. COLEMAN
Co-founder, Society for Natural Philosophy, 1963; Director, 1963–1984; Secretary, 1963–1965, 1970–1971, 1980–1981; Chairman, 1967–1968, 1983–1984; Member of the Program Committee, 1975–1976. Co-chairman of the local committee and Chairman of the Round-Table Discussion, meetings of the Society for Natural Philosophy at Baltimore, 1963, Bressanone, 1965, Chairman of a Round-table Discussion at the meetings at Chicago, 1966; Cincinnati, 1970; Cincinnati, 1977; Madison, 1984. Co-chairman of the local committee for the meeting at Baltimore, 1965. Coordinator, C.I.M.E. Course on Nonlinear Continuum Theories, Bressanone, 1965. Co-Chairman, First Joint Italian-American Cooperative Science Seminar, Udine, 1971. Member of the Scientific Committee, Symposium on Problems of Plasticity, Polish Academy of Sciences, Warsaw, 1972. Co-Chairman, Italian–American Cooperative Science Seminar, Udine, 1974. Organizer of Special Symposium “Conceptual Analysis in Rational Thermomechanics”, Summer Meeting, American Mathematical Society, Providence, 1978. Member of the Steering Committee, International Conference on Nonlinear Mechanics, Shanghai, 1985. Honorary Doctorates Dott.ing.h.c. in Mechanical Engineering (Fluid Mechanics and History of Science), Centenary of the Politecnico di Milano, 1965. D.Sc. (Engineering), Tulane University, 1976. Fil. D. h.c. (Physics), Uppsala University, 1979. Dr. Phil. h.c. (Sciences), University of Basel, 1979. Dott. mat. h.c. (Mathematics), University of Ferrara, 1992. Memberships in National or International Academies of Science, etc. Socio Onorario dell’Accademia Nazionale di Scienze, Lettere ed Arti, Modena, from 1960. Membre Correspondent de l’Académie Internationale d’Histoire des Sciences, Paris, 1961–1968. Membre Effectif from 1968. Membro Straniero dell’Istituto Lombardo Accademia di Scienze e Lettere, from 1968. Socio Corrispondente Straniero dell’Istituto Veneto di Scienze, Lettere ed Arti, from 1969. Accademico Corrispondente Straniero dell’Accademia delle Scienze dell’Istituto di Bologna, from 1971.
MEMORIES OF CLIFFORD TRUESDELL
13
Socio Straniero dell’Accademia Nazionale dei Lincei Rome, from 1972. Membre Titulaire de l’Académie Internationale de Philosophie des Sciences, Bruxelles, from 1974. Socio Straniero dell’Accademia delle Scienze, Torino, from 1978. Membro Corrispondente, Academia Brasileira de Ciências, from 1981. Honorary Foreign Member, Polish Society for Theoretical and Applied Mechanics, from 1985. Membrum Ordinarium, Regia societas scientiarum Upsaliensis, from 1987. Fellow, American Academy of Arts and Sciences, from 1991. Awards, Prizes California Institute of Technology, Institute Scholar and LaVerne Noyes Scholar, 1938–1941; Conger Peace Prize, 1940, 1941. Fellow of the John Simon Guggenheim Memorial Foundation, 1957. Euler medal of the USSR Academy of Sciences, 1958, 1983. Senior Post-Doctoral Fellow, U.S. National Science Foundation, 1960–1961. Bingham Medal of the Society of Rheology, 1963. Gold Medal and International Prize “Modesto Panetti” (applied mechanics), Accademia di Scienze di Torino, 1967. Birkhoff Prize (applied mathematics), American Mathematical Society and Society for Industrial and Applied Mathematics, 1978. Ordine del Cherubino, University of Pisa, 1978. Visiting Research Scholar, Japan Society for the Promotion of Science, Kyoto, 1980. Senior U.S. Scientist Award (Humboldtpreis), West Germany, 1985.
Clifford Truesdell (1919–2000), Historian of Mathematics ENRICO GIUSTI Department of Mathematics, University of Florence, Italy Received 26 August 2003
In many ways, the research performed by Clifford Truesdell on the history of mathematics can be summarized by the title of the first article, at the beginning of the first issue of the Archive for the History of Exact Sciences: “A program towards rediscovering Rational Mechanics in the age of reason”. Two themes come together and will always recur in Truesdell’s research. The first one is reason: in an age when the term “enlightenment” took up also negative meanings, Truesdell never stopped claiming, in a decisive and clear language, the supremacy of reason as the only guide to human behavior. He saw the end of this “age of reason” in the French revolution, the source, in his opinion, of all atrocities of modern times, from the lager to the gulag, from the nuclear threat to universal suffrage. Truesdell sees all subsequent historical events in an exclusively negative way. Although we cannot understand or agree with Truesdell on all these, we see that his theory envisages reason as the only compass able to guide mankind in his daily choices, and discover in his mathematical philosophy a model of that rational culture that seems sometimes increasingly far from us and obsolete. The second theme is rational mechanics. All of the work of Truesdell on history places at the center the onset and the development of modern rational mechanics, which is a well-known discipline for Italian scientists, but is somehow unrelated to the Anglo-Saxon tradition. Rational mechanics is considered by Truesdell, first of all, as an out and out mathematical discipline, equally far from pure abstract speculation and from “big science” with its big-scale projects and few ideas. Not only private, individual experimental researches were performed in the eighteenth century; there were also large, cooperative projects. As today, they cost more than real science, and they attracted administrators. But the effect of all this expense on what we now consider the achievement of the period was nil. The method used in the great researches was entirely mathematical, but the result was not what would now be called pure mathematics. Experience was the guide; experience, physical experience and the experience of accumulated previous theory. If we were to seek a word for what was done, it would not be physics and it would not be pure mathematics; least of all would it be applied mathematics: It would be rational mechanics. Essays in the History of Mechanics, Springer-Verlag, Berlin, 1968, p. 136.
15 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 15–22. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
16
E. GIUSTI
In rational mechanics, Truesdell has studied especially the founders, “the immortal ones” in his personal definition: Euler, Jacob, Johann and Daniel Bernoulli, Cauchy. This choice is evidently a reflection of his personal orientations and taste; there are, however, deeper reasons than his personal preferences: these reasons originate from the way Truesdell conceived the study of the historical aspects of mathematics. One of the key points of Truesdell’s perspective on the history of mathematics concerns competence. Truesdell did not indulge in sociological investigations or in the description of cultural circles and cultural institutions, except for the cases where those aspects had a decisive function for the development of the research. His history was about contents rather than circumstances. According to Truesdell, the history of science has reason to exist only if it tells about science; otherwise it will just exert a bad influence both on history and, especially, on science itself. For this branch of history, one must have a detailed, deep and diverse knowledge: in the first place, knowledge of mathematics, based on an extensive study of the subject, preferably associated with some kind of familiarity with research procedures, and then historical and philological competence. All this knowledge can only be acquired after a long study period: the history (or the philosophy) of mathematics cannot be the occupation of retired mathematicians or literates without any scientific background: The philosophy of science, I believe, should not be the preserve of senile scientists and of teachers of philosophy who have themselves never so much as understood the contents of a textbook of theoretical physics, let alone done a bit of mathematical research or even enjoyed the confidence of a creating scientist.
Since it is not a matter of describing an environment that looks the same to each observer, but rather a matter of analyzing and understanding different contributions bearing the most diverse degrees of deepness and quality, the specificity and the vastness of the history of mathematics raise the urgent problem of relevance. As he cannot describe and reproduce all existing written treatises and documentation on mathematics, the historian must make a choice and illustrate, at the same time, the criteria on which he bases his choice. Truesdell has only one criterion: relevance. First of all, after examining the huge series of authors and literature, the historian will have to identify who and what gave a concrete contribution to the development of the discipline, then comprehend and describe the contents. This absolutely requires a specific competence; from this comes, furthermore, the need to evaluate, express judgements and to take a position. In writings on the history of science today, as in all aspects of social intercourse, it becomes increasingly bad taste to call a spade a spade. In the particular application to the history of science, the compulsion to euphemy assumes the form of a solemn refusal to admit that there is such a thing as wrong in science. All that matters is how the scientists of one epoch thought and felt about nature, their own An Idiot’s Fugitive Essays on Science, Springer-Verlag, New York, 1984, p. vii.
CLIFFORD TRUESDELL, HISTORIAN OF MATHEMATICS
17
work, and the work of others. In particular, the beginner is enjoined above all not to take sides, like the positivist historians of the last century, and set past science into categories of true and false according as it does or does not agree with what is taught in school science of our own day. However admirable this philosophy may be in promoting peace and mutual love among historians of science, it disregards one aspect of science that is not altogether negligible, namely, that scientists seek the truth, not a truth. He who refuses to “take sides” in science in effect negates science itself by denying its one and common purpose. He reduces science to just one more social manifestation. In so doing, not only does he by implication belittle the great scientists of the past, but also he sins against history, for in his attempt at historical impartiality he destroys the object, namely science, the history of which he claims to write.
From this point of view, the mathematician, and consequently also the historian of mathematics, derives an undoubted advantage from the fact that mathematics, as opposed to other disciplines, applies objective criteria to distinguish truth from wrong and relevant aspects from secondary ones: Now a mathematician has a matchless advantage over general scientists, historians, politicians, and exponents of other professions: He can be wrong. A fortiori, he can also be right. There are errors in E UCLID, and, to within a set certainly of measure zero on the ordinary human scale, what E UCLID proved to be true in ancient Greece is true even in the colossal, unprecedented, nucleospatial, totally welfared today. In the advance through the physical, social, historical, and other sciences, the demarcation between truth and falsehood grows vaguer, until in some areas truth can be rezoned as falsehood and falsehood enshrined into truth by consensus of “acknowledged experts and authorities” or even popular vote.
According to Truesdell, to take a position is a duty, perhaps the main duty of a historian, who must distinguish relevance from fame and put each author in the right perspective. In his treatises about a period or an author, Truesdell never neglects to take a clear position and to go so far that, in some cases, it looks as if he intentionally chooses topics which allow him to take a position against established interpretations and evaluations, often accepted and repeated without criticism. His heroes, as we explained, are Euler, Bernoulli, Cauchy and, in history, Pierre Duhem. On the opposite side, he puts D’Alembert, Lagrange and Mach: the last two are considered responsible, by the way, for the misunderstandings and the forgeries present in the history of mechanics. The first volume [of the Encyclopédie] carries as frontispiece a magnificent engraving of D’A LEMBERT, who, under the guise of authoritative reviews, filled its pages with shoddy hashes of antiquated science served up with a sauce of his own prejudices, advertisements for his researches, and attacks on his opponents.‡ Essays in the History of Mechanics, pp. 145–146. Essays in the History of Mechanics, p. 140. ‡ The rational mechanics of flexible or elastic bodies 1638–1788 (Leonhardi Euleri Opera
Omnia II, XI (2)), p. 245n.
18
E. GIUSTI
L AGRANGE’s talent for algebra was undoubtedly great, but in respect to fundamental questions of analysis or mechanics his work does not attain the logical and conceptual standards of his great predecessors. Also, the proportion of nontrivial errors in L AGRANGE’s calculation is high compared with other major mathematicians’. This body of errors seems to have attracted little notice, so that Lagrange is generally given credit for having solved several problems on which his work is largely or totally wrong. It is time for a reappraisal of the work of the French mathematicians, a reappraisal constructed, in defiance of the généralités from the obituaries and the descendents of the obituaries, upon critical study of the work done. I am confident such a reappraisal would much reduce the importance of D’A LEMBERT and L AGRANGE, would yield a more realistic view of L APLACE, M ONGE, and F OURIER, would raise C LAIRAUT and P OISSON to their just level, and would reveal C AUCHY as the towering giant of his age and nation. While L AGRANGE’s book is a good starting place, experience with it has led me to the following working hypotheses: 1. There was little new in the Méchanique Analitique; its contents derive from earlier papers of L AGRANGE himself or from works of E ULER and other predecessors. 2. General principles or concepts of mechanics are misunderstood or neglected by L AGRANGE. 3. L AGRANGE’s histories usually give the right references but misrepresent or slight the contents.
Not even Leonardo escapes his criticism. Commenting on the attribution to Leonardo of the laws ruling the free fall on inclined planes, Truesdell writes: From many other examples we know that L EONARDO often proposed rules of simple proportion, and that usually they came out wrong. Here is one that is right. Did L EONARDO know it? Was not this, for him, just one of the hundred linear rules he proposed and never got around to trying out, the only difference being that this one concerns a problem later seen to have central importance, and this answer just turned out to be right? I fear so. We remember that in regard to free fall, L EONARDO proposed several linear laws, mutually contradictory, and in the single one that turned out to be correct, he may simply have forgotten to repeat his other, incorrect, statements. The facts before us are simple: 1. 2. 3. 4.
In physics, some relations are linear and some are not. L EONARDO never proposed any relation other than a linear one. L EONARDO did propose dozens of linear relations. From 1 and 3, some of L EONARDO’s rules may be expected to come out right.
Counting only the cases when he was right and disregarding those when he was wrong may give a somewhat distorted estimate of his capacity as a natural philosopher.‡ The rational mechanics of flexible or elastic bodies, p. 412n. Essays in the History of Mechanics, pp. 246–247. ‡ Essays in the History of Mechanics, p. 36.
CLIFFORD TRUESDELL, HISTORIAN OF MATHEMATICS
19
Once again, he does not fear the wrath of defenders of the “politically correct” principle, when he writes an article bearing the title “Sophie Germain: fame earned by stubborn error”, in which he makes statements such as: The above suffices to show that Sophie Germain was ignorant not only of elementary mechanics but also of the calculus of double integrals.
Of course, Truesdell does not blame the authors; they did all that they possibly could. Instead, he blames the historians, who are not able to distinguish between sumptuously dressed banalities and conclusive results, who keep repeating already repeated statements without thinking them through and, by doing so, who contribute to the onset of an historic treacle, in which nobody anymore is able to distinguish a valid theorem from a futile exercise. While a physicist writing on the history of physics usually tells us what he thinks the old scientists must have thought, and a historian tells us whom they knew and what books they read, Professor JAMMER tells us mainly what they said they did . . . . A history apparently was not intended, since, despite critical remarks here and there, Professor JAMMER seems to be content with quiet juxtaposition of conflicting opinions. We read what was thought about mass not only by N EWTON and H ERTZ but also by A LOIS H ÖFLER and C LÉMENTICH DE E NGELMEYER. Now if Professor JAMMER had found that H ÖFLER and DE E NGELMEYER, although forgotten today, in fact did something important in mechanics, everyone should congratulate him on his success as a historian, but when he merely tells us what they thought – H ÖFLER is quoted as saying, “The tonomonic quantity ‘dyne’ precedes logically the notion of ‘one gram mass’ ”, and DE E NGELMEYER “our daily experience prepares us much better for the comprehension of the notion of force than of mass . . .” – then we may well ask, who cares? Indeed, if it had been Huygens and Euler who had made the statement just quoted, a historian would do neither their memories nor his readers any service by perpetuating these flat vacuities. Once the history of our culture was our common heritage, our pride and our lesson book for conduct both private and public. . . . For me, as long as I have tried to do research, the one and only school of method has been study, study and study of the masters. Today these simple truths are as obsolete currency as gold coin. . . . The beginner in history of science must be taught first of all what will make him, if he completes apprenticeship, different from and independent of historians and scientists alike. Mathematics cannot be defined now except as that which mathematicians do; for physics, we substitute the word “physicists”, and soon the history of science will be defined as that which historians of science do and will likewise live a Parkinsonian life, independent equally of science and of history. Just as books on political history are written now to be read by political historians alone, and works on mathematics to be read by none but professional mathematicians, soon we can expect that books on the history of science will be meaningless except to historians-of-science, dumb Bollettino di Storia delle Scienze Matematiche, 11-2 (1991), p. 12. An Idiot’s Fugitive Essays on Science, pp. 170–171.
20
E. GIUSTI
to scientists and to historians, serving only to produce more and more historiansof-science who are paid, if they can get jobs, to do nothing but indoctrinate more historians-of-science. The history of science is different in kind from science itself and from ordinary history. The material of the history of science is compact: Being history, it necessarily concerns the past, and because in the past science was a tiny and select vocation, not the factory job it is today, there is little to be read. What little there is, includes the highest intellectual achievement of our culture as well as a part of its finest artistic creation. . . . The example set today by the professions of scientists and historians is the worst that historians of science could choose to follow. Indeed, the history of science needs to be cleared and established. Thereafter, it ought to be learned. Although only a handful of persons could ever acquire the eccentric conjunction of skills and knowledge necessary in him who would do sound research in the history of science, there are many who can and should learn the results of that research. History of science should be studied and learned by every scientist, every historian, every person who seeks any intellectual footing in the Western culture. The great need of history of science today is for teachers.
These last two citations bring us to another topic on which Truesdell repeatedly insisted, in particular in his Lectio doctoralis, held on the occasion of the conferment of the “laurea honoris causa” at the University of Ferrara: the importance of the history of mathematics and, more generally, of the history of science, not only for the culture of the average citizen and, in particular, for the culture of scientists, but also for the interaction between historic and scientific culture. Without the history of science, a scientist’s culture is an end in itself; without at least a basic knowledge of science, the history of mathematics is impossible. All attempts to write a history of mathematics to be read by non-mathematicians turned out as disastrous failures: horrendous and stupid myths, the caricature or even the degradation of mathematics. Usually, the mathematician is represented as a sort of quack, rather than the way a great mathematician really is: a thinker, an organiser of ideas, a creator of new concepts, the bearer of the truth, the indefatigable worker in the fields of knowledge and beauty. In such works of vulgarization, mathematics itself is described as an arcane science, a spell, a crazy revelation. My discussion here refers not only to mathematics in its modern and specific sense, but also to mathematical science. The history of mathematical science, although probably not very useful to individuals who are ignorant of any form of mathematics, should be understandable to a much vaster public nowadays, more than any common research paper. It should be interesting, perhaps even useful, to anybody with some knowledge of mathematics, not necessarily very deep and detailed, and to anybody with some conscience of the value of mathematical science. A treatise on the history of mathematical science should differ profoundly from a modern research essay.‡ An Idiot’s Fugitive Essays on Science, pp. 585–586. An Idiot’s Fugitive Essays on Science, p. 589. ‡ Translated from Honoris Causa. Lezioni dottorali di insigniti di laurea ad honorem in occasione
del VI centenario dell’Ateneo. Academic Year 1991/1992. University of Ferrara, 1995, p. 49.
CLIFFORD TRUESDELL, HISTORIAN OF MATHEMATICS
21
One of the main functions it should fulfill is to help scientists understand some aspects of specific areas of mathematics about which they still don’t fully know. What’s more important, it helps them too. By satisfying their natural curiosity, typically present in everybody towards his or her own forefathers, it helps them indeed to get acquainted with their ancestors in spirit. As a consequence, they become able to put their efforts into perspective and, in the end, also able to give those efforts a more complete meaning. By seeing beyond one or two leaves of the great tree of mathematics, they can comprehend the whole structure, discern the subdivision of its branches and trace back its roots. When a scientist gets the chance to measure his own efforts by comparing them with the results of the immortals, he will feel perhaps less compelled to vie for supremacy over the pygmies of his level and will try, instead, to achieve one or two small goals, which will have a probability to survive the test of time. This is a moral advantage. Mathematical science offers examples of what human reason can achieve, of the safety that comes together with reason; by reaching a rational clearness, they put also in evidence the limits of reason and the unstable foundations on which all sciences are based. The methods that turned out to be successful and those that have been surpassed and failed, the paths that guided to the final destination and the blind ones, can be learned exclusively through the history of mathematical sciences. This is an old-style history, the history of men and of their actions; with the words of Savile: No other study is as suitable as history to guide the life of mankind. This is a social advantage.
Still, relevance and fame are two different concepts. Truesdell does not choose his heroes simply among the recognized mathematicians. The scientists who are brought by Truesdell to the level of heroes, and take, in this way, a curtain call on the scene of Truesdell’s investigations, are sometimes practically unknown, not only to the general public, but also to historians of science. Particularly enlightening examples are John Herapath and John James Waterston. Both of them were the first ones to give a substantial contribution to the kinetic theory of gases, after the publication of Daniel Bernoulli’s Hydrodynamica: both of them had sent their work to the Royal Society which, in both cases, rejected it, giving rise to a series of disputes. Herapath, though he used the momentum instead of the kinetic energy of particles for his definition of temperature, had shown that the kinetic theory was suitable as a first explanation for phase transitions, diffusion, and the propagation of sound. In a series of letters, in which Herapath was exhorted to abandon these speculative aspects and devote himself to experiments, the president of the Royal Society, Sir Humphry Davy, wrote, “having considered a good deal the subject of the supposed real zero, I have never been satisfied with any conclusions respecting it. I cannot see any necessary connexion between the Translated from Honoris Causa, pp. 49–50. Translated from Honoris Causa, p. 50.
22
E. GIUSTI
capacity of bodies for heat, and the absolute quantity they contain; and temperature does not measure a quantity, but merely a property of heat.” As far as Waterston is concerned, long before the work of Joule, Thompson and Krönig, he had sent in 1845 a note on kinetic theory to the Royal Society. This document was rejected by the Society as well. On that occasion, one of the two referees objected that the principle that pressure depended on molecular collisions with the walls, a central hypothesis of Waterston’s document, was “by no means a satisfactory basis for a mathematical theory”. The second referee just wrote that “the paper is nothing but nonsense, unfit even for reading before the Society”. The work in question was rediscovered by Lord Rayleigh in 1891 and published in 1893, ten years after Waterston’s death. In the contrast between the genius and the Academy, Truesdell chooses definitely the first and draws two important conclusions from the above-described events. In the first place, he praises an anarchical approach in academic research and organization. Truesdell himself depicts the academic world with his incomparable style, as somewhat serious and somewhat ironical, and describes it in ways that, besides some details, could very well fit the Italian situation: Our academic life presents to the foreigner a lamentable scene of chaos. No-one knows who is on top. If in University 1 Professor A is a demigod, we have only to consult Professor B in University 2 to learn that in his department A would not qualify even as an assistant. True, A belongs to six national committees, has a million dollar grant from the Central Spy Bureau, and has published eight successful textbooks, but B, who points to A’s textbooks as models of nonsense, has written 216 research papers with twenty-three co-authors and also is consultant for four major corporations, assistant editor of five journals and second vice-president of a professional trade-union.
The second conclusion drawn by Truesdell from the events in which Herapath and Waterston had been involved is more important. He thought that a system making use of anonymous referees favored a reactionary attitude and made it difficult for new ideas to emerge. In the journals he founded, the two Archives, this system was replaced by one that the name of the communicator of a paper was published immediately beneath the name of the work’s author. He thought that this change would successfully replace anonymous irresponsibility with individual and clear responsibility. Unfortunately, I fear that his thought was wrong: in the framework of modern organization, the problem is not anymore the protection of a genius from the closed and reactionary attitude of the establishment, but rather the protection of journals from the huge amount of irrelevant material which threatens to flood them. From this point of view, a comparison between different opinions is preferable to leaving it all to the whim of just one communicator. After all, no system is perfect.
An Idiot’s Fugitive Essays on Science, p. 399.
The Genesis of Truesdell’s Nonlinear Field Theories of Mechanics WALTER NOLL 308 Field Club Ridge Road, Pittsburgh, PA 15238, U.S.A. E-mail:
[email protected] Received 14 March 2003
Clifford Truesdell was a singularity among all prominent scientist-scholars of the twentieth century. He believed that the pinnacle of civilization had been reached in the 18th century and that things have gone downhill ever since. He had no television set and no radio, and I doubt that he ever used a typewriter, let alone a computer. Many of the letters he sent me were written with a quill pen. (However, he did not reject such modern conveniences as flush toilets and air-conditioning.) He loved baroque music and did not care very much for what was composed later on. He owned very fine harpsichords and often invited masters of the instrument to play, often before a large audience, in his large home in Baltimore, which he called the “Palazzetto”. He collected art and antique furniture, mostly from the 18th century. He even often dressed in the manner of an 18th century gentleman. Most importantly, he admired the scientists of the 18th century and, above all, Leonhard Euler, whom he considered to be the greatest mathematician of all time. Actually, in the 18th century, mathematics and physics were not the separate specialities they are today, and the term “Natural Philosophy” was the term then used for the endeavor to understand nature by using mathematics as a conceptual tool. C.T. tried and succeeded to some extent in reviving this term. Clifford Truesdell was extremely prolific and he worked very hard most of the time. When he didn’t, he liked to eat well and drink good wine. This is perhaps one reason why he loved Italy so much, and spent extended periods of time there. Clifford Truesdell was not only an eminent scientist but also a superb scholar. He mastered Latin perfectly. He could not only read and understand the classical literature, which was written in Latin, but even wrote at least one paper in Latin. He was fluent not only in his native language, English, but also in French, German, and, above all, Italian. He wrote papers and gave lectures in all of these languages. All of these attributes are particularly rare for somebody born in Los Angeles, California. This paper is the text, with minor revisions, of a lecture given by the author at the Meeting in memory of Clifford Truesdell, held in Pisa, Italy, in November 2000.
23 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 23–30. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
24
W. NOLL
Clifford Truesdell was only in his early 30s in the early 1950s when he had already established himself as perhaps the world’s best informed person in the field of continuum mechanics, which included not only the classical theories of fluid mechanics and elasticity but also newer attempts to mathematically describe the mechanical behavior of materials for which the classical theories were inadequate. In 1952, he published a 175 page account entitled “The Mechanical Foundations of Elasticity and Fluid Dynamics” in the first volume of the Journal of Rational Mechanics and Analysis. Springer-Verlag, with Headquarters in Heidelberg, Germany, has been an important publisher of scientific literature since the beginning of the 20th century. In the 1920s it published the Handbuch der Physik, a multi-volume encyclopedia of all the then known knowledge in Physics, mostly written in German. By the 1950s the Handbuch had become hopelessly obsolete, and Springer-Verlag decided to publish a new version, written mostly in English and called “The Encyclopedia of Physics”. Here is Clifford Truesdell’s own account, in a 1988 editorial in the Archive for Rational Mechanics and Analysis, of how he became involved in this enterprise: In 1952 Siegfried Flügge and his wife Charlotte were organizing and carrying through the press of Springer-Verlag the vast, new Encyclopedia of Physics (Handbuch der Physik), an undertaking that was to last for decades. Joseph Meixner in that year had called Flügge’s attention to my article The mechanical foundations of elasticity and fluid dynamics . . . On October 1, 1952, Flügge invited me to compose for the Encyclopedia a more extensive article along the same lines but presenting also the elementary aspects of continuum mechanics, which my paper . . . had presumed to be already known by the reader. In the second week of June 1953, Flügge spent some days with us in Bloomington, mainly discussing problems of getting suitable articles on fluid dynamics for the Encyclopedia. . . . He asked me to advise him, and by the end of his visit I had agreed to become co-editor of the two volumes under discussion.
It was C.T., in his capacity as co-editor, who invited James Serrin to write the basic article on Fluid Mechanics. C.T. himself decided to make two contributions dealing with the foundations of continuum mechanics, the first to be called “The Classical Field Theories of Mechanics” and the second “The Nonlinear Field Theories of Mechanics”. Here is what he wrote about this decision, in the preface to a 1962 reprint of his Mechanical Foundations: By September, 1953, I had agreed to write, in collaboration with others, a new exposition for Flügge’s Encyclopedia of Physics. It was to include everything in The Mechanical Foundations, supplemented by fuller development of the field equations and their properties in general; emphasis was to be put on principles of invariance and their representations; but the plan was much the same. As the work went into 1954, the researches of Rivlin and Ericksen, and of Noll made the underlying division into fluid and elastic phenomena, indicated by the very title The Mechanical Foundations of Elasticity and Fluid Dynamics no longer a natural one. It was decided to split the projected article in two parts, one on the general principles, and one on constitutive equations. The former part, with the collaboration of Mr. Toupin and Mr. Ericksen,
TRUESDELL’S NONLINEAR FIELD THEORIES OF MECHANICS
25
was completed and published in 1960 . . . . In over 600 pages it gives the full material sketches in Chapters II and III of The Mechanical Foundations. The second article, to be called “The Nonlinear Field Theories of Mechanics”, Mr. Noll and I are presently engaged in writing. The rush of fine new work has twice caused us to change our basic plan and rewrite almost everything. The volume of grand and enlightening discoveries since 1955 has made us set aside the attempt at complete exposition of older things. It seems better now to let The Mechanical Foundations stand as final for most of the material included in it, and to regard the new article not only as beginning from a deeper and sounder basis than could have been reached in 1949 but also as leaving behind it certain types of investigation that no longer seem fruitful, no matter what help they may have provided when new.
In 1951 I started employment as a “Wissenschaftlicher Assistant” (scientific assistant) at an Institute of Engineering Mechanics (Lehrstuhl für Technische Mechanik) at the Technical University of Berlin. In late 1952 I came across a leaflet from Indiana University offering “research assistantships” at the Graduate Institute of Mathematics and Mechanics. I applied and was accepted to start there in the Fall of 1953. After arriving, I found out that I had become a “graduate student”. At first I did not know what that meant, because German universities have nothing corresponding to it. But the situation was better than I expected, because no work was assigned to me, and I could pursue work towards a Ph.D. degree full time. I asked Clifford Truesdell to be my thesis advisor, because he was the local expert on mechanics and I thought that would help me after my return to Germany. At the time I did not know that C.T. had already become the global expert on mechanics. I also did not know at the time that I was C.T.’s first doctoral student. I was told later that other graduate students did not select him as a thesis advisor because they thought that he was too tough. It is true that he was tough and asked much of me, but he was also extremely kind and helpful, and, in the end, he had more influence on my professional career than any other person. I learned from him not only the basic principles and the important open questions of mechanics, but also how to write clear English. In addition, his wife Charlotte and he did very much to make me feel comfortable in Bloomington. They invited me to their home very frequently to share the gourmet dinners prepared by Charlotte. As a thesis topic, C.T. gave me a paper written by S. Zaremba in 1903 and asked me to make sense of it. I found that, in order to do so, a general principle was needed, which I called “The Principle of Isotropy of Space”. This principle also served to clarify many assertions in mechanics that had been obscure before (at least to me). C.T. had a wonderful sense of humor. One day he put up his son, then about 10 years old, to ask me: “Mr. Noll, please explain the Principle of Isotropy of Space to me.” Actually, I think I have found a good explanation only recently, about 40 years later. I had a very good background in the concepts of coordinate-free linear algebra and used it in my thesis. At the time, C.T. was uncomfortable with this approach and forced me to add a coordinate version to all formulas. However, he did not have a closed mind and later, in a letter to me dated August 4, 1958, he wrote:
26
W. NOLL
I must also admit that the direct notations you use are better suited to fundamental questions than are indicial notations. Your present mathematical style is smoother and simpler than that in your thesis.
After I received my Ph.D. in September 1954 and returned to Germany, C.T. and I stayed in contact by mail, and I saw him again when he gave a lecture in Berlin in June 1955 (“Das ungelöste Hauptproblem der endlichen Elastizitätstheorie”). The term of my position in Berlin ran out in the Fall of 1955 and I needed a new job. On C.T.’s recommendation I was offered an Associate Professorship at Carnegie Institute of Technology (now Carnegie Mellon University), where I moved in the Fall of 1956 after having spent a year at the University of Southern California. Here is an excerpt from a letter he wrote to me on February 14, 1956, after I accepted the position at Carnegie Tech: I think you deserve your position at Carnegie and hope you will like it. If I might offer a word of advice, it would be to maintain the level set by your two papers in the Journal. Nowadays so many people are publishing like rabbits that volumes of papers make little impression. Many young people publish too much when they find out how easy it is to do something others have not done, especially something a senior colleague is too dull to see. It is of course necessary to avoid setting such high standards that one’s publication ceases entirely, as is the case with many well known savants.
Later, in December of the same year, he wrote me the following letter: Dear Noll, Ericksen and I have long promised to write a second article, The nonlinear field theories of mechanics, to cover exact work on elastic, fluid, and plastic phenomena. Ericksen is far behind on the first article, and therefore we have not even outlined this second one. Although I have not yet consulted Ericksen, I feel sure that he would be relieved if you would consent to join us and do a major part of the second article. The order of authors would be according to the amount of work provided, and if you could do most of the work, you would be the senior author. You know both the old Mechanical Foundations and the new Classical Field Theories thoroughly. Therefore you have as much background as anyone else, and you have made important contributions yourself. I have not been able to think of a good organization. All I can say is that I should prefer to cut down on the amount of space given to special proposals prior to1948 and to emphasize general work such as you and Rivlin have done. Whether it is better to treat very general visco-elastic theories first or to give a definitive presentation of classical finite elasticity first is not clear to me. The article should be exhaustive, but for the work prior to1948 I think we need only condense the Mechanical Foundations or change the emphasis, as I believe the list of sources cited there is virtually complete. The presentation should be concise, but the length is not critical. If this proposal is acceptable to you, I will put it up to Ericksen at once. Also, if you could start right away on the organization and suggest a distribution of material among the three authors it would help. I believe I can begin active work within two months, but I do not expect to be able to prosecute exclusively this one project in the near future.
TRUESDELL’S NONLINEAR FIELD THEORIES OF MECHANICS
27
Here is an excerpt from my answer, written on January 2, 1957: I shall be glad to accept your proposal and join you and Ericksen as co-author of “the nonlinear field theories of mechanics”. It will be a challenging task for me, but I shall do the best I can. I have thought a little bit about the organization. Here is very roughly what I would like to propose: The first section should contain general remarks concerning constitutive equations, their classification, invariance requirements, constraints, definitions of terms like “material”, “isotropic”, “aelotropic”, etc. I have been thinking about this subject for quite a while, and I plan to include my ideas on it in a paper on the axiomatic foundation of mechanics, which I plan to write in the near future. I think it is best to give then an account of finite elasticity and nonlinear fluid theory (Reiner–Rivlin). These are the simplest and best known nonlinear theories. Next, more general stress-strain relations (as those proposed by Rivlin and Ericksen) could be treated. Finally, constitutive equations involving stress rates (as those proposed by Oldroyd, Cotter and Rivlin, and in my thesis) and their special cases (hypo-elasticity) could be worked out, and the connection of these general theories with the simpler ones could be established. I would be willing to do the general remarks and also most of the treatment of the various constitutive equations in general. However, I would appreciate some help with the special cases and solutions. I would think, for instance, that Ericksen would have little trouble to present all special solutions valid for an arbitrary strain energy function in finite elasticity by taking his two papers (ZAMP 5: 466–489 and J. Math. Phys. 2: 126–128) as a basis. Also, it is probably easier for one of you to deal with finite deformations of shells, anisotropic elastic bodies, and similar material. I would think, too, that you would want to do the section on hypo-elasticity yourself. I am open to alternative and more detailed proposals concerning the organization as well as the distribution of the material to the authors, and I am looking forward to have your opinion.
The new task of working on the Nonlinear Field Theories of Mechanics (NLFT) changed my professional life forever, because it focused my attention on the foundations of continuum mechanics. C.T. was very good at reading, digesting, and summarizing existing literature. I am not. I cannot help rethinking and recreating whatever subject I wish to understand. This had perhaps the effect of improving the NLFT in the end, but it also slowed down the progress towards completion. Sometimes C.T. became very impatient with me, and with good reason. The low point was reached when C.T. wrote the following letter to me: Dear Walter, Upon my return from the West, I did not find any manuscript of our article. As I said in Pittsburgh, I feel now that you should withdraw from the article. This is the third period of half a year or more in which you have not sent me a line, and again the subject is slipping away from us. My having had to disturb you every few weeks or days for the past six months so as to get your assurance that a large section of manuscript would be ready in a few days has been very painful, the more so since
28
W. NOLL
fruitless. I have put in this treatise the better part of all my work in the past five years, and as I prepare for another trip to Europe, I cannot drag this weight of responsibility any longer. I plan to finish the article by intensive work in January and February, and I request you to send me all material I gave you before I went to Europe last spring. Surely, you know that I feel this loss deeply. The parts written by you are far better than anything I could write, and your criticism and correction of the parts I wrote have been of the highest value. There is no one else who could do the job you are capable of doing, but you have not done it, and some sort of article must be published nevertheless. There is too much invested in it already.
Well, I pulled myself together and I helped to finish NLFT. During our collaboration we were in constant contact by mail or in person. I visited Bloomington in November 1958. C.T. came to Pittsburgh in June 1959 and gave lectures at the Mellon Institute (later a part of Carnegie Mellon University). In August 1961, C.T. left Indiana University and he and Charlotte moved to Baltimore where he joined The Johns Hopkins University with the title Professor of Rational Mechanics. In September 1961, C.T. visited Pittsburgh again. During the academic year 1962/63, C.T. arranged for me to come to Johns Hopkins University as a visiting professor. Occasionally, C.T. and I had disagreements on terminology. Perhaps the most interesting concerned what we finally called “The Principle of Material FrameIndifference”. Here are some excerpts from letters that we exchanged on this subject: C.T. to W.N., August 1, 1958: I was just planning to write something mentioning your old “isotropy of space”. It seems to me the term “principle of objectivity” shares one of the undesirable features of the old term, namely, it is too vague. I was going to use “the principle of material indifference”. However, it would be bad for there to be three names running around for the same principle. If you find my suggestion appealing, let me know right away; otherwise, I will use your present name.
W.N. to C.T., August 5, 1958: I have switched from “isotropy of space” to “objectivity” because the former suggests only that there are no preferred directions in static space, which is much less than is implied by the principle of objectivity. If “objectivity” means independence of the observer, as I believe it does, then it seems to me that it is the correct term. One could remove some of the vagueness you mentioned by using “principle of objectivity of material properties” (or perhaps, “principle of material objectivity”). If you think that this is advisable, please insert “of material properties” after “principle of objectivity” in the table of contents (section 11), in line 2 of page 3, in the title of section 11 on page 20, and in line 15 from the bottom of page 21. I think that no additional changes are necessary because the other references to the “principle of objectivity” cannot lead to confusion.
TRUESDELL’S NONLINEAR FIELD THEORIES OF MECHANICS
29
C.T. to W.N., August 9, 1958 (handwritten): Dear Noll, I have to confess that my dislike of “objective” is subjective. Psychologists, sociologists, and all sorts of other charlatans and fakers use “objective” when the only sense I can find for what they claim is “nonsensical” or “thoughtless”. Thus I exclude, banish and ostracize “objective” as the opposite of ”subjective” in my own usage, but of course there is no reason why this should influence you. Cordial regards, CT
C.T.’s objection did influence me and I am now glad it did. I suggested that we change indifference to frame-indifference. C.T. accepted that and I believe now that the term we finally used expresses the idea better than any of the ones used before. The entire task of writing the NLFT turned out more extensive than was expected, and C.T. called it the “monsterino”. (He called the earlier Classical Field Theories the “monster”.) The monsterino was finally published in 1965. There were many people who helped us with the work by reading parts of the manuscript and offering corrections, critique, and suggestions. They include B.D. Coleman, J. Ericksen, M.E. Gurtin, R. Toupin, K. Zoller, C.-C. Wang, D.C. Leigh, C.C. Hsiao, W. Jauzemis, and A.J.A. Morgan. On June 25, 1961 C.T. wrote to me: I now have an excellent secretary, so that the preparation of the manuscript should be easier.
That helped, too. Clifford Truesdell was very meticulous about attributions. I believe he bent over backwards in my favor when he described, in a footnote to the Introduction to NLFT, how the work was divided between us: Acknowledgment. This treatise, while it covers the entire domain indicated by its title, emphasizes the reorganization of classical mechanics by Noll and his associates. He laid down the outline followed here and wrote the first drafts of most sections in Chapters B, C, and E and of a few in Chapter D. Among the places where he has given new results not published elsewhere, shorter proofs, or major simplifications of older ideas may be mentioned (21 items). The larger part of the text was written by Truesdell, who also took the major share in searching the literature. While Noll revised many of the sections drafted by Truesdell, it is the latter who prepared the final text and must take responsibility for such oversights, crudities, and errors as may remain.
The Nonlinear Field Theories was reprinted in 1992. Here is an excerpt from the preface to the second edition: This volume is a second, corrected edition of The Nonlinear Field Theories of Mechanics, which first appeared as Volume III/3 of the Encyclopaedia of Physics, 1965. Its principal aims were to replace the conceptual, terminological, and notational chaos that existed in the literature of the field by at least a modicum of order and coherence, and second, to describe, or at least to summarize, everything that was both known and worth knowing in the field at the time. Inspecting the literature that has appeared
30
W. NOLL
since then, we conclude that the first aim was achieved to some degree. Many of the concepts, terms, and notations we introduced have become more or less standard, and thus communication among researchers in the field has been eased. On the other hand, some ill-chosen terms are still current. Examples are the use of “configuration” and “deformation” for what we should have called, and now call, “placement” and “transplacement”, respectively. (To classify translations and rotations as deformations clashes too severely with the dictionary meaning of the latter.) We believe that the second aim was largely achieved also. We have found little published before 1965 that should have been included in the treatise but was not. However, a large amount of relevant literature has appeared since 1965, some of it important. As a result, were the treatise to be written today, it should be very different. On p. 12 of the Introduction we stated “. . . we have subordinated detail to importance and, above all, clarity and finality”. We believe now that finality is much more elusive than it seemed at the time. The General theory of material behavior presented in Chapter C, although still useful, can no longer be regarded as the final word. The Principle of Determinism for the Stress stated on p. 56 has only limited scope. It should be replaced by a more inclusive principle, using the concept of state rather than a history of infinite duration, as a basic ingredient. In fact, forcing the theory of materials of the rate type into the general framework of the treatise as is done on p. 95 must now be regarded as artificial at best, and unworkable in general. This difficulty was alluded to in footnote 1 on p. 98 and in the discussion of B. Bernstein’s concept of a material on p. 405. This major conceptual issue was first resolved in 1972 [“A New Mathematical Theory of Simple Materials” by W. Noll, Archive for Rational Mechanics and Analysis, Vol. 48, pp. 1–50], and then only for simple materials. The new concept of material makes it possible, also, to include theories of plasticity in the general framework, and one can now do much more than “refer the reader to the standard treatises”, as we suggested on p. 11 of the Introduction.
About three years ago, Springer-Verlag sold the right to translate NLFT into Chinese. I recently received copies of the Translation.1
1 After this paper was written, I was informed by Springer-Verlag that a second reprinting will
appear in 2003.
An Appreciation of Clifford Truesdell JAMES SERRIN School of Mathematics, University of Minnesota, Minneapolis, MN 55455, U.S.A. Received 9 June 2003
The following comments are a slightly emended version of a talk delivered at a meeting of the Society for Natural Philosophy in December 1985, together with additional remarks appended in May 2003. This meeting of the Society for Natural Philosophy has been arranged as a special tribute to its principal founding member, Clifford Truesdell, in recognition of his many contributions to the Society since its beginning at The Johns Hopkins University in March 1963. In this group picture of the participants of the first meeting (see Figure 1) you will see Clifford Truesdell at next to the left in the front row, and no doubt you will also recognize a number of other faces, all 22 years younger and perhaps 22 years wiser, than today. In the intervening years the Society has sponsored 27 additional meetings, holding fast throughout its existence to the original form, to small size, no publications, and the encouragement of quality of research – primarily in rational continuum mechanics and its foundations, though with occasional forays into related mathematical disciplines. And outstanding menus, as well. Consistent maintainance of the original principles of the Society is an achievement due more to Clifford Truesdell than to any other single individual. It is impossible in a few short minutes to do justice to his remarkable and myriad accomplishments, in papers, monographs, memoirs, and books ranging from mathematical sciences, to rational mechanics, natural philosophy, and the history of science. Already in the forties there was clear evidence of this future in several unusual publications, perhaps not today known to everyone here. As a student at Indiana University I was privileged early to see the power of his thought in exceptional lectures which he offered in elasticity and kinetic theory – perfect in content and stately in development – thus, the appearance in 1952 of the seminal paper “Mechanical Foundations of Elasticity and Fluid Mechanics” came to me not so much as a surprise but rather as a remarkable opportunity to learn more from a great master. Several years before, he had initiated research on the kinematics of vorticity, together with the interrelations of vorticity to thermodynamics. This culminated in his next work, the elegant monograph “The Kinematics of Vorticity”, published by the Indiana University Press but now out of print, and, following without pause, his great paper on the absorption and dispersion 31 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 31–38. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
Figure 1. First Meeting of the Society for Natural Philosophy, March 25–26, 1963. The front row left-to-right: L.M. Milne-Thomson, C. Truesdell, B. Coleman, J. Serrin, J.L. Ericksen, H. Markovitz, E. Sternberg, W. Noll, H. Grad, R. Toupin, R. Rivlin. Back row second from the right: Charlotte Truesdell.
32 J. SERRIN
AN APPRECIATION OF CLIFFORD TRUESDELL
33
of sound according to the viscosity formulae of Navier and Stokes. Altogether one of the most audacious first appearances of a new star in the firmament of mechanics! While time makes it impossible to dwell upon still another quiver in his bow, I must add that it was exactly following these works that he began the three celebrated prefaces for the Opera Omnia of Leonhard Euler – one each on fluid mechanics, acoustics, and elasticity. These established his fame as the leading modern interpreter of Eighteenth Century Mechanics, and of course brought to our attention, as nothing else, the contributions of Euler to our science. In 1960 Truesdell and Richard Toupin discovered the important entropy relations now known as the Clausius–Duhem inequality, specifically including in it the crucial external heat supply term. This development made possible the later theory of Coleman and Noll, providing thereby, for the first time, a rational and rigorous approach to the foundational aspect of constitutive restrictions. Moreover, as appeared in retrospect many years following, the radiation supply term provided a vital link between classical thermodynamics and modern theory. Truesdell’s work on the Clausius–Duhem inequality was severly criticized by those whose understanding of thermodynamics had been formed without benefit from study of the masters of the subject. The criticisms could not be fully answered, ironically, because the foundations of thermodynamics were themselves open to question. In any case, this neglected and nettle-choked thicket of physics became one of the principal areas of Truesdell’s later work: of course his detractors never dreamed that anyone would set foot in those Augean stables, nor of course did they have the courage to do so themselves. Nevertheless, the difficulties of thermodynamics were enough to require many further years of thought before the ideas were ripe for presentation, and other projects loomed immediately. Two monumental volumes for the Handbuch der Physik appeared in the early sixties, “The Classical Field Theories” with Richard Toupin, and “Non-linear Field Theories of Mechanics” with Walter Noll. It is almost impossible to know how to approach these volumes. Their weight of scholarship and perception, their very completeness, daunt the mere mortal who opens them. Even so, the rich veins of ore were, individually and internationally, mined by the many. Here one may also recall the impress which Truesdell’s views had on the seven companion volumes of the Handbuch der Physik; for the most part he selected not only the general subject matter, but also the corresponding authors. Even now, thirty years later, many of these volumes are as fresh as ever – required reading for the acolyte. After 1964 thermodynamics and kinetic theory began to occupy Clifford more heavily than ever before, as if only the most monumental problems were now sufficient of challenge. In 1957 he had published a first paper on chemical mixtures, followed by a second paper in 1968 which inspired much further work on this most difficult of subjects. Then came in 1969 the polemical masterpiece “Rational Thermodynamics”, summarizing his thoughts to that time and forming a springboard (I almost wrote “launching pad” but rejected this) for future work.
34
J. SERRIN
Figure 2. Journal of Rational Mechanics and Analysis: Volume 1, Number 1.
If I may digress a moment, let me recall that this scientific activity was paralleled by continuing effort to re-establish both Rational Mechanics and the methods of Natural Philosophy as integral parts of physical science. Beyond the formation of the Society for Natural Philosophy, there is the founding of the Journal of Rational Mechanics and Analysis in 1952 (see Figure 2), its reappearance as the Archive for Rational Mechanics and Analysis in 1957, and the inauguration of the Archive for History of Exact Sciences in 1960, to name his major and most longstanding contributions in this direction. Galileo had said “Nature first made things in her own way, and then made human reason skillful enough to understand – but only by hard work – some parts of her secrets.” Clifford saw well that a combination of mathematical analysis and historical sense was the key to this understanding, and he emphasized always, as Dedekind put it, “What is provable should not be believed in science without proof.” In these endeavors his practice emulated his preachment to the Society at an earlier meeting. “The aim of rational mechanics is to provide a sound conceptual framework for description of nature as human senses perceive it and to create patterns of systematic inquiry and inference such as to order and interrelate the phenomena thereby conceived.” Deeper understanding was to come through the successive works of generations, together with our experience of materials themselves.
AN APPRECIATION OF CLIFFORD TRUESDELL
35
I would like to bring you the message that Clifford’s evangelism succeeded beyond his expectations, but must sadly report that the world is still peopled by pagans (and worse) – a theme which Clifford of course has voiced frequently in his scientific essays. In any case I herewith dedicate for his perusal two sentences culled at random from today’s scientific journalese – the first “If space has nine dimensions and matter is strings then the mysteries of the universe may soon come clear” (so help me God), and the second, “Among enhancement techniques the National Research Council will investigate are sleep learning, group cohesion technologies, accelerated intelligencing and parapsychological stress.” But one must put aside the ridiculous in the study of nature. Returning then to the question of classical thermodynamics, one observes from the outset that this is the sole discipline of classical physics where serious argument about fundamentals remains. Even while moguls of science grant entropy and energy all-pervasive influence, it is impossible to obtain agreement from these grandees on the two laws of nature from which these concepts derive. As recently as 1970 the subject was innocent of mathematical structure, and the lack of precision showed throughout. The introductory essay in “Rational Thermodynamics”, written in 1968, tells this story faultlessly. Truesdell attacked this miasmic swamp with vital energy, and brought others as well to the fray. The outcome is now clear; there is no longer doubt that energy and entropy can be set upon sure and credible foundations, vindicating Clifford’s intuition when he proposed the Clausius–Duhem inequality in 1960. His own part of this research, carried out from 1974–1984, included in particular a rigorous axiomatization of reversible thermodynamics. There is also in this work a corollary result, a beautiful pearl, which I think deserves to be better known than it is. This is his discovery that, for reversible systems, the law of thermodynamics can be stated without recourse to the mechanical equivalent of heat, indeed in such a way that the existence of this mechanical equivalent can be deduced, much as the existence of absolute temperature can be inferred from the second law. Generalizations of this to arbitrary thermal systems have been given by later writers, leading me to predict that ultimately this method of viewing the First Law will become a new paradigm of physics. Books issue in continuing stream from “Il Palazzetto”: “Introduction to Rational Elasticity” (with Wang), “The Tragicomical History of Thermodynamics, 1822– 1854”, “A First Course in Rational Mechanics”, “An Idiot’s Fugitive Essays on Science”, “Fundamentals of Maxwell’s Kinetic Theory of a Simple Monatomic Gas” (with Muncaster), the last his magnus opus on a subject which has been with him for 40 years, and even papers on antique furniture and the routes to Hell. What Note added, March 2003. This statement needs some clarification: The context should be re-
stricted to phenomenological thermodynamics, in which heat and work are the primitive elements, and where also there are given differential forms defining the heat and work associated to processes in question. Beyond this, there are systems where one may credibly define energy and entropy, though such definitions may restrict the set of available processes of the systems.
36
J. SERRIN
more proof is needed for the converse of the famous epigram of Leonardo da Vinci: “Quiet water becomes stagnant. Iron rusts from disuse. So doth inactivity sap the vigor of the mind.” Clifford Truesdell inspired a generation to study Natural Philosophy, and persuaded the learned that Athena is not dead. He has repeatedly and deservedly been honored for his many accomplishments. There is the Bingham Medal of the Society of Rheology which he received in 1963, the Modesto Panetti Gold Medal and Prize from the Turin Academy of Sciences in 1967, and the George David Birkhoff Award for Applied Mathematics from the American Mathematical Society in 1978, as well as Foreign Membership in the Accademia Nazionale dei Lincei. I dare say that the Society for Natural Philosophy as well would wish to present formally to him its highest constitutional award: the great regard in which he is held, the high esteem he is owed, and our continuing best wishes for the furtherance of the scientific program in rational mechanics which he initiated at the beginning of his career. The following additional words were appended in May, 2003, drawn from a lecture given in November, 2000, at Pisa at the Meeting in memory of Clifford Truesdell. Generous in praise, criticism, judgement and friendship, none bestowed lightly: ordinary words do not suffice for Clifford Truesdell. He had great vision accompanied by exceptional accomplishments, at the same time supremely gifted, prodigiously learned and with almost overwhelming energy. He lived life not just to the full, but to overflowing in every activity he undertook: his prodigal lifestyle, his many volumes of correspondence, his historical research and essays, his many books, both scientific and philosophical, and his exceptional editorial activities. He was lavish in his help for young researchers and was happy with the success of those who followed in the directions he pioneered. He chose his friends well, and from the group around him and around the Archive some of the greatest work of mechanics and thermomechanics in the last century resulted. Theodore Roosevelt’s old fashioned words apply still to Clifford and to his goals, “Far better it is to dare mighty things, to win glorious triumphs, even though checkered by failure, than to take rank with these poor spirits who neither enjoy nor suffer much, because they live in the gray twilight that knows neither victory nor defeat.” All the same, it is necessary to add that Clifford was not without the faults of the gods – perhaps he was aware of this – even allowing this aspect of his character to be a full part of his personality. Clifford could become upset on occasion, not always fairly; there were exasperating times when he became all too human. Let me mention one moment engraved on my memory. On one occasion when my wife Barbara and I were visiting Clifford and Charlotte, after the dinner, while he and I were standing at the top of the magnificent divided stairway at the Palazzetto, the subject of absolute temperature in classical thermodynamics came up. After all at that time we were both violently infected by the thermodynamic
AN APPRECIATION OF CLIFFORD TRUESDELL
37
plague. He would not accept my view of the “hotness manifold”, which perhaps unduly boldly I had ventured to discuss. While always expressed with politeness, the differing views clearly upset him – he was finally led to utter the ultimate put down: “The problem with you, dear James, is that you never understood Bernard Coleman.” Of course, that is no doubt true, . . . . Let me add something about the Archive (Archive for Rational Mechanics and Analysis). I will ramble a bit, because the main aspect of the Archive – what it is and what it has meant is familiar. In the years between 1946 to 1960 there was a wonderful period of optimistic activity in the United States, based on the successful ending of the Great War against Germany and Japan, and also on the need to supply many of the requirements of life which had not been met since the beginning of the Great Depression in 1930. Thus a period of optimism was just beginning and American mathematics was coming to maturity. Before that time, American higher mathematics had been concentrated almost entirely in the universities of the East Coast – Harvard, MIT, Yale, Columbia and Princeton; and otherwise only at the University of Chicago and at Berkeley. There had been only 500 members of the American Mathematical Society, when now there are 10,000. A year’s collection of mathematical review was less than half what we now receive in one month. This small world was to change dramatically after the war, with the coming of European mathematicians, mostly German, who had escaped from Hitler’s tyranny or from the devastation of Europe. They were welcomed in America, to aid in the war as scientists, or as part of the generosity towards the world which Americans in those days so strongly showed. It was my good fortune to go to Indiana University at just this time, to study with a newly formed but impressive faculty, including among others Max Zorn, Eberhard Hopf and David Gilbarg. Clifford Truesdell arrived in Bloomington in 1950, one year before I left, brought there to solidify the study of continuum mechanics. The students were completely in awe of him, and indeed his lectures were amazing tours de force, which those who knew him later can easily imagine. That was the setting. Now let me turn specifically to the Archive, at first in Clifford’s words. “A mathematical journal of a different kind had to arise. It was only a question of persons, place, and time. In those circumstances, T.Y. Thomas, the Head of the Department, asked me to join him in founding a journal to serve the then growing fields of mathematical continuum mechanics and the analysis of nonlinear partial differential equations.” “Beyond the scope and editorial policy, there was some discussion about the unusual title ‘Rational Mechanics’, but in the end it was adopted because Newton had introduced it in his Principia and had not only exemplified it but defined it.” Even the form of the cover of the new journal occupied his mind – for what it is worth, my small contribution in this direction was to choose the color, red instead
38
J. SERRIN
of blue, to be used as the contrasting ink on the cover. Thus the JOURNAL of Rational Mechanics and Analysis was born, complete with Latin inscription. Number 1 of Volume 1 of the Journal of Rational Mechanics and Analysis appeared in January, 1952. Clifford left Indiana in 1957 for Johns Hopkins, and the old Journal became the new Journal of Mathematics and Mechanics (since that time renamed again as the Indiana University Mathematics Journal). At the same time, a new periodical emerged from the ashes, the ARCHIVE for Rational Mechanics and Analysis. This time by the way, there was no problem in choosing a color for the cover, the publisher Springer-Verlag being inextricably committed to yellow, and yellow only. Both journals have of course flourished and are still here, a tribute to the far-seeing eye and organizational ability of Truesdell. The early years of the Archive saw a great revival of mechanics as a rational doctrine, with the contributions of Stuart Antman, Bernard Coleman, Jerald Ericksen, Roger Fosdick, Morton Gurtin, Daniel Joseph, Victor Mizel, Walter Noll, David Owen, Richard Toupin, as well as the celebrated analysts, Antonio Ambrosetti, Haïm Brezis, Constantine Dafermos, P.L. Lions, J.B. McLeod, Paul Rabinowitz, and others. The Archive had become necessary for every fine scientific library. It was thus fitting and natural that when Clifford Truesdell retired from the editorship in 1989, Stuart Antman was chosen to follow in the same tradition. The Archive requires special qualities of mind of its main editors – taste in subject and style, judgement of scientific merit, dedication to an ideal, sheer plodding effort – together with the ability to see a union between mathematical analysis and the physical or rational world. These are talents which Stu had in abundance – and, during the time I was co-editor of the Archive, talents which I could draw on from Clifford Truesdell when required. The focus of the Archive has turned gradually in recent years, in part due to changed research directions in rational mechanics and analysis, a change which affected both contributors and Members of the Editorial Board as well as the response of the chief Editors. Clifford viewed mathematics and mechanics as a type of art, as part of our living culture. Here are the words of a famous Finnish composer – which seem equally relevant to the life of mathematics: “It is my belief that art is great if, at some moment, it catches ‘a glimpse of eternity through the window of time’ – if the experience is one which we might call ‘the oceanic feeling’. This to my mind, is the only true justification for all art. All else is of secondary importance.” (Einojuhani Rautavaara)
Clifford A. Truesdell’s Contributions to the Euler and the Bernoulli Edition D. SPEISER Université Catholique de Louvain, Louvain-la-Neuve, Belgium Present address: Bromhübelweg 5, CH - 4144 Arlesheim, Switzerland. E-mail:
[email protected] Received 2 March 2003
1. Introduction Allow me to recall that the young scientist, formed at Caltech and who had won acclaim through a series of articles on special subjects, soon acquired extraordinary fame through his comprehensive Handbuch articles, the second of which was written with Walter Noll, where they reformulated the foundations of the mechanics of continua. The mechanics of continua, which before had appeared in all textbooks as a conglomerate, even as a sandhill of sometimes isolated subjects between which there was no connection, logical or mathematical, became now connected and unified by a powerful system of axioms in the way Hilbert had postulated at the beginning of the century. This unification of the entire field of mechanics became possible, among other things, through the sharp distinction between dynamical principles and constitutive equations. The latter formulates the special properties of the materials only – something that cannot be deduced in classical mechanics, but must be left to quantum mechanics. In classical mechanics such a property must be formulated by an additional hypothesis. Noll and Truesdell then showed that this distinction could be traced back to Cauchy and from Cauchy to Euler and to Jacob Bernoulli. And thereby we have now stepped onto the domain of the history of science. But here we must now pause for a moment. While it is obvious that having worked out a systematic organization of mechanics is indeed an incomparable preparation for analyzing, ordering and understanding also the historical discoveries and the various processes of the development of science, one cannot stress enough, on the other hand, that science and history are two radically different endeavors of the human spirit. The essence of science lies in its property of being systematic since science ultimately always wishes to grasp the laws of nature, which it strives to uncover and to formulate in the simplest and most transparent form. But human history, and thus 39 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 39–53 © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
40
D. SPEISER
also the history of science, is the complete opposite of this: it is totally unsystematic, always complex and never simple nor transparent. So, for writing the history of science two different, indeed totally opposite, endeavors must simultaneously be at work in the same man. Thus from the same man two almost irreconcilable gifts are requested – gifts from his intellect as well as from his heart. This confrontation, one might say “clash”, of the endeavor to systematize and to extract the universally valid from the documents which the historian finds before him, with the aim to determine the conditions under which this, always unique, discovery was made, under very special circumstances and by one distinct individual different from all others, and then to interpret its significance for the development of science, is the character of the history of science. It is its very essence, even its unique prerogative and also its characteristic charm. Beyond and above all of the interesting facts which we learn and insights which we gain about the progress that we are allowed to observe and learn to appreciate, it is precisely this constant confrontation which fascinates the reader of Truesdell’s articles and books, and there perhaps more so than for any other historian of science I know. No wonder then that the interest of such a scientist-historian is directed especially towards the great systematizers and unifiers, those who at the end of previous long developments could lay the definite foundations of a whole field, like Newton, Euler, Cauchy, and Maxwell, or prepare new ways like Hilbert did for quantum mechanics. But as we shall also see, Truesdell had a special admiration also for Jacob Bernoulli. Here I shall restrict myself almost exclusively to what Truesdell did for the Euler Edition, which was the cause of my first contacts with him, and later, which is a very different story, for the Bernoulli Edition.
2. Truesdell’s Introductions to Euler’s Works on Hydrodynamics Truesdell was invited in the fifties by the mathematician Andreas Speiser, then General Editor of Euler’s Opera Omnia, to edit volumes 12 and 13, series II, which contain Euler’s writings on hydrodynamics. The great systematizer Truesdell was, I must remind no one, cut out for that job much more than Speiser could have known. Indeed, Truesdell’s introductions to the two volumes II.12 and 13 offer much more than what their titles promise and what Speiser possibly could have hoped for. He traced the subject back to its origins, Archimedes for the hydrostatic and Torricelli for the hydrodynamical part, and then presented the development of these branches of science in detail, from the second half of the 17th century throughout the whole 18th century and beyond. Truesdell presented this scientific development, where many strands lived for a long time their separate lives. Later they became more and more intertwined and now form whole domains, but in the unplanned and rationally never fully explicable way that history, which does not care much for its future historians, always proceeds. And eventually he showed how these separate domains came together and why they had to be formulated by a
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
41
new mathematical language – a language the very creation of which was stimulated largely by this development. In these two introductions we can see how the systematic understanding of a science from today’s point of view not only helps to unravel the various strands, but is indeed also indispensable for showing what exactly each strand and each domain has contributed, and thus for doing justice to each of them. Here it is not so much the experimental versus the theoretical progress that must be balanced; for, as Truesdell observed, because of the sudden enormous progress of mathematics during this period, theory was then mostly far ahead. Incidentally, this is perhaps the secret reason why this period was always, and still is today, so much neglected by the historians of physics. Rather, the difficulty is to attribute proper credit not only to the stimuli that comes from new, now suddenly accessible, problems of mechanics on the one hand but also on the other to the impulses due to the analytical and, to a minor degree, geometric discoveries made around the middle of the century. First we have Newton’s heritage, deposited in the second book of the Principia. It is well known that precisely here are some of the book’s most difficult, even obscure passages. Truesdell’s summary of it and his evaluations are extremely succinct; they fill merely two pages. Even so they are a very valuable help to the reader. This holds true especially for his careful separation of what Newton had derived by starting with the corpuscular view from those results that he had derived using a continuum view and, finally, from those results for which he used both kinds of assumptions. The only regret the reader feels is that this whole section is so short. And then we find Truesdell’s examination of Daniel Bernoulli’s Hydrodynamica. This book contains, albeit in splendid isolation from the rest, the celebrated excursion into the kinetic theory of gases, as we say today, which Truesdell places in the proper historical perspective. Here, Bernoulli was over a century ahead of his time. But then in this book hydrostatics is unified with hydraulics. This was the first of the two big unifications of theories during this period. It was achieved through the celebrated Bernoulli equation, which, as Truesdell carefully explains, was by no means written by Bernoulli in the form in which we know it today. Indeed, not less than four major discoveries were needed before its transparent form, due to Euler, became possible. The first of the four discoveries just mentioned was Johann I Bernoulli’s introduction of Newton’s concept of force into hydrodynamics, if only in a one-dimensional setting. It was Truesdell who rediscovered this fact, acknowledged already by Euler, thus clearing Johann from the reproach of having plagiarized his own son. Later Szabó confirmed Truesdell’s vindication in even stronger terms. And Truesdell underlined that the roots of Euler’s hydrodynamics were to be found in Johann’s rather than in Daniel’s work. The second important discovery was Euler’s deeply penetrating understanding of what we call today an “inertial system” and, in particular, the fact that only in such a frame can one expect the laws of nature to have a simple form, a form that
42
D. SPEISER
one can guess through mathematical or physical imagination. Truesdell calls this paper E177 “a research, that will change the whole face of mechanics”. This is the moment to mention the recent book by the Italian mathematician and historian Giulio Maltese in Rome: La storia di “F = ma”. This book deals with the development of particle mechanics from Newton to Euler and it is perhaps the first one to profit fully from Truesdell’s work. The third important discovery was Euler’s creation of the concept of “inner pressure”, as opposed to Stevin’s “outer pressure”. This new concept was the key to his equations. The fourth important step lies on the mathematical side of this development. It was the formulation of a field description due to d’Alembert, a discovery that again was first pointed out by Truesdell. And on the mathematical side, one can, of course, directly observe the greatest step due to the mathematics of the 18th century: the systematic development of partial differential equations, due again largely to Euler and to d’Alembert in their work on hydrodynamics. The interplay between these two lines of research were masterfully pursued and presented by Truesdell. If we think how loath scientists usually are to recognize how much their own field owes to the progress of other areas, Truesdell’s presentation is doubly remarkable and welcome. Thus the ingredients were together for the equations of hydrodynamics, published in 1752 by Euler in his paper E 258. Thereby, hydrodynamics and aerodynamics were unified into one dynamical theory, which can be applied equally to either of them by selecting a special constitutive equation. These equations mark more than only a period in the history of science: they represent the first field theory in physics! The Bernoulli equation appears now in the form we know it today, namely as an integral of the equations of motion, valid only under certain circumstances. To this culmination point in the history of mechanics there corresponds a culmination in Truesdell’s narratives: his commentaries on Euler’s two comprehensive presentations of hydrodynamics. The first of these consisted of three papers written in the 1750’s in French, and the second was written in the 1760’s in Latin. The latter is part of Euler’s great project, conceived in 1734, to present the entire science of mechanics in 6 volumes. The first of Euler’s presentations reflects the moment of the discovery when it was still fresh in his mind, and the second, a more polished one, shows the ambition to present everything as clearly as possible, for beginners. I mention here only Truesdell’s comments on Euler’s paper E 331: “On the motion of fluids arising from different degrees of heat”, written in 1764. This paper belongs to the prehistory of meteorology. Euler pursued a line taken up 150 years later by Milankovitch, but Truesdell pointed out, and again he was here the first to do this, what this paper brought to thermodynamics. But he restricted his account by no means to the leading figures of history, even if they do receive the lion’s share, but others are not overlooked. Thus, the work of Simon Stevin, one of those who is always underestimated, received elo-
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
43
quently his due. Truesdell drew the reader’s attention also to Jacob Hermann, Jacob Bernoulli’s most important disciple next to his brother Johann. It will be an important task of the Bernoulli Edition, to scrutinize Hermann’s works for important contributions. In the introduction to the second volume one finds Truesdell’s account of the development of acoustics. Besides Euler’s, two names stand out here: Daniel Bernoulli and Lagrange. I prefer to mention Bernoulli’s work later in connection with the introduction to the history of elasticity. Of Lagrange’s work, Truesdell gave a detailed account of correspondence with Euler as well as of Lagrange’s own published papers: he wrote that “the velocity-potential theorem and the impulse theorem are first rate creative works and L AGRANGE’s greatest discoveries in fluid dynamics.” For organizing this vast amount of material, as one can appreciate, an intimate acquaintance with the sources is not enough. What was essential here was a deep insight into the science of mechanics itself – an insight that could be gained only from its most modern formulation. Such an understanding was essential to provide a systematic and powerful enough reference system for organizing adequately this enormous body of material. It is the combination of both of these qualities, which Truesdell possessed to the highest degree, which lends to his presentation relief and gives to the narrative at the supreme moments a dramatical quality. But let us return to Euler’s equations and their two great presentations. From Euler these equations came to Cauchy, who, with the help of his new creation – the stress tensor – laid the foundation of the theory of elasticity in its definite form. During the same period the field idea, through Euler’s “Lettres à une princesse”, reached Faraday, who quoted them frequently in his journal, and from Cauchy and Faraday the concept of a field came to Maxwell. One may fairly say, that either of Euler’s great works, especially the Latin one, if supplemented with some of Truesdell’s comments which place them in a modern perspective, are, or alas, as I rather must say, would still today be the best introduction to hydrodynamics at the level of high school, and even more so, at the level of university teaching. The presentation of the idea of what a law of nature is, how it works and what use is made of it under special circumstances including technical applications has hardly ever been surpassed. This presentation is never clouded by mere formalisms nor by going into the details of technological applications which, more often than not, just obscures the basic idea. By Euler and by Truesdell one is led from one essential point to the next with only the absolutely necessary excursions until one reaches the summit.
3. Truesdell’s Introduction to Euler’s Work on Elasticity “The rational mechanics of flexible or elastic bodies 1638–1788”, as the title implies, reviews its subject from Galileo’s Discorsi to Lagrange’s Méchanique Analitique. But in fact it opens with a survey over the prehistory of the whole field of early Greek antiquity and also deals with the Middle Ages. Here we find such
44
D. SPEISER
Truesdellian pearls like the following first sentence: “Duhem’s great historical studies showed that the apparent darkness of mediaeval physics is but darkness of our knowledge of it.” Indeed this was a call to fill an immense gap in science. I will mention here only one glaring hole: our ignorance of mediaeval technology. How, exactly, did the mediaeval architects and engineers proceed with the construction of their enormous cathedrals, especially of the towers, which surpass in height, refinement and daring everything that the Greeks and even the Romans had achieved? Truesdell’s clarion call ought to remind all historians of science as well as of the arts that they should direct their attention much more to the Middle Ages than they have done so far. He mentions explicitly the elusive Jordanus de Nemore and notes: “The only writing of value on deformable bodies that I have been able to see is the fourth book of J ORDAN DE N EMORE’s Theory of Weight (13th century), and remarkable it is, Western in spirit, ambitious beyond anything in the Greek or Arab tradition. The seventeen propositions on fluid flow, resistance, fracture and elasticity are all original.” And the next section contains an evaluation of Leonardo’s achievements. The whole account fills the separate volume II, 11b of 400 pages that contains the introduction to the two volumes II, 10 and 11! An immense achievement: theories, experiments, simple historical facts, etc. are not only enumerated but thoroughly “digested”; that is, their scientific content is explained and imbedded into the historical development. To satisfy both requirements makes, of course, heavy demands on the writer. Add to this that the account is based not only on the published writings, but also on an enormous number of letters, for instance, on the quite complex epistolary exchanges between the Bernoullis and Euler, which Truesdell searched through. From these we can get an idea of the Herculean labor that he has gone through. And besides theories, we also find experiments discussed – experiments done, among others, by Musschenbroek, Giordano Riccati and Chladni and the account reaches far into technology and engineering. Since the field of elasticity has a more complex structure than hydrodynamics, its history presents more isolated strands and subdomains. Thus, a lucid arrangement is here much more difficult to attain. Here I must restrict myself to a small part of Truesdell’s accounts of elasticity proper and the notion of flexibility. I shall begin with the latter. While the modern theories of flexibility began with Galileo’s and Mersenne’s investigations of the string, it was only the investigation of static problems like the hanging cord and the suspension bridge that led science to ask for mechanical explanations. In 1712, Brook Taylor computed the fundamental frequency of the string. While Daniel Bernoulli, as we learn from Truesdell, had a bit later the theory of the overtones in his hand, he did not publish it. Instead he studied the double-, the triple-, the multiple-pendulum, and asked the following question, characteristic of his whole research: under what condition is the oscillation stationary? His answer was: a pendulum with n masses has exactly n stationary oscillations, and he computed up to n = 5 its n frequencies as a function of the various masses and length’s.
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
45
Turning next to the the hanging cord, by going to the limit he proved that it has infinitely many stationary oscillations given by the zeros of a new transcendental function, today denoted as the Bessel function J0 (x). A little later Johann I Bernoulli formulated the problem of the motion of several particles attached to a string. Truesdell saw that in this field too he was the first to use Newton’s equation of motion. Then d’Alembert, who had studied Daniel Bernoulli’s two papers, published in the Traité de Dynamique in 1743 a partial differential equation for the hanging cord, the first partial differential equation in mechanics. Truesdell called this discovery “a turning point in the whole history of mechanics”. Then, in 1746, in his paper “Recherches sur la courbe que forme une corde tendue mise en vibration”, d’Alembert presented the differential equation of the string, this time together with the solution, and a short time later Euler published the solution found in a different way. With these discoveries began the last of the big scientific polemics of the 18th century. Truesdell was highly critical of the polemic itself, which he called “deplorable”. He commented that it “confirms the principle that ever the greatest quantity of paper is smeared over with the dullest matter.” But this “great quantity” was searched through by him with great care, and the questions at issue were explained by him incisively. In this triangular struggle both d’Alembert and Euler maintained that only the partial differential equation would yield all solutions. Their quarrel concerned the class of functions that are admissible as solutions. While d’Alembert tenaciously, but erroneously, maintained that only the functions today called analytic can serve as solutions of a mechanical problem, Euler admitted a much larger class, the class of piecewise smooth functions. Hereby he achieved in Truesdell’s words “the greatest advance of scientific methodology in the whole century”, because it contradicted the Leibniz postulate that in mechanics functions must be analytic, a postulate which, according to Truesdell, had not been contradicted by anyone, not even by Newton. Bernoulli, however, claimed that the trigonometric series would equally well yield all solutions. As he had overlooked the arbitrary phases he was wrong, as we know today. The approach of his adversaries carried the day, but 250 years later we can see that Bernoulli had in fact solved the first finite and infinite eigenvalue problems, which occupy now, after the past 75 years, the center of quantum mechanics. Thus are the meanderings of the developments of science! All these problems, however, were merely 1-dimensional ones. Only one 2-dimensional problem from the field of flexibility was solved during this period: Euler discovered in 1759 the equation of the drum. The elaboration of the theory of elastic bodies moved during the same period on very different lines, especially with respect to the mathematics involved. Truesdell traced its modern development back to Beeckman and to Galileo. In the Discorsi, Galileo had asked: what is the proportion between the minimal weight Pl needed to break a beam simply by elongation and the minimal weight Pt needed to break
46
D. SPEISER
it transversally when one of its ends is clamped into a wall? Galileo’s prediction was Pl : Pt = (b : a)N, where a and b are the length and breadth of the beam, respectively, and N is a numerical factor. By assuming a special model he then could compute N = 1/2. Mariotte, who tried to confirm Galileo’s predictions experimentally found N = 1/4 rather than 1/2, and he wrote his results to Leibniz. Leibniz suggested in 1684 that one should take into account also the energy needed to bend the beam before it breaks. Using Hooke’s law he found N = 1/3. According to Truesdell this was the first computation which took account of the dilation of fibres. A few years later in 1687 Jacob Bernoulli asked Leibniz, in a letter, for explanations concerning his new calculus and in the same letter he included the results of his own experiments which in certain cases clearly disproved Hooke’s law. Leibniz, because of his absence on a trip, replied three years later. He suggested that Bernoulli should determine the exact form of a beam that is bent by a weight. Bernoulli at once set out to work on this problem. And indeed by using the principle of the balance of moments of forces and his own “golden theorem” – his formula for the radius of curvature – he found the solution in the form of an integral which contained an arbitrary function depending upon the constitutive relation between the stress and the strain. Here, in 1692, Bernoulli recognized the distinction between dynamical principle and constitutive equation, for his experiments had convinced him that there was no universal law valid for all materials. In the simplest case, namely when Hooke’s law is assumed, the solution is given by the famous elliptic integral, discovered and discussed by him in this paper. Of his theory of the bent beam Truesdell wrote (p. 96): “the deepest and most difficult problem yet to be solved in mechanics, is his alone.” Truesdell pursued Bernoulli’s later investigations on this problem, especially on the location of the neutral fibre. It ended, as Antoine Parent showed, with a failure. Truesdell remarked: “To the ironies and disappointments which filled [his] life must be added that while he originated or assembled all the apparatus sufficient to put [his final equation] on firm ground, he failed to do so, failed because his attempt was on too grand a scale.” In fact the problem of the neutral fibre seems still today not to be solved in full generality. Truesdell’s evaluation of Jacob Bernoulli’s achievements is: “In our epoch for study, 1638–1788, but one other, Euler, is to build himself a like monument in our subject.” On the other hand, it is characteristic for Truesdell that he devoted a full section to Parent, whom he rescues from near oblivion by showing that “Parent was the first to apply statical principles correctly to the tensions of the fibres of a beam, and that he recognized the existence of shearing stress.” These are no small merits, indeed! The next step in this development was taken by Daniel Bernoulli. He knew that Euler was working on a book on variational calculus, and suggested to him a mini-
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
47
mum principle for the potential elastic energy stored in a curved beam. Euler immediately worked out its consequences which he annexed as the first appendix to his Methodus Inveniendi Lineas Curvas, where he derived a multitude of new results. Truesdell commented also on Euler’s further work in this field, e.g., his discovery of the shear force and Coulomb’s discovery of the shear stress. In this field all correctly solved problems were, again, 1-dimensional. Even the problem of the oscillations of a massive plate was missed by the second Jacob Bernoulli, if also, as D.O. Mathùna showed, only by a hair’s breadth! Truesdell was here a bit severe, for the definitive solution was given by Lagrange only more than 30 years later. Especially impressive is Truesdell’s “modern evaluation”, which fills the last 10 pages of the book. He divides the task into three parts: the evaluation of Analysis, Geometry and Mechanics. Who else could have dared to evaluate three basically so different histories? Even a careful and advised reader, I suspect, will discover in these few pages, here and there, something of what he believed to have well understood but had in fact failed to grasp a fundamental aspect. I repeat here the first sentences of the first summary. The triumphant lines show Truesdell’s rightful pride in his own beloved science: Rational Mechanics. He wrote: “Prior to 1730, researches on continuum mechanics applied mathematical techniques already developed in other subjects, notably in geometry and in the mechanics of point masses. Starting with the research on vibrating systems by DANIEL B ERNOULLI and E ULER, the situation was completely inverted. From then on until the end of the century, continuum mechanics gave rise to all the major new problems of analysis.” On the two last pages of the book Truesdell asked why the foundations to a complete theory of elasticity escaped this period and writes: “Neither physical intuition nor experiment was what was needed here; rather, as both E ULER and C HLADNI said, it was want of differential geometry that blocked the way to theories of deformable surfaces and solids”. And after mentioning that Euler had introduced all elements of the strain tensor in a paper on hydrodynamics, he notes on the very last page: “In surveying all these brilliant individual achievements . . . , we are driven to ask why, when Euler had succeeded in 1752 in creating a general theory of perfect fluids . . . , nevertheless after many more years he failed to reach a general theory of elasticity.” His answer was: “To succeed in hydrodynamics, the only hope lay in abandoning a one-dimensional approach. But for elastic or flexible bodies onedimensional theories led to one triumph after another. It was the brilliant successes of the special theories that blocked the way to the general theory, for nothing is harder to surmount than a corpus of true but too special knowledge.” I could give here only an insufficient account of this monumental work of Truesdell: the history of the theory of elasticity is now, probably due to him, the best charted and the best investigated domain of the history of physics. And I remain convinced that the three introductions to which I referred are the best guide to a deeper understanding and further study of the history of classical mechanics and indeed of the history of science; every time I open one of them I find again something new and interesting that had escaped me.
48
D. SPEISER
4. The Concepts and Logic of Classical Thermodynamics as a Theory of Heat Engines. Rigorously Constructed upon the Foundation Laid by S. Carnot and F. Reech Before turning to my second main topic, Truesdell’s work for the Bernoulli Edition, I would like to mention his book with S. Bharatha. Even if it lies somewhat apart from the other works of this account, it brings several characteristics of Truesdell to the fore, which seem to me fundamental for his thinking as well as significant. I did quote the full title of the book, since this book brings together, like no other of his books that I know, science, history and, not surprisingly, conceptual logic. And in no other book of those which I know, is Truesdell so preoccupied with teaching. Not that I would recommend the book as a textbook for students, but I recommend it highly to all those who teach physics. The aim of the book is to construct a rigorous foundation of classical thermodynamics based on the idea of Carnot cycle. “Rigorous” refers here not only to mathematical rigor, but also, and in fact even more so, to conceptual rigor – to a clear and adequate introduction and a sharp definition of all concepts that will be used in the equations as well as a precise reference as to how they are to be measured, i.e., how they are connected with experiment. The very first sentence of the preface makes this clear and it is, at the same time, a “critique” in the sense of Kant of the possibility of writing the history of science. He writes: “I do not think it possible to write the history of a science until that science itself shall have been understood, thanks to a clear, explicit, and decent logical structure.” I have hardly found in Truesdell’s work positive references to philosophy, and probably he would be surprised to be referred to as a philosopher; yet what he notes here and in the rest of the section is as important a contribution to the philosophy of science and to the philosophy of history of science as I have ever heard. Perhaps Truesdell, were he here, would react to this compliment with a little smile. On the other hand, the aim which he pursued as a historian is expressed by his dedication of the book “as an expression of respectful gratitude for the legacy of the great French thermodynamicists C ARNOT, R EECH , D UHEM”. This dedication incidentally refutes the accusation, which I heard some times, that his was an anti French bias. In three sections Calorimetry, Carnot’s General Axiom, and Universal Efficiency of Ordinary Carnot Cycles, the results are presented. I believe that the progress of science consists in establishing connections between various phenomena, between phenomena and their measurements, and it includes a process called the formation of theories, i.e., constructing connections between different restricted theories by erecting greater and even more comprehensive theories. The significance of the book then lies perhaps in the first place, that no other book of those that I have consulted connects thermodynamics so cogently and intimately to classical mechanics. But the book has another distinction too. It is a book written especially for the teacher, I dare say even for the teacher, who must speak to beginners. In other
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
49
words, the book has also a pedagogical aim. If it is not directly a textbook, this is only because the authors wished to prove that their approach is powerful enough for coming to grips with all situations that the practical applications demand. Hence the careful analytical generalizations to cases where functions that are only piecewise smooth are needed, etc. But for the explanation of the thermodynamical principles themselves, these technical details are not necessary and can easily be suppressed by the teacher. But what the teacher can learn and teach above all is not so much the mathematical rigor, but the conceptual rigor of the theory or of any theory for that matter, and the importance of a careful introduction and explanation of all concepts. The importance, for instance, of the eternal question that looms over the beginning of all introductory courses on mechanics: “what exactly is now the force, professor”? When I think of my own lecturing, my greatest regret is that I concentrated too little and too late on the careful introduction of all concepts used in physics, and that I spent in my lectures too little time on their discussion. It is with the help of sentences that we prescribe the setting of a reproducible experiment – the concepts connect the experiments with the mathematical formalisms. Another fundamental point made clear in this book is that all theories are always valid only with respect to a certain domain of the variables and under certain restrictions. It is the neglect of these caveats which makes possible only pseudophilosophical and pseudoscientific generalizations. Here teachers can learn much that will prevent a certain boastful offer of their merchandise and at the same time make the understanding of what they present easier since it is focussed. I learned myself, for example, that the teacher must immediately at the beginning of each course enumerate explicitly all restrictions under which the predictions of the theory only are valid. Over 20 years ago I had invited Truesdell to Louvain-la-Neuve for giving a series of lectures. In the first one he outlined the content of this book in one hour, overestimating, of course, his audience, which was oriented mainly towards quantum mechanics and its applications. And then he changed to other subjects. But he presented me with a copy of the book, and when I had read it, I regretted deeply that not the whole series of lectures was directed to this one topic, but I did not dare say it to him. But later, at a lunch I told him, that I was particularly impressed by one special topic, namely his treatment of the anomaly of water between 0 and 4 degrees, of which I never had seen an adequate presentation. He then said approvingly, “I can tell you that this subject was my special goal for writing this booklet”, and then taking his glass he invited me to call him henceforth “Clifford”, which from here on I shall also do in this discourse! But during his stay in Louvain-la-Neuve there was one other topic towards which many of our conversations were directed again and again. This was the Bernoulli Edition, to which I shall now turn.
50
D. SPEISER
5. Truesdell’s Contribution to the Restart of the Bernoulli Edition This other topic which occupied Clifford and me was precisely the new beginning of the Bernoulli Edition, and here I must now go back a few years. Clifford had been involved with the Euler Edition, as I mentioned already, through the mathematician Andreas Speiser, an uncle of mine, with whom I had close contacts. My uncle was extremely proud of this acquisition and spoke to me often enthusiastically about Clifford. He gave me separata of the two introductions to the hydrodynamical works, and my uncle’s enthusiasm caught on also with me. I made Clifford’s acquaintance in 1957 on the occasion of Euler’s 250th anniversary, where at my uncles invitation he was the main speaker at the official university ceremony. A few years later he wrote to me a complimentary letter for my own introduction to Euler’s works in the domain of physical optics. Meanwhile I had begun to read his introductions, so that when J.O. Fleckenstein asked me to succeed Hans Straub as the editor of the works of Daniel Bernoulli, I said “Yes, but . . .”. Namely, I stated as a condition that I should be paid the equivalent of a half time assistant. It so happened that a few month’s earlier a young student had asked me if she could write a Ph.D. thesis under my direction. I wished to accept her, for she had definitely “une tête bien organisée”, but no post seemed free. Hence my proposal to Fleckenstein. I succeeded in persuading the student to work on the history of science, although at first she found this a puzzling and somewhat dubious proposition, and Fleckenstein arranged the financial side. Today, some twenty-seven years later, this young student, Patricia Radelet-de Grave, is professor at the Université Catholique de Louvain, where she teaches the history of science. She has now succeeded me as editor of the Bernoulli Edition while an Italian Ph.D. student of hers served as the Edition’s secretary. Studying Clifford’s introductions, I had become convinced that he was the best possible guide and counsellor for launching the whole enterprise again. In 1975, I was in New York when I received a call from Clifford who inquired about what was going on in the Bernoulli Edition. I gave him the little information I had, but only later did I find out that he had not been terribly excited by my answers. Nevertheless, later, as I shall mention, he accepted an invitation to be the editor of Daniel Bernoulli’s work on hydrodynamics. To my surprise he wrote to me that he was not satisfied with his earlier work in the Euler Edition. His, as I noted before, was the only critical voice about these introductions, of which I ever heard. In 1980, when I succeeded Fleckenstein as the Editor of the whole Edition, he encouraged me to ask André Weil to become an editor of the works of Jacob Bernoulli. Weil accepted very kindly first the volume Analysis and then later also the volume Differential Geometry. Also from Weil I learned much on the art of editing, and I wish to state here too that I cannot remember one difficult moment with him. A bit later Weil introduced me to Herman Goldstine, who edited the volume on Variational Calculus containing works of Jacob and of Johann. Mean-
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
51
while Mrs. Radelet and I produced plans for editing the works of all Bernoullis; so far there had not been any plans nor any reliable estimates of the work at all, the old ones being all much too low. Precisely all these questions and many more, including the choice of the typography and the design of the volumes, were then discussed with Clifford in Louvain-la-Neuve. He took a detailed interest in all problems and Mme. Radelet and I learned much from him. At last we could publish the plans, which were printed in 1982 in an illustrated brochure, which also contained a presentation of the Bernoulli family and the importance of each member. The plans consisted of (i) a presentation of the whole project, including what had already been achieved and the distribution of the works into volumes and (ii) a determination of our priorities: first to complete what was begun, i.e., Jacob I and Daniel Bernoulli and three volumes with the letters of Joh. I B.; this first stage is now approaching its completion. Then only to complete in a second stage all works of Joh. I and the “minor” B.’s; this stage is now opened with the deposition of Vol. 8 of the works of Joh. I by P. Villaggio from Pisa. A third stage was forseen for the letters; for their publication a project was worked out by F. Nagel and myself. Our plans also contained (iii) a list with the names of all editors of the first stage. At first, when I wrote to Clifford about this brochure, he did not seem terribly impressed, but when I sent him two copies he sent me his enthusiastic congratulations together with a list of friends and colleagues whom he invited me to send a copy too. These plans, with the appointment of editors for the first stage, were the basis of the 1982 restart of the Bernoulli Edition. The same year, on the occasion of the bicentennary of Daniel Bernoulli’s death, the Curatorium, under its president the historian A. Gasser, organized a Symposium. The main speaker was Clifford, who in the “Alte Aula” between the portraits of Daniel Bernoulli and Euler gave a speech about the research of both on the theory of oscillations, evaluating the strengths and the weaknesses of both of them. This was the beginning of a series of exchanges concerning the progress of the Edition, and especially of my requests of Clifford’s opinion on various questions. At the Symposium we presented the first new volume edited by L. Bouckaert and B.L. van der Waerden. It received, besides several favorable reviews, one that put the Edition at its start into serious trouble. Again Clifford came to our rescue and refuted in a letter to the Swiss National Science Foundation (SNScF) all points, with the exception of one, of the review. His letter especially restored, as I was told, the confidence of the SNScF. Incidentally, the author of the review later graciously apologized to Mrs. Radelet. Of course, we had all very much hoped that Clifford might deposit the two volumes containing Daniel Bernoulli’s work on hydrodynamics. This was not to be. As all know, an enormous load of work kept him busy beyond his forces, and his health eventually failed him, although he was in the best of care. He had to resign from his engagement, and he advised me to invite the Russian Academician
52
D. SPEISER
Gleb Mikhailov to take over. I am happy to report that recently we began with the printing of the first of the two volumes! But, thanks to his friend André Weil, Clifford, nevertheless, became a Bernoulli Editor! Indeed Weil had advised me to edit completely all letters of Jacob Bernoulli’s correspondence with Leibniz and I had unhesitatingly accepted his advice, ignoring the existence of an agreement between the Leibniz and the Bernoulli Edition, which had left the editing of the letters with Leibniz to her sister-edition in Hannover; these letters were, of course, the most interesting ones. But when I explained the situation to our colleagues of the Leibniz Edition, they very generously accepted our plans, provided we would not undertake a “critical edition”. During his work on the edition Weil persuaded Clifford to write an introduction to the parts of the correspondence that dealt with questions of mechanics. It is there that Leibniz drew Bernoulli’s attention to the problem of the curved arc. The result of Weil’s invitation was again the appearence of a very penetrating introduction. Thus the Edition is fortunate that Clifford’s name will remain connected to it, and especially as an editor of Jacob Bernoulli, for whom he had done so much. But even without this turn of events, after 30 years of work for the Bernoulli Edition, I can state firmly, that no one has done more for making the new beginning of the Edition in 1982 possible than Clifford Truesdell.
6. Truesdell the Artist and the Man My report on Truesdell’s work for the Euler and the Bernoulli Edition must end here, although I could go on at length. But there remains a question: neither Clifford’s scientific expertise nor his penetration into history alone can explain the full fascination which his works exert on their reader. We know that all scientific theories are even at best only approximations to the observed world and the same, but even more so, holds for the reconstruction of the historical path on which they were founded. Clifford, more than most historians was aware of this. All too often we must be satisfied with guesses. Thus, like the scientific theories, the historical reconstruction to some extent always remains a construct. So, what then produces the great satisfaction that we experience when we read Truesdell’s works? Here, for discovering the answer we must, I think, turn to another field: we must enter another dimension – the realm of beauty. The man, who was so much attached to all arts, music, painting, old books, etc., was himself also an artist. He has composed his books, in the double sense of this word. As much as his search for scientific precision in all details would allow it, his books are beautifully constructed! This brings me necessarily to Clifford the private man. Everyone who had the intense pleasure to be received in the Palazzetto knows what I mean: the carefully chosen objects of the collection, their carefully thought out presentation, and especially their owners’ passionate interests in all arts and also in the history of the arts. Here too one could experience the truly enlightening comments which their
C.A. TRUESDELL’S CONTRIBUTIONS TO THE EULER AND THE BERNOULLI EDITION
53
guests received. One could watch how they stimulated through their interest in all crafts the artisans of Baltimore who made the Palazzetto into what it became. As you realize, I slipped now, almost unconsciously, into the plural: the treasures of the Palazzetto were offered to its guests by a couple! And so this was more than only their home! Can we imagine Clifford’s tremendous outpourings without the constant and intense help of Charlotte? Can we imagine this without her painful proof reading of his books, her corrections, her meticulous improvements of the last details, her conscientious organization of the Archive as well as the classification of Clifford’s correspondence in a private archive? Of course we cannot. Just as the Palazzetto’s hospitality was the work of both, the Palazzetto’s soul was Clifford and Charlotte. And Charlotte made sure that in spite of his harsh afflictions Clifford could spend his last years there in dignity. For this, all of Clifford’s friends will always be in her debt, and they will remain grateful to Charlotte and Clifford. Acknowledgements It is a pleasure to thank Professor G. Capriz for the invitation to contribute to the Pisa Meeting in memory of Clifford Truesdell and to Professors Chi-Sing Man and R. Fosdick for their careful editing of this article. Also, I would like to thank Professors L.A. Radicati di Brozolo and, especially, P. Villaggio for many interesting conversations and my wife for linguistic advice. References 1.
2.
3.
4.
G. Maltese, La Storia di “F = ma”. La seconda legge del moto nel XVIII secolo. Olschki, Firenze (1992). After the Meeting in memory of Clifford Truesdell in Pisa, there appeared a 2nd volume by Giulio Maltese: Da “F = ma” alle leggi cardinali del moto. Hoepli, Milano (2001). This is a worthy successor of the first volume: one must hope that many readers and especially students will take profit from it. It is written in the spirit of Truesdell’s pioneering work and it is a monument to it. C.A. Truesdell, Leonhardi Euleri Commentationes Mechanicae, Ser. Secunda, Vol. XII, pp. IX– CXXV; Ser. Secunda, Vol. XIII, pp. IX–CV; Ser. Secunda, Vol. X et XI Sect. Secunda. Orell Füssli, Turici (1954; 1955; 1960). C. Truesdell and S. Bharatha, The Concepts and Logic of Classical Thermodynamics as a Theory of Heat Engines. Rigorously Constructed upon the Foundation Laid by S. Carnot and F. Reech. Springer, New York (1977). A. Weil (ed.), Der Briefwechsel von Jacob Bernoulli, with contributions by C. Truesdell and F. Nagel. In: D. Speiser (general ed.), The Collected Scientific Papers of the Mathematicians and Physicists of the Bernoulli Family. Birkhäuser, Basel (1993).
Baltimore, Maryland, 1978
Invariant Dissipative Mechanisms for the Spatial Motion of Rods Suggested by Artificial Viscosity STUART S. ANTMAN Department of Mathematics, Institute for Physical Science and Technology, and Institute for Systems Research, University of Maryland, College Park, MD 20742-4015, U.S.A. E-mail:
[email protected] Received 10 September 2002; in revised form 16 January 2003 Abstract. The introduction of artificial viscosity into the partial differential equations of mechanics is often useful for both analytic and numerical studies. The traditional forms of artificial viscosity, originally designed to treat problems for fluids, when applied to problems for solids often lead to equations describing material properties that are not invariant under rigid motions. Consequently, for rapidly rotating bodies, artificial viscosity could produce serious errors. In this paper it is shown how to introduce artificial viscosity in a properly invariant way, and that the resulting systems have a rich and attractive structure, which beckons analysis. Mathematics Subject Classifications (2000): 35L65, 65M99, 74K10. Key words: artificial viscosity, invariance under rigid motions, frame-indifference, spatial deformations of rods, hyperbolic conservation laws.
In memoriam Clifford Truesdell
1. Introduction A general version of a conservation law with one spatial variable s is a system of partial differential equations of the form ut = g(u, s)s + h(u, s)
(1.1)
where u is an n-tuple of unknown scalar functions of s and t; g and h are given functions with values in Rn ; and derivatives are denoted by subscripts. Suppose that g is differentiable and that h is continuous. In this case, this system is hyperbolic if the matrix ∂g/∂u of partial derivatives of the components of g with respect to the components of u is positive-definite. As is well known, such nonlinear hyperbolic conservation laws admit shocks. In both analytic and numerical studies of (1.1), a central role has been played by modifying it by appending to its right-hand side an artificial viscosity term [D(u, s) · us ]s where D(u) is a small positive-definite matrix (often taken to be constant and diagonal. See [4]. Here D · us denotes the 55 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 55–64. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
56
S.S. ANTMAN
image of the n-tuple us under the matrix D.) For problems of continuum physics, the effect of D is to modify the material properties characterized by g and h. When we introduce such modifications into conservation laws from physics, it behooves us to determine if the modified system has physical significance. If not, the modification might introduce serious analytic or numerical errors. (This happens when some standard numerical methods for the treatment of shocks are applied to the equations for the planar motion of elastic rods [3].) When (1.1) is a set of equations of continuum mechanics, there are three sources of difficulty: (i) Vectorial geometry is suppressed because the unknown u is just an n-tuple of scalars, typically components of vectors with respect to some moving basis with the basis carrying most of the geometric information. (ii) The system typically includes both momentum equations and compatibility equations, the latter expressing the equality of mixed s and t derivatives. It is necessary to determine what physical meaning, if any, inheres in adding dissipation to such compatibility equations. (Slemrod [5] first treated this question for a scalar equation, in which the issue of invariance under rigid motions does not arise.) (iii) The form of the artificial viscosity D · us is inspired by the viscosity term for 1-dimensional gas dynamics and more generally for that in the Navier–Stokes equations. These equations are typically given in a spatial (Eulerian) formulation. This form of the dissipation is not preserved in a material (Lagrangian) formulation, typically used for problems of solid mechanics. Indeed, we shall see that in the material formulation, the artificial viscosity D · us corresponds to constitutive equations that are not invariant under rigid motions. The purpose of this paper is to resolve these difficulties, producing properly invariant dissipative mechanisms suggested by this artificial viscosity. We shall see that such dissipative terms have a rich and attractive structure. Notation. We employ Gibbs notation for vectors and tensors: Vectors, which are elements of Euclidean 3-space E3 , and vector-valued functions are denoted by lower-case, italic, bold-face symbols. The dot and cross products of (vectors) u and v are denoted u · v and u × v. The value of tensor A at vector v is denoted A · v (in place of the more usual Av). Twice-repeated lower-case Latin indices except for the independent variables s and t are summed from 1 to 3 and twice-repeated lower-case Greek indices are summed from 1 to 2. Triples (u1 , u2 , u3 ) of components of any vector u with respect to a certain nonconstant right-handed orthonormal basis {dk } are denoted by the corresponding lower-case, sans-serif, bold-face symbol u. In view of the orthonormality of this √ basis, we set u · v := ui vi (= u · v), |u| = uk uk (= |u|), u × v := (u2 v3 − u3 v2 , u3 v1 − u1 v3 , u1 v2 − u2 v1 ) (so that the first component of u × v is (u × v) · d1 , etc.). The matrix of a tensor A with respect to this basis is denoted A. Its action on a triple u is denoted A · u. The dot product and norm for other n-tuples are treated analogously. The (Gâteaux) differential of u → f (u) at v in the direction h is (d/dt)f (v + th)|t =0 . When it is linear in h, we denote this differential by (∂f /∂u)(v) · h or
57
INVARIANT DISSIPATIVE MECHANISMS
fu (v) · h. We occasionally denote the function u → f (u) by f (·). The partial derivative of a function f with respect to a scalar argument t is denoted by either ft or ∂t f . The operator ∂t is assumed to apply only to the term immediately following it. We shall always use notation like ∂t for a total derivative, i.e., a derivative of a composite function. Obvious analogs of these notations will also be used. 2. Formulation of the Governing Equations We briefly outline the formulation of geometrically exact equations governing the motion in space of a rod that can suffer flexure, extension, torsion, and shear. We follow [1, Chapter 8], which should be consulted for interpretations and for the proofs of all our assertions. The motion of a rod is defined here by three vector-valued functions [0, 1] × R (s, t) → r(s, t),
d1 (s, t),
d2 (s, t) ∈ E3
(2.1)
with {d1 (s, t), d2 (s, t)} orthonormal. The function r(·, t) may be interpreted as the configuration at time t of the curve of centroids of a slender 3-dimensional body. The vectors d1 (s, t) and d2 (s, t) may be interpreted as characterizing the orientation of the material section at s at time t. In particular, d1 (s, t) and d2 (s, t) may be regarded as characterizing the configurations at time t of a pair of orthogonal material lines of the section s. We assume that s is the arc-length parameter of the reference configuration of r and we scale the length so that 0 s 1. We set d3 := d1 × d2 .
(2.2)
Since {dk (s, t)} is a right-handed orthonormal basis for E3 for each (s, t), there are vector-valued functions u and w such that ∂ s dk = u × dk ,
∂ t dk = w × dk .
(2.3)
Since the basis {dk } is natural for the intrinsic description of deformation, we decompose relevant vector-valued functions with respect to it: v := rs = vk dk ,
p := rt = pk dk ,
u = uk dk ,
w = wk dk . (2.4)
The equality of mixed partial derivatives of r and of the dk implies that ps = vt = (∂t vk )dk + w × v,
ws = ut + u × w = (∂t uk )dk .
(2.5)
We set u := (u1 , u2 , u3 ),
v := (v1 , v2 , v3 ),
p := (p1 , p2 , p3 ),
w := (w1 , w2 , w3 ).
(2.6)
u and v are the strain variables corresponding to the motion (2.1). (The strains u1 and u2 measure flexure, u3 measures torsion, v1 and v2 measure shear, and v3 measures dilatation.)
58
S.S. ANTMAN
In the configuration at time t, the resultant contact force and contact couple exerted by the material of (s, 1] on the material of [0, s] (for 0 < s 1) are respectively denoted n(s, t) and m(s, t). Provided that there are no body forces or couples, the equations of motion have the form ns + f = ρArt t , ms + rs × n + l = ∂t (ρJpq wq dp ) =: ∂t (ρJ · w)
(2.7) (2.8)
where (ρA)(s) is the prescribed positive mass density per reference length at s, the (ρJγ δ )(s), γ , δ = 1, 2, are the prescribed components of the positive-definite symmetric 2 × 2 matrix of mass-moments of inertia of the section s. The positivedefinite symmetric 3 × 3 matrix ρJ := (ρJpq ) is defined by ρJγ 3 = ρJ3γ = 0, ρJ33 = ρJγ γ , and ρJ := ρJpq dp dq . Let mk := m · dk , m := (m1 , m2 , m3 ),
nk := n · dk , n := (n1 , n2 , n3 ).
(2.9)
m1 and m2 are the bending couples, m3 is the twisting couple, n1 and n2 are the shear forces, and n · rs /|rs | is the tension. The rod is elastic if there are constitutive functions (u, v, s) → m (u, v, s), n(u, ˆ v, s)
(2.10a)
such that m(s, t) = m (u(s, t), v(s, t), s),
etc.
(2.10b)
This form of the constitutive equations ensures that the material behavior is invariant under rigid motions. For any function [0, 1] × R (s, t) → z(s, t) we define zs,t (σ, τ ) := z(s − σ, t − τ ) for all σ such that s − σ ∈ [0, 1] and for all τ t. The general constitutive equations for a rod whose response at (s, t) depends nonlocally on other material points (sections) of the rod and depends upon the past history has the form m(s, t) = m (us,t , vs,t , s),
etc.
(2.11)
These very general constitutive equations are likewise invariant under rigid motions. We now recast our governing partial differential equations as a vectorial system of first order in the time derivative. Equations (2.3)–(2.5), (2.10), (2.11) imply that ∂ t dk ut vt ∂t (ρJ · w) ρApt
= = = = =
w × dk , ws − u × w, ps , ∂s ( mk dk ) + v × nˆ k dk , ∂s (nˆ k dk )
(2.12a) (2.12b) (2.12c) (2.12d) (2.12e)
59
INVARIANT DISSIPATIVE MECHANISMS
where the arguments of m k and nˆ k are u · dl ,
v · dl ,
ws · dl ,
ps · dl − (w × v) · dl ,
(2.12f)
s.
Note that the ordinary differential equation (2.12a) preserves the dot products dk ·dl and therefore ensures that {dk (s, t)} is an orthonormal basis for all s, t if {dk (s, 0)} is an orthonormal basis for all s. By the continuation theory for ordinary differential equations, the linearity of (2.12a) and the constancy of d1 · d1 , . . . imply that the solutions of initial-value problems for (2.12a) are defined for all t. The system (2.12) is hyperbolic if ( m, n) ˆ satisfies the monotonicity condition that the matrix ∂( m, n) ˆ ∂(u, v)
is positive-definite.
(2.13)
If we take the componential version of (2.12) with respect to the basis {dk }, we can uncouple the equation (2.12a) for the dk from the remaining equations: Let eklm denote the alternating symbol. Then (2.12a) has the componential form ∂t dk = ekij wj di ,
(2.14)
and system (2.12b–f) is equivalent to ∂t ui ∂t vi ∂t (ρJij wj ) ∂t (ρApi )
= = = =
∂s wi − eij k wj uk , ∂s pi + eij k (uj pk − wj vk ), ∂s m i + eij k (uj m k + vj nˆ k − wj ρJkq wq ), ∂s nˆ i + eij k (uj nˆ k − wj ρApk ),
(2.15a) (2.15b) (2.15c) (2.15d)
where the arguments of m k and nˆ k are u,
v,
ws − w × u (= ut ),
ps + u × p − w × v (= vt ),
s.
(2.15e)
We can write this system in the compact form ut vt ∂t (ρJ · w) ≡ ρJ · wt ∂t (ρAp) ≡ ρApt
= = = =
ws − w × u, ps + u × p − w × v, ∂s m +u×m + v × nˆ − w × (ρJ · w), ∂s nˆ + u × nˆ − w × (ρAp).
(2.16a) (2.16b) (2.16c) (2.16d)
(Even though (2.16) is independent of the dk , boundary conditions, which we do not study, may not be.) The systems (2.12) and (2.16) are in a general conservation form. It is important to note that our original system (2.3)–(2.5), (2.10), (2.11) of governing equations, the first-order vectorial system (2.12), and the first-order componential system (2.16) are each equivalent.
60
S.S. ANTMAN
3. Artificial Viscosity We now modify system (2.16), which we identify with (1.1), by adding an artificial viscosity D · us , where, for simplicity, we take D to be a constant positive-definite diagonal matrix. In particular, let U, V, W, P be constant positive-definite diagonal 3×3 matrices. Then the modification of (2.16) with artifical viscosity has the form ut vt ρJ · wt ρApt
= = = =
ws − w × u + U · uss , ps + u × p − w × v + V · vss , ∂s m +u×m + v × nˆ − w × (ρJ · w) + W · (ρJ · ws )s , ∂s nˆ + u × nˆ − w × (ρAp) + P · (ρAps )s .
(3.1a) (3.1b) (3.1c) (3.1d)
Suppose we were to decompose u and v with respect to {dk } as above, but decompose w and p with respect to a Cartesian basis {ik }. It is a straightforward exercise to show that the modification of the resulting system with artificial viscosity as in (3.1) is not equivalent to (3.1) (and therefore these two systems could have solutions with very different properties). This is a portent of some of the difficulties we must overcome. Modification of the momentum equations. We would like the artificial viscosity terms in (3.1c,d) to represent a material dissipation, which would regularize the behavior of solutions. Thus they should modify the constitutive functions in these two equations. If these modified constitutive functions are to be invariant under rigid motions, then the remarks surrounding (2.11) imply that W · ρJ · ws and P · ps should depend solely (although possibly nonlocally in space and time) on u, v. It follows from (2.5), which gives the actual kinematical relations, that these viscosities lack the requisite form. It is also clear from (2.11) how to rectify this deficiency. A particularly simple way to do this is to drop W · ρJ · ws and P · ps from the right-hand side s of (3.1c) and (3.1d) and to replace ( m, n) ˆ of (2.10b) with ( m + M · ut , nˆ + N · vt ) where M, N are constant positive-definite diagonal 3 × 3 matrices. Then in place of (3.1c,d) we obtain + M · ust + u × ( m + M · ut ) ρJ · wt = ∂s m + v × (nˆ + N · vt ) − w × (ρJ · w), ρApt = ∂s nˆ + N · vst + u × (nˆ + N · vt ) − w × (ρAp).
(3.2) (3.3)
These equations are equivalent to the following modifications of (2.12d,e): mk (u, v) + Mkl ∂t ul ]dk }s + v × [nˆ k (u, v) + Nkl ∂t vl ]dk , (3.4) (ρJ · w)t = {[ (3.5) ρApt = {[nˆ k (u, v) + Nkl ∂t vl ]dk }s . Modification of the compatibility equations. We now examine modifications like (3.1a,b) of the compatibility equations (2.16a,b). Since such modifications are critical for numerical methods, we cannot avoid studying them by simply setting
61
INVARIANT DISSIPATIVE MECHANISMS
U = O = V, as we would do in the analysis of the differential equations for viscoelastic rods of strain-rate type. We first have to frame a notion of invariance for modified compatibility equations, which come from purely kinematic considerations and have no intrinsic material significance. We define an invariant system with artificial viscosity to be a system of equations with single time-derivatives on the left-hand side and with each equation containing a dissipative term such that it is equivalent to the system consisting of momentum equations (2.7), (2.8) and constitutive equations of the invariant form (2.11). Thus to study this issue, we have to reconstitute the governing system of equations of motion in their traditional form involving second time-derivatives from a suitable modification of (2.16) involving first time-derivatives. In the process of constructing an invariant modification of (2.16a,b) we show that (3.1a,b) are not invariant. Rather than modifying (3.1a,b), it is more convenient to modify (2.12a–c). Let U and V be tensors whose matrices with respect to the basis {dk } are U and V. We first consider a modification of (2.12c) in the form vt = ps + [(ρA)−1 V · vs ]s + [(ρA)−1 η]s
(3.6)
where η is a function at our disposal to make this equation invariant. Since this equation sets a t-derivative equal to an s-derivative on a simply-connected domain, there is a vector-valued potential, which is convenient not only to denote as r but also to treat as r, such that rt = p + (ρA)−1 V · vs + (ρA)−1 η,
rs = v.
(3.7)
The treatment of the analogous modification of (2.12b) is a little trickier: We seek a ξ so that ut = [w + (ρJ )−1 · (U · us + ξ )]s − u × [w + (ρJ )−1 · (U · us + ξ )] (3.8) is invariant. Although this equation does not have the form of a t-derivative equaling an s-derivative, it does have the form of (2.5)2 . We therefore conclude that there is a triple of vectors, which is convenient to denote as {dk }, such that ∂t dk = [w + (ρJ )−1 · (U · us + ξ )] × dk ,
∂ s dk = u × dk .
(3.9)
Since each of these equations conserves the dot products {dk · dl }, we see that these {dk } are orthonormal if they are orthonormal at the initial time. Note that (3.9)1 is a modification of (2.12a). We replace p in the modified momentum equation (3.5) with its expression coming from (3.7) and we now identify the dk appearing there with the new vectors satisfying (3.9). We obtain ρArt t = (V · vs )t + ηt + {[nˆ k (u, v) + Nkl ∂t vl ]dk }s .
(3.10)
For this equation to have the requisite invariance, the first two terms on the right+ hand side must have the form [n+ k dk ]s where the nk depend (possibly nonlocally)
62
S.S. ANTMAN
on (u, v). (The first term on the right-hand side lacks this form because the timedifferentiation of the base vectors given by (3.9)1 introduces w-terms not of this form.) We make a particularly simple choice of n+ k by choosing η so that (V · vs )t + ηt = (Vkl ∂t vl dk )s .
(3.11)
The choice (3.11) gives (3.10) an invariant form: ρArt t = {[nˆ k (u, v) + (Nkl + Vkl )∂t vl ]dk }s
(3.12)
(where we use (3.9) to compute the derivatives of the dk ). We differentiate (3.6) with respect to t and insert (3.11) into the resulting equation to get vt t = pst + [(ρA)−1 (Vkl ∂t vl dk )s ]s
(3.13)
where again we use (3.9) to compute the derivatives of the dk . The principal part of the partial differential operator acting on v in this equation is vt t − V · [(ρA)−1 vst ]s .
(3.14)
It is just a vectorial version of the heat operator on vt . It is responsible for the dissipativity and it gives this equation a parabolic character and gives the entire system a parabolic-hyperbolic character; see Zheng [6]. Note that the dk in (3.13) depend upon ξ , which has not yet been fixed. We now make (3.8) invariant by the same process by which we made (3.6) invariant. Since (3.8) comes from (2.12b) by replacing w with w + (ρJ )−1 · (U · us + ξ ), we make this replacement in (3.4), use the modified constitutive functions from (3.12), and interpret the dk in (3.4) as the new dk : mk (u, v) + Mkl ∂t ul ]dk }s (ρJ · w)t = (U · us + ξ )t + {[ + v × [nˆ k (u, v) + (Nkl + Vkl )∂t vl ]dk ,
(3.15)
To put this into invariant form, we simply choose ξ so that (U · us + ξ )t = [Ukl ∂t ul dk ]s .
(3.16)
Thus (3.15) becomes the invariant mk (u, v) + (Mkl + Ulq )∂t uq ]dk }s (ρJ · w)t = {[ + v × [nˆ k (u, v) + (Nkl + Vkl )∂t vl ]dk ,
(3.17)
where once again we use (3.9) to compute the derivatives of the dk . Equation (3.16) may be regarded as linear ordinary differential equation for ξ : ξt = Ukl [(∂t ul dk )s − (∂s ul dk )t ] ≡ Ukl {(∂t ul u − ∂s ul [w + (ρJ )−1 · (U · us + ξ )]} × dk .
(3.18)
63
INVARIANT DISSIPATIVE MECHANISMS
(Had we replaced ξ by ρJ · ξ in (3.8), then the left-hand side of (3.18) would be (ρJ · ξ )t ≡ ρJkl (ξl dk )t , and (3.18) would not be linear in the components of ξ because the t-derivative of dk in this expression would introduce another factor involving the ξl ). We regard the modified compatibility equations (3.8) and (3.18) as a system for u and ξ . We cannot eliminate ξ from (3.8) as we did η from (3.6) in getting (3.13). (We may regard ξ as an internal variable.) Nevertheless, if we differentiate (3.8) with respect to t and then substitute (3.16) into the resulting equation where possible, we obtain an equation for which the principal part of the partial differential operator acting on u is ut t − (ρJ)−1 · U · usst ,
(3.19)
which has the same character as (3.14). 4. Comments It is clear that we could have replaced all the positive-definite diagonal matrices appearing in the above development with positive-definite symmetric matrices. (Numerical methods typically do not require even this sophistication.) Indeed, we could introduce artificial viscosity through nonlinear constitutive laws, which would just give another source to the quasilinearity of the governing system. In short, there is no unique way to introduce invariant versions of artificial viscosity. The following development shows that if we make a slight change in our procedure, then the alternative invariant versions of the momentum equations (3.4), (3.5) have higher derivatives with respect to s corresponding to constitutive functions depending on (us , vs ) also. The resulting theories are of strain-gradient type. Consider the pair (2.12c,e) of equations. Suppose that the modification of (2.12e) is taken to have the form ρApt = [nˆ k (u, v)dk ]s + P · (ρAp)ss + ζ
(4.1)
where the components of P with the basis {dk } form a constant positive-definite diagonal matrix and where ζ is at our disposal to make (4.1) invariant. We retain the modification (3.6) of the compatibility equation and replace the two visible p’s in (4.1) with the expression coming from (3.7)1 . We choose ηt to control the term (V · vs )t by again adopting (3.11), thus converting (4.1) to ρArt t = {[nˆ k (u, v) + Vkl ∂t vl ]dk }s + P · [ρArt − V · vs − η]ss + ζ
(4.2)
where V (s, t) · vs (s, t) + η(s, t) = V (s, 0) · vs (s, 0) + η(s, 0) t Vkl ∂t vl (s, τ )dk (s, τ ) dτ + ∂s 0
(4.3)
64
S.S. ANTMAN
by (3.11). Since (3.11) only involves the t-derivative ov η, we choose the sum of the first two initial terms on the right-hand side of (4.3) to vanish, whence V (s, t) · vs (s, t) + η(s, t)
= ∂s [Vkl vl (s, t)dk (s, t)] − ∂s
t
Vkl vl (s, τ )∂t dk (s, τ ) dτ.
(4.4)
0
We next choose ζ to ensure that the resulting version of (4.1) is invariant, without annihilating the expression involving P . Omitting details, we find that this equation involves four s-derivatives of r. The modifications associated with the imposition of artificial viscosity in hyperbolic conservation laws support numerical methods like the Lax–Friedrichs scheme and the upwind schemes [4]. As mentioned above, standard versions of these methods, which do not correspond to invariant constitutive functions, can produce serious errors. The introduction of higher derivatives with respect to s represents, in the language of hyperbolic conservation laws, a dispersive regularization, versions of which support the Lax–Wendroff and Beam–Warming schemes [4]. The treatment of (4.1) shows that the invariant introduction of artificial viscosity may also introduce some dispersive effects. Slemrod [5] first associated the artifical viscosity in the equation of mass conservation for 1-dimensional gas dynamics with capillarity. In material coordinates, the equation for conservation of mass becomes a compatibility equation and capillarity corresponds to strain-gradient effects for solid mechanics. It seems likely that the introduction of (properly invariant) artificial viscosity into compatibility equations gives additional regularity to the solutions. Thus the analysis of the regularized system should have advantages that more than compensate for the additional complexity. Acknowledgment The work reported here was supported in part by grants from the NSF and ARO. References 1. 2. 3. 4. 5. 6.
S.S. Antman, Nonlinear Problems of Elasticity. Springer, New York (1995). S.S. Antman, Physically unacceptable viscous stresses. Z. Angew. Math. Phys. 49 (1998) 980– 988. S.S. Antman and J.-G. Liu, Errors in the numerical treatment of hyperbolic conservation laws caused by lack of invariance, in preparation. R.J. LeVeque, Numerical Methods for Conservation Laws. Birkhäuser, Basel (1992). M. Slemrod, Dynamics of first order phase transitions. In: M.E. Gurtin (ed.), Phase Transitions and Material Instabilities in Solids. Academic Press, New York (1984) pp. 163–203. S. Zheng, Nonlinear Parabolic Equations and Hyperbolic-Parabolic Coupled Systems. Longman, New York (1995).
An Average-Stretch Full-Network Model for Rubber Elasticity MILLARD F. BEATTY Department of Engineering Mechanics, University of Nebraska-Lincoln, P.O. Box 910215, Lexington, KY 40591-0215, U.S.A. E-mail:
[email protected] Received 24 April 2002; in revised form 14 March 2003 Abstract. Two constitutive models that are based on the classical non-Gaussian, Kuhn–Grün probability distribution function are reviewed. It is shown that all chains of a network cell structure comprised of a finite number of identical chains in an affine deformation referred to principal axes may have the same invariant stretch, if and only if the chains are oriented initially along any of eight directions forming the diagonals of a unit cube. The 4-chain tetrahedral and the 8-chain cubic cell structures are familiar admissible models having this property. An easy derivation of the constitutive equation for the Wu and van der Giessen full-network model of initially identical chains arbitrarily oriented in the undeformed state is presented. The constitutive equations for the neo-Hookean model, the 3-chain model, and the equivalent 4- and 8-chain models are then derived from the Wu and van der Giessen equation. The squared chain stretch of an arbitrarily directed chain averaged over a unit sphere surrounding all chains radiating from a cross-link junction as its center is determined. An average-stretch, full-network constitutive equation is then derived by approximation of the Wu and van der Giessen equation. This result, though more general in that no special chain cell morphology is introduced, is the same as the constitutive equation for the 4- and 8-chain models. Some concluding remarks on extensions to amended models are presented. Mathematics Subject Classifications (2000): 74B20, 82D60. Key words: rubber elasticity, non-Gaussian chains, constitutive equations, full-network models.
Dedicated in memory of my friend and esteemed teacher, Clifford Ambrose Truesdell III
1. Introduction Various models of rubber elasticity are based on the non-Gaussian statistical characterization of a network of randomly oriented, perfectly flexible molecular chains that occupy their most probable configurations in the natural, undeformed state. These models use as a measure of deformation the change in length of the end-toend vector r between molecular cross-links; and an affine deformation assumption relates the chain stretch to the macroscopic stretch of the continuum. The configurational entropy of a chain, and hence its strain energy, is determined by specification of a probability distribution function P (r) that derives from non-Gaussian 65 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 65–86 © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
66
M.F. BEATTY
statistical theory, the distribution P (r) being a rough measure of the range of variation among the very many possible configurations of a chain [1]. Some early studies of such distribution functions for freely jointed chains are described by Treloar [2], the most widely used and simplest of which is due to Kuhn and Grün [3]. The question then focuses on how to model a network of a great many such chains to obtain a continuum theory that characterizes the mechanical response of isotropic rubberlike materials in finite extension. Consequently, a number of specific network models have been proposed, including the James–Guth 3-chain [4], Flory–Rehner 4-chain [5], Arruda–Boyce 8-chain [6], and the Treloar [7], Treloar– Riding [8], and the Wu–van der Giessen [9] full-network models. The work by Treloar’s group focuses on simple uniaxial [7] and biaxial [8] deformations, whereas the Wu and van der Giessen [9, 10] result admits general three-dimensional deformation states. All of these non-Gaussian network models use the Kuhn–Grün [3] probability distribution function, which is essentially a first order approximation developed from Rayleigh’s exact Fourier integral representation [11]. In an effort to adjust for inaccuracies introduced through approximations in the Kuhn–Grün function, Jernigan and Flory [12] proposed an amended form of the distribution function. This idea was explored recently by Zúñiga and Beatty [13] in a study of several amended models of rubberlike materials. Details on these models and several additional references citing various phenomenological models, including the review article [14], may be found there. The Arruda–Boyce model, however, has proved to be the most successful in that it is mathematically simpler than others, it compares most favorably with experiments on the mechanical response of various elastomers under diverse loading conditions, and it requires determination of only two well-defined material constants. Nevertheless, in comparison with experimental data, none of these theoretical models predict fully accurate material response for all deformations studied, the greatest variance generally occurring for equibiaxial extension. Here we study the Wu and van der Giessen [9, 10] full-network model. Their major result is a formidable, three-dimensional integral type of constitutive equation for an incompressible and isotropic, hyperelastic rubberlike material whose microstructure is characterized by a full-network of initially identical chains randomly oriented in the undeformed state. In applications, however, they admit that their rule requires time intensive numerical computation, so no specific analytical results have been obtained. With their result in hand, we seek a simpler but general constitutive equation for a uniform full-network microstructure. First, we show that the constitutive equations for the classical neo-Hookean (Gaussian network) model [2], the James–Guth 3-chain model [4], and the Arruda–Boyce 8-chain model [6], or the equivalent Wang–Guth 4-chain model [11], may be derived from the general Wu and van der Giessen equation. It is then shown that the squared chain stretch of an arbitrarily directed chain averaged over a unit sphere is a certain function of the first principal invariant I1 (B) of the Cauchy–Green deformation tensor B. With the aid of this result, a general average-stretch, full-network consti-
AN AVERAGE-STRETCH FULL-NETWORK MODEL
67
tutive equation valid in every reference frame is obtained by approximation of the Wu and van der Giessen principal stress-stretch equation. The reduced equation, though more general in that no specific chain cell morphology is assumed, has precisely the same form as the constitutive equation for the equivalent 4- and 8-chain models. The same average-stretch procedure may be applied to full-network models characterized by certain amended distribution functions [13], all of which are thus approximated by the same formal constitutive equation, but each having a different isotropic elastic response function. It is shown elsewhere [15] that parallel results may be derived for the back stress tensor in amorphous glassy polymers. 2. Work of Deformation We begin with a sketch of some relations for uniform non-Gaussian networks of perfectly flexible chains whose end points occupy their most probable positions in the reference configuration [1–3, 11, 12]. The non-Gaussian statistical treatment of a single, freely jointed molecular chain model accounts for the finite extensibility of the end-to-end chain vector length r up to its ultimate, fully extended chain length rL ≡ Nl, where N is the number of rigid links, each of length l.√In its undeformed Hence, state, the mean end-to-end chain vector length is given by r0 = N l [1, 2]. √ the fully extended, chain locking stretch is defined by λL ≡ rL /r0 = N. It is useful to define the relative chain stretch λr as the ratio of the current chain vector length r = rchain to its fully extended length rL . In terms of the chain stretch λchain ≡ rchain /r0 , we then have λr =
rchain λchain = √ ; rL N
(2.1)
and hence λr varies from N −1/2 in the undeformed state to the value 1 in the deformed, fully extended state: N −1/2 λr 1, with λr → 0 as N → ∞. The entropy s ≡ k ln P (r) for a single randomly oriented, freely jointed molecular chain is defined in terms of a probability distribution function P (r) depending on r, where k is the universal Boltzmann constant [1, 2]. Kuhn and Grün [3] derived an approximate non-Gaussian expression for P (r) that yields the following Recent analysis [16] by computational simulations has shown that as a consequence of anneal-
ing, that is, finding the mechanically equilibrated states of network chains, each of several unimodular networks has a mean undeformed end-to-end√ vector length that is roughly 10–20% smaller than the classical root mean square (rms) value r0 = Nl [1, 2]. This revelation, however, is of no concern in our current study of non-Gaussian networks for which we retain the classical rms value. This is essentially the first order approximation in a series representation of the Rayleigh distribution function discussed later in Section 7. See [1, Chapter VIII] for these and other details on the Kuhn–Grün function for freely jointed chains. An alternative exact derivation of the chain tension relation f = (1/Nl) dw(λr )/dλr = (k / l)β, concluded without proceeding by way of the entropy function, is provided by Weiner [17, p. 244–247] based on the stress ensemble viewpoint. See also [2, pp. 108–109] and the subsequent discussion there on the non-Gaussian network theory.
68
M.F. BEATTY
configurational entropy for a single, randomly oriented chain, β , s = k c − N λr β + ln sinh β
(2.2)
wherein c is a constant and β ≡ L−1 (λr )
(2.3)
is the inverse of the Langevin function L(β). Therefore, λr = L(β) ≡ coth β −
1 , β
(2.4)
where we recall the relative chain stretch (2.1). The work of deformation, the strain energy per chain, is given by w = − s, in which is the absolute temperature. Hence, for the Kuhn–Grün chain entropy function (2.2), the strain energy per chain is determined by β − c∗ , (2.5) w(λr ) = k N λr β + ln sinh β c∗ being a constant chosen so that the energy vanishes in the undeformed state. The relative chain stretch in an affine deformation of an arbitrarily directed chain is readily related to the macroscopic principal stretches of the continuum, which is considered incompressible. The total strain energy for a full-network model, however, will depend on the orientation, distribution, and concentration of all of the chains in the bulk material. We shall return to this farther on. For simplicity, however, this complication may be removed by the introduction of specific chain cell structures. To characterize these structures, let n denote the chain density, the number of freely jointed chains per unit volume of the bulk material. Suppose that the microstructure consists of an assembly of certain unit cells of p chains initially oriented in p distinct directions emanating from a cross-link and each having a different relative chain stretch λpr . Assuming that the chain density np for the pth directed set of chains is the same for each direction, we have np = n/p; and the distribution of the network of chains is called homogeneous. Therefore, among n chains per unit volume of a homogeneous distribution, the contribution to the total strain energy from all chains in the pth direction is (n/p)w(λpr ). Thus, each of the p distinct chains of the cell contributes an amount of energy wp = w(λpr ) to the total strain energy W , which is given by W =
p n wj . p j =1
(2.6)
Clearly, it is not necessary to include oppositely directed chains of a symmetric cell structure; these have the same stretch and contribute the same energy, so p
AN AVERAGE-STRETCH FULL-NETWORK MODEL
69
may be replaced by p/2. p If all chains in the cell should have the same relative stretch λj r = λr , then j =1 wj = pw(λr ). In this case, the total strain energy per unit volume for a homogeneous network of non-Gaussian chains is provided by W (λr ) = nw(λr ).
(2.7)
We shall see momentarily that the 3-chain and 8-chain cell structures, for example, are respectively characterized by (2.6) and (2.7). We recall that the Gaussian (neo-Hookean) theory assumes that a molecular chain adopts a tight configuration with end-to-end √ separation r that is small compared to its fully extended length (i.e., for λchain N). Therefore, the results for this model are limited to moderate stretches for which the approximation of (2.3) is given by β ≈ 3λr .
(2.8)
It follows that the non-Gaussian, Kuhn–Grün energy function (2.5) reduces to the Gaussian strain energy function per chain. The variance encountered between predictions of the Gaussian theory and experiments at moderate strains, therefore, very likely will not be significantly diminished by any non-Gaussian network model based on the Kuhn–Grün distribution function, as remarked by Treloar and Riding [8]. The constitutive equations for several models of interest are sketched below. First, however, let us note that the strain energy function for an incompressible, (λ1 , λ2 , λ3 ) of isotropic hyperelastic material is a symmetric function W = W the principal stretches λj , subject to the incompressibility constraint λ1 λ2 λ3 = 1. Then the stress-stretch equations for the principal Cauchy stress components Tj are provided by Tj = −p + λj Wj ,
j = 1, 2, 3
(no sum),
(2.9)
/∂λj . where p is an arbitrary pressure and Wj ≡ ∂ W 3. The James–Guth 3-Chain Model The James–Guth 3-chain model [4] considers a homogeneous full-network of n chains per unit volume oriented along the three mutually orthogonal principal directions √ of deformation, all having the same initial (rms) chain vector length r0 = N3 l. Here N3 denotes the number of rigid links, each of length l, in a molecular chain between cross-links of the network. In an affine deformation the current chain vector length in the j th direction is then defined by rj chain ≡ λj r0 , where λj denotes the macroscopic principal stretch of the continuum along the j th principal axis. The corresponding j th relative chain stretch λj r , in accordance with (2.1), is defined by λj r ≡
λj chain λj =√ , λL N3
j = 1, 2, 3,
(3.1)
70
M.F. BEATTY
in which λj chain ≡ rj chain /r0 = λj defines the j th current chain stretch and λL is the chain locking stretch. Since the 6-chain cell structure is symmetric and the distribution of chains is homogeneous, the chain density nj for the j th set of orthogonal chains is the same for each set: nj = n/3, though wj is not. Hence, (λ1 , λ2 , λ3 ) from (2.5) and (2.6) in which p = 3, the total strain energy W = W per unit volume for the James–Guth 3-chain network model may be written as 3 βj nk N3 βj λj r + ln − c3 , W = 3 sinh βj j =1
(3.2)
in which the constant c3 is chosen so that the strain energy vanishes in the undeformed state; and, by (2.3), βj ≡ L−1 (λj r ). Observing that λj ∂W/∂λj = λj r ∂W/ ∂λj r and using (3.2) in (2.9), we obtain the constitutive equation for the James– Guth 3-chain model in the principal reference system: Tj = −p +
µ0 N3 βj λj r , 3
j = 1, 2, 3
(no sum),
(3.3)
wherein µ0 is the shear modulus in the undeformed state: µ0 ≡ nk .
(3.4)
The non-Gaussian stress-stretch relations (3.3) are determined by two parameters: the shear modulus µ0 , and the number of links N3 in a single chain of the 3-chain network model. The latter controls the stiffening behavior of rubber materials at large strains and determines the ultimate extensibility of the network. With the aid of the infinite series expansion [2, 7] for L−1 (λj r ) and use of the Cayley–Hamilton theorem for which B3 = I1 B2 − I2 B + 1 for an incompressible material, noting that the left Cauchy–Green deformation tensor B = diag[λ21 , λ22 , λ23 ] in the principal reference system, we see that (3.3) may be cast in the familiar invariant tensorial form T = −p1 + ℵ1 (I1 , I2 ; N3 )B + ℵ2 (I1 , I2 ; N3 )B2 ,
(3.5)
where the elastic response functions ℵα (I1 , I2 ; N3 ) are defined by the infinite series 513 99 I + (1 − I1 I2 ) ℵ1 (I1 , I2 ; N3 ) = µ0 1 − 2 2 175N3 875N33 42039 2 2 (I − I I + I ) + · · · , + 1 1 2 2 67375N34 (3.6) 99 513 3 2 + I1 + (I − I2 ) ℵ2 (I1 , I2 ; N3 ) = µ0 5N3 175N32 875N33 1 42039 3 (I − 2I1 I2 + 1) + · · · . + 67375N34 1
AN AVERAGE-STRETCH FULL-NETWORK MODEL
71
A similar series of terms has been absorbed in the arbitrary pressure term which is simply rewritten as p. It is evident that, except for its invariant tensorial representation in the form (3.5), there is little significant advantage gained over the principal axis representation (3.3), except possibly for finite element applications. By (2.8), for small to moderate stretches βj ≈ 3λj r , and (3.3) reduces easily to the neo-Hookean constitutive model [2]: T = −p1 + µ0 B.
(3.7)
4. The Arruda–Boyce 8-Chain Model We next develop the constitutive equation for the Arruda-Boyce 8-chain model. We begin with a different view point and thereby obtain a new auxiliary result. Let us consider a single, perfectly flexible and freely √ jointed molecular chain whose undeformed (rms) chain vector length is r0 = l N and whose chain vector r0 = (X, Y, Z) in the reference configuration is directed along the line X = Y = Z through O in a rectangular Cartesian frame ψ = {O; √ Ik }. Then, in the undeformed network,√the chain vector has length r0 = X 3 and its direction in ψ is m = (1, 1, 1)/ 3. (See Figure 1(a)). Now suppose that the network is subjected to an affine deformation in which ψ coincides with the local principal axes. The corresponding principal stretches are denoted by λj and the squared stretch of a chain initially oriented in an arbitrary referential direction m is determined by λ2chain
= m · Cm =
3
m2k λ2k ,
(4.1)
κ=1
where C is the right Cauchy–Green deformation tensor. Hence, for our special single chain model, (4.1) yields the chain stretch rchain I1 , (4.2) = λchain = r0 3 in which I1 ≡ λ21 + λ22 + λ23 . We recall that I1 is the first principal invariant of the Cauchy–Green deformation tensor C or B, each being equal to diag[λ21 , λ22 , λ23 ] in its corresponding principal reference system. The same result holds for a chain whose end point is initially situated along any of the three similar lines −X = Y = Z; −X = −Y = Z; and X = −Y = Z. Hence, (4.2) is a necessary condition in order that a chain may be initially oriented along any of the aforementioned eight directions radiating from O in the principal referential frame ψ. Let us consider the converse question. What are the orientations of all chains whose chain stretch is given by (4.2)? Consider any chain C whose end points in the principal frame ψ are at O and r0 = r0 mk Ik initially, where mk are its direction cosines. In its deformed state in ψ, the squared stretch of the chain C with direction m is determined by (4.1), that is, by λ2chain = λ21 m21 + λ22 m22 + λ23 m23 .
(4.3)
72
M.F. BEATTY
Figure 1. Various chain cell models having the same chain stretch. (a) 1-chain structure, (b) 4-chain tetrahedral structure, (c) 8-chain cubic structure, (d) 4-chain semi-octahedral and 8-chain octahedral structures.
Now suppose that the same network is subjected to an affine deformation with the same values of the principal stretches as before, but with λ1 and λ2 interchanged. The chain C now has the squared stretch λ2chain = λ22 m21 + λ21 m22 + λ23 m23
(4.4)
for the same initial vector r0 and the same values of the stretches λk . Of course, the interchange of stretches does not alter (4.2); so, if the chain stretch in both (4.3) and (4.4) is the invariant chain stretch (4.2), then (λ21 − λ22 )(m21 − m22 ) = 0
(4.5)
must hold for arbitrary principal stretches λk that satisfy the incompressibility condition. It follows that m21 = m22 must hold initially for the chain C. Clearly,
AN AVERAGE-STRETCH FULL-NETWORK MODEL
73
a relation similar to (4.5) holds when the same network is again subjected to a deformation with the same values of the principal stretches as before, but with λ1 and λ3 , or λ2 and λ3 interchanged. Consequently, m21 = m22 = m23 = 1/3 must hold initially for the chain vector r0 . It thus follows that the only chain orientations for which (4.2) holds for all deformations with arbitrary principal stretches λk are chains situated along straight lines through the origin and defined by X 2 = Y 2 = Z 2 in ψ, that is, the four lines described earlier and directed along the diagonals of a unit cube. We thus have the following auxiliary result. The chain stretch λchain in an affine deformation with local principal stretches λk has the invariant form (4.2) if and only if the chain is oriented initially in any of the eight directions from O along the diagonals of a unit cube. As a consequence, we have only a few possible network cell structures having a finite number of chain orientations for which (4.2) holds. These include the 1-chain structure, the 4-chain tetrahedral structure, and the 8-chain cubic structure shown in Figure 1, in which the origin O in Figures 1(b) and (c) is at the center of the cube. The 4-chain semi-octahedral and 8-chain octahedral structures in Figure 1(d) are identical to the cubic structure. The 4-chain and 8-chain models, therefore, are important special members of the larger geometrical class of uniform polyhedra networks having ν chains, all with initial vector length r0 . In the deformed state, a uniform polyhedron chain structure is distorted with varying degrees of relative chain stretch among its many chains. The 4- and 8-chain network models, however, are unique among these. They are the only uniform polyhedra chain morphologies all of whose chains in the principal reference system have the same chain stretch (4.2), and hence the same strain energy per chain. Therefore, these geometrically similar models are said to be isomorphic. Equation (4.2) is the same as the Arruda–Boyce equation (16) in [6] obtained √ for a network model based on eight chains of undeformed vector length r0 = l N8 linked at the center of a cube and extending to its eight corners, N8 denoting the number of chain links of the 8-chain model. Henceforward, in regards to this model, we shall write N = N8 and refer to our subsequent constitutive equation as the Arruda–Boyce 8-chain model. With (4.2), the relative chain stretch (2.1) is given by
λchain = λr = λL
I1 , 3N8
(4.6)
This is viewed somewhat differently by Yeoh and Fleming [18]. According to them, the constitutive coincidence of the 4- and 8-chain models arises because the 8-chain model is isomorphic to a body centered cubic lattice comprised of two tetrahedral diamond sublattices; and the 4-chain model is isomorphic to a diamond lattice. Wu and van Giessen [9] observed from simple extension data that N ≈ 3N ; so, experimentally 3 8 determined values for N3 and N8 for the same physical parameter N may vary considerably. See also the remarks in [13].
74
M.F. BEATTY
√
where λL = N8 , the fully extended chain stretch. All eight chains of the symmetric 8-chain network, in an affine deformation of its cubic structure in the principal frame ψ, have the same relative chain stretch (4.6). We shall assume that the network chain density n is distributed uniformly among these eight (or by symmetry, four) chain directions. Hence, by (2.5) and (2.7), the total strain energy per unit volume for the Arruda–Boyce 8-chain network model is β − c8 ; (4.7) W = µ0 N8 βλr + ln sinh β where c8 is a convenient constant, β is defined by (2.3), and we recall (3.4). Therefore, the strain energy for the Arruda–Boyce model is simply a function of the principal invariant I1 alone. Hence, substitution of (4.7) into (2.9) yields the constitutive equation for the Arruda–Boyce 8-chain network model [6, 19]: T = −p1 + ℵ(I1 )B,
(4.8)
where B = diag[λ21 , λ22 , λ23 ] in the principal reference system of the deformed state and the material response function ℵ(I1 ) is defined by ℵ(I1 ) ≡
µ0 β , 3λr
(4.9)
with β = L−1 (λr ) and λr given by (4.6). We see that the invariant 8-chain rule (4.8) with the single elastic response function (4.9) is far simpler than the corresponding 3-chain result in (3.5). For small to moderate values of the relative chain stretch (4.6) for which (2.8) holds, (4.9) yields ℵ(I1 ) = µ0 , a constant; and (4.8) reduces to the neo-Hookean (Gaussian network) relation (3.7), the same moderate principal stretch relation obtained from the 3-chain model. 5. The Wu and van der Giessen Full-Network Model The development of special network models, such as the 3-chain and 8-chain cell structures, avoids the more difficult consideration of random chain orientations studied by Wu and van der Giessen [9, 10]. A much simpler construction of their major result is presented next, and some basic applications follow. Let us consider a network cell of randomly oriented molecular chains radiating from an arbitrarily chosen cross-link junction O. The other end point of each chain is similarly connected with other randomly oriented network chains. For a homogeneous network, every network chain has the same initial, unstretched end-to-end The result (4.8) in terms of principal stretches was first reported without demonstration by Wang
and Guth [11, 13] for the distinct Flory–Rehner 4-chain tetrahedral model [5]. Arruda and Boyce [6] subsequently derived the same principal stretch result for the cubic 8-chain cell model and studied it extensively in experiments on a variety of rubber materials. The constitutive equation (4.8), however, holds in every reference system, a useful property noted in another context in [19]. See also [20].
AN AVERAGE-STRETCH FULL-NETWORK MODEL
75
√
chain vector length r0 = l N. Consequently, each chain emanating from O has its end point on the surface of a sphere S of radius r0 and volume v = 4π r03 /3. The volume averaged value of the strain energy per chain for the uniform network of chains enclosed within S is defined by 1 w(λr ) dv, (5.1) w ≡ v S in which we recall the Kuhn–Grün strain energy (2.5) for a single randomly oriented molecular chain. The end point of a typical chain initially at (X, Y, Z) in a referential Cartesian frame ϕ = {O; Ik } has the spherical coordinates (r0 , θ0 , φ0 ) w, per unit with θ0 = [0, π ], φ0 = [0, 2π ]. Hence, the total strain energy W = n referential volume of a uniform, full-network of chain density n, by (5.1), is given by 2π π n w(λr ) sin θ0 dθ0 dφ0 . (5.2) W = 4π 0 0 This is equivalent to integrating (5.1) over a unit sphere. Though here derived differently, (5.2) is the same constitutive equation obtained by Wu and van der Giessen [9] for an initially homogeneous distribution of a large number of randomly oriented chains, an equation introduced earlier by Treloar and Riding [8]. The initial end-to-end chain vector r0 = r0 m in ϕ has the direction cosines {mi (θ0 , φ0 )} = (sin θ0 cos φ0 , sin θ0 sin φ0 , cos θ0 );
(5.3)
and the relative stretch λr = λr (θ0 , φ0 ; λkr ) of a randomly oriented chain, defined by (2.1), is determined by λ2r = λ2r (θ0 , φ0 ; λkr ) =
3
m2k λ2kr .
(5.4)
k=1
The relative principal stretches λkr are defined by λk λkr = √ , N
(5.5)
as in (3.1). These are independent of the chain orientation angles θ0 , φ0 . Thus, by (2.9) and (5.2), the Wu and van der Giessen constitutive equation for the Cauchy stress may be concisely written as Tk = −p + ℵk λ2k ,
k = 1, 2, 3
(no sum),
in which the elastic response functions ℵk ≡ ℵk (λ1 , λ2 , λ3 ) are defined by 2π π 1 m2 µ0 β(λr ) k sin θ0 dθ0 dφ0 , k = 1, 2, 3, ℵk = 4π λr 0 0 wherein we recall (5.4) and the shear modulus µ0 is given by (3.4).
(5.6)
(5.7)
76
M.F. BEATTY
One might expect that the full-network model (5.6) should provide better predictions of observed experimental data than other approximate network models. But Wu and van der Giessen [9] do not find significantly better predictions than those demonstrated by the 8-chain model, which they attribute to factors other than its superiority. In applications, however, the formidable constitutive equation (5.6) must be solved by time intensive numerical methods [9]. To get around this computational difficulty, Wu and van der Giessen introduce an ad hoc phenomenological constitutive equation consisting of an additive mixture of 3-chain and 8-chain constitutive components with coefficients chosen to best fit their general constitutive equation. They thus use this somewhat simpler composite phenomenological model in numerical applications in which N has the same value for both contributions. We recall, however, that for the separate models experiments exhibit distinct values for N3 and N8 . While their composite mixture rule is computationally simpler and easier to use in comparison of model results with test data, it replaces an amorphous structure with specific chain cell morphologies and it offers no significant analytical simplicity. (See also [13].) Consequently, we seek a general, but approximate constitutive equation that may be derived directly from (5.6). First, however, we shall show that three familiar special cases may be readily derived from this full-network equation. 5.1. THE GAUSSIAN NETWORK MODEL The Gaussian network is characterized by the moderate stretch approximation (2.8). It thus follows that the response functions (5.7) are constants: ℵk = where
3µ0 Ik , 4π
Ik ≡ 0
k = 1, 2, 3,
(5.8)
m2k sin θ0 dθ0 dφ0 .
(5.9)
2π π 0
With the aid of (5.3) in (5.9), we obtain I1 = I2 = I3 = 4π/3; and hence each ℵk = µ0 , a constant. Therefore, the full-network model (5.6) yields the familiar constitutive equation (3.7) for the neo-Hookean, Gaussian network model. 5.2. THE JAMES – GUTH 3- CHAIN MODEL The principal stress components for the James–Guth 3-chain model depend only on the homogeneous distribution of chains with density nk = n/3 in each of the corresponding three orthogonal principal directions, the chains in opposite directions being equivalent. Therefore, in (5.7) we replace n with n/3; that is, in view of (3.4), µ0 is replaced with µ0 /3. In addition, because each principal direction corresponds to a different relative chain stretch, for the kth principal direction
77
AN AVERAGE-STRETCH FULL-NETWORK MODEL
we replace λr with λkr in accordance with (3.1); and each of the three directed contributions must be taken into account. Because the chains are oriented initially along the three principal directions Ij , the three chain directors mj = mj k Ik are the constant vectors m1 = (1, 0, 0), m2 = (0, 1, 0), and m3 = (0, 0, 1), in accord with (5.3). Consequently, (5.7) is now written as 1 µ0 ℵk = 12π
2π π
0
0
3
βj
j =1
m2j k λj r
sin θ0 dθ0 dφ0 ,
k = 1, 2, 3,
(5.10)
in which βj ≡ L−1 (λj r ). Noting that mj k is the kth component of the j th principal chain director, we see for k = 1 that 3j =1 βj m2j 1 /λj r = β1 /λ1r , for example. It is then easily seen that (5.10) yields ℵk =
µ0 βk , 3 λkr
k = 1, 2, 3.
(5.11)
Recalling (3.1), we have λ2k = N3 λ2kr , and hence with (5.11) we find that the constitutive equation (5.6) reduces to (3.3) for the non-Gaussian, James–Guth 3-chain model. 5.3. THE ARRUDA – BOYCE 8- CHAIN MODEL The Arruda–Boyce model consists of a homogeneous distribution of identical chains with density nk = n/8 in each of the eight chain directions oriented along the diagonals of a cubic cell, so µ0 is replaced with µ0 /8. (Because chains in opposite directions are equivalent, actually we need only consider distributions of chains along the four diagonal directions.) All chains have the same squared direction cosines m2k ; and all experience the same relative chain stretch in an affine deformation in the principal frame aligned with the edges of the referential cube. Therefore, it turns out that the effect of our accounting for these eight identical contributions in the manner described previously for the 3-chain model is simply equivalent to our considering a homogeneous distribution of single chains with chain density n and having the direction triple {mk } =
(1, 1, 1) . √ 3
(5.12)
Then, with (5.5), we see that (5.4) yields λ2r = (1/(3N8 ))(λ21 + λ22 + λ23 ), i.e. the same relative stretch given in (4.6). Therefore, β(λr )/λr is now independent of the coordinates θ0 , φ0 ; and hence with the aid of (5.12) the response functions in (5.7) may be written as ℵk =
µ0 β . 3λr
(5.13)
78
M.F. BEATTY
We see from (5.13) that the all three functions ℵk = ℵ(I1 ) coincide with (4.9); and (5.6) thus reduces to (4.8) for the Arruda–Boyce 8-chain model [6]. A parallel argument holds for the distinct Flory–Rehner 4-chain model [5] shown in Figure 1(b) and again leads to (4.8), first recorded in terms of principal stretches and without proof by Wang and Guth [11]. 6. An Average-Stretch Non-Gaussian Full-Network Model Let us return to our homogeneous full-network unit of randomly oriented molecular chains of density n radiating from an arbitrarily chosen cross-link junction at the origin of a principal reference frame ϕ = {O; Ik }. Each chain diverging from O has its end point on the surface of a sphere S of radius r0 . In terms of spherical coordinates, the end point r0 = r0 m of a representative chain has the initial direction cosines given by (5.3) in ϕ. When the continuum is subjected to a deformation with local principal stretches λk in ϕ, the same end point in an affine deformation has the end-to-end chain vector length rchain ; and its corresponding squared chain stretch (4.1) may be written as λ2chain =
2 rchain = sin2 θ0 (λ21 cos2 φ0 + λ22 sin2 φ0 ) + λ23 cos2 θ0 . r02
(6.1)
Notice that upon forming the ratio λ2r = λ2chain /N, the result (6.1) is the same as (5.4) in which we recall (5.3) for a chain having an arbitrary orientation in ϕ. The volume averaged value of the squared stretch of an arbitrarily directed chain within a sphere S of radius r0 centered at a cross-link junction is defined by 1 ˆλ2chain = λ2 dv. (6.2) v S chain In terms of spherical coordinates, (6.2) becomes 2π π 1 2 λ2 sin θ0 dθ0 dφ0 , (6.3) λˆ chain = 4π 0 0 chain which is equivalent to averaging the squared stretch over a unit sphere at O. Hence, for the squared stretch (6.1) of a typical arbitrarily directed chain in the fullnetwork structure surrounding the central junction at O, we find eventually from (6.3) the following simple, invariant relation for the averaged chain stretch : I1 , (6.4) λˆ chain = 3 Note added in proof: I have discovered recently that Kearsley [24] apparently was the first to
have derived this rule. In addition, he shows [24] that the square of the stretch ratio of a material area element averaged over all orientations of its normal vector is equal to I2 /3. A simpler proof this result follows readily from the relation α 2 = I3 C−1 m · m, given in [25, Equation (29.2)], in which α denotes the areal stretch ratio α = da/dA of the deformed material area element da(x) to its undeformed element dA(X) whose outward directed unit normal vector at X is m. The average of α 2 over a unit sphere with radius vector m is defined by αˆ 2 = 1/(4π) 02π 0π α 2 sin θ0 dθ0 dφ0 ,
AN AVERAGE-STRETCH FULL-NETWORK MODEL
79
in which I1 =trB =λ21 +λ22 +λ23 . The average √ end-to-end chain vector length is thus defined by rˆchain ≡ λˆ chainr0 , where r0 = l N , as usual. The locking √ chain length is rL = Nl, and hence the limiting chain stretch is λL = rL /r0 = N. Therefore, the mean relative chain stretch λˆ r is defined by rˆchain λˆ chain I1 . (6.5) = √ = λˆ r ≡ rL 3N N The italicized auxiliary result below (4.5) shows that a single network chain has the relative chain stretch (6.5) when and only when that chain is specifically oriented along a diagonal of a unit cube. The 4- and 8-chain network models are the only uniform polyhedra chain morphologies all of whose chains in the principal reference system have the same relative chain stretch (4.2). The mean result (6.5), however, is more general in that no specific chain cell morphology is introduced, it holds in the mean for any single randomly oriented chain. In the affine deformation, the stretch of some chains in S clearly will exceed the average value (6.4), while that of others will be smaller, the nature of the macroscopic deformation being captured by the invariant I1 . Recalling that (5.9) with (5.3) yields the constant value Ik = 4π/3 and introducing the mean value (6.5) in (5.7) for the full-network model, we obtain for the elastic response functions ℵk the approximate values ℵk ≡ ℵk (λ1 , λ2 , λ3 ) defined by the single relation µ0 β(λˆ r ) µ0 β(λr ) = . (6.6) ℵk ≡ 3 λˆ r 3λr λr =λr ˆ With this estimate in hand, the general constitutive equation (5.6) for our averagestretch, full-network model for rubber elasticity is nicely approximated by the invariant relation T = −p1+µ(I1 )B,
(6.7)
ℵk (λ1 , λ2 , λ3 ) is given in which, with (6.5), the shear response function µ(I1 ) ≡ by µ(I1 ) =
µ0 β(λˆ r ) . 3λˆ r
(6.8)
It is remarkable that equation (6.7) is precisely the same as the constitutive equation (4.8) for the Arruda–Boyce 8-chain model. The fundamental difference is that the result holds more generally for an average-stretch, full-network model of n randomly oriented chains per unit volume; it thus characterizes the amorphous for all orientations m given by (5.3) in the principal frame of C. It is evident that C−1 m · m may be read from the right-hand side of (6.1) in which the λ2k are replaced by λ−2 k , the principal values of C−1 . The subsequent integration for αˆ 2 , therefore, yields a rule similar to (6.4). In consequence, we have αˆ 2 = I3 (C)I1 (C−1 )/3 = I2 (C)/3, which is Kearsley’s result.
80
M.F. BEATTY
molecular structure of rubberlike materials. Therefore, it is not necessary to emphasize the heuristic 8-chain cell structure in reference to their result. Equation (6.7) with (6.8) is simply the Arruda–Boyce constitutive equation for an average-stretch, full-network of arbitrarily oriented molecular chains. Alternatively, we may begin with the Kuhn–Grün configurational entropy (2.2) for a single, freely jointed and randomly oriented molecular chain in which we introduce the mean relative stretch (6.5) to obtain the mean entropy sˆ = s(λˆ r ), per chain. The mean strain energy per chain is then given by w = − ˆs in accord with = n (2.5); and (2.7) provides the mean total strain energy W w for a homogeneous full-network of n such chains per unit volume. We thus obtain with (6.5) a mean = W (I1 ), namely, strain energy per unit volume W W (I1 ) = nw(λˆ r )
(6.9)
depending on only the first principal invariant of B; and, with the aid of (2.9), the general constitutive equation for the incompressible elastomer is given by (6.7) with (6.8). 7. A General Non-Gaussian Network Constitutive Equation The average-stretch procedure may be applied to amended non-Gaussian molecular network models based on the Wang and Guth [11] series developments of Rayleigh’s exact, but formidable integral formulation of non-Gaussian chain distributions [1, p. 314]. Their relation for N 1 and r comparable to N, characterizes highly elastic materials and leads to a probability distribution function [11, equation (2.20)] that depends on only the fractional extension ratio; namely, N 1 2 1/2 sinh β PN (λr ) = 4π Nl 3 π N β exp(λr β) β 2λr −1/2 q(λr ; N) 2 + ··· , (7.1) × 1 − λr − 1+ λr β N in which λr and β are defined in (2.1) and (2.3), and q(λr ; N) is a certain function of λr and the chain parameter N. This development is valid for all λr ∈ [N −1/2 , 1]. We are mainly interested in ln PN (λr ). We thus see that the first square bracket in (7.1) gives the Kuhn–Grün distribution that leads to the configurational entropy (2.2) and the strain energy function (2.5). This term together with the first term within the second square bracket in (7.1) yields the amended Kuhn–Grün function introduced by Jernigan and Flory [12], and recently studied by Zúñiga and Beatty [13]. The remaining terms become increasingly negligible for sufficiently large N. In any such series formulation, however, we shall have some strain energy function w(λr ), per chain, that depends only on the relative chain stretch; and for all such models the constitutive equation for a uniform full-network of randomly oriented chains is given by (5.2). But use of the general distribution in (5.2) further
AN AVERAGE-STRETCH FULL-NETWORK MODEL
81
complicates an already difficult equation. On the other hand, our average-stretch, full-network model for randomly oriented chains considerably simplifies the general distribution function in that, by (7.1), we always have PN (λˆ r ) = P (I1 ); and for a homogeneous network of chain density n, we may write the mean strain energy = −nk ln P (I1 ) = W (I1 ). The most general form of the per unit volume as W constitutive equation for every model characterized by (7.1) evaluated for λˆ r is thus given by (6.7) in which the general shear response function, in accordance with (2.9), is determined by µ(I1 ) =
∂W (λˆ r ) , 3N λˆ r ∂ λˆ r 1
(7.2)
where we recall (6.5). It is seen from (6.3) that as λ2chain → N, the greatest value of the squared chain stretch, λˆ 2chain → N also; and hence λˆ r → 1. Consequently, from (6.5), in any given affine deformation the first principal invariant I1 → Im , its greatest possible value determined by the material constant N; that is, Im ≡ 3N
(7.3) √ is a material constant reflecting the locking stretch λL = N of any randomly oriented molecular chain and is thus named the network locking constant. It follows that for an average-stretch model the first principal invariant I1 in every affine deformation is bounded by the network locking constant : 3 I1 < Im . The √ general response function (7.2) with λˆ r = I1 /Im thus involves the two physical constants µ0 and Im . The constitutive equation (6.7) associated with (7.2) is valid for all deformations of rubberlike materials; it is an equation for which general solutions of many boundary-value problems are well known, and for which specific knowledge of the response function itself is not essential. Therefore, our average-stretch, fullnetwork model for homogeneous non-Gaussian networks of randomly oriented molecular chains is especially useful in the general formulation and solution of a great variety of practical problems in finite elasticity. We find that the simplest of these average-stretch, non-Gaussian full-network models is described by the Arruda–Boyce constitutive equation (6.7) with shear response function (6.8). In principle, specifically, the network locking constant may be determined (actually only esti-
mated) by a simple uniaxial experiment with limiting uniaxial stretch λlim at which network chain locking will occur. We thus have 2 . Im = 3N = λ2lim + λlim
(7.4)
It follows, as described in [13], that the limiting value λlim of the greater principal stretch of the continuum in any other deformation is given by the equation I1 = Im . The limiting stretch (tension or compression) in an equibiaxial stretch, for example, is given by the two positive roots of the equation 2λ6lim − Im λ4lim + 1 = 0. Clearly, the limiting stretch λlim of the continuum in an affine deformation is not equal to the molecular chain locking stretch λL .
82
M.F. BEATTY
8. Conclusion We have shown that among all uniform polyhedra cell structures, the 4- and 8-chain models share the unique property that all of their chains in a principal reference system have the same stretch. Consequently, they are described by the same constitutive equation first derived by Arruda and Boyce [6]. These are special models within the class of uniform non-Gaussian network models, a general model for which is provided by the Wu and van der Giessen [9] constitutive equation which is essentially based on the Treloar–Riding [8] general energy integral for a homogeneous full-network of non-Gaussian chains. All are based on the Kuhn–Grün distribution function [2, 3]. We have shown that the constitutive equations for the classical neo-Hookean, the 3-chain, and the 8-chain (hence also the 4-chain) models may be derived from the Wu and van der Giessen full-network model. It is also shown that the volume averaged, squared stretch of an arbitrarily directed chain is the same squared stretch that characterizes the isomorphic 4- and 8-chain cell models; the difference, however, is that no specific chain cell morphology is required. With this result in hand, an average-stretch constitutive equation is obtained by approximation from the Wu and van der Giessen equation for uniform networks. We find that this reduced equation is precisely the same as the Arruda–Boyce [6] constitutive equation for the 8-chain model; and a similar equation with a different shear response function may be derived for other models based on amended forms of the Kuhn–Grün function. The same average-stretch, full-network approximation is applied directly to obtain from the Kuhn–Grün function an approximate total strain energy function for a uniform full-network model that leads again to the Arruda–Boyce constitutive equation. It is known that experimental data on homogeneous deformations of a variety of rubberlike materials stand in very good overall agreement with results based on the Arruda–Boyce constitutive equation [6, 9], the greatest variance of the model in comparison with all known data arises in equibiaxial deformation. All data reported so far, however, focus on homogeneous deformations for which principal axes are readily identified. It would be most useful and interesting to expand the comparison of additional experimental data on the same materials with theoretical predictions based on the Arruda–Boyce constitutive equation, represented in (6.7) and (6.8), for the torsion of bars, the twisting and inflation of tubes, and the bending of rubber rods, for example. 9. Endnote: Remarks on Some Related Unpublished Work A reviewer has pointed out that some parallel results derivable from (5.2) have been reported in unpublished work by Puso [21]. Puso remarks that the strain energy functions for both the James–Guth and Arruda–Boyce models may be obtained from the Wu and van der Giessen (actually the Treloar–Riding) equation (5.2) as Gauss point approximations with six and eight points, respectively. In fact, no approximations appear necessary. It is customary to assume a priori for these
AN AVERAGE-STRETCH FULL-NETWORK MODEL
83
models that the chains are equally distributed along the three principal axes or along the four diagonal lines of a cube with edges along principal axes; and, of course, oppositely directed chains have the same stretch and are assumed equivalent. Thus, for the 3-chain case, replacing w(λr ) with each directed chain density contribution wk = w(λk ) in summing over the unit sphere in (5.2) (opposite chains having the 2π π same stretch) and noting that (1/(4π )) 0 0 sin θ0 dθ0 dφ0 = 1, we see that the total strain energy (5.2) reduces to (2.6) (with p = 3) which thus leads to (3.2) for the James–Guth model.√Similarly, for the 8-chain model, replacing w(λr ) with each wk = w(λ˜ r ) and λ˜ r = I1 /3N, the total energy (5.2) simplifies to (2.7) which thus yields (4.7) for the Arruda–Boyce model. There is no mention of identical results for the Flory–Rehner tetrahedral model [5] whose four distinct chains are similarly oriented along four diagonal lines of a cube, √as shown in Figure 1(b), all of which ˜ have the same relative chain stretch λr = I1 /3N , as shown earlier. Plainly, the corresponding Cauchy stress for each model must then be obtained separately by application of (2.9), though Puso prefers the second Piola–Kirchhoff stress representation. I think this is essentially the idea perceived by Puso for derivation of the aforementioned constitutive equations from the Treloar–Riding total energy relation (5.2). A leading objective in [21] is to obtain a certain series approximation of (5.2) to provide a simplified constitutive equation in terms of stretch invariants that demonstrates improved accuracy over either of the special chain models and which can be used in finite element formulations of boundary value problems. The model thereby avoids integration of the Langevin function following use of (2.5) in (5.2). First, the inverse Langevin function in our current notation is approximated by β∼ = 3λr /(1 − λ3r ), an empirical estimate that exhibits very good graphical comparison with β = L−1 (λr ) up to λr = 0.8, after which it increases faster than the exact value. This estimate is introduced in the force relation given in our footnote ∗∗ , page 3, which is then integrated with respect to λr to obtain the strain energy w(λr ), per chain, as a sum of certain logarithm and inverse tangent functions . The result is then used in (5.2), still not integrable. The next step in the constitutive formulation for a chain with stretch λchain in an arbitrary direction consists of a√Taylor series ˜
expansion of w(λchain) = w √(λr ) about the chain stretch λchain = I1 /3, or the relative chain stretch λr = I1 /3N of the 8-chain model, an interesting idea. In essence, though not stated in [21], it is assumed that the stretch in an arbitrary direction varies only slightly from its mean value in (6.4). As may be expected, the zero order term necessarily, by the assumed construction, leads to the constitutive I find, however, that Puso’s integral result for w(λ ), which is to be used in the subsequent r Taylor series expansion of w(λr ), contains an error repeated in subsequent equations related to it. I have not confirmed all of the subsequent details. The consistent appearance of this error and others, including incorrect equations for λ2chain and for the second Piola–Kirchhoff stress tensor everywhere, while troublesome, are probably typographical slips. I note, for example, that Puso reports the correct value (6.4) obtained from the integral (6.3) that appears in the first order Taylor series term of the total energy function.
84
M.F. BEATTY
equation for the 8-chain model. By use of Puso’s series estimate, the previously non-integrable constitutive relation (5.2) in terms of the inverse Langevin function is much simplified to an algebraic constitutive equation having the general tensorial representation T = FSFT = −p1 + ℵ1 (I1 , I2 )B + ℵ2 (I1 )B2 , based on the form of S given in [21], the accuracy of which I have not confirmed. Like all other non-Gaussian and related phenomenological models, this one fails to capture effects observed in equibiaxial extension tests, though it does better than others in comparison with uniaxial data. Puso’s series model subsequently is modified to include effects due to neighboring chain entanglement interactions. The adjusted model shows very good comparison with equibiaxial data for natural gum rubber by James et al. [22], and improved though still imperfect and stiffer response in comparison with similar equibiaxial data by Treloar [23]. Although my construction exhibits some similarities of result, it differs fundamentally in its procedure, objectives, and simplicity. All of the familiar constitutive equations for the Cauchy stress, including the classical neo-Hookean equation, are readily deduced directly from the Wu and van der Giessen representation (5.7). My direct average-stretch approximation of the Wu and van der Giessen equation (5.7) shows that the Arruda–Boyce constitutive equation is equivalent to an average-stretch, full-network model. Moreover, it is shown that similar constitutive equations of the type (6.7) with different shear response functions hold for other models based on amended forms of the Kuhn–Grün function. None of these results have been reported elsewhere. Finally, let us recall that the Cauchy stress for a general incompressible, isotropic hyperelastic material may be written as T = −p1 + 2
∂W −1 ∂W B−2 B , ∂I1 ∂I2
(9.1)
in which the strain energy function W = W (I1 , I2 ). Now suppose that the relative chain stretch in an arbitrary direction of a full-network of randomly oriented, perfectly flexible molecular chains does not stray very far from its mean value (6.5). We may then obtain from the Treloar–Riding energy functional (5.2) the following approximate total strain energy function W (I1 ) = nw(λˆ r )
(9.2)
in which w(λˆ r ) is the approximate strain energy per chain. For all such models, W = W (I1 ), a function of I1 alone; and (9.1) then has the reduced general form (6.7) with shear response function (7.2). This is precisely the result obtained differently in (6.9) based on the mean Kuhn–Grün configurational entropy per chain. Indeed, when w(λˆ r ) is given by the Kuhn–Grün chain energy function (2.5), it readily follows that dw(λˆ r )/dλˆ r = Nk β(λˆ r ); and in this case, it is easily seen from (9.2) that 2
∂w(λˆ r ) ∂ λˆ 2r µ0 β(λˆ r ) ∂W = 2n = ≡ µ(I1 ). ∂I1 ∂(λˆ 2r ) ∂I1 3λˆ r
(9.3)
AN AVERAGE-STRETCH FULL-NETWORK MODEL
85
Therefore, as mentioned before, (9.1) reduces precisely to the average-stretch, fullnetwork result (6.7), i.e. the Arruda–Boyce constitutive equation for the isomorphic 4- and 8-chain models. Of course, the result (9.3) is more general in that no specific chain cell morphology is assumed. The estimate in (9.2), and hence my result (6.9), essentially corresponds to the lowest order approximation in Puso’s series expansion of w(λr ) about the squared relative chain stretch of the 8-chain model. The introduction in (9.3) of Puso’s aforementioned empirical estimate now yields the following approximate shear response function for an average-stretch, full-network constitutive model: µ(I ˆ 1) =
µ0 . 1 − λˆ 3r
(9.4)
We recall that λˆ r ∈ [0, 1], and hence the finite chain extensibility effect is evident in (9.4). Indeed, recalling the network locking constant (7.3) in (6.5), we see that the reduced shear response function (9.4), based on Puso’s estimate, is now given explicitly in terms of the physical constants µ0 and Im : µ(I ˆ 1) =
µ0 . 1 − (I1 /Im )3/2
(9.5)
Clearly, the first principal invariant I1 in every affine deformation of this material is bounded by the network locking constant: I1 ∈ [3, Im ]. The general constitutive equation (9.1) for the average-stretch model thus simplifies to 3/2 −1 I1 B. (9.6) T = − p1 + µ0 1 − Im Though here obtained differently and expressed in altogether different terms, this simple result is Puso’s first order constitutive equation [21, equation (1.2.27)]. Acknowledgment I thank two anonymous reviewers for helpful comments and for pointing out the additional references [19] and [21] previously unknown to me. References 1. 2. 3. 4. 5.
P.J. Flory, Statistical Mechanics of Chain Molecules. Hanser Publishers, New York (1988) Chapter 1. L.R.G. Treloar, The Physics of Rubber Elasticity, 3rd edn. Clarendon Press, Oxford (1975). W. Kuhn and F. Grün, Beziehungen zwischen elastischen Konstanten und Dehnungsdoppelbrechung hochelastischer Stoffe. Kolloid-Z. 101 (1942) 248–271. H.M. James and E. Guth, Theory of the elastic properties of rubber. J. Chem. Phys. 10 (1943) 455–481. P.J. Flory and J. Rehner, Statistical mechanics of cross-linked polymer networks: I. Rubber elasticity. J. Chem. Phys. 11 (1943) 512–520.
86
M.F. BEATTY
6.
E.M. Arruda and M.C. Boyce, A three-dimensional constitutive model for the large stretch behavior of rubber elastic materials. J. Mech. Phys. Solids 41 (1993) 389–412. L.R.G. Treloar, The photoelastic properties of short-chain molecular networks. Trans. Faraday Soc. 50 (1954) 881–896. L.R.G. Treloar and G. Riding, A non-Gaussian theory for rubber in biaxial strain. I Mechanical properties. Proc. Roy. Soc. London A 369 (1979) 261–280. P.D. Wu and E. van der Giessen, On improved network models for rubber elasticity and their application to orientation hardening in glassy polymers. J. Mech. Phys. Solids 41 (1993) 427– 456. P.D. Wu and E. van der Giessen, On improved 3-D non-Gaussian network models for rubber elasticity. Mech. Res. Comm. 19 (1992) 427–433. M.C. Wang and E. Guth, Statistical theory of networks of non-Gaussian flexible chains. J. Chem. Phys. 20 (1952) 1144–1157. R.L. Jernigan and P.J. Flory, Distribution functions for chain molecules. J. Chem. Phys. 50 (1969) 4185–4200. A.E. Zúñiga and M.F. Beatty, Constitutive equations for amended non-Gaussian network models of rubber elasticity. Internat. J. Engrg. Sci. 40 (2002) 2265–2294. M.C. Boyce and E.M. Arruda, Constitutive models of rubber elasticity: A review. Rubber Chem. Technol. 73 (2000) 504–523. M.F. Beatty, Constitutive equations for the back stress in amorphous glassy polymers. Math. Mech. Solids, to appear. P.R. von Lockette and E.M. Arruda, Computational annealing of simulated unimodal and bimodal networks. Comput. Theoret. Polymer Sci. 11 (2001) 415–428. J.H. Weiner, Statistical Mechanics of Elasticity. Wiley, New York (1983). O.H. Yeoh and P.D. Fleming, Constitutive modeling of the large strain time-dependent behavior of elastomers. J. Polymer Sci. B: Polymer Phys. 35 (1997) 1919–1931. S. Socrate and M.C. Boyce, Micromechanics of toughened polycarbonate. J. Mech. Phys. Solids 48 (2000) 233–273. A.E. Zúñiga and M.F. Beatty, A new phenomenological model for stress softening in elastomers. Z. Angew. Math. Phys. 53 (2002) 794–814. M.A. Puso, Mechanistic constitutive models for rubber elasticity and viscoelasticity. Doctoral dissertation, University of California, Davis (1994) 124 pages. A.G. James, A. Green and G.M. Simpson, Strain energy functions of rubber: I. Characterization of gum vulcanizates. J. Appl. Polymer Sci. 19 (1975) 2033–2058. L.R.G. Treloar, Stress–strain data for vulcanized rubber under various types of deformation. Trans. Faraday Soc. 40 (1944) 59–70. E.A. Kearsely, Strain invariants expressed as average stretches. J. Rheology 33 (1989) 757–760. C. Truesdell and R. Toupin, The Classical Field Theories. Flügge’s Handbuch der Physik, Vol. III/1. Springer, Berlin (1960).
7. 8. 9.
10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25.
From 3-D Nonlinear Elasticity Theory to 1-D Bars with Nonconvex Energy MICHELE BUONSANTI1 and GIANNI ROYER-CARFAGNI2,
1 Department Mechanics and Materials, University Mediterranea, Reggio Calabria,
I-89026 Reggio Calabria, Italy 2 Department of Civil-Environmental Engineering and Architecture, University of Parma, Parco Area delle Scienze 181/a, I-43100 Parma, Italy. E-mail:
[email protected] Received 22 July 2002; in revised form 7 May 2003 Abstract. This paper represents a first attempt to derive one-dimensional models with non-convex strain energy starting from “genuine” three-dimensional, nonlinear, compressible, elasticity theory. Following the usual method of obtaining beam theories, we show here for a constrained kinematics appropriate for long cylinders governed by a polyconvex, objective, stored energy function, that the bar model originally proposed by Ericksen [3] is obtainable but enriched by an additional term in the strain gradient. This term, characteristic of nonsimple grade-2 materials, penalizes interfacial energies and makes single-interface two-phase solutions preferred. The resulting model has been proposed by a number of authors to describe the phenomenon of necking and cold drawing in polymeric fibers and, here, we discuss its suitability to interpret also the elastic-plastic behavior of metallic tensile bars under monotone loading. Mathematics Subject Classifications (2000): 74A45, 74B20, 74G55, 74G65, 74N20. Key words: nonconvex energy, nonlinear elasticity, polyconvexity, nonsimple materials, necking, polymer, cold-drawing, plasticity.
1. Introduction Different-in-type material responses, such as ductile [1] or brittle [2], can be interpreted with a variational approach. In one-dimension, simple models with nonconvex stored energy, of which Ericksen’s [3] is perhaps the most cited example, represent an extension to solids of the Van der Waals idea, which applies to a surprisingly wide range of materials. Theories of this kind, as more deeply considered by Dunn and Fosdick [4], allow for stress- and deformation-induced phase transitions and they predict discontinuous strain fields in reasonable agreement with experimental observations. The basic difficulty in 1-D models of the kind given in [3, 4] is that phaserearrangements are equienergetic and, consequently, minimizing strain fields are in general highly non-unique. In elastic fluids, where a similar problem occurs when the energy is assumed to be a nonconvex function of the density, several Corresponding author.
87 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 87–100. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
88
M. BUONSANTI AND G. ROYER-CARFAGNI
important results concerning the interfacial energy between phases were obtained by Cahn and Hilliard [5] by introducing a further dependence of the energy upon the density gradient. A number of authors have attempted to solve the congenital indeterminacy for solids by similarly changing the energy functional. For example, Carr et al. [6, 7] proposed a grade-2 model for one-dimensional bars, in which the energy is nonconvex in the strain and quadratic in the strain gradient. Their proposal is a particular case of a more general theory earlier advanced by Coleman for thin polymeric fibers [8]. Such models represent the natural extension of the Cahn and Hilliard idea and, in substance, predict that configurations with least energy are the two-phase single-interface solutions. Consequently, once the average elongation of the bar is known, the minimizing strain field is uniquely determined modulo reversal. The aim of this paper is to discuss the soundness of models of this kind apart from any agreement with experimental evidence and to exhibit their consistence with 3-D nonlinear elasticity theory (respecting in particular classical requirements, such as objectivity). The method is similar to the traditional approach to beam theory using a three-dimensional parent theory, i.e., one that assumes a restricted kinematics such as is common to the Bernoulli–Navier or Timoshenko hypotheses [9]. Here, it is shown that for particular choices of polyconvex objective strain energies, the corresponding minimization problem, in a restricted class of kinematicallyconstrained deformation fields, naturally leads to the aforementioned 1-D straingradient models. This idea has already been pursued by Coleman and Newman, who showed that the 1-D model in [8] for polymeric fibers can be obtained starting from an ad hoc incompressible three dimensional elasticity parent theory [10, 11]. The main difference between their work and ours consists in the choice of the elastic potential for the 3-D parent theory. Here, we do not consider an incompressible material and, indeed, it is the material reaction against volume changes that yields the nonconvex (though polyconvex) character of the elastic potential; this is maintained in the onedimensional reduction. Whether the nonconvexity of the 1-D reduced model is a strict consequence of the assumed kinematical constraints, or is a general property independent of any simplification, remains an open question at this stage, not even discussed in [10, 11]. We conjecture that this conclusion holds true even under weaker kinematical restrictions and for more general elastic potentials, but the consequent analytical complications call for a numerical approach, which will be considered in a further work. In any case, despite drastic simplifications, the present analysis shows that bar models with nonconvex energy may be naturally deduced from the most classical theories for three-dimensional compressible elastic bodies. The resulting onedimensional stored energy function, nonconvex in the axial strain and quadratic in the strain gradient, coincides with the earlier proposals of Carr et al. [6, 7] and Coleman and Newman [8, 10–12]. The second-order term, characteristic of nonsimple grade-2 materials, penalizes interfaces between heterogeneous phases since
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
89
they must be separated by a transition zone. This is characteristic of the well-known phenomenon of necking in polymers and their cold drawing for increasing average elongation. Moreover, we recognize that also the behavior of other materials, in particular the yielding of metallic bars, can be interpreted using the same model; this bears on the original idea of Müller and Villaggio [1]. 2. The variational problem under kinematical constraints Let us consider a straight bar with constant cross section and length L. A reference orthogonal system (x, y, z), with associated unit vectors i, j, k, is introduced so that the z-axis is parallel to the bar longitudinal axis and passes through the centroid of its cross section. Let B = × (0, L) ⊂ R3 denote the undistorted natural configuration of the body, where ⊂ R2 is the domain representative of the cross section. A deformation is defined through the mapping y(x): B → R3 which satisfies the usual hypothesis, i.e., regularity, injectivity, det ∇y > 0. The bar is made of a simple hyperelastic material defined through the strain potential W (∇y): Lin+ → R, supposed to comply with the material objectivity requirements and with the well-known polyconvexity condition of Ball [13]. Thus, W takes the form W (∇y) = g(∇y, adj ∇y, det ∇y),
(2.1)
with g(·, ·, ·) convex in each term. Such requirement is crucial to demonstrate existence theorems in nonlinear elasticity theory [13], since it implies the lowersemicontinuity of the energy functional. Indeed, lower-semicontinuity is assured by the less-restrictive quasi-convexity condition of Morrey. But this weaker hypothesis guarantees existence in general only if growth conditions for W are assumed [13] that prohibit any singular behavior, such as the physical requirement det ∇y → 0+
⇒
W (∇y) → +∞.
(2.2)
The particular class of stored energy functions W here considered, usually referred to as the Blatz–Ko potential, is given by (2.3) W (∇y) = a |∇y|2 + ϕ(det ∇y) , where a > 0 is a constant and ϕ is a smooth convex function such that ϕ(det ∇y) → +∞ as det ∇y → 0+ or det ∇y → +∞. Clearly, W of (2.3) is polyconvex and (2.2) is satisfied. It should be noticed, however, that W is nonconvex in ∇y. Indeed, any hypothesis of convexity would imply serious physical inconsistencies, such as either failure of material objectivity and of (2.2) or unacceptable monotonicity of forces [14]. For this class of stored energy functions, the Piola–Kirchhoff stress, defined by S ≡ W∇y , has the form (2.4) S = a 2∇y + ϕ (det ∇y)∇y−T .
90
M. BUONSANTI AND G. ROYER-CARFAGNI
The assumed condition that the undistorted reference configuration is natural, i.e., S|∇y=I = 0, implies the further requirement ϕ (1) = −2.
(2.5)
Fosdick and Royer-Carfagni [15] have recently considered potentials of this kind to exhibit possible material instabilities. Let us consider a particular kinematics for B, defined through a class of deformations where a particle at x = xi + yj + zk is mapped to y = (1 + ν)x − νx∂z f (x, y, z) i + (1 + ν)y − νy∂z f (x, y, z) j + f (x, y, z)k, (2.6) where f : B → R is a function to be determined. Notice that when the bar is uniformly stretched longitudinally so that f,z = const = 1 + e, it contracts laterally in the ratio (1 − νe) : 1, so that ν may be referred to as the coefficient of lateral contraction. When the bar is extended in a hard-loading device, from (2.3) and (2.6) its energy takes the form E[f ] = a 2(1 + ν − νf,z )2 − 2ν(1 + ν − νf,z )∂z (xf,x + yf,y ) B
2 2 2 + f,zy + f,zz ) + f,x2 + f,y2 + f,z2 + ν 2 (x 2 + y 2 )(f,zx + ϕ f,z (1 + ν − νf,z )2 + ν(1 + ν − νf,z ) × [f,zz (xf,x + yf,y ) − f,z ∂z (xf,x + yf,y )] dV .
(2.7)
The associated variational problem is the following: min E[f ],
(2.8)
f ∈A
where the class A of admissible functions f is characterized by conditions f (x, y, 0) = 0,
f (x, y, L) = βL.
(2.9)
The parameter β defines the average stretch. 3. The Particular Case of Null Lateral Contraction In order to analyze the variational problem (2.7)–(2.9), it is helpful to consider first a preliminary case when the coefficient of lateral contraction ν is equal to zero. Under this condition, the energy (2.7) reduces to E0 [f ] = a[2 + f,x2 + f,y2 + f,z2 + ϕ(f,z )] dV . (3.1) B
For boundary conditions of the type (2.9), it can be easily seen that any minimizing field f ∗ must enjoy the property f,x∗ = f,y∗ = 0 , i.e., there is no warping of the bar
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
91
cross sections. Moreover f ∗ (x, y, ·) ≡ w ∗ (·) ∀(x, y), where w ∗ : (0, L) ⊂ R → R is a solution of the auxiliary one-dimensional problem E0 [w], Aˆ = w | w(0) = 0, w(L) = βL , (3.2) min w∈Aˆ
with E0 [w] =
L
ψ(w (z)) dz,
0
ψ(w (z)) = aA 2 + (w (z))2 + ϕ(w (z)) .
(3.3)
Here, A is the bar cross-sectional area. In fact, for any f (x, y, z): B ⊂ R3 → R, let wf (z): (0, L) ⊂ R → R E0 [wf ] E0 [f ], where the equal represent its restriction to the z-axis. Clearly E0 [w ∗ ] sign holds iff f,x = f,y = 0. On the other hand, if w ∗ solves (3.2), then ∗ ∗ E0 [wf ]. Let us then define f (x, y, z) as the extension of w (z) such that f ∗ (x, y, ·) ≡ w ∗ (·) ∀(x, y), that is f,x∗ = f,y∗ = 0. We can therefore write E0 [f ∗ ] = E0 [w ∗ ] E0 [wf ] E0 [f ],
∀f ∈ A.
(3.4)
In words, the terms f,x and f,y in (3.1) produce the coupling in the response of fibers parallel to the z axis, which must deform to the same extent at any cross section in order to minimize the energy. This finding is important if the function ψ(·) in (3.3) is not convex. In this situation, it has been clear since Ericksen’s analysis [3] that, for β in a given range, two-phase solutions are highly non-unique, since phase-rearrangements are energetically equivalent. If the longitudinal fibers of the bar behaved independently of one another, phase rearrangements would be completely arbitrary for each one of them. On the other hand, the effect of the coupling terms f,x and f,y in (3.1) implies the organization of phases in layers at right angle to the bar axis. In conclusion, the phase-rearrangement indeterminacy still persists in the case ν = 0, but only longitudinally and not transversally. However, a substantial remark should be made at this point. If ϕ(·) is convex, as is assumed in Section 2 to assure polyconvexity of the stored energy, from (3.2) also ψ(·) will be convex. Consequently, it is well known that for any β the unique solution of (3.2) is the trivial one, where the longitudinal strain is uniform in the whole bar. Evidently, the assumption ν = 0 is too drastic and rules out a wide range of interesting cases. In other words, necking due to lateral contraction should play a crucial role. 4. The Bernoulli–Navier Case Let us now consider for (2.7)–(2.9) a further kinematical hypothesis, similar to that of Bernoulli–Navier for beams, i.e., the deformation preserves the flatness of
92
M. BUONSANTI AND G. ROYER-CARFAGNI
transverse planes normal to the cylinder axis. This is equivalent to assuming that f (x, y, z) in (2.6) does not depend upon x and y but, for convenience, we do not change notation and simply denote with f = f (z) the new corresponding function and with f (z) its derivative. A restricted kinematics of this kind was earlier considered by Coleman and Newman [10, 11] in their derivation of the onedimensional theory [8] as an approximation for the response of a three-dimensional long fiber made of polymeric incompressible elastic material. The total energy for B becomes E[f ] = a 2(1 + ν − νf )2 + f 2 + ν 2 f 2 (x 2 + y 2 ) B (4.1) + ϕ(f (1 + ν − νf )2 ) dV , which can be integrated in the cross section, to give L 1 2 E[f ] = aA (f ) + κf dz. 2 0
(4.2)
Here, the function : R → R, from (4.1), takes the form (t) = 2(1 + ν − νt)2 + t 2 + ϕ t (1 + ν − νt)2 ,
(4.3)
while the constant κ is equal to κ=
2ν 2 I0 , A
(4.4)
having denoted with I0 the second order polar moment of inertia of the bar crosssectional area. What should be noted in (4.3) is that, despite the fact that ϕ is assumed to be convex, the cubic c(t) = t (1 + ν − νt)2 is not. Consequently, (t) may be nonconvex. To illustrate this aspect, let consider the following example. Let ϕ be the convex function defined by ϕ(c) = 2
c10 − (19 + 10−8 )c + 10−12 c−10 , 9 + 10−11 + 10−8
c > 0.
(4.5)
For ν = 0.1, c(t) = t (1 + ν − νt)2 is a nonconvex cubic and, the composed function ϕ(c(t)) is also nonconvex. The graphs corresponding to c(t), ϕ(c) and ϕ(c(t)) are juxtaposed in Figure 1. As a result, the graph of (t), appearing in (4.2) and defined in (4.3), is of the type represented in Figure 2. Notice that condition (2.5) assures that (t) has a minimum at t = 1, so that the undistorted configuration is natural. In conclusion, it has been shown that classical nonlinear elasticity theory may directly offer, under appropriate kinematical assumptions, 1-D models where the
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
Figure 1. (a) Graph of ϕ(c); (b) graph of c(t); (c) graph of ϕ(c(t)).
Figure 2. Graph of (t), defined in (4.3).
93
94
M. BUONSANTI AND G. ROYER-CARFAGNI
stored energy is a nonconvex function of the axial strain. However, the example at hand contains the natural appearance of a second-order term in the resulting energy (4.2), which plays the important role of penalizing interfaces between heterogeneous material phases. 5. The Resulting One-Dimensional Model Setting u := f , we find that the problem under consideration becomes the following: PROBLEM P1-D . Minimize L 1 2 (u(z)) + κ(u (z)) dz, [u] = aA 2 0 over all u ∈ W 1,2 (0, L), u > 0, consistent with the constraint L u(z) dz = βL.
(5.1)
(5.2)
0
This model is well studied in the literature. It was proposed by Coleman [8] and Carr et al. [6, 7], and later developed by Coleman and Newman [10–12], as a generalization of the original idea by Ericksen [3]. Here, for the sake of completeness, we briefly recall the main results. Under mild smoothness hypotheses, direct methods in the calculus of variations [16] show existence of at least a minimizer, and that any such minimizer solves the corresponding Euler–Lagrange equations. Now, let α1 and α2 be defined by the Maxwell conditions (α2 ) − (α1 ) = σ0 (α2 − α1 )
with σ0 = (α1 ) = (α2 ),
(5.3)
i.e., the line (α1 ) + σ0 (t − α1 ) supports the graph of (t) from below. It is easy to show that when β α1 or β α2 the solution of P1-D is the trivial one u∗ (z) = βz. In fact, such u∗ (z) minimizes [u] of (5.1) for κ = 0 [3] and, consequently, L L ∗ ∗ (u (z)) dz aA (u(z)) dz [u ] = aA 0 0 L 1 (u(z)) + κ(u (z))2 dz = [u], ∀u. (5.4) aA 2 0 The most interesting case is when α1 < β < α2 , whose characterization is given by the following theorem, contained in [6, 7]. THEOREM (Carr, Gurtin and Slemrod). Assume that is a smooth nonconvex function of the form shown in Figure 2 and, more precisely, is of class C 5 (0, L), > 0 on (0, a1 ) ∪ (a2 , ∞) and < 0 on (a1 , a2 ), (0+ ) = −∞ and (+∞) = +∞. Then, for any β ∈ (α1 , α2 ):
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
95
(i) when κ > 0 is small enough, problem P1−D has a unique (modulo reversal) solution u∗κ (z); (ii) u∗κ (z) is strictly monotone; (iii) As κ → 0, u∗κ (z) approaches the single interface solution u∗ (z), defined as (modulo its reversal) α1 0 z l, ∗ (5.5) u (z) = α2 l < z L, with l = L(α2 − β)/(α2 − α1 ). In other words, for infinitesimal κ the higher order term causes the two-phase solutions with least energy to become the single-interface solutions. For the discussion of this very important case, we report an elegant argument by Alberti [17], providing an elementary characterization of the transition zone between heterogeneous phases. To this aim, it is convenient to consider a problem equivalent to P1-D , i.e., L L 1 ≡ 2 (u(z)) + κ(u (z)) dz, u(z) dz = βL, (5.6) min 2 0 0 where ≡ (u(z)) = (u(z)) − [(α1 ) + σ0 (u(z) − α1 )],
(5.7)
and σ0 has been defined in (5.3). With this choice, ≡ (·) is non-negative and presents only two absolute minima at α1 and α2 , with ≡ (α1 ) = ≡ (α2 ) = 0. It is well known that minimizers of (5.6) coincide with solutions of P1-D since any linear functional is a null-Lagrangian for [u] in (5.1). Now, case (iii) of the aforementioned theorem suggests that when β ∈ (α1 , α2 ), the material tends to be separated into two different phases, but their interface is not sharp because of the gradient term in (5.7). The two phases are connected by a transition zone, which produces an effect in the bar equivalent to a distortion. Under De Saint Venant’s principle, the effects of such a distortion should be negligible at distances that are large when compared with the diameter of the bar cross section. Consequently, if the length of the bar is much larger than the diameter of its cross section and β is far from α1 and α2 , recalling (5.5) the transition zone is certainly distant from the bar extremities. Consequently, the following optimal profile problem may be considered for the determination of the shape of the transition zone. PROBLEM POP (Optimal Profile). Find u: R → R such that +∞ ⎧ 1 ⎪ ≡ 2 ⎨ aA (u(z)) + κ(u (z)) dz = min, 2 −∞ ⎪ ⎩ lim u(z) = α1 and lim u(z) = α2 . z→−∞
z→+∞
(5.8)
96
M. BUONSANTI AND G. ROYER-CARFAGNI
Following the argument proposed by Alberti [17] for the -convergence of Cahn–Hilliard models, the inequality r 2 + s 2 2r · s with r = ( 12 κ)1/2u and √ s := ≡ (u(z)) is applied to (5.8)1 to give +∞ 1 ≡ 2 (u(z)) + κ(u (z)) dz aA 2 −∞ +∞ 1 κ u (z) dz ≡ (u(z)) · 2aA 2 −∞ α2 1 ≡ κ (u) du = γ . (5.9) 2aA 2 α1 The quantity
α2
1 ≡ κ (u) du, (5.10) 2 α1 represents the energy necessary to produce an interface between the two phases and depends only upon the shape of ≡ and, consequently, of . Noticing that r 2 + s 2 = 2r · s if and only if r = s, we find from (5.9) and (5.10) that the lower bound γ for the integral in (5.8)1 is attained provided u satisfies the differential equation κ u (z) = ≡ (u(z)), (5.11) 2 with boundary conditions (5.8)2 . Notice that u is monotone and u → 0 as u approaches the value α1 or α2 . To illustrate the characteristics of the transition zone in one example, suppose that takes the form 1 (5.12) (u) = b1 (u − α1 ) + b2 + 2 κ inf (u − α1 )2 , (u − α2 )2 2ζ for some constants b1 , b2 and ζ . Rigorously speaking, expression (5.12) does not follow immediately from (4.3), but since what is important is the nonconvex character of in a neighborhood of (α1 , α2 ), it is assumed that there exist particular choices of ϕ(·) and ν such that (4.3) is well approximated by (5.12) in such a neighborhood. From (5.3) and (5.7), ≡ becomes 1 (5.13) ≡ (u) = 2 κ inf (u − α1 )2 , (u − α2 )2 , 2ζ and the differential equation (5.11) has the form ⎧ |u − α1 | α1 + α2 ⎪ ⎪ for α1 u < , ⎨ ζ 2 (5.14) u = ⎪ |u − α2 | α1 + α2 ⎪ ⎩ for u α2 . ζ 2 γ = 2aA
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
97
By symmetry, the boundary conditions (5.8)2 may be replaced by requiring u(0) = (α1 + α2 )/2, so that a simple integration gives the following solution u∗ (z): ⎧ 1 z ⎪ ⎪ ⎪ for z < 0, ⎨ 2 (α2 − α1 ) exp ζ + α1 ∗ (5.15) u (z) = ⎪ z 1 ⎪ ⎪ + α2 for z 0. ⎩ (α1 − α2 ) exp − 2 ζ It should be noted that ζ represents a characteristic length for the transition zone. Moreover, from (5.10) the energy γ necessary to produce the transition (interface energy) is κ α2 − α1 2 , (5.16) γ = aA ζ 2 which depends also upon κ. Recalling (2.6) and the notation u = f , we see that the bar undergoes a lateral contraction equal to ν(u − 1). Consequently, for β ∈ (α1 , α2 ), the shape of the deformed bar would appear as represented in Figure 3. The lateral contraction of the bar is ν(α1 − 1) or ν(α2 − 1) for the portions on the left-hand side or on the right-hand side of the transition zone respectively. Provided that ζ L, the length of the phase1-portion is approximately α2 − β L. l∼ = α2 − α1
(5.17)
The quasi-static equilibrium states of the bar, as the average stretch β is gradually increased starting from the undistorted state β = 1, are as follows. (i) For 1 < β α1 , the bar is uniformly stretched. The axial dilatation is (β − 1) and the lateral contraction is ν(β − 1). (ii) For α1 < β < α2 , the deformation of the bar is as in Figure 3. As β increases, one phase evolves at the expense of the second phase, gradually invading the whole bar. (iii) For β α2 , the bar dilatation becomes again uniform and equal to (β − 1) as in case (i).
Figure 3. Predicted necking in the transition zone between heterogeneous phases.
98
M. BUONSANTI AND G. ROYER-CARFAGNI
In practice, the energy consumed in producing the necking is interpreted as a surface energy between heterogeneous phases, assuring uniqueness (modulo reversal) of minimizers also in stage (ii). As β is increased the neck moves along the bar, changing the ratio between the length of the regions occupied by heterogeneous phases in order to accommodate the average elongation β. This particular motion is usually referred to as drawing [8]. 6. Comparison with Experiments and Conclusions The model defined by (4.2) is a particular case of a general theory appositely conceived by Coleman [8] to describe the cold drawing in polymeric fibers such as nylon. In the undistorted condition, nylon is composed of long chainlike molecules that are oriented at random. Under a tensile force increasing from zero such filaments, tangled together, do not stretch much at first, but when a certain critical load is reached, the long molecule chains suddenly rearrange themselves, so that they become parallel to the filament axis. At this instant, a sharp constriction forms in the filament profile, in general nucleated at one of the extremities where the fiber is clamped by the loading device. A moment later, the contracted portion is seen to increase its length and the “wave front” moves along the specimen towards the opposite end (drawing), due to the successive rearrangement of the molecules in the neighboring portions. Eventually, the filament shape looks as represented in Figure 4, where the correspondence with the behavior of Figure 3, predicted by the model, is evident. After the entire filament has contracted, all the molecules have been rearranged and, if the fiber is further pulled, the longitudinal strain (and the consequent lateral contraction) increases uniformly throughout the bar [18]. It is perhaps less known that such a behavior, despite being caused by a mechanism completely different at the microstructural level (dislocation movements), is also analogous to the manner in which a bar of mild steel elongates. Metals have the characteristic property of possessing a well-defined “yield point”, at which they start to stretch permanently. At the macroscopic level, the load vs. displacement curve is characterized by a pseudo-horizontal plastic plateau, but careful experiments [19] reveal that plastic strain does not progress uniformly throughout the bar. When one element or layer yields, it strains a few percent almost instantly and then nearly stops, while yielding is transferred to the neighboring portions. In both examples, a microstructural rearrangement (molecular alignment or plastic slip) produces a strain jump with the same characteristics as a phase tran-
Figure 4. Constriction forming on thin filaments of nylon after pulling (from [18, p. 300]).
NONLINEAR ELASTICITY AND 1-D BARS WITH NONCONVEX ENERGY
99
sition, which occurs in 1-D elastic bars with nonconvex stored energy. Indeed, Müller and Villaggio [1] recognized that Ericksen’s model [3] was suitable also for an elastic-plastic body. However, if the segments of the bar could yield independently of one another, like the rings of a chain, the process would be highly chaotic, in agreement with the non-uniqueness inherent in Ericksen’s model. On the other hand, experiments on metals show an orderly deformation similar to that of polymers. A possible explanation stems from the observation that the yield point of metals is greatly influenced by any stress concentration. There is a wealth of experimental evidence [19] that localized yielding produces a condition equivalent to a stress concentration at the boundary of the yielded portion, which can greatly influence [20] the behavior of the neighboring parts (nonlocal effect). The yielding of the first portion produces a chain reaction that produces the spreading of plastic deformation through the specimen similar to the cold drawing of polymers. For the model of Section 5, this nonlocal effect is due to the constriction shown in Figure 3 that, while connecting heterogeneous phases, produces a condition somehow equivalent to a stress concentration. The energy consumed by the formation of such a transition plays the role of an interface energy. In the class of models discussed here, the nonlocal effect is interpreted by the presence of the quadratic term in the strain-gradient which appears in the energy functional (4.2). A number of authors have proposed to add strain-gradient dependence in order to regularize Ericksen’s problem and to solve the congenital equivalence of phase rearrangements. In this paper we have emphasized that, for a specified restricted kinematics, classical 3-D nonlinear elasticity theory naturally yields a 1-D formulation with the aforementioned characteristics. In particular, in our derivation the nonconvex strain dependence of the energy, that dependence which allows for phase transformations, is inevitably associated with the appearance of a convex term in the strain gradient, thus providing the nonlocal effect necessary to reproduce the phenomena of necking and drawing. Acknowledgements Our most thankful appreciation is devoted to Professors G. Del Piero and R. Fosdick for their helpful comments during the preparation of this work. Partial support of Italian MURST under grant CoFin “Modelli matematici per la scienza dei materiali” is gratefully acknowledged. References 1. 2.
3.
I. Müller and P. Villaggio, A model for an elastic plastic Body. Arch. Rational Mech. Anal. 65 (1977) 25–56. L. Truskinovsky, Fracture as a phase transition. In: R.C. Batra and M.F. Beatty (eds), Contemporary Research in the Mechanics and Mathematics of Materials, CIMNE, Barcellona (1996) pp. 322–332. J.L. Ericksen, Equilibrium of bars. J. Elasticity 5 (1975) 191–201.
100 4. 5. 6. 7. 8. 9. 10.
11. 12. 13. 14. 15. 16. 17.
18. 19. 20.
M. BUONSANTI AND G. ROYER-CARFAGNI
J.E. Dunn and R.L. Fosdick, The morphology and stability of material phases. Arch. Rational Mech. Anal. 74 (1980) 1–99. J.W. Cahn and J.E. Hilliard, Free energy of a non uniform system. I. Interfacial free energy. J. Chem. Phys. 28 (1958) 258–267. J. Carr, M. Gurtin, and M. Slemrod, One-dimensional structured phase transformations under prescribed loads. J. Elasticity 15 (1985) 133–142. J. Carr, M. Gurtin, and M. Slemrod, Structured phase transition on a finite interval. Arch. Rational Mech. Anal. 86 (1984) 317–351. B. Coleman, Necking and drawing in polymeric fibers under tension. Arch. Rational Mech. Anal. 83 (1983) 115–137. P. Podio-Guidugli and M. Lembo, Internal constraints, reactive stresses and the Timoshenko beam theory. J. Elasticity 65 (2002) 131–148. B. Coleman and D.C. Newman, Constitutive relations for elastic materials susceptible to drawing. In: V.K. Stokes and D. Krajcinovic (eds), Constitutive Modeling for Nontraditional Materials. ASME, New York (1987). B. Coleman and D.C. Newman, On the rheology of cold drawing. I. Elastic materials. J. Polymer Sci. Part B Polymer Phys. 26 (1988) 1801–1822. B. Coleman and D.C. Newman, Mechanics of neck formation in the cold drawing of elastic films. Polymer Engrg. Sci. 30 (1990) 1299–1302. J. Ball, Convexity conditions and existence theorems in nonlinear elasticity. Arch. Rational Mech. Anal. 100 (1977) 337–403. J.E. Marsden and T.J.R. Hughes, Mathematical Foundations of Elasticity. Dover, Mineola, NY (1994). R. Fosdick and G. Royer-Carfagni, Multiple natural states for an elastic isotropic material with polyconvex stored energy. J. Elasticity 60 (2000) 223–231. B. Dacorogna, Direct Methods in the Calculus of Variations. Springer, New York (1989). G. Alberti, Variational models for phase transitions. An approach via -convergence. In: Summer School on Differential Equations and Calculus of Variations, Pisa (16–28 September 1996) Lecture Notes. A. Nadai, Theory of Flow and Fracture of Solids, Vol. I. McGraw-Hill, New York (1950). M. Froli and G. Royer-Carfagni, Discontinuous deformation of tensile steel bars: Experimental results. J. Engrg. Mech. ASCE 125 (1999) 1243–1250. M. Froli and G. Royer-Carfagni, A mechanical model for the elastic-plastic behavior of metallic bars. Internat. J. Solids Struct. 37 (2000) 3901–3918.
Eshelby Tensor as a Tensor of Free Enthalpy GIOVANNI BURATTI1, YONGZHONG HUO2 and INGO MÜLLER3 1 Dipartimento di Ingegneria Strutturale, Università di Pisa, Pisa, Italy 2 Department of Mechanics, Fudan University, Shanghai, PR China 3 Thermodynamik, Technische Universität Berlin, 10623 Berlin, Germany
Received 18 September 2002; in revised form 16 September 2003 Abstract. The balance equations of mass, momentum, energy and entropy at a phase boundary imply phase boundary conditions which determine the position of the boundary as a function of temperature. This is true when either the phase boundary is sharp or when it occurs through a transition zone, albeit the latter case seems to require strongly symmetric geometry. Mathematics Subject Classifications (2000): 74A15, 74A50. Key words: phase transitions, Eshelby tensor, free enthalpy.
Dedicated to C.A. Truesdell, who taught us rational thinking
1. Eshelby Tensor as a Tensor of Free Enthalpy Viewed macroscopically phase boundaries may sometimes be sharp boundaries between uniform phases. Such is the case in a liquid–vapour or a solid–liquid phase boundary; some solid–solid phase boundaries are of that type too, notably for martensitic transformations. In such cases the equations of balance of thermodynamics assume the form of jump conditions on a singular surface, the phase boundary. In addition to the balance equations coherency of the phases requires kinematic compatibility conditions which relate the jumps of the velocity components across the phase boundary to the mass rate of the phase transition. Thus, one obtains an entropy inequality on the phase boundary which implies that the mass rate of the phase transition is proportional to the discontinuity of the normal component of the Eshelby tensor. This idea goes back to the procedure of linear irreversible thermodynamics (e.g., see [1]), by which the thermodynamic forces and fluxes, whose product forms the entropy inequality, are proportional. This idea has been adapted in [2, 3] to the motion of a phase boundary. In particular, in equilibrium the normal component of the Eshelby tensor is therefore continuous. This equilibrium condition generalizes the thermodynamic condition of phase equilibrium by which the specific free enthalpy – or Gibbs free energy – is continuous. In simple cases the condition gives rise to the “common tangent 101 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 101–112. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
102
G. BURATTI ET AL.
construction”, which in thermodynamics is a familiar way to determine the load and deformations of phase equilibria as functions of temperature. The usefulness of the knowledge about the continuity of the Eshelby tensor is strongly qualified as a universal tool by the need to know the generally nonuniform stress and strain fields adjacent to the phase boundary. The determination of such fields requires numerical calculations and is therefore restricted to specific cases. More often than sharp phase boundaries we observe a transition zone between the phases, in particular between two solid phases. Such a zone is narrow but macroscopically noticeable. And while we cannot employ singular surface analysis in such cases, we may integrate over the surface of the zone and thus compare stresses and strains in the regions of the body adjacent to the transition zone. In simple cases of high symmetry this procedure can lead to a variant of the “common tangent construction”. In recent years phase transitions in solids have become important, since structural elements of machines are subjected to high temperatures and strong tensile forces which can affect the size and shape of inclusions. After the pioneering work of Eshelby [4] for elastic inclusions, the thermodynamic character of the Eshelby tensor was first recognized by Heidug and Lehner [5]. More recent references are the works by Truskinovsky [6], Gurtin [7] and Liu [8]. Under most circumstances the exploitation of the phase boundary conditions requires extensive numerical calculations and such calculations have been made by Schmidt [9] and R. Müller [10], who have calculated the shapes of inclusions in solids. The present work emphasizes the similarity of the Eshelby tensor with the free enthalpy, or Gibbs free energy of thermodynamics. After the derivation of the general form of the phase equilibrium conditions, we specialize to a liquid– vapour transition and to a solid–solid transition under shear. We also discuss the case when the transition does not occur on a sharp interface but in a narrow but smooth zone. The mathematical formulation of the coherency of the phases makes use of the displacement field, i.e., the position of material points relative to their positions at an initial time, or in a reference configuration. And the displacement gradient affects the Eshelby tensor. This fact has given rise in the literature to the ideas that the proper environment of the Eshelby tensor is a “configurational space”; see [11, 12], who have gone so far as to call for a “configurational physics”. Similar ideas about “configurational balances” have been proposed by Podio-Guidugli [13]. We see no need for all these and we firmly place the Eshelby tensor in “physical space”, where it belongs in our opinion, along with such classical subjects as stress, strain and free enthalpy. To keep arguments simple we have omitted to endow the phase boundary with properties of its own like surface stress or surface energy. The formulae needed for the consideration of such properties are available in [14] or [15].
ESHELBY TENSOR AS A TENSOR OF FREE ENTHALPY
103
Figure 1. Sharp phase boundary.
2. Equations of Balance Figure 1 shows a cross section of a sharp phase boundary that separates the phases (+) and (−) and has the unit normal ni pointing from (−) into (+). The velocity of the boundary is Vi = V ni and the velocities of the phases on either side are v± i . The conservation laws of mass, momentum and energy on the phase boundary read [ρ(vi − Vi )]ni = 0, [ρvj (vi − Vi ) − tj i ]ni = 0, 1 2 ρ ε + v (vi − Vi ) − tj i vj + qi ni = 0. 2
(2.1)
They state that the fluxes of mass, momentum and energy in and out of the surface are equal; the convective fluxes are relative to the moving surface. Equations (2.1) employ the canonical letters for the physical quantities. Thus ρ is the mass density, vi the velocity, tj i the stress, ε the specific internal energy, and qi the heat flux. Square brackets denote the difference of the bracketed quantity on the two sides such that [c] = c+ − c− holds. The balance of entropy states that the efflux of entropy from the surface is bigger than the influx. Thus, if we suppose that the efflux is on the (+)-side and the influx is on the (−)-side the balance reads qi ni = σ 0; (2.2) ρη(vi − Vi ) + T η denotes the specific entropy, T the absolute temperature and σ is the surface density of entropy production. It is reasonable to assume that the temperature is continuous. Therefore [qi ]ni may be eliminated between (2.1)3 and (2.2). With ψ = ε − T η as the specific free energy the result reads 1 2 (2.3) ρ ψ + v (v⊥ − V ) − tj i vj ni = −T σ 0, 2
104
G. BURATTI ET AL.
where v⊥ is short for vi ni . We define the arithmetic mean value c = 12 (c+ + c− ) and use (2.1)1,2 as well as the identity [ab] = [a]b + a[b]
(2.4)
to rewrite (2.3) in the form [ψ]ρ(v⊥ − V ) − tj i ni [vj ] 0.
(2.5)
In words this means that the difference between efflux and influx of the free energy is smaller than or equal to the power of the mean stress vector tj i ni on the relative velocity [vj ]. An alternative form of (2.5) results if we decompose [vj ] into its normal and tangential components according to [vj ] = [v⊥ ]nj +
2
[vα ]τjα ,
(2.6)
α=1
τjα
are two orthonormal tangent vectors on the phase boundary such that where α β τj τj = δαβ holds; vα = vj τjα are the corresponding tangential components of vj . Thus we obtain from (2.5)
2 tj i ni nj ρ(v⊥ − V ) − tj i ni τjα [vα ] 0, ψ− ρ α=1
(2.7)
so that the power of the mean stress vector is split into three parts due to the normal and tangential components of tj i ni . 3. Kinematic Compatibility Coherency of the phases implies that the displacement vector ui is continuous across the phase boundary, i.e., [uj ] = 0.
(3.1)
(The displacement field ui (xj , t) refers to a material particle which at time t is at the position xj , while at some prior time t0 it was at xj0 ; we have ui (xj , t) = xi (xj0 , t) − xi0 .) The gradient ∂uj /∂xi and the time derivative ∂uj /∂t may suffer jumps but, because of (3.1), the jump of the gradient cannot have a tangential direction and the jump of the time derivative is related to the normal speed V of the phase boundary. We have ∂uj ∂uj ∂uj = −V (3.2) = aj ni and ni . ∂xi ∂t ∂xi
ESHELBY TENSOR AS A TENSOR OF FREE ENTHALPY
105
These conditions, where ai is an arbitrary vector on the phase boundary, are called kinematic compatibility conditions (e.g., see [16, p. 503 ff]). The velocity is related to the derivatives of the displacement by ∂ui ∂uj −1 ∂ui ∂ui + vj . (3.3) , hence vj = δij − vi = ∂t ∂xj ∂xi ∂t We recall the decomposition (2.6) of [vj ] into normal and tangential components and use (2.1)1 , (2.4), (3.2) and (3.3) to determine these components in terms of the mass rate of the phase transition 1 ρ(v⊥ − V ), [v⊥ ] = ρ (3.4) −1 ∂u ∂u 1 j k δj i − δkj − ni ρ(v⊥ − V ). [vα ] = −τkα ∂xj ρ ∂xi Bki
By (3.4) the jumps of all three components are proportional to the mass rate ρ(v⊥ − V ) of the phase transition. For the normal component the factor of proportionality is [1/ρ], while for the tangential components the factors are complex expressions in terms of the densities and displacement gradients of the phases. In the sequel we simplify by introducing the abbreviation Bki as indicated in (3.4)2 . 4. Entropy Inequality, Speed of Phase Boundary, Equilibrium We eliminate [vj ] between (3.4) and (2.5) and obtain an alternative form of the entropy inequality or free energy inequality, viz. 2 tkn nk nn α α δli + tj l τj τk Bki nl ni ρ(v⊥ − V ) 0. (4.1) ψ− ρ α=1 Eshelby tensor gli
The bracketed quantity defines the Eshelby tensor; here we denote it by gik in order to emphasize its kinship with the free enthalpy or Gibbs free energy, which we shall exhibit later and which is usually denoted by g. Thus (4.1) may be written in abbreviated form as [gli ]nl ni ρ(v⊥ − V ) 0.
(4.2)
The left-hand side of (4.2) may be interpreted as a product of a thermodynamic force [gli ]nl ni – the discontinuity of the normal component of the Eshelby tensor, and a thermodynamic flux ρ(v⊥ − V ) – the mass rate of the phase transition. The inequality (4.2) is satisfied by setting ρ(v⊥ − V ) = N[gik ]ni nk ,
(4.3)
106
G. BURATTI ET AL.
where N is a negative factor of proportionality. Thus the mass rate of the phase transition is proportional to the jump of the normal component of the Eshelby tensor. In particular, in equilibrium, when the phase transition comes to a standstill we must have [gik ]|E ni nk = 0.
(4.4)
The index E refers to equilibrium, where v⊥ = V holds. In that case (2.1)2 implies [tj i ]|E ni = 0,
hence
tj i |E ni = tj i ni ,
(4.5)
so that the stress vector tj i ni is continuous. We obtain from (4.4) and (4.1) [gik ]|E ni nk =
2 tkn α α ψ − nk nn δli + tj l τj τk Bki nl ni = 0, ρ E α=1
(4.6)
where Bki is defined in (3.4)2 . We shall refer to this equation as the Eshelby condition for phase equilibrium. Given the continuity of the temperature across the phase boundary and the continuity of the stress vector and the continuity of the normal component of the Eshelby tensor we can determine the position of the phase boundary as a function of temperature. We proceed to study two special cases. 5. Liquid–Vapour Phase Transition We envisage a situation as shown in Figure 2(a) with liquid and vapour of a fixed mass m and in a fixed volume V . And we ask for V (−) (V , T ), the volume of the liquid in equilibrium as a function of the total volume V and temperature T .
(a)
(b)
Figure 2. Common tangent for liquid–vapour transition.
ESHELBY TENSOR AS A TENSOR OF FREE ENTHALPY
107
In equlibrium the stress of both liquid and vapor reduces to an isotropic pressure p, i.e., in (4.6) we have tj i = −pδij . Therefore the stress vector tj l nl has no tangential component and (4.6) reads [ψ] p = 0 or p = − , (5.1) ψ+ ρ [1/ρ] since, by (4.5), the pressure is continuous in this case. Equation (5.1), which represents the Eshelby condition in the the present case, expresses the continuity of the specific free enthalpy or Gibbs free energy. We recall from basic thermodynamics that p = −∂ψ/(∂1/ρ) holds in the liquid and the vapour. Therefore it follows from (5.1)2 that ∂ψ − [ψ] ∂ψ + (5.2) = = ∂1/ρ ∂1/ρ [1/ρ] holds, which implies the well-known “common tangent construction”. This is a graphical method for determining the densities ρ ± (T ) as functions of temperature. These densities or, equivalently, the specific volumes 1/ρ ± (T ) result as the abscissae of the points of contact of the tangent common to the two free energy functions ψ − (1/ρ, T ) and ψ + (1/ρ, T ); see Figure 2. The pressure needed to maintain the volume V is given by the slope of the tangent; it depends on the temperature. We are thus able to reach our objective and determine V (−) as a function of T , given V . With m being the mass of the fluid we obtain V − (V , T ) = V
m/V − ρ + (T ) . ρ − (T ) − ρ + (T )
(5.3)
Of course, this solution did require the knowledge of ψ (+) (1/ρ, T ) and ψ (1/ρ, T ). A well-known example for which this knowledge is available is the van der Waals gas, where ψ ± are given by the two convex parts of the free energy function. (−)
6. Solid–Solid Phase Transition under Shear We consider a situation as shown in Figure 3(a) where the upper plane of an elastic body is displaced by ui (H ) = (U, 0, 0). The transformation strain between the phases (+) and (−) is given by ⎡ ⎤ 0 ε¯ 0 (6.1) εij(+) − εij(−) = ε¯ ij = ⎣ ε¯ 0 0 ⎦ . 0 0 0 We ask for X(U, T ), the position of the phase boundary in equilibrium as a function of temperature, if U is given.
108
G. BURATTI ET AL.
(a)
(b)
Figure 3. Common tangent construction for solid–solid transition in shear.
We assume that both phases are linearly elastic isotropic bodies with equal Lamé coefficients µ and λ. Therefore the stresses tij in terms of the strains εij = 1 (∂ui /∂xj + ∂uj /∂xi ) are given by 2 tij(+) = 2µ εij(+) − ε¯ ij + λεll(+) δij , tij(−) = 2µεij(−) + λεll(−) δij . (6.2) In both phases the equilibrium condition ∂tij /∂xj = 0 must be satisfied. It follows that only the 1-components of the displacement fields are nonzero. They read X U (+) + 2¯ε (x2 − H ) + U, u1 = H H (6.3) X U (−) + 2¯ε − 1 x2 . u1 = H H (−) These fields satisfy the boundary values u(+) 1 (H ) = U , u1 (0) = 0 and the jump conditions (3.1) and (4.5). It remains to exploit the condition (4.6) on the continuity of the Eshelby tensor. Obviously from (6.3) we have
1 (U + 2¯ε X), 2H 1 (−) (−) (U + 2¯ε (X − H )), = ε21 = ε12 2H µ (±) (±) = t21 = (U + 2¯ε (X − H )), t12 H
(+) (+) = ε21 = ε12
(6.4)
and ρ (+) = ρ (−) holds, since εll = 0. Without loss of generality we set τ1α = (1, 0, 0) and τ2α = (0, 0, 1) on the boundary. Therefore (4.6) reads [ψ +t12 B12 ] = 0 or, with (cf. (3.4)) [B12 ] = −(1/ρ)[∂u1 /∂x2 ], [ρψ] ∂u1 . (6.5) = 0 or, by (6.1), (6.3), t12 = ρψ − t12 ∂x2 [2ε12 ]
109
ESHELBY TENSOR AS A TENSOR OF FREE ENTHALPY
Equation (6.5)1 or, equivalently [ρψ − t12 2ε12 ] = 0, represents the Eshelby condition in this case. Clearly it may be expressed as the continuity of the free enthalpy density appropriate to shear loading. We recall from linear elasticity that the free energy densities of the phases are given by λ ρψ (+) = ρf (+) (T ) + µ εij(+) − ε¯ ij εij(+) − ε¯ ij + εll(+)2 2 λ ρψ (−) = ρf (−) (T ) + µεij(−) εij(−) + εll(−)2 , 2
and (6.6)
so that t12 = ∂ρψ/(∂2ε12 ) holds; f (±) (T ) are the specific non-elastic, thermal parts of the free energy density. Therefore (6.5) may be written in the form ∂ρψ − [ρψ] ∂ρψ + , (6.7) = = ∂2ε12 ∂2ε12 [2ε12 ] ± (T ) which implies the common tangent construction: for the determination of ε12 and t12 (T ); see Figure 3. We are then able to calculate the position X of the interface from (6.4)1 as
X(U, T ) =
(+) (T ) − U 2H ε12 (+) (−) 2(ε12 (T ) − ε12 (T ))
.
(6.8)
Given U , the position changes with temperature. 7. Extrapolation to Transition Zones Rather than Sharp Boundaries It is clear that simple and clear-cut considerations like those in Sections 5 and 6 require uniform fields of stress and strain up to the phase boundary whose position may then be calculated as shown. This does not work very often. Usually the fields are non-uniform and not known analytically. Also neither the position nor the shape of the phase boundary is known. In such cases extensive numerical calculations are needed to obtain solutions inside the phases that are coherent at the boundary and satisfy the jump conditions of thermodynamics. Such calculations have been made (e.g., see [9, 10]), and they have been used to determine the size and shape of precipitates in alloys. Actually such investigations are close in scope to the motivation of Eshelby’s original paper [4], in which the Eshelby tensor appeared first. Not wanting to enter such complex numerical studies we shall rest content to point out, in this section, certain simple cases in which stress and strain are not uniform and yet definite simple answers can be found like those of the previous sections. Specifically we shall consider circumstances in which the non-uniformity of the stress and strain fields is confined to a narrow range, the transition zone between
110
G. BURATTI ET AL.
Figure 4. Tensile and compressive rods.
two phases. Typical examples are shown in Figure 4: a cylindrical rod under tension undergoing a phase change that is accompanied by a change of cross section. And a rod under compression where the lateral expansion is constrained by a rigid pipe. In both cases there is a transition zone of non-uniform stress but outside that zone it is reasonable to assume uniformity. We apply the equations of balance of mass, momentum, energy and entropy to material volumes whose surfaces ∂ are indicated by the dashed lines in Figure 4 so that the cross-sections lie outside the transition zones. The surfaces move with the bodies and the mantle parts of ∂ contribute nothing except possibly heating. Therefore the balance equations read {ρvi ni A} = 0, {ρvj vi ni A − tj i ni A} = 0, $ 1 2 qi ni dA = 0, ρ ε + v vi ni A − tj i ni vj A + 2 ∂ qi ni dA 0. {ρηvi ni A} + ∂ T
(7.1)
The brackets denote differences between the two cross-sections, such that {c} = c+ − c− . Note that the cross-sections may be different in area A. In the case of the tensile rod density and stress vanish on the mantle, while in the case of the compressed rod the contributions of stress on the mantle cancel each other because of symmetry. We assume that T is uniform and eliminate the heating between (7.1)3,4 . Thus we obtain $ 1 2 (7.2) ρ ψ + v vi ni A − tj i ni vj A 0 2
ESHELBY TENSOR AS A TENSOR OF FREE ENTHALPY
111
or, by (2.4), which holds for the curly brackets as well as the square ones, {ψ}ρvi ni A − tj i ni A{vj } 0.
(7.3)
In both of our examples vj is normal to the cross-sections and, by (7.1)1 , we may therefore write (with vi ni = v⊥ ) $ 1 nj ρv⊥ A (7.4) {vj } = ρA so that (7.3) assumes the form $ tj i nj ni A ρv⊥ A 0. ψ− ρA
(7.5)
As before, in Section 4, we argue from (7.5) that the mass rate ρv⊥ A of the transition is proportional to its factor in (7.5) with a negative factor of proportionality so as to satisfy the inequality. Thus with tij ni nj A as the tensile or compressive force P we have $ P (7.6) ρv⊥ A = N ψ − ρA and in phase equilibrium $ P /A = 0, ψ− ρ
(7.7)
since, by (7.1)2 , P is the same on the two cross-sections. The phase equilibrium condition (7.7) is formally idential to the case of the liquid–vapour condition, cf. (5.1)1 . Note, however, that when A+ and A− are unequal – like in the tensile rod – it is not the stress P /A which is equal in the two phases but the load P . In such a case there is an interesting alternative form of (7.7) which results by extending numerator and denominator by the lengths L+ and L− , see Figure 4. In this way with ρAL = m we obtain $ {ψ} L = 0 or P = , (7.8) ψ −P m {l} where l ± = L± /m± are the specific lengths of the phases. Along with P = (∂ψ/∂l)|+ = (∂ψ/∂l)|− this gives rise to a common tangent construction for the free energies ψ ± (l, T ). Such a construction was described by Huo and Müller [17] in connection with a shape memory rod from a minimization of the free energy. Acknowledgements Giovanni Buratti and Yonghzong Huo gratefully acknowledge the support of their funding institutions. G. Buratti was a junior researcher in the TMR project: Phase
112
G. BURATTI ET AL.
Transformations in Crystalline Solids and Y. Huo has a Alexander von Humboldt scholarship. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.
S.R. de Groot and P. Mazur, Anwendung der Thermodynamik irreversibler Prozesse. Bibliographisches Institut, Mannheim (1974). R. Abeyaratne and J.K. Knowles, On the driving traction acting on a surface of strain discontinuity in a continuum. J. Mech. Phys. Solids 38 (1990) 345–360. E. Fried, Energy release, friction and supplemental relations at phase interphases. Continuum Mech. Thermodyn. 7 (1995) 111–121. J.D. Eshelby, The elastic energy momentum tensor. J. Elasticity 5 (1975) 321–335. W. Heidug and F.K. Lehner, Thermodynamics of coherent phase transformations in nonhydrostatically stressed solids. Pure Appl. Geophys. 123 (1985) 91–98. L.M. Truskinovsky, Dynamics of non-equilibrium phase boundaries in a heat-conducting nonlinearly elastic medium. J. Appl. Math. Mech. PMM USSR 51 (1987) 777–784. M.E. Gurtin, The dynamics of solid–solid phase transitions – 1. Coherent transitions. Arch. Rational Mech. Anal. 123 (1993) 305–335. I-Shih Liu, On interface equilibrium and inclusion problems. Continuum Mech. Thermodyn. 4 (1992) 177–188. I. Schmidt, Gleichgewichtsmorphologien elastischer Einschlüsse. Dissertation TU Darmstadt. Shaker Verlag (1997). R. Müller, 3D-Simulation der Mikrostrukturentwicklung in Zwei-Phasen-Materialien. Dissertation TU Darmstadt (2001). G.A. Maugin, Material forces: Concepts and applications. ASME Appl. Mech. Rev. 48 (1995) 213–245. R. Kienzler and G.A. Maugin, Configurational Mechanics of Materials. CISM Internat. Centre for Mechanical Sciences, Courses and Lectures 427 (1999). P. Podio-Guidugli, Configurational balances via variational arguments. Interfaces Free Boundaries 3 (2001) 1–13. I. Müller, Thermodynamics. Pitman, Boston (1985). I. Müller, Eshelby tensor and phase equilibrium. Theor. Appl. Mech. 25 (1999) 77–89. C. Truesdell and R. Toupin, The classical field theories. In: Handbuch der Physik, Vol. III/1. Springer, Heidelberg (1960) pp. 226–793. Y. Huo and I. Müller, Thermodynamics of pseudoelasticity – an analytical approach. Acta Mechanica 99 (1993) 1–19.
E. Frola (1906–1962): an Attempt Towards an Axiomatic Theory of Elasticity SANDRO CAPARRINI and FRANCO PASTRONE Dipartimento di Matematica, Università di Torino, Via Carlo Alberto 10, 10123 Torino, Italy. E-mail:
[email protected],
[email protected] Received 19 September 2002; in revised form 16 October 2003 Abstract. In a few papers published in the 1940’s, the Italian mathematician Eugenio Frola (1906– 1962) proposed an axiomatic formulation of the theory of linear elasticity as an autonomous branch of mathematics, and he sketched a program for the logical foundations of that mathematical theory. He gave only some general principles and a few hints, without obtaining any definite axiomatic structure in the sense of the fundamental work of W. Noll and C. Truesdell a few years later. Nevertheless, Frola’s attempt of finding rigorous axiomatic foundations of classical elasticity must be acknowledged as a very first step in the right direction. Mathematics Subject Classifications (2000): 74-03, 74Axx, 74B05, 74B20. Key words: history of elasticity, axiomatic theory.
To Clifford Truesdell, Master of Science and Life
1. Introduction As it is well known, at the beginning of the XXth century the theory of elasticity was widely developed in Italy, to the point that Felix Klein claimed that “elasticity was the national question of Italians”. In 1935, in the annual report of the Italian Society for the Progress in Science (SIPS), it was remarked that, in a rather unpleasant period for Mechanics in Italy, two fields were still flourishing: Hydrodynamics and Elasticity. Eugenio Frola was part of the scientific group working in linear elasticity and mechanics of structures and he showed a marked interest in generalizing or trying to generalize any of the many problems studied by himself or by his colleagues in different fields, even if he dealt mainly with the theory of elasticity. He was variously attracted by problems in engineering, structural mechanics, vibrations, physics, functional analysis, PDE, integral equations, and economics. His efforts were always directed not only to solving particular problems but also to finding more general formulations which could satisfy, even partially, his expectation of pointing out the rigorous logical structure that ought to underlie every theory. 113 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 113–125. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
114
S. CAPARRINI AND F. PASTRONE
In 1940 Frola wrote two papers, in which he sought to develop a systematic approach to nonlinear elasticity, starting from a linear model. In Section 3 we examine this attempt, first discussed in an article [1] that appeared in Il Saggiatore, a journal devoted to the dissemination of science among cultivated people, then in a more technical paper published in the Proceedings of the Academy of Sciences of Torino [2], where he went deeper into the question. In Section 4 we turn our analysis to a paper which appeared in 1948 in Atti del Centro Studi Metodologici, a largely unknown scientific journal, where he went through the principles of the theory of elasticity and tried to state them in a rigorous way, to point out the basic ideas that were applied, consciously or unconsciously, in the treatises and papers on this subject. Indeed he realized that there was a need for a logical, rigorous mathematical formulation of the theory of elasticity, both linear and nonlinear, and he would have liked to pursue this task, but his diverging interests and his ill health stopped his researches. In this paper our goal is to present and comment on the ideas of Frola on this specific topic, which was disregarded by most of his contemporaries, but fundamental to any modern theory of elasticity, or, more generally, of continua.
2. Sketch of a Biography and Comments on Previous Works Eugenio Frola was born in Montanaro, a small village at the feet of the Alps, near Torino, on September 28, 1906. He obtained two degrees, one in Civil Engineering from the Politecnico of Torino in 1929, and the other in Mathematics from the University of Torino in 1933. Frola was a student of G. Albenga and G. Fubini, the first working in Mechanics, the second in Analysis with a taste for the applications to elasticity and mechanics (for a brief sketch of their activities see [3]). He became Assistant to Fubini and to F.G. Tricomi, an outstanding mathematician who worked in Analysis and is famous for the PDE bearing his name (see again [3]). After a few years Frola obtained a position as Professor at the Politecnico of Torino. On February 24, 1940, he was elected Member of the Academy of Sciences of Torino, one of the oldest Italian academies, founded in 1759 by J.L. Lagrange among others. Between 1932 and 1955 Frola published 37 papers, mostly in the Proceedings of the Academy of Sciences of Torino and the Accademia dei Lincei. Thereafter he suffered from a fatal illness, which lasted several years. He turned his mind to Buddhism and translated a couple of Buddhist books, the second one being published after his death, which happened on May 6, 1962. His severe health problems diverted him from research and he did not complete his program to build logical foundations for the theory of elasticity, thus preventing him from obtaining the results and the recognition he deserved. According to the philosopher of science L. Geymonat, who edited in 1964 a book of Frola’s collected papers [4], Frola’s contributions to epistemology were highly significant, even more so than it would appear from his dense but few and schematic essays on
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
115
foundations and methodology (see the introduction of the volume [4, pp. 7–33]). A brief analysis of the works of Frola can be found in [5]. Frola’s work is marked by some characteristic features that distinguish his approach from that of the other Italian mechanicians of the same period. In his research he always tries to generalize known or new results, relaxing some hypotheses or putting in evidence analogies with other disciplines. In the first period (1932–1940) his researches were devoted to satisfying these needs: for instance, Frola [6] tried to extend to dynamics the Colonnetti and Castigliano versions of Betti’s reciprocity theorem, even though it was D. Graffi who scored success in 1939 and 1963 (see [3, pp. 414–417]). It is worth noting that in the paper [7] published in the Acta Pontificia in 1938, the abstract is in Latin as requested: “Auctor ostendit theorema Colonnetti circa systemata elasto-plastica nihil aliud esse nisi hypotheses illa fundamentalis de deformationis omnimodae congruentia. Docet etiam rectam quamdam elementariam rationem, qua opus nos est algorithmis minimizantibus uti”. Hence we can see that the use of Latin in scientific papers did not disappear completely, even in the XXth century, and the revival of this language in a couple of papers by C. Truesdell signified the continuation of a use inherited from great scientists of the past. In the same period Frola applied to elasticity techniques of quantum mechanics [8], namely: he used the Dirac delta function and infinite dimensional vector spaces to obtain the equation of elastodynamics from a microscopical approach. In other researches he made use of linear functionals to obtain integral equations as field equations of elasticity, where the influence of Volterra, who had been in Torino from 1893 to 1900, is evident.
3. A First Step: Linear and Nonlinear Theory of Elasticity In 1940 two papers of different style but with the same objective appeared [1, 2]. In these papers Frola explored the mathematical foundations of linear elasticity as this theory was established in his times. His purpose was to make clear which parts of linear elasticity would remain unchanged and which parts had to be modified if the nonlinear theory were to be developed. Clearly this is not a trivial problem. By now we understand that it would be much more correct to develop an exact theory first and then linearize in a mathematically rigorous way, but it was not natural at all in those years, when elasticity meant linear elasticity for almost everyone (and even nowadays one can find in textbooks of physics such an identification). The first paper was published in the first issue of the journal Il Saggiatore, the purpose of which was to disseminate the more advanced and up-to-date results in sciences among scientists themselves with papers that would avoid technical complications but expose the main ideas. In the editorial board there were, among others, F.G. Tricomi (whom we have already mentioned in the preceding section) and G.C. Wick, an outstanding physicist. In this paper Frola explained his ideas in an almost colloquial way and, before tackling the main question, he gave a simple sketch of what he thought were the main differences between elasticity
116
S. CAPARRINI AND F. PASTRONE
and plasticity, by means of simple ideal experiments as follows: if small rods of steel and lead are strained and their deformations during loading and unloading are measured, the steel rod will recover the initial configuration, while the lead rod will “remember” the deformations undergone. Here Frola uses a rigorous language: “If we want to make use of the language of modern analysis, we would say that [the displacements in the lead rod] are no more a function of the force, but a functional depending on the force, which is a function of time, a continuous functional in the sense that small variations of the law with which the force has changed in time cause small variations in the displacements” [1, p. 75]. The influence of the works of Volterra, Pincherle and Amaldi on functionals is clear, but the idea of applying such techniques to plasticity, in this context, is due to Frola. Returning to elasticity, Frola realizes that elastic bodies can be subjected to the phenomena of buckling when the load exceeds some critical values and it is not contradictory to the assumption of elastic material, but it is a nonlinear effect: “Nevertheless, even after the abrupt variation in the behavior [the buckling of an elastic bar, authors’ remark], the body continues to be elastic (no more in the linear sense) [. . .]” [1, p. 78]. In fact, Frola was interested in the problem of buckling of structures not only for its many different applications, but also for a deeper analysis of the theory of elasticity. While his contemporaries devoted their attention to many detailed and particular problems, which led to engineering and structural mechanics (see the comment by Truesdell and Noll in the final remarks below), he spent his efforts at a crucial point, which was not a feature only of buckling of rods, but a basic problem of the theory of elasticity: the correct formulation of a nonlinear theory. He was aware of the paper of Trefftz on buckling [9], as we will show later, but his interest was differently orientated and he was not satisfied with the way this problem had been treated. Frola concludes that, for such bodies under such circumstances, the principle of superposition of the effects is no longer valid. On the other hand, at that time the experimentalists assumed that in elastic solids this principle held and, in order to be consistent with this experimental hypothesis, the linear model must be assumed. Finally, to justify buckling effects the nonlinear model must be used, and the main assumption to be rejected is the principle of superposition of the effects, and experiments must take into account this new fact. Then Frola distilled what he took to be the essence of linear elasticity, as summarized in two axioms: 1. The principle of invariance of the response in a sequence of subbodies, each one included in the preceding one, down to a body point: in the words of Frola, “discesa dal globale al locale”, which means descent from global to local. In other words, we can claim that an elastic body consists of elastic subbodies and, in the limit, of elastic points. 2. The existence of a local elastic energy function as a positive definite quadratic form in the first derivatives of the displacement (components): “l’energia potenziale elastica locale sia in particolare una forma quadratica, definita positiva, nelle sole derivate prime dello spostamento, invariante per moti rigidi infinites-
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
117
imi [. . .]”, where the term “invariante” must be referred to the energy function (ibidem, p. 79). In fact, Axiom 2 is partially a consequence of Axiom 1: if we assume the existence of a global elastic potential energy, because of Axiom 1 we can obtain it as the “sum” of the elastic potentials of the parts which, we imagine, compose the body, in the limit even when these parts are elastic points. Then we can define a local function (the local elastic potential energy), which depends on local properties of the displacement, and obtain the total energy by integration over the body. Since Frola was mainly interested in static problems, he added the unnecessary assumption that the elastic potential be positive definite and used the virtual work principle (which is a more general assumption in mechanics) to build up an equilibrium theory for linearly elastic solids. He ends his paper with the following proposal: “Le basi e i fondamenti delle teorie non lineari, che permettono una visione più ampia dei fenomeni elastici generali, potranno formare argomento di un ulteriore esame” [The basis and the foundations of nonlinear theories, which allow a broader view of general elastic phenomena, can be in the future a subject of further investigations] [1, p. 80]. The next step appeared the same year. In a longer and more technical paper [2], Frola proves that buckling phenomena cannot be included in the linear theory of elasticity and the difficulty lies in the principle of superposition, which he calls here the first postulate. In fact, the same remark was already made in [1], but now the approach is more detailed and mathematically consistent. Again Frola begins from a critical analysis of the postulates of the “ordinaria” (in our language it means linear) theory of elasticity, remarking that the very basic postulate is the superposition principle. It is a phenomenological principle, from which both a global and a local theory ensue: in the first case it is expressed by means of integral field equations, and in the second case by differential equations. In a global theory this principle means that the stress–strain relations are expressed with linear functionals and that the local equations thus obtained are not necessarily linear differential equations. On the other hand, Hooke’s law is a physical assumption typical of linear theories, requiring the algebraic linearity of the stress–strain relations. But, as Frola clearly proves, the two assumptions are not equivalent: neither global linearity implies Hooke’s law, nor does the vice versa hold obviously. It follows that the first postulate can be saved, but must be generalized: a global principle of superposition is required. A second postulate is added: “I quadrati delle derivate prime degli spostamenti siano trascurabili rispetto alle derivate stesse” [the squares of the first derivatives of the displacements must be negligible with respect to the derivatives themselves] [2, p. 535]. A third postulate concerns the loads: in modern language it states that only dead loads are taken into consideration. At this point, almost incidentally, Frola remarks that, if we admit the existence of a strain energy function, the matrix of the coefficients in Hooke’s law must be symmetric and vice versa. This is a
118
S. CAPARRINI AND F. PASTRONE
consequence of Betti’s reciprocal theorem, as it was clearly proved by C. Truesdell [10] in the more general context of finite elasticity. (Truesdell proved that Betti’s reciprocal theorem is a sufficient and necessary condition for an elastic material to be hyperelastic.) Finally, Frola rephrases the axioms listed in the previous paper, which are necessary to obtain a linear theory. In modern language they can be summarized in the following way: (i) the body is linearly hyperelastic, (ii) the displacement gradient is “small”, and (iii) the external loads are dead loads. Returning to the problem of buckling of elastic structures, Frola proves that the first and third postulates are not affected by this anomalous behavior, while the second postulate fails: experience shows, for instance, in the well-known case of the elastica, that we can have small deformations but large first derivatives. Hence Frola proposes to modify the strain–displacement relations, as follows: 1 2 1 2 1 2 1 1 1 1 + exy + ezx + exy r − ezx q + q 2 + r 2 , εxx = exx + exx 2 8 8 2 2 2 2
(1)
and analogous expressions for the other components of the strain, where exx , eyy , ezz , exy , eyz , ezx are the classical linear strain components in a Cartesian coordinates system (x, y, z), (u, v, w) the displacement components and (p, q, r) the components of the local rotation. In fact, according to Frola’s notations, the εxx , εyy , . . . are the components of what we now call the Lagrangian strain tensor and the exx , exy , . . . , p, q, r can be written in terms of the displacement gradient H, namely as: exx = Exx , exy = 2 Exy , . . . , p = −Wyx , where Exx , . . . , Wxy , . . . are the components of: E = Sym H, W = Skw H. Assuming that the squares of the linear strain components and the mixed terms are negligible, while the squares of the rotations are admitted, one can obtain the nonlinear relations: 1 εxx = exx + (q 2 + r 2 ), 2
εyz = eyz − qr, . . . .
(2)
By using the first axiom the local elastic potential energy density takes the form: 1 W = W0 + (λ + µ)θ(p 2 + q 2 + r 2 ) + (λ + µ)(p 2 + q 2 + r 2 )2 2 2 2 2 − µ[exx p + eyy q + ezz r + exy pq + eyz qr + ezx pr],
(3)
where W0 is the quadratic elastic strain energy density used in the linear theory and θ = (exx + eyy + ezz ) the ordinary cubic dilatation. In a few final lines Frola states a variational principle by means of the virtual work theorem and refers to a forthcoming paper with applications and local field equations given explicitly. But the paper never appeared. Let us remark that this critical review of the foundations of elasticity was carried out by Frola in a totally independent way. Analogous researches were carried out at the same time only in the former Soviet Union and were revealed to western scientists through the treatise by Valentin Valentinovich Novozhilov (1910–1987)
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
119
when its English version appeared in 1953 [11]. The subject was then definitively cleared up in a general context by Truesdell and Toupin [12]. The different “weights” attributed to displacement gradients and local rotations became a common feature after the works of Naghdi and others [13, 14], where the concepts of small displacements and moderate rotations were introduced. This modern terminology was not used in [2], but Frola explicitly said that the deformation components exx , eyy , ezz are of the first order and the squares of the local rotation components p 2 , q 2 , r 2 must be of the same order. In other words, we do not deal with infinitesimal deformations and finite rotations, as misunderstood by many people, but small rotations, with a different order of infinitesimality with respect to the deformations. The definition of Frola in [2] fits exactly with the definition given by Naghdi and Vongsarnpigoon [14, Section 4.3, p. 282]: “Given [a small strain] E = O(ε0 ), a proper orthogonal tensor R is said to be a moderate rotation with respect to ε0 if for any unit vector v, the vector β defined in (4.29) [β = Rv − v] 1/2 satisfies: β = O(ε0 ) as ε → 0” [14, p. 282]. Nobody, but one, noticed this new idea of Frola. The reason is that the nonlinear theory of elasticity was introduced independently in the 1930’s by A. Signorini in Italy and by C. Murnaghan in the USA. The approach of Signorini (and Murnaghan, but he was not known in Italy) is completely different from Frola’s. Signorini did not pay much attention to the logical foundations of his theory. He put his efforts in developing techniques useful in solving some problems he had in mind, which could not be treated by the linear model. Unfortunately Signorini chose the “dreadful notation” [C. Truesdell, verbatim] of the homographies, imposed in Italy by C. Burali-Forti and T. Boggio, instead of tensor calculus. This awkward formalism and the fact that the papers were written in Italian made the work of Signorini almost unreadable outside Italy. Only C. Truesdell took the trouble to scrutinize it thoroughly, so that many of Signorini’s results, clarified and cleansed, appeared in the treatise of Truesdell and Toupin [12], with the credits he deserved. The only one who noticed Frola’s results was Placido Cicala (1910–1996), a professor of civil engineering at the Politecnico of Torino. Cicala had just written a paper [15] in which he proposed to modify the classical equations of equilibrium of linear elasticity in order to allow critical and postcritical behavior in loaded structures, i.e., again the buckling problem. As pointed out by Frola [2, p. 532], this approach can lead to a family of contradictory theories “in quanto partenti da risultati acquisiti dalla teoria ordinaria dell’elasticitá che deve d’altra parte nel corso della ricerca essere negata nei suoi postulati” [because it is based on results obtained from the linear theory of elasticity, which, on the other hand, must be denied in its postulates during the same research]. Frola means that authors like Cicala modified the linear theory by adding ad hoc terms in the strain–displacement relations and in the constitutive relations relative to some specific problem, and in this way they contradicted, without noticing it, the postulates of linear elasticity they were still using. Cicala [16], in a one-page footnote of a paper devoted to the nonlinear theory, rejected the above-mentioned critical remark of Frola, and tried
120
S. CAPARRINI AND F. PASTRONE
to prove with a simple numerical example that Frola’s deductions were wrong. Incidentally, Cicala provided evidence that Frola was aware of the paper of Trefftz [9]. In a footnote on page 95, Cicala remarks: “(1 ) . . . This note represents an appreciable development of the theory whose foundations are stated by Trefftz in his master paper . . .” and, on page 97, we find another footnote: “This [variational] procedure is used by Trefftz in the paper quoted above. Frola in his future notes, will adopt, as he announces [annuncia], such method”. The Italian verb “annuncia” used in this remark means that probably Frola mentioned this paper to Cicala in some private discussion, because he never wrote such a sentence. Cicala deals with a rigid rotation around the z-axis of angle α, which produces the displacement: u = x(cos α − 1) − y sin α, v = y(cos α − 1) + x sin α, w = 0.
(4)
Assuming the approximations required by Frola, Cicala obtains the explicit form for the deformation components εxx : ∂u 1 ∂u ∂w 2 1 ∂v ∂u 2 + − − + , (5) εxx = ∂x 8 ∂z ∂x 8 ∂x ∂y which is equal to the expression (2). By substitution of (4) in (5) Cicala finds the following value for the first deformation component: εxx = −(cos α − 1)2
(6)
and since “the rotation α can be any, it can even result = 2” (indeed, as Frola will remark: = −2). And this is in contradiction with the hypothesis of small deformations. In fact, as Frola himself wrote in [17], Cicala confused “infinitesimal” and “small” quantities: “the theory [. . .] is obviously a limit theory which intends to represent phenomena not encompassed in the classical [= linear] theory, but it does not pretend to be either a theory of finite displacements or to interpret any elastic phenomenon”. Frola tried to explain that, when we deal with “infinitesimal” quantities, they must not necessarily be of the same order, and, in any case, the example is misleading. The same rotation inserted in the formula (3) of Cicala’s paper will produce the strain exx = cos x − 1 and Cicala claims that this result is correct, because it is of the same order as the deformation. It is easy for Frola to argue that exx is of order α 2 , while εxx is of order α 4 ; hence the rotations are of a different order of “smallness” with respect to the pure deformations, as postulated. Finally, one must not confuse “small” with “finite”: “the theory, . . . , does not pretend to be a theory of finite displacements . . . . All the first derivatives of the displacement components must be considered small, . . . not necessarily of the
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
121
first order” [17, p. 259]. Moreover the criticism of Cicala does not apply to this model “because it acts on a field (finite displacements) which is not included in my theory” (ibidem, p. 260). It has already been remarked that Frola’s point of view is consistent with the theory of small deformations and moderate rotations as developed by Naghdi and coworkers [13, 14], who surely did not know the papers of Frola. Then, having been provoked, Frola proved that the counterexample produced by Cicala could be easily adapted to the classical linear elasticity; moreover, he claimed that his own theory was not rigorous, but sufficient to interpret some physical facts, and that he was not dealing with a finite displacement theory. Moreover, Frola showed that in Cicala’s paper [15] there were some “weak points”, not to say mistakes and “inesattezze alquanto pericolose” [quite dangerous inaccuracies], as, for example, the confusion of the modulus of a derivative with the derivative of the modulus, the use√of wrong rules of approximation, and finally, the statement that “if misleading. In fact, “if a is very small, 8 a is very √ small too” [15, p. 213], which is√ a goes to zero, so does 8 a, but if a = 10−4 (really small), 8 a = 10−1/2 ≈ 0.315, which I would not say to be very small” [17, p. 262]. Cicala did not reply and the debate passed unnoticed in the Italian scientific community.
4. The Methodology of Mathematics and the Last Paper (1948) At the end of World War II, Frola turned his mind to new perspectives: he was mainly, if not exclusively, interested in problems related to the foundations and methodology of mathematics. This change must surely be connected with the establishment of the “Centro Studi Metodologici di Torino”, whose founders were prominent scientists working at the University and at the Politecnico of Torino: mathematicians, physicists, chemists, economists, etc. It was a smaller version of the Vienna Circle: a private association of scholars, linked by a common interest in logical foundations of science. Frola was very active, from the beginning, even if he was interested almost only in mathematics and its applications and did not seem to be too much influenced by the Austrian neopositivism. On the history of the Centro the reader is referred to a very complete article by Giacardi and Roero [19]. Included in the above mentioned book [4] (edited by Geymonat and published in 1964) are Frola’s main papers of this period, all concerned with mathematics and its relationship with other sciences, including engineering, physics and economics. This last subject is discussed in detail in his most celebrated methodological paper, written with the economist Leoni [20]. Here Frola strongly denies that it would be possible to apply mathematics to economics in the same way that it is applied to physics. He shows that other mathematical techniques are needed, most of them not even known to the economists of that period. Here we have no intention to go into the details of this paper, which contains many interesting critical comments and suggestions, but it is worth noting that at the beginning criticism is levelled to those who blindly use mathematics without caution: “Writing down a system of
122
S. CAPARRINI AND F. PASTRONE
mathematical relations is useless [. . .] and meaningless, if theorems of existence of the solutions are not proved and techniques of approximations are not determinated [. . .]. This solution will be considered satisfactory, and the system of mathematical relations efficient, if it [. . .] will provide us a good accuracy” [20, p. 88]. At the end of the same paper Frola attacks again the improper use of mathematics and quotes a book by P.A. Samuelson, Foundations of Economic Analysis (Cambridge, 1948), where, on page 10, a system of n equations involving some not well defined “functional relationships” in n unknowns and m parameters is considered. Samuelson deduces that any of the unknowns can be expressed as a function of the parameters. After some sharp considerations, Frola concludes: “[. . .] we cannot but regard this approach as a free and easy use of mathematics. One could find countless examples of such uninhibited use of mathematics in economics, which are not considered here [. . .]” [20, p. 109]. Pursuing his new interest in the philosophy of mathematics, in 1948 Frola wrote [18] a paper on the logical foundations of the theory of elasticity, where he does not enter into the details of an axiomatic structure, but simply tries to fix the fundamental concepts of the theory. It is sort of a first draught, which was never followed by a more complete work. Its relevance is due to the fact that almost certainly it is the first attempt, together with the papers of 1940–1942, to formalize some basic axioms of the theory of elasticity. Frola makes a distinction amongst the characteristics of the axioms, dividing them into three families. A similar distinction in different groups of hypotheses can be found in the general and classical works of W. Noll and C. Truesdell, even if, obviously, the classification of Frola is simpler, more schematic, and even naive. In Frola’s scheme, there are three groups of axioms, each one containing three hypotheses. The first group contains “geometrical” assumptions: 1. An elastic body is a domain in an Euclidean 3-dimensional space (what we can call a “placement” of the body). 2. A vector field, representing the deformations, is defined over this domain. 3. This vector field must satisfy suitable smoothness conditions. The second group contains the “mechanical” assumptions: 4. The loads are described by vectors, with the same properties of forces in rational mechanics. 5. “On the existence of tensions”: internal tractions as surface forces are defined by the assumption that a surface force acts on the boundary of any subbody and represents the action of the remaining part of the body over the given subbody. 6. The nature of loads and tensions must be defined in a precise, mathematical way, including hypotheses analogous to those required in point 3. It is clear that such a scheme presumes a more general theoretical framework, where the forces (i.e., volume and surface forces) must be introduced according
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
123
to some general theory of rational mechanics. The third group is called “physical hypotheses” and concerns more specifically linear elasticity: 7. Linearity: there holds the superposition principle, as introduced in [1]. 8. The displacements are small, in the usual sense. 9. “[D]iscesa all’infinitesimo”: all the preceding assumptions are valid for any subbody of the body, down to the smallest part of the body, namely a point, taken as the limit of a sequence of shrinking domains ordered by inclusion. This final assumption, of a nature analogous to assumption 5, neither has the support of experimental evidence nor represents any physical experience, but is the result of a “metaphysical attitude” necessary to apply the macroscopic continuous model, which allows the derivation of differential equations as field equations. In his final remark, Frola expresses his dissatisfaction with this inadequate construction. He realizes that it is not the only possible approach, and the mathematical language used here is not even the best mathematical language one can imagine. Surely there are gaps to be filled, and he hints vaguely that he would continue his researches in this direction. But Frola soon turned his mind away from elasticity and mathematics. He wrote a few papers about methodology of mathematics, including the aforementioned one on economics, but he never continued his interesting program on axiomatization of the theory of elasticity and became wholly immersed in his studies on Buddhism. In conclusion Frola’s ideas, even if his construction has not been completed, remain exemplary for their consistency and coherence, and they stimulate further work to search for rigorous logical foundations of theories, for physically meaningful hypotheses, and for correct applications in different fields. The last lines of this paper [18, p. 14] are a sketch of a program: “Rimane quindi allettante, e dal punto di vista critico e da quello applicativo della scienza delle costruzioni, il tentare altre vie per la costruzione di nuovi complessi teorici destinati a descrivere scientificamente il fenomeno elastico”. [It remains hence appealing, both from critical and applicative points of view in materials science, to try different ways to build up new theoretical structures apt to describe scientifically the elastic phenomenon.] Frola failed, but this task would be successfully fulfilled a few years later by W. Noll, C. Truesdell and others, in complete generality, not only for elastic material, but also for a broad class of materials, within the framework of Rational Continuum Mechanics. 5. Final Remarks Even if Frola did not obtain the results he pursued, he felt the need of constitutive equations for the nonlinear theory of elasticity, and by extension for other
124
S. CAPARRINI AND F. PASTRONE
theories based on a mathematical model. In this sense he realized beforehand the significance of one of the basic points in Rational Continuum Mechanics. The constitutive equations, correctly formulated, are fundamental in providing a rigorous framework and the logical foundations of a physical-mathematical theory. They are based on different sets of axioms, as Frola explicitly said, and his classification is very similar, even if much sketchier, to the classification used nowadays. The historical impact of Frola’s papers is slight, because they were few, incomplete and unknown, but it is surprising that he, isolated and pathetic figure of a scientist, working outside the main stream of interest of his time, tried to find the logical structure of the theory of elasticity, both linear and nonlinear. The main interests in Italian mechanics between the 1930’s and the 1950’s were surely not to build up general theories of continua, but to solve many particular problems, applying analytical tools, often complicated and sophisticated, with the goal of proving existence, uniqueness, and regularity of the solutions of various differential equations. In other countries different models were developed, mainly linear, to study phenomena such as plasticity, dynamics of fluids, structural mechanics, etc., and again many particular problems were solved, many results were found, sometimes interesting, but “most of which have later turned out to be unnecessary in the cases they are justified. Knowledge of the true principles of the general theory seems to have diminished except in Italy, where it was kept alive by the teaching and writing of Signorini” [21, p. 9]. Nevertheless the relevant work done by Signorini, one of the pioneers, with Murnaghan, on the nonlinear theory of elasticity, was ignored until C. Truesdell read his difficult papers (written in Italian) and noticed their significance. Frola did not follow the lead of Signorini. His approach was completely different: his goal was to find the logical structure at the basis of a theory in applied mathematics and, even if he solved some particular problems in elasticity, he was aware that the theory of elasticity, as well as other theories, had weak foundations and it was necessary to reconstruct them firmly. There are no important theorems in his papers, but we would like to point out that Frola tried to go deeper into the logical foundations of linear and nonlinear elasticity, while his contemporaries were not interested in it. He was not able to set the theory of elasticity on a satisfactory basis, but we must recognize that his was one of the very first attempts in this direction.
Acknowledgements The first author was supported by the project MIUR-COFIN 2000 “Storia delle scienze matematiche”. The second author was supported by the project MIURCOFIN 2002 “Mathematical Models for Material Science” and, partially, by GNFM-CNR. The authors are indebted to Chi-Sing Man for his careful reading of the manuscript and helpful suggestions, which improved greatly the content of this paper.
E. FROLA: AN ATTEMPT TOWARDS AN AXIOMATIC THEORY OF ELASTICITY
125
References 1. 2. 3.
4. 5. 6. 7. 8. 9.
10. 11.
12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
E. Frola, La teoria dell’elasticitá. Il Saggiatore 1 (1940) 74–80. E. Frola, Sull’elasticitá non globalmente lineare. Principi e fondamenti delle teorie. Atti Accad. Sci. Torino 75 (1940) 531–540. F. Pastrone, Fisica matematica e meccanica razionale. In: S. Di Sieno, A. Guerraggio and P. Nastasi (eds), La Matematica Italiana dopo l’Unitá. Gli Anni tra le Due Guerre Mondiali. Marcos y Marcos, Milano (1998) pp. 381–504. E. Frola, Scritti Metodologici, L. Geymonat (ed.). Giappichelli, Torino (1964). L. Geymonat, Eugenio Frola. Atti Accad. Sci. Torino 151 (1961/62) 986–997. E. Frola, Su di una generalizzazione dinamica del teorema di Betti diversa da quella di Lord Rayleigh. Rend. Accad. Lincei s. VI XXV (1937) 586–589. E. Frola, Intorno al teorema di Colonnetti sui sistemi elasto-plastici. Acta Pont. Acad. Sci. 2(7) (1938) 61–71. E. Frola, Il problema di Cauchy in grande e le equazioni alle derivate parziali lineari a coefficienti costanti. Rend. Accad. Lincei s. VI XXVII (1938) 518–524. E. Trefftz, Ueber die Ableitung der Stabilitätskriterien des elastischen Gleichgewichtes aus der Elastizitätstheorie endlicher Deformationen. In: Verh. 3. Internat. Kongr. Techn. Mech. 3 (1931) 44–50. C.A. Truesdell, The meaning of Betti’s reciprocal theorem. J. Research of N.B.S. 67B (1963) 85–86. V.V. Novozhilov, Foundations of the Nonlinear Theory of Elasticity. Gostekhizdat, Moscow (1948) (English transl. by F. Bagemihl, H. Komm and W. Seidel, Graylock, Rochester, NY (1953)). C.A. Truesdell and R.A. Toupin, The classical field theories. In: S. Flügge (ed.), Handbuch der Physik, Vol. III/1. Springer, Berlin (1960) pp. 226–793. J. Casey and P.M. Naghdi, Physically nonlinear and related approximate theories of elasticity, and their invariance properties. Arch. Rational Mech. Anal. 76 (1981) 355–390. P.M. Naghdi and L. Vongsarnpigoon, Small strain accompanied by moderate rotation. Arch. Rational Mech. Anal. 80 (1982) 263–294. P. Cicala, Sulla stabilitá dell’equilibrio elastico. Atti Accad. Sci. Torino 75 (1940) 185–222. P. Cicala, Sulla teoria non lineare dell’elasticitá. Atti Accad. Sci. Torino 76 (1940) 94–104. E. Frola, Su alcune questioni di elasticitá non lineare. Atti Accad. Sci. Torino 77 (1942) 258– 262. E. Frola, Sui fondamenti logici della teoria dell’elasticitá, Atti del Centro di Studi Metodologici I (1948) 12–14. L. Giacardi and C.S. Roero, L’ereditá del Centro di Studi Metodologici di Torino. In: Quaderni di Storia dell’Universitá di Torino, Vol. 2. (1998) pp. 289–356. E. Frola and B. Leoni, Possibilitá di applicazione delle matematiche alle discipline economiche. Il Politico 20 (1955) 190–210. C. Truesdell and W. Noll, The nonlinear field theories of mechanics. In: S. Flügge (ed.), Handbuch der Physik, Vol. III/3. Springer, Berlin (1965) pp. 1–602.
Symmetries and Hamiltonian Formalism for Complex Materials GIANFRANCO CAPRIZ1 and PAOLO MARIA MARIANO2
1 Dipartimento di Matematica, Università di Pisa, via Buonarroti 2, I-56127 Pisa, Italy.
E-mail:
[email protected] 2 Dipartimento di Ingegneria Strutturale e Geotecnica, Università di Roma “La Sapienza”, via Eudossiana 18, I-00184 Roma, Italy. E-mail:
[email protected] Received 25 March 2003; in revised form 6 October 2003 Abstract. Preliminary results toward the analysis of the Hamiltonian structure of multifield theories describing complex materials are reported: we invoke the invariance under the action of a general Lie group of the balance of substructural interactions. Poisson brackets are also introduced in the material representation to account for general material substructures. A Hamilton–Jacobi equation suitable for multifield models is presented. Finally, a spatial version of all these topics is discussed without making use of the notion of paragon setting. Mathematics Subject Classifications (2000): 74A30, 74A35, 74B20. Key words: complex materials, microstructures, elasticity, multifield theories.
In memory of Clifford Ambrose Truesdell, our teacher and mentor
1. Lagrangian and Hamiltonian Descriptions of Elastic Bodies with Substructure In standard continuum mechanics, each material element of a body is “collapsed” into the place occupied by its centre of mass; let X be that place in the reference placement; the set of all X is taken to be a fit region B0 of the three-dimensional Euclidean space E. Sometimes such simplicistic model of physical reality is insufficient; then, to render the picture adequate, the material element must be portrayed as a system and at least some coarse grained descriptor ν (an order parameter) enters the picture. Here, as in [1] (see for other details and additional results [2–5]), we take ν as an element of some differentiable manifold M, and presume that physical circumstances impose a single choice of metric and of connection for M. We also assume that the region occupied by the body in the current placement be obtained through a sufficiently smooth mapping x˜ : B0 → E; so that the current place of a material element at X in B0 is given by x = x˜ (X); and B = x˜ (B0 ) 127 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 127–140. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
128
G. CAPRIZ AND P.M. MARIANO
is also fit. We denote as usual with F the placement gradient. We presume also that another sufficiently smooth mapping ν˜ : B0 → M shows the value of the order parameter at X, namely ν = ν˜ (X). A motion is a pair of time-parametrized families x˜ t (X) = x˜ (X, t) and ν˜ t (X)) = ν˜ (X, t), twice differentiable with respect to time. Rates in the material representation are indicated with x˙ (X, t) and ν˙ (X, t), and we will write x˙ and ν˙ for brevity. We restrict here our attention to bodies for which a Lagrangian density L exists, so that the total Lagrangian L of the body is given by L(X, x, x˙ , F, ν, ν˙ , ∇ν) d(vol), (1) LB0 = B0
the gradient ∇ν being based on the mandatory connection. We presume that L be of the form L(X, x, x˙ , F, ν, ν, ˙ ∇ν) =
1 ρ0 ˙x2 + ρ0 χ(ν, ν) ˙ − ρ0 e(X, F, ν, ∇ν) 2 − ρ0 w(x, ν),
(2)
where ρ0 is the referential mass density (conserved during the motion), χ the kinetic co-energy (see [1, p. 19]) associated with the substructure, e the elastic energy density and w the density of the potential of external actions, all per unit mass. Below we use the notation b = −∂x w for the density standard external actions and β = −∂ν w for the substructural ones. The kinetic energy density ρ0 κ(ν, ν˙ ) pertaining to the substructure is the partial Legendre transform of χ with respect to ν˙ . If L is sufficiently smooth, we may apply standard procedures to derive EulerLagrange equations for the functional LB0 : ∂x˙˙L = ∂x L − Div∂F L, ∂ ˙L = ∂ L − Div∂ L, ν˙
ν
∇ν
(3) (4)
where Div is the divergence calculated with respect to X, i.e. Div = tr∇. Put H = x˙ · ∂x˙ L + ν˙ · ∂ν˙ L − L.
(5)
The pair (˜ ν , ∇ ν) ˜ collects the peculiar elements of the tangent mapping T ν˜ : B0 × Vec → T M, where Vec is the translation space over E , and we identify B0 × Vec with T B0 . More specifically, we have T ν˜ (X): TX B0 → Tν M. Such elements cannot be separated invariantly unless M is endowed
with a parallelism (and one wants also to have a physically significant parallelism). A similar remark holds also for each element (ν, ν˙ ) of T M. The Lagrangian density is then defined on B0 × E × Vec × Hom(Vec, Vec) × T M × Hom(T B0 , T M) (satisfying some compatibility conditions of possible various nature, depending on the substructure), with Hom(A, B) the set of linear transformations between A and B.
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
129
Clearly, H is the density of the total energy. In fact, since ∂ν˙ L = ρ0 ∂ν˙ χ, the term ν˙ · ∂ν˙ χ − χ in (5) coincides with the substructural kinetic energy density κ(ν, ν˙ ) (hence the presence of χ rather than κ in the expression of L), then 1 H = ρ0 ˙x2 + ρ0 κ(ν, ν˙ ) + ρ0 e(X, F, ν, ∇ν) + ρ0 w(x, ν), 2
(6)
as asserted. The balance of energy can be expressed in terms of H as follows ˙ − Div(˙xP + νS) H ˙ = 0,
(7)
where P and S are respectively the Piola–Kirchhoff stress and the referential microstress P = −∂F L,
S = −∂∇ν L.
(8)
That (7) is true follows from direct computation. Notice that, in view of our hypothesis on the existence of a unique, physically significant connection for M, concrete meaning can be assured for microtractions Sn which represent interactions between neighboring material elements. As already remarked in various occasions in [1] (see, e.g., pp. 26, 27), in general, a properly covariant separation between ‘self-actions’ (−∂ν L) and ‘microstresses’ (−∂∇ν L) does not apply. Equations (3) and (4) lead us to an appropriate version of Noether theorem (see [4, p. 29]); here we follow the program of [6, p. 284]. We consider some virtual motion of our system, by assigning two one-parameter families fsii of sufficiently smooth point valued diffeomorphisms, i = 1, 2, acting respectively on B0 and E, and a Lie group G of transformations of M. We indicate with a prime the derivative with respect to the relevant s. 1. At each s1 , f1s1 acts on B0 so that X −→ f1s1 (X) ∈ E, and is isocoric (no virtual 1 1 change of density), i.e., Divf1 s1 = 0; f0 is the identity. We put f0 (X) = w. 2 2. At each s2 , fs2 is a diffeomorphism that transforms E into itself. We assume that f02 is the identity and put f02 (x) = v. 3. A Lie group G, containing SO(3), acts on M. Let ξ be an element of the Lie algebra of G. The infinitesimal generator of its action on ν ∈ M is indicated with ξM (ν) (see [7, p. 256]); νg is the value of ν after the action of g ∈ G. If we consider a one-parameter trajectory s3 → gs3 ∈ G such that g0 is the identity, We do not identify a-priori the substructural kinetic co-energy χ with its Legendre transform κ
(assuming thus for both the traditional quadratic form), to encompass cases in which such a distinction is necessary to capture prominent physical phenomena. Excluding for a while exotic cases, in fact, even when κ(ν, ν˙ ) = (1/2)˙ν α αβ ν˙ β , with (ν) ∈ Sym+ (Tν M, Tν M), a priori χ differs from κ by an addendum of the type λα ν˙ α , with λ(ν) ∈ Tν∗ M. Although such an addendum is commonly neglected, it becomes prominent in some cases as, e.g., when the substructural kinetics has rotational character, i.e., when the anholomic constraint ν˙ = Am holds, with m an arbitrary vector. Such a circumstance occurs in magnetostrictive solids for which m is the magnetization and one obtains from (4) the standard Gilbert’s equation (see [13] for the relevant calculations).
130
G. CAPRIZ AND P.M. MARIANO
we have also s3 → νgs3 and ξM (ν) = (d/ds3 )νgs3 |s3 =0 . When G coincides with ˙ where q˙ is the special orthogonal group SO(3), we identify ξM (ν) with Aq, the characteristic vector of a rotational rigid velocity and A a linear operator mapping vectors into elements of the tangent space of M, namely, if νq is the value of the order parameter measured by an observer after a rotation q, then A = (dνq /dq)|q=0 . Henceforth, to simplify notations, we use f1 , f 2 and νg to indicate f1s1 (X), fs22 (x), νgs3 (X), and write |0 for |s1 =0,s2 =0,s3 =0 . Moreover, grad indicates the gradient with respect to x. We say that L is invariant with respect to fsii ’s and G when L(X, x, x˙ , F, ν, ν˙ , ∇ν), = L f1 , f 2 , (grad f 2 )˙x, (grad f2 )F(∇f1 )−1 , νg , ν˙ g , (∇νg )(∇f1 )−1 .
(9)
Let us define Q = ∂x˙ L · (v − Fw) + ∂ν˙ L · (ξM (ν) − (∇ν)w), F = Lw + (∂F L)T (v − Fw) + (∂∇ν L)T (ξM (ν) − (∇ν)w),
(10) (11)
where v, w and ξM (ν) are as mentioned in items 1, 2, 3. THEOREM 1 (Noether-like theorem for complex materials). If the Lagrangian density L is invariant under f1s1 , f2s2 and G, then Q˙ + DivF = 0.
(12)
Proof. To prove the theorem, as a first step we note that (9) implies d 1 2 L f , f , (grad f2 )˙x, (grad f2 )F(∇f1 )−1 , νg , ν˙ g , (∇νg )(∇f1 )−1 0 = 0, ds1 (13) d 1 2 L f , f , (grad f2 )˙x, (grad f2 )F(∇f1 )−1 , ν g , ν˙ g , (∇νg )(∇f1 )−1 0 = 0, ds2 (14) d L f1 , f2 , (grad f2 )˙x, (grad f2 )F(∇f1 )−1 , νg , ν˙ g , (∇νg )(∇f1 )−1 0 = 0, ds3 (15) which lead to ∂X L · w − ∂F L · (F∇w) − ∂∇ν L · (∇ν∇w) = 0, ∂x L · v + ∂x˙ L · ((grad v)˙x) + ∂F L · ((grad v)F) = 0, (ν) + ∂∇ν L · ∇ξM (ν) = 0, ∂ν L · ξM (ν) + ∂ν˙ L · ξM
(16) (17) (18)
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
131
as a consequence of the properties listed under items 1–3 above. Then, we calculate the time rate of the scalar Q, the divergence of the vector F and, by using the equations (3) and (4), identifying s3 with t, we recognize that d d d L0+ L0+ L , Q˙ + DivF = ds1 ds2 ds3 0 which proves the theorem.
(19) 2
REMARK 1. As a first special case, we require that f2s2 alone acts on L leaving v arbitrary. By using (17) we obtain from (12) ∂ ∂x˙ L − ∂x L + Div∂F L · v = 0, (20) ∂t i.e., ρ0 x¨ = ρ0 b + DivP,
(21)
which is the standard equation of balance of momentum. REMARK 2. With G arbitrary, we consider its action alone on L; by using (18), with the identification s3 = t, we obtain from (12) that ∂ ∂ν˙ L − ∂ν L + Div∂∇ν L · ξM (ν) = 0 (22) ∂t or ρ0 ∂ν˙˙χ − ∂ν χ + z − ρ0 β − DivS = 0.
(23)
z = −ρ0 ∂ν e is called self-force in the terminology of [1]. This result assures the covariance of the balance of substructural interactions. When G coincides with SO(3), the co-vector in the parentheses in (22), namely the term multiplying ξM (ν), must be an element of the null space of AT (see for details of this special case [1–4]). REMARK 3. As a second special choice, let f2s3 be such that v = q˙ × (x − x0 ) (with q˙ a rigid rotational velocity – depending on time only – and x0 a fixed point ˙ If ˙ is an element of its Lie algebra, thus ξM (ν) = Aq. in space) and G = SO(3); q× 2 L is independent of x and we assume that only fs2 and G (in the form just defined) act on L, we have (24) skw (∂F LFT ) = e AT ∂ν L + (∇AT )t ∂∇ν L , where e is Ricci’s alternating tensor and skw(·) extracts the skew-symmetric part of its argument.
132
G. CAPRIZ AND P.M. MARIANO
REMARK 4. If we require that f1s1 alone acts on L, with w arbitrary (but satisfying 1), by using (16) we obtain from (12) that 1 ˙ 2 (FT ∂x˙ L + ∇ν T ∂ν˙ L) − Div P − ρ0 ˙x + ρ0 χ(ν, ν˙ ) I − ∂X L = 0, 2 (25) where P = ρ0 eI − FT P − ∇ν T ∗S is the modified Eshelby tensor for continua with substructure (see [4] for a similar result in a non-conservative setting, where the elastic potential e is substituted by the free energy). I is the second-order unit tensor and, in writing the explicit expression of P above, we find convenient to introduce the product ∗ defined by (∇ν T ∗S)n · u = Sn · (∇ν)u for any pair of vectors n and u. REMARK 5. Let us assume as special choices that f1s1 is such that w = q˙ × (X − X0 ) (with q˙ a rigid rotational velocity, and X0 a fixed point in space) and that G = SO(3), being q˙ × an element of its Lie algebra, thus ξM (ν) = A˙q. If the material is homogeneous, and we assume that f1 and G alone (in the form just defined) act on L, we have skw(FT ∂F L + (∇ν)T ∗∂∇ν L) = 0. REMARK 6. The action of f1s1 can be interpreted as a special virtual mutation of a possibly existing smooth distribution of inhomogeneities throughout the body, in the sense of [8]. In other words, we may say that (25) is the balance of interactions arising when the body mutates its inhomogeneous structure. This interpretation has been also suggested in [9] in non-conservative setting.
1.1. HAMILTON EQUATIONS Define p and µ, respectively, the canonical momentum and the canonical substructural momentum, by p = ∂x˙ L,
µ = ∂ν˙ L.
(26)
The Hamiltonian density H, H (X, x, p, F, ν, µ, ∇ν) = p · x˙ + µ · ν˙ − L(X, x, x˙ , F, ν, ν˙ , ∇ν),
(27)
has partial derivatives with respect to its entries; some of them are the opposite of the corresponding derivatives of L so that (3), (4) can be also written respectively as p˙ = −∂x H + Div∂F H, x˙ = ∂p H ; µ˙ = −∂ν H + Div∂∇ν H, ν˙ = ∂µ H .
(28) (29)
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
133
2. Canonical Poisson Brackets in Multifield Theories We now consider a general boundary value problem where the following boundary conditions are associated with (28) and (29) x(X) ∂F H n ν(X) ∂∇ν H n
= = = =
x¯ t ν¯ t
on ∂ (x) B0 , on ∂ (t) B0 , on ∂ (ν) B0 , on ∂ (t) B0 ;
(30) (31) (32) (33)
x¯ , t, ν¯ and t are prescribed on the relevant parts ∂ (·) B0 of the boundary, Cl(∂B0 ) = Cl(∂ (x) B0 ∪ ∂ (t) B0 ), with ∂ (x)B0 ∩ ∂ (t) B0 = ∅, and Cl(∂B0 ) = Cl(∂ (ν)B0 ∪ ∂ (t) B0 ), with ∂ (ν) B0 ∩ ∂ (t) B0 = ∅, where Cl indicates closure and n is the outward unit normal to ∂B0 at all points in which it is well defined. Again, as below (8) we argue that physical significance can be attributed to (33) in view on our hypotheses on M. It should be clear that those hypotheses need not apply to all substructures, e.g., in “homogenized” theories of liquids containing gas bubbles (order parameter the gas fraction) no such microstress could be expected to have such physical substance; there the interactions are at least weakly nonlocal, and no significant connection seems to exists then for M. We assume that there exist two surface densities U (x) and U (ν) such that t = ρ 0 ∂x U ,
t = ρ0 ∂ν U,
where U and U plays here the rôle of surface potentials. Then the Hamiltonian H of the whole body is given by H(X, x, p, ν, µ) d(vol) H (x, p, ν, µ) = B0 (U (x) − U (ν)) d(area). −
(34)
(35)
∂ (t) B0
Notice that we write H (X, x, p, ν, µ) instead of H (X, x, p, F, ν, µ, ∇ν) because below we consider directly variational derivatives. THEOREM 2. The canonical Hamilton equation F˙ = {F, H }
(36)
is equivalent to the Hamiltonian system of balance equations (28), (29) for a continuum with substructure where F is any functional of the type B0 f (X, x, p, ν, µ), with f a sufficiently smooth scalar density, and the Poisson bracket {·, ·} for a complex material is given by δH δf δf δH · − · d(vol) {F, H } = δp δx δp B0 δx
134
G. CAPRIZ AND P.M. MARIANO
δH δf δf δH · · + − d(area) δp ∂ (t) B0 δx δp ∂ (t) B0 ∂ (t) B0 δx δH δf δf δH · − · d(vol) + δµ δµ δν B0 δν δH δf δf δH · · − + d(area), δµ ∂ (t) B0 δν δµ ∂ (t) B0 ∂ (t) B0 δν
(37)
where the variational derivative δH/δx is obtained fixing p and allowing x to vary; an analogous meaning is valid for the variational derivative with respect to the order parameter. The proof can be developed by direct calculation. Clearly, {·, ·} is bilinear and skew-symmetric, and one can check easily that it satisfies the Jacobi’s identity. We note that δf δf ∂H · − · (∂x H − Div∂F H ) d(vol) {F, H } = ∂p δp B0 δx δf ∂H δf · · (∂x U − ∂F H n) ∂ (t) B d(area) + − 0 ∂p ∂ (t) B0 δp ∂ (t) B0 δx δf δf ∂H · − · (∂ν H − Div∂∇ν H ) d(vol) + ∂µ δµ B0 δν δf ∂H · + ∂µ ∂ (t) B0 ∂ (t) B0 δν δf · (∂ν U − ∂∇ν H n) ∂ (t) B d(area), − (38) 0 δµ and, in terms of functional partial derivatives, δf δf δf δf · x˙ + · p˙ + · ν˙ + · µ˙ d(vol) F˙ = δp δν δµ B0 δx δf δf · x˙ · ν˙ d(area) + d(area). + ∂ (t) B0 δx ∂ (t) B0 δν ∂ (t) B0 ∂ (t) B0
(39)
By identifying analogous terms in (38) and (39), we obtain both the Hamiltonian system (28), (29) and the boundary conditions (30)–(33). When we put F = H , (36) coincides with the equation of conservation of energy. We have, in fact, H˙ = {H, H } = 0.
(40)
Geometrical properties of the Poisson brackets for direct models of rods, plates and complex fluids have been discussed in [10, 11]. See relevant remarks in [10].
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
135
3. A Formal Approach toward an Hamilton–Jacobi Theory with Gradient Effects Let h be a smooth diffeomorphism h: (X, x, p, F, ν, µ, ∇ν) −→ (X, x∗ , p∗ , F∗ , ν∗ , µ∗ , ∇ν∗ ).
(41)
The transformation h generates a new Hamiltonian density H∗ (X, x∗ , p∗ , F∗ , ν∗ , µ∗ , ∇ν∗ ),
(42)
with corresponding Lagrangian density L∗ = p∗ · x˙ ∗ + µ∗ · ν˙ ∗ − H∗ .
(43)
If h were such that H∗ = 0, then an immediate integration of the system (28), (29) could be achieved. To this aim we choose h to be such that the integral of the difference L − L∗ between two instants, say t1 and t2 , be equal to the time derivative of a generating function S of the type S = S(t, X, x, p∗ , ν, µ∗ ), i.e., t2 (L − L∗ ) dτ = S|t =t2 − S|t =t1 . (44) t1
Then, from (44) we would have (p · x˙ + µ · ν˙ − H ) − (p∗ · x˙ ∗ + µ∗ · ν˙ ∗ − H∗ ) = S˙ = ∂t S + ∂x S · x˙ + ∂p∗ S · p˙ ∗ + ∂ν S · ν˙ + ∂µ∗ S · µ˙ ∗ ,
(45)
and hence p = ∂x S, x∗ − x0 = ∂p∗ S, ∂ t S + H = H∗ .
µ = ∂ν S, ν = ∂µ∗ S,
(46) (47) (48)
t To obtain (47) one makes use of the fact that δ t12 (p · (x − x0 ) + µ · ν) = 0 for variations vanishing at t1 and t2 (in the sense that δ(µ · ν)|t =t1 ,t2 = 0 and δ(p · (x − x0 ))|t =t1 ,t2 = 0) so that p · x˙ = p˙ · (x − x0 ) and µ · ν˙ = µ˙ · ν. A necessary and sufficient condition to assure that H∗ = 0 is ∂t S + H (X, x, ∂x S, F, ν, ∂ν S, ∇ν) = 0,
(49)
which is a Hamiltonian–Jacobi like equation. Since H∗ = 0, p∗ and µ∗ are constant in time, the time derivative of S reduces to S˙ = ∂t S + ∂x S · x˙ + ∂ν S · ν˙ = −H + p · x˙ + µ · ν˙ = L. The relation (50) allows us to determine S to within a constant, namely S = L dt + const.
(50)
(51)
136
G. CAPRIZ AND P.M. MARIANO
4. The Spatial Form Circumstances in which the notion of reference placement is wanting, as in the case of fluids or granular flows, render the choice of a material or spatial representation not matter of form only (see, e.g., [12] for standard bodies). Here, having in mind the study of complex fluids, we provide a spatial variational derivation of the balance equations free of any concept of reference place or paragon setting and without even formal recourse to an inverse motion. So, in the present section ˆ t) is used for the velocity field x ∈ B is just a point in space. The notation u = u(x, over B. The order parameter is now ν = ν˜ (x, t) (with some abuse of notation) and we indicate with υ = υ(x, ˆ t) its rate in the present placement. The symmetric tensor g is the spatial metric characterizing the present state of the body; it plays a prominent rôle because in this case the counterpart of (2) of the Lagrangian density is of the form L(x, u, g, ν, υ, gradν) =
1 ρv2 + ρχ(ν, υ) − ρe(g, ν, gradν) 2 − ρw(x, ν),
(52)
with some slight abuse of notation. We then find balance equations as conditions verifying the relation t¯ dτ L(x, u, g, ν, υ, gradν) d(vol) = 0, (53) δˆ 0
B
where δˆ denotes the total variation (where we use perhaps inappropriately the adjective “total” in the sense which is sometimes accepted in elementary treatises when speacking of total time derivatives, as it will be clear in the developments below). To define the variation of the relevant fields, we make use of f2 introduced at point 2 of Section 1 and identify δx with v. We consider a special (though wide) subclass of possible vector fields x −→ v(x) characterized by the circumstance that they are purely deformative; in other words, we choose v such that skewgrad v = 0. We then define ˆ = d f2∗ δg s2 g s2 =0 = Lv g = 2symgrad v = 2grad v, ds2
(54)
where f2∗ s2 means pull back and Lv is thus the autonomous Lie derivative following the flow v. In analogous way, we put ˆ = δν + (gradν)v, δν ˆ = grad δν ˆ + (gradν)grad v. grad δν
(55) (56)
With the words “paragon setting” we refer to an ideal model of paragon for the material element
and the body, say, e.g., an ideal crystal or any other choice that physical circumstances may suggest.
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
137
As an intermediate step we notice that e(g, ν, gradν) d(vol) δˆ B ˆ ˆ d(vol) = δe 2∂g e · grad v + ∂ν e · δν = B
B
ˆ + (gradν)grad v) d(vol). + ∂gradν e · (grad δν
(57)
By developing the variation of (53), making use of (54)–(57) and Gauss theorem, we recognize that appropriate balances in the bulk are ∂u˙L − ∂x L + div 2∂g L − (gradν)T ∂gradν L = 0, (58) ∂υ˙L − ∂ν L + div(∂gradν L) = 0. (59) Cauchy stress T is then given by T = −2∂g L − (gradν)T Sa ,
(60)
where the actual microstress Sa is defined by Sa = −∂gradν L.
(61)
In the case of simple bodies, (60) reduces to the well known Doyle–Ericksen formula. REMARK 7. A requirement of invariance of e under the action of SO(3) implies that (62) skew(2∂g L) = e AT za + (gradAT )t Sa , where za = −ρ∂ν e is the actual self-force and e Ricci’s alternating tensor.
4.1. SPATIAL HAMILTON EQUATIONS To find appropriate spatial Hamilton equations, we follow the pattern of Section 1.1. To this end we define spatial canonical standard and substructural momenta (p¯ and µ, ¯ respectively) through p¯ = ∂u L,
µ¯ = ∂υ L.
(63)
Consequently, the spatial Hamiltonian density is given by ¯ g, ν, µ, H (x, p, ¯ gradν) = p¯ · u + µ¯ · υ − L(x, u, g, ν, υ, gradν)
(64)
(with some slight abuse of notation) and has partial derivatives with respect to its entries. By evaluating the variation of H , taking into account (54) and (56), and
138
G. CAPRIZ AND P.M. MARIANO
comparing the result with the variation of L, after making use of the balances (58) and (59), we obtain the spatial form of the Hamilton equations: p˙¯ = −∂x H + div 2∂g H − (gradν)T ∂gradν H , (65) u = ∂p¯ H ; ˙¯ = −∂ν H + div(∂gradν H), µ (66) υ = ∂µ¯ H .
4.2. SPATIAL HAMILTON – JACOBI FORM We may obtain the spatial counterpart of (49) by considering a smooth diffeomorphism ¯ (x, p, ¯ g, ν, µ, h: ¯ gradν) −→ (x∗ , p¯ ∗ , g∗ , ν∗ , µ¯ ∗ , gradν∗ ),
(67)
which generates a new Hamiltonian density H∗ (x∗ , p¯ ∗ , g∗ , ν∗ , µ¯ ∗ , gradν∗ ).
(68)
Now, we may use a generating function S = S(t, x, p¯ ∗ , ν, µ¯ ∗ ), and, following the same procedure of Section 3, we find that a necessary and sufficient condition to assure that H∗ = 0 is ∂t S + H (x, ∂x S, g, ν, ∂ν S, gradν) = 0.
(69)
4.3. A SPATIAL FORM OF POISSON BRACKETS For the spatial Hamiltonian in equations (65), (66), taking into account (54)–(56), we define a new variational derivative ∂H /∂x through the relation δH ¯ ν, µ) (x, p, ¯ · v = −∂x H + div(2∂g H − (gradν)T ∂gradν H ) · v, δx
(70)
holding p¯ fixed and allowing x to vary, for any v of the kind used in (54)–(56). Consider a boundary value problem of the type ¯ 2∂g H − (gradν)T ∂gradν H n = ∂x u(x), (71) (∂gradν H)n = ∂ν u(ν), on ∂B, U (x) and U (ν)). (where u(x) ¯ and u(ν) are the counterparts of the surface potentials ¯ ν, µ) The total Hamiltonian is now given by H (x, p, ¯ = B H (with some slight ¯ ν, µ) abuse of notation) and we list only the entries (x, p, ¯ because we consider the variational derivative (65) below. We consider also arbitrary functionals F of the ¯ ν, µ), ¯ with f a sufficiently smooth scalar density. type B f (x, p,
HAMILTONIAN FORMALISM FOR COMPLEX MATERIALS
139
THEOREM 3. The canonical Hamilton equation F˙ = {F, H }a
(72)
is equivalent to the Hamiltonian system of balance equations (65), (66) with δf δH δH δf · − · d(vol) {F, H }a = δp δx δp B δx δf δH δH δf · · − d(area) + δp ∂B δx δp ∂B ∂B δx δH δf δf δH · − · d(vol) + δµ δµ δν B δν δH δf δf δH · · − d(area), (73) + δµ ∂B δν δµ ∂B ∂B δν where {·, ·}a is bilinear, skew-symmetric and satisfies Jacobi’s identity.
5. Final Remarks To illustrate possible uses of Theorem 2, we list below some special cases. Analogous results accrue from Theorem 3. REMARK 8. If we choose f = p · v, with v an arbitrary vector, equation (28a ) and the boundary condition (31) follow immediately from (36). REMARK 9. Let f = µ · ξM (ν), then from (36) we get (29a ) and the boundary condition (33). REMARK 10. Let f be of the form ˙ f = p · (q˙ × (x − x0 )) + µ · Aq,
(74)
with q˙ arbitrary as in previous sections. Consider also, for the sake of simplicity, absence of external bulk interactions (the ones accounted for w(x, ν)). By using (28) and (29), we obtain from (36) e(∂F H FT ) = AT ∂ν H + (∇AT )t ∂∇ν H.
(75)
These remarks are the Hamiltonian counterparts of Remarks 1–3. Of course, Poisson parentheses not only allow one to write in a concise form balance equations, but generate articulated geometric structures over the infinite-dimensional manifold of mappings showing placements and order parameters, and properties of these structures depend also strictly on the geometric properties of M.
140
G. CAPRIZ AND P.M. MARIANO
Acknowledgements This paper is an extended version of the first part of a communication of P.M.M. delivered at the Symposium honoring the memory of Clifford Ambrose Truesdell III, which was held in conjunction with the 14th US National Congress of Theoretical and Applied Mechanics, Blacksburg, June 2002. P.M.M. acknowledges gratefully the support of the U.S. National Science Foundation (through a conference grant to C.-S. Man). We also thank Reuven Segev for valuable discussions. The support of the Italian National Group of Mathematical Physics (INDAM-GNFM) is acknowledged. References 1. 2. 3.
G. Capriz, Continua with Microstructure. Springer, Berlin (1989). G. Capriz, Continua with substructure. Phys. Mesomech. 3 (2000) 5–14, 37–50. G. Capriz and P.M. Mariano, Balance at a junction among coherent interfaces in materials with substructure. In: G. Capriz and P.M. Mariano (eds), Advances in Multifield Theories of Materials with Substructure. Birkhäuser, Basel (2003). 4. P.M. Mariano, Multifield theories in mechanics of solids. Adv. Appl. Mech. 38 (2001) 1–93. 5. R. Segev, A geometrical framework for the statics of materials with microstructure. Math. Models Methods Appl. Sci. 4 (1994) 871–897. 6. J.E. Marsden and T.J.R. Hughes, Mathematical Foundations of Elasticity. Prentice-Hall, Englewood Cliffs, NJ (1983). 7. R. Abraham and J.E. Marsden, Foundations of Mechanics. Benjamin/Cummings Publishing (1978). 8. W. Noll, Materially uniform simple bodies with inhomogeneities. Arch. Rational Mech. Anal. 27 (1967) 1–32. 9. M. Epstein, The Eshelby tensor and the theory of continuous distributions of inhomogeneities. Mech. Res. Comm. 29 (2002) 501–506. 10. J.C. Simo, J.E. Marsden, and P.S. Krishnaprasad, The Hamiltonian structure of nonlinear elasticity: The material and convective representation of solids, rods and plates. Arch. Rational Mech. Anal. 104 (1988) 125–183. 11. H. Cendra, J.E. Marsden, and T.S. Ratiu, Cocycles, compatibility and Poisson brackets for complex fluids. In: G. Capriz and P.M. Mariano (eds), Advances in Multifield Theories of Materials with Substructure. Birkhäuser, Basel (2003). 12. G. Capriz (1984), Spatial variational principles in continuum mechanics. Arch. Rational Mech. Anal. 85 (1984) 99–109. 13. M. Brocato and G. Capriz, Spin fluids and hyperfluids. Theoret. Appl. Mech. 28/29 (2002) 39–53.
Geometrically-based Consequences of Internal Constraints DONALD E. CARLSON1, ELIOT FRIED1,2 and DANIEL A. TORTORELLI1,3 1 Department of Theoretical and Applied Mechanics, University of Illinois at Urbana-Champaign,
104 South Wright Street, Urbana, IL 61801-2935, USA. E-mail:
[email protected] 2 Department of Mechanical Engineering, Washington University in St. Louis, St. Louis, MO 63130-4862, USA 3 Department of Mechanical and Industrial Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801-2935, USA Received 23 October 2002; in revised form 19 June 2003 Abstract. When a body is subject to simple internal constraints, the deformation gradient must belong to a certain manifold. This is in contrast to the situation in the unconstrained case, where the deformation gradient is an element of the open subset of second-order tensors with positive determinant. Commonly, following Truesdell and Noll [1], modern treatments of constrained theories start with an a priori additive decomposition of the stress into reactive and active components with the reactive component assumed to be powerless in all motions that satisfy the constraints and the active component given by a constitutive equation. Here, we obtain this same decomposition automatically by making a purely geometrical and general direct sum decomposition of the space of all secondorder tensors in terms of the normal and tangent spaces of the constraint manifold. As an example, our approach is used to recover the familiar theory of constrained hyperelasticity. Mathematics Subject Classifications (2000): 74A20, 74B20. Key words: continuum mechanics, internal constraints, constitutive theory, hyperelasticity.
Dedicated to the memory of Clifford A. Truesdell
1. Introduction Most contemporary works in constrained theories of continuum mechanics follow the approach of Truesdell and Noll [1], wherein the stress is decomposed a priori into reactive and active terms with the reactive stress assumed to be powerless in all motions consistent with the constraints and the active stress given by a constitutive equation. The approach of Truesdell and Noll was motivated by the Ericksen See Carlson and Tortorelli [2] for a fuller account of other work in this area. To that account the
more recent work of Casey and Krishnaswamy [3, 4] must be added. As in earlier work of Casey [5] cited by Carlson and Tortorelli [2], these considerations are based on the behavior on the constraint manifold of associated unconstrained materials – an approach fundamentally different than ours. 141 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 141–149. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
142
DONALD E. CARLSON, ELIOT FRIED AND DANIEL A. TORTORELLI
and Rivlin [6] treatment of constrained hyperelasticity, which is based on the requirement that the constitutive equations for the stress and internal energy satisfy balance of energy in all motions consistent with the constraints. The main feature of the Ericksen–Rivlin hyperelastic development is that the stress is automatically decomposed into the sum of two terms. One term has zero power in any motion meeting the constraints and is determined by the constraints to within scalar multipliers; it is natural to think of this term as being present to maintain the constraints and to call it the reactive stress. The other term is, roughly speaking, the gradient of the internal-energy density with respect to the strain, and it is called the active stress. Carlson and Tortorelli [2] replaced the Lagrange multiplier formalism of the Ericksen–Rivlin approach with an elementary geometrical argument – essentially, the assertion that, if a vector a is orthogonal to every vector b that is orthogonal to some vector c, then a is parallel to c – used in the Truesdell–Noll method for determining the form of the reactive stress. It is widely accepted that many of the advances in modern continuum mechanics rest in large part on the clear separation of kinematics, basic laws of balance and growth, and constitutive equations that characterizes the subject. Where do internal constraints fit into this hierarchy? While internal constraints do delimit aspects of material response, they apply to broad classes of materials; for instance, the constraint of incompressibility applies equally well to both hyperelastic solids and viscous fluids. Hence, we view internal constraints as being more basic than constitutive equations. It is natural then to attempt to ascertain the implications of the kinematical nature of internal constraints. Motivated by this point of view, Anderson, Carlson and Fried [8] used a modified version of the geometrical argument of Carlson and Tortorelli [2] to deal with the constraints of incompressibility and microstructural inextensibility present in their theory of nematic elastomers. They started with a purely geometrical direct sum decomposition of the relevant fields based on the normal and tangent spaces of the constraint manifold to obtain automatically the decompositions of the deformational stress, orientational stress, and internal orientational body-force density into active and reactive components – without the use of any balance laws or constitutive assumptions. In this paper, we present this improved approach in the simpler context of isothermal continuum mechanics. We also take this opportunity to treat multiple constraints. In Section 2, we consider the case where the deformation gradient is restricted by n independent constraints. Thus, the deformation gradient is constrained to belong to a certain manifold in contrast to being an arbitrary element of the open subset of second-order tensors with positive determinant as in the unconstrained case. Next, we use the projection theorem to effect a unique orthogonal decomposition of the space of all second-order tensors in terms of the normal and tangent spaces of the constraint manifold. O’Reilly and Srinivasa [7] take an analogous view in their treatment of constrained discrete
mechanical systems.
GEOMETRICALLY-BASED CONSEQUENCES OF INTERNAL CONSTRAINTS
143
In the absence of thermal contributions, the general thermomechanical principles of energy balance and entropy growth combine to yield a free-energy inequality, which may be simplified by means of the power identity. These considerations are developed in Section 3. In Section 4, the orthogonal decomposition of Section 2 is applied to the stress tensor. We find that, for motions consistent with the constraints, the normal component is automatically powerless and only the tangential component enters into the free-energy inequality. Consequently, the tangential component is called the active stress, and one would expect to write a constitutive equation for it. On the other hand, the normal component, termed the reactive stress, is determined by the constraints to within scalar multipliers that we take to be constitutively indeterminate. Thus, our approach to internal constraints has the same level of generality as that of Truesdell and Noll [1] and provides exactly the same results. However, our decomposition of the stress, rather than being a priori, is dictated by the geometry of the constraint manifold. In Section 5, as an application of the general theory, we make elastic constitutive assumptions for the free energy and the stress and require that the free-energy inequality be satisfied for all motions consistent with the constraints to recover the theory of constrained hyperelasticity; and, in this sense, the present paper replaces the paper of Carlson and Tortorelli [2]. Finally, in Section 6, we show that when the principle of material frame-indifference is invoked in constrained hyperelasticity, the active and reactive stresses individually satisfy local balance of moment of momentum. Throughout, we use the notations of modern continuum mechanics; see, e.g., the text of Gurtin [9]. 2. The Geometry of the Constraint Manifold We use a referential formulation. Accordingly, the body is identified with the region of space B that it occupies in a fixed reference configuration. We write y for the motion of the body and F = Grad y,
(2.1)
with det F > 0, for the deformation gradient. We consider the case where the motion of the body is restricted by n simple constraints; i.e., the deformation gradient is required to meet γˆi (F ) = 0,
i = 1, . . . , n,
(2.2)
where the constraint functions γˆi : Lin+ → R are suitably smooth and independent in the sense that the set {Grad γˆi (F ), i = 1, . . . , n} is linearly independent at each At this level of generality, it must be required that n < 9. However, once the principle of material frame-indifference is imposed (cf. the developments of Section 6), the constraint functions γˆi are seen to depend on F only through the symmetric tensor F F . Consequently, we must, in fact, have n < 6.
144
DONALD E. CARLSON, ELIOT FRIED AND DANIEL A. TORTORELLI
F belonging to Lin+ . In other words, the deformation gradient must belong to the constraint manifold (2.3) Con := F ∈ Lin+ : γˆi (F ) = 0, i = 1, . . . , n . Of great use to us will be the normal space to Con at F , Norm(F ) := Lsp Grad γˆi (F ), i = 1, . . . , n ,
(2.4)
and its orthorgonal complement in Lin, ⊥ = A ∈ Lin : A·B = 0, ∀B ∈ Norm(F ) Norm(F ) = A ∈ Lin : A·Grad γˆi (F ) = 0, i = 1, . . . , n =: Tan(F ),
(2.5)
which is the tangent space to Con at F . Of course, the constraint equations (2.2) must hold for all time, and time differentiation yields Grad γˆi (F )· F˙ = 0,
i = 1, . . . , n,
(2.6)
which, in view of (2.5), is equivalent to F˙ ∈ Tan(F ).
(2.7)
If the body actually occupies the reference configuration at some reference time (so that γˆi (I ) = 0, i = 1, . . . , n), then (2.6) implies (2.2) (see Carlson and Tortorelli [2]); hence, in this case, (2.7) is equivalent to (2.2). By the projection theorem, Lin admits the direct sum decomposition Lin = Norm(F ) ⊕ Tan(F );
(2.8)
i.e., each A ∈ Lin can be written uniquely as A = A⊥ + A ,
A⊥ ∈ Norm(F ), A ∈ Tan(F ).
(2.9)
In view of (2.5), (2.7), and (2.9), A⊥ · F˙ = 0,
A· F˙ = A · F˙ .
(2.10)
3. Free-energy Inequality We restrict attention to processes in which the temperature is independent of position and time; in this case, the principles of energy balance and entropy growth (in the form of the Clausius–Duhem inequality), or the first and second laws of Our usage of the subscripts ⊥ and here is exactly opposite to that used by Anderson, Carlson
and Fried [8].
GEOMETRICALLY-BASED CONSEQUENCES OF INTERNAL CONSTRAINTS
145
thermodynamics, combine to yield a free-energy inequality. On using P to denote an arbitrary regular part of B with boundary ∂P and unit outward normal field n, this free-energy inequality requires that ˙ 1 2 ρ ψ + 2 |v| dv Sn·v da + ρb·v dv (3.1) P
P
∂P
for each instant and for all parts. Here, ρ is the referential mass density, v is the velocity field, ψ is the free energy per unit mass in the reference configuration, S is the first Piola–Kirchhoff stress tensor, b is the body force per unit mass in the reference configuration, and the superposed dot indicates time differentiation. Next, we recall that an easy consequence of the principles of mass balance and momentum balance is the power identity, which asserts that ˙ 1 2 ˙ Sn·v da + ρb·v dv = S · F dv + ρ|v| dv (3.2) 2 P
∂P
P
P
for each instant and all parts. Equations (3.1) and (3.2) imply that ˙ ρψ dv S · F˙ dv P
(3.3)
P
for each instant and all parts. The local equivalent of (3.3) is ρ ψ˙ S · F˙ ,
(3.4)
and it is this inequality on which our subsequent considerations of hyperelasticity are based. 4. Active and Reactive Stresses On employing the decomposition (2.10) in the particular case when A is identified with the first Piola–Kirchhoff stress S, it follows from the power identity (3.2) that only the component S expends nonzero power over a constrained motion, and we refer to S as the active component of the stress and write S = S a.
(4.1)
On the other hand, the component S ⊥ is powerless in a constrained motion, and we refer to S ⊥ as the reactive component of the stress and write S⊥ = S r.
(4.2)
Finally, since S r belongs to Norm(F ), it follows from (2.4) that there exist scalar fields λ1 , . . . , λn that we take to be constitutively indeterminate such that Sr =
n i=1
λi Grad γˆi (F ).
(4.3)
146
DONALD E. CARLSON, ELIOT FRIED AND DANIEL A. TORTORELLI
Thus, we have shown that, when a body is internally constrained by simple constraints of the form (2.2), the geometry of the constraint manifold dictates that the stress is automatically decomposed into the sum of two components: a powerless component S r that is determined to within scalar multipliers by (4.3); and a component S a that does expend power and consequently appears in the freeenergy inequality. We emphasize that this result is independent of any constitutive considerations other than the “simple” nature of the constraints; in particular, the body need not be elastic. A noteworthy feature of our approach is that, in view of (4.1), (4.2), and (2.9), S a ·S r = 0.
(4.4)
This automatic normalization is important, because the presence of the constitutively indeterminate multipliers in S r (see (4.3)) means that the response function for any component of S a not orthogonal to S r could not be measured. 5. Constrained Hyperelasticity In the constrained case, it follows from (2.10) and (4.1) that the local free-energy inequality (3.4) reduces to ρ ψ˙ S a · F˙ .
(5.1)
For hyperelasticity, we make the constitutive assumptions that ˆ ), ψ = ψ(F
ˆ Con → R, ψ:
(5.2)
and S a = Sˆ a (F ),
Sˆ a : Con → Tan(F ).
(5.3)
Now, with ψˆ assumed to be smooth, ˆ )· F˙ ; ψ˙ = Grad ψ(F
(5.4)
so the local free-energy inequality becomes ˆ ))· F˙ 0. (Sˆ a (F ) − ρGrad ψ(F
(5.5)
In the spirit of Green [10, 11], Ericksen and Rivlin [6], and Coleman and Noll [12], we require that our constitutive equations be restricted such that the local free-energy inequality (5.5) is always satisfied. To make this precise, we say that a constrained hyperelastic process consists of: (i) (ii) (iii) (iv)
a motion y consistent with the constraint equations (2.2); scalar fields λ1 , . . . , λn ; a free-energy field ψ given in terms of y by constitutive equation (5.2); an active stress field S a given in terms of y by constitutive equation (5.3);
GEOMETRICALLY-BASED CONSEQUENCES OF INTERNAL CONSTRAINTS
147
(v) a reactive stress field S r given in terms of y and λ1 , . . . , λn through (4.3); and (vi) a body force field b determined in terms of the above fields through local balance of momentum. Then, we insist that the local free-energy inequality (5.5) be satisfied for every constrained hyperelastic process. At least locally, it is possible to choose a constrained hyperelastic process such that, at any given position and time, F and F˙ take on arbitrary values in Con and Tan(F ), respectively. Since both S a (F ) and ˆ ) belong to Tan(F ), we conclude that Grad ψ(F ˆ ). Sˆ a (F ) = ρGrad ψ(F
(5.6)
ˆ ) represents the tangential gradient of ψˆ at F . When In (5.4)–(5.6), Grad ψ(F ˆ the response function ψ admits a smooth extension off the constraint manifold to an open subset of Lin+ , then % & n ˆ )= I− ˆ ), N i ⊗N i Grad ψ(F (5.7) Grad ψ(F i=1
where the fourth-order tensor I is the identity operator on Lin, {N i , i = 1, . . . , n} is an orthonormal basis for the linear subspace Norm (F ), and A⊗B is the fourthorder tensor defined such that (A ⊗ B)C = (B · C)A for any second-order tensor C. 6. Material Frame-indifference and Moment-of-momentum Balance An interesting feature of hyperelasticity in the unconstrained case is that the principle of balance of moment-of-momentum need not be taken as an axiom; rather it appears as a theorem in the theory primarily as a consequence of the principle of material frame-indifference. In this section, we show that this is the case also in the constrained theory as developed above. As noted in the introduction, internal constraints do delimit aspects of material response. Thus, the kinematical restrictions embodied in (2.2) are subject to the principle of material frame-indifference: γˆi (QF ) = γˆi (F ),
i = 1, . . . , n, ∀(Q, F ) ∈ Orth+ × Lin+ .
(6.1)
A standard consequence of (6.1) is that for each i γˆi (F ) = γ¯i (C),
γ¯i : Psym → R,
(6.2)
where C = F F is the right Cauchy–Green deformation tensor.
(6.3)
148
DONALD E. CARLSON, ELIOT FRIED AND DANIEL A. TORTORELLI
By (6.2) and (6.3), Grad γˆi (F ) = 2F Grad γ¯i (C),
(6.4)
and (4.3) becomes Sr =
n
λi F Grad γ¯i (C)
(6.5)
i=1
in terms of the reduced constraint functions γ¯i , where the factor of 2 has been absorbed into the constitutively indeterminate multipliers. An immediate consequence of (6.5) is that S r F = F S r ,
(6.6)
which is the local form of balance of moment-of-momentum for the reactive stress. Similarly, material frame-indifference requires that the constitutive equation (5.2) for the free-energy density reduce to ¯ ψ = ψ(C). Here, of course, the domain of ψ¯ is the reduced constraint manifold Con(C) := C ∈ Psym : γ¯i (C) = 0, i = 1, . . . , n .
(6.7)
(6.8)
¯ (5.6) becomes In terms of ψ, ¯ S a = S¯ a (C) = 2ρF Grad ψ(C),
(6.9)
where Grad now denotes the tangential gradient with respect to the manifold Con. Furthermore, it follows from (6.9) that S a F = F S a ,
(6.10)
which is the local form of moment-of-momentum balance for the active stress. Acknowledgements This work was supported in part by the National Science Foundation under Grant CMS96-10286. References 1. 2.
C. Truesdell and W. Noll, The non-linear field theories of mechanics. In: S. Flügge (ed.), Handbuch der Physik, Vol. III/3. Springer-Verlag, Berlin (1965). D.E. Carlson and D.A. Tortorelli, On hyperelasticity with internal constraints. J. Elasticity 42 (1996) 91–98.
GEOMETRICALLY-BASED CONSEQUENCES OF INTERNAL CONSTRAINTS
3.
149
J. Casey and S. Krishnaswamy, On constrained thermoelastic materials. In: R.C. Batra and M.F. Beatty (eds), Contemporary Research in the Mechanics and Mathematics of Materials. CIMNE, Barcelona (1996) pp. 359–371. 4. J. Casey and S. Krishnaswamy, A characterization of internally constrained thermoelastic materials. Mathematics and Mechanics of Solids 3 (1998) 71–89. 5. J. Casey, A treatment of internally constrained materials. Trans. ASME J. Appl. Mech. 62 (1995) 542–544. 6. J.L. Ericksen and R.S. Rivlin, Large elastic deformations of homogeneous anisotropic materials. J. Rational Mech. Anal. 3 (1954) 281–301. 7. O.M. O’Reilly and A.R. Srinivasa, On a decomposition of generalized constraint forces. Proc. Roy. Soc. London A 457 (2001) 1307–1313. 8. D.R. Anderson, D.E. Carlson and E. Fried, A continuum-mechanical theory for nematic elastomers. J. Elasticity 56 (1999) 33–58. 9. M.E. Gurtin, An Introduction to Continuum Mechanics. Academic Press, New York (1981). 10. G. Green, On the laws of reflection and refraction of light at the common surface of two noncrystallized media. Trans. Cambridge Philos. Soc. 7 (1839) 245–269. 11. G. Green, On the propagation of light in crystallized media. Trans. Cambridge Philos. Soc. 7 (1841) 113–120. 12. B.D. Coleman and W. Noll, The thermodynamics of elastic materials with heat conduction and viscosity. Arch. Rational Mech. Anal. 13 (1963) 167–178.
Second Variation Condition and Quadratic Integral Inequalities with Higher Order Derivatives YI-CHAO CHEN Department of Mechanical Engineering, University of Houston, Houston, TX 77204, U.S.A. E-mail:
[email protected] Received 1 October 2002; in revised form 13 June 2003 Abstract. The positivity of quadratic integrals involving variable coefficients and derivatives of any order is studied. The result is determined by the solution of an initial value problem for a system of first order nonlinear differential equations. The system is identified as the matrix Riccati differential equation in control theory. A complete conclusion is reached by considering the cases when the solution is bounded and when the solution is unbounded. Mathematics Subject Classifications (2000): 34A12, 34A34, 49K15, 49N10, 74B20. Key words: stability, calculus of variations, Riccati equation.
Dedicated to Professor C. Truesdell with deepest admiration and appreciation
1. Introduction In the analysis of mechanics, need often arises to determine whether certain quadratic integral inequalities hold. For example, in a recent work, Chen and Haughton [1] study the stability of inflation and stretch of thick-walled incompressible elastic cylinders. An energy stability criterion is used, which leads to a variation problem. The second variation condition reads (1) ∇u · WFF [∇u] + WF · ∇(∇uF−1 u) dV 0,
where is the cylindrical reference configuration, F the deformation gradient, W (F) the strain energy function, and u the variation function. By incompressibility and a Fourier analysis, inequality (1) is found to be equivalent to the following quadratic integral inequality b U(r) · A(r)U(r) dr 0, a
where r is the radial coordinate, a and b the inner and outer radii, respectively, of the cylinder, A(r) a 3 × 3 symmetric matrix whose components are smooth 151 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 151–167. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
152
YI-CHAO CHEN
functions of r, and U(r) = (u(r), u (r), u (r))T is a column vector consisting of an arbitrary smooth scalar function u(r) and its first and second order derivatives. The above inequality was solved in [1] and is a special case of the problem studied here. In this work, we develop a general method to determine the positivity of quadratic integrals involving derivatives of arbitrary order. Let [a, b] be an interval in R, and n be a positive integer. We define the space of admissible functions U by U ≡ C n [a, b]; R . Consider the following quadratic integral inequality b n di u(x) dj u(x) aij (x) dx 0, dx i dx j a i,j =0
(2)
where aij ∈ C n ([a, b]; R), i, j = 0, 1, . . . , n, satisfy aij = aj i .
(3)
The objective of this work is to determined whether inequality (2) holds for all u ∈ U. It is noted that (2) is the second variation condition for minimizing an integral involving higher order derivatives of the competing function. The case where n = 1 has been treated by many authors, and has become a part of classical theory of calculus of variations. See, for example, Hestenes [2] and Sagan [3]. The focus of this work is on the cases where (2) contains derivatives of u(x) of any order. While the conditions to be derived in this paper pertain to (2) as it stands, they can be made appropriate for the strict inequality version of (2). It is also noted that (2) can be recast in a form which is related to a constrained minimization problem studied in control theory. Indeed, by writing ui (x) ≡ di u(x)/dx i , the left-hand side of (2) can be rewritten as b n aij (x)ui (x)uj (x) dx. (4) a i,j =0
A problem in control theory is to find un (x) that minimizes (4) subject to the differential equation constraints dui (x) = ui+1 (x), dx
i = 0, 1, . . . , n − 1.
Variations of this latter problem, often for the constant coefficients aij with ain = 0, i = 0, . . . , n − 1, have been studied in control theory (for example, see [4–6]), where un (x) is called control function. An important development is the associated matrix Riccati equation whose solution determines the optimal control function. The solution of the Riccati equation has been studied extensively [7–10].
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
153
For unconstrained inequality (2) itself, some sufficient conditions and necessary conditions can be derived by elementary arguments. Let A(x) be the (n+1)×(n+1) matrix function with aij (x) being its elements. An obvious sufficient condition for (2) to hold for all u ∈ U is that A(x) be pointwise positive semi-definite, that is, v · A(x)v 0
∀v ∈ Rn+1 , x ∈ [a, b].
(5)
This condition, although very simple, is in most cases too strong to be practically useful. Condition (5) is a pointwise algebraic inequality. A condition of this kind is desirable as it usually allows a simple verification. However, a necessary and sufficient condition for (2) to hold is in general not pointwise. When A is a constant matrix, one can derive various algebraic necessary conditions by taking special form of u(x) in (2) and carrying out the integration. For example, by choosing u(x) ≡ ekx , k ∈ R, we find that (2) implies that ⎛ ⎞ 1 ⎜ k ⎟ ⎜ 2⎟ ⎜ ⎟ v · Av 0 ∀v = ⎜ k ⎟ . ⎜ .. ⎟ ⎝ . ⎠ kn This scheme, however, may not lead to useful result when A is not constant. Even when A is constant, the necessary conditions obtained this way may well be much weaker than (2) itself. The difficulty in finding necessary and sufficient conditions lies in the fact that U is an infinite-dimensional function space. This problem is dealt with in the present work by relating the integral in (2) to a system of nonlinear ordinary differential equations, which is identified as the Riccati equation in control theory. The analysis presented here provides the workers in mechanics with a direct access to the solution of (2) without referring to the constrained minimization problem studied in control theory, and without the usual restrictions on aij , which are particular in control theory. In addition, we present a detailed treatment of the case where the solution of the Riccati equation becomes unbounded. Besides its theoretical value, the method presented in this work offers great numerical advantage in dealing with the integral inequality (2). As no numerical method has been developed to solve such an inequality, numerous ODE solvers are available. Once the coefficients aij are given, the corresponding Riccati equation can always be solved numerically, and the properties of the solution provide a definite conclusion on whether (2) holds for all u ∈ U, as demonstrated in [1] where the stability of inflation of elastic cylinders has been determined for the first time. In the next section, we state some well-known results in calculus of variations regarding the integral in (2). Section 3 introduces the system of differential equa-
154
YI-CHAO CHEN
tions upon which the solution method is based. In Section 4, a necessary and sufficient condition is derived when the solution of the differential equations is bounded. In Section 5, the case where the solution is unbounded is analyzed. An illustrative example is given in the concluding Section 6. 2. Preliminaries The quadratic inequality (2), being the second variation condition of some minimization problem, forms a minimization problem itself. The first variation condition of this minimization problem reads b n di u(x) dj v(x) aij (x) dx = 0. (6) dx i dx j a i,j =0 If u(x) is a minimizing function of the integral in (2), it must satisfy (6) for any v(x) in a class of variation functions. If both u(x) and v(x) are smooth (say, of class C 2n ), one can integrate (6) by parts n times to obtain - n n n .b b n k−1 di u dj v di u dj −k v k−1 d aij i dx = (−1) aij i k−1 dx dx j dx dx dx j −k a i,j =0 k=1 j =k i=0 a b n j i d du + (−1)j j aij i v dx dx dx a i,j =0 .b - n−1 n−l n k−1 di u dl v k−1 d (−1) ai,k+l i = dx k−1 dx dx l l=0 k=1 i=0 a b n j i d d u + (−1)j j aij i v dx dx dx a i,j =0 .b - n n−l n k di u dl−1 v k d (−1) ai,k+l i = dx k dx dx l−1 l=1 k=0 i=0 a b n j i d du + (−1)j j aij i v dx dx dx a i,j =0 = 0. Here a change of variables j = k + l for the summation indices has been used. By a standard argument [2, 3] in calculus of variations, one can derive the following Euler–Lagrange equation which must be satisfied by a minimizing function u(x) at the points where u(x) is of C 2n : n j di u j d (−1) aij i = 0. (7) dx j dx i,j =0
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
155
Furthermore, at a point where u(x) is not of C 2n , the Weierstrass–Erdmann corner condition must hold, which states that n−l n dk di u (−1)k k ai,k+l i , l = 1, . . . , n dx dx k=0 i=0 must be continuous. A well-known pointwise necessary condition for (2) to hold is the Legendre condition ann (x) 0
∀x ∈ [a, b].
(8)
Indeed, if ann (x0 ) < 0 for some x0 ∈ (a, b), one can construct an admissible function u(x) that has a non-empty support contained in a neighborhood of x0 , and that is so oscillatory that (2) is violated for this u(x). The conclusion that (8) must hold at the end points follows from the continuity of ann . In this work, we assume that the following strengthened Legendre condition holds: ann (x) > 0 ∀x ∈ [a, b].
(9)
3. A System of Differential Equations Of central importance to the solution of (2) is the solution of a system of first order differential equations. Via this system of equations, the left-hand side of (2) can be written in a form whose positiveness can be determined unambiguously by the solution of the system. To be determined is an n × n matrix function Y(x), whose elements yij (x), i, j = 1, . . . , n, satisfy the following system of ordinary differential equations and initial conditions (ai−1,n − yin )(aj −1,n − yj n ) dyij = ai−1,j −1 − yi,j −1 − yi−1,j − , dx ann i, j = 1, . . . , n, yij (a) = 0.
(10) (11)
The notation in (10) is so chosen that yi0 = y0i = 0,
i = 1, . . . , n.
(12)
It follows from (3) that Y is symmetric: yij = yj i .
(13)
The boundary value problem (10, 11) for a system of differential equations is introduced to solve the integral inequality (2). Through this system of differential The classical Legendre condition pertains to (2) for n = 1.
156
YI-CHAO CHEN
equations, the left-hand side of (2) can be integrated by parts in such a way that the resulting integrand is the square of a function which can be made to vanish for a certain choice of the admissible function. Whether (2) holds is then determined by the boundary terms, which are further related to the solutions of this boundary value problem. The system of differential equations (10) is identified as the matrix Riccati equation in control theory where the coefficients aij are often taken to be constant with ain = 0, i = 0, . . . , n − 1. The solution of the Riccati equation has been widely studied [7–10]. The general theory for the solution of the initial value problem (10) and (11) is well developed. See, for example, Ince [11]. It can be shown that under condition (9), a Lipschitz condition is satisfied in a neighborhood of the initial point. A unique continuous solution of (10) and (11) then exists in the neighborhood. This neighborhood, however, may or may not cover the entire interval [a, b]. The solution may become unbounded as x approaches some c b. It will be shown below that in this latter case inequality (2) is violated for some admissible function. The system of differential equations (10) is nonlinear. It is known in control theory, as shown below, that this system can be related to a 2nth order linear ordinary differential equation, which is identified as a generalization of the classical Jacobi equation. The behavior of the solution to this equation is well understood. It is found that if the solution of (10) and (11) becomes unbounded at a point in (a, b], then the 2nth order equation has a solution which and its first n − 1 derivatives vanish at the point. In the next section, we first consider the case where the solution of the initial value problem is bounded in [a, b].
4. Bounded Solution When the solution of (10) and (11) is bounded on [a, b], whether (2) holds for all u ∈ U is determined completely by the values of the solution and its derivatives at b. This will be proved by using the following lemma, which will also be utilized for further development. LEMMA 1. If the initial value problem (10) and (11) has a solution Y(x) that is of C 1 on [a, c] for some c ∈ (a, b], then for any u ∈ U the following equality holds
c
n
a i,j =0
aij
n−1 di u dj u di u dj u dx = y (c) (c) (c) i+1,j +1 dx i dx j dx i dx j i,j =0 % &2 c n−1 dn u ain − yi+1,n di u ann + dx. (14) + i dx n a dx nn a i=0
157
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
Proof. Let the solution yij (x) with the required properties be given. By (3), (12), (10), (11) and (9), we find that c n di u dj u aij i dx dx dx j a i,j =0 c n−1
n−1 di u dj u di u dn u dn u dn u aij i + 2ain i + ann n n dx = dx dx j dx dx n dx dx a i,j =0 i=0 c/ n−1 (ain − yi+1,n )(aj n − yj +1,n ) di u dj u aij − yi+1,j − yi,j +1 − = ann dx i dx j a i,j =0
i n−1 di u dj +1 u dn u dn u d u dn u + 2yi+1,j +1 i + 2 a − y + a nn in i+1,n dx dx j +1 dx n dx n i=0 dx i dx n 0 n−1 (ain − yi+1,n )(aj n − yj +1,n ) di u dj u + dx ann dx i dx j i,j =0
c- n−1 di u dj +1 u dyi+1,j +1 di u dj u + 2yi+1,j +1 i = dx dx i dx j dx dx j +1 a i,j =0 n . n−1 d u ain − yi+1,n di u 2 + dx + ann dx n ann dx i i=0 n . c- n−1 n−1 d di u dj u d u ain − yi+1,n di u 2 yi+1,j +1 i + + ann dx = j n i dx dx dx dx a dx nn a i,j =0 i=0 n−1
di u dj u yi+1,j +1 (c) i (c) j (c) + = dx dx i,j =0
c
ann a
dn u ain − yi+1,n di u + dx n ann dx i i=0 n−1
2
dx. 2
Lemma 1 and inequality (9) readily render the following sufficient condition for (2) to hold. THEOREM 1. If the initial value problem (10) and (11) has a solution Y(x) that is of C 1 on [a, b], and if the matrix Y(b) is positive semi-definite, then inequality (2) holds for all u ∈ U. In Theorem 1, the condition that a C 1 solution of (10) and (11) exists on [a, b] is crucial. For a particular problem, this condition may not be satisfied, as demonstrated by the example in Section 6. This case will be treated in the next section. In the next theorem, we shall show that when the above-mentioned condition is satisfied, the remaining conditions in Theorem 1 are actually necessary for (2) to hold.
158
YI-CHAO CHEN
THEOREM 2. If the initial value problem (10) and (11) has a solution Y(x) that is of C 1 on [a, b], and if the matrix Y(b) is not positive semi-definite, then inequality (2) does not hold for some u ∈ U. Proof. By the given conditions, there are vi ∈ R, i = 1, . . . , n, such that n
yij (b)vi vj < 0.
(15)
i,j =1
Consider the following initial value problem for u(x): dn u ain − yi+1,n di u + = 0, dx n ann dx i i=0 n−1
(16)
di u (b) = vi+1 , i = 0, 1, . . . , n − 1. (17) dx i By the theory of linear ordinary differential equations, this initial value problem has a unique solution u(x) on [a, b]. For this u(x), we have, with the help of (14), b n−1 n di u dj u di u dj u aij i dx = y (b) (b) (b) i+1,j +1 i j dx dx j dx dx a i,j =0 i,j =0 % &2 b n−1 dn u ain − yi+1,n di u ann + dx + dx n ann dx i a i=0 =
n−1
yi+1,j +1 (b)vi+1 vj +1
i,j =0
< 0. The last two steps follow from (17), (16) and (15).
2
5. Unbounded Solution In this section we consider the case where the solution of the initial value problem (10) and (11) becomes unbounded as x approaches some c ∈ (a, b]. Theorems 1 and 2 are in this case inapplicable. We shall show that it is possible to construct a function u ∈ U for which inequality (2) is violated. To this end, we first relate the system of the first order nonlinear differential equations (10) to a sequence of linear differential equations. Let the solution Y(x) of (10) and (11) be given, that is of C 1 in [a, c). Consider the following nth order linear differential equation for u: dn u ain − yi+1,n di u + = 0. dx n ann dx i i=0 n−1
(18)
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
159
It is noted that equation (18) is identical with (16), and that a solution u(x) of (18) makes the last integral in (14) vanish. By the theory of linear ordinary differential equations, all solutions of (18) form an n-dimensional linear space. In this work, we shall assume that these solutions are of class C 2n . LEMMA 2. A solution u(x) of (18) satisfies % n−1 & n n−1 di u di u di u d yi+1,j i = ai,j −1 i − yi+1,j −1 i , dx i=0 dx dx dx i=0 i=0 j = 1, . . . , n.
(19)
Proof. By using (10), (18), (12), (13) and (3), we find that % n−1 & d di u yi+1,j i dx i=0 dx =
n−1
yi+1,j
i=0
+
=
n−1 (ain − yi+1,n )(aj −1,n − yj n ) di u ai,j −1 − yi+1,j −1 − yij − ann dx i i=0
n i=1
=
di+1 u dx i+1
n−1
di u dn u di u yij i + (ai,j −1 − yi+1,j −1 − yij ) i + (aj −1,n − yj n ) n dx dx dx i=0 n−1
(ai,j −1 − yi+1,j −1 )
i=0
=
n i=0
di u dn u + a j −1,n dx i dx n
di u di u − y . i+1,j −1 i dx i dx i=0 n−1
ai,j −1
2
PROPOSITION 1. A solution u(x) of (18) satisfies % n & n−1 l j i d d u di u (−1)j j ai,n−l+j i − yi+1,n−l i = 0, dx i=0 dx dx j =0 i=0 l = 0, 1, . . . , n.
(20)
Proof. Equation (18) can be rewritten as n i=0
di u di u ain i − yi+1,n i = 0. dx dx i=0 n−1
(21)
160
YI-CHAO CHEN
Taking the derivative of (21) l times and using (19) repeatedly, we find that % n & n−1 di u dl di u ain i − yi+1,n i dx l i=0 dx dx i=0 % n & % n & n−1 dl−1 di u di u dl di u ain i − l−1 ai,n−1 i − yi+1,n−1 i = l dx i=0 dx dx dx dx i=0 i=0 % n & % n & di u dl−1 dl di u ain i − l−1 ai,n−1 i = l dx i=0 dx dx dx i=0 % n & n−1 dl−2 di u di u ai,n−2 i − yi+1,n−2 i + l−2 dx dx dx i=0 i=0 = ... % n & l j i d d u (−1)l−j j ai,n−l+j i = dx i=0 dx j =2 % n & n−1 i i d d u d u ai,n−l+1 i − yi+1,n−l+1 i + (−1)l−1 dx i=0 dx dx i=0 % & l n dj di u (−1)l−j j ai,n−l+j i = dx i=0 dx j =1 % n & n−1 di u di u l ai,n−l i − yi+1,n−l i + (−1) dx dx i=0 i=0 % & l n n−1 dj di u di u (−1)l−j j ai,n−l+j i + (−1)l+1 yi+1,n−l i = dx i=0 dx dx j =0 i=0 = 0.
2
An important consequence of Proposition 1, obtained by taking l = n in equation (20) and using (12), is COROLLARY 1. A solution u(x) of (18) satisfies n j di u j d (−1) aij = 0. dx j dx i i,j =0
(22)
Equation (22) is found to be a generalization of the classical Jacobi equation, which was originally derived for the variational problem involving the first order derivative of the admissible function. It is also noted that (22) is identical with the Euler–Lagrange equation (7) since the functional in (2) is quadratic. The theory for such an equation is well developed. Among other things, since aij ∈ C n ([a, b], R), all solutions of (22), and therefore all solutions of (18), are bounded on [a, b].
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
161
PROPOSITION 2. If the solution Y(x) of (10) and (11) is of C 1 in [a, c), and is unbounded at x = c, then there exists a nontrivial solution u(x) ˆ of (18) on [a, c], that satisfies di uˆ (c) = 0, dx i
i = 0, 1, . . . , n − 1.
(23)
Proof. The first n equations of (20) can be rewritten as a vector equation Y(x)w(x) = g(x),
(24)
where the n × n matrix function Y(x) is the solution of (10) and (11) under consideration, and w(x) and g(x) are n-dimensional vector functions whose components are given, respectively, by di−1 u(x) , i = 1, . . . , n, dx i−1 - n . n−i j k d d u(x) (−1)j j ak,i+j (x) , gi (x) ≡ dx k=0 dx k j =0 wi (x) ≡
(25) i = 1, . . . , n.
(26)
Equation (18) has n linearly independent solutions u(1) (x), u(2) (x), . . . , u(n) (x) on [a, c). By Proposition 1, these solutions also satisfy (20), and therefore (24). Let w(1)(x), w(2) (x), . . . , w(n) (x) and g(1) (x), g(2) (x), . . . , g(n) (x) be the corresponding vector functions defined through (25) and (26), respectively. Then each pair of functions w(l)(x) and g(l) (x), l = 1, . . . , n satisfy (24). Since Y(x) is unbounded at x = c, at least one eigenvalue of Y(x) is unbounded at x = c. It then follows from (24) that the vectors w(l)(c), l = 1, . . . , n are all orthogonal to the associated eigenvector, and therefore are linear dependent. Hence, there exist constants c1 , c2 , . . . , cn , not all zero, such that n
cl w(l) (c) = 0.
l=1
Now define u(x) ˆ ≡
n
cl u(l) (x).
(27)
l=1
This function satisfies (18) and (23). Moreover, u(x) ˆ is nontrivial since functions 2 u(l) , l = 1, . . . , n are linearly independent on [a, c). It is noted that Proposition 2 is related to the notion of conjugate point in the classical theory of calculus of variations [2, 3]. For the case n = 1, the point x = c is said to be conjugate to x = a if the Jacobi equation has a nontrivial solution that vanishes at these two points.
162
YI-CHAO CHEN
PROPOSITION 3. If the solution Y(x) of (10) and (11) is of C 1 in [a, c), and is unbounded at x = c, then there exists a nontrivial solution u(x) ˆ of (18) on [a, c], such that c n di uˆ dj uˆ aij i dx = 0. dx dx j a i,j =0 Proof. Let u(x) ˆ be given by (27). By (14), (18), (25), and (24), we find that
c
n
aij (x)
a i,j =0 n−1
= lim
x→c
c
+ a
x→c
yi+1,j +1 (x)
i,j =0
= lim
di u(x) ˆ dj u(x) ˆ dx i dx dx j ˆ dj u(x) ˆ di u(x) i dx dx j
ain (x) − yi+1,n (x) di u(x) dn u(x) ˆ ˆ ann (x) + n dx ann (x) dx i i=0
n−1
yi+1,j +1 (x)
i,j =0
n−1
2 dx
di u(x) ˆ dj u(x) ˆ i dx dx j
ˆ ˆ = lim Y(x)w(x) · w(x) x→c
ˆ = gˆ (c) · w(c), ˆ where w(x) and gˆ (x) are defined by (25) and (26) through u(x). ˆ The desired conclusion then follows from Proposition 2. 2 We are now in a position to prove the main result of this section. THEOREM 3. If the solution Y(x) of (10) and (11) is of C 1 in [a, c), and is unbounded at x = c for some c < b, then inequality (2) does not hold for some u ∈ U. Proof. Let u(x) ˆ be given by (27). Define u(x) ˆ for a x c, u(x) ˜ ≡ 0 for c < x b. It follows from Proposition 3 that
b
n
a i,j =0
aij
di u˜ dj u˜ dx = 0. dx i dx j
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
163
Suppose that (2) holds for all u ∈ U. Then we would have b b n n di u˜ dj u˜ di u dj u aij i dx a dx ∀u ∈ U. ij dx dx j dx i dx j a i,j =0 a i,j =0 This implies that u˜ is a minimizing function of the quadratic integral in (2), and therefore must satisfy, in addition to the Euler–Lagrange equation (7), the Weierstrass–Erdmann corner condition. This latter condition asserts that the expressions n−k n j di u˜ j d (−1) ai,j +k i , k = 1, . . . , n dx j dx j =0 i=0 must be continuous at x = c. By the definition of u(x), ˜ we would then have n−k n j di uˆ j d (−1) = 0, k = 1, . . . , n. (28) ai,j +k i dx j dx x=c j =0 i=0 By Corollary 1, function uˆ is a solution of the 2nth order linear differential equation (22). The initial conditions (23) and (28) then imply that u(x) ˆ is identically zero on [a, c], which is a contradiction. 2 To complete the solution of the integral inequality (2), it remains to analyze the case where the solution Y(x) of (10) and (11) is of C 1 in [a, b), and is unbounded at the end point x = b. We shall show, in the following theorem, that a function u ∈ U can be constructed for which (2) is violated. THEOREM 4. If the solution Y(x) of (10) and (11) is of C 1 in [a, b), and is unbounded at x = b, then inequality (2) does not hold for some u ∈ U. Proof. Let u(x) ˆ be given by (27) in Proposition 2, with c therein being replaced by b. By Propositions 1 and 2, and Corollary 1, this uˆ satisfies (18), (20), (22) and (23), again with c being replace by b. Define functions gˆ i (x), i = 1, . . . , n, through (26) and u(x). ˆ The values gˆ i (b), i = 1, . . . , n, cannot be all zero, because otherwise u(x) ˆ would be identically zero, in virtue of (22) and (23). Now define g(x) ≡
n−1 gˆi+1 (b) i=0
i!
(x − b)i
and u(x) ≡ u(x) ˆ − g(x),
(29)
where is a small positive number. Obviously u ∈ U and di g (b) = gˆi+1 (b), dx i
i = 0, 1, . . . , n − 1.
(30)
164
YI-CHAO CHEN
Furthermore, by (14), (29), (18), (20), (26), (23) and (30), we find that b b n n di u dj u di g dj g 2 aij i dx − a dx ij dx dx j dx i dx j a i,j =0 a i,j =0 - n−1 % &2 . x n−1 di u dj u dn u ain − yi+1,n di u yi+1,j +1 i + ann + dx = lim x→b dx dx j dx n ann dx i a i,j =0 i=0 - n−1 di g dj g yi+1,j +1 i − 2 lim x→b dx dx j i,j =0 % &2 . x n−1 dn g ain − yi+1,n di u + ann + dx dx n ann dx i a i=0 % n−1 & n−1 i j di u dj u d g d g yi+1,j +1 i − 2 yi+1,j +1 i = lim j x→b dx dx dx dx j i,j =0 i,j =0 dj g di uˆ dj uˆ yi+1,j +1 i − 2 j = lim x→b dx dx j dx i,j =0 % n & −1 n−1 n−j k dj g di uˆ dj uˆ k d (−1) ai,j +k+1 i − 2 j = lim x→b dx k i=0 dx dx j dx j =0 k=0 n−1
= lim
x→b
=
n−1
n−1
gˆj +1
j =0
dj g dj uˆ − 2 dx j dx j
gˆj +1 (b) −2 gˆj +1 (b)
j =0
2 = −2 gˆ (b) . Since gˆ (b) is nonzero, the desired conclusion follows for sufficiently small .
2
In summary, Theorems 1, 2, 3 and 4 provide a definite conclusion on the positivity of the quadratic integral in (2). The conclusion hinges upon the solution Y(x) of the initial value problem (10) and (11). For a practical problem, once the coefficients aij (x) are given, this initial value problem can be solved numerically. If the solution Y(x) becomes unbounded at some c b, the inequality (2) does not hold for some u ∈ U. On the other hand, if the solution is bounded on [a, b], whether (2) holds for all u ∈ U depends on whether Y(x) is positive semi-definite at x = b.
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
165
6. An Example As an illustrative example, we consider the inequality 1 4 2 k u + 2k 4 xuu + k 2 (1 − kx)2 u2 + 6k 2 xu u + u2 dx 0,
(31)
0
where k is a constant, and a prime denotes the derivative with respect to x. This is inequality (2) with n = 2, a22 = 1,
a11 = k 2 (1 − kx)2 , a = 0, b = 1, a00 = k 4 , a01 = k 4 x, a02 = 0, a12 = 3k 2 x.
Not only do the theorems in the previous sections decide whether (31) holds for all u ∈ U, the proofs of the theorems also provide the details of direct justification of the conclusion as shown below. The initial value problem (10) and (11) takes the form ⎧ 2 y = k 4 − y12 , ⎪ ⎪ ⎪ 11 2 ⎨ 2 y22 = k (1 − kx)2 − 2y12 − 3k 2 x − y22 , ⎪ y12 = k 4 x − y11 + y12 3k 2 x − y22 , ⎪ ⎪ ⎩ y11 (0) = y22 (0) = y12 (0) = 0. The solution is found to be y11 = k 3 x,
y22 =
k 2 x(1 − 2kx) , 1 − kx
y12 = 0.
(32)
The solution is bounded in [0, 1] if k ∈ (−∞, 1). In this case, we have 4 0 k . Y(1) = 0 k 2 (1 − 2k)/(1 − k) By Theorems 1 and 2, inequality (31) holds when k ∈ (−∞, 1/2], and does not hold when k ∈ (1/2, 1). Indeed, when k ∈ (−∞, 1/2], we have, by integrating by parts, that 1 4 2 k u + 2k 4 xuu + k 2 (1 − kx)2 u2 + 6k 2 xu u + u2 dx 0
k 2 − 2k 3 2 u (1) + = k u (1) + 1−k 0. 3 2
1 2k 2 x − k 3 x 2 2 u dx u + 1 − kx 0
On the other hand, when k ∈ (1/2, 1), we choose 1 2 2 1 2 u = exp k − k + kx − k x − 1, 2 2
166
YI-CHAO CHEN
and arrive at 1 4 2 k u + 2k 4 xuu + k 2 (1 − kx)2 u2 + 6k 2 xu u + u2 dx 0 1 2 2 1 4 2 2 3 2 1 2 k − k + kx − k x = k x − 3kx + 2k x exp 2 2 0 4 = k (1 − k)(1 − 2k) < 0. Finally, when k ∈ [1, ∞), the solution (32) is unbounded at x = 1/k. Theorems 3 and 4 assert that inequality (31) does not hold for some u ∈ U. Here we demonstrate it for k = 1. Taking in the left-hand side of (31), √ 1 2 u = − e(4 − 3x) + exp x − x , 2 we find that 1 2 u + 2xuu + (1 − x)2 u2 + 6xu u + u2 dx 0 1 √ 1 2 2 2 3 e 25 − 66x + 36x − e 2 + 14x + 4x − 6x exp x − x = 2 0 1 + 2 − 2x − 4x 2 + 10x 3 − 4x 4 exp2 x − x 2 dx 2 √ 1 2 2 3 2 = e 25x − 33x + 12x − e 2x + 6x exp x − x 2 1 1 + 2x − 3x 2 + 2x 3 exp2 x − x 2 2 0 = −3e < 0. Acknowledgements The author wishes to thank Professor A. Mielke for his critiques and comments on an earlier version of this work. The support of ONR grant 99PR08596 and of the Texas Institute for Intelligent Bio-Nano Materials and Structures for Aerospace Vehicles, NASA NCC-1-02038 is acknowledged. References 1. 2.
Y.C. Chen and D. Haughton, Stability and bifurcation of inflation of elastic cylinders. Proc. Roy. Soc. London A 459 (2003) 137–156. M.R. Hestenes, Calculus of Variations and Optimal Control Theory. Wiley (1966).
SECOND VARIATION CONDITION AND QUADRATIC INTEGRAL INEQUALITIES
3. 4. 5. 6. 7. 8. 9. 10. 11.
167
H. Sagan, Introduction to the Calculus of Variations. Dover Publications (1992). R.E. Kalman, Contribution to the theory of optimal control. Bol. Soc. Mat. Mexicana 5 (1960) 102–119. M. Andjelic, On a matrix Riccati equation of cooperative control. Internat. J. Control 23 (1976) 427–432. B.D. Anderson and J. Moore, Optimal Control Linear Quadratic Methods. Prentice-Hall (1989). W.T. Reid, Riccati Differential Equations. Academic Press (1965). M. Razzaghi, Solution of the matrix Riccati equation in optimal control. Inform. Sci. 16 (1978) 61–73. L. Jodar and E. Navarro, Closed analytical solution of Riccati type matrix differential equations. Indian J. Pure Appl. Math. 23 (1992) 185–187. J. Nazarzadeh, M. Razzaghi and K.Y. Nikravesh, Solution of the matrix Riccati equation for the linear quadratic control problems. Math. Comput. Modelling 27 (1998) 51–55. E.L. Ince, Ordinary Differential Equations. Dover Publications (1956).
Principal Compliance and Robust Optimal Design ELENA CHERKAEV and ANDREJ CHERKAEV Department of Mathematics, University of Utah, U.S.A. E-mail:
[email protected],
[email protected] Received 1 November 2002; in revised form 17 September 2003 Abstract. The paper addresses a problem of robust optimal design of elastic structures when the loading is unknown and only an integral constraint for the loading is given. We propose to minimize the principal compliance of the domain equal to the maximum of the stored energy over all admissible loadings. The principal compliance is the maximal compliance under the extreme, worst possible loading. The robust optimal design is formulated as a min–max problem for the energy stored in the structure. The maximum of the energy is chosen over the constrained class of loadings, while the minimum is taken over the design parameters. It is shown that the problem for the extreme loading can be reduced to an elasticity problem with mixed nonlinear boundary conditions; the last problem may have multiple solutions. The optimization with respect to the designed structure takes into account the possible multiplicity of extreme loadings and divides resources (reinforced material) to equally resist all of them. Continuous change of the loading constraint causes bifurcation of the solution of the optimization problem. It is shown that an invariance of the constraints under a symmetry transformation leads to a symmetry of the optimal design. Examples of optimal design are investigated; symmetries and bifurcations of the solutions are revealed. Mathematics Subject Classifications (2000): 35B27, 35J50, 35P15, 49K20, 65K10, 74P05. Key words: structural design, robustness, bifurcation, Steklov eigenvalues, minimax, constrained optimization.
This paper is dedicated to the memory of Professor Clifford Truesdell.
1. Introduction A typical structural optimization problem asks for a material layout in the stiffest design. The stiffness is defined as an elastic energy of a domain loaded by external boundary forces (loading). If the loading is fixed and known, an optimal structure adapts itself to resist the loading. However, the optimal designs are usually unstable to variations of the forces. This instability is a direct result of optimization: To best resist the given loading, all the resistivity of the structure is concentrated against a certain direction thus decreasing its ability to sustain loadings in other directions [7, 8, 20]. For example, consider a problem of optimal design of a structure of a cube of maximal stiffness made from an elastic material and void; assume that the cube is supported on its lower side and loaded by a homogeneous vertical force 169 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 169–196. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
170
E. CHERKAEV AND A. CHERKAEV
on its upper side. It is easy to demonstrate, that the optimal structure is a periodic array of unconnected infinitely thin cylindrical rods. Obviously, this design does not resist any other but the vertical loading. The instability to variations of the loading is not a defect of an optimization procedure – the structure does exactly what it is asked to do; it is a defect of the modeling. In order to find a more stable robust solution, one needs to optimize a more general robust stiffness-like functional that characterizes an elastic body loaded by unspecified (or partly unspecified) forces on its boundary, as it happens with most engineering constructions. To avoid this vulnerability of the optimally designed structures to variations of loading, we propose to minimize the principal compliance of the domain equal to the maximum of the stored energy over all admissible loadings. The principal compliance is the maximal compliance under the extreme, worst possible loading. We formulate the robust optimal design problem as a min–max problem for the energy stored in the domain, where the inner maximum is taken over the set of admissible loadings and the minimum is chosen over the design parameters characterizing the structure. This formulation corresponds to physical situations when biological materials are created and engineering constructions are designed to withstand loadings that are not known in advance. This approach to the structural optimization was discussed in our papers [9, 12] and (for the finite-dimensional model) in the papers [18, 19]. Various aspects of the optimal design against partly unknown loadings were studied in [1, 5, 8, 21, 25– 27, 31, 32, 37], see also references therein. In some cases, the minimax design problem, where the designed structure is chosen to minimize maximal compliance of the domain, can be formulated as minimization of the largest eigenvalue of an operator. The minimization of dominant eigenvalues was considered in a setting of the inverse conductivity problem in [11, 13]. The multiplicity of optimal design that we find in the minimax loading-versus-design problem is similar to multiplicity of stationary solutions investigated in the engineering problems of the optimal design against buckling [14, 34] and vibration [30, 28, 33, 22]. The structure of this paper is as follows. In Section 2, we introduce an integral quantity of an elastic domain, the principal compliance, equal to the response of the domain to the worst (extremal) boundary loading from the given class of loadings; this quantity is a basic integral characteristic of the domain similar to the capacity, the eigenfrequency, or the volume. The principal compliance is a solution of a variational problem, which can be reduced to an eigenvalue problem or to a bifurcation problem. Examples of various constraints for admissible loadings and resulting variational problems are considered in Section 3. Particularly, the variational problem for the principal compliance with a quadratically constrained class of loadings is reduced to the Steklov eigenvalue problem. The principal compliance of the domain in this case is a reciprocal of the principal Steklov eigenvalue. We also consider the constraints of the Lp norm, p > 1, of the loading and inhomogeneous constraints and show that the Lp norm constraints result in a nonlinear boundary
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
171
value problem. The constraint of L1 norm of the loading yields to a variational problem which does not have a classical solution, but a distribution: the optimal loading turns out to be a δ-function or, physically speaking, a concentrated loading (if such a loading does not lead to infinite energy). Section 4 considers robust structural optimization which is formulated as a problem of minimization of the principal compliance. The optimal design takes into account the multiplicity of stationary solutions for extreme (most dangerous) loadings; typically, the optimal structure equally resists several extreme loadings. The set of the extreme loadings depends on the constraints of the problem. Continuous change of the constraints leads to modification of the set of extreme loadings; the optimal structure changes in response. This corresponds to bifurcation of the solution of the optimization problem. Another characteristic feature of the optimization problem is the symmetry of its solution. We show that the invariance of the set of the constraints for the admissible loadings, together with the corresponding symmetry of the domain, leads to the symmetry of the optimally designed structure. Section 5 contains two examples of problems of structural design for uncertain loadings. One example is provided by the problem of designing the optimally supported beam loaded by an unknown loading with fixed mean value. The second example is a problem of determining the optimal structure of a composite strip loaded by a force which deviates from the normal in an unknown direction. The force is assumed to have a prescribed normal component and an additional component which is arbitrarily directed and is unknown. 2. The Principal Compliance of a Domain 2.1. PROBLEM , EQUATIONS , CONSTRAINTS 2.1.1. Equations Consider a domain with the boundary ∂ = ∂0 ∪∂ filled with a linear anisotropic elastic material, loaded on its boundary component ∂ by a force f , and fixed on the boundary component ∂0 . The elastic equilibrium of such a body is described by a system (see, for instance, [35]): ∇ · σ = 0 in ,
σ = C : , 1 (w) = ∇w + (∇w)T . 2
σ =σ , T
(1)
Here C = C(x) is the fourth-order stiffness tensor of an anisotropic inhomogeneous material, w = w(x) is the displacement vector, is the strain tensor, σ is the stress tensor, and (:) represents contraction of two indices. Thus, ij σj i , (C : )ij = Cij kl lk . :σ = i,j
k,l
172
E. CHERKAEV AND A. CHERKAEV
Equation (1) is supplemented with the boundary conditions σ ·n=f
w = 0 on ∂0 ,
on ∂,
(2)
where n is the normal to the boundary ∂. These equations are the first variation conditions of the variational problem, (C, (w)) dx − w · f ds J(C, f ) = − min w:w|∂0 =0 ∂ w · f ds − (C, (w)) dx , (3) = max w:w|∂0 =0
∂
where is the density of the elastic energy: 1 1 (C, (w)) = : σ = : C : . 2 2
(4)
The nonnegative functional J is called the compliance of the domain; (3) states that it is maximal at equilibrium. At equilibrium, the energy stored in the body equals the work of the applied external forces f , 1 w · f ds = (C, (w)) dx. (5) J0 (C, f ) = 2 ∂ Simultaneously with the elasticity problem, we consider also a close problem of the bending of a Kirchhoff plate (see, for example, [35]). The equilibrium of the plate is described by the fourth order equation ∇∇ : Cpl : ∇∇w = f
in
(6)
with homogeneous boundary conditions w = 0 on ∂,
∂w = 0 on ∂, ∂n
(7)
corresponding to a clamped plate, or w = 0 on ∂,
nT (Cpl : ∇∇w)n = 0
on ∂,
(8)
for a simply supported plate. Here, w is the deflection orthogonal to the plane of the plate, Cpl is the fourth-order tensor of bending stiffness of the elastic material, ∇∇w is the Hessian of w, and f is the external loading. Notice that the force f enters the equation as a right-hand-side term. The equation for the plate deflection corresponds to maximization of the functional 1 ∇∇w : Cpl : ∇∇w − wf dx. (9) Jpl (C, f ) = − 2 The results that we develop further in this paper apply to both the elasticity (1) and the bending problem (6); therefore, we will drop the subscript in Jpl (C, f ),
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
173
and keep notation J(C, f ) for both compliance functionals. If this does not cause a confusion, we use the same notation w to denote both the displacement in the elasticity problem (1) and the deflection in the bending problem (6), even though the first one is a vector function, whereas the second one is a scalar function. 2.1.2. Admissible Loadings Let F be a set of admissible loadings f . The elastic energy over a finite domain is assumed to be finite. We consider integral constraints to describe the set of loadings F : $ ∂, for problem (1), φ(f ) ds = 1 , Df = (10) F = f: , for problem (6). Df Here Df is a domain of application of the forces: in the elasticity problem (1), Df concides with the part of the boundary ∂, whereas for the bending plate problem (6), Df is the domain or a part of it. We assume that φ is a convex function of f , with the derivative ψ: R3 → R3 : ∂φ ∂φ ∂φ ∂φ = , , , ψ(f ) = ∂f ∂f1 ∂f2 ∂f3 which has an inverse ρ = ψ −1 . 2.1.3. Principal Compliance We define the principal compliance of an elastic domain in a class of loadings as a compliance in the worst possible loading scenario. DEFINITION. The principal compliance of the domain is = max J(C, f ).
(11)
f ∈F
The loadings that correspond to the principal compliance are extreme or the most dangerous loadings; we denote them as fD . (C) = J(C, fD ) J(C, f )
∀f ∈ F .
(12)
The most dangerous loadings exist if the set F is closed and convex, see [15].
2.2. CALCULATION OF THE PRINCIPAL COMPLIANCE The concept of the principal compliance is useful if there are efficient algorithms for computing the extreme loadings. We show here that the problem of computation of the principal compliance and the extreme loadings can be formulated as a boundary value problem.
174
E. CHERKAEV AND A. CHERKAEV
Consider problem (11) and assume that the loadings are constrained as in (10). The augmented functional J for the problem is: φ(f ) ds − 1 , J = J(C, f ) − µ Df
where µ is the Lagrange multiplier. Clearly, maxf ∈F J = maxf J . Variation of the augmented functional with respect to f gives the optimality condition for the extreme loading(s): ∂ (−f · w + µφ(f ))δf = 0, δf J = Df ∂f or, since δf is arbitrary, w−µ
∂φ = 0 on Df . ∂f
Solving for the extreme loading(s) fD = f , we arrive at the condition w fD = ρ µ
(13)
which links the loading fD to the displacement w at the same boundary point for the elasticity problem (1) or at the same point in the domain for the bending problem. Condition (13) together with the first boundary condition in (2) allows us to exclude f from the boundary conditions, leading to the boundary value problem for the displacement w. We arrive at: THEOREM 1. The principal compliance of the elasticity problem (1), (2) with the constraints for the class of loadings (10) equals w 1 ds, (14) wρ = 2 ∂ µ where w satisfies the elasticity equations (1) in with the boundary conditions 1 w on ∂, w = 0 on ∂0 . (15) σ ·n=ρ µ The Lagrange multiplier µ is determined from the integral condition w ds = 1, φ ρ µ ∂ where the function ρ(·) is an inverse of ψ = ∂φ/∂f .
(16)
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
175
Indeed, the displacement w, whose energy is the principal compliance, satisfies the elasticity equations (1) in with the boundary conditions obtained from (2) and (13). The first condition in (15) relates the normal stress at a point on the boundary ∂ to the displacement at this point. The boundary value problem (1), (15), (16) allows us to compute w and µ, fD , and . For the bending problem (6), the calculation is similar. The principal compliance is the maximum of the functional (9) over all loadings bounded by the constraint (10); its value is the following. THEOREM 2. The principal compliance for the bending problem (6)–(8) with the constraint for the class of loadings (10) is w 1 dx, (17) wρ = 2 µ where w satisfies the equation w ∇∇ : Cpl : ∇∇w = ρ µ
(18)
together with the corresponding homogeneous boundary conditions (7) or (8). The function ρ(·) is an inverse of ψ = ∂φ/∂f . The Lagrange multiplier µ is determined from w ds = 1. (19) φ ρ µ Indeed, the extreme loading f is related to the displacement w by a scalar relation w = µφ (f ) or f = ρ(w/µ), and the plate equlibrium is described by equation (18). 3. Examples of Constraints 3.1. HOMOGENEOUS QUADRATIC CONSTRAINT Assume that the constraint (10) restricts a weighted L2 norm of f : 1 1 f T f ds = 1 or φ(f ) = f T f, 2 ∂ 2
(20)
where (s) is a symmetric, positive matrix. In this case, ρ is a linear mapping: ρ(f ) = −1 f , and the first of the boundary conditions (15) for the extremal loading becomes linear: 1 −1 w−σ ·n=0 µ
on ∂.
(21)
The optimality condition states that w and σ · n are proportional to each other everywhere on the boundary ∂ with the same tensor of proportionality µ.
176
E. CHERKAEV AND A. CHERKAEV
REMARK 1. The stationary condition (21) allows for the following physical interpretation: The boundary ∂ is equipped with distributed springs with negative stiffness. The forces in them are proportional but opposite to the forces in conventional linear springs. The elasticity equations (1) with boundary conditions (21) form a linear eigenvalue problem that has a nonzero solution w only if 1/µ is one of its discrete eigenvalues. Eigenvalue 1/µ relates the displacement on the boundary and the normal stress. As all eigenvalue problems, the problem (1), (21) represent Euler–Lagrange equations of a variational problem: (w) : C : (w) ds 1 = min −1 µ w:w|∂ =0 ∂ w · w ds or
1 (w) : C : (w) dx − µ
w·
−1
w ds
∂
→ min . w:w|∂ =0
(22)
The eigenvalue problem that contains the eigenvalue in the boundary condition is a Steklov eigenvalue problem, and µ is a reciprocal to the Steklov eigenvalue, see [4]. The eigenfunctions are normalized by condition (20). Using (20) and (21) in the form w = µf , we observe that the second term in (22) is equal to µ, thereafter µ = . The Steklov problem has infinitely many real positive eigenvalues (see [4, 23]), but the principal compliance of the domain corresponds to the dominant eigenvalue, = µmax . The dominant eigenfunction is not necessarily unique; we will demonstrate below that the existence of many stationary solutions is typical for the problems of minimization of the principal compliance with respect to the structure. The dominant eigenfunctions are the extreme loadings. The results are formulated as THEOREM 3. If the L2 -norm of admissible loadings is bounded, the principal compliance is a solution of the eigenvalue problem: ∇ · σ = 0 in ,
w = σ · n on ∂.
(23)
is a reciprocal to the principal eigenvalue 1/µ of the problem (1), (21). REMARK 2. The spectrum of the problem (1), (21) has one condensation point, zero. Positive eigenvalues µk tend to zero but never reach it. This implies that the dual problem of minimal compliance does not have a solution: the compliance can be made arbitrarily small by choosing a fast alternating loading. REMARK 3. The problem becomes isomorphic to the problem of the principal eigenfrequency of the domain, if the kinetic energy (and the inertia) are concentrated on the boundary: T = δ(x − xb )ww, where xb ∈ ∂.
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
177
In the bending problem (6), the analogy between the principal compliance and the principal eigenfrequency of vibrations is complete. The equilibrium (18) of the optimally loaded plate coincides with the equation for the magnitude of the deflection of the oscillating plate, ∇∇ : Cpl : ∇∇w =
1 w.
3.2. L1 - NORM CONSTRAINT Consider the L1 -norm constraint for the class of admissible loadings which assumes that the mean value of loading’s magnitude is fixed: |f | ds = f · f ds = 1. (24) ∂
∂
From an engineering viewpoint, this case is probably the most interesting one: it models the situation when the total weight applied to the structure is known but the distribution of the loading over the boundary is uncertain. For this, the functional of the variational problem grows linearly as |f | → ∞ which leads to a significantly different analysis. The straightforward variational technique does not provide the correct answer. Indeed, the variation with respect to f returns the vector condition δf :
w − µ√
1 f =0 f ·f
on ∂,
which says that |w| = constant
and
wf
on ∂.
The last condition, together with the condition σ · n = f (see (2)), allows us to exclude f and end up with a pair of conditions on w: (σ · n) × w = 0,
|w| = constant
on ∂.
Generally, these conditions cannot be satisfied if the ∂-component of the boundary is adjacent to the component ∂0 where w = 0 since w is continuous. This contradiction shows that the naive variational method does not apply. REMARK 4. The appearance of discontinuous solutions in the variational problems of linear growth is well-known [36]. The famous classical example is the existence of a non-smooth solution in the minimal surface problem. To solve the contradiction, we need to assume that the optimal loading f is a distribution. Indeed, the distribution does not have to satisfy the Euler equations
178
E. CHERKAEV AND A. CHERKAEV
of the variational problem because this equation was derived under the assumption that the optimal solution f is finite and smooth. Dealing with distributions in the L1 -constrained set of loadings may cause difficulties because the distributions δ(x − x0 ) may or may not correspond to a finite energy of the elastic system, as is stated in the Sobolev embedding theorem, see, for example, [24]. For the compliance of the bending plate (9), the energy of the concentrated loading and the Green’s function of the corresponding operator are finite. We illustrate this case below considering a one-dimensional example of a beam; the concentrated loadings of the type δ(x − x0 ) are acceptable because the corresponding energy stored in the elastic beam is finite. However, the linear elasticity problem does not allow a concentrated loading because the corresponding energy is infinite; the Green’s function g(x, y) has a singularity, g(x, x) = ∞. In this case, the restriction on the class of admissible f can be slightly tighten. We may assume, for example, that the force is piecewise constant within small domains of area . Alternatively, we may constrain the L1+ -norm of the loading, |f |1+ ds = 1, (25) ∂
where > 0 is a fixed parameter. This loading can be supported by a linear elastic material, although the displacement w can indefinitely grow when → 0. The analysis of this case leads to the optimality condition 1/ w w , f = µ |w| which shows that magnitude of an optimal loading either stays arbitrarily close to zero or is very large (of the order of 1/). The integral constraint (25) guarantees that the measure of the set of large values of f (s) goes to zero when → 0. With this warning, we proceed with the formal analysis of the problem with the L1 constraint assuming that either the limit exists or that can be chosen arbitrary close to zero to preserve the qualitative properties of the solution. The extremal loading is concentrated in several points, ci ξi δ(x − xi ), f = i
where {xi } is the set of points where the (concentrated) loading is applied, xi ∈ ∂, ξi : ξi = (ξi(1), ξi(2) , ξi(3) ), |ξi | = 1, are directional vectors of the concentrated loadings, and ci are their intensities; due to (24), ci belong to the simplex ci = 1, ci 0. (26) ci : i
Further, we show that the extreme loading is always applied to a single point. The displacements wk = w(xk ) are g(xk , xi )ci ξi , wk = i
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
179
where g(xk , xi ) is the Green’s function which relates the δ-function loading at the point xi to the generated displacement w at the point xk . The compliance becomes ci ck ξiT g(xi , xk )ξk . J= i
k
The principal compliance corresponds to the maximum of J with respect to ci , ξi and the points xi . As a function of ci , J is a nonnegative quadratic form, because the work J is always nonnegative. Therefore, J is a convex function of ci and its maximum is reached in a corner of the simplex (26): the maximum Jc of J corresponds to a single concentrated loading c1 = 1, c2 = · · · = cp = 0. Next, we maximize this maximum Jc with respect to the direction ξ1 = (ξ1(1) , ξ1(2) , ξ1(3)) of the single applied loading. The resulting compliance Jξ,c is equal to the maximal eigenvalue g λmax (x1 ) of the Green’s function g(x1 , x1 ) at the point x = x1 : Jξ,c = max ξ1T g(x1 , x1 )ξ1 = λgmax (x1 ). ξ1
This implies that the applied loading f (x) must be parallel to the displacement w(x). Finally, we choose the point x1 ∈ ∂ of application of the extreme concentrated loading and obtain the principal compliance . Summarizing, we obtain THEOREM 4. The L1 -principal compliance is = max λgmax (x) , x∈∂
g λmax (x)
is the maximal eigenvalue of the 3 × 3 tensor Green’s function where g(x, x) of the problem (1) at the point x ∈ ∂. We stress that the point x1 may be not unique although the extreme loading is always concentrated at one point. For example, there may be two symmetric extreme loadings if is a symmetric domain. An example in Section 5.1 below shows that there are several equally dangerous loadings in an optimal solution: g g λmax (x1 ) = · · · = λmax (xq ); the number q depends on the structure. 3.3. OTHER SPECIAL CASES 3.3.1. Constrained Lp -norm of the Loading If the constraint is imposed on the Lp -norm of the loading, i.e., 1 |f |p = 1, p > 1, p ∂ the problem has the form (1) but the boundary conditions (21) are replaced by |w| 1/(p−1) w (27) σ · n = η(w), η(w) = µ |w|
180
E. CHERKAEV AND A. CHERKAEV
and the normalization (16) for µ becomes 1/q 1 1 1 q |w| ds with + = 1. µ= p ∂ q p
(28)
In this case, the relation between the stress and displacement is nonlinear. Again, the multiplicity of stationary solutions that satisfy (27), (28) is expected; this time the solutions correspond to bifurcation points instead of spectrum points. The physical interpretation is similar to the one given in Remark 1, but the springs attached to the boundary ∂ are nonlinear. 3.3.2. Nonhomogeneous Constraint Let the loading f consist of some known component f 0 and an unknown deviation with a constrained Lp -norm: f 0 − f Lp 1.
(29)
Applying the previous variational analysis, we conclude that an extremal loading can be found from the elasticity problem with a inhomogeneous mixed boundary condition: σ · n = f 0 + η(w)
on ∂.
Since the boundary condition is inhomogeneous, w = 0 is not a solution. Still, the problem may have several stationary solutions. An example of this constraint is discussed later in Section 5.2. 4. Robust Optimal Design 4.1. MULTIPLICITY OF EXTREME LOADINGS Consider an optimal design problem: find a layout of elastic materials over the domain that minimizes the principal compliance . Such a structure (stiffness C(x)) corresponds to a solution of the extremal problem Pmin max = min (C), C∈C
(30)
where C is a class of admissible layouts. We rewrite the problem using the definition of (C): Pmin max = min max J(C, f ), C∈C f ∈F
(31)
where the compliance J = J(C, f ) is defined in (3). Minimization over w in (3) is performed first so that w will satisfy the elasticity equations while interchanging the order of the extremal operations minC∈C and maxf ∈F correspond to two physically different situations. Minimax problem (31) is a problem of optimization
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
181
of the material layout when the applied loading is unknown, while in the maximin problem Pmax min = max min J(C, f )
(32)
f ∈F C∈C
the loading is chosen to maximize the stored energy and is known to the designer; so the design resists this particular loading. If J is a saddle-point functional, the solutions to these two problems coincide, and Pmax min = Pmin max . Saddle point solutions are typical for ‘weak’ control as we will demonstrate below. The general case Pmax min < Pmin max corresponds to a situation when several loadings are ‘equally dangerous.’ The stiffness of the structure Copt should be fairly distributed to resist equally well each of these extreme loadings leading to the condition J(Copt , fi ) = J(Copt , fj ),
fi , fj ∈ ,
where is a set of extreme loadings. Generally, the set of stationary loadings may consist of any number of elements. They can be found from the following equations, see [16]. Consider a design Copt and the functional J(Copt , f ). The extremal loadings that solve the variational problem δ2 J(Copt , f ) 0 δf 2
δ J(Copt , f ) = 0, δf
are denoted by fˆi , i = 1, . . . , p, where p ∞; we assume that there are p stationary loadings that can become extreme. The optimized principal compliance Pmin max is determined from the problem ˆ νi J(Copt , fi ) , (33) min max Pmin max + C
νi 0
i
where νi 0 are the Lagrange multipliers due to the constraints νi = 1. J(Copt , fˆi ) − Pmin max 0, i
Optimal design Copt is found from the following conditions that reformulate the minimax problem as the problem of minimization of a sum of energies corresponding to extreme loadings.
182
E. CHERKAEV AND A. CHERKAEV
THEOREM 5. The optimal principal compliance Pmin max equals Pmin max = min max
C∈C {νi }:νi >0
q
νi J(C, fˆi ),
i=1
νi = 1,
(34)
i
where q is the number of active extreme loadings. The nonzero Lagrange multipliers correspond to the equalities J0 = J(Copt , fˆi )
i = 1, . . . , q,
⇒
νi > 0,
and the multipliers equal zero if the stationary loading leads to a smaller value of the functional, i.e., J0 > J(Copt , fˆk ),
k = q + 1, . . . , p
⇒
νk = 0.
These last conditions should be checked in the optimization procedure; that is, minimizing J0 we check if the value of the functional for the next loading fq+1 (not the most dangerous one) is still less than J0 . When this inequality becomes equality, the set of extreme loadings should be enlarged to include fˆq+1 , and the corresponding Lagrange multiplier νq+1 becomes positive. The multiplicity of equally dangerous loadings closely resembles the multiplicity of optimal solutions in a well studied problem of maximization of the minimal eigenfrequency. The multiplicity of optimal eigenvalues in that problem was observed first in a pioneering paper of Olhoff and Rasmussen [30]; then it was investigated in [33, 14, 34]. REMARK 5. The optimization problem (34) also admits a probabilistic interpretation. Namely, assume that the optimal loading is a random variablewhich takes q stationary values with some probability ν1 , . . . , νq . Then the sum νi J(C, fi ) in (34) is the expectation of the energy. The optimal design minimizes the expectation of the energy, meanwhile the loading chooses probabilities ν1 , . . . , νq to maximize it.
4.2. SYMMETRIES Symmetries are typical for designs that minimize the principal compliance. Namely, if the domain and the class of loadings are invariant under a symmetry transformation (translation, reflection, or rotation), then the set of extreme loadings and the optimal design are invariant under this transformation as well. We state the following: THEOREM 6. If the domain , the boundary component ∂, and the set F of admissible loadings are invariant under a symmetry transformation R, i.e., = R,
∂ = R∂,
and
F = RF ,
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
183
Figure 1. The force could be applied at arbitrary points along the elastically supported beam. The mean value of the magnitude of the force is constrained.
then the set of extreme loadings and the optimal materials’ layout C are invariant under this transformation, i.e., = R,
C = RC.
(35)
Indeed, applying the above consideration we can see that if f0 ∈ is an extreme loading, then Rf0 is also an extreme loading. The compliance of the structure should be the same for both loadings, which implies invariance of the design parameters with respect to the transformation R. Particularly, when the loaded domain is rotationally symmetric, and the loading can be applied from any direction, the optimal layout is axisymmetric. REMARK 6. Notice the symmetry of many natural ‘designs’ that are perfected by evolution: The rotationally symmetric shape of trees allows them to sustain wind from all directions; our natural “protective shell”, the skull, provides the best protection for the brain against hits from any direction. The conditions of the theorem do not require the symmetry of the extreme loading, only a possibility to apply a loading symmetric to any given one. In contrast, the design must be symmetric. 5. Examples of Optimal Designs The following examples highlight the discussed multiplicity of extreme loadings and bifurcation of the optimal solution. 5.1. OPTIMAL DESIGN OF A SUPPORTED BEAM 5.1.1. Formulation Consider a homogeneous elastic beam of unit length simply supported at both ends, elastically supported from below by a distributed system of elastic vertical springs with the specific stiffness q(x) 0, and loaded by a distributed nonnegative force f (x) 0. The elastic equilibrium of the displacement w is described by a onedimensional version of (6): (Ew ) + qw = f,
w(0) = w(1) = 0,
w (0) = w (1) = 0, (36)
184
E. CHERKAEV AND A. CHERKAEV
where E is Young’s modulus. The compliance is equal to 1 E 2 q 2 f w − (w ) − w dx, J= 2 2 0
(37)
where w is a solution of (36). Assume that the mean value of the magnitude of the loading (L1 -norm constraint) is equal to one, and the integral stiffness of the supporting springs is constrained by a constant κ. $ 1 −1 F = f ∈ H (0, 1): f dx = 1 , 0 $ 1 −1 q dx = κ . Q = q ∈ H (0, 1): 0
The optimal design problem of minimization of the principal compliance by distributing the springs stiffness becomes: 2 1 Pmin max = min max J . q∈Q
f :∈F
Applying the above analysis, we conclude: 1. The domain, class of loadings and the boundary conditions are invariant to the translation x → 1 − x, therefore the design (the springs stiffness) is symmetric with respect to the center of the beam, see Section 4.2, q(x) = q(1 − x). 2. Necessary conditions in Section 3.2 show that the extreme loading is a deltafunction f (x) = δ(x − xj ) applied at one of the points {x1 , x2 , . . . , xp }, where w (xj ) = 0,
w (xj ) 0.
(38)
The extreme loading may be applied to different points symmetric with respect to the center of the beam; the resulting stiffness must be equal. 3. The stiffness of an optimal spring is a distribution αi δ(x − yi ), αi = κ, αi 0. q(x) = i
i
Indeed, the assumption that q(x) satisfies variational stationary conditions leads to a contradiction similar to the contradiction discussed in Section 3.2. Particularly, the optimal positions of the springs satisfy the necessary conditions (38), and therefore the set of reinforcement points coincides with the set {x1 , x2 , . . . , xp }. The number p of the critical points depends on the relative stiffness of the springs κ/E.
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
185
Accounting for the loading and springs being concentrated, we reformulate the problem (37) for the optimal principal compliance: 0 1 p αi 2 E 2 (w ) dx , max δik wk − wi − xk 2 0 2 i=1 /
Pmin max =
min
(α1 ,...,αp )
(39)
where δik is Dirac function. The response of a supported beam can be characterized by a function v(x) = max g(ζ, x), ζ ∈(0,1)
(40)
where g is the Green’s function of the boundary value problem (36): g(ζ, x) is the displacement w(ζ ) at the point ζ corresponding to a delta-function loading applied at the point x, f (ζ ) = δ(ζ − x), and v(x) is the maximal displacement under the concentrated force applied at the point x. Figure 2 shows the response v(x) of the beam supported by two symmetric springs. The family of the thin curves shows the displacements wk (x) under several concentrated loadings applied at different points along the beam. The thick curve shows the maximal displacement, v(x). Notice that the point of application of the concentrated force is generally different from the point of maximum of the displacement curve; see the caption to Figure 2. i , i = 1, 2, of the maximum However, the optimal springs are located at points xopt of v(x), and the extreme loading is the one applied at one of the same points, 1 2 ) or fD = δ(x − xopt ). fD = δ(x − xopt The numerical results demonstrate the following: if the springs are weak, κ/E κ1 , they are concentrated in the center of the beam. We are dealing with the saddlepoint case: the most dangerous loading is a concentrated loading applied also at the center. The maximal displacement v(x) is a unimodal function of the position of the loading, with the maximum in the center, (v (1/2) = 0, v (1/2) < 0). There
Figure 2. Thin curves: The displacement functions generated by concentrated loadings applied at various points along the beam. The thick curve: maximal displacement v(x) generated by a force applied at x ∈ (0, 1) as a function of the position of the force. The displacement corresponding to the force applied at x = 0.15, has a maximum at x = 0.25. Figure shows the responses of the beam optimally reinforced by two symmetric springs.
186
E. CHERKAEV AND A. CHERKAEV
(a)
(b)
(c) Figure 3. Maximal displacement v(x) as a function of the position of the applied loading: (a) corresponds to a saddle point case, κ/E < κ1 : the function v(x) is unimodal, the optimal spring and the extreme loading are both located in the middle of the beam; (b) shows v(x) corresponding to κ/E in the interval κ1 < κ/E < κ2 when the strong spring is located in the center of the beam. Maximal displacement v(x) is not unimodal; design is not optimal; (c) corresponds to κ/E in the same interval κ1 < κ/E < κ2 , the maximal displacement v(x) is shown for an optimally designed beam which is supported by two symmetric springs.
is only one solution for the optimal applied force and the optimal position of the spring: 1 , f (x) = δ x − 2
1 q(x) = κδ x − . 2
187
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
Figure 3(a) shows v(x) for the beam supported by a weak spring in the center of the beam. One can see that v(x) is unimodal. If the spring becomes stronger, κ1 < κ/E κ2 , but is still located in the center, the maximum of v(x) corresponds to a noncentral applied force. The equally dangerous loadings could be applied in two symmetric eccentric points. The maximum displacement v(x), shown in Figure 3(b), is not a unimodal function of the position of the moving applied force; the design is not optimal. The optimal design for this case (Figure 3(c)) corresponds to two equally stiff springs located symmetrically with respect to the center; the design experiences a bifurcation at the critical value of κ/E = κ1 . An optimally supported beam is shown in Figure 3(c), where two strong springs are located symmetric with respect to the center of the beam. The maximal displacement curve becomes unimodal again, with a large interval of almost constant values in the middle. The next bifurcation occurs when κ further increases, at the point κ/E = κ2 . Three springs appear after the next bifurcation. The number of optimal supporting points increases and tends to infinity when the springs are much stronger than the beam, κ/E 1. The optimality conditions w (xi ) = 0,
w(xi )|f =fi = constant(i),
give the optimal position of the supporting springs xi and a requirement on their stiffnesses αi . 5.2. COMPOSITE STRIP WITH CONSTRAINED DEVIATION OF THE LOADING This example shows the design of an optimal structure for the worst possible loading. Consider an infinite strip = {−∞ < x < ∞, −1 y 1}, made from a two-component elastic composite with arbitrary structure but with fixed fractions mA and mB = 1 − mA of the isotropic components. The stiffness of the composite C(x, y) is an anisotropic elasticity tensor; it is assumed that the stiffness can vary only along the strip, C = constant(y). Assume that the upper boundary is loaded by some unknown but uniform loading f , σ (x, 1) · N = f
∀x,
where N = (0, 1) is the normal vector. The loading f consists of the fixed component f0 = (0, 1) directed along the normal and a variable component (deviation) (fN , fT ), the magnitude of the deviation is constrained: f = (f0 + fN )N + fT T ,
fN2 + fT2 = γ 2 .
(41)
Here T = (1, 0) is the tangent vector and γ is the intensity of the deviation. The constraint (41) can be rewritten as f = (f0 + γ cos θ)N + (γ sin θ)T
for y = 1,
188
E. CHERKAEV AND A. CHERKAEV
Figure 4. An infinite composite strip loaded by a force f that could deviate from the normal direction. If the norm γ of the deviation is smaller than a critical value γ1 , the optimal composite is a laminate with layers directed across the strip. If γ is greater than γ1 , the optimal composite is second rank laminate with layers oriented along directions φ and −φ.
where θ is the angle of inclination of the deviation of the loading; see Figure 4. The lower boundary of the strip is assumed to be loaded by a symmetrically deviated force f− = −f = −(f0 + γ cos θ)N + (γ sin(−θ))T
for y = −1.
The symmetry of the loadings results in the horizontal strain being zero, xx (x, y) = 0,
−1 y 1,
(42)
so that the strain tensor has only two, vertical and shear, nonzero components. The stiffness of the composite C(x) is an anisotropic tensor that is assumed to vary only along the x coordinate. We consider the problem of optimization of the principal compliance of the described domain. 5.2.1. Design Parameters Applying the symmetry theorem, we conclude that: 1. The elastic properties of the optimally designed structure do not vary along the strip, since the design is invariant to the translation x → x + χ. Together with the assumption that the material properties do not vary with the thickness, this leads to the conclusion that the elastic properties are uniform: the tensor C is constant in x and y. This implies that the stress field σ is constant inside an optimal strip and σyy = 1 + γ cos θ,
σxy = γ sin θ.
(43)
2. The material in the optimal strip is orthotropic with main axes directed along the x and y axes since the design is invariant to the reflection x → −x: 0 −xy 0 xy =C: . C: xy yy −xy yy
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
189
This implies orthotropy with the main axes codirected along the x, y axes. For the following calculations, we introduce an orthonormal (ai : aj = δij ) tensor basis 1 0 1 0 0 1 0 . (44) , a2 = , a3 = √ a1 = 0 1 0 0 2 1 0 In this basis, the stress tensor σ , σ2 σ3 , σ = σ3 σ1 is represented as a vector σ = σ1 a1 + σ2 a2 +
√
2σ3 a3 .
The compliance tensor S and stiffness tensor C = S −1 are presented as matrices with the components {Sij } and {Cij }; their orthotropy implies the representation & % S11 S12 0 S = S12 S22 0 0 0 S33 and a similar one for C. 5.2.2. The Optimization Problem The energy of an orthotropic material is computed either as a function of stresses and compliance tensor S = {Sij } (stress energy): 1 σ (S, σ ) = (S11 σ12 + S22σ22 + 2S12 σ1 σ2 + 2S33 σ32 ), 2 or as a function of strain and stiffness tensor C = {Cij },
(45)
1 (46) (C, ) = (C11 12 + C22 22 + 2C12 1 2 + 2C33 32 ). 2 Recall (see (43)) that two components σ1 = σyy and σ3 = σxy of the stress field σ are known, and the strain in the xx direction is zero, (42): 2 = S12 σ1 + S22 σ2 = 0; therefore, σ2 can be excluded. The elastic energy (46) becomes 1 (C, ) = (C11 12 + 2C33 32 ) 2 or, in terms of stress (see (45)), 2 1 S12 2 2 S11 − σ + 2S33 σ3 . σ (S, σ ) = 2 S22 1
190
E. CHERKAEV AND A. CHERKAEV
The problem of robust optimal design becomes Pstrip =
min
max (S, σ ),
(47)
C∈Gm closure f ∈F
where Gm closure is the set of all possible effective compliance tensors of a microstructure formed from the two given materials with the compliance tensors SA and SB , taken in the proportion mA and mB = 1 − mA , respectively, see [8, 27]. We reformulate the problem using a sum of weighted energies, where the minimized functional is taken as a sum of the energies due to the extreme loadings.
5.2.3. Laminates of Third Rank: Symmetry The description of the strongest structures that minimize the sum of the energies due to several loadings is known, (see the original papers [2, 3, 17] and the books [8, 29]); the best structures in 2D are so-called “laminates of the third rank” shown in Figure 5. In 3D, they are the sixth rank laminates [17]. Structural optimization based on using the third rank composites was effectively developed for the multi-loadings case in [6, 10, 25]. The effective compliance tensor S = C −1 of a third rank composite – the symmetric fourth-order tensor of elasticity – has the representation −1 S = SA + mB (SB − SA )−1 + mA N ,
(48)
where SA is the compliance of an enveloping (reinforcing) material, SB is the compliance of the material in the nucleus, N is the matrix of structural parameters that depends on the structure of the composite, see [8, 29], N = EA
3 i=1
αi P (φi ),
3
αi = 1,
αi 0.
i=1
Here EA is the Young’s modulus of the A-material, angles φi are the angles that define the directions of laminates (directions of reinforcement), P is a tensor product of four directional vectors zi = (cos φi , sin φi ):
Figure 5. The schematic picture of the composite of the third rank.
191
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
P (φi ) = zi ⊗ zi ⊗ zi ⊗ zi ,
(49)
αi is the corresponding relative thickness of the reinforcing layer in the ith direction. The above mentioned symmetry of an optimal composite requires the orthotropy of the optimal structure. Since the original materials are isotropic, the structure is orthotropic if the matrix N is orthotropic. This can be achieved by setting φ2 = −φ3 = φ,
α2 = α3 = α.
Generally, the optimal strip is reinforced by three layers of strong material; one layer (with relative volume fraction 1 − 2α) is directed in the y-direction and two other layers (with equal relative volume fractions α) are symmetrically inclined by the angles ±φ. In addition, the structure may degenerate into a single layer (when α = 0) or two symmetric layers (when α = 12 ) with angles φ and −φ. Because of this symmetry, the matrix N for an optimal composite becomes N = (1 − 2α)P (0) + αP (φ) + αP (−φ).
(50)
Let us compute the compliance of a third-rank composite in the basis (44). Compliance SA of an isotropic material A is given by a matrix & % −νA 0 1 + νA 1 − νA SA = 1 − νA 0 , −νA EA 0 0 1 and similarly for the material B. To compute the effective compliance of a thirdrank laminate, we first represent the matrix P (φ) of (49) in the basis (44), √ % 3 & sin2 φ cos2 φ cos4 φ √2 sin3φ cos φ 2 4 2 P (φ) = √sin φ cos φ 2 sin φ cos φ , √ sin3 φ 2 sin φ cos3 φ 2 sin φ cos φ 2 sin2 φ cos2 φ and obtain from (50) % 1 − 2α + 2α cos4 φ N= 2α sin2 φ cos2 φ 0
2α sin2 φ cos2 φ 2α sin4 φ 0
0 0 4α sin2 φ cos2 φ
& .
The matrix N is the variable part of the compliance matrix, (see (48)); it depends on only two scalar parameters, φ and α. The structural optimization problem (47) finally becomes an algebraic problem Jstrip = min max σ (S(φ, α), σ (θ)); φ,α
θ
(51)
the expressions for the quantities involved are described above. The angle θ is the angle of deviation of the loading from the normal, and φ and α are structural parameters.
192
E. CHERKAEV AND A. CHERKAEV
5.2.4. Second Rank Structure is Optimal Although in the general case of minimization of a sum of energies corresponding to multiple loadings the third-rank laminates are optimal, here the optimal structures are the second – not the third-rank laminates. To prove this statement we must find the derivative of σ in the algebraic minimization problem (51), and demonstrate that it does not become zero; this would give the optimal value of α on the boundary of the constraint. However, we skip this bulky calculation and give a physical argument supported by results of numerical optimization. Because of the absence of a displacement in the x-direction, there is no need to reinforce this direction. Even more, the stress in the composite does not change if a layer with infinite stiffness oriented along x-axes is added to the composition. If this infinitely stiff layer is counted, then the structure would be reinforced by three layers of stiff material. Since the stiffness of a structure with an infinitely stiff layer is not smaller than the stiffness of a structure without such a layer, the optimality of the second-rank laminates follows. This conclusion is supported by results of numerical optimization, which gives αopt = 1/2 for all settings. Physically, this means that the optimal structure is reinforced by either single laminates oriented across the strip (the case when φ = 0) or by a second-rank laminate with two symmetric reinforcement directions φ and −φ, see Figure 4. This degeneration of the third-rank laminates can be explained by the special geometry of the strip and the loading, which do not allow for any strain xx along the strip, and the assumed independence of the design on the y-coordinate. The formulas for the effective properties of a symmetric second-rank composite are simplified: They are still given by the expression (48) but the structural matrix N is 1 N = (P (φ) + P (−φ)) 2 instead of (50); in the basis (44) it has the form & % sin2 φ cos2 φ 0 cos4 φ 2 4 . sin φ 0 N = sin φ cos2 φ 2 2 0 0 2 sin φ cos φ We notice that the symmetry in this example efficiently reduces the dimension of the computational problem, but the general method works with or without symmetry. 5.2.5. Numerical Example For the first example, the following values of parameters were chosen: mA = 1 − mB = 0.2, νA = νB = 0.3,
EA = 1, f0 = 1.
EB = 5,
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
193
Figure 6. Bifurcation diagram shows (1) the angle of deviation θˆ (γ ) of the extreme, most ˆ ) of optimal reinforcement of the second rank lamidangerous loading and (2) the angle φ(γ nated composite. Notice that the bifurcation parameter γ has different critical values for the deviation of the loading θ and for the angle of reinforcement φ.
The relative magnitude γ of the variable part of the loading is the parameter of the problem; the angle θ of the optimal deviation of the extreme loading and the structural parameters α and φ are determined from the solution of the min–max optimization problem. We detect three regimes: 1. When γ < γ0 = 0.31, the extreme loading is vertical, θopt = 0, and the optimal structure is a laminate with vertical layers directed across the strip, φopt = 0, see Figure 6. 2. At the critical value γ0 of the parameter γ , the direction of the extreme deviation ˆ ), shown by the curve 1 in Figure 6. But undergoes a bifurcation, θopt = ±θ(γ for γ < γ1 = 0.46, the optimal structure remains the same: a laminate with layers directed across the strip, φopt = 0 (curve 2 in Figure 6). 3. When the magnitude γ further increases, γ γ1 , the optimal structure bifurcates as well; it becomes a second-rank matrix laminate with the angle φopt = ˆ ) (curve 2 in Figure 6). ±φ(γ Although the problem has two solutions for the extreme loading, the dependence of the compliance on the parameters φ and θ is a saddle-point surface as is shown in Figure 7. Indeed, the problem is reformulated (relaxed) accounting for non-uniqueness of the loading and for the symmetry in the design. The following examples demonstrate the dependence of the optimal solution on the ratio of Young’s moduli for the materials in the composite. Figure 8 shows the bifurcation diagrams for different ratios of Young’s moduli. Qualitatively, the picture remains the same, but the critical values of the bifurcation parameter γ are different: The larger the ratio, the smaller the critical value of γ0 and γ1 at which the bifurcation occurs. The interval (γ0 , γ1 ) decreases with an increase of the ratio of Young’s moduli.
194
E. CHERKAEV AND A. CHERKAEV
Figure 7. Energy stored in the composite is a saddle point function of the angle of deviation of the loading θ and of the direction of reinforcement φ.
(a)
(b) Figure 8. Bifurcation diagram for different ratios of Young’s moduli of the materials in the ˆ ) of deviation of the composite ranging from 1 : 2 to 1 : 25. (a) Bifurcation of the angle θ(γ ˆ ) of direction of the optimal extreme loading from the normal. (b) Bifurcation of the angle φ(γ reinforcement for the second rank laminated composite.
PRINCIPAL COMPLIANCE AND ROBUST OPTIMAL DESIGN
195
5.3. DISCUSSION The principal compliance is a basic characteristic of an elastic body which depends only on the shape of the domain and on the stiffness of the material. By the proper normalization of using and C, this quantity is reduced to the dimensionless parameter λ: λ=
, C
and can be treated as a basic integral characteristic of the filled domain along with such properties as main eigenfrequency, the capacity, etc. The optimal design aimed to decrease the principal compliance is a minimax problem; typically, the problem does not have a saddle point and the optimal design provides equal minimal compliance for several extreme loadings. Symmetries and relaxation bring the problem to a saddle-point type. Depending on the type of constraints, the extreme loading can be a principal eigenfunction of an eigenvalue problem, a concentrated loading, or a solution of a bifurcation problem. Acknowledgement The authors acknowledge the support from NSF and ARO. References 1. 2. 3. 4. 5. 6.
7. 8. 9.
10. 11.
G. Allaire, Shape Optimization by the Homogenization Method. Springer, Berlin (2002). M. Avellaneda, Optimal bounds and microgeometries for elastic two-phase composites. SIAM J. Appl. Math. 47 (1987) 1216–1228. M. Avellaneda and G. W. Milton, Bounds on the effective elastic tensor of composites based on two-point correlations. J. Appl. Mech. (1989) 89–93. C. Bandle, Isoperimetric Inequalities and Applications. Pitman Publishing Program, London (1980). M.P. Bendsoe, Optimization of Structural Topology, Shape, and Material. Springer, Berlin (1995). M. Bendsoe, A. Diaz, R. Lipton and J. Taylor, Optimal design of material properties and material distribution for multiple loading conditions. Internat. J. Numer. Methods Engrg. 38(7) (1995) 1149–1170. A. Cherkaev, Stability of optimal structures of elastic composites. In: M. Bendsoe and C.A. Mota Soares (eds), Topology Design of Structures. Kluwer, Dordrecht (1992) pp. 547–558. A. Cherkaev, Variational Methods for Structural Optimization. Springer, New York (2000). A. Cherkaev and E. Cherkaeva, Optimal design for uncertain loading conditions. In: V. Berdichevsky, V. Jikov and G. Papanicolaou (eds), Homogenization. World Scientific, Singapore (1999) pp. 193–213. A. Cherkaev, L. Krog and I. Kucuk, Stable Optimal design of two-dimensional structures made from optimal composites. Control Cybernet. 27(2) (1998) 265–282. E. Cherkaeva, Optimal source control and resolution in nondestructive testing. J. Structural Optim. 13(1) (1997) 12–16.
196 12.
E. CHERKAEV AND A. CHERKAEV
E. Cherkaeva and A. Cherkaev, Bounds for detectability of material damage by noisy electrical measurements. In: N. Olhoff and G.I.N. Rozvany (eds), Structural and Multidisciplinary Optimization. Pergamon, New York (1995) pp. 543–548. 13. E. Cherkaeva and A.C. Tripp, Inverse conductivity problem for inexact measurements. Inverse Problems 12 (1996) 869–883. 14. S.J. Cox and M.L. Overton, On the optimal design of columns against buckling. SIAM J. Math. Anal. 23(2) (1992) 287–325. 15. B. Dagorogna, Direct Methods in the Calculus of Variations. Springer, Berlin (1989). 16. V.F. Demyanov and V.N. Malozemov, Introduction to Minimax. Dover, New York (1990). 17. G.A. Francfort, F. Murat and L. Tartar, Fourth-order moments of nonnegative measures on S 2 and applications. Arch. Rational Mech. Anal. 131(4) (1995) 305–333. 18. M.B. Fuchs and E. Farhi, Shape of stiffest controlled structures under unknown loads. Comput. Struct. 79(18) (2001) 1661–1670. 19. M.B. Fuchs and S. Hakim, Improved multivariate reanalysis of structures based on the structural variation method. J. Mech. Struct. Mach. 24(1) (1996) 51–70. 20. L. Gibiansky and A. Cherkaev, Microstructures of composites of extremal rigidity and exact bounds on the associated energy density. Ioffe Physico-Technical Institute, Academy of Sciences of USSR, Report N. 1115, Leningrad (1987). Translation in: A. Cherkaev and R. V. Kohn (eds), Topics in the Mathematical Modelling of Composite Materials. Birkhäuser, Basel (1997) pp. 273–317. 21. R.T. Haftka and Z. Gurdal, Elements of Structural Optimization. Kluwer, Dordrecht (1992). 22. L.A. Krog and N. Olhoff, Topology optimization of plate and shell structures with multiple eigenfrequencies. In: N. Olhoff and G.I.N. Rozvany (eds), Structural and Multidisciplinary Optimization. Pergamon, Oxford (1995) pp. 675–682. 23. J.R. Kuttler, Bounds for Stekloff eigenvalues. SIAM J. Numer. Anal. 19(1) (1982) 121–125. 24. O.A. Ladyzhenskaya and N.N. Uraltseva, Linear and Quasilinear Elliptic Equations. New York/London (1968). 25. T. Lewinski and J.J. Telega, Plates, laminates and shells. Asymptotic Analysis and Homogenization. World Scientific, Singapore (2000). 26. R. Lipton, Optimal design and relaxation for reinforced plates subject to random transverse loads. J. Probab. Engrg. Mech. 9 (1994) 167–177. 27. K.A. Lurie, Applied Optimal Control Theory of Distributed Systems. Plenum, New York (1993). 28. E.F. Masur, On structural design under multiple eigenvalue constraints. Internat. J. Solids Struct. 20 (1984) 211–231. 29. G.W. Milton, Theory of Composites. Cambridge Univ. Press, Cambridge (2002). 30. N. Olhoff and S.H. Rasmussen, On bimodal optimum loads of clamped columns. Internat. J. Solids Struct. 13 (1977) 605–614. 31. N. Olhoff and J.E. Taylor, On structural optimization. J. Appl. Mech. 50(4) (1983) 1139–1151. 32. G.I.N. Rozvany, Structural Design via Optimality Criteria. Kluwer Academic Publishers, Dordrecht, The Netherlands (1989). 33. A.P. Seyranian, Multiple eigenvalues in optimization problems. Prikl. Mat. Mekh. 51 (1987) 272–275. 34. A.P. Seyranian, E. Lund and N. Olhoff, Multiple eigenvalues in structural optimization problems. J. Struct. Optim. 8 (1994) 207–227. 35. S. Timoshenko, Theory of Elasticity, 3rd edn. McGraw-Hill, New York (1970). 36. R. Weinstock, Calculus of Variations with Applications to Physics and Engineering. Dover, New York (1974). 37. J. Zowe, M. Kocvara and M.P. Bendsoe, Free material optimization via mathematical programming. Math. Programming 79(1–3) (1997) 445–466.
Rivlin’s Representation Formula is Ill-Conceived for the Determination of Response Functions via Biaxial Testing JOHN C. CRISCIONE Department of Biomedical Engineering, Texas A&M University, College Station, TX 77843-3120, U.S.A. E-mail:
[email protected] Received 6 August 2002; in revised form 14 March 2003 Abstract. The experimental determination of a strain energy function W for a rubber specimen must address departures from an elastic ideal in a rational fashion. Herein, such a rational experimental method is developed for biaxial stretching experiments and applied to rubber data in the literature. It is shown that Rivlin’s representation formula is experimentally ill-conceived because experimental error is magnified to the extent that error obscures trends in the response function plots. Upon developing direct tensor expressions for the response function calculations, we show that Rivlin’s representation formula (or any such constitutive law that has high covariance amongst the response terms) magnifies experimental error greatly. By “high covariance”, we mean the inner product amongst the response terms in the constitutive law is nearly equal to the maximum possible value – i.e., the product of their magnitudes. Moreover, we show that the second partials of W with respect to I1 and I2 should approach infinity as the strain decreases. Using an alternate set of invariants with minimal covariance (i.e., a null inner product amongst the response terms), a W for rubber can be determined forthwith. Mathematics Subject Classifications (2000): 74-05, 74B20. Key words: finite elasticity, elastomer, biaxial testing.
This work is dedicated to the memory of Clifford C. Truesdell whom I have only known through his writings. As evident by his publications and our own, Professor Truesdell was a man of great reason and our rational investigations of mechanics have benefited immensely from his devotion.
1. Introduction By defining strain energy functions of the form W (I1 , I2 ) where I1 and I2 are respectively the first and the second principal invariants of C, the right Cauchy– Green deformation tensor, it is possible to find exact solutions to some nonlinear boundary and initial value problems in mechanics of incompressible materials with behavior that is isotropic and hyperelastic. This approach was pioneered by Rivlin 197 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 197–215. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
198
J.C. CRISCIONE
(e.g., [1]). Moreover, many of these boundary value problems can be solved analytically to provide universal solutions – solutions that are valid regardless of the specific form of W . Nevertheless, to use finite elasticity theory in practice or to verify results in solid-state physical chemistry (e.g., statistical mechanics of polymer chains), it is necessary to determine a W for a real material. Experimental finite elasticity, however, is in its infancy and is in need of much attention. In part, this is due to the complexity of the task. Yet, even for the most basic case, the results of experiments have been inconclusive and often contentious. Specifically, let us consider the biaxial stretching of a rubber sheet – a statically determinate test on an isotropic elastomer with minimal hysteresis and with nearly incompressible behavior. Here, the response functions (i.e., ∂W/∂I1 and ∂W/∂I2 ) can be calculated directly from the biaxial stretch data. Because of a magnification of experimental error in much of the deformation range of rubber, however, the functional form of W (I1 , I2 ) remains elusive in the sense that experiments cannot determine it in a definitive manner. Ambiguity in determining W (I1 , I2 ) is, incidentally, used by Rivlin as a justification for not reporting a W in his seminal paper with Saunders [2]. Rivlin and Sawyers [3] state: “. . . was not explicitly presented in the paper, since it was felt that other expressions for W could fit the experimental results equally well and it seemed invidious to select this one for special mention.” Some of the various forms of W (I1 , I2 ) for elastomers in the literature [2, 4–9] are displayed in Table I. Additionally, one cannot solve for the I1 and I2 response functions for uniaxial tests (i.e., uniaxial stretch and equibiaxial stretch) unless one assumes a functional form for W a priori. Whereby, calculations of ∂W/∂I1 and ∂W/∂I2 for uniaxial tests are only as valid as that which is assumed for the functional form of W (I1 , I2 ) in the first place. As a notorious case in point (see the discussion in [10]), a Mooney plot only yields C10 and C01 if indeed the test piece behaves like a Mooney material (entry 1 in Table I). Herein, we show that this indeterminacy for uniaxial tests and this magnification of experimental error in determining W (I1 , I2 ) is due to significant covariance amongst the response terms in the constitutive law for Cauchy stress or true stress t. All constitutive theories with such covariance, moreover, will magnify the experimental error that is inherent in tests on real materials. To understand why, let us first define the covariance amongst tensors in an explicit fashion. Toward this end, let the covariance ratio between second order tensors A1 and A2 be defined as RC (A1 , A2 ) =
abs(A1 : A2 ) , |A1 ||A2 |
(1.1)
The experimental determination of constitutive laws for elastic solids that undergo large
deformations.
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
199
Table I. W = C10 (I1 − 3) + C01 (I2 − 3) W = C10 (I1 − 3) + C01 (I2 − 3) + C02 (I2 − 3)2 1 2 W = C10 (I1 − 3) + C01 ln I32 1 2 W = C10 exp(c1 (I1 − 3)2 ) dI1 + C01 ln I32 1 2 I −3+c W = C10 (I1 − 3) + C10 (I2 − 3) + C02 ln 2 c2 2 1 2 W = C10 exp(c1 (I1 − 3)2 ) dI1 + C01 ln I32 2 W = (a0 (I1 − 3) − a1 (I1 − 3)−1 + a2 (I1 − 3)−3/2 (I2 − 3) 3 + A0 (I2 − 3) − A1 ln(I2 − 3)) W = (C10 (I1 − 3) + C20 (I1 − 3)2 + C30 (I1 − 3)3 + C01 (I2 − 3) + C11 (I1 − 3)(I2 − 3))
Mooney [4] Rivlin et al. [2]
(1940) (1951)
Gent et al. [5]
(1958)
Hart-Smith [6]
(1967)
Alexander [7]
(1968)
Alexander [7]
(1968)
Obata et al. [8]
(1970)
James et al. [9]
(1975)
The Cij , ci , Ai , and ai are all constants. Most of the constants have different symbols in the corresponding reference. Obata et al. [8] notation is preserved because W was not given and it was necessary to integrate their response functions [8, equation 15]. Rivlin and Saunders [2] did not report the W corresponding to that above, yet they draw such a W response in their response function plots.
where A1 : A2 = tr(AT1 A2 ) is the √ inner product of A1 and A2 , and |A1 |, for example, is the magnitude of A1 or A1 : A1 . It follows (from the Cauchy–Schwarz inequality) that: (1) RC (A1 , A2 ) ∈ [0, 1]; (2) RC (A1 , A2 ) = 1 iff A1 and A2 are colinear; and (3) RC (A1 , A2 ) = 0 iff A1 and A2 are mutually orthogonal. Hence, the covariance is high if RC (A1 , A2 ) is near 1 and low if near 0. Let us consider a constitutive law of the form t = −qI+α1 A1 +α2 A2 wherein q is an indeterminate pressure, α1 and α2 are scalar response functions and A1 and A2 are symmetric, deviatoric and kinematic tensors. Upon separately contracting A1 and A2 onto t, we obtain two equations that can be solved for response functions α1 and α2 in terms of stress and strain measurements. As shown in the Appendix, error in stress measurements will propagate through response function calculations, and the error will be magnified by the factor: (1 − RC (A1 , A2 )2 )−1/2 . Hence, if the covariance is high then error will be magnified greatly (with an infinite magnification of error if RC (A1 , A2 ) = 1). For an incompressible material with W = W (I1 , I2 ), the constitutive law for t is expressed by Rivlin’s representation formula which is There exists two nonzero scalars a and a such that a A + a A is the null tensor. 1 2 1 1 2 2 It is understood that A and A are mutually orthogonal when their inner product A : A 1 2 1 2 vanishes. Terminology such as “A1 and A2 are orthogonal” has to be avoided because it is conven−1 −1 T tional to say “A1 and A2 are orthogonal tensors” when AT 1 = A1 and A2 = A2 rather than when A1 : A2 = 0.
200
J.C. CRISCIONE
Figure 1. Covariance ratio RC of the I1 and I2 response terms for the biaxial stretch of an incompressible sheet. λ1 and λ2 are the in-plane stretch ratios and the data points shown correspond to the biaxial stretch tests of Rivlin and Saunders [2]. The light blue and the black data are respectively the constant I2 and constant I1 testing protocols. RC is maximal and 1 iff the response terms are colinear. RC is zero iff the response terms are mutually orthogonal. For much of the stretch domain of rubber, covariance amongst response terms is significant in the sense that RC is close to unity.
t = −pI + 2
∂W ∂W −1 B−2 B , ∂I1 ∂I2
(1.2)
where B is the left Cauchy–Green deformation tensor. It follows from the results in the Appendix that error in the response function calculations will be magnified by the factor: (1 − RC (dev(B), dev(B−1 ))2 )−1/2 wherein dev(B), for example, denotes the deviatoric part of B. Figure 1 displays RC (dev(B), dev(B−1 )) for biaxial stretching tests. Note that it is nearly 1 for moderate strain and it is equal to 1 for uniaxial tests. The magnification of error is large and even infinite in much of the strain domain of rubber. In Section 2, we provide a detailed analysis of the experimental error in assuming that rubber is elastic, and in Section 3, we develop a method of analysis that rationally adjusts for this error. In this rational experimental method, we fit the data with a stress tW (λ1 , λ2 ) that is continuous and hyperelastic like (i.e., tW satisfies the necessary conditions for an isotropic hyperelastic material being stretched biaxially). To facilitate direct comparison, we use the same tW to generate
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
201
response function plots for Rivlin’s representation formula and for a novel representation formula [11]. In so doing, we show that the experimental determination of W (I1 , I2 ) from biaxial stretch tests is ill-conceived. The representation formula of Criscione et al. [11], on the other hand, is well-posed. This later formula has minimal covariance (i.e., null inner products amongst the response terms), and the form of W for rubber can be determined forthwith from biaxial stretch tests. Furthermore, we show that the second partials of W with respect to I1 and/or I2 should approach infinity as the strain vanishes. Following Mooney [4], most elasticians assume the complete opposite – i.e., they consider rubber to be such that ∂ 2 W/∂I12, ∂ 2 W/∂I22 , and ∂ 2 W/∂I1 ∂I2 vanish. As a simple example of a W with singular second partials, consider W = µ|E|2 + γ |E|3 where |E| is the magnitude of the Green strain tensor and µ and γ are positive constants. Such a material law has smooth, monotonic behavior that recapitulates linear elasticity when |E| is small. Nevertheless, it is easy to show that ∂ 2 W/∂I12 , ∂ 2 W/∂I22 , and ∂ 2 W/∂I1 ∂I2 go to infinity as |E| vanishes. One cannot rule out the existence of cubic dependence on strain magnitude in W . Based on the analysis herein (Section 4) and the experiments of Obata et al. [8], one should, in fact, expect singular second partials for W (I1 , I2 ) when I1 and I2 approach 3. 2. Experimental Error in Biaxial Tests on Rubber Ideally, a test specimen that is composed of an elastic solid should behave such that there is one stress response for each state of deformation. In practice, however, and even with measurement error withstanding, stress measurements display variation when one particular configuration is retested multiple times. If one is investigating the inelastic behavior of metals, polymers, etc., then this departure from elasticity is of primary interest. Nevertheless, many solids have a range of deformation wherein their behavior is predominately hyperelastic in the sense that the work done on the specimen is mostly recoverable. For such materials, a hyperelasticity framework may be useful for predicting material behavior and for investigating the physical origin of the mechanical behavior. Yet, if an elasticity law is to be determined from tests on rubber, then it is imperative that departures from an elastic ideal be considered as experimental error. A direct relationship between irreproducibility of stress data and experimental error is undeniable when one assumes there to be only one stress state for every strain state. This fact is neglected, however, in almost all experimental reports on rubber. To the knowledge of this author, only Treloar [10] and Jones and Treloar [12] address this type of experimental error. This type of error is an “error of definition” (e.g., see Beers [13]) as opposed to an “error of measurement”. It is error nevertheless, and like measurement error, it represents an amount by which we are uncertain of the elastic stresses in In particular, note that the slope of ∂W/∂I vs. I becomes steeper as I and I approach 3. 1 2 1 2 Provided that the force transducers have a good enough resolution.
202
J.C. CRISCIONE
rubber. Rather than simply report the error as the resolution of the transducer, an experimentalist should retest the same values of λ1 and λ2 in a multitude of ways (e.g., loading and unloading). The variance amongst the stress measurements is the square of the error in assuming that the true stress is the average of these stress measurements at this particular strain. Retests of data points are rarely (if ever) reported in the rubber literature for error analysis purposes; however, such retests often arise unintentionally. For example, consider the biaxial stretching data of Rivlin and Saunders [2] wherein there are the data λ1 = 2.3, λ2 = 1.91, t1 = 21.5 kg/cm2 , and t2 = 16.5 kg/cm2 in Table I (constant I1 protocols), but in Table II (constant I2 protocols) they report λ1 = 2.3, λ2 = 1.92, t1 = 21.6 kg/cm2 , and t2 = 16.0 kg/cm2 . These two data points have the same λ1 , but λ2 of the latter is greater. One would expect t2 of the latter to be greater as well, yet in fact, it is lesser. This difference (0.5 kg/cm2 ) cannot be attributed to measurement error (less than 0.1 kg/cm2 ). Hence, one must suspect that the experimental error associated with Rivlin and Saunders’ data is predominately that which is due to the inelasticity of their specimen rather than that which is due to the resolution of their transducers (calibrated helical springs). From the hysteresis loop reported by Rivlin and Saunders it should be evident that their rubber specimen has a small amount of inelasticity. As a fair quantification of their experimental error, let us assume that the loading curve and unloading curve are bounds on the stress variance in such a manner that these curves are one standard deviation on either side of an expected mean curve. Whereby, half of the difference of the loading and unloading stresses vs. stretch would be the experimental error in uniaxial stress vs. stretch. Upon noticing that the hysteresis loop is wider at larger strains, a first approximation may be that the experimental error is proportional to the stress measurement. In particular, we estimate that the error in assuming elasticity is 2% of the stress measurement. A detailed error analysis for tests on rubber remains wanting. However, in lieu of such, all error bars in the figures herein are calculated assuming that the error in knowing an elastic stress value (at a particular configuration) is 2% of the measured stress value. Rivlin and Saunders tried to minimize the effects of hysteresis; yet for the two overlapping data points mentioned above, the t2 data differ more than 2% from their mean. Also recall that the difference is in the wrong direction. In order to estimate the propagation of error in the response function calculations for each data point in the biaxial stretch tests of Rivlin and Saunders [2], use their equations for calculating the response functions with rearrangement as follows: λ21 λ22 ∂W t t, = − 1 −2 −2 2 ∂I1 2(λ21 − λ22 )(λ21 − λ−2 2(λ21 − λ22 )(λ22 − λ−2 1 λ2 ) 1 λ2 )
(2.1a)
To be fair, there are two other overlapping data sets in these data tables, yet the set with the greatest variance is given here. To justify so doing, note that the variance displayed by two data points will be typically less than the full variance displayed by multiple data since it is relatively rare to have both points significantly deviate to opposite sides of a true mean – i.e., that obtained from many data.
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
1 1 ∂W t1 − t, = −2 −2 −2 2 2 2 2 2 2 ∂I2 2(λ2 − λ1 )(λ1 − λ1 λ2 ) 2(λ2 − λ1 )(λ22 − λ−2 1 λ2 )
203 (2.1b)
where t1 and t2 are the in-plane principal stresses that correspond respectively with the in-plane principal stretches λ1 and λ2 . Assuming the error in t (at particular values of λ1 and λ2 ) is 2% of the measured stress, the error in t1 , for example, is t1 = ±2t1 /100. The errors t1 and t2 are unrelated, and as such, the error propagates as the square-root of the sum of each term squared, t1 2 ∂W λ21 = ± −2 100 ∂I1 (λ21 − λ22 )(λ21 − λ−2 1 λ2 ) t2 2 1/2 λ22 + . (2.2) −2 100 (λ21 − λ22 )(λ22 − λ−2 1 λ2 ) Error propagation in the I2 response function calculation is obtained in a similar manner.
Figure 2. I1 and I2 response function plots of the biaxial stretch data of Rivlin and Saunders [2] with error bars obtained by assuming that the experimental error in the stress data is 2% of the stress measurement. There are more points in this plot than in Figure 6 of Rivlin and Saunders because all of the data in their Tables I and II are plotted here. The symbols correspond to those used by Rivlin and Saunders and they also correspond to those in Figure 1 herein. The magnification of error is greatest near equibiaxial stretch where, as evident in Figure 1, the covariance is greatest.
204
J.C. CRISCIONE
Figure 2 displays the response function plots of the biaxial stretch data in [2] with error bars included. The error bars are, in general, small for the data in the vicinity of pure shear. Not surprisingly, the data near pure shear are in regions with the least amount of covariance (see Figure 1). Data in the red zone of Figure 1 have substantial error bars – even at high values of I1 and I2 . Moreover, Rivlin and Saunders intentionally did not test regions with a higher magnification of experimental error. In the moderate strain range (i.e., extensions below 100%), error bars would be much larger and approaching infinity as the strain decreases. 3. Rational Experimental Method As shown in the prior section, biaxial stretch data on rubber specimens will display departures from the elastic ideal. So too will the data depart from isotropic and hyperelastic ideals. To address this error in a rational fashion, we assume that a stress measurement tM at the strain state given by λ1 and λ2 is tM = tW +
t,
(3.1)
where tW is a continuous function of the strain that satisfies the assumptions of isotropy and hyperelasticity. The error t represents the amount by which measurements depart from the isotropic, hyperelastic ideal. For a biaxial test with a specimen being stretched λ1 and λ2 in the associated directions e1 and e2 , let tW be written as tW = tW 1 (λ1 , λ2 )e1 ⊗ e1 + tW 2 (λ1 , λ2 )e2 ⊗ e2 ,
(3.2)
where tW 1 and tW 2 are scalar functions of λ1 and λ2 and the sheet surface (with normal e3 ) is traction free. In order for the stress response to be hyperelastic and isotropic with respect to the reference configuration, the following constraints should be evident tW 1 (λ1 , λ2 ) = tW 2 (λ2 , λ1 ), −1/2 tW 2 (λ1 , λ1 ) = 0 ∀λ1 1, −1/2
tW 1 (λ2 , λ2 ) = 0 ∀λ2 1, ∂λ2 tW 2 ∂λ1 tW 1 = . ∂λ2 ∂λ1
(3.3a) (3.3b) (3.3c) (3.3d)
We sought a continuous, best fit for tW by defining 9 nodes and 4 elements in the (λ1 , λ2 ) plane. The nodes have (λ1 , λ2 ) values as follows: (1, 1); (1, 1.75); (1, 2.5); (1.75, 1); (1.75, 1.75); (1.75, 2.5); (2.5, 1); (2.5, 1.75); and (2.5, 2.5). The domains of the 4 elements are: (1) λ1 < 1.75 and λ2 < 1.75; (2) λ1 1.75 and λ2 < 1.75; (3) λ1 < 1.75 and λ2 1.75; and (4) λ1 1.75 and λ2 1.75. Constraint (3.3d) allows the definition of a potential ω(λ1 , λ2 ) with λ1 tW 1 = ∂ω/∂λ1 and λ2 tW 2 = ∂ω/∂λ2. Hence, to find a best-fit for tW we sought a ω with
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
205
bicubic Hermite interpolation within each element (C 1 smoothness across element boundaries). Let there be n measurements of t1 , t2 , λ1 , and λ2 . With i = 1, 2, . . . , n; let [i] [i] [i] and tM2 be the stress measurements of t1 and t2 at stretches λ[i] tM1 1 and λ2 . To determine the nodal degrees of freedom, we minimized the following error function: 2 2 n ∂ω ∂ω [i] [i] [i] [i] + λ2 tM2 − λ1 tM1 − , (3.4) [i] [i] ∂λ1 λ1 =λ1[i] ∂λ2 λ1 =λ1[i] i=1 λ2 =λ2
λ2 =λ2
subject to nodal constraints required by (3.3a). Since (3.3a) and (3.4) only constrain derivatives of ω we enforced ω(1, 1) = 0 in order to obtain a solution. In total, there are 19 independent degrees of freedom to be determined. Conditions (3.3b), (3.3c) were not enforced during the fit (except at λ1 = λ2 = 1), yet the degrees of freedom determined by minimizing (3.4) were such that (3.3b), (3.3c) were nearly ∗ but not exactly satisfied. Let tW 1 be our initial data fit that does not satisfy (3.3c). ∗ To satisfy (3.3c) exactly, we added the following function to tW 1 (λ1 , λ2 ): ⎧ λ1 1, ⎨0 2 − 1) (λ 1 −1/2 ∗ (3.5) ζ1 (λ1 , λ2 ) = − tW , λ2 ) λ1 < 1. ⎩ 1 (λ2 −1/2 (λ2 − 1)2 In this work, we do not consider deformations with in-plane compression (i.e., −1/2 when λ2 1), and hence the denominator of the fraction in (3.5) is λ1 < λ2 −1/2 ∗ , λ2 ) when λ2 > 0 greater than or equal to the numerator. Since ζ1 = −tW 1 (λ2 −1/2 ∗ + ζ and λ1 = λ2 , the augmented data adjustment, tW 1 (= tW 1 ) will vanish and 1 thus satisfy (3.3c). A likewise method (yet with λ1 and λ2 interchanged) was used to enforce (3.3b). Although not continuous at (1, 1) in general, ζ1 is continuous in this case. To understand why, consider a deformation with λ1 = 1 − αε and λ2 = 1 + ε in the vicinity of the reference configuration (i.e. ε 1) with ε positive. Since we do not consider deformation with in-plane compression, α has its maximal value of 1/2 when the deformation is uniaxial extension in the e2 direction. When α 0 then λ1 1 and ζ1 = 0. When α ∈ (0, 1/2], we have 2 ∗ ∗ (3.6) |ζ1 | −α 2 λ2 λ2 + 1 tW 1 2|tW 1 | for sufficiently small ε > 0. The continuity of ζ1 at (1, 1) follows from the smooth∗ ∗ ness of tW 1 and the fact that tW 1 vanishes at λ1 = λ2 = 1. Rivlin and Saunders’ biaxial data is too sparse for our fitting method because data is lacking for small and moderate strains. Jones and Treloar’s data, as tabulated in [14], is better with n = 99 and uniform coverage of the (λ1 , λ2 ) plane. Figure 3 Since we do not consider deformations with in-plane compression, negative ε would necessitate
λ1 > 1 and ζ1 = 0.
206
J.C. CRISCIONE
Figure 3. Fit of tW to biaxial stretch data in [14]. The data for both t1 and t2 are on this plot with the λ1 t1 values shown directly and with the λ2 t2 values shown at the point with λ1 and λ2 transposed (see text). The fit is smooth, monotonic, and it satisfies the isotropy and hyperelasticity assumptions. A segment at each data point bridges the data and fit values. The two circles highlight regions (with transposed λ1 and λ2 ) where the data violate hyperelasticity and thus depart from the fit (see text).
plots the data and our fit. Only the λ1 tW 1 surface is shown because (3.3a) requires a mirror symmetry such that λ2 tW 2 is obtained when λ1 and λ2 are interchanged (i.e., [i] reflection about a plane given by the condition λ1 = λ2 ). The λ[i] 1 tM1 data points are [i] shown directly. The λ[i] 2 tM2 data points are displayed by interchanging λ1 and λ2 . Note that our fit is smooth, monotonic and representative of the test data. There are systematic departures – two of which are circled. Such systematic departures also occur when the number of degrees of freedom of the fit is increased (34 independent degrees of freedom obtained from 16 nodes and 9 elements) because these departures violate the hyperelasticity constraint. To see this, combine constraints (3.3a) and (3.3d) to obtain ∂ ∂ λ1 tW 1 (λ1 , λ2 ) = λ1 tW 1 (λ1 , λ2 ) . (3.7) ∗ λ1 =λ1 λ1 =λ∗2 ∂λ2 ∂λ2 ∗ ∗ λ2 =λ2
λ2 =λ1
where λ∗1 and λ∗2 are arbitrary constants. As highlighted on the left side of Figure 3, the surface is flat such that the left side of (3.7) is near zero. Yet in the region highlighted on the right (i.e., the region with λ1 and λ2 transposed), the data have a nonzero slope. Our fit must depart from the data because it models the stress in a hyperelastic material – one that satisfies (3.7).
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
207
Figure 4. Biaxial stretching trajectories with I1 or I2 held constant at 3.5, 4, 5, 6, and 7. Also shown are the equibiax line and the uniaxial stretch curves. The trajectories with I1 constant have a radius of curvature (on the equibiax line) with a center toward the origin. The trajectories with I2 constant, on the other hand, have a radius of curvature (on the equibiax line) with a center away from the origin.
4. W (I1 , I2 ) Has Singular Second Partials The prior section described how to obtain a stress field tW for rubber that is a continuous function of (λ1 , λ2 ) and satisfies the assumptions of isotropy and hyperelasticity. With this data adjustment, we can now attempt to find W (I1 , I2 ). Toward this end, consider the ideal testing trajectories in Figure 4 which separately hold I1 or I2 at the values: 3.5, 4, 5, 6, and 7. Note that all deformations with I1 7 have λ1 and λ2 within the domain of the fit displayed in Figure 3. Hence, subsequent plots with the I1 axis truncated at 7 are within the stretch range used to determine tW – i.e., we are interpolating, not extrapolating the Jones and Treloar data in [14]. The response function plots in Figure 5 are calculated from λ1 , λ2 , and tW (λ1 , λ2 ) for the aforementioned I1 and I2 trajectories. Note that highly nonlinear behavior is evident for small and moderate strain despite the fact that tW is smooth, monotonic,
208
J.C. CRISCIONE
and thus linearly elastic in the small strain limit. Singular second partials of W are evident as I1 and I2 approach 3. Since the error bounds are large and even infinite in the red zone of Figure 1, W (I1 , I2 ) cannot be found from response function plots – i.e., error obscures any trends in the plot. 5. Determining W (K2 , K3 ) Is Well-Posed Magnification of error is not problematic for all phenomenological theories of rubber elasticity. In fact, Criscione et al. [11] developed an approach using natural strain ln V which minimizes the covariance amongst response terms. For rubberlike materials, [11] reports W = W (K2 , K3 ) where K2 = |dev(ln V)| and K3 = piud(ln V). Whereby, the constitutive law becomes 3 3 1 − K32 ∂W ∂W t = −qI + udev(ln V) − cudev(ln V). (5.1) ∂K2 ∂K3 K2 Although the operators udev(·), piud(·), and cudev(·) are not used in [11], they are defined in the Appendix. With simple substitution, this formulation is consistent. Note that K2 is the magnitude of the distortion strain dev(ln V), and note that the K2 response term is colinear with the distortion strain. As for K3 ∈ [−1, 1], it is the mode-of-distortion (1 for uniaxial extension, 0 for pure shear, −1 for uniaxial contraction), and its response term is orthogonal to the distortion strain. Although K2 appears in the denominator, an important result from [11] is that ∂W/∂K3 must vanish as order K23 when K2 goes to zero. To solve for the K2 and K3 response functions, respectively contract udev(ln V) and cudev(ln V) onto (5.1) to obtain ∂W = udev(ln V) : t, ∂K2 K2 cudev(ln V) ∂W 3 = : t. ∂K3 2 3 1 − K3
(5.2a) (5.2b)
The orthogonal nature of the response terms (i.e., udev(ln V) : I = 0, cudev(ln V) : I = 0, udev(ln V) : cudev(ln V) = 0) makes isolation of the response functions easy. Moreover, |udev(ln V)| = 1 and |cudev(ln V)| = 1. With an approach similar to that in the Appendix, it follows from (5.2a) that error in calculating ∂W/∂K2 is on the same order as the root-mean-squared error of the principal stresses. Throughout the entire deformation range of rubber, thus, ∂W/∂K2 can be evaluated without magnification of experimental error. We are liberal with our use of W to generally represent strain energy functions, and W (K , K ) 2 3
is meant to imply that W can be expressed in terms of K2 and K3 . It does not indicate that W depends on K2 and K3 in exactly the same fashion that W (I1 , I2 ) depends on I1 and I2 . dev(ln V) is referred to as distortion strain because the spherical part of ln V solely depends on dilatation whereas its deviatoric part does not depend on dilatation whatsoever.
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
209
Figure 5. I1 and I2 response function plots for the trajectories in Figure 4 with the stress given by tW (i.e., the surface fit in Figure 3). Error bars are not shown, yet the error bounds are larger than those in Figure 2 and go to infinity near the ends of each curve.
As discussed in [11], it is appropriate that the ∂W/∂K3 calculation be sensitive to error for uniaxial, axis-symmetric deformations (i.e., K32 = 1) because the K3 response term must vanish in order to satisfy symmetry. In particular, note from (5.1) that dev(t) is colinear to dev(ln V) for uniaxial deformations – a necessary
210
J.C. CRISCIONE
Figure 6. Functional dependence of W on K2 and K3 . W and its derivatives are in units of MPa. The top panels show how ∂W/∂K2 (a) and W (b) depend on K3 . The width of the line spans the upper and lower error bounds. A functional form, given by W = g(K2 ) + K3 h(K2 ), is appropriate. Fitting a line to the W vs. K3 relation at multiple values of K2 yields the intercept g(K2 ) and slope h(K2 ) which are plotted in panels, respectively. As required by [11], note that g(K2 ) goes to zero as order K22 whereas h(K2 ) vanishes faster.
condition for an axis-symmetric deformation superimposed on an isotropic body. Because of this, the K3 response function has an infinite magnification of error for uniaxial tests. Yet, it is a simple experimental task to measure how ∂W/∂K2 depends on K3 because the K2 response function is measurable, nevertheless, for uniaxial deformation and all neighboring configurations. In contrast, neither ∂W/∂I1 nor ∂W/∂I2 can be measured for uniaxial tests because their response terms are colinear – the maximum of covariance. Using our data adjustment (i.e., tW in Section 3) to calculate ∂W/∂K2 , Figure 6(a) plots ∂W/∂K2 as a function of K3 when K2 is held constant at 0.25, 0.5, 0.75, 1.0, and 1.25. The thickness of each curve is determined by the error propagation such that the upper and lower edges are maximal and minimal bounds, respectively. With tW being continuous, we numerically integrate ∂W/∂K2 (with dK2 = 0.01 and using the trapezoidal rule) at fixed values of K3 . In so doing for many K3 values, Figure 6(b) plots W as a function of K3 when K2 is held constant. Note the functional form of W is nearly W = g(K2 ) + K3 h(K2 ), i.e., linear in K3 as suggested by [11]. When K2 is held constant, a linear regression of W vs. K3 has an intercept equal to g(K2 ) and a slope equal to h(K2 ). With such a regression
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
211
method, we plot g(K2 ) and h(K2 ) in Figures 6(c) and (d), respectively. Finding W (K2 , K3 ) is forthright and accurate.
6. Conclusions It is shown that a representation formula for t in rubber is experimentally illconceived when there is significant covariance amongst the response terms. This is so because experimental error, inherent in tests on rubber, is unacceptably magnified when response functions are calculated from biaxial test data. This result is important because it explains why the most widely used phenomenological theory for rubber elasticity (i.e., that of Rivlin) is the most intractable experimentally. In contrast, the phenomenological theories of Criscione et al. [11] and Ogden [15] contain response terms that are mutually orthogonal (the absolute minimum of covariance), and they are experimentally tractable. Although not yet applied to data, the approach of Laine et al. [16] is sure to be experimentally tractable since the response terms are mutually orthogonal. Furthermore, the analytical example in Section 1, the data in [8], and the results of Section 4 all show that W should have singular second partials with respect to I1 and I2 . In contrast, the functional form of W (K2 , K3 ) is simple. As shown in Section 5, one should expect a W (K2 , K3 ) that is linear in K3 and smooth and monotonic in K2 . The behavior of the second partials is an important matter because they appear in the equilibrium equations. To their credit, Rivlin and Saunders [2] recognized that experimental error is magnified unacceptably for moderate strain. Subsequently, they restricted their biaxial tests to large strain with I1 and I2 at 5 or above. By quantifying the magnification of error in terms of covariance, we show more precisely why W (I1 , I2 ) cannot be determined in the moderate strain domain and in some domains of high strain (i.e., the red region in Figure 1). Since high covariance is inherent in Rivlin’s representation formula itself, it is doubtful that tests such as torsion of a cylinder would be able to determine the functional form of W (I1 , I2 ). Yet, this is an open question – only biaxial stretching is considered in detail here. As for the statistical theory of rubber, many consider W (I1 , I2 ) to be useful or at least thermodynamically convenient (e.g., see Treloar’s book [10]). However, the covariance amongst the I1 and I2 response terms is most pronounced in the moderate strain range where entropy terms (rather than internal energy or crystallization) are most likely to predominate. In other words, entropically based formulations of W are potentially valid for a strain range in which W (I1 , I2 ) cannot be found experimentally. This is a compelling problem for a statistical theory. How unique or useful is a W (I1 , I2 ) formulation that cannot be independently verified with biaxial testing? We conclude that some large variations in W (I1 , I2 ) will perturb the stress only minimally because we have shown the reverse to be true – i.e., small variations in stress values can give rise to large variations in W (I1 , I2 ).
212
J.C. CRISCIONE
Section 2 focuses on the inelastic behavior of rubber in order to accurately quantify the experimental error in assuming elastic behavior. Albeit important for assessing error, this inelastic behavior is small in comparison to the total stress response. It is worthwhile to note that the stress is predominately (say 98%) dependent on strain alone when the variation in stress for a given strain is 2%. We do not report a specific functional form of W (K2 , K3 ) for rubber herein because at present there is not an acceptable data set in the literature to so do. Although the data in [14] are useful for showing that W (K2 , K3 ) is experimentally well-posed, there are systematic departures in our fit for tW (Figure 3) that arise because the data violate hyperelasticity. One possible explanation for this is that the specimens may have been held (and stress relaxed) at the uniaxial stretch state before they were stretched in the cross direction. Moreover, isotropy cannot be verified because only half of the stretch domain is tested. More experimental work needs to be done. An ideal testing protocol would randomly test and retest a large set of stretch values that cover the biaxial stretch domain. Upon randomizing the stretches and stretch-rates during the test, the resulting best-fit tW (see Section 3) would be the best-guess for the stress if the strain-rate and deformation-history were not known. The standard deviation of the data from the tW fit would be the error in assuming that the stress in rubber is that which is given by a strain energy function alone. Regardless of what experimental method is used, a departure from an isotropic, hyperelastic ideal introduces error that must be addressed in a rational fashion when trying to determine W . Acknowledgement The Texas Engineering Experiment Station provided financial support for this investigation. Appendix This Appendix shows that there is an inherent magnification of experimental error when there is significant covariance amongst response terms in t. Classically, t for incompressible materials with isotropic elastic behavior is written as: t = −pI + α1 B − α2 B−1 ,
(A.1)
where p is an arbitrary scalar, α1 and α2 are scalar response functions, I is the identity tensor, and B = FFT is the left Cauchy–Green deformation tensor where F is the local deformation gradient tensor. To develop general equations for calculation of α1 and α2 , let q = p − α1 tr(B)/3 + α2 tr(B−1 )/3. Whereby, t = −qI + α1 dev(B) − α2 dev(B−1 ), where dev(·) denotes the deviatoric part of the argument.
(A.2)
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
213
Two independent equations for α1 and α2 are obtained by taking the inner product of (A.2) with dev(B) for one and with dev(B−1 ) for the other. In particular, t : dev(B) = α1 dev(B) : dev(B) − α2 dev(B−1 ) : dev(B), t : dev(B−1 ) = α1 dev(B) : dev(B−1 ) − α2 dev(B−1 ) : dev(B−1 ).
(A.3a) (A.3b)
These equations are easily solved, yet to express the solution in a more useful manner, let us introduce the operator udev(·) which denotes the unit deviator of its argument. The unit deviator of B, for example, is dev(B) divided by its magnitude, or equivalently, udev(B) = | dev(B)|−1 dev(B). Upon solving (A.3) and rearranging terms, we obtain udev(B) − (udev(B) : udev(B−1 )) udev(B−1 ) : t, | dev(B)|(1 − RC (dev(B), dev(B−1 ))2 ) udev(B−1 ) − (udev(B) : udev(B−1 )) udev(B) : t. α2 = − | dev(B−1 )|(1 − RC (dev(B), dev(B−1 ))2 ) α1 =
(A.4a) (A.4b)
Although udev(B) and udev(B−1 ) must have unit magnitude, combinations of them do not. In fact, the numerators in (A.4a), (A.4b) have a magnitude of 1 − (udev(B) : udev(B−1 ))2 . Note further that (udev(B) : udev(B−1 ))2 is equal to RC (dev(B), dev(B−1 ))2 , whereby α1 becomes α1 =
udev(udev(B) − (udev(B) : udev(B−1 )) udev(B−1 )) : t. | dev(B)| 1 − RC (dev(B), dev(B−1 ))2
(A.5)
Now the tensor in the numerator always has unit magnitude, and hence the sum of the squares of its principal values is unity (i.e., ξ12 + ξ22 + ξ32 = 1 where ξi are the principal values of the tensor in the numerator). As necessary for isotropy, B and t are coaxial. It follows that any combination of B and B−1 is coaxial to t. Consequently, (A.5) becomes α1 =
ξ1 t1 + ξ2 t2 + ξ3 t3 , | dev(B)| 1 − RC (dev(B), dev(B−1 ))2
(A.6)
where the ti are the corresponding principal values of t. Assuming that the error in knowing each of the principal values (i.e., t1 , t2 , and t3 ) of t is uncorrelated to the other principal values then the error bounds of α1 are 3 ξ12 t12 + ξ22 t22 + ξ32 t32 , (A.7) α1 = ± | dev(B)| 1 − RC (dev(B), dev(B−1 ))2 where t12 , for example, is the variance in t1 . If the ti have a similar variance (i.e., t12 = t22 = t32 = t02 ), then the variance of α1 (i.e., α12 ) is α12 =
t02 . | dev(B)|2 (1 − RC (dev(B), dev(B−1 ))2 )
(A.8)
214
J.C. CRISCIONE
It should be evident that error in t will be greatly magnified when RC (dev(B), dev(B−1 )) is near 1. To generalize for all isotropic, incompressible, elastic materials, note that one does not have to use B and B−1 as the kinematic tensors in t. In particular, we may write t = −qI + α1 A1 + α2 A2 ,
(A.9)
wherein α1 and α2 are scalar response functions and A1 and A2 are deviatoric tensors that are linearly independent combinations of dev(B) and dev(B−1 ). With an approach similar to the above, it follows that the error in the calculation of α1 and α2 will grow as RC (A1 , A2 ) approaches unity. In order to avoid magnification of experimental error, hence, an ideal choice for A1 and A2 would be combinations of dev(B) and dev(B−1 ) that are mutually orthogonal so that RC (A1 , A2 ) vanishes. Moreover, if these combinations were normalized such that |A1 | = |A2 | = 1 then contraction of A1 and A2 onto (A.9) would separately yield α1 = t : A 1 ,
α2 = t : A 2 .
(A.10)
In so doing, the error in the response function calculations would be on the same order as that of t itself. The response functions and the stress would have the same units and the same variance. REMARK. Such an orthogonal tensor basis with normalized magnitudes can be defined rather easily (see [17] for verification of all statements in this remark). Toward this end, let D be a linear combination of B and B−1 . udev(D) will be a combination of dev(B) and dev(B−1 ) that has unit magnitude. As for a tensor that is orthogonal to udev(D), use the complementary unit deviator of D, cudev(D), as given by √ √ 6I + 3 piud(D) udev(D) − 3 6 udev(D)2 , (A.11) cudev(D) = 3 1 − piud(D)2 √ where piud(D) = 3 6 det(udev(D)) is the principal invariant of the unit deviator of D. Since tr(udev(D)) is zero and tr(udev(D)2 ) is unity, udev(D) only has √ one principal invariant, det(udev(D)), which happens to be bounded by ±(3 6)−1 . Hence, piud(D) ∈ [−1, 1], and it is such that piud(D) = −1 iff D is like the strain of uniaxial contraction and piud(D) = 1 iff D is like the strain of uniaxial extension. With generous use of the Cayley–Hamilton equation for udev(D), it is possible to show that cudev(D) : cudev(D) = 1 and cudev(D) : udev(D) = 0. Being deviatoric, it should be evident that udev(D) : I = 0 and cudev(D) : I = 0. For uniaxial, isochoric contraction, strain tensors have one negative principal value and two
positive ones that are equal. For uniaxial, isochoric extension, strain tensors have one positive principal value and two negative ones that are equal.
DETERMINATION OF RESPONSE FUNCTIONS VIA BIAXIAL TESTING
215
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
12. 13. 14. 15. 16.
17.
R.S. Rivlin, Large elastic deformations of isotropic materials: IV. Further developments of the general theory. Phil. Trans. Roy. Soc. A 241 (1948) 379–397. R.S. Rivlin and D.W. Saunders, Large elastic deformations of isotropic materials: VII. Experiments on the deformation of rubber. Phil. Trans. Roy. Soc. A 243 (1951) 251–288. R.S. Rivlin and K.N. Sawyers, The strain-energy function for elastomers. Trans. Soc. Rheol. 20 (1976) 545–557. M. Mooney, A theory of large elastic deformation. J. Appl. Phys. 11 (1940) 582–592. A.N. Gent and A.G. Thomas, Forms of the stored (strain) energy function for vulcanized rubber. J. Polym. Sci. 28 (1958) 625–628. L.J. Hart-Smith, Elasticity parameters for finite deformations of rubber-like materials. Z. Angew. Math. Phys. 17 (1967) 608–626. H. Alexander, A constitutive relation for rubber-like materials. Internat. J. Engrg. Sci. 6 (1968) 549–563. Y. Obata, S. Kawabata and H. Kawai, Mechanical properties of natural rubber vulcanizates in finite deformation. J. Polymer Sci. A 2(8) (1970) 903–919. A.G. James, A. Green and G.M. Simpson, Strain energy function of rubber. I. Characterization of gum vulcanization. J. Appl. Polymer Sci. 19 (1975) 2033–2058. L.R.G. Treloar, The Physics of Rubber Elasticity. Clarendon Press, Oxford (1975) p. 225. J.C. Criscione, J.D. Humphrey, A.S. Douglas and W.C. Hunter, An invariant basis for natural strain which yields orthogonal stress response terms in isotropic hyperelasticity. J. Mech. Phys. S. 48 (2000) 2445–2465. D.F. Jones and L.R.G. Treloar, The properties of rubber in pure homogeneous strain. J. Phys. D: Appl. Phys. 8 (1975) 1285–1304. Y. Beers, Introduction to the Theory of Error. Addison-Wesley, Reading, MA (1957). D.W. Haines and W.D. Wilson, Strain-energy density function for rubber-like materials. J. Mech. Phys. S. 27 (1979) 345–360. R.W. Ogden, Nonlinear Elastic Deformations. Halsted Press, New York (1984). E. Laine, C. Vallee and D. Fortune, Nonlinear isotropic constitutive laws: Choice of the three invariants, convex potentials and constitutive inequalities. Internat. J. Engrg. Sci. 37 (1999) 1927–1941. J.C. Criscione, Direct tensor expression for natural strain which yields a fast, accurate approximation. Internat. J. Comp. Struct. 80 (2002) 1895–1905.
Generalized Hessian and External Approximations in Variational Problems of Second Order CESARE DAVINI and ROBERTO PARONI Dipartimento di Ingegneria Civile, Universitá degli Studi di Udine, Via delle Scienze, 208-33100 Udine, Italy. E-mail: {cesare.davini:roberto.paroni}@dic.uniud.it Received 18 January 2002; in revised form 13 August 2002 Abstract. We introduce a suitable notion of generalized Hessian and show that it can be used to construct approximations by means of piecewise linear functions to the solutions of variational problems of second order. An important guideline of our argument is taken from the theory of the -convergence. The convergence of the method is proved for integral functionals whose integrand is convex in the Hessian and satisfies standard growth conditions. Mathematics Subject Classifications (2000): 65N12, 65N30, 46N10, 74K20, 74S05. Key words: numerical methods, non-conforming approximations, -convergence, anisotropic plates.
To the memory of Clifford Truesdell, with gratitude.
1. Introduction The approximation of second, or higher order, variational problems by standard conforming methods is quite cumbersome, since it requires the continuity of the first, or higher order, derivatives across the mesh elements. It is therefore preferable to use alternative methods that with various strategies provide external approximations to the solution, that is approximations within spaces of functions that are less regular than what would be required by the variational problem. Several families of these methods have been thoroughly studied in the past decades and a well established theory has been developed. In non-conforming finite elements the regularity requirements across the mesh elements are relaxed and an external approximation is obtained by simply considering bases of non-conforming functions. The possible jumps and discontinuities at element interfaces resulting from nonconformity are completely ignored and just the sum of the contribution over the mesh elements is taken into account. The convergence to the solution is then assured for the so called consistent finite elements [7]. The mixed methods instead provide external approximations by enlarging the list of primal variables and adding suitable constraints as side conditions. By the introduction of Lagrange multipliers the minimum problem is 217 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 217–242 © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
218
C. DAVINI AND R. PARONI
then changed into a saddle point problem. Obviously to solve the latter problem a peculiar choice of algorithms is needed [4, 17]. Here we follow a different approach that does not make use of Lagrange multipliers nor ignores the discontinuities across the mesh elements. We consider variational problems of second order in a two-dimensional bounded domain, and give an approximation scheme using spaces of piecewise linear functions defined for a chosen sequence of triangulations of the domain. Precisely, we introduce a notion of generalized Hessian, based on a discrete Green’s formula, and show that it endows the union of these discrete spaces with a sequential topology that makes it dense, in an appropriate sense, in the function space in which the given variational problem is defined. This is established under a mild assumption on the triangulations. Then, for a generic functional of integral type whose integrand is convex with respect to the Hessian and satisfies a standard quadratic growth condition, we construct a sequence of functionals defined on these discrete spaces and prove that it -converges to the given functional. Moreover, when the integrand is strictly convex, the minimizers converge to the minimizer of the original problem. So this approach provides an approximation technique. All this generalizes ideas discussed by Davini [10, 11] and Davini and Pitacco [12, 13]. Credit must also be given to an early paper by Glowinski [15] that probably did not receive the attention it deserves. Applications and the crucial issue of estimating the convergence rate are not considered in the present paper (see however the related paper by Davini and Pitacco [13] where the rate of convergence for the biharmonic problem is studied within the general framework of the mixed method). Our attention is rather focused on general aspects of the method and, particularly, on its connections with the -convergence of functionals. Although it is customarily used for different scopes, it seems to us that the framework of -convergence lends itself quite naturally to approximation purposes. In particular, as in this case, it may lead in a direct way to the introduction of sequences of unconstrained minimum problems for functionals defined in non-conforming spaces and whose minimizers provide external approximations to the solution. Our results are not confined to the quadratic functionals and cover a fairly broad class of problems. It is worth recalling that various authors have proposed for the quadratic case approximation techniques that turn out to be similar to ours, although they have been worked out in a different perspective. We mention, among others, the works by Bhattacharyya et al. [3, 16], who treated the case of linear anisotropic plates within the scheme of the mixed methods, and those by Angelillo et al. [1, 2], who adapted the argument of Davini and Pitacco [13] for the loading problem of plane anisotropic linear elasticity. 2. Discretization of the Domain Let ⊂ R2 be an open bounded domain with smooth boundary. Let Th := {Tj }j =1,...,Ph , with h taking values in some countable set Ᏼ of real numbers, be
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
219
a sequence of triangulations of regular in the sense of Ciarlet [6], i.e., such that the ratio between ρh = inf sup{diam (S): S is a disk contained in Tj } and j
S
h := sup{diam Tj } j
is bounded away from zero by a constant independent of h. We denote by xi the vertices of the triangles Tj and call them the nodes of the mesh . We indicate by Ph := {1, 2, . . . , Ph } and ᏺh := {1, 2, . . . , Nh } the sets of values taken by the indexes of the triangles and the mesh nodes, respectively. We shall call Th the primal mesh. Denoting ◦
h :=
4
Tj
Tj ∈Th
we require that h invades from inside. Following Davini and Pitacco [12, 13], for each h ∈ Ᏼ we also introduce a dual mesh Th := {Ti }i=1,...,Nh consisting of disjoint open polygonal domains, each containing just one primal node, as shown in Figure 1 where the dual elements are drawn with dashed lines. We assume that the sequence of dual meshes is also regular and that ◦
h =
4
Tj .
Tj ∈Th
Let Xh be the space of functions which are affine on Tj and continuous on h (briefly, the polyhedral functions over Th ), and let X0h ⊂ Xh denote the set of functions that vanish on ∂h . We regard X0h as a subspace of H01 () by extending the functions to zero in \ h .
Figure 1. , h , the primal and the dual mesh.
220
C. DAVINI AND R. PARONI
Let ϕˆi be the polyhedral splines in Xh defined by the condition that ϕˆi (xj ) = δij for i, j = 1, . . . , Nh . We assume that 1 ϕˆi dx = |supp(ϕˆi )| (1) |Ti | = 3 h and call it A SSUMPTION (H0). Note that it is always possible to construct a mesh with this property, e.g., by taking the nodes of the dual mesh to be the center of mass and the middle points of the sides of the triangles of the primal mesh. In what follows, in order to keep the notation simple, we sometime avoid labeling by h certain quantities that are mesh dependent, such as, for instance, the mesh nodes or the mesh elements, assuming that that dependence is clear from the context. Also, it shall be useful to distinguish between the internal nodes, which are those that do not belong to ∂h and whose indices take value in the set Ᏽh ⊂ ᏺh , and the boundary nodes which are those sitting on it. DEFINITION 1. We shall say that the sequence of partitions considered has the P ROPERTY () if the fourth-order tensor Nh −1 (k) ∇ ϕˆj ⊗ ∇ ϕˆk dx ⊗ (xj − xk ) ⊗ (xj − xk ) Gh := 2|Tk | j =1 satisfies the following inequality 5 5 5 1, lim sup sup 5G(k) h k∈Ᏽh
h
where · denotes the sup-norm, i.e., (k) 5 (k) 5 5G 5 := sup |Gh H| , h |H| H=0
H ranging over the space of second order tensors. A similar condition was required by Glowinski [15]. Angelillo et al. [1, 2] considered instead meshes with the following property (P ROPERTY (AFF)): Nh H(x − xj ) · (x − xj )∇ ϕˆj ⊗ ∇ ϕˆk dx = 0 ∀H j =1
h
for every node xk . In what follows we prove a couple of lemmas implying that if the mesh satisfies P ROPERTY (AFF) at the internal nodes, then it has the P ROPERTY (). As we shall see our assumption that h invades is sufficient to control also the contribution coming from the boundary nodes. tensors into symREMARK 1. The fourth-order tensor G(k) h maps second-order metric tensors. To deduce this it suffices to show that ∇ ϕˆj ⊗ ∇ ϕˆk dx is a symmetric second-order tensor. Note that if xj and xk are not the nodes of the same
221
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
Figure 2. Two triangles of the primal mesh.
triangle then the integral considered is equal to zero. So, let us fix attention to any node xk and adopt local labels t = 1, 2, . . . , tk to denote the nodes all around it and the respective primal triangles. Also, choose counterclockwise ordering and indicate by nt the unit normals to the sides joining the central node xk to xt(k) . (k) (k) Let Tt(k) and Tt(k) −1 be the triangles having the side (xk , xt ), with xj ≡ xt , in common. The situation is represented in Figure 2. After denoting with ϕˆt(k) −1 and (k) (k) (k) ϕˆ t the restriction of ϕˆj on Tt −1 and Tt , respectively, we may write (k) ∇ ϕˆj ⊗ ∇ ϕˆk dx = ∇ ϕˆ t −1 ⊗ ∇ ϕˆk dx + ∇ ϕˆ t(k) ⊗ ∇ ϕˆk dx, (k)
h
(k)
Tt−1
Tt
from which, by applying Green formula, it follows that ∇ ϕˆj ⊗ ∇ ϕˆk dx h ∇ ϕˆ t(k) ⊗ nt +1 ϕˆk dτ = (k)
xk ,xt+1
−
(k) xk ,xt
(k) ∇ ϕˆ t ⊗ nt ϕˆk dτ −
(k) xk ,xt−1
∇ ϕˆt(k) ˆk dτ. −1 ⊗ nt −1 ϕ
Now, the gradients ∇ ϕˆt(k) and ∇ ϕˆt(k) −1 have the direction of nt +1 and nt −1 , respectively. So, the integrands of the first and third integrals are symmetric dyads. On (k) |] nt and the second integral the other hand, by Hadamard lemma, [|∇ ϕˆt(k) |] = [|ϕˆ t,n also takes on symmetric values. The following simple lemma shall be useful in what follows. LEMMA 1. Let xk be an internal node. Then, for all H, the identity Nh −1 Hx H = · x ∇ ϕˆj ⊗ ∇ ϕˆk dx G(k) h k | j j 2| T j =1 holds true.
(2)
222
C. DAVINI AND R. PARONI
Proof. We notice that, if W is skew-symmetric, equation (2) is satisfied because = 0 by the definition. Hence, without loss in generality, let H be any symmetric second-order tensor. Then, from G(k) h W
G(k) h H
Nh −1 H(xj − xk ) · (xj − xk ) ∇ ϕˆj ⊗ ∇ ϕˆk dx, = 2|Tk | j =1
and by taking into account that Nh ˆj (x) = x we find j =1 xj ϕ
Nh
j =1
∇ ϕˆj = 0, since
Nh
j =1
ϕˆj = 1, and that
Nh −1 (Hxj · xj − 2Hxj · xk ) ∇ ϕˆj ⊗ ∇ ϕˆk dx 2|Tk | j =1 % Nh −1 Hxj · xj = ∇ ϕˆj ⊗ ∇ ϕˆk dx k | 2| T j =1 %N & &T h 1 ∇ xj ϕˆj Hxk ⊗ ∇ ϕˆk dx + |Tk | j =1
G(k) h H =
Nh Nh −1 1 Hxj · xj Hxk ⊗ ∇ ϕˆj ⊗ ∇ ϕˆk dx + ∇ ϕˆk dx. = 2|Tk | |Tk | j =1 j =1
By a Green formula the last integral on the right hand side vanishes because ϕˆk vanishes at the boundary of supp ϕˆk when xk is an internal node. Thus, equation (2) holds true. 2 The next lemma shows that P ROPERTY () is more general than P ROPERTY (AFF). LEMMA 2. An internal node xk owns the property (AFF) if and only if G(k) h = I, where I is the fourth-order identity tensor. Proof. Without loss in generality it suffices to consider a generic symmetric tensor H. Then, %N &T Nh h Hx · xj ∇ ϕˆj ⊗ ∇ ϕˆk dx = ∇ xj ϕˆj Hx ⊗ ∇ ϕˆk dx j =1
h
h
j =1
Hx ⊗ ∇ ϕˆk dx = − ϕˆk dx, = −H
∇(Hx)ϕˆk dx
=
h
h
h
223
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
where we have applied Green’s formula and taken into account that xk is an internal node so that ϕˆ k vanishes on the boundary of supp ϕˆk . By A SSUMPTION (HO) we then obtain Nh Hx · xj ∇ ϕˆj ⊗ ∇ ϕˆk dx = −H|Tk |. (3) j =1
h
Moreover we have Nh H(x − xj ) · (x − xj )∇ ϕˆj ⊗ ∇ ϕˆk dx j =1
h
Hx · x ∇
= h
+
& ϕˆj
⊗ ∇ ϕˆk dx − 2
j =1
Nh j =1
%N h
Nh j =1
Hxj · xj ∇ ϕˆj ⊗ ∇ ϕˆk dx
h
and by Lemma 1, equation (3) and noticing again that Nh j =1
Hx · xj ∇ ϕˆj ⊗ ∇ ϕˆk dx
h
Nh
j =1
ϕˆj = 1, we obtain
H(x − xj ) · (x − xj )∇ ϕˆj ⊗ ∇ ϕˆk dx = 2|Tk |(H − G(k) h H).
h
From this identity it then follows that G(k) h H =H
∀H
if and only if P ROPERTY (AFF) applies.
2
As a consequence of Lemma 2 we deduce that P ROPERTY () holds whenever the chosen triangulation forms hexagons or half hexagons formed by triplets of equal isosceles triangles, see Figure 3, since it was proved by Angelillo et al. [1, 2] that in this case P ROPERTY (AFF) holds. We now look at the case in which the primal mesh is generated by a rectangular grid of nodes, see Figure 4. In this case, after tedious calculations similar to those done in Remark 1, we deduce (with the notation shown in Figure 4) that G(k) h = γ I, where γ :=
3(1 + αβ) 2 + 2αβ + α + β
and where α := a1 /a2 and β := b1 /b2 . Hence P ROPERTY () holds provided γ 1, that is for ({α 1} ∩ {β 1}) ∪ ({β 1} ∩ {α 1}), cf. Figure 4. Note that P ROPERTY (AFF) holds only for either α = 1 or β = 1.
224
C. DAVINI AND R. PARONI
Figure 3. Partition for which property P ROPERTY (AFF) holds.
Figure 4. α := a1 /a2 , β := b1 /b2 .
3. Generalized Hessian In what follows we are interested in dealing with functions in X0h and shall imagine them extended to zero in the whole of R2 . Let vˆ ∈ X0h . Then, its second ˆ α ψ,β dx, for all distributional derivatives are defined by v, ˆ αβ , ψ = − h v, ˆ is a symmetric linear operator functions ψ ∈ C0∞ (R2 ). Hence the Hessian, D 2 v, from C0∞ (R2 ) into the space of two by two real matrices defined by 6
7 ˆ ψ =− D 2 v,
∇ vˆ ⊗ ∇ψ dx. h
By density we can extend this operator to the Sobolev space H 1 (h ). We shall still ˆ It follows that denote this extension by D 2 v. 6
7 ˆ ψˆ = − D 2 v,
h
∇ vˆ ⊗ ∇ ψˆ dx,
ψˆ ∈ Xh .
225
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
ˆ Noticing that every function ψˆ ∈ Xh can be written as ψ(x) = ϕˆ j (x)), we can write 6
Nh 7 ˆ ψˆ = − D 2 v, j =1
∇ vˆ ⊗ ∇ ϕˆj dx ψˆ (xj ) =
h
Nh
Nh
ˆ
j =1 (ψ(xj )
ˆ j )|Tj | Hh v(x ˆ j )ψ(x
×
(4)
j =1
for every vˆ ∈ X0h and every ψˆ ∈ Xh , where we have set −1 ˆ j ) := ∇ vˆ ⊗ ∇ ϕˆj dx. Hh v(x |Tj | h
(5)
Given vˆ ∈ X0h , we define the tensor valued function Hh vˆ :=
Nh
Hh v(x ˆ j )χTj ,
(6)
j =1
where χTj denotes the characteristic functions of Tj . We will call Hh vˆ the generalized Hessian of v. ˆ Furthermore, given a continuous function f we define ch f (x) :=
Nh
f (xj )χTj (x),
j =1
rh f (x) :=
Nh
(7) f (xj )ϕˆj (x).
j =1
ˆ For every vˆ ∈ X0h and ψˆ ∈ Xh , then, from Obviously, if vˆ ∈ Xh , then rh vˆ = v. equation (4) we deduce that 7 6 2 ˆ ˆ ψ = Hh vˆ ch ψˆ dx. (8) D v, h
For later use let us also denote by ◦ Hh v(x ˆ j )χTj Hh vˆ :=
(9)
j ∈ Ᏽh
the simple function that coincides with the generalized Hessian of vˆ on the internal dual elements and vanishes outside. Note that, when ψˆ ∈ X0h , equation (8) becomes 7 ◦ 6 2 ˆ ˆ ˆ ψ = Hh v(x ˆ j )ψ(xj )|Tj | = (10) D v, Hh vˆ ch ψˆ dx. j ∈ Ᏽh
h
226
C. DAVINI AND R. PARONI
REMARK 2. From what we have shown in Remark 1 it follows that the generalized Hessian is a symmetric tensor valued function for every vˆ ∈ X0h . Its trace turns out to coincide with the generalized Laplacian defined by Davini in [11]. It is worth noticing an interesting interpretation that can be given to the generalized Hessian when the boundary of Tj intersects at the midpoints the edges of the primal triangles that concur at xj . From calculations similar to those made in Remark 1 we get in fact tj 1 (j ) ˆ j) = [|vˆ,n |]nt ⊗ nt ϕˆ t ds Hh v(x |Tj | t =1 xj ,xt(j) 1 1 (j ) = (xj − xt ) [|vˆ,n |]nt ⊗ nt , |Tj | t =1 2 tj
(11)
where [|vˆ,n |] denotes the jump of the directional derivative of vˆ in the direction (j ) normal to the primal side (xj , xt ). It is easy to see that Xh is contained in the space of functions with special bounded Hessian, SBH(), cf. [5], and that the distributional Hessian of vˆ ∈ Xh is a Radon measure of the form 8 ˆ , (12) D 2 vˆ = [|vˆ,n |] n ⊗ n dᏴ1 S(∇ v) ˆ denotes the one-dimensional Hausdorff measure restricted to where Ᏼ1 "(S(∇ v)) the set where the gradient of vˆ is discontinuous. By recalling that the boundary of Tj meets the sides of the primal mesh at the midpoints, we get tj 1 (j ) (xj − xt ) [|vˆ,n |]nt ⊗ nt , ˆ Tj ) = D v( 2 t =1 2
(13)
when this measure is calculated on Tj . Therefore, by (11)2 it follows that ˆ j) = Hh v(x
D 2 v( ˆ Tj ) , |Tj |
(14)
which motivates us to regard the generalized Hessian of a polyhedral function at xj as the mean value of the Hessian over Tj . 9 4. Some Properties of a Sequential Topology in h∈Ᏼ X0h 9 The set h∈Ᏼ X0h is obviously dense in H01 (), with respect to the H 1 norm. The notion of generalized 9 Hessian introduced in Section 3 can be used in order to endow the space h∈Ᏼ X0h with a sequential topology that makes it dense in H 2 () ∩ H01 (), or H02 (), in an appropriate sense, and allows us to define converging discretization methods that provide external approximations to the solutions of variational problems involving the Hessian. This generalizes ideas discussed by Davini [10, 11] and Davini and Pitacco [12, 13]. It is routine to show that:
227
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
LEMMA 3. There exist constants c and C, which do not depend on h, such that ˆ 2L2 () v ˆ 2L2 () Cch v ˆ 2L2 () cch v
∀vˆ ∈ Xh .
Proof. Let vˆ ∈ Xh and Tj be one of the primal triangles. Then, v(x ˆ t1 )ϕˆt1 (x) for x ∈ Tj , v(x) ˆ = t1 ∈ᏺh (j )
where the sum extends to the nodes that are vertices of Tj and ᏺh (j ) ⊂ ᏺh denotes the set of values taken by their respective indices. Hence, we have 2 |v| ˆ dx = v(x ˆ t1 ) ϕˆt1 ϕˆt2 dx v(x ˆ t2 ). Tj
Tj
t1 ,t2 ∈ᏺh (j )
So, the matrix ( Tj ϕˆt1 ϕˆt2 dx) is strictly positive. One can use an affine change of coordinates x → x to transform Tj into a normalized triangle, obtaining that ϕˆt1 ϕˆt2 dx = |Tj |Kt1 t2 , t1 , t2 ∈ ᏺh (j ), Tj
with Kt1 t2 the entries of a 3 × 3 symmetric matrix which is also strictly positive and independent of j . It follows that there are positive constants λ and such that v(x ˆ t 1 )2 |v| ˆ 2 dx |Tj | v(x ˆ t 1 )2 . λ|Tj | Tj
t1 ∈ᏺh (j )
t1 ∈ᏺh (j )
By summing up over the primal triangles and reorganizing the sums in the first and last term suitably, it follows that 2 v(x ˆ i) |Tj | |v| ˆ 2 dx v(x ˆ i )2 |Tj |, λ i∈ᏺh
h
j ∈Ph (i)
i∈ᏺh
j ∈Ph (i)
values relative to the primal triangles where Ph (i) ⊂ Ph is the subset of index that have xi as a vertex. By observing that j ∈Ph (i) |Tj | = |supp ϕˆi | and recalling A SSUMPTION (HO), we find that v(x ˆ i )2 |Ti | |v| ˆ 2 dx 3 v(x ˆ i )2 |Ti | 3λ i∈ᏺh
h
i∈ᏺh
which is our thesis. LEMMA 4. There exists a constant c, independent of h, such that 5◦ 5 ∇ v ˆ L2 () c5Hh vˆ 5L2 () ∀vˆ ∈ X0h .
2
228
C. DAVINI AND R. PARONI
Proof. This is a straightforward consequence of equation (10) and Lemma 3. In fact, by applying the Cauchy–Schwarz inequality to equation (10), with ψˆ equal to v, ˆ and the Poincaré inequality successively, we deduce that 5◦ 5 5◦ 5 ∇ vˆ ⊗ ∇ vˆ dx 5Hh vˆ 5L2 () ch v ˆ L2 () c5Hh vˆ 5L2 () ∇ v ˆ L2 () h
for some constant c. But 2 ∇ vˆ ⊗ ∇ vˆ dx ∇ v ˆ L2 () = I · h √ 5◦ 5 ∇ vˆ ⊗ ∇ vˆ dx c5Hh vˆ 5L2 () ∇ v ˆ L2 () . 2
(15)
h
2
Hence, the lemma is proved. We now prove a compactness theorem. THEOREM 1. Let vˆh ∈ X0h and v ∈ L2 (). If vˆh → v and
in L2 ()
5◦ 5 sup5Hh vˆh 5L2 () < +∞, h
then v ∈ H 2 () ∩ H01 (), vˆh → v
in H 1 (),
and ◦
2 Hh vˆh " ∇ v
in L2 (). ◦
ˆ L2 () < +∞, from Lemma 4 we deduce that Proof. Since suph Hh v sup∇ vˆh L2 () < +∞, h
and hence vˆ h " v in H 1 (). But vˆh ∈ H01 () and hence also v ∈ H01 (). Let f ∈ C0∞ (). Then, for h small enough supp f h and we have that ˆ j dx f (xj ) ∇ vˆh ⊗ ∇rh f dx = ∇ vˆh ⊗ ∇ϕ
Tj
j ∈ Ᏽh
= −
◦
|Tj |Hh vˆh (xj )f (xj ) = −
j ∈ Ᏽh
◦
Hh vˆh f dx −
= −
◦
Hh vˆh ch f dx
◦
Hh vˆh (ch f − f ) dx
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
229
and hence ◦ ◦ ∇ vˆh ⊗ ∇f dx = − Hh vˆh f dx − Hh vˆh (ch f − f ) dx ∇ vˆh ⊗ ∇(f − rh f ) dx. +
(16)
Let ε > 0. By using the fact that ch f → f in L2 () and rh f → f in H 1 (), we find that for h sufficiently small we have ◦ Hh vˆh (ch f − f ) dx + ∇ vˆh ⊗ ∇(f − rh f ) dx εf L2 () .
From equation (16) we find ∇ vˆh ⊗ ∇f dx cf L2 ()
and passing to the limit we deduce 6 2 7 D v, f = ∇v ⊗ ∇f dx cf L2 () .
Thus the Hessian is a bounded linear operator on L2 (). By the Riesz representation theorem there exists a function V ∈ (L2 ())2×2 such that 7 6 2 Vf dx. D v, f =
As is usually done we shall denote the function V by ∇ 2 v. Now, passing to the limit in equation (16) we deduce ◦ 2 ∇ vf dx = − ∇v ⊗ ∇f dx = lim Hh vˆh f dx,
h
for every f ∈ ◦
C0∞ (),
2 Hh vˆh " ∇ v
which implies that
in L2 ().
We now finish the proof by showing that ∇ vˆh converges strongly in L2 () to ∇v. To this end, from equation (8) we find ◦ ◦ ∇ vˆh ⊗ ∇ vˆh dx = − Hh vˆh vˆh dx + Hh vˆh (vˆh − ch vˆh ) dx.
Hence,
∇ vˆh ⊗ ∇ vˆh dx = −
lim h
∇ v v dx =
∇v ⊗ ∇v dx,
2
(17)
230
C. DAVINI AND R. PARONI
◦
as Hh vˆh " ∇ 2 v, vˆh → v and (vˆh − ch vˆh ) → 0 in L2 (). By calculating the trace of the terms on the two sides of (17) it follows that lim ∇ vˆh 2L2 () = ∇v2L2 () . h
Therefore, ∇ vˆh → ∇v
in L2 ()
since ∇ vˆh " ∇v in L2 ().
2
Assuming P ROPERTY () we now deduce a density result. THEOREM 2. Assume that P ROPERTY () holds. For every v ∈ H 2 () ∩ H01 () there exists a sequence {vˆh } ⊂ X0h such that vˆh → v
in H 1 (),
and ◦
2 Hh vˆh → ∇ v
in L2 ().
Proof. We start by considering v ∈ C ∞ () ∩ H 2() ∩ H01() and let vˆh := rh v be its nodal interpolation defined in equation (7). Obviously, vˆ h → v in H 1 (). We now compute −1 ∇ vˆh ⊗ ∇ ϕˆj dx Hh vˆh (xj ) = |Tj | h Nh −1 ∇ ϕˆk ⊗ ∇ ϕˆj dx v(xk ) (18) = j | h | T k=1 at any internal node xj . Since the function v is smooth we can write v(xk ) = v(xj ) + ∇v(xj ) · (xk − xj ) 1 + ∇ 2 v(xj )(xk − xj ) · (xk − xj ) + o |xk − xj |2 . 2 Then, taking into account that %
Nh k=1
& ∇ ϕˆk ⊗ ∇ ϕˆj dx v(xj ) = 0,
j ∈ Ᏽh
(19)
231
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
Nh
because k=1 ϕˆk = 1, and Nh ∇v(xj ) · (xk − xj )∇ ϕˆk ⊗ ∇ ϕˆj dx k=1
% ∇ ∇v(xj ) ·
=
Nh
& (xk − xj )ϕˆk ⊗ ∇ ϕˆj dx
k=1
∇ ∇v(xj ) · (x − xj ) ⊗ ∇ ϕˆj dx = ∇v(xj ) ⊗ ∇ ϕˆj dx = 0 j ∈ Ᏽh ,
=
(20)
we deduce
Nh −1 ∇ ϕˆk ⊗ ∇ ϕˆj dx ∇ 2 v(xj )(xk − xj ) · (xk − xj ) Hh vˆh (xj ) = 2|Tj | h k=1 + =
Nh
o(h2 )
k=1 (j ) 2 Gh ∇ v(xj )
+ o(1)
j ∈ Ᏽh ,
(21)
where we have taken into account that xk and xj must belong to the same triangle in order to appear in the expressions above. In equation (21) o(1) indicates a quantity that tends to 0 with h, uniformly in j . Hence, by applying P ROPERTY (), 5◦ 52 |Hh vˆh (xj )|2 |Tj | lim sup5Hh vˆh 5L2 () = lim sup h
h
lim sup h
j ∈ Ᏽh
5 (j ) 52 5G 5 ∇ 2 v(xj ) 2 |Tj | h
j ∈ Ᏽh
2 5 (j ) 52 2 5 5 ∇ v(xj ) |Tj | lim sup sup Gh h
lim sup h
j ∈ Ᏽh
j ∈ Ᏽh
Nh
2 ∇ v(xj ) 2 |Tj |.
j =1
It follows that 5 52 5◦ 52 lim sup5Hh vˆh 5L2 () 5∇ 2 v 5L2 () .
(22)
h
◦
For a subsequence, not relabeled, we have that suph Hh vˆh 2L2 () < +∞, and hence from Theorem 1 we deduce that ◦
2 Hh vˆh " ∇ v
in L2 (),
while, from equation (22) we find that the convergence is indeed strong in L2 ().
232
C. DAVINI AND R. PARONI
We now consider the general case, i.e., v ∈ H 2 ()∩H01 (). Let wk ∈ C ∞ ()∩ H 2 () ∩ H01 (), k ∈ N, be such that wk → v in H 2 (). From the case discussed above we deduce that for every k there exists a h = h(k) such that if we let vˆh := rh(k) wk we have 1 , k 1 . k
vˆh − wk 2H 1 () 5 5◦ 5Hh vˆh − ∇ 2 wk 52 2
L ()
Now, letting k go to infinity we see that vˆh is the sequence we were looking for. 2 The following important variants of the previous theorems apply if we take into account the generalized Hessian up to the boundary nodes. THEOREM 3. Let vˆh ∈ X0h and v ∈ L2 (). If vˆh → v
in L2 ()
and supHh vˆh L2 () < +∞, h
then v ∈ H02 (), vˆh → v
in H 1 (),
and Hh vˆh " ∇ 2 v
in L2 (). ◦
Proof. Since suph Hh vˆh L2 () < +∞ implies that suph Hh vˆh L2 () < +∞, ◦
the conclusions of Theorem 1 hold. Moreover, obviously, Hh vˆh " ∇ 2 v in L2 () implies that Hh vˆh " ∇ 2 v in L2 (). Thence we have only to prove that v,n = 0
in ∂.
To see this, we can replicate the argument of Davini [11, Lemma 3]. Let f ∈ C ∞ () and observe by the Green formula that v,n f ds = vf dx + ∇v · ∇f dx. (23) ∂
233
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
Proceeding as in Theorem 1, we have that ∇v · ∇f dx = ∇ vˆh · ∇rh f dx + ∇ vˆh · ∇(f − rh f ) dx ∇(v − vˆh ) · ∇f dx = ∇ vˆh · ∇rh f dx + o(1), +
(24) where we have used the fact that rh f → f and vˆh → v in H 1 (). By taking account of equation (24) in equation (23) and observing that ∇ vˆh · ∇rh f dx =
Nh
trace Hh vˆh (xj ) f (xj )|Tˆj |,
j =1
cf. equation (8), it follows that
v,n f ds = ∂
vf dx +
Nh
trace Hh vˆh (xj ) f (xj )|Tj | + o(1).
j =1
Hence, by using obvious inequalities and passing to the limit we get 1 2 v f ds sup H v ˆ 2 + v 2 ,n h h L () L () f L2 () . h
∂
Then, by density, 1 2 supHh vˆh L2 () + vL2 () f L2 () v f ds ,n
(25)
h
∂
for all f ∈ H 1 (). This implies that v,n = 0 on ∂ because, for every f ∈ H 1 (), it is always possible to construct a sequence {fk } such that f − fk ∈ 2 H01 () and fk → 0 in L2 (). Thus, v ∈ H02 (). THEOREM 4. Assume that P ROPERTY () holds. For every v ∈ H02 () there exists a sequence {vˆh } ⊂ X0h such that vˆh → v
in H 1 (),
and Hh vˆh → ∇ 2 v
in L2 ().
Proof. We start considering v ∈ C0∞ () and proceed as in Theorem 2. Let vˆ h := rh v. Obviously, vˆh → v in H 1 () and the analysis that leads to equation (21) keeps on holding for the internal nodes. On the other hand, for h small enough v(xj ) = ∇v(xj ) = 0 at the boundary nodes. Thus, equations (19) and (20) hold
234
C. DAVINI AND R. PARONI
true for every node. It follows that equation (21) applies to every node as well and thence we have that 5 52 (26) lim supHh vˆh 2 2 5∇ 2 v 5 2 . h
L ()
L ()
One of the implications of Theorem 1 is that Hh vˆh " ∇ 2 v in L2 (), as we have already noticed. Thus, we conclude from (26) that the convergence is indeed strong. By repeating the argument of Theorem 2, the thesis follows from the density of 2 C0∞ () in H02 ().
5. External Approximations of Quadratic Functionals: the Equilibrium Problem for Anisotropic Elastic Plates We use the results of Sections 3 and 4 to prove the convergence of a direct nonconforming approximation scheme for quadratic variational problems involving the Hessian. Namely, we adopt the format of -convergence theory in order to prove that a suitable sequence of discrete functionals {Fh } defined in the spaces X0h -converges in an appropriate topology to the functional F that describes the problem we wish to approximate. In particular, since the limit functional we shall consider has a unique minimizer and the discrete functionals are equicoercive, one of the key properties of -convergence stated in Theorems 7.8 and 7.24 of [9] applies: min F (v) = lim min Fh (v)
(27)
uh → u,
(28)
h
and u and uh being the minimizers of F and Fh , respectively. Therefore, the uh can be used in order to approximate u. To compute them a sequence of discrete unconstrained minimum problems for the functions Fh have to be solved and we can use standard techniques to do it. It is fair to say that the customary perspective of -convergence is reversed here, since the theory is used for validating approximation schemes rather than for getting a characterization of the limit problem, as is more common in other types of applications. Let us consider the equilibrium problem for an elastic homogeneous anisotropic plate under transverse loads Cαβγ δ u,γ δαβ = f on ⊂ R2 , B0 u = 0 on ∂, B1 u = 0 on ∂,
(29)
where f describes the applied loads, Cαβγ δ are the components of the elasticity tensor C of the plate and B0 and B1 are suitable boundary operators. We assume that C has both the minor and major symmetries Cαβγ δ = Cβαγ δ = Cαβδγ = Cγ δαβ
(30)
235
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
and is strictly positive, that is, CK · K m|K|2
∀K ∈ Sym
(31)
for some positive real number m > 0. Sym stands for the space of two by two symmetric matrices. For our purposes below we also assume that f ∈ H −1 (). As to the boundary conditions, we focus on two cases: u = u,n = 0 on ∂
(clamping conditions)
(32)
and u = Cαβγ δ u,γ δ nα nβ = 0
on ∂
(simple support conditions).
(33)
To apply the techniques of -convergence it is convenient to study the variational formulation of problem (29). Let us define F (v) := C∇ 2 v · ∇ 2 v dx − 2f, v,
having denoted by ·, · the duality pairing between H −1 () and H01 (). Then, the solution of problem (29) can be found by minimizing F among all v ∈ HE2 (), with HE2 () = H 2 () ∩ H01 () or HE2 () = H02 () in the two cases above, respectively. We shall extend in fact F to L2 () by letting it take the value +∞ in L2 () \ HE2 (), that is, we set ⎧ ⎨ C∇ 2 v · ∇ 2 v dx − 2f, v if v ∈ HE2 (), (34) F (v) := ⎩ +∞ if v ∈ L2 () \ H 2 (). E
Accordingly, we introduce the sequences of discrete functionals ⎧ ⎨ CHh v · Hh v dx − 2f, v if v ∈ X0h , Fh (v) := ⎩ +∞ v ∈ L2 () \ X ,
(35)
0h
in the clamping case, and ⎧ ◦ ◦ ⎨ CHh v · Hh v dx − 2f, v if v ∈ X0h , Fh (v) := ⎩ +∞ v ∈ L2 () \ X0h ,
(36)
in the simple support case. The next theorem follows from the theory discussed so far. THEOREM 5. Let C be a strictly positive symmetric fourth-order tensor and let f ∈ H −1 (). Assume that P ROPERTY () holds. Let F be defined as in (34) and Fh as in (35) or (36) in the clamping and simple support case, respectively. Then, Fh -converges to the functional F with respect to the L2 () topology.
236
C. DAVINI AND R. PARONI
Proof. The two cases can be alike treated, so let us consider the clamping case for illustration. Since L2 () is a metric space we can use the sequential characterization of -convergence: “(L2 ) − limh Fh = F if and only if the following conditions are satisfied: (i) ∀v ∈ L2 () and ∀{vh } ⊂ L2 (), vh → v: F (v) lim inf Fh (vh ), (ii) ∀v ∈ L (), ∃{vh } ⊂ L (), 2
2
L2 ()
h
L2 ()
h
vh → v: F (v) lim sup Fh (vh ).”
We refer to the book of Dal Maso [9] for more details. Let us call the two requirements the lim-inf inequality and the recovery sequence condition, respectively. We start by proving the lim-inf inequality. Let vh be a sequence in L2 () converging to v in the L2 norm. If lim infh Fh (vh ) = +∞ there is nothing to prove. Hence, suppose lim infh Fh (vh ) < +∞. By passing to a subsequence, if necessary, we have that suph |Fh (vh )| < +∞ and hence vˆh := vh ∈ X0h . Moreover, using simple inequalities and Lemma 4 we find Fh (vˆh ) mHh vˆh 2L2 () − c∇ vˆh L2 () c1 ∇ vˆh 2L2 () − c∇ vˆh L2 ()
(37)
with c1 a positive constant. From these inequalities we deduce that suph ∇ vˆh L2 () < +∞ and that suph Hh vˆh L2 () < +∞. Then, by Theorem 3, we obtain that v ∈ H02 (), vˆh → v
in H 1 (),
and
Hh vˆh " ∇ 2 v
in L2 ().
Since 0 C Hh vˆh − ∇ 2 v · Hh vˆh − ∇ 2 v = CHh vˆh · Hh vˆh − 2CHh vˆh · ∇ 2 v + C∇ 2 v · ∇ 2 v, it follows that CHh vˆh · Hh vˆh 2CHh vˆh · ∇ 2 v − C∇ 2 v · ∇ 2 v, and using this inequality we find lim inf Fh (vˆh ) = lim inf CHh vˆh · Hh vˆh dx − 2 limf, vˆh h h h C∇ 2 v · ∇ 2 v dx − 2f, v = F (v).
The existence of a recovery sequence is an immediate consequence of Theorem 4, and thus we have finished the proof. 2
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
237
REMARK 3. It is possible to extend the analysis and allow the datum f to be in (HE2 ())∗ , the dual of HE2 (), rather than just in H −1 (). Indeed, by an easy application of the Lax–Milgram lemma, for every f ∈ (HE2 ())∗ there exist a f 0 ∈ L2 (), f 1 ∈ L2 (; R2 ) and a f 2 ∈ L2 (; R2×2 ) such that, 0 f v + f 1 · ∇v + f 2 · ∇ 2 v dx f, v =
for every v ∈ HE2 (). Here, ·, · denotes the duality pairing between HE2 () and its dual (HE2 ())∗ . It is then natural to define 0 f vˆ + f 1 · ∇ vˆ + f 2 · Hh vˆ dx f, v ˆ h :=
for every vˆ ∈ X0h . Note that if vˆh " v in H 1 () and Hh vˆh " ∇ 2 v in L2 () then f, vˆh h → f, v. Thence for f ∈ (HE2 ())∗ the previous theorem still holds provided we replace f, v with f, vh in the definition of Fh , i.e., in equations (35) and (36).
6. External Approximations of General Convex Functionals The results of Section 5 extend to more general functionals. Giving symbols the same meaning they had in the previous sections, here we consider functionals of the form W (x, v, ∇v, ∇ 2 v) dx − 2f, v (38) F (v) :=
to be minimized in HE2 (). We assume that W : × R × R2 × R2×2 → [0, +∞) be a function satisfying the following requirements: (H1) W is a Carathéodory function; (H2) W (x, s, ξ, ·) is convex for a.e. (x, s, ξ ) ∈ × R × R2 ; (H3) there exist two positive constants c, C such that c||2 W (x, s, ξ, ) C 1 + ||2 for a.e. (x, s, ξ ) ∈ × R × R2 . To stay with a mechanical interpretation, this class of problems encompasses the equilibrium of nonhomogeneous and nonlinearly elastic plates, including the case where the plate is supported by a Winkler foundation, for instance. Since the continuity of W on x is not required, the theory is applicable to materials with inclusions of different materials. The goal of this section is to study the approximation of these kind of functionals in spaces X0h by using the generalized notion of the Hessian introduced
238
C. DAVINI AND R. PARONI
above. As before, we imagine F extended to L2 () by defining it equal to +∞ in L2 () \ HE2 () and introduce the sequences of discrete functionals ⎧ N h ⎪ ⎨ W xj , v(xj ), ∇h v(xj ), Hh v(xj ) |Tj | − 2f, v if v ∈ X0h , Fh (v) := ⎪ ⎩ j =1 +∞ if v ∈ L2 () \ X0h (39) in the clamping case, or ⎧ N h ⎪ ◦ ⎨ W xj , v(xj ), ∇h v(xj ), Hh v(xj ) |Tj | − 2f, v Fh (v) := ⎪ ⎩ j =1 +∞ if v ∈ L2 () \ X0h
if v ∈ X0h , (40)
in the simple support case. In formulae (39) and (40) we have set 1 ∇v dx. ∇h v(xj ) := |Tj | Tj THEOREM 6. Assume that (H1)–(H3) and P ROPERTY () hold. Let f ∈ H −1 (), and let F and Fh be the functionals defined in (38) and (39), or (40), according to the studied case. Then Fh -converges to the functional F , with respect to the L2 () topology. To prove the theorem above we shall use the following well known theorem: LEMMA 5 (see [8]). Let be a bounded open set of Rn . Let g: × Rm × RN → [0, +∞] be a Carathéodory function. Let g x, u(x), ξ(x) dx. G(u, ξ ) :=
Assume that g(x, u, ·) is convex and that uk → u in L2 () and that ξk " ξ in L2 () then lim inf G(uk , ξk ) G(u, ξ ). k
Proof of Theorem 6. Again the proof is similar for the two kinds of boundary conditions, so let us consider the clamping case. By using equation (7) and defining ch id(x) :=
Nh j =1
xj χTj (x),
∇h v(x) :=
Nh j =1
∇h v(xj )χTj (x)
239
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
we can write Fh (v) :=
⎧ ⎨
W (ch id, ch v, ∇h v, Hh v) dx − 2f, v
⎩ +∞
if v ∈ X0h ,
otherwise,
where id: R2 → R2 is the identity function. We first prove the lim-inf inequality. Let vh be a sequence in L2 () converging to v in the L2 norm. If lim infh Fh (vh ) = +∞ there is nothing to prove. Hence, suppose lim infh Fh (vh ) < +∞. By using H3 and proceeding as in Theorem 5 we deduce that vˆh := vh ∈ X0h , v ∈ H02 (), vˆh → v
in H 1 (),
and
Hh vˆh " ∇ 2 v
in L2 ().
By the Scorza–Dragoni theorem (cf. [14]) we have that for every ε > 0 there exists a compact subset Kε of such that | − Kε | < ε and W restricted to Kε × R × R2 × R2×2 is continuous. Now, observing that ch id → id in L∞ (), ch vˆh → v in L2 (), and ∇h vˆh → ∇v in L2 () we have that uh := (ch id, ch vˆh , ∇h vˆh ) → u := (id, v, ∇v) in L2 (), hence by applying Lemma 5 we find lim inf Fh (vˆh ) lim inf χKε (x)W (uh , Hh vˆh ) dx − 2 limf, vˆh h h h χKε (x)W (u, ∇ 2 v) dx − 2f, v.
Letting ε go to zero and applying Fatou’s lemma we obtain W (u, ∇ 2 v) dx − 2f, v lim inf Fh (vˆh ) h W (x, v, ∇v, ∇ 2 v) dx − 2f, v = F (v). =
We now prove the existence of a recovery sequence. If v does not belong to H02 () there is nothing to prove. So let us suppose that v ∈ H02 (). By Theorem 4 there exists a sequence vˆ h ∈ X0h such that vˆh → v
in H 1 (),
and
Hh vˆh → ∇ 2 v
in L2 ().
It immediately follows that ch vˆh → v in L2 (), and ∇h vˆh → ∇v in L2 (). Moreover by passing to a subsequence, if necessary, we may also suppose that ch vˆh , ∇h vˆh and Hh vˆh converge almost everywhere to v, ∇v and ∇ 2 v, respectively. Since limf, vˆh = f, v h
it suffices to prove that W (x, v, ∇v, ∇ 2 v) dx. lim sup W (ch id, ch vˆh , ∇h vˆh , Hh vˆh ) dx h
(41)
240
C. DAVINI AND R. PARONI
Observing that the integrand below is non negative, by using (H3) and Fatou’s lemma we have C 1 + |Hh vˆh |2 − W (ch id, ch vˆh , ∇h vˆh , Hh vˆh ) dx lim inf h lim inf C 1 + |Hh vˆh |2 − W (ch id, ch vˆh , ∇h vˆh , Hh vˆh ) dx h K ε C 1 + |∇ 2 v|2 − W (x, v, ∇v, ∇ 2 v) dx, = Kε
where Kε is the compact subset of defined in the first part of the proof. Letting ε go to zero and applying Fatou’s lemma we find C 1 + |Hh vˆh |2 − W (ch id, ch vˆh , ∇h vˆh , Hh vˆh ) dx lim inf h (42) C 1 + |∇ 2 v|2 − W (x, v, ∇v, ∇ 2 v) dx.
On the other hand we also have C 1 + |Hh vˆh |2 − W (ch id, ch vˆh , ∇h vˆh , Hh vˆh ) dx lim inf h lim sup C 1 + |Hh vˆh |2 dx h + lim inf −W (ch id, ch vˆh , ∇h vˆh , Hh vˆh ) dx h C 1 + |∇ 2 v|2 dx − lim sup W (ch id, ch vˆh , ∇h vˆh , Hh vˆh ) dx. (43) =
h
Combining equations (42) and (43) we obtain equation (41), and thus we have completed the proof. 2
7. Convergence of the Minimizers As pointed out at the beginning of Section 5 the functionals we have considered have properties that assure the convergence of the minimizers of Fh to the minimizers of the functional F in the L2 topology, by general properties of -convergence. Here, however, that convergence is stronger. The considerations done in this short section apply to the clamping case as well as to the simple support case. For simplicity we discuss only the former. Let uˆ h be a minimizer of the discrete functional Fh defined by either one of equations (35) and (39). Then, due to the equicoercivity of the functional Fh , to Lemma 4 and to Poincaré’s inequality we have that the sequence uˆ h is bounded in
GENERALIZED HESSIAN AND EXTERNAL APPROXIMATIONS
241
H 1 (), and thence it has a weakly convergent subsequence (not relabeled) converging to a function u ∈ H 1 () which is a minimizer of F by equation (27). Indeed from Theorem 1 we deduce that uˆ h → u,
in H 1 ()
Hh uˆ h " ∇ 2 u in L2 ().
and
(44)
The next theorem shows that also the generalized Hessian of uˆ h converges strongly in L2 () to the Hessian of u. By equation (27) this is obviously true for the quadratic case, but a similar result also holds for the minimizers of the functional defined by equation (39) provided the potential W is strictly convex in the last variable. THEOREM 7. Assume the hypothesis of Theorem 6 hold. Moreover, suppose there
: ×R ×R2 ×R2×2 → R2×2 and a strictly positive exists a continuous function W constant γ > 0 such that
(x, s, ξ, 1 ) · (2 − 1 ) W (x, s, ξ, 2 ) W (x, s, ξ, 1 ) + W + γ |2 − 1 |2 ,
(45)
for a.e. (x, s, ξ, 1 , 2 ) ∈ × R × R2 × R2×2 × R2×2 . Then if uˆ h is the minimizer of the functional Fh defined in equation (39) we have that uˆ h → u,
in H 1 ()
Hh uˆ h → ∇ 2 u in L2 (),
and
where u is the unique minimizer of the functional F defined in equation (38). The proof of this theorem is standard, we include it here just for the reader’s convenience. Proof. Because of (44) we just have to prove that Hh uˆ h → ∇ 2 u in L2 (). From assumption (45) it follows that W (ch id, uˆ h , ∇h uˆ h , ∇ 2 u) + γ |Hh uˆ h − ∇ 2 u|2 Fh (uˆ h )
(ch id, uˆ h , ∇h uˆ h , ∇ 2 u) · (Hh uˆ h − ∇ 2 u) dx − 2f, uˆ h . +W From the convexity of W in the last variable and hypothesis (H3), we deduce
(x, s, ξ, )| C(1 + ||) for a.e. (x, s, ξ ) ∈ × R × R2 , and (cf. [8]), that |W
(ch id, uˆ h , ∇h uˆ h , ∇ 2 u) → from Lebesgue’s convergence theorem we find that W 2 2
(x, u, ∇u, ∇ u) in L (). Hence passing to the limit in the inequality above W and using the fact that Fh (uˆ h ) → F (u) we deduce that F (u) F (u) + γ lim sup |Hh uˆ h − ∇ 2 u|2 dx h
which concludes the proof.
2
242
C. DAVINI AND R. PARONI
Acknowledgements A substantial part of this work was completed while C.D. was a visiting professor at the University of Kentucky and R.P. was a visiting fellow at the University of Oxford, within the European TMR project “Phase Transitions in Crystalline Solids.” The authors thank prof. C.-S. Man and prof. J. M. Ball for providing them appropriate context to carry it out.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
12. 13. 14. 15.
16.
17.
M. Angelillo, A. Fortunato and F. Fraternali, The lumped stress method and the discretecontinuum approximation, Part I: Theory. Internat. J. Solids Struct. 39 (2002) 6211–6240. M. Angelillo, A. Fortunato and F. Fraternali, The lumped stress method and the discretecontinuum approximation, Part II: Applications. Ibid. S. Balasundaram and P.K. Bhattacharyya, A mixed finite element method for fourth order partial differential equations. Z. angew. Math. Mech. 66 (1986) 489–499. F. Brezzi and M. Fortin, Mixed and Hybrid Finite Element Methods. Springer, New York (1991). M. Carriero, A. Leaci and F. Tomarelli, Special bounded Hessian and elastic–plastic plate. Rend. Accad. Naz. Sci. XL Mem. Mat. 16 (1992) 223–258. P.G. Ciarlet, The Finite Element Method for Elliptic Problems. North-Holland, Amsterdam (1978). P.G. Ciarlet, Basic error estimates for elliptic problems. In: P.G. Ciarlet and J.L. Lions (eds), Handbook of Numerical Analysis. North-Holland, Amsterdam (1991). B. Dacorogna, Direct Methods in the Calculus of Variations. Springer, New York (1989). G. Dal Maso, An Introduction to -Convergence. Birkhäuser, Boston (1993). C. Davini, Note on a parameter lumping in the vibrations of elastic beams. Rend. Ist. Matematica Università di Trieste 28 (1996) 83–99. C. Davini, -convergence of external approximations in boundary value problems involving the bi-Laplacian. J. Comput. Appl. Math. 140 (2002) 185–208. (2000) (to appear). C. Davini and I. Pitacco, Relaxed notions of curvature and a lumped strain method for elastic plates. SIAM J. Numer. Anal. 35 (1998) 677–691. C. Davini and I. Pitacco, An unconstrained mixed method for the biharmonic problem. SIAM J. Numer. Anal. 38 (2000) 820–836. I. Ekeland and G. Temam, Convex Analysis and Variational Problems. North-Holland, Amsterdam (1976). R. Glowinski, Approximations externes, par éléments finis de Lagrange d’ordre un et deux, du problème de Dirichlet pour l’operateur biharmonique. Méthode iterative de résolution des problèmes approches. In: J.J.H. Miller (ed.), Topics in Numerical Analysis. Academic Press, New York (1973) 123–171. N. Nataraj, P.K. Bhattacharyya, S. Balasundaram and S. Gopalsamy, On a mixed-hybrid finite element method for anisotropic plate bending problems. Internat. J. Numer. Methods Engrg. 39 (1996) 4063–4089. J.E. Roberts and J.-M. Thomas, Mixed and hybrid methods. In: P.G. Ciarlet and J.L. Lions (eds), Handbook of Numerical Analysis. North-Holland, Amsterdam (1991).
Static Deformations of a Linear Elastic Porous Body Filled with an Inviscid Fluid F. DELL’ISOLA1, G. SCIARRA2 and R.C. BATRA3
1 Dip. Ingegneria Strutturale e Geotecnica, Universita degli Studi di Roma “La Sapienza”,
via Eudossiana 18, 00184 Roma, Italy. E-mail:
[email protected] 2 Dip. Ingegneria Chimica, dei Materiali, delle Materie Prime e Metallurgia, Università degli Studi di Roma “La Sapienza”, via Eudossiana 18, 00184 Roma, Italy. E-mail:
[email protected] 3 Department of Engineering Science and Mechanics, M/C 0219, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, U.S.A. E-mail:
[email protected] Received 20 September 2002; in revised form 19 September 2003 Abstract. We study infinitesimal deformations of a porous linear elastic body saturated with an inviscid fluid and subjected to conservative surface tractions. The gradient of the mass density of the solid phase is also taken as an independent kinematic variable and the corresponding higher-order stresses are considered. Balance laws and constitutive relations for finite deformations are reduced to those for infinitesimal deformations, and expressions for partial surface tractions acting on the solid and the fluid phases are derived. A boundary-value problem for a long hollow porous solid cylinder filled with an ideal fluid is solved, and the stability of the stressed reference configuration with respect to variations in the values of the coefficient coupling deformations of the two phases is investigated. An example of the problem studied is a cylindrical cavity leached out in salt formations for storing hydrocarbons. Mathematics Subject Classifications (2000): 74F10, 74F20. Key words: solid–fluid mixture, conservative tractions, principle of virtual power, partial tractions, fluid-filled cylindrical cavity, stability analysis.
R.C. Batra dedicates this work with deep respect and admiration to Professor C.A. Truesdell, a superb teacher and an excellent friend.
1. Introduction Simple models of a mechanical system comprised of a deformable porous solid matrix filled with a compressible fluid have been developed by Fillunger [22], Biot [11], Truesdell [40] and Müller [30]. In these works a spatial point is simultaneously occupied by all constituents. This is readily comprehensible for gaseous mixtures [30] and fluid solutions [41]. Observations on fluid saturated solids have shown higher values of fluid percolation through pores of the solid matrix than that predicted by the aforestated 243 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 243–264. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
244
F. DELL’ISOLA ET AL.
models (see, e.g., [13] or [33]). The increase of percolation is possibly not only due to the higher externally applied pressure but also due to the opening of pores in the vicinity of the boundary (see [17]). This provides a justification for endowing the theory of mixtures with the volume fraction concept (see, e.g., [18]); this additional scalar parameter can describe the nonstandard dilatation effects (see, e.g., [38]) as follows. Once the mixture is modelled from a microscopic point of view and the reference configuration of the solid matrix is required to be periodic, then those dilatations of pores which do not involve global deformations of cells are captured by the volume fraction. In this paper we consider a binary mixture involving a second gradient solid and a perfect fluid; we refer the reader to [23, 24, 28] for the relationship between micro-structural and second gradient theories. Our approach is close to that of the volume fraction concept as the latter reduces to the former once a suitable constraint among the enlarged set of state parameters is assumed [34]. For example, for an incompressible solid constituent, it is easy to see [25] how a mixture model endowed with the volume fraction concept transforms into a binary mixture whose solid constituent has second gradient constitutive relations. After formulating a general problem, we study deformations of a porous hollow linear elastic cylinder filled with a perfect fluid. In particular, by assuming that the internal energy density can be split into a part involving first gradient of the displacements and a part involving second order gradients of displacements, we perform parametric analysis of the density profiles of the solid matrix with respect to a suitable energetic coupling coefficient between the solid and the fluid. We limit our analysis to the case when the external tractions applied on both constituents are conservative and can be derived from a potential. We also discuss stability of the prestressed reference configuration of the hollow cylinder with respect to perturbations of the aforementioned energy coupling coefficient; an energetic criterion is proposed for this analysis. The distance in the space of mixture configurations is described in terms of the total energy which equals the sum of the mixture deformation energy and the potential of surface tractions. Batra et al. [5–9] have analyzed numerically finite transient thermomechanical deformations of a homogeneous body with the Cauchy stress and higher-order stresses depending upon gradients of deformation. 2. Formulation of the Problem Material particles of the fluid and the solid are identified respectively by their position vectors X(f) and X(s) in fixed reference configurations f0 and s0 . We presume that, at any time t, particles of both constituents occupy the same position x in the present configuration . The velocity v(α) (α = f, s) of the material particle X(α) is defined by vα =
d(α) u(α)(X (α) , t) , dt
(1)
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
245
where d(α) /dt denotes the material time derivative following the motion of Xα and u(α) is the displacement of the αth constituent from its reference configuration. Let ρ (f) and ρ (s) denote, respectively, the apparent mass densities of the fluid and the solid; then the mass density ρ of the mixture equals ρ (f) + ρ (s) . The mean or the barycentric velocity v of the mixture is defined by ρv = ρ (f) v(f) + ρ (s) v(s) .
(2)
Details of the theory of mixtures are given in [13, 19, 27, 29–31, 33, 40]. 2.1. BALANCE LAWS We presume that there is no interconversion of mass between the solid and the fluid. In the spatial description of motion, the balance of mass for each constituent is given by d(α) ρ (α) + ρ (α) div v(α) = 0, dt
(3)
where div v = tr(gradv), and grad denotes derivatives with respect to coordinates in the present configuration. We use the principle of virtual power to derive the balance of linear momentum and the boundary conditions for each constituent. That is, we postulate that (s) (s) m · v¯ + m(f) · v¯ (f) + T(s) · ∇ v¯ (s) − p (f) div v¯ (f) + (s) · ∇∇ v¯ (s) dV (s) (s) b · v¯ + b(f) · v¯ (f) dV = ¯ (s) (s) (s) (f) (f) (s) ∂ v dA. (4) t · v¯ + t · v¯ + τ · + ∂n ∂ Here m(α) is the bulk solid-fluid interaction force, T(s) the partial Cauchy stress in the solid, p (f) the hydrostatic pressure in the fluid (we assume that the fluid is ideal; therefore partial Cauchy stress in it is spherical), (s) the second-order stress in the solid, ∇ the gradient operator with respect to coordinates in the present configuration, b(α) the density of partial body forces, t(α) the partial surface tractions, τ (s) the traction corresponding to the second-order stress tensor in the solid, v¯ (s) the virtual velocity in the solid that vanishes on the part of the boundary of the solid where essential boundary conditions are prescribed, v¯ f the virtual velocity in the fluid that vanishes on the part of the boundary of the fluid where essential boundary conditions are specified, a · b the inner product between tensors a and b of the same order, and ∂v(s) /∂n is the directional derivative of v(s) along the outward unit normal n to the boundary ∂ of . The effect of inertia forces is For a boundary-value problem with field equations involving derivatives of order 2m, boundary
conditions involving derivatives of order at most (m−1) are called essential; others are called natural.
246
F. DELL’ISOLA ET AL.
included in the density of body forces. The physical meaning of (s) and τ (s) can be described in a way similar to that done in different contexts in [20, 23]. We note that the external action τ (s) can be regarded as the sum of two different contributions, the first one is a doubly normal double force, i.e., an external areal action which works on the rate of opening, (∇v(s) · n ⊗ n), along the outward unit normal n of pores on the boundary, the other one is a tangential couple working on the vorticity of the apparent velocity of the solid; this nomenclature is due to Germain [23]. The areal action is also considered in the Cosserat model for granular materials (see, e.g., [21]) and in the present problem vanishes. However, the doubly normal double force plays an important role in the dilatancy phenomenon studied here. A motivation for considering higher-order stresses in the solid will be provided below. We note that capillary type forces in the fluid, discussed in the literature by second-order stresses (see, e.g., [14–16, 26]), have been neglected here to keep the analysis tractable. Whereas we have included in equation (4) the internal supplies m(f) and m(s) of the linear momentum, the internal supplies of the moment of momentum have been neglected. This is consistent with the assumption that the stress in the fluid is a hydrostatic pressure. The symmetry of T(s) follows from equation (4) by setting v¯ (f) = v¯ (s) = velocity field of a rigid body motion, that is, from the objectivity of the left-hand side of (4) which also implies that the sum of m(f) and m(s) equals zero. By using the divergence theorem and exploiting the fact that equation (4) must hold for all virtual velocities vanishing on the part of the boundary where essential boundary conditions are given, we obtain the following set of field equations and boundary conditions: (5) div T(s) − div (s) − m(s) + b(s) = 0, in , −∇p (f) − m(f) + b(f) = 0, m(s) + m(f) = 0, (s) T − div (s) n − divs ((s) n) = t(s) , (s) n n = τ (s) ,
in , in , on ∂1 ,
(6) (7) (8)
on ∂1 ,
(9)
v = vˆ , −p (f) n = t(f) , v(f) · n = vˆ (f) ,
on ∂2 , on ∂3 , on ∂4 .
(10) (11) (12)
(s)
(s)
Here divs is the surface divergence on ∂, and ∂1 and ∂2 are complementary parts of the boundary ∂ of , where natural and essential boundary conditions, respectively, are prescribed for the solid; a similar interpretation holds for ∂3 and ∂4 for the fluid. Here a ⊗ b is the tensor product between the nth order tensor a and the mth order tensor b defined as (a ⊗ b)c = (b · c)a for every mth order tensor c; A · B = tr(ABT ) for tensors A and B.
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
247
2.2. CONSTITUTIVE RELATIONS The balance laws (5) and (6) are to be supplemented by constitutive relations; we express these in terms of the internal energy. We presume that the mixture is at a uniform temperature, the constituents are deformed quasistatically so that their kinetic energy can be neglected, and no energy is dissipated. The continuum model we use in order to describe deformations of a solid matrix saturated with a fluid must account for the strain energy associated with the gradient of the mass density of the solid. At the macroscopic level, the dependence of the strain energy on the gradient of deformation describes the effects of the opening of neighboring pores on the pore cluster which is modelled as a solid material point. We assume that the internal energy density can be split into two parts: a part that depends upon the “local” deformation of the solid and the fluid particles and another part that depends upon a “nonlocal” measure of deformation of the solid particles; the latter is taken to be proportional to |∇ρ (s)|2 . Thus we write the balance of energy as λs (s) 2 d ∇ρ dV ρ ρ (f) , F(s), X(s) + dt 2ρ (s) (s) b · v + b(f) · v(f) dV = ∂v(s) dA (13) t(s) · v(s) + t(f) · v(f) + τ (s) · + ∂n ∂ which because of (4) implies that λs (s) 2 d (f) (s) (s) ∇ρ ρ (ρ , F , X ) + dV dt 2ρ (s) (s) m · v + m(f) · v(f) + T(s) · ∇v(s) − p (f) div v(f) = + (s) · ∇∇v(s) dV .
(14)
Here F(s) = Grad x = ∂x/∂X is the deformation gradient for the solid, λs > 0 is a material parameter with units of Newton m6 /kg2 , and d/dt signifies the material time derivative following the mean motion of the mixture. By using the Reynolds transport theorem the left hand side of equation (14) can be represented as a linear functional of the velocity field of the two constituents. Thus the following constitutive equations for the partial Cauchy stresses T(a), the solid-fluid interaction forces m(a), a = s, f, and the second-order stress (s) associated with the solid constituent must hold: ss ∂ f (1 + ξ (f) )I + ∇ρ (s) ⊗ ∇ρ (s) , (15) T(s) = ρ (s) F(s)T − λs ∂F 2 λs ∂ (16) p (f) = ρρ (f) (f) − f ss ξ (f) , ∂ρ 2
248
F. DELL’ISOLA ET AL.
(s) = −λs ρ (s) I ⊗ ∇ρ (s) , (17) T ∂ ∂ ∂ + ξ (f) F(s)−T (s) − ξ (s) (f) ∇ρ (f) m(s) = −m(f) = −ρ ξ (f) ∇F(s) (s) ∂F ∂X ∂ρ λs + ∇(ξ (f) f ss ) , (18) 2ρ where f ss = ∇ρ (s) · ∇ρ (s),
(19)
and ξ (f) is the mass fraction of the fluid phase. Note that the partial Cauchy stress tensor T(s) is symmetric and ∂/∂F(s) equals the partial first Piola–Kirchhoff stress tensor of the solid constituent. Equations (15)–(18) are derived in [34] where Germain’s [23] arguments are used to obtain constitutive relations for a second gradient porous matrix filled with an ideal fluid. Equation (13) does not include all features of a general second gradient linear elastic matrix; only density gradients have been assumed to affect the internal potential energy and contributions of other components of the third order tensor, Grad F(s) , have not been considered. Furthermore, we only analyze static deformations of the mixture. Thus Darcy-type drag forces are not modeled. 2.3. SPLITTING OF EXTERNAL SURFACE TRACTIONS INTO PARTIAL TRACTIONS
We consider problems for which b(s) = b(f) = 0, i.e., there are only external surface tractions. In a physical problem, total surface tractions are prescribed either on a part or on all of the boundary of the region . Here we require that these tractions be assigned on all of the boundary of the mixture in the current configuration. However, the solution of the boundary-value problem defined by equations (5)–(12) requires that the partial surface tractions be specified. In order to find the partial tractions we assume the existence of a potential function such that (s) (s) (s) (f) (f) (s) ∂v dA t ·v +t ·v +τ · ∂n ∂ d ψ ext x, ρ (s) , ρ (f) , ∇ρ (s) dV . (20) = dt The external surface tractions for which equation (20) holds are conservative. It is readily apparent that not all conservative surface tractions are characterized by equation (20). Here we consider only those surface tractions which satisfy equation (20) and ψ ext depends upon deformations of the solid only through ρ (s) and ∇ρ (s) . Equation (20) is dictated by the intended application of studying static deformations of an annular cylindrical porous region filled with an inviscid fluid.
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
249
Requiring that equation (20) hold for all choices of the velocity field, we obtain ∂ψ ext ∂ψ ext = C s , in , − div (21) ∂ρ (s) ∂ ∇ρ (s) ∂ψ ext = C f, in , (22) ∂ρ (f) ext ∂ψ ext ∂ψ ext (s) (s) ∂ψ · ∇ρ + div ρ t(s) = − (s) ρ (s) + ξ (s) ψ ext − ∂ρ ∂ ∇ρ (s) ∂ ∇ρ (s) (s) ext ext ∂ψ ∂ψ ∂ρ n + ρ (s) · n tr(∇ s n) − ·n (s) (s) ∂ ∇ρ ∂ ∇ρ ∂n ∂ψ ext (s) s ·n , on ∂, (23) +ρ ∇ ∂ ∇ρ (s) ∂ψ ext (24) t(f) = − (f) ρ (f) + ξ (f) ψ ext n, on ∂, ∂ρ ext (s) (s) ∂ψ τ =− ρ · n n, on ∂, (25) ∂ ∇ρ (s) where ∇ s is the surface gradient on ∂. Equations (21) and (22) with C s and C f as constants are necessary conditions for the existence of a ψ ext for which b(s) = b(f) = 0. Once the constitutive relations for external actions are specified, which in our model is equivalent to specifying ψ ext , the partial surface tractions can be expressed in terms of the total surface tractions; see [33] for details. The consideration of conservative external tractions limits the space of admissible surface tractions. The often used assumption characterizing the partial surface tractions in terms of the volume fraction of the constituents and the total surface tractions cannot be deduced from our work. It is because no state parameter in addition to the solid and the fluid placement maps has been introduced. For a nonpolar medium whose response does not depend upon second-order displacement gradients, Batra [4] showed that surface tractions cannot depend upon ∂u/∂n where u is the displacement of a point. Results of this section can be summarized as follows: (i) a variational principle describing static deformations of a solid–fluid mixture is formulated, (ii) the Euler–Lagrange equations are deduced from the principle, and are recognized as the balance laws and the constitutive relations for the solid–fluid mixture, (iii) the external and internal actions are specified by requiring that they be conservative. 3. Solution of a Boundary-Value Problem We analyze, within a linearized second gradient theory, static infinitesimal deformations of a long hollow porous cylinder filled with an inviscid fluid and with the inner and the outer surfaces subjected to uniform external pressures p1ext and p2ext ,
250
F. DELL’ISOLA ET AL.
respectively. We assume that the pressure on the inner and the outer surfaces of ext ext and p02 , respectively, and the cylinder, in the reference configuration, equals p01 postulate that 1 1 1 T0(s) · H(s) + γ 0f ρ (f) + C H(s) · H(s) − H(s) · T0(s)H(s)T = ρ0 2 4 1 − H(s) T0(s) + H(s) · H(s) T0(s) − T0(s)H(s) + T0(s)H(s)T − H(s)T T0(s) 8 1 ff (f) 2 (f) sf (s) ρ (26) + γ + ρ K ·H , 2 where H(s) := ∇u(s),
ρ (f) = ρ (f) − ρ 0(f) ,
(27)
ρ 0(f) is the mass density of the fluid in the reference configuration, T0(s) a symmetric tensor representing the partial stress in the solid in the reference configuration, C is the classical elasticity tensor for the solid constituent mapping symmetric second order tensors into symmetric second order tensors, γ 0f , γ ff and Ksf = KsfT are material parameters. Terms involving the second order tensor T0(s) and the scalar γ (0)f in equation (26) represent contributions to the internal potential energy by the prestress in the solid and the fluid constituents. Since a fluid can be in equilibrium only if it is confined, it is necessary to consider a pre-stressed reference configuration. Equation (26) is the most general one can have to get linear constitutive relations for a pre-stressed solid-fluid mixture. Equations (15)–(18) imply that terms of order one in H(s) and ρ (f) yield zeroth order terms for the solid stress tensor and the fluid pressure, and the second-order terms in H(s) and ρ (f) provide the first-order terms for the solid stress tensor, the fluid pressure, and the bulk internal action m(s) . The second-order tensor Ksf accounts for the interaction between the solid and the fluid phases because of deformations of pores; Ksf is not necessarily a spherical tensor even when the solid and the fluid constituents are isotropic. It is because pores need not be of uniform shapes and sizes. The coupling coefficient Ksf can be explained in terms of the pore dilatancy as follows: the dilatation of pores induced by the injection of the solid or the fluid into a suitable elementary reference volume either deforms the solid constituent or changes the mass density of the fluid or both. (s) Denoting the first term on the right-hand side of equation (15) by T and using (26) we obtain T
(s)
ρ (f) 0(s) 1 T − ξ0(s) tr H(s) T0(s) + T0(s)H(s)T + H(s)T0(s)T ρ0 2 1 (s) 0(s) (28) + W T − T0(s)W(s) + C E(s) + ρ (f) Ksf , 2
= T0(s) +
251
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
where E(s) := (H(s) + H(s)T )/2 and W(s) := (H(s) − H(s)T )/2. If each component of the initial stress, T0(s), is much less than the smallest value of an elastic modulus of the solid phase, then terms such as T0(s)H(s)T can be neglected in equation (28). We assume the following form for the elasticity tensor applied to the E(s) C E(s) = λtr(E(s))I + 2µE(s) , (29) where λ and µ are the Lamé constants for the solid. One can deduce expressions for p (f) and m(s) by substituting for from (26) into (16) and (18). We assume that deformations of the mixture are plane strain, and in the plane of deformation they are independent of the angular position. That is, in cylindrical coordinates the two in-plane physical components ur and uθ of the displacement are functions of the radial coordinate r only and uz = 0; we call such a deformation field radially symmetric. This assumption is reasonable because the body and the boundary conditions are radially symmetric. Henceforth we use cylindrical coordinates with orthonormal vectors er and eθ at a point in the radial and the circumferential directions, respectively, and work in terms of physical components of stresses. In order to find boundary conditions on the solid and the fluid phases, we consider external tractions given by the following expression for ψ ext : ψ ext = C s ρ (s) + C f ρ (f) + Cint(r, θ) · ∇ρ (s) ρ (s) + p0 + p1 r.
(30)
Here ρ (s) = ρ (s) − ρ 0(s), ρ 0(s) is the density of the solid phase in the reference configuration, Cint(r, θ) is a solenoidal vector field whose radial component is independent of r and θ. It is easy to verify that the aforementioned assumptions on Cint(r, θ) suffice to satisfy equation (21). Bearing in mind the hypothesis of radially symmetric deformations for both phases, we have
Cint (r, θ) · ∇ρ (s) ρ (s) = Cint(r, θ) · (ρ (s) er ) ρ (s) = C int ρ (s) ρ (s) ,
(31)
where C int is the constant radial component of Cint (r, θ) and a prime indicates differentiation with respect to r. Substitution from (30) and (31) into equations (23)– (25) gives C int (s) (s) (s) (s) ext s (s) int (s) (s) ρ ρ ρ + ρ n, (32) t = ξ ψ −C ρ −C r (33) t(f) = ξ (f) ψ ext − C f ρ (f) n, τ (s) = −(ρ (s) ρ (s) C intδ)n,
(34)
where δ = +1 on the external surface and δ = −1 on the internal surface. Note that values of C s , C f and C int depend upon the shape of the bounding surface, constituents of the mixture, and on the interaction between the mixture and the medium surrounding it. For example, at an impermeable wall, C int = 0 because externally applied tractions of type (34) vanish; as a matter of fact, the absence of fluid flux implies that the fluid belonging to the cluster of pores near the boundary
252
F. DELL’ISOLA ET AL.
rests at a uniform pressure. In other words the external world cannot access pores within the body and dilate them. Because of our interest in studying infinitesimal deformations under the action of higher-order tractions (or double forces without moments), we retain terms in (32)–(34) that are bilinear in ρ (s) and ρ (s) ; the remaining terms are either linear in ρ (s) and ρ (f) or independent of ρ (s) and ρ (f) . We thus get the following linearized constitutive characterizations of external tractions and double forces: 0(f) 0(s) 0 (s) (s) (f) 0(s) t = −ξ ξ ρ (C − C ) + ξ p¯ 0 n ¯ 0 C int 0(s) (s) 0(f) 2 (f) 0(f) p ρ ρ (s) + −(ξ ) (C − C ) + ξ + ρ0 r p¯0 (s) + −(ξ 0(s))2 (C − C (f) ) − ξ 0(s) 0 ρ (f) + ξ 0(s)(p˜ 0 + rp1 ) n, ρ (35) 0(f) 0(s) 0 (s) (f) (f) 0(f) t = ξ ξ ρ (C − C ) + ξ p¯0 n p¯ 0 (s) ρ (f) + (ξ 0(s))2 (C − C (f) ) + ξ 0(s) 0 ρ ¯0 (s) 0(f) 2 (f) 0(f) p (s) 0(f) + (ξ ) (C − C ) − ξ ρ + ξ (p˜ 0 + rp1 ) n, (36) ρ0 (37) τ (s) = −ρ 0(s)C int ρ (s)δ n. Here ξ 0(f) and ξ 0(s) are mass fractions of both phases in the initial configuration, ρ 0 is the apparent density of the mixture in this configuration, p¯ 0 is the reference value of p0 and p1 equals its infinitesimal variation. Note that terms in the first brackets in the representation formulas of t(s) and t(f) describe tractions applied on the boundary of the mixture in the reference configuration, and terms in the second brackets are infinitesimal increments of tractions. Similarly equation (37) states that the initial double force vanishes but its infinitesimal increment is nonzero. Let R1 and R2 denote the inner and the outer radii of the porous cylinder in the reference configuration. It is clear from equation (28) that T0(s) is the partial stress in the solid in the reference configuration. Boundary conditions for the solid in the reference configuration are (s)
(s) ext p01 n, T0(s)(R1 )n = t (R1 ) = − d01 (s)
T0(s)(R2 )n = t (R2 ) =
(s) ext − d02 p02 n,
(38) (39)
(s) (s) ext and d02 denote fractions of the externally applied pressures, p01 and where d01 ext , carried by the solid phase on the inner and the outer surfaces of the porous p02 We have assumed that the small parameter defining the linearization procedure is same for the solid and the fluid kinematical descriptors. Thus small deformations of the solid matrix are associated with small variations of the fluid apparent density.
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
253
cylindrical body. Through these equations a representation formula for both areal fractions splitting external pressure into partial tractions is obtained. Let the state of stress, T0(s) and p0(f) , be b0 b0 0(s) T = a0 − 2 er ⊗ er + a0 + 2 eθ ⊗ eθ , r r (40) (f) (s) ext (s) ext 0f 0(f) p0 := γ ρ = const = 1 − d01 p01 = 1 − d02 p02 , then (s) ext (s) ext R22 d02 p02 − R12 d01 p01 , a0 = − 2 2 R2 − R1
R 2 R 2 (d (s) p ext − d (s) p ext ) b0 = − 1 2 02 2 02 2 01 01 . R2 − R1
(41)
It is easily verified (see, e.g., [32]) that the pre-stress (40) is admissible. Equa(s) (s) ext ext , d02 , p01 and p02 cannot be independently prescribed. tion (40)2 implies that d01 (s) (s) ext ext must equal p02 , For example, if d01 = d02 for a reference configuration, then p01 and the stress state in the solid phase must also be that of a hydrostatic pressure, as (s) need not equal 1 because b0 = 0 (see equation (41)). At an impermeable wall, d01 a part of external tractions applied to the impermeable wall may be carried by the fluid. Equations for the determination of infinitesimal displacements in the radial and in the circumferential directions and for ρ (f) are 2 3 3 3 0(s)2 ur + ur − 2 ur + 3 ur − 4 ur −λs ρ r r r r b 0 + 2ξ 0(f) a0 − 2 + λ + 2µ ur r 1 b0 2b0 0(f) + 2ξ a0 + 2 + 2 + λ + 2µ ur r r r 1 b 2b0 0 0(f) + 2 −2ξ a0 + 2 − 2 − (λ + 2µ) ur r r r 1 b0 sf 0(s) 0(f) + a0 − 2 + Krr − ξ γ ρ (f) ρ0 r (f) 1 sf sf + Krr − Kθθ (42) ρ = 0, r 1 a0 a0 1 + µ r 2 − b0 uθ + 3 + µ r 2 − b0 uθ 2 r 2 r 2 1 2 (f) a0 2 sf (f) + µ r − b0 uθ + Krθ ρ − 4 ρ + = 0, (43) r 2 r
254
F. DELL’ISOLA ET AL.
2ρ 0(f) γ 0f + ρ 0 ρ 0(f) γ ff ρ (f) 1 0(s) 0(f) 0f = c + ρ ρ γ ur + ur r 1 0 0(f) sf sf sf 1 Krr ur + Krθ uθ − uθ + Kθθ ur −ρ ρ r r b b 1 0 0 ur . − ρ 0(f) a0 − 2 ur − ρ 0(f) a0 + 2 r r r
(44)
In equation (44), c is a constant of integration. Equation (44) can be solved for ρ (f) and the result substituted into equations (42) and (43) to obtain two coupled ordinary differential equations for ur and uθ . These equations belong to Heun’s family of equations (see [3]); their solution has four poles, one at r = 0, one at r = ∞, and locations of the other two poles depend upon the coefficient of uθ in equation (43). In order for a pole to be within the hollow cylinder the following inequalities must hold
b0 < R2 . R1 < (45) a0 /2 + µ Thus depending upon the shear modulus of the solid phase, the pre-stress, and the fraction of externally applied pressures carried by the solid constituent, the solution may blow up at a point within the annular cylinder. 3.1. A SIMPLIFIED PROBLEM Henceforth we consider the case of b0 = 0, therefore r = 0 is a triple pole and there is no pole within the hollow cylinder. The stress in the solid phase in the reference (s) ext (s) ext p02 = d01 p01 = − a0 , i.e., − a0 configuration is a hydrostatic pressure. Thus d02 equals the reference value of the hydrostatic pressure in the solid. Moreover p¯ 0 equals the negative of the external pressure applied on both the external and the internal surfaces of the hollow cylinder ext ext = p02 = −p¯ 0 . p01
(46)
We also assume that the second order tensor Ksf is spherical: Ksf = K sf I. Thus the strain-energy density provides both the dependence of the hydrostatic component of the solid partial stress tensor on ρ (f) , and that of the hydrostatic pressure acting in the fluid on ρ (s). An increment ρ (f) in the apparent mass density of the fluid does not induce nonzero deviatoric stresses in the solid constituent. 3.1.1. Influence of the Coupling Coefficient K sf on Density Profiles With the aforestated assumptions and the existence of a first integral of equation (42), equations (42)–(44) can be reduced to the following two uncoupled
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
ordinary differential equations: 1 0(s)2 Ur + Ur −λs ρ r (ξ 0(s)γ 0(f) − a0 /ρ 0 − K sf )2 + 2µ + λ + 2ξ 0(f) a0 − Ur = s , 2γ 0(f) /ρ 0 + γ ff 2 Uθ + Uθ = 0, r
255
(47) (48)
where s is an integration constant to be determined by boundary conditions, Ur = (ur + (1/r) ur ) = tr H(s) and Uθ = (uθ − (1/r) uθ ) are dimensionless quantities. In particular, Ur is related to the increment of the solid apparent density by ρ (s) = −ρ 0(s)Ur .
(49)
According to the previous assumptions we can also specify boundary conditions arising from equations (9)–(12) and equations (35)–(37). Recalling the structure of balance laws (5) and (6) these conditions prescribe external radial and circumferential tractions for the solid constituent acting on the inner and the outer surfaces, the external pressure for the fluid constituent acting on the inner (or the outer) surface and the double forces only for the solid constituent acting on disjoint parts of the boundary. The differential equation for the variable Ur is a Bessel equation, and that for Uθ is the classical Euler equation. Let us first consider equation (48) and boundary conditions corresponding to shear external tractions for the solid constituent. As these tractions are null, see equation (35), we get Uθ = 0
⇒
uθ = cθ r,
(50)
cθ being a constant of integration. Thus the only admissible infinitesimal displacement in the circumferential direction is a rigid body rotation. In order to solve equation (47), we recall that a typical Bessel equation is x 2 y (x) + xy (x) + x 2 − υ 2 y(x) = 0, and a typical modified Bessel equation is x 2 y (x) + xy (x) − x 2 + υ 2 y(x) = 0. It is evident that equation (47) for s = 0 is either a classical or a modified Bessel equation, according to the sign of the coefficient of Ur . Therefore, two different solutions for the increment of the solid apparent density can be obtained. Let the coefficient of Ur in equation (47) be defined by q := 2µ + λ + 2ξ 0(f) a0 −
(ξ 0(s)γ 0(f) − a0 /ρ 0 − K sf )2 , 2γ 0(f)/ρ 0 + γ ff
(51)
256
F. DELL’ISOLA ET AL.
then, through a change of variable, equation (47) for s = 0 becomes
d2 Ur (ξ ) 1 dUr (ξ ) |q| − sign(q) Ur (ξ ) = 0, ξ = + r. 2 dξ ξ dξ λs ρ 0(s)2
(52)
Depending upon the sign of q, equation (52) is either a Bessel or a modified Bessel equation. If q > 0 then the solution of equation (52) is given by a linear combination of modified Bessel functions I0 (ξ ) and K0 (ξ ). However, if q < 0 then the solution of equation (52) is given by a linear combination of classical Bessel functions J0 (ξ ) and Y0 (ξ ). The subscript zero indicates that these functions are solutions of the Bessel equation with υ = 0. Note that the standard nomenclature for Bessel equations has been adopted. Expressions for the classical Bessel functions J0 (ξ ) and Y0 (ξ ) and the modified Bessel functions I0 (ξ ) and K0 (ξ ) can be found in a book, e.g., [1]. Starting from a vanishing coupling coefficient we conduct a parametric study of the solution of equation (47) when |K sf | is monotonically increased. This is essentially necessitated by our inability to identify the suitable range of values of K sf . Even though we know that it describes the effects of pore dilatancy, there is no experimental data available to quantify this effect. We commence from the solution of the homogeneous equation associated with equation (47). Two different ranges of values of the coupling coefficient can be determined. Let
2γ 0(f) a 0 sf 0(s) 0(f) 0(f) ff − 0 ∓ 2µ + λ + 2ξ a0 +γ (53) K1,2 := ξ γ ρ ρ0 be the values of K sf for which q vanishes. When K sf ∈ (K1sf , K2sf ) then the solution of the homogeneous equation is given by a linear combination of the modified Bessel functions as sign(q) = 1. For K sf ∈ (− ∞, K1sf ) ∪ (K2sf , ∞) then the solution of the aforementioned equation is a linear combination of the classical Bessel functions as sign(q) = −1. Once equation (52) has been solved, the solution of equation (47) is obtained by simply adding a suitable constant to it. The following figures depict the ρ (s) profiles for a salt matrix filled with brine; values of constitutive and geometric parameters and surface tractions applied on the boundary are listed in Table I. Values of constitutive parameters are deduced from the test data on salt rock and brine [37]. Note that values for the constitutive coefficients λs and C int that describe second gradient effects are arbitrarily chosen since no experimental data is available for their determination. However, they are reasonable and can describe the pore-opening effect near the boundary of the mixture. The variation of ρ (s) in the boundary layer due to second gradient is about 1% of its reference value (i.e. its value due to the first gradient model only). The plot in Figure 1 is representative of solutions in the range (K1sf , K2sf ) and that in Figure 2 corresponds to solutions in the range (− ∞, K1sf ) ∪ (K2sf , ∞). In Figure 1 the typical behavior of fields exhibiting boundary layers is shown; in our model these boundary layers are driven by the
257
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
Table I. Values of parameters used in the computation of results. E and ν are Young’s modulus and Poisson’s ratio of the solid matrix; ρˆ 0(s) and ρˆ 0(f) are the densities of the solid and the fluid constituent in the reference configuration, ν 0(s) and ν 0(f) their volume fractions. In the reference configuration the mixture is saturated Constitutive parameters E = 200 MPa ν = 0.33 λs = 200 N m4 /kg2 γff = 1.64 106 N m4 /kg2 (s)
C = C (f) C int = 1 N m3 /kg2
Geometric and referencial state properties ρˆ 0(s) = 1850 kg/m3 ρˆ 0(f) = 1300 kg/m3 ν 0(s) = 0.97 ν 0(f) = 1 − ν 0(s) = 0.03
Tractions ext = 20 MPa p01 ext = p ext p02 01 p˜0 = −2.21 MPa p1 = 0.1 · 106 N/m3
ρ 0(s) = ρˆ 0(s) ν 0(s) = 1794.5 kg/m3 ρ 0(f) = ρˆ 0(f) ν 0(f) = 39 kg/m3 R1 = 2 m, R2 = 20 m
Figure 1. Qualitative ρ (s) -profiles for K sf ∈ (K1sf , K2sf ).
applied double forces. Close to the boundary of the solid matrix their effect is either a dilatation or a compaction induced by the applied fluid pressure. In Figure 2 the solution apparently shows wide oscillations due to the change of type occurring in the Bessel equation (52). This is usually an indication of instability and motivates the following analysis. 3.1.2. Stability of the Stressed Reference Configuration We now investigate the stability of the reference configuration with respect to changes in the coupling coefficient K sf . Recalling our earlier remarks on the admissible values of K sf we delineate now the range of its values which assure the uniqueness of the solution of the elastic problem. Said differently, our goal is to characterize the coupling coefficient K sf which ensures the structural stability of the partial differential equations defined on the space of state parameters u(s) and ρ (f) and describing deformations of the mixture. One could also investigate the stability of the prestressed configuration with respect to the value of the prestress. According to the criterion stated, for example, by Arnold [2] or Thom [39], we dis-
258
F. DELL’ISOLA ET AL.
(a)
(b) Figure 2. Qualitative ρ (s) -profiles for K sf ∈ (−∞, K1sf ) ∪ (K2sf , ∞). Figure 2(a) corresponds to 1.75 < K < 1.81, and (b) to 1.83 < K < 1.9, where K = K sf / 0f µ(2γ /ρ 0 + γ ff ).
cuss the possibility that for two solutions corresponding to sufficiently close values of K sf , an homeomorphism on the space of mixture states exists transforming one solution into the other. The norm induced by the energy inner product is used to define the neighborhood of an element in this space. In order that the partial differential equations of our problem fulfill the condition of topological equivalence (see [2]) it is assumed that a solution of equations (47) and (48) describes available transformations of a reference configuration. This provides an admissible criterion for the stability analysis; many other choices are possible. The following physically meaningful energetic criterion can be used to study the stability. It requires that for the reference equilibrium configuration to be stable, the total energy given by the functional Etot ρ (f) , F(s) 2 λs (54) ρ ρ (f) , F(s) + ∇ρ (s) − ψ ext x, ρ (s) , ρ (f) , ∇ρ (s) dV = 2 be minimum in the reference configuration. Here essential boundary conditions are assumed to be prescribed on the entire boundary. In particular, we prove that the reference configuration is stable when the coupling coefficient lies in a suitable subset of the open interval (K1sf , K2sf ). The mathematical reasoning parallels that
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
259
given, for example, in [12]. That is, for the reference configuration to be stable, the second functional derivative of Etot evaluated in the reference configuration must be positive definite. We shall prove that, when is prescribed by equation (26), this is equivalent to requiring that the following spectral problem has positive eigenvalues only. ⎧ 1 1 r2 ⎪ ⎪ 2µ + λ + 2ξ 0(f) a0 − α r,rr + r,r − 2 1 + ⎪ ⎪ ⎪ r r λs ρ 0(s) ⎪ ⎪ ⎪ ⎪ 0(f)2 0(s) 0f ⎪ ⎪ (ξ γ − a0 /ρ 0 − K sf )2 ⎨ +ρ r = 0, α − ρ 0(f) (2ξ 0(f) γ 0f + ρ 0(f) γ ff ) (55) ⎪ ⎪ 0(f) 0(s) 0f 0 sf ⎪ ⎪ ρ (ξ γ − a0 /ρ − K ) ⎪ ⎪ r = 0, − ⎪ ⎪ α − ρ 0(f) (2ξ 0(f) γ 0f + ρ 0(f) γ ff ) ⎪ ⎪ ⎪ ⎩ in ; (2µ + a0 − α)θ = 0, ⎧ 1 2 ⎪ ⎪ λ − 2ξ 0(s) − 1 a0 + ρ 0(s) C int − α div vs ⎪ ⎪ ⎪ r ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ + ρ 0(f) ξ 0(s)γ 0f − a0 − K sf div v ⎪ f ⎪ ⎪ ρ0 ⎪ ⎪ ⎪ ⎪ ⎪ a0 1 ⎪ 0(s)2 ⎪ ⎪ (vs )r,r − λs ρ +2 µ + r,r + r = 0, ⎪ ⎨ 2 r (56) 1 δ ⎪ ⎪ (2µ + a0 − α) (vs )θ,r + (vs )θ = 0, ⎪ ⎪ ⎪ 2 r ⎪ ⎪ ⎪ ⎪ ⎪ 0(f) 0(s) 0f a0 ⎪ ⎪ ξ γ − 0 − K sf div vs ρ ⎪ ⎪ ρ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ + ρ 0(f) 2ξ 0(f) γ 0f + ρ 0(f) γ ff − α div vf = 0, ⎪ ⎪ ⎪ ⎩ on ∂. λs r − C int div vs = 0, Here α denotes a generic eigenvalue, and quantities r , θ and are defined by 1 r := ∇(div vs ) · er = (vs )r,r + (vs )r , r ,r 1 1 (vs )θ,r + (vs )θ , (57) θ := div(skw (∇vs )) · eθ = 2 r ,r := ∇(div vf ) · er , in terms of the virtual velocity fields vs and vf . This equivalence can be proved by following the procedure adopted by Seppecher [36]. In a virtual motion the second time derivative of Etot evaluated in the reference configuration is required to equal the integral over of a suitable quadratic form multiplied by α. Consequently its sign depends on the sign of α. In order to get conditions for a positive definite
260
F. DELL’ISOLA ET AL.
second derivative of Etot by means of a suitable eigenvalue problem, the proper quadratic form is d2 Etot = α (div vs )2 + skw(∇vs ) · skw(∇vs ) + (div vf )2 dV . 2 dt (ρ 0(f),ρ 0(s) ,p¯ 0 ) (58) It is clear that if α is positive then the left-hand side of equation (58) is positive and the reference configuration is stable. Cumbersome calculations involved in the derivation of equations (55) and (56) from equation (58) have been omitted. We simply note that the divergence theorem applied to the right-hand side of equation (58) implies that field equations (55) and boundary conditions (56) depend on the eigenvalue α. Field equation (55)1 is a Bessel equation. Therefore, according to the sign of 2 1 ρ 0(f) (ξ 0(s)γ 0f − a0 /ρ 0 − K sf )2 0(f) , 2µ + λ + 2ξ a0 − α + Q := α − ρ 0(f) (2ξ 0(f) γ 0f + ρ 0(f) γ ff ) λs ρ 0(s)2 (59) its solution is a linear combination of classical Bessel functions or of modified Bessel functions. In√particular, if Q √ < 0 the solution of equation (55)1 is a linear combination of J1 (√ −Qr) and Y√ 1 ( −Qr), conversely if Q > 0 it is a linear combination of I1 ( Qr) and K1 ( Qr). In order to determine the range of K sf for which eigenvalues are positive, we first determine the range of eigenvalues corresponding to positive or negative values of Q. When α is negative, we restrict our discussion to the case α < ρ 0(f) 2ξ 0(f) γ 0f + ρ 0(f) γ ff .
(60)
>0
Consequently we get Q>0
⇒
α < α1 , α > α2 ,
Q<0
⇒
α1 < α < α2 ,
(61)
where α1 and α2 are roots of the equation Q = 0. Consider the case when α < α1 ; it is easy to check that, for values of the constitutive coefficients considered in Table I, a characteristic root always exists in this range of eigenvalues. Therefore solutions of equations (55) are linear combinations of modified Bessel functions.
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
261
Figure 3. The solution of equation ϕ(α, K sf ) = 0 for α < α1 is presented where a = 2 α/λs ρ 0(s) .
Assume that α = 2µ + a0 , i.e., require equation (55)3 to be satisfied by a rigid body rotation. Then equations (55) imply that √ √ r = C1 I1 Qr + C2 K1 Qr , θ = 0, C1 C2 (62) div vs = √ I0 Qr − √ K0 ( Qr) + C3 , Q Q C4 C1 C2 C3 I1 Qr + K1 Qr + r+ , (vs )r = Q Q 2 r where Ci (i = 1, . . . , 4) are constants of integration. Substituting from (62) into the homogeneous boundary conditions (56) and requiring that the determinant of this linear system vanish, one gets the equation which determines the admissible eigenvalues of the system for α < α1 . These solutions are functions of the coupling coefficient K sf . Let ϕ be the determinant of the linear system obtained from equations (56) and regarded as a function of α and K sf . In Figure 3 the curve ϕ(α, K sf ) = 0 is exhibited. The gray area in the figure indicates the region of positive eigenvalues larger than α1 . For values of the coupling coefficient in the range (K1sf , K2sf ) an eigenvalue smaller than α1 always exists. Consequently, the loss of stability of the equilibrium configuration occurs when such an eigenvalue becomes negative. A numerical simulation shows that the stability is guaranteed for values sf sf , K2s ) of (K1sf , K2sf ). Thus of the coupling coefficient in a suitable open subset (K1s solutions of equations (47) and (48) are meaningful only for coupling coefficients in this open interval.
262
F. DELL’ISOLA ET AL.
sf and K sf upon λ . Figure 4. Dependence of K1s s 2s
These results indicate that the stability-instability transition does not involve a change in the macroscopic deformation profile for the hollow cylinder. In other words the loss of stability and wide oscillations occurring when the coupling coefficient belongs to the open set (− ∞, K1sf )∪(K2sf , ∞) are not correlated. This could be due to the stability criterion used here. A nonlinear analysis and/or a different choice of the energetic functional may provide a different stability limit. sf sf and K2s are not strongly It is interesting to note that the stability limits K1s affected by second gradient effects. An increase in λs does not induce a noticeable change in these limits (see Figure 4). However, estimates of the stability limits provided by the first gradient theory (i.e. λs = 0) cannot be used when second gradient effects are present, and the length of the stable region progressively decreases as second gradient effects become more relevant (see Figure 4). 4. Concluding Remarks We have studied infinitesimal static deformations of a long hollow porous isotropic elastic cylinder initially saturated with a perfect fluid and subjected to a hydrostatic pressure on the inner and the outer surfaces. The cylinder in the reference configuration is stressed. Equations governing deformations of the solid and the fluid that are linear in displacement gradients and infinitesimal changes in the apparent mass densities of the solid and the fluid have been derived; these are Heun’s equations and may have singular points in the interior of the hollow cylinder. Constitutive
STATIC DEFORMATIONS OF A LINEAR ELASTIC POROUS BODY
263
relations for the solid have been assumed to depend upon the gradients of the apparent mass density of the solid. Deformations of the solid and the fluid are coupled through the scalar coefficient K sf multiplying gradients of the apparent mass density of the solid. When the initial stress state in the solid is that of uniform pressure, equations governing the radial and the circumferential components of displacement are uncoupled; the former is a nonhomogeneous Bessel equation and the latter an Euler equation. These equations are solved for a somewhat arbitrarily chosen set of material and geometric parameters; these correspond to a salt matrix filled with brine. The stability of the reference configuration with respect to changes in the values of the coupling coefficient K sf has been scrutinized. It is found that the reference configuration is stable for values of the coupling coefficient in a suitable open set. The computed profiles of the mass density of the solid phase exhibit an oscillatory behavior and also a boundary layer near the outer surface. The variation of changes in the apparent mass density of the solid due to the consideration of second-gradient effects is about 1% of its value in the absence of second-gradient effects and depends upon the value assigned to the coupling coefficient K sf .
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
G.E. Andrews, R. Askey and R. Roy, Special functions. In: Encyclopedia of Mathematics and Its Applications. Cambridge Univ. Press., Cambridge (1999). V.I. Arnold, Geometrical Methods in the Theory of Ordinary Differential Equations, translated by J. Szucs; English translation edited by M. Levi. Springer, New York (1983). F.M. Arscott, Heun’s equation. In: A. Ronveaux (ed.), Heun’s Differential Equations. Oxford Univ. Press, Oxford (1995) Part A. R.C. Batra, On nonclassical boundary conditions. Arch. Rational Mech. Anal. 48 (1972) 163– 191. R.C. Batra, Thermodynamics of non-simple elastic materials. J. Elasticity 6 (1976) 451–456. R.C. Batra, The initiation and growth of, and the interaction among adiabatic shear bands in simple and dipolar materials. Internat. J. Plasticity 3 (1987) 75–89. R.C. Batra and L. Chen, Shear band spacing in gradient-dependent thermoviscoplastic materials, Comput. Mech. 23 (1999) 8–19. R.C. Batra and J. Hwang, Dynamic shear band development in dipolar thermoviscoplastic materials. Comput. Mech. 12 (1994) 354–369. R.C. Batra and C.H. Kim, Adiabatic shear banding in elastic-viscoplastic nonpolar and dipolar materials. Internat. J. Plasticity 6 (1990) 127–141. P. Bérest, J. Bergues, B. Brouard, J.G. Durup and B. Guerber, A salt cavern abandonment test. Internat. J. Rock Mech. Min. 38 (2001) 357–368. M.A. Biot, General theory of three-dimensional consolidation. J. Appl. Phys. 12 (1941) 155– 164. P. Blanchard and E. Bruning, Variational Methods in Mathematical Physics (a Unified Approach). Springer, Heidelberg (1992). R.M. Bowen, Theory of mixtures. In: Continuum Physics, Vol. III (1976) pp. 2–127. J.W. Cahn and J.E. Hilliard, Free energy of a non-uniform system. J. Chem. Phys. 31 (1959) 688–699. P. Casal, La théorie du second gradient et la capillarité. C. R. Acad. Sci. Paris Sér. A 274 (1972) 1571–1574.
264 16. 17.
18. 19.
20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36.
37. 38. 39. 40. 41.
F. DELL’ISOLA ET AL.
P. Casal and H. Gouin, Equations du movement des fluides thermocapillaires. C. R. Acad. Sci. Paris Sér. II 306 (1988) 99–104. P. Cosenza, M. Ghoreychi, B. Bazargan-Sabet and G. de Marsily, In situ rock salt permeability measurement for long term safety assessment of storage. Internat. J. Rock Mech. Min. 36 (1999) 509–526. R. de Boer, Theory of Porous Media. Springer, Berlin (2000). F. dell’Isola, M. Guarascio and K. Hutter, A variational approach for the deformation of a saturated porous solid. A second gradient theory extending Terzaghi’s effective stress principle. Archive Appl. Mech. 70 (2000) 323–337. F. dell’Isola and P. Seppecher, Edge contact forces and quasibalanced power. Meccanica 32 (1997) 33–52. W. Elhers, Toward finite theories of liquid-saturated elasto-plastic porous media. Internat. J. Plasticity 7 (1991) 433–475. P. Fillunger, Erdbaumechanik. Selbstverlag des Verfassers, Wien (1936). P. Germain, La méthode des puissances virtuelles en mécanique des milieux continus. J. Mécanique 12(2) (1973) 235–274. P. Germain, The method of virtual power in continuum mechanics. Part 2: Microstructure. SIAM J. Appl. Mech. 25(3) (1973) 556–575. M.A. Goodman and S.C. Cowin, A continuum theory for granular materials. Arch. Rational Mech. Anal. 44 (1972) 249–266. H. Gouin, Tension superficielle dynamique et effet Marangoni pour les interfaces liquidevapeur en théorie de la capillarité interne. C. R. Acad. Sci. Paris Sér. II 303(1) (1986). S. Krishnaswamy and R.C. Batra, A thermomechanical theory of solid-fluid mixtures. Math. Mech. Solids 2 (1997) 143–151. G.A. Maugin, The method of virtual power in continuum mechanics: application to coupled fields. Acta Mech. 35 (1980) 1–70. L.W. Morland, A simple constitutive theory for a fluid saturated porous solid. J. Geoph. Res. 77 (1972) 890–900. I. Müller, A thermodynamic theory of mixtures of fluids. Arch. Rational Mech. Anal. 28 (1968) 1–39. I. Müller, Thermodynamics. Pittman, Boston (1985). N.I. Muskhelishvili, Some Basic Problems of the Mathematical Theory of Elasticity. Noordorf, Groningen (1953). K.R. Rajagopal and L. Tao, Mechanics of Mixtures. World Scientific, Singapore (1995). G. Sciarra, F. dell’Isola and K. Hutter, A solid-fluid mixture model allowing for solid dilatation under external pressure. Continuum Mech. Thermodyn. 13 (2001) 287–306. G. Sciarra, K. Hutter and G.A. Maugin, A variational approach to a micro-structured theory of solid-fluid mixtures, in preparation. P. Seppecher, Equilibrium of a Cahn–Hilliard fluid on a wall: Influence of the wetting properties of the fluid upon the stability of a thin liquid film. European J. Mech. B Fluids 12(1) (1993) 69–84. S.M.R.I. Solution, Mining Research Institute, Technical class guidelines for safety assessment of salt caverns, Fall Meeting, Rome, Italy (1998). B. Svendsen and K. Hutter, On the thermodynamics of a mixture of isotropic materials with constraints. Internat. J. Engrg. Sci. 33 (1995) 2021–2054. R. Thom, Stabilité structurelle et morphogénèse: Essai d’une Théorie Générale des Modèles. Benjamin, New York (1972). C.A. Truesdell, Sulle basi della termomeccanica. Lincei Rend. Sc. Fis. Mat. Nat. XXII (Gennaio 1957). C.A. Truesdell, Thermodynamics of diffusion, Lecture 5. In: C.A. Truesdell (ed.), Rational Thermodynamics. Springer, Berlin (1984) pp. 216–219.
A Class of Fit Regions and a Universe of Shapes for Continuum Mechanics GIANPIETRO DEL PIERO Dipartimento di Ingegneria, Università di Ferrara, Via Saragat 1, 44100 Ferrara, Italy E-mail:
[email protected] Received 12 September 2002; in revised form 3 March 2003 Abstract. A new class of fit regions is proposed as an alternative to those available in the literature, and specifically to the class defined by Noll and Virga in their paper [12]. An advantage of the proposed class is that of being based mostly on topological concepts rather than on less familiar concepts from geometric measure theory. A distinction is introduced between fit regions and shapes of continuous bodies. The latter are defined as equivalence classes of fit regions, made of regions all with the same interior and with the same closure. In the final part of the paper the axioms for a universe of bodies, formulated by Noll and incorporated in Truesdell’s book [15], are re-discussed and partially re-formulated. Mathematics Subject Classifications: 73A05. Key words: foundations of continuum mechanics, fit regions, universes of bodies.
Dedicated to the memory of Clifford A. Truesdell, teacher and friend
1. Introduction In his textbook on Rational Continuum Mechanics, Clifford Truesdell writes: “Mechanics rests upon three substructures: a universe of bodies, a geometry with its kinematics, and a theory of forces. These substructures provide the concepts mechanics is to connect.” [15, p. 6]. In this paper I deal with the first of these substructures. A universe of bodies, or material universe, is a collection of continuous bodies. Mathematically it is defined as a pair (, ≺), where is a set and ≺ is a relation on , subject to a number of axioms. The elements of are the bodies, and if two bodies A and B satisfy the relation A ≺ B we say that A is a part of B.‡ Presently on leave at the Centro Linceo Interdisciplinare “Beniamino Segre”, Accademia Nazionale dei Lincei, Rome, Italy. The axioms define on (, ≺) the structure of a Boolean algebra, see [10, 15]. ‡ The concept of a material universe was introduced by Noll in 1959 [8]. Formal definitions are given in [9–11, 15]. Sometimes, a material universe is identified with a single continuous body, and its elements are identified with the subbodies of the given body, see, e.g., [2, 3, 7, 11, 13, 14].
265 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 265–285. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
266
G. DEL PIERO
The bodies of Continuum Mechanics are deformable. They may take different shapes. A change of shape is called a transplacement, and the collection of all shapes which can be taken by all bodies in a universe of bodies is a universe of shapes. Usually, shapes are identified with regions of a Euclidean space. The idea that shapes should belong to some “suitable class of nice subsets of Euclidean spaces” appears in [11]. In the same paper, two examples of “suitable classes” are provided: – the set of all regularly closed subsets, – the set of all regularly open subsets. Purely mathematical examples are also given in [10]. The concept of a nice subset was made precise by Noll and Virga in their paper [12]. In it, nice subsets were called fit regions, and their properties were fixed as follows: (F1) The set of all fit subregions of a given fit region should satisfy the axioms of a material universe. (F2) The class of fit regions should be invariant under transplacements, which should include adjustments to fit regions of smooth diffeomorphisms from one Euclidean space to another. (F3) Each region should have a surface-like boundary for which a form of the integral-gradient (Gauss–Green) theorem should be valid. The same authors added: (F4) It is also desirable that the class of fit regions include all that can possibly be imagined by an engineer but exclude those that can be dreamt up only by an ingenious mathematician. The class of regularly open sets and that of regularly closed sets do not satisfy the requirement (F3). In fact, the integral-gradient formula is usually established within the class of sets with finite perimeter, see, e.g., [5, Section 5.8]. The class of open sets with finite perimeter were proposed as fit regions by Banfi and Fabrizio [2].‡ As remarked by Noll and Virga in [12], this class does not satisfy the requirement (F1). Later, Gurtin et al. added the condition that the sets be d-regular.‡‡ The class of d-regular open sets with finite perimeter satisfies all requirements but, in Noll and Virga’s opinion, it is “unnecessarily large”.¶ In the spirit of their statement (F4), they select a more restricted class, which also has the [15, Section 2.1]. In papers preceding Truesdell’s book, there is some confusion between bodies and regions occupied by bodies. Denote by int A the interior and by clo A the closure of a set A. Then A is regularly open if A = int clo A, and is regularly closed if A = clo int A. ‡ See also [13]. ‡‡ Gurtin et al. [7]. A set is d-regular, or normalized, if it coincides with the set of its density points, see Section 2 below. ¶ In [3], Degiovanni et al. restricted this class to bounded sets. In [14], Šilhav´y further restricted this class to sets with negligible boundary, see condition (NV4) below.
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
267
advantage of involving a reduced amount of measure theoretic concepts. This is the class of all subsets of a Euclidean space which are (NV1) (NV2) (NV3) (NV4)
bounded, regularly open, with finite perimeter, with negligible boundary.
Regions with these properties will be called NV-regions. In the present paper I define a new class of fit regions, which I call D-regions. In it, the only measure theoretic concept involved is that of the Hausdorff measure of a set. Indeed, the boundary of every D-region is required to have a finite (N − 1)-dimensional Hausdorff measure. This implies that D-regions are both with finite perimeter and with negligible boundary. For D-regions, properties (F2) and (F3) are proved in Section 2. More precisely, the invariance property (F2) is proved for bi-Lipschitz homeomorphisms, a class of mappings larger, and physically more realistic, than the class of smooth diffeomorphisms considered in [12].‡ D-regions need not be either open or closed, but those which are open are NVregions. This is proved in Section 3, where it is also shown by an example that open D-regions form a proper subset of the set of all NV-regions. Referring to the statement (F4), the new class of fit regions can be considered as a further step in the process of eliminating regions which may be “dreamt up by an ingenious mathematician” but never “imagined by an engineer”. A new definition of a shape of a body is given in Section 4. While shapes are usually identified with fit regions of Euclidean spaces, here they are defined as equivalence classes of D-regions, called D-shapes.‡‡ Each class is made of regions having the same interior and the same closure, so that regions within the same class only differ by the portion of boundary included in the region. Accordingly, a D-shape may be represented by an open region, by a closed region, or by a region which is neither open nor closed. This may be helpful in facing many situations encountered in problems of mechanics.¶ Also, at least in my opinion, the proposed definition meets more directly some requirements suggested by physics. For exam I.e., the boundary has zero N-dimensional Lebesgue measure. If, as in the present paper, shapes are not identified with fit regions, then (F1) applies to shapes
rather than to individual fit regions. ‡ Indeed, bi-Lipschitz homeomorphisms need not be differentiable at every interior point of the region, and cases of transplacements which are not differentiable on singular surfaces or on other sets with zero measure are frequently met in problems of elasticity. Invariance under bi-Lipschitz homeomorphisms was first assumed in [2]. ‡‡ To my knowledge, the only reference to equivalence classes of regions as an alternative to individual regions was made by Šilhav´y in [13]. ¶ For example, open sets are more appropriate to the study of boundary value problems of elasticity, while closed sets are more convenient when a specific material structure is prescribed to the boundary or to a part of it.
268
G. DEL PIERO
ple, as shown in Section 4, it provides a more satisfactory definition of the partition of a shape, a definition which cannot be given in terms of open sets or of closed sets alone. The axioms of a material universe mentioned in (F1) are discussed in Section 5. In the presentation given in [15] there are six axioms. The first three are: (A1) (A2) (A3)
A ≺ A, A≺B A≺B
and and
B ≺A B ≺C
⇒ ⇒
A = B, A ≺ C.
They state that the relation ≺ is reflexive, antisymmetric, and transitive. These are the defining properties of a partial ordering. It is then assumed that contains two elements ∅, ∞, called the null body and the universal body, such that ∅ ≺ A≺∞
∀A ∈ .
(1.1)
A body C is said to be an envelope of A and B if both A and B are parts of C, and is said to be a common part to A and B if C is a part of both A and B. The minimum envelope of A and B, if it exists, is noted A ∨ B and is called the join of A and B, and the maximum common part of A and B, if it exists, is noted A ∧ B and is called the meet of A and B. The fourth axiom postulates the existence of a unique exterior Ae for each body A: (A4)
for each A ∈ there is a unique Ae ∈ such that A ∧ Ae = ∅ and A ∨ Ae = ∞,
the fifth axiom requires that all bodies disjoint from a body A be parts of the exterior of A: (A5)
A∧B =∅ ⇒
B ≺ Ae ,
and the last axiom postulates the existence of the meet for every pair of bodies: (A6)
the meet A ∧ B exists for every A, B ∈ .
There are some differences between the postulates listed above and those in [10, 11]. In [10], axiom (A1) is replaced by the assumption of the existence of the relation ≺, and in [11] the same axiom is replaced by the existence of the elements ∅, ∞, an assumption made in [10, 15] without giving it the status of an axiom. In fact, there are two lists of postulates in [10], one in the Appendix, to which I refer here, and
one in Section 8, which coincides with the list given in [11]. In [10, 15], ∅ and ∞ are considered as improper bodies, to be included in order to perform a sort of completion of .
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
269
While the introduction of ∅ is essential, that of ∞ can be avoided. This consideration, together with the discrepancies in the lists of axioms mentioned above, induced me to re-consider the whole matter. The result is the new list of axioms given in Section 5. In it, axioms (A1)–(A3) are eliminated, simply by declaring from the outset that the relation ≺ is a partial ordering. About the remaining axioms, I found it convenient to replace assumption (A6) on the existence of the meet by that of the existence of the join, and to substitute axiom (A5) with a separation postulate asserting that a body disjoint from two other bodies is also disjoint from their join. Moreover axiom (A4), which is meaningless if the universal body ∞ is left out of the picture, is replaced by a partition postulate assuming the existence of the complementary body. By adding the assumption of existence of the null body, I obtain four axioms which I denote by (B1)–(B4). I prove that counterparts of the statements (A4) and (A5), involving complementary sets instead of exteriors, can be deduced from these axioms. Finally, Section 6 is devoted to checking that the universe of shapes constructed in Section 4 satisfies all axioms (B1)–(B4) of a universe of bodies, as required by (F1). 2. A Class of Fit Regions Consider the class of all measurable subsets of RN with the following properties: (i) (ii) (iii) (iv)
is bounded, int = int clo , clo = clo int , H N−1 (bdy ) < +∞.
Here int, clo, bdy denote the topological interior, closure and boundary, respectively, and H N−1 denotes the (N − 1)-dimensional Hausdorff measure. A set with the above properties will be called a D-region of RN . In particular, property (ii) states that the interior of is regularly open, and (iii) states that the closure of is regularly closed. Thus, a D-region need be neither open nor closed. An illustration of the meaning of (ii) and (iii) is given in Figure 1. It is shown there that, due to (ii), a D-region cannot have missing lines or points and that, due to (iii), it cannot have isolated lines or points. The interiors, closures and boundaries of D-regions have the remarkable properties proved in the next propositions. PROPOSITION 2.1. If is a D-region, then bdy int = bdy = bdy clo .
(2.1)
Proof. By property (iii), bdy(int ) = clo(int )\ int(int ) = clo \ int = bdy ,
(2.2)
270
G. DEL PIERO
Figure 1. A region of the plane, and the regions int , clo , int clo , clo int . Thick lines are included in the region, thin lines are not included.
and, by property (ii), bdy(clo ) = clo(clo )\ int(clo ) = clo \ int = bdy .
(2.3) 2 PROPOSITION 2.2. The interior and the closure of a D-region are D-regions. Proof. In a metric space, is bounded if and only if int and clo are bounded [4, Chapter 5]. For clo , property (ii) is trivially satisfied and (iii) is proved by observing that, if is a D-region, clo(clo ) = clo = clo int = clo(int clo ) = clo int(clo ).
(2.4)
Similarly, for int property (iii) is trivially satisfied and (ii) is proved by interchanging “int” and “clo” in the preceding equations. Finally, (iv) follows from (2.1). 2 PROPOSITION 2.3. If and are D-regions, then int ⊂ int
⇔
clo ⊂ clo .
(2.5)
Proof. If int ⊂ int , then clo = clo int ⊂ clo int = clo . The inverse implication is proved by interchanging “int” and “clo”.
(2.6) 2
An immediate corollary is that two D-regions have the same interior if and only if they have the same closure: int = int
⇔
clo = clo .
(2.7)
The proof of the next proposition is based on the following properties of arbitrary subsets A, B of RN . clo A ∪ clo B = clo(A ∪ B), int A ∪ int B ⊂ int(A ∪ B), bdy A ∪ bdy B ⊃ bdy(A ∪ B), clo A ∩ clo B ⊃ clo(A ∩ B), int A ∩ int B = int(A ∩ B), (2.8) bdy A ∪ bdy B ⊃ bdy(A ∩ B), clo A\ int B ⊃ clo(A\B), int A\ clo B = int(A\B), bdy A ∪ bdy B ⊃ bdy(A\B).
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
271
PROPOSITION 2.4. If and are D-regions, then clo( ∪ ), int( ∩ ) and int(\ ) are D-regions. Proof. For clo( ∪ ), properties (i) and (ii) are trivially verified. Moreover, by (2.8)1 and (2.8)2 , clo(clo( ∪ )) = clo( ∪ ) = clo ∪ clo = clo int ∪ clo int = clo(int ∪ int ) ⊂ clo int( ∪ ) (2.9) ⊂ clo int(clo( ∪ )). Because the inverse inclusion is true for any set, we obtain (iii). Finally, from (2.8)1 , (2.8)3 and (2.1) we have bdy clo( ∪ ) = bdy(clo ∪ clo ) ⊂ bdy clo ∪ bdy clo = bdy ∪ bdy ,
(2.10)
and this implies H N−1 (bdy clo( ∪ )) H N−1 (bdy ) + H N−1 (bdy ) < +∞. (2.11) For int( ∩ ) properties (i) and (iii) are immediate, (ii) follows from (2.9) after interchanging clo, ∪, ⊂ with int, ∩, ⊃, respectively, and (iv) is proved by using (2.8)5 , (2.8)6 and (2.1): bdy int( ∩ ) = bdy(int ∩ int ) ⊂ bdy int ∪ bdy int = bdy ∪ bdy .
(2.12)
Finally, for int(\ ) conditions (i) and (iii) are trivially satisfied. Moreover, by (2.8)8 and (2.8)7 , int(int(\ )) = int(\ ) = int \ clo = int clo \ clo int = int(clo \ int ) ⊃ int clo(\ ) ⊃ int clo(int(\ )),
(2.13)
and because the inverse inclusion is true in general we obtain (ii). Finally, by (2.8)8 , (2.8)9 and (2.1), bdy(int(\ )) = bdy(int \ clo ) ⊂ bdy int ∪ bdy clo = bdy ∪ bdy ,
(2.14) 2
and (iv) follows.
Next, I prove that D-regions have the properties (F2), (F3), so that they form a class of fit regions. About the first property, I prove that a bi-Lipschitz homeomorphism maps D-regions onto D-regions. I recall that a homeomorphism is a bijective continuous mapping with a continuous inverse, and that a mapping f : RN → RN is bi-Lipschitz if there are positive constants c, m such that c|x − y| |f (x) − f (y)| m|x − y|
∀x, y ∈ RN .
(2.15)
272
G. DEL PIERO
I also recall that if f is a homeomorphism, then f (int A) = int f (A),
f (clo A) = clo f (A),
f (bdy A) = bdy f (A) (2.16)
for all regions A in RN . PROPOSITION 2.5. Let f : RN → RN be a bi-Lipschitz homeomorphism. Then a subset of RN is a D-region if and only if f () is a D-region. Proof. Let be a D-region. Then bounded and f continuous in clo implies f () bounded by the Weierstrass theorem. Thus f () satisfies the first property of a D-region. The second property is proved using (2.16) and the property (iii) for : int clo f () = int f (clo ) = f (int clo ) = f (int ) = int f (),
(2.17)
and property (iii) follows after interchanging “int” and “clo”. Finally, property (iv) holds because H N−1 (bdy f ()) = H N−1 (f (bdy )) by (2.16)3 and H N−1 (f (bdy )) mN−1 H N−1 (bdy ),
(2.18)
by the Lipschitz continuity of f , see [5, Section 2.4.1, Theorem 1]. Thus, f () is a D-region whenever is a D-region. The inverse implication is proved by 2 interchanging and f with f () and f −1 . I now turn to the proof that the integral-gradient formula ∇f (x) dx = f (x) ⊗ n(x) dH N−1
(2.19)
eby
holds for all D-regions of RN and for all continuous functions f : RN → RN whose gradient ∇f is locally summable. Here eby denotes the essential boundary of and n(x) is the measure theoretic outward normal at x. Given a region A in RN , the set of all density points for A is the set dns A made of all x in RN such that |B(x, r) ∩ A| = 1, (2.20) lim r→0 |B(x, r)| where B(x, r) is the open ball of RN centered at x and with radius r, and |·| denotes the N-dimensional Lebesgue measure. A point of rarefaction for A is a point for which the above limit is zero, and the essential boundary eby A of A is the set of all points which are neither points of density nor points of rarefaction for A. The inclusions int A ⊂ dns A ⊂ clo A,
eby A ⊂ bdy A,
(2.21)
hold for every subset A of RN . For the basic definitions and results from geometric measure theory, I refer to the book [16] by
Vol’pert and Hudjaev.
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
273
A subset A of RN is a set with finite perimeter if the (N − 1)-dimensional Hausdorff measure of the essential boundary is finite. From (2.21)2 , we have H N−1 (eby A) H N−1 (bdy A),
(2.22)
so that every set whose boundary has finite (N −1)-dimensional Hausdorff measure is a set with finite perimeter. In particular, from the property (iv) of D-regions it follows that every D-region is a set with finite perimeter. For sets with finite perimeter, formula (2.19) holds with replaced by dns in the first integral [16, p. 198, Theorem 1]. Hence, formula (2.19) holds as it is, whenever and dns differ by a set of Lebesgue measure zero: |(\ dns ) ∪ (dns \)| = 0.
(2.23)
By the general inclusions (2.21)1 and int ⊂ ⊂ clo , both (\ dns ) and (dns \) are included in (clo \ int ) = bdy . Then the Lebesgue measure of their union is not greater than |bdy |. For a D-region, the property (iv) implies |bdy | = 0. Consequently, (2.23) holds for all D-regions, and this proves the following PROPOSITION 2.6. Let be a D-region of RN . Then the integral-gradient formula (2.19) holds for any f : RN → RN continuous and with ∇f ∈ L1loc . REMARK. There are D-regions for which inequality (2.22) is strict. An example is the region D(x, y, d, l) defined in the following section. Thus, eby cannot be replaced by bdy in (2.19).
3. A Comparison with Noll and Virga’s Class If we identify the N-dimensional Euclidean space with RN and if we consider all open D-regions of RN , we see that they have all properties (NV1)–(NV4) in Section 1. Thus, all open D-regions are NV-regions. Here I wish to show that there are NV-regions which are not D-regions. For a fixed positive number l and for each d in (0, l) consider the following subset of R2 h−1 ∞ 24 4 D(x, y, d, l) := B(x + xh,k , y + yh , rh ) | k = 2p − 1, h=1 p=1
$ kl d d (3.1) xh,k = h , yh = h , rh = h , 2 2 4 where B(x, y, r) is the open ball centered at (x, y) with radius r. This set is both a NV-region and a D-region. Its essential boundary is the union of the boundaries of all balls which form the set, and the boundary is the union of the essential boundary This definition is equivalent to standard definitions given, for example, in [5] or in [16]. The
equivalence follows from Proposition 3.62 in the book [1] by Ambrosio et al.
274
G. DEL PIERO
and of the segment (x, x + l) × {y}. Thus, H N−1 (eby D(x, y, d, l)) =
∞
2h−1 2π rh =
h=1
∞
2h 4−h π d = π d,
(3.2)
h=1
and H N−1 (bdy D(x, y, d, l)) = π d + l.
(3.3)
For each positive integer p, consider the set Dp := D(0, yp , dp , l), with y1 = 0,
yp+1 =
p
dq ,
dq = 2−q l,
(3.4)
q=1
and take :=
∞ 4
(3.5)
Dp .
p=1
The region is shown in Figure 2. It is bounded because it is included in the square (0, l) × (0, l), and it is regularly open because it is the countable union of pairwise disjoint open balls. Moreover, its perimeter is equal to H N−1 (eby ) =
∞
π dq = π l,
q=1
Figure 2. A region in R2 which is a NV-region but not a D-region.
(3.6)
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
275
and the boundary has zero Lebesgue measure because it is a countable union of sets with zero measure. Therefore, is a NV-region. But it is not a D-region, because H N−1 (bdy ) =
∞
(π dq + l) = +∞.
(3.7)
q=1
4. D-Shapes In the set of all D-regions of RN consider the equivalence relation ∼
⇔
int = int ,
(4.1)
stating that two D-regions are equivalent if they have the same interior. Notice that, in view of (2.7), two D-regions are equivalent if and only if they have the same closure. If is a D-region, then int and clo are D-regions by Proposition 2.2. Moreover, they all have the same interior, because int = int(int ) is trivially satisfied and int = int(clo ) is property (ii) in the definition of a D-region. Therefore, , int and clo are equivalent.
the equivalence class containing . Then not Given a D-region , denote by
but also, by (4.1) and (2.7), they are the unique only int and clo belong to
, respectively. They will be called open region and the unique closed region in
. The notations the open representative and the closed representative of ◦
, ∈
, ∈¯
(4.2)
will be used to denote that is the open representative and that is the closed
, respectively. An equivalence class of D-regions will be called representative of a D-shape, and the set of all D-shapes in RN will be denoted by SN . On SN , consider the partial ordering
≺
⇔
int ⊂ int
, ∀ ∈
. ∀ ∈
(4.3)
if
≺
. Notice that, by Proposition 2.3,
≺
is a part of We say that
if and only if clo ⊂ clo for all in and for all in . ˜ such that There is in SN a D-shape, denoted by ∅,
∀
∈ SN . ∅˜ ≺
(4.4)
This is the equivalence class containing the empty set ∅. Because the empty set is at the same time open and closed, we have int ∅ = ∅ = clo ∅, and because the only In the terminology introduced in Truesdell’s book [15], bodies are sets in some abstract topological space, and shapes are regions in the Euclidean point space which can be occupied by a body [15, pp. 16, 86]. Shapes were called places by Newton [15, p. 33].
276
G. DEL PIERO
set whose closure is the empty set is the empty set itself, the class ∅˜ consists of the single element ∅. A basic property of the equivalence relation ∼ is that it is preserved under biLipschitz homeomorphisms. PROPOSITION 4.1. Let f : RN → RN be a bi-Lipschitz homeomorphism, and let , be D-regions in RN . Then ∼ if and only if f () ∼ f ( ). Proof. If and are D-regions, then f () and f ( ) are D-regions by Proposition 2.5. Moreover, ∼ implies int = int , and from (2.16)1 we have int f () = f (int ) = f (int ) = int f ( )
(4.5)
and therefore f () ∼ f ( ). The proof that f () ∼ f ( ) implies ∼ is similar. 2
) of the equivalence class
The preceding proposition states that the image f ( is the equivalence class containing f (). Thus, the image of a D-shape under a bi-Lipschitz homeomorphism is a D-shape.
,
, respectively. Then
,
be D-shapes, and let , be D-regions in Let clo( ∪ ), int( ∩ ) and int(\ ) are D-regions by Proposition 2.4. I prove
,
and not below that these regions are determined by the equivalence classes by their specific representatives , . In other words, they do not change if , are replaced by equivalent regions.
, and
,
be D-shapes, let , 1 be D-regions in PROPOSITION 4.2. Let
let , 1 be D-regions in . Then, clo( ∪ ) = clo(1 ∪ 1 ), int(\ ) = int(1 \1 ).
int( ∩ ) = int(1 ∩ 1 ),
(4.6)
Proof. By (2.7), ∼ 1 and ∼ 1 implies clo = clo 1 and clo = clo 1 . Then by (2.8)1 , clo( ∪ ) = clo ∪ clo = clo 1 ∪ clo 1 = clo(1 ∪ 1 ).
(4.7)
This proves the first equality in (4.6). The two remaining equalities are proved in a similar way. 2 Because clo( ∪ ) is a closed D-region, it can be taken as the closed representative of a D-shape. By the proposition just proved, that shape depends on the
∨
defined
,
, and not on their specific elements. The D-shape D-shapes by
∨
¯ clo( ∪ ),
, ∈
, ∈
(4.8)
and
. Similarly, int( ∩ ) and int(\ ) can be will be called the join of taken as the open representatives of a D-shape. The D-shapes ◦
∧
int( ∩ ),
◦
,
int(\ ),
, ∈
, ∈
(4.9)
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
277
and
, respectively. From the inclusions define the meet and the difference of clo ⊂ clo( ∪ ), int(\ ) ⊂ int ,
int( ∩ ) ⊂ int , (4.10)
it follows that
∧
≺
,
≺
∨
,
,
≺
(4.11)
is a part of
, the difference
,
defines the
,
. If for all pairs of D-shapes
in
. I prove below that
is the unique D-shape
of complementary part with the properties
=
,
∨
˜
∧
= ∅.
(4.12)
be D-shapes, and let
≺
. Then equa ,
and PROPOSITION 4.3. Let
,
.
= tions (4.12) hold if and only if
,
,
, respectively. Then equaProof. Let , , be D-regions in tions (4.12) mean that clo( ∪ ) = clo ,
int( ∩ ) = ∅.
(4.13)
The only if part of the proposition consists in proving that the above equalities imply int(\ ) = int .
(4.14)
Using the relations (2.8), (4.13)1 , the identity (A ∪ B)\B = A\B, and the fact that and are D-regions, we get int(\ ) = = = = = = = =
int \ clo int clo \ clo int clo( ∪ )\ clo int(clo( ∪ )\ clo ) int((clo ∪ clo )\ clo ) int(clo \ clo ) int clo \ clo int \ clo .
(4.15)
But ∅ = int ∩ int = int ∩ clo int = int ∩ clo .
(4.16)
278
G. DEL PIERO
Indeed, the first equality follows from (4.13)2 , the second is due to the fact that A ∩ clo B = ∅ for every pair of open sets A, B with A ∩ B = ∅, and the third follows from property (iii) of D-regions. Then (4.16) implies int \ clo = int , and (4.14) follows from (4.15). To prove the if part of the proposition, assume that (4.14) holds. Then clo( ∪ ) = = = = = = = =
clo ∪ clo clo int ∪ clo clo int(\ ) ∪ clo clo(int \ clo ) ∪ clo clo((int \ clo ) ∪ clo ) clo(int ∪ clo ) clo int ∪ clo clo ∪ clo ,
(4.17)
≺
. and the last set is equal to clo because clo ⊂ clo by the assumption This proves equation (4.13)1 . Equation (4.13)2 follows from int( ∩ ) = = = = =
int ∩ int int(\ ) ∩ int (int \ clo ) ∩ int (int ∩ int )\ clo int \ clo .
(4.18)
which implies int ⊂
≺ The last equality follows from the assumption int , and the preceding equality follows from the identity (A\B)∩C = (A∩C)\B. Then we have int( ∩ ) = int \ clo = ∅.
form a partition of
if
, We say that two D-shapes
=
∨
and
˜
∧
= ∅.
(4.19)
This definition makes precise the idea of subdividing a D-shape into two subshapes.
is obtained simply by taking Indeed, Proposition 4.3 tells us that a partition of
any part of and its complementary part in . To provide examples of join and meet, consider the plane R2 and select a system of Cartesian coordinates (x, y). Take the regions (4.20) := (x, y) ∈ R2 | −l < x 0, 0 y l , (4.21) := (x, y) ∈ R2 | 0 x < l, 0 y l . Both are squares of side l, neither open nor closed, and they are both D-regions. Their union is the rectangle (4.22) ∪ = (x, y) ∈ R2 | −l < x < l, 0 y l ,
279
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
their intersection is the segment ∩ = (x, y) ∈ R2 | x = 0, 0 y l ,
(4.23)
and the difference \ is the square \ = (x, y) ∈ R2 | −l < x < 0, 0 y l .
(4.24)
and
are the equivalence classes of , , their join is the set of all rectangles If with the same sides as ∪ , irrespective of which part of the sides is included in the region. Their meet is the equivalence class including the interior of ∩ , ˜ and their difference
. Notice that,
,
coincides with that is, the empty class ∅, ˜
.
2 by (4.19), ∧ = ∅ implies that and form a partition of ∨
5. Universes of Bodies A universe of bodies is a pair (, ≺), with a set and ≺ a partial ordering on . The elements A, B, . . . of are called bodies, and A ≺ B is to be read as A is a part of B. The pair (, ≺) is subject to four axioms. The first is the existence of the null body (B1) There is an element ∅ of such that ∅ ≺ A for all A ∈ , and the second is the existence of the minimum envelope (B2) For every A, B in , there is a C ∈ such that: (i) A and B are parts of C, (ii) if A and B are parts of D then C is a part of D. Note that this axiom consists of two statements: (i) postulates the existence of an envelope for every pair of elements of , and (ii) postulates the existence of a minimum envelope. The uniqueness of the null body and of the minimum envelope are easy to prove. Indeed, if ∅ and ∅ are null bodies then ∅ ≺ ∅ and ∅ ≺ ∅, and this is possible only if ∅ = ∅. Similarly, if both C and C are minimum envelopes of A and B, then A and B are parts of both C and C by (i). Then by (ii) C ≺ C and C ≺ C, and therefore C = C. The minimum envelope of A and B is called the join of A and B and is denoted by A ∨ B. The following properties are direct consequences of the definition: A ∨ A = A, A ∨ ∅ = A, A ∨ B = B ∨ A, A ≺ B ⇒ (A ∨ C) ≺ (B ∨ C) for all C in , (A ∨ B) ∨ C = A ∨ (B ∨ C).
(5.1) (5.2) (5.3)
Two bodies A, B are separate if their only common part is the null body: A and B separate
⇔
(C ≺ A and C ≺ B
⇒
C = ∅).
(5.4)
280
G. DEL PIERO
Notice that A and B separate and C ≺ A ⇒
C and B separate.
(5.5)
Indeed, if C ≺ A then (D ≺ C and D ≺ B) implies (D ≺ A and D ≺ B), and this implies D = ∅ if A and B are separate. The third axiom is a separation postulate (B3) If A and C are separate and if B and C are separate, then A ∨ B and C are separate, and the last axiom is a partition postulate (B4) If A ≺ C, there is a part AC of C such that (i) A and AC are separate, (ii) A ∨ AC = C. AC is called the complementary part of A in C. The following properties are easily proved: CC = ∅, A≺C ⇒ (AC )C = A, A≺B≺C ⇒ A and BC are separate.
(5.6) (5.7) (5.8)
In particular, to prove (5.8) it is sufficient to observe that, by (5.5), B and BC separate and A ≺ B implies A and BC separate. For every body C and for every part A of C, the complementary part of A in C is unique. To see this, assume that there are two complementary parts of A, AC and AC . Then AC := AC ∨ AC is a complementary part of A as well. Indeed, A and AC are separate by the separation postulate, and A ∨ AC = A ∨ (AC ∨ AC ) = (A ∨ AC ) ∨ AC = C ∨ AC = C. By definition, AC is a part of AC . Let ACC be a complementary part of AC in AC . Then AC and ACC are separate. Moreover, ACC and A are separate, because AC and A are separate and ACC is a part of AC . Then, ACC and C = AC ∨ A are separate by the separation postulate. On the other hand, ACC is a part of AC which is a part of C. So, we have at the same time that ACC is a part of C and that ACC and C are separate. This is possible only if ACC is the null body. Then AC = AC ∨ ACC = AC . In the same way it can be proved that AC = AC . Thus, AC = AC . An example of a pair (, ≺) for which the meet does not exist is given in [15, p. 9]. A pair for which the meet exists but the separation postulate does not hold is obtained by taking as the set of all closed intervals of the real line, and as ≺ the inclusion in the sense of set theory. Indeed, the meet of A = [0, 1] and B = [4, 5] is A ∨ B = [0, 5], and we see that the interval [2, 3] is separate from A and B but not from A ∨ B.
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
281
A pair (, ≺) for which the axioms (B1)–(B3) are satisfied but the complementary set does not exist is obtained by taking as the set of all closed subsets of the real line and as ≺ again the set inclusion. Then if we take C = [0, 2] and A = [0, 1] we have that A ≺ C but the complementary set AC does not exist. In particular, AC is not (1, 2] because this is not a closed interval, and AC is not [1, 2] because [0, 1] and [1, 2] have as common part the singleton {1}, which belongs to and is different from the null set. There are two remarkable consequences of axioms (B3) and (B4). The first is an inverse of the implication (5.8). It can be regarded as a counterpart to axiom (A5) in the Introduction, in the absence of a universal body. PROPOSITION 5.1. If A and B are separate parts of C, then A is a part of BC . Proof. Because B and BC are separate, B and A separate implies B and A ∨ BC separate by the separation postulate. Moreover, by (5.3), (A ∨ BC ) ∨ B = A ∨ (BC ∨B) = A∨C = C. Then A∨BC = BC by the definition of the complementary 2 part of B, and therefore A is a part of BC . The second consequence of axioms (B3) and (B4) is the following: PROPOSITION 5.2. Let A, B be parts of C. Then A≺B
⇔ BC ≺ AC .
(5.9)
Proof. If A ≺ B, then A and BC are separate by (5.8), and if A and BC are separate then BC is a part of AC by Proposition 5.1. Conversely, if BC ≺ AC then BC and (AC )C = A are separate by (5.8). Then, A is a part of (BC )C = B by Proposition 5.1. 2 The meet of A and B is defined by A ∧ B := (AC ∨ BC )C ,
(5.10)
where C is any envelope of A and B. In order this definition be meaningful, it is necessary to prove that the right-hand side of (5.10) does not depend on C. This is done in Proposition 5.4 below, after proving the following preliminary result. LEMMA 5.3. Let A ≺ C ≺ D, and let AC and AD be the complementary parts of A in C and D, respectively. Then AD = AC ∨ CD . Proof. By (5.2), AC ≺ C implies AC ∨ CD ≺ C ∨ CD = D. Moreover, by (5.3), A ∨ (AC ∨ CD ) = (A ∨ AC ) ∨ CD = C ∨ CD = D.
(5.11)
It remains to prove that A and AC ∨ CD are separate. In fact, A and AC are separate by the definition of the complementary set, and A and CD are separate because A is a part of C and C and CD are separate. Then A and AC ∨ CD are separate by the separation postulate. 2
282
G. DEL PIERO
PROPOSITION 5.4. Let A and B be parts of both C and D. Then (AC ∨ BC )C = (AD ∨ BD )D . Proof. Assume first that C ≺ D. Then, by the preceding lemma and by (5.3), AD ∨ BD = (AC ∨ CD ) ∨ (BC ∨ CD ) = (AC ∨ BC ) ∨ CD ,
(5.12)
and, again by the lemma, (AC ∨ BC ) ∨ CD = ((AC ∨ BC )C )D .
(5.13)
Then, (AC ∨ BC )C = ((AC ∨ BC ) ∨ CD )D = (AD ∨ BD )D .
(5.14)
If C is not a part of D, take an envelope E of C and D. Then (5.14) holds both for C and E and for D and E. Combining the two equalities we get (5.14) for C and D. 2 The following consequences of the definition (5.10) are easy to prove. A ∧ B ≺ A, A ∧ B = B ∧ A, D ≺ A and D ≺ B ⇒ D ≺ A ∧ B.
(5.15) (5.16)
The last statement characterizes the meet A∧B as the maximum common part of A and B. From (5.10) it also follows: COROLLARY 5.5. A and B are separate if and only if A ∧ B = ∅. Proof. If A and B are separate, then A ∧ B = ∅ follows from (5.4) and (5.15). Conversely, if A ∧ B = ∅ then by (5.16) the only common part of A and B is D = ∅. Therefore, A and B are separate. 2 Other consequences of the definition (5.10) are: (A ∧ B) ∧ C = A ∧ (B ∧ C), (A ∨ B) ∧ C = (A ∧ C) ∨ (B ∧ C), (A ∧ B) ∨ C = (A ∨ C) ∧ (B ∨ C).
(5.17) (5.18) (5.19)
Proof of (5.17). Take an envelope D of A, B, C. Then by (5.3) and (5.10), (A ∧ B) ∧ C = (AD ∨ BD )D ∧ C = ((AD ∨ BD ) ∨ CD )D = (AD ∨ (BD ∨ CD ))D = A ∧ (BD ∨ CD )D = A ∧ (B ∧ C).
(5.20) 2
Proof of (5.18). Assume first that A ≺ C,
B ∧ C = ∅,
(5.21)
283
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
and set D := A ∨ B ∨ C. It is easy to check that (A ∨ B)D = AC ,
AD = AC ∨ B,
CD = B.
(5.22)
Then by the definition (5.10) (A ∨ B) ∧ C = ((A ∨ B)D ∨ CD )D = (AC ∨ B)D = (AD )D = A.
(5.23)
On the other hand, (A ∧ C) ∨ (B ∧ C) = A ∨ ∅ = A,
(5.24)
and (5.18) follows in the special case (5.21). Now consider arbitrary bodies A, B, C, and set H := A ∧ C,
K := B ∧ C.
(5.25)
Then A ∨ B = (H ∨ HA ) ∨ (K ∨ KB ) = (H ∨ K) ∨ (HA ∨ KB ),
(5.26)
with (H ∨ K) ≺ C and (HA ∨ KB ) ∧ C = ∅. Then (5.23) holds with A, B replaced by (H ∨ K), (HA ∨ KB ): ((H ∨ K) ∨ (HA ∨ KB )) ∧ C = H ∨ K, and (5.18) follows from (5.25) and (5.26).
(5.27) 2
Proof of (5.19). Let D be an envelope of A, B, C. Then from the definition (5.10) we have (A ∧ B) ∨ C = (AD ∨ BD )D ∨ C = ((AD ∨ BD ) ∧ CD )D ,
(5.28)
(A ∨ C) ∧ (B ∨ C) = (AD ∧ CD )D ∧ (BD ∧ CD )D = ((AD ∧ CD ) ∨ (BD ∧ CD ))D . The two right-hand sides are equal by (5.18). Then (5.19) follows.
(5.29) 2
6. A Universe of Shapes A universe of shapes is a pair (, ≺), where is a set of shapes, ≺ is a partial ordering on , and and ≺ satisfy the axioms (B1)–(B4) of a universe of bodies stated in the preceding section. Here I take as the set SN of all D-shapes in RN defined in Section 4, as ≺ I take the relation (4.3), and I show that with this
of SN is choice the axioms (B1)–(B4) are satisfied. I recall that each element an equivalence class of D-regions with respect to the equivalence relation (4.1).
284
G. DEL PIERO
Axiom (B1) on the existence of a null shape is satisfied by the shape ∅˜ defined in Section 4. On the contrary, SN does not include a universal shape, i.e., a D-shape
≺∞
in SN . ∞
with the property
for all Axiom (B2) on the existence of the join is satisfied by the definition (4.8). To check whether the separation postulate (B3) is satisfied notice that, according to
,
are separate if, for any other the definition given in Section 5, two D-shapes
D-shape , int ⊂ int and
int ⊂ int
⇒
= ∅
(6.1)
,
,
. It is easy to see that this for any representatives , , of the classes condition is verified if and only if int ∩ int = ∅.
(6.2)
Then axiom (B3) is satisfied if (int ∩ int = ∅ and int ∩ int = ∅) ⇒ int clo( ∪ ) ∩ int = ∅,
(6.3)
and this implication is proved by the following chain of equalities int clo( ∪ ) ∩ int = int(clo( ∪ ) ∩ int ) = int((clo ∪ clo ) ∩ int ) = int((clo ∩ int ) ∪ (clo ∩ int )) = int((clo int ∩ int ) ∪ (clo int ∩ int )) = ∅. (6.4) In it, the first two equalities follow from (2.8)5 and (2.8)1 , the third is due to the set identity (A ∪ B) ∩ C = (A ∩ C) ∪ (B ∩ C), the fourth comes from the property clo = clo int of D-regions, and the last follows from the left-hand side of (6.3), because (clo A) ∩ B = ∅ for any pair A, B of open sets with A ∩ B = ∅. Finally, it is proved in Proposition 4.3 that axiom (B4) on the existence of the
is satisfied by the D-shape
defined in
of complementary part of any part Section 4. Acknowledgements I thank the anonymous reviewers for precious suggestions and comments. This research has been supported by the Programma Cofinanziato 2000 “Modelli Matematici per la Scienza dei Materiali” of the Italian Ministry for University and Scientific Research. An alternative choice for a universe of shapes is to remove the requirement of boundedness from the definition of a D-region [6]. In this case, ∞
is identified with the singleton {RN }. Dealing with
unbounded regions would imply to replace the condition (iv) of area-boundedness of the boundary with a condition of local area-boundedness: for every D-region and for every ball B of RN , the (N − 1)-dimensional Hausdorff measure of (bdy ) ∩ B is finite.
A CLASS OF FIT REGIONS FOR CONTINUUM MECHANICS
285
References [1] L. Ambrosio, N. Fusco and D. Pallara, Functions of Bounded Variation and Free Discontinuity Problems. Oxford Science Publications, Oxford (2000). [2] C. Banfi and M. Fabrizio, Sul concetto di sottocorpo nella meccanica dei continui. Rend. Accad. Naz. Lincei 66 (1979) 136–142. [3] M. Degiovanni, A. Marzocchi and A. Musesti, Cauchy fluxes associated with tensor fields having divergence measure. Arch. Rational Mech. Anal. 147 (1999) 197–223. [4] J. Dixmier, General Topology, Springer Undergraduate Texts in Mathematics. Springer, New York (1984). [5] L.C. Evans and R.F. Gariepy, Measure Theory and Fine Properties of Functions, CRC Press, Boca Raton, FL (1992). [6] M.E. Gurtin, Private communication, Blacksburg (June 2002). [7] M.E. Gurtin, W.O. Williams and W.P. Ziemer, Geometric measure theory and the axioms of continuum thermodynamics. Arch. Rational Mech. Anal. 92 (1986) 1–22. [8] W. Noll, La mécanique classique, basée sur un axiome d’objectivité. In: La Méthode Axiomatique dans les Mécaniques Classiques et Nouvelles, Colloque International, Paris, 1959. Gauthier-Villars, Paris (1963). [9] W. Noll, The foundations of mechanics. In: Non-linear Continuum Theories, CIME Lectures, 1965. Cremonese, Roma (1966) pp. 159–200. [10] W. Noll, Lectures on the foundations of continuum mechanics and thermodynamics. Arch. Rational Mech. Anal. 52 (1973) 62–92. [11] W. Noll, Continuum Mechanics and geometric integration theory. In: F.W. Lawvere and S.H. Schnauel (eds), Categories in Continuum Physics, Buffalo, 1982, Springer Lecture Notes in Mathematics, Vol. 1174. Springer, Berlin (1986) pp. 17–29. [12] W. Noll and E.G. Virga, Fit regions and functions of bounded variation. Arch. Rational Mech. Anal. 102 (1988) 1–21. [13] M. Šilhavý, The existence of the flux vector and the divergence theorem for general Cauchy fluxes. Arch. Rational Mech. Anal. 90 (1985) 195–212. [14] M. Šilhavý, Cauchy’s stress theorem and tensor fields with divergences in Lp . Arch. Rational Mech. Anal. 116 (1991) 223–255. [15] C.A. Truesdell, A First Course in Rational Continuum Mechanics, Vol. 1, 2nd edn. Academic Press, Boston (1991). [16] A.I. Vol’pert and S.I. Hudjaev, Analysis in Classes of Discontinuous Functions and Equations of Mathematical Physics. Nijhoff, Dordrecht (1985).
Toward a Field Theory for Elastic Bodies Undergoing Disarrangements LUCA DESERI1 and DAVID R. OWEN2
1 Dipartimento di Ingegneria, Università di Ferrara, 44100 Ferrara, Italy.
E-mail:
[email protected] 2 Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A. E-mail:
[email protected] Received 26 September 2002; in revised form 19 June 2003 Abstract. Structured deformations are used to refine the basic ingredients of continuum field theories and to derive a system of field equations for elastic bodies undergoing submacroscopically smooth geometrical changes as well as submacroscopically non-smooth geometrical changes (disarrangements). The constitutive assumptions employed in this derivation permit the body to store energy as well as to dissipate energy in smooth dynamical processes. Only one non-classical field G, the deformation without disarrangements, appears in the field equations, and a consistency relation based on a decomposition of the Piola–Kirchhoff stress circumvents the use of additional balance laws or phenomenological evolution laws to restrict G. The field equations are applied to an elastic body whose free energy depends only upon the volume fraction for the structured deformation. Existence is established of two universal phases, a spherical phase and√an elongated phase, whose volume fractions are (1 − γ0 )3 and (1 − γ0 ) respectively, with γ0 := ( 5 − 1)/2 the “golden mean”. Mathematics Subject Classifications (2002): 74A, 74B20, 74M25, 74H99, 76N99, 80A17. Key words: structured deformations, multiscale, slips, voids, field equations, elasticity, dissipation.
1. Introduction The vast scope of elasticity as a continuum field theory includes the description at the macrolevel of the dynamical evolution of bodies that undergo large deformations, that respond to smooth changes in geometry by storing mechanical energy, and that experience internal dissipation in isothermal motions only during non-smooth macroscopic changes in geometry such as shock waves. The research described in this paper represents the first step in a program to employ structured deformations of continua to obtain a field theory capable of describing, in the context of dynamics and large isothermal deformations, the evolution of bodies that (i) undergo smooth deformations at the macroscopic length scale, that (ii) can experience piecewise smooth deformations at submacroscopic length scales, and that (iii) can not only store energy but can also dissipate energy during such multiscale geometrical changes. 287 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 287–326. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
288
LUCA DESERI AND DAVID R. OWEN
The main goal of the present study is the derivation of the following field relations governing the smooth vector field χ that describes the macroscopic changes in geometry of a body and the smooth tensor field G that describes the contribution at the macrolevel of only the smooth part of the submacroscopic piecewise smooth geometrical changes experienced by the body. If we put M := ∇χ − G and K := (∇χ)−1 G, the desired relations are (10.1)–(10.5), which we record in the following detailed form: ˜ ˜ t), G(X, t)) + DG (M(X, t), G(X, t))) divX (DM (M(X, ¨ t), + bref (X, t) = ρref (X)χ(X, ˜ t), G(X, t))(K(X, t)−T − I ) DG (M(X, ˜ t), G(X, t))K(X, t)−T = 0, + DM (M(X, ˜ t), G(X, t))M(X, t)T ) sk(DG (M(X, ˜ t), G(X, t)) G(X, t)T ) = 0, + sk(DM (M(X, ˜ ˙ DG (M(X, t), G(X, t)) · M(X, t) ˙ ˜ t), G(X, t)) · G(X, t) 0, +DM (M(X, det(G(X, t) + M(X, t)) det G(X, t) > m(t) > 0.
(1.1) (1.2) (1.3) (1.4) (1.5)
˜ Here, (M, G) → (M, G) is the response function that gives the Helmholtz free ˜ energy density (M(X, t), G(X, t)) at each point X in the reference configuration ˜ and DG ˜ denote its partial derivatives, bref is the body and each time t, DM force in the reference configuration, ρref is the mass density in the reference configuration, m(t) is a positive number depending on time, I is the identity tensor, sk denotes the skew part of a tensor, and superposed dots denote differentiation with respect to time. The balance of linear momentum (1.1), the “consistency relation” (1.2), and the frame-indifference relation (1.3) amount to 12 scalar equations that restrict the unknown fields χ and G representing 12 scalar fields in all. The Piola– Kirchhoff stress field S in the reference configuration is related constitutively to the fields G and M by the stress relation ˜ ˜ t), G(X, t)) + DG (M(X, t), G(X, t)), S(X, t) = DM (M(X,
(1.6)
and the “mixed power” inequality (1.4) guarantees that the internal dissipation is non-negative on each dynamical process of the body. Finally, the inequality (1.5) guarantees that no interpenetration of matter occurs submacroscopically [1]. In addition to the frame-indifference relation (1.3), the free energy response function ˜ is required to be frame-indifferent in the sense described in Section 9. There we show that these two conditions of frame-indifference imply that the law of balance of angular momentum is satisfied and, hence, need not be imposed directly. The theory of structured deformations [1, 2] shows that the tensor field M = ∇χ − G describes the contributions at the macrolevel of “disarrangements,” i.e., of the non-smooth part of piecewise smooth submacroscopic geometrical changes, and we use the term “elasticity with disarrangements” to distinguish the nascent
FIELD THEORY FOR ELASTIC BODIES
289
field theory described in (1.1)–(1.5) from that embodied in the now standard field theory of non-linear elasticity: ¨ t), divX (D(∇χ(X, t))) + bref (X, t) = ρref (X)χ(X, det ∇χ(X, t) > 0
(1.7) (1.8)
along with the stress relation S(X, t) = D(∇χ(X, t)).
(1.9)
We note that the balance of angular momentum need not be imposed explicitly if the response function F → (F ) is required to be frame-indifferent. We describe in this paper how the multiscale geometry embodied in structured deformations affords not only the decomposition ∇χ = G + M
(1.10)
of the macroscopic deformation gradient ∇χ into a part M due to disarangements and a part G without disarrangements, but also the decomposition (det K)S = S\ + Sd ,
(1.11)
with S\ := (det K)SK −T the stress without disarrangements and Sd := (det K)S − S\ the stress due to disarrangements. The decomposition of stress (1.11) is the basis for the consistency relation (3.6) which, in turn, yields the field equation (1.2), once constitutive assumptions are laid down. The two decompositions (1.10) and (1.11) are central to the “top-down” nature of our methodology, in which standard macroscopic fields, such as the stress power S · ∇ χ˙ and the volume density of moments due to contact forces sk(S(∇χ)T ), are refined and enriched by substitution of the decompositions (1.10) and (1.11) for the factors S and ∇χ: ˙ + Sd · M˙ + S\ · M˙ + Sd · G, ˙ (det K)S · ∇ χ˙ = S\ · G
(1.12)
(det K)sk(SF T ) = sk(S\ GT ) + sk(Sd M T ) + sk(S\ M T ) + sk(Sd GT ).
(1.13)
We utilize in the sequel an “identification relation” (2.2)2 for the field G, an identification relation (2.4) for M, and one for divS\ (relation (A.3) in the Appendix), that are provided by the theory of structured deformations. These relations (i) describe G, M and divS\ as limits of geometrical or statical quantities calculated in terms of the piecewise smooth, injective deformations that approximate a structured deformation, (ii) justify the attributes “without disarrangements” and due to “disarrangements”, and (iii) provide unambiguous interpretations for the terms in the decompositions (1.12) and (1.13).
290
LUCA DESERI AND DAVID R. OWEN
This methodology is supplemented by factorizations of the type (χ, G) = (χ, ∇χ) ◦ (π, K)
(1.14)
in which the pair (χ, ∇χ) represents only classical geometrical changes and (π, K) represents purely submacroscopic geometrical changes. This factorization permits us to introduce a “virgin configuration,” macroscopically identical to the reference configuration, and to interpret the stress without disarrangements S\ as a stress in the virgin configuration. In the case of invertible structured deformations, the virgin configuration also can serve as an intermediate configuration for theories based on classical deformations. We are able with these tools to scrutinize and refine principal ingredients in continuum field theories, namely, · geometry · power · material symmetry · kinematics · dissipation · material frame indifference · forces and moments · constitutive relations · material uniformity, and to arrive at the field relations (1.1)–(1.5) for an elastic body undergoing disarrangements. The new relations derived here incorporate the effects of submacroscopic disarrangements, such as slips, separations, the formation of voids, and the switching or reorientation of submacroscopic units. They also cover submacroscopically smooth geometrical changes, such as the distortion of atomic lattices and of molecular networks at length scales large enough to justify the use of smooth fields to extrapolate the discrete geometrical changes of the lattice or network. However, relations (1.1)–(1.5) do not directly incorporate the effect of jumps in the gradients of approximating piecewise smooth deformations (“gradient disarrangements”), so that fine mixtures of phases are not captured. Moreover, the effects of time-like disarrangements, in which changes in position occur at very short time scales, are not incorporated, and macroscopic disarrangements such as fracture, shear bands, shock waves, and acceleration waves are formally excluded by our assumption that χ and G are smooth. Nevertheless, the inclusion of macroscopic disarrangements can be accomplished in a manner analogous to that used in the field theory based on (1.7)–(1.9). In addition, time-like disarrangements and gradient disarrangements are amenable to treatment via the concepts of “structured motions” [3, 4] and of “second-order structured deformations” [5] that go beyond the geometry and the kinematics in this paper. We note also that couple stresses and other multipolar entities, temperature variations, electromagnetic fields, and chemical reactions are left out of the present development. ˜ It is evident from the constitutive expression (M(X, t), G(X, t)) for the volume density of the Helmholtz free energy that our theory permits energy to be stored by means of both smooth and non-smooth submacroscopic geometrical changes. For example, contribution to the energy both from the distortion of a crystalline lattice between slip bands and the relative translations of parts of the crystalline lattice across slip bands can be included here, because the former is captured by G(X, t) and the latter by M(X, t). Similarly, the macroscopic stretching
FIELD THEORY FOR ELASTIC BODIES
291
produced when a polymer network deforms can be identified through ∇χ(X, t) = G(X, t) + M(X, t) while the submacroscopic reorientations of attached nematic ˜ particles can be described by G(X, t), so that (M(X, t), G(X, t)) can reflect the energetic contributions of both. Our development includes the possibility that ˜ depends upon the material point X explicitly, and we discuss the concepts of material uniformity and homogeneity in Section 13. ˜ The response function (M, G) → (M, G) may or may not be obtained by means of a process of homogenization or relaxation from an “initial” response function describing the energy stored in piecewise smooth approximating deformations. In fact, it has been shown that such a relaxation procedure, starting from a standard form of the initial energy, does lead to response functions of the type ˜ (M, G) → (M, G) [6], but also that the specific dependence on M and G obtained by such a relaxation may exclude response functions already found to ˜ identified be useful in applications ([3, 7–9]). We expect that response functions in a variety of ways will play a role in applying the field relations (1.1)–(1.5) to specific bodies. ˜ restricted only In the present paper we describe a class of response functions by considerations of material frame indifference, material symmetry, or material uniformity as explained in Sections 9, 12, and 13. This choice provides a broad view of elasticity with disarrangements, but does not provide for the moment insights into specific solid or fluid bodies encountered in the laboratory. However, we do include a specific example, that of an “energetically nearsighted elastic body,” in which the constitutive relation for the free energy takes the form det G ˜ ¯ . (1.15) (M, G) → (M, G) = ψ det(G + M) We note by (1.5) that det G/ det(G + M) = det K takes values in (0, 1], and we can interpret 1 − det K as the “void fraction” created by the purely submacroscopic factor (i, K) in (1.14). Similarly, we call det K the “volume fraction” associated with (i, K). We show that the consistency relation (1.2), rewritten in terms of the variables ∇χ and K, and the constitutive assumption (1.15) imply that such an elastic body can arise in two non-trivial “universal” phases: a spherical phase, in which det K = (1 − γ0 )3 , and an elongated phase, in which det K = 1 − γ0 , where √ 5−1 γ0 := 2 is the golden mean. The stress relation (1.6) for the spherical phase reduces to that of an ideal gas, so that the stress in the current configuration is a hydrostatic pressure that depends linearly on the density in the current configuration. For the elongated phase, a uniaxial stress in the direction of submacroscopic elongation is superposed on such a hydrostatic pressure, as an outcome of the stress relation (1.6). We turn now to further introductory remarks on the nature of the field relations ˜ (1.1)–(1.5). Suppose that the response function (M, G) → (M, G) is chosen to satisfy the condition ˜ G) = 0, DM (0,
(1.16)
292
LUCA DESERI AND DAVID R. OWEN
for all tensors G with det G > 0. If we consider a smooth deformation χ and put G := ∇χ, then the tensor field M = ∇χ − G is identically zero and (1.2)–(1.4) are satisfied identically, the last with “” replaced by “=”. Consequently a classical motion satisfies (1.1)–(1.4) if and only if ˜ ¨ t), ∇χ(X, t))) + bref (X, t) = ρref (X)χ(X, divX (DG (0,
(1.17)
which is equivalent to (1.7), the balance of linear momentum for a non-linearly elastic body. The inequality (1.5) and the relation G = ∇χ yield (1.8), and the stress relation (1.6) for an elastic body undergoing disarrangements reduces, by virtue of (1.16), to ˜ ∇χ(X, t)), S(X, t) = DG (0,
(1.18)
a relation equivalent to (1.9). Therefore, relation (1.16) implies that classical motions that satisfy the new relations (1.1)–(1.6) also satisfy the field relations (1.7)– (1.9) of non-linear elasticity. We note also that, in the example of energetically nearsighted elastic bodies treated in Section 14, the condition (1.16) is satisfied only in exceptional cases. The statical quantities underlying the relations (1.1)–(1.6) all can be expressed in terms of the classical measure of stress S, and the only balance law directly imposed is the classical balance of linear momentum. This observation provides a point of contrast between elasticity with disarrangements and theories of “structured continua,” in which additional geometrical fields are accompanied by additional statical quantitities and additional balance laws [10, 11]. Moreover, the presence of the non-classical geometrical field G in the present theory does not ˙ in terms of require that we impose constitutively an evolution law expressing G other geometrical and statical quantities, as is the case in theories of materials with “internal variables.” Instead, the decomposition (1.11) for the stress leads to the consistency relation (1.2) that restricts G and M = ∇χ − G in dynamical processes for the body. We note further that the field relations (1.1)–(1.5) are not obtained by imposing balance laws and constitutive relations at a submacroscopic length scale followed by an averaging procedure that leads to corresponding relations at the macrolevel. This fact distinguishes the present theory from multiscale approaches employed in the field of micromechanics that use homogenization or other systematic schemes of averaging. The present study does not address initial-boundary value problems for the field relations (1.1)–(1.5). Nevertheless, we expect that the problem of existence and uniqueness locally in time of smooth solutions χ, G satisfying initial conditions on χ, χ, ˙ and G and boundary conditions on χ can be attacked by means of the energy methods described in Chapter III of the monograph [12]. Our expectation is based on preliminary calculations for one-dimensional versions of (1.1)–(1.5), expressed in the equivalent form (10.23)–(10.27). Key issues in confirming our expectation are the local solvability of the consistency relation (10.24) at the initial
293
FIELD THEORY FOR ELASTIC BODIES
time and satisfaction of the mixed power inequality (10.26) with strict inequality at the initial time. 2. Structured Deformations Specification of a structured deformation from a region A in a Euclidean space E with translation space V includes the specification of two fields g: A → E and G: A → Lin V called the macroscopic deformation and the deformation without disarrangements, respectively. In the present study we assume that the fields g and G are smooth, although discontinuites are permitted in the piecewise-smooth approximating deformations fn introduced below (2.1), as well as in ∇fn . This assumption excludes slip and separation at the macroscopic level while permiting such discontinuities at submacroscopic levels. To avoid some technical issues, precise smoothness assumptions on these fields and on the region A will not be specified here, but sufficient smoothness requirements on the fields can be inferred from the context. Other than smoothness requirements, the only conditions imposed on the fields g and G are the injectivity of g and the existence of a positive number m such that the inequalities m < det G(X) det ∇g(X)
(2.1)
hold for all X ∈ A. The Approximation Theorem for structured deformations [1] assures that there is a sequence n → fn of piecewise smooth, injective deformations defined on A (a determining sequence) such that g = lim fn , n→∞
G = lim ∇fn ,
(2.2)
n→∞
with the limits taken in the sense of essentially uniform convergence (i.e., L∞ convergence). The spatial derivatives ∇fn are taken in the classical sense, and the limit G of derivatives in (2.2)2 need not equal the corresponding derivative ∇g of the macroscopic deformation (nor need G even be the gradient of some deformation). Specific quantitative information about the difference M := ∇g − G is provided in the first subsection below and justifies the terminology deformation due to disarrangements for M. 2.1. DECOMPOSITIONS AND IDENTIFICATION RELATIONS The additive decomposition ∇g = G + M
(2.3)
for the macroscopic deformation gradient is given deeper significance by means of the following limit relation [2] for M: [fn ](Y ) ⊗ ν(Y ) dAY . (2.4) M(X) = lim lim vol B(X; δ)−1 δ→0 n→∞
(fn)∩B(X;δ)
294
LUCA DESERI AND DAVID R. OWEN
In this relation, n → fn is an arbitrary sequence of piecewise smooth deformations that satisfies the limit relations (2.2). The symbol B(X; δ) denotes the ball of radius δ > 0 centered at a point X in A, and (fn ) ⊂ E, [fn ](Y ) ∈ V, and ν(Y ) ∈ V denote, respectively, the jump set of the piecewise smooth deformation fn , the jump of fn at a point Y ∈ (fn ), and the unit normal to the jump set (fn ) at the point Y . The integrand in (2.4) is the tensor product of [fn ](Y ) and ν(Y ), both vectors in V. The precise interpretations now available for G and M = ∇χ − G permit us to understand and interpret various features of structured deformations in the following subsections. 2.2. FACTORIZATIONS , VIRGIN CONFIGURATIONS , AND INTERMEDIATE CONFIGURATIONS
The definition of composition of two structured deformations [1] is provided in the formula: ˜ ◦ (g, G) := (g˜ ◦ g, (G ˜ ◦ g)G). (g, ˜ G)
(2.5)
Here, the symbol “◦” on the left-hand side denotes the composition of two structured deformations, while on the right-hand side it denotes the composition of two ˜ ◦ g)G denotes the pointwise composition of the two functions. In addition, (G ˜ ◦ g and G. This formula provides the following factorizations for a tensor fields G structured deformation (g, G): (g, G) = (g, ∇g) ◦ (i, K), (g, G) = (i, H˜ ) ◦ (g, ∇g),
(2.6) (2.7)
where i(X) := X for all X ∈ A, K := (∇g)−1 G and H˜ := (G ◦ g −1 )((∇g)−1 ◦ g −1 ). The first factorization (2.6) represents the given structured deformation as the classical deformation (g, ∇g) following a “purely submacroscopic” structured deformation (i, K) that accomplishes all of the disarrangements associated with (g, G). Analogously, the second factorization (2.7) represents (g, G) as the same classical deformation followed by the purely submacroscopic structured deformation (i, H˜ ). We emphasize that all of the factors in the above representions are deformations of the entire body. The factorization (2.6) provides a distinction between the body before and after it undergoes the purely submacroscopic deformation (i, K), a distinction that permits us to distinguish between the reference configuration, from which the classical deformation (g, ∇g) procedes, and the virgin configuration, from which both (i, K) and (g, G) procede. Similarly, we may distinguish by means of (2.7) between the deformed configuration without disarrangements, attained from the virgin configuration via the classical deformation (g, ∇g) alone, and the deformed configuration, attained from the deformed configuration without disarrangements via the purely submacroscopic deformation (i, H˜ ). Of course, all of the configurations mentioned are global configurations of the body.
295
FIELD THEORY FOR ELASTIC BODIES
We note that the inequality (2.1) implies the relations 0 < det K = det H˜ 1
(2.8)
and permits us to call det K = det H˜ = det G/ det ∇g the volume fraction associated with the given structured deformation. The case det K < 1 reflects creation of voids through the purely submacroscopic deformations (i, K) and (i, H˜ ). Of particular interest in applications such as crystalline plasticity are invertible structured deformations (g, G), i.e., structured deformations for which the volume fraction equals 1. The term “invertible” is appropriate, because the pair (g −1 , G−1 ◦ g −1 ) then is itself a structured deformation that is a two-sided inverse for (g, G) with respect to the composition in (2.5) and with (i, I ) playing the role of the identity structured deformation (here I v = v for all v ∈ V). In this case, the purely submacroscopic factor (i, K) also is an invertible structured deformation with inverse (i, K)−1 = (i, K −1 ), and we have the following factorization (g, ∇g) = (g, G) ◦ (i, K)−1
(2.9)
of the classical deformation (g, ∇g). For the structured deformation (g, G), the purely submacroscopic deformation (i, K) carried the virgin configuration into the reference configuration; consequently, its inverse (i, K)−1 carries the reference configuration into the virgin configuration. Consequently, the virgin configuration for the invertible structured deformation (g, G) plays the role of a (global) intermediate configuration for the classical deformation (g, ∇g). Local intermediate configurations play an important role in descriptions of single and polycrystalline materials and of polymers (see [13, 14] and references cited therein). 2.3. MOTIONS VIA FAMILIES OF STRUCTURED DEFORMATIONS ; SPACE - LIKE DISARRANGEMENTS
The most immediate way of capturing the possibility that a body evolves in time while undergoing structured deformations at each instant is to consider a given positive number T and a pair of smooth mappings χ: A ×(0, T ) → E and G: A × (0, T ) → Lin V such that the pair (χ(·, t), G(·, t)) is a structured deformation for each t ∈ (0, T ). When the Approximation Theorem and the identification relation in Section 2.1 are invoked at each time t, the relations (2.2) and (2.4) become: G(·, t) = lim ∇χn (·, t)
χ(·, t) = lim χn (·, t), n→∞
n→∞
(2.10)
and M(X, t) = lim lim vol B(X; δ) δ→0 n→∞
−1
[χn (·, t)](Y ) ⊗ ν(Y ) dAY .
(2.11)
(χn(·,t ))∩B(X;δ)
In this context, the disarrangements associated with the approximating motions χn that are captured in the tensor field M: A × (0, T ) → Lin V are space-like,
296
LUCA DESERI AND DAVID R. OWEN
so that time-like jumps in χn do not affect the fields associated with the family t → (χ(·, t), G(·, t)) of structured deformations. The more complete treatment of “structured motions” described in [3], Part 2, introduces not only a deformation without disarrangements G but also a velocity without disarrangements χ˙\ that permit both space-like and time-like jumps to be captured in two analogues of the identification relation (2.11). We choose here to follow the more immediate route, bodies evolving through time-parameterized families of structured deformations, and our theory of elasticity with disarrangements more accurately can be entitled elasticity with space-like disarrangements. The inequality (2.1) becomes in the case of time-parameterized families of structured deformations: 0 < m(t) < det G(X, t) det ∇χ(X, t).
(2.12)
3. Contact and Body Forces 3.1. DECOMPOSITIONS Earlier studies of balance laws for bodies undergoing structured deformations ([3], Part 2, Section 1, and [15]) showed that the classical law of balance of forces in the reference configuration is equivalent to a “refined balance law” that may be written as: div(SK ∗ ) + div((det K)S − SK ∗ ) − S∇(det K) + (det K)bref = 0.
(3.1)
Here, S: A×(0, T ) → Lin V is the Piola–Kirchhoff stress field, K := (∇χ)−1 G, bref is the body force per unit volume in the reference configuration, and A∗ := (det A)A−T for all invertible A ∈ Lin V. Moreover, the decomposition (A.1) and the identification relations (A.2), (A.3) derived in earlier studies ([3, 15]) and recorded in the Appendix, permit us to call S\ := SK ∗ = (det K)SK −T
(3.2)
the stress without disarrangements, div(SK ∗ ) the volume density of contact forces without disarrangments, and div((det K)S − SK ∗ ) − S∇(det K) the volume density of contact forces due to disarrangements. We call Sd := S[(det K)I − K ∗ ]
(3.3)
the stress due to disarrangements. The availability through structured deformations of a both a virgin configuration and a reference configuration permits one to view (3.1) as balance of forces in the virgin configuration, differing from the reference configuration by a purely submacroscopic deformation as described in Section 2.2. Of course, the scalar field det K may be thought of as the volume fraction associated with the given time-parameterized family of structured deformations.
FIELD THEORY FOR ELASTIC BODIES
297
The considerations above lead us not only to the decomposition F =G+M
(3.4)
of the macroscopic deformation gradient F = ∇χ: A×(0, T ) → Lin V but also, upon adding relations (3.2) and (3.3), to the decomposition of the stress: (det K)S = S\ + Sd .
(3.5)
The stress tensor (det K)S is an analogue of the “weighted Cauchy tensor” (det F )T discussed in [16], and equations (3.1) and (3.5) show that it is this weighted measure of stress that readily decomposes into a part without disarrangements plus a part due to disarrangements. 3.2. CONSISTENCY RELATION If we use the defining relation (3.2) for the stress without disarrangements S\ to eliminate the Piola–Kirchhoff stress S from the decomposition (3.5), we obtain a consistency relation between the stresses due to and without disarrangements: S\ K T = S\ + Sd .
(3.6)
Roughly speaking, there is less freedom in the decomposition (3.5) of the weighted stress (det K)S into parts with and without disarrangements than in the decomposition (3.4) of the macroscopic deformation gradient into parts with and without disarrangements. Accordingly, we refer to (3.6) as the consistency relation. It will provide, through the constitutive assumptions for S\ and Sd made in Section 7, the restriction (1.2) on the dynamical processes that can occur in a given elastic body. An equivalent form of the consistency relation, S\ M T + Sd GT + Sd M T = 0,
(3.7)
follows from (3.6), after substitution of GT F −T for K T , and implies that sk(S\ M T ) + sk(Sd GT ) + sk(Sd M T ) = 0,
(3.8)
where skA := (A − AT )/2 denotes the skew part of A ∈ Lin V. This relation plays a role in the analysis of moment densities in Section 5. 4. Power Expended; Balance Laws We postulate that in a family of structured deformations (χ, G) the power expended at time t ∈ (0, T ) on a subbody S ⊂ A by its exterior is given by the classical formula S(X, t)ν(X) · χ˙ (X, t) dAX P (S, t) = bdy S + b∗ (X, t) · χ˙ (X, t) dVX . (4.1) S
298
LUCA DESERI AND DAVID R. OWEN
Here, beyond the quantities S and χ˙ introduced in Sections 2 and 3, ν(X) denotes the outward unit normal at the point X ∈ bdy S, and b∗ := bref − ρref χ¨ is the total body force, with bref the body force in the reference configuration and ρref the mass density in the reference configuration. Our use of the classical formula (4.1) for the power allows us to preserve much of the structure of the standard field theory of non-linear elasticity. (See [4] for a derivation of balance laws that arise from a non-classical formula for the power.) A standard argument [17, 18] based on invariance of the power expended under superposed rigid motions yields the classical laws of balance of linear and angular momentum: divS + bref = ρref χ¨ ,
(4.2)
sk(SF T ) = 0.
(4.3)
The definition (4.1) of the power expended and the balance law (4.2) yield by means of the divergence theorem and a standard product rule the following reduced expression for the power expended (4.4) P (S, t) = S(X, t) · ∇ χ˙ (X, t) dVX . S
We note that the formula ∇ χ˙ = (∇χ)· = F˙ and the two basic decompositions (3.4) and (3.5) permit us to decompose (det K)S · ∇ χ, ˙ the density of stress power in the virgin configuration, in the following manner: ˙ + Sd · M˙ + S\ · M˙ + Sd · G. ˙ (det K)S · ∇ χ˙ = S\ · G
(4.5)
˙ + Sd · M˙ to the stress power involves pairing like quantities The contribution S\ · G (a stress without disarrangements and a rate of deformation without disarrangements, or corresponding quantities due to disarrangements). Because the contribu˙ mixes factors with and without disarrangements, we refer to tion S\ · M˙ + Sd · G it as the mixed (stress) power. In a similar way, we may decompose the volume density of moments in the balance law (4.3): (det K)sk(SF T ) = sk(S\ GT )+sk(Sd M T )+sk(S\ M T )+sk(Sd GT ),
(4.6)
and individual terms on the right-hand side may be interpreted as particular moment densities, as described in the next section. By the consistency relation (3.8), the last three moment densities sk(Sd M T ), sk(S\ M T ), and sk(Sd GT ) appearing on the right-hand side of (4.6) must add to zero. By the balance of angular momentum (4.3), by (4.6), and by (3.8), the first moment density sk(S\ GT ) on the right-hand side of (4.6) must vanish.
299
FIELD THEORY FOR ELASTIC BODIES
5. Offset Moments The volume densities of moments sk(S\ M T ), sk(Sd M T ), sk(Sd GT ) arose in Section 4 through the formula (4.6). In this section we identify the terms sk(S\ M T ) and sk(Sd M T ) as volume densities of “offset moments.” The significance of the moment density sk(Sd M T ) is revealed by the following identification relation: sk(Sd (X, t)M T (X, t)) = lim lim
δ→0 n→∞
(χn (·,t ))∩B(X;δ) Sd (X, t)ν(Y )
× [χn (·, t)](Y ) dAY
vol B(X; δ)
,
(5.1)
which we verify below. The vector Sd (X, t)ν(Y ) is the traction due to disarrangements at the point Y on a disarrangement site in the virgin configuration, computed using the stress due to disarrangements at the center X of the ball. The vector product Sd (X, t)ν(Y ) × [χn (·, t)](Y ) is (minus) the moment per unit area produced by that traction acting against the offset [χn (·, t)](Y ) caused by disarrangements. An elementary instance of such moments would arise if a deck of cards, in equilibrium under a system of loads, is shifted near the middle card without changing either the shape of the individual cards or the applied loads. The moment arising from the change in geometry of the deck corresponds to the moment calculated on the right-hand side of (5.1). Consequently, we call sk(Sd M T ) a volume density of offset moments. Of course, replacing Sd by S\ in (5.1) permits us also to call sk(S\ M T ) a volume density of offset moments. The identification relation (5.1) follows immediately if we substitute the right-hand side of the identification relation (2.11) into the left-hand side of (5.1) and if we identify the skew tensor sk(Sd (X, t)ν(Y ) ⊗ [χn (·, t)](Y )) with its axial vector. Because G measures the deformation away from disarrangement sites, we interpret the moment density sk(Sd GT ) in (4.6), arising even in motions involving disarrangements, as an analogue of the moment density sk(SF T ) arising in the classical balance law. (Recall that in Section 4 we pointed out that the analogous density sk(S\ GT ) vanishes, because it is a scalar multiple of sk(SF T ).) According to (3.8), the offset moment densities sk(Sd M T ), sk(S\ M T ), and the moment density sk(Sd GT ) in (4.6) must add to zero. Actually, we show in Section 9 that material frame-indifference implies that sk(Sd M T ) vanishes and, therefore, that sk(S\ M T ) and sk(Sd GT ) also must add to zero. 6. Dynamical Processes, Constitutive Classes, and the Dissipation Inequality A dynamical process is specified here by giving a motion χ, the deformation without disarrangements G, the stress field S, the volume density ψ of the Helmholtz free energy in the reference configuration, and the mass density ρref in the reference configuration. Of course, the stresses without and due to disarrangements S\ and Sd are determined by the Piola–Kirchhoff stress S, the motion χ, and
300
LUCA DESERI AND DAVID R. OWEN
the deformation without disarrangements G through the relations (3.2) and (3.3), and the body force bref also is determined by fields in our list from the balance of linear momentum (4.2). (We may guarantee that the balance of angular momentum (4.3) is satisfied on every dynamical process by imposing the condition sk(SF T ) = 0, but we refrain from doing so pending the discussion of frameindifference in Section 9.) We will omit ρref in the list above for the sake of conciseness. The concept of a constitutive class is central to the specification of the particular material that is to be considered. Here, following Gurtin [19], a constitutive class C simply is a collection of dynamical processes. A particular choice of constitutive class limits the dynamical processes that are to be considered. In practice, a constitutive class is specified by giving a list of response functions: the constitutive class is the collection of those dynamical processes that satisfy the relations on the fields χ, G, S, ψ provided by the response functions. Of course, these relations may include inequalities as well as equations. Another limitation on dynamical processes is provided by the second law of thermodynamics which, in the present context of isothermal processes, is the dissipation inequality: ˙ ψ(X, t) S(X, t) · ∇ χ(X, ˙ t),
(6.1)
asserting that the rate of change of the density of the Helmholtz free energy does not exceed the stress power. We denote by D the collection of all dynamical processes χ, G, S, ψ that satisfy the dissipation inequality. The dissipation inequality is imposed by means of the requirement C ⊂ D.
(6.2)
In other words, every dynamical process for the given material must obey (6.1). The dissipation inequality may be used to impose restrictions on the response functions that specify a constitutive class C, as first described in the context of the Clausius–Duhem inequality by Coleman and Noll [20] and now widely followed in continuum thermodynamics. According to this procedure, one seeks necessary and sufficient conditions on the response functions that specify C in order that C ⊂ D. We indicate in the next section that, when the free energy and stresses depend only upon F = ∇χ and G, the restrictions obtained from the procedure of Coleman and ˙ Noll include the vanishing of the internal dissipation S(X, t) · ∇ χ˙ (X, t) − ψ(X, t) on dynamical processes in C. We shall maintain the premise that it is useful to identify and study constitutive classes that admit internal dissipation on a non-trivial class of dynamical processes. Consequently, instead of following the procedure of Coleman and Noll, we are led in the next section to impose sufficient conditions on the constitutive class in order that it be included in D. To do so, we specify a particular constitutive class Ed and show directly the inclusion Ed ⊂ D. Although all of the fields in our description of a dynamical process are smooth, the present approach echoes the standard use of the second law of thermodynamics to limit the
FIELD THEORY FOR ELASTIC BODIES
301
class of non-smooth processes that can occur in the presence of a shock wave (see, for example, [21]). In our context, the non-smoothness occurs at a submacroscopic level and is made explicit only through the piecewise-smooth motions χn that arise in the Approximation Theorem. In spite of the present choice not to pursue the procedure of Coleman and Noll, the constitutive class C obtained via that procedure merits detailed study, because it admits the possibility that internal dissipation arises via small jumps between points on a constitutive manifold determined by the consistency relation. (See [7, 8] for elementary examples.) 7. A Constitutive Class for Elastic Bodies Undergoing Disarrangements The constitutive data that we employ initially for the specification of an elastic body undergoing disarrangements are the smooth response functions (F, G) → (F, G), (F, G) → S\ (F, G), and (F, G) → Sd (F, G) for the free energy, stress without disarrangements, and stress due to disarrangements, all defined on pairs of invertible tensors (F, G) satisfying the inequalities 0 < det G det F.
(7.1)
An equivalent description of these response functions entails the specification of ˜ the mappings (M, G) → (M, G) := (M + G, G)(M, G) → S˜\ (M, G) := ˜ S\ (M + G, G), and (M, G) → Sd (M, G) := Sd (M + G, G) defined on pairs of tensors (M, G) satisfying 0 < det G det(M + G).
(7.2)
For future reference, we record here the relations ˜ G) = DF (M + G, G), DM (M,
(7.3)
˜ G) = DF (M + G, G) + DG (M + G, G). DG (M,
(7.4)
We allow the free energy response function also to depend upon the material point X at which the free energy is to be computed, but we delay until Section 9 making ˜ explicit this dependence on X in the symbols (F, G) and (M, G). ˜ S˜\ , S˜d now permit us to define the class C of dynamical The functions , processes satisfying the constitutive relations ˜ ψ(X, t) = (M(X, t), G(X, t)),
(7.5)
S\ (X, t) = S˜\ (M(X, t), G(X, t)),
(7.6)
Sd (X, t) = S˜d (M(X, t), G(X, t))
(7.7)
and
302
LUCA DESERI AND DAVID R. OWEN
for all X, t. We now indicate how the requirement (6.2) imposed via the procedure of Coleman and Noll leads to a constitutive class in which no internal dissipation occurs. (To simplify the relations below, we omit the argument (X, t) throughout.) We multiply both sides of the dissipation inequality (6.1) by det K, use the constitutive relations (7.5)–(7.7) and the formula (4.5) for the stress power in the virgin configuration, and we conclude that the internal dissipation ˙ det K(S · ∇ χ˙ − ψ) ˜ = (S˜\ (M, G) + S˜d (M, G) − (det K)DM (M, G)) · M˙ ˙ ˜ G)) · G + (S˜\ (M, G) + S˜d (M, G) − (det K)DG (M,
(7.8)
is not negative on each dynamical process in C. In spite of the restrictions that the consistency relation (3.6) together with the constitutive relations (7.6), (7.7) ˙ we may reverse any dynamical process in C with respect to its place on M˙ and G, ˙ time-evolution and obtain another dynamical process in C. Consequently, M˙ and G ˙ in (7.8), leaving all other quantities unchanged. may be replaced by −M˙ and −G Therefore, the internal dissipation as given in (7.8) must vanish for every dynamical process in C, and the dissipation inequality (6.1) must be satisfied as an equality. In order to obtain a theory that admits internal dissipation, we consider now a collection of dynamical processes different from C. The constitutive class that we now specify is suggested by comparing the formula (4.5) for the stress power in the virgin configuration with the formula for (det K)ψ˙ obtained by differentiating both sides of (7.5) with respect to t: ˙ + Sd · G ˙ + S\ · M, ˙ (det K)S · ∇ χ˙ = Sd · M˙ + S\ · G ˙ ˜ ˜ G) · M˙ + (det K)DG (M, G) · G. (det K)ψ˙ = (det K)DM (M,
(7.9) (7.10)
Our goal of specifying a material that can both store energy and dissipate energy in smooth processes can be achieved first by choosing some terms on the righthand side of (7.9) to be set equal to the entire right-hand side of (7.10), thereby specifying the amount of work done on each time interval that will be stored by the body. In order to satisfy the dissipation inequality (6.1), the remaining terms on the right-hand side of (7.9) must be assumed to be non-negative. Accordingly, ˜ with domain {(M, G) | 0 < det G det(G + M)}, given a response function we consider the collection Ed of dynamical processes χ, G, S, ψ satisfying the constitutive relations ˜ ψ(X, t) = (M(X, t), G(X, t)), ˜ t), G(X, t)), Sd (X, t) = (det K(X, t))DM (M(X, ˜ S\ (X, t) = (det K(X, t))DG (M(X, t), G(X, t)),
(7.11) (7.12) (7.13)
and the mixed power inequality ˙ ˙ t) + Sd (X, t) · G(X, t) 0 S\ (X, t) · M(X, for all X, t.
(7.14)
FIELD THEORY FOR ELASTIC BODIES
303
In making these choices we appeal to the idea that forces separated from a site of geometrical changes are unlikely to be able to maintain a metastable geometrical configuration at that site and, therefore, should be capable of contributing to dissipation. Thus, we take into account the separation of the points of applications of the contact forces due to disarrangements (produced by Sd ) from the sites where the ˙ occur. Analogeometrical changes without disarrangements (that contribute to G) ˙ gous considerations can be made for the other term S\ · M in the mixed power. On the contrary, for the “pure” term Sd · M˙ the proximity of the points of application of the contact forces due to disarrangements to the sites where changes in the disarrangements occur enables the maintainance of metastability, and so also for the ˙ The constitutive assumptions (7.12) and (7.13) embody other “pure” term S\ · G. ˙ in the stress power. the non-dissipative character of the terms Sd · M˙ and S\ · G An important conclusion that can be drawn from the definition of the constitutive class Ed is that the dissipation inequality is satisfied for every dynamical process in Ed , i.e., Ed ⊂ D. Indeed, the constitutive relations (7.11)–(7.13), the mixed power inequality (7.14), and relations (3.4) and (3.5) tell us that ˜ · M˙ + (det K)DG ˜ ·G ˙ (det K)ψ˙ = (det K)DM ˙ ˜ · M˙ + (det K)DG ˜ ·G (det K)DM ˙ + (det K)DG ˜ ·G ˜ · M˙ + (det K)DM = (det K)S · F˙ , which is equivalent to the dissipation inequality (6.1). It is also significant that the consistency relation (3.6), through the constitutive relations (7.12) and (7.13), imposes a restriction on dynamical processes in Ed : for every dynamical process χ, G, S, ψ in Ed there holds ˜ t), G(X, t))K(X, t)T DG (M(X, ˜ ˜ = DG (M(X, t), G(X, t)) + DM (M(X, t), G(X, t)) (7.15) for all (X, t). Consequently, the pairs (M(X, t), G(X, t)) available through dynamical processes in Ed lie in a submanifold of Lin V × Lin V. In particular, for each ˙ ˙ (X, t), the pairs (M(X, t), G(X, t)) of time-derivatives available through dynamical processes in Ed lie in the tangent space of the submanifold at (M(X, t), G(X, t)) and, hence, cannot be arbitrary elements of Lin V × Lin V. Similarly, the mixed ˙ power inequality (7.14) imposes a restriction on the quantities M, G, M˙ and G, ˙ that can arise for dynamical processes in the or, equivalently, on F , G, F˙ and G, constitutive class Ed , and we shall discuss some of these restrictions in Section 8. Finally, for every classical dynamical process χ, ∇χ, S, ψ in Ed , the consistency relation (7.15), and the fact that K = I when M = F − G = 0, yield for all X, t: ˜ ∇χ(X, t)) = 0, (7.16) DM (0, and, equivalently, by (7.3), DF (∇χ(X, t), ∇χ(X, t)) = 0, a restriction on the classical dynamical processes for the given elastic body.
(7.17)
304
LUCA DESERI AND DAVID R. OWEN
Our theory thus implies that a given choice of free-energy response function ˜ restricts the dynamical processes available to a body through relations (7.11)– (7.15), and that choice also restricts the classical dynamical processes through ˜ itself, only through relation (7.16). In contrast, our theory restricts the choice of , the condition of frame-indifference (9.1). 8. Internal Dissipation The internal dissipation in the reference configuration for a dynamical process χ, G, S, ψ in Ed is defined to be the excess of the stress-power over the rate of change ˙ Because the dissipation inequality (6.1) is of free energy: S · ∇ χ˙ − ψ˙ = S · F˙ − ψ. satisfied for every dynamical process in Ed , the internal dissipation is non-negative, and we consider from now on ˙ 0, ϒ := (det K)(S · F˙ − ψ)
(8.1)
the internal dissipation in the virgin configuration. It follows immediately from (4.5), (7.12), and (7.13) that the internal dissipation in the virgin configuration equals the mixed stress power: ˙ ϒ = S\ · M˙ + Sd · G ˜ · M˙ + DM ˜ · G) ˙ = (det K)(DG ˙ 0 = (det K)[(DF + DG ) · F˙ − DG · G]
(8.2)
for each dynamical process χ, G, S, ψ in the constitutive class Ed . Our aim in this section is to relate the internal dissipation to familiar quantities in the literature by ˙ in (8.2). investigating the relative contributions of the two terms S\ · M˙ and Sd · G An equivalent rewriting of (8.1) yields the relations ˙ + Sd · M˙ + S\ · M˙ + Sd · G) ˙ S · F˙ = (det K)−1 (S\ · G ˙ + DM ˜ ·G ˜ · M˙ + (det K)−1 ϒ = DG = ψ˙ + (det K)−1 ϒ,
(8.3)
a decomposition of the stress-power in the reference configuration into a nondissipative part ψ˙ and a dissipative part (det K)−1 ϒ 0. Thus, by (8.2), the dissipative part (det K)−1 ϒ of the stress-power equals the mixed stress-power in the reference configuration. Moreover, for classical dynamical processes χ, ∇χ, S, ψ in the constitutive class Ed , the internal dissipation vanishes, because M˙ = Sd = 0. ˙ the relative For a given stress S and for given deformation rates M˙ and G, ˙ can be altered by adjusting K = F −1 G, magnitudes of the terms S\ · M˙ and Sd · G ∗ because of the formulas S\ = SK and Sd = (det K)S − SK ∗ . In particular, for K close to the identity I , we have S\ = S + O(K − I ) and Sd = O(K − I ), and we ˙ in the expression (8.2) for ϒ expect that the term S\ · M˙ dominates the term Sd · G
FIELD THEORY FOR ELASTIC BODIES
305
as K tends to the identity I . (The symbol O(K − I ) denotes a tensor whose norm is bounded above by a constant times the norm of K − I . ) In order to understand this idea in more depth, it is enlightening to express the internal dissipation ϒ in terms of the Cauchy stress T , the macroscopic deformation F and its time-derivative F˙ , ˙ In doing so, and the deformation without disarrangements G and its derivative G. ∗ we employ (8.2) along with the the formulas S\ = SK and Sd = (det K)S − SK ∗ , and we find it convenient to suppress the arguments X, t, and χ(X, t) for the sake of simplicity of notation. We record the result here, omitting its routine derivation: ˙ −1 ) + T H ∗ · GG ˙ −1 (H − I )2 , (det F )−1 ϒ = T H ∗ · (F˙ F −1 − GG
(8.4)
where H := GF −1 is the referential version of the tensor field H˜ appearing in Section 2.2 and, as usual, H ∗ = (det H )H −T . We note that the expression T H ∗ · ˙ −1 (H − I )2 on the right-hand side of (8.4) is quadratic in H − I , while the GG ˙ −1 ) equals T · (F˙ F −1 − GG ˙ −1 ) plus a term linear in first term T H ∗ · (F˙ F −1 − GG H − I . In other words, we conclude from (8.4) that ˙ −1 ) + O((H − I )2 ) (det F )−1 ϒ = T H ∗ · (F˙ F −1 − GG ˙ −1 ) + O(H − I ). = T · (F˙ F −1 − GG
(8.5)
In order to relate the last formula for the internal dissipation to more familiar ˙ −1 and LM := F˙ F −1 − GG ˙ −1 quantities, we note that the fields LG := GG appeared in the study [13] of multiple slip in single crystals as the relative rate of deformation without disarrangements and the relative rate of deformation due to disarrangements, respectively. (In [13], the term “slip” replaced “disarrangement” because of the particular context of that study.) Moreover, the factorization (2.7) implies that the tensor field T\ := T H ∗ in (8.4) and (8.5) is analogous to S\ = SK ∗ and may be called the stress in the current configuration without disarrangements, a configuration macroscopically identical to the current configuration but containing none of the disarrangements associated with χ and G. Accordingly, T\ represents a stress without disarrangements. (In view of (3.3), the tensor field Td := (det H )T −T\ is the analogue of Sd and represents a stress due to disarrangements.) Therefore, (8.5) may now be recast in the form (det F )−1 ϒ = T\ · LM + O((H − I )2 ) = T · LM + O(H − I ).
(8.6)
The tensor H − I measures the disarrangements from the current configuration without disarrangements to the current configuration, and the decomposition (8.6) tells us that the quantities T ·LM and T\ ·LM provide approximations to the internal dissipation to within, respectively, linear and quadratic terms in the disarrangements from the current configuration without disarrangements. This result places in perspective with respect to the present theory the frequent identification of the internal dissipation as an expression of the form T · LM (sometimes called “plastic power” in phenomenological theories of plasticity).
306
LUCA DESERI AND DAVID R. OWEN
9. Material Frame-Indifference We consider here the transformation properties of the kinematical quantities associated with dynamical processes under changes of observer. These transformation properties can be obtained by replacing the motion χ, and the approximating motions χn from the Approximation Theorem, by (X, t) → r(χ(X, t), t) and (X, t) → r(χn (X, t), t), where r denotes a rigid motion (X, t) → x0 (t) + Q(t)(X − X0 ) with Q(t) a proper orthognal tensor. From this observation and the fact that G = limn→∞ ∇χn , we obtain the transformation rules F G M K F˙ ˙ G M˙
→ → → → → → →
QF QG QM K ˙ QF˙ + QF ˙ + QG ˙ QG ˙ QM˙ + QM.
In the present context of an elastic body undergoing disarrangements, we say that ˜ is frame-indifferent if, for all proper orthogonal tensors Q the response function and pairs (M, G) with 0 < det G det(M + G), there holds ˜ ˜ (QM, QG) = (M, G),
(9.1)
or, equivalently, (QF, QG) = (F, G)
(9.2)
for all proper orthogonal tensors Q and pairs (F, G) with 0 < det G det F . A useful characterization of this condition follows from the polar decomposiT in (9.1) or (9.2), tions F = RF UF and G = RG UG . Indeed, we may put Q := RG −1 T T T T or Q := RF in (9.1) and use the relations RF = UF F , RG = UG−1 GT to obtain for all pairs (M, G), with 0 < det G det(M + G), the representations ↔
˜ ˜ G−1 GT M, UG ) = (GT M, CG ), (M, G) = (U (GT F, CG ), (F, G) = (UG−1 GT F, UG ) = (F, G) =
(UF , UF−1 F T G)
= (CF , F G), T
(9.3) (9.4) (9.5)
where CF := F T F and CG := GT G are the right Cauchy–Green tensors for F and G, respectively. Each one of these representations is both a necessary and a ˜ and sufficient condition for the frame-indifference of the response functions in the context of elastic bodies undergoing disarrangements. A second characterization of the frame-indifference of the response function ˜ follows by imposing (9.1) on smooth, time-parameterized families t → Q(t)
FIELD THEORY FOR ELASTIC BODIES
307
and t → (M(t), G(t)) and differentiating both sides of (9.1) with respect to t to conclude that ˙ + QM] ˙ + DG (QM, ˙ + QG] ˙ ˜ ˜ DM (QM, QG) · [QM QG) · [QG ˙ ˜ ˜ G) · M˙ + DG (M, G) · G. = DM (M,
(9.6)
˜ we may vary Because the restriction (9.1) applies throughout the domain of , T ˙ ˙ ˙ ˙ Q, G, and M independently (subject to the constraints sym(QQ ) = 0 and 0 < ˜ that det G det(M + G)) to conclude from the smoothness of ˜ ˜ QG) = QDM (M, G), DM (QM,
(9.7)
˜ ˜ QG) = QDG (M, G), DG (QM,
(9.8)
˜ ˜ G)M T + DG (M, G)GT ) = 0 sk(DM (M,
(9.9)
and
for all proper orthogonal tensors Q and pairs (M, G) with 0 < det G det(M + ˜ G). It is easy to verify that relations (9.7)–(9.9) imply that the response function is frame-indifferent. It is crucial to distinguish between, on the one hand, the smooth time-parameterized families t → (M(t), G(t)) used in establishing (9.6) and, on the other hand, the families t → (M(X, t), G(X, t)) = (∇χ(X, t) − G(X, t), G(X, t)) arising from dynamical processes in the constitutive class Ed . In particular, the time derivatives of former pairs can be varied arbitrarily (when det G(t) < det(M(t) + G(t))), while those of the latter pairs cannot, as we observed near the end of Section 7. ˙ = (det K)DG (M, ˙ ˜ G) · M+ We say that the mixed power S\ · M˙ + Sd · G ˜ ˙ is frame-indifferent if, for all smooth, time-parameterized G) · G (det K)DM (M, families t → Q(t) and for all families t → (M(X, t), G(X, t)) arising from dynamical processes in Ed , there holds ˙ ˜ ˜ G) · M˙ + (det K)DM (M, G) · G (det K)DG (M, · ˜ QG) · (QM) = det(K)DG (QM, ˜ + det(K)DM (QM, QG) · (QG)· .
(9.10)
This condition amounts to the assertion that the mixed power is invariant under superpositions of rigid motions on dynamical processes. We now show that, given the ˜ the mixed power is frame-indifferent frame-indifference of the response function , if and only if ˜ ˜ G)M T + DM (M, G)GT ) = 0 sk(DG (M,
(9.11)
308
LUCA DESERI AND DAVID R. OWEN
for all pairs (M, G) arising from dynamical processes in Ed . In fact, expansion of the derivatives on the right-hand side of (9.10) tells us that (9.10) is equivalent to the relation ˜ ˜ G) − QT DG (QM, QG)) · M˙ (DG (M, T ˙ ˜ ˜ G) − Q DM (QM, QG)) · G + (DM (M, T T ˙ ˜ ˜ QG)M + DM (QM, QG)G ) · Q. = (DG (QM, ˜ we conclude from (9.7) and (9.8) that the Given the frame-indifference of , previous relation is equivalent to ˙ ˜ ˜ G)M T + DM (M, G)GT ) · QT Q, 0 = (DG (M, ˙ = 0 along with the arbitrariness of t → Q(t) provides and the relation sym(QT Q) the asserted characterization of frame-indifference of the mixed power. Our main result on material frame-indifference is a generalization of a result of Noll [22] in classical elasticity: if both the free energy response function and the mixed power are frame indifferent, then the balance of angular momentum (4.3) is satisfied for all dynamical processes in Ed . (Noll actually showed that the frame-indifference of the free energy response is equivalent to the law of balance of angular momentum in the classical context.) Indeed, if we add (9.11) to (9.9) we conclude that ˜ ˜ G)F T + DG (M, G)F T ) = 0 sk(DM (M,
(9.12)
for all pairs (M, G) arising from dynamical processes in Ed . For each dynamical process, the constitutive relations (7.12), (7.13), and the decomposition (3.5) may be applied to yield (4.3), the law of balance of angular momentum. This main result permits us to impose the law of balance of angular momentum indirectly by requiring that both the free energy response function and the mixed power be frame-indifferent, a requirement that we impose from now on through the relations (9.1) and (9.11). In these considerations, it is important to remember that (9.11) is a restriction on dynamical processes, while (9.1) is a restriction on the free energy response function. Moreover, our main result and the imposition of (9.1) and (9.11) permit us to omit the law of balance of angular momentum among the field equations that we provide in the next section. We note from relations (7.12) and (7.13), that (9.11) implies sk(S\ M T + Sd GT ) = 0
(9.13)
on all dynamical processes in the constitutive class Ed , and relation (3.8) then implies also that sk(Sd M T ) = 0. We conclude from frame-indifference as realized in (9.1) and (9.11) that:
(9.14)
FIELD THEORY FOR ELASTIC BODIES
(i) (ii) (iii) (iv)
309
sk(SF T ) = 0, sk(Sd M T ) = 0, sk(S\ GT ) = 0, sk(S\ M T + Sd GT ) = 0.
Thus, each of the “pure” moment densities sk(Sd M T ) and sk(S\ GT ) is self-equilibrated, while the mixed moment densities sk(S\ M T ) and sk(Sd GT ) add to zero. Moreover, by (ii) and by the identification relation (5.1), the traction due to disarrangements Sd (X, t)ν(Y ) and the geometrical offset [χn ](Y, t) are colinear on average. It is useful to record the forms that relations (9.1) and (9.11) assume when the ˜ response function (M, G) → (M, G) is replaced by (F, G) → (F, G) = ˜ (F − G, G). Of course, (9.1) is replaced by (9.2), a restriction on the response function . In view of (3.4), (7.5), and (7.11), the relation (9.11) is equivalent to sk(DF (F, G)F T + DG (F, G)(F T − GT )) = 0,
(9.15)
a restriction on dynamical processes. Henceforth, when using F and G as the arguments of the free energy, we assume that (9.2) and (9.15) are satisfied, the first throughout the domain of and the second on all dynamical processes in Ed . 10. Field Relations Our analysis in Sections 7–9 has led us to the specification of one response function ˜ (M, G) → (M, G) satisfying relations (9.1) and (9.11), the former throughout ˜ and the latter on all dynamical processes in Ed . Given the body the domain of force field bref in the reference configuration, the remaining relations employed in deriving the field relations are restrictions on dynamical processes: the balance of linear momentum (4.2), the constitutive relations (7.11)–(7.13), the mixed power inequality (7.14), and the consistency relation (7.15). As demonstrated in Section 9, the law of balance of angular momentum is a consequence of the assumptions of frame-indifference (9.1) and (9.11). We now are in a position to record and derive from (9.1), (9.11), (4.2), and (7.11)–(7.15) the field relations for an elastic body undergoing disarrangements: ˜ + DG ) ˜ + bref = ρref χ¨ , div(DM
(10.1)
˜ −T − I ) + DM K ˜ −T = 0, DG (K
(10.2)
˜ T + DM G ˜ T ) = 0, sk(DG M
(10.3)
˙ 0, ˜ · M˙ + DM ˜ ·G DG
(10.4)
det(G + M) det G > m > 0,
(10.5)
where, as in (3.4), ∇χ = G + M, and, as in (2.12), m is a positive number depending upon t alone. (These are relations (1.1)–(1.5), with arguments omitted
310
LUCA DESERI AND DAVID R. OWEN
for the sake of conciseness.) The law of balance of linear momentum (10.1), the consistency relation (10.2), and the frame-indifference of the mixed power (10.3) amount to 3 + 6 + 3 = 12 scalar equations for the unknowns χ and G, having a total of 12 scalar components. (That the consistency relation amounts to only 6 scalar equations follows from the fact that both sides of the original consistency relation (3.6), when multiplied by F T , are symmetric tensors.) The inequalities (10.4) and (10.5) further restrict the dynamical processes satisfying (10.1), (10.2), and (10.3). We emphasize that, given the body force field bref , the field relations are restrictions on dynamical processes, while the relation (9.1) is a restriction ˜ We must also keep in mind that ˜ and its derivatives on the response function . depend not only upon M(X, t) and G(X, t) but may also upon the material point X itself, and we make this dependence explicit when needed for clarity. For example, the first term on the left-hand side of the equation of balance of linear momentum (10.1) is the field ˜ (X, t) → divX [DM (∇χ(X, t) − G(X, t), G(X, t), X) ˜ t) − G(X, t), G(X, t), X)]. + DG (∇χ(X,
(10.6)
The field relations follow readily from (9.1), (9.11), (4.2), (7.11)–(7.15), and (2.12). The law of balance of linear momentum is a consequence of its counterpart (4.2), of the constitutive relations (7.12) and (7.13) for the stresses without and with disarrangements, and of the decomposition (3.5); the consistency relation (10.2) is the relation (7.15), rewritten with trivial algebraic changes; (10.3) is (9.11), and the mixed power inequality (10.4) is (7.14) with S\ and Sd replaced by the expressions in the formulas (7.12) and (7.13). Moreover, by the decomposition (3.5) and the constitutive relations (7.12), (7.13), one has the stress relation ˜ ˜ t), G(X, t)) + DG (M(X, t), G(X, t)) S(X, t) = DM (M(X,
(10.7)
valid for all dynamical processes in Ed . When the motion χ and G determine a classical motion, i.e., G = ∇χ, then the relations M = 0, K = I , and det K = 1 tell us that the balance of linear momentum (10.1), the consistency relation (10.2), and the inequality (10.5) become ˜ ∇χ) + bref = ρref χ¨ , div DG (0,
(10.8)
˜ ∇χ) = 0, DM (0,
(10.9)
det ∇χ > m > 0.
(10.10)
The remaining relations (10.3) and (10.4) are satisfied identically in view of (10.9). In some applications, it is easier to use the field relations when they are ex˜ pressed in terms of the response function (F, G) → (F, G) = (F − G, G). In
FIELD THEORY FOR ELASTIC BODIES
311
this case, the response function is assumed to satisfy (9.2), and the field relations (10.1)–(10.5) become div(2DF + DG ) + bref = ρref χ¨ ,
(10.11)
DF (2K −T − I ) + DG (K −T − I ) = 0,
(10.12)
sk((DF + DG )F T − DG GT ) = 0,
(10.13)
˙ 0, (DF + DG ) · F˙ − DG · G
(10.14)
det F det G > m > 0,
(10.15)
respectively, while the stress relation (10.7) becomes S(X, t) = 2DF (F (X, t), G(X, t)) + DG (F (X, t), G(X, t)).
(10.16)
Corresponding to the expression (10.6), the first term on the left-hand side of the equation of balance of linear momentum (10.11) is the field (X, t) → divX [2DF (∇χ(X, t), G(X, t), X) +DG (∇χ(X, t), G(X, t), X)].
(10.17)
It is convenient to record for future use the following formulas for the stresses with and without disarrangements in terms of (omitting X and t for the sake of brevity): S\ = (det K)(DF (F, G) + DG (F, G)),
(10.18)
Sd = (det K)DF (F, G).
(10.19)
In view of the significance of the purely submacroscopic factor (i, K) in (2.6), some applications become more accessible if one employs the response function (F, K) → (F, K) := (F, F K)
(10.20)
with domain the set of pairs (F, K) satisfying 0 < det F and 0 < det K 1. The relations DF (F, F K) = DF (F, K) − F −T DK (F, K)K T
(10.21)
DG (F, F K) = F −T DK (F, K)
(10.22)
and
permit us to express the field relations in terms of this choice of variables: div[2DF + F −T DK (I − 2K T )] + bref = ρref χ¨ , DF (2K −T − I ) + F −T DK K T ({K −T }2 − 3K −T + I ) = 0, sk(DF F T + F −T DK (I − 2K T )F T ) = 0, (DF + F −T DK (I − 2K T )) · F˙ − DK · K˙ 0, 1 det K > 0.
(10.23) (10.24) (10.25) (10.26) (10.27)
312
LUCA DESERI AND DAVID R. OWEN
In addition, the stress relation (10.16) becomes S(X, t) = 2DF (F (X, t), K(X, t)) + F −T (X, t)DK (F (X, t), K(X, t))(I − 2K T (X, t)). (10.28) It also is convenient to record for future use the following counterparts of (10.18) and (10.19): (10.29) S\ = (det K) DF (F, K) + F −T DK (F, K) I − K T , (10.30) Sd = (det K) DF (F, K) − F −T DK (F, K)K T .
11. Submacroscopic Stability Suppose that, for a given tensor F0 , there is a tensor G0 (with 0 < det G0 det F0 ) that provides a local minimum for G → (F0 , G), the Helmholtz free energy at the given macroscopic deformation gradient F0 . In this case, we say that the pair (F0 , G0 ) is submacroscopically stable. If the pair (F0 , G0 ) is submacroscopically stable and, in addition, det G0 < det F0 , then DG (F0 , G0 ) = 0. By (7.12), (7.13), (7.3), (7.4), and (3.5), there holds 1 (11.1) S\ (X, t) = Sd (X, t) = (det K(X, t))S(X, t) 2 for all dynamical processes in Ed and pairs (X, t) satisfying F (X, t) = F0 and G(X, t) = G0 . By the first formula in (8.2), we may conclude that dynamical processes in Ed through submacroscopically stable pairs (F0 , G0 ) with det G0 < det F0 proceed with internal dissipation ϒ given by 1 (det K(X, t))S(X, t) · F˙ (X, t), (11.2) 2 one half the stress-power in the virgin configuration. The stress-power in such submacroscopically stable processes thus is partitioned equally between energy stored and dissipated. This result on equipartition of the stress-power does not apply to classical dynamical processes, because the relation F (X, t) = G(X, t) for classical processes means that the strict inequality det G0 < det F0 , assumed in the derivation of (11.2), would be violated. We shall provide now information about arbitrary dynamical processes that encounter a submacroscopically stable pair (F0 , G0 ) at (X, t). To this end, we observe from (2.12) that, for a submacroscopically stable pair (F0 , G0 ), the tensor G0 is a solution of the constrained minimization problem: minimize G → (F0 , G) subject to the constraint det G det F0 . The Kuhn– Tucker theorem ([23], p. 314) implies that, corresponding to the given solution G0 , there is a number λ 0 for which ϒ(X, t) =
DG (F0 , G0 ) + λ(det G0 )G−T 0 = 0.
(11.3)
FIELD THEORY FOR ELASTIC BODIES
313
Again, by (11.3), (7.12), (7.13), (7.3), (7.4), and (3.5), we have −λ(det G0 )(det K(X, t))G−T 0 = (det K(X, t))DG (F0 , G0 ) = S\ (X, t) − Sd (X, t) = (det K(X, t))S(X, t)(2K −T (X, t) − I ),
(11.4)
for a dynamical process and a pair (X, t) satisfying F (X, t) = F0 and G(X, t) = G0 . Because the Cauchy stress and the Piola–Kirchhoff stress are related by T = (det F )−1 SF T , relation (11.4) implies the formula T (X, t)(2I − H0T ) = −λI,
(11.5)
with H0 := G0 F0−1 , corresponding to the field H˜ defined below (2.7). For the special case when the submacroscopically stable pair (F0 , G0 ) corresponds to a classical deformation, i.e., F0 = G0 , (11.5) reduces to the relation T (X, t) = −λI.
(11.6)
In other words, a submacroscopically stable pair (F0 , F0 ) can arise in a classical dynamical process in Ed only if the corresponding Cauchy stress is a hydrostatic pressure. For general submacroscopically stable pairs (F0 , G0 ) encountered in dynamical processes, the relation (11.5) may be written in the equivalent form T\ − Td = −λH0∗
(11.7)
with T\ and Td the stresses without and due to disarrangements with respect to the current configuration defined in Section 8. This result applies even in cases where det G0 < det F0 . An equivalent form of (11.7) is obtained from (11.4) and reads S\ − Sd = −λ(det K0 )G∗0 ,
(11.8)
where (F0 , G0 ) is a submacroscopically stable pair and K0 = F0−1 G0 . The results in the previous paragraphs show that submacroscopic stability leads to hydrostatic states of stress for classical dynamical processes in Ed , but not necessarily for non-classical dynamical processes in Ed , and describe the states of stress encountered at arbitrary submacroscopically stable pairs. In the elastostatics of crystals, the conclusion that equilibrium leads to hydrostatic states of stress has been verified in several contexts ([24–27]). We note that the conclusions of Ericksen [24], of Chipot and Kinderlehrer [25], and of Fonseca and Parry [26] were based in part on the symmetries of the crystals they considered, whereas the results of the previous paragraphs are independent of the notion of material symmetry, a concept that we study in the next section. For isotropic elastic bodies, Mizel’s results on the energetics of fractured states [28] foreshadow our results on submacroscopic stability.
314
LUCA DESERI AND DAVID R. OWEN
12. Material Symmetry For each point X0 in the region A undergoing a dynamical process, we consider the transformation properties of the kinematical quantities at that point under a change of virgin configuration determined by a given unimodular tensor H0 . These transformation properties can be obtained by replacing the time-parameterized family of structured deformations (χ, G) by the composition (X, t) → ((χ, G) ◦ (ξH0 , H0 ))(X, t) = (χ(ξH0 (X, t), t), G(ξH0 (X, t), t)H0 ),
(12.1)
where ξH0 denotes the homogeneous, time-independent deformation (X, t) → X0 + H0 (X − X0 ). From this observation we obtain the following transformation rules F G M K
→ → → →
F H0 GH0 MH0 H−1 0 KH0
(12.2)
under change of virgin configuration. In this display, if a quantity on the left is evaluated at (X, t), the corresponding quantity on the right is evaluated at (X0 + H0 (X − X0 ), t). We say that H0 is a symmetry transformation at X0 with respect to changes of virgin configuration for the elastic body undergoing disarrangements if the response function (F, G) → (F, G; X0 ) satisfies (F H0 , GH0 , X0 ) = (F, G, X0 )
(12.3)
for all (F, G) with 0 < det G det F or, equivalently, if the response function ˜ (M, G) → (M, G, X0 ) satisfies ˜ ˜ (MH 0 , GH0 , X0 ) = (M, G, X0 )
(12.4)
for all (M, G) with 0 < det G det(M + G). As in elasticity without disarrangevirgin ments, the symmetry transformations at X0 form a group GX0 . virgin In the special case when GX0 is the proper orthogonal group, it is easy to obtain virgin necessary and sufficient conditions that (12.3) holds for all H0 ∈ GX0 . Indeed, T , from the polar decompositions F = VF RF and we can choose H0 to be RFT or RG G = VG RG , to obtain (F, G, X0 ) = (VF RF RFT , GRFT , X0 ) = (VF RF RFT , GF T VF−1 , X0 ) 1/2 −1/2 ˆ F , GF T , X0 ), (12.5) = (BF , GF T BF , X0 ) = (B or, alternatively, ˇ GT , BG , X0 ). (F, G, X0 ) = (F
(12.6)
FIELD THEORY FOR ELASTIC BODIES
315
Here, BF = F F T and BG = GGT are the left Cauchy–Green tensors for F and ˆ such that (12.5) holds for all (F, G) with 0 < G. The existence of a function ˇ such that (12.6) holds for all such pairs) det G det F (or, equivalently, of virgin is a necessary and sufficient condition that GX0 be the proper orthogonal group. Similarly, the existence of a function (M, G) → # (M, G, X0 ) such that ˜ (12.7) (M, G, X0 ) = # (MGT , BG , X0 ) for all (M, G) with 0 < det G det(M + G) is both necessary and sufficient for virgin GX0 to be the proper orthogonal group. virgin In the case when GX0 is the proper unimodular group, we may put H0 = (det F )1/3 F −1 or H0 = (det G)1/3 G−1 to conclude that the Helmholtz free energy can be expressed as a function of the pair (det F, H ) or, equivalently, in terms of (det G, H ) with, as usual, H = GF −1 . In Section 14, we will consider the special case where the Helmholtz free energy depends on the volume fraction det K = det G/ det F = det H , alone. Alternatively, a notion of material symmetry may be formulated in terms of invariance of response to changes in reference configuration. For each point X0 in the region A undergoing a dynamical process, we consider the transformation properties of the kinematical quantities at that point obtained first by factoring (χ, G) via the notion of composition introduced in (2.5), (χ, G) = (χ, ∇χ) ◦ (π, ∇χ −1 G).
(12.8)
Here, π(X, t) = X for all X and t. The factor (χ, ∇χ) on the right-hand side of (12.8) is a family of classical deformations, while (π, ∇χ −1 G) involves only purely submacroscopic deformations, because π leaves each point fixed. We next replace the expression on the right-hand side of (12.8) by ((χ, ∇χ) ◦ (ξH0 , H0 )) ◦ (π, ∇χ −1 G),
(12.9)
where, as above, ξH0 denotes the homogeneous, time-independent deformation (X, t) → X0 + H0 (X − X0 ). This replacement leaves the purely submacroscopic factor (π, ∇χ −1 G) unchanged and changes only the classical factor (χ, ∇χ). From this replacement we obtain the following transformation rules F G M K
→ → → →
F H0 (F H0 F −1 )G (F H0 F −1 )M K
(12.10)
under change of reference configuration. In this display, if a quantity on the left is evaluated at (X, t), the quantity on the right is evaluated at (X0 + H0 (X − X0 ), t). For a pair (H0 , K0 ), with 0 < det K0 1 = det H0 , we say that H0 is a symmetry transformation at X0 for K0 with respect to changes of reference configuration for the elastic body undergoing disarrangements if the response function F → (F, F K0 , X0 ) satisfies (F H0 , F H0 K0 , X0 ) = (F, F K0 , X0 )
(12.11)
316
LUCA DESERI AND DAVID R. OWEN
for all tensors F with 0 < det F . Equivalently, we may use the definition in Section 10, (F, K, X0 ) := (F, F K, X0 ) for all pairs (F, K) with 0 < det F and 0 < det K 1, to write (12.11) in the simpler form (F H0 , K0 , X0 ) = (F, K0 , X0 )
(12.12)
for all tensors F with 0 < det F . We denote by Gref X0 ,K0 the group formed by the symmetry transformations at X0 for K0 . The symmetry group Gref X0 ,K0 defined through (12.11) or (12.12) corresponds to the usual symmetry group of an elastic body undergoing only classical deformations, because the influence of disarrangements is removed by fixing the value of K = F −1 G at K0 . We remark that there is a notion of invariance dual to (12.12): (F, KP0 , X0 ) = (F, K, X0 )
(12.13)
for all tensors F and K with 0 < det F and 0 < det K 1, with P0 a given unimodular tensor. This invariance arises by replacing the right-hand side of the factorization (12.8) by (χ, ∇χ) ◦ ((π, ∇χ −1 G) ◦ (π, P0 )).
(12.14)
The purely submacroscopic factor (π, P0 ) alters the given one (π, ∇χ −1 G) without rechanging the classical factor (χ, ∇χ). The resulting symmetry group Gsubmac X0 sembles a notion introduced by Šilhavý and Kratochvíl [29], in the context of Noll’s new theory of simple materials [30], and adapted by Bertram [31] to formulate and solve problems in the plasticity of materials undergoing large deformations. is obtained by means of non-classical changes (π, P0 ) in conMoreover, Gsubmac X0 virgin figuration, as distinct from the groups Gref X0 ,K0 and GX0 , obtained via the classical changes (ξH0 , H0 ). A case that merits further study consists of the assumption that, during a dynamical process of an elastic body undergoing disarrangements, there holds K(X, t) ∈ Gsubmac X
(12.15)
for all (X, t). It is easy to show that DF (F, K(X, t), X) = DF (F, I, X)
(12.16)
DK (F, K(X, t), X)K(X, t)T = DK (F, I, X)
(12.17)
and
for all F with det F > 0 and for all dynamical processes satisfying (12.15). These relations should prove to be useful, because they restrict substantially the manner in which the field (X, t) → K(X, t) can appear in the field relations (10.23)–(10.26). In other words, the field relations simplify significantly when the disarrangements embodied in K correspond to submacroscopic symmetries of the elastic body.
FIELD THEORY FOR ELASTIC BODIES
317
In the next section, we study invariance properties of the response function in (12.12) under simultaneous changes in K0 and in X0 . 13. Material Uniformity We now have evidence that the factorization (χ, G) = (χ, ∇χ) ◦ (π, ∇χ −1 G),
(13.1)
with (X, t) → π(X, t) := X the trivial motion of the body, provides a useful way to distinguish between, on the one hand, the virgin configuration from which the purely submacroscopic family (π, ∇χ −1 G) proceeds while introducing all of the disarrangements associated with the given family (χ, G), and, on the other hand, the classical reference configuration, a macroscopically time-independent configuration from which the classical motion (χ, ∇χ) proceeds without introducing further disarrangements. With a view toward capturing the influence of the purely submacroscopic factor (π, ∇χ −1 G) on the material response, for each (X, t) we put as usual K(X, t) = ∇χ(X, t)−1 G(X, t) and consider the mapping F → (F, F K(X, t), X) = (F, K(X, t), X),
(13.2)
the classical free-energy response induced at (X, t) by the purely submacroscopic motion (π, ∇χ −1 G). We may interpret this response as that of an elastic material element to local deformations in which the disarrangements are frozen at their values for the material point X at time t. Because ∇χ −1 G may vary with material point and time, the classical response will vary with the pair (X, t). However, the response function also depends upon X, and it may happen for a given time t0 that the dependence of on X compensates for the dependence of K(X, t0 ) on X, i.e., for all X, Y ∈ A and for every tensor F with det F > 0: (F, K(X, t0 ); X) = (F, K(Y, t0 ); Y ).
(13.3)
The condition (13.3) embodies the idea of a materially uniform elastic body as described by Truesdell and Noll [32], Noll [33]. (In particular, see relation (27.4) of [32], in which the dependence of response on material point is compensated by the choice of local configuration.) Moreover, the mapping X → K(X, t0 ) corresponds to their concept of a uniform reference for the body at time t0 . It is possible that one of the uniform references X → K(X, t0 ) = ∇χ(X, t0 )−1 G(X, t0 ) for the body at time t0 is the gradient of a mapping on A, and this embodies the notion of a materially uniform, homogeneous elastic body (Truesdell and Noll [32], Noll [33]). If none of the uniform references at time t0 is a gradient, the induced classical response is described as that of a materially uniform, inhomogeneous elastic body. These considerations lead us to define the collection Edunif of dynamical processes in Ed that satisfy the material uniformity condition (F, K(X, t); X) = (F, K(Y, t); Y )
(13.4)
318
LUCA DESERI AND DAVID R. OWEN
for all X, Y ∈ A, for every tensor F with det F > 0, and for every time t. The dynamical processes in Edunif have the property that, for every t, K(·, t) is a uniform reference for the body. The material uniformity condition then may be viewed as a restriction on the field (X, t) → K(X, t) = ∇χ(X, t)−1 G(X, t), to be appended to the field relations for elastic bodies undergoing disarrangements derived in Section 10. Determining or characterizing explicitly the collection Edunif would entail characterizing the solutions of the field relations, augmented by the material uniformity condition. If we impose the requirement that the collection Edunif be non-empty, then the material uniformity condition (13.4) places restrictions on the response function . To make this observation more apparent, we consider a given classical dynamical process χ, ∇χ, S, ψ in Ed , and we ask how would be restricted by requiring that this classical dynamical process be in Edunif . An immediate answer follows from the fact that, for every classical dynamical process in Ed , K(X, t) = I for all X, t. Therefore, the material uniformity condition applied to the given classical dynamical process in Ed becomes the condition (F, I, X) = (F, I, Y )
(13.5)
for all X, Y ∈ A and for every tensor F with det F > 0. Evidently, this relation restricts the constitutive function used to define Ed . The material uniformity condition (13.4) may be differentiated with respect to F to obtain the relation DF (F, K(X, t), X) = DF (F, K(Y, t), Y ) valid for all F , X, Y , and t. In the relation (10.28) between the stress and free energy, the material uniformity condition does not seem to imply a corresponding uniformity condition on the response functions that determine any of the stresses S, S\ , and Sd . Of course, one directly can impose in place of (13.4) – or in addition to it – a material uniformity condition on one or more of the stress responses, for example on the response function from (10.28) for the Piola–Kirchhoff stress S: 2DF (F, K(X, t), X) + F −T DK (F, K(X, t), X)(I − 2K T (X, t)) = 2DF (F, K(Y, t), Y ) + F −T DK (F, K(Y, t), Y )(I − 2K T (Y, t)) for all F , X, Y , and t. Although this choice would provide a different restriction on the response function , it can be interpreted and studied along the lines outlined for (13.4). 14. Energetically Nearsighted Elastic Bodies The field relations obtained in Section 10, together with special properties of a body such as material uniformity and material symmetry, provide the setting for understanding the scope and range of applicability of elasticity with disarrangements. In this section we take a preliminary step in this direction by considering
FIELD THEORY FOR ELASTIC BODIES
319
elastic bodies that are “energetically nearsighted” in the sense that only the purely submacroscopic factor (π, (∇χ)−1 G) = (π, K) in the factorization (13.1) affects the free energy. Thus, submacroscopic slips or formation of voids would permit the body to change its free energy, while the classical deformations embodied in the factor (χ, ∇χ) would not, and we consider now elastic bodies for which the free energy response in (10.20) does not depend upon the macroscopic deformation F and, therefore, satisfies DF (F, K, X) = 0 for all triples F , K, X in the domain of . Accordingly, the field relations (10.23)–(10.27) and the stress relation (10.28) take the form ¨ (14.1) div F −T (K) I − 2K T + bref = ρref χ,
(K)K T ({K −T }2 − 3K −T + I ) = 0, sk(F −T (K)(I − 2K T )F T ) = 0, (F −T (K)(I − 2K T )) · F˙ − (K) · K˙ 0, 1 det K > 0, S = F −T (K)(I − 2K T ),
(14.2) (14.3) (14.4) (14.5) (14.6)
where we have written in place of DK to simplify notation. In some of the considerations below, it is helpful to use the field H := GF −1 = F KF −1 associated with the factorization (2.7), and we note the relation H −T = F −T K −T F T = (F T )−1 K −T F T .
(14.7)
14.1. UNIVERSAL PHASES AND THE GOLDEN MEAN The consistency relation in the form (14.2) provides a restriction on the field K, and the form of this restriction depends in general upon the response function . We observe, however, that there are solutions K of the consistency relation that do not depend upon , because the expression {K −T }2 −3K −T +I occurs multiplicatively in (14.2). Consequently, each tensor K with 0 < det K 1 for which K −T is a solution of the quadratic, tensor equation X 2 − 3X + I = 0,
X ∈ Lin V,
(14.8)
determines a solution of the consistency relation (14.2). It is easy to see that K −T is a solution of the quadratic equation (14.8) if and only K itself is a solution. In turn, this is equivalent to the assertion that the tensor H = F KF −1 is a solution of (14.8). We call solutions K (with 0 < det K 1) of the consistency relation universal if they are solutions of (14.8), because they do not depend upon the free energy response function of the nearsighted elastic body. We also refer to the corresponding tensor H = F KF −1 as a universal solution of the consistency relation. A necessary condition for√H to be universal is the inclusion of the spectrum of √ H in the solution set {(3 + 5)/2, (3 − 5)/2} = {2 + γ0 , 1 − γ0 } of the scalar
320
LUCA DESERI AND DAVID R. OWEN
√ quadratic equation x 2 − 3x + 1 = 0. Here, γ0 := ( 5 − 1)/2 ≈ 0.618 is the “golden mean,” the positive number satisfying the relation 1/x = x/(1 − x). By elementary linear algebra, H is universal if and only if it is diagonalizable over the reals with diagonal entries given up to permutations by one of the two triples: (1−γ0 , 1−γ0 , 1−γ0 ), (1−γ0 , 1−γ0 , 2+γ0 ). (The possibilites (1−γ0 , 2+γ0 , 2+γ0 ) and (2 + γ0 , 2 + γ , 2 + γ0 ) are ruled out by the restriction det H ∈ (0, 1].) Of course, the first triple (1 − γ0 , 1 − γ0 , 1 − γ0 ) determines the tensor Hsph := (1 − γ0 )I
(14.9)
that, in turn, determines the purely submacroscopic structured deformation (i, (1 − γ0 )I ) to follow a classical deformation (χ(·, t), ∇χ(·, t)). A piecewise smooth approximation hn for (i, (1−γ0 )I ) takes a body in its current configuration without disarrangements, partitioned into congruent cubic cells of side 1/n, and replaces each cell by one with the same center but now of side (1 − γ0 )/n. The simultaneous shrinking of each cell creates voids, and the resulting structured deformation has volume fraction det Hsph = det Ksph = (1 − γ0 )3 ≈ 0.056. This change of submacroscopic geometry determines the (universal) spherical phase of the energetically nearsighted elastic body. For each choice of basis (d(1) , d(2) , d(3) ), with corresponding reciprocal basis j (d (1) , d (2) , d (3) ) satisfying d(i) · d (j ) = δi , the second triple (1 − γ0 , 1 − γ0 , 2 + γ0 ) of diagonal entries determines a tensor Hlong := (1 − γ0 )d(1) ⊗ d (1) + (1 − γ0 )d(2) ⊗ d (2) + (2 + γ0 )d(3) ⊗ d (3) = Hsph + (1 + 2γ0 )d(3) ⊗ d (3)
(14.10)
that, in turn, determines a purely submacroscopic deformation (i, Hlong) to follow a classical deformation (χ(·, t), ∇χ(·, t)) that we refer to as the (universal) elongated phase of the body. A piecewise smooth approximation hn for (i, Hlong) takes each of the basic cells of side 1/n, with its edges now parallel to d(1) , d(2) , d(3) , respectively, and stretches the “d(3) ” edge to the length (2 + γ0 )/n ≈ 2.618/n while shrinking the other two edges to the length (1 − γ0 )/n ≈ 0.382/n. The simultaneous elongation of each cell creates voids, and the resulting structured deformation (i, Hlong ) has volume fraction det Hlong = det Klong = (1 − γ0 )2 (2 + γ0 ) = (1 − γ0 ) ≈ 0.382.
(14.11)
We have used the fact that (1 − γ0 ) and (2 + γ0 ) are reciprocals in the last calculation. Moreover, after elongation, the cells may have to be translated slightly in order to avoid interpenetration of neighboring cells, because the piecewise smooth approximations hn are required to be injective. While the universal solution Hsph does not vary in space and time, the universal solution Hlong may vary through dependence of the dyad d(3) ⊗ d (3) on position and time. In addition, the basis (d(1) , d(2) , d(3) ) need not be orthogonal, so that the approximating deformations hn map unit cubes into possibly non-rectangular parallelepipeds. For a particular
321
FIELD THEORY FOR ELASTIC BODIES
class of energetically nearsighted elastic bodies, the basis (d(1) , d(2) , d(3) ) must be orthonormal, as we demonstrate below, and we have d (i) = d(i) for i = 1, 2, 3. 14.2. FIELD RELATIONS FOR A CLASS OF NEARSIGHTED ELASTIC BODIES We now specialize the discussion above to the case ¯ ¯ (F, K) = ψ(det K) = ψ(det H)
(14.12)
in which only the volume fraction f := det K = det H ∈ (0, 1] produced by the purely submacroscopic deformations (i, H ) and (i, K) affects the free energy. We shall restrict our attention to universal phases of the energetically nearsighted elastic material under consideration, so that the consistency relation need not be considered further, and the formula (K) = f ψ¯ (f )K −T
(14.13)
along with the fact that f ∈ {1 − γ0 , (1 − γ0 )3 } is a constant for each phase yields after some computations the following forms of the remaining field relations (14.4), (14.3), and (14.1), as well as the stress relation (14.6): f ψ¯ (f )div((H −1 − 2I )F −T ) + bref = ρref χ, ¨
(14.14)
skH = 0,
(14.15)
ψ¯ (f )(H −1 − 2I ) · F˙ F −1 − ψ¯ (f )tr(H˙ H −1 ) 0,
(14.16)
S = f ψ¯ (f )(H −1 − 2I )F −T .
(14.17)
We have used the frame-indifference of the mixed power (14.15) to replace H −T by H −1 in the balance law (14.14) and in the mixed power inequality (14.16), and we have assumed that ψ¯ (f ) = 0 for f ∈ {1 − γ0 , (1 − γ0 )3 }. The symmetry of H implies that we may write Hlong = Hsph + (1 + 2γ0 )d ⊗ d,
(14.18)
with d := d(3) = d (3) . An easy calculation provides the formulas −1 = (2 + γ0 )I, Hsph
−1 Hlong = (2 + γ0 )I − (1 + 2γ0 )d ⊗ d,
(14.19)
and, using the constancy of Hsph, we obtain the specific forms for the balance of linear momentum, the mixed power inequality, and the stress relation in the spherical phase ¨ γ0 (1 − γ0 )3 ψ¯ ((1 − γ0 )3 )div F −T + bref = ρref χ,
(14.20)
ψ¯ ((1 − γ0 )3 )tr(F˙ F −1 ) 0,
(14.21)
322
LUCA DESERI AND DAVID R. OWEN
S = γ0 (1 − γ0 )3 ψ¯ ((1 − γ0 )3 )F −T .
(14.22)
The stress relation (14.22) implies that the Cauchy stress in the spherical phase is given by T = SF T / det F = ρSF T /ρ det F = C0 ρI,
(14.23)
where C0 := γ0 (1 − γ0 )3 ψ¯ ((1 − γ0 )3 )/ρref has the same sign as ψ¯ ((1 − γ0 )3 ) and ρ denotes the density in the current configuration. Thus, in the spherical phase, the energetically nearsighted elastic body experiences a hydrostatic stress that is proportional to the density in the current configuration. If ψ¯ ((1 − γ0 )3 ) < 0, then the stress is a hydrostatic pressure, again proportional to the density, as in the case of an ideal gas. Thus, if ψ¯ ((1−γ0 )3 ) < 0, the equation of state of the energetically nearsighted elastic body in the spherical phase is that of an ideal gas undergoing isothermal dynamical processes. Of course, the balance of linear momentum then takes the standard form in the current configuration for gas dynamics: ˙ C0 grad ρ + b = ρ v,
(14.24)
where ρ and b now denote the density and body force in the current configuration. However, the mixed power inequality now requires that div v 0,
(14.25)
which tells us that, when ψ¯ ((1 − γ0 )3 ) < 0, the spherical phase can arise only when the elastic material is not expanding. By employing relations (14.18), (14.19), and the formulas ˙ H˙ long = (1 + 2γ0 )(d˙ ⊗ d + d ⊗ d),
d·d =1
we obtain in a similar way specific forms for the balance of linear momentum, the mixed power inequality, and the stress relation in the elongated phase (1 − γ0 )ψ¯ ((1 − γ0 ))div[(γ0 I − (1 + 2γ0 )d ⊗ d)F −T ] + bref = ρref χ¨ ,
(14.26)
(1 − γ0 )ψ¯ ((1 − γ0 ))[γ0 I − (1 + 2γ0 )d ⊗ d] · F˙ F −1 0,
(14.27)
S = (1 − γ0 )ψ¯ ((1 − γ0 ))[γ0 I − (1 + 2γ0 )d ⊗ d]F −T .
(14.28)
(In relations (14.26)–(14.28), the symbol d denotes the field introduced in (14.18) referred to the reference configuration.) The stress relation (14.28) implies that the Cauchy stress in the elongated phase is given by T = C1 ρ[I − (3 + γ0 )d ⊗ d]
(14.29)
323
FIELD THEORY FOR ELASTIC BODIES
with C1 := γ0 (1 − γ0 )ψ¯ ((1 − γ0 ))/ρref having the same sign as ψ¯ ((1 − γ0 )), and the relations (14.24) and (14.25) become in the elongated phase: C1 grad ρ − C1 (3 + γ0 )div[ρd ⊗ d] + b = ρ v˙
(14.30)
div v (3 + γ0 )(grad v)d · d
(14.31)
and
the latter when ψ¯ ((1 − γ0 )) < 0. We conclude that the elongated phase can persist when div v is positive, as long as the stretching (grad v)d · d in the direction of d is at least divv/(3 + γ0 ). Consequently, when ψ¯ ((1 − γ0 )) < 0, the material in the elongated phase may expand or contract as long as the direction of submacroscopic elongation d is strongly aligned with directions of stretching in the macroscopic motion. Appendix: Decomposition of Flux Densities We record here for the sake of completeness the principal relations obtained in [3, 15], concerning the decomposition of flux densities arising through a structured deformation, with g the macroscopic deformation and G the deformation without disarrangements. If w: A → V is a smooth vector field and K = (∇g)−1 G, then the identity
(det K)div w = div(K ∗T w) + div((det K)w − K ∗T w) − w · ∇(det K)
(A.1)
is an immediate consequence of the product rule div(ϕw) = ∇ϕ · w + ϕdiv w, where ϕ is an arbitrary smooth scalar field. By an appropriate choice of determining sequence n → hn for the purely submacroscopic structured deformation (i, K), one can derive the following identification relations for (det K)div w and div(K ∗T w) at each point X ∈ A: −3 w(Y ) · ν(Y ) dAY , (A.2) (det K)div w|X = lim lim r r→0 n→∞
∗T
div(K w)|X = lim lim r r→0 n→∞
−3
hn (bdy(Cr (X)∩C))
C∈Cn
C∈Cn
w(Y ) · ν(Y ) dAY .
(A.3)
hn (bdy(Cr (X))∩C)
In (A.2) and (A.3), for each positive integer n, Cn is a collection of closed cubes C that cover the region A and whose faces together include the jump-sites of hn
324
LUCA DESERI AND DAVID R. OWEN
and of ∇hn , and Cr (X) is a cube centered at X of side r whose faces intersect the jump-sites of all the functions hn and ∇hn in a set of area zero. The surface integral in (A.2) is taken over the image under hn of all the faces of the parallelepiped Cr (X) ∩ C, so that the sum in (A.2) represents the total flux of w across the image of the faces of Cr (X) and across the image of the faces of cubes C in Cn containing the jump-sites of hn and ∇hn inside Cr (X). Therefore, the limit on the right-hand side of (A.2) and, hence, the left-hand side (det K)div w|X , represents the volume density of the total flux of w. The surface integral in (A.3) is taken instead over the image under hn of only those faces of the parallelepiped Cr (X) ∩ C that belong to the boundary of Cr (X) and not to the images of faces of cubes C in Cn containing the jump-sites of hn and ∇hn inside Cr (X). Therefore, the limit on the right-hand side of (A.3) and, hence, the left-hand side div(K ∗T w)|X , represents the volume density of the flux of w without disarrangements. Consequently, the remaining terms (div((det K)w − K ∗T w) − w · ∇(det K))|X on the right-hand side of (A.1) represent the volume density of the flux of w due to disarrangements. The identification relation (A.3) also permits us to call the vector field w\ := K ∗T w
(A.4)
the portion of w without disarrangements. Moreover, from (A.2) the divergence of the vector field wd := (det K)w − K ∗T w
(A.5)
together with the term −w · ∇(det K), account for all the volume density of flux due to disarrangements, and we call wd the portion of w due to disarrangements. By (A.5) we may write (det K)w = w\ + wd
(A.6)
as an additive decomposition of the vector field (det K)w into the portion of w without disarrangements and the portion of w due to disarrangements. Relations (A.4)–(A.6) yield the consistency relation K T w\ = w\ + wd .
(A.7)
We note that for each fixed vector a ∈ V, we may set w = S T a in (A.1)–(A.7) and, in view of the arbitrariness of a, recover the relations (3.2), (3.3), (3.5) and (3.6). Acknowledgements We thank Amit Acharya, Morton Gurtin, William Hrusa, and the referees for valuable comments and discussions related to this research. We also acknowledge the
FIELD THEORY FOR ELASTIC BODIES
325
support of the US National Science Foundation, Division of Mathematical Sciences, Award #0102477, and the support of the Italian Ministry of University and Scientific Research through the Grant “Cofin 2000: Modelli Matematici per la Scienza dei Materiali”, coordinated by P. Podio-Guidugli. We thank the Department of Mathematics of the University of Kentucky at Lexington and C.-S. Man for the valuable support offered to L. Deseri as Visiting Professor in the Spring Semester, 2002.
References 1. 2. 3. 4.
5. 6. 7.
8. 9. 10. 11. 12.
13. 14. 15. 16. 17.
G. Del Piero and D.R. Owen, Structured deformations of continua. Arch. Rational Mech. Anal. 124 (1993) 99–155. G. Del Piero and D.R. Owen, Integral-gradient formulae for structured deformations. Arch. Rational Mech. Anal. 131 (1995) 121–138. G. Del Piero and D.R. Owen, Structured Deformations. Quaderni dell’ Istituto Nazionale di Alta Matematica, Gruppo Nazionale di Fisica Matematica No. 58 (2000). D.R. Owen, Twin balance laws for bodies undergoing structured motions. In: P. Podio-Guidugli and M. Brocato (eds), Rational Continua, Classical and New. Springer-Verlag, New York (2002); Research Report No. 01-CNA-005, February 2001, Center for Nonlinear Analysis, Department of Mathematical Sciences, Carnegie Mellon University. D.R. Owen and R. Paroni, Second order structured deformations. Arch. Rational Mech. Anal. 155 (2000) 215–235. R. Choksi and I. Fonseca, Bulk and interfacial energy densities for structured deformations of continua. Arch. Rational Mech. Anal. 138 (1997) 37–103. R. Choksi, G. Del Piero, I. Fonseca and D.R. Owen, Structured deformations as energy minimizers in models of fracture and hysteresis. Mathematics and Mechanics of Solids 4 (1999) 321–356. L. Deseri and D.R. Owen, Energetics of two-level shears and hardening of single crystals. Mathematics and Mechanics of Solids 7 (2002) 113–147. G. Del Piero, The energy of a one-dimensional structured deformation. Mathematics and Mechanics of Solids 6 (2001) 387–408. G. Capriz, Continua with Microstructure. Springer Tracts in Natural Philosophy 35. SpringerVerlag, New York (1989). A. Eringen, Microcontinuum Field Theories, I. Foundations and Solids. Springer-Verlag, New York (1999). M. Renardy, W. Hrusa and J.A. Nohel, Mathematical Problems in Viscoelasticity. Pitman Monographs and Surveys in Pure and Applied Mathematics 35. Longman Scientific and Technical (1987). L. Deseri and D.R. Owen, Invertible structured deformations and the geometry of multiple slip in single crystals. Internat. J. Plasticity 18 (2002) 833–849. M. Boyce, G. Weber and D. Parks, On the kinematics of finite strain plasticity. J. Mechanics and Physics of Solids 37 (1989) 647–665. D.R. Owen, Structured deformations and the refinements of balance laws induced by microslip. Internat. J. Plasticity 14 (1998) 289–299. P. Haupt, Continuum Mechanics and Theory of Materials. Springer-Verlag, Berlin (2000). W. Noll, La mécanique classique, basée sur un axiome d’objectivité. In: La Méthode Axiomatique dans les Mécaniques Classiques and Nouvelles (Colloque International, Paris, 1959). Gauthier-Villars, Paris (1963) pp. 47–56.
326 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33.
LUCA DESERI AND DAVID R. OWEN
A.E. Green and R.S. Rivlin, On Cauchy’s equations of motion. J. Appl. Math. Phys. 15 (1964) 290–292. M.E. Gurtin, An Introduction to Continuum Mechanics. Academic Press, New York (1981). B.D. Coleman and W. Noll, The thermodynamics of elastic materials with heat conduction and viscosity. Arch. Rational Mech. Anal. 13 (1963) 167–178. C.M. Dafermos, Quasilinear hyperbolic systems with involutions. Arch. Rational Mech. Anal. 94 (1986) 373–389. W. Noll, On the continuity of the solid and fluid states. J. Rational Mech. Anal. 4 (1955) 3–81. D. Luenberger, Linear and Nonlinear Programming, 2nd edn. Addison-Wesley, Reading, MA (1989). J.L. Ericksen, Loading devices and stability of equilibrium. In: Nonlinear Elasticity. Academic Press (1973) pp. 161–173. M. Chipot and D. Kinderlehrer, Equilibrium configurations of crystals. Arch. Rational Mech. Anal. 103 (1988) 237–277. I. Fonseca and G. Parry, Equilibrium configurations of defective crystals. Arch. Rational Mech. Anal. 120 (1992) 245–283. C. Davini and G. Parry, On defect-preserving deformations in crystals. Internat. J. Plasticity 5 (1989) 337–369. V. Mizel, On the ubiquity of fracture in non-linear elasticity. J. Elasticity 52 (1999) 257–266. M. Šilhavý and J. Kratochvíl, A theory of inelastic behavior of materials, Part I. Arch. Rational Mech. Anal. 65 (1977) 97–129; Part II. Arch. Rational Mech. Anal. 65 (1977) 131–152. W. Noll, A new mathematical theory of simple materials. Arch. Rational Mech. Anal. 48 (1972) 1–50. A. Bertram, An alternative approach to finite plasticity based on material isomorphisms. Internat. J. Plasticity 14 (1999) 353–374. C. Truesdell and W. Noll, The Non-Linear Field Theories of Mechanics, 2nd edn. SpringerVerlag, Berlin (1992). W. Noll, Materially uniform simple bodies with inhomogeneities. Arch. Rational Mech. Anal. 27 (1967) 1–32.
Continuous Distributions of Dislocations in Bodies with Microstructure MARCELO EPSTEIN1 and IOAN BUCATARU2
1 Department of Mechanical and Manufacturing Engineering, The University of Calgary, Calgary,
AB T2N 1N4, Canada. E-mail:
[email protected] 2 Faculty of Mathematics, “Al.I.Cuza” University, Iasi, 6600, Romania. E-mail:
[email protected] Received 16 August 2002; in revised form 13 June 2003 Abstract. A material body with smoothly distributed microstructure can be seen geometrically as a fibre bundle. Within this very general framework, we show that a theory of continuous distributions of dislocations can be formulated and specialized to particular applications, both old and new. Mathematics Subject Classifications (2000): 74E05, 74M25, 53C10, 53B05, 55R10. Key words: inhomogeneity, differential geometry, G-structures, fibre bundles, Eshelby stress.
Dedicated to the memory of Clifford Ambrose Truesdell III.
1. Introduction The modern theory of continuous distributions of dislocations in simple bodies can be traced back to the pioneering work of Kondo [13] and his collaborators in Japan, and the works of Bilby [1], Kroener [14] and others in Europe. Within the context of mainstream Continuum Mechanics, the seminal articles of Noll [18] and Wang [19] paved the way for a formalism that is amenable to generalization in a variety of directions. Some of these generalizations have been presented elsewhere [7, 15, 16] and they comprise theories of second-grade materials as well as generalized Cosserat bodies. These are instances of bodies with some kind of internal structure (or microstructure) along the lines of the classical ideas of the Cosserat brothers [3] and of the more modern developments by numerous authors (Truesdell, Ericksen, Toupin, Mindlin, Green and Naghdi, Eringen, etc.). A book by Capriz [2] points towards more general theories and their possible interpretations and applications. The basic common feature underlying all these theories is that the microstructure (be it of a granular nature, or of an orientational origin such as in liquid crystals, or arising from any other physical motivation) is eventually In [15] and [16] there is a first attempt at a general theory of the type presented here, but the behavior at the fibre level is still of the local type.
327 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 327–344. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
328
M. EPSTEIN AND I. BUCATARU
represented by a smoothed-out apparatus (such as Cartan’s repère mobile). The obvious differential-geometric object corresponding exactly to this conceptual model is a fibre bundle. The nature of the fibre bundle depends naturally on the particular application at hand. The most intuitively obvious fibre bundle, general enough to subsume the Cosserat’s initial idea and many of its generalizations, is the frame bundle of an ordinary body. In this paper, however, we choose to leave the nature of the typical fibre unspecified and attempt to answer the question: is it possible to develop a fully fledged theory of inhomogeneities before such specification is made? We answer this question in the affirmative and then proceed to show how old and new theories can be derived as particular cases of the general formulation. 2. The Body Bundle and Its Configurations 2.1. FIBRE BUNDLES The simplest instance of a fibre bundle (and one that is convenient to bear in mind) is a product bundle or trivial bundle. It consists of the Cartesian product M = B × F of a base manifold B and a fibre manifold F . This product is, of course, itself a manifold whose dimension is the sum of the dimensions of B and F . It is endowed with two natural differentiable projection maps such that, given a point p ∈ M consisting of the ordered pair (b, f ) (with b ∈ B and f ∈ F ), the first projection (pr1 ) renders b, and the second f . In extending this idea to the general notion of a fibre bundle, the second projection is lost. Thus, a fibre bundle consists of a triple (M, B, π ), such that the projection: π: M → B
(2.1)
is a differentiable map (a surjective submersion) with the following property: there exists a manifold F , called the typical fibre, such that every point b ∈ B has an open neighbourhood U and a diffeomorphism fU : π −1 (U ) → U × F making the following diagram commutative:
We express this fact by saying that the fibre bundle is locally trivial, since it is neighbourhood-wise diffeomorphic to a product bundle. We note that on the non-vanishing intersection of any two trivialization neighbourhoods U and V , the transition map fV ◦ fU−1 : F → F (restricted to each point of the intersection) is a diffeomorphism of the typical fibre. Instead, it is possible to require that the
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
329
transition maps belong to a subgroup G of the group of diffeomorphisms of the typical fibre. When this is done, the group G is called the structural group of the fibre bundle. The transition maps are required to depend differentiably on the points of the intersection U ∩ V . 2.2. BUNDLE CONFIGURATIONS In the physical picture, we shall refer to the fibre bundle M as the body bundle. The base manifold B represents the macromedium, namely, an ordinary threedimensional body upon which the microstructure is later superimposed. As such, for this particular use in continuum mechanics, the base manifold is assumed to be trivial, in the sense that it can be covered by an atlas consisting of just one chart. We note in passing that this requirement is not essential for the development of most of the conceptual framework of classical Continuum Mechanics. Rather, it is imposed to represent the intuitive notion that the body must manifest itself in toto in physical Euclidean space. Be that as it may, we shall adopt the standard assumption. The microstructure will be represented by the typical fibre F , whose nature is left undefined at this point beyond the fact that it is an m-dimensional differentiable manifold. In particular, the typical fibre need not have a priori the property of being trivial, nor is it necessary that the total body bundle M be globally trivializable. We will, however, assume this last property for the sake of consistency. It is important to distinguish between a trivial bundle (namely, a bundle that is a given product of two manifolds) and a globally trivializable one (namely, a bundle that is globally diffeomorphic to a trivial one). The difference resides in the fact that the former has a particular singled-out trivialization, while the latter doesn’t. A configuration of a body bundle is, by definition, a global trivialization κ given in terms of a fibre-consistent embedding: κ: M → E 3 × F ,
(2.2)
where E stands for a three-dimensional Euclidean space, which we may identify with R3 . By fibre-consistency we simply mean that the following diagram is commutative: 3
where χ is an ordinary configuration of the macromedium B. Such a fibre-consistent embedding is also called a fibre-bundle morphism. In the rest of this paper we will take the liberty of using this κ/χ notation freely. Namely: the character χ (possibly with some subscripts) will always denote the configuration of the
330
M. EPSTEIN AND I. BUCATARU
macromedium induced by the body-bundle configuration denoted by κ (with the same subscripts). REMARK 2.1 For some particular theories, it may be desirable to further limit the allowable configurations. Thus, for example, in a theory of second-grade materials the body bundle is the principal bundle of frames of B and, in contradistinction with the Cosserat (anholomic) case, we may only allow embeddings which are lifts of ordinary configurations of B. Let X I , Y A (I = 1, 2, 3; A = 1, . . . , m) and x i , y a (i = 1, 2, 3; a = 1, . . . , m) be local coordinate systems in the body bundle and in the spatial product E 3 × F , respectively. Then, a configuration of the body bundle is represented locally by the 3 + m smooth functions: x i = x i (X I ); y a = y a (X I , Y A ).
(2.3)
Equations (2.3) can also be regarded as representing a deformation from the reference configuration implied by the assignment of the bundle chart (X I , Y A ), into a spatial configuration expressed in the coordinate system (x i , y a ). Note that the Jacobian of the transformation (2.3) must have maximal rank in the sense that: i ∂x = 3, rank ∂X I (2.4) a ∂y = m. rank ∂Y A For each fixed point X ∈ B, equation (2.3)b must be a diffeomorphism of the typical fibre F belonging to the structural group G of the bundle. Consequently, this equation is tantamount to a smooth map g:
B → G, X → g(X).
(2.5)
Accordingly, denoting by Lg the left action of G on F , equation (2.3)b can also be understood as y a = Lg(X)(Y A ).
(2.6)
In closing this section, we remark that whereas the physical space E 3 is assumed to have an intrinsic physical meaning and a distinguished metric structure, the typical fibre does not, in principle, possess either, unless and until a concrete physical context is established. From now on, we adopt the convention that the indices H , I , J , K, L, h, i, j , k, l vary within
the range 1,2,3, while the indices A, B, C, D, E, a, b, c, d, e vary within the range 1, . . . , m.
331
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
3. The Material Response 3.1. GENERALITIES We should like to confine our attention to materials whose mechanical behavior is in some sense local. At this point, however, it is not necessary to require any particular locality of response insofar as the fibers (or microstructure carriers) are concerned. We will only demand that the response functional be localized in terms of its dependence on the points of the base manifold (the macromedium). In particular, we wish to consider the case in which the material response is of the first grade, namely, it involves only the local values of the first derivative of the configuration with respect to the base-manifold coordinates. One has to ascertain, however, that such presumed first-grade behavior can be established intrinsically, independently of any particular coordinate system. We will now show that this is indeed the case and that the entity that characterizes the independent argument of the constitutive functional is a well-defined geometric object that we call a fibre jet. 3.2. FIBRE JETS We start by defining an equivalence relation ∼0,X within the set of all possible configurations of the body bundle M. Two configurations, κ and λ, are said to be ∼0,X -equivalent at a point X ∈ B if they take exactly the same values for each point of the fibre π −1 (X). Using coordinates we may, therefore, write: κ ∼0,X λ
⇐⇒
κ(X I , Y A ) = λ(X I , Y A )
∀Y A ∈ π −1 (X).
(3.7)
More explicitly, denoting with a “hat” the coordinates corresponding to the λ-configuration, the equivalence of κ and λ at a point X ∈ B implies that x i (X) = xˆ i (X) and y a (X I , Y A ) = yˆ a (X I , Y A ). The second equality is actually an identity of the functions y a and yˆ a over the whole range of values of the fibre coordinates Y A for the given (fixed) X I . REMARK 3.1 If we adopt κ as a reference configuration, then for any other configuration λ which is ∼0,X -equivalent to κ, the deformation of the fibre at X, as given by equation (2.5), corresponds to the value g = e, where e is the neutral element of the structural group G. The equivalence relation just defined partitions the class of all configurations into equivalence classes. Each equivalence class will be called a zero-th fibre jet at X and will be denoted as J0,X . In a similar way, we can define higher-order fibre jets. In particular, the first-order fibre jets are obtained from the equivalence relation ∼1,X defined by the conditions: κ ∼1,X λ
⇐⇒
κ∗ (X I , Y A ) = λ∗ (X I , Y A )
∀Y A ∈ π −1 (X),
(3.8)
where the asterisk subscript indicates the tangent (or gradient) map. The maps κ∗ (X I , Y A ) and λ∗ (X I , Y A ) are linear maps defined on the tangent space T(XI ,Y A ) M
332
M. EPSTEIN AND I. BUCATARU
for every Y A ∈ π −1 (X). The equivalence condition (3.8) may also be written as κ∗ (X I , ·) = λ∗ (X I , ·). Obviously, the following implication holds: κ ∼1,X λ
0⇒
κ ∼0,X λ.
(3.9)
In any coordinate system the ∼1,X -equivalence of κ and λ at a point X boils i i down to x i (X I ) = xˆ i (X I ), y a (X I , Y A ) = yˆ a (X I , Y A ), x,J (X I ) = xˆ,J (X I ) and a a (X I , Y A ) = yˆ,J (X I , Y A ), ∀Y A ∈ π −1 (X), with commas denoting partial derivy,J atives. It is important to note that these conditions are independent of the coordinate chart since the derivatives along the fibres are automatically identical. Each equivalence class of the relation ∼1,X is called a first-order fibre jet (or a first fibre jet) and is denoted by J1,X . The first fibre jet at X corresponding to a configuration κ is denoted as J1,X κ. REMARK 3.2 In local coordinates, we can identify a first fibre jet at X I with a triple i a (X I , ·) . (3.10) x,J (X I ), y a (X I , ·), y,J i (X I ) is simply an ordinary frame of B at the point X, and y a (X I , ·) is Here, x,J a diffeomorphism of F belonging to the structural group G. To reveal the nature a (X I , ·), we may provisionally adopt as a reference configuof the third entry, y,J ration one belonging to the given equivalence class, in which case, according to remark 3.1, the second entry can be regarded as the identity e. It follows then that the third entry, being the derivative map of (2.5) evaluated at the identity, can be identified with an element of the set of linear transformations L(R3, g), where g is the Lie algebra of G.
The composition of fibre jets (of the same order) is defined by taking any two representatives of the respective equivalence classes, effecting their composition and then adopting the jet of the composition as the composition of the jets, namely J1,χ(X) λ ◦ J1,X κ = J1,X (λ ◦ κ).
(3.11)
It is not difficult to prove that this definition is independent of the particular representatives chosen. From a more physical point of view, we may say that a first fibre jet of a deformation represents the deformation of a “first-order neighbourhood” of a point and its fibre. This terminology, although lacking of mathematical rigor, will be used frequently in an attempt to trigger the right physical intuition. 3.3. FIRST- GRADE RESPONSE We will now attempt to provide a more rigorous characterization of our introductory remarks as to the material response. For specificity, we will speak of a (time
333
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
independent) energy functional W , so that one may say that we are confining our attention to hyperelastic behavior. Although this is not, strictly speaking, a necessary requirement for the developments that follow, we adopt it here so as to concentrate on the more intricate geometric aspects of the theory. The energy functional W at any time t is assumed to be given as the integral over the macromedium B of an energy-density functional w whose independent argument is the first fibre jet of the configuration. In any given reference configuration, the volume element for integration is given by the underlying Cartesian volume element in E 3 . We may thus write: w(J1,X κ; X) dV (X). (3.12) W = χ0 (B)
We note that the density w is still a functional as far as its dependence on the functions y a (X I , ·) and their X-derivatives is concerned. In other words, the value of w depends on the values of the functions x i and their derivatives at X I , but also on the entire functions y a (X I , ·) and their derivatives as functions of the running coordinate Y A . The behavior is, therefore, local only insofar as its dependence on the deformation of the macromedium, but it may be global in terms of its dependence of the deformation of the micromedium. To emphasize this fact, we sometimes will write the energy density w more explicitly (in coordinates) as: i a (X I ), y a (X I , ·), y,J (X I , ·); X I . (3.13) w = w x,J Note that the dependence on x i (X I ) has been eliminated, since the energy density is assumed to be invariant under space translations. Further exploitations of the principle of frame indifference are not pursued in this paper, since they are not essential to the description of continuous distributions of dislocations at the fundamental level. Notice also that we have intentionally specified a dependence of the energy density functional on the material point X to allow for the fact that the material properties may change from point to point of B or that, even if the material is the same, inhomogeneities may be present. In fact, the nature of this dependence is a major issue in establishing a consistent theory of continuous distributions of dislocations. As a prolegomenon to the theory, we record here the way in which the energy density changes upon a change of reference configuration. Let a change of reference configuration (i.e., a change of trivialization of M) be given in terms of a fixed deformation κˆ 0 of the reference configuration κ0 : κˆ 0: κ0 (M) → E 3 × F .
(3.14)
The energy density functional w to be integrated over the new domain χ 0 (B) is related to w by: ˆ χ 0 (X)) = w (J(1, χ0 (X)) κ;
−1
ˆ χ0 (X)) κ χˆ 0∗ (X) w(J(1,
◦ J1,X κˆ 0 ; X),
(3.15)
334
M. EPSTEIN AND I. BUCATARU
where is the determinant of its subscript, and κˆ represents arbitrary deformations measured from the new reference configuration. In coordinates, we may write: i a ∂x a ∂y I ˆ w ,y , ;X ∂ Xˆ I ∂ Xˆ I i ˆI a ˆI ∂x ∂ X ∂y a ∂ Yˆ A J −1 a ∂y ∂ X ,y , + ; X . (3.16) = (∂ Xˆ I /∂XJ ) w ∂ Xˆ I ∂X J ∂ Xˆ I ∂X J ∂ Yˆ A ∂X J Naturally, the independent variables of the functions appearing on either side are different. Thus, for example, the functions y a on the left-hand side are y a (Xˆ I , Yˆ A ), whereas the functions y a on the right-hand side are to be understood (by composition) as y a = y a (X J , Y B ) = y a (Xˆ I (X J ), Yˆ A (X J , Y B )), and so on. We remark once again that the value of the coordinate expression at a point of B is invariant under a change of representative of the first fibre-jet. 3.4. THE MATERIAL SYMMETRY GROUP A change of reference configuration that maps a point X of the macromedium B to itself may happen to have the property that it leaves the material response at X unchanged. Since in order to check whether or not this is the case one has to make use of the transformation equation (3.15) (or its coordinate version (3.16)), and since this equation is sensitive only to the first fibre jet at X, we need only consider the fibre jets of changes of configuration (that preserve the point X). By analogy with the case of simple materials, we define a material symmetry H at the point X ∈ B relative to the reference configuration κ0 of M as a first fibre-jet of a change of configuration that preserves X and such that the equation: w(J1,X κ; X) = w((J1,X κ) ◦ H ; X)
(3.17)
is satisfied identically for all deformations κ. Note that the determinant of the gradient of the macromedium deformation does not appear since, in accordance with the theory of unstructured continua, we assume that only volume preserving local transformations can be physically meaningful symmetries. A material symmetry at a point X ∈ B is given by a triple (G, Lg , L). Here G is an element of the special linear group SL(3, R) and has the meaning of a symmetry of the macromedium at X, Lg is the left translation induced by an element g of the structural group G that acts on the fibre π −1 (X), and L(X, ·) represents a mixed symmetry of micro- and macro-structures. We remark here also that L(X, ·) can be regarded as an element of L(R3 , g). This follows from a consideration similar to that embodied in remark 3.2 taking into account that all tangent spaces to a Lie group are canonically isomorphic via the adjoint map. The collection of all symmetries at X forms a group HX whose group operation is represented by the composition of fibre jets. More explicitly, the material sym-
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
335
metry group HX is a subgroup of the semidirect product Gl(3, R) × G × L(R3 , g), where the multiplication law is given by: (G, Lg , L)(G , Lg , L ) = (GG , Lgg , LG + (Lg )∗ L ),
(3.18)
where (Lg )∗ is the automorphism of the Lie algebra g induced by the left translation Lg . The neutral element of this group is given by (I3 , e, 0), where e is the neutral element of the structural group G. The inverse of an element (G, Lg , L) is given by (G, Lg , L)−1 = (G−1 , Lg −1 , −(Lg −1 )∗ LG−1 ). We observe that every fibre jet determines uniquely by projection an ordinary jet at the base manifold. More precisely, let J1,X κ be a fibre jet. The induced ordinary first jet at X is given by π(J1,X κ) = j1,X (χ), where χ is the ordinary configuration of B associated with the bundle configuration κ, and where j denotes an ordinary jet. The collection GX = π(HX ) obtained by taking the projection of each and every symmetry H ∈ HX is a subgroup of GL(3, R) called the induced symmetry group of the macromedium at X. An important subgroup of HX is the group SH X obtained by considering only those symmetries stemming from changes of configuration whose zeroth fibre jet at X is the identity. It follows that SH X is a normal subgroup of HX . The quotient group HX /SH X carries the physical meaning of global symmetries of the isolated fibre manifold itself. Using the above notations, the normal subgroup can be expressed as SH X = {(G, e, L)}. Then it is easy to see that the quotient group HX /SH X is isomorphic to a subgroup of the structural group G of the fibre bundle. The symmetry groups of one and the same material point relative to two different reference configurations are related by conjugation through the first fibre jet of the change of reference configuration. 3.5. MATERIAL ISOMORPHISMS AND UNIFORMITY The energy density functional w varies from point to point of the macromedium B, as explicitly indicated in equations (3.12) or (3.13) by the dependence of w on the last argument, X. It is, therefore, legitimate to ask the question: are two points, X1 and X2 , of B made of the same material? A necessary and sufficient condition for this to be the case is, most certainly, the existence of some reference configuration in which the identity: w(J1,X1 κ; X1 ) = w(J1,X2 κ; X2 ) is satisfied for all J1,X1 κ = J1,X2 κ. By this last equation we mean, abusing the notation, that we are comparing fibre jets at two different points by means of the parallelism induced by the trivialization implied in the choice of reference configuration. In other words, we say that the two points are made of the same material if, in some reference configuration, they have exactly the same response to the “same” deformations. The naive, but intuitively clear, definition just introduced already points at the fact that, even if all the points of B happen to be made of the same material, there might not exist a common global reference configuration for which the identity just quoted is satisfied for every pair of points. Thus, we can intuitively distinguish
336
M. EPSTEIN AND I. BUCATARU
between the concept of uniformity (“all points are made of the same material”) and the idea of homogeneity (“there exists a reference configuration in which the energy density w is independent of position”). An intermediate situation of local homogeneity is also possible (“for each point of B there exists a reference configuration for which the energy density is independent of position in a neighbourhood of the point”). These ideas, which are at the heart of Noll and Wang’s treatment of the theory of continuous distributions of dislocations in simple bodies, will now be extended to materials with general microstructure. We say that two points X1 and X2 of a macromedium B are materially isomorphic (read: “made of the same material”) if there exists a body-bundle morphism κ1,2 such that χ1,2 (X1 ) = X2 and w(J1,X2 κ ◦ J1,X1 κ1,2 ; X1 ) =
χ1,2 ∗ (X1 ) w(J1,X2 κ; X2 ),
(3.19)
for all jets J1,X2 κ. We will use the notation P (X1 , X2 ) = J1,X1 κ1,2 . Physically, this jet represents a “transplant operation” which achieves a perfect graft, as far as the mechanical response is concerned. What has been done is to cut out a first-order neighbourhood of the point X1 , including its fibre π −1 (X1 ), deform it according to the map P (X1 , X2 ), and implant it in the place of a similar neighbourhood of X2 and its fibre. The identity above expresses the fact that the graft has been successful, and this can only happen if the materials are the same! If we make the source and target points of a material isomorphism to coincide, we obtain a material automorphism, namely, we recover the concept of a material symmetry at a point. Let H be a material symmetry at the point X1 ∈ B and let P (X1 , X2 ) be a material isomorphism between X1 and another point X2 . Then the composition of jets P (X1 , X2 ) ◦ H ◦ P −1 (X1 , X2 ) is a material symmetry at X2 . It is not difficult to show that all material symmetries at X2 can be obtained in this way and, consequently, that the symmetry groups at two materially isomorphic points are conjugate of each other, the conjugation being effected by means of any material isomorphism. A body M → B with microstructure is said to be materially uniform if all the points of B are pairwise materially isomorphic. Since material isomorphism is clearly an equivalence relation, an alternative definition of material uniformity consists of establishing the material isomorphisms of all points of B with a fixed point X0 ∈ B. For clarity, one may conceive of this archetypal point X0 as placed outside the body. Naturally, this archetype consists of a point carrying the typical fibre and a first-order neighbourhood of both. Once the archetype is chosen, the field of transplants becomes a function of one variable alone, namely: P (X) = P (X0 , X). Let (X µ ) be the coordinates of the archetypal point X0 and (Y α ) the fibre coordinates along the typical fibre. Then a uniformity field can be written as P (X) = (PµI (X), Y A (X, ·), RµA (X, ·)). Here Y A (X, ·) and RµA (X, ·) are functions of Y α . In terms of the archetypal energy density functional w¯ per unit volume of the archetype the uniformity condition can be written as w(J1,X κ; X) =
−1
P
(X)w(J ¯ 1,X κ ◦ P (X)).
(3.20)
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
337
A body M is said to be smoothly uniform if the uniformity field (or “field of implants”) P (X) is smooth. Although in some important practical applications it may be the singularities of this field that matter the most, we will henceforth assume that the body is smoothly uniform.
3.6. THE MATERIAL LIE - GROUPOID We abandon for the moment the mechanical motivation and turn our attention to the following interesting geometrical object. For a given fibre bundle M over the base manifold B we consider a pair of points X1 and X2 of B and we construct all possible fibre jets of fibre-bundle morphisms that map X1 to X2 . In other words, we consider the collection of all possible transplants (regardless of the material response). We now form the union of all these collections of jets for all pairs of points in B. The result is an object J(B, M) endowed with two “projection” maps. Indeed, given any element of J(B, M), namely a fibre jet between some points X1 and X2 , the first projection, α: J(B, M) → B, points at the source X1 , while the second projection, β: J(B, M) → B, points to the target point X2 . In addition to being endowed with two projections, the object J(B, M) enjoys other properties. Firstly, the subset JX (B, M) = {J ∈ J(B, M) | α(J ) = β(J ) = X} is a group. Secondly, if J1 , J2 ∈ J(B, M) and β(J1 ) = α(J2 ), then the composition J2 ◦ J1 belongs to J(B, M). Finally, if J ∈ J(B, M) then also J −1 ∈ J(B, M). A set with these properties is called a groupoid, and we will call J(B, M) the fibre jet groupoid associated with the fibre bundle M. It is not difficult to see that all the groups JX (B, M) are conjugate to each other. Any one of them can be rightly called the structural group of the groupoid. When, as in the case J(B, M), the projections α and β are smooth functions, we have a Lie groupoid. Notice that the projections are maps onto B, so it is appropriate to say that B (and not M) is the base manifold of the groupoid J(B, M). If we reintroduce the temporarily abandoned material picture and consider, for a given uniform body bundle M with energy density w, the set of all those fibre jets representing material isomorphisms, we obtain a subset Jw (B, M) of J(B, M) enjoying precisely all the properties required by the definition of a groupoid. We say that Jw (B, M) is a subgroupoid of J(B, M). This material subgroupoid has as its structural group the material symmetry group of any of the points of B. If we adopt the idea of an archetypal point outside the body, we may conveniently say that the structural group of the material subgroupoid is the material symmetry group H0 of the archetype. By the assumed smoothness of the uniformity field, Jw (B, M) is a Lie groupoid, which we shall call the material Lie groupoid of M. There exists a parallel, almost equivalent, picture that is worth revealing. A first fibre-jet J1,X κ of a local trivialization κ can be said to define a fibre frame at X ∈ B. With this picture in mind, if we consider the collection of all fibre frames at all points of B, we can construct a fibre bundle F (B, M) over B, with projection τ: F (B, M) → B. The typical fibre of this bundle consists of all possible changes
338
M. EPSTEIN AND I. BUCATARU
of fibre frames and is, therefore, a group. It can be shown that F (B, M) is actually a principal bundle, which we call the principal bundle of fibre frames of M. It is worthwhile repeating that the base manifold of this bundle is B, not M. If we now reconsider the notion of the material archetype, we realize that each uniformity implant P (X) (or, rather, its inverse) is nothing but a fibre frame at X. A uniformity field is then just a section of the principal bundle of fibre frames. The set of all possible implants from a given archetype, consistent with a given constitutive law, forms a subbundle of F (B, M). This subbundle is itself a principal bundle, whose structural group is the material symmetry group H0 of the archetype. In this way, we have obtained a generalization of the concept of a G-structure, that we call a fibre G-structure. 3.7. HOMOGENEOUS BODIES WITH MICROSTRUCTURE The uniformity concept for a body with microstructure has been introduced and studied in Section 3.5. In the same section we introduced in general terms the ideas of homogeneity and local homogeneity for a bundle body. We will now make this ideas more precise. We say that a body bundle M is homogeneous with respect to a given fibre frame of an archetypal point X0 if it admits a global deformation κ such that the fibre jet J1,X κ −1 = P (X) is a uniformity field. If this is the case we say that the associated fibre G-structure is integrable. In other words, integrability is equivalent to the existence of a section P of the fibre G-structure that is the fibre jet J1,X κ −1 of a global deformation κ. In local coordinates we have that the body bundle is homogeneous if there exists a uniformity field P (X) = (PµI (X), Y A (X, y α ), RµA (X, y α )) which, by a global change of reference configuration, can be brought to the trivial field {δµI , e, 0}. The first component can be achieved if the inverse matrix of PµI (X) is derivable from three scalar functions x µ (X I ) by (P −1 )µI = ∂x µ /∂X I . This will be the case if the equality of mixed partial derivatives is satisfied, namely: (P −1 )µI,J = (P −1 )µJ,I The second entry can always be achieved by just inverting the function Y A (X, y α ) for each X thus obtaining a function y α (X, Y A ). If the first condition is satisfied, we have determined the change of configuration x µ = x µ (X I ) and y α = y α (X I , y A ). We still have to check whether the third entry, for this particular change of configuration, vanishes. By the law of composition of first fibre jets, this will be the case if ∂y α A −1 µ ∂y α R (P )I + = 0. ∂Y A µ ∂X I REMARK 3.3 We note that if the symmetry group is discrete, then the test just described is necessary and sufficient for (local) homogeneity. In the case of a continuous symmetry group, however, the degree of freedom afforded by the continuity has to be taken into consideration, each group leading to different necessary and sufficient criteria. In other words, given a particular uniformity field the test
339
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
is sufficient for homogeneity, but the body may still be homogeneous if the test is violated, since there may exist another independent uniformity field that satisfies it. In more geometrical language, we can say that a uniformity field for a bodybundle induces three different parallelisms: (i) the ordinary material parallelism of Noll, as determined by the field of matrices PµI . On an appropriate coordinate patch the corresponding Christoffel symbols will be curvature-free, since the parallelism is obviously distant; (ii) a distant parallelism between the fibres of the body-bundle M induced by the functions Y A (X, y α ), two points at two different fibres π −1 (X1 ) and π −1 (X2 ) being in correspondence if they have the same values of the coordinates y α ; and (iii) a curve dependent parallelism on M generated by the horizontal distribution spanned by the vectors: ∂ ∂ + RµA (X, y α ) A . (3.21) I ∂X ∂Y The test of local homogeneity described above is equivalent to the following geometric conditions: (i) the torsion of the first connection vanishes; (ii) the horizontal distribution spanned by Hµ is involutive, thus giving rise to a distant fibre parallelism; and (iii) this distant parallelism coincides with the one induced by Y A (X, y α ). It is remarkable that a necessary condition for the above homogeneity criteria to be satisfied can be expressed in terms of an ordinary linear connection on M, namely, an object that lives in the principal bundle of frames of M, or in its associated tangent bundle T M. Indeed, consider the field of frames of M given by the 3 + m linearly independent vector fields Hµ and Vα = (∂Y A /∂y α )(∂/∂Y A ). Then, a necessary condition for local homogeneity is the vanishing of the following three inhomogeneity tensors: ⎧ −1 µ ∂(P −1 )µK )J ⎪ I ∂(P ⎪ − ; Pµ ⎪ ⎪ ∂X K ∂X J ⎪ ⎪ ⎪ ⎨ ∂Y A ∂R α α ∂RK J (3.22) − ; α K J ⎪ ∂y ∂X ∂X ⎪ ⎪ ⎪ ⎪ ∂ 2yα ∂Y A ∂RJα ⎪ ⎪ ⎩ − , ∂y α ∂Y B ∂Y B ∂X J Hµ = PµI (X)
where we have set RIα = −(∂y α /∂Y A )RµA (P −1 )I . The above mentioned tensors are nothing but the torsion components of the complete parallelism (linear connection) D on M induced by the field of frames on M given by Hµ and Vα or, in matrix form: & % −1 µ 0 (P )I (X) . (3.23) F (X, Y ) = ∂y α (X, Y ) RIα (X, Y ) A ∂Y µ
340
M. EPSTEIN AND I. BUCATARU
The complete parallelism on M induced by this field of frames determines a unique linear connection D on M for which the frame {Hµ, Vα } is covariantly constant. This means that DZ Hµ = DZ Vα = 0 for all vector fields Z on M. The linear connection D is curvature free. If we express the torsion T (Z, W ) = DZ W −DW Z−[Z, W ] with respect to the frame {Hµ , Vα }, then there are only three nonzero components. These components are the inhomogeneity tensors (3.22). REMARK 3.4 The field of frames (3.23) has some important features: it is projectable and preserves the vertical vectors with respect to the projection π . The first tensor (3.22) measures the inhomogeneity of the macrostructure. This tensor is the torsion of the complete parallelism (linear connection) ∇ on B induced by µ the frame (P −1 )I (X). It is easy to see that the linear connection ∇ on B is the projection of the linear connection D on M. 4. Additional Topics 4.1. ON STRESSES AND CONFIGURATIONAL FORCES Since our purpose has been merely to reveal the geometrical material structure underlying a distribution of dislocations or other forms of inhomogeneity in a body with general microstructure, we have not discussed the field equations that govern the possible motion and material evolution of such bodies. Our intention in this subsection is to touch upon this issue briefly so as to prepare the ground for future work on the applications of the general theory to particular cases. We start by recalling once again that the energy density function w appearing in equation (3.13) is in fact a functional as far as its dependence on the fibre deformation y a (X, ·) and its material gradient y,Ia (X, ·) are concerned. At this level of generality one may, therefore, define the microstresses as functional derivatives (such as Gateaux or Fréchet derivatives) of w with respect to the functions y a (X, ·) and y,Ia (X, ·). In particular cases, the functional w may eventually turn out to be expressed in terms of an ordinary function of a finite number of parameters. An important example of this type is discussed in the next subsection. In such cases, then, the microstresses will boil down to a finite number of “hyperstress tensors” defined at each point of the macromedium B. Another case of practical importance is a situation in which the functional can be represented as an integral over the fibre, via an appropriately defined volume element arising from physical considerations. In these cases, it appears that the microstresses will be reducible to a number of tensor fields defined over the fibres themselves. In all cases, not necessarily confined to the two examples just mentioned, the relevant form of the equations of motion can be gathered from an exercise consisting of postulating a form of the kinetic energy and then demanding that the corresponding Lagrangian attain an extremal value. An issue directly related to that of microstresses is that of Eshelby-like stresses for material with microstructure. Born from the classical article of Eshelby [11] on the force on an isolated singularity in an elastic material, the notions of Eshelby
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
341
stresses and configurational forces have acquired in recent years considerable impetus. The conceptual reach of the original idea has been extended to a large variety of physical situations, including phase transitions and material growth, some of which are discussed at length in [17] and [12]. It is a fundamental feature of the theory of continuous distributions of inhomogeneities that, once the nature of the uniformity maps has been established for a particular theory on the basis of the kinematic variables involved, the correct expression for the corresponding configurational stresses arises canonically as the static dual in, say, the free energy expression (see [9, 10, 4]). Thus, the kinematics alone, by determining the appropriate uniformity maps, determines the form of the Eshelby stresses. This feature is particularly useful in the case of materials with microstructure. Within the generality maintained thus far in this article, the Eshelby microstresses would be measured, indeed, by means of functional derivatives of the right-hand side of equation (3.20) with respect to the uniformity field P (X), namely, with respect to the triple P (X) = (PµI (X), Y A (X, ·), RµA (X, ·)). In any particular theory, these configurational microstresses will have exactly the same nature as their Newtonian counterparts. The evolution of the material structure, namely, the time evolution of the corresponding groupoid within the class of conjugate groupoids, will then be subjected to a number of formal restrictions. For the case in which the microstructure describes a second gradient behaviour, these restrictions have been studied in detail elsewhere [5, 6], but a general treatment is lacking. 4.2. BODIES WITH LINEAR MICROSTRUCTURE In this section we shall show how the theory applies for the particular case when the fibre bundle (M, π, B) is the tangent bundle (T B, π, B) of a material body B. As the typical fibre is a linear space and the structural group is the general linear group Gl(3, R) we say that the microstructure behave linearly. We consider a material body B, that is, a three-dimensional manifold that can be covered with just one chart, and we denote by (T B, π, B) its tangent bundle. Then a configuration of the body bundle is given by a map κ: T B → E 3 × R3
(4.24)
that preserves the fibre structure, or, equivalently, by the six smooth functions: ⎧ i ∂x ⎪ ⎨ x i = x i (X I ), rank = 3, ∂X I (4.25) ⎪ ⎩ i y = HIi (X I )Y I , rank(HIi ) = 3. It is important to remark that, in this case, the function y i describing the fibre deformation at a point X ∈ B is completely defined by the matrix HIi (X). This means that the energy density functional w effectively has become an ordinary function of a finite number of variables. In order to study the mechanical behavior
342
M. EPSTEIN AND I. BUCATARU
of a body with microstructure, we have defined in Section 3.2 the first order fibre jet of a configuration. In our case, when a configuration κ has the form given by (4.24) or (4.25), the fibre jet is given by i i (X), HJi (X), HJ,K (X) . (4.26) J1,X = x,J i (X)Y J have to be seen as linear maps. Consequently, the Here HJi (X)Y J and HJ,K material response at each point X ∈ B of the body bundle is given by an ordinary function: i i (X), HJi (X), HJ,K (X); X I , w = w x,J (4.27)
(compare with [7, 8]). A material symmetry H of a point X ∈ B is a triple (GIJ , KJI , LIJ M ) such that i i i I N , HJi , HJ,M ) = w x,Ii GIJ , HIi KJI , HI,N KJI GN (4.28) w(x,J M + KN LJ M . As the structural group of the tangent bundle (T B, π, B) is Gl(3, R), and as the Lie algebra of this group is the algebra of matrices M3 (R), then L(R3 , M3 (R)) can be identified with L2 (3), the set of all bilinear maps from R3 × R3 to R3 . The material symmetry group is then a subgroup of the semidirect product Gl(3, R) × Gl(3, R) × L2 (3). The uniformity of the body bundle B reduces to the existence of a globally defined field I (X) . (4.29) P (X) = PαI (X), QIα (X), Rαβ For the general case, a body with microstructure is homogeneous if the three inhomogeneity tensors (3.22) induced by a uniformity field, vanish. Due to the particular form (4.29) of the uniformity field, the second inhomogeneity tensor vanishes if the third one does. Consequently, we may conclude that a body with linear microstructure is homogeneous if there exists a field of uniformities (4.29) with zero inhomogeneity tensors: ⎧ −1 α ∂(P −1 )αK )J ⎪ ⎪ I ∂(P ⎪ − ; ⎨ Pα ∂X K ∂X J (4.30) ⎪ ∂(Q−1 )αL ⎪ α I ⎪ Qα RLJ − . ⎩ ∂X J Here RIαJ is the third entry of the inverse of the uniformity field (4.29) and is given by: L (Q−1 )I (P −1 )J . RIαJ = −(Q−1 )αL Rβγ β
γ
(4.31)
The vanishing of the two tensors (4.30) implies the existence of a diffeomorphism κ: (X I , Y I ) ∈ T B → (x α (X), (Q−1 )αI (X)Y I ) ∈ E 3 × R3 such that
P −1
α I
=
∂x α ∂X I
and
RIαJ =
∂(Q−1 )αI . ∂X J
(4.32)
CONTINUOUS DISTRIBUTIONS OF DISLOCATIONS
343
It is important to note here that, just as in the general case, the two inhomogeneity tensors (4.30) are the torsion components of a linear connection D on T B. This linear connection is the complete parallelism induced by the following field of frames on T B −1 α 0 (P )I (X) . (4.33) F (X, Y ) RJαI (x)Y J (Q−1 )αI (X) In [7], the homogeneity of a second order simple material body reduces to the vanishing of two inhomogeneity tensors. These tensors are exactly the same as the inhomogeneity tensors (4.30), but they were obtained in a different way. The first inhomogeneity tensor is the torsion of the complete parallelism of the material body B induced by the frame (P −1 )αI , while the second is the difference of the two linear connections induced by the fields (Q−1 )αI and RIαJ . The material Lie groupoid consists of all triples X), QIˆ (X, X), R Iˆ ˆ (X, X) X) = P Iˆ (X, (4.34) P (X, I I IJ to X that are compatible with the energy density of material isomorphisms from X functional w. A fibre frame, as defined in Section 3.6 can be seen now as a second order frame (4.29) on B or as a special frame (4.33) on T B. Then the induced fibre G-structure is isomorphic to a second order G-structure on B or a G-structure on T B. The homogeneity of the body bundle is equivalent to the integrability of the second order G-structure (see [8]) or to the integrability of the G-structure on T B and each of these conditions is equivalent to the vanishing of the tensors (4.30). Acknowledgements This work has been partially supported by the Natural Sciences and Engineering Research Council of Canada. The second author (I.B.) would like to thank Dr. Epstein for his support during the visit at the University of Calgary. The first author (M.E.) gratefully acknowledges an illuminating discussion with Professor Gianfranco Capriz on the possible applications of the theory to highergrade liquid crystals and nematic elastomers. References 1. 2. 3. 4.
5.
B.A. Bilby, Continuous distributions of dislocations. In: Progress in Solid Mechanics, Vol. 1. North-Holland, Amsterdam (1960) pp. 329–398. G. Capriz, Continua with Microstructure. Springer (1989). E. Cosserat and F. Cosserat, Théorie des Corps Déformables, Paris, Hermann (1909). M. Epstein, Eshelby-like tensors in thermoelasticity. In: W. Muschik and G.A. Maugin (eds), Nonlinear Thermomechanical Processes in Continua, Vol. 61. TUB-Dokumentation, Berlin (1992) pp. 147–159. M. Epstein, On the anelastic evolution of second-grade materials. Extracta Mathematicae 14 (1999) 157–161.
344 6.
M. EPSTEIN AND I. BUCATARU
M. Epstein, Towards a complete second-order evolution law. Math. Mech. Solids 4 (1999) 251– 266. 7. M. Epstein and M. de León, Homogeneity conditions for generalized Cosserat media. J. Elasticity 43 (1996) 189–201. 8. M. Epstein and M. de León, Geometrical theory of uniform Cosserat media. J. Geom. Physics 26 (1998) 127–170. 9. M. Epstein and G.A. Maugin, Sur le tenseur de moment matériel d’Eshelby en élasticité non linéaire. C. R. Acad. Sci. Paris 310/II (1990) 675–768. 10. M. Epstein and G.A. Maugin, The energy-momentum tensor and material uniformity in finite elasticity. Acta Mechanica 83 (1990) 127–133. 11. J.D. Eshelby, The force on an elastic singularity. Philos. Trans. Roy. Soc. London A 244 (1951) 87–112. 12. M.E. Gurtin, Configurational Forces as Basic Concepts of Continuum Physics. Springer, Berlin (2000). 13. K. Kondo, Geometry of elastic deformation and incompatibility. In: Memoirs of the Unifying Study of the Basic Problems in Engineering Science by Means of Geometry. Tokyo Gakujutsu Benken Fukyu-Kai (1955). 14. E. Kroener, Allgemeine Kontinuumstheorie der Versetzungen und Eigenspannungen. Arch. Rational Mech. Anal. 4 (1960) 273–334. 15. M. de León, A geometrical description of media with microstructure: Uniformity and homogeneity. In: Gepmetry, Continua and Microstructure, Collection Travaux en Cours 60. Herrmann, Paris (1999) pp. 11–20. 16. M. de León and M. Epstein, Geometric characterization of the homogeneity of continua with microstructure. Extracta Mathematicae 11 (1996) 1116–1126. 17. G.A. Maugin, Material Inhomogeneities in Elasticity. Chapman and Hall, London (1993). 18. W. Noll, Materially uniform simple bodies with inhomogeneities. Arch. Rational Mech. Anal. 27 (1967) 1–32. 19. C.C. Wang, On the geometric structure of simple bodies: A mathematical foundation for the theory of continuous distributions of dislocations. Arch. Rational Mech. Anal. 27 (1967) 33–94.
A Model of the Evolution of a Two-dimensional Defective Structure 2 ˙ MARCELO EPSTEIN1 and MAREK ELZANOWSKI
1 Department of Mechanical and Manufacturing Engineering, The University of Calgary, Calgary,
AB, Canada. E-mail:
[email protected] 2 Department of Mathematical Sciences, Portland State University, Portland, OR, U.S.A. E-mail:
[email protected] Received 28 June 2002 Abstract. A model of the anelastic evolution law of a two-dimensional defective solid crystal body is proposed. Assuming that the material body is made of triclinic crystals and that the evolution process does not alter the basic material symmetry group, we postulate that the evolution is driven by the present state of the density of the distribution of defects. We show that a linear relation between the inhomogeneity velocity gradient and the torsion tensor is rich enough to model such phenomena as relaxation of defects and dislocation pile-up. Mathematics Subject Classifications (2000): 74E05, 53C10, 53B05. Key words: defects, evolution, inhomogeneity, anelasticity.
1. Introduction The theory of continuous distributions of dislocations in its various formulations results always in a mathematical description of distributions of inhomogeneities in terms of differential-geometric objects. An open question, however, is the formulation of constitutive laws that govern the possible time evolution of such geometric structures so as to represent a variety of important physical phenomena involving the massive motion of defects. The driving force behind these phenomena can perhaps be best explained in terms of configurational forces such as those represented by the Eshelby tensor. Nevertheless, it is quite possible to conceive of an evolutionary process that is driven by the dislocation pattern itself in its natural tendency to eliminate residual stresses or, even if these stresses are absent, to achieve a defect-free structure over time. These processes can be enhanced, for example, by raising the temperature of the body so as to increase the probability of the atoms in overcoming potential barriers. On the other hand, a dislocation pattern may lead in the opposite direction, in the sense that a dislocation pile-up may arise naturally out of an initially smooth distribution of defects. These typically nonlinear phenomena are in need of a general theoretical framework consistent with the differential-geometric apparatus mentioned above. The purpose of this 345 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 345–355. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
346
˙ M. EPSTEIN AND M. ELZANOWSKI
paper is to show how a relatively simple model, valid for solids endowed with only a discrete material symmetry group and already possibly devoid of residual stress, can explain, among other phenomena, the appearance of dislocation pile-ups. The proposed evolution law consists of assuming nothing more than a linear relation between the inhomogeneity velocity gradient and the instantaneous value of the torsion of the (unique) material connection. That such a simple law can account for nonlinear phenomena is an encouraging sign of the power of the theory of continuous distributions of inhomogeneities, which is just beginning to be fully tapped. A possible extension of the theory would include the modelling of the release of residual stresses present in an isotropic solid. In this case, the dislocation density can be completely characterized by the curvature tensor of an appropriately defined Riemannian connection. The theory would be necessarily more involved than the one presented in this paper not only because the curvature tensor is of higher order than a torsion, but also because the evolution would involve a coupling with the solution of the equilibrium boundary-value problem at each instant. It is mainly for reasons of simplicity that we have limited the presentation to the solid crystal case. Conspicuously absent from this treatment is the important issue of thermodynamic restrictions on the form of the evolution law. For the case of a stress-driven evolution, and within different phenomenological frameworks, such restrictions have been considered, among others, by [7–9].
2. Uniformity Let B denote an open, possibly unbounded, region in R3 . We shall view it as a deformable continuum in a reference configuration. A deformation of the body B is an embedding χ: B → R3 . Its tangent map evaluated at the material point X ∈ B is called the deformation gradient at X, and it will be denoted by F(X). In fact, due to the canonical identification of a tangent space of R3 with the Euclidean vector space E3 we recognize the deformation gradient as an automorphism of E3 , and drop the explicit dependence of F on the material point X. In pure elasticity the density of the stored energy per unit reference volume is given by a function W (F; X) where, as mentioned earlier, F is the gradient of the deformation from the reference configuration to the current configuration evaluated at X. Adopting a three-dimensional vector space V as a reference crystal (an archetype material point) we say that the body B is materially uniform whenever there exist smoothly distributed (throughout the body) uniformity maps P(X) from the reference crystal V to the tangent space of the reference configuration at X, and such that a real-valued function W (FP(X)) W (F; X) = W
(1)
for all deformation gradients F and for each material point X [11]; see also [4]. Given a basis Eα (α = 1, 2, 3) in the reference crystal V and a (right-handed)
A MODEL OF THE EVOLUTION
347
coordinate system eI (I = 1, 2, 3) in R3 the mappings P(X) induce in the reference configuration a field of bases fβ (X) ≡ PβI (X)eI ,
(2)
called a uniform reference. The uniform frame at X is related to the uniform frame at Y by the linear isomorphism P(X; Y ) ≡ P(X)P−1(Y ),
(3)
called a material isomorphism from Y to X. Note that the choice of the basis Eα in the reference crystal, although arbitrary, has no effect on the choice of maps P(X; Y ). A uniform reference (a moving frame) fβ is not, in general, induced by any coordinate system on the body B even if considered only in some neighborhood of a material point. However, if for every material point X there exists such a coordinate neighborhood (albeit different at different points) the body is called locally homogeneous [12, 13]. By an appropriate change of reference configuration, the uniformity maps P(X) can then be chosen as independent of X in each such neighborhood. This in turn implies that the parallelism induced on B by such a material reference fβ is locally trivial. The material connection associated with such a parallelism is torsion-free, where a material connection of the mathematical theory of inhomogeneities is a connection generated by any (homogeneous or not) uniform reference [10]. Note that any material connection is locally integrable, i.e., its curvature tensor vanishes locally, as uniform references are induced from the reference crystal by the smoothly distributed (throughout the body) mappings P(X). For a solid crystal point the material symmetry group is finite. In particular, the triclinic crystal is a solid crystal with the trivial symmetry group (there are no symmetries other than the identity I). A material body made of solid crystals has a unique material connection. This is in contrast with the case when the material symmetry group is continuous, e.g., in an isotropic solid. In this paper we shall only consider uniform material bodies made of triclinic crystals and such that there exists a global reference configuration in which all material isomorphisms P(X; Y ) are proper rotations, i.e., the uniform reference corresponds to contorted aelotropy [10] or, equivalently, a state of constant strain [5]. This can be realized if, for example, there exists a global stress-free reference, and the reference crystal is assumed stress-free. Other states of stress are also possible. Indeed, one can show that in a 2-dimensional solid crystal body the state of stress compatible with a state of constant strain is hydrostatic [5]. In other words, if the body is in a state of constant strain, and if a (right-handed) orthonormal basis eI (I = 1, 2, 3) defines a Cartesian coordinate system on R3 , then fβ (Z) = QIβ (Z)eI , One may also allow −I to be a symmetry of a triclinic solid [11].
(4)
348
˙ M. EPSTEIN AND M. ELZANOWSKI
where all QIβ (Z) are proper orthogonal tensors. The Christoffel symbols of the second kind of the unique (constant strain) material connection are given in the Cartesian coordinate system by IKJ (Z) = −QIα,J (Z)QαK (Z)
(5)
where a comma indicates partial differentiation. When the body is locally homogeneous, and the rotations QIβ (Z) are locally material point independent, the Christoffel symbols of the material connection vanish. 3. Evolution Law Consider a uniform solid crystal body. In the realm of pure elasticity the given uniform reference remains unchanged. In other words, there are no processes of elastic deformations which may change the existing structure. However, anelastic processes involve usually mechanisms which modify the distribution of material inhomogeneities. This can be modelled by allowing the uniform reference to change in time. As the uniform reference fα evolves, and assuming that the evolution does not alter the symmetry group, its time derivative yields ˙fβ = P˙ I eI = P˙ I (P −1 )γ fγ = Lγ fγ , β β β I
(6)
γ
as implied by relation (2). Here, Lβ represent the components of the inhomogeneity velocity gradient [6] ˙ L ≡ P−1 P,
(7)
which measures the temporal rate of change of uniform references pulled back to the reference crystal. Note that for the triclinic crystal body in a state of constant strain γ γ L β = Q˙ Iβ QI
(8)
are components of a skew-symmetric matrix, as implied by (4). Given a particular uniform reference of an arbitrary uniform material body the torsion I I ≡ KJ − JI K TKJ
(9)
of the induced material connection is an indicator of whether or not the body is homogeneous. Indeed, if the torsion vanishes the induced material parallelism is trivial and the body is homogeneous. On the other hand, if the torsion of a particular material connection does not vanish the corresponding uniform reference is not integrable. The body may still be homogeneous as there may exist another uniform reference, obtained by the action of the material symmetry group, inducing a flat material connection. In the triclinic crystal case, however, as we pointed out earlier, the material connection is unique. The torsion of such a material connection is not
A MODEL OF THE EVOLUTION
349
only an indicator of inhomogeneity but it may be considered a true measure of the density of the distribution of inhomogeneities. We, therefore, postulate that, regardless of the state of stress, the distribution (density) of inhomogeneities is the driving force behind the intrinsic anelastic evolution of these inhomogeneities. According to this idea, we suggest an evolution law of the form: ˙ P(X, t) = f (T(X, t), P(X, t))
(10)
where T is the torsion tensor of the instantaneous intrinsic material connection (as generated by the current uniformity maps P), and where f is assumed not to depend explicitly on X because of the assumed uniformity of the evolving body. Formulating an evolution law is a difficult constitutive modelling process. However, for such a law to describe a true evolution it must satisfy the principle of covariance [6]. That is, it must be independent of any particular reference configuration chosen. If λ: R3 → R3 is a diffeomorphism representing a change of reference configuration and H denotes its gradient at a material point the corresponding uniformity maps R and P are related by R = HP.
(11)
As we want our evolution law to describe a particular physical situation in a manner independent of the reference configuration and, since λ is time independent, we have: ˙ = HP. ˙ R
(12)
This implies that f (HTH−1 H−1 , HP) = Hf (T, P)
(13)
for all nonsingular tensors H. Note that as the torsion T is a vector-valued two-form the notation HTH−1 H−1 is a shorthand for the pull-back transformation whose coordinate representation takes the form B C TIJ K = (H −1 )IA T A BC HJ HK .
(14)
In particular, let us select (with some abuse of notation) H = P−1 and define fv (T) ≡ f (P−1 TPP, I).
(15)
Hence, P˙ = Pfv (T) = Pf (Tv , I),
(16)
We emphasize that, in principle, the function f may depend also on other parameters, such as temperature, stress, etc. Our aim is to exhibit the richness of the theory even under the assumption of a “self-driven” evolution. Note also that (10), although particularly appealing in the triclinic crystal case, may as well be applicable in other situations, with possibly extra equivariance conditions.
350
˙ M. EPSTEIN AND M. ELZANOWSKI
where Tv ≡ P−1 TPP
(17)
can be recognized as the density of the distribution of inhomogeneities (torsion tensor) seen from the perspective of the reference crystal. The evolution equation (16) can now be rewritten in terms of the inhomogeneity velocity gradient as follows: L(P) = fv (T).
(18)
It is not difficult to see that this form of the evolution law is completely invariant. In particular, we may restrict the form of the evolution law by supposing a linear relation such that L(P) = CTv ,
(19)
where C is a fifth order tensor of material constants. In other words the evolution law is given in component form by M (P −1 )αI P˙ Iβ = C αβρσ λ (P −1 )ρM P Nσ P K λ T NK .
(20)
According to the principle of actual evolution [6] a process described by such an evolution law is truly evolutive only if the inhomogeneity velocity gradient L is outside of the Lie algebra of the material symmetry group of the reference crystal. In the case of a material body made of triclinic crystals, when the material symmetry group is finite, this principle implies that every non-trivial evolution, i.e., γ L β = 0, represents a true evolution. 4. The Two-Dimensional Case For the sake of specificity and to illustrate the range of phenomena within the scope of this approach, we consider now a class of problems for which the uniform reference is independent at all times of, say, the third Cartesian coordinate. In doing so, we render the evolution problem two-dimensional and gain the added computational simplicity afforded by the explicit representation of the rotation by means of a single angular parameter. Adopting an orthonormal basis in V and a Cartesian coordinate system x, y, z in the fixed reference configuration, the assumption that at all times t and at all points the uniform reference represents a state of constant strain results in the following matrix representation of the uniformity maps P: & % cos θ(x, y, t) sin θ(x, y, t) 0 [P] = − sin θ(x, y, t) cos θ(x, y, t) 0 , (21) 0 0 1 where θ = θ(x, y, t) measures, say, the counterclockwise rotation between the x-axis and the vector f1 . The non-vanishing Christoffel symbols of the second kind
A MODEL OF THE EVOLUTION
351
of the induced material connection IKJ can now be calculated directly from (5) as 1 2 = −11 = θ, x , 21 1 2 22 = − 12 = θ, y
(22) (23)
whence the non-vanishing torsion components are: 1 1 = −T21 = −θ, x , T12 2 2 T12 = −T 21 = −θ, y .
(24) (25)
Similarly, the non-vanishing components of the inhomogeneity velocity gradient at the reference crystal are L12 = −L21 = θ, t .
(26)
The most general evolution law (20) results (after some calculation effort) in the single quasi-linear partial differential equation θ,t +(a cos θ − b sin θ)θ,x +(a sin θ + b cos θ)θ,y = 0,
(27)
where a and b are, respectively, the material constants 2C 12112 and 2C 12212 . These are the only two material constants left due to the skew-symmetry of the torsion tensor and the form of the uniformity maps (21). We may further simplify the form of the evolution equation (27) by writing it as a single nonlinear balance law for the new variable β (28) β, t + c(sin β), x − c(cos β), y = 0, √ where β ≡ θ + θ0 , c = 1/ a 2 + b2 , and where θ0 is such that tan θ0 = b/a. The characteristic strips [2] of this equation are solutions of the following system of ordinary differential equations: dt = 1, ds
(29)
dx = c cos β, ds
(30)
dy = −c sin β, ds
(31)
dβ = 0, ds
(32)
dβ,t = −cβ,t [β,x sin β + β,y cos β], ds
(33)
dβ,x = −cβ,x [β,x sin β + β,y cos β], ds
(34)
352
˙ M. EPSTEIN AND M. ELZANOWSKI
dβ,y = −cβ,y [β,x sin β + β,y cos β]. ds
(35)
As it is well known, the quasi-linearity of the single partial differential equation has several important consequences. Firstly, for given initial conditions x(0), y(0), t (0) and β(x(0), y(0), 0), the first four equations can be solved independently from the last three. A line x(s), y(s), t (s) thus obtained is called a characteristic curve or simply a characteristic. Equation (32) implies that β is constant along each characteristic. Moreover, the parameter s, according to (29), can be identified with time t, except for an arbitrary additive constant. Finally, the constancy of β implies that along a characteristic the right-hand sides of equations (30) and (31) are constant, and therefore that the characteristics are actually straight lines. The values of the material constants, together with the initial condition, determine whether or not the characteristics will tend to converge (intersect) or diverge. In the former case, we will observe the creation of dislocation pile-ups, while the latter is a representation of the tendency of the dislocations to dissipate after the passage of a long enough time. Indeed, the general Cauchy problem for such a balance law has, as it is well known [1], no smooth global solution even for smooth compactly supported initial condition. A solution stays temporarily smooth but eventually develops singularities. The blow-up of a smooth solution, which in the context of our model we identify with a dislocation pile-up, occurs when the spatial gradient of β becomes unbounded. In a one-dimensional case, given any particular initial distribution of inhomogeneities, it is rather elementary to determine, as shown in [3], such propagation characteristics as the blow-up time, the speed of propagation (Rankine–Hugoniot condition), and the propagation condition for the amplitude of the pile-up. Moreover, looking at the Rankine–Hugoniot condition for the evolution equation (28), whether planar or one-dimensional, it is easy to realize a possibility of the occurrence of a stationary pile-up, i.e., a singular pattern of inhomogeneities which will not propagate.
5. Examples For the sake of being even more specific and to better illustrate each of the above mentioned types of evolutions, let us restrict further our analysis to the one-dimensional case by assuming that the uniform references depend only on one Cartesian coordinate, say y. This renders the evolution equation particularly simple, namely: β, t + cβ, y sin β = 0.
(36)
The general Cauchy problem for such a balance law has, as it is well known, no smooth global solution even for smooth compactly supported initial conditions. A solution stays temporarily smooth but eventually develops singularities. The blow-up of a smooth solution, which in the context of our model we identify with
A MODEL OF THE EVOLUTION
353
a dislocation pile-up, occurs when β,y becomes unbounded. It is easy to show by integrating along characteristics that this is possible provided cβ0 cos β0 (y) < 0
(37)
at some y ∈ R, where β0 (y) ≡ β(y, 0) and where k(y) ≡ c cos β0 (y) is obviously constant along the characteristics. The actual breaking of a continuous solution will be observed at the critical time tc ≡ min y
−1
. cβ0 (y) cos β0 (y)
(38)
Such a singularity, once developed, will propagate, as implied by the Rankine– Hugoniot condition, with the speed v=c
[cos β] [β]
(39)
along the shock-curve y = (s), where (d/ds)(s) = v(y(s), s). The evolution of the amplitude [β] of such a shock is given by the propagation condition ˜ = c [cos β] [β,y ] + [β,y sin β] , (40) [β] [β] where [f (β)] ≡ f (β + ) − f (β − ) denotes the jump of the quantity f across the ˜ indicates differentiation along . Using the method shock-curve , and where [β] of singular surfaces the propagation of such a singularity can be further analyzed by developing the infinite system of iterated compatibility conditions and solving it numerically. To show the relation between the form of the initial condition and the choice of the material constants a and b we briefly discuss here some one-dimensional evolution initial-value problems. (i) Suppose that a = b = 1 and let β0 (y) = arctan y. As β0 ,y > 0 the condition (37) is never satisfied proving that no pile-up of dislocations will ever occur. A simple analysis of characteristics shows, in fact, that the solution θ(y) tends asymptotically to −π/4 at every y ∈ R. (ii) Let β0 (y) = − arctany and let us keep the same material constants. This initial condition, in contrast to the previous one, will develop, as easily attested by (37), into a shock. In fact, investigating the arrangement of characteristics and calculating the critical blow-up time (38) one arrives at the conclusion that the two shocks travelling in opposite directions √ (one front-shock and one back-shock) will develop at the same time tc = 2/2. (iii) Suppose a = b = 1 and select a symmetric (about y = 0) initial condition, e.g., β0 (y) = (π/2)sech y. An elementary analysis of characteristics shows that this solution will blow up in finite time into a front shock. Changing the material constants to a = −b = −1, but keeping the initial condition
354
˙ M. EPSTEIN AND M. ELZANOWSKI
unchanged, will make very little difference. Indeed, √ rewriting the evolution equation for the new material constants as β,t + ( 2/2)β,y cos β = 0 one can easily conclude that the new solution also blows up in finite time. However, a different part of the initial condition contributes now to the pile-up, slowing down its occurrence and propagation considerably. (iv) As the last example we consider the spherically symmetric planar problem. In other words, we seek a solution to the evolution equation (28) such that it is invariant at all times t 0 with respect to rotations about the origin. Rewriting equation (28) in the polar coordinates ($, ψ) we obtain β, t + cβ,$ sin(β − ψ) −
c β,ψ cos(β − ψ) = 0, $
(41)
where β = β($, ψ, t). The solution β is truly rotationally invariant provided (β − ψ),ψ = 0.
(42)
Hence, β($, ψ, t) = ψ + F ($, t),
(43)
where F, t + cF,$ sin F −
c cos F = 0. $
(44)
What we have now is a one-dimensional balance law with a source. The characteristic curves are no longer straight lines and the solution F is no longer constant along characteristics. The initial value problem is well posed only locally in time. As in the case of a conservation law, the solution of (44) generally stays smooth only up to some critical time at which a singularity develops. Moreover, the source term may even cause the singular solution to become unbounded in finite time and, if dissipative enough, it may altogether prevent the breaking of some relatively weak waves. Note also that the source term of (44) plays a prominent role close to the origin while it is negligible very far away from the center. Indeed, the proximity of defects increases the density of defects which, in turn, as expected, influences their evolution in a more significant way. Acknowledgements This work has been supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC). The work was done in part when the second author was visiting the University of Calgary in September–December 2001. The financial support for this visit was provided by NSERC, the Department of Mathematics and Statistics of the University of Calgary, and Portland State University.
A MODEL OF THE EVOLUTION
355
References 1.
C.M. Dafermos, Hyperbolic Conservation Laws in Continuum Physics. Springer, New York (2000). 2. G.F.D. Duff, Partial Differential Equations. University of Toronto Press, Toronto (1956). 3. M. El˙zanowski and M. Epstein, On the intrinsic evolution of material inhomogeneities. In: Proc. the 2nd Canadian Conf. on Nolinear Solid Mechanics, Vancouver, Canada (2002) in press. ´ 4. M. El˙zanowski, M. Epstein and J. Sniatycki, G-structures and material homogeneity. J. Elasticity 23(2/3) (1990) 167–180. 5. M. Epstein, A Question of constant strain. J. Elasticity 17 (1987) 23–34. 6. M. Epstein and G.A. Maugin, On the geometrical material structure of anelasticity. Acta Mech. 115 (1996) 119–134. 7. M. Epstein and G.A. Maugin, Thermomechanics of volumetric growth in uniform bodies. Internat. J. Plasticity 16 (2000) 951–978. 8. M.E. Gurtin, A Gradient theory of single crystal viscoplasticity that accounts for geometrically necessary dislocations. J. Mech. Phys. Solids 50 (2002) 5–32. 9. P.M. Naghdi and A.R. Srinivasa, A dynamical theory of structured solids. I Basic developments. II Special constitutive equations and special cases of the theory. Phil. Trans. Roy. Soc. London A 345 (1993) 425–476. 10. W. Noll, Materially uniform simple bodies with inhomogeneities. Arch. Rational Mech. Anal. 27 (1967) 1–32. 11. C. Truesdell and W. Noll, The Non-Linear Field Throeries of Mechanics. Handbuch der Physik, Vol. III/3. Springer, Berlin (1965). 12. C.-C. Wang, On the geometric structure of simple bodies, a mathematical foundation for the theory of continuous distributions of dislocations. Arch. Rational Mech. Anal. 27 (1967) 33–94. 13. C.-C. Wang and C. Truesdell, Introduction to Rational Elasticity. Nordhoff, Leyden (1973).
On the Theory of Rotation Twins in Crystal Multilattices J.L. ERICKSEN 5378 Buckskin Bob Rd., Florence, OR 97439, U. S. A. Received 1 May 2002; in revised form 1 October 2002 Abstract. Rotation twins form a subset of twins in crystals which at least closely resemble many of the twins that are observed. My purpose is to characterize all solutions of this kind for twinning equations in the X-ray theory of crystals. An analysis of a common kind of growth twins in staurolite is presented. Mathematics Subject Classifications (2000): 74E15, 82D25. Key words: twinning theory, continuum theory of crystals.
Dedicated to the memory of Clifford Truesdell
1. Introduction The definition of rotation twins to be studied here is that given by Barrett and Massalski [1, p. 406], “Crystals are rotation twins if a two-, three-, four- or sixfold rotation about a twinning axis produces the orientation of the other. The rotation axis lies either in the twinning plane or normal to it and is not a symmetry element of the individual crystals.” Of course, use of the adjective “rotation” is reasonably interpreted as implying that not all configurations called twins are rotation twins. Except for the two-fold possibility, which applies to almost all mechanical twins, all examples known to me occur in growth twins. Here, my purpose is to describe all solutions of the twinning equations in my [2] X-ray theory for such twins. I [3, 4] have described how these equations are a bit different from some others used in studies of mechanical twins, which cannot reasonably be applied to growth twins. Also, I [5] have verified that my equations describe several different kinds of growth twins that are well-established in quartz, excepting those for which the two sides represent crystallographically inequivalent surfaces. It should be noted that workers have been unable to agree on a general definition of twins. In the words of one expert, Cahn [6, Section 1.1], one of the best 357 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 357–373. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
358
J.L. ERICKSEN
known proposals, attributed to Friedel, describes a “true twin” as involving a pair of configurations such that “. . . the two crystals can be brought into one congruent configuration by reflection in a lattice plane of low indices, or by a rotation through 60◦ , 90◦ , 120◦ or 180◦ about a lattice row of low indices.” The decision as to how low the indices must be seems to be left to the individual, but practice favors using single digits. Generally, Cahn seems to take this view fairly seriously, although he describes what he considers to be some rare exceptions. Other workers call other kinds of exceptions twins. Structural considerations motivated Hartman [7] to exclude existence of three-, four- and six-fold rotations in his definition of twins. Given that all measurements are subject to some error, there is no way to confirm experimentally that the rotation involved in one of these examples is exactly 90◦ , for example. Particularly in minerals, various kinds of complicated intergrowths occur, and most workers are not happy to call all of these twins. I note that Barrett and Massalski do not mention that the rotation axes should be restricted to the kind of crystallographic directions mentioned by Cahn, and I will not assume this. However, according to my theory, it turns out that, in most cases, these axes are parallel to some (parallel) rows of atoms or to normals of some (parallel) crystallographic planes, and I will note kinds of cases where one of these conditions must hold. I won’t expend more ink in trying to describe all of the different views on what should be meant by a twin. For present purposes, I will regard an intergrowth as a twin if it is described by some nontrivial solution of my twinning equations. 2. Background A crystal n-lattice, pictured as filling all of space, consists of n geometrically identical lattices translated relative to each other, with lattice vectors ea and their duals ea , the reciprocal lattice vectors. Physically, the atoms in any one of the lattices are identical, but atoms in different lattices can be the same or different. To describe the relative translations occurring when n > 1, we use shift vectors pi , i = 1, . . . , n − 1. Pick one atomic position in each of the lattices and take one as a base point. Then the shifts are position vectors of the others relative to the base point. For a given configuration, there are infinitely many ways of choosing the vectors, (ea , pi ) and (¯ea , p¯ i ) being two possibilities if the first is and the second satisfies a (2.1) e¯ a = mba eb ⇔ e¯ a = m−1 b eb , where m = mba is a unimodular matrix of integers, and j p¯ i = αi pj + lia ea , j
lia ∈ Z,
(2.2)
the matrices α = αi being discussed in some detail by Pitteri [8] and Ericksen [9]. Briefly, they describe interchanging identical atoms. Here, the detailed descriptions of them are not important. In dealing with matrices, my convention
ON THE THEORY OF ROTATION TWINS
359
is that the lower index labels rows. The transformations (2.1) and (2.2) form an infinite discrete group, with the group product indicated by def ¯ ¯l} · {m, α, l} = {mm, αα, ¯ αl ¯ + ¯lm}. {m, α,
(2.3)
For each configuration of an n-lattice with n > 1, we have four finite groups that are relevant here, the lattice group j L(ea , pi ) = m, α, l | mba eb = Qea , αi pj + lia ea = Qpi , Q ∈ O(3) , (2.4) the skeletal lattice group L(ea ) = m | mba eb = Qea , Q ∈ O(3) ,
(2.5)
the point group P (ea , pi ) j = Q | Qea = mba eb , Qpi = αi pj + lia ea , {m, α, l} ∈ L(ea , pi ) , and the skeletal point group P (ea ) = Q | Qea = mba eb , m ∈ L(ea ) .
(2.6)
(2.7)
Often, the latter is called the holohedral point group or holohedry. For a Bravais lattice (1-lattice), the lattice and point groups are just the skeletal groups. While workers seem unable to agree on a general definition of twins, they are generally considered to involve jump discontinuities in lattice vectors and/or shifts although, frequently, the latter are not considered explicitly. Normally, at least tacitly, these are considered as they occur in crystals at constant temperature that are unstressed, or in which the stress is a constant hydrostatic pressure, with ea , ea and pi piecewise constant. So, assume this and the fact that the aforementioned Barrett–Massalski description of rotation twins presumes this. In this context, the twinning equations I [2] proposed are of the form a e¯ a = (1 − n ⊗ a)ea = Q m−1 b eb ⇔ e¯ a = (1 − n ⊗ a)−T ea = Qmba eb , (2.8) and
j p¯ i = Q αi pj + lia ea ,
Q ∈ O(3).
(2.9)
Here, (ea , pi ) and (¯ea , p¯ i ) are values of these vectors on the two sides of the discontinuity surface, n being its unit normal, m, α and l being some choice of the matrices referred to in (2.1) and (2.2), a being some vector not having a definite physical interpretation. One finds it by solving (2.8), when possible, which depends on the nature of data available for the other variables. Here, the description of rotation twins gives some information about Q, that it is a rotation with axis parallel or perpendicular to n, angles of rotation being those commonly encountered in various
360
J.L. ERICKSEN
studies of crystals. There are solutions of (2.8) with Q a rotation with axis making a different angle with n. Our task is to characterize all choices of the other variables in (2.8) and (2.9) for the indicated possibilities of Q and n. In this setting, (2.9) is almost trivial. For given pi , we could satisfy it by taking p¯ i = Qpi , for example. Of course, in trying to match a solution to some observations, one should match the observed shifts, when data on these is available. In addition, ea and pi should satisfy some equilibrium equations, which I [2] have described, but are not used explicitly here. Whether they can be satisfied by reasonably stable configurations for any one of the kinds of configurations discussed depends on the nature of the particular constitutive equations considered. It follows from (2.8) that det(1 − n ⊗ a) = ±1,
(2.10)
which distinguishes two kinds of possibilities. The upper sign gives S-twins with
a·n=0
⇒
(1 − n ⊗ a)−T = 1 + a ⊗ n
(2.11)
a·n=2
⇒
(1 − n ⊗ a)−T = 1 − a ⊗ n.
(2.12)
and the lower gives O-twins with
Here, the S and O refer to the fact that ea and e¯ a or, equivalently, ea and e¯ a have the same or opposite orientations, respectively. As will become clear, rotation twins of both kinds or at least configurations closely resembling them are encountered, in practice. For some deformation twins, S-twin solutions of (2.8) are used, with a interpreted as the amplitude of a simple shearing deformation. The X-ray theory does not require this interpretation. Note that, if (2.8) is satisfied for some values of Q and m, it is also satisfied if we simply replace these by −Q and −m, leaving the remaining arguments unchanged. However, it can be that (2.9) is satisfied by one of these and not the other, for equivalent sets of shifts. In considering rotation twins, I interpret the description as implying that we should assume that Q = R ∈ SO(3),
(2.13)
from which it follows that det m = 1 for S-twins,
det m = −1 for O-twins.
(2.14)
Obviously, it is easy to remove the restriction (2.13). Henceforth, I write R in place of Q whenever these are considered as rotations and denote by Rvψ the rotation through angle ψ with axis in the direction of v. There are solutions of (2.8) and (2.9) sometimes called fake twins, because the discontinuities involved are not visible in typical X-ray observations. One kind consists of the lattice invariant shears: S-twins with Q = 1.
(2.15)
361
ON THE THEORY OF ROTATION TWINS
Sometimes, workers compose some of these with certain other solutions to adjust values of a or n. The other kind involves solutions with {m, α, l} ∈ L(ea , pi ),
(2.16)
perhaps combined with a suitable lattice invariant shear. It is easy to show that, for S-twins of this kind, a = 0 and QT is the corresponding element of P (ea , pi ). For O-twins of this kind, a = 2n and −QT Rnπ is the corresponding element of P (ea , pi ). One can also compose these with another solution to get alternative descriptions of the same twin, with different values of Q, m, etc. For n-lattices with n > 1, having suitable symmetry, one can have somewhat similar but nontrivial solutions not of this kind, for example, S-twins with m ∈ L(ea ),
m∈ / L(ea , pi )
⇒
a = 0,
(2.17)
n then being arbitrary. This applies to at least some twins called penetration twins. I call them penetration twins, not implying that all things called this fit (2.17). I am still looking for other possibilities for describing these. According to my [5] analyses of Brazil and Dauphiné twins in α-quartz, they are of this kind. Observations of Dauphiné twins indicate that the discontinuity surfaces are rather random surfaces. The isometry can be taken as a certain 180◦ rotation. Theoretically, one could pick planes such that these fit the description of rotation twins and workers have done this in considering occurrence of these in some Japan twins in quartz, for example. The surfaces associated with Brazil twins are different, usually of zigzag form, involving pieces of various kinds of crystallographic planes. I know of no theoretical reason for this difference. The isometry involved can be taken as a central inversion. There is also the analog of (2.17) O-twins with m ∈ L(ea ), 1 − n ⊗ a = −Rnπ .
m∈ / L(ea , pi ) ⇒ a = 2n
⇒ (2.18)
One could also call these penetration twins, but I am not yet sure how consistent this is with practice, although we will see an example that is. So, for the present, I call them exceptional O-twins. According to my [5] analysis of Friedel twins in α-quartz, (2.18) applies to them and they fit the description of rotation twins. The isometry can be taken as a 90◦ rotation, with axis perpendicular to n. Theoretically, the interfaces are orthogonal planes, and I do not know of any observations that are inconsistent with this. Those who have a little familiarity with twinning studies have encountered examples of the rotation twins of the two-fold kind, said to be of types I and II. Analyses of these are presented by Pitteri [10, 11]. These are S-twins, so a · n = 0 and, for both, m2 = 1, det m = 1. This implies that m is of the form mβα = −δαβ + uα v β ,
uα v α = 2,
(2.19)
362
J.L. ERICKSEN
where uα and v α can be taken as integers. For both, the isometries are 180◦ rotations. Solving (2.8) gives u n type I: Q = Rπ , n⊗a= u⊗ 2 2 −v , (2.20) |u| and type II:
Q=
Raπ ,
v n ⊗ a = u − 2 2 ⊗ v, |v|
(2.21)
where u = ua ea ,
v = v a ea .
(2.22)
As is known from studies of deformation twins, it follows from (2.20) that, for type I twins, n is rational, meaning that there is a vector in this direction with integer components relative to the basis ea . Such directions are normal to crystallographic planes. For type II twins, it is also known that (2.22) implies that a is rational in a different sense, here meaning that there is a vector in this direction with integer components relative to the basis ea . Rows of atoms have such directions. Of course, these are also the directions of the axes of rotation. Most mechanical twins are of type I. Like the lattice invariant shears, these solutions are available for any set of lattice vectors. Pitteri and Zanzotto [12] show that these are the only S-twins with this property. As is discussed in slightly different ways by Adeleke [13] and Ericksen [14], one subset of rotation twins has been characterized completely, rotation S-twins with
Rn = n,
a = 0.
(2.23)
From the analyses of these, it follows that n, the axis of rotation, must be parallel to a normal of some crystallographic planes and m must be similar to Q. Adeleke [13] describes all S-twin solutions of (2.8) with a = 0, making our task relatively easy. In one respect, possibilities with a = 0 are somewhat like those mentioned above for Dauphiné twins, since a value of Q included in any point group satisfies QN = 1 for N = 1, 2, 3, 4 or 6. Thus, take Q = R = 1 to be a rotation of this kind and you can pick n to be parallel to its axis, so you then have a description of all rotation S-twins such that Rn = n. For any set of lattice vectors, R ∈ P (ea ) implies that the axis of R is normal to some crystallographic planes. If you are not familiar with this, it is easily seen by taking the inner product of the equation in (2.7) with the axis, so this property extends to these penetration twins, assuming we are not dealing with the trivial fake twins. Thus, we can consider these possibilities to be characterized. 3. S-twins with RN = 1, N = 3, 4 or 6 Here, N is considered to be the smallest integer such that RN = 1. The analogous assumption is made elsewhere. From the above remarks, the only S-twins that
363
ON THE THEORY OF ROTATION TWINS
need to be considered are those for which the axis of R is normal to n. The cases indicated in the heading all have the property that the rotations do not take a plane with normal n to itself. Such cases are special cases of the solutions covered by Adeleke [13], in his subcase 3.3.2.1, so it is only necessary to specialize his results. Let i, j, k be an orthonormal basis with n = i,
Rk = k,
R = Rkψ ,
(3.1)
for any of the angles ψ fitting the powers listed in the heading. There are some restrictions on m, namely (a)
m must have 1 as an eigenvalue
⇒
mab x b = x a ,
(3.2)
where the x a are not all zero, and (b)
for some real numbers y a , x a , y a and mab y b are linearly independent. (3.3)
In the following, one can use any such values of m and y a . Translating Adeleke’s notation to mine, one gets that ea is obtained by solving ea · j = y a , ea · RT j = mab y b , (3.4) ea · k = x a , from which k = x a ea .
(3.5)
If the eigenspace corresponding to 1 is one-dimensional, the x a are, to within a scalar factor, integers, so k is parallel to some rows of atoms. Obviously, this is the case if 1 is an isolated eigenvalue. The other logical possibility is that the eigenvalues are 1, 1, 1. Then, one can use the Jordan Canonical Theorem on matrices to show that, if m = 1 and if m does not correspond to a lattice invariant shear, trivial possibilities, this eigenspace is one-dimensional. Adeleke [13] uses this theorem repeatedly. Using the results above, I calculate that (3.6) ea = csc ψmab y b − cot ψy a i + y a j + x a k. I won’t belabor the elementary calculations giving ea . Then, Adeleke’s calculations give a = mab eb · iRea − i.
(3.7)
This generalizes solutions I [14] obtained for the special case Ra = a. Among the solutions considered here, these and only these have m similar to R. Thus, RN = 1 ⇒ mN = 1 just for this special case. By using (3.5) to calculate Rkψ (m−1 )ab eb and rearranging terms, I get an alternative to (3.7) as a (3.8) a = csc ψ mab + m−1 b y b − 2 cot ψy a ea .
364
J.L. ERICKSEN
Except for differences of notation, most of these calculations are included as special cases of those given by Adeleke [13], but (3.8) is not. 4. S-twins with R2 = 1 There are obvious possibilities of this kind with the axis of R perpendicular to n, the type II twins described by (2.21), but these do not quite include all such solutions, which are covered by Adeleke’s [13] subcase 3.3.4.3. For all, it is necessary that m have 1, −1, −1 as its eigenvalues.
(4.1)
If m2 = 1, one gets type II twins. If not, proceed as follows. Of course, by hypothesis, one will have, for some orthonormal basis i, j, k, Ri = i,
Rj = −j,
Rk = −k,
n = k.
(4.2)
Introducing a pair of eigenvectors of m, we have mab x b = x a ,
mab y b = −y a .
(4.3)
One can take ea as any linearly independent vectors such that ea · i = x a ,
ea · j = y a .
(4.4)
So, ea = x a i + y a j + za k,
(4.5)
where, except for the requirement of linear independence, the numbers za are arbitrary. Adeleke’s description of a is a = mba ea · neb − n = mba za eb − k.
(4.6)
The values of m involved here have the property that m = m m ,
m2 = 1,
(4.7)
where m is a value of m corresponding to a lattice invariant shear. One way to see this is to use Adeleke’s observation that, by a similarity transformation using matrices of integers with determinant one, m can be reduced to the form 5 5 51 0 0 5 5 5 5 m=5 (4.8) 5 p −1 0 5 , 5q r −1 5 where the entries are, of course, integers. Said differently, given an m with the properties described above, one can find lattice vectors relative to which it reduces He and I use different conventions, making his matrices transposes of mine.
365
ON THE THEORY OF ROTATION TWINS
to the form (4.8). If r = 0, m2 = 1, so assume that r = 0. It is then easy to verify that (4.7) is satisfied by 5 5 5 5 5 1 51 0 0 05 0 5 5 5 5 5 5 1 05 m = 5 (4.9) m = 5 5 0 5, 5 p −1 0 5 , 5 rp −r 1 5 5 q 0 −1 5 that m does represent a lattice invariant shear and that m2 = 1. Pitteri [11] first produced an example of an S-twin solution of (2.8) with R2 = 1, m2 = 1. For this example, Zanzotto [15] showed that it is really a type II twin, in disguise. Here, we can come to the same conclusion, by similar reasoning. To do so, note that we have solutions of (2.8), which I put in a form more like that used by Zanzotto, Rmba eb = (1 + a ⊗ n)ea = e¯ a .
(4.10)
With (4.7), this is equivalent to b b R(m )ba eb = (1 + a ⊗ n) m−1 a eb = e¯¯ a = m−1 a e¯ b ,
(4.11)
e¯ a and e¯¯ a being equally acceptable choices of lattice vectors on the second side. Assuming the lattice vectors used correspond to (4.8), we use (4.5) to calculate that e3 = z3 k = z3 n and, with this that −1 b e = (1 + b ⊗ n)ea , m a b
(4.12)
b · n = 0,
(4.13)
where b = rz3 (e2 − pe1 ).
(4.14)
Then, (4.11) becomes R(m )ba eb = (1 + c ⊗ n)eb = e¯¯ a ,
c = a + b.
(4.15)
This is a solution of (2.8) with m2 = 1, det m = 1, m = m , axis of R perpendicular to n, implying that it is a type II twin, so Rc = c.
(4.16)
The isometry is then the same as it is for the type II twin. In the example presented by Pitteri [11], Zanzotto [15] reasoned from this that the apparently different solution is physically equivalent, and I agree. Thus, all of the cases considered in this section are really type II twins. Actually, what appeared to be exceptions are compound twins, meaning that they can also be described as type I twins.
366
J.L. ERICKSEN
5. O-twins with Rn = n Having disposed of the S-twins, we now consider the type of O-twins described in the heading. So, we are to satisfy a a · n = 2, (5.1) Rnψ m−1 b eb = (1 − n ⊗ a)ea , for any of the angles ψ associated with rotation twins. Operate on both sides with −Rnπ , replace m by m = −m and note that −Rnπ (1 − n ⊗ a) = (1 − 2n ⊗ n)(1 − n ⊗ a) = 1 − n ⊗ a¯ ,
(5.2)
where a¯ = 2n − a
⇒
a¯ · n = 0.
(5.3)
Also, Rnπ Rnψ = Rnψ+π .
(5.4)
Thus, (5.1) gets transformed to the S-twin solution a Rnψ m−1 b eb = (1 − n ⊗ a¯ )ea , ψ = ψ + π, m = −m
(5.5)
and, obviously, we can transform (5.5) in a similar way to get (5.1). There are exceptional cases. For one, ψ =π
⇒
ψ = 2π
⇒
Rnψ = 1,
(5.6)
(5.5) then describing a lattice invariant shear. One is then combining the trivial −Rnπ = 1 − 2n ⊗ n with a lattice invariant shear, to get a solution which is rather trivial, but perhaps not useless. Another exceptional case occurs if m ∈ L(ea )
⇔
m ∈ L(ea ),
(5.7)
(5.5) then describing a penetration twin or a fake twin. Either way, a¯ = 0. Then, Rnψ ∈ P (ea ), so the axis n is normal to some crystallographic planes. For the remaining solutions, if ψ is one of the angles associated with rotation twins, ψ + π is another, so one can take any of the S-twin solutions discussed in Section 3 and transform it to get an O-twin solution. From the remarks after (2.23), it follows that, for the twins considered in this section, the axis of rotation is parallel to the normal of some crystallographic planes. 6. O-twins with axis of R perpendicular to n For these, we can use the fact that any rotation can be written as a product of two 180◦ rotations with axes perpendicular to that of the rotation and one of these axes
367
ON THE THEORY OF ROTATION TWINS
can be chosen at will. If u is a unit vector such that u · n = 0, there is then a vector v such that Ruψ = Rnπ Rvπ ,
v · u = 0,
|v| = 1,
Given an O-twin solution of the form a Ruψ m−1 b eb = (1 − n ⊗ a)ea = e¯ a ,
2(v · n)2 = 1 + cos ψ.
a · n = 2,
(6.1)
(6.2)
we can transform it as we did (5.1) to get the S-twin solution a Rvπ m−1 b eb = (1 − n ⊗ a¯ )ea = e˜ a , e˜ a = −Rnπ e¯ a , m = −m,
(6.3)
with a¯ again given by (5.3). One exceptional case occurs when (6.3) describes a penetration twin, so that m ∈ L(ea )
⇔
m ∈ L(ea ),
a¯ = 0.
(6.4)
It then follows that m is similar to Rvπ , so m2 = 1
⇒
m2 = 1.
(6.5)
In these cases, it is not necessary that u be parallel to a row of atoms or to a normal of a crystallographic plane. My [5] analysis of Friedel twins in quartz, which is consistent with Zanzotto’s [16, 17], shows that they provide an example fitting (6.4) and (6.5), with ψ = π/2. Another exceptional case occurs when v·n=0
⇒
Ruψ = Rv∧n π .
(6.6)
Then, (6.3) describes a rotation S-twin of the kind discussed in Section 4, essentially a type II twin. Using (2.21), one can transform it to get solutions of (5.1) rather explicitly. In this case, u need not be parallel to a normal of crystallographic planes or to rows of atoms. This covers the nontrivial possibilities with m2 = 1, so I now assume that m2 = 1. If v is parallel to n, (6.1) gives Ruψ = 1, which is of no interest. So, we can assume that v is neither parallel nor perpendicular to n. Then, Rvπ does not map a plane with normal n to itself, and we do have Rvπ u = −u,
u · n = 0.
(6.7)
Such solutions are characterized by Adeleke [13], in his subcase 3.3.2.2. Restrictions on m not already mentioned are that m must have −1 as an eigenvalue ⇒ mba k b = −k b
and
mba lb = −la , (6.8)
for some numbers k a and la , not all in either set being zero. Also, 1ab y b are linearly there must be numbers y a such that k a , y a , and m independent.
(6.9)
368
J.L. ERICKSEN
Of course, m can be replaced by m−1 in (6.8). If I calculate correctly, such y a always exist for any particular m of the kind considered, but the linear independence fails for some values of these. For calculations to follow, one can use any values of m and y a that are consistent with (6.8) and (6.9). Take u as a unit vector and use the orthonormal basis u, n, w = u ∧ n. Then, for some angle ϕ with sin 2ϕ = 0, v = cos ϕn + sin ϕw.
(6.10)
Then, (6.1)4 can be replaced by ϕ=±
ψ . 2
(6.11)
Adeleke’s prescription for ea gives them as solutions of ea · w = y a , ea · Rvπ w = mba y b , ea · u = k a ,
(6.12)
which gives
ea = k a u + csc 2ϕmba y b + cot 2ϕy a n + y a w.
(6.13)
For a¯ , his prescription gives a¯ = mba eb · nRvπ ea − n.
(6.14)
Of course, one must transform this to get the corresponding solutions of (6.1), using a = a¯ + 2n. From (6.12)1 , it follows that the axis of Ruψ is given by u = k a ea .
(6.15)
Now, starting with the combination Rvπ (m−1 )ab eb , use (6.7) and (6.10) to get Rvπ n = cos 2ϕn + sin 2ϕw, Rvπ w = sin 2ϕn − cos 2ϕw
(6.16) (6.17)
to evaluate this. One finds that it can be put in the form (1 − n ⊗ a¯ )ea , with a (6.18) a¯ = csc 2ϕ mba − m−1 b y b ea , as an alternative to (6.14). With l = la ea ⇒ la = l · ea , it follows from (6.8) and (6.18) that l · a¯ = 0.
(6.19)
Using (6.13), one gets l = k a la u + sec ϕla y a (− sin ϕn + cos ϕw)
⇒
l · v = 0,
(6.20)
where (6.10) is used. With the possibility v · n = 0 excluded, a¯ and v cannot be parallel, so these determine the direction of a plane with normal l. When the
369
ON THE THEORY OF ROTATION TWINS
eigenspace of m corresponding to −1 is one-dimensional, k a and la are proportional to integers, implying that u is parallel to some rows of atoms and l is normal to some crystallographic planes. Either −1 is an isolated eigenvalue, in which case the eigenspace is one-dimensional, or the eigenvalues are 1, −1, −1. In the latter case, one can reduce m to the form (4.8). It is then easy to show that if m2 = 1, the eigenspace corresponding to −1 is one-dimensional, and we have already covered possibilities with m2 = 1. 1 2 = 1. One possibilAs is clear from (6.18), there are exceptional cases when m ity is that one has type I or type II twins, so sin 2ϕ = 0, a¯ as given by (6.18) then being indeterminate. The other possibility, mentioned earlier, is that a¯ = 0
⇒
m ∈ L(ea ),
Rvπ ∈ P (ea ).
(6.21)
With this, we have all solutions of the twinning equations describing rotation twins. 7. An example In part, this is an attempt to warn those unfamiliar with research on minerals of some of the pitfalls. Experimentally, it is not always easy to distinguish between various twin laws which seem to be quite different. In the words of some experts, Donnay et al. [18], “Whenever the crystal lattice or one of its multiple lattices possesses pseudo-symmetry, the crystal may twin (twinning by pseudo-merohedry or by reticular pseudo-merohedry). If the pseudo-symmetry is pronounced and sufficiently high, several twin laws may lead to nearly identical orientations of the twinned individual. The resulting twins have been called “neighboring twins” (macles voisines, Friedel, 1926). Because the relative orientation of one of the twinned individuals with respect to the other is known to morphologists only to within the limits of error of optical goniometry, the identification of neighboring twins may be a difficult problem, as is well illustrated by cryolite, staurolite, harmotome, morvenite, etc.” These workers developed a rather sensitive technique for distinguishing between such possibilities. As an example, they considered four likely possibilities for describing one kind of twin in staurolite. These involve four rotations with different axes. Two have 120◦ angles, the angles for the other two being 90◦ and 180◦ . According to their observations, that with the 180◦ angle is the best fit, among the four. I have looked at several references on staurolite and plan to look at more, since I have found them rather confusing and incomplete, partly because of my ignorance and meager experience with minerals. In one of the more recent references, the book by Klein and Hurlbut [19, pp. 104, 105 and 438, 439], various information is presented, including the chemical composition, from which it is clear that this is a complicated multilattice, and that the composition is somewhat different in different specimens. This is also the case in various other minerals. It is described as monoclinic with a β angle of 90◦ , being pseudo-orthorhombic. As I interpret this and other writings, the skeletal lattice is orthorhombic, but the configuration of
370
J.L. ERICKSEN
shifts reduces the symmetry to monoclinic. The space group is described as C2/m. No details concerning shifts are given. Although writers tend not to say so, the general practice seems to be to use as lattice vectors a mutually orthogonal set, with magnitudes given by these writers as a = |e1 | = 7.83 Å,
b = |e2 | = 16.62 Å,
c = |e3 | = 5.65 Å. (7.1)
Other estimates I have seen of these numbers are not very different. As they describe them, two common kinds occur, both being called penetration twins, and both being involved in crosses commonly found in this material, “(1) with twin plane {0 3 1} in which the individuals cross at nearly 90◦ (Figure 13.24b); (7.2) ◦ (2) with twin plane {2 3 1} in which they cross at nearly 60 (Figure 13.24c).” I will only present an analysis of the first kind. Also, I note that Donnay et al. [18] do not consider the twin plane {0 3 1} as one of the more likely possibilities. Here, {a b c} represents the direction ae1 + be2 + ce3 or a crystallographic equivalent. For the particular direction noted, replace curly brackets by parentheses. Concerning the first kind, in a paper written almost two decades earlier, Hartman [7] complains that too many workers ignore experimental work by Friedel [20], then quite old, which he accepts. This is also endorsed by Cahn [6], a major expert. Briefly, Friedel found that the isometry is better described by 1 . Reπ/2
(7.3)
From such remarks, I became doubtful that the interface is exactly a {0 3 1} plane. Similarly, Klein and Hurlbut [19] ignore the work mentioned above, by Donnay et al. For the second kind, they found that this isometry is better described as a 180◦ rotation about [3 1 3], the direction 3e1 + e2 + 3e3 . One could replace this by a crystallographic equivalent, the set of these being described by 3 1 3. Various writers just give one direction, assuming you know that the equivalents can be used and, here, I follow suit. I decided to try analyzing the first kind as a rotation twin, using (7.3) This axis is perpendicular to the normal to the aforementioned (0 3 1) plane or an equivalent. I assume that the interface is a plane with normal perpendicular to the axis, but not necessarily this one. A priori, it might be an S-twin or an O-twin, these being growth twins. I tried both possibilities, concluding that only the latter is appropriate. I shall sketch my analysis of the latter, using the analysis in Section 6. We are given that π e1 = ae1 , (7.4) ψ= . u= a 2 With the lattice vectors orthogonal, e2 and e3 must be in the plane determined by n and w, where the vectors are defined as in Section 6, giving be2 = cos χn + sin χw,
ce3 = − sin χn + cos χw,
(7.5)
371
ON THE THEORY OF ROTATION TWINS
where χ is some angle, unknown since no particular value of n is assumed. Of course, these values of ea are to be consistent with (6.13), and I reject solutions possible only for isolated values of b/c. This gives equations which can be solved by elementary methods, to get two solutions. For m, one gets the values 5 5 5 5 51 0 0 5 51 0 05 5 5 5 5 5 5 m2 = 5 (7.6) m1 = 5 5 0 1 0 5, 5 0 −1 0 5 ∈ L(ea ). 5 0 0 −1 5 50 0 15 For the first, one gets
(n ⊗ a)1 = be + ce 2
3
⊗
and, for the second (n ⊗ a)2 = be2 − ce3 ⊗
e2 e3 + b c
e2 e3 − b c
⇒
a1 = 2n1
(7.7)
⇒
a2 = 2n2 .
(7.8)
The two directions of n are orthogonal, these solutions being rather similar to the solutions Zanzotto [16, 17] and I [5] got by different reasoning for the Friedel twins in quartz. The latter form 90◦ crosses, and one can use much the same reasoning to construct solutions for such crosses in staurolite. For (7.7), say, the direction of n is given by b 2 e + e3 , c
b ∼ = 2.94 c
(7.9)
for the approximate values of b and c given in (7.1). In the usual jargon, this is an irrational direction. In dealing with these, it is a common practice to give a rational direction, using fairly small integers and, here, a likely choice is a (0 3 1) plane, referred to above as a twin plane. While m1 and m2 are not in L(ea , pi ) 5 5 51 0 0 5 5 5 5 m3 = m1 m2 = 5 5 0 −1 0 5 ∈ L(ea , pi ) ⇒ m1 = m3 m2 , (7.10) 5 0 0 −1 5 from the claim that these are monoclinic crystals. Here, my interpretation of the literature is that the 180◦ rotation included in the point group is Reπ1 . So, the two solutions are symmetry related. As far as I can tell, this seems to be a satisfactory description of these twins. What is the basis for calling these penetration twins, particularly by those like Klein and Hurlbut who associate a definite twin plane with them? According to Cahn [6, p. 388], the interfaces of penetration twins are crystallographically irregular, and interfaces are not always parallel to twin planes, when these exist. I have not yet found reports of observations of details of these interfaces, so I might be missing something. From looking at sketches of the 90◦ crosses and eyeing
372
J.L. ERICKSEN
one specimen, the interfaces at least resemble perpendicular planes. By looking at sketches of the “nearly 60◦ ” crosses, the reader will be as able as I am to judge whether the interfaces are planes and if these are likely to be rotation twins. Other kinds of intergrowths more or less like these, not always found in crosses, are among the things called penetration twins. Sketches of these staurolite crosses and other twinned configurations in this and other materials are presented by various writers, for example, Klein and Hurlbut [19, pp. 103–106] and Dana [21, pp. 181– 194]. With (7.6), for the same values of m1 and m2 , one can get solutions for what I call penetration S-twins without restrictions on n, or for what I call exceptional O-twins with different values of n, but these involve different isometries. There is a little mystery which I seem to have resolved. For the second kind of twin described in (7.2), Cahn [6, p. 373] says nothing about the twin plane mentioned there, but does mention a twin law “. . . with a three fold twin axis [1 0 1], . . . ”. I expected to find this among the four likely prospects considered by Donnay et al., but did not. At first, I thought that this could be due to some misprint. Then, I found that Hartman [7, p. 234] noted that different workers use different values of c, one based on “. . . the X-ray unit cell. . . ”, the other on “. . . the morphological description. . . ”, the latter being twice the former. Assuming Cahn but not the others used the latter, this corresponds to the [1 0 2] possibility considered by Donnay et al. As was mentioned earlier, this is not the one they consider the most likely, which is a two fold rotation with twin axis [3 1 3]. However, this seems to give a likely explanation of the indicated difference. Of the two, Cahn’s paper was published a bit earlier so, at the time, he might not have learned of the results of the other writers. I will not recommend a solution for these twins without demonstrating that it is adequate to describe the “nearly 60◦ ” crosses, something I have not yet explored. I do think it desirable to try to analyze more of the many growth twins, to better understand how well my twinning equations apply to them, and to determine how useful they might be in helping to resolve ambiguities mentioned at the beginning of this Section. References 1. 2. 3. 4. 5. 6. 7. 8.
C.S. Barrett and T.B. Massalski, The Structure of Metals, 3rd edn. McGraw-Hill, New York (1966). J.L. Ericksen, Equilibrium theory for X-ray observations. Arch. Rational Mech. Anal. 139 (1997) 181–200. J.L. Ericksen, On correlating two theories of twinning. Arch. Rational Mech. Anal. 151 (2000) 261–289. J.L. Ericksen, Twinning analyses in the X-ray theory. Internat. J. Solids Structures 38 (2001) 967–995. J.L. Ericksen, On the theory of growth twins in quartz. Math. Mech. Solids 6 (2001) 359–386. R.W. Cahn, Twinned crystals. Adv. in Phys. Quart. Suppl. Phil. Mag. 3 (1954) 363–445. P. Hartman, On the morphology of growth twins. Zeits. Krist. 107 (1956) 225–237. M. Pitteri, On (ν + 1) lattices. J. Elasticity 15 (1985) 3–25.
ON THE THEORY OF ROTATION TWINS
9. 10. 11. 12. 13. 14. 15.
16.
17. 18. 19. 20. 21.
373
J.L. Ericksen, On groups occurring in the theory of crystal multi-lattices. Arch. Rational Mech. Anal. 148 (1999) 145–178. M. Pitteri, On the kinematics of mechanical twinning in crystals. Arch. Rational Mech. Anal. 88 (1985) 25–58. M. Pitteri, On type-2 twins in crystals. Internat. J. Plasticity 2 (1986) 99–106. M. Pitteri and G. Zanzotto, Beyond space groups: The arithmetic symmetry of deformable multilattices. Acta Cryst. A 54 (1998) 359–373. S. Adeleke, On matrix equations of twinning in crystals. Math. Mech. Solids 5 (2000) 395–415. J.L. Ericksen, Some surface defects in unstressed thermoelastic solids. Arch. Rational Mech. Anal. 88 (1985) 337–345. G. Zanzotto, Twinning in minerals and metals: remarks on the comparison of a thermoelastic theory with some experimental results. Mechanical twinning and growth twinning, Nota II. Atti Accad. Naz. Lincei 82 (1988) 725–741, 743–756. G. Zanzotto, Geobarothermometric properties of growth twins and mathematical analyses of quartz for a broad range of temperatures and pressures. Phys. Chem. Minerals 16 (1989) 783– 789. G. Zanzotto, Thermoelastic stability of multiple growth twins in quartz and general barothermometric implications. J. Elasticity 23 (1990) 253–287. G. Donnay, J.D.H. Donnay and V.J. Hurst, Precession goniometry to identify neighboring twins. Acta Cryst. 8 (1955) 507–509. C. Klein and C.S. Hurlbut, Jr., Manual of Mineralogy (after James D. Dana), 21st edn, revised. Wiley, New York (1999). G. Friedel, Sur les macles de la staurotide. Bull. Soc. Franç. Minéral. 45 (1922) 8–15. B.W. Dana, A Textbook on Mineralogy with an Extended Treatise on Crystallography (revised and enlarged by W.E. Ford). Wiley, New York (1932).
Minimum Free Energies for Materials with Finite Memory MAURO FABRIZIO1 and MURROUGH GOLDEN2
1 Dipartimento di Matematica, Università degli Studi di Bologna, Piazza di Porta S. Donato 5,
40127 Bologna, Italia. E-mail:
[email protected] 2 School of Mathematical Sciences, Dublin Institute of Technology, Dublin, Ireland. E-mail:
[email protected] Received: 22 May 2002; in revised form 16 October 2003 Abstract. Finite memory viscoelastic materials are of interest because (a) they are not necessarily experimentally distinguishable from materials with infinite memory; and (b) the assumption of infinite memory can, in certain contexts, lead to results that run counter to physical intuition. An example of this – the quasi-static viscoelastic membrane in a frictional medium – is discussed. It is shown that, for a finite memory material, the singularity structure of the Fourier transform of the relaxation function derivative is quite different from the infinite memory case in the sense that it is an entire function with all its singularities being essential singularities at infinity. The formula for the minimum free energy [1] is still valid in this case. In contrast to the work function, this quantity, and all other functions of the minimal state, depend only on the values of the history over the period when the relaxation function derivative is nonzero. The factorization required to determine the form of the minimum free energy can be carried out explicitly for simple step-function choices of the relaxation function derivative. The two simplest cases are fully worked through and explicit formulae are given for all relevant quantities. Mathematics Subject Classifications (2000): 74A15, 74D05, 30E20. Key words: linear viscoelasticity, thermodynamics, minimum free energy, finite memory.
To Clifford A. Truesdell, Founder of Rational Thermodynamics
1. Introduction. General expressions have been given recently for the minimum free energy of linear viscoelastic materials under isothermal conditions in the scalar case [1] and in the full tensorial case [2]. An alternative derivation for the full tensorial case was given in [3] together with expressions for a family of other free energies that are functions of state in the sense of Noll [4] in the case of scalar, discrete spectrum (relaxation function given by a sum of exponentials) response. The case of a compressible fluid is considered in [5]. The aim of the present work is to derive an expression for the minimum free energy corresponding to a relaxation function with the special property that its 375 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 375–397. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
376
M. FABRIZIO AND M. GOLDEN
derivative is nonzero over only a finite interval of time. It will be seen that there are special features associated with the analytic behaviour of the frequency space representation of such relaxation functions which render this a non-trivial extension, with unique features, of the general treatments referred to above. This property of finite memory is of interest in the first instance because finite and infinite memories are not necessarily experimentally distinguishable; also, the assumption of infinite memory can lead to paradoxical results for many problems. The scalar case is dealt with in this work. In Section 2, various relationships required in later sections are presented. The problem of a viscoelastic membrane in a frictional medium is discussed in Section 3 in order to illustrate that results running counter to physical intuition emerge from the assumption of infinite memory. In fact, a result is quoted which shows that while (time) exponential decay in the displacement occurs in the elastic problem, this is not so in the viscoleastic problem if the viscoelastic function does not decay exponentially. One would expect that any viscoelastic function would simply enhance the elastic exponential decay because of the dissipative effects associated with viscoelasticity. In Section 4, it is shown that the singularity structure of the Fourier transform of the relaxation function derivative is quite different from the infinite memory case in that it is an entire function with essential singularities of exponential type at infinity, rather than poles and branch points generally in the finite complex plane, as is the case for infinite memory materials. The latter may of course include essential singularities at infinity, though these have been excluded for simplicity from earlier work [1–3]. In Section 5, an explicit expression for the minimum free energy is derived. This is very similar to the developments in [1, 2], though with the difference that relative strain histories are used, which leads to certain simplifications in frequency space – associated with better convergence at infinity. The main reason why the derivation is presented in some detail is to ascertain that the very different singularity structure in the finite memory quantities does not affect the final expression. An important result is proved at the end of this section, namely that the minimum free energy (and other functions of the minimal state) depends only on that portion of past history for which the memory is nonzero, while the work function depends on all past times. In Section 6 the crucial factorization required to determine the minimum free energy is discussed for specific (step-function) examples of finite memory materials, while explicit forms of the minimum free energy, and the corresponding work function, are given in Section 7 for two choices of finite memory relaxation functions.
2. Basic Relationships We consider a linear viscoelastic solid, subject to stress in such a way that there is only one nonzero component of stress T (t) and strain E(t) related by
377
MINIMUM FREE ENERGIES FOR FINITE MEMORY
∞
T (t) = G0 E(t) + 0
= G∞ E(t) +
G (s)E t (s) ds,
∞
G (s)Ert (s) ds,
E t (s) = E(t − s),
s ∈ R, (2.1) t t Er (s) = E (s) − E(t),
0
where E ∈ L (R ) ∩ L2 (R+ ) ∩ C 1 (R+ ) and G ∈ L1 (R+ ) ∩ L2 (R+ ) using the notation here and below: R is the set of reals, R+ the positive reals, and R++ the strictly positive reals; similarly R− , R−− are the negative and strictly negative reals. The relative history Ert will be used extensively later. The relaxation function s G (u) du (2.2) G(s) = G0 + t
1
+
0
is then well-defined along with G∞ = lims→∞ G(s). We take G∞ > 0
(2.3)
so that the body is a solid. Let be the complex ω plane and + = {ω ∈ | Im(ω) ∈ R+ }, (+) = {ω ∈ | Im(ω) ∈ R++ }.
(2.4)
These define the upper half-plane including and excluding the real axis, respectively. Similarly, − , (−) are the lower half-planes including and excluding the real axis, respectively. A viscoelastic state is defined in general by the current value of strain and the history (E(t), E t ). The concept of a minimal state, defined in [3] (see also [6–8, 2]) can be expressed as follows: two viscoelastic states (E1 (t), E1t ), (E2 (t), E2t ) are in the same minimal state if ∞ G (s + τ )[E1t (s) − E2t (s)] ds = 0 ∀τ 0. (2.5) E1 (t) = E2 (t); 0
We refer to any functional of (E(t), E t ) which gives the same value for any (E1 (t), E1t ) and (E2 (t), E2t ) obeying (2.5), as a function of the minimal state. For any f ∈ L2 (R), we denote its Fourier transform by ∞ f (ξ )e−iωξ dξ, fF ∈ L2 (R) (2.6) fF (ω) = −∞
If f is a real-valued function – which will be the case for all functions of interest here, in the time domain – then f¯F (ω) = fF (−ω) where the bar denotes complex conjugate. Note that this notation differs from that in [1]
(2.7)
378
M. FABRIZIO AND M. GOLDEN
We have fF (ω) = f+ (ω) + f (− (ω), ∞ f (ξ )e−iωξ dξ, f+ (ω) = 0 0 f (ξ )e−iωξ dξ, f− (ω) = −∞
(2.8) f± ∈ L2 (R),
where f+ is analytic in (−) since it is the Fourier transform of a function that is zero on R−− . For the cases of interest in the present work, we also assume that it is analytic on R and thus on − . Similarly, f− is analytic on + . What we mean by extending the analyticity to R for say f+ is that there is a constant > 0 such that there are no singularities for Im(ω) < . This of course amounts to assuming that f (ξ ) decays exponentially as |ξ | → ∞. However, it will sometimes be possible to weaken this assumption by taking continuous limits, once final results have been obtained, for example, by extending branch points at which no discontinuity or infinity occurs up to the real axis. In general, the integral definition of f± are convergent only over part of . On the remainder of the complex plane (where the singularities lie), it can be defined by analytic continuation, except at the singularities. If an explicit expression can be obtained for the integral, this is a trivial procedure. Using the inverse transform to express f in terms of fF , we obtain ∞ 1 fF (ω ) dω , f+ (ω) = − 2π i −∞ ω − ω− ∞ 1 fF (ω ) (2.9) dω , f− (ω) = 2π i −∞ ω − ω+ ω± = lim (ω ± iα). α→0+
Thus, we move ω slightly into the half-plane of analyticity of f± respectively to achieve convergence in the time integration. This also ensures that the integrals on the right-hand side of (2.9) have a well-defined meaning. The limit is taken after the integration is carried out. Functions on R which vanish identically on R−− are defined as functions on R+ . For such quantities, fF = fc − ifs , where fc , fs are the Fourier cosine and sine transforms ∞ f (ξ ) cos ωξ dξ = fc (−ω), fc (ω) = 0 ∞ (2.10) f (ξ ) sin ωξ dξ = −fs (−ω). fs (ω) = 0
Thus GF (ω)
= 0
∞
G (s)e−iωs ds = Gc (ω) − iGs (ω).
(2.11)
MINIMUM FREE ENERGIES FOR FINITE MEMORY
379
Properties of Gs (ω) include [9] Gs (ω) 0
∀ω ∈ R++ ,
Gs (−ω) = −Gs (ω), ∀ω ∈ R,
(2.12)
the first relation being a consequence of the second law of thermodynamics and the second being a particular case of (2.10). It follows that Gs (0) = 0. Actually, it was proved in [9] that Gs (ω) < 0 for positive ω, based on a restricted form of the second law. It will be shown in Section 6 that this does not hold for at least two finite memory examples. We also have [9] 1 ∞ Gs (ω) dω < 0 (2.13) G∞ − G0 = π −∞ ω so that Gs (ω)/ω ∈ L1 (R). It follows from (2.3) and (2.13) that G0 is positive definite. The function GF (ω) is analytic on (−) . As noted after (2.8), this is a consequence of the fact that G vanishes on R−− which is essentially the requirement of Causality [10]. It is assumed that GF (ω) is analytic on R and therefore on − . The quantity GF (−ω) = GF (ω) is analytic in + , a mirror image, in the real axis, of the singularity structure of GF (ω). Thus, Gs (ω) has singularities in both (+) and (−) which are mirror images of one another. Similarly, its zeros will be mirror images of one another. The singularity structure of H (ω) = −ωGs (ω) = H (−ω) 0 ∀ω ∈ R
(2.14)
will be of central interest. We have H (ω) = H1 (ω2 )
(2.15)
which is a consequence of the analyticity of H (ω) on the real axis. It follows that H (ω) goes to zero at least quadratically at the origin. Note that H (ω) is positive semi-definite. The possibility of it vanishing for nonzero frequencies is not excluded, following the remarks after (2.12). It will be required in later developments that H (ω) can be written in the form H (ω) = H+ (ω)H− (ω),
(2.16)
where H+ (ω) has no singularities or zeros in (−) and is thus analytic in − . Similarly, H− (ω) is analytic in + with no zeros in (+) . By considering the inverse sine transform of Gs (ω) ([9], for example) one can show that for the infinite memory case (where GF is analytic at infinity; we shall see that this implies that it has no finite memory component) H∞ = lim H (ω) = − lim ωGs (ω) = −G (0) 0. |ω|→∞
ω→∞
(2.17)
The sign of G (0) has been deduced by various authors from thermodynamic constraints in the general three-dimensional case [11, 12, 9]. We assume for present
380
M. FABRIZIO AND M. GOLDEN
purposes that G (0) is nonzero so that H∞ is a finite, positive number. Then H (ω) ∈ R++ ∀ω ∈ R, ω = 0. The result corresponding to (2.17) for the finite memory case is discussed in Section 4. There is a non-uniqueness in the factorization (2.16) up to a constant phase factor. We eliminate this by taking H± (ω) = H∓ (−ω) = H ∓ (ω),
H (ω) = |H± (ω)|2 .
(2.18)
There is still an arbitrariness of sign in the sense that −H± (ω) are also acceptable choices. It should be remarked that there is the possibility of a further, deeper non-uniqueness deriving from the fact that, since H (ω) is not positive definite for nonzero ω, the condition for unique factorization is not met [2]. The following sufficient conditions are also stated in [2] for the full tensorial case: G (0) < 0 which has been assumed; G(·) − G∞ integrable; and G integrable. This last condition does not hold for the examples in Section 6, and since such non-uniqueness in fact occurs for one of these examples, it may therefore be concluded that the condition that G be integrable is also necessary. Consider now the strain history E t . Define ∞ E t (s)e−iωs ds, E+t ∈ L2 (R). (2.19) E+t (ω) = 0
It is analytic in (−) , a property which will be assumed to extend to − . It is defined in all or part of (+) by analytic continuation. We also require the Fourier transform of the relative history. ∞ E(t) t t e−iωs ds = E+t (ω) − − , (2.20) Er+ (ω) = E+ (ω) − E(t) iω 0 where ω− is defined as in (2.9), the limit being taken, as noted, after any integration involving the quantity (ω− )−1 has been carried out. Under a similar assumption to t is analytic in − . that for E+t , we may conclude that Er+ t Similarly, we define (if E (s), s ∈ R− ∈ L2 (R− )) 0 E t (s)e−iωs ds, E−t ∈ L2 (R), E−t (ω) = −∞ (2.21) 0 E(t) t −iωs t t Er (s)e ds = E− (ω) + + , Er− (ω) = iω −∞ both of which are analytic in + . Analyticity at infinity is assumed for E+t (ω) though not for E−t (ω). Note that dE+t (ω) = −iωE+t (ω) + E(t) dt yielding that t ˙ (ω) dEr+ E(t) t = −iωEr+ (ω) − − . dt iω
(2.22)
(2.23)
381
MINIMUM FREE ENERGIES FOR FINITE MEMORY
Note also that T (t), given by (2.1), is independent of E t (s), s ∈ R−− , which allows us to extend G to R as, for example, an odd function. Using this fact and Plancherel’s theorem for the Fourier transform [13, 14] gives that [1, 15, 16] ∞ 1 G (ω)E+t (ω) dω T (t) = [G0 E(t) − π i −∞ s ∞ 1 t G (ω)Er+ (ω) dω, (2.24) = G∞ E(t) − π i −∞ s where (2.13) has been used in writing (2.24)2 . 3. Viscoelastic Membrane In this section we recall a result obtained in [17], in order to show that an infinite memory, represented by a relaxation function G ∈ L1 (R+ ), places very strong restrictions on the asymptotic behaviour of solutions of viscoelastic boundary and initial value problems. To the extent that these restrictions run counter to physical intuition, the results derived provide a motivation for considering finite memory materials. Let us consider a viscoelastic membrane occupying the region B with boundary ∂B, represented by the system ∞ G (s) ut (s) ds − aut (3.1) ut t = G0 u + 0
with boundary and initial conditions given by u|∂B = 0,
u(x, 0) = u0 (x),
ut (x, 0) = u1 (x),
(3.2)
where is the Laplacian operator and the constant a 0 denotes the coefficient of a viscous force. Let us consider the problem (3.1), (3.2) on the domain Q = B × (0, ∞). We suppose a > 0, G ∈ L1 (R) ∩ L2 (R), u0 ∈ H 1 (B), u1 ∈ L2 (B). DEFINITION 3.1. A function u ∈ L2 (R+ , H01 (B)) ∩ H 1 (R+ , L2 (B)) is said to be a weak solution of the problem (3.1), (3.2) with data u0 ∈ H 1 (B), u1 ∈ L2 (B), if u satisfies the integral equation ut (x, t)ϕt (x, t) dx dt + u1 (x, t)ϕ(x, 0) dx Q B ∞ G (s)∇ut (x, s) · ∇ϕ(x, t) ds G0 ∇u(x, t) · ∇ϕ(x, t) + = Q 0 (3.3) + aut (x, t)ϕ(x, t) dx dt for all ϕ ∈ L2 (R+ , H01 (B)) ∩ H 1 (R+ , L2 (B)).
382
M. FABRIZIO AND M. GOLDEN
A function g ∈ L1 (R+ ) is said to decay exponentially if there exists a positive β such that ∞ eβt |g(t)| dt < ∞. (3.4) 0
We say that a weak solution u decays exponentially if E 1/2 (t) decays exponentially, where the energy E is given by E(t) = (u2t (x, t) + [∇u(x, t)]2 dx. B
One of the main results contained in [17] is the following: THEOREM 3.1. If G does not decay exponentially, then the weak solution of problem (3.1), (3.2) does not decay exponentially. When the relaxation function G = 0, and a > 0, then the solutions of the system (3.1), (3.2) exhibit exponential decay. In contrast, Theorem 3.1 gives that when a > 0 and G = 0 does not decay exponentially, then the solutions do not exhibit exponential decay, even though the memory term represents a dissipative effect. In other words, the decay to zero of the solutions of (3.1), (3.2) for t → ∞ cannot be faster that the decay to zero of the kernel G for s → ∞. Of course this constraint does not apply if the memory is finite. Roughly speaking, when G is nonzero on an infinite interval, then the memory term not only provides a dissipative effect, but also a braking effect on the decay to zero of the solutions. For many problems, these two effects are physically contradictory. In such cases it is more suitable to use a kernel G which is nonzero only in a finite interval. 4. Finite Memory We now explore the case where G (t) = 0,
t > d > 0,
so that (2.11) becomes d G (s)e−iωs ds GF (ω) =
(4.1)
(4.2)
0
and the inverse relationship is ∞ 1 G (ω)eiωs dω. G (s) = 2π −∞ F
(4.3)
We have the same behaviour if the memory is finite, i.e., there exists a d ∈ R ++ such that G (s) = 0, for any s d.
MINIMUM FREE ENERGIES FOR FINITE MEMORY
383
Before proving an important result, it is relevant to note that the function eiz has an essential singularity at infinity. This is manifested as an exponential divergence as Im(z) → −∞ and as an exponential decay as Im(z) → ∞. We shall, for brevity, and in the context of this paper, refer to eiz as analytic/convergent on the upper half-plane, meaning analytic on the finite part of the plane and exponentially convergent to zero as Im(z) → ∞. The function e−iωb on , where b > 0, diverges exponentially in (+) as Im(ω) → +∞. Similarly, eiωb , where b > 0 diverges exponentially in (−) as Im(ω) → −∞. We refer to these as exponential divergences of order b in the respective half-planes. The following is now proved: PROPOSITION 4.1. Relation (4.1) is true if and only if GF has only essential singularities at infinity with exponential divergences in (+) of order d and perhaps others of lower order. Proof. By transforming the integration variable in (4.2), we obtain 0 −iωd g(ω), g(ω) = G (s + d)e−iωs ds. (4.4) GF (ω) = e −d
It follows from its definition that g(ω) is analytic on (+) and (4.4)1 gives that the only singularities of GF in (+) are exponential divergences at +∞ due to its factor e−iωd . Since g(ω) may contain factors eiωc , c > 0, we see that while the dominant divergence is of order d, there may be others of order d − c, 0 < c < d. Note also that g(ω) = eiωd GF (ω)
(4.5)
has only exponential divergences in (−) of order d and possibly others of lower order. To prove the converse, we combine (4.3) and (4.4) to obtain ∞ 1 g(ω)eiω(s−d) dω. (4.6) G (s) = 2π −∞ If s > d, the contour can be closed in (+) with the contribution from the infinite portion exponentially attenuated. The analyticity of g(ω) in (+) ensures that the result is zero. 2 REMARK 4.1. The first part of Proposition 4.1 can be seen by the following, perhaps more intuitive argument. From the fact that GF , given by (4.2), is defined by an integral of finite range, we see that it is an entire function on the complex frequency plane; and therefore, if not a polynomial, must have an essential singularity at infinity [18]. The dominant singular behaviour can be deduced without difficulty from (4.2). It is worth noting that the properties of Fourier transforms of functions that are nonzero only on finite regions are of interest also in signal processing applications [19].
384
M. FABRIZIO AND M. GOLDEN
An immediate consequence of Proposition 4.1 is that the assumption that GF is analytic at infinity, which was made in earlier work, must now be dropped. We have (4.7) lim ωGF (ω) = −iG (0) 1 − lim e−iωd , ω→∞
ω→∞
where the limit is taken along the real axis or any axis parallel to the real axis. This is of course finite but not well-defined. Also lim
Im(ω)→−∞
ωGF (ω) = −iG (0).
(4.8)
Furthermore, lim
Im(ω)→∞
ωg(ω) = iG (d − ).
(4.9)
The quantities of central interest are H , given by (2.14) and its factors H± defined by (2.16). The function H has exponential divergences as Im(ω) approaches infinity in both (+) and (−) ; the factor H+ has exponential divergences only in (+) and H− only in (−) . Along the real axis or any axis parallel to it, (4.10) lim H (ω) = −G (0) 1 − lim cos ωd ω→∞
ω→∞
by virtue of (4.7). This limit does not exist. There is however no resultant infinity. We have lim
Im(ω)→∓∞
H± (ω) = h∞ = {−G (0)}1/2 .
(4.11)
The method of factorization given in [1] is not useful in this case. Special techniques must be employed, as outlined in Section 6. 5. An Expression for the Minimum Free Energy Let ψ(t) be a free energy functional for the system under consideration. Then the Clausius–Duhem inequality, adapted to the isothermal case: ˙ − ψ(t) ˙ T (t)E(t) = D(t) 0
(5.1)
requires, by virtue of standard arguments [20, 21], that T (t) =
∂ψ(t) ∂E(t)
(5.2)
provided that ψ(t) has certain differentiability properties. The quantity D(t) is the internal dissipation function. Denoting by ψs (t) the free energy for static histories equal to E(t) for all past times, then ψs (t) = φ(t) =
1 G∞ E 2 (t) 2
(5.3)
MINIMUM FREE ENERGIES FOR FINITE MEMORY
385
or the elastic stored energy. The second law requires that for all histories [20, 21] ψ(t) ψs (t),
(5.4)
where equality occurs by definition for static histories. It follows that ψ(t) is nonnegative. We take (5.1)–(5.4) to be the defining properties of a free energy [9]. The derivation of the form of the minimum free energy as presented in [1] and other relevant formulae will be sketched here, with some change and simplification, in order to clarify that the essential singularities at infinity in H± do not invalidate the argument. Let us consider the work function which is also the maximum free energy if the state is defined as (E(t), E t ) [22]: t ˙ ds T (s)E(s) W (t) = ψM (t) = −∞ ∞ 1 t H (ω)|Er+ (ω)|2 . (5.5) = φ(t) + 2π −∞ The second relation is derived in [15, 1]. We wish to establish an expression for the minimum free energy at a specified state, which is given by the maximum recoverable work from that state [12, 9]. It can be shown that this quantity is a function of the minimal state [3]. It will be assumed that the strain tends to zero at large times [2] though the eventual optimal continuation will not have this property. The limit of W (u) as u → +∞ gives [1] ∞ ∞ 1 t t ˙ dsT (s)E(s) = dωH (ω)|E+t (ω) + E−t (ω)|2 ψ = ψM (∞) = 2π −∞ −∞ ∞ 1 t t = dωH (ω)|Er+ (ω) + Er− (ω)|2 . (5.6) 2π −∞ The last form follows from the previous term on using (2.20) and (2.21) and the t (∞) fact that H (ω) vanishes quadratically at the origin. It has been allowed that ψM may depend on t in anticipation of the fact that the eventual optimal continuation will be dependent on the current time. t t (ω) is given and wish to find the choice of Er− (ω) which We assume that Er+ maximizes the recoverable work ∞ ˙ ) dt T (t )E(t Wr = − t ∞ t ˙ ) − ˙ ) dt T (t )E(t dt T (t )E(t = −∞
= ψM (t) − ψ t .
−∞
(5.7)
t (ω) which minimizes ψ t . Let Now, ψM (t) is given, so we seek the choice of Er− t Em (ω) (not the same notation as in [1]) be that choice. It will be assumed that
386
M. FABRIZIO AND M. GOLDEN
it (and other continuations) may have essential singularities at infinity, similar to those of GF . Thus, we will use the term analytic/convergent with respect to it. If we replace it by Emt (ω) + k(ω) where k(ω) is arbitrary, apart from the fact that ¯ k(ω) = k(−ω), that it vanishes at least as strongly as ω−1 at large frequencies in + t with possible es and that it is analytic/convergent in + (any choice of Er− sential singularities similar to those of GF must have these properties) the resulting integral must not be smaller. It is easy to show that this is assured if ∞ t dωH (ω)Re k(−ω)(Er+ (ω) + Emt (ω)) = 0. (5.8) −∞
Let us impose the equivalent condition that ∞ t dωH (ω)k(−ω) Er+ (ω) + Emt (ω) = 0 −∞ ∞ t = dωH+ (ω)k(−ω) H− (ω)Er+ (ω) + H− (ω)Emt (ω) .
(5.9)
−∞
t is analytic and H+ (ω), k(−ω) are analytic/convergent in − while Note that Er+ t Em (ω) and H− (ω) are analytic/convergent in + . Let t t t (ω) = p− (ω) − p+ (ω), P t (ω) = H− (ω)Er+
where [23] 1 p (z) = 2π i t
∞
−∞
dω
P t (ω ) ω − z
(5.10)
(5.11)
t (ω) is the limit of p t (z) on the real axis from above. It is analytic in (+) . and p− t (ω) is the limit from below and is analytic in (−) . This inverted notational Also, p+ convention is adopted to retain conformity with other notation introduced earlier. The function P t (ω) is analytic on the real axis. Closing the contour on (+) (where t , none of which H− (ω) is analytic/convergent), we pick up the singularities of Er+ t t (ω) = are on R or at infinity. Thus, we see that p± are analytic on R. We have p¯± t p± (−ω), ω ∈ R. The definitions of these functions may be extended to by analytic continuation, as discussed before (2.9). t (ω) vanishes as ω−2 for large ω in (−) since The product H+ (ω)k(−ω)p+ t (ω) vanishes as ω−1 . Therefore the integral of this function over the k(−ω) and p+ real axis can be extended to an infinite contour on (−) without altering its value. The result is zero because of the convergence of the integrand on (−) . Therefore (5.9) becomes ∞ t dωH+ (ω)k(−ω) p− (ω) + H− (ω)Emt (ω) = 0. (5.12) −∞
This will be true for arbitrary k(−ω) only if the expression in brackets is a function that is analytic or analytic/convergent in (−) , vanishing at infinity. However,
MINIMUM FREE ENERGIES FOR FINITE MEMORY
387
t Emt (ω) must be analytic/convergent in + . Remembering that p− (ω) is analytic + and H− (ω) analytic/convergent in , we see that the expression in brackets must be analytic/convergent in both the upper and lower half-planes and on the real axis. Thus, it is analytic/convergent over the entire complex plane and thus analytic over t (ω) vanishes as ω−1 at infinity (at least in + ), as also must the finite part. Now p− Emt (ω) if the strain function is to be finite at s = 0. Therefore, the function is analytic everywhere, zero at infinity and so must vanish everywhere by Liouville’s theorem. Thus ∞ t p t (ω) H− (ω )Er+ (ω ) 1 1 =− dω . (5.13) Emt (ω) = − − H− (ω) 2π i H− (ω) −∞ ω − ω+
Observe that the pole at the origin due to H− (ω) in the denominator must be shifted to (−) , i.e., [H− (ω)]−1 behaves as (ω+ )−1 near the origin. Substituting (5.13) into (5.6) and using (5.10), we see that the minimum value of ψ t is given by ∞ 1 t dω|p+ (ω)|2 . (5.14) ψmt = 2π −∞ Also, from (5.5)
∞ 1 t t dω|p+ (ω) − p− (ω)|2 ψM (t) = φ(t) + 2π −∞ ∞ t 1 t = φ(t) + dω |p+ (ω)|2 + |p− (ω)|2 2π −∞
(5.15)
since the cross-term t t t t (−ω)p+ (ω) + p− (ω)p+ (−ω) p−
(5.16)
consists of terms that are analytic in (−) and (+) respectively and which vanish as ω−2 on the infinite boundary of these domains. By closing the contour on the half-plane over which a given term is analytic, one obtains zero. The minimum free energy ψm (t) is equal to the quantity Wr evaluated for ψ t = t ψm , giving ∞ 1 |p t (ω)|2 ψM (t). (5.17) ψm (t) = φ(t) + 2π −∞ − We can write (5.17) in the form: ∞ 1 dωH (ω)|Emt (ω)|2 . ψm (t) = φ(t) + 2π −∞
(5.18)
Relation (4.1) can be shown to be obeyed by ψm (t), using the relation t (ω) ∂p− H− (ω) =− , ∂E(t) iω
(5.19)
388
M. FABRIZIO AND M. GOLDEN
together with equation (2.24)2 and carrying out certain contour integrals on the appropriate half-planes. Also, relations (5.3) and (5.4) follow from (5.18), on observing [1] that, for a static history, Emt (ω) vanishes. Lastly, we must show that ˙ − ψ˙ m (t) Dm (t) = T (t)E(t)
(5.20)
is non-negative. From (5.7)3 and (5.14) we see that d 1 d ∞ t dω|p+ (ω)|2 . Dm (t) = ψmt = dt 2π dt −∞
(5.21)
One finds that Dm (t) = |K(t)|2 ,
(5.22)
where K(t) is a real number given by ∞ 1 t dωH− (ω)Er+ (ω) K(t) = 2π −∞
(5.23)
on using the first of the relationships d t t p (ω) = −iωp+ (ω) − K(t), dt + (5.24) ˙ H− (ω)E(t) d t t p (ω) = −iωp− (ω) − K(t) − dt − iω which are derived by using (2.23) in (5.11). Also required are the relationships for t from p+
1 2π
t (ω) = iK(t), lim ωp±
|ω|→∞ ∞
1 1 t dωp± (−ω) = ∓ K(t) = 2 2π −∞
∞
−∞
t p± (ω) dω.
(5.25)
By steps similar to those in [1], it can also be shown with the aid of (5.13) and (5.25) that the optimal relative continuation Emt (s) does not tend to zero as s → 0. Relation (4.11) is required for this demonstration. Also, Emt (s) + E(t) does not tend to zero as s → ∞. We see from (2.8) and (2.9) that if ∞ 1 t P t (ω)eiωs dω, (5.26) Y (s) = 2π −∞ where P t is defined by (5.10) then ∞ t Y t (s)e−iωs ds, p+ (ω) = − 0 0 t Y t (s)e−iωs ds. p− (ω) = −∞
(5.27)
MINIMUM FREE ENERGIES FOR FINITE MEMORY
From (2.14) and (4.4) we have that for ω ∈ R ω iωd ¯ H (ω) = g(ω)e−iωd − g(ω)e 2i
389
(5.28)
so that H (ω)
Im(ω)→+∞
Im(ω)→−∞
ω g(ω)e−iωd 2i ω iωd ¯ − g(ω)e , 2i
(5.29)
where g¯ indicates the complex conjugate of the function but not the argument. It follows from (4.9) that the dominant behaviour of the factors are H+ (ω)
Im(ω)→+∞
H− (ω)
Im(ω)→−∞
Ae−iωd , (5.30)
¯ iωd , Ae
where A is a constant. One can deduce from (5.30) that Y t (s), given by (5.26), vanishes for s < −d by closing the contour on (−) , so that (5.27)2 becomes 0 t Y t (s)e−iωs ds. (5.31) p− (ω) = −d
Finally, in this section, we prove the following result. PROPOSITION 5.1. For a material with finite memory of duration d, the minimum free energy ψm depends only on that part of the history for which G is nonzero, i.e., Ert (s), 0 s d; while ψM may depend on the entire history of strain. Proof. We have from (5.26) that ∞ ∞ 1 H− (ω) Ert (u)eiω(s−u) du dω. (5.32) Y t (s) = 2π −∞ 0 It follows from (5.30)2 that ∞ H− (ω)eiω(s−u) dω = 0 −∞
so that 1 Y (s) = 2π t
∞
−∞
∀s + d < u
s+d
Ert (u)eiω(s−u) du dω.
H− (ω)
(5.33)
(5.34)
0
t (ω), given by (5.31), depends only Ert (s), 0 s d. It is clear now that p− t (ω), given by (5.27)1 , may depend on the entire history of strain. The However, p+ result follows from (5.15) and (5.17). 2
390
M. FABRIZIO AND M. GOLDEN
A consequence of Proposition 5.1 is that a time domain representation of the minimum free energy, given in [24], reduces to the form 1 d d t ∂2 Er (s1 ) G(s1 , s2 )Ert (s2 ) ds1 ds2 (5.35) ψm (t) = φ(t) + 2 0 0 ∂s1 ∂s2 rather than this expression with infinite integrations as in the general case. An expression for G(s1 , s2 ) can be given in terms of time domain representations of the factors H± [24]. REMARK 5.1. It is interesting to consider Proposition 5.1 against a more general background. From (2.5), we see that, for a finite memory material, the condition that two viscoelastic states are in the same minimal state depends only on the t values of the histories in the time interval [0, d]. In particular, the quantity p− is a function of the minimal state [2]. A function of the minimal state can depend only on the history in the interval [0, d], since values outside of this interval can t and ψm must have be varied arbitrarily without altering the minimal state. Thus p− this property, as shown by Proposition 5.1.
6. Factorization of H (ω) Let us now address the problem of factorization of H (ω) as given by (2.16) in the finite memory case, for specified forms of the relaxation function. Consider the ansatz H+ (ω) = e−iωd/2 {H (ω)}1/2, H− (ω) = eiωd/2 {H (ω)}1/2, ω ∈ R,
(6.1)
where {H (ω)}1/2 is assumed to be analytic at all finite points in . We note that H vanishes at ω = 0 where it has a quadratic zero that does not produce a branch point. It is assumed that any other zero of H in is of even power type. The quantity H+ has an exponential divergence of order d as Im(ω) → +∞ and is analytic/convergent in (−) . Similarly, H− has an exponential divergence of order d as Im(ω) → −∞ and is analytic/convergant in (+) . These are consistent with (5.30). Let us now look at specific cases. Consider first the choice G (t) = −K0 , = 0,
0 t < d, t d,
(6.2)
where K0 =
G0 − G∞ > 0. d
(6.3)
391
MINIMUM FREE ENERGIES FOR FINITE MEMORY
Then iK0 1 − e−iωd , ω H (ω) = K0 (1 − cos ωd) ωd = 2K0 sin2 2
GF (ω) =
(6.4)
which has zeros at ωd = 2nπ for all integer values of n and is thus not positive definite for nonzero ω. Also ωd (6.5) {H (ω)}1/2 = 2K0 sin 2 so that, from (6.1)
K0 1 − e−iωd , 2 K0 1 − eiωd , H− (ω) = 2 H+ (ω) =
(6.6)
where a factor i has been omitted, to obtain agreement with (2.18). Beyond this example, relation (6.1) would seem to have limited applicability. Consider next the case G (t) = −K0 , 0 t < d/2, = −K1 , d/2 t < d, = 0, t d, 0 < K1 < K0 .
(6.7)
We have that G0 − G∞ K0 + K1 = 2 d
(6.8)
and GF (ω) = It follows that
i K0 − (K0 − K1 )e−iωd/2 − K1 e−iωd . ω
ωd H (ω) = K0 − (K0 − K1 ) cos 2
(6.9)
− K1 cos(ωd) 0 ∀ω ∈ R
(6.10)
which again vanishes for discrete nonzero values of ω. We look for factors of the form H+ (ω) = b0 + b1 e−iωd/2 + b2 e−iωd , H− (ω) = b0 + b1 eiωd/2 + b2 eiωd ,
(6.11)
392
M. FABRIZIO AND M. GOLDEN
where b0 , b1 , b2 are real. Using (2.16) and comparing coefficients gives the conditions b02 + b12 + b22 = K0 , K1 − K0 , b1 (b0 + b2 ) = 2 K1 b0 b2 = − 2 with solution b1 = 1
K0 − K1 , 2 3
1 = ±1,
1 −b1 + 2 b12 + 2K1 , 2 3 1 b2 = −b1 − 2 b12 + 2K1 , 2 2 = ±1. b0 =
(6.12)
(6.13)
Observe that b0 + b1 + b2 = 0 as required for H± to have zeros at ω = 0. If we choose K0 = 2K1
(6.14)
then these reduce to √ K0 , b1 = 1 √ 2 √ K0 −1 + 2 5 , b0 = √4 √ K0 −1 − 2 5 . b2 = 4
(6.15)
Since H± are arbitrary up to a constant real phase factor, we choose 1 = 1. However, there remain two possible solutions, corresponding to an interchange of b0 and b2 . A choice will be made between these two possibilities below. This is the lack of uniqueness in the factorization referred to after (2.18). We finish this section with a few observations on the case of more general stepfunction forms of G . For n steps of equal time gaps, (6.11) and (6.12) are replaced by H± (ω) =
n r=0
br e∓irωd/n
(6.16)
393
MINIMUM FREE ENERGIES FOR FINITE MEMORY
and n−r
bl bl+r = ar ,
l=0
ar =
r = 0, 1, . . . , n,
Kr − Kr−1 , 2
a0 = K0 ;
(6.17)
r = 1, 2, . . . , n − 1,
an = −
Kn−1 , 2
where the quantities Kr are obvious generalizations of K0 and K1 in (6.7). In the case of three steps, we have four equations for b0 , b1 , b2 , b3 . Adding the two middle relations, one obtains equations of the same form as (6.12) for b0 , b1 + b2 and b3 , but in terms of b1 b2 which is unknown. Subtracting the two middle equations and squaring, one obtains a quadratic equation for b1 b2 with real roots. Thus there are two solutions for each solution of (6.12). This special procedure becomes more cumbersome for a larger number of steps. In the general case, the simplest systematic procedure would seem to be to start with the last relation of (6.17)1 (r = n), solving for bn , bn−1 , . . . in terms of b0 and b1 . This procedure is complicated by the double occurrence of variables beyond a certain point. The penultimate equation for b1 in terms of b0 is a polynomial equation of degree n − 1. The final substitution into the first equation (r = 0 in (6.17)1 – which can be replaced by nl=0 bl = 0 – yields in general a nonpolynomial equation for b0 . 7. Explicit Forms of the Minimum Free Energy For the relaxation function derivative given by (6.2) we have from (2.1) that the stress has the form d Ert (s) ds T (t) = G∞ E(t) − K0 0 t (E(u) − E(t)) du (7.1) = G∞ E(t) − K0 t −d
and the work function given by (5.5) becomes, after some algebra, K0 d t [Er (s)]2 ds W (t) = ψM (t) = φ(t) + 2 0 K0 ∞ t [Er (s + d) − Ert (s)]2 ds. + 2 0 Similarly, for G given by (6.7), d/2 Ert (s) ds − K1 T (t) = G∞ E(t) − K0 0
d
d/2
Ert (s) ds
(7.2)
394
M. FABRIZIO AND M. GOLDEN
= G∞ E(t) − K0 − K1
t −d/2
t −d
t
t −d/2
(E(u) − E(t)) du
(E(u) − E(t)) du
(7.3)
and the work function has the form
K0 d/2 t K1 d t 2 [Er (s)] ds + [E (s)]2 ds W (t) = ψM (t) = φ(t) + 2 0 2 d/2 r d 2 K0 − K1 ∞ t t ds Er (s) − Er s + + 2 2 0 K1 ∞ t [Er (s) − Ert (s + d)]2 ds + 2 0 ∞ ∞ d t 2 t t ds [Er (s)] ds + (K1 − K0 ) Er (s)Er s + = φ(t) + K0 2 0 0 ∞ Ert (s)Ert (s + d) ds. (7.4) − K1 0
We now write down the form of the minimum free energy and associated quantities for the explicit factorizations considered in Section 6. Consider first (6.2) where the factors are given be (6.6). The quantity Y t , defined by (5.26), has the form t 1 K0 ∞ t (ω)eiωs dω 1 − eiωd Er+ Y (s) = 2π 2 −∞ K0 t [E (s) − Ect (s + d)], (7.5) = 2 c where Ect is the function Ert with Ect (s) = 0, s < 0 so that K0 0 t t Ec (s + d)e−iωs ds p− (ω) = − 2 −d K0 iωd d t e Ec (u)e−iωu du = − 2 0 and
t (ω) p+
K0 =− 2
∞
[Ect (s) − Ect (s + d)]e−iωs ds.
(7.6)
(7.7)
0
From Plancherel’s theorem and (5.17) we see that K0 d t [Ec (s)]2 ds. ψm (t) = φ(t) + 2 0
(7.8)
395
MINIMUM FREE ENERGIES FOR FINITE MEMORY
Also, (5.15) gives an expression in agremment with (7.2). For the second case, (6.7), we have ∞ t 1 t (ω)eiωs dω b0 + b1 eiωd/2 + b2 eiωd Er+ Y (s) = 2π −∞ d t t + b2 Ect (s + d) = b0 Ec (s) + b1 Ec s + 2 so that
t p− (ω)
= b1
0
Ect
− d2
+ b2 and
t (ω) p+
=−
∞
0
−d
d −iωs e s+ ds 2 Ect (s + d)e−iωs ds
b0 Ect (s)
+
0
Thus
ψm (t) = φ(t) +
(7.9)
b1 Ect
(7.10)
d t + b2 Ec (s + d) e−iωs ds. (7.11) s+ 2
d/2
[Ect (s)]2
d
ds + [Ect (s)]2 ds 0 0 d/2 d t t ds Ec (s)Ec s + + 2b1 b2 2 0 b12
b22
(7.12)
after changing variables in the integrations. Formula (7.4) can also be reproduced from (5.15) with the aid of the conditions (6.12). It is easier to use (5.15)1 for this purpose. The vanishing of the cross-terms is readily checked directly from (7.10) and (7.11). Relation (7.12) can be written in the form d/2 d t 2 [Ec (s)] ds + a2 [Ect (s)]2 ds ψm (t) = φ(t) + a1
0
d/2
Ect (s) − Ect s +
− b1 b2 0
d/2
d 2
2 ds,
(7.13)
K0 , − b1 b2 = a1 = 2 K1 , a2 = b22 + b1 b2 = 2 b02
where (6.12) has been used. We see therefore that the largest choice of b2 minimizes ψm and thus choose 2 = −1 in (6.13). The results of this section are consistent with Proposition 5.1.
396
M. FABRIZIO AND M. GOLDEN
For the general case of n steps, where the coefficients bl , l = 1, 2, . . . , obey (6.17), the generalization of (7.10)–(7.12) is straightforward. Finally, we consider the question of minimal states. For G given by (6.2) or (6.7), the conditions (2.5) reduce to E1t (s) = E2t (s),
0 s d,
(7.14)
which will be obeyed by a large class of histories. Strictly, this relation need apply only almost everywhere in 0 < s d. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
17. 18. 19. 20.
J.M. Golden, Free energies in the frequency domain: the scalar case. Quart. Appl. Math. 58 (2000) 127–150. L. Deseri, G. Gentili and J.M. Golden, An explicit formula for the minimum free energy in linear viscoelasricity. J. Elasticity 54 (1999) 141–185. M. Fabrizio and J.M. Golden, Maximum and minimum free energies for a linear viscoelastic material. Quart. Appl. Math. 60 (2002) 341–381. W. Noll, A new mathematical theory of simple materials. Arch. Rational Mech. Anal. 48 (1972) 1–50. M. Fabrizio, G. Gentili and J.M. Golden, The minimum free energy for a class of compressible viscoelastic fluids. Adv. in Differential Equations 7 (2002) 319–342. G. Del Piero and L. Deseri, On the concepts of state and free energy in linear viscoelasticity. Arch. Rational Mech. Anal. 138 (1997) 1–35. G. Del Piero and L. Deseri, On the analytic expression of the free energy in linear viscoelasticity. J. Elasticity 43 (1996) 247–278. D. Graffi and M. Fabrizio, Sulla nozione di stato materiali viscoelastici di tipo ‘rate’. Atti Accad. Naz. Lincei 83 (1990) 201–208. M. Fabrizio and A. Morro, Mathematical Problems in Linear Viscoelasticity. SIAM, Philadelphia, PA (1992). J.M. Golden and G.A.C. Graham, Boundary Value Problems in Linear Viscoelasticity. Springer, Berlin (1988). M.E. Gurtin and I. Herrera, On dissipation inequalities and linear viscoelasticity. Quart. Appl. Math. 23 (1988) 235–245. W.A. Day, Thermodynamics based on a work axiom. Arch. Rational Mech. Anal. 31 (1968) 1–34. E.C. Titchmarsh, Introduction to the Theory of Fourier Integrals. Clarendon Press, Oxford (1937). I.N. Sneddon, The Use of Integral Transforms. McGraw-Hill, New York (1972). M. Fabrizio, C. Giorgi and A. Morro, Free energies and dissipation properties for systems with memory. Arch. Rational Mech. Anal. 125 (1994) 341–373. M. Fabrizio, Existence and uniqueness results for viscoelastic materials. In: G.A.C. Graham and J.R. Walton (eds), Crack and Contact Problems for Viscoelastic Bodies. Springer, Vienna (1995). M. Fabrizio and S. Polidoro, On the exponential decay for differential systems with memory. Applicable Analysis 81 (2002) 1245–1266. E.T. Whittaker and G.N. Watson, A Course of Modern Analysis. Cambridge Univ. Press, Cambridge (1963). J. Ramanathan, Methods of Applied Fourier Analysis. Birkhäuser, Boston (1998). B.D. Coleman, Thermodynamics of materials with memory. Arch. Rational Mech. Anal. 17 (1964) 1–46.
MINIMUM FREE ENERGIES FOR FINITE MEMORY
21.
397
B.D. Coleman and V.J. Mizel, A general theory of dissipation in materials with memory. Arch. Rational Mech. Anal. 27 (1967) 255–274. 22. A. Morro and M. Vianello, Minimal and maximal free energy for materials with memory. Boll. Un. Mat. Ital. 4A (1990) 45–55. 23. N.I. Muskhelishvili, Singular Integral Equations. Noordhoff, Groningen (1953). 24. L. Deseri, G. Gentili and J.M. Golden, Free energies and Saint-Venant’s principle in linear viscoelasricity, submitted for publication.
About Clapeyron’s Theorem in Linear Elasticity ROGER FOSDICK and LEV TRUSKINOVSKY Department of Aerospace Engineering and Mechanics, University of Minnesota, Minneapolis, MN 55455, U.S.A. E-mail:
[email protected],
[email protected] Received: 6 August 2002; in revised form: 17 March 2003 Abstract. We examine some elementary interpretations of the classical theorem of C LAPEYRON in linear elasticity theory. As we show, a straightforward application of this theorem in the purely mechanical setting leads to an apparent paradox which can be resolved by referring either to dynamics or to thermodynamics. These richer theories play an essential part in understanding the physical significance of this theorem. Mathematics Subject Classifications (2000): 74A15, 74B05, 80A17 Key words: elasticity, Clapeyron, dissipation, viscoelasticity, thermoelasticity.
In remembrance of Clifford Truesdell and his scientific program of enlightenment.
1. Introduction According to Love [11, p. 173], “The potential energy of deformation of a body, which is in equilibrium under given load, is equal to half the work done by the external forces, acting through the displacements from the unstressed state to the state of equilibrium.” This is now commonly known as C LAPEYRON’s theorem in linear elasticity theory. In particular, this theorem, taken literally, implies that the elastic stored energy accounts for only half of the energy spent to load the body; the remaining half of the work done to the body by the external forces is unaccounted for and is lost somewhere in achieving the equilibrium state. It is particularly striking that this apparent paradox is reached within the framework of The National Science Foundation Grant No. DMS-0102841 is gratefully acknowledged for their
support of this research. In 1852, Lam´e [9] published his volume, Leçons sur la th´eorie math´ematique de l’´elasticit´e des corps solides, in which he devoted his seventh lecture to what he termed C LAPEYRON’s Theorem. (See [13, pp. 565 and 578], for relevant remarks.) Earlier, Lam´e and Clapeyron [10] had noted this result in a joint memoir of 1833. Although Emile Clapeyron [3], himself, first published on this theorem in 1858, in a r´esum´e of an original memoir that apparently was never published, it is argued by Todhunter and Pearson [14, p. 419], that the “result of the memoir of 1833 was due entirely to Clapeyron, for Lam´e in his Leçons, of 1852, . . . terms it C LAPEYRON’s Theorem, and C LAPEYRON here speaks of it as he would do only if it were entirely due to himself.” 399 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 399–426. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
400
R. FOSDICK AND L. TRUSKINOVSKY
the purely conservative linear theory of elasticity. Alternatively, however, within elastostatics the common characterization of the work done to reach equilibrium is conceptually ambiguous and a different interpretation may be required. To illustrate the above concerns, let us first recall that in the linear theory of elasticity the total strain energy of a body that occupies the region ⊂ R3 and supports a, generally, dynamical displacement field u = u(x, t) and strain field e ≡ (∇u + (∇u)T )/2 = e(x, t) relative to its undistorted state at time t = 0 is defined by 1 ρC[e] · e dv. (1.1) U [e](t) ≡ 2 Here, ρ is the mass density of the body and C is the positive definite and completely symmetric elasticity tensor. Further, the work done during the interval of time (0, t) due to an applied boundary traction field t∗ = t∗ (x, t) and body force field b∗ = b∗ (x, t) over the displacement u(x, t) is given by t ∗ ∗ t · u˙ da + b · u˙ dv dt. (1.2) W [u](t) = 0
∂
The corresponding stress field in at time t, T = T(x, t), satisfies the generalized H OOKE ’ S law T = ρC[e] and is symmetric. Throughout this paper we shall assume, for convenience, that the body is homogeneous, so that ρ and C are constant. If the loads t∗ and b∗ are ‘dead’, i.e., independent of time, so that t∗ = ¯t(x) and ¯ then for a body that is undistorted at time t = 0, (1.2) may be integrated b∗ = b(x), to yield ¯ (1.3) t · u da + b¯ · u dv ≡ W [u](t). W [u](t)|(t∗ ,b∗ )=(¯t,b) ¯ = ∂
This ‘dead load work’ represents the “work done by the external forces” to which L OVE referred in his quote concerning equilibrium reproduced in the first line of this introduction, above. Of course, in this case the loads are equilibrated so that ¯b dv = 0, ¯ ¯t da + x × t da + x × b¯ dv = 0 (1.4) ∂
∂
¯ and u is an equilibrium displacement field, say u(x); the corresponding ‘dead load work’ is then ¯ ≡ W [u](t)|u=u(x) W [u] ¯ .
(1.5)
¯ Suppose that the displacement field u = u(x) corresponds to an equilibrium state with strain e¯ (x) and stress T(x) satisfying T = ρC[¯e] and div T + b¯ = 0
in ,
Tn = ¯t
on ∂,
(1.6)
ABOUT CLAPEYRON’S THEOREM
401
where n is the outer unit normal to on ∂. Without loss of generality, we may ¯ eliminate the possibility of an added infinitesimal rigid displacement field in u(x) ¯ and render u(x) unique by imposing the normalization conditions ρ u¯ dv = 0, ρx × u¯ dv = 0, (1.7)
where the mass density ρ is included only for later convenience. Then, according to the usual derivation of C LAPEYRON’s theorem, we see, starting with (1.3) and (1.5) and using (1.6), generalized H OOKE ’ S law, the symmetry of T and the divergence theorem, that 1 1 ¯ ¯ = W [u] Tn · u¯ da + b · u¯ dv = U [¯e]. (1.8) 2 2 ∂ Literally following L OVE’s statement of C LAPEYRON’s theorem, one may infer that elastostatics alone accounts for only half of the work that is expended to reach equilibrium; the coefficient one-half is a result of the linearity of the theory. In the remainder of this paper, we continue within the linear framework and consider, respectively, in Sections 2, 3 and 4, the richer dynamical theories of elasticity, viscoelasticity and thermoelasticity in order to shed light on this seemingly paradoxical and incomplete conclusion. SYNOPSIS
¯ of In Section 2, we argue that within ideal elasticity theory the quantity W [u] C LAPEYRON’s theorem does not reasonably represent the work done by the exter¯ nal forces to reach an elastostatic equilibrium state u = u(x). We then investigate ‘fast’ versus ‘slow’ time dependent loading conditions and conclude that within the ¯ is a better representative of the work expended assumptions of elastostatics W [u]/2 to reach equilibrium. In Sections 3 and 4, we amend ideal elasticity theory so as to include the mechanisms of viscous and thermal dissipation, respectively. Then dead loading becomes compatible with the notion of achieving equilibrium and, ¯ of C LAPEYRON’s theorem does adequately we conclude that the quantity W [u] represent the corresponding work done by the external applied forces. Here we ¯ becomes stored in the body in the form of equilibrium strain find that half of W [u] energy and the remaining half is dissipated either through the action of viscous dissipation or heat transfer. In Section 5, we offer some conclusions. In elastostatics, there is, of course, no time dependence and formally the work done by the loads ¯ ¯t(x) and b(x) ¯ to reach the equilibrium displacement u = u(x) from an undistorted state commonly is calculated by using (1.3) and (1.5), as was done in (1.8). As noted earlier, for purely equilibrium theory this may not properly represent the ‘work done to reach equilibrium’ because this tacitly assumes that the loads are ‘dead’ and applied over time and, therefore, impulsive. For an ideal elastic body this circumstance is not compatible with the notion of reaching equilibrium, as we shall see in Section 2 and related Appendices A and B.
402
R. FOSDICK AND L. TRUSKINOVSKY
In Appendices A and B, we show some example calculations to further illustrate the claims of Section 2. It should be noted that throughout the main body of this paper we assume, for convenience, that the traction field is specified on the complete boundary of . However, in the elementary examples of these appendices we prefer to hold one part of the boundary fixed and specify the traction on the complementary part for all time. While these boundary conditions clearly are not consistent with (1.4)1 and (2.4)1 , nevertheless they are normal and allowable; moreover, they do not compromise the main purpose of illustrating the difference between dead and retarded loading. 2. Elastodynamics Here, we shall first consider the consequences of ‘dead’ loading within elastodynamics regarding work and energy and then show how equilibrium theory is best accounted for by introducing a retarded system of loads. 2.1. ‘ DEAD ’ LOADING Suppose that for all time t > 0 the body is ‘dead’ loaded with the same loads as in ¯ in (1.2). On the static situation described above, so that t∗ = ¯t(x) and b∗ = b(x) the boundary of we set Tn = ¯t on ∂, ∀t > 0,
(2.1)
and initially the body is at rest and undistorted so that ˙ 0) = 0 in . u(x, 0) = u(x,
(2.2)
The dynamical equation is div T + b¯ = ρ u¨
in , ∀t > 0,
(2.3)
¯ and we recall that t¯ = t¯(x) and b¯ = b(x) are supposed to be balanced in the sense of (1.4). Of course, u, e and T are related through the strain-displacement and stress–strain equations of Section 1. Under these conditions, it readily follows, from (2.1), (2.3) and the symmetry of T, that the linear and angular momentum are conserved. Thus, by integration in time and use of (2.2), it is clear that the resulting motion naturally satisfies the normalization ρu dv = 0, ρx × u dv = 0 ∀t 0. (2.4)
˙ integrating over and usIn addition, by forming the inner product of (2.3) with u, ing the symmetry of C together with (2.1) and (1.1), we readily reach the classical power theorem d d ¯t · u˙ da + ˙ (2.5) b¯ · u˙ dv = U [e](t) + K[u](t), dt dt ∂
ABOUT CLAPEYRON’S THEOREM
where
˙ K[u](t) ≡
1 ˙ 2 dv ρ|u| 2
403
(2.6)
is the kinetic energy of the body. Then, by integrating (2.5) in time and using (2.2) and (1.3) we obtain the standard balance of mechanical energy ˙ W [u](t) = U [e](t) + K[u](t),
(2.7)
which is supposed to be valid for all time t 0. Now, as a first elementary observation, let us assume that may, at some time t = t¯ during the motion, instantaneously support the equilibrium displacement field ¯ ˙ t¯) = 0 so is not coincidently in the sense that u(x, t¯) = u(x); it may be that u(·, at rest. Be that as it may, nevertheless, (2.4) will be met at t = t¯ because of (1.7) ¯ and U [e](t¯) = U [¯e], we see from (2.7) and, in addition, because W [u](t¯) = W [u] that ¯ = U [¯e] + K[u]( ˙ t¯). W [u]
(2.8)
Then, recalling (1.8), we may conclude that half the work done during the time interval (0, t¯) is stored in the body as strain energy and the remaining half satisfies 1 ¯ = K[u]( ˙ t¯); W [u] (2.9) 2 it has been spent to produce the instantaneous kinetic energy of the body. Accordingly, under the present circumstances, it is this kinetic energy that must be spontaneously extracted from if the body is to be arrested in the equilibrium ¯ state u(x, t¯) = u(x). However, there is no mechanism in this conservative ideal elastic system to do so! Let us now consider an alternative description of how the work may be channeled into strain energy and kinetic energy based upon time-averaging of the corresponding energies. The assumption here is that there is a time t = t ∗ > 0, perhaps ˙ t ∗) ≡ 0 one among many, at which the body instantaneously is at rest, i.e., u(x, in . To describe the average motion, we introduce the time-average displacement field as ∗ 1 t u(x, t) dt, (2.10) u(x) ≡ ∗ t 0 According to [11, p. 123] (see also [13, p. 537, art. 988]), in 1839 Poncelet [12] was the
first to note that “a load suddenly applied may cause a strain twice as great as that produced by a gradual application of the same load.” While this observation of Poncelet, which also contains an interesting factor of 2, appeared contemporaneously with the original and later announcements of C LAPEYRON’s theorem, there appears to have been no recognition of a possible relationship between the claims of either authors. While, generally, there may not be such a time, in the case of periodic motion there is a countable set of such times; a specific one-dimensional example is discussed later in Appendix A. Notice, ˙ t) over is always zero. though, that according to (2.4)1 the average of u(x,
404
R. FOSDICK AND L. TRUSKINOVSKY
with the time-average strain e(x) and stress T(x) fields defined analogously. Then, it readily follows that 1 ∇u + (∇u)T , T = ρC[e]. 2 Moreover, because of (2.2), by time-averaging (2.3) and (2.1) we find e =
divT + b¯ = 0 in ,
Tn = ¯t on ∂.
(2.11)
(2.12)
Also, note that the time-average displacement field u(x) satisfies the normaliza¯ tion (2.4) and recall that the loads ¯t = ¯t(x) and b¯ = b(x) are balanced in the sense ¯ of (1.4). Thus, because of uniqueness and the fact that u(x) and u(x) solve the same equilibrium boundary-value problem, we may conclude that ¯ u(x) = u(x),
e(x) = e¯ (x),
T(x) = T(x)
in .
(2.13)
Now, by time-averaging (2.7), using (1.3), recalling the notation established in (2.10) and applying (2.13), we easily have ¯ = U [e] + K[u]. ˙ W [u] = W [u] = W [u]
(2.14)
In particular, the average of the ‘dead load work’, W [u], is equal to the quantity ¯ of C LAPEYRON’s theorem in (1.8), and our immediate aim is to determine W [u] how this average work expended is divided up between the average strain energy ˙ of the body. To do so, we first U [e] and the average kinetic energy K[u] introduce the difference displacement field ¯ u (x, t) ≡ u(x, t) − u(x),
(2.15)
with e (x, t) and T (x, t) defined analogously, and observe, using (1.1), the symmetry of C and (1.8), that 1 ρC[¯e + e ] · (¯e + e ) dv U [e](t) = 2 ρC[¯e] · e dv + U [e ](t) = U [¯e] + 1 ¯ + ρC[¯e] · e dv + U [e ](t). = W [u] 2 Then, by time-averaging we have 1 ¯ + ρC[¯e] · e dv + U [e ]. U [e] = W [u] 2 However, because e (x, t) = e(x, t) − e¯ (x) we see from (2.13) that e (x) = e(x) − e¯ (x) = 0 and, consequently, we reach 1 ¯ + U [e ]. U [e] = W [u] 2
(2.16)
405
ABOUT CLAPEYRON’S THEOREM
Now, to determine U [e ], it is convenient to observe, using (2.1)–(2.3), (2.15) and the relationships T = ρC[e ] and e = (∇u + (∇u )T )/2, that div T = ρ u¨ T n = 0
in , ∀t > 0, on ∂, ∀t > 0;
¯ u (x, 0) = −u(x),
u˙ (x, 0) = 0
(2.17) in .
Then, with the definition (1.1), the symmetry of C, (2.17) and the aid of the divergence theorem we find that 1 T · ∇u dv U [e ](t) = 2 1 1 T n · u da − ρ u¨ · u dv = 2 ∂ 2 1 ρ u¨ · u dv =− 2 2 1 1 ˙ ˙ 2 dv, ρ u˙ · u − ρ|u| =− 2 ˙ t). Now, the last equation of which uses the fact that (2.15) implies u˙ (x, t) = u(x, ˙ 0) = u(x, ˙ t ∗ ) = 0 and using (2.6), we obtain by time-averaging, recalling that u(x, ˙ U [e ] = K[u],
(2.18)
which is a well known result concerning the equipartition between kinetic and potential energies. Thus, by substituting (2.18) into (2.16) and then using (2.14) we conclude that 1 ˙ = W [u] ¯ U [e ] = K[u] 4
(2.19)
and, again using (2.16), we see that 1 3 1 ¯ + W [u] ¯ = W [u]. ¯ U [e] = W [u] 2 4 4
(2.20)
To show that this result is independent of the assumption of periodicity, let us introduce the complete set of orthonormal eigenfunctions and eigenvalues, {u¯ i (x), ωi , i = 1, 2, . . .}, which satisfy (1.7) and div(C[∇ u¯ i ]) + ρωi2 u¯ i = 0 (C[∇ u¯ i ])n = 0
in , on ∂,
and expand the solution u(x, t) of (2.1)–(2.3) in the form u(x, t) =
∞ i=1
u¯ i (x)gi (t).
(2.21)
406
R. FOSDICK AND L. TRUSKINOVSKY
Then, we readily find that gi (t) = Ai (1 − cos ωi t), i = 1, 2, . . . , and it is possible to determine the constants Ai as Fourier coefficients so that this series represents a weak solution of (2.1)–(2.3) in the sense that ∇u(·, t) ∈ L2 () for all t > 0. Furthermore, it is straightforward to show that the infinite time-average of the displacement field, 1 T u(x, t) dt, (2.22) u∞ (x) ≡ lim T →∞ T 0 ¯ and that conclusions similar to those highlighted in the satisfies u∞ (x) = u(x) previous paragraph continue to hold for the relationships between the infinite timeaverages of the work, strain energy and kinetic energy, i.e., ¯ = U [e]∞ + K[u] ˙ ∞ W [u]∞ = W [u∞ ] = W [u]
(2.23a)
with U [e]∞ =
3 ¯ W [u], 4
1 ˙ ∞ = W [u]. ¯ K[u] 4
(2.23b)
Based upon the above analyses, we conclude that when an elastic body is set in motion with a ‘dead’ loading system from an initially undistorted rest state, then, with suitable interpretation, the average work that is supplied to the body by the ‘dead’ loading is equal to the equilibrium work of C LAPEYRON’s theorem. On the average, three quarters of this work appears as strain energy (half due to the equilibrium strain energy as predicted from C LAPEYRON’s theorem and a quarter due to the strain energy of the deformation relative to this equilibrium), and the remaining quarter is, on the average, transformed into kinetic energy. To illustrate the general conclusions reached above, we consider, in Appendix A, a specific one-dimensional elastodynamic problem with ‘dead’ loading. 2.2. ‘ SLOW ’ LOADING When an ideal elastic body is ‘dead’ loaded from an undistorted, rest state with an otherwise equilibrium system of loads, the loading is impulsively applied. Consequently, from the dynamical considerations of Section 2.1, the body never reaches equilibrium but, rather, rings by constantly redistributing kinetic and strain energy between its elements. Indeed, the work done to the body at any time t > 0 due to the external loading is given by (1.3), but the body is never coincidently at rest and in a state of equilibrium. On the other hand, we expect that if an equilibrium system of loads is achieved sufficiently slowly in time then even an ideal elastic body should distort through a sequence of near equilibrium states and eventually reach a nearly static equilibrium configuration. In this case, the work done to the body at any time t due to the external loading may be calculated using (1.2), but the calculation is no longer trivial because now t∗ and b∗ are not ‘dead’ but rather depend on time. For dissipationless, ideal elastic bodies it is intuitively clear that
407
ABOUT CLAPEYRON’S THEOREM
the work expended to reach equilibrium should be related to the latter rather than the former calculation. To gain some general perspective, suppose that the loading system, t∗ on ∂ and b∗ in , is such that ⎧ ⎨ t ¯t(x), t ∈ (0, t ), ∞ ∗ ∗ (2.24) t = t (x, t) = t∞ ⎩¯ t(x), t t∞ , and
⎧ ⎨ t b(x), ¯ t ∈ (0, t∞ ), ∗ ∗ b = b (x, t) = t∞ ⎩¯ b(x), t t∞ ,
(2.25)
where t∞ is a sufficiently large time constant so that the loads may be considered to be slowly applied. Then, at least for t ∈ (0, t∞ ), the displacement field u = ¯ u(x, t) = t u(x)/t ∞ and the corresponding strain and stress fields, e = e(x, t) = t e¯ (x)/t∞ and T = T(x, t) = tT(x)/t∞ , from the strain-displacement and stress– strain relations of Section 1, will satisfy the dynamical equation div T + b∗ = ρ u¨
in , t ∈ (0, t∞ ),
together with the boundary condition Tn = t∗
on ∂, t ∈ (0, t∞ )
and initial conditions u(x, 0) = 0,
˙ 0) = u(x,
1 ¯ u(x) t∞
in .
Clearly, for sufficiently large time constant t∞ not only is the applied loading ‘slow’, but the initial state of is undistorted and ‘nearly’ at rest. Further, at time ¯ with, again, t = t∞ the body achieves the equilibrium displacement field u(x) ‘nearly’ zero velocity. Moreover, according to (1.2), the work done to up to time t = t∞ is t∞ 1 t t ¯ 1 ¯t · u¯ da + b · u¯ dv dt W [u](t∞ ) = t∞ t∞ 0 ∂ t∞ t∞ 1 ¯t · u¯ da + b¯ · u¯ dv , = 2 ∂ so that with (1.8)1 we have 1 ¯ W [u](t∞ ) = W [u]. 2
408
R. FOSDICK AND L. TRUSKINOVSKY
In addition, according to (2.6), the kinetic energy of at any time t ∈ [0, t∞ ) is ‘nearly’ zero and equal to its initial value because 1 ¯ 2 dv, ∀t ∈ [0, t∞ ). ˙ ρ|u| K[u](t) = 2 2t∞ Finally, according to (1.1), the strain energy of at time t = t∞ is given by U [e](t∞ ) = U [¯e], and so with (1.8)2 we may conclude that W [u](t∞ ) = U [e](t∞ ). Thus, for sufficiently large time constant t∞ , the body is ‘nearly’ at rest in equilibrium at time t = t∞ and the work done to to achieve this ‘near’ equilibrium state is half that which is supplied, according to L OVE’s interpretation of C LAPEYRON’s theorem; in fact, this reasoning shows that the work done is equal to the strain energy at time t = t∞ and, to within a certain degree of approximation the paradox of C LAPEYRON’s theorem may be considered resolved. Of course, the body is not exactly at rest in equilibrium at time t = t∞ and while it was initially undistorted, it was not initially at rest. For large time constant t∞ , according to the given initial conditions there is a small kinetic energy imparted to at time t = 0 and this same kinetic energy must be extracted from at time t = t∞ in order for to strictly remain in equilibrium for all time t > t∞ . In general, for the loading conditions of (2.24) and (2.25) it readily follows from an application of the power theorem that for all time t > t∞ we must have 1 ¯ 2 dv, ˙ ˙ ∞) = 2 ρ|u| K[u](t) + U [e − e¯ ](t) = K[u](t 2t∞ where, of course, the right-hand side also represents the kinetic energy that is added to the system at t = 0 due to the ‘nearly’ stationary initial condition. Thus, given an > 0, for sufficiently large time constant t∞ , both the kinetic energy of and the strain energy of for the difference strain e(x, t) − e¯ (x) must remain within an -neighborhood of zero for all t t∞ . To illustrate these ideas, we consider, in Appendix B, a one-parameter family of one-dimensional elastodynamic problems for a bar of finite length, wherein the applied loading depends on a slowness parameter α. Our aim in this appendix is to exhibit how the retarded nature of the applied loading effects the dynamical behavior and its relationship to the notion of equilibrium. 3. Viscoelasticity As earlier, we again suppose that the body is initially at rest and undistorted and that it is ‘dead’ loaded as in the static situation of Section 1. Now, to introduce an
409
ABOUT CLAPEYRON’S THEOREM
elementary form of mechanical dissipation, we consider a viscoelastic body whose constitutive relation is of the K ELVIN –VOIGT form T = ρC[e] + D[˙e],
(3.1)
where D is a positive definite, completely symmetric (constant) viscosity tensor. The dynamical equation and the boundary and initial conditions are the same as those in (2.1)–(2.3), i.e., div T + b¯ = ρ u¨
in , ∀t > 0,
Tn = ¯t on ∂, ∀t > 0,
˙ 0) = 0 in . u(x, 0) = u(x,
(3.2)
In the usual way, it follows from (3.2) and (2.6) that the classical power theorem holds, i.e., d ¯b · u˙ dv = ¯t · u˙ da + ˙ (3.3) T · e˙ dv + K[u](t), dt ∂ which, with the use of (3.1), (3.2)3 , (1.1), (1.3) and integration in time, results in ˙ W [u](t) = U [e](t) + K[u](t) + D(t),
(3.4)
where D(t) denotes the dissipation function t D[˙e] · e˙ dv dt 0, D(t) ≡
(3.5)
0
for all t 0. Because of the dissipative character of viscosity and the special nature of the ¯ loading, in that ¯t(x) and b(x) are balanced and correspond with the equilibium ¯ displacement field u(x) of Section 1, it is natural to expect that the solution of ¯ the problem outlined above will have the ‘asymptotic property’ u(x, t) → u(x) as t → ∞. Supposing this is the case, we find from (3.4), (1.1) and (2.6) that in the limit as t → ∞ ¯ = U [¯e] + D∞ , W [u] where
D∞ ≡ D(∞) = 0
∞
D[˙e] · e˙ dv dt 0.
(3.6)
Dafermos [5] and Andrews and Ball [1] have studied the questions of existence and asymptotic
stability for general one-dimensional K ELVIN -VOIGT viscoelasticity theory. With certain smoothness hypotheses, the conclusions in [1, 5] guarantee that the solution to the problem with ‘dead’ loading and zero initial data asymptotically and strongly approaches the equilibrium state which corresponds to the same ‘dead’ loads.
410
R. FOSDICK AND L. TRUSKINOVSKY
Moreover, by using C LAPEYRON’s theorem (1.8) we may then conclude that half the work done to reach equilibrium is stored as strain energy and the remaining half is given by 1 ¯ = D∞ , W [u] 2 which is consumed during the dynamical process through viscous dissipation. To summarize, when viscous dissipation is present and the ‘asymptotic property’ holds then ‘dead’ loading and equilibrium are, indeed, compatible. Moreover, in practical terms the paradox reached within elasticity theory from L OVE’s interpretation of C LAPEYRON’s theorem may be resolved by appropriately accounting for the dissipative action of viscoelastic behavior. 4. Thermoelasticity Within the linear theory of thermoelasticity, when a body is subject to a displacement field u = u(x, t) relative to its undistorted, rest state and coincidently the absolute temperature is changed from its constant reference (room) temperature θ0 to the field θ = θ(x, t), the H ELMHOLTZ free energy per unit mass ψ = ψ(x, t) is determined by the constitutive equation 1 θ ˆ ψ = ψ(θ, e) = C[e] · e − (θ − θ0 )M · e − cθ ln , 2 θ0
(4.1)
ˆ 0 , 0) = 0. Here, M is the positive definite, symmetric thernormalized so that ψ(θ mal expansion tensor and c > 0 is the specific heat at constant deformation, both representing prescribed thermomechanical material properties and herein assumed to be constant. The symmetric stress tensor field T = T(x, t) and the entropy field per unit mass η = η(x, t) are then determined by the G IBBS relations
and
ˆ ∂ ψ(θ, e) = ρ(C[e] − (θ − θ0 )M) T= T(θ, e) = ρ ∂e
(4.2)
ˆ θ ∂ ψ(θ, e) = M · e + c ln + 1 , η = η(θ, ˆ e) = − ∂θ θ0
(4.3)
respectively. The total H ELMHOLTZ free energy of the body is given by ˆ ρ ψ(θ(x, t), e(x, t)) dv. [θ, e](t) ≡
(4.4)
If we now assume, analogous to Sections 1–3, that the ‘dead’ loads ¯t(x) on ∂ and ¯ ¯ b(x) in are balanced in the sense of (1.4) and that u = u(x) is a corresponding In [4] or [8, p. 99], for example, a non-essential quadratic approximation for θ near θ is used 0
in place of the last term in (4.1).
411
ABOUT CLAPEYRON’S THEOREM
equilibrium displacement field at the uniform temperature θ = θ0 then, as in Section 1, we readily see, from an argument totally analogous to that given in (1.8) for C LAPEYRON’s theorem and (1.1), that [θ0 , e¯ ] =
1 ¯ W [u]; 2
(4.5)
i.e., half of the work done to reach equilibrium is stored in the body as H ELMHOLTZ free energy. Here, we have used the normalization (1.7) which guarantees uniqueness and eliminates any possible additive infinitesimal rigid field. Before proceeding with a more detailed continuum thermodynamic analysis for non-isothermal processes, we first give an elementary thermodynamic explanation for the isothermal case θ(x, t) = θ0 . Thus, for a finite material body the first law and the second law, in the form of the C LAUSIUS –P LANCK inequality, may be written as ˙ E˙ (t) + K(t) = P (t) + Q(t),
Q(t) ˙ H(t) θ0
∀t > 0,
(A)
where E, K, P , Q and H denote the internal energy, kinetic energy, mechanical power supply (positive for influx and negative for efflux), heat supply rate (positive for absorbtion and negative for emission) and entropy for the body, respectively. Now, introducing the H ELMHOLTZ free energy of the body, F (t) ≡ E(t)−θ0 H (t), we may write (A)1 in the form ˙ ˙ + K(t) − Q(t). P (t) = F˙ (t) + θ0 H(t)
(B)
Then, supposing the body reaches an equilibrium state at some time t0 ∈ (0, ∞], we see from (B) that the total work done to the body over the time interval (0, t0 ) is given by t0 P (t) dt = F + D, (C) W≡ 0
where
D ≡ θ0 H −
t0
Q(t) dt 0
(D)
0
represents the total energy dissipated by the body during the (isothermal) process of reaching equilibrium. We are assured that this dissipated energy is non-negative because of (A)2 and the isothermal condition. Now, by naturally interpreting W ¯ to reach equilibrium and F as the equilibrium free energy as the work W [u] [θ0 , e¯ ], both noted in (4.5), we see from (4.5), (C) and (D) that 1 W= 2
F
and
1 W = D. 2
(E)
412
R. FOSDICK AND L. TRUSKINOVSKY
Clearly, half of the work that is supplied to reach equilibrium is dissipated and the paradox of C LAPEYRON’s theorem is resolved. Now, rather than assume that the temperature field of the body is spatially uniform and constant in time, let us suppose that the body is initially at rest in its undistorted state at the constant temperature θ0 and that for all time t > 0 it is subject to a balanced ‘dead’ loading system, as is presumed in the equilibrium situation which lead to (4.5) above. In addition, for convenience we suppose that the body is subject to null heat radiation to or from the external environment and that the boundary temperature is fixed at θ0 for all time t > 0. Explicitly, the boundary and initial conditions that we consider are Tn = ¯t,
θ = θ0
on ∂, ∀t > 0,
(4.6)
and ˙ 0) = 0, u(x, 0) = u(x,
θ(x, 0) = θ0
in ,
(4.7)
respectively. The dynamical governing equations have the form div T + b¯ = ρ u¨
in , ∀t > 0,
(4.8)
and −div q + T · e˙ = ρ ˙
in , ∀t > 0.
(4.9)
Here, = (x, t) is the internal energy field per unit mass, which is related to the H ELMHOLTZ free energy, temperature and entropy through = ψ + θη, and q = q(x, t) is the heat flux vector field. Also, we note for later reference that with (4.1)–(4.3), we may write (4.9) in the alternative form −div q = ρθ η˙
in , ∀t > 0.
(4.10)
Now, with the aid of (4.8), (4.6) and (2.6) we again have the power theorem (3.3). Moreover, following a standard line of reasoning which uses (3.3) with (4.9) and an application of the divergence theorem, we recover the global form of the balance of energy: d d ¯t · u˙ da + ˙ − Q(t). (4.11) b¯ · u˙ dv = E[θ, e](t) + K[u](t) dt dt ∂ Here, E[θ, e](t), the total internal energy of the body at time t, may be written conveniently as ρ ˆ (θ(x, t), e(x, t)) dv E[θ, e](t) ≡ ˆ t), e(x, t)) dv, (4.12) = θ0 [θ, e](t) + θ0 ρ η(θ(x,
ABOUT CLAPEYRON’S THEOREM
where
413
θ0 [θ, e](t) ≡
ρ(ˆ (θ, e) − θ0 η(θ, ˆ e))dv
(4.13)
is the total H ELMHOLTZ semi-free energy of the body based upon the boundary temperature θ0 and Q(t) is the total heat rate of the body at time t, which, here, is determined solely by boundary conduction, i.e., Q(t) ≡ −q · n da. (4.14) ∂
Because n denotes the outer unit normal to ∂, we note that Q(t) > 0 (< 0) corresponds to a rate of heat supply to (loss from) . Thus, by integration of (4.11) in time and use of the initial conditions (4.7), and formulae (4.1), (4.4), (1.2) and (1.3), we arrive at ˙ W [u](t) = θ0 [θ, e](t) + K[u](t) + D(t), where D(t) denotes the dissipation function t ρ(η(θ, ˆ e) − η(θ ˆ 0 , 0))dv − Q(τ ) dτ D(t) ≡ θ0 0 t d θ0 H [θ, e](t) − Q(t) dt 0, = dt 0
(4.15)
(4.16)
for all t 0 and where H [θ, e](t), the total entropy of the body in the state of temperature θ(x, t) and strain e(x, t), is defined by ρ η(θ(x, ˆ t), e(x, t)) dv. (4.17) H [θ, e](t) ≡
Observe that the right-hand side of D(t) in (4.16) contains an expression as integrand which, in the absence of radiation and when the body is emersed in an environment of constant temperature θ0 , is non-negative due to the second law of thermodynamics in the form of the C LAUSIUS –P LANCK inequality. Of course, in this circumstance the C LAUSIUS –P LANCK inequality is implied by the C LAUSIUS – D UHEM inequality. Because of the dissipative nature of heat conduction and the fact that the me¯ chanical loading ¯t(x) and b(x) and the thermal loading conditions (4.6)2 and (4.7)3 , See the work on the stability of material phases by Dunn and Fosdick [7, p. 41]. Duhem [6]
introduced a similar quantity denoted by him “l’´energie balistique” in his studies on the stability of equilibrium states. Truesdell [15], in his Historical Introit on pp. 39–40, gives a brief account of Duhem’s ballistic energy and its first appearances in the more modern researches of the 1960s. Today, the term “ballistic free energy” often is used to denote the sum of the total kinetic energy, the H ELMHOLTZ semi-free energy and the total potential energy of the applied forces for the body, for certain special processes as, for example, in [2, Section 3.3]. Its main feature is that it is non-negative on these processes and this fact emphasizes its importance in stability analyses.
414
R. FOSDICK AND L. TRUSKINOVSKY
¯ are associated with the equilibrium state u = u(x) and θ = θ0 , it is natural to expect, based on physical considerations, that any possible thermodynamic process, ¯ generated according to (4.6)–(4.9), will stabilize in the sense that u(x, t) → u(x) and θ(x, t) → θ0 as t → ∞. Provided this asymptotic behavior is, indeed, the case, we may conclude, from (4.13), (4.15), (4.16) and the fact that θ0 [θ, e](t) → [θ0 , e¯ ], that ¯ = [θ0 , e¯ ] + D∞ W [u]
(4.18)
in the limit t → ∞, where D∞
∞ ¯ ≡ D(∞) = θ0 (H [θ0 , e] − H [θ0 , 0]) − Q(τ ) dτ 0 ∞ d θ0 H [θ, e](t) − Q(t) dt 0. = dt 0
(4.19)
Thus, with (4.5) and (4.18) we see that half the work done to reach equilibrium is stored as H ELMHOLTZ free energy and the remaining half is given by Of course, from an analytical point of view this will depend upon the constitutive structure for the law of heat conduction which, for classical linear theory, may be taken as Fourier’s law (4.22). In the present context, this problem has yet to be studied. While Dafermos [4] has provided an analysis of the issues of existence and asymptotic stability for the completely linear theory of thermoelasticity, the initial-boundary value problem under consideration here is weakly nonlinear, due to thermal expansion, and slightly different. In its one-dimensional form the fields u(x, t) and θ(x, t) are sought for x ∈ (0, L) and for all t > 0 such that the dynamical and constitutive equations (4.2), (4.3), (4.8), (4.9) and (4.22) hold subject to null body force and appropriate boundary and initial conditions. Specifically, the governing equations are
¨ t) ∀x ∈ (0, L), ∀t > 0, σx (x, t) = ρ u(x, with σ (x, t) = Eux (x, t) − ρm(θ(x, t) − θ0 ), and kθxx (x, t) = ρ(mθ(x, t)u˙ x (x, t) + cθ˙ (x, t)) ∀x ∈ (0, L), ∀t > 0, subject to the following boundary and initial conditions: u(0, t) = 0,
σ (L, t) = σ¯ = const,
u(x, 0) = u(x, ˙ 0) = 0,
θ(x, 0) = θ0
θ(0, t) = θ(L, t) = θ0
∀t > 0,
∀x ∈ (0, L).
The material constants ρ, k, m, c and E are positive. In the completely linear theory, the nonlinear term θ u˙ x in the third equation above is linearized and replaced by θ0 u˙ x . For the system so linearized and within the more general three-dimensional setting, DAFERMOS has shown that the solution asymptotically and strongly approaches the equilibrium state of uniform temperature in the sense that (u, e ≡ ux , σ )(x, t) → (u, ¯ e, ¯ σ¯ )(x) = as t → ∞.
1 σ¯ E
x,
2 σ¯ , σ¯ , E
θ(x, t) → θ0
ABOUT CLAPEYRON’S THEOREM
1 ¯ = D∞ . W [u] 2
415 (4.20)
Following classical considerations, we may interpret the first term in the definition (4.19)1 of D∞ , i.e., the term that involves the total entropy difference, as that part of the change of the total internal energy that is stored in the distorted equilibrium state of the body in the ‘primative form of heat’ and that is unavailable to do mechanical work at the temperature θ = θ0 . This is historically referred to as the ‘bound’ part. Of course, the total H ELMHOLTZ free energy [θ0 , e¯ ] represents the remaining part of the total internal energy, and it is available. According to the definition (4.14), the second term in D∞ , in (4.19)1 , represents the total heat exchange for the body due to the process of conduction (i.e., ‘transfer’) through its boundary during the thermodynamic process. Finally, to clearly identify (4.19) as an expression for the dissipated energy due to the internal heat transfer, we first note that with (4.14), (4.17), the divergence theorem, (4.10) and (4.6)2 we may re-write D∞ as ∞ (ρθ0 η˙ + div q) dv dt D∞ = 0 ∞ θ0 div q dv dt 1− = θ 0 ∞ q · ∇θ − 2 dv dt. (4.21) θ0 = θ 0 Then, as is standard within the linear theory of thermoelasticity, if we assume F OURIER ’ S law of heat conduction, i.e., q = −K∇θ,
(4.22)
where K is the positive definite, symmetric heat conductivity tensor, we see that ∞ (K∇θ) · ∇θ dv dt 0. (4.23) D∞ = θ0 θ2 0 Accordingly, in the case of continuum thermoelasticity the expression (4.23) gives an explicit representation for the total dissipated energy that was identified as D in our previous more elementary discussion (see (D)). Through (4.20), it accounts for the remaining half of the work that is done to reach equilibrium and provides a thermodynamics based response to the paradox posed in Section 1. 5. Discussion In this communication we have revisited a well known classical theorem in linear elastostatics due to Emile Clapeyron and offered several interpretations of an apparent paradox associated with the ‘mysterious’ unaccountability of part of the work done by the loading device to reach equilibrium. Our considerations reveal that this
416
R. FOSDICK AND L. TRUSKINOVSKY
theorem may be viewed in a purely statical framework as a mechanical statement concerning work and elastic strain energy as did Love [11], and that is where the paradox appears, or it can be viewed more generally as a thermodynamical statement concerning the work and the H ELMHOLTZ free energy, in which case no paradox emerges. We consider the ‘thermodynamic’ version of C LAPEYRON’s theorem, as noted in (4.5), to be the most reasonable one; the issue does not appear to have been addressed previously in the literature. Within elastostatics, the purely mechanical statement of C LAPEYRON’s theorem is ambiguous because only equilibrium ideas are used to deduce it and, therefore, the definition of ‘work’ is somewhat subjective. In practice, an elastic body adjusts to the application of a loading gradually and part of the associated work is transformed during this process into an energy of ‘ringing’ relative to some average configuration. This ‘ringing’ may be sizable or negligible depending upon the rate at which the ultimate load is attained. Coincidently, this energy is being removed from the system by the unavoidable action of dissipation and the body tends to an equilibrium state. If, in a particular setting, the process of reaching equilibrium is considered instantaneous relative to the time-scale defined by the physical problem, then the classical theorem applies and the unaccounted work should be considered lost through dissipation. In this case, one can suppose that there is a fast time-scale in the problem and that the associated generation of high frequency vibrations can be considered, from the slower time-scale point of view, to be an effective dissipative action. We note that circumstances in which some energy may be either ‘lost’ or ‘acquired’ are not unknown within the setting of a purely conservative elastic system. For example, when considering steady state solutions of linear elastodynamic problems, one characteristically neglects short transient periods in determining the corresponding steady states from prescribed initial conditions. One of the energetic consequences of such a neglect of the transient phase of the process is the necessity to apply so-called radiation conditions in order to determine a unique steady state configuration. Another example originates in nonlinear elastodynamics where the energy is not conserved due to the unavoidable generation of the ‘invisible’ high frequency vibrations inside the transition layer of shock waves. If the H ELMHOLTZ free energy is used instead of the elastic strain energy and the problem is viewed as thermodynamical from the very beginning, the paradox does not surface. The reason is that in this case the system no longer is considered to be energetically closed and the ‘macro-mechanical’ degrees of freedom are not the only ones present in the system. More specifically, in this case, the adjustment of the body to the applied dead loads involves the activation of the ‘micro-mechanical’ degrees of freedom not accounted for by the purely mechanical macro-description. The channeling of the macroscopic energy towards these microscopic degrees of freedom is then viewed at the macro-level as the dissipation. The beauty of a continuum thermodynamical description is that these degrees of freedom need not be described explicitly.
417
ABOUT CLAPEYRON’S THEOREM
Acknowledgements We wish to thank Chi-Sing Man, J.J. Marigo and W. Warner for helpful comments. L.T. also acknowledges, with appreciation, discussions with I. Müller, P. PodioGuidugli, J. Rice and K. Wilmanski. We gratefully acknowledge E. Petersen for supplying the calculations and figures related to Appendix B. Appendix A. 1D Example: ‘Dead’ Loading To exemplify the general conclusions reached in Section 2.1 concerning the dynamical implications of ‘dead’ loading, consider the specific one-dimensional elastodynamic problem of determining the displacement field u(x, t) for x ∈ (0, L) and for all time t > 0 such that ¨ t) Euxx (x, t) = ρ u(x,
∀x ∈ (0, L), ∀t > 0,
(A.1)
subject to the following boundary and initial conditions: u(0, t) = 0, σ (L, t) = σ¯ = const ∀t > 0, u(x, 0) = u(x, ˙ 0) = 0 ∀x ∈ (0, L).
(A.2) (A.3)
Here, E > 0 is the (constant) Young’s modulus and σ (x, t) ≡ Eux (x, t) denotes the stress. It is straightforward to show that√the solution of (A.1)–(A.3) is periodic in time with period T = 4L/c, where c ≡ E/ρ is the characteristic wave speed, and that in the (x, t)-plane the strain and velocity fields, e(x, t) ≡ ux (x, t) and v(x, t) ≡ u(x, ˙ t), are piecewise constant and of the form shown in Figure 1. Moreover, in this one-dimensional setting (2.7) again holds, i.e., W [u](t) = U [e](t) + K[v](t)
∀t 0,
(A.4)
where
L 1 2 Ee dx, W [u](t) ≡ σ¯ u(L, t), U [e](t) ≡ 0 2 t (A.5) 1 2 ρv dx. K[v](t) ≡ 0 2 Thus, from the solution shown in Figure 1 we may readily construct the periodic forms of W [u](t), U [e](t) and K[v](t) and they are illustrated in Figure 2. Now, to analyze these results it is helpful to first note that the unique equilibrium displacement u(x), ¯ strain e¯(x) and stress σ¯ (x) fields which correspond to the boundary conditions u(0) ¯ = 0,
σ¯ (L) = σ¯
are given by u(x) ¯ = (σ¯ /E)x, e(x) ¯ = σ¯ /E and σ¯ (x) = σ¯ for x ∈ (0, L). In this case, C LAPEYRON’s theorem implies that 1 W [u] ¯ = U [e] ¯ (A.6) 2
418
R. FOSDICK AND L. TRUSKINOVSKY
Figure 1. Summary of the solution of (A.1)–(A.3) in the (x, t)-plane.
Figure 2. The total work W [u](t), strain energy U [e](t) and kinetic energy K[v](t) during one period of motion.
419
ABOUT CLAPEYRON’S THEOREM
and we easily calculate, using (A.5), that W [u] ¯ =
σ¯ 2 L, E
U [e] ¯ =
1 σ¯ 2 L. 2E
(A.7)
Notice from Figure 1 that at the discrete times t = t¯ ∈ {L/c, 3L/c, . . .} the displacement and strain fields coincide with those of the equilibrium state, u(x, t¯) = u(x) ¯ and e(x, t¯) = e¯ (x). Thus, from (A.7)1 and Figure 2 we see that W [u](t¯) = W [u], ¯
U [e](t¯) = K[v](t¯) =
1 W [u]. ¯ 2
(A.8)
This verifies (2.9) and explicitly shows that at those times when the dynamical displacement field coincides with the equilibrium displacement field, half the work done is stored as strain energy and the remaining half appears as kinetic energy. In passing, we note from Figure 1 that at time t = 2L/c (and periodically thereafter) the body is at rest and it is distorted with a strain field that is double what it is in equilibrium. Moreover, from Figure 2 we see that at this time there is a total ‘workenergy balance’ in the sense that W [u](2L/c) = U [e](2L/c). This is a reflection of Poncelet’s observation noted earlier in the first footnote of Section 2.1. Observe, from Figure 1, that v(x, t ∗ ) = 0 ∀x ∈ (0, L) and for every t ∗ ∈ {2L/c, 4L/c, . . .}. Thus, by time-averaging (A.4) over any interval (0, t ∗ ) and using a notation analogous to (2.10) it is clear that W [u] = W [u] = U [e] + K[v],
(A.9)
where, according to Figure 2 and (A.7)1 , we easily calculate ¯ W [u] = W [u],
3 U [e] = W [u], ¯ 4
1 K[v] = W [u], ¯ 4
(A.10)
in agreement with results more generally obtained in Section 2.1. In addition, from ¯ in (A.7), we readily the periodic extension of Figure 2 and the value of W [u] see that the infinite time-average, constructed analogous to (2.22) for this onedimensional example, satisfies the general conditions recorded in (2.23), i.e., W [u]∞ = U [e]∞ + K[v]∞ , where ¯ W [u]∞ = W [u],
3 U [e]∞ = W [u], ¯ 4
K[v]∞ =
1 W [u]. ¯ 4
Appendix B. 1D Example: Retarded Loading In order to exhibit more precisely how the solution of an elastodynamics problem may depend on the slowness of the applied loading, we consider another one-
420
R. FOSDICK AND L. TRUSKINOVSKY
dimensional elastodynamic problem of determining u(x, t) for x ∈ (0, L) and for all time t > 0 such that ¨ t) Euxx (x, t) = ρ u(x,
∀x ∈ (0, L), ∀t > 0,
(B.1)
subject to the following boundary and initial conditions: u(0, t) = 0, σ (L, t) = (1 − e−αt )σ¯ u(x, 0) = u(x, ˙ 0) = 0 ∀x ∈ (0, L).
∀t > 0,
(B.2) (B.3)
Here, α > 0 represents a ‘slowness’ load parameter which governs the length of time it takes the applied end load to essentially reach the constant value σ¯ . For sufficiently large α, the loading in (B.2) is nearly impulsive and this problem then reduces to that of Appendix A. As α is reduced the loading becomes more retarded and the solution is expected to show less of a dynamic structure. Of course, analogous to (2.7) the mechanical energy balance again holds, so that W [u](t) = U [e](t) + K[v](t),
∀t 0,
where the work done on the body up to time t is now determined by t σ (L, τ )u(L, ˙ τ ) dτ, W [u](t) =
(B.4)
(B.5)
0
and where the corresponding strain energy, U [e](t), and corresponding kinetic energy, K[v](t), are as defined in (A.5). One of the major questions concerning the solution of the dynamical problem stated above is how the work, strain energy and kinetic energy vary with time relative to the strain energy that would be stored in the same elastic bar in equilibrium under the constant end load σ¯ , i.e., U [e] ¯ of (A.7)2 . In Figures 3–5 we show the normalized work, W [u](t)/U [e], ¯ normalized strain energy, U [e](t)/U [e], ¯ and normalized kinetic energy, K[v](t)/U [e], ¯ as functions of time computed numerically for this problem for a range of slowness load parameters α between α = 104 sec−1 and α = 106 sec−1 . These figures are based on material constants for an aluminum alloy with E = 76.1 × 109 Pa and ρ = 2710 kg/m3 , for a bar of length L = 5 × 10−3 m, and for a load constant σ¯ = 107 Pa. The time axis of these figures is measured in ‘time steps’ with the final time step of 129760 corresponding to 1200 × 10−6 sec. One can see that the impulsive-like nature of the loading for large α results in wildly irregular behavior which is sustained over an infinite time. On the contrary, for relatively small α equilibrium appears to be achieved quickly in time with nearly constant limiting values W [u](t)/U [e] ¯ ≈ 1, U [e](t)/U [e] ¯ ≈ 1 and ¯ in (A.7)1 , while it has K[v](t)/U [e] ¯ ≈ 0. We conclude that the quantity W [u] units of work and shows up in C LAPEYRON’s theorem as exhibited in (A.6), does not represent the work done to reach equilibrium; reasoning based on the computed limiting behavior leads to the conclusion that only half of this value is expended to reach equilibrium and, then, it is manifested totally in the form of strain energy.
ABOUT CLAPEYRON’S THEOREM
421
Figure 3. Normalized work W [u](t)/U [e] ¯ as a function of time for various slowness load values α.
Figure 4. Normalized strain energy U [e](t)/U [e] ¯ as a function of time for various slowness load values α.
422
R. FOSDICK AND L. TRUSKINOVSKY
Figure 5. Normalized kinetic energy K[v](t)/U [e] ¯ as a function of time for various slowness load values α.
Because there are three decades of variation of the slowness load parameter α shown in Figures 3–5, there is much highly oscillatory, rapid time-behavior that is not resolved in these figures. Therefore, in Figures 6–10, we take α = 105 sec−1 and show a more detailed solution of (B.1)–(B.3). The material constants E and ρ, bar length L and load constant σ¯ are the same as noted above, but the time steps for the time-axis is now such that the final time step of 12800 corresponds to 120 × 10−6 sec. In Figure 6, we see that the strain field e(x, t) is highly irregular in time at the fixed end x = 0 where information from the time-dependent loading at the end x = L is reflected back into the bar. The length-axis of this figure is measured in ‘length steps’ with the final length step of 100 corresponding to 5 × 10−3 m which is the length of the bar. In Figures 7 and 8, we show the normalized total work done W [u](t)/U [e] ¯ and the normalized kinetic energy K[v](t)/U [e] ¯ as functions of time. These correspond to the α = 105 sec−1 cross sections of Figures 3 and 5, respectively, for the initial time interval (0, 12800) as noted in these figures. The normalized strain energy U [e](t)/U [e] ¯ is not shown, but behaves similar to Figure 7. Notice the orders of magnitude reduction of the energy scale used in exhibiting the kinetic energy in Figure 8. In Figures 9 and 10, we show the ratios U [e](t)/W [u](t) and K[v](t)/W [u](t) as functions of time in order to illustrate that it takes only a few ‘rings’ to almost completely eliminate the total kinetic energy in the bar. Of course, a small motion remains in the bar for all time no matter how small the slowness parameter α > 0.
ABOUT CLAPEYRON’S THEOREM
Figure 6. Strain e(x, t) as a function of axial position and time for α = 105 sec−1 .
Figure 7. W [u](t)/U [e] ¯ vs. t: α = 105 sec−1 .
423
424
R. FOSDICK AND L. TRUSKINOVSKY
Figure 8. K[v](t)/U [e] ¯ vs. t: α = 105 sec−1 .
Figure 9. U [e](t)/W [u](t) vs. t: α = 105 sec−1 .
ABOUT CLAPEYRON’S THEOREM
425
Figure 10. K[v](t)/W [u](t) vs. t: α = 105 sec−1 .
References 1. 2. 3.
4. 5. 6. 7. 8. 9. 10.
G. Andrews and J.M. Ball, Asymptotic behaviour and changes of phase in one-dimensional nonlinear viscoelasticity. J. Differential Equations 44 (1982) 306–341. J.M. Ball, Some open problems in elasticity. In: Geometry, Mechanics and Dynamics, eds. P. Newton, P. Holmes and A. Weinstein, Springer, New York (2002) pp. 3–59. E. Clapeyron, Mémoire sur le travail des forces élastiques dans un corps solide élastique déformé par l’action de forces extérieures. Comptes Rendus Acad. Sci. Paris XLVI (1858) 208–212. C.M. Dafermos, On the existence and the asymptotic stability of solutions to the equations of linear thermoelasticity. Arch. Rational Mech. Anal. 29 (1968) 241–271. C.M. Dafermos, The mixed initial-boundary value problem for the equations of nonlinear onedimensional viscoelasticity. J. Differential Equations 6 (1969) 71–86. P. Duhem, Traité d’Energétique ou de Thermodynamique Générale. Gauthier-Villars, Paris (1911). J. E. Dunn and R. Fosdick, The morphology and stability of material phases. Arch. Rational Mech. Anal. 74 (1980) 1–99. D. Iesan and A. Scalia, Thermoelastic Deformations. Kluwer Academic, Dordrecht (1996). G. Lamé, Leçons sur la Théorie Mathématique de l’Élasticité des Corps Solides. Paris (1852). G. Lamé and E. Clapeyron, Mémoire sur l’équilibre intérieur des corps solides homogénes. Mém. Divers Savants IV (1833) 465–562.
426 11. 12. 13.
14.
15.
R. FOSDICK AND L. TRUSKINOVSKY
A.E.H. Love, A Treatise on the Mathematical Theory of Elasticity, 4th edn. Cambridge (1927). J.V. Poncelet, Introduction à la Mécanique Industrielle, Physique et Expérimentale. Paris (1839). I. Todhunter and K. Pearson, A History of the Theory of Elasticity and of the Strength of Materials from Galilei to the Present Time, Vol. I, Galilei to Saint-Venant. Cambridge (1886) pp. 1639–1850. I. Todhunter and K. Pearson, A History of the Theory of Elasticity and of the Strength of Materials from Galilei to the Present Time, Vol. II, Saint-Venant to Lord Kelvin, Part I. Cambridge (1893). C. Truesdell, Rational Thermodynamics, 2nd edn. Springer, New York (1984).
The Lavrentiev Phenomenon in Nonlinear Elasticity M. FOSS1, W. HRUSA2 and V.J. MIZEL2 1 Kansas State University, Manhattan, KS, U.S.A. 2 Carnegie Mellon University, Pittsburgh, PA, U.S.A.
Received 14 October 2002; in revised form 30 October 2003 Abstract. In 1985 J.M. Ball and V.J. Mizel raised the question of whether there exist nonlinearly elastic materials possessing a physically natural stored energy density, i.e., one which is independent of an observer’s coordinate frame (objective) and is invariant under the group of orthogonal linear transformations of space (isotropic), as well as physically reasonable boundary value problems for such materials such that the infimum of the total stored energy for those continuous deformations of the material meeting the boundary condition (admissible deformations) which belong to a Sobolev space W 1,p2 for some p2 > 1 is strictly greater than its infimum for those admissible continuous deformations belonging to some Sobolev space W 1,p1 , p1 < p2 , despite the density of W 1,p2 in W 1,p1 . The question was motivated by M. Lavrentiev’s demonstration in 1926 of the presence of such a gap for a 1-dimensional variational boundary value problem on a bounded interval whose smooth integrand satisfied the conditions of Tonelli’s existence theorem (as well as the development of improved versions in the 1980’s). The present article describes a positive response to the question raised in 1985. Namely, we provide examples of nonlinearly elastic materials in 2-dimensions and physically reasonable boundary value problems for these materials in which a positive gap exists between the infimum of the total stored energy over admissible continuous deformations belonging to a Sobolev space W 1,p2 and its infimum over admissible continuous deformations belonging to a Sobolev space W 1,p1 , with p1 < p2 . The physical and computational significance of such results is also discussed. Mathematics Subject Classifications (2000): 49J, 49K, 74B, 74G. Key words: Lavrentiev phenomenon, nonlinear elasticity, singular minimizers.
Dedicated to the memory of Clifford Truesdell, a teacher, good friend, and inspiration to several generations of researchers in continuum physics.
1. Introduction During the course of an investigation in the early 1980’s studying the variational approach to the analysis of hyperelastic materials, J.M. Ball and V.J. Mizel discovered that even unidimensional problems in the calculus of variations are not as elementary as many classical results suggest. For example, there are two point boundary value problems involving a positive smooth integrand which is convex in the derivative variable for which the absolutely continuous minimizer is not 427 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 427–435. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
428
M. FOSS ET AL.
a Lipschitz function but instead possesses a derivative which is only in Lp for some finite p values and thus is essentially unbounded [4]. Furthermore Ball and Mizel became aware through Cesari’s book [6] of an even more surprising one dimensional phenomenon (to be described below) which they thereafter called the Lavrentiev phenomenon in honor of its discoverer. These developments affected the degree of confidence Ball and Mizel had in the invariable success of variational methods in determining the actual energy deformation of an elastic body under physically appropriate boundary conditions and external forces [4]. The goal of the present article is to present a successful resolution of the question raised in [4] as to whether the Lavrentiev phenomenon can arise in realistic problems of multidimensional hyperelasticity. We summarize the results obtained in a longer article, which appeared in the Archive for Rational Mechanics and Analysis [7], and we include some additional discussion of the physical significance of those results. In order to clarify what is meant by the Lavrentiev phenomenon in one-dimensional variational problems let us consider a functional of the form b f (x, y(x), y (x)) dx, (1) J [y] = a
where y is an absolutely continuous function defined on the interval [a, b] which is subject to boundary constraints y(a) = A, y(b) = B and smoothness conditions. We adopt the following notation. y ∈ W 1,p (a, b) ⇔ y is absolutely continuous with y ∈ Lp (a, b); Ap = y ∈ W 1,p (a, b) | y(a) = A, y(b) = B . Thus, for example, with [a, b] = [0, 1] and A = 0, B = 1 y(x) = x β , β ∈ (0, 1), y ∈ W 1,p (0, 1) if and only if p ∈ [1, 1/(1 − β)). For J as above we put i(p) = inf{J [y] | y ∈ Ap },
p ∈ [1, ∞],
so that p1 p2 ⇒ i(p1 ) i(p2 ). Then the Lavrentiev phenomenon is said to occur if J is such that i(p1 ) < i(p2 ) for some p1 < p2 . For convenience we denote this phenomenon by . There is no general theory ensuring the existence of a W 1,∞ (i.e., Lipschitz) minimizer for such problems; however in the 1920’s the following result (incorporating an idea due to Nagumo) was demonstrated by Tonelli. THEOREM. If there exists φ: [0, ∞) → R such that φ(t)/t → ∞ as t → ∞ and f ∈ C 2 satisfies f (x, y, z) φ(|z|), ∀x ∈ [a, b], (y, z) ∈ R2 and fzz 0 then there exists u ∈ A1 such that J [u] J [v], ∀v ∈ A1 . To illustrate the phenomenon we examine the following example due to Heinricher and Mizel [8]. Let f0 (x, y, z) = (y 2 − x)2 z6 and consider the problem of minimizing 1 f0 (x, y(x), y (x)) dx, J0 [y] = 0
THE LAVRENTIEV PHENOMENON IN NONLINEAR ELASTICITY
429
where y is an absolutely continuous function defined on the interval [0, 1] subject to the constraints y(0) = 0, y(1) = 1. Note that f0 satisfies all conditions apart from the superlinear growth condition in Tonelli’s theorem. We put i0 (p) = inf J0 [y] | y ∈ W 1,p (0, 1), y(0) = 0, y(1) = 1 , p ∈ [1, ∞]. Now positivity of f0 ensures that since u(x) = x 1/2 satisfies J0 [u] = 0 one has i0 (p) = 0 ∀p ∈ [1, 2). However it was shown using ideas of Noether that for all p ∈ [2, 5/2) the function v(x) = x 3/5 minimizes J0 whence i0 (p) = (1/6)(3/5)6 for this range of p values. Thus we have the phenomenon for this problem: i0 (p1 ) = 0 < i0 (p2 ) = (1/6)(3/5)6 ∀p1 < 2, p2 ∈ [2, 5/2). Furthermore if the exponent 6 in f0 is replaced by any β > 6 then for any sequence {yn } ⊂ A2 such that yn (x) converges pointwise to x 1/2 , J0 [yn ] → +∞. Now it is possible by a simple modification of f0 to construct an integrand f which satisfies all conditions of Tonelli’s theorem as well as fzz > 0 and yet exhibits Lavrentiev’s phenomenon. Namely, put for > 0, f (x, y, z) = (y 2 − x)2 z6 + (1 + z2 )5/6 and set 1 f (x, y(x), y (x)) dx. (2) J [y] = 0
It is not difficult to verify that for sufficiently small one has i (p1 ) < (1/6)(3/5)6 < i (p2 ) ∀p1 < 2, p2 ∈ [2, 5/2), whereby the phenomenon holds for J , as claimed. One might be tempted to say that {y ∈ W 1,6 (0, 1): y(0) = 0, y(1) = 1} is the “natural” domain on which to minimize J0 (or J ). However, in this case there is no minimizer. In physical problems arising in nonlinear elasticity the situation is similar. If one takes the domain of the energy functional based on a Sobolev space having the property that all deformations have finite energy, then there may not be a minimizer – unless the stored energy function is of a very special type having the same upper and lower growth rates. 2. 2-Dimensional Hyperelasticity Examples As indicated earlier, we will describe certain boundary value problems involving a 2-dimensional body consisting of an elastic material with a physically natural stored energy function, for which the Lavrentiev phenomenon occurs. It should be noted that there were previous results in nonlinear elasticity where such a gap phenomenon was shown [3, 1] but all such examples involved discontinuous deformations exhibiting cavities of nonzero surface area to which no energetic cost was assigned. In view of the uncertain physical status of those examples, we regard it as important that our examples involve only continuous deformations so that no cavities arise. Now the presence of the Lavrentiev phenomenon in our boundary value problems for a physically natural elastic material implies that the infimal stored energy for continuous deformations belonging to a Sobolev space of exponent p2 for some
430
M. FOSS ET AL.
p2 > 1, is strictly greater than the minimal stored energy for continuous deformations belonging to the larger Sobolev space of exponent p1 = 1. Consequently any deformation minimizing the stored energy in the space with exponent p2 would be a metastable equilibrium – one possessing milder gradient singularities than those occurring in the stable equilibrium deformation belonging to the space with exponent p1 = 1. Therefore any occasion in which the material deformation transforms from the metastable equilibrium associated with exponent p2 to the stable equilibrium associated with exponent p1 would produce material points with heightened gradient singularities. Possibly such points could provide loci for the initiation of (as opposed to the presence of ) fracture. The computational significance of such results is that standard computational schemes might suggest that the more regular metastable equilibrium state is actually the stable equilibrium state, i.e., absolute minimizer. Hence such computations could lead to misleadingly optimistic design specifications for elastic structures. Furthermore, in analogy to the one-dimensional result cited on page 4, when a sector of the unit disc is deformed into a sector whose central angle is less than 3/4 as large then for any sequence {u(n)} lying in the Sobolev space of exponent p2 (which does not contain the stable equilibrium deformation) and converging to that equilibrium deformation pointwise one has J [u(n) ] → +∞, where J [u] denotes the total stored energy associated with the displacement field u. It will be convenient to introduce the following notation for the description of our examples. Lin R2 denotes the set of linear transformations on R2 , while Lin+ R2 denotes the subset consisting of linear transformations with positive determinant. Orth R2 denotes the orthogonal group of linear transformations on R2 , with Orth+ R2 = Orth R2 ∩ Lin+ R2 denoting the proper orthogonal group of linear transformations on R2 . For F ∈ Lin R2 , F 2 = tr F T F . For x ∈ R2 \{0}, 3 r(x) = x12 + x22 , θ(x) = arctan(x2 /x1 ). Finally, for β ∈ (0, 2π ) β = {x | r(x) < 1, θ(x) ∈ (0, β)}, with the following notation for portions of its boundary: 1,β = {x | r(x) ∈ [0, 1], θ(x) = β}, 2,β = {x | r(x) ∈ [0, 1], θ(x) = 0}, 3,β = {x | r(x) = 1, θ(x) ∈ [0, β]}. We begin with the following two dimensional variational problem involving a positive integrand W0 (∇u) where W0 has several physically natural features (to be described below) but is not a physically appropriate stored energy integrand for a nonlinearly elastic material since W0 (F ) does not become unbounded as det F → 0+, i.e., as the material undergoes very high compression: Minimize J0 [u] =
W0 (∇u) dx for u: β → α , α ∈ (0, β), β
(P)
THE LAVRENTIEV PHENOMENON IN NONLINEAR ELASTICITY
431
subject to (i) r(u(x)) = 1, (ii)
u(1,β ) = 1,α ;
(iii)
u(2,β ) = 2,α ;
θ(u(x)) =
(iv) u(0) = 0; (v) det ∇u(x) dx β
α θ(x), β
∀x ∈ 3,β ;
(BC) du.
u(β )
(The first four conditions are (generalized) Dirichlet type conditions while the fifth condition guarantees that if ∇u ∈ Lin+ R2 almost everywhere then u is injective almost everywhere.) Here W0 (F ) = (F 2 − 2 det F )4 = ((F11 − F22 )2 + (F21 + F12 )2 )4 and for each p ∈ [1, ∞] we put Ap = W 1,p (β ; R2 ) ∩ C(β ; R2 ) = A1 ∩ W 1,p (β ; R2 ), restricting the admissible mappings u for (P) to be those which are elements of Ap := {u ∈ Ap | ∇u ∈ Lin+ R2 a.e. and u satisfies (BC)}. For example, for each p ∈ [1, ∞] the mappings u: β → α given by ⎞ ⎛ α θ(x) cos ⎟ ⎜ β δ⎜ u(x) = r(x) ⎝ ⎟ ⎠ α θ(x) sin β belong to Ap for appropriate choices of δ > 0. We note the following properties of W0 : • W0 ∈ C ∞ (Lin R2 ; R); W0 is convex on Lin R2 ; W0 (F ) W0 (I ) = 0; • W0 is materially homogeneous, i.e., not x-dependent; • W0 is objective and isotropic, i.e., W0 (QF ) = W0 (F ) = W0 (F Q) for each Q ∈ Orth+ R2 . Next for each p ∈ [1, ∞] put i0 (p) = inf{J0 [u] | u ∈ Ap } and for each α, β ∈ ∗ = 2β/(β − α). We now give the value of i0 (p) for all (0, 2π ) with α < β put pβ,α ∗ p ∈ [1, ∞]\{pβ,α }. THEOREM 1. For α, β ∈ (0, 2π ) with α < β define uam: β → α as follows α cos(γ θ(x)) γ , where γ := < 1 so that uam (x) = r(x) sin(γ θ(x)) β cos(γ θ(x)) − sin(γ θ(x)) γ −1 , x ∈ β . ∇uam (x) = γ r(x) sin(γ θ(x)) cos(γ θ(x)) Using the definition of W0 it is clear that W0 (∇uam (x)) = 0 for all x ∈ β .
(∗)
432
M. FOSS ET AL.
∗ Now it is easy to verify that uam ∈ Ap if and only if p ∈ [1, pβ,α ) = [1, 2/(1 − γ )). It therefore follows by (∗) that ∗ ), inf J0 = J0 [uam ] = 0 for each p ∈ [1, pβ,α
(∗∗)
Ap
whence uam is an “absolute” minimizer for J0 on each Ap , 1 p < 2/(1 − γ ). To summarize we may write ∗ ). io (p) = 0 for all p ∈ [1, pβ,α
(3)
THEOREM 2 (Case 1). If γ = α/β < 3/4 then J0 possesses a “pseudo” minimizer upm ∈ Ap for each p ∈ [2/(1 − γ ), 14/(1 + γ )). Namely, cos(γ θ(x)) , upm (x) = r(x)(6−γ )/7 sin(γ θ(x)) so that
⎛6−γ
cos(γ θ(x)) −γ sin(γ θ(x))
⎞
7 ⎠ ∇upm (x) = r(x)−(1+γ )/7 ⎝ 6 − γ sin(γ θ(x)) γ cos(γ θ(x)) 7 γ (6 − γ ) r(x)−2(1+γ )/7 > 0, x ∈ β . det ∇upm (x) = 7
and
Now upm is a solution to the Euler–Lagrange system for J0 and upm ∈ Ap for all / Ap for any p 2/(1 − γ ), so one p ∈ [2/(1 − γ ), 14/(1 + γ )), whereas uam ∈ has 7 14 8 3 2 −γ , ,(#) > 0 for all p ∈ inf J0 = J0 [upm ] = β Ap 7 4 1−γ 1+γ which justifies the description of upm as a “pseudo” minimizer for p ∈ [2/(1 − γ ), 14/(1 + γ )). Moreover the relation (#) also holds for all p ∈ [14/(1 + γ ), ∞] / Ap for these p values. although upm ∈ Case 2. If β > α 34 β so that 3/4 < γ < 1 then for all p ∈ [2/(1 − γ ), ∞] one has the relation inf(J0 ) = J0 [uam ] = 0 Ap
(##)
/ Ap for these p values. although uam ∈ In view of Theorems 1 and 2 we see that the Lavrentiev phenomenon does hold for J0 when α/β = γ < 3/4: 7 8 3 2 2 −γ , p2 > , for all p1 < i0 (p1 ) = 0 < i0 (p2 ) = β 7 4 1−γ 1−γ ()
433
THE LAVRENTIEV PHENOMENON IN NONLINEAR ELASTICITY
whereas the Lavrentiev phenomenon does not occur when γ > 3/4: γ >
3 4
⇒
i0 (p) = 0
for all p ∈ [1, ∞].
We note that the validity of (#) and (##) when γ > 3/4 is demonstrated by constructing sequences {u(n) } ⊂ A∞ such that u(n) converges weakly to upm (respectively, to uam) for the given values of p. Next we quote a fundamental existence theorem due to Ball [2]. THEOREM 3. Suppose W : Lin+ R2 → R satisfies (i) W is polyconvex: i.e., there is a continuous jointly convex function g: Lin R2 × (0, ∞) → R such that W (F ) = g(F, det F ), ∀F ∈ Lin+ R2 ; (ii) There are p0 2 and K1 , K2 > 0 such that g(F, λ) K1 F p0 − K2 , ∀(F, λ) ∈ T = Lin+ R2 × (0, ∞); (iii) g(F, λ) → +∞ as (F, λ) → ∂T , in particular as det F → 0+ and as det F → +∞ (by convention W (F ) = +∞ for F ∈ Lin R2 \ Lin+ R2 ). Then for any β ∈ (0, 43 π ) and the stored energy functional J : W 1,1 (β ; R2 ) → (−∞, ∞] defined by W (∇u) dx, J [u] = β
there exists uam ∈ Ap0 such that J [uam ] = inf{J [u] | u ∈ A1 } = inf{J [u] | u ∈ Ap0 } provided J [u] < ∞ for at least one u ∈ Ap0 . We now adapt our previous result to the elasticity context. Given K ∈ [2, 4), let PK (F ) = K(det F )−1 + 3(2−K)/2 (1 + F 2 )K/2 and define W,K : Lin+ R2 → [0, ∞) as follows: 4 W,K (F ) = F 2 − 2 det F + K(det F )−1 + 3(2−K)/2(1 + F 2 )K/2 = W0 (F ) + PK (F ). Consider the following variational problem for given p ∈ [1, ∞]: W,K (∇u) dx → inf for u ∈ Ap . J,K [u] =
(P,K )
β
We note the following properties of W,K for each > 0, K 2 (by convention W,K (F ) = +∞ for F ∈ Lin R2 \ Lin+ R2 ): W,K ∈ C ∞ (Lin+ R2 ; R);
W,K (F ) F K ;
W,K (F ) W,K (I ) > 0;
W,K is materially homogeneous, i.e., x-independent; W,K is objective and isotropic; W,K is polyconvex; W,K (F) → +∞ as F → ∂T = ∂ Lin+ R2 × (0, ∞).
434
M. FOSS ET AL.
Thus W,K possesses those properties which are typical for physically natural stored energy densities and in addition satisfies the conditions of Theorem 3 (Ball’s existence theorem). We may now state our main result writing i,K for the obvious infimum as a function of p. THEOREM 4. Given β ∈ (0, 43 π ) and α ∈ (0, 34 β) there are for each K ∈ ∗ = 2/(1 − γ ) and β,α = β,α (K) > 0 such that if [2, 2/(1 − γ )) numbers pα,β < β,α then the Lavrentiev phenomenon is present in the following sense: 0 < i,K (p1 ) = inf{J,K (u) | u ∈ Ap1 } < i,K (p2 )
(,K )
whenever p1 < 2/(1 − γ ) < p2 . It can be shown that under the constraints we have given the minimizers u,K am for (P,K ) above, whose existence is guaranteed by Ball’s theorem (Theorem 3), are continuous mappings, thus avoiding the cavitation issue referred to on page 3. For proofs and additional discussion see [7]. REMARK 1. Although our boundary condition (i) completely prescribes the displacement on 3,β , conditions (ii) and (iii) only partially prescribe the displacement on 1,β and 2,β . We do not know if the Lavrentiev phenomenon can occur for problems in which the displacement is completely prescribed on the entire boundary (i.e., for standard Dirichlet type boundary conditions). REMARK 2. In order to give some insight into the special properties of W0 it is useful to introduce complex notation. If we write z = x1 + ix2 , f = u1 + iu2 = reiθ(u) then W0 (∇u) = (4|∂f |2 )4 and the Euler–Langrange system becomes ∂(|∂f |6 ∂f ) = 0, where ∂ 1 ∂ ∂ 1 ∂ −i +i and ∂ = . ∂= 2 ∂x1 ∂x2 2 ∂x1 ∂x2 This allows one to explicitly construct solutions of the Euler–Lagrange equations. Convexity arguments can be used to prove that these solutions are minimizers. Similar results apply to elastic materials associated with W0 (F ) = (F 2 − 2 det F )q , for each exponent q > 1. REMARK 3. We have not succeeded in demonstrating that there are inherently three-dimensional physically natural boundary problems in which the Lavrentiev phenomenon occurs for some physically natural elastic material.
Acknowledgements Mizel wishes to express his appreciation to T.J. Healey for stimulating comments after his talk. Others contributing earlier stimulating comments to the research are
THE LAVRENTIEV PHENOMENON IN NONLINEAR ELASTICITY
435
acknowledged in [7]. In addition, the authors wish to express appreciation to three anonymous referees for their helpful comments. Finally, Mizel wishes to express his appreciation to the U.S. National Science Foundation for its partial support of his research under Grant 0072816 and Foss expresses his appreciation to the U.S. National Science Foundation for partial support of his research under its VIGRE program. References 1. 2. 3. 4. 5. 6. 7. 8.
G. Alberti and P. Majer, Gap phenomenon for some autonomous functionals. J. Convex Anal. 1 (1994) 31–45. J.M. Ball, Convexity condiitons and existence theorems in nonlinear elasticity. Arch. Rational Mech. Anal. 63 (1977) 337–403. J.M. Ball, Discontinuous equilibrium solutions and cavitation in nonlinear elasticity. Philos. Trans. Roy. Soc. London Ser. A 305 (1982) 557–611. J.M. Ball and V.J. Mizel, One-dimensional variational problems whose minimizers do not satisfy the Euler–Lagrange equation. Arch. Rational Mech. Anal. 90 (1985) 325–388. M. Belloni, Interpretation of Lavrentiev phenomenon by relaxation: The higher order case. Trans. Amer. Math. Soc. 347 (1995) 2011–2023. L. Cesari, Optimization Theory and Applications. Springer, New York (1983). M. Foss, W.J. Hrusa and V.J. Mizel, The Lavrentiev gap phenomenon in nonlinear elasticity. Arch. Rational Mech. Anal. 167 (2003) 337–365. A.C. Heinricher and V.J. Mizel, The Lavrentiev phenomenon for invariant variational problems. Arch. Rational Mech. Anal. 102 (1988) 57–93.
Steady Flow of a Navier–Stokes Fluid around a Rotating Obstacle GIOVANNI P. GALDI Department of Mechanical Engineering, University of Pittsburgh, Pittsburgh 15261 PA, U.S.A. E-mail:
[email protected] Received 7 October 2002; in revised form 7 January 2003 Abstract. Let B be a body immersed in a Navier–Stokes liquid L that fills the whole space. Assume that B rotates with prescribed constant angular velocity ω. We show that if the magnitude of ω is not “too large”, there exists one and only one corresponding steady motion of L such that the velocity field v(x) and its gradient grad v(x) decay like |x|−1 and |x|−2 , respectively. Moreover, the pressure field p(x) and its gradient grad p(x) decay like |x|−2 and |x|−3 , respectively. These solutions are “physically reasonable” in the sense of Finn. In particular, they are unique and satisfy the energy equation. This result is relevant to several applications, including sedimentation of heavy particles in a viscous liquid. Mathematics Subject Classifications (2000): 35Q30, 76N10, 76D07. Key words: rotating obstacle, Navier–Stokes, steady state, asymptotic behavior.
To Clifford Truesdell, in memoriam
1. Introduction The steady motion of a liquid past a rigid body, B, translating with a constant velocity is among the oldest and most fundamental questions in theoretical and applied fluid dynamics [24]. In fact, the first, significant contributions to the subject date back to the work of Stokes [35], Kirchhoff [21], and Thomson (Lord Kelvin) and Tait [37]. In view of its complexity, a systematic and rigorous mathematical study of the problem for a Navier–Stokes liquid, L, was initiated only much later, through to the fundamental work of Oseen [31], Odqvist [30], and Leray [25, 26], and only a few decades ago was it further deepened and, under certain aspects, completed, as a result of the efforts of several mathematicians including Ladyzhenskaya [22], Fujita [10], Finn [9] and Babenko [3]; see also [12, 15]. The main achievement of these works is the proof of existence of steady solutions that exhibit all the main features expected from a physical point of view. Work partially supported by NSF grant DMS-0103970
437 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 437–467. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
438
GIOVANNI P. GALDI
In particular, they are unique for small data, satisfy the global energy balance (energy equation) and show a wake behind the body, that is, they are “physically reasonable” in the sense of Finn [9]. Furthermore, they are stable and attainable from rest for sufficiently small data. It is important to emphasize that all the above properties can be secured only through the knowledge of the asymptotic behavior of the solutions at large distances. Moreover, they hold under the crucial assumption that the motion of B is purely translatory (no spin). Recently, the present author has started a mathematical analysis of sedimentation of rigid bodies in a Navier–Stokes liquid (see [14] and the reference cited therein). This problem, which is at the foundation of several engineering applications like manufacturing of short-fiber composites [2], separation of macromolecules by electrophoresis, [36], flow-induced microstructures [20], and blood flow problems [32], consists in studying the existence, stability and attainability of terminal states that are eventually achieved (as time goes to infinity) by a rigid body of negative buoyancy that is dropped from rest in a Navier–Stokes liquid. Here, by “terminal state” we mean a state of motion where the body moves with constant translational and angular velocities with respect to an inertial frame, while the flow of the liquid, as observed from a frame attached to the body, is steady [39]. A significant result of D. Serre shows that, for B of arbitrary shape and mass and for L of arbitrary density and viscosity, the set of terminal states is not empty [33]. However, solutions obtained by Serre are “weak”, in the sense that their corresponding velocity field v vanishes at infinity a priori only in a generalized sense and, consequently, it is not known if they are “physically reasonable”. Recently, the present author has shown that, for these solutions, v and the corresponding pressure p tend to zero at large distances uniformly pointwise [14]. However, this result is not enough to furnish the validity of the basic physical properties mentioned above, that require for v and p an order of decay with the distance r from B like r −1 and r −2 , respectively [4]; see also [5, 18, 19] and the references therein. We wish to emphasize that even the proof of the uniform pointwise convergence, that in absence of rotation is obtained quite straightforwardly [12, Theorem IX.6.1], requires a substantial effort if B is allowed to spin; see Section 4.2.2 in [14]. In order to understand why the problem becomes difficult if B is rotating versus translating, we recall that the method typically employed in the study of the asymptotic structure of a steady solution in exterior domain [12, 27, 8] relies upon the proof of existence and of appropriate estimates of solutions to the linearized problem, in conjunction with a suitable fixed point argument. In turn, this proof is typically achieved by showing appropriate estimates of the fundamental solution for the relevant linear operator. Now, if B is only translating, the linearized operator is the well-known Oseen operator, LT , which is obtained from the (second order) Similar problems are of great interest also in visco-elastic non-Newtonian fluid [38]; see [14]. Specifically, the average of |v| over the unit sphere vanishes at large spatial distances.
FLOW AROUND A ROTATING OBSTACLE
439
Stokes operator by adding a lower (first) order term in the velocity field v, with constant coefficients. If, on the other hand, the body is rotating with angular velocity ω, the corresponding linearized operator, LTR , also includes the first order term ω × x · grad v with x generic point in the region occupied by L; see equation (2.1). This term has two undesired features, related to its coefficient ω × x. The first is that this coefficient depends on x, and the other, more important, is that it becomes unbounded at large distances from B. It should be added that the fundamental solution for the operator LTR is known [5], but due to its very complicated form, any reasonable attempt to furnish appropriate estimates appears to be unwieldy and extremely difficult. Moreover, also other methods, like Fourier transform in conjunction with theory of multipliers, that have been successfully employed in the case of the operator LT [12, Chapters IX and X], in the case of the operator LTR they seem to fail or, at least, they do not seem to provide valuable information. The present paper is devoted to existence, uniqueness and asymptotic behavior of steady solutions to the Navier–Stokes equation in the exterior of a rotating body. In particular, we show that, if the angular velocity ω of the body is not “too large”, a unique solution exists, whose velocity field v decays to zero as |x|−1 , and grad v decays as |x|−2 . Moreover, the corresponding pressure field p and grad p behave as |x|−2 and |x|−3 , respectively. It is interesting to observe that these are exactly the same asymptotics of the linear Stokes problem [11, Theorem V.3.2], obtained by setting ω = 0 and by disregarding the nonlinear terms in the relevant Navier– Stokes equations (see (2.1)). By a standard argument, it follows that our solutions satisfy the energy equation (see (2.3)). From the work of Borchers [4], it also follows that they are nonlinearly asymptotically stable in the sense of Liapunov, in suitable norms. The method we use to show the above results is based on obtaining the estimates for solutions to the linear problem as limit, as time goes to infinity, of analogous estimates proved for solutions of the corresponding initial value problem. Actually, by means of a suitable transformation of coordinates, this latter goes into an initial value problem for the heat equation. So, ultimately, the estimates for solutions to the steady linearized problem are reduced to find the same estimates, uniformly in time, for solutions of an initial value problem for the heat equation. This is done in a relatively simple way, because the fundamental solution of the heat equation is much simpler to handle than the fundamental solution of the operator LTR . Since the main mathematical difficulty comes from rotation, for simplicity of argument, in the present paper we have assumed that the body just rotates, without translating. However, the method we use is quite flexible and it can be extended to cover more general cases. This will be the object of a future work. Finally, it should be emphasized that, even though our analysis was motivated by the problem of sedimentation, it has, of course, an independent interest and can be applied to other significant physical problems, like evaluation of torques and forces on B; see [16, 17] and the references cited therein (see also Section 6).
440
GIOVANNI P. GALDI
The paper is organized as follows. In Section 2 we formulate the problem and state the main result. Section 3 is devoted to the study of a suitable linear problem in the whole space. Using the results of Section 3, in Section 4 we show existence, uniqueness and corresponding estimates for solutions to a linearized problem in exterior domains. The results of Section 4 are employed in Section 5 where the proof of the main result is presented. We end the paper with a final Section 6, that includes, among other things, possible other applications of our result. 2. Formulation of the Problem and Main Result We begin to introduce some notation. R3 is the Euclidean 3-dimensional space and {e1 , e2 , e3 } is the associated canonical basis. For a > 0, x ∈ R3 , we set Ba (x) = {y ∈ R3 : |y − x| < a}, and B a (x) = {y ∈ R3 : |y − x| > a}. If x = 0, we shall simply write Ba and B a , respectively. If A is a domain of R3 , we denote by δ(A) its diameter. Moreover, we set Aa = A ∩ Ba and Aa = A ∩ B a . If f is a scalar, vector or tensor function defined in A and k is a nonnegative integer, we set [|f |]k = ess sup (|x|k + 1)f (x) , x∈A
where · denotes absolute value or modulus, depending on whether f is a scalar, vector or tensor field. If A is a subdomain of A we shall write [|f |]k,A = ess sup (|x|k + 1)f (x) . x∈A
m,q m,q ¯ m 0, 1 < q ∞, denote usual Lq (A), W m,q (A), W0 (A), Wloc (A), Lebesgue and Sobolev spaces [1]. Norms in Lq (A) and W m,q (A) are denoted by · q,A , · m,q,A . Unless confusion arises, in the above norms, we shall drop the subscript “A”. The trace space on ∂A for functions from W m,q (A) will be denoted by W m−1/q,q (∂A) and its norm by · m−1/q,q,∂A . By D k,q (A), k 1, 1 < q < ∞, we indicate the homogeneous Sobolev space of order (k, q) on A [11, 34], that is, the class of functions u that are (Lebesgue) locally integrable in A and with D β u ∈ Lq (A), |β| = k. Finally, given a Banach space X, and an open real interval (a, b), we denote by W m,q (a, b; X) the linear space of (equivalence classes of) functions f : (a, b) → X whose X-norm is in W m,q (a, b). Typically, we shall use the symbol c to denote a constant whose numerical value or dependence on parameters is not essential to our aims. In such a case, c may have several different values in a single computation. For example, we may have, in the same line, 2c c. In this paper we shall study the steady-state motions of a viscous fluid around a rotating obstacle. Specifically, let B be a rigid body that uniformly rotates, with constant angular velocity ω, in a viscous liquid L filling the entire space. We Let X be any space of real functions. As a rule, we shall use the same symbol X to denote the
corresponding space of vector and tensor-valued functions.
FLOW AROUND A ROTATING OBSTACLE
441
assume that L is described by the Navier–Stokes model, and that the motion of L as seen from a frame S attached to B is steady. Then, the relevant nondimensional equations, written with respect to S, are given by (see, e.g., [14]) $ Re(v · grad v − µ × x · grad v + µ × v) = v − grad p, in , div v = 0 (2.1) lim v(x) = 0, |x |→∞ v(x) = µ × x, x ∈ ∂. Here Re = |ω|d 2 /ν is the appropriate Reynolds number, d is the diameter of B, ν is the kinematical viscosity of L, µ = ω/|ω|, and (the exterior of B) is the domain occupied by L. The main goal of this paper is to prove the following existence and uniqueness theorem for problem (2.1) THEOREM 2.1. Let be of class C 2 , and let R > δ(B), q > 1. Then, there exists a constant Re0 > 0 depending only on , R and q, such that if Re < Re0 , problem (2.1) admits one and only one solution v, p satisfying v2,q,R + D 2 v2 + [|v|]1 + [|grad v|]2 < ∞, [|p|]2 + [|grad p|]3,R < ∞.
(2.2)
Moreover, v, p ∈ C ∞ (). REMARK 2.1. (i) For the sake of simplicity, we are assuming that the body force b acting on the fluid is zero. However, the result of Theorem 2.1 can be easily extended to cover the case b = 0. For example, as it is clear from the proof that we shall give, one can show that if b = div F , with F = {Fij } a second-order tensor field satisfying the assumptions (i)–(iii) of Theorem 4.1, there exists one and only one corresponding solution v, p in the class (2.2), provided Re is suitably small. In addition, the solution satisfies the following estimate v2,q,R + D 2 v2 + [|v|]1 + [|grad v|]2 + [|p|]2 + [|grad p|]3,R c [|F |]2 + [|∂i Fij ei |]3 + [|∂j ∂i Fij |]4 + 1 . The differentiability of this solution will depend, of course, on the degree of smoothness of F . If, in particular, F ∈ C ∞ (), then v, p ∈ C ∞ (). (ii) Using the spatial asymptotic properties of solutions of Theorem 2.1, one can easily show that they satisfy the energy equation: D(v) : D(v) = µ · x × T (v, p) · n, (2.3)
∂
We adopt summation convention over repeated indices. Unless confusion may arise, we shall omit in the integrals the infinitesimal volume or surface
of integration.
442
GIOVANNI P. GALDI
where D(v) is the stretching tensor, T = 2D + pI , I is the identity tensor and n is the unit outer normal to ∂. These solutions are (clearly) also unique, and their velocity field decays at large distances as |x|−1 . Therefore, they are physically reasonable in the sense of Finn [9]. (iii) From the work of Borchers [4] it follows that solutions of Theorem 2.1 are nonlinearly stable in the sense of Liapounov, for sufficiently small Reynolds number. In particular, every dynamical perturbation that is initially in L2 () decays to zero in suitable norms as t → ∞. The Proof of Theorem 2.1 will be achieved through several steps, obtained in the following two sections. 3. A Linear Problem in R3 LEMMA 3.1. Let F = {Fij } be a second-order tensor field in R3 such that [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 < ∞. Moreover, let 1,q
f ∈ W0 (Bρ ),
some ρ > 0 and q > 3.
Then, the problem u = ∂j ∂i Fij + div f ,
in R3
has one and only one solution such that [|u|]2 c [|F |]2 + [|∂i Fij ej |]3 + f q,Bρ , [|grad u|]3 c [|F |]2 + [|∂i Fij ej |]3 + [|∂i ∂j Fij |]4 + f q,Bρ + div f q,Bρ , D 2 us,R3 c [|∂i ∂j Fij |]4 + div f q,Bρ , all s ∈ (1, q],
(3.4)
(3.5)
where c is a positive constant. Proof. Set G = ∂i Fij ej + f . Then, by assumption, G and div G belong to Ls (R3 ) for all s ∈ [1, q]. Therefore, from well-known results, it follows that there exists one and only one solution u to (3.4) such that us1 ,R3 + grad us,R3 c [|∂i Fij ej |]3 + f q,Bρ , (3.6) D 2 us,R3 c [|∂i ∂j Fij |]4 + div f q,Bρ for all s ∈ (1, q], s1 > 3/2. The second of these estimates is just the third inequality in (3.5). The Sobolev embedding theorem along with (3.6) implies that u and grad u are essentially bounded
443
FLOW AROUND A ROTATING OBSTACLE
on the ball BR of R3 of arbitrary finite radius R > 0, and that the following inequality holds u∞,BR + grad u∞,BR CR [|F |]2 + [|∂i Fij ej |]3 + [|∂i ∂j Fij |]4 + div f q,Bρ + f q,Bρ .
(3.7)
Furthermore, again from well-known results and from (3.6), we have that u admits the following representation for all x ∈ R3 u(x) = E(x − y)∂j ∂i Fij (y) dy + E(x − y) div f (y) dy R3
R3
≡ u1 (x) + u2 (x),
(3.8)
where E(ξ ) is the fundamental solution to Laplace’s equation in dimension three. We recall that |D α E(ξ )| c|ξ |−1−|α| ,
for all |α| 0 and for ξ = 0.
Integrating by parts, we obtain ∂i E(x − y)fi (y) dy. u2 (x) = −
(3.9)
(3.10)
R3
Since x ∈ B 2ρ , y ∈ Bρ 0⇒ |x − y| 12 |x|,
(3.11)
using (3.9) we get |x|2 |u2 (x)| cf q,Bρ ,
|x| 2ρ.
(3.12)
Differentiating (3.10) once, we obtain ∂k ∂i E(x − y)fi (y) dy. ∂k u2 (x) = − R3
Using in this equation (3.11) and (3.9), we recover |x|3 |grad u2 (x)| cf q,Bρ ,
|x| 2ρ.
(3.13)
We next estimate the first integral in (3.8). Taking into account the asymptotic properties of E and of Fij , it is easy to see that, for every fixed x, we can perform integration by parts to get ∂j E(x − y)∂i Fij (y) dy, for all x ∈ R3 . (3.14) u1 (x) = − R3
In fact, we can show that from (3.6) we have that u and grad u are essentially bounded in the whole of R3 , but this is irrelevant for the rest of the proof. Notice that ∂ ∂ F ∈ Lq (R3 ) for all q 1. j i ij
444
GIOVANNI P. GALDI
Since F satisfies (i) and (ii), from Lemma 2.5 of [29], it then follows that |x|2 |u1 (x)| c [|F |]2 + [|∂i Fij ej |]3 , |x| > 1. This estimate, together with (3.12) and (3.7), in turn proves the first inequality in (3.5). In order to show the second inequality, we observe that ∂k E(x − y)∂j ∂i Fij (y) dy. (3.15) ∂k u1 (x) = R3
To estimate the integral on the right-hand side of (3.15), we set |x| = R > 2 and split R3 as BR/2 ∪ B R/2 , and denote the corresponding contributions of the integral over the two regions by I1 and I2 , respectively. We also set, for simplicity, N0 = [|F |]2 ,
N1 = [|∂i Fij ej |]3 ,
N2 = [|∂j ∂i Fij |]4 .
By a double integration by parts, we have I1 = ∂i ∂j ∂k E(x − y)Fij (y) dy − ∂j ∂k E(x − y)Fij (y)nj (y) dσy BR/2 ∂BR/2 + ∂k E(x − y)∂i Fij (y)nj (y) dσy . ∂BR/2
Taking into account that y ∈ BR/2 0⇒ |x − y| 12 R,
(3.16)
y ∈ ∂BR/2 0⇒ |y| = 12 R, by (3.9) and the assumptions (i) and (ii), we find N0 N0 N1 1 N0 + N1 dy + + . c |I1 | c 4 3 3 2 1 + |y| R3 R BR/2 R R Next, since y ∈ B R/2 0⇒
/
|y| R/2, |∂j ∂i Fij (y)| cN2 /|y|4
(3.17)
(3.18)
by assumption (iii) we have ∂k E(x − y)∂j ∂i Fij (y) dy |I2 | = B R/2 N2 dy N2 c 3, (3.19) c 2 2 2 R R3 |x − y| |y| R where, in the last inequality, we have used a classical estimate on weakly singular integral (see, e.g., [11, Lemma II.7.2]). From (3.17) and (3.19) we then conclude |x|3 | grad u(x)| c(N0 + N1 + N2 ),
|x| > 2,
which together with (3.13) and (3.7) proves the second estimate in (3.5). The proof of the lemma is thus accomplished. 2
445
FLOW AROUND A ROTATING OBSTACLE
LEMMA 3.2. Let F and f be tensor and vector fields, respectively, satisfying the assumptions of Lemma 3.1. Then, the problem ⎫ u + Re(µ × x · grad u − µ × u) ⎪ ⎬ = grad φ + ∂i Fij ej + f , (3.20) in R3 ⎪ ⎭ div u = 0 has one and only one solution such that 2,2 u ∈ Wloc (R3 ) ∩ D 2,2 (R3 ) ∩ D 1,2 (R3 ) ∩ L6 (R3 ) ∩ L∞ (R3 ), φ ∈ W 1,r (R3 ) ∩ D 2,s (R3 ), all q s > 1, r > 3/2, [|φ|]2 + [|grad φ|]3 < ∞.
(3.21)
Moreover, the following estimate holds u∞ + u6 + grad u2 + D 2 u2 + [|φ|]2 + [|grad φ|]3 + D 2 φs c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 + f q,Bρ + div f q,Bρ , where c is a positive constant depending only on q and s. Proof. The existence of the solution u satisfying the stated properties can be found in [18] and [14]. Moreover, again by the work of these authors, we have that, in particular, the corresponding pressure φ belongs to L6 (R3 ). We now apply the operator “div” at both sides of (3.20)1 . Since div(−µ × x · grad u + µ × u) = −µ × x · grad(div u),
(3.22)
we find φ = ∂j ∂i Fij + div f . Thus, the properties of φ follow from Lemma 3.1 and from a classical uniqueness theorem in the Lebesgue class Lq for the Poisson equation in the whole space. 2 LEMMA 3.3. Let G(x, t) = {Gij (x, t)} be a second-order tensor field in R3 × (0, ∞) such that ess sup [|G|]2 + [|∂i Gij ej |]3 < ∞. t 0
Moreover, let g be a function of bounded support contained in Bρ , for some ρ > 0 and such that g ∈ L∞ (0, ∞; Lq (Bρ )),
for some q > 3.
Then, the Cauchy problem ∂w = w + ∂i Gij ej + g ∂t w(x, 0) = 0
in R3 ,
(3.23)
446
GIOVANNI P. GALDI
has one and only one solution such that w ∈ W 1,2 (0, T ; L2 (R3 )) ∩ L2 (0, T ; W 2,2 (R3 )),
all T > 0.
(3.24)
Furthermore, this solution satisfies the following estimate ess sup [|w|]1 + [|grad w|]2 c ess sup [|G|]2 + [|∂i Gij ej |]3 + gq,Bρ . (3.25) t 0
t 0
Proof. The existence of a unique solution in the class (3.24) is well-known; see, e.g., [23]. In order to show the other properties of w, we shall make use of the volume heat potential representation: t H (x − y, s) ∂i Gij (y, t − s) + gj (y, t − s) dy ds wj (x, t) = ≡
0 wj(1)
R3
+ wj(2),
(3.26)
where
$ |z|2 1 , s, |z| > 0. exp − H (z, s) = (4π s)3/2 4s In the sequel, we shall employ many times the following elementary inequality: $ c |z|2 −k 2k , k 0, (3.27) s exp − 4s |z| where c is a positive constant independent of z and s. Let us first consider the function w(2) . Using (3.27) in conjunction with the Hölder inequality, we find for any r, p ∈ [1, q] and t 1 1/r 1 (2) −3/2 −r |x−y|2 /4s e dy ds |w | c ess sup gr,Bρ s t 0
0
+ ess sup gp,Bρ t 0
Bρ
t
s
−3/2
1
−p |x−y|2 /4s
e
1/p dy
$ ds
Bρ
2 1 ≡ c ess sup gr,Bρ I1 + ess sup gp,Bρ I2 t 0
t 0
c ess sup gq,Bρ (I1 + I2 ), t 0
(3.28)
where r and p are conjugate exponents to r and p, respectively. Without loss, we shall assume throughout ρ > 1. Noticing that x ∈ B 2ρ ,
y ∈ Bρ 0⇒ |x − y| 12 |x|,
with the help of (3.27), for any β ∈ (0, 1] we find 1/r 1 dy −1+β dy ds I1 c s (1+2β)r Bρ |x − y| 0 c c , |x| 2ρ. 1+2β |x| |x|
(3.29)
(3.30)
447
FLOW AROUND A ROTATING OBSTACLE
Furthermore, using again (3.27) and (3.29), we obtain 1/p t t 2 −3/2 −p |x|2 /16s e dy ds c s −3/2 e−|x| /16s ds I2 c s 1
1
Bρ
c , |x| 2ρ. |x| From (3.28)–(3.31) we then deduce
(3.31)
|x| |w (2) (x, t)| c ess sup gq,Bρ ,
|x| 2ρ.
t 0
(3.32)
We next show an estimate valid for all |x| 0. We have 1/r 1 −3/2 −r |x−y|2 /4s e dy ds I1 c s
0
B4ρ (x)
1
c
s
−3/2
0
2 −r σ 2 /4s
σ e
1/r dσ
ds
0 1
c
4ρ
s −3(1−1/r )/2 ds.
0
Thus, choosing r > 3/2, we conclude I1 c,
|x| 0.
(3.33)
In a completely analogous fashion, we find t I2 c s −3(1−1/p )/2 ds, 1
and so, choosing p < 3/2 we also have I2 c,
|x| 0.
(3.34)
If t 1, |w(2) (x, t)| is bounded by I, and so, collecting (3.32)–(3.34) we obtain [|w(2) |]1 c ess sup gq .
(3.35)
t 0
Our next step is to estimate the first spatial derivative of w(2) . Taking the partial derivative of w(2) with respect to xk and proceeding as before, we find for any r, p ∈ [1, q] and t 1 1/r 1 (2) −5/2 r −r |x−y|2 /4s |x − y| e dy ds |∂k w | c ess sup gr,Bρ s t 0
0
+ ess sup gp,Bρ t 0
Bρ
t
s
−5/2
1
p −p |x−y|2 /4s
|x − y| e
1/p dy
$ ds
Bρ
2 1 ≡ c ess sup gr,Bρ I3 + ess sup gp,Bρ I4 t 0
t 0
c ess sup gq,Bρ (I3 + I4 ). t 0
(3.36)
448
GIOVANNI P. GALDI
With the help of (3.27) and (3.29), for any β ∈ (0, 1], we find
1
I3 c
s
−1+β
0
Bρ
c , |x|2
1/r
dy (2+2β)r
|x − y|
ds
dy
c |x|
2+2β
|x| 2ρ.
(3.37)
Furthermore, using again (3.29), and observing that y ∈ Bρ 0⇒ |x − y| 32 |x|,
|x| 2ρ, we obtain
t
I4 c 1
s −5/2
|x − y|r e−r |x−y|
2 /4s
1/r dy
ds
Bρ
t
c |x|
s −5/2 e−|x|
2 /4s
ds c|x|−2 ,
|x| 2ρ.
(3.38)
1
We next show an estimate valid for all |x| 0. Choosing r = q , we have
1
I3 c
s
0
s 0
q −q |x−y|2 /4s
|x − y| e
−5/2
4ρ
σ
2+q −q σ 2 /4s
e
1/q dy
ds
1/q dσ
ds
0 1
c
B4ρ (x)
1
c
−5/2
s −2+3/2q ds.
0
Thus, since q > 3, we conclude I3 c,
|x| 0.
(3.39)
In a completely analogous way, we find t I4 c s −2+3/2p ds, 1
and so, choosing p < 3 we also have I4 c,
|x| 0.
(3.40)
If t 1, |∂k w(2) (x, t)| is bounded by I3 and so, from (3.36)–(3.40) we deduce [|grad w(2) |]2 c ess sup gq . t 0
(3.41)
449
FLOW AROUND A ROTATING OBSTACLE
It remains to estimate the integral w(1) in (3.26). For simplicity, we introduce the following notation: N0 = ess sup [|G|]2 , t 0
N1 = ess sup [|∂i Gij ej |]3 . t 0
As shown in the estimates for w(z) (x, t), we may take, without loss, t 1. Integrating by parts, and using the assumption (i) for G, we find, for any α ∈ [0, 2], t 2 (1) −5/2 −|x−y|2 /4s s (xi − yi )e Gij (y, t − s) dy ds |wj | = 3/2 (4π ) R3 0 1 2 |x − y|e−|x−y| /4s s −5/2 dy ds cN0 |y|α R3 0 2 t |x − y|e−|x−y| /4s −5/2 s dy ds + |y|2 R3 1 ≡ c N0 (I1 + I2 ). (3.42) Using (3.27), for all β ∈ (0, 1] we find 1 dy −1+β s ds, I1 c 2+2β |y|α R3 |x − y| 0 and so, by a classical estimate (see, e.g., [11, Lemma II.7.2]) we obtain I1 c |x|−2β−α+1 . Therefore, choosing β = 1 − α/2, α < 2, we conclude I1 c|x|−1 ,
|x| > 0.
(3.43)
In order to estimate I2 , we notice that t −|x−y|2 /4s |x − y| e ds dy, I2 c 2 |y| s 5/2 R3 1 and so, performing in the time integral the change of variable η = |x − y|2 /4s, it follows that ∞ dy dy 1/2 −η η e dη c . I2 c 2 2 2 2 R3 |x − y| |y| R3 |x − y| |y| 0 Employing again Lemma II.7.2 in [11], we obtain I2 c|x|−1 ,
|x| > 0.
(3.44)
Equations (3.43) and (3.44) imply |w(1) (x, t)| c|x|−1 ,
|x| > 0.
(3.45)
450
GIOVANNI P. GALDI
We wish now to show that w(1) is uniformly bounded for all x. Applying Hölder inequality in the first integral in (3.42), we obtain
1
|w | c ess sup Gr (1)
s
t 0
−5/2
R3
0
t
+ ess sup Gq t 0
r −r |x−y|2 /4s
s −5/2
|x − y| e
R3
1
|x − y|q e−q |x−y|
1/r dy
2 /4s
ds 1/q
dy
ds
for any 3/2 < r, q ∞. Since ess sup Gp c N0 ,
all p > 3/2,
t 0
the preceding inequality implies |w(1) | c N0 0
1
s −5/2
+
R3
t
s
−5/2
1
≡ c N0 (I3 + I4 ).
|x − y|r e−r |x−y|
R3
2 /4s
1/r dy
q −q |x−y|2 /4s
|x − y| e
ds 1/q dy
ds
√ Performing the change of variable σ = |x − y|/ 4s we find 1 s −2+3/2r ds, I1 c 0
and so, choosing 3/2 < r < 3 we get I1 c. By the same token, we obtain t s −2+3/2q ds, I2 c 1
and so, choosing this time q > 3 we get I2 c. As a consequence, we deduce |w(1) (x, t)| c N0 ,
|x| 0.
(3.46)
From (3.45) and (3.46) we conclude [|w(1) |]1 c N0 .
(3.47)
451
FLOW AROUND A ROTATING OBSTACLE
It remains to estimate the spatial derivatives of w(1). To this end, we notice that from (3.26) we have ∂k wj(1) =
2 (4π )3/2
0
t
s −5/2
(xk − yk )e−|x−y|
2 /4s
R3
∂i Gij (y, t − s) dy ds. (3.48)
In order to achieve our goal, we set |x| = R > 2, and, as in the proof of Lemma 3.1, split again R3 as BR/2 ∪ B R/2 . The contributions to the integral in (3.48) over the two subdomains will be denoted by I1 (t) and I2 (t), respectively. Moreover, for each Ii (t), we split the interval [1, t] into the two intervals [0, 1] and [1, t], and denote the corresponding integrals by Ii (0, 1) and Ii (1, t), i = 1, 2, according to whether we are integrating over [0, 1] or [1, t]. For instance, we have t 2 −5/2 −|x−y|2 /4s s (x − y )e ∂ G (y, t − s) dy ds I1 (t) ≡ k k i ij (4π )3/2 0 BR/2 1 2 −5/2 −|x−y|2 /4s s (xk − yk )e ∂i Gij (y, t − s) dy ds = (4π )3/2 0 BR/2 t 2 −5/2 −|x−y|2 /4s s (xk − yk )e ∂i Gij (y, t − s) dy ds + (4π )3/2 1 BR/2 ≡ I1 (0, 1) + I1 (1, t), etc. Integrating by parts, we find 2 (xk − yk )e−|x−y| /4s ∂i Gij dy BR/2
=
(xk − yk )(xi − yi ) −|x−y|2 /4s e Gij dy δik + s BR/2 2 e−|x−y| /4s Gij ni dσy −
∂BR/2
≡ i1 (s) + i2 (s) + i3 (s). Using the assumption (i) on G, the first condition in (3.16) and (3.27), for any ε ∈ (0, 1] we have 1 2 1 e−|x−y| /4s dy −5/2 −1+ε ds s |i1 (s)| ds c N0 s 3/2+ε |y|2 + 1 BR/2 s 0 0 1 dy ds c N0 3+2ε 2 |x − y| (|y| + 1) 0 s 1−ε B R/2 N0 N0 dy c 2. c 3 (3.49) 2 R BR/2 |y| + 1 R
452
GIOVANNI P. GALDI
By a similar argument, we find 1 s −5/2 |i2 (s)| ds 0
2 |x − y|2 e−|x−y| /4s dy ds s 5/2+ε |y|2 + 1 BR/2 0 1 dy ds N0 c N0 c 2, 3+2ε 2 1−ε (|y| + 1) 0 s R BR/2 |x − y|
c N0
1
s −1+ε
(3.50)
and, using this time also the second condition in (3.16), 1 s −5/2 |i3 (s)| ds 0
2 |x − y|e−|x−y| /4s s dσy ds s 3/2+ε ∂BR/2 0 1 N0 |x − y| ds N0 dy c 2. c 2 3+2ε 1−ε R ∂BR/2 |x − y| R 0 s N0 c 2 R
1
−1+ε
(3.51)
From (3.49)–(3.51) we then conclude |I1 (0, 1)| c
N0 , |x|2
|x| > 2.
(3.52)
We shall next estimate I1 (1, t). Using assumption (i) on G, we obtain t −|x−y|2 /4s t 1 e −5/2 s |i1 (s)| ds c N0 ds dy, 2 s 5/2 BR/2 |y| + 1 1 1 and so, setting η = |x − y|2 /4s, and using the first condition in (3.16), it follows that ∞ 1/2 −η t 1 η e −5/2 s |i1 (s)| ds c N0 dη dy 2 3 BR/2 |y| + 1 1 0 |x − y| N0 dy N0 c 2. (3.53) c 3 2 R BR/2 |y| + 1 R Likewise, we find t −5/2 s |i2 (s)| ds c N0 1
t −|x−y|2 /4s |x − y|2 e ds dy 2 s 7/2 BR/2 |y| + 1 1 ∞ 1/2 −η 1 η e dη dy c N0 2 3 BR/2 |y| + 1 0 |x − y| N0 dy N0 c 2. c 3 2 R BR/2 |y| + 1 R
(3.54)
453
FLOW AROUND A ROTATING OBSTACLE
Moreover, using this time the second condition in (3.16), we obtain t −|x−y|2 t N0 e −5/2 s |i3 (s)| ds c 2 |x − y| ds dσy R ∂BR/2 s 5/2 1 1 ∞ N0 N0 dσy c 2 η1/2 e−η dη c 2 . 2 R ∂BR/2 |x − y| 0 R
(3.55)
Collecting (3.52)–(3.55), we thus conclude |I1 (t)| c
N0 , |x|2
|x| > 2.
(3.56)
We now estimate I2 (t) = I2 (0, 1) + I2 (1, t). Using assumption (ii) on G, and recalling (3.18) and (3.27), for any ε ∈ (0, 1) we find 1 2 |x − y| e−|x−y| /4s N1 −1+ε |I2 (0, 1)| c 1+2ε s dy ds 2−2ε R s 3/2+ε B R/2 |y| 0 1 dy N1 −1+ε s ds c 1+2ε 2+2ε R/2 R |x − y| |y|2−2ε B 0 N1 N1 c 2+2ε c 2 , (3.57) R R where, in the last step, we have used Lemma II.7 of [11]. Moreover, again by (3.18) and this lemma, it follows that t −|x−y|2 /4s N1 |x − y| e ds dy |I2 (1, t)| c 2 R B R/2 |y| s 5/2 1 ∞ dy N1 η1/2 e−η dη c R R3 |x − y|2 |y|2 0 N1 (3.58) c 2. R Therefore, from (3.57), (3.58) we conclude |I2 (t)| c
N0 , |x|2
|x| > 2.
(3.59)
Recalling that ∂k w(1) (x, t) = Ii (t) + I2 (t), from (3.56) and (3.59) we conclude |grad w(1) (x, t)| c
N0 + N1 , |x|2
|x| > 2.
(3.60)
Finally, we wish to show an estimate for grad w(1) holding for all |x|. However, the proof of this estimate is completely similar to the analogous estimate we proved for w(1) . The reason is because both w(1) and grad w(1) are expressed as the convolution
454
GIOVANNI P. GALDI
of a spatial derivative of the kernel H times a function G (say) which belongs to Lq (R3 ), for all q > 3/2. (In fact, in view of assumption (ii) the function G associated to grad w (1) belongs to Lq (R3 ) for all q > 1.) We may thus prove |grad w(x, t)| c N1 ,
|x| 0,
which, in turn, along with (3.60), allows us to conclude [|grad w(1) |]2 c (N0 + N1 ). 2
The proof of the lemma is then completed.
LEMMA 3.4. Let F and f satisfy the assumptions of Lemma 3.1, and let φ be the pressure field in Lemma 3.2. Then the Cauchy problem ∂v − Re(µ × x · grad v − µ × v) ∂t = v − grad φ − ∂i Fij ej + f in R3 , v(x, 0) = 0
(3.61)
has one and only one solution such that v ∈ W 1,2 (0, T ; L2 (BR )) ∩ L2 (0, T ; W 2,2 (R3 )), ess sup [|v|]1 + [|grad v|]2 < ∞.
all R, T > 0, (3.62)
t 0
Moreover, the solution satisfies the following estimate ess sup [|v|]1 + [|grad v|]2 t 0
c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 + f q,Bρ + div f q,Bρ ,
(3.63)
where c depends only on q. Proof. Let Q = Q(t), t 0, be the uniquely determined family of proper orthogonal transformations parameterized with time, such that (“T” denotes transpose) •
QT (t) · Q(t) · a = Re a × µ, Q(0) = I .
for all a ∈ R3 , t 0,
(3.64)
It is well-known that the tensor field Q(t) is found by solving the following initialvalue problem ⎤ ⎡ / • 0 −µ3 µ2 Q= ReQ · M, 0 −µ1 ⎦ . M(µ) = ⎣ µ3 Q(0) = I , −µ2 µ1 0
455
FLOW AROUND A ROTATING OBSTACLE
We next introduce a new set of coordinates y related to x by y = Q(t) · x.
(3.65)
Also, set w(y, t) = Q(t) · v(QT (t) · y, t).
(3.66)
•
•
Using (3.66), (3.64), and the identity Q T (t) ·Q(t) = −QT (t) · Q (t), we find • • ∂v ∂w T T = Q(t) · + (Q (t) · Q(t) · x) · grad v + Q (t) · Q (t) · v ∂t ∂t x 2 1 ∂v (3.67) = Q(t) · − Re µ × x · grad v − µ × v . ∂t x and yw
= Q(t) ·
x v.
(3.68)
Therefore, the Cauchy problem (3.61) can be equivalently rewritten as follows ∂w = w + ∂i Gij ej + g ∂t w(y, 0) = 0,
in R3 ,
(3.69)
where the second-order tensor field G and the vector field g are given by G(y, t) = Q(t) · F (QT (t) · y) · Q(t)T + φ(QT (t) · y)I , g(y, t) = Q(t) · f (QT (t) · y). Clearly, |g(y, t)| = |f (x)|. Moreover, |G(y, t)| c (|F (x)| + |φ(x)|), % 3 & 3 |∂i Gij (y, t)| c |∂ix Fij (x)| + gradx φ(x) . j =1
j =1
Therefore, recalling the properties of φ shown in Lemma 3.2, we deduce that the fields G and g satisfy the assumptions of Lemma 3.3. Since we have also grad w(y, t) = grad v(x, t) |w(y, t)| = |v(x, t)|, y
x
and |y| = |x|, the proof of the lemma follows from Lemma 3.2 and Lemma 3.3. 2 LEMMA 3.5. The solution v to the Cauchy problem (3.61) given in Lemma 3.4 tends, as t → ∞, to the solution u of the steady problem (3.20) given in Lemma 3.2. Specifically, lim v(t) − uq,R3 = 0,
t →∞
for all q > 6,
lim grad(v(t) − u)6,R3 = 0.
t →∞
456
GIOVANNI P. GALDI
Proof. Set U (y, t) = Q(t) · u(QT (t) · y), where Q(t) and y are defined in (3.64) and (3.65), respectively, and W (y, t) = w(y, t) − U (y, t). Arguing as in the proof of Lemma 3.4 (see (3.67) and (3.68)), and taking into account (3.20), (3.61) and (3.64)2 , we find that W (y, t) satisfies the following Cauchy problem ∂W = W in R3 , ∂t W (y, 0) = u(y). Since, obviously,
grad W = grad (v − u) , y x
|W | = |v − u|,
(3.70)
from (3.22) and (3.62), we have W ∈ L∞ (0, ∞; L∞ (R3 )) ∩ L∞ (0, ∞; D 1,2 (R3 )). Thus, using these properties along with the asymptotic properties in space of the kernel H , in conjunction with the classical Green’s identity for the heat equation, we obtain that W admits the following representation: H (y − z, t)u(z) dz. (3.71) W (y, t) = R3
Therefore, from Young’s theorem on convolutions we get W q,R3 c t − 2 (1/6−1/q)u6 , 3
gradW 6,R3 c t −1/2 u6 ,
q > 6,
t > 0.
The proof then follows from these latter displayed inequalities and from (3.70). 2 We are now in a position to show the main result of this section. THEOREM 3.1. Let F and f satisfy the assumptions of Lemma 3.1. Then the problem (3.20) has at least one solution u, φ such that 2,2 (R3 ) ∩ D 2,2 (R3 ) ∩ D 1,2 (R3 ) ∩ L6 (R3 ) ∩ L∞ (R3 ), u ∈ Wloc
[|u|]1 + [|grad u|]2 < ∞, φ ∈ W 1,r (R3 ) ∩ D 2,s (R3 ), [|φ|]2 + [|grad φ|]3 < ∞.
all q s > 1, r > 3/2,
(3.72)
457
FLOW AROUND A ROTATING OBSTACLE
Moreover, the following estimate holds D 2 u2 + [|u|]1 + [|grad u|]2 + [|φ|]2 + [|grad φ|]3 + D 2 φs c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 + f q,Bρ + div f q,Bρ .
(3.73)
Finally, if u1 , φ1 is another solution to (3.20) with 2,2 (R3 ) ∩ D 1,2(R3 ) ∩ L6 (R3 ), u1 ∈ Wloc
1,2 φ1 ∈ Wloc (R3 ),
we have u ≡ u1 , φ ≡ φ1 + const. Proof. In view of Lemma 3.1, for the existence proof we only have to show that the solution u satisfies, in addition, the property [|u|]1 + [|grad u|]2 < ∞, together with the estimate given in (3.73). To this end, let v(x, tn ) be the solution to the Cauchy problem (3.61) given in Lemma 3.4, evaluated along an increasing sequence of times {tn } with tn → ∞. By Lemma 3.5, v(x, tn ) and grad v(x, tn ) converge strongly to u(x) and grad u(x), in Lq , q > 6, and L6 , respectively. Therefore, we can select a subsequence, again denoted by {tn }, along which v(x, tn ) and grad v(x, tn ) converge pointwise to u(x) and grad u(x), for almost all x ∈ R3 . By the triangular inequality and by (3.63) we then find |u(x)|(|x| + 1) + |grad u(x)|(|x|2 + 1) |v(x, tn )|(|x| + 1) + |grad v(x, tn )|(|x|2 + 1) + |v(x, tn ) − u(x)|(|x| + 1) + |grad(v(x, tn ) − u(x))|(|x|2 + 1) c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 + f q,Bρ + div f q,Bρ + |grad(v(x, tn ) − u(x))|(|x|2 + 1) + |v(x, tn ) − u(x)|(|x| + 1). Passing to the limit n → ∞ in this latter inequality furnishes the desired result. In order to show the uniqueness part, setting U = u − u1 , = φ − φ1 , we have that U and satisfy the following problem U + µ × x · grad U − µ × U = grad , div U = 0
in R3 .
(3.74)
By classical results on elliptic regularity, we have that U and are of class C ∞ . Operating with “curl” on both sides of (3.74)1 , we find W + µ × x · grad W + grad U · µ + µ · grad U = 0,
(3.75)
where W = curl U . Let ψR = ψR (|x|), R > 0, be a real, nonnegative and nondecreasing function of |x| such that ψR (|x|) = 1 for |x| < R, ψR (|x|) = 0 In the proof of uniqueness, the magnitude of Re is irrelevant. Therefore, for simplicity, we shall
set Re = 1.
458
GIOVANNI P. GALDI
for |x| > 2R and | ψR (|x|)| M/R 2 , for a constant M independent of R and x. Dot-multiplying (3.75) by ψR W , integrating by parts over R3 and observing that grad ψR (|x|) · µ × x = 0, we find R3
ψR |grad W |2 =
1 2
for all x ∈ R3 ,
(3.76)
ψR |W |2 +
R3
R3
ψR (µ · grad U + grad U · µ) · W .
Recalling that grad U ∈ L2 (R3 ), letting R → ∞ in this relation we thus deduce grad W ≡ grad curl U ∈ L2 (R3 ). Since U = − curl curl U , this implies U ∈ L2 (R3 ).
(3.77)
We now go back to (3.74). Since, by assumption, U ∈ D 1,2 (R3 ) ∩ L6(R3 ), we infer U /|x| ∈ L2 (B r ),
for all r > 0;
(3.78)
see [11, Theorem II.5.1]. Plugging this information back into (3.74) and using (3.77), we then obtain grad /|x| ∈ L2 (B r ),
for all r > 0.
(3.79)
However, applying the operator “div” at both sides of (3.74)1 and using (3.22) and (3.74)2 , we have that is harmonic in the whole space. But grad satisfies the asymptotic condition (3.79) and so, by well-known results, it follows that = const. Equation (3.74) thus furnishes, in particular, − U − µ × x · grad U + µ × U = 0
in R3 .
(3.80)
We now multiply this equation by ψR U and integrate again by parts on R3 . Taking into account (3.76), we obtain 2 ψR |grad U | = − ψR |U |2 . R3
R3
Using the properties of ψR in this latter relation we obtain |U |2 2 ψR |grad U | c , 2 R3 BR,2R |x| and so, letting R → ∞, by (3.78) we conclude |grad U |2 = 0 0⇒ U (x) = const. R3
Since U ∈ L6 (R3 ), this gives, in turn, U ≡ 0, and the proof of the theorem is completed. 2
FLOW AROUND A ROTATING OBSTACLE
459
4. A Linear Problem in Exterior Domains The objective of this section is to prove existence, uniqueness and corresponding estimates of solutions to the following exterior problem $ v + Re(µ × x · grad v − µ × v) = grad p + ∂i Fij ej , in , div v = 0 (4.1) v = v ∗ , x ∈ ∂, where is an exterior domain of class C 2 . We begin to prove the following. LEMMA 4.1. Let be a locally Lipschitzian, exterior domain. Assume that the second-order tensor field F = {Fij } has components in L2 (), and that v ∗ ∈ W 1/2,2 (∂), with v∗ · n = 0, (4.2) ∂
where n is the unit outer normal to ∂. Then, problem (4.1) has at least one distributional solution v, p such that v ∈ D 1,2(),
¯ p ∈ L2loc ().
Proof. The proof of this result is quite standard and we shall sketch it here. First, we extend the boundary data to a solenoidal smooth function V in W 1,2 () of bounded support (see [11, Chapter III]). Then we look for a solution to (4.1) of the form v = V + u, where u satisfies the following problem (in the sense of distributions) $ u + Re(µ × x · grad u − µ × u) = grad p + ∂i Fij ej + h, in , div u = 0 (4.3) u = 0, x ∈ ∂, where h is a function of bounded support given by h = − V − Re(µ × x · grad V − µ × V ). Dot-multiplying (4.3)1 by u, integrating by parts over R , letting R → ∞ and formally assuming that the surface integrals over ∂BR go to zero, we obtain the following a priori estimate: |grad u|2 M,
where M depends only on , F , Re and v ∗ . Using this bound and the classical Galerkin method, we can easily show the existence of a weak solution u ∈ D 1,2() ¯ For details, we refer the to (4.3), with corresponding pressure field p ∈ L2loc (). reader to [12, Chapter IX]. The proof of the lemma is completed. 2
460
GIOVANNI P. GALDI
We also have LEMMA 4.2. Let be an exterior domain of class C 2 , and let F and v ∗ satisfy the assumptions of Lemma 4.1. Assume, further, that ∂i Fij ej ∈ Ls (), v ∗ ∈ W 2−1/s,s (∂B), for all s > 1. Then, the solution v, p of Lemma 4.1 satisfies v ∈ W 2,s (r1 ),
p ∈ W 1,s (r1 ),
for all r1 > δ(B). Moreover, the following estimate holds v2,s,r1 + p1,s,r1 c ∂i Fij ej s,r + v ∗ 2−1/s,s,∂ + vs,r + ps,r for all r > r1 , with c depending only on , r1 , r, s and B, whenever Re ∈ [0, B]. Proof. We may formally write (4.1) as a Stokes problem: $ v = grad p + H , in , div v = 0 (4.4) v = v ∗ at ∂, where H = −Re(µ × x · grad v − µ × v) + ∂i Fij ej . ¯ From [11, Theorem IV.5.1], it then By assumption, we have that H ∈ L2loc (). 2,2 ¯ follows that v ∈ Wloc (). By the Sobolev embedding theorem, we then have v ∈ 1,6 ¯ ¯ Again from [11, Theorem IV.5.1], we (), which implies that H ∈ L6loc (). Wloc 2,6 ¯ then infer Wloc () which, in turn, by the Sobolev embedding theorem, gives v ∈ 1,∞ ¯ ¯ for all s > 1. We (). Thus, we conclude, in particular, that H ∈ Lsloc () Wloc 2,s ¯ () and, then use Theorem IV.5.1 in [11], to find that, on the one hand, v ∈ Wloc on the other hand, that v2,s,r1 + grad p2,s,r1 c H s,ρ + v ∗ 2−1/s,s,∂ + v1,s,ρ + ps,ρ for arbitrary ρ > r1 . Recalling the form of H , we then obtain v2,s,r1 + grad p2,s,r1 c ∂i Fij ej s,ρ + v ∗ 2−1/s,s,∂ + (1 + Re)v1,s,ρ + ps,ρ . (4.5) Applying Theorem IV.5.3 of [11] to (4.4) we also find, in particular, for any r > ρ, (4.6) v1,s,ρ c H −1,s,r + v ∗ 1−1/s,s,∂ + vs,r + ps,r . Since H −1,s,r c(1 + Re)vs,r + ∂i Fij ej s,r , the lemma follows from (4.5)–(4.7). The main result of this section is given in the following theorem.
(4.7) 2
461
FLOW AROUND A ROTATING OBSTACLE
THEOREM 4.1. Let be an exterior domain of class C 2 and let F = {Fij } be a second-order tensor field on such that [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 < ∞. Assume, further, v ∗ ∈ W 2−1/q,q (∂) for all q > 1, satisfying (4.2). Then, problem (4.1) has one and only one solution v, p verifying 2,q ¯ ∩ D 2,2 (), v ∈ Wloc () 1,q ¯ p ∈ Wloc (),
all q 1,
all q 1,
[|v|]1 + [|grad v|]2 < ∞
[|p|]2 + [|grad p|]3,R < ∞,
(4.8)
all R > δ(B).
Moreover, the following estimate holds v2,q,R + D 2 v2 + [|v|]1 + [|grad v|]2 + [|p|]2 + [|grad p|]3,R c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 + v ∗ 2−1/q,q,∂ ,
(4.9)
where the constant c depends only on , q, R and B, whenever Re ∈ [0, B]. Proof. Uniqueness in the class defined by (4.8) is immediately obtained by a standard argument. Actually, consider (4.1) with F ≡ v ∗ ≡ 0, and denote the resulting problem by (4.1)0 . Dot-multiplying the first equation in (4.1)0 by v, integrating by parts over r , r > δ(B), and using the conditions div v = 0 and v |∂ = 0, furnishes |grad v|2 = (N · grad v · v + pv · N ), N = x/|x|. r
∂Br
Then uniqueness follows by letting r → ∞ in this relation and by using the asymptotic properties of v and p given in (4.8). We now show existence. Let v be the solution to problem (4.1) determined in Lemma 4.1. By assumption and Lemma 4.2 we have 2,q ¯ v ∈ D 1,2() ∩ Wloc (),
1,q ¯ p ∈ Wloc (),
all q 1.
(4.10)
Let ϕ = 1 − ψR , R > δ(B), where ψR is the “cut-off” function introduced in Theorem 3.1, and set u = ϕv + w, φ = ϕp, where w satisfies the following problem div w = − grad ϕ · v w∈
3,q W0 (2R ),
in 2R ,
for all q > 1.
(4.11)
By virtue of (4.10) and well-known results on problem (4.11), e.g., [11, Section III.3], it follows that the field w does exist. By a direct calculation that uses (4.1), we find that u, φ satisfy problem (3.20) with F = ϕF , f = − w − µ × x · grad w + µ × w − µ × x · grad ϕ v − ϕv − 2 grad ϕ · grad v − p grad ϕ − ∂i ϕFij ej .
462
GIOVANNI P. GALDI
In view of (4.10) and of the assumptions on F we find that F and f satisfy the hypotheses of Theorem 3.1. Therefore, according to that theorem, there exists at least one solution u, ¯ φ¯ satisfying all conditions there stated. However, in view of (4.8), we have u ∈ D 1,2 (R3 ) and so, again by Theorem 3.1, we conclude u¯ ≡ u and φ¯ ≡ φ. In view of (4.10) and of the Sobolev embedding theorem, we then conclude that v and p satisfy (4.8). It remains to show the validity of the estimate. To this end, using (4.10) and (4.11), we find [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]3 c [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 , (4.12) f q + div f q c v2,q,2R + grad pq,2R + [|F |]2 + [|∂i Fij ej |]3 , where c depends also on B. By Lemma 4.2, the second inequality delivers (4.13) f q + div f q c vq,ρ + pq,ρ + [|F |]2 + [|∂i Fij ej |]3 , where ρ = 3R (say) and c is a constant depending only on R, q and B. Taking into account the first inequality in (4.12) and (4.13), and using (3.73) we obtain, in particular, D 2 v2,2R + [|v|]1,2R + [|grad v|]2,2R + [|p|]2,2R + [|grad p|]3,2R c N + vq,ρ + pq,ρ , where N ≡ [|F |]2 + [|∂i Fij ej |]3 + [|∂j ∂i Fij |]4 . Combining this latter inequality and the inequality in Lemma 4.2 with r1 = 2R, r = ρ ≡ 3R, and using again Sobolev embedding theorem, we conclude v2,q,2R + D 2 v2 + [|v|]1 + [|grad v|]2 +grad pq,2R + [|p|]2 + [|grad p|]3,2R c N + v ∗ 2−1/q,q,∂ + vq,ρ + pq,ρ ,
(4.14)
with a constant c depending only on , R, q and B. To complete the proof of the theorem, it is enough to show that there is a constant c, again depending at most on , R, q and B, such that (4.15) vq,ρ + pq,ρ c N + v ∗ 2−1/q,q,∂ . Assume this inequality does not hold. Then, in view of the linearity of problem (4.1), we can find a sequence {F n , v ∗n , Ren }, with Ren ∈ [0, B] and a sequence of corresponding solutions {v n , pn }, such that [|F n |]2 + [|∂i Fnij ej |]3 + [|∂j ∂i Fnij |]4 + v ∗n 2−1/q,q,∂ v n q,ρ + pn q,ρ = 1. Possibly redefining φ by the addition of a constant.
1 , n
(4.16)
463
FLOW AROUND A ROTATING OBSTACLE
From (4.14), it follows that the sequence of solutions is bounded in the norm defined by the left-hand side of (4.14) and that, therefore, it converges, in a suitable topology, to a pair {v 0 , p0 } which belongs to the class defined by (4.8). Since, in particular v n 1,q,ρ + pn 1,q,ρ M with M independent of n, by Rellich’s theorem and by the second equation in (4.16) we infer v 0 q,ρ + p0 q,ρ = 1.
(4.17)
Moreover, using (4.16), it is easy to show that v 0 , p0 is a solution of the following boundary-value problem $ v 0 + Re0 (µ × x · grad v 0 − µ × v 0 ) = grad p0 , in , div v 0 = 0 (4.18) v 0 = 0 at ∂, where Re0 = limn→∞ Ren . However, v0 , p0 satisfy (4.8) and, by the uniqueness property showed previously, we conclude v 0 = p0 = 0, contradicting (4.17). This proves (4.15), and the proof of the theorem is completed. 2
5. Proof of Theorem 1.1 We are now in a position to give a proof of our main result. The proof of existence will be obtained by combining the results of Theorem 4.1 with a fixed point argument. To this end, for fixed R > δ(B) and q > 1 we introduce the following space of functions: 1,2 () : div ϕ = 0 in , XR,q = ϕ ∈ Wloc ϕ2,q,R + D 2 ϕ2 + [|ϕ|]1 + [|grad ϕ|]2 < ∞ . Clearly, XR,q is a Banach space with the norm ϕXR,q ≡ ϕ2,q,R + D 2 ϕ2 + [|ϕ|]1 + [|grad ϕ|]2 . Let us consider the map M: ϕ ∈ XR,q → v, where v is a solution to the following problem v + Re(µ × x · grad v − µ × v) = grad p + ∂i Fij ej , div v = 0 v = µ × x, x ∈ ∂, Recall that p (x) → 0 as |x| → ∞. 0
$ in ,
(5.1)
464
GIOVANNI P. GALDI
where Fij = Re ϕi ϕj . Notice that, by virtue of the condition div ϕ = 0, we have ∂i Fij ej = Re ϕ · grad ϕ,
∂i ∂j Fij = Re grad ϕ · (grad ϕ) .
Therefore, since ϕ ∈ XR,q , we deduce [|F |]2 + [|∂i Fij ej |]3 + [|∂i ∂j Fij |]4 c Re ϕ2XR,q . So, by Theorem 4.1, we find, on the one hand, that v is a uniquely determined element of XR,q and, on the other hand, that (5.2) vXR,q + [|p|]2 + [|grad p|]3,R c Reϕ2XR,q + 1 . Moreover, if v 1 = M(ϕ 1 ) and v 2 = M(ϕ 2 ), ϕ 1 , ϕ 2 ∈ X, setting v = v 1 − v 2 and ϕ = ϕ 1 − ϕ 2 we deduce that v satisfies the following problem ⎫ v + Re(µ × x · grad v − µ × v) ⎬ = grad p + Re∂i (ϕi ϕ1j + ϕ2i ϕj )ej , in , ⎭ div v = 0 v = 0 at ∂, for some p. Thus, again by Theorem 4.1, it follows, in particular, v 1 − v 2 XR,q c Re ϕ 1 XR,q + ϕ 2 XR,q ϕ 1 − ϕ 2 XR,q .
(5.3)
Inequalities (5.2) and (5.3) ensure that the map M has a fixed point v in XR,q , for sufficiently small Re. Furthermore, in view of (5.2), the corresponding pressure p satisfies the condition stated in the theorem. Finally, the pair v, p is of class C ∞ (), as a consequence of a boot-strap argument and of well-known regularity results for the Stokes problem (see, e.g., [11, Chapter IV]). It remains to prove uniqueness. To this end, let u = v 2 − v 1 , φ = p2 − p1 where {vi , pi }, i = 1, 2 are two smooth solutions to (2.1) in the class defined by (2.2) and corresponding to the same ω. We then have ⎫ Re(u · grad u + u · grad v 1 + v 1 · grad u ⎪ ⎬ − µ × x · grad u + µ × u) = u − grad φ, in , ⎪ ⎭ (5.4) div u = 0 lim u(x) = 0, |x |→∞ u(x) = 0 at ∂. Dot-multiplying the first equation in (5.4) by u, integrating by parts over R and taking into account the third and fourth equation in (5.4), we obtain (with
FLOW AROUND A ROTATING OBSTACLE
N = x/|x|) |grad u|2 = R
+ Re ∂BR
465
$ ∂u 1 2 · u − Re |u| (u + v 1 ) · N − φN ∂n 2 u · grad v 1 · u.
(5.5)
R
We now observe that, in view of the asymptotic properties (2.2) the surface integral vanishes in the limit R → ∞. Therefore, in this limit, from (5.5) we find 2 |grad u| = Re u · grad v 1 · u.
However, [|grad v 1 |]2 c, and so the preceding equation delivers |u|2 2 |grad u| c Re . 2 |x|
(5.6)
Since [11, Section II.5], |u|2 4 |grad u|2 , 2 |x| from (5.6) and the fourth equation in (5.4), for sufficiently small Re, we obtain u = 0, which completes the proof of the theorem. 2 6. Conclusions Consider a rigid body B steadily rotating, with constant angular velocity ω in a Navier–Stokes liquid that fills the whole space exterior to B. The main achievement of this paper is that, if B is of class C 2 and if |ω| is not “too large”, the spacial asymptotics of the velocity v and pressure p of the liquid are completely and uniquely determined. Specifically, v(x) and its gradient grad v(x) decay like |x|−1 and |x|−2 , respectively, while the pressure field p(x) and its gradient grad p(x) decay like |x|−2 and |x|−3 , respectively. This result is relevant in several respects. From a strictly theoretical point of view, it ensures existence of solutions satisfying basic physical requirements, at least for small data. In fact, these solutions are unique, satisfy the global energy balance and are nonlinearly stable in the sense of Liapunov. The result is also important in several applications, like particle sedimentation [14] and the calculation of net force and torque exerted by a viscous liquid on a rotating body at small and nonzero Reynolds number; see, e.g., [6, 16, 17], and references cited therein. Finally, the knowledge of the sharp asymptotic behaviour of solutions to elliptic system in exterior domains is also fundamental in numerical computations, especially in evaluating the error made by approximating the infinite region of flow with a necessarily bounded domain. For this type of problems related to the Navier–Stokes equations see, e.g., [28, 7].
466
GIOVANNI P. GALDI
Acknowledgement I am indebted to Professor Christian Simader for helpful conversations about the proof of Lemma 3.1.
References 1. 2. 3.
4. 5. 6. 7. 8.
9. 10. 11.
12.
13.
14.
15.
16. 17. 18.
R. Adams, Sobolev Spaces. Academic Press, New York (1975). A.S. Advani, Flow and Rheology in Polymer Composites Manufacturing. Elsevier, Amsterdam (1994). K.I. Babenko, On stationary solutions of the problem of flow past a body of a viscous incompressible fluid. Mat. Sb. 91(133) (1973) 3–27; English transl.: Math. SSSR-Sb. 20 (1973) 1–25. W. Borchers, Zur Stabilität und Faktorisierungsmethode für die Navier–Stokes Gleichungen inkompressibler viskoser Flüssigkeiten. Habilitationsschrift, University of Paderborn (1992). Z.-M. Chen and T. Miyakawa, Decay properties of weak solutions to a perturbed Navier–Stokes system in Rn . Adv. Math. Sci. Appl. 7(2) (1997) 741–770. R.G. Cox, The steady motion of a particle of arbitrary shape at small Reynolds numbers. J. Fluid Mech. 23 (1965) 625–643. P. Deuring, On H 2 -estimates of solutions to the Stokes system with an artificial boundary condition. J. Math. Fluid Mech. 4 (2002) 203–236. P. Deuring and G.P. Galdi, On the asymptotic behavior of physically reasonable solutions to the stationary Navier–Stokes system in three-dimensional exterior domains with zero velocity at infinity. J. Math. Fluid Mech. 2(4) (2000) 353–364. R. Finn, On the exterior stationary problem for the Navier–Stokes equations, and associated perturbation problems. Arch. Rational Mech. Anal. 19 (1965) 363–406. H. Fujita, On the existence and regularity of the steady-state solutions of the Navier–Stokes equation. J. Fac. Sci. Univ. Tokyo (1A) 2 (1961) 59–102. G.P. Galdi, An Introduction to the Mathematical Theory of the Navier–Stokes Equations: Linearized Steady Problems, revised edn. Springer Tracts Nat. Philos. 38. Springer-Verlag, New York (1998). G.P. Galdi, An Introduction to the Mathematical Theory of the Navier–Stokes Equations: Nonlinear Steady Problems, revised edn. Springer Tracts Nat. Philos. 39. Springer-Verlag, New York (1998). G.P. Galdi, Slow motion of a body in a viscous incompressible fluid with application to particle sedimentation. In: V.A. Solonnikov (ed.), Developments in Partial Differential Equations. Quaderni di Matematica della II Università di Napoli 2 (1998) 2–50. G.P. Galdi, On the motion of a rigid body in a viscous liquid: A mathematical analysis with applications. In: S. Friedlander and D. Serre (eds), Handbook of Mathematical Fluid Mechanics. Elsevier Science (2002) pp. 653–791. G.P. Galdi, J.G. Heywood and Y. Shibata, On the global existence and convergence to steady state of Navier–Stokes flow past an obstacle that is started from rest. Arch. Rational Mech. Anal. 138 (1997) 307–318. G.P. Galdi and A. Vaidya, Translational steady fall of symmetric bodies in a Navier–Stokes liquid, with application to particle sedimentation. J. Math. Fluid Mech. 3(1) (2001) 183–211. R.B. Guenther, R.T. Hudspeth and E.A. Thomann, Hydrodynamic forces on submerged rigid bodies – steady flow. J. Math. Fluid Mech. (2002), in press. T. Hishida, An existence theorem for the Navier–Stokes flow in the exterior of a rotating obstacle. Arch. Rational. Mech. Anal. 150 (1999) 307–348.
FLOW AROUND A ROTATING OBSTACLE
19. 20.
21. 22. 23. 24. 25. 26. 27.
28. 29. 30. 31. 32. 33. 34. 35. 36.
37. 38. 39.
467
T. Hishida, The Stokes operator with rotation effect in exterior domains. Analysis (Munich) 19 (1999) 51–67. D.D. Joseph, Flow induced microstructure in Newtonian and viscoelastic fluids. In: Proceedings of the Fifth World Congress of Chemical Engineering. Particle Technology Track. 6 (1996) 3–16. G. Kirchhoff, Über die Bewegung eines Rotationskörpers in einer Flüssigkeit. J. Reine Ang. Math. 71 (1869) 237–281. O.A. Ladyzhenskaya, Investigation of the Navier–Stokes equation for a stationary flow of an incompressible fluid. Uspekhi Mat. Nauk. 14(3) (1959) 75–97 (in Russian). O.A. Ladyzhenskaya, N.N. Ural’ceva and V.A. Solonnikov, Linear and Quasilinear Equations of Parabolic Type. Transl. Math. Monographs 23. Amer. Math. Soc., Providence, RI (1968). H. Lamb, Hydrodynamics, Cambridge Univ. Press (1932). J. Leray, Etude de diverses équations intégrales non linéaires et de quelques problèmes que pose l’hydrodynamique. J. Math. Pures Appl. 12 (1933) 1–82. J. Leray, Sur le mouvement d’un liquide visqueux emplissant l’espace. Acta Math. 63 (1934) 193–248. S.A. Nazarov and K. Pileckas, On steady Stokes and Navier–Stokes problems with zero velocity at infinity in a three-dimensional exterior domain. J. Math. Kyoto Univ. 40(3) (2000) 475–492. S.A. Nazarov and M. Specovius-Neugebauer, Approximation of exterior boundary value problems for the Stokes system. Asymptotic Anal. 14(3) (1997) 223–255. A. Novotný and M. Padula, Note on decay of solutions of steady Navier–Stokes equations in 3-D exterior domains. Differential Integral Equations 8(7) (1995) 1833–1842. F.K.G. Odqvist, Über die Randwertaufgaben der Hydrodynamik Zäher Flüssigkeiten. Math. Z. 32 (1930) 329–375. C.W. Oseen, Neuere Methoden und Ergebnisse in der Hydrodynamik. Akad. Verlagsgesellschaft M.B.H., Leipzig (1927). H. Schmid-Schonbein and R. Wells, Fluid drop-like transition of erythrocytes under shear. Science 165(3890) (1969) 288–291. D. Serre, Chute libre d’un solid dans un fluide visqueux incompressible. Existence. Japan J. Appl. Math. 40(1) (1987) 99–110. C.G. Simader and H. Sohr, The Dirichlet Problem for the Laplacian in Bounded and Unbounded Domains. Pitman Res. Notes Math. Ser. 360. Longman Sc. Tech. (1997). G. Stokes, On the effect of internal friction of fluids on the motion of pendulums. Trans. Cambridge Phil. Soc. 9 (1851) 8–85. B. Tinland, L. Meistermann and G. Weill, Simultaneous measurements of mobility, dispersion, and orientation of DNA during steady-field gel electrophoresis coupling a fluorescence recovery after photobleaching apparatus with a fluorescence detected linear dichroism setup. Phys. Rev. E 61(6) (2000) 6993–6998. W. Thomson and P.G. Tait, Natural Philosophy, Vols. 1, 2. Cambridge Univ. Press (1879). C. Truesdell and W. Noll, Handbuch der Physik, Vol. VIII/3. Springer-Verlag (1965). H.F. Weinberger, On the steady fall of a body in a Navier–Stokes fluid. Proc. Sympos. Pure Math. 23 (1973) 421–440.
Global Bifurcation in Nonlinear Elasticity with an Application to Barrelling States of Cylindrical Columns TIMOTHY J. HEALEY1 and ERROL L. MONTES-PIZARRO2 1 Center for Applied Mathematics and Department of Theoretical and Applied Mechanics, Cornell University, Ithaca, NY 14853, USA. E-mail:
[email protected] 2 Department of Mathematics and Physics, University of Puerto Rico, Cayey Campus, Cayey, PR 00736, Puerto Rico. E-mail:
[email protected]
Received 18 October 2002; in revised form 12 May 2003 Abstract. We present rigorous local and global bifurcation results for a concrete example from 3-dimensional nonlinear elastostatics – the problem of barrelling of compressed cylindrical columns. We use standard tools of bifurcation theory for the local analysis, already producing results that are rare in our field. For the global part we employ the generalized degree designed by Healey and Simpson to overcome the specific difficulties of 3-dimensional nonlinear elasticity. Ours are the first global bifurcation results for a problem from 3-dimensional nonlinear elastostatics not governed by ordinary differential equations. Moreover, our approach to the barrelling problem provides a paradigm for the solution of a large class of problems in nonlinear elastostatics concerning bifurcation from a homogeneously deformed state. Mathematics Subject Classifications (2000): 74B20, 74G25, 37G99. Key words: local bifurcation, global bifurcation, barrelling, complementing condition, Green– Hadamard material, Blatz–Ko material.
We dedicate this work to the memory of Clifford Ambrose Truesdell III, whose scholarship, clarity of exposition and leadership inspired a generation.
1. Introduction One of the main difficulties in applying degree-theoretic methods to problems of three-dimensional nonlinear elastostatics stems from the presence of traction boundary conditions. Indeed the latter correspond to nonlinear Neumann conditions, where the nonlinearity is in the dependence upon the deformation gradient. Even for scalar, second-order elliptic equations, the apparent inapplicability of the Leray–Schauder degree in this setting is well known [18] (cf. Chapter 10.1). In a recent work [14], a generalized degree was constructed to overcome this difficulty, and global-continuation results were obtained for a general class of boundary value 469 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 469–494. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
470
T. J. HEALEY AND E. L. MONTES-PIZARRO
problems in nonlinear elasticity. The construction in [14] was inspired by an abstract degree proposed in [16] for a class of nonlinear Fredholm maps. However, the treatment in [14] accounts explicitly for mappings comprising nonlinear equations on both the domain and the boundary, which is common in problems from nonlinear continuum mechanics (cf. [7], where the degree developed in [14] was recently applied to a problem in water waves). In [14] one starts from the unloaded, stressfree state, which is subsequently shown to be an element of a global continuum of solutions (in “load-displacement” space). In particular, for pure displacement problems, under physically reasonable hypothesis on the stored energy function and on the domain, one obtains unbounded branches of classical injective solutions, cf. [12, 13]. In the absence of a-priori bounds, this falls short of a general existence theorem for displacement problems. Nonetheless, the existence of solutions “in the large”, i.e., “far” from the unloaded reference configuration, is established. For problems admitting a trivial line of solutions, e.g., a family of homogeneous solutions parametrized by the magnitude of the loading, the existence results obtained in [12, 14, 13] are of little interest (the trivial solution branch itself is unbounded). Rather, bifurcating solutions from the trivial line are typically sought and characterized. The degree presented in [14] has all of the properties of the Leray–Schauder degree, including the capability of detecting global bifurcation in the sense of Rabinowitz [26]. The first step in a global bifurcation analysis is to verify a change in the degree as the bifurcation parameter crosses a singular point along the trivial line. As first observed by Krasnoselskii [17], this yields the existence of a local continuum of bifurcating solutions; Rabinowitz [26] later showed the global ramifications of this change in degree for operator equations in the form of a compact perturbation of the identity. In practice, the simplest and most common way to demonstrate a change in degree is to verify a certain transversality or “crossing condition”, which also typically yields the existence of a local curve of bifurcating solutions, cf. [8]. In other words, a local analysis is the first step in performing a global one. The literature in three-dimensional nonlinear elasticity is replete with examples in which the necessary conditions for bifurcation are worked out by (formally) linearizing the nonlinear problem about a trivial line of homogenous solutions. The critical value or “buckling load” at which the linearization fails to be injective is then determined, e.g., cf. [5, 21, 22, 27, and references therein]. This is often referred to in the literature as the problem of “small on large”. Given the maturity of the field of bifurcation theory, there is surprisingly little rigorous analysis in the literature on sufficient conditions for bifurcation in concrete examples from three-dimensional nonlinear elasticity. In fact, we know of only two such works – [25, 31] – where the existence of a local curve of bifurcating solutions is obtained. In this work we choose a typical example from the literature for which only the necessary conditions for bifurcation are well understood, viz., the problem of barrelling of compressed cylindrical columns. This problem was studied by Simpson and Spector [28, 29], and by Davies [9, 10]. In those works, the existence of the
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
471
trivial line of homogeneous solutions is established, the critical load is obtained, and the stability (and instability) of solutions on the trivial line is determined. Although our ultimate goal here is to obtain global bifurcation results, quite a bit of the paper is devoted to a rigorous local bifurcation analysis. Here we benefit from the treatment in [31]. To the best of our knowledge, ours are the first global bifurcation results in a problem from three-dimensional elasticity not governed by ordinary differential equations. Moreover, our approach to the barrelling problem serves as a paradigm for the nonlinear analysis of a broad class of “small on large” problems in nonlinear elasticity. The outline of the paper is as follows: In Section 2 we present the formulation of our problem. As in [9, 10, 28, 29], we assume that the two compressed ends are subjected to “sliding” conditions, which insure that the trivial solution branch corresponds to a homogeneously deformed state. We impose strong ellipticity, which, among other things, plays a key role in our ability to reformulate the problem as that of finding certain periodic solutions for an infinite cylinder. In this way, we eliminate the presence of corners on the boundary of the domain, thus resolving otherwise difficult questions of regularity. In Section 3 we summarize the necessary conditions for bifurcation from the homogeneous state, and in Section 4 we present concrete examples verifying those conditions. In Section 5 we provide a local bifurcation analysis of our problem, insuring the existence of a curve of nontrivial “barreling” solutions. Finally, in Section 6 we obtain global bifurcation results using the degree presented in [14]. Following up on an observation made in [12], we take the opportunity here to simplify the construction of the degree via uniform spectral estimates, cf. Proposition 6.3. As is the case for the continuation results in [14], the first general result here provides the existence of a global continuum of nontrivial solutions, characterized not only by the two usual Rabinowitz alternatives, but also by the possibility that the branch “terminates” due to a loss of local injectivity and/or a failure of the complementing condition. As in [13], we are able to eliminate the possibility of a terminated or bounded branch of solutions due to a loss of local injectivity for a large class of realistic materials. Interestingly, the presence of traction boundary conditions requires slightly stronger growth conditions here than those employed in [13] for pure displacement problems.
NOTATION
Throughout this work we presume that a fixed, right-handed, rectangular Cartesian frame of reference has been chosen for E3 , Euclidean 3-space; we employ the usual abuse of notation by associating points in E3 and vectors (in the space of translations of E3 ) with their coordinates and components, respectively, relative to the Cartesian frame, as elements of R3 . Elements of R3 , henceforth called vectors, are denoted by boldface, lowercase latin letters such as a, x, etc.; a · b denotes the inner product of a and b. Linear transformations of R3 into itself, also called (second order) tensors, are denoted by boldface, uppercase symbols like A, L, etc.;
472
T. J. HEALEY AND E. L. MONTES-PIZARRO
I denotes the identity. AT , A−1 , trA and detA denote the transpose, the inverse, the trace and the determinant, respectively, of A. Given two Banach spaces X and Y , we denote the space of all bounded linear transformations of X into Y by L(X, Y ); L(X) ≡ L(X, X). Uppercase symbols like A, L, etc., denote elements of L(X, Y ); I ∈ L(X) denotes the identity. We write A[x] for the value of A ∈ L[X, Y ] at x ∈ X. In particular, we consistently employ the latter notation when dealing with elements of L(L(R3)), which are called fourth order tensors: C[H] denotes the value of C ∈ L(L(R3)) at H ∈ L(R3 ). We also define GL(X) GL+ (R3 ) SO(3) Sym(L(R3 ))
≡ ≡ ≡ ≡
{A ∈ L(X):3 A is bijective}, A ∈ GL(R ): det A > 0 , A ∈ L(R3 ): AT = A−1 ∩ GL+ (R3 ) , C ∈ L(L(R3)): A · C[B] = B · C[A], ∀A, B ∈ L(R3 ) ,
where E · F ≡ tr(ET F) for all E and F, which is the standard inner product on L(R3 ). Finally, for a function (x1 , . . . , xn ) of n variables we denote by ,i the partial derivative of with respect to its ith argument, and by ,ij the mixed second order partial derivatives. 2. Formulation Let ⊂ R3 denotes the right circular (open) cylinder of height L and radius R given by (2.1) = (x1 , x2 , x3 ) ∈ R3 : x12 + x22 < R 2 , 0 < x3 < L , with boundary ∂ = ∂B ∪ ∂L ∪ ∂T , where ∂B = (x1 , x2 , x3 ): x12 + x22 R 2 , x3 = 0 , ∂T = (x1 , x2 , x3 ): x12 + x22 R 2 , x3 = L , ∂L = (x1 , x2 , x3 ): x12 + x22 = R 2 , x3 ∈ [0, L] .
(2.2)
We consider a hyperelastic, homogeneous, and isotropic body occupying in a stress-free reference configuration. Let f denote a deformation of , which is, by definition, a differentiable mapping f: ⊂ R3 → R3 , i.e., f(x) denotes the position in the deformed state of the material point occupying position x in the reference configuration ; we require local injectivity, viz., F(x) ≡ ∇f(x) ∈ GL+ (R3 ) for each x ∈ . Let S denote the (first) Piola–Kirchhoff stress tensor. We then study the following boundary value problem: ∇ ·S = 0 f3 = 0
in , on ∂B ,
(2.3) (2.4)
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
S13
f3 = λL on ∂T , = S23 = 0 on ∂B ∪ ∂T , Sn = 0 on ∂L .
473 (2.5) (2.6) (2.7)
Here n denotes the outward unit normal to ∂L , and λ ∈ (0, ∞) is a “loading” parameter. To avoid trivial nonuniqueness we also impose: f1 = f2 = 0 (2.8) [f1,2 − f2,1 ] = 0. (2.9)
By hyperelasticity we mean there exists a sufficiently smooth stored-energy function W : GL+ (R3 ) → R, such that S(F) =
dW (F) . dF
(2.10)
Since the reference configuration is assumed to be stress-free, we have dW (I) = 0. dF
(2.11)
We require W to satisfy the principle of material objectivity (frame–indifference) W (QF) = W (F),
∀Q ∈ SO(3).
(2.12)
In addition, by isotropy we mean W (FQ) = W (F),
∀Q ∈ SO(3).
(2.13)
Consequently, there exists a smooth function : R2 × R+ → R, such that 1 T 1 T (2.14) W (F) = F · F, FF · FF , det F . 2 4 The fourth order tensor C defined by C(F) ≡
d2 W (F) , dF2
is called the elasticity tensor. Note that C ∈ Sym(L(R3 )) ≡ A ∈ L(L(R3 )): D · A[B] = B · A[D], ∀D, B ∈ L(R3 ) .
(2.15)
(2.16)
We also make the following assumptions on W, which are physically reasonable and mathematically convenient:
474
T. J. HEALEY AND E. L. MONTES-PIZARRO
H1. Growth conditions on W : lim W (F) = +∞,
det F→0+
and
lim
F→+∞
W (F) = +∞.
(2.17)
H2. Smoothness: W ∈ C 5 (GL+ (R3 ), R).
(2.18)
H3. The restriction of C(I) to Sym(R3 ) is positive-definite: H · C(I)[H] > 0,
∀H ∈ Sym(R3 )\{0}.
(2.19)
H4. The elasticity tensor is strongly-elliptic, i.e., for every F ∈ GL+ (R3 ), a ⊗ b · C(∇f)[a ⊗ b] > 0,
∀a, b ∈ R3 \{0}.
(2.20)
Next we discuss the complementing condition. Let x0 ∈ ∂L , F0 ∈ GL+ (R3 ), and assume that C0 ≡ C(F0 ) is strongly elliptic. Consider the following linear problem: ∇ · C0 [∇v] = α 2 v in H, on ∂H, C0 [∇v]n0 = 0
(2.21)
where n0 ≡ n(x0 ) is the outward unit normal to ∂L at x0 , and H is the halfspace H = {x ∈ R3 : (x − x0 ) · n0 < 0}. We seek solutions of (2.21) of the form v(x) = z((x − x0 ) · n0 ) exp(i(x − x0 ) · ξ ),
(2.22)
for all unit vectors ξ such that ξ · n0 = 0, where z ∈ C ∞ ([0, ∞), C3 ). The pair (C0 , n0 ) is said to satisfy the complementing condition if, for all unit vectors ξ orthogonal to n0 , the only bounded solution of (2.21) with α = 0, of the form (2.22) is v ≡ 0 [2, 30]. If the same is true for all α = 0, then (C0 , n0 ) is said to satisfy Agmon’s condition. If both the complementing condition and Agmon’s condition are satisfied, then (C0 , n0 ) is said to satisfy the strong complementing condition. As discussed, e.g., in [14], the verification of these conditions is equivalent to the nonvanishing of a certain 3 × 3 determinant, denoted d(F0 , x0 , ξ,α) = 0, where d is continuous in its four arguments (cf. also [1, 20]). The following result is well known, cf. [10, 29, 32]: PROPOSITION 2.1. Assume that is twice continuously differentiable and that hypothesis (2.17) and (2.20) hold. Then, for each λ ∈ (0, ∞), there exists a unique constant µ(λ) > 0, such that ⎛ 1/2 ⎞ µ 0 0 (2.23) f(x) = Hλ x ≡ ⎝ 0 µ1/2 0 ⎠ x 0 0 λ is a solution of (2.3)–(2.14). Moreover, µ ∈ C 1 ((0, ∞); R), and µ(1) = 1.
475
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
In the jargon of bifurcation theory, the solution of (2.3)–(2.14), given by (2.23), is called the trivial solution. We want to study the existence of branches of nontrivial solutions bifurcating from (2.23). In order to carry out a rigorous bifurcation analysis for our problem, it is convenient to rewrite it abstractly as G(λ, u) = 0, where u denotes the displacement from the trivial solution and G is a nonlinear operator between appropriate Banach spaces. Before doing this, we derive boundary conditions, in terms of the displacement field u, that are equivalent to the “sliding” boundary conditions (2.4)–(2.6). PROPOSITION 2.2. If we write f(x) = Hλ x + u(x), then the boundary conditions (2.4) and (2.5) become u3 (x1 , x2 , 0) = u3 (x1 , x2 , L) = 0,
(2.24)
while the zero-shear conditions (2.6) are equivalent to u1,3 (x1 , x2 , 0) = u1,3 (x1 , x2 , L) = 0, u2,3 (x1 , x2 , 0) = u2,3 (x1 , x2 , L) = 0,
and
(2.25)
for x12 + x22 R 2 . Proof. The first part leading to (2.24) is trivial. For the second part, first notice that u3 constant in ∂T ∪ ∂B , implies u3,1 = u3,2 = 0 in ∂T ∪ ∂B . Using this with (2.10) and (2.14), we get that S13 = ,1 + ,2 (µ1/2 + u1,1 )2 + u21,2 + u21,3 + (λ + u3,3 )2 u1,3 + ,2 (µ1/2 + u1,1 )u2,1 + u1,2 (µ1/2 + u2,2 ) + u1,3 u2,3 u2,3 (2.26) and
S23 = ,2 (µ1/2 + u1,1 )u2,1 + u1,2 (µ1/2 + u2,2 ) + u1,3 u2,3 u1,3 + ,1 + ,2 u22,1 + (µ1/2 + u2,2 )2 + u22,3 + (λ + u3,3 )2 u2,3 . (2.27)
In ∂T ∪ ∂B we have the decomposition ⎞ ⎛ 1/2 u1,2 u1,3 µ + u1,1 u2,1 µ1/2 + u2,2 u2,3 ⎠ = F0 + a ⊗ e3 , F=⎝ 0 0 λ + u3,3 where
⎞ µ1/2 + u1,1 u1,2 0 µ1/2 + u2,2 0 ⎠ u2,1 F0 = ⎝ 0 0 λ
(2.28)
⎛
and
a = (u1,3 , u2,3 , u3,3 ).
(2.29)
Observe that det F > 0 ⇒ det F0 > 0. Now strong ellipticity at F0 implies that the mapping a → s(a) ≡ S(F0 + a ⊗ e3 )e3 is injective, cf. [4]. Of course s(a) =
476
T. J. HEALEY AND E. L. MONTES-PIZARRO
(S1,3 , S2,3 , S3,3 ), and by (2.26) and (2.27) we see that s((0, 0, u3,3 )) = (0, 0, S3,3 ) in ∂T ∪ ∂B . On the other hand, if S1,3 = S2,3 = 0 in ∂T ∪ ∂B , then s((u1,3 , u2,3 , u3,3 )) = (0, 0, S3,3 ), and thus, u1,3 = u2,3 = 0 in ∂T ∪ ∂B by the injectivity of s. 2 In view of (2.24) and (2.25), we now formulate our nonlinear problem on the infinite cylinder (2.30) ∞ ≡ (x1 , x2 , x3 ) ∈ R3 : x12 + x22 < R 2 , −∞ < x3 < ∞ . We then impose the following even–oddness assumptions on the components of u: ui (x1 , x2 , x3 ) = ui (x1 , x2 , −x3 ), for i = 1, 2, u3 (x1 , x2 , x3 ) = −u3 (x1 , x2 , −x3 ).
and
(2.31) (2.32)
We restrict ourselves to axisymmetric solutions, viz., we require that u satisfy u(Qx) = Qu(x), ⎛
cos ω Q = ⎝ sin ω 0 ⎛ 1 0 Q = ⎝ 0 −1 0 0
∀x ∈ ∞ , − sin ω cos ω 0 ⎞ 0 0 ⎠. 1
where ⎞ 0 0 ⎠, 1
∀ω ∈ Rmod2π ,
(2.33)
or
Let us define now the following spaces of functions X ≡ u ∈ C 2,α (∞ , R3 ): u is 2L-periodic in x3 , and satisfy (2.31)–(2.33) , Y0 ≡ v ∈ C 0,α (∞ , R3 ): v is 2L-periodic in x3 , and satisfy (2.31)–(2.33) , Y1 ≡ w ∈ C 1,α (∂∞ , R3 ): w is 2L-periodic in x3 , and satisfy (2.31)–(2.33) , Y ≡ Y0 × Y1 ,
(2.34) (2.35) (2.36) (2.37)
where
∂∞ = (x1 , x2 , x3 ) ∈ R3 : x12 + x22 = R 2 ,
(2.38)
k,α
and where C denotes the usual Hölder spaces of all k-times continuously differentiable functions whose kth-order partial derivatives are (locally) Hölder continous with exponent α ∈ (0, 1]. We endow X and Y with the usual Hölder norms rendering them Banach spaces: · X ≡ · 2,α; ,
· Y ≡ · 0,α; + · 1,α;∂L .
(2.39)
477
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
Next we define U ≡ (λ, u) ∈ (0, ∞) × X: det(Hλ + ∇u) > 0 in , G(λ, u) ≡ C(Hλ + ∇u)[∇ 2 h], S(Hλ + ∇u)n ,
and
(2.40) (2.41)
where G: U → Y. From (2.15), the componential form of the first term in (2.41) is given by
C(Hλ + ∇u)[∇ 2 u] i ≡
∂ 2W ∂ 2 uk (Hλ + ∇u) . ∂Fij ∂Fkl ∂xj ∂xl
Observe that all u ∈ X automatically satisfy (2.8) and (2.9). From Proposition 2.2, we then conclude: PROPOSITION 2.3. Any solution of the operator equation, G(λ, u) = 0
(2.42)
with u restricted to , is a solution of the BVP (2.3)–(2.20) and vice-versa. That is, any classical solution of BVP (2.3)–(2.20), for each fixed λ, can be 2L-periodically extended on ∞ to produce a solution of (2.42).
3. Linearized Problem The mapping G, as defined in the previous section, has the property G(λ, 0) = 0,
∀λ ∈ (0, ∞).
(3.1)
Moreover, the smoothness hypothesis H2 (cf. (2.18)) insures that G: U → Y is of class C 2 , cf., e.g., [35]. (For the purposes of this section, G of class C 1 is enough). We are interested in nontrivial solutions of (2.42) bifurcating from the trivial solution u ≡ 0. A necessary condition for bifurcation is that the linearized problem, (3.2) Gu (λ, 0)[h] = ∇ · C(Hλ )[∇h], C(Hλ )[∇h]n = (0, 0), admit nontrivial solutions h ∈ X. Here Gu (λ, u) denotes the Frechet derivative of u →G(λ, u) at (λ, u) ∈ U. Fortunately, problem (3.2) has been studied previoulsy, cf. [9, 10, 29]. Any h ∈ X satisfying conditions (2.33) can be written in the form: ⎞ ⎛ φ(r, z)x1 (3.3) h(x1 , x2 , x3 ) = ⎝ φ(r, z)x2 ⎠ , %(r, z) where r 2 = x12 + x22 and z = x3 . If we let θ(r, z) = r 2 φ(r, z), a long and tedious, but otherwise elementary, computation shows that our linear problem reduces to
478
T. J. HEALEY AND E. L. MONTES-PIZARRO
that of finding a pair (θ(r, z), %(r, z)) of C 2,α ((0, R) × (0, L)) functions satisfying the equations: β1 θzz θr τ1 + N%rz = 0, + (3.4) r r r β1 (r%r )r + rτ3 %zz + Nθrz = 0, in (0, R) × (0, L), together with the boundary conditions %(r, 0) = %(r, L) = 0, θz (r, 0) = θz (r, L) = 0,
(3.5)
for r ∈ (0, R],
and β1 R%r (R, z) + t 1/2 β1 θz (R, z) = 0, τ1 θr (R, z) + XR%z (R, z) =
2β3 θ(R, z) , R
for z ∈ [0, L].
(3.6)
The requirement u ∈ C 2,α yields lim+
r→0
θ(r, z) = 0. r
(3.7)
In general, β1 , β3 , τ1 , and τ3 are given by the expressions (3.8) βi ≡ ,1 + ,2 (ν12 + ν22 + ν32 − νi2 ) > 0, ∂t i > 0, with (3.9) τi ≡ ν1 ν2 ν3 νi−1 ∂νi ν 2 ,1 + νi4 ,2 , (3.10) ti (ν1 , ν2 , ν3 ) ≡ ,3 + i ν1 ν2 ν3 √ where the νi ’s are the eigenvalues of FFT (the principal stretches) and the ti ’s are the eigenvalues of the Cauchy stress tensor T(F) = (det F)−1 S(F)FT (the principal stresses). It can be shown [34] that inequalities (3.8) and (3.9) are a consequence of strong ellipticity H4, cf. (2.20). Here in (3.4)–(3.6), the expressions (3.8) and (3.9) are evaluated at the trivial solution (2.23), viz., ν1 = ν2 = µ1/2 and ν3 = λ, and µ (3.11) t ≡ 2, λ X ≡ µ1/2 λ,11 + (µ3/2 λ + µ1/2 λ3 ),12 + (µ1/2 λ2 + µ3/2 ),31 + (µ1/2 λ4 + µ5/2 ),23 µ3/2 λ3 ,22 + µ3/2 λ,33 + µ1/2 ,3 ,
(3.12)
N ≡ X + t 1/2 β1 .
(3.13)
and
The following is due to Simpson and Spector [29]:
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
479
LEMMA 3.1. Suppose that the quadratic polynomial p(e) ≡ β1 τ1 e2 + (τ1 τ3 + β12 − N 2 )e + β1 τ3
(3.14)
has distinct negative real roots e1 , and e2 . Then, any C 2 solution (θ, %) of (3.4), (3.5), and (3.7) can be written as a uniformly and absolutely convergent Fourier series of the form θ(r, z) = %(r, z) =
∞ n=1 ∞
θn (r) cos(ρn z), (3.15) %n (r) sin(ρn z),
n=1
where ρn = nπ/L. The coefficients θn (r) and %n (r) are C 2 on [0, R] and θn (r) = an θn1 (r) + bn θn2 (r), %n (r) = an %n1 (r) + bn %n2 (r), (3.16) θnj (r) = rI1 (ρn rfj ), %nj (r) = Dj I0 (ρn rfj ), where fj = |ej |, Dj = [β1 − fj2 τ1 ]/Nfj , for j = 1, 2, and Ik , for k = 1, 2, are the modified Bessel functions. To find a nontrivial solution of (3.2), we look for those values of λ ∈ (0, ∞) such that the boundary conditions (3.6) are fulfilled. This, in turn, is equivalent (for a given λ ∈ (0, ∞)), to checking for values of n ∈ N such that (θn (r) cos(ρn z), %n (r) sin(ρn z)) satisfies (3.6). Hence, we have from Lemma 3.1 that a necessary and sufficient condition for the existence of such a number n is that the system 0 θn1 θn2 = (3.17) + bn An an %n1 %n2 0 |r=R has nontrivial solutions an , bn ∈ R. Here An is the differential operator ⎞ ⎛ β3 d ρn Xr ⎟ ⎜ τ1 − 2 r An = ⎝ dr d ⎠, −ρn t 1/2 β1 β1 r dr
(3.18)
where X, t, θnj , and %nj are as defined above, cf. (3.11), (3.12), and (3.16). Note that (3.17) is a linear system with unknowns (an , bn ) that can be re-written in matrix form as θn1 θn2 an 0 ; An = . (3.19) An %n1 %n2 b 0 n r=R
480
T. J. HEALEY AND E. L. MONTES-PIZARRO
Finally, we conclude that Gu (λ∗ , 0) ∈ L(X, Y) has a nontrivial kernel if and only if λ∗ is a root of the equation (hereafter refered to as the characteristic equation) θn1 θn2 = 0, (3.20) ; An h(n, λ) ≡ det An %n1 %n2 r=R for some n ∈ N. Next suppose that we have a root, λ∗ ∈ (0, ∞), of the characteristic equation (3.20), i.e., (3.2) admits a nontrivial solution at λ = λ∗ . In order to perform a rigorous local analysis of bifurcation, one must also demonstrate that Gu (λ∗ , 0) is a Fredholm operator (of index zero). This, in turn, depends upon both the ellipticity of the differential operator (cf. H4, (2.20)) and the satisfaction of the complementing condition at each point x ∈ ∂L , cf. (2.20)–(2.22). By virtue of hypothesis H3, cf. [14, (2.19) and Proposition 2.1], we know that the pair (C(Hλ ), n(x)) satisfies the strong complementing condition for every x ∈ ∂L and for each λ sufficiently close to λ = 1. To obtain explicit conditions under which our linearized problem satisfies the complementing condition we note that the principal parts of the equation (3.4) and the boundary conditions (3.6), with the coefficients evaluated at a point in ∂∞ (cf. (2.38)), are given by ⎫ β1 θzz τ1 θrr ⎬ + + N%rz = 0 , r > 0, and (3.21) R R β1 R%rr + Rτ3 %zz + Nθrz = 0 ⎭ 0 R%r (0, z) + t 1/2 θz (0, z) = 0 , (3.22) τ1 θr (0, z) + XR%z (0, z) = 0 respectively. In view of (2.22), we need to check whether the system (3.21)–(3.22) has solutions of the form θ(r, z) = w1 (r)eiαz ,
%(r, z) = w2 (r)eiαz ,
(3.23)
with w1 , and w2 bounded, and α, z ∈ R. Substitution of (3.23) into (3.21)–(3.22) yields 0 τ1 w1 (r) − β1 α 2 w1 (r) = −NRαiw2 (r) , r > 0, (3.24) β1 Rw2 (r) − Rτ3 α 2 w2 (r) = −Nαiw1 (r) together with Rw2 (0) + t 1/2 αiw1 (0) = 0 and
τ1 w1 (0) + XRαiw2 (0) = 0.
(3.25)
Note that the system (3.24)–(3.25) is a constant coefficients system. Accordingly, we look for solutions of the form w1 (r) = A1 eξ r ,
w2 (r) = A2 eξ r ,
(3.26)
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
which upon substitution into (3.24) yields 2 τ1 ξ − β1 α 2 A1 + NRαiξ A2 = 0, Nαiξ A1 + β1 Rξ 2 − Rτ3 α 2 A2 = 0. The linear system of equations (3.27) has nontrivial solutions iff 4 ξ 2 ξ 2 2 − τ1 τ3 + β1 − N + β1 τ3 = 0. τ1 β1 α α
481
(3.27)
(3.28)
Upon extracting the roots of which, we conclude that w1 , and w2 are solutions of (3.24) iff (3.29) ξ = ±α −ej , j = 1, 2, where e1 , and e2 are the roots of (3.14). Therefore, the general solution of (3.24), satisfying the boundedness condition, is given by √
√
w1 (r) = Aeα −e1 r + Beα −e2 r , (3.30) AD1 α √−e1 r BD2 α √−e2 r e e + i w2 (r) = − R R √ assuming, without loss of generality, that Re(α −ej ) 0, j = 1, 2, cf. (3.29). Finally, from (3.30) and the boundary conditions (3.25), we arrive at 1/2 √ √ A 0 t √ − D1 −e1 t 1/2 √ − D2 −e2 = , (3.31) τ1 −e1 + XD1 τ1 −e2 + XD2 B 0 which has nontrivial solutions when the determinant of the coefficient matrix is equal to zero. This yields: PROPOSITION 3.2. The pair (C(Hλ ), n(x)) satisfies the complementing condition at λ ∈ (0, ∞) if and only if √ √ g(λ) ≡ t 1/2 − D1 −e1 τ1 −e2 + XD2 √ √ (3.32) − t 1/2 − D2 −e2 τ1 −e1 + XD1 = 0.
4. Examples It is difficult to study both the characteristic equation (3.20) and the complementing condition inequality (3.32) for arbitrary hyperelastic homegeneous isotropic materials. Simpson and Spector studied the characteristic equation (3.20) for Green– Hadamard materials in [29] and for Blatz–Ko materials in [28]. In this section we introduce these materials as examples verifying the general conditions of the previous section. We will briefly review the results of Simpson and Spector for the
482
T. J. HEALEY AND E. L. MONTES-PIZARRO
corresponding characteristic equations and consider the complementing condition inequality (3.32). We only consider values of λ in the interval [0, 1]. The constitutive stored-energy function for Green–Hadamard materials is given by W (F) =
b a F · F + (F · F)2 − FFT · FFT + (det F). 2 4
(4.1)
We assume, as in [29], that a > 0, b 0, (s (s)) 0, for all s ∈ (0, 1], (1) = −a − 2b.
and
(4.2)
As remarked in [29] we have that conditions (4.2) imply strong ellipticity for the elasticity tensor at every F ∈ GL+ (R3 ) satisfying det F 1. Condition (4.2)2 is used to prove that the energy becomes infinite as det F goes to zero and (4.2)3 is equivalent to the reference configuration being natural. For Green–Hadamard materials we have: τ1 = a + b(µ + λ2 ) + q, β1 = a + bµ, β3 = a + bλ2 , q = µλ2 (µλ), µ(1) = 1.
τ3 = a + 2bµ + tq, N = t 1/2 [bλ2 + q], X = t 1/2 [−a − bµ + bλ2 + q], t = µλ−2 ,
(4.3)
For this material the roots of (3.14) are given by e1 = −1,
e2 = −
a + tq + 2bµ a + q + b(µ + λ2 )
(4.4)
It can be shown, see [29], that e2 (λ) < −1, for every λ ∈ (0, 1), and that dµ/dλ 0, and 1 µ(λ) for every λ ∈ (0, 1]. Hence, in particular, we have e1 not equal to e2 , and we can apply Lemma 3.1. Making use of (4.3), (4.4), and (3.16)3,4 , equation (3.20) reduces to h(n, λ) = v(ρf ) −
2(t − 1) 2 a + bλ2 4t 2 f f = 0, v(ρ) + (1 + t 2 ) (1 + t 2 ) a + bµ
(4.5)
where v(r) ≡ rI0 (r)/I1 (r), f2 = 1 +
(t − 1)(q + bλ2 ) , (q + bλ2 ) + (a + bµ)
See the previous sections for the notation.
(4.6)
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
483
and ρ = ρn R (recall that ρn = nπ/L). In [28] Simpson and Spector showed that for each n ∈ N there exists a λn ∈ (0, 1) such that (4.5) is satisfied. In other words, if W is given by (4.1) and (4.2), then for each n ∈ N there exists a λn ∈ (0, 1) such that (3.4)–(3.7) has a solution that is a linear combination of (θni , lni ) as given by (3.16). Hence, for those values of λ, our linearized problem (3.2) admits nontrivial solutions. For Green–Hadamard materials, to determine the values of λ ∈ (0, 1) for which our linearized problem (3.2) fails to satisfy the complementing condition we substitute (4.3) and (4.4) in the left hand side of (3.32) and solve (4.7) g(λ) ≡ a + q + b(µ + λ2 ) (t + 1)4 + 16te2 = 0, for λ ∈ (0, 1). PROPOSITION 4.1. For Green–Hadamard materials satisfying conditions (4.2), the complementing condition is always violated at least once, i.e., there exists λc ∈ (0, 1) such that the complementing condition fails at λ = λc . Proof. It is easy to see that at λ = 1 we have e2 = −1, and t = 1 from which it follows that g(1) = 0. On the other hand, g (1) > 0, and limλ→0+ g(λ) = +∞. Hence, using the continuity of g, we conclude that there exists λc ∈ (0, 1) such 2 that g(λc ) = 0. The constitutive stored-energy function for Blatz–Ko materials (see [6]) is a special case of (4.1) corresponding to a = 1, b = 0, and (s) = m1 s −m , with m > 0, i.e. 1 1 F · F + (det F)−m . 2 m For this material the formulas (4.3) and (4.4) are reduced to: W (F) =
µ = µ(λ) ˆ τ1 −2 t = µλ N
β1 = β3 = 1, = λ−m/(m+1) , = m + 2, τ3 = 1 + (m + 1)λ−(3m+2)/(m+1) , = λ−(3m+2)/(m+1) , X = mλ−(3m+2)/(2(m+1)) , = (m + 1)λ−(3m+2)/(2(m+1))
(4.8)
(4.9)
and e1 = −1,
e2 = −
1 1 + (m + 1)λ−(3m+2)/(2(m+1)) . m+2
The characteristic equation (4.5) reduces to (cf. (4.6)), tm 1/2 1/2 −1/2 ) m+2− v(ρe2 ) = 0. h(n, λ) = −4t v(ρ) + 2(t − t e2
(4.10)
(4.11)
Equation (4.11) is much easier to analize than (4.5), but the analysis is not trivial. Simpson and Spector showed in [28] that for each n ∈ N there exists a unique
484
T. J. HEALEY AND E. L. MONTES-PIZARRO
λn ∈ (0, 1) such that (4.11) is satisfied. Hence, there exists an infinite sequence (λn ) ⊂ (0, 1) for which Gu (λn , 0) = 0 has nontrivial solutions. Condition (4.7) can be written as the following polynomial equation on t, (cf. (4.9)): (4.12) (t − 1) (m + 2)t 3 − (11m + 6)t 2 − 5(m + 2)t − (m + 2) = 0. The complementing condition fails at those values of λ ∈ (0, 1) for which t = λ−(3m+2)/(m+1) is a root of (4.12). Using Descartes’ rule of signs, we easily see that (4.12) has only one positive root (other than t = 1) and that it is greater than one. Hence, we have shown:** PROPOSITION 4.2. For Blatz–Ko materials the complementing condition is vio−(3m+2)/(m+1) is the only lated at exactly one value, λc , of λ ∈ (0, 1) where t = λc positive (greater than one) root of (m + 2)t 3 − (11m + 6)t 2 − 5(m + 2)t − (m + 2). The next theorem was proved in [28]. THEOREM 4.3. The infinite sequence (λn ) ⊂ (0, 1) for which Gu (λn , 0) = 0 has nontrivial solutions satisfy the following properties:‡ 1. Each λn , n ∈ N, is a simple root of (4.11). 2. ∃N ∈ N such that (λn )nN ⊂ (λc , 1). 3. limn λn = λc . 4. dim kerGu (λn , 0) = 1, for n < N. 5. 1 dim kerGu (λn , 0) 2 for n N, i.e. at most two linear modes can occur simultaneously. REMARK 4.4. For Blatz–Ko materials we also remark: 1. It is clear from the results of Simpson and Spector (see Section 5 and Figure 1 in [28, p. 111]) that for Blatz–Ko materials dn ≡ dim kerGu (λn , 0), is generically equal to one. 2. It is interesting to note that there exist values of R and L, respectively the radius and the height of the cylinder, for which λc = λn0 , for exactly one λn0 ∈ (λn ). In other words, for cylinders with those dimensions the linearized problem Gu (λc , 0)[h] = 0 has a nontrivial solution. 5. Local Bifurcation Denoting the Fréchet derivative of G with respect to its second argument by Gu (λ, u) ≡ L(λ, u) ∈ L(X, Y), where L(λ, u)[h] = (A(λ, u)[h], B(λ, u)[h]) ∈ Y0 × Y1 ∀h ∈ X,
(5.1)
The index corresponds to the enumeration of the linear modes. ** This result was also obtained in [10]. ‡ Although this theorem was proved in [28], the authors of that paper do not discuss the issue of
the violation of the complementing condition at λ = λc .
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
and assuming that λ ∈ (0, 1] is such that (3.32) holds, then we get hX C A(λ, 0)[h]Y0 + B(λ, 0)[h]Y1 + hY0 C L(λ, 0)[h]Y + hY0 ,
485
(5.2)
for every h ∈ X, where C > 0 is independent of h. By a Lemma of Peetre [23], we conclude from (5.2) that L(λ, 0) is a semi-Fredholm operator. A standard homotopy argument (see the proof of Proposition 3.1 in [14]) shows that in this case L(λ, 0) is a Fredholm operator of index zero. In fact the following theorem is true (see [30]): THEOREM 5.1. If L(λ∗ , 0) satisfies the strong ellipticity condition, and the complementing condition for λ∗ ∈ (0, ∞), then L(λ∗ , 0) is self-adjoint and Fredholm of index zero. In our problem we cannot say in general for which values of λ ∈ (0, 1], L(λ, 0)[h] = 0 has nontrivial solutions, or for which values of λ the complementing condition fails. But as we saw in the previous section we can verify the hypothesis of the following theorem for particular materials. THEOREM 5.2 (Local Bifurcation Theorem). Consider G: U → Y and suppose that λ∗ ∈ (0, 1) is such that: 1. dim ker L(λ∗ , 0) = 1. 2. λ∗ satisfies (3.32), i.e. the linearized problem L(λ∗ , 0)[h] = 0 satisfies the complementing condition. 3. If kerL(λ∗ , 0) = span{h∗ }, and M ≡ Gu,λ (λ∗ , 0), let us assume that / range(L(λ∗ , 0)), Mh∗ ∈
(5.3)
which is usually called “the strict crossing condition”. Then, (λ∗ , 0) is a bifurcation point of a local continuous branch of nontrivial solutions of G(λ, u) = 0. Proof. In view of the Fredholm property (cf. Theorem 5.1), the proof of this theorem is well known, cf., e.g., [8, 3]. 2 The condition (5.3) can be rewritten in a form which is easier to verify in the context of our problem, cf. [31]. LEMMA 5.3. Let us assume that λ∗ satisfies the hypothesis of Theorem 5.2. Then condition (5.3) is equivalent to: d ∇h∗ · C(Hλ )[∇h∗ ] λ=λ∗ = 0, (5.4) dλ which in turn is equivalent to λ = λ∗ being a simple root of the characteristic equation, cf. (3.20): θn1 θn2 = 0. (5.5) ; Bn f (n, λ) = det Bn %n1 %n2 r=R
486
T. J. HEALEY AND E. L. MONTES-PIZARRO
Proof. Consider the linear functional ψ: Y → R given by ψ(h, g) = − h∗ · h + ∂L h∗ · g. If (h, g) ∈ range(L(λ∗ , 0)), then ∃v ∈ Xs such that L(λ∗ , 0)[v] = (h, g), i.e. such that ∇ · C(Hλ )[∇v] = h in , and C(Hλ )[∇v]n = g on ∂L . Now, making use of (2.16), integration by parts, and that h∗ ∈ Xs we get ψ(h, g) = ψ ∇ · C(Hλ∗ )[∇v], C(Hλ∗ )[∇v]n h∗ · C(Hλ∗ )[∇v]n = − h∗ · ∇ · C(Hλ∗ )[∇v] + ∂L ∇h∗ · C(Hλ∗ )[∇v] = ∇v · C(Hλ∗ )[∇h∗ ] = v · C(Hλ∗ )[∇h∗ ]n = − v · (∇ · C(Hλ∗ )[∇h∗ ] +
∂L
= 0.
Using the above computation, the hypothesis that L(λ∗ , 0) is Fredhom of index zero with one dimensional kernel, and by self-adjointness, we conclude that kerψ = rangeL(λ∗ , 0). Therefore, Guλ (λ∗ , 0)[h∗ ] ∈ / rangeL(λ∗ , 0) if and only if / kerψ, i.e., if and only if Guλ (λ∗ , 0)[h∗ ] ∈ d − h∗ · ∇ · C(Hλ )[∇h∗ ] + h∗ · C(Hλ )[∇h∗ ]n = 0, dλ ∂L λ=λ∗ which is condition (5.4) after an integration by parts. For the other equivalence, let T u(i) λ (r, z) = (θni (r) cos(ρn z), %ni (r) sin(ρn z)) ,
for i = 1, 2, be two solutions of (3.4), (3.5), and (3.7), with θni (r) and %ni (r) given by (3.16)3,4 , continuously differentiable in λ and of mode number n of h∗ for each λ in a neighborhood of λ∗ . Suppose h∗ = c1 uλ(1)∗ + c2 u(2) λ∗ with c1 , c2 ∈ R (not both (1) (2) zero). If we define uλ = c1 uλ + c2 uλ , then d d ∇h∗ · C(Hλ )[∇h∗ ] λ=λ∗ = ∇uλ · C(Hλ )[∇uλ ] λ=λ∗ . dλ dλ Performing an integration by parts in the last integral and using (3.5) we get ∇uλ · C(Hλ)[∇uλ ] = − uλ · ∇ · C(Hλ )[∇uλ ] uλ · C(Hλ )[∇uλ ]n. + ∂L
Since by hypothesis ∇ · C(Hλ )[∇u(i) λ ] = 0, then d d ∇uλ · C(Hλ )[∇uλ ] λ=λ∗ = uλ · C(Hλ )[∇uλ ]n λ=λ∗ dλ dλ ∂L
487
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
L uλ · C(Hλ )[∇uλ ]n r=R
d dλ 0 λ=λ∗ d uλ · C(Hλ )[∇uλ ]n r=R λ=λ∗ = L dλ d = L c · T Aλ cT λ=λ∗ , dλ =
where
θn1 (λ) θn2 (λ) , and
= c = (c1 , c2 ), %n1 (λ) %n2 (λ) θn1 (λ) θn2 (λ) . ; An Aλ = An %n1 (λ) %n2 (λ) r=R
By construction we have that cT is a null vector of T Aλ∗ . But, T is invertible, which implies that cT is also a null vector of Aλ∗ . Note that T d dAλ T d c · T Aλ cT λ=λ∗ = c · Aλ + T c dλ dλ dλ λ=λ∗ dA λ T c = c · T , dλ λ=λ∗
which is nonzero if and only if (d/dλ) det Aλ |λ=λ∗ = 0. Recall that dimkerL(λ, 0) = dimkerAλ , Hence, by hypothesis (1) of Theorem 5.2, zero is a simple eigenvalue of T Aλ∗ . But, T is invertible, therefore zero is a simple eigenvalue Aλ∗ , i.e., λ∗ is a root of det Aλ , and we have shown that it is simple if and only if (5.4) holds. 2 Combining Theorem 4.3 with Remark 4.4 we notice that for Blatz–Ko materials, Theorem 5.2 implies the existence of an infinite sequence of bifurcating branches of nontrivial solutions of G(λ, u) = 0 bifurcating from (λn , 0), where limn λn = λc , and (λn ) is enumerated by the number of the corresponding linear mode, cf. Proposition 4.2 and Theorem 4.3. A similar phenomena was observed by Rabier and Oden in their study of bifurcation of steady-state motions of a spinning hyperelastic incompressible cylinder, cf. [25], and by Simpson and Spector in their study of buckling of a rectangular rod, cf. [31]. 6. Global Bifurcation In this section we demonstrate that the conditions of Section 5 ensuring local bifurcation, also yield global bifurcation results for our problem G(λ, u) = 0,
(6.1)
488
T. J. HEALEY AND E. L. MONTES-PIZARRO
cf. (2.41). As a first step, we define a set of admissible solutions appropriate for our analysis: A = (λ, u) ∈ (0, ∞) × X: Hλ + ∇u(x) ∈ GL+ (R3 ) ∀x ∈ , and |d(Hλ +∇u(x), x)| > 0 ∀x ∈ ∂L , (6.2) where “d” refers to the determinant involved in the definition of the complementing conditon, cf. (3.32). We define O to be the maximal connected set in A containing the point (λ, u) ≡ (1, 0), i.e., O = comp{(1, 0)}
in A.
(6.3)
For each δ > 0, we also define Oδ = (λ, u) ∈ A: det(Hλ + ∇u(x)) > δ ∀x ∈ , and |d(Hλ +∇u(x), x)| > δ ∀x ∈ ∂L .
(6.4)
Clearly Oδ ⊂ X is open, and O δ ⊂ O, for each δ > 0. Moreover, O = and thus, O ⊂ (0, ∞) × X is open. We now state our main theorem.
9
δ>0 Oδ ,
THEOREM 6.1. Assume the hypotheses of Theorem 5.2. Let S ⊂ A denote the closure of the set of nontrivial solution pairs (λ, u) of (6.1). Let C ⊂ S denote the (connected) component of S containing the bifurcation point (λ∗ , 0). Then at least one of the following holds: 1. C is unbounded in (0, ∞) × X. 2. (λ0 , 0) ∈ C, where λ0 ∈ (0, ∞) and λ0 = λ∗ . 3. C ⊂ Oδ , for each δ > 0. To prove Theorem 6.1, we first fix δ > 0. Observe that the smoothness assumptions of Section 2 insure that G: Oδ → Y is of class C 2 , cf. [35]. In particular, we denote the Fréchet derivative of G with respect to its second argument by Gu (λ, u) ≡ L(λ, u) ∈ L(X, Y), where (cf. (5.1)) L(λ, u)[h] = (A(λ, u)[h],B(λ, u)[h]) ∈ Y0 × Y1
∀h ∈ X.
(6.5)
A straightforward calulation shows that the principal parts of the linear operators A(λ, u) and B(λ, u) (for fixed (λ, u) ∈ Oδ ) are given by A(λ, u)[h] = C(Hλ +∇u(x))[∇ 2 h] + · · ·
in ,
B(λ, u)[h] = C(Hλ + ∇u(x))[∇h]n(x) + · · ·
on ∂L .
and
(6.6)
We now summarize three crucial properties of the mapping G: Oδ → Y, which enable the construction of a degree having the capability of detecting global bifurcation. In what follows W ⊂ Oδ is open and bounded.
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
489
PROPOSITION 6.2. For each (λ, u) ∈ W , L(λ, u) ∈ L(X, Y) is a Fredholm operator of index zero, i.e., the (finite) dimension of the null space, N(L(λ, u)), is equal to the co-dimension of the range, R(L(λ, u)). Proof. The Schauder estimates and a theorem of Peetre [23] imply that the dimension of N(L(λ, u)) is finite and that R(L(λ, u)) is closed. A homotopy argument, using the stability of the Fredholm index (on the connected set O), then yields the result, cf. [14, Proposition 4.3]. 2 The next proposition requires some additional notation. For each (λ, u) ∈ Oδ , note that the linear operator A(λ, u) with domain Zλ,u ≡ {h ∈ X: B(λ, u)[h] = 0}
(6.7)
is closed in Y0 . PROPOSITION 6.3. For each (λ, u) ∈ W , there are positive constants ε, C1 , C2 , independent of λ, u, µ, and h, such that hX C1 |µ|α/2 (A(λ, u)−µ)[h]Y0 + |µ|(1+α)/2 B(λ, u)[h]Y1 , (6.8) for all h ∈ X, and for all µ ∈ C satisfying |arg(µ)| π/2 + ε and |µ| C2 , where α ∈ (0, 1) is the Hölder exponent inherent in X and Y. Proof. The main observation here is that for all (λ, u) ∈ W , the pair (C(Hλ + ∇u(x)), n(x)) satisfies the strong complementing condition at each x in ∂. This follows from the path connectedness of (λ, u) and (1, 0) in O and the fact that the pair (C(I), n(x)) satisfies Agmon’s condition, cf. the proof of [14, Proposition 4.4]. From here, the uniform estimate (6.8) follows from Agmon’s trick [1] in the Hölder-space setting [36, 15]. 2 The next proposition is established in [14, Theorem 4.6]. PROPOSITION 6.4. The mapping G: Oδ → Y is proper, i.e., G−1 (K) ∩ D is compact for each bounded set D ⊂ Oδ and compact set K ⊂ Y. We can now define a degree for u → G as in [14] (cf. also [11, 16]) as follows: Consider any subset W ⊂ O δ such that for any fixed value of λ ∈ (0, ∞), the set Wλ = {u ∈X: (λ, u) ∈ W },
(6.9)
is open and bounded. With λ ∈ (0, ∞) fixed, we then consider equation (6.1), / G(λ, ∂Wλ ) is a regular with u → G(λ, u) restricted to W λ , assuming that 0 ∈ value. Proposition 6.3 insures that the linear operator A(λ, u), with domain Zλ,u , has a finite number of positive eigenvalues, denoted ν(λ, u), counted by algebraic multiplicity. We then define the degree of G(λ, ·) in Wλ (with respect to 0) by (−1)ν(λ,u) , (6.10) deg(G(λ, ·), Wλ , 0) = u∈G−1 λ (0)∩Wλ
490
T. J. HEALEY AND E. L. MONTES-PIZARRO
with the understanding that deg(G(λ, ·), Wλ , 0) = 0 if G−1 λ (0) ∩ Wλ = ∅. To show the validity of (6.10) when 0 ∈ / G(λ, ∂Wλ ) is not a regular value, and to prove homotopy invariance, viz., deg(G(λ, ·), Wλ , 0) = const
(6.11)
/ G(λ, ∂Wλ ) for all λ ∈ [λ1 , λ2 ], require the for all λ ∈ [λ1 , λ2 ] whenever 0 ∈ use of a generalization of Sard’s theorem [24] applicable to C 2 , proper Fredholm maps, the later two properties of which are guaranteed by Propositions 6.2 and 6.4. The uniform estimate (6.8) of Proposition 6.4 together with the continuity of the mappings (λ, u) → A(λ, u) and (λ, u) → B(λ, u) yield eigenvalue-perturbation results insuring the continuity of the index (λ, u) → (−1)ν(λ,u) , which is employed in the proof of homotopy invariance. We refer the reader to [14, the appendix] for details. In addition to homotopy invariance, our degree has all of the usual properties of the Leray–Schauder degree, e.g., existence, additivity, etc. Proof of Theorem 6.1. For each fixed δ > 0, we now argue as in [26], viz., if none of properties (1)–(3) hold, then by the separation theorem for compact sets, there is a bounded open set W ⊂ O δ such that C ⊂ W and S ∩ ∂W = ∅. Hence, (6.11) holds. Moreover, for ε > 0 sufficiently small, S ∩ {(λ, 0): λ∗ − ε < λ < λ∗ + ε} = {(λ∗ , 0)}. Let Ba (0) denote an open ball of radius a > 0 centered at 0 ∈ X. Let λ → ρ(λ) be continuous on [λ∗ − ε, λ∗ + ε] such that ρ(λ) > 0 on [λ∗ − ε, λ∗ ) ∪ (λ∗ , λ∗ + ε] and ρ(λ∗ ) = 0. Define Mλ similarly to that in (6.9). We claim that deg(G(λ∗ − ε, ·), Mλ∗ −ε , 0) = deg(G(λ∗ + ε, ·), Mλ∗ +ε , 0).
(6.12)
To see this, consider the (parametrized) eigenvalue problem A(λ, 0)[h] =µh in , B(λ, 0)[h] = 0 on ∂L .
(6.13a) (6.13b)
Note from hypothesis (2) of Theorem 5.2, that µ = 0 and h = h∗ when λ = λ∗ . If we differentiate (6.13a) and (6.13b) with respect to λ, take the vector dot product of each with h, integrate the first over and the second ∂L , subtract these equations, and then evaluate the result at µ = 0, h = h∗ , λ = λ∗ , we obtain the well known result that the transversality condition (5.4) is equivalent to dµ = 0. (6.14) dλ λ=λ∗ This, in turn, insures the “birth” or “death” of a simple, positive eigenvalue as λ crosses through λ∗ , i.e., the integer ν(λ, 0) increases or decreases by one; (6.12) then follows directly from (6.10). Finally, since 0 ∈ / G(λ, ∂(Wλ − M λ )) for all λ ∈ [λ∗ − ε, λ∗ + ε], we observe that deg(G(λ, ·), Wλ − M λ , 0) = const
on [λ∗ − ε, λ∗ + ε],
(6.15)
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
491
where the constant is zero. To see this, choose W so that W ∩ (R × {0}) = [λ∗ − ε, λ∗ + ε] × {0}. Observe that there are no solutions of (6.1) in W − M for large enough λ. By additivity of the degree, we see that (6.12) and (6.15) contradict (6.11), which completes the proof of Theorem 6.1. 2 If property (3) of Theorem 6.1 holds, then C ∩ ∂Oδ = ∅ for each δ > 0. From (6.4) and (6.3) we then conclude that there is a sequence of solution points {(λj , uj )} ⊂ C such that at least one of the following occurs: inf det(Hλ + ∇uj (x)) 2 0,
x∈
inf |d(Hλ + ∇uj (x), x)| 2 0,
x∈∂L
j → ∞,
(6.16)
j → ∞.
(6.17)
If we adopt more specific, physically reasonable constitutive hypotheses, we can follow ideas of [13] to show that (6.16) is not possible on bounded solution branches C. Specifically, we assume that W has the form W (F) = (F) + (det F) ∀F ∈ GL+ (R3 ),
(6.18)
where ∈ C 5 (GL+ (R3 )) ∩ C 2 (GL+ (R3 )) and ∈ C 5 (0, ∞) such that the following growth conditions hold: (η) → ∞ η (η) → −∞
as η 2 0, as η 2 0.
(6.19)
(The smoothness of on the closure of GL+ (R3 ) can be relaxed slightly, cf. [13].) We now have the following generalization of Theorem 2.3 in [13]: THEOREM 6.5. Let the hypotheses of Theorem 6.1 hold, and assume the constituive hypothesis (6.18) with growth conditions (6.19). If the global solution branch C is bounded in (0, ∞) × X, then condition (6.16) is not possible, i.e., bounded solution branches are characterized by property (2) of Theorem 6.1 and/or property (6.17). Proof. The proof is nearly identical to that given in [13], and we refer the reader to that work for the details. However, there is one point in the proof where we need a different argument here in this context. Namely, we need that J (x) = det F(x) = det(Hλ + ∇u(x)) not vanish identically on . In [13] the placement boundary conditions rule out such behavior. Here we use the traction-free boundary conditions and (6.19)2 , the latter of which is stronger than that required in [13], as follows. If we write the stored energy function W as a function of the principal stretches (cf. (3.8)–(3.10) above), ν1 , ν2 , ν3 , viz., W (F) = (ν1 , ν2 , ν3 ),
(6.20)
then the traction-free boundary condition (2.7) and isotropy imply that the outward unit normal n is a principal direction, say, i = 1, with s1 ≡
∂ (ν1 , ν2 , ν3 ) = 0 on ∂. ∂ν1
(6.21)
492
T. J. HEALEY AND E. L. MONTES-PIZARRO
Suppose that {(λj , uj )} ⊂ C is bounded such that Jj (x) = det(Hλ +∇uj (x))20 identically on . Now (6.18), (6.19)2 and (6.20) show that s1 →
Jj (Jj ) → −∞ as j → ∞, ν1 j
(6.22)
where we have used the fact that the sequence of principal stretches, (ν1j )j , is either 2 bounded or approaches zero. Obviously, (6.22) contradicts (6.21) on ∂L .
7. Concluding Remarks Even with the additional hypotheses (6.18) and (6.19), Theorem 6.5 leaves open the possibility that a bifurcating branch could be bounded and “terminate” due to the breakdown of the complementing condition. That the complementing condition can fail along a solution branch is clear. In both of the specific constitutive examples presented in Section 4, the complementing condition is violated along the trivial solution branch. On the other hand, the trivial solution does not “terminate” at that location, perhaps suggesting that Theorem 6.5 is not sharp. That is, although our existence method fails at a point where the complementing condition is violated, we do not know if the branch actually terminates or not. A physically reasonable way around this is to introduce a small additive second-gradient term in the model, which can be thought of as a model for the surface behavior, cf. [33, 37]. In particular, when the model is linear in the higher-gradient term, the complementing condition is always satisfied, cf. [19]. We plan to pursue these questions, in the context of global bifurcation problems, in future work. Clearly our approach to the barrelling problem serves as a paradigm for the rigorous analysis of a large class of concrete problems concerning bifurcation from a trivial line of homogeneously deformed states, e.g., cf. [5, 22, 27, and references therein]. The essential ingredients are: (1) the problem has enough symmetry or “hidden” symmetry enabling a reformulation without boundary “corners”, cf. Section 2. (2) The necessary conditions for bifurcation can be obtained for the linearized problem. (3) A crossing condition is verified, insuring a change in degree. We believe that (1) can be carried out in most cases, although see [21] for an example where this step is unclear. Conditions (2) and (3) are difficult to carry for general materials, but can be determined in the context of specific classes of materials. Acknowledgements The work of T.J.H. was supported in part by the National Science Foundation through grant DMS-0072514, and the work of E.L.M-P by the University of Puerto Rico as matching funds to the National Institute of Health grant GM-63039-01. The authors thank Phoebus Rosakis and Pablo V. Negrón-Marrero for useful comments at various stages of this work.
GLOBAL BIFURCATION IN NONLINEAR ELASTICITY
493
References 1. 2.
3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
22. 23. 24. 25.
S. Agmon, On the eingenfunctions and on the eigenvalues of general elliptic boundary value problems. Comm. Pure Appl. Math. 17 (1964) 35–92. S. Agmon, A. Douglis and L. Nirenberg, Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions ii. Comm. Pure Appl. Math. 15 (1962) 119–147. A. Ambrosetti and G. Prodi, A Primer of Nonlinear Analysis, Cambridge Studies in Advanced Mathematics 34. Cambridge Univ. Press, Cambridge, UK (1993). J.M. Ball, Strict convexity, strong ellipticity, and regularity in the calculus of variations. Math. Proc. Cambridge Philos. Soc. 87 (1980) 501–513. M.A. Biot, Mechanics of Incremental Deformation. Wiley, New York (1965). P.J. Blatz and W.L. Ko, Applications of finite elasticity theory to the deformation of rubber materials. Trans. Soc. Rheology 6 (1962) 223–251. A. Constantin and W. Strauss, Exact steady periodic water waves with vorticity. Preprint (2003). M. Crandall and P.H. Rabinowitz, Bifurcation from simple eigenvalues. J. Funct. Anal. 8 (1971) 321–340. P.J. Davies, Buckling and barrelling instabilities in finite elasticity. J. Elasticity 21 (1989) 147– 192. P.J. Davies, Buckling and barrelling instabilities of nonlinearly elastic columns. Quart. Appl. Math. 49(3) (1991) 407–426. C.C. Fenske, Extensio gradus ad quasdam applicationes Fredholmii. Mitt. Math. Seminar Giessen 121 (1976) 65–70. T.J. Healey, Global continuation in displacement problems of nonlinear elastostatics via the Leray–Schauder degree. Arch. Rational Mech. Anal. 152 (2000) 273–282. T.J. Healey and P. Rosakis, Unbounded branches of classical injective solutions to the forced displacement problem in nonlinear elastostatics. J. Elasticity 49 (1997) 65–78. T.J. Healey and H.C. Simpson, Global continuation in nonlinear elasticity. Arch. Rational Mech. Anal. 143 (1998) 1–28. H. Kielhöfer, Existenz und Regularität von Lösungen semilinearer parabolischer Anfangas– Randwertprobleme. Math. Z. 142 (1975) 131–160. H. Kielhöfer, Multiple eigenvalue bifurcation for Fredholm mappings. J. Reine Angew. Math. 358 (1985) 104–124. M.A. Krasnosel’skii, Topological Methods in the Theory of Nonlinear Integral Equations. Pergamon Press, New York (1964). O.A. Ladyzhenskaya and N.N. Ural’tseva, Linear and Quasilinear Elliptic Equations. Academic Press, New York (1968). A. Mareno, Global continuation in higher-gradient three-dimensional nonlinear elasticity. PhD Thesis, Cornell University (2002). A. Mielke and P. Sprenger, Quasiconvexity at the boundary and a simple variational formulation of Agmon’s condition. J. Elasticity 51 (1998) 23–41. P.V. Negrón-Marrero and E.L. Montes-Pizarro, Axisymmetric deformations of buckling and barrelling type for cylinders under lateral compression – The linear problem. J. Elastcity 65 (2001) 61–86. R.W. Ogden, Non-linear Elastic Deformations. Ellis Horwood, Chichester (1984). P. Peetre, Another approach to elliptic boundary problems. Comm. Pure Appl. Math. 14 (1961) 711–731. F. Quinn and A. Sard, Hausdorff conullity of critical images of Fredholm maps. Amer. J. Math. 94 (1972) 1101–1110. P.J. Rabier and J.T. Oden, Bifurcation in Rotating Bodies, Recherches en Mathématiques Appliquées 11. Springer, New York (1989).
494 26.
T. J. HEALEY AND E. L. MONTES-PIZARRO
P.H. Rabinowitz, Some global results for nonlinear eigenvalue problems. J. Funct. Anal. 7 (1971) 487–513. 27. K.N. Sawyers, Material stability and bifurcation in finite elasticity. In: R.S. Rivlin (ed.), Finite Elasticity AMD, Vol. 27. ASME, Basel (1977). 28. H.C. Simpson and S.J. Spector, On barrelling for a special material in finite elasticity. Quart. Appl. Math. 42 (1984) 99–111. 29. H.C. Simpson and S.J. Spector, On barrelling instabilities in finite elasticity. J. Elasticity 14 (1984) 103–125. 30. H.C. Simpson and S.J. Spector, On the positivity of the second variation in finite elasticity. Arch. Rational Mech. Anal. 98 (1987) 1–30. 31. H.C. Simpson and S.J. Spector, On bifurcation in finite elasticity: Buckling of a rectangular block. Unpublished manuscript. 32. S.J. Spector, On the absence of bifurcation for elastic bars in uniaxial tension. Arch. Rational Mech. Anal. 85 (1984) 171–199. 33. N. Triantafyllidis and E. Aifantis, A gradient approach to localization of deformation I: Hyperelastic materials. J. Elasticity 16 (1986) 225–237. 34. C. Truesdell and W. Noll, The Nonlinear Field Theories of Mechanics. In: S. Flügge (ed.), Handbuch der Physik III/3. Springer, Berlin (1965). 35. T. Valent, Boundary Value Problems of Finite Elasticity. Springer, New York (1988). 36. W. von Wahl, Gebrochene Potenzen eines elliptischen Operators und parabolische Differentialgleichungen in Räumen hölderstetiger Funktionen. Nachr. Akad. Wiss. Göttingen II. Math. Phys. K1 11 (1972) 231–258. 37. C.H. Wu, Cohesive elasticity and surface phenomena. Quart. Appl. Math. 50 (1992) 73–103.
Constitutive Relation of Elastic Polycrystal with Quadratic Texture Dependence MOJIA HUANG and CHI-SING MAN Department of Mathematics, University of Kentucky, Lexington, KY 40506, USA. E-mail:
[email protected] Received 25 September 2002 Abstract. Herein we consider polycrystalline aggregates of cubic crystallites with arbitrary texture symmetry. We present a theory in which we keep track of the effects of crystallographic texture on elastic response up to terms quadratic in the texture coefficients. Under this theory, the Lamé constants pertaining to the isotropic part of the effective elasticity tensor of the polycrystal will generally depend on the texture. We introduce also two simple models, which we call HM-V and HM-R, by which we derive an explicit expression for the effective stiffness tensor and one for the effective compliance tensor. Each of these expressions contains a term quadratic in the texture coefficients and, in addition to three parameters given in terms of the single-crystal elastic constants, each carries an undetermined material coefficient. These two remaining coefficients can be determined by imposing the requirement that the expressions from models HM-V and HM-R be compatible to within terms linear in the texture coefficients. Mathematics Subject Classifications (2000): 74B99, 74E10, 74E25, 74M25, 74Q15. Key words: polycrystal, crystallographic texture, Lamé constants, HM-V and HM-R models.
May the rational spirit prevail! 1. Introduction A polycrystal is an aggregate of tiny crystallites separated by grain boundaries. The chemical composition and the arrangement of the constituting crystallites, which includes grain orientations and grain boundary structure, are the main factors that determine the effective stiffness tensor of the polycrystal. In this paper we restrict our discussion to polycrystals whose constituting crystallites are cubic crystals of the same chemical composition. To give a crude but quantitative description of grain orientations or “crystallographic texture”, the orientation distribution function w (or equivalently its associated orientation measure ℘ which, for each given Borel set A of orientations, specifies the probability that the grain located at a given point has its orientation in A) was introduced independently by Bunge [1] and by Roe [2] in the 1960s. Since then, efforts have been made to determine the effect of the orientation distribution function (ODF) on various material properties. In linear elasticity the ODF 495 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 495–524. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
496
M. HUANG AND C.-S. MAN
was first introduced [3, 4] into the constitutive equation of orthorhombic aggregates of cubic crystallites through the Voigt model and orientational averaging. Under the Voigt model, the anisotropic part of the effective elasticity tensor C eff depends linearly on the anisotropic part of the ODF characterized by the texture coefficients. A few years ago Man [5] initiated a phenomenological approach in delineating the effects of crystallographic texture on the mechanical anisotropy of polycrystals. In this approach the ODF is treated on a par with stress and strain and is taken as a macroscopic variable in constitutive equations. General principles that govern constitutive equations (e.g., the principle of material frame-indifference, indifference to rotation of reference placement [6], etc.) and restrictions imposed by texture and crystal symmetries are then applied to obtain representation formulae that show explicitly the effects of texture on mechanical response. As a first example of applying this approach, Man [5] derived, for orthorhombic aggregates of cubic crystallites, a representation formula for the effective elasticity tensor C eff , which accounts for the effects of the ODF up to terms linear in the texture coefficients. Empirical experience has so far suggested that Man’s formula for the elasticity tensor would work well for materials such as aluminum, whose single crystal has weak anisotropy. On the other hand, there is also experimental evidence [7] which indicates that this simple formula is inadequate for strongly textured samples of copper, whose single crystal manifests much stronger elastic anisotropy than that of aluminum. With a view to applications involving strongly textured aggregates of crystallites which are themselves strongly anisotropic, here we seek a formula for C eff which accounts for the effects of crystallographic texture on elastic anisotropy up to terms quadratic in the texture coefficients. To this end, we shall follow a phenomenological approach. The stress and strain that enter into the constitutive equation of an elastic polycrystal are each a volume average of the corresponding field over many crystallites. In other words, they are the mean stress T and the mean strain E over some representative volume of the polycrystal. In polycrystalline sheet metals, we often find clustering of crystallite orientations around a relatively small number of specific orientations Rα . This phenomenon motivates the definition of texture components in metallurgy. A texture component in the representative volume of a sample often includes many crystallites. If we denote the volume averages of the strain and stress over the crystallites included in the αth texture component by E(Rα ) and T (Rα ), respectively, it should not need a long stretch of the imagination to believe that there could be constitutive equations governing E(Rα ) and T (Rα ) or, equivalently, constitutive equations on the mean perturbation strain E = E(Rα ) − E and the mean perturbation stress T = T (Rα ) − T . As the starting point of our present theoretical investigations, we postulate the existence of appropriate forms of such constitutive equations (see equations (59) and (148) below). After presenting some preliminaries in Sections 2–4, we discuss the constitutive assumption (59) in Section 5. Starting from this constitutive assumption or rather its linearization (60), we examine the special instance of cubic aggregates of cubic
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
497
crystallites in Section 6. There we obtain the somewhat surprising result that, when we consider the influence of crystallographic texture up to terms quadratic in the texture coefficients, the Lamé constants that pertain to the isotropic part of the effective elasticity tensor C eff will generally depend on the texture. From experience in working with formulae linear in the texture coefficients, one could hardly anticipate this finding. The formula for C eff in Section 6 contains numerous undetermined parameters. Thus, from the practical standpoint, the theory presented there is too general. In Section 7, we add simplifying ad hoc assumptions and obtain two simple models, which we call HM-V and HM-R, respectively, for polycrystalline aggregates of cubic crystallites with arbitrary texture symmetry. These models lead to a formula for the effective stiffness tensor C eff and one for the effective compliance tensor S eff , both of which contain terms quadratic in the texture coefficients (see equations (141) and (155)). Besides three polycrystal coefficients related to the single-crystal elastic constants, each of these formulae contains an undetermined material parameter, which we denote by ζ and η, respectively. It is easy to determine the particular values of ζ and η which guarantee compatibility of the two models up to terms linear in the texture coefficients. The formulae for these particular values are given in equations (166) and (167). We call the special version of HM-V and of HM-R with the particular value of ζ and of η model HM-Vc and HM-Rc , respectively. In Section 8, after giving some remarks on using numerical calculations and experimental corroboration to check the adequacy of formulae (141) and (155), we present a couple of examples, where the predictions of models HM-Vc and HM-Rc are compared with results of experimental measurements and/or computations based on the self-consistent method. In what follows we adopt the Schönflies notation for point groups and the Einstein summation convention for tensors. For two fourth-order tensors and , we let : denote the fourth-order tensor with components ij mn mnkl . When we regard fourth-order tensors as linear transformations on the space of second-order tensors, : is simply the composition of the corresponding linear transformations. In this sense, we have : = 2 . 2. Preliminaries Henceforth we assume that a fixed spatial Cartesian coordinate system has been chosen. We consider polycrystalline aggregates of cubic crystallites of the same chemical composition. To describe the orientation of a crystallite, we pick as reference the configuration of a single crystal which has its three four-fold axes of rotational symmetry coincide with the coordinate axes. The orientation of a crystallite in the polycrystalline aggregate is then specified by any one of the 24 rotations R which take the reference configuration to the given configuration of the crystallite. In much of our discussions below, we shall refrain from making any specific assumption on texture symmetry. On occasions where we refer to aggregates with
498
M. HUANG AND C.-S. MAN
cubic or orthorhombic texture, we assume that the axes of the spatial coordinate system have been chosen to agree with the three four-fold axes and with the three two-fold axes of the O and D2 texture symmetry, respectively. 2.1. TENSOR BASIS FOR CUBIC ELASTICITY For a fourth-order tensor A and a rotation Q, we let Q⊗4 A to denote the tensor with components ⊗4 (1) Q A ij kl = Qip Qj q Qkr Qls Apqrs . If A defines a physical property of a material point X in a given configuration, then Q⊗4 A describes the same property of X after the configuration is rotated by Q. Let C(R) be the elasticity tensor of the crystallite with orientation R. Clearly C(R) = R ⊗4 C(I ),
(2)
where I is the second-order identity tensor and C(I ) is the elasticity tensor of the reference cubic crystal. Let c11 , c12 , and c44 be the three independent components of C(I ) (in the Voigt notation), and let c = c11 − c12 − 2c44 .
(3)
Let B (α) (I ) (α = 1, 2, 3) be [8] the fourth-order tensors with components Bij(1)kl (I ) = Bij(1)kl = δij δkl , Bij(3)kl (I )
=
3
1 Bij(2)kl (I ) = Bij(2)kl = (δik δj l + δil δj k ), 2
(4)
δiα δj α δkα δlα .
α=1
By direct computations, it is easy to verify that C(I ) = c12 B (1) + 2c44 B (2) + cB (3) (I ).
(5)
If we let R act on both sides of equation (5), we obtain the decomposition of the elasticity tensor C(R) in terms of the tensor basis B (α) (R) as follows: C(R) = c12 B (1) + 2c44 B (2) + cB (3) (R),
(6)
where for α = 1, 2, B (α) (R) = B (α) (I ) = B (α) Bij(3)kl (R) = 3α=1 Riα Rj α Rkα Rlα .
(7)
Note that B (1) = I ⊗ I ,
B (2) = I,
(8)
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
499
where I is the identity operator on the space of second-order symmetric tensors, and that they constitute a basis in the space of isotropic fourth-order tensors with both the major and minor symmetries. 2.2. THE ORIENTATION DISTRIBUTION FUNCTION Let w be the orientation distribution function (ODF) [1, 2, 5] pertaining to the given polycrystalline aggregate, and let L2 (SO(3)) be the space of square-integrable complex-valued functions defined on the rotation group SO(3). We assume that w is independent of the sampling location. For w ∈ L2 (SO(3)), we can expand it as an infinite series in terms of the Wigner D-functions: w(R) = wiso + l cmn
=
∞ l l
l l cmn Dmn (R),
l=1 m=−l n=−l ∗ m−n l cm¯ n¯ , (−1)
(9)
l (l 1) are the where wiso = 1/(8π 2 ) is the ODF for an isotropic aggregate, cmn ∗ texture coefficients, z denotes the complex conjugate of the complex number z, l are related to Roe’s Wlmn coefficients [2] and n¯ = −n. The texture coefficients cmn by the formula 2 cl . (10) Wlmn = (−1)n−m 2l + 1 mn
Let g = 8π 2 gH , where gH is the Haar measure on SO(3) with gH (SO(3)) = 1. l , which constitute an orthogonal basis in L2 (SO(3)), The Wigner D-functions Dmn satisfy ∗ 7 6 l 8π 2 l l δll δmm δnn . Dmn (R) Dml n (R) dg = (11) Dmn , Dm n ≡ 2l + 1 SO(3) When R = R(ψ, θ, φ) is described by the Euler angles (here we use the convention adopted by Roe [2]), the Wigner D-functions assume the form [9] l l (R) = dmn (θ)e−i(mψ+nφ) , (12) Dmn
n+m n−m θ (l + n)!(l − n)! θ (n−m,n+m) l cos (θ) = Pl−n (cos θ) sin dmn (l + m)!(l − m)! 2 2 (13)
with the Jacobi polynomial Pq(r,s)(x) = (q + r)!(q + s)!
((x − 1)/2)q−k ((x + 1)/2)k , k!(q + r − k)!(q − k)!(s + k)! k
(14)
where the summation is over all integral values of k for which the arguments of the factorials in the denominator are non-negative. From (9) and (11), the texture
500
M. HUANG AND C.-S. MAN
coefficients are given by l ∗ 2l + 1 l w(R) D (R) dg. cmn = mn 8π 2 SO(3)
(15)
The texture coefficients that we shall need in this paper can easily be measured by X-ray diffraction. For cubic crystallites, since w(R) = w(RQcr ) for all Qcr ∈ O, we have ∗ 2l + 1 1 l l w(R) (16) Dmn (RQcr ) dg. cmn = 2 8π 24 Q ∈O SO(3) cr
By direct computations using a simple Maple program, it is easy to verify that ∗ 1 l (17) Dmn (RQcr ) = 0, when l = 1, 2, 3, 24 Q ∈O cr
for all R ∈ SO(3). Hence, from (16) and (17), we obtain l = 0, cmn
when l = 1, 2, 3.
(18)
After a polycrystal with texture characterized by the ODF w undergoes a rotation Q, its texture is described by a new ODF wQ (R) = w(Q−1 R), whose texture coefficients l cˇmn
=
l
l cˇmn
(19) can be obtained by the formula
l l csn Dsm (Q−1 ).
(20)
s=−l
Similarly, when the reference orientation of the crystallites undergoes a rotation Q, l the ODF of the polycrystal becomes w(RQ) with texture coefficients c`mn l c`mn
=
l
l l cms Dns (Q).
(21)
s=−l
As an example on the applications of (20) and (21), the texture and the crystal symmetry of orthorhombic aggregates of cubic crystallites lead to the equations l cmn
l cmn
= =
l s=−l l
l l csn Dsm (Q−1 tex ),
∀Qtex ∈ D2 , (22)
l l cms Dns (Qcr ),
∀Qcr ∈ O,
s=−l
in which D2 denotes the orthorhombic group of texture symmetry and O the octahedral group of cubic crystal symmetry. From (22), one [1, 2, 10] can easily derive l l , some of which are shown in Table I (e.g., cmn = 0 whenever the restrictions on cmn m is odd).
501
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL l for orthorhombic aggregates Table I. Some restrictions on texture coefficients cmn of cubic crystallites (in the table k denotes an integer)
Conditions
m = 2k
m = 2k
n = 4k
n = 4k
l = cmn
l (−1)l cmn ¯
0
l (−1)l cm n¯
0
√
4 = 70 c4 cm4 14 m0
2.3. ISOTROPIC PART OF ELASTICITY TENSOR In this paper we shall present a theory and two models on textured aggregates of cubic crystallites, under which the “isotropic part” of the effective elasticity tensor will generally depend on the crystallographic texture. To make our discussions precise, here we define what we mean by the isotropic part Ciso of an elasticity tensor C, and we give a formula for the computation of Ciso. Proof of a general version of this formula and detailed discussions on the isotropic and anisotropic parts of general material tensors will be given elsewhere. Let V be the translation space of the three-dimensional Euclidean space, and V r the r-fold tensor product V ⊗V ⊗· · ·⊗V . Elasticity tensors belong to the subspace of V 4 whose members enjoy both the major and minor symmetries. We denote this subspace by [[V 2 ]2 ]. Each rotation Q on V induces a linear transformation Q⊗4 on [[V 2 ]2 ] as defined by (1). The map Q → Q⊗4 defines [11] a linear representation of the rotation group SO(3) on [[V 2 ]2 ]. By formally introducing the complexification Vc of V (see [11, p. 105]), we shall henceforth regard this tensor representation as a complex representation. For simplicity, we shall suppress the subscript “c” and continue to write the complex representations as Q → Q⊗4 |[[V 2 ]2 ]. The rotation group has a complete set of absolutely irreducible unitary representations Dl (l = 0, 1, 2, . . .) of dimension 2l + 1. The representation Q → Q⊗4 |[[V 2 ]2 ], however, is reducible; it can be decomposed as a direct sum of subrepresentations as described by the formula 2 2 (23) [V ] = 2D0 + 2D2 + D4 . This formula should be interpreted as follows: the 21-dimensional space [[V 2 ]2 ] is a direct sum of two 1-dimensional, two 5-dimensional, and one 9-dimensional subspaces, each of which is invariant under the action of Q⊗4 for every rotation Q. Moreover, the restrictions of Q → Q⊗4 on each of the 1-dimensional, 5-dimensional, and the 9-dimensional subspace are subrepresentations equivalent to the irreducible representation D0 , D2 , and D4 , respectively. Decomposition formulae such as equation (23) above can be derived by computing the character of the tensor representation in question [12–14] or by other methods [15]. Tensors C which fall in the D4 subspace of [[V 2 ]2 ] are harmonic. In other words, they are totally symmetric and traceless, i.e., Ci1 i2 i3 i4 = Ciτ (1) iτ (2) iτ (3) iτ (4)
(24)
502
M. HUANG AND C.-S. MAN
for any permutation τ of {1, 2, 3, 4} and trj,k C = 0
(25)
for any pair of distinct indices j and k. A tensor C ∈ [[V 2 ]2 ] is isotropic if and only if it takes value in the direct sum of the two 1-dimensional subspaces invariant under Q⊗4 . These two subspaces are spanned by the tensors I ⊗ I and I, respectively. We call the 2-dimensional subspace spanned by these two tensors the isotropic subspace of [[V 2 ]2 ]. When we write an isotropic tensor in [[V 2 ]2 ] as the linear combination λI ⊗ I + 2µI, the parameters λ and µ are called the Lamé constants. For an arbitrary C ∈ [[V 2 ]2 ], we write it as a direct sum of tensors in the rotationally invariant subspaces given in decomposition (23). We define the isotropic part of C to be that which falls in the isotropic subspace of [[V 2 ]2 ] under this decomposition. One recipe to compute the isotropic part of tensor C is by way of the integral formula Ciso = R ⊗4 Cwiso dg. (26) SO(3)
We call C − Ciso the anisotropic part of C. When Ciso depends on texture, the corresponding Lamé constants are functions of the ODF w. Naturally they should be isotropic functions of w. To make this connotation precise, we introduce the following: DEFINITION 2.1. A function f (·) of the ODF is isotropic if f (wQ ) = f (w) for each rotation Q and each ODF w.
3. Theoretical Setting Consider an ensemble of nominally identical polycrystals, each member of which is subjected to the same macroscopic deformation so that T the ensemble average of the Cauchy stress, and E the ensemble average of the infinitesimal strain, are independent of place x. We assume that T and E are equal to the volume average of the stress field and of the strain field in a representative polycrystal B, and we call them the mean stress and the mean strain of the polycrystal, respectively. The effective stiffness tensor C eff of the polycrystal is defined by [16] T = C eff [E].
(27)
Let w be the ODF that characterizes the texture of the polycrystal in question, and let ℘ be the associated orientation measure. For each Borel subset A of SO(3), we have w(R) dg. (28) ℘ (A) = A
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
503
Let Tm (A) and Em (A) be the volume average of the stress and strain field over the crystallites in B, the orientations of which lie in A. We assume that the set functions Tm (·) and Em (·) are vector-valued measures. Clearly, T = Tm (SO(3)) and E = Em (SO(3)). Let T (·) and E(·) be the Radon–Nikodym derivative of Tm and Em with respect to ℘, respectively. It follows that T = T (R)w(R) dg, E= E(R)w(R) dg. (29) SO(3)
SO(3)
Roughly speaking, T (R) and E(R) are the (Euclidean) volume averages of the stress and strain fields over those crystallites in B whose orientations lie in an infinitesimal group volume around R. By abuse of language, we simply call T (R) and E(R) the mean stress and mean strain pertaining to the crystallites with orientation R. We define the mean stiffness tensor of the polycrystal to be C= C(R)w(R) dg, (30) SO(3)
and call D(R) = C(R) − C,
E (R) = E(R) − E
(31)
the perturbation stiffness tensor and the mean perturbation strain of the crystallites with orientation R, respectively. It is clear from their definition that D = D(R)w(R) dg = 0, (32) SO(3) E = E (R)w(R) dg = 0. (33) SO(3)
From the preceding equations and the identity T (R) = C(R)[E(R)] = (C + D(R))[E + E (R)] = C[E] + D(R)[E] + C[E (R)] + D(R)[E (R)],
(34)
we deduce that T = C[E] + D(R)[E (R)].
(35)
Combining equations (27) and (35), we have C eff [E] = C[E] + D(R)[E (R)], where
(36)
D(R)[E (R)] =
D(R)[E (R)]w(R) dg. SO(3)
Our task at hand is to determine C eff from equation (36).
(37)
504
M. HUANG AND C.-S. MAN
REMARK 3.1. The bulk modulus K of the polycrystal is defined by the equation tr T = 3K tr E.
(38)
Under the present setting, we always have K=
2 1 c11 + c12 3 3
(39)
for aggregates of cubic crystallites, irrespective of the texture. Indeed, from (7), we observe that (1) = 3δkl , Biikl
(2) Biikl = δkl ,
(3) Biikl (R) = δkl ;
hence, from (6), we have Ciikl (R) = 3c12 δkl + 2c44 δkl + cδkl = (c11 + 2c12 )δkl , which leads to
Diikl (R) = Ciikl (R) −
Ciikl (R)w(R) dg = 0. SO(3)
The preceding equation and equation (35) implies that T ii = C iikl E kl = Ciikl (R)w(R) dg E kl SO(3)
= (c11 + 2c12 )δkl E kl = (c11 + 2c12 )E ii .
(40)
Formula (39) then follows from a comparison of (38) with (40). REMARK 3.2. When all the crystallites in the polycrystal have the same orientation R0 , we have C = C(R0 ) and, by definition (31)1 , D = 0. Equation (36) then leads to the formula C eff = C(R0 ).
4. Orientational Averaging of Tensor Basis Let A(·) be a tensor function defined on the rotation group SO(3). For simplicity, we introduce the following notation + A
A= A(R)w(R) dg = A (41) SO(3)
where = A
A(R)wiso dg, SO(3)
= A
A(R)(w(R) − wiso ) dg. SO(3)
(42)
505
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
Since B (1) (R) = I ⊗I and B (2) (R) = I are constant tensor functions on SO(3), we have B
(α)
(α) = B (α) , =B
(α) = 0, B
α = 1, 2.
(43)
From equations (7)3 and (42)1 , we obtain the identity (3) = 1 I ⊗ I + 2 I. B 5 5
(44)
To proceed further, let us write
(3) ≡ B
(45)
for brevity. The tensor is harmonic (see Section 2.3 above); explicit formulae expressing its components in terms of the texture coefficients have been reported elsewhere [17], which we reproduce here for completeness: 2233 = a1 ,
1133 = a2 ,
1122 = a3 ,
1123 = a5 − a8 ,
1113 = −a7 + 3a4 ,
1112 = −a6 + a9 ,
3323 = −4a5 ,
3313 = −4a4 ,
3312 = 2a6 ,
(46)
where a1 a3 a5 a7 a9
32π 2 4 5 4 c00 + Re(c20 ) , =− 105 2 √ 8π 2 4 4 c00 − 70Re(c40 = ) , 105 √ 8 5π 2 4 Im(c10 = ), 105 √ 8 35π 2 4 Re(c30 = ), 105 √ 8 70π 2 4 Im(c40 = ); 105
a2 a4 a6 a8
32π 2 4 5 4 c00 − Re(c20 ) , =− 105 2 √ 2 8 5π 4 Re(c10 = ), 105 √ 8 10π 2 4 (47) Im(c20 = ), 105 √ 8 35π 2 4 Im(c30 = ), 105
here Re(z) and Im(z) denote the real and imaginary parts of the complex number z, respectively. From the total symmetry of and from the traceless condition (cf. equations (24) and (25)), all the other components of the harmonic tensor can be obtained from those displayed above. The presence of texture symmety imposes restrictions on the texture coefficients, and the formulae for the components of will simplify accordingly. For 4 = instance, for orthorhombic aggregates, the texture coefficients are real, cm0 4 4 , and c = 0 for odd m (see Table I above). The independent non-trivial cm0 ¯ m0
506
M. HUANG AND C.-S. MAN
components are then
32π 2 4 32π 2 4 5 4 5 4 c00 + c , c00 − c , 1133 = − 2233 = − 105 2 20 105 2 20 √ 4 8π 2 4 c00 − 70c40 1122 = . (48) 105 Under the Voigt model, all the grains in the polycrystal are assumed to have a uniform strain field equal to the mean strain E of the polycrystal. Thus we have the mean stress of the crystallites with orientation R given by T (R) = C(R)[E], and the mean stress of the polycrystal given by T = C[E]. It follows from (27) that, under the Voigt model, the effective stiffness tensor of the polycrystal is given by the mean stiffness tensor, i.e., C eff = C. From equations (6) and (43)–(45), we obtain the following explicit formulae for polycrystalline aggregates of cubic crystallites with arbitrary texture symmetry: (3) +B
(3) = λI ⊗ I + 2µI + c, C = c12 B (1) + 2c44 B (2) + c B (49) 1 1 2 4 (50) λ = c12 + c = c11 − c44 + c12 , 5 5 5 5 1 1 3 1 (51) µ = c44 + c = c11 + c44 − c12 . 5 5 5 5 For the special instance of orthorhombic aggregates, the preceding form of the mean stiffness tensor has long been available in the literature [3, 4]. Under the Reuss model, all grains in the polycrystal are assumed to have a uniform stress field equal to the mean stress T of the polycrystal. It follows that for the Reuss model the effective compliance tensor of the polycrystal is none other than the mean compliance tensor S= S(R)w(R) dg; (52) SO(3)
here S(R) = R ⊗4 S(I ), where S(I ) = s12 B (1) + 2s44 B (2) + sB (3) (I )
(53)
s = s11 − s12 − 2s44 ,
(54)
with
s11 =
2 c11
c11 + c12 , 2 + c11 c12 − 2c12
s12 =
2 c11
−c12 , 2 + c11 c12 − 2c12
s44 =
1 . 4c44 (55)
Similar to the computation of C, we obtain the mean compliance tensor of the polycrystal S = λs I ⊗ I + 2µs I + s, 1 2 4 λs = s11 − s44 + s12 , 5 5 5
(56) 1 3 1 µs = s11 + s44 − s12 . 5 5 5
(57)
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
507
5. Constitutive Assumption on Mean Perturbation Strain The Voigt model and the Reuss model, for which we have C eff = C and S eff = S, respectively, are too simplistic and are based on dubious physical assumptions. As shown in (49) and (56), both the anisotropic part of C and of S are linear in the texture coefficients. Starting from the assumptions that C eff = C eff (w) and that C eff is indifferent to the rotation of reference placement [6], which with the principle of material frame-indifference leads to the constitutive restriction C eff (wQ ) = Q⊗4 C eff (w)
(58)
for each rotation Q, Man [5, 14] derived for orthorhombic aggregates of cubic crystallites a formula for C eff up to terms linear in the texture coefficients. Man’s formula is identical in form to (49) for C, although the parameters λ, µ and c are all undetermined material constants; thus his formula should be interpreted in the same spirit as the classical representation formula with two Lamé constants in isotropic elasticity. Empirical experience has so far suggested that Man’s formula would work well for materials such as aluminum, whose single crystal has weak anisotropy. On the other hand, experimental evidence [7] also indicates that this simple formula is inadequate for strongly textured samples of copper, whose single crystal manifests much stronger elastic anisotropy than that of aluminum. With a view to applications involving strongly textured aggregates of strongly anisotropic crystallites, here we seek a formula for C eff which delineates the effects of crystallographic texture on elastic anisotropy up to terms quadratic in the texture coefficients. Consider, in the given polycrystal, the collection of crystallites with orientations in an infinitesimal group volume around R. Recall that we let E(R) denote the mean strain (i.e., volume average of the strain field) in these crystallites, and we call E = E(R) − E the mean perturbation strain pertaining to these crystallites (cf. Section 3 for a precise definition of the function E(·)). Our analysis below is based on the following physical assumption: (#) The mean perturbation strain E is governed by a constitutive relation of the form E = E (R, w, E)
(59)
with E (R, w, 0) = 0. Since we are concerned with linear elasticity in this paper, we linearize the constitutive function (59) with respect to E and take as our starting point the constitutive relation E = H(R, w)[E],
(60)
where H is a fourth-order tensor with minor symmetries. Similar to equation (58), we require H to satisfy, under any rotation Q of the polycrystal, the constraint that H(QR, wQ ) = Q⊗4 H(R, w)
(61)
508
M. HUANG AND C.-S. MAN
for each R ∈ SO(3). On the other hand, since we restrict our attention to aggregates of cubic crystallites, we have H(R, w) = H(I , w)
(62)
for each R ∈ O, the group of cubic crystal symmetry. To proceed further, suppose the function H(R, ·) is sufficiently smooth in a neighborhood of w = wiso that we may use the Taylor formula H(R, w) = H (0) (R, wiso ) + H (1) (R)[w − wiso] + H (2) (R)[w − wiso , w − wiso] + o w − wiso 2 ,
(63)
where H (β) (R) is 1/β! times the βth derivative of H(R, ·) at w = wiso and · denotes the L2 -norm. Henceforth, for simplicity, we shall suppress wiso in H (0) (R, wiso) and write it as H (0)(R). Clearly, for β = 0, 1, 2, H (β) enjoys the minor symmetries, and they satisfy H H
(2)
(1)
H (0)(QR) = Q⊗4 H (0) (R), (QR)[wQ − wiso ] = Q⊗4 H (1) (R)[w − wiso ], ⊗4
(QR)[wQ − wiso , wQ − wiso ] = Q H
(2)
(64)
(R)[w − wiso , w − wiso]
for all Q, R ∈ SO(3). That E has to satisfy also the requirement E = 0 (cf. (33) in Section 3) imposes further restrictions on H (β) , i.e., (0) = H (0) (R)wiso dg = 0, (65) H SO(3) (1) H (R)[w − wiso ]wiso dg + H (0) (R)(w(R) − wiso ) dg = 0, SO(3)
SO(3)
(66)
H (2) (R)[w − wiso , w − wiso ]wiso dg SO(3) H (1) (R)[w − wiso](w(R) − wiso ) dg = 0. +
(67)
SO(3)
From (64)1 , we observe that H (0) (R) = R ⊗4 H (0) (I ),
∀R ∈ SO(3).
(68)
On the other hand, for R ∈ O, the symmetry group of the reference cubic crystallite, we have H (0)(R) = H (0)(I ) or H (0) (I ) = R ⊗4 H (0) (I ),
∀R ∈ O,
where we have appealed to equation (68). It follows that 1 (0) Rip Rj q Rkr Rls Hpqrs (I ), Hij(0)kl (I ) = 24 R∈O
(69)
(70)
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
509
from which we obtain the identity (0) (I ) Hij(0)kl (I ) = Hklij
(71)
by direct computations. In other words, H (0) (I ) has the major symmetry. Because H (0) (I ) enjoys the minor and major symmetries, equations (68) and (69) imply that we may write H
(0)
(R) =
3
hα B (α) (R),
(72)
α=1
where hα are some constants. From (43), (44) and (65), we know 3 h3 2h3 B (1) + h2 + B (2) = 0. hα B (α) (R)wiso dg = h1 + 5 5 SO(3) α=1 (73) Let h3 = ζ . From (73), we observe that 1 h1 = − ζ, 5
2 h2 = − ζ. 5
(74)
Putting (74) into (72), we have H (0) (R) = ζ (R), 2 1
(R) = − B (1) − B (2) + B (3)(R). 5 5
(75) (76)
For later use, we record here two more equations involving , namely:
= , D(R) = c( (R) − ).
(77) (78)
Equation (77) follows immediately from (43)–(45) and (76), and equation (78) from (6), (31)1 , (49) and (76). 6. Lamé Constants with Quadratic Texture Dependence In this section, we consider the special instance where the texture of the polycrystal also has cubic symmetry. This particular example serves to highlight a somewhat surprising result under our present theory: If we account for the effects of texture up to terms quadratic in the texture coefficients, then the Lamé constants of the isotropic part of C eff will generally depend on the texture. For brevity, we shall sometimes write H (β) (R)[w − wiso , . . . , w − wiso] ≡ H (β) (R, w). β -fold
(79)
510
M. HUANG AND C.-S. MAN
From (64), we observe that for β = 1, 2 and for each Q ∈ O, we have H (β) (Q, w) = Q⊗4 H (β) (I , w),
(80)
because wQ = w when Q belongs to the group of cubic texture symmetry. On the other hand, it follows from equations (62) and (63) that H (β) (Q, w) = H (β) (I , w)
(81)
for each Q ∈ O. Hence we have H (β) (I , w) = Q⊗4 H (β) (I , w)
(82)
for each Q ∈ O. Following the same argument as in the derivation of equation (72), we see that we may express H (β) (I , w) in terms of the tensor basis B (α) (I ): H (β) (I , w) =
3
(α) h(β) (I ) α (w)B
(83)
α=1
for β = 1, 2, where hα(1) (w) ≡ h(1) α [w − wiso ],
hα(2)(w) ≡ h(2) α [w − wiso , w − wiso ].
(84)
Combining equations (64) and (83), we conclude that for each R ∈ SO(3) H (β) (R, wR ) =
3
(α) h(β) (R). α (w)B
(85)
α=1
Replacing w by wRT in (85), we have H (β) (R, w) =
3
(α) h(β) (R). α (wR T )B
(86)
α=1
By taking the ODF w as a parameter, we treat aα (R; w) ≡ h(1) α [wR T − wiso ], fα (R; w) ≡ h(2) α [wR T − wiso , wR T − wiso ] as functions defined on the rotation group. Clearly, a α (w) = aα (R; w)w(R) dg, fα (w) = SO(3)
fα (R; w)wiso dg (87)
SO(3)
are functions of the ODF. LEMMA 6.1. The parameters a α and fα are isotropic functions of the ODF, i.e., a α (wQ ) = a α (w) and fα (wQ ) = fα (w) for each rotation Q.
511
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
Proof. For later convenience, we introduce the notation l l (α) (1) (2) Flmnl Dmn , Dml n . A(α) m n ≡ hα lmn ≡ hα Dmn ,
(88)
By the linearity and bilinearity of hα(1) and hα(2) , respectively, we have aα (R; w) = h(1) α [wR T − wiso ] = =
l A(α) lmn cˇmn
l,m,n
l
l l A(α) lmn csn Dsm (R),
(89)
l,m,n s=−l
fα (R; w) = h(2) α [wR T − wiso , wR T − wiso ] (α) l l Flmnl m n cˇmn cˇm = n l,m,n l ,m ,n
=
l l l,m,n l ,m ,n
s=−l
(α) l l l l Flmnl m n csn cs n Ds m (R)Dsm (R),
(90)
s =−l
where l, l 1 and we have made use of equation (20). Since we have aα =
l
l A(α) lmn csn
l Dsm (R)wiso dg = 0,
(91)
SO(3)
l,m,n s=−l
we obtain from equations (11), (42)1 , and (89) the formula a α (w) = aα =
l
l,m,n s=−l l ,m ,n
=
l l A(α) lmn csn cm n
l Dsm (R)Dml n (R) dg SO(3)
l 8π 2 l l (−1)s+m A(α) ¯, lmn csn cs¯ m 2l + 1 l,m,n s=−l
(92)
l (θ). Similarly, from where we have appealed to the property dml¯ n¯ (θ) = (−1)m+n dmn (90), we have
fα (w) =
l l (−1)s+m (α) l l Flmnl mn ¯ csn cs¯ n . 2l + 1 l,m,n s=−l
(93)
n =−l
Since wQ (R) = w(QT R), we deduce from (89) that aα (R; wQ ) =
l l,m,n s=−l
l l T A(α) lmn csn Dsm (Q R).
(94)
512
M. HUANG AND C.-S. MAN
From (20) and (92) we derive the identity l l l T T a α (wQ ) = A(α) lmn csn Dsm (Q R)(w(Q R) − wiso ) dg SO(3) l,m,n s=−l
=
l
l l A(α) lmn csn cm n
l,m,n s=−l l ,m ,n
=
l
l l A(α) lmn csn cm n
l Dsm (QT R)Dml n (QT R) dg SO(3)
l Dsm (R)Dml n (R)dg SO(3)
l,m,n s=−l l ,m ,n
= a α (w).
(95)
Similarly, we can show that
fα (wQ ) =
l l
(α) l l Flmnl m n csn cs n
l,m,n l ,m ,n s=−l s =−l
×
l Dsl m (QT R)Dsm (QT R)wiso dg = fα (w).
(96)
SO(3)
2 Substituting (89) and (90) into equation (86) for β = 1, 2, respectively, we have H (1) (R, w) =
3 l
l l (α) A(α) (R), lmn csn Dsm (R)B
(97)
α=1 l,m,n s=−l
H
(2)
(R, w) =
3 l l
(α) l l Flmnl m n csn cs n
α=1 l,m,n l ,m ,n s=−l s =−l
l × Dsl m (R)Dsm (R)B (α) (R).
(98)
From (63) and (75), the perturbation strain E in (60) can be expressed as E = (ζ + H (1) + H (2) + · · ·)[E]. We substitute (78) and (99) into the term D[E ] in (35) to obtain l 2 (1) (2) (3) + C (4) + C (5) + o |cmn D[E ] = c(C + C + C | [E],
(99)
(100)
where (cf. Section 1 for notation) (1)
(2)
C = ζ : , C = −ζ : , (4) (1) (3) , = − : H C = : H (1), C (5) =
: H (2) wiso dg. C SO(3)
(101) (102) (103)
513
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
Table II. Results of B (α) : B (β) (α)
(β)
Bij mn Bmnkl
β=1
α=1
3Bij kl
α=2
Bij kl
(1)
α=3
Bij kl
(1)
Bij kl
(1)
β=2
β=3
Bij kl
(1)
Bij kl
Bij kl
(2)
Bij kl
(3)
Bij kl
(1) (3) (3)
In order to compute the right-hand sides of equations (101)–(103), first we list the results of B (α) : B (β) in Table II. From (76) and Table II, we have
: =−
4 1 3 (1) B + B (2) + B (3) (R). 25 25 5
From (43), (44), (101)1 and (104), we obtain 6 1 2 (1) C =ζ − I ⊗I + I+ . 25 25 5
(104)
(105)
Since = (see equation (77)), we know C
(2)
= −ζ : .
(106)
(1) = −H (0) = −ζ and From (65), (66), (75) and (102)1 , we observe that H (3) = ζ : . C
(107)
With the help of Table II, we can recast (102)2 and (103) as 1 2 3 (4) C = − (a 2 + a 3 )I ⊗ I − a 2 I + a2 + a3 B (3) (R)w(R) dg, 5 5 5 SO(3) (108) 3 (5) = − 1 f2 + f3 I ⊗ I − 2 f2 I + f2 + f3 B (3)(R)wiso dg. C 5 5 5 SO(3) (109) To proceed further, let us write 3 0 a2 + a3 B (3)(R)wiso dg, A = 5 SO(3) 3 1 a2 + a3 B (3)(R)(w − wiso) dg, A = 5 SO(3) 3 f2 + f3 B (3) (R)wiso dg. F0 = 5 SO(3)
(110) (111) (112)
514
M. HUANG AND C.-S. MAN
l Since B (3) (R) is a fourth-order tensor and cmn = 0 for 1 l 3 (see (18)), we observe from (89) and (110) that 4 4 0 4 4 U4mn csn Dsm (R)B (3)(R)wiso dg, (113) A = SO(3) m,n=−4 s=−4
where 3 (3) U4mn = A(2) 4mn + A4mn . 5 Let
(114)
√
√ 70 70 U4m4 + U ¯. dm = U4m0 + 14 14 4m4 Using Table I, we have 4 4 4 4 dm cs0 Dsm (R)B (3) (R)wiso dg. A0 =
(115)
(116)
SO(3) m=−4 s=−4
It is easy to show [17] that 4 Dsm (R)B (3) (R)wiso dg = 0 when m = ±1, ±2, ±3,
(117)
SO(3)
and
4 Ds4 (R)B (3) (R) dg
=
SO(3)
SO(3)
√ =
70 14
Ds44¯ (R)B (3)(R) dg
4 Ds0 (R)B (3)(R) dg.
(118)
SO(3)
Hence from (116), we have 4 4 4 cs0 Ds0 (R)B (3)(R)wiso dg = ξ , A0 = ξ 1
(119)
SO(3) s=−4
where
√
√ 7 70 70 d4 + d4¯ , and ξ = ξ1 . (120) ξ1 = d0 + 14 14 96π 2 Now we consider the relations (111) and (112). Using the Clebsch–Gordan expansion [18], we see that A1 =
l
l l Ulmn csn cm n lsm l m n ,
(121)
l ,m ,n l,m,n s=−l
F = wiso 0
l l l ,m ,n
l,m,n s =−l s=−l
lmn l l Vlmn csn cs n lsm l s m ,
(122)
515
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
where 3 (3) (2) l m n F , = Flmnl Vlmn m n + 5 lmnl m n J l+l J lmn JM JN Clml m Clnl n l m n = J =|l−l | M=−J N=−J
(123) J DMN (R)B (3) (R) dg; (124)
SO(3)
JM (3) (R) is a fourth-order here Clml m are the Clebsch–Gordan coefficients. Since B tensor pertaining to a cubic crystal, we have J DMN (R)B (3) (R) dg = 0 unless J ∈ {0, 4} and N ∈ {−4, 0, 4}. SO(3)
(125) We can decompose A and F in (121) and (122) into 1
A1 = K (0) + K (4) ,
0
F 0 = L(0) + L(4),
(126)
where K and L are the partial sums in (121) and (122) with J = 0; K and L(4) are the partial sums in (121) and (122) with J = 4. Note that the tensors K (0) and L(0) lie in the isotropic subspace of [[V 2 ]2 ], whereas K (4) and L(4) belong to the 9-dimensional anisotropic D4 subspace and they are harmonic (i.e., totally symmetric and traceless). 0 (R) = 1, which leads to When J = 0, we have l = l and D00 00 00 0 D00 (R)B (3)(R) dg lsm l m n = Clslm Clmln SO(3) 2 (2) δs m¯ δmn¯ s+m 2 1 (1) (−1) 8π B + B (127) = 2l + 1 5 5 (0)
and lsm l s m
(0)
δs s¯ δmm¯ 2 (2) s+m 2 1 (1) (−1) 8π B + B = , 2l + 1 5 5
(4)
(128)
where we have appealed to the identity [18] δll δs m¯ 00 (−1)l−s . (129) Clsl m = 2l + 1 Putting (126)–(128) into (121) and (122), respectively, we obtain from (92), (93), (114) and (123) the formulae l 8π 2 2 1 (0) s+m l l I ⊗I + I = (−1) Ulmn csn cs¯ m¯ K 2l + 1 s=−l 5 5 l,m,n 2 3 1 I ⊗I + I , (130) = a2 + a3 5 5 5 2 3 1 (0) I ⊗I + I . (131) L = f2 + f3 5 5 5
516
M. HUANG AND C.-S. MAN
Substitution of (110)–(112), (119), (126), (130), and (131) into (108) and (109) leads to 2 6 (4) C = − a 3 I ⊗ I + a 3 I + ξ + K (4) , (132) 25 25 (5) = − 2 f3 I ⊗ I + 6 f3 I + L(4). (133) C 25 25 Finally, combining (100)–(107), (132) and (133) with (35), we obtain a formula for the effective elasticity tensor of cubic aggregates of cubic crystallites, which is correct up to terms quadratic in the texture coefficients: C eff = λeff I ⊗ I + 2µeff I + ceff + c(K (4) + L(4) ),
(134)
where 2c (ζ + a 3 + f3 ), 25 6c = 2µ + (ζ + a 3 + f3 ), 25 cζ + cξ. = c+ 5
λeff = λ − 2µeff ceff
(135)
In (134), λeff and µeff are the effective Lamé constants of the polycrystal, which contain terms quadratic in the texture coefficients and are isotropic functions of the ODF w. The isotropic and anisotropic parts of C eff are λeff I ⊗ I + 2µeff I and ceff + c(K (4) + L(4) ), respectively. The term ceff is linear in the texture coefficients, whereas c(K (4) + L(4) ) is quadratic. Since the tensors , K (4) and L(4) are harmonic (cf. Section 2.3 above), we have 3K = 3λeff +2µeff = 3λ+2µ = c11 + 2c12 in agreement with (39). 7. HM-V Model, HM-R Model, and Compatibility Expression (134) shows that under assumption (#) crystallographic texture will generally affect the isotropic part of the effective elasticity tensor of a textured polycrystal. The same expression, which pertains to the special instance of cubic aggregates, however, also betrays the fact that the theory we presented is in a sense too general for practical purposes. In (134), for example, the myriads of undeter(α) mined coefficients A(α) lmn and Flmnl m n make it impossible to apply this expression in practical computations. For this reason, in this section, we shall establish two simple models by adding some ad hoc simplifying assumptions. 7.1. HM - V MODEL In our first model, which we call the HM-V model, we discard all the o(w −wiso ) terms in equation (63) and simply take (136) E = H (0)(R) + H (1) (R, w) [E]
517
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
as the constitutive relation for the mean perturbation strain. Under assumption (136), equation (67) becomes H (1) (R, w)(w(R) − wiso) dg = 0. (137) SO(3)
In order that equation (137) be satisfied for all independent texture coefficients, the quantity H (1) (R, w) must be independent of R. Hence, we conclude from (66), (75) and (77) that H (1) (R, w) = −ζ .
(138)
Substituting (75) and (138) into (136), we obtain the following formula for the perturbation strain under the HM-V model: E = ζ (R) − [E]. (139) REMARK 7.1. For later use, let us examine equation (139) when the cubic crystallites are almost isotropic, i.e., c ≈ 0. Under such circumstances, we expect that E should be almost a constant function of R and E ≈ 0 for all R. Since the expression (R) − is independent of c, we conclude that the HM-V model would be unphysical unless ζ ≈ 0 when the crystallites are almost isotropic. Now, from (77), (78), and (139), we have D[E ] = cζ( : − : )[E].
(140)
Substituting (49) and (140) into (35), we derive from (43)–(45) and (104) a simple formula for the effective stiffness tensor of polycrystalline aggregates of cubic crystallites as follows: 2 (1) 6 (2) 1 eff (1) (2) C = λB + 2µB + c + cζ − B + B + − : 25 25 5 ◦ ◦ ◦ ◦ (141) = λ I ⊗ I + 2µ I + c + d : with λ◦ = λ −
2cζ , 25
µ◦ = µ +
3cζ , 25
c◦ = c +
cζ , 5
d ◦ = −cζ, (142)
where c, λ and µ, are given by (3), (50) and (51), respectively, and ζ is an undetermined material constant. Expression (141) for C eff contains a term quadratic in the texture coefficients. It is applicable to aggregates of cubic crystallites with arbitrary texture symmetry, for which the tensor is given by (46). The isotropic part of the tensor C eff can be obtained from formula (26): eff = R ⊗4 C eff wiso dg = λeff I ⊗ I + 2µeff I, (143) Ciso SO(3)
518
M. HUANG AND C.-S. MAN
where eff
4 512π 4 d ◦ 4 2 =λ − |c | , 4725 k=−4 k0
µeff
4 256π 4 d ◦ 4 2 ◦ =µ + |c | . 1575 k=−4 k0
λ
◦
(144)
The Lamé constants λeff and µeff pertaining to the isotropic part of C eff are isotropic 4 coefficients vanish, the polyfunctions of the ODF. In the limit when all the cm0 eff crystal exhibits elastic isotropy, and we have λ = λ◦ , µeff = µ◦ for the isotropic polycrystal. When all the cubic crystallites in the polycrystal have the same orientation as the reference single crystal, the polycrystal is said to have the ideal Cube texture in the jargon of metallurgy. Under the theoretical setting of the present paper, a polycrystal with the ideal Cube texture is indistinguishable from a single crystal. By substituting into (141) the values of texture coefficients appropriate for the Cube texture, i.e., √ 21 70 4 4 4 4 c , , c40 = c40 (145) c00 = ¯ = 2 32π 14 00 4 = 0, it is straightforward to verify by direct computations that the and all other cm0 effective elasticity tensor of the polycrystal reduces to
C eff = c12 I ⊗ I + 2c44 I + cB (3) (I ),
(146)
which is none other than the elasticity tensor of the reference single crystal (cf. equation (5) above). This result, while expected (see Remark 3.2), is still comforting, for it serves as a check on the correctness of the computations that lead to equation (141). 7.2. HM - R MODEL By reversing the roles of stress and strain, we may follow the same procedure as the derivation of (141) to obtain a parallel expression for the effective compliance tensor S eff . For the polycrystalline aggregate, let T = T (R) − T
(147)
be the mean perturbation stress of the crystallites with orientations in an infinitesimal group volume around R (cf. Section 3 for a precise definition of the function T (·)). Instead of assumption (#) on the mean perturbation strain E (see Section 5), now we postulate the constitutive relation T = T (R, w, T )
(148)
519
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
for the mean perturbation stress, with T (R, w, 0) = 0. Next we linearize equation (148) with respect to T and write T = G(R, w)[T ],
(149)
and we express the constitutive function G in the Taylor formula G(R, w) = G(0)(R) + G(1)(R)[w − wiso ] + o w − wiso ,
(150)
where G(β) (R, w) enjoy the minor symmetries. In our second model, which we call the HM-R model, we ignore all the o(w − wiso ) terms in equation (150) and take (151) T = G(0)(R) + G(1) (R)[w − wiso ] [T ] as the constitutive equation for the mean perturbation stress. Parallel to what we obtain under the HM-V model (see equation (139)), we find that under the HM-R model equation (151) reduces to the form T = η( (R) − )[T ],
(152)
where η is an undetermined material constant. Let L = S(R) − S be the perturbation compliance tensor of crystallites with orientation R. Parallel to equation (78), we have L(R) = s( (R) − ).
(153)
The mean strain tensor E and the effective compliance tensor S eff of the polycrystal are given by the equation (cf. equations (35) and (36)) E = S eff [T ] = S[T ] + L[T ].
(154)
Substituting (56), (152) and (153) into (154), we obtain the following formula for the effective compliance tensor of the polycrystal: S eff = λ◦s I ⊗ I + 2µ◦s I + cs◦ + ds◦ :
(155)
with λ◦s = λs −
2sη , 25
µ◦s = µs +
3sη , 25
cs◦ = s +
sη , 5
ds◦ = −sη, (156)
where λs , µs , and s are given in (54) and (57). The isotropic part of the tensor S eff is eff eff = λeff Siso s I ⊗ I + 2µs I,
(157)
520
M. HUANG AND C.-S. MAN
where λeff s µeff s
4 512π 4 ds◦ 4 2 − |c | , 4725 k=−4 k0
=
λ◦s
=
µ◦s
4 256π 4 ds◦ 4 2 + |c | . 1575 k=−4 k0
(158)
7.3. COMPATIBILITY. MODELS HM - Vc AND HM - Rc The HM-V and the HM-R model each has an undetermined material constant in ζ and in η, respectively, and the predictions from these two models need not be compatible. Nevertheless, we can use the requirement C eff : S eff = I
(159)
to determine ζ and η so that formulae (141) and (155) agree with each other to within terms linear in the texture coefficients. To this end, we substitute (141) and (155) into (159) and obtain, with the help of Table II, the expansion C eff : S eff = (3λ◦ λ◦s + 2λ◦ µ◦s + 2λ◦s µ◦ )I ⊗ I + 4µ◦s µ◦ I + (2µ◦ cs◦ + 2µ◦s c◦ ) + · · · .
(160)
In order that (159) be satisfied to within terms linear in the texture coefficients, we impose in (160) the requirements that 3λ◦ λ◦s + 2λ◦ µ◦s + 2λ◦s µ◦ = 0, 4µ◦s µ◦ = 1,
2µ◦ cs◦ + 2µ◦s c◦ = 0.
(161)
From (3), (50), (51), (54)–(57), we can get the relations 50λµ + 5λc + 2c2 , 2(3λ + 2µ)(10µ + 3c)(5µ − c) −25c 5(10µ + c) , s= . µs = 4(10µ + 3c)(5µ − c) 2(5µ − c)(10µ + 3c)
λs = −
(162)
Substituting (162) into (156) and then substituting (142) and (156) into (160), we arrive at the following equations: (−50ζ µ + 50ηµ − 25c + 6cηζ − 5cζ )c = 0, 25(−5µ + c)(10µ + 3c)
(163)
3(−50ζ µ + 50ηµ − 25c + 6cηζ − 5cζ )c = 0, 100(−5µ + c)(10µ + 3c)
(164)
(−50ζ µ − 25c + 25cζ + 30cη + 12cηζ + 50ηµ)c = 0. 20(−5µ + c)(10µ + 3c)
(165)
−
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
521
Since (163) is equivalent to (164), we need to solve only (164) and (165). We obtain two pairs of solutions (ζ, η). For one pair, we find ζ → −10 and η → −10 as c → 0. By Remark 7.1, this pair of solutions is deemed unphysical and is discarded. The pair of physical solutions is given by the formulae √ 10µ − 2(5µ − c)(10µ + 3c) √ , (166) ζ = −5 6c − 2(5µ − c)(10µ + 3c) √ 5(10µ − 2(5µ − c)(10µ + 3c)) η= . (167) 2(−5µ + 3c) Note that for this pair, ζ → 0 and η → 0 as c → 0. With this choice of ζ and η, the constitutive equations as derived by the two models will agree with each other to terms linear in the texture coefficients. Henceforth we shall refer to (141) and (155) with ζ and η given by (166) and (167) as predictions from models HM-Vc and HM-Rc , respectively. The subscript “c” in HM-Vc and HM-Rc will remind us that these models are special versions of HM-V and HM-R, where the material constants ζ and η are chosen to guarantee approximate compatibility between the two. At this point, however, we would like to add a cautionary remark: Whereas compatibility provides one easy means to put a numerical prediction on ζ and η in (166) and (167), respectively, there is no a priori reason why models HM-V and HM-R should be compatible with each other. Hence, one should not attach too much significance to the formulae (166) and (167) for ζ and η. 8. Examples and Discussion With equation (141) for C eff and equation (155) for S eff in hand, the first question to ask is whether these formulae from the simple models HM-V and HM-R are adequate. To shed some light on this question, one approach is to check these formulae through numerical computations. For definiteness, let us restrict our discussion here to formula (141) for C eff . A polycrystal can be taken as an inhomogeneous elastic body whose elasticity tensor is piecewise constant. For such a body, the classical boundary-value problems of elastostatics are well posed. Given a set of single-crystal elastic constants and an arrangement of crystallites (including their orientations), a selected set of six boundary-value problems Pi (i = 1, 2, . . . , 6) can be solved by using the finite element method to get the corresponding mean stress T i and mean strain E i , from which the effective elasticity tensor C eff which maps E i to T i for each i can be calculated. By equation (141), each component Cijeffkl of the effective elasticity tensor carries only one and the same undetermined coefficient ζ . From the computed value of Cijeffkl , a value of ζ can be determined. The values of ζ thus obtained from various components Cijeffkl can then be checked for agreement with each other. Such checkings can be repeated for different textures and for different choices of boundary-value problems Pi .
522
M. HUANG AND C.-S. MAN
Of course, another possible approach for checking the adequacy of formulae (141) and (155) is to seek their experimental corroboration. In this regard, however, one should note that crystallographic texture is only one of the microstructural factors, albeit an important factor, which influence the elastic response of an anisotropic polycrystal. In seeking experimental corroboration of (141) or (155), efforts should be focused on strongly textured samples for which the term quadratic in texture coefficients in these formulae has such a sizable effect that it could not be confused with the influence from other microstructures. When the single-crystal elastic constants are known, formulae (141) and (155) each carry only one undetermined material constant. In metallurgical practice, however, because of the effects of alloying elements (even if they are in minute quantities), it is difficult to get accurate estimates of the elastic constants of the crystallites in a polycrystalline metal. Equations (141) and (155) are then more properly looked upon as formulae with four undetermined coefficients. On the other hand, sometimes we just need rough estimates of the elastic constants of a polycrystal. Then, coupling single-crystal elastic constants from handbooks with either model HM-Vc or HM-Rc may suffice. In closing, we present two examples, in which the predictions of HM-Vc and HM-Rc are compared with results of experimental measurement and/or calculations by the self-consistent method [19]. EXAMPLE 8.1. We consider isotropic aggregates of copper. The single-crystal elastic constants are taken to be c11 = 169.05 GPa, c12 = 121.93 GPa, c44 = 75.50 GPa [19]. In Table III we list a comparison of the predicted values of λeff and µeff from models HM-Vc , HM-Rc , the Voigt and the Reuss model, and Morris’ calculations [19] by a self-consistent scheme. As reference, we include also the values of λeff and µeff for isotropic aggregates, as inferred from ultrasound measurements [7] on a batch of C122 copper samples. EXAMPLE 8.2. Morris [19] reported the values of stiffness components pertaining to an orthorhombic aggregate of α-Fe crystallites, as calculated by a selfconsistent scheme. In his example, the values of the relevant texture coefficients are: 4 = −0.02475209611, c00 4 c40 = 0.003290910316,
4 c20 = −0.001375676243,
Table III. Comparison of predicted and measured values of Lam´e constants for isotropic aggregates of copper. Units are in GPa Model
Voigt
HM-Vc
HM-Rc
Reuss
Morris
Expt.
λeff µeff
101.16 54.72
106.14 47.24
106.14 47.24
110.89 40.12
105.2 48.7
106.5 47.35
523
CONSTITUTIVE RELATION OF ELASTIC POLYCRYSTAL
Table IV. Stiffness components pertaining to an orthorhombic aggregate of iron crystallites. Units are in GPa Model
Voigt
HM-Vc
HM-Rc
Reuss
Morris
eff C1111
295.3
286.2
286.2
276.6
286.8
eff C2222 eff C3333 eff C2233 eff C3311 eff C1122 eff C2323 eff C3131 eff C1212
297.1
288.3
288.4
278.8
288.8
311.7
305.7
305.8
296.7
305.5
102.8
105.6
105.5
110.0
105.3
104.6
107.7
107.7
112.2
107.3
119.1
125.1
125.1
130.2
123.9
77.8
70.2
70.3
64.6
71.8
79.6
71.9
71.9
65.9
73.5
94.1
87.4
87.4
79.7
88.4
and the single-crystal elastic constants are taken as c11 = 237 GPa, c12 = 141 GPa, c44 = 116 GPa. In Table IV, we list, in juxtaposition with the values of stiffness components reported by Morris, the corresponding values predicted by models HM-Vc and HM-Rc . We include also the predictions from the Voigt and the Reuss model in the same table as reference.
Acknowledgements The findings reported here were obtained in the course of work supported in part by a grant from the U.S. National Science Foundation (No. DMS-0103979), a DEPSCoR grant from AFOSR (No. F49620-02-1-0243), and an R&D Excellence grant from the Kentucky Science & Engineering Foundation (No. KSEF-148-50202-19). References 1. 2. 3. 4. 5.
H.J. Bunge, Texture Analysis in Materials Science: Mathematical Methods. Butterworths, London (1982). R.J. Roe, Description of crystallite orientation in polycrystalline materials: III, General solution to pole figures. J. Appl. Phys. 36 (1965) 2024–2031. P.R. Morris, Averaging fourth-rank tensors with weight functions. J. Appl. Phys. 40 (1969) 447–448. C.M. Sayers, Ultrasonic velocities in anisotropic polycrystalline aggregates. J. Phys. D 15 (1982) 2157–2167. C.-S. Man, On the constitutive equations of some weakly-textured materials. Arch. Rational Mech. Anal. 143 (1998) 77–103.
524 6.
M. HUANG AND C.-S. MAN
R. Paroni and C.-S. Man, Constitutive equations of elastic polycrystalline materials. Arch. Rational Mech. Anal. 150 (1999) 153–177. 7. C.-S. Man, X. Fan and K. Kawashima. In preparation. 8. R.A. Toupin and R.S. Rivlin, Dimensional changes in crystals caused by dislocations. J. Math. Phys. 1 (1960) 8–15. 9. L.C. Biedenharn and J.D. Louck, Angular Momentum in Quantum Physics. Cambridge Univ. Press, Cambridge (1984). 10. R.J. Roe, Inversion of pole figures for materials having cubic crystal symmetry. J. Appl. Phys. 37 (1966) 2069–2072. 11. W. Miller, Symmetry Groups and Their Applications. Academic Press, New York (1972). 12. L. Tisza, Zur Deutung der Spektren mehratomiger Moleküle. Z. Physik 82 (1933) 48–72. 13. H.A. Jahn, Note on the Bhagavantam–Suryanarayana method of enumerating the physical constants of crystals. Acta Cryst. 2 (1949) 30–33. 14. C.-S. Man, Material tensors of weakly-textured polycrystals. In: W. Chien et al. (eds), Proc. of the 3rd Internat. Conf. on Nonlinear Mechanics. Shanghai Univ. Press, Shanghai (1998) pp. 87–94. 15. Yu.I. Sirotin, Decomposition of material tensors into irreducible parts. Soviet Phys. Crystallogr. 19 (1975) 565–568. 16. M.J. Beran, T.A. Mason and B.L. Adams, Bounding elastic constants of an orthotropic polycrystal using measurements of the microstructure. J. Mech. Phys. Solids 44 (1996) 1543–1563. 17. M. Huang and C.-S. Man, Elastic stiffness and compliance of anisotropic aggregates of cubic crystallites. In: Q.-S. Zheng, M.-F. Fu and G.-Q. Song (eds), Mechanics and Its Applications in Civil Engineering (In Honor of Professor D.-P. Yang’s 70th Anniversary). Tsinghua Univ. Press, Beijing (2002) pp. 107–116 (in Chinese). 18. D.A. Varshalovich, A.N. Moskalev and V.K. Khersonskii, Quantum Theory of Angular Momentum. World Scientific, Singapore (1988). 19. P.R. Morris, Elastic constants of polycrystals. Internat. J. Engrg. Sci. 8 (1970) 49–61.
Reconstruction Formula for Identifying Cracks MASARU IKEHATA1, and GEN NAKAMURA2, 1 Department of Mathematics, Faculty of Engineering, Gunma University, Kiryu, 376-8515, Japan.
E-mail:
[email protected] 2 Department of Mathematics, Graduate School of Sciences, Hokkaido University, Sapporo, 060-0810, Japan. E-mail:
[email protected] Received 23 July 2002; in revised form 3 July 2003 Abstract. We consider an inverse boundary value problem for identifying cracks in a conductive medium. By combining the probe method and an analysis for the behavior of the “reflected solution”, we derive a reconstruction formula for identifying cracks from the Neumann to Dirichlet map. We give also some related results. Mathematics Subject Classifications (2000): 35J05, 35J55, 35R30. Key words: inverse boundary value problem, probe method, indicator function.
Dedicated to the memory of Clifford Truesdell, who led to a renaissance in rational mechanics through his teachings and research
1. Introduction In this paper we give a reconstruction formula for identifying a crack in a homogeneous isotropic conductive medium in Rn (n = 2 or 3) by boundary measurements. More precisely, we take the Neumann to Dirichlet map as boundary measurements and apply the probe method to reconstruct the crack. The probe method was introduced by the first author in [8]. Therein he established a reconstruction formula for unknown inclusions in a conductive medium. It uses singular solutions and Runge’s theorem to approximate the solutions. It should be pointed out that Isakov [10] was the first who gave a uniqueness theorem for identifying unknown inclusions in a conductive medium by using singular solutions and Runge’s theorem. In order to apply the probe method, we developed an analysis characterizing the behavior of the “reflected solution” given later in Lemma 3.1. Our method works without change for multiple cracks. We will remark about other types of measurements at Partially supported by Grant-in-Aid for Scientific Research (C)(2) (No.13640152) of Japan Society for the Promotion of Science. Partially supported by Grant-in-Aid for Scientific Research (B)(2) (No.14340038) of Japan Society for the Promotion of Science.
525 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 525–538. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
526
M. IKEHATA AND G. NAKAMURA
the boundary. These are given in terms of mixed type boundary conditions. For example, fixing Dirichlet data at a part of the boundary, we measure the corresponding Dirichlet data on the other part of the boundary for given Neumann data on the same part of the boundary. By switching Dirichlet data to Neumann data and vice versa in the preceding example, we obtain another type of boundary measurements. Our method can be generalized to treat a more general problem. For example, we will give an analogous reconstruction formula for identifying cracks in a homogeneous anisotropic elastic medium. Although the basic idea of the proof is the same, it requires much more heavy analysis for characterizing the behavior of the “reflected solution”. In order to present the idea of our method most efficiently, we give the details of argument for conductive media but only state the result for elastic media without proof. The details of proof for elastic media will be given elsewhere. For elastic media, the mixed type boundary condition is widely used in practical applications. Since we regard the conductivity equation as a simple analogue of the elasticity equation, we consider the mixed type boundary condition even for the conductivity equation. There are many related results: Bryan and Vogelius [5], Kress [12], Ben Abda et al. [2], Andrieux et al. [1], Ben Abda and Bui [3] and Brühl et al. [4]. Ben Abda et al. assume the nonvanishing of the stress intensity factor of a surface breaking crack in a two-dimensional medium and the nonvanishing of displacement jump across a two-dimensional crack in a plane and use the reciprocity gap principle to reconstruct the crack. Brühl et al. use the Kirsch method [11]. All other authors transform the problem to some optimization problems and use a Newton-type algorithm to solve the optimization problems. We end this section by defining several notations used in this paper. Let X be an open submanifold of a manifold Y . Following Hörmander [7], for a space F of distribution in Y , we define F (X) := {f |X ; f ∈ F },
F˙ (X) := {f ∈ F ; supp f ⊂ X},
where f |X is the restriction of f to X. These notations will be used for some Sobolev spaces defined on X and X, and we assume sufficient regularity of X, Y and the boundary of ∂X for those definitions. Let ⊂ Rn (n = 2 or 3) be a bounded domain with C 2 boundary . For n = 2 or n = 3, let S ⊂ be a C 2 Jordan closed curve or closed connected surface and ' ⊂ S be an open curve or surface, respectively. When n = 3, we assume the boundary ∂' of ' to be C 2 . ' will be considered as a crack. We sometimes divide into two parts: = D ∪ N ,
D ∩ N = ∅,
where D , N ⊂ are open subsets. When n = 3 and D = ∅, N = ∅, we assume that the boundaries ∂D , ∂N of D , N are C 2 . One of D , N will
527
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
be considered as the place where we do the measurements. Note that we do not exclude the case D = ∅ or N = ∅. Let − be the open subset of with boundary S and + := \ − . The trace operator to is denoted by γ and the trace operators from ± to S are denoted by γ± , respectively. The direction of the unit normal ν at and at S is directed into Rn \ and into + , respectively. The normal derivative is denoted by ∂ν , and the partial derivative with respect to the Cartesian xj variable is denoted by ∂j or ∂xj . Also, we use C to denote the general positive constant in our estimates. For example, by taking X = ', k ∈ R (|k| 1), we can define the Sobolev k spaces H (') and H˙ k ('), which are subspaces of the Sobolev space H k (S). These loc loc loc (') of H(k) (S) in [7], respectively. Also, we are the subspaces H (k) (') and H˙ (k) k k k ∗ denote by H (') the dual space of H ('). Like H (')∗ , the superscript ∗ will be used to denote the dual spaces of function spaces. For 12 < s 1, we define H s ( \ ') by s
H s ( \ ') := {u ∈ D (); u± := u|± ∈ H (± ); γ+ u+ − γ− u− = 0 on S \ '} with the norm uH s (\') := u+ H s (+ ) + u− H s (− ) . Eller [6] gives H s ( \ ') in a different way, but both definitions are equivalent. The advantage of defining k the Sobolev spaces H ('), H˙ k (') as in [7] and H s ( \ ') as above is to avoid
1/2('), which is nothing but H˙ 1/2('); we autodefining another Sobolev space H matically have [u] := γ+ u+ − γ− u− ∈ H˙ 1/2(') if u ∈ H 1 ( \ ') and we can
1/2('))∗ . avoid heavy notation such as (H 2. Crack in a Conductive Medium It is natural to take current and voltage as the input and output data for the measurements, respectively. We consider the following two types of direct problems. −1/2
−1/2 (); g, 1 = 0}, find a TYPE 1. For any given g ∈ H# () := {g ∈ H 1 1 solution u ∈ H# ( \ ') := {u ∈ H ( \ '); u dσ = 0} to ⎧ ⎨ u = 0 in \ ', (DP)1 ∂ u = 0 in ' (i.e., γ+ ∂ν u = γ− ∂ν u = 0 on '), ⎩ ν ∂ν u = g on ,
where , is the pairing between H −1/2() and H 1/2 (), dσ is the line or surface element for n = 2 or 3, respectively. TYPE 2. Let D = ∅. For any given F ∈ L2 ( \ '), f ∈ H −1/2 H (N ), find a solution u ∈ H 1 ( \ ') to ⎧ ⎨ u = F in \ ', ∂ u = 0 on ', ⎩ ν u=f on D , ∂ν u = g on N .
1/2
(0 ), g ∈
(DP)2
528
M. IKEHATA AND G. NAKAMURA
We have well posedness for a slightly more general direct problem including the case D = ∅. 1/2
THEOREM 2.1. For any given F ∈ L2 ( \ '), f ∈ H (D ), g ∈ H −1/2 ('), there exists a unique solution u ∈ H 1 ( \ ') to p∈H ⎧ ⎨ u = F in \ ', ∂ u = p on ', ⎩ ν u=f on D , ∂ν u = g on N ,
−1/2
(N ),
(DP)2
and it satisfies uH 1 (\') C F L2 (\') + f H 1/2 (
D)
+ gH −1/2 (
Here, if D = ∅, we assume F dx = g, 1
N)
+ pH −1/2 (') . (2.1)
(2.2)
\'
−1/2 (N ) has to be replaced by g ∈ H −1/2(). and u dσ = 0, and g ∈ H Proof. We give an outline of the proof. For the case D = ∅, we seek a solution 1 1 u ∈ H# ( \ ') := {u ∈ H ( \ '); u dσ = 0} to the variational equation: ∇u · ∇v dx = − F v dx + gv dσ − p[v] dσ (2.3) \'
\'
'
for any v ∈ H#1 ( \ '), where \' q dx := + q dx + − q dx and the integral −1/2 () gv dσ , for example, is understood as the pairing g, v between g ∈ H 1/2 and v| ∈ H (). Just as in [6], we can prove that solving this variational problem is equivalent to finding a solution u ∈ H#1 ( \ ') to (DP)2 , and that this variational problem admits a unique solution u ∈ H#1 ( \ ') with the esti1/2 mate (2.1). When D = ∅, we proceed as follows. By the definition of H (D ), there is an extension f˜ ∈ H 1/2 () of f such that f˜H 1/2 () f H 1/2 (D ) . Let 1 u0 ∈ H () be the solution to u0 = 0 in , (2.4) on u0 = f˜ with the estimates: u0 H 1 () Cf˜H 1/2 (), ∂ν u0 H −1/2 (
N)
Cf˜H 1/2 ().
(2.5) (2.6)
529
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
Let χ ∈ C ∞ (), supp χ ∩ ' = ∅, χ = 1 near and u1 := u − χu0 . Then, (DP)2 becomes ⎧ ⎨ u1 = G in \ ', (DP)2 ∂ u = p on ', ⎩ ν 1 on D , ∂ν u1 = h on N , u1 = 0 where G := F − (2∇χ · ∇u0 + u0 χ) ∈ H 1 ( \ ')∗ , h := g − χ∂ν u0 ∈ −1/2 H (N ). Solving (DP2 ) for u1 ∈ H 1 ( \ ') is equivalent to finding a solution u1 ∈ H 1 ( \ ') to the variational equation: ∇u1 · ∇v dx = − Gv dx + gv dσ − p[v] dσ (2.7) \'
\'
N
' −1/2
for any v ∈ W := {v ∈ H 1 ( \ '); v = 0 on D }. Here, note that H (N ) = H˙ 1/2(N )∗ . To see that (2.7) has a unique solution u ∈ W with the estimate: (2.8) u1 H 1 (\') C GH 1 (\')∗ + gH˙ 1/2 (N ) + pH −1/2 (') , it is enough to prove the coercivity |∇v|2 dx Cv2H 1 (\') , a(v, v) :=
v ∈ W.
(2.9)
\'
and apply the Lax–Milgram theorem. Since v ∈ W , a(v, v) = 0 implies v = 0 because v = 0 on D = ∅, and α(v) := a(v, v) defines a norm in W . Hence, we have to prove that α(v) and vH 1 (\') are equivalent norms in W . Clearly, it suffices to prove α(v) Cv2L2 (\') ,
v ∈ W.
(2.10)
Suppose this is not true. Then there exist vn ∈ W (n ∈ N) such that |vn | := vn L2 (\') = 1,
α(vn ) → 0,
n → ∞.
(2.11)
∞ By (2.10), {vn H 1 (\') }∞ n=1 is bounded. So there exists a subsequence {vn(k) }k=1 of {vn }∞ n=1 and v ∈ W such that vn(k) → v (k → ∞) weakly in W and
α(v) lim inf α(vn(k) ). k→∞
(2.12)
Hence by (2.11), α(v) = 0, which gives v = 0. Since H 1 ( \ ') *→ L2 ( \ ') is compact, we can take the above subsequence to satisfy vn(k) → v (k → ∞) 2 strongly in L2 ( \ '). This contradicts with |vn(k) | = 1. Next we define the Neumann to Dirichlet map ' , which is our boundary measurements.
530
M. IKEHATA AND G. NAKAMURA
DEFINITION 2.2. −1/2 () → H 1/2() by (i) For the direct problem of type 1, we define (1) ' : H# (1) g = u| , '
g ∈ H −1/2 (),
(2.13)
where u ∈ H#1 ( \ ') is the solution to (DP)1 . 1/2 (ii) For the direct problem of type 2, we fix f ∈ H (D ) and define (2) ' : −1/2 1/2 H (N ) → H (N ) by (2) g = u|N , '
g∈H
−1/2
(2.14)
(N ),
where u ∈ H 1 ( \ ') is the solution to (DP)2 . (j ) (j ) (iii) For both types, let ' (j = 1, 2) be denoted by ∅ (j = 1, 2) when 1
' = ∅. In this case, H#1 ( \ ') has to be replaced by H # () := {u ∈ 1 H (); u dσ = 0} for type 1. The formulation of our inverse problems is as follows. INVERSE PROBLEMS (IP)j (j = 1, 2). For each j (j = 1, 2), reconstruct ' (j ) from ' . We claim that for each j (j = 1, 2) there is a reconstruction formula for iden(j ) tifying ' from ' . We have adapted the probe method [8, 9] for this purpose. For simplicity, we consider only the case n = 3, because only obvious changes are required for the case n = 2. DEFINITION 2.3. (i) (Needle γ ). Let γ := {γ (t) ∈ ; 0 t 1} be a non-selfintersecting continuous curve joining γ (0), γ (1) ∈ such that γ (t) ∈ (0 < t < 1). We call γ a needle. (ii) (First hitting time T (γ , ')). We define T (γ , ') by T (γ , ') := sup{t; 0 < t < 1, γ (s) ∈ / ', 0 s < t}.
(2.15)
We call T (γ , ') the first hitting time. If γ ∩ ' = ∅ and we consider t as the time, T (γ , ') is the time that the needle γ first hits '. REMARK 2.4. It is obvious that if we know T (γ , ') for all possible needles, we can reconstruct '. So the inverse problems (IP)j (j = 1, 2) reduce to find(j ) ing procedures to determine T (γ , ') for any needle γ from ' (j = 1, 2), respectively.
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
531
DEFINITION 2.5. (i) (Indicator function I1 (t, γ )). We define the indicator function I1 (t, γ ) by 6 (1) 7 − ∅(1) gj 1 ; (2.16) I1 (t, γ ) := lim gj , ' j →∞
−1/2 () and H˙ 1/2(), and gj := ∂ν vj | , here , 1 is the pairing between H 1 1 where vj ∈ H () (j ∈ N) is defined as follows. The functions vj ∈ H () (j ∈ N) satisfy in , vj = 0 (2.17) 1 vj → G(·, γ (t)), j → ∞ in H loc ( \ γt ),
where γt := {γ (s); 0 < s t} and G(x, x 0 ) = (4π |x − x0 |)−1 .
(2.18)
(ii) We define the indicator function I2 (t, γ ) by 6 (2) 7 − ∅(2) gj 2 ; I2 (t, γ ) := lim gj , '
(2.19)
j →∞
−1/2 (N ) and H˙ 1/2(N ), and gj := ∂ν vj |N , here , 2 is the pairing between H 1 where vj = v + vj ∈ H () (j ∈ N) and v , vj are defined as follows. The 1
function v ∈ H () is the solution to v = F in , on D , ∂ν v = 0 on N v = f
(2.20)
1
and vj ∈ H () (j ∈ N) satisfy ⎧ in , ⎨ vj =0 supp(vj | ) ⊂ 0 ⎩ 1 vj → G(·, γ (t)), j → ∞ in H loc ( \ γt ),
(2.21)
where 0 is a fixed open subset of N . REMARK 2.6. Note that the well posedness of (2.20) follows from Theorem 2.1 as its special case when ' = ∅. The existence of vj (j ∈ N) and vj (j ∈ N) are due to the Runge approximation theorem given in the Appendix. DEFINITION 2.7. (Detecting times tj (γ , ') (j = 1, 2)). For each j (j = 1, 2), we define the detecting time tj (γ , ') by tj (γ , ') := sup{0 < t < 1; sup |Ij (s, γ )| < ∞}. 0<s
We claim the following which will be proven in the next section.
(2.22)
532
M. IKEHATA AND G. NAKAMURA
THEOREM 2.8. For each j (j = 1, 2), T (γ , ') = tj (γ , ') if γ ∩ ' = ∅.
(2.23)
REMARK 2.9. (i) As already remarked before, Theorem 2.8 implies our reconstruction formulae. (ii) We will prove this theorem by using the probe method [8, 9]. We have adapted this method to our crack identification problem. (iii) The reconstruction formulae are summarized at the end of the next section. (iv) It is straightforward to see that our arguments provide the same reconstruction formulae for identifying cracks inside a homogeneous conductive medium. This is the case when we replace the Laplacian by ni,j =1 aij ∂i ∂j with n = 1 or 2 and positive symmetric constant matrix (aij ). (v) The numerical realization of our reconstruction will be discussed in forthcoming papers.
3. Proof of Theorem 2.8 Since the proof for type 1 is essentially the same as that for type 2, we provide a proof only for type 2 and comment on modifications necessary for type 1. The rest of the proof concerns (2.23). Hereafter in the proof, we simply denote (2) , I (t, γ ) = I2 (t, γ ), t (γ , ') = t2 (γ , '). Let uj ∈ H 1 ( \ ') be the ' = ' solution to ⎧ ⎨ uj = F in \ ', (3.1) ∂ u = 0 on ', ⎩ ν j on D , ∂ν uj = gj on N , uj = f and wj := uj − vj ∈ H 1 ( \ '). LEMMA 3.1 (Reflected solution w). If γt ∩ ' = ∅, then wj → w (j → ∞) in H 1 ( \ ') and w ∈ H 1 ( \ ') is the solution to ⎧ ⎨ w=0 in \ ', (3.2) ∂ν w = −∂ν (v + G(·, γ (t))) on ', ⎩ w = 0 on D , ∂ν w = 0 on N . Proof. Let γt ∩ ' = ∅. By (2.20), (2.21) and (3.1), wj ∈ H 1 ( \ ') satisfies ⎧ ⎨ wj = 0 in \ ', (3.3) ∂ν wj = −∂ν vj on ', ⎩ on D , ∂ν wj = 0, on N . wj = 0
533
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
By Theorem 2.1, wj − wk H 1 (\') C∂ν (vj − vk )H −1/2 (') = C∂ν (vj − vk )H −1/2 (') . (3.4) Here, if we take a bounded domain D with C 2 boundary such that ' ⊂ D, vj −vk ∈ H 1 (D) and (vj − vk ) = 0 in D imply ∂ν (vj − vk )H −1/2 (') Cvj − vk H 1 (D)
(3.5)
by the continuity of the trace (see [6, Lemma 2.9]). Hence, by (2.21), we have 2 wj − wk H 1 (\') → 0 (j, k → ∞). LEMMA 3.2. If γt ∩ ' = φ, then we have |∇w|2 dx + F w dx + I (t, γ ) = − \'
\'
f ∂ν w dσ.
(3.6)
D
Proof. By (2.19) and Lemma 3.1, it is enough to prove 2 |∇wj | dx + F wj dx + gj , (' − ∅ )gj = − \'
\'
f ∂ν wj dσ.
D
(3.7) By the definition of ' , ∅ and the Green formula (see [6, (2.5), (2.7)]), uj ∂ν vj dσ + f ∂ν wj dσ + F wj dx, gj , (' − ∅ )gj = '±
D
\'
(3.8) where '± g dσ := ' γ+ g dσ − ' γ− g dσ . Here, note that '± uj ∂ν vj dσ = ' (γ+ uj [∂ν vj ] + [uj ]γ− ∂ν vj ) dσ = ' [uj ]γ− ∂ν vj dσ and uj ∂ν vj dσ = wj ∂ν wj dσ = − |∇wj |2 dx. (3.9)
'±
'±
\'
Hence, from (3.8) and (3.9), we have (3.7).
2
For the behavior of w = w(·, γ (t)) as t ↑ T (γ , ') if γ (T (γ , ')) ∈ ', we have the following lemma, which will be proven in the Appendix. LEMMA 3.3. Assume γ (T (γ , ')) ∈ '. (i) \' F w dx and D f ∂ν w dσ are bounded as t ↑ T (γ , '). (ii) \' |∇w|2 dx → ∞ (t ↑ T (γ , ')). Now we prove (2.23). Since (2.23) is obvious when γ ∩ ' = ∅, it suffices to prove (0, T (γ , ')) = (0, t (γ , ')) when γ ∩ ' = ∅.
(3.10)
534
M. IKEHATA AND G. NAKAMURA
Clearly, we have (0, T (γ , ')) ⊂ (0, t (γ , ')).
(3.11)
Suppose there exists t ∈ (0, t (γ , ')) such that t T (γ , '). By the definition of t (γ , '), |I (s, γ )| C0 ,
0 < s T (γ , ')
(3.12)
with C0 = sup0<st |I (s, γ )|. But, by Lemma 3.3, this is impossible. Hence, (0, t (γ , ')) ⊂ (0, T (γ , ')) and we have (3.10). Now we comment on the modifications necessary for problems of type 1. We have to change the definition of wj given just before Lemma 3.1 to wj := uj − (vj − vj dσ ) ∈ H#1 ( \ '). Also, in (3.1) and Lemmas 3.1–3.3, we have to delete whatever we had for D and replace H 1 ( \ ') by H#1 ( \ '). Next we summarize our reconstruction formulae for identifying ' from (1) ' (2) or ' . The steps given below for the reconstruction pertain to the Neumann to (1) Dirichlet map (2) ' . For ' , Steps 2 and 3 have to be modified more than just changing I2 (t, γ ) and t2 (γ , ') to I1 (t, γ ) and t1 (γ , '), respectively. The modified steps for Steps 2 and 3 are given as Steps 2 and 3 , respectively. Step 1. Consider a needle γ = {γ (t); 0 t 1} and the domain \ γt with γt := {γ (s); 0 < s t}. 1 Step 2. Take harmonic functions vj ∈ H () (j ∈ N) which approximate G(x, γ (t)) = (4π |x − γ (t)|)−1 . (See (2.21) for details.) Step 3. Solve (3.3) for v ∈ H 1 () and compute gj = ∂ν (v + vj )|N . Step 4. Compute the indicator function I2 (t, γ ) = limj →∞ gj , (' − ∅ )gj for small t. Step 5. Increase t and search for t where |I (t, γ )| blows up. Denote this t by t2 (γ , '). By Theorem 2.8, this give the first hitting time T (γ , '). Step 6. Take many γ ’s and repeat all the previous steps. Plot all the points γ (T (γ , ') for these γ ’s. Then these points generate the crack '. Steps 2 and 3 are as follows: 1 Step 2 . Take harmonic functions vj ∈ H () (j ∈ N) which approximate G(x, γ (t)) = (4π |x − γ (t)|)−1 . (See (2.17) for details.) Step 3 . Compute gj = ∂ν vj | . 4. Crack in an Elastic Medium In this section, we give the corresponding result without proof for a homogeneous elastic medium with elasticity tensor C = (Cij k% ) satisfying the full symmetries: Cij k% = Cj ik% = Ck%ij ,
1 i, j, k, % 3,
(4.1)
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
535
and the strong convexity condition: 3
3
Cij k% εij εk% δ
εij2
(4.2)
i,j =1
i,j,k,%=1
for any symmetric matrix (εij ) and some δ > 0. Then Theorem 2.8 also holds for this elastic medium with a crack ' and we have the same reconstruction formula for identifying ' from the Neumann to Dirichlet map. While the proofs are different from those for conductive media, the main idea of the proof remains the same as before. Hence, we refrain from giving the details of the whole argument, but for preciseness we present the definitions of the Neumann to Dirichlet maps (j ) ' (j = 1, 2) and the indicator functions Ij (t, γ ) (j = 1, 2). For simplicity, we continue using the same notation as before and we only consider the case D = ∅ (2) by on which we fix the displacement. We define ' (2) g := u|N , '
(4.3)
where u is the solution to the mixed type boundary value problem / Lu = F in \ ', ∂L u = 0 on ', u=f on D , ∂L u = g on N ,
(4.4)
with 3 × 3 matrices of operators L and ∂L whose (i, k) components are 3 j,%=1 3
Cij k% ∂j ∂%
and
Cij k% νj ∂% ,
(4.5) (4.6)
j,%=1
respectively. The indicator function I2 (t, ') is defined in the same way as −1/2 (N ))3 and (2.19) with the modification that the pairing , 2 is between (H 1/2 3 (H˙ (N )) . The functions gj (j ∈ N) are defined as before except that and ∂ν in (2.20) are replaced by L and ∂L , respectively. Also, we have to change the definition of G(x, x 0 ) in (2.21) and (2.18) by G(x, x 0 ) = E(x, x 0 )b,
(4.7)
where E(x, x 0 ) is the fundamental solution of L and b is any fixed nonzero constant vector. Acknowledgements The authors would like to thank Prof. Chi-Sing Man who kindly corrected our English. Also, the second author would like to thank Prof. Kohji Ohtsuka who taught him a lot about fracture mechanics.
536
M. IKEHATA AND G. NAKAMURA
Appendix In this appendix we state Runge’s theorem, which we use in the construction of vj (j ∈ N), and we prove Lemma 3.3. THEOREM A.1 (Runge’s theorem). Let U be an open subset of such that U ⊂ and \ U is connected. Define the two spaces X, Y of functions by 1
), X := {u|U ; u ∈ H (U 1
Y := {v|U ; v ∈ H (),
}, u = 0 in U
(A.1)
v = 0 in , supp(v| ) ⊂ 0 },
(A.2)
⊂ and
is an open subset of U depending on u such that U ⊂ U
⊂U where U 1 0 is a fixed open subset of N . Then, Y is dense in X with respect to H (U ) norm. Proof. The proof is given in [9]. 2 Proof of Lemma 3.3. Let x 0 = γ (t) ∈ \ ' and a = x(T (γ , ')). Suppose x ∼ a (i.e., |x 0 − a| 1). Let y = (y1 , y2 , y3 ) = (y1 (x, x 0 ), y2 (x, x 0 ), y3 (x, x 0 )) be the boundary normal coordinates near the point a such that 0
y(a) = 0,
∂y(x, x 0 ) |x=x 0 = I, ∂x
− = {y1 < 0} near a,
(A.3)
where I is the identity matrix. Let A(x) := |J (x)|−1 J (x)(tJ (x)),
x(y(x, x 0 ), x 0 ) = x,
(A.4)
where J (x) = ∂y(x, x0 )/∂x. Also, let ˜ A(y) = A(x(y, x 0 )),
u(y) ˜ = u(x(y, x 0 )),
y 0 = y(x 0 , x 0 ).
(A.5)
Then, it is easy to see ˜ (i) A(y) ∈ C 1 near y = 0, ˜ near y = 0, (ii) |J |−1 = ∇ · A∇ 0 0 (iii) δ(x(y; x ) − x ) = δ(y − y 0 ), (iv) ∂ν = ∂y1 . To simplify our expression, we introduce the following definition. DEFINITION A.2. Let X be a function space defined on an open subset of R3 and let {g(·, x 0 )}, {r(·, x 0 )} be families of distributions defined on X, which depend on x 0 ∼ a. Then, we write g(·, x 0 ) ∼ r(·, x 0 ) in X if and only if {g(·, x 0 ) − r(·, x 0 ); x 0 ∼ a} is bounded in X.
537
RECONSTRUCTION FORMULA FOR IDENTIFYING CRACKS
Let V ⊂ R3 be a small open neighborhood of y = 0 such that V± := V ∩ R3± with R3± := {±y1 > 0} and has C 2 boundary and β± , β0 are open subsets of the boundary ∂V± of V± such that ∂V± = β¯± ∪ β¯0 ,
β± ∩ β0 = ∅,
β± ⊂ R3± ,
β0 ⊂ {y1 = 0}.
(A.6)
in H 1 (V± ),
(A.7)
By a direct computation, we have
y 0 ) := G(x(y, y 0 ), x(y 0 , y 0 )) ∼ G(y, y 0 ) G(y,
=0 (y, y 0 ) by where y 0 ∼ 0 plays the role of x 0 in Definition A.2. Define w ± =0 (y, y 0 ) = ±G(y, ∓y 0 , y 0 ) w ± 1
in R3± with y10 > 0,
(A.8)
where y 0 = (y10 , y20 , y30 ), y 0 = (y20 , y30 ). Then, by a direct computation, we have / =0 = 0 in R3± , w ± (A.9) =0 = −∂ G(y, y 0 ) on y = 0. ∂ w y1
±
y1
1
=± ∈ H 1 (V± ) to Consider the solution Z / =0 + G) ˜ ˜ 0 ))∇(w ˜ Z =± = −∇ · (A(y) − A(y ∇ · A∇ ± =± = 0 =± = 0 on β0 , ∂y1 Z Z
in V± on β± .
(A.10)
=0 + G) ∼ 0 in L2 (V ), we have ˜ ˜ 0 ))∇(w Then, since (A(y) − A(y ± ± =± ∼ 0 in H 1 (V± ). Z Define w =± by =0 − (G =± + w
− G). w =± = Z ± Then, by a direct computation, we have ˜ w ∇ · A∇ =± = 0 in V± ,
on y1 = 0. =± = −∂y1 G ∂y1 w
(A.11)
(A.12)
(A.13)
Now, let ζ ∈ C0∞ () satisfy ζ = 1 in a small open neighborhand of a and supp ζ ∩ S ⊂ ' and define w0 ∈ H 1 ( \ ') by w =+ (y(x, x 0 ), y 0 ) in + ∩ supp ζ , (A.14) w0 = ζ w0 , w0 = w =− (y(x, x 0 ), y 0 ) in − ∩ supp ζ . Then, by (3.2) and (A.13), w1 := w − w0 satisfies ⎧ ⎨ w1 = −2∇ζ · ∇w0 − ( ζ )w0 in \ ', ∂ν w1 = −∂ν v + (ζ − 1)∂ν G − (∂ν ζ )w0 on ', ⎩ ∂ ν w1 = 0 on N . w1 = 0 on D ,
(A.15)
538
M. IKEHATA AND G. NAKAMURA
Note that (A.15) is equivalent to the variational equation: ∇w1 · ∇ϕ dx = {−2w0 (∇ζ · ∇ϕ) + ( ζ )w0 ϕ} dx \' \' {−∂ν v + (ζ − 1)∂ν G − (∂ν ζ )w0 }ϕ dσ +
(A.16)
'±
for any ϕ ∈ W . Also, we have (∂ν ζ )w0 ϕ dσ = (∂ν ζ )w0 ϕ dσ Cηw0 L2 (S) ϕL2 (S) ±
S
Cηw0 H 1 (\') ϕH 1 (\') ,
(A.17)
/ supp η. Hence the behavior of w is where η ∈ C0∞ (), η = 1 on supp(∇ζ ) and a ∈ controlled by that of w0 . Therefore, (i) and (ii) of Lemma 3.3 follow immediately from (A.7), (A.8) and (A.17). This completes the proof. 2 References 1.
S. Andrieux, A. Ben Abda and H.D. Bui, Reciprocity principle and crack identification. Inverse Problems 15 (1999) 59–65. 2. A. Ben Abda, H. Ben Ameur and M. Jaoua, Identification of 2D cracks by elastic boundary measurements. Inverse Problems 15 (1999) 67–77. 3. A. Ben Abda and H.D. Bui, Reciprocity principle and crack identification in transient thermal problems. J. Inverse Ill-Posed Problems 9 (2001) 1–6. 4. M. Brühl, M. Hanke and M. Pidcock, Crack detection using electrostatic measurements. Math. Model. Numer. Anal. 35 (2001) 595–605. 5. K. Bryan and M. Vogelius, A. computational algorithm to determine crack locations from electrostatic boundary measurements. The case of multiple cracks. Internat. J. Engrg. Sci. 32 (1994) 579–603. 6. M. Eller, Identification of cracks in three-dimensional bodies by many boundary measurements. Inverse Problems 12 (1996) 395–408. 7. L. Hörmander, Linear Partial Differential Operators, 3rd revised edn. Springer, Berlin (1969). 8. M. Ikehata, Reconstruction of the shape of the inclusion by boundary measurements. Comm. Partial Differential Equations 23 (1998) 1459–1474. 9. M. Ikehata, Reconstruction of inclusion from boundary measurements. J. Inverse Ill-Posed Problems 10 (2002) 37–65. 10. V. Isakov, On uniqueness of recovery of a discontinuous conductivity coefficient. Comm. Pure Appl. Math. 14 (1988) 863–877. 11. A. Kirsch and S. Ritter, The linear sampling method for inverse scattering from an open arc. Inverse Problems 16 (2000) 89–105. 12. R. Kress, Inverse elastic scattering from a crack. Inverse Problems 12 (1996) 667–684.
An Approximate Treatment of Blunt Body Impact R.J. KNOPS1 and PIERO VILLAGGIO2 1 Department of Mathematics, Heriot-Watt University, Edinburgh EH14 4AS, Scotland, UK 2 Dipartimento di Ingegneria Strutturale, Università di Pisa, 56126 Pisa, Italy
Received 29 October 2002; in revised form 21 July 2003 Abstract. This paper considers a blunt body, modelled by an elastic-perfectly plastic one-dimensional bar, impacting normally against a rigid fixed target as indicated in Figure 1. When the impact velocity is small, the bar behaves elastically during the ensuing motion and rebounds with an equal and opposite velocity to that on impact. But for large impact velocity, part of the bar adjacent to the point of contact experiences permanent plastic deformation reducing the rebound velocity. The illuminating theory developed by Taylor [10] analyzed the impact of a rigid-plastic bar. We extend this treatment by employing a semi-inverse procedure combined with energy conservation to additionally take into account elastic deformation. Mathematics Subject Classifications (2000): 74M20, 74C99, 74H99. Key words: impact, elastic-perfectly plastic blunt body.
1. Introduction A slender bar of uniform cross-section A, uniform mass density ρ, and undeformed length %, is projected normally in the horizontal direction towards a rigid smooth vertical plane as shown in Figure 1. With respect to a Cartesian set of coordinate axes, the x-axis is directed along the bar and the origin is located at the point of initial impact. During the period immediately prior to striking the plane, the motion of the bar is assumed to be rigid and to possess a longitudinal velocity v1 . The bar behaves as a linear elastic-perfectly plastic material. Consequently, the deformation is linear elastic with Young’s modulus E until a limiting value σp of the stress is attained, after which the material starts to yield at constant stress σp . Unloading, whenever it occurs, is always linear elastic according to classical plasticity theory ([5, p. 47]; see also [3]). When the impact speed v1 is sufficiently small or the yield stress σp is sufficiently large, the deformation after impact remains elastic throughout the bar. The longitudinal motion of the bar satisfies the standard linear wave equation and the displacement is described by d’Alembert’s solution (see, e.g., [4, art. 281]). A different analysis, however, is required when either condition is not satisfied and parts of the bar become plastic. The wave equation ceases to be valid in these parts, and, moreover, the precise extent of the 539 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 539–554. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
540
R.J. KNOPS AND P. VILLAGGIO
Figure 1.
plastic region is unknown beforehand. Such complications partly explain why the problem has failed to receive a satisfactory treatment since its inception 150 years ago. A notable contribution by Taylor [10] attempted to circumvent the difficulty by adopting a rigid-plastic stress–strain law. He proposed that immediately after impact the elastic and plastic regions are separated by a “shock front” that propagates into the bar. Behind the front the bar is at rest in plastic deformation. Taylor’s approach was probably motivated by, and certainly elucidates, the experimentally observed characteristic shape of projectiles after they have been fired against armour or similar rigid targets. The shape has been confirmed by Whiffin [11] in experiments successively repeated since the original investigations by Boltzmann [1]. Nevertheless, Taylor’s method perhaps is vulnerable to criticisms for two reasons. The first is that a rigid-plastic stress–strain law excludes
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
541
any elastic deformation, the second is the assumption that the bar does not rebound but remains in contact with the rigid plane obstacle and progressively comes to rest as the plastic shock front propagates through the bar. Expressed otherwise, these criticisms query the assumption that the kinetic energy immediately prior to impact is totally converted into the work of plastic deformation. There is an obvious theoretical and technical interest in the nonelastic behavior of a bar that rebounds after impact and for which the initial kinetic energy is only partially recoverable from the elastic deformation. Equally important is the determination of the corresponding coefficient of restitution, which Stronge [8, art. 5.1] defines to be the ratio of the kinetic energies at the end of the rebound period to that on impact. He also remarks that energy can be dissipated either by elastic vibrations generated on impact, by viscosity, or by plastic deformation. The greatest energy is absorbed by the plastic deformation especially in ductile materials that include most metals. Intermediate between the extremes of a perfectly elastic body and Taylor’s rigid-plastic treatment, is another possibility. The body, immediately after impact, experiences everywhere an elastic deformation into which subsequently propagates a region of plastic deformation commencing from the end in contact with the rigid target. Provided the plastic deformation does not extend to the whole bar, the elastic strain energy is recoverable and is fully available to contribute to the energy of rebound. We here begin to explore this proposed intermediate mode of deformation. Certain simplifying assumptions are introduced, the most notable of which is the classic equivalent reduction of a continuous system to one possessing a single degree of freedom. The reduction, first proposed by Cox [2], was further considered by Saint-Venant and Flamant [7] and improved by Pöschl [6]. When applied to a partly plastically deformed bar, it enables energy conservation to be applied to determine the maximum compression as well as the coefficient of restitution. Furthermore, the period of compression and recovery during which the end of the bar remains in contact with the rigid obstacle also may be calculated. These two phases of the deformation are of different duration in an elastic – plastic bar and neither is equal to the respective periods when the deformation is wholly elastic. Section 2 sketches the reduction of the basic theory to one dimension and by means of a semi-inverse procedure with a general function examines the motion when the deformation is completely elastic. A condition is obtained for plastic deformation to occur. Section 3 examines a propagating plastic region and derives expressions for the respective strain and kinetic energies again using a semi-inverse procedure. Energy conservation leads to the determination of the maximum compression of the bar and the time taken to achieve this, as well as the coefficient of restitution. A final section introduces a particular set of functions for the semi-inverse procedure in order to illustrate the results derived in Section 3, and to compare them with those obtained by Taylor and Whiffin.
542
R.J. KNOPS AND P. VILLAGGIO
2. The Reduced System: Perfectly Elastic Behavior Let the time be denoted by t and consider the bar of Section 1 at the instant t = 0 when one end B is first in contact with the rigid plane obstacle. Immediately after impact, it is assumed that there is a brief “conversion” period [0, t0 ], where t0 does not exceed l/c0 and c0 is the speed of propagation of longitudinal elastic waves. During this period, a disturbance penetrates into the bar causing the velocity to change from the rigid translational velocity v1 of the bar immediately prior to impact. Let F (x1 , x2 , t) be the reactive force at the end B; u(x, t) the longitudinal displacement; e = ∂u/∂x the corresponding strain; and σ the axial stress. Differentiation with respect to the time variable is indicated by a superposed dot. In terms of these quantities and in an obvious notation, the principle of linear momentum gives t0 % t0 F (x1 , x2 , η) dS dη = ρAu(x, ˙ t) dx , (2.1) 0
A
0
0
while the rate of work equation may be expressed as t0 t0 1 % 2 ρAu˙ (x, t) dx + σ (x, η)e(x, ˙ η) dx dη = 0, 2 0 0 0
(2.2)
in which we have taken u(x, 0) = 0 and σ (%, t) = 0. The pair of unknowns F (x1 , x2 , t) and u(x, t) are to be determined from (2.1) and (2.2). The solution, however, presents severe difficulties that are here overcome by the introduction of the following simplifying assumptions based upon an averaging procedure adopted by Cox [2] and subsequently developed by others, notably Pöschl [6]: (a) The reactive force F (x1 , x2 , t) remains finite during the interval [0, t0 ]. (b) The interval [0, t0 ] is small compared to the subsequent period of deformation. (c) The velocity at the end of the small interval [0, t0 ] is spatially affine to the static compression of a heavy column supported at its base. (d) The displacement of the bar is zero throughout the interval [0, t0 ]. Furthermore it is supposed that (e) The displacement remains longitudinal for t t0 . Assumptions (a) and (b) imply that the left side of (2.1) can be neglected and consequently linear momentum is conserved [9, Section 24]: 1 % u(x, ˙ t0 ) dx. (2.3) v1 = % 0 In accordance with Assumption (c), we represent the speed at t = t0 by (cf. Szabó [9, p. 374]): u(x, ˙ t0 ) = −Vf (x),
(2.4)
543
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
where V is a positive constant and f (x) is given by f (x) =
x(2% − x) . %2
(2.5)
Note that the quadratic function f (x) vanishes at x = 0, and becomes unity at x = %. Instead, however, of the explicit choice (2.5) for the function f (x), it is convenient to retain a function that is differentiable but otherwise arbitrary apart from the properties: f (0) = 0,
f (%) = 1,
f (%) = 0,
f (x) 0,
f (x) 0,
(2.6)
where a superposed prime indicates differentiation with respect to the variable x. It follows from (2.3) and (2.4) that κ1 V = −v1 , where κ1 =
1 %
(2.7)
%
f (x) dx
(2.8)
0
is a reduction factor for the total mass m = ρA% of the bar due to the speed V . At the end of period [0, t0 ] the bar experiences a compressive deformation that achieves first maximum compression at the instant t1 to be determined. The strain in the interval [t0 , t1 ] is calculated by the semi-inverse procedure in which the longitudinal displacement u(x, t) is assumed to be of the separable form: u(x, t) = −ω(t)f (x),
t t0 ,
(2.9)
where by continuity ω(t ˙ 0 ) = V , and by Assumption (d), ω(t0 ) = 0. Of course, insofar as the displacement (2.9) satisfies neither the equilibrium equations nor the equations of motion, it is to be regarded only as an approximation. This remark applies equally to the subsequent introduction in Section 3 of a similar expression (3.1) for the displacement. The longitudinal velocity obviously becomes u(x, ˙ t) = −ω(t)f ˙ (x),
t t0 ,
(2.10)
and the corresponding strain is e(x, t) =
∂u(x, t) = −ω(t)f (x), ∂x
t t0 .
(2.11)
For the remainder of this section it is supposed that when t t0 the deformation is entirely elastic so that the accompanying stress is given by σ (x, t) = −Eω(t)f (x), where E is Young’s modulus.
t t0 ,
(2.12)
544
R.J. KNOPS AND P. VILLAGGIO
The strain energy is expressed by % 1 1 % 2 2 EAe dx = EAω (t) f 2 (x) dx U (t) = 2 0 2 0 1 mE 2 ω (t)κ2 , t t0 , = 2 ρ where κ2 =
1 %
%
f 2 (x) dx.
1 κ3 = %
(2.14)
0
Moreover, the kinetic energy is 1 1 % ρAω˙ 2 (t)f 2 (x) dx = mω˙ 2 (t)κ3 , K(t) = 2 0 2 where
(2.13)
t t0 ,
(2.15)
%
f 2 (x) dx
(2.16)
0
is another reduction factor [9, Section 24]. Conservation of energy yields % Emκ2 A 2 2 ω (t) + mκ3 ω˙ (t) = ρAu˙ 2 (x, t0 ) dx, t t0 . ρ 0
(2.17)
Assumption (e) implies that no strain energy is created during the conversion period [0, t0 ], and accordingly this quantity vanishes at t = t0 . By virtue of (2.4) and (2.7), we obtain from (2.17) the relation 2 v1 Eκ2 2 2 ω (t) + ω˙ (t) = , t t0 . (2.18) ρκ3 κ1 By hypothesis, we have the initial conditions: ω(t0 ) = 0,
v1 ω(t ˙ 0) = V = , κ1
and the solution to the simple harmonic motion (2.18) becomes Eκ2 1/2 ρκ3 1/2 v1 sin (t − t0 ) . ω(t) = Eκ2 κ1 ρκ3
(2.19) (2.20)
(2.21)
The maximum compression of the bar corresponds to the value of ω(t) given by ρκ3 1/2 v1 , (2.22) ω(t1 ) = Eκ2 κ1
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
545
and occurs at the instant t1 , where (t0 , t1 ) is the compression period of length ρκ3 1/2 1 . (2.23) t1 − t0 = π 2 Eκ2 We conclude from (2.6) and (2.12) that the maximum stress σM occurs at the end B at the instant t = t1 and is given by Eρκ3 1/2 v1 f (0). (2.24) σM = Eω(t1 )f (0) = κ2 κ1 The assumption that linear isotropic elasticity adequately describes the material behavior is justified provided that σM does not exceed the plastic compressive yield stress σP . Consequently, the expression (2.24) implies that the deformation ceases to be elastic whenever (after suitable adjustment of signs) Eρκ3 1/2 v1 f (0). (2.25) σP κ2 κ1 Condition (2.25) holds when either the plastic compressive yield stress is sufficiently small or the impact velocity is sufficiently large and ensures that the bar becomes progressively plastic. Such regions of plastic deformation, however, can transmit only a stress σP . The remaining parts of the bar continue in elastic motion since the longitudinal stress nowhere exceeds the value σP . Elastic-plastic behavior is discussed in the next section. 3. The Reduced System: Elastic Plastic Behavior In this section it is assumed that the plastic compressive yield stress satisfies a condition corresponding to (2.25) and the plastic deformation extends progressively along the bar during the compression subsequent to a period [t0 , t2 ] of elastic motion throughout the bar, and the brief conversion period [0, t0 ] defined in Section 2. In particular, t0 is still supposed sufficiently small to justify condition (d) of Section 2, namely that u(x, t0 ) is zero. We seek to determine the instant t2 at which plastic deformation first occurs and then proceed to construct individual expressions for the different contributions to the stored and kinetic energies. We deduce the length of the plastic region, and the instant T , when there is first instantaneous maximum compression, and calculate the coefficient of restitution. Because the maximum compressive elastic stress always occurs at the end B in contact with the rigid target, plastic deformation penetrates into the bar from this point. It is supposed that a narrow transitional zone separates the regions of elastic and plastic deformation, and it is convenient to abstract the zone into a line discontinuity between the regions. We let z(t) be the distance of the discontinuity from the end B measured in the deformed bar at time t ∈ [t2 , T ]. In what follows, all deformations are assumed sufficiently small to justify neglect of second and higher order terms. We frequently, for example, confuse
546
R.J. KNOPS AND P. VILLAGGIO
the distance z(t) with the corresponding distance measured in the undeformed or reference configuration of the bar. For simplicity, it is initially supposed that the plastic region is reduced to rest, although subsequently we indicate an approximate form for the kinetic energy when the plastic region is in motion. As just mentioned, during the period [t0 , t2 ] a compressive elastic motion occurs throughout the bar, and thereafter is confined to the length [z(t), %]. Accordingly, we represent the elastic displacement by the semi-inverse expression 0 x %, t0 t t2 , (3.1) u(x, t) = −(t)f (x), z(t) x %, t2 t T , where (t), not necessarily identical to the function ω(t) introduced in (2.9), is to be determined; and the function f (x) satisfies conditions (2.6). The negative sign in (3.1) indicates contraction of the bar. The corresponding speed, strain and stress produced by the elastic displacement (3.1) on its region of definition are given respectively by ˙ u(x, ˙ t) = −(t)f (x), e(x, t) = −(t)f (x), σ (x, t) = −E(t)f (x), where, as before, E denotes Young’s modulus. Assumption (d) of Section 2 implies the initial values v1 ˙ 0) = V = (t . (t0 ) = 0, κ1
(3.2) (3.3) (3.4)
(3.5)
We conclude from (3.4) and (2.9) that the elastic compressive stress assumes its maximum at the end B (x = 0) and consequently plastic deformation commences from this point provided the condition corresponding to (2.25) is satisfied. We let t2 be the instant at which the elastic stress at x = 0 achieves the value −σP of the plastic compressive yield stress. It follows from equations corresponding to (2.21) that σP = E(t2 )f (0) Eκ2 1/2 ρEκ3 1/2 v1 f (0) sin (t2 − t0 ) , = κ2 κ1 κ3
(3.6) (3.7)
and t2 may be calculated from (3.7) in terms of σP . At a subsequent instant t ∈ [t2 , T ] the elastic plastic interface has moved to the point z(t). The stress on the side adjacent to the interface in the elastic region is given by (3.4) and must equal the plastic compressive yield stress. Consequently, we have (3.8) σP = E(t)f z(t) , 0 z(t) %, t ∈ [t2 , T ],
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
547
where the reference and deformed positions are not distinguished in accordance with the linear approximation adopted here. On noting that z(t2 ) = 0, we obtain from (3.8) the lower bound σP (3.9) (t) , t ∈ [t2 , T ]. Ef (0) Moreover, differentiation of (3.8) leads to ˙ (t) = −(t)
f (z(t)) z˙ (t), f (z(t))
t ∈ [t2 , T ],
(3.10)
which by (3.8) may be written as ˙ (t) =−
σP f (z(t)) z˙ (t), E f 2 (z(t))
t ∈ [t2 , T ].
(3.11)
Let us also remark that as the plastic deformation progresses along the bar, the (true) stress distribution at the instant t ∈ [t2 , T ] is given by σ = −σP , = −E(t)f (x),
0 x z(t), z(t) x %.
(3.12) (3.13)
We continue the analysis by employing, as in Section 2, conservation of energy and observe that the total strain energy U (t) of the bar at any time t ∈ [t2 , T ] consists of three components U1 (t), U2 (t), U3 (t), due respectively to the elastic, elasticplastic, and dissipative strain energies. We calculate separately each component. First, the elastic strain energy from (3.3), (3.4) and (3.8) becomes, to within the linear approximation, % 1 2 f 2 (x) dx U1 (t) = AE (t) 2 z(t ) 2 −1 % 2 σP 2 1 f (z(t)) f (x) dx. (3.14) = A 2 E z(t ) We suppose that a point x undergoes a displacement w(x) at constant volume during the plastic deformation so that the total compressive strain at a point in the plastic region is ∂w , 0 x z. (3.15) ∂x Because no change in volume accompanies the plastic deformation the longitudinal compressive plastic strain must be balanced by a transverse dilatational plastic strain (x, t), using the sign convention of (3.15). By definition, the cross-sectional area A(x, t) at any point x in the plastic region is then related to the uniform cross-section area A of the undeformed bar by (x, t) = −
A(x, t) = A(1 + (x, t)),
0 x z(t).
548
R.J. KNOPS AND P. VILLAGGIO
Note that, since in contraction (x, t) is positive by definition (3.15), the plastically deformed area A(x, t) is greater than A. By the standard theory of elastic-perfectly plastic materials [5], the total longitudinal compressive plastic strain (x, t) contains a plastic part eP = −σP /E whose associated strain energy is given by z(t ) 1 (−σP )eP dx (3.16) U2 (t) = A 2 [1 − (x)] 0 2 z(t ) σ dx 1 , (3.17) = A P 2 E [1 − (x)] 0 where the nominal stress σP /[1 − (x)] per unit area of A is used. Evaluation of the integrand in (3.17) to first order leads to the expression 2 z(t ) σ 1 [1 + (x)] dx, (3.18) U2 (t) = A P 2 E 0 which by (3.15) and the continuity of the displacement across the interface gives 2 σ 1 (3.19) U2 (t) = A P [z(t) − (t)f (z(t))] 2 E 2 σ σP f (z(t)) 1 . (3.20) = A P z(t) − 2 E E f (z(t)) On again using the nominal stress, the expression for the dissipative strain energy is given by z(t ) (−σP )[(x) − eP ] dx U3 (t) = A [1 − (x)] 0 z(t ) (x) dx − 2U2 (t) = −AσP [1 − (x)] 0 σP E U2 (t) + AσP z(t), (3.21) 1+ = −2 σP E where (3.16) and (3.18) are used. Under the assumption that no energy is destroyed across the interface, conservation of energy at the instant T of first instantaneous maximum compression yields U (T ) = K,
(3.22)
where 2 v1 1 , K = ρA%κ3 2 κ1
(3.23)
549
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
and κ1 and κ3 are defined in (2.8) and (2.16). Rearrangement of (3.22) after substitution from (3.14), (3.20) and (3.21) gives P (z(T )) =
n%κ3 , κ12
(3.24)
where 2
−1
%
f 2 (x) dx + (2 + q)
P (z(t)) = [f (z(t))]
z(t )
f (z(t)) − z(t), f (z(t))
(3.25)
and the nondimensional parameters q and n are defined by q=
σP , E
n=
ρEv12 . σP2
(3.26)
In the next section, we solve equation (3.24) for specific functions f (x) and present some numerical results. The right side of (3.24) is not arbitrary. Let us recall that condition (2.25) for the onset of plastic deformation requires that Eρκ3 1/2 v1 f (0), (3.27) σP κ2 κ1 where κ2 is defined by (2.14). In terms of the parameter n, the bound (3.27) becomes n
κ12 κ2 , κ3 f 2 (0)
(3.28)
which implies that the right side of (3.24) must satisfy the lower bound κ2 % nκ3 % 2 . 2 f (0) κ1
(3.29)
The parameter n enters prominently into the numerical calculations of Section 4 where it is compared to the nondimensional parameter ρv12 /σP = nq introduced by Taylor [10]. The solution z(T ) to equation (3.24) subject to (3.28) may be used to determine the coefficient of restitution e∗ , which according to Stronge [8] is defined to be e∗ = 1 −
U3 (T ) , K
(3.30)
in our notation. It is immediate from (3.22) that 0 e∗ 1, whereas on substituting from (3.20), (3.21) and (3.24) we have the explicit representation % 2 2κ12 z(T ) 1 z(T ) f (x) dx q ∗ + + (1 + q) . (3.31) e =− (2 + q) nκ3 (2 + q) % % f 2 (z(T ))
550
R.J. KNOPS AND P. VILLAGGIO
Determination of the time T of first maximum compression requires consideration of the energy conservation for the whole bar at some intermediate time t ∈ [t2 , T ] and the introduction of the corresponding kinetic energy. We distinguish between hard materials, for which both the elastically and plastically deformed parts of the bar experience motion, and soft materials for which, as assumed so far, the plastically deformed part is at rest. Consequently, for soft materials, motion at time t ∈ [t2 , T ] is confined to the elastically deformed region where the displacement is given by (3.1). The kinetic energy is % 1 ˙ 2 (t)f 2 (x) dx K1 (t) = ρA 2 z(t ) 2 % 1 1 2 f (z(t)) 2 f (x) dx z˙ 2 (t), t ∈ [t2 , T ], (3.32) = ρA%q 2 f 2 (z(t)) % z(t ) after appeal to (3.11). The potential energy of the whole bar continues to be given by U (t) and, accordingly, energy conservation yields K1 (t) + U (t) = K,
(3.33)
which after substitution from (3.14), (3.20), (3.21), (3.32) becomes f (z(t)) 2 1 % 2 f (x) dx z˙ 2 (t) = n%κ3 κ1−2 − P (z(t)). f 2 (z(t)) % z(t )
(3.34)
We may integrate (3.34) to obtain
ρ% E
1/2
(t − t2 ) = 0
z(t )
[f (ζ )/f 2 (ζ )][(1/%)
% ζ
f 2 (x) dx]1/2 dζ
[n%κ3 κ1−2 − P (ζ )]1/2
,
(3.35)
which enables z(t) to be determined. Insertion into (3.8) provides an expression for (t). To calculate T − t2 , we substitute from the solution to (3.24) into (3.35). But t2 is given by (3.7) and consequently the time T − t0 is known. We omit details. For a hard material, all parts of the bar continue in motion after plastic deformation has commenced. We adopt the simplifying assumption that the transverse inertial effects are absent in the region of plastic deformation, and further assume that the kinetic energy may adequately be represented by elastic kinetic energy derived from the displacement (3.1). The total kinetic energy K2 (t) of the whole bar is consequently given by % 1 ˙ 2 (t)f 2 (x) dx K2 (t) = ρA 2 0 f (z(t)) 2 2 1 2 z˙ (t), (3.36) = ρAκ3 q % 2 2 f (z(t))
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
551
where (3.11) has been used. Conservation of energy becomes K2 (t) + U (t) = K, which may be treated similarly to (3.33) to yield 1/2 z(t ) [f (ζ )/f 2 (ζ )] dζ E 1/2 (t − t2 ) = κ3 . ρ% [n%κ3 κ1−2 − P (ζ )]1/2 0
(3.37)
(3.38)
Again, the function (t) and the total time T − t0 may be determined as before. The evaluation of the integrals in (3.35) and (3.38) must in general be undertaken numerically. Let us observe, however, that by definition (2.16) we have 1 % 2 f (x) dx, κ3 % z and accordingly the time for maximum compression in soft materials is less than that for hard materials. 4. The Pöschl Approximation. Numerical Evaluation We illustrate the theory by selecting an explicit family of functions f (x) and for one particular choice (the Pöschl approximation (2.5)) present a numerical analysis that allows comparison with the well-known corresponding results obtained by Taylor [10] in the rigid-plastic theory that for certain impact speeds predict those experimentally observed by Whiffin [11]. The functions considered are given by x α , 0 x %, (4.1) f (x) = 1 − 1 − % where α (> 1) is a positive constant. Clearly, the functions (4.1) satisfy conditions (2.6) for each α. In particular, when α = 2, the function (4.1) becomes the Pöschl approximation (2.5) and because of its physical relevance this function is selected for numerical investigation. To obtain the equation satisfied by z(T ), we first substitute (4.1) into definitions (2.8), (2.14), and (2.16) to obtain α α2 2α 2 , κ2 = 2 , κ3 = , α+1 % (2α − 1) (α + 1)(2α + 1) and consequently the plastic yield stress by (2.25) must satisfy 2α − 1 1/2 1/2 . σP (ρE) v1 2(α + 1) 2α + 1 κ1 =
(4.2)
(4.3)
As observed in Section 3, the material parameters in (3.24) appear only in the dimensionless combinations q and n defined by (3.26). Consequently, n by (4.3) possesses the lower bound: 2α + 1 . (4.4) n 2(α + 1)(2α − 1)
552
R.J. KNOPS AND P. VILLAGGIO
Observe that the functions (4.1) give % y(T ) 2 −1 , [f (z(T ))] f 2 (x) dx = 2α − 1 z(T )
(4.5)
where %y(T ) = % − z(T ) is the elastically deformed length of the bar at first maximum compression, and therefore (3.24) leads to: 2(α − 1)2 2n(α + 1) y (1−α) (T )[2 + q] + y(T ) −q −α 1+ = 0. (4.6) 2α − 1 2α + 1 The coefficient of restitution e∗ from (3.31) becomes {2(α − 1) − q} (2α + 1) ∗ 1 − y(T ) . e = −q/(2 + q) + n(α + 1)(2 + q) 2α − 1
(4.7)
General estimates for the time taken to achieve first maximum compression do not simplify sufficiently to warrant separate display. For the Pöschl approximation, when α = 2, the compressive yield stress from (4.3) must satisfy the bound 2ρE 1/2 , (4.8) σP 3v1 5 while the parameter n from (4.4) must satisfy n 0.278.
(4.9)
We consider the following numerical values for the density, impact speed and Young’s modulus, which are comparable to those considered by Taylor [10] and Whiffin [11] for mild steel: ρ = 7.873 · 10−3 Kg.cm−3 , E = 2.07 · 109 Kg.wt.cm−2 .
v1 = 4.875 · 104 cm.sec−1 ,
(4.10)
From (4.8) we find that σP 3.734 · 108 Kg.wt.cm−2 . On the other hand, Taylor stipulates that his analysis requires the compressive yield stress to be comparable to ρv12 which for the numerical values (4.10) equals 1.871 ·107 . Nevertheless, Taylor expresses his results in terms of the nondimensional parameter N = nq = ρv12 /σP (in our notation) and considers the explicit values N = 0.5, 1.633, 3.2, and 8.1 for the impact speed v1 = 2.4689 ·104 cm.sec−1 (= 810 ft.sec−1 ). These values correspond to a compressive yield stress of respective values 9.598 ·106 Kg.wt.cm−2 , 2.939 ·106 Kg.wt.cm−2 , 1.5 ·106 Kg.wt.cm−2 , 5.925 ·105 Kg.wt.cm−2 . The values of n are 1.078 ·102 , 1.150 ·103 , 4.417 ·103 , and 2.83 ·104 . The proportional length y(T ) = 1 − z(T )/% of the bar that remains elastically deformed at first maximum compression is obtained from (4.6) and satisfies the quadratic equation: 6n q 3q 2 − 3y(T ) 1 + +3 1+ = 0. (4.11) y(T ) 1 − 2 5 2
553
AN APPROXIMATE TREATMENT OF BLUNT BODY IMPACT
Table I. n
N
σP (Kg.wt.cm−2 )
y(T )
z(T )/%
e∗
0.5 0.75 1 5 10 2.766 ·10 102 2.950 ·102 8.051 ·102 103
6.72 ·10−2 8.23 ·10−2 9.51 ·10−2 2.126 ·10−1 3.000 ·10−1 5.000 ·10−1 9.507 ·10−1 1.633 2.968 3.006
2.783 ·108 2.272 ·108 1.968 ·108 8.801 ·107 6.223 ·107 3.742 ·107 1.968 ·107 1.146 ·107 6.936 ·106 6.223 ·106
0.739 0.587 0.491 0.144 0.077 0.029 0.008 0.003 0.001 0.0008
0.261 0.413 0.509 0.856 0.923 0.971 0.992 0.997 0.999 0.9992
0.846 0.676 0.561 0.151 0.079 0.030 0.008 0.003 0.001 0.000
But the ratio q = σP /E for both soft and hard materials, and also for the values considered here, is negligibly small compared to one, and (4.11) reduces to the approximate equation: 6n 2 + 3 = 0. (4.12) y (T ) − 3y(T ) 1 + 5 The coefficient of restitution from (4.7) is expressed by: e∗ = −
5[3 − (2 − q)y(T )] q + , 2+q 9n(2 + q)
(4.13)
which upon neglecting q assumes the simpler form: e∗ =
5[3 − 2y(T )] . 18n
(4.14)
We calculate y(T ) from (4.12) for different values of n, and then determine e∗ from (4.14). The results for the data (4.10) are presented in Table I which also lists the respective values of z(T ), the compressive yield stress σP obtained from (3.26), and the parameter N. It is seen that the present investigation provides realistic values of y(T ) corresponding to values of the compressive yield stresses that for n 1 are approximately 10 times greater than those given by Taylor and Whiffin. The difference decreases to roughly 10% for n = 5 when the value of y(T ) is that measured by Whiffin at the same impact speed. We do not attempt to determine the total time to first maximum compression, which as already noted, requires the numerical evaluation of the integrals appearing in (3.35) and (3.38). Whiffin [11] treated mild steel specimens subject to longitudinal impact velocities v1 = 1.219 ·104 cm.sec−1 , 4.875 ·104 cm.sec−1 , and 7.62 ·104 cm.sec−1 , and from measurements of the plastically deformed and undeformed lengths used
554
R.J. KNOPS AND P. VILLAGGIO
Taylor’s theory to show that the values of N are 0.145, 2.698, and 6.290, respectively. He then determined the values of the compressive yield stress to be 8.061 ·106 Kg.wt.cm−2 , 6.936 ·106 Kg.wt.cm−2 , 7.268 ·106 Kg.wt.cm−2 , which are of the same order of magnitude as those produced from other tests for mild steel. The most favorable comparison with those predicted by the present approach exhibits a difference of about 10%. The order of magnitude of the yield stress, however, derived here is the same as that for nickel–chrome steel whose density and Young’s modulus are comparable with the values given by (4.10). Our analysis appears better suited to hard materials. Both Taylor and Whiffin indicate a considerable shortening of the bar due to the plastic deformation. The method of this paper, however, supposes that such shortening is negligible. A possible explanation for the discrepancies between the theories is provided by our assumption of small plastic strains and the introduction of elastic strains into the energy balance equation. Acknowledgements This work was partly supported by the Italian Group for Mathematical Physics. The authors are grateful to the referees for constructive comments. One author (R.J.K.) wishes to thank the Leverhulme Trust for the award of an Emeritus Fellowship. References 1.
L. Boltzmann, Einige Experimente über den Stoss von Zylindern. Sitzungberichte, Akad. Wiss. Wien Math. Naturwiss. Kl. 84 (1881) 1225. 2. H. Cox, On impacts on elastic beams. Trans. Cambridge Phil. Soc. 9 (1849) 73. 3. W. Goldsmith, Impact. E. Arnold, London (1960). 4. A.H.E. Love, A Treatise on the Mathematical Theory of Elasticity. Cambridge Univ. Press, Cambridge (1927). 5. J.B. Martin, Plasticity: Fundamentals and General Results. MIT Press, Cambridge, MA (1975). 6. T. Pöschl, Der Stoss. Handbuch der Physik, Vol. 6. Springer, Berlin (1928), Chapter 7. 7. B. Saint-Venant and A. Flamant, Détermination et répresentation graphique des lois du choc longitudinal. C. R. Acad. Sci. Paris 47 (1883) 127, 214, 281, 314. 8. W.J. Stronge, Impact Mechanics. Cambridge Univ. Press, Cambridge (2000). 9. I. Szabó, Einführung in die Technische Mechanik. Springer, Berlin (1963). 10. G.I. Taylor, The use of flat-ended projectiles for determining dynamic yield stress. I. Theoretical consideration. Proc. Roy. Soc. London A 194 (1948) 289–299. 11. A.C. Whiffin, The use of flat-ended projectiles for determining dynamic yield stress. II. Tests of various metallic materials. Proc. Roy. Soc. London A 194 (1948) 300–322.
On the Transformation Property of the Deformation Gradient under a Change of Frame I-SHIH LIU Instituto de Matemática, Universidade Federal do Rio de Janeiro, Caixa Postal 68530, CEP 21945-970, Rio de Janeiro, Brazil. E-mail:
[email protected] Received 23 April 2002; in revised form 16 January 2003 Abstract. If the deformation gradients are denoted by F and F ∗ respectively before and after a change of frame, they are related by the transformation formula, F ∗ = QF , where Q is the orthogonal transformation associated with the change of frame. Although it has been pointed out that this relation is valid “provided that the reference configuration be unaffected by the change of frame” (see p. 308 of [1]), this formula is found in most textbook of Continuum Mechanics, and is used, without further justification, in deriving the condition of material frame-indifference, H (QF ) = QH (F )QT for the constitutive function H of the stress tensor of an elastic body. In this note, we shall analyze the effect of change of frame on the transformation property of the deformation gradient, and show that the above transformation formula is not valid in general. However, we shall confirm the validity of the above well-known condition of material frame-indifference without the assumption that the reference configuration be unaffected by the change of frame. Mathematics Subject Classifications (2000): 74A05, 74A20. Key words: reference configuration, Euclidean transformation, Galilean objectivity, principle of material frame-indifference, simple materials.
In memory of Professor Clifford A. Truesdell
1. Frame of Reference and Deformation The event world or space-time W of continuum mechanics [2] can be mapped onto the product space of a three-dimensional Euclidean space E and the set of real numbers R through a one-to-one mapping, φ: W → E × R. Such a mapping is called a frame of reference. Let us denote by Wt the totality of simultaneous events at the instant t, and φt the restriction of φ to Wt , so φt : Wt → E associates the placement of an event with a location in the Euclidean space E. A body B is a set of material points, and we shall identify it, through a oneto-one mapping, with a region in E relative to a frame of reference. Such an identification is called a configuration of the body. More specifically, we consider 555 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 555–562. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
556
I-S. LIU
a placement of the body B in Wt , say, κ: ˜ B → Wt , then κ = φt ◦ κ, ˜ B → E is a configuration of B relative to the frame of reference φ at the instant t. Given a particular configuration relative to a frame of reference, κ: B → E,
κ(X) = X,
(1.1)
called a reference configuration of B, a material point X in the body B can be identified with its position X in the region Bκ occupied by the body in the reference configuration κ. A motion χ of B can be expressed as a map, χ: B × R → E,
x = χ(X, t),
(1.2)
where χ(·, t): B → E is the configuration of the body B at time t. The region occupied by the body at time t will be denoted by Bt . Given a reference configuration κ, the motion χ can also be expressed as x = χκ (X, t) = χ κ −1 (X), t . (1.3) χκ : Bκ × R → E, We call χ(X, t) the material description of the motion and χκ (X, t) a referential description. The map χκ (·, t) = χ(·, t) ◦ κ −1 : Bκ → Bt is called the deformation from Bκ to Bt . The deformation gradient relative to κ, denoted by F is defined as the gradient of χκ (X, t) relative to X, i.e., F = ∇X χκ . For a given motion, the reference configuration κ is often chosen as the configuration at some instant t = t0 , say, the initial position of the body in the motion, κ = χ(·, t0 ). However, the reference configuration need not be occupied by the body in the actual motion at any instant, in principle. It can be any convenient placement of the body at some instant of time in the frame of reference. 2. Transformation Property of Deformation Gradient Let φ and φ ∗ be two frames of reference. We call ∗ = φ ∗ ◦ φ −1 : E × R → E × R a change of frame from φ to φ ∗ . In general, the change of frame ∗, which maps (x, t) to (x ∗ , t ∗ ), is a Euclidean transformation of the following form, x ∗ = Q(t)(x − x0 ) + c(t), t ∗ = t + a,
(2.1)
TRANSFORMATION PROPERTY OF THE DEFORMATION GRADIENT
557
Figure 1. Reference configurations κ and κ ∗ in the change of frame from φ to φ ∗ .
for some a ∈ R, x0 ∈ E, c(t) ∈ E, and Q(t) ∈ O, where O is the group of orthogonal transformations on the translation space of E. In particular, φt∗ ◦ φt−1 : E → E is given by x ∗ = φt∗ (φt−1 (x)) = Q(t)(x − x0 ) + c(t).
(2.2)
Let κ: ˜ B → Wt0 be a reference placement of the body at some instant t0 , then (see Figure 1) κ = φt0 ◦ κ˜
and
κ ∗ = φt∗0 ◦ κ˜
(2.3)
are the two corresponding reference configurations of B in the frames φ and φ ∗ at the same instant, and X = κ(X),
X ∗ = κ ∗ (X),
X ∈ B.
Let us denote by γ = κ ∗ ◦ κ −1 the change of reference configuration from κ to κ ∗ in the change of frame, then it follows from (2.3) that γ = φt∗0 ◦ φt−1 and by (2.2), 0 we have X ∗ = γ (X) = K(X − x0 ) + c(t0 ),
(2.4)
where K = Q(t0 ) is a constant orthogonal tensor. On the other hand, the motion in referential description relative to the change of frame is given by x = χκ (X, t),
x ∗ = χκ∗∗ (X ∗ , t ∗ ),
and from (2.2) we have χκ∗∗ (X ∗ , t ∗ ) = Q(t)(χκ (X, t) − x0 ) + c(t). Therefore we obtain for the deformation gradient in the frame φ ∗ , i.e., F ∗ = ∇X∗ χκ∗∗ , by the chain rule, F ∗ (X ∗ , t ∗ ) = Q(t)F (X, t)K T , or simply, F ∗ = QF K T ,
(2.5)
558
I-S. LIU
where K T denotes the transpose of K, which, by (2.4), K = Q(t0 ), is a constant orthogonal tensor due to the change of frame for the reference configuration. The transformation property (2.5) stands in contrast to the well-known formula F ∗ = QF , which is valid provided that the reference configuration is unaffected by the change of frame, so that K reduces to the identity tensor. From (2.1), since the orthogonal transformation Q(t) in a Euclidean transformation is time-dependent, it is conceivable that in a change of frame, one may choose a reference configuration at some instant t0 , such that Q(t0 ) = 1. Therefore, the assumption that “the reference configuration be unaffected by the change of frame” (see [1, p. 308]) can be justified. However, for an arbitrary Euclidean transformation, this is not always possible in general, for example, when Q is time-independent. Transformation properties of some other kinematic quantities related to the deformation gradient are discussed in [3]. GALILEAN OBJECTIVITY OF DEFORMATION GRADIENT
A second order tensor quantity S is called objective if, in a change of frame ∗, S ∗ = QSQT . From (2.5), it follows that the deformation gradient F is not an objective tensor quantity under Euclidean transformations because, in general, K = Q(t0 ) = Q(t). However, (2.5) also asserts that F is objective under frame transformations with time-independent orthogonal tensor Q, since, in this case, K =Q
and
F ∗ = QF QT .
In particular, we can say that the deformation gradient is an objective tensor quantity with respective to Galilean transformations, which form a subclass of Euclidean transformations (2.1), with Q(t) = Q,
c(t) = v0 t + c0 .
(2.6)
This conclusion, therefore, modifies the classical result of the strict non-objectivity of the deformation gradient, from the transformation formula F ∗ = QF , based on the convenient, but oversimplified, assumption that the reference configuration be unaffected by the change of frame. 3. Principle of Material Frame-Indifference The most important aspect of changes of frame lies in the formulation of the principle of material frame-indifference for constitutive functions. In what follows, we
TRANSFORMATION PROPERTY OF THE DEFORMATION GRADIENT
559
shall confirm the usual condition of material frame-indifference, without the usual assumption that the reference configuration be unaffected by the change of frame. For simplicity, we shall present it in the pure mechanical theory. 3.1. IN MATERIAL DESCRIPTION Let φ be a frame of reference and χ be a motion. Let T (X, t) be the value of the stress tensor at the material point X and time t in the frame φ. We can write the constitutive relation in the following form, T (X, t) =
(χ(Y, t Fφ Y ∈B,0s<∞
− s), X),
X ∈ B,
(3.1)
where we have indicated the domain of the argument function χ beneath the functional symbol F . We emphasize that the constitutive function depends on the choice of frame in general, so that we have also indicated the frame φ on F as a subscript. We remark that the stress tensor is an objective tensor quantity, i.e., relative to a change of frame ∗ given by (2.1), it has the following transformation property: T ∗ (X, t ∗ ) = Q(t)T (X, t)Q(t)T ,
X ∈ B.
(3.2)
Since any intrinsic property of materials should be independent of frame of reference, it is required that for any objective quantity, the constitutive function must be invariant with respect to any change of frame. Mathematically, it can be stated in the following Principle of material frame-indifference. The constitutive function of an objective quantity must he independent of the frame, i.e., Fφ (·) = Fφ∗ (·), for any frames of reference φ and φ ∗ . More specifically, from (3.2), the principle implies the following condition of material frame-indifference, F
Y ∈B,0s<∞
(χ ∗ (Y, t ∗ − s), X) = Q(t)
F
Y ∈B,0s<∞
(χ(Y, t − s), X)Q(t)T ,
(3.3)
or simply F (χ ∗ ) = QF (χ)QT , for any change of frame ∗ given by (2.1). In this condition, we have written F for both Fφ and Fφ∗ , since, by the principle of material frame-indifference, they are the same function, and therefore, (3.3) is a restriction imposed on the constitutive function F .
560
I-S. LIU
3.2. IN REFERENTIAL DESCRIPTION Let κ be a reference configuration of the body B in the frame φ, x = χ(X, t) = χκ (X, t),
X = κ(X),
X ∈ B.
(3.4)
In terms of referential description, we can rewrite the constitutive relation (3.1) relative to κ as T (X, t) =
(χκ (Y , t Fκ Y ∈Bκ ,0s<∞
− s), X),
X ∈ Bκ .
(3.5)
From (3.4), Fκ is related to F by Fκ (χκ (Y , t − s), X) = F (χκ (Y , t − s), κ −1 (X)).
(3.6)
Note that the constitutive function F depends on the reference configuration κ in the frame φ. To express the condition of material frame-indifference in referential description, let φ ∗ be another frame, and denote the corresponding reference configuration in this frame by κ ∗ (see (2.3)). We have x ∗ = χ ∗ (X, t ∗ ) = χκ∗∗ (X ∗ , t ∗ ),
X ∗ = κ ∗ (X),
X ∈ B,
(3.7)
and, similar to (3.6), Fκ ∗ (χκ∗∗ (Y ∗ , t ∗ − s), X ∗ ) = F (χκ∗∗ (Y ∗ , t ∗ − s), κ ∗−1 (X ∗ )).
(3.8)
The condition (3.3) then takes the form, Fκ ∗ (χκ∗∗ (Y ∗ , t ∗ − s), X ∗ ) = Q(t)Fκ (χκ (Y , t − s), X)Q(t)T .
(3.9)
In this equation, the constitutive functions on the two sides are expressed in terms of the reference configuration in two different frames. However, from (3.8) and (3.6), we have Fκ ∗ (χκ∗∗ (Y ∗ , t ∗ − s), X ∗ ) = F (χκ∗∗ (Y ∗ , t ∗ − s), κ ∗−1 (X ∗ )) = F (χκ∗∗ (κ ∗ ◦ κ −1 (Y ), t ∗ − s), κ ∗−1 (κ ∗ ◦ κ −1 (X))) = F (χκ∗∗ (γ (Y ), t ∗ − s), κ −1 (X)) = Fκ (χκ∗∗ (γ (Y ), t ∗ − s), X), where γ = κ ∗ ◦ κ −1 stands for the change of reference configuration due to the change of frame given by (2.4). Therefore, the condition of material frameindifference relative to a reference configuration κ becomes, (χκ∗∗ (γ (Y ), t ∗ Fκ Y ∈Bκ ,0s<∞ = Q(t)
− s), X)
(χκ (Y , t Fκ Y ∈Bκ ,0s<∞
− s), X)Q(t)T ,
(3.10)
561
TRANSFORMATION PROPERTY OF THE DEFORMATION GRADIENT
or simply as Fκ (χκ∗∗ ◦ γ ) = QFκ (χκ )QT .
(3.11)
We emphasize that, in this condition, only the constitutive function relative to the reference configuration κ in the frame φ is involved and therefore, (3.11) is a restriction on the constitutive function Fκ . Note that we have x = χκ (X, t),
x ∗ = χκ∗∗ (γ (X), t ∗ ),
(3.12)
for X ∈ Bκ and by (2.2) they are related by χκ∗∗ (γ (X), t ∗ ) = Q(t)(χκ (X, t) − x0 ) + c(t).
(3.13)
From (3.10) and (3.13), we conclude that although the reference configuration is frame-dependent, it does not affect the condition of material frame-indifference, (3.10) together with (3.13), as long as the condition is expressed in terms of the constitutive function relative to the reference configuration in a frame only. 4. Simple Material Bodies For simple material bodies (see [1, Section 28]), the constitutive dependence of motions is restricted to the local dependence of deformation gradients only. The constitutive relation (3.5) can then be written as T (X, t) =
H (F (X, t − s), X),
(4.1)
0s<∞
where F = ∇X χκ , and the condition (3.11) becomes H(∇X (χκ∗∗ ◦ γ )) = QH(∇X χκ )QT .
(4.2)
By the chain rule, we obtain the gradient, ∇X (χκ∗∗ ◦ γ ) = (∇X∗ χκ∗∗ )(∇X γ ) = F ∗ K, where F ∗ = ∇X∗ χκ∗∗ and K = ∇X γ from (2.4). Hence, we have ∇X (χκ∗∗ ◦ γ ) = (QF K T )K = QF,
(4.3)
by the use of the transformation formula F ∗ = QF K T from (2.5). Note that the above relation can also be obtained directly from (3.13). Therefore, from (4.2) and (4.3), the condition of material frame-indifference (3.10), for simple material bodies, becomes H (Q(t − s)F (X, t − s), X) = Q(t) H (F (X, t − s), X)Q(t)T ,
0s<∞
0s<∞
562
I-S. LIU
or simply H(QF ) = QH (F )QT
∀Q(t) ∈ O.
(4.4)
In other words, this well-known condition remains valid without the assumption that the reference configuration be unaffected by the change of frame. FINAL REMARKS
In [4], the transformation property (2.5), F ∗ = QF K T , was derived (see equation (2.2.93)), however, it appeared as an isolated remark, and its consequences were not considered further in the book. The observer-dependent reference configuration was also considered in [5], in which the property (2.5) and its restrictions on the response functions for elastic solids were obtained in a manner different from the present paper. Acknowledgement The author acknowledges the partial support of CNPq-Brasil, through the Research Fellowship, Proc. 300135/83-1. References 1. 2. 3. 4. 5.
C. Truesdell and W. Noll, The Non-Linear Field Theories of Mechanics, S. Flügge (ed.), Handbuch der Physik, Vol. III/3. Springer, Berlin/Heidelberg (1965). C. Truesdell, A First Course in Rational Continuum Mechanics, Vol. 1, 2nd edn. Academic Press, Boston (1991). I-Shih Liu, Continuum Mechanics. Springer, Berlin/Heidelberg (2002). R.W. Ogden, Non-Linear Elastic Deformations. Dover, Mineola, New York (1997). A.I. Murdoch, On objectivity and material symmetry for simple elastic solids. J. Elasticity 60 (2000) 233–242.
Some New Advances in the Theory of Dynamic Materials KONSTANTIN A. LURIE Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609, U.S.A. E-mail:
[email protected] Received 25 September 2002; in revised form 29 August 2003 Abstract. Some recent advances in the theory of dynamic materials are listed in the paper. We discuss the technique used to determine the set of invariant characteristics of material mixtures in one spatial dimension and time, in the context of electrodynamics of moving dielectrics, versus the relevant results in traditional electrostatics. Some special features of dynamic materials demonstrated through a material design are advertised as well. Among them, we mention the possibility to eliminate the cut-off frequency in the waveguides with activated dielectric filling. Mathematics Subject Classifications (2000): 78A48, 78M40, 78M30. Key words: dynamic materials, cut-off frequency elimination.
To the Memory of Professor Clifford Truesdell, Teacher and Friend
Introduction This paper is focused on special material formations termed dynamic materials (DM). DMs are defined [1–3] as composites assembled from conventional materials distributed on a microscale in space and time. When a low frequency dynamic disturbance propagates through such an assemblage, it perceives this one as a uniform medium with some “effective” properties mathematically detected through homogenization. A discussion of such properties, along with some special effects they produce in material design, is the central objective of this work. DMs are encountered far more often in real life than one may expect at first glance. A TV screen on which a movie is projected represents a DM – a plane with reflection properties that are fast variable in space and time. A human mechanism of vision implements a spatio-temporal averaging of a rapidly alternating pattern of picture waves, i.e., modulated scanning lines, and it thereby implements homogenization to reveal “a slow motion” carrying information stored in a movie. A similar example of a DM is given by a transmission line with variable linear inductance and capacitance. A discrete model represents the line as an array of LC-cells connected in series (Figure 1). Assume that each cell offers two possi563 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 563–573. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
564
K.A. LURIE
Figure 1. A discrete version of a transmission line.
Figure 2. A moving (LC)-property pattern – an activated composite.
bilities: (L1 , C1 ) – “material 1”, and (L2 , C2 ) – “material 2”, turned on/off by a toggle switch. If the cells are densely distributed along the line, then, by a controlled switching, the linear inductance L and capacitance C may become almost arbitrary functions of the spatial coordinate z and time t. In particular, they may produce a periodic LC-laminate in a (z, t)-plane assembled from materials 1 and 2 (Figure 2). To this end, we create such a pattern at time t = 0, and bring it, as a whole, to a uniform motion with velocity V along the z-axis. This velocity should either be less than the least phase velocity of waves in both materials, or exceed both of such velocities; we take these precautions in a DM to avoid the formation of shocks. It is essential that the motion is confined to the pattern alone: materials 1 and 2 themselves remain immovable relative to a laboratory observer. From this remark it becomes clear that some restrictions should be imposed on the microgeometry of a DM to avoid strong discontinuities in dynamic disturbances. We will term the relevant microstructures admissible; for such microstructures,
SOME NEW ADVANCES IN THE THEORY OF DYNAMIC MATERIALS
565
Figure 3. A pipeline construction.
conventional compatibility conditions of kinematic and dynamic type hold across the interfaces separating one material in the assemblage from another. After we apply homogenization to an admissible laminar construction, we reveal a uniform material with the effective properties depending on all of the parameters involved, such as the volume fractions of participating constituents, and the velocity V of a property pattern. This example represents what we term an activated spatio-temporal composite – one out of two major categories of DMs. Another type of DMs, called kinetic, involves the relative motion of the original constituents in a microstructure. An example is given by an air column in a form of a pipeline assembled from identical sections separated by toroidal chambers [4]. By manipulating compressions in the chambers, one may produce an individual velocity pattern within each section, and such patterns may vary, from section to section, both in magnitude and direction (Figure 3). The waves that are long compared to the length of a section, will propagate along the pipeline as if it were a uniform medium with some effective density and compressibility. The mathematical theory of DMs reveals some resemblance with a conventional theory of composites built in space alone. This resemblance is, however, limited because there are features unique for dynamic formations that have no analogs in ordinary composites. One thing is universal: to maintain a spatio-temporal variability of properties in a material assemblage with a dynamic process developed in it, one should generally arrange a flow of energy and momentum between the material and its environment. In other words, a DM is a thermodynamically open system. A clear idea of both common and special features of DMs versus the ordinary composites may be obtained from the comparison between electrodynamics of moving dielectrics and traditional electrostatics. In both examples we start with two base tensor entities: the electric displacement D and the electric field E in electrostatics, and the skew-symmetric electromagnetic tensors f and F in electrodynamics. These entities are linked through the constitutive relations involving material tensors: a tensor e of dielectric constants and a tensor s of dielectric and
566
K.A. LURIE
Figure 4. Spatial and spatio-temporal polycrystals.
magnetic constants, respectively. The base vectors (tensors) satisfy the relevant fundamental equations given by Maxwell’s theory. The main difference is that electrostatics is about purely spatial phenomena associated with the Euclidean group of rotations, while electrodynamics is about spatio-temporal phenomena associated with the Lorentz group that contains, along with Euclidean rotations, also pure Lorentz transforms as its elements. Particularly, any dielectric which is isotropic in a conventional sense (i.e., with regard to Euclidean rotations) is at the same time anisotropic in space-time (with regard to Lorentz transforms), the only exception from this rule being vacuum. This difference is substantial: due to it, in electrostatics we have a variational principle of minimum stored energy, while in electrodynamics we only have a principle of stationarity of the action density. Accordingly, Euler’s equations are elliptic in electrostatics and hyperbolic in electrodynamics. In spite of these differences, homogenization may detect the effective properties of composites in both scenarios. To illustrate, consider as example the polycrystalline formation. The notion is common in electrostatics: to produce a polycrystal, we must have an originally anisotropic paternal material (a monocrystal), and intermix, in space, its fragments turned by different angles relative to a laboratory frame (Figure 4). In other words, a traditional Euclidean rotation is responsible for the difference in the material properties of individual grains. When the polycrystal is two-dimensional (i.e., it lies in an (x, y)-plane), then with homogenization the determinant of its material tensor e is preserved through the mixing [5]: λ1 λ2 = det eeff = det e = 1 2 . Here, 1 and 2 represent eigenvalues of the paternal material, whereas λ1 and λ2 denote eigenvalues of the effective tensor eeff .
SOME NEW ADVANCES IN THE THEORY OF DYNAMIC MATERIALS
567
Figure 5. “Caterpillar” construction.
A similar result holds for the electrodynamics of moving dielectrics. To be specific, consider a kinetic laminate – a periodic array which consists of copies of one and the same isotropic dielectric, with eigenvalues c and 1/µc for its material tensor s; these copies will be distributed along the z-axis, and each copy brought into material motion along the z-axis with individual velocity V . A discontinuous velocity pattern may be implemented through the use of the following feasible construction. Assume that we have a linear arrangement of caterpillars placed one after another along the z-axis (Figure 5). The tracks that are moved by caterpillars become electrically connected when they belong to the z-axis, and stay disconnected otherwise. The z-axis will then become occupied by material fragments moving each at its own axial velocity, and the electric current will flow along the z-axis through the assemblage of electrically connected tracks. With this construction, the electromagnetic field will be controlled directly by an appropriate specification of the velocity pattern. Because every conventional dielectric is anisotropic in space-time (c = 1/µc), and because a material motion represents rotation in space-time by an imaginary angle iφ, where tanhφ = V /c, we arrive at what may be termed a spatio-temporal polycrystal (Figure 4). This formation represents a DM – an isotropic dielectric – with the effective properties E, M found through homogenization. We obtain [2] E = det seff = det s = , M µ in complete analogy with a similar electrostatic situation. To translate this result into the language of transmission lines, we may say that the effective wave impedance of the line assembled from the moving parts with the same wave impedance is preserved through a spatio-temporal mixing. There is, however, a substantial difference between the static and dynamic scenarios. Both identify the set of eigenvalues of the effective material tensors as hyperbolas in the relevant planes (Figure 6); these hyperbolas obviously pass through points related to the monocrystalline materials. However, in electrostatics, not all of the points on the hyperbola are attainable; only those points can be attained through actually assembled composites that belong to the segment of the hyperbola between the original material and the diagonal. This is understandable because an ordinary polycrystal cannot become more anisotropic than the original monocrystal. This follows basically from the minimum variational principle of electrostatics. On the contrary, in electrodynamics a spatio-temporal hyperbola is attainable at all points but one, namely, that point on the diagonal related to the vacuum [6]. This is because, to attain this point, it takes infinite energy for particles of nonzero proper
568
K.A. LURIE
Figure 6. Effective properties of spatial and spatio-temporal polycrystals.
mass. We conclude that in electrostatics the minimum variational principle generates a hierarchy of materials with respect to mixing: an original monocrystalline material may create only those polycrystals that lie on the hyperbola closer to the diagonal. In contrast to this, in electrodynamics any material on the hyperbola may create, by forming polycrystals, any other material on it (except for the vacuum). In other words, electrostatics displays a paternalistic performance when it comes to mixing in space, whereas in electrodynamics, with a spatio-temporal mixing, we have no such performance. Clearly, the reason is because the minimum principle is not valid with respect to the full Maxwell’s system. These observations strongly affect the problem of determining the so-called G-closures, i.e., the sets of invariants of material tensors of all mixtures that may be produced as the assemblages of original material constituents. Again, we illustrate this through the comparison between electrodynamics and electrostatics. Figure 7 is related to electrostatics; it demonstrates the G-closure produced by a spatial mixing of two anisotropic dielectrics in 2D [7]. The original materials (11 , 12 ) and (21 , 22 ) generate, through making spatial polycrystals, their own hyperbolic segments. These segments represent a part of the boundary of the G-closure. In a transverse direction, this set is bounded by a diagonal at one end, and by a special curve at another end; this curve passes through the points (11 , 12 ) and (21 , 22 ) related to the original materials and represents a rank one spatial laminate assembled from them. Within the layers, the eigenaxes of the original materials are oriented along and across the layers’ interface. All 2D-mixtures of two original materials fall, regardless of microgeometry, into the shaded domain bounded by the noted curves. In electrodynamics, where composites are assembled in space-time, the situation is different. Any particular composite built in space-time from the original constituents may or may not allow the long waves to travel through it without shocks, damping, or amplification (the term “long” in this context means “long compared with the period of a microstructure”). When such travelling waves exist, we call a composite stable, otherwise we term it unstable.
SOME NEW ADVANCES IN THE THEORY OF DYNAMIC MATERIALS
569
Figure 7. G-closures of a binary set of spatial and spatio-temporal composites.
We will allow only stable composites to become elements of G-closures generated by the elements of the original set U . Stable formations should, of course, be admissible in the sense that they do not lead to shock waves. This requirement alone is, however, not enough for stability: a composite may not generate shock waves but at the same time be unable to transmit travelling waves. Such composites may be produced by a special mixing procedure consisting of two steps; below we describe this procedure for one dimensional wave propagation. We start with two isotropic dielectrics immovable in a laboratory frame and having positive material constants i , µi , i = 1, 2, 1 /µ1 = 2 /µ2 . An activated rank one laminate assembled from them has the determinant of its effective material tensor s0 defined by [2] y det s ; (1) det seff = y here · = m1 ·1 + m2 ·2 , m1 , m2 0, m1 + m2 = 1, 1 y = ( )−1 , , = (V 2 − a 2 ), a2 = det s = µ µ
(2)
and V denotes the velocity of the pattern – the slope of lines in Figure 2. By introducing κi =
mi yi , y
i = 1, 2,
(3)
570
K.A. LURIE
we observe that κ1 + κ2 = 1; as to the sign of κi , it is the same as that of yi /y. Assume that this sign is positive for both i = 1, 2; then det seff is also positive as a convex combination (1) of det si , and, consequently, the eigenvalues Ec, 1/Mc of seff have the same sign. This means that travelling waves are possible through the laminate, and this holds for all admissible values of mi and V . However, if the signs of yi y are opposite for i = 1 and i = 2, then the same holds for κi , and the combination det seff = κ1 det s1 + κ2 det s2 may be made negative by a suitable choice of κ1 ; as a consequence, the values of E and M will have opposite signs, and travelling waves will not exist. For the relevant values of κ1 , the laminate will become unstable, but for other admissible values of κ1 (i.e., mi and V ) it will remain stable. We will call a stable composite absolutely stable if it remains stable for all admissible values of its structural parameters. So, if both κ1 and κ2 are positive for all of such values, an activated laminate is absolutely stable, otherwise it lacks absolute stability. Another example of absolutely stable composite is given by a kinetic polycrystal produced by mixing different fragments of the same original dielectric in space and time. As seen from the above argument, a laminate fails to be absolutely stable if the signs of 1 and 2 are opposite; in other words, to violate absolute stability, we must have original materials with parameters i , µi of opposite sign for different values of i, say, both parameters positive for i = 1, and both of them negative for i = 2. But a material with negative and µ can be created as an activated laminate, probably of the second rank, assembled from any two original dielectrics with , µ being all positive. This may be achieved [8] by a special choice of structural parameters in a laminate. The creation of such a “negative” material represents the first stage of the two-step procedure mentioned above. Having one material negative and another material positive, we will, at the second stage of this procedure, assemble from them a second rank laminate lacking absolute stability. We now define a stable hyperbolic G-closure of an original set U of materials as a set of invariants of the effective tensors seff of all absolutely stable spatio-temporal mixtures generated by the elements of U . For one spatial variable and time, a stable G-closure of the set U of two original dielectrics (1 , µ1 ) and (2 , µ2 ) is given by a hyperbolic strip bounded by hyperbolas E/M = 1 /µ1 , and E/M = 2 /µ2 , this strip involving branches belonging both to the first and the third quadrants of the coordinate plane; see Figure 7 [9]. We observe that the G-closure contains materials with both effective parameters negative. Not every two elements of a G-closure may serve as original constituents for building other elements; only those elements qualify that may produce an absolutely stable composite. Materials of opposite signs are known not to qualify, so the secondary elements (mixtures) may only be produced as composites made from original materials of the same sign. We also observe that there is no transverse bound for a G-closure (leave alone
SOME NEW ADVANCES IN THE THEORY OF DYNAMIC MATERIALS
571
the diagonal). The reason for that is because there is no minimum energy principle, and the system is thermodynamically open. In conclusion, I want to mention some of the special effects produced by DMs as they become elements of material design. When we create DMs, we control the geometry of the characteristics of the relevant hyperbolic equations. By a suitable mixing, we may direct all of the waves to travel in one and the same direction relative to a laboratory observer. We term such a phenomenon a coordinated wave propagation. Consider two coordinated material mixtures; in one of them, both waves travel from left to right (“a right material”), in another – from right to left (“a left material”). Place a right material on the right of the origin z = 0 on the z-axis, and a left material to the left of it. This material combination will demonstrate a screening property [10]: an initial state will be split into waves travelling away from the origin, and never entering an extended region in between. This region will never be invaded also by the waves generated at the ends of the segment of the z-axis we consider because the characteristics will avert such waves away from the required direction. Another effect achieved through a material activation consists in elimination of the cut off frequency in waveguides. A waveguide filled with an appropriately activated laminate allows for all waves much longer than a spatial period of lamination to propagate without damping through the waveguide, thus eliminating the cut off frequency.
Some Recollections of Professor Clifford Truesdell I am pleased to have this opportunity to share some personal reminiscences of Clifford. I met with Clifford only twice but both meetings produced such a lasting impression upon me that I think he was one of the most remarkable and unique individuals I ever met. The first meeting took place in the late spring of 1988 when Clifford and Charlotte visited Russia, and the second occurred one year plus later when they hosted me in their wonderful home in Baltimore. Before I met Clifford in person, I knew much about his work and noticeably less about his personality. Of course I knew about a strong opposition among some high ranking moguls in Soviet mechanics to what they called “Truesdellism”. This labelling of rational mechanics carries no negative flavor per se; I myself perceive it in a positive sense as a tribute to the man whose seminal work has added so much to our understanding of the roots of continuum mechanics. There are, however, individuals who take this name negatively because they associate with it “an unnecessary incursion of abstract mathematics into the field of mechanics”. This is their viewpoint, and if appropriately motivated it may become a basis for a legitimate opposition. The bad thing, however, is that time and again there are undertaken unproportionally fierceful efforts aimed to suppress the ideas of rational mechanics at all costs, and prevent it from reaching out to a broad audience. Such
572
K.A. LURIE
efforts are beyond logic; I will refer only to two examples; of one of them I bear a personal witness. The first occasion took place in 1975 when A First Course in Rational Continuum Mechanics was published in Moscow by the initiative of Grigorii Isaakovich Barenblatt. This name was taboo at the Nauka Editorial Board headed by Academician L.I. Sedov; after some extensive pressure, however, the book appeared with no mention of Barenblatt’s name in it; it was translated by R.V. Goldstein and V.M. Entov, and edited by P.A. Zhilin and A.I. Lurie. Clifford had taken an active part in the preparatory work through his intensive correspondence with the editors. Interestingly, the Russian version appeared several years before the book was published in English [11]. The second episode is related to one of the first original Russian texts on nonlinear elasticity written by my father, A.I. Lurie. This book has exposed and developed many of the ideas of rational mechanics of which the author not only was an ardent supporter, but worked in it as an active contributor. He and Clifford knew each other, and paid an immense mutual respect. The manuscript of the book was completed by the beginning of 1979 and submitted to Nauka Publishers for review. When the referee’s report was received in late April, the author was already gravely ill. Naturally, it was up to me to act in his name through the entire review business. The referee’s report shocked me: its language left no doubt that the goal was to destroy the book. From a long list of demands, I mention here only one: it required that there should be an index rather than a direct tensor notation. To accept this was the same as to kill the book because it required complete revision (and retyping) of the whole text, i.e., several long months of senseless work. The situation was both delicate and risky; after a discussion with colleagues, I declined this demand but accepted some others, less significant, but also time consuming. Unfortunately, the delay produced by this circumstance did not allow enough time for the author to see his work published [12]. So it was with this prehistory that I met Clifford and Charlotte in Leningrad in the spring of 1988. There was a meeting at the Ioffe Institute, my workplace at that time, with a number of invited guests. At this Institute, there was a group of people doing classical mathematical physics, particularly, special functions, integral equations, and alike. And it quite unexpectedly turned out that Clifford had contributed to this topic, too! Apparently, that occurred years before he turned all of his attention to the foundations of continuum mechanics. While in Leningrad, we discussed many general topics. This was a time when perestroika was already taking gear, so we naturally discussed politics. But I think Clifford was more interested in personal observations: the town itself, its museums, and its beautiful suburbs. St. Petersburg is an architectural marvel, a place world famous for its magnificence and harmony. Passionate as he was, Clifford was eager to examine every bit of what he saw in museums, be it a fabric, furniture, or clocks. He literally knelt down in front of selected pieces to better feel the material and
SOME NEW ADVANCES IN THE THEORY OF DYNAMIC MATERIALS
573
to recognize the work. He evidently was pleased with his visit as he recalled later when we met again. I think Clifford was more than a prolific scholar. He really loved life, was passionate and sometimes sharp in his judgment, but he also was a philosopher. His interest towards the history of science was not occasional: it originated from the same source as his interest towards foundations of mechanics: both revealed his strive for understanding the very roots of things. In this sense, I think, he belonged with the same ilk as the people of Renaissance who clearly realized their place in history. References 1.
I.I. Blekhman and K.A. Lurie, On dynamic materials. Proc. of the Russian Academy of Sciences (Doklady) 37 (2000) 182–185. 2. K.A. Lurie, The problem of effective parameters of a mixture of two isotropic dielectrics distributed in space-time and the conservation law for wave impedance in one-dimensional wave propagation. Proc. Roy. Soc. London A 454 (1998) 1767–1779. 3. K.A. Lurie, Control of the coefficients of linear hyperbolic equations via spatio-temporal composites. In: V. Berdichevsky, V. Jikov and G. Papanicolaou (eds), Homogenization. World Scientific, Singapore (1999) pp. 285–315. 4. B.P. Lavrov, Private Communication. Mekhanobr-Tekhnika, St. Petersburg, Russia (2002). 5. A.M. Dykhne, Conductivity of a two-dimensional two-phase system. Soviet Phys. JETP 32 (1971) 63–65. 6. K.A. Lurie, Bounds for the electromagnetic material properties of a spatio-temporal dielectric polycrystal with respect to one-dimensional wave propagation. Proc. Roy. Soc. London A 456 (2000) 1547–1557. 7. K.A. Lurie and A.V. Cherkaev, Effective characteristics of composite materials and the optimal design of structural elements. In: A. Cherkaev and R. Kohn (eds), Topics in the Mathematical Modelling of Composite Materials. Birkhäuser, Boston (1997) pp. 175–258. 8. K.A. Lurie and S.L. Weekes, Effective and averaged energy densities in one-dimensional wave propagation through spatio-temporal dielectric laminates with negative effective values of and µ. To appear in: R. Agarwal and D. O’Reagan (eds), Nonlinear Analysis and Applications. World Scientific, Singapore (2003). 9. K.A. Lurie, A stable spatio-temporal G-closure and Gm -closure of a set of isotropic dielectrics with respect to one-dimensional wave propagation. Submitted to Wave Motion. 10. K.A. Lurie, Effective properties of smart elastic laminates and the screening phenomenon. Internat. J. Solids Struct. 34 (1997) 1633–1643. 11. C.A. Truesdell III, A First Course in Rational Continuum Mechanics, Part I. Academic Press, New York, 1977. 12. A.I. Lurie, Non-linear Theory of Elasticity (in Russian). Nauka, Moscow, 1980 (English translation: Non-linear Theory of Elasticity. North Holland, Amsterdam, New York, Oxford, Tokyo, 1990).
Pseudo-plasticity and Pseudo-inhomogeneity Effects in Materials Mechanics GERARD A. MAUGIN Laboratoire de Modélisation en Mécanique, UMR CNRS 7607, Université Pierre et Marie Curie (Paris 6),Case 162, 4 place Jussieu, 5252 Paris, Cedex 05, France. E-mail:
[email protected] Received 29 July 2002; in revised form 27 August 2003 Abstract. It is shown that a large variety of physical effects such as continuously distributed defects, heat conduction, anelasticity (plasticity in finite-strains, growth), phase transitions and more generally shock-waves, can be viewed as pseudo-material inhomogeneities when continuum thermomechanics is completely projected onto the material manifold itself. Main ingredients in this approach are the notions of local structural rearrangements (Epstein and Maugin) and of its thermodynamical dual, the Eshelby material stress tensor. An outcome of this is the unification of the theories of inhomogeneity of Eshelby on the one hand, and of Kroener–Noll–Wang, on the other hand. The notion of configurational forces as understood nowadays in solid-state physics and engineering mechanics follows necessarily from these developments. They are driving forces acting on sets of material points that correspond to strongly localized fields and, in the limit, singularities, which are also viewed as pseudo-inhomogeneities. The second law of thermodynamics then is a constraint imposed on the time evolution of these pseudo-inhomogeneities (e.g., plastic evolution, volumetric growth, progress of a crack, advancement of a phase-transition front, etc.). This has very powerful implications in numerical schemes drawn directly on the material manifold (e.g., thermodynamically admissible volume-element scheme for the simulation of phase-transformation evolution). Mathematics Subject Classifications (2000): 74A15, 74A45, 74J40, 74N20. Key words: solids, thermodynamics, elasticity, dissipation, nonhomogeneous materials, fracture, phase transitions, singularities.
This contribution is dedicated to the memory of the late Clifford A. Truesdell, master and guide of our generation in the field theory of continuum mechanics.
1. Introduction We start by giving the following two purely verbal definitions which encapsulate most of the subject of this paper. DEFINITION 1. We call pseudo-plastic effects in continuum mechanics those mechanical effects – due to any physical property – which manifest themselves just like plasticity, through the notion of internal or eigenstrains and eigenstresses 575 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 575–597. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
576
G.A. MAUGIN
(eigenspannungen) in the language of Kroener [1]; see also [2]. Thermoelasticity, magnetoelasticity of nonuniformly magnetized ferromagnets, electroelasticity of ferroelectrics, etc., are examples of such pseudo-plastic effects (cf. [3, 4]). DEFINITION 2. We call pseudo-inhomogeneity effects in continuum mechanics those mechanical effects – of any origin – which manifest themselves as so-called material forces in the material mechanics of materials, as developed by the author and co-workers since 1990 (see, e.g., reviews by the author [5, 6]). The reason for these is that the force exerted on a true material inhomogeneity (a region of a material body where material properties vary with the material point or are different from those at other points outside the region) in a material displacement (caused by the field solution of the problem) is – through the inherent duality of continuum mechanics – the best characterization of the materal inhomogeneity of a body. Forces acting on smooth distributions of dislocations (one kind of crystalline defect) and forces acting on macroscopic defects viewed as field singularities of certain dimensions on the material manifold, such as the forces driving macroscopic cracks or phase-transition fronts, are of this type. The present work has for purpose to present a unified view of these two classes of effects. This should not come as a surprise since, for instance, dislocations are one possible cause of eigenstresses. They also provide the ultimate microscopic mechanism at the basis of macroscopic plasticity. Also, it might seem reasonable that the best intrinsic way to describe eigenstrains is to observe what happens on the material manifold. To reach our conclusion we shall combine two modern approaches of continuum mechanics, the thermomechanics of irreversible processes approached by means of the concept of internal variable of state [7] and the geometrical approach that considers local material rearrangements on the material manifold (a notion due to Epstein and Maugin [8], but leaning on Noll’s and Wang’s works [9, 10] (see also [11]) as the basic mechanisms of all our effects of interest. On the way, we shall uncover the unification of three of the most productive and creative lines of thought developed in continuum mechanics in the second part of the XXth century, namely, (i) the finite-strain line with the concept of multiplicative decomposition of the deformation gradient, (ii) the geometrical line whose purpose inspired by mathematical physics was to capture anelastic effects via necessarily involved geometrical descriptions of the material manifold, and (iii) the configurational-force line which gave rise to the notion of material force (i.e., a covector on the material manifold) following the pioneering works of Peach and Koehler and Eshelby in the 1950’s. Section 2 reviews these three lines in the form of a historical introït. The three great historical figures who emerge thus are J. Mandel (1904–1978), E. Kroener (1919–2000) and J.D. Eshelby (1916–1981).
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
577
2. Historical Preliminary This section does not pretend to be an exaustive historical account of the field, but only to indicate some of the most salient contributions and their approximate interrelations. We like to distinguish three lines of originally independent creative developments in continuum mechanics in the period 1950–2000 (Flow chart 1 based on [12, 13]). One purpose of this contribution is to show how these three lines finally recently united in a grand scheme under the umbrella of thermomechanics and how the viewpoints of the main protagonists (Mandel, Kroener, Noll, Eshelby) find their best combined expression in this powerful unity. A. Along the finite-deformation line (left column in Flow chart 1, Figure 1), following the natural notion of composition of maps in analysis, the main fruitful ingredient was the multiplicative decomposition of the deformation gradient into an elastic contribution and an anelastic one (none of these two being integrable into a displacement separately), originally by the UK group of Bilby et al. [14] and Kroener and Seeger [15, 16]. This may have been anticipated by rheologists (Green and Tobolsky, 1940’s) but for exactly integrable members of the decomposition. The geometrical line (central column in Flow chart 1) was connected with this initially. But the finite-strain theory of anelasticity stayed dormant until the late 1960s when this was revived by Lee [17] and co-workers. From our viewpoint, however, a definite progress was made by Mandel [18] when he showed that what is now referred to as the “Mandel stress” [19] expressed in the so-called elasticallyreleased or “intermediate” configuration – between the material one and the actual one – is the driving force behind anelasticity. The introduction of the “intermediate” configuration is intimately – we should say, in duality – related to that of multiplicative decomposition of the finite deformation; cf. [20]. Sidoroff [21] has shown how the richness of the phenomenological description of finite-strain viscoelasticity is enhanced by the decomposition in multiple factors (more than two), introducing thus a series of “intermediate” configurations. So much for this line. B. Along the geometrical line (central column in Flow chart 1), we find works by scientists who were greatly influenced by mathematical physics, particularly the geometrical theory of gravitation of A. Einstein known as the general theory of relativity. Kondo [22, 23] in Japan was the first to infuse such ideas in continuum mechanics. But the group of Bilby et al. in the UK and E. Kroener and A. Seeger in Germany soon took over this line. In particular, introducing the notion of incompatibility tensor [1] to describe mathematically the lack of unique determination of the elastic displacement in continuously dislocated bodies, Kroener [15] made a definite step as he could then relate the density of dislocations (one type of “elastic” defect) to the geometry of the material manifold (non-vanishing curvature). At this point inclusive ideas of T. Levi-Civita and E. Cartan on (geometrical) connections, torsion, and distant parallelism entered the scene. This was most forcefully implemented by Noll [9] – also in [11, 10] in landmark papers. But these authors, fruitful
578
G.A. MAUGIN
Finite Deformation Line ⇓ 1950s
Geometrical Line ⇓ 1950s
Configurational-force Line ⇓ 1950s
Multiplicative Decomposition
Riemannian Geometry
Force on a singularity PEACH-KOEHLER (1950)
∗
ICTAM Brussels 1956 ⇐
K. KONDO (JP) ∗
BILBY et al. (UK) Force on an inhomogeneity STROH ⇔ J.D. ESHELBY (1951)
Attempts to relate the Einstein–Cartan tensors to density of defects Incompatibility tensor ⇐ E. KROENER Non-Riemannian geometry W. NOLL (inhomogeneity) C.C. WANG ∗ E.H.
Mechanics on the Material Manifold
LEE (1969) Gauge theory EDELEN, LAGOUDAS, KROENER, KLEINERT (1980)
Elastoplasticity J. MANDEL ⇓ Mandel Stress
Figure 1. Flow chart 1.
M. EPSTEIN ⇐ ⇓
∗ G.A.M.
(1969,71) D. ROGULA ⇓ A. GOLEBIEWSKA G. HERRMANN R. KIENZLER ···
⇒ STRUCTURAL REARRANGEMENTS e.g., divR b = b :
⇐
⇒
G.A.M. (1989) divR b + finh = 0 ⇓ Eshelby stress
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
579
and deep as their research was, did not really propose a relationship between a driving force and the geometrical background. C. The third line (right column in Flow chart 1) is that initially developed by Peach and Koehler [24] and Eshelby [25], who established the expression of the driving force (not a Newtonian force acting per unit of matter) on a singularity line (dislocation line) and on a material inhomogeneity, respectively. The celebrated J-integral of fracture (force on a crack tip [26]) is also such a force. Eshelby found that this type of “force” is related to the divergence of a peculiar stress tensor, which he identified as the spatial part of what was known as the energy-momentum tensor in field theories [27]. This is now referred to as Eshelby stress tensor in honor of this great scientist. However, late in the 1960’s, Rogula [28] and the author [29], then relating to studies in general relativistic continuum mechanics, found it convenient to emphasize the duality between projections of the equations of continuum mechanics, whether in physical space or directly onto the material manifold. It seems that this viewpoint was exported to the USA by Golebiewska [30] in the late 1970s, who initiated a trend followed by Herrmann and his co-workers with efficient applications to the strength of materials of structural members [31]. The configurational-force line is also exposed in some detail in [32]. Epstein and Maugin [8] (also many subsequent papers by these authors; in particular, some synthesis works [5, 6]), working entirely in material space and exploiting the ideas of Noll but pursuing them to a logical end, combined lines C and B and got the final unifying result: the Eshelby material stress is indeed fed by all types of material inhomogeneities and field singularities (defects) . This is shown by establishing the material balance law in which the Eshelby stress is the flux. This is the fully material balance law missed by Noll and Wang, which represents equilibrium, or dynamics, among all types of inhomogeneities. This establishes the relationship between the geometrical and configurational-force lines. Furthermore it happens that the above-mentioned Mandel stress is none other than an easily identified part of the Eshelby stress. All these they achieved by exploiting the notion of inhomogeneity map, or material transplant (with a biophysical connotation) or, still, local structural rearrangement. This is shown in Section 4 below after an introduction to canonical balance laws in Section 3. A different line of thought was pursued by Gurtin [33–35] and some of his co-workers, with a special interest in interface phenomena. 3. Canonical Balance Laws These are the fundamental balance laws of thermomechanics (momentum and energy) expressed intrinsically in terms of a good space-time parametrization. In a relativistic background this would be the conservation – or lack of conservation – of the canonical energy-momentum tensor [27], first spelled out in 1915 and 1918 by David Hilbert and Emmy Noether on a variational basis. Here we do not
580
G.A. MAUGIN
appeal to any variational formulation as we consider the case of finitely deformable dissipative media which may conduct heat, a case of general interest. In modern continuum mechanics, we account for a variety of microscopic phenomena responsible for macroscopic dissipation through the notion of internal variables of state (review of this notion in [7]). These variables have for essential property to be uncontrollable directly by external stimuli, so that they expand power only in the bulk in the form of dissipation. We denote collectively by α these variables, whose choice and tensorial nature depend on the physical acumen of the theoretician helped by the experimentalist who should uncover those most representative variables (e.g., density of dislocation, plastic strain, work-hardening variables, etc., as we know now). Such a thermodynamical framework is particularly powerful in the field of study of anelastic behavior of solid-like materials. To formulate a correspondingly sufficiently general theory it is remarkable that it is sufficient, to start with, to cast the theory of finite-strain thermoelasticity in a form where the free energy is taken to be a function of the internal variable of state (by the very nature of these new variables being internal, we do not need to introduce new kinetic notions – inertial forces in the bulk or applied forces at the boundary). The only initial change is the introduction of α, in addition to the deformation gradient F and the absolute temperature θ in the list of functional arguments of the free energy, e.g., 1 (F,θ,α; X) W =W (3.1) for an anisotropic, possibly anelastically inhomogeneous material in finite strains, whose basic behavior is elastic, but it may present combined anelasticity. Here, W is the free energy density per unit volume in the global reference configuration 1 may depend on F only through another KR of a material body B. The function W quantity such as an element of a multiplicative decomposition while α itself may contain another element of this decomposition (case of finite-strain elasto-plasticity 1 is supposed to depend explicitly on the and elasto-viscoplasticity). In addition, W material point X, i.e., the material may be smoothly materially inhomogeneous (assuming the function sufficiently smooth in all of its arguments to allow for analytic manipulations). Equation (3.1) corresponds to a so-called first-order gradient theory with respect to the deformation (so-called simple materials in Noll’s classification) but not for internal variables. Higher-order gradients of both fields yielding scale effects are dealt with by Maugin and Trimarco [36] (also in [5, Section 5.8]) on a variational basis for the deformation and [37] for internal variables in the dissipative case. The so-called laws of (thermodynamical) state given by the partial derivatives 1 (3.10) with respect to its first three arguments are: of this function W 1 1 1 ∂W ∂W ∂W , S=− , A=− . (3.2) T= ∂F ∂θ ∂α These are, respectively, the first Piola–Kirchhoff stress, the entropy density (according to the axiom of local thermodynamical state [7]), and the thermodynamical
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
581
force associated to α. The quantities F, θ and α are fields, thus depending on the material point X and the Newtonian time t. Let ρ0 (X) be the matter density at KR . Then at any regular material point in the body B, we have the following balance equations for mass, linear momentum, and energy [7]: ∂ρ0 = 0, (3.3) ∂t X ∂p − divR T = 0, (3.4) ∂t X ∂(K + E) (3.5) − ∇R · (T.v − Q) = 0. ∂t X These equations are presented here in the so-called Piola–Kirchhoff formulation, with an (X, t) space–time parametrization, but the components of equation (3.4) are still in physical space, so that it is not an intrinsic formulation. We remind the reader of the following definitions: x = χ(X, t), ∂χ = ∇R χ, F= ∂X t
p = ρ0 v, E = W + Sθ.
∂χ v= , ∂t X 1 K = ρ0 v2 , 2
(3.6) (3.7) (3.8) (3.9)
This last quantity is the internal energy per unit reference volume; v and p are called the physical velocity and linear momentum, respectively; Q is the material heat flux, i.e., the heat influx per unit material surface. Equations (3.3)–(3.5) are strict conservation laws (no source terms), because we assume, for the sake of simplicity , that there are neither external body force acting nor energy input per unit volume. In these conditions the entropy equation and the dissipation inequality read ∂α ∂S intr intr := Aα, ˙ α˙ ≡ , and (3.10) θ + ∇R · Q = , ∂t X ∂t X Q (3.11) σB = θ −1 (intr − S · ∇R θ) 0, S ≡ , θ with the continuity condition Q(F, θ, α; ∇R θ; X) → 0 as ∇R θ → 0.
(3.12)
CANONICAL EQUATION OF LINEAR MOMENTUM
This is obtained by projecting canonically equation (3.4) onto the material manifold M 3 of points X constituting the body. In turn this is effected simply by
582
G.A. MAUGIN
applying F to the right to equation (3.4) and taking account of the functional dependence (3.1) and that of ρ0 . The now classical result is ∂P − (divR b + f inh) = f th + f intr , (3.13) ∂t X where we have introduced the canonical momentum P (a co-vector on M 3 ), a density of “Lagrangian function” L with a superscript th indicating that this is evaluated with the free energy, the (fully material but mixed) Eshelby stress tensor b, and three material forces due respectively to true material inhomogeneities, thermal effects, and intrinsic dissipative effects represented by α: P = −p · F = ρ0 C.V, L = Lth = K − W, b = −(Lth 1R + T · F), ∂Lth inh ; f th = S∇R θ, f = ∂X
(3.14) (3.15) (3.16) f intr = A(∇R α)T .
(3.17)
expl
In equation (3.14), C is the Cauchy–Green finite strain and V is the material velocity (a contravector on the material manifold) defined by ∂χ −1 T −1 V = −F · v = . (3.18) C = F · F, ∂t x In the first of equations (3.17), the explicit material gradient is computed by keeping the fields fixed, i.e., 1 2 ∂W inh . (3.19) f = (∇R ρ) v − 2 ∂X F,θ,αfixed At all regular material points X equation (3.13) is a differential identity deduced from equation (3.4). An embryonic form of this equation for the case of statics in a purely hyperelastic homogeneous body in the absence of applied force may be found in [38] – we referred to this as Ericksen’s identity [5, pp. 76–77] (for other such “Ericksen–Noether identities” in other field theories see [39]). According to (3.19) the “force” f inh captures indeed the explicit X-dependency and deserves its naming as material force of inhomogeneity, or for short inhomogeneity force. This is the first cause for the momentum equation (3.13) to be inhomogeneous (i.e., to have a source term) while the original – in physical space – momentum equation (3.4) is a true conservation law. What is more surprising is that a spatially nonuniform state of temperature (∇R θ = 0) causes a similar effect, i.e., the material thermal force f th acts just like a true material inhomogeneity in so far as the balance of canonical (material) momentum is concerned [40]. It seems that Bui [41] was the first to uncover such a thermal term while studying fracture although in the small-strain framework and not in the material setting. Finally, any
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
583
internal variable of state α that has not reached a spatially uniform state at point X, ∇R α = 0, has a similar effect in the equation of canonical momentum through the intrinsic material force f intr [37]. We call such material forces, material forces of quasi- or pseudo-inhomogeneity. Note that any additional variable put in the functional dependency of the free energy W will cause a similar effect. It is only in the pure materially homogeneous elastic case (W depending only on F) that the balance of canonical momentum is also a strict conservation law. For instance, in a spatially nonuniformly magnetized elastic material, with material magnetization density m per unit volume of the reference configuration, we shall have a material magnetic force with expression f magn = HL · (∇R m)T ,
HL = −
∂W , ∂m
(3.20)
where HL is the so-called local magnetic field – in material form – of ferromagnetism [42]. Formulas (3.20) strictly apply to the case of soft ferromagnets only (no magnetic hyteresis). In a hard ferromagnet with magnetic ordering (micromagnetics) the contributions (3.20) will be replaced by f ferro = Heff · (∇R m)T ,
Heff = −
δW tot , δm
(3.20)
where δ/δm is an Euler–Lagrange functional derivative, and W tot is the total potential energy including elastic, magnetoelastic, exchange, magnetic-anisotropy, magnetic doublet, and demagnetizing energies [43]. The very expression (3.20)1 accounts for the gyroscopic nature of the magnetic spin so that there is simultaneously no explicit contribution of magnetic spin to the volume kinetic energy K. In the presence of spin-lattice relaxation (Gilbert’s spin “viscosity” generalized to the deformable framework), there exists an additional material force due to this effect (see [43]). One could be tempted to consider equations (3.13) and (3.5) as the canonical equations of momentum and energy. But the second of these, in its form (3.5), does not show much in common with (3.13). The reason is that, whatever we try (see below), the latter can never be transformed into a strict conservation law. It is therefore equation (3.5) which must be transformed in order to exhibit a structure similar to that of equation (3.13). For this it is sufficient to remember that (3.10)1 is but a transformed form of the energy equation. Manipulating the first term, we can write the latter equation as [44, 45] ∂θ ∂(Sθ) (3.21) + ∇R .Q = th + int , th := S . ∂t X ∂t The similarity between variables α and θ is thus enhanced due to the very analogous space and time structure of the right-hand sides of equations (3.13) and (3.21)1 ; but while the second variable is governed by the heat equation, the first one has to be governed by a pure evolution equation subjected to the second
584
G.A. MAUGIN
law of thermodynamics (non-negative dissipation). Equations (3.13) and (3.21)1 now clearly appear as the spacelike and timelike components of a unique fourdimensional canonical balance of momentum and energy. The remarkable fact, however, is that the fourth (timelike) component of the four-dimensional canonical momentum that could be introduced is neither the free nor the internal energy density but the difference between the two. This is in agreement with the relativistic formulation of thermoelasticity (neither with true inhomogeneities nor with any pseudo-inhomogeneities of any kind) of Kijowski and Magli [46]. This hints at a true analytical mechanics of dissipative continua. REMARK. (On Legendre–Fenchel transforms of the energy density). At regular material points equation (3.13) is an identity deduced for smooth fields from equation (3.4). As such we can arrange the contributions to the left- and right-hand sides at will. But, whatever we do, we cannot reformulate the material forces as exact time and space derivatives in the left hand side, even by a clever redefinititon of some quantities. There will always remain source terms. For instance, considering the case of quasi-statics (neglect of inertial quantities) in order to simplify the writing, equation (3.13) will read divR b + f inh + f th + f intr = 0 with f
inh
1 ∂W ; =− ∂X expl
f th = S∇R θ,
(3.22)
f intr = A(∇R α)T .
(3.23)
The expressions (3.23) go along with the fact that it is the free energy from which b is now defined: b = W 1R − T · F.
(3.24)
We shall normally use a notational device to emphasize this fact; thus we shall 1 is assumed to be concave in the variable θ. write bW for this b. Usually function W A typical Legendre–Fenchel transformation of this energy density which conserves the property of convexity and the degree of mathematical homogeneity is given by (no pure material inhomogeneities here) ∂E > 0, (3.25) ∂S of which the first expresses Young’s equality for conjugate functions in convex analysis at fixed F and α, and the second provides the non-negative temperature θ indicating that internal energy E is an ever increasing function of entropy. E(F, S, α) + (−W (F, θ, α)) = Sθ,
θ=
For illustrative purposes, consider the simple case of quasi-statics in the absence of internal variable α (materially homogeneous, purely thermoelastic body). Equation (3.22) takes on the form divR bW + fθth = 0,
bW := W (F, θ)1R − T.F, fθth := S∇R θ.
(3.26)
585
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
But with the Legendre–Fenchel transform (3.25) this can as well be written as divR bE + fSth = 0,
bE := E(F, S)1R − T.F, fSth := −θ∇R S.
(3.27)
The notation with subscripts (W, θ) and (E, S) is clear and consistent. Like in the rest of thermomechanics, the choice of exploiting either (3.26)1 or (3.27)1 depends on the thermodynamical situation at hand (i.e., isothermal situation or adiabatic conditions, or isentropic conditions). This choice becomes essential in treating problems involving singularities (crack extension) or transition layers such as phase-transition interfaces (essentially homothermal singular surfaces) or more classical shock waves (singular surfaces exhibiting a growth of entropy across, but often assumed to connect two material regions in adiabatic regime). The reason for this is that, while (3.26) and (3.27) or their more general form in dynamics and with real inhomogeneities and dissipative processes are mathematical identities at regular material points, they do provide the expression of the driving force (originally a material or configurational force) acting at such singular points (case of a crack) or at such surfaces via contour integrals or the jump equation associated to the balance law of material momentum, of which (3.26)1 and (3.27)1 are specialized equilibrium forms. The problem of selecting the appropriate energy density in the Eshelby stress tensor, and accordingly the expression of the material forces due to thermal and other dissipative effects deserve special attention (this was remarked upon by Abeyaratne and Knowles [47] and the author [48]). 4. Local Structural Rearrangements and Material Transplants 4.1. TRUE MATERIAL INHOMOGENEITIES In order to make ideas clear let us consider the case of quasi-statics in the absence 1 (F; X) per unit of body force with an elastic energy density given by W = W reference volume. In this case equations (3.4) and (3.22) reduce to divR T = 0,
T=
1 ∂W , ∂F
and divR b + f
inh
= 0,
f
inh
(4.1) 1 ∂W , =− ∂X expl
b = W 1R − T · F.
(4.2)
But following Epstein and Maugin [8], we consider (thought experiment) the case where the material inhomogeneity can be artificially removed at each material point X, by effecting a point-dependent change of reference configuration. That is, the reference change is therefore local and generally not integrable over the whole body. Such a change is called a local structural rearrangement. This is in the line of Noll’s original idea of uniformity [9]. Let P(X) denote this reference change (note that P here is not to be mistaken for the canonical momentum which does
586
G.A. MAUGIN
not appear in this section), which brings a neighborhood of X into the so-called crystal reference. This is performed modulo the material symmetry [8, 49] so that when we account for the accompanying volume change JP = det P , P combines mutiplicatively to the right with F and, for energies, we can write 1 (F; X) = JP−1 W (FP(X)) = W
(F, P). W
(4.3)
Obviously we can compute the partial derivatives of the last mentioned function
, obtaining thus, as easily checked, W T=
1
∂W ∂W = , ∂F ∂F
∂W b˜ = − = −(T · F − W 1R ) · P−T . ∂P
(4.4)
Accordingly, b ≡ b˜ · PT = −
∂W · PT ≡ W 1R − T · F. ∂P
(4.5)
This provides an elegant geometrical definition of the quasi-static Eshelby stress b (originally referred to as the energy-momentum or Maxwell stress by Eshelby) via the notion of local structural rearrangement, although the final expression in (4.5) no longer refers to this rearrangement. It is just the same as that given in the last of (4.2). Assuming that we just know (4.1)1 at all regular points X, JF := det F > 0, we can then compute the material divergence of b resulting in (4.2)1−2 . But we can also express the material co-vector f inh through the operation
(F, P) ∂W ∂W = −(∇R P) : b · P−T = b : , = (∇R P) : (4.6) ∂X expl ∂P where is the (geometrical) connection based on the non-integrable mapping P; that is, in components (to avoid any misunderstanding): A A := −(P−1 )α.B P.α,K . B.K
(4.7)
Therefore, equation (4.2)1 also reads [8] divR b = b : .
(4.8)
In some geometrical theories (Bilby, Kroener, Noll, Wang) of continuous distributions of dislocations, the connection is directly related to the density of dislocations – the skew part of is the torsion tensor and it is set equal to the skew tensor that represents the density of dislocations [5]. Accordingly, we can say that in such “continuously dislocated” elastic bodies, dislocations create a material force density which is responsible for the non-divergence-free nature of the Eshelby stress tensor. Dislocations, which originally are discrete defects, act thus as a materially distributed inhomogeneity force in agreement with equation (4.8). This is the equation that unifies, via (4.5), the two (geometrical and configurational) lines in Flow chart 1. We do not pursue further here this geometrical approach to
587
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
continuously distributed defects (see more on differential geometry, the notions of material uniformity and homogeneity, the role of material symmetry groups, crystallographic basis, transplants, G-structure, and G-covariance in [49]). For reasons to become clear, the mapping P may also be referred to as a material transplant [50]. We note that the unification represented by (4.8) is now followed by several authors [51–53]. In any case the notion of dislocation density happens here to be connected to that of local structural rearrangements. These are local and we cannot fit the reference crystal pieces together so that this description necessarily yields the notion of internal stresses, but the really new point is the relationship to the Eshelby stress tensor. 4.2. THE CASE OF DISSIPATIVE MEDIA WITH INTERNAL VARIABLES The mental operation just performed to account for true material inhomogeneities can also be performed for thermal effects and those due to the presence of internal variables of state (whatever their peculiar tensorial character). Because of this more general nature than that of temperature, we consider by way of example the case of a materially homogeneous anelastic material treated in quasi-statics (still to remain as simple as possible). Then equations (4.1) apply while equations (4.2) are replaced by 1 (F, α) ∂W . (4.9) ∂α Now let us envisage the following mental operation. Consider that it is possible, by the appropriate local change of reference P, to make the material appear as purely elastic at point X. This means that the new energy function W at X will depend only on a finite strain and no other argument, becoming indeed a function (FP(α(X, t))). But this is now per unit volume of the new local reference conW figuration so that, accounting for the volume change, we should write (compare with equation (4.3)) 1 (F, α) = JP−1 W FP(α(X, t)) = W
(F, P(α)). (4.10) W =W divR b + f intr = 0;
f intr = A(∇R α)T ,
A=−
The same reasoning as in the previous paragraph yields 1
T ∂W ∂W ∂W ˜ T. = ; b=− P = W 1R − T · F = bP (4.11) ∂F ∂F ∂P By introducing the local reference change or local rearrangement P(α(X)), we have in some way “subtracted” the anelastic behavior of the material at X. Now on account of the functional dependences (4.10) we can also evaluate the “force” A and obtain the following equality between the “thermodynamic” definition of A and a kind of “geometrical” definition (via P): T=
A = b · P−T ·
∂PT , ∂α
(4.12)
588
G.A. MAUGIN
where the free indices are on A and α, i.e. in components, (4.13) reads K (P−1 ).K A = b.L γ
∂P.γL ∂α
(4.12a)
.
Obviously, this is the same as A = −b · P
∂P−1 . ∂α
(4.13)
4.3. THE CASE WHERE α IS AN ANELASTIC STRAIN Here we identify P with the inverse of the “anelastic” deformation gradient (in truth, not a gradient but a Pfaffian form) in a multiplicative decomposition of F as [17–19] F = Fe · Fp ,
(4.14)
where Fe is the elastic component and Fp is the anelastic one (in fact, the subscript p stands for plasticity). Hence P−1 ≡ Fp ,
FP = F · F−1 p ≡ Fe .
(4.15)
The deformation Fp defines locally an elastically released configuration at material point X. This is also called a (local) intermediate configuration Ki [18, 20, 21]. Use can be made of formula (4.13) to find out the thermodynamic force associated with α through the geometrical description. With α ≡ Fp , we immediately have: A = −b · F−T p .
(4.16)
Introducing the second Piola–Kirchhoff stress S and the “Mandel” stress M by S = T · F−T ,
M = T · F = S · FT · F = S · C,
(4.17)
we obtain that b = W 1R − M
or
M = W 1R − b.
(4.18)
As a consequence, equation (4.16) reads A = (M − W 1R ) · F−T p .
(4.19)
Simultaneously, the intrinsic dissipation intr takes on the following form: ˙T (4.20) intr = Aα˙ = tr (M − W 1R ) · F−T p · Fp , or
p intr = tr (M − W 1R ) · (LR )T ,
(4.21)
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
589
where LR = F˙ p · F−1 p p
(4.22)
is the “plastic finite-strain rate in the reference configuration KR ”. If the plastic p deformation is assumed to be incompressible, then trLR = 0, and (4.21) reduces formally to p (4.23) intr = tr M · (LR )T . In plain words this means that the Mandel stress is the driving force behind plasticity. The relation (4.23) can be as well expressed with geometrical objects pushed forward to the intermediate configuration Ki , a more usual formulation. The present argument clearly shows the unification of the three lines of thought of chart 1. Obviously, the Mandel stress is only one part of the Eshelby stress. The first relationship between finite-strain plasticity and the notion of Eshelby stress in an intermediate configuration was established by the author [54]; more on Eshelby stress and finite-strain elastoplasticity to be found in other works [55, 56]. 4.4. THE THERMOELASTIC CASE Since the variables α and θ in preceding sections play a similar role in so far as materials mechanics is concerned, we may consider the case of finite-strain thermoelasticity by analogy with the anelastic case. That is, while in homoge1 (F, θ), we neous thermoelastic materials the free energy is a priori a function W can think of a local rearrangement of matter at point X in such a way that after this local rearrangement the free energy depends only on one finite strain (just as in pure elasticity) and no other argument, so that it depends on temperature only through this rearrangement. To comply with the notation of other papers we call P(θ(X, t)) = H−1 (θ) this local rearrangement so that up to the notation equation (4.10) delivers the following functional dependences:
(F, H−1 ). (FH−1 (θ)) = W 1 (F, θ) = JH W W =W
(4.24)
We copy directly equation (4.12) with the appropriate change in notation, obtaining thus a relationship between the thermodynamical definition of entropy and a geometrical-like definition via H (note that the b here is bW ) [40]: S = bW · H ·
dH−1 , dθ
(4.25)
with
−T ∂W H = W 1R − T · F. ∂H−1 Consequently, we can write the material thermal force as dH th −1 · (∇R θ). f = −b · H · dθ bW = −
(4.26)
(4.27)
590
G.A. MAUGIN
4.5. MAGNETOELASTICITY OF NONUNIFORMLY MAGNETIZED BODIES In that case we replace θ or α by the material magnetization vector m per unit volume of the reference configuration. By analogy with the two previous cases we have thus HL = b · P ·
∂P−1 , ∂m
f magn = b · P ·
∂P−1 · (∇R m)T . ∂m
(4.28)
5. Configurational Forces Although it has become customary to refer to the above material forces (contributions in the balance of canonical momentum) as configurational forces, we prefer to call configurational forces those quantities that are deduced from the balance of canonical momentum by some operation such as integration over a singular region (and shrinking to a singular point if this is the case) or taking the jump across a singular manifold (this is also obtained by volume integration over a region overlapping the singular surface and then flattening this region on the surface). In both cases the definition involves both an integration and a limiting procedure yielding a nonzero quantity by virtue of the present singularity. Accordingly, configurational forces are here associated with field singularities. The latter thus appear as pseudo-inhomogeneities in their own right. Such configurational forces acquire a true physical meaning only in so far as the power they expend in an irreversible motion of the singularity set is none other than a dissipation. They are clearly related primarily not to the dissipative behavior of the bulk material per se but to an irreversibility due to the time evolution of the volume of integration, e.g., during the irreversible progress of the crack tip inward the material in the case of fracture. To arrive at a consistent “material force”–“energy change” formulation, one must integrate both canonical momentum and energy, a point that escapes the attention of many authors. The material may simultaneously be dissipative and smoothly materially inhomogeneous in the bulk. Then one must account for the sources of canonical momentum exhibited in previous sections. Without dealing with this in detail but just to show the completeness and unification of concepts, we briefly remind the reader of the two cases of fracture (propagation of a singularity line viewed as a point in the plane) and propagation of a singularity surface (shock wave and phase-transition front viewed as a line in the plane). 5.1. FRACTURE In the case of a macroscopic sharp straight through crack seen as the uniform limit of a family of regular rounded notches with radius going to zero, Dascalu and Maugin [57], starting from the appropriately reduced form of equations (3.13) and (3.5) at any regular material point X, implemented the above mentioned procedure rigorously for a purely elastic homogeneous material. They showed that the global
591
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
material force or configurational force K (we note this force K – like Kraft in German – to avoid any confusion with the deformation gradient F; P here again is the material momentum and not the structural rearrangement of Section 4) acting on the crack tip and the associated energy release rate G are given by ∂ 1 P dV (5.1) K = (N · b + P(V · N)) dS − ∂t B and
∂ V · N) + N(T · v)) dS − G = (H (1 ∂t
H dV ,
(5.2)
B
where H = E + K is the total energy (Hamiltonian) density based on the internal energy, 1 V is the material velocity of the crack tip, and B is the regular region bounded by the inside border in the material (with unit outward normal N) and the stress-free faces of the crack. The quantities K and G are related by the dissipation relation 11 0, G = K1 V
(5.3)
11 and K1 are components of 1 where V V and K in the direction of extension of the crack. The powerful result (5.3) holds in the limit as the volume B shrinks uniformly to the crack tip. Since there are no thermal effects in equations (5.1)– (5.3), it does not matter whether the Hamiltonian H is based on the internal or free energy and we do not distinguish between bW and bE . However, when true and pseudo-inhomogeneities are present the starting point may be equation (3.13), which emphasizes the presence of these additional effects, and the following transformed expression of equation (3.5) obtained by accounting for (3.21)1 and the Legendre–Fenchel transform (3.25), that is, ∂ HW − ∇ · (T · v) + th + intr = 0, (5.4) ∂t where HW = W + K is the total energy (Hamiltonian) density based on the free energy. The volume integral and skrinking limit (if necessary) are now applied to equations (3.13) and (5.4). One obtains thus in place of equation (5.1) ∂ inh th intr V · N)) dS − P dV − (f + f + f ) dV , K = (N · bW + P(1 ∂t B B (5.5) where bW is the dynamical Eshelby stress based on the free energy. In the same conditions, equation (5.2) is replaced by (HW (1 V · N) + N · (T · v)) dS G = ∂ th intr HW dV + ( + ) dV . (5.6) − ∂t B B
592
G.A. MAUGIN
Fortunately, the material force f inh – which has no counterpart in (5.6) – has no dissipation content so that (5.5) and (5.6) are again shown to be consistent. Indeed, accounting for the order of singularity of the fields α and θ at the crack tip, we have limit expressions at the crack tip such as [57] V ≈ −th , f th · 1
f intr · 1 V ≈ −intr ,
(5.7)
from which it follows by using the same argument as for the purely elastic case [57] that with the dual expressions (5.5)–(5.6) the result (5.3) still holds true. 5.2. SINGULAR SURFACES In the case of propagating singular surfaces ' (with unit oriented normal N) entirely described in the material framework, it is clear that the presence of such a surface breaks the translational invariance on the material manifold, since the material will in general have acquired different material properties on both sides of the surface. Accordingly, the central equation – that one which will deliver the driving force on the singular surface – is the jump relation associated with the regular bulk equation (3.13), because this equation is that one which contains the “material force” generated by a material displacement of ' on the material manifold. This general problem was dealt with by the author [37, 58, 59] along this line of thought. For a general singular surface (however not equipped with its own mass and energy) of the shock wave type (characterized by a finite discontinuity in the physical velocity field v and possibly in the other fields θ and α), one can establish by various means the following two equations that relate to the lack of conservation of pseudomomentum and entropy across ' although the jump equations associated with the physical momentum and energy (see equations (5.11)–(5.13) below) do not reveal source terms:
and
V ⊗ P] + f' = 0 N · [bW + 1
(5.8)
Q 1 + σ' = 0, N · VS − θ
(5.9)
with the constraint (second law of thermodynamics at ') σ' 0.
(5.10)
Equations (5.8) and (5.9) can be viewed as uniform limits obtained at ' by shrinking (flattening) a volume – so-called “pill-box” method – overlapping '. In that view the source term, e.g., f' , in (5.8), is the formal representation of the limit of the volume integral of the pseudo-inhomogeneity forces, the singularity of the surface making that this does not converge towards zero in the limit. The same holds true of the surface entropy source σ' , which is a phenomenological representation
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
593
of the term obtained in the same limit procedure starting with a volume integral of the sources of entropy in equation (3.11). Again, the true inhomogeneity force f inh , contrary to f th and f intr, does not produce any entropy. Accordingly, the interesting relationship is the one that relates the unknown driving force f' and the equally unknown (but non-negative) surface entropy source σ' . If the theory is consistent, these cannot be entirely independent. The looked for consistency condition in fact allows one to close the system of phenomenological equations at ' in compliance with the second law. As a matter of fact, accounting for the jump equations associated with mass, physical motion and energy, i.e., corresponding to the bulk equations (3.3)–(3.5), across ', (1 V · N)[ρ0 ] = 0, N · [T + 1 V ⊗ p] = 0, 1 N · [VH + T · v − Q] = 0,
(5.11) (5.12) (5.13)
one can show in all generality [12] that we have the following relationship [SN] −1 V + N · Q[θ −1 ] 0, (5.14) σ' = −θ f' + −1 · 1 θ where the symbolisms [·] and · denote, respectively, the jump and mean value of the enclosed quantity at ' and we have V = − E(1 V · N) − N · T · F · 1 V . (5.15) f' · 1 For classical shock waves (in the so-called inconsistent theory where a dissipative interface across which entropy grows is supposed to connect two regions nonetheless in adiabatic regime!), one sets f' = 0,
∀1 V = 0,
(5.16)
V · N) 0, which tells in which and there remains the trivial relation σ' = −[S](1 direction (with respect to N) the wave front moves to guarantee an increase in entropy. Projected onto the unit normal N, the first of (5.16) then is none other than the celebrated Hugoniot equation of shock-wave theory, i.e., HugoSW := [E − N · T · F · N] = 0.
(5.17)
For coherent phase-transition fronts for which [V] = 0,
[θ] = 0
(5.18)
across ', the above-given formula (5.14) reduces to 1N 0, V = θ'−1 f' V σ' = θ'−1 f' · 1
(5.19)
1N = 1 V · N is the normal speed of ', and the where θ' is the value of θ at ', V scalar surface driving force f' is given by f' = −HugoPT ,
HugoPT := [W − N · TF · N]
(5.20)
594
G.A. MAUGIN
and is generally not zero (it is zero for the nondissipative Landau theory of phase transitions where the vanishing of f' is a mathematical statement akin – in the appropriate state space – to the “Maxwell’s rule of equal areas” in the construction of the so-called Maxwell line). Another way to derive these relations at ' has been developed by the author [58, 59] by introducing the notion of a single scalar quantity, namely a generating function or Massieu thermodynamical potential M from which both f' and σ' are consistently derived. Equations (5.17) and (5.20) illustrate perfectly the need for distinguishing between internal and free energies. They emphasize the use of one or the other depending on the thermal conditions of the considered process across the wave front. Accordingly, we may say that the study of the thermodynamics of shock waves is based on the vanishing of the jump of the normal component of the (quasistatic) Eshelby stress built on the internal energy, while the study of the propagation of phase-transition fronts is based essentially on the consideration of the value of the jump of the normal component of the (quasi-static) Eshelby stress built on the free energy [47, 58]. As a matter of fact, equation (5.19) is in the familiar form of the product of a thermodynamical force and a generalized velocity (here a true velocity). This hints at the fact that although originally defined in terms of usual fields (Piola–Kirchhoff stress, finite deformation, energy density, temperature), the configurational forces should be involved in kinetic relations (in a general way, relationships between material velocity of the singularity set and conjugated driving force), which should obey the second law of thermodynamics. In this view shared by many works of Abeyaratne and Knowles [60, 61], the configurational forces are essentially secondary quantities that are exploited in criteria of progress of the singularity set (or “defects” or “localized inhomogeneities”) and not primary quantities on which the solution of an original boundary-value problem can be built. This is in contradiction with Gurtin’s point of view [35] where configurational forces seem (a priori only, this is our own remark) to exist independently of the physical world (e.g., the classical balance of physical momentum or its jump in the last studied case, or the bulk field equation for a more abstract dependent field).
6. Conclusion At the end of Section 3 we have already clearly unified the notions of true and pseudo-inhomogeneities by their parallel contributions to the balance of canonical momentum or its degenerate equilibrium form. The three lines of chart 1 are now unified through the dual notions of Eshelby stress and local structural rearrangement. The latter may be of other types than those exhibited here, e.g., phase transformations or material growth. They may be of a general deformation type (up to a rotation and the local material symmetry of the material), essentially of the shear type (plasticity), or of the isotropic dilatation type (thermoelasticity, growth unless these two have directional properties). In each case eigenstrains are involved, e.g., transformation strains in the case of phase transformation. In the
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
595
case of growth of materials of the physiological type (such as in bone remodelling or the mechanics of soft tissues), the local rearrangement was called “transplant” – emphasizing the local nature – for obvious “surgical” reasons [50]. In the case of the theory of inhomogeneity, it was called “inhomogeneity map” [8–10]. We note to conclude that other generalizations of the concepts presented globally in this contribution apply to media with higher order deformation gradients than in classical hyperelasticity (so-called “weakly” nonlocal theory [5, 36], to additional internal degrees of freedom [62], or to dissipative internal variables exhibiting also a weak nonlocality [37], and to electromagnetic deformable bodies [43, 63]. Acknowledgements The author benefits from a Max Planck Award for International Co-operation (2001 –2005). He acknowledges his debt to the referees. References 1.
E. Kroener, Inneren Spannungen und der Inkompatibilitätstensor in der Elastizitätstheorie. Z. Angew. Phys. 7 (1958) 249–257. 2. V.L. Indenbom, Internal stress in crystals. In: B. Gruber (ed.), Theory of Crystal Defects, Proc. of Summer School, Hrazany, Czech, September 1964. Acad. Publ. House, Prague, and Academic Pres, New York (1965) pp. 257–274. 3. M. Kleman, Dislocations, disclinations and magnetism. In: F.R.N. Nabarro (ed.), Dislocations in Solids, Vol. 5. North-Holland, Amsterdam (1980) pp. 100–215. 4. G.A. Maugin, Classical magnetoelasticity in ferromagnets with defects. In: H. Parkus (ed.), Electromagnetic Interactions in Elastic Solids, CISM Udine Course (1977). Springer, Vienna (1979) pp. 243–324. 5. G.A. Maugin, Material Inhomogeneities in Elasticity. Chapman and Hall, London (1993). 6. G.A. Maugin, Material forces: Concepts and applications. ASME Appl. Mech. Rev. 48 (1995) 213–245. 7. G.A. Maugin, Thermomechanics of Nonlinear Dissipative Behaviors. World Scientific, Singapore, and River Edge, NJ (1999). 8. M. Epstein and G.A. Maugin, The energy-momentum tensor and material uniformity in finite elasticity. Acta Mech. 83 (1990) 127–133. 9. W. Noll, Materially uniform simple bodies with inhomogeneities. Arch. Rational Mech. Anal. 27 (1967) 1–32. 10. C.C. Wang, On the geometric structure of simple bodies, or mathematical foundations for the theory of continuous distributions of dislocations. Arch. Rational Mech. Anal. 27 (1967) 33–94. 11. C.A. Truesdell and W. Noll, Nonlinear field theories of mechanics. In: S. Flügge (ed.), Handbuch der Physik, Vol. III/3. Springer, Berlin (1965). 12. G.A. Maugin, Kröner–Eshelby approach to continuum mechanics with dislocations, material inhomogeneities and peudo-inhomogeneities. In: B. Maruzewski (ed.), Proc. of Internat. Sympos. on Structured Media in Memory of E. Kröner, Poznan, Poland, September 2001. Poznan Univ. Press, Poland (2001) pp. 182–195. 13. G.A. Maugin, Geometry and thermomechanics of structural rearrangements: Ekkehart Kroener’s legacy, GAMM’2002, Kroener’s Lecture, Augsbug (2002). Z. Angew. Math. Mech. 83 (2002) 75–83.
596 14.
15. 16. 17. 18. 19. 20. 21. 22. 23.
24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39.
G.A. MAUGIN
B.A. Bilby, L.R.T. Lardner and A.N. Stroh, Continuum theory of dislocations and the theory of plasticity. In: Proc. of the Xth ICTAM, Brussels, 1956. Presses de l’Université de Bruxelles, Vol. 8 (1957) pp. 35–44. E. Kroener, Kontinuumstheorie der Versetzungen und Eigenspannungen. Springer, Berlin (1958). E. Kroener and A. Seeger, Nicht-lineare Elastizitätstheorie und Eigenspannungen. Arch. Rational Mech. Anal. 3 (1959) 97–119. E.H. Lee, Elastic-plastic deformation at finite strain. ASME Trans. J. Appl. Mech. 36 (1969) 1–6. J. Mandel, Plasticité et Viscoplasticité Classique, CISM Udine Course. Springer, Vienna (1971). J. Lubliner, Plasticity Theory. McMilan, New York (1990). C. Teodosiu and F. Sidoroff, A Theory of finite elastoplasticity in single crystals. Internat. J. Engrg. Sci. 14 (1976) 165–176. F. Sidoroff, Variables internes en viscoélasticité et viscoplasticité. State Doctoral Thesis in Mathematics, Université Pierre et Marie Curie, Paris (1976). K. Kondo, On the geometrical and physical foundations of the theory of yielding. In: Proc. of the 2nd Japanese National Congress of Applied Mechanics, Kyoto (1952) pp. 41–47. K. Kondo, Non-Riemannian geometry of imperfect crystals from a macroscopic viewpoint. In: K. Kondo (ed.), RAAG Memoirs of the Unifying Study of Basic Problems in Engineering and Physical Sciences by Means of Geometry, Vol. 1. Gakujutsu Bunken Fukyukai, Tokyo (1955) pp. 459–480. M.O. Peach and J.S. Koehler, The force exerted on dislocations and the stress field produced by them. Phys. Rev. II-80 (1950) 436–439. J.D. Eshelby, The force on an elastic singularity. Phil. Trans. Roy. Soc. London A 244 (1951) 87–112. J.R. Rice, Path-independent integral and the approximate analysis of strain concentrations by notches and cracks. Trans. ASME J. Appl. Mech. 33 (1968) 379–385. L.D. Landau and E.M. Lifshitz, Theory of Fields. Mir, Moscow (1965). D. Rogula, Forces in material space. Arch. Mech. 29 (1967) 705–715. G.A. Maugin, Magnetized deformable Media in general relativity. Ann. Inst. Henri Poincaré A 15 (1971) 275–302. A. Golebiewska-Herrmann, On conservation laws of continuum mechanics. Internat. J. Solids Struct. 17 (1981) 1–9. R. Kienzler and G. Herrmann, Mechanics in Material Space. Springer, Berlin (2000). R. Kienzler and G.A. Maugin (eds), Configurational Mechanics of Materials. Springer, Vienna (2001). M.E. Gurtin, The characterization of configurational forces. Arch. Rational. Mech. Anal. 126 (1994) 387–394. M.E. Gurtin, On the nature of configurational forces. Arch. Rational Mech. Anal. 131 (1995) 67–100. M.E. Gurtin, Configurational Forces as Basic Concepts of Continuum Physics. Springer, Berlin (1999). G.A. Maugin and C. Trimarco, Pseudo-momentum and material forces in nonlinear elasticity: Variational formulation and application to fracture. Acta Mech. 94 (1992) 1–28. G.A. Maugin, Thermomechanics of inhomogeneous-heterogeneous systems: Application to the irreversible progress of two- and three-dimensional defects. ARI 50 (1997) 41–56. J.L. Ericksen, Special topics in elastostatics. In: C.-S.Yih (ed.), Advances in Applied Mechanics, Vol. 17. Academic Press, New York (1977) pp. 189–244. G.A. Maugin, On Ericksen–Noether identity and material balance laws in thermoelasticity and akin phenomena. In: R.C. Batra and M.F. Beatty (eds), Contemporary Research in the Me-
PSEUDO-PLASTICITY AND PSEUDO-INHOMOGENEITY
40. 41. 42. 43. 44. 45. 46. 47.
48.
49.
50. 51.
52. 53. 54. 55. 56. 57. 58. 59.
597
chanics and Mathematics of Materials (J.L.Ericksen’s 70th Anniversary Volume). C.I.M.N.E., Barcelone (1996) pp. 397–407. M. Epstein and G.A. Maugin, Thermoelastic material forces: definition and geometric aspects. C. R. Acad. Sci. Paris II 320 (1995) 63–68. H.D. Bui, Mécanique de la Rupture Fragile. Masson, Paris (1978). G.A. Maugin, Continuum Mechanics of Electromagnetic Solids. North-Holland, Amsterdam (1988). A. Fomèthe and G.A. Maugin, Material forces in thermoelastic ferromagnets. Cont. Mech. Thermodyn. 8 (1996) 275–292. G.A. Maugin, On the universality of the thermomechanics of forces driving singular sets. Arch. Appl. Mech. 70 (2000) 31–45. G.A. Maugin, Universality of the thermomechanics of forces driving singular sets in continuum mechanics. In: 20th ICTAM, Paper QG2. Chicago (August 2000). J. Kijowski and G. Magli, Unconstrained Hamiltonian formulation of general relativity with thermo-elastic surces. Classical Quantum Grav. 15 (1998) 3891–3916. R Abeyaratne and J.K. Knowles, A note on the friving traction acting on a propagating interface: Adiabatic and non-adiabatic processes in a continuum. ASME Trans. J. Appl. Mech. 67 (2000) 829–831. G.A. Maugin, Remarks on Eshelbian thermomechanics of materials. In: S. Cleja-Tigoiu and V. Tigoiu (eds), Proc. of the 5th Internat. Seminar on Geometry, Continua and Microstructure. Publ. House of Romanian Acad. Sciences, Bucharest (2001) pp. 159–166. M. Esptein and G.A. Maugin, Notions of material uniformity and homogeneity. In: T. Tatsumi (ed.), Theoretical and Applied Mechanics, Proc. of ICTAM’96, Kyoto. Elsevier, Amsterdam (1997) pp. 201–215. M. Epstein and G.A. Maugin, Thermomechanics of volumetric growth in uniform bodies. Internat. J. Plasticity 16 (2000) 51–978. K. Ch. Le, Thermodynamically based constitutive equations for single crystals. In: G.A.Maugin (ed.), 1st Internat. Seminar on Geometry, Cotinua and Microstructure. Hermann, Paris (1999) pp. 87–97. M.E. Gurtin and P. Cermelli, The characterization of geometrically necessary dislocations in finite plasticity. In: 20th ICTAM, Paper FG1. Chicago (August 2000). P. Steinmann, Views on multiplicative elastoplasticity and the continuum theory of dislocations. Internat. J. Engrg. Sci. 34 (1996) 1717–1735. G.A. Maugin, Eshelby stress in plasticity and fracture. Internat. J. Plasticity 10 (1994) 393– 408. M. Epstein and G.A. Maugin, On the geometrical material Structure of unelasticity. Acta Mech. 115 (1995) 19–131. S. Cleja-Tigoiu and G.A. Maugin, Eshelby’s stress tensors in finite elastoplasticity. Acta Mech. 139 (2000) 19–131. C. Dascalu and G.A. Maugin, Forces matérielles et taux de restitution de l’énergie dans les corps élastiques homogènes avec défauts. C. R. Acad. Sci. Paris II 317 (1993) 1135–1140. G.A. Maugin, On shock waves and phase-transition fronts in continua. ARI 50 (1998) 145–150.
G.A. Maugin, Thermomechanics of forces driving singular point sets. Arch. Mech. 50 (1998) 477–487. 60. R. Abeyaratne and J.K. Knowles, Driving traction acting on a surface of strain discontinuity in a continuum. J. Mech. Phys. Solids 38 (1990) 345–360. 61. R. Abeyaratne and J.K. Knowles, Kinetic relations and the propagation of phase boundaries in elastic solids. Arch. Rational Mech. Anal. 114 (1991) 119–154. 62. G.A. Maugin, On the structure of the theory of polar elasticity. Phil. Trans. Roy. Soc. London A 356 (1998) 1367–1395. 63. G.A. Maugin and C. Trimarco, Driving force on phase transition fronts in thermoelectroelastic crystals. Math. Mech. Solids 2 (1997) 199–214.
On the Microscopic Interpretation of Stress and Couple Stress A. IAN MURDOCH Department of Mathematics, University of Strathclyde, Livingstone Tower, 26 Richmond Street, Glasgow G1 1XH, U.K. E-mail:
[email protected] Received 18 September 2002; in revised form 25 February 2003 Abstract. Exact continuum forms of balance (for mass, linear momentum, and tensor-valued moment of momentum) are established as relations between weighted spatial averages of corpuscular quantities computed at any supra-molecular length scale. Explicit expressions for stress and generalised couple stress in terms of particle interactions are obtained using a theorem due to Noll, and their physical interpretation is discussed for a specific choice of weighting function. Remarks are made on other choices of weighting function, the interpretation of partial stress in mixture theory, a link between couple stress and inhomogeneity, and other forms of moment of momentum balance. Comparison is made with the statistical mechanical viewpoint pioneered by Irving and Kirkwood. Mathematics Subject Classifications (2000): 70F, 74A. Key words: stress, couple stress, microscopic interpretation, weighting function.
Dedicated to the memory of Clifford Truesdell
1. Introduction Modelling molecules as interacting point masses, Irving and Kirkwood [1] studied the molecular basis of the equations of hydrodynamics within the framework of classical statistical mechanics. Explicit expressions were obtained for the separate contributions to the stress tensor which derive from momentum transport and from interactions. The physical and geometrical interpretations of these contributions were clear and simple, but the formal manipulation of series expansions of Dirac δ distributions (central to the analysis) was not justified. Noll [2] showed how the same results could be obtained using a theorem whose proof requires only undergraduate-level multivariable calculus. Pitteri [3] extended Noll’s analysis to very general interactions, and later [4] discussed couple stress from the same, statistical mechanical, viewpoint. The field values which appear in [1–4] are strictly local, both in space and time. However, as pointed out in [1, p. 821], in principle the values of fields in deterministic continuum mechanics should be identified with averages of local 599 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 599–625. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
600
A.I. MURDOCH
measurements made in oft-repeated experiments. Since any local measurement has associated scales of length and time, it was proposed in [1] that continuum field values be identified with those obtained after a further averaging in both space and time. Such additional averaging was not undertaken, however. Since the field values in [1] are strictly-local ensemble averages, the foregoing proposal implicitly assumes that space-time averaging of an ensemble average should be equivalent to averaging oft-repeated space-time averages. The foregoing motivates the formulation of continuum relations in which field values are directly identifiable with averages of molecular quantities computed at specific scales of length and time. These field values can then be related to measurements at these scales in individual experiments. Such formulation was undertaken by Murdoch and Bedeaux [5], who employed weighting function methodology. The main purpose of this work is to review the nature of spatial averaging using weighting functions, to indicate the central role of the aforementioned theorem of Noll, and to highlight the somewhat subtle interpretation of the interaction stress and couple-stress tensors. The existence and explicit form of the couple stress tensor, together with the physical/geometrical interpretation of stress and couple stress, constitute the new aspects of this contribution. In Section 2 relations expressing mass conservation, together with balances of linear momentum and (rank two tensor-valued) moment of momentum, are derived in terms of corpuscular quantities and general choice of weighting function. Noll’s theorem is used to establish the existence and explicit forms of the stress and couple-stress tensors. Choice of a simple scale-dependent weighting function is made in Section 3, and the physical and geometrical interpretation of all fields is discussed. Other choices of weighting function are considered in Section 4, together with the interpretation of partial stress in mixtures, the link between couple stress and inhomogeneity, the difference between moment of momentum here derived and those usually postulated, and the statistical mechanical approach of [1]. 2. Continuum Relations Derived on the Basis of Particle Mechanics 2.1. KINEMATICS AND MASS CONSERVATION Consider a material system M of distinguishable molecules, modelled as a system of interacting point masses labelled Pi (i = 1, 2, . . . , N), whose masses, locations, and velocities at instant t are denoted by mi , xi (t), and vi (t), respectively. Local spatial averages of additive corpuscular quantities may be computed in terms of a weighting function. For example, the mass density appropriate to a choice w of weighting function is N mi w(xi (t) − x). (2.1) ρw (x, t) := i=1
To make physical sense, w should assign greater contributions to the sum from molecules near the geometrical point x than from those far from x, and also have
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
601
physical dimension (length)−3 . Further, if ρw is to be identified with mass density as employed in continuum mechanics then it should be differentiable both in space and time. However, from (2.1), any such regularity of ρw is inherited from that of w. Accordingly w is required to be of class C 1 on the space V of displacements in Euclidean space E. Additionally, the integral of ρw over E should yield the total mass of the system. If there is only one particle in M then necessarily w = 1. (2.2) V
This normalisation condition also (trivially) suffices to yield the property for any number of particles. At this stage no further restrictions will be imposed upon w. Although the physical interpretation of fields is crucially dependent upon the choice of w, in what follows the forms of the relations these fields satisfy is independent of such choice. Holding x fixed in (2.1), ∂ρw = mi ∇w · vi = − mi ∇x w · vi ∂t i=1 i=1 N
N
= −
N
mi div{vi w} = −div pw ,
(2.3)
i=1
where pw (x, t) :=
N
mi vi (t)w(xi (t) − x)
(2.4)
i=1
denotes the momentum density appropriate to w. Here ∇w denotes the derivative of w with respect to its argument u(:= xi (t) − x), ∇x w denotes the gradient of w regarded as a function of location x, and in introducing the divergence it has been noted that vi is independent of x. Whenever ρw = 0, the corresponding velocity field pw . (2.5) vw := ρw Thus from (2.3) and (2.5) ∂ρw + div{ρw vw } = 0. (2.6) ∂t 2.2. LINEAR MOMENTUM BALANCE Linear momentum balance is obtained by considering the motion of Pi relative to an inertial frame. Such motion is governed by the equation N j =1,j =i
fij + bi =
d {mi vi }. dt
(2.7)
602
A.I. MURDOCH
Here fij denotes the force exerted upon Pi by Pj , bi represents the resultant force on Pi due to external agencies, and the sum is over all particles Pj (j = i). Multiplication of each term by w(xi (t) − x), followed by summation over all particles, yields N d {mi vi }w(xi − x), fw + bw = dt i=1
where fw (x, t) :=
N N
fij (t)w(xi (t) − x)
(2.8)
(2.9)
i=1 j =1 j =i
and bw (x, t) :=
N
bi (t)w(xi (t) − x).
(2.10)
i=1
Since ∂ d {mi vi }w(xi − x) = {mi vi w(xi − x)} − (mi vi ⊗ vi )∇w dt ∂t and (mi vi ⊗ vi )∇w = −mi vi ⊗ vi ∇x w = −div{mi vi ⊗ vi )w}, the right-hand side of (2.8) becomes / N 0 ∂ mi vi w(xi − x) + div Dw , ∂t i=1
(2.11)
where Dw (x, t) :=
N
mi vi (t) ⊗ vi (t)w(xi (t) − x).
(2.12)
i=1
Accordingly, noting definitions (2.4) and (2.5), substitution of (2.11) in (2.8) yields the continuum balance ∂ {ρw vw } + div Dw . (2.13) fw + bw = ∂t Writing vˆ i (t; x) := vi (t) − vw (x, t),
(2.14)
and noting that from (2.4) and (2.5) N i=1
mi vˆ i (t; x)w(xi (t) − x) = pw − ρw vw = 0,
(2.15)
603
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
it follows that Dw (x, t) = D w (x, t) + ρw vw ⊗ vw ,
(2.16)
where D w (x, t) :=
N
mi vˆ i (t; x) ⊗ vˆ i (t; x)w(xi (t) − x).
(2.17)
i=1
Using (2.16), balance (2.13) may be written as −div D w + fw + bw ∂ {ρw vw } + div{ρw vw ⊗ vw } = ∂t ∂vw ∂ρw + div{ρw vw } vw + ρw + (∇vw )vw . = ∂t ∂t That is, invoking (2.6),
(2.18)
−div D w + fw + bw = ρw aw ,
(2.19)
where the acceleration field ∂vw + (∇vw )vw . (2.20) aw := ∂t Relation (2.19) is to be compared with the usual local form of momentum balance div T + b = ρa.
(2.21)
Identifications bw ↔ b,
ρw ↔ ρ,
aw ↔ a
(2.22)
are straightforward. This motivates an attempt to express fw as the divergence of a tensor field. The existence and explicit form of such a tensor field is a consequence of (see [2, 4, 5]) NOLL’S THEOREM. Let g denote a class C 1 tensor-valued function of any rank defined on E × E which satisfies, for any pair of points x and y, g(y, x) = −g(x, y),
(2.23)
and such that for some positive number δ (here we identify E with R by selection of a Cartesian reference frame) 3
g(x, y)x3+δ y3+δ , ∇y g(x, y)x3+δ y3+δ
∇x g(x, y)x3+δ y3+δ ,
and (2.24)
are bounded in E × E. Then $ 1 1 g(x, y) dy = div − g(x + αu, x − (1 − α)u) ⊗ u dα du . (2.25) 2 V 0 E
604
A.I. MURDOCH
To invoke this theorem in respect of fw we define g(x, y) :=
N N
fij w(xi − x)w(xj − y),
(2.26)
i=1 j =1 j =i
and notice that use of normalisation (2.2) and (2.9) yields N N g(x, y) dy = fij w(xi − x) w(xj − y) dy = fw (x). E
i=1 j =1 j =i
(2.27)
E
(Of course, in the foregoing time dependence has been omitted for brevity.) Accordingly, (2.25) enables fw (x) to be expressed as the divergence of a tensor, defined explicitly in terms of interactions and w, provided (2.23) and (2.24) are satisfied. Now N N g(x, y) = fij w(xi − x)w(xj − y) i=1 j =1 j =i
=
N N
fj i w(xj − x)w(xi − y)
j =1 i=1 i =j
= −
N N
fij w(xj − x)w(xi − y) = −g(y, x).
j =1 i=1 i =j
Here the second equality is a consequence of re-labelling, and the third equality holds if Newton’s third law holds for particle interactions, namely fj i = −fij .
(2.28)
In respect of the boundedness conditions, observe that interactions fij are independent of x and y. If each interaction fij is governed by a separation-dependent potential, φij say, which is bounded below, and such that φij → +∞ as Pi and Pj get ever closer, then provided the total (kinetic plus potential) energy of M is bounded it follows that the values of all interactions are bounded. (In particular, the foregoing holds for Lennard–Jones-type potentials: see, for example, [6, p. 251].) The boundedness criteria (2.24) are accordingly satisfied provided that, for some δ > 0, w(u)u3+δ
and
∇w(u)u3+δ
are bounded in V.
(2.29)
Condition (2.29) is thus a condition upon the choice of weighting function necessary for invocation of the theorem. Accordingly, for standard models of molecular interactions, and modulo restriction (2.29) on w, from (2.27), (2.26) and (2.25), fw = div T− w,
(2.30)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
605
where the interaction stress tensor 1 N N 1 − fij ⊗ u w(xi − x − αu) Tw (x) := − 2 V 0 i=1 j =1 j =i
×w(xj − x + (1 − α)u) dα du.
(2.31)
From (2.30) the balance of linear momentum (2.19) takes its standard form (2.21), namely div Tw + bw = ρw aw ,
(2.32)
where the stress tensor Tw := T− w − Dw.
(2.33)
REMARK 1. The existence of T− w imposes no restriction upon the range of interactions, but requires their pairwise balance (2.28). More general interactions have been considered: see [7, p. 299] and [3, p. 294]. A corresponding stress tensor was motivated for the interactions covered in [7] using different methodology from that here employed. Interactions discussed in [3] were decomposed into conservative and non-conservative contributions, and a stress tensor was associated with the former, using Noll’s theorem: the non-conservative contribution was shown to be decomposable into two terms, one of which has an associated stress tensor and the other remains a spatial force density. For large bodies, gravitational considerations make a similar decomposition of fw desirable in order to make comparison with the usual continuum approach to self-gravitation. Writing fij as the sum of the gravitational attraction of Pj upon Pi together with the remaining non-gravitational interaction, fw can be expressed as the sum of the divergence of a non-gravitational interaction stress tensor together with an internal gravitational interaction body force density, bgrav say. For the simplest choice of w, with corresponding length scale ε (see (3.2)), the value bgrav (x, t) is the resultant gravitational force at instant t exerted by molecules of the body distant greater than ε from x upon those distant less then ε from x, divided by 4π ε 3 /3. The non-gravitational interaction stress at x, while possibly involving individual long-range molecular contributions, in general derives almost entirely from molecules close to x as a consequence of co-operative behaviour (see [7, Section 3.1, Remarks]).
2.3. GENERALISED MOMENT OF MOMENTUM BALANCE Tensorial pre-multiplication of each term in equation (2.7) by (xi − x)w(xi − x), followed by summation over all i = 1, . . . , N, yields N d (xi − x) ⊗ {mi vi }w(xi − x), cw + Jw = dt i=1
(2.34)
606
A.I. MURDOCH
where cw (x, t) :=
N N
(xi (t) − x) ⊗ fij (t)w(xi (t) − x),
(2.35)
i=1 j =1 j =i
and Jw (x, t) :=
N
(xi (t) − x) ⊗ bi (t)w(xi (t) − x).
(2.36)
i=1
Now (xi − x) ⊗ =
d {mi vi }w(xi − x) dt
∂ {(xi − x) ⊗ mi vi w(xi − x)} − vi ⊗ mi vi w(xi − x) ∂t −(xi − x) ⊗ mi vi (∇w . vi ).
(2.37)
Defining the action of simple tensor a ⊗ b ⊗ c on any vector v by (a ⊗ b ⊗ c)v := (c . v)a ⊗ b,
(2.38)
the last term of (2.37) may be written as −((xi − x) ⊗ mi vi ⊗ vi )∇w = ((xi − x) ⊗ mi vi ⊗ vi )∇x w = div (xi − x) ⊗ mi vi ⊗ vi w(xi − x) + mi vi ⊗ vi w(xi − x).
(2.39)
The second equality is a consequence of the identity div(φ a ⊗ b ⊗ c) = φ (∇a)c ⊗ b + a ⊗ (∇b)c + (div c)(a ⊗ b) +(a ⊗ b ⊗ c)∇φ,
(2.40)
with a := (xi − x), b := mi vi , c := vi , and φ := w. Here the definition of the divergence of a rank three tensor M ensures that the divergence theorem holds in the form Mn = div M (2.41) ∂R
R
for a regular region R having outward unit normal n on its boundary ∂R. (In Cartesian tensor notation, (div M)ij = Mij k,k .) From (2.37) and (2.39), relation (2.34) may be written as ∂ {ρw Bw } + div Mw , (2.42) cw + Jw = ∂t where N (xi (t) − x) ⊗ mi vi (t)w(xi (t) − x) (2.43) ρw (x, t)Bw (x, t) := i=1
607
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
and Mw (x, t) :=
N (xi (t) − x) ⊗ mi vi (t) ⊗ vi (t)w(xi (t) − x).
(2.44)
i=1
Recalling that the non-interaction contribution D w to stress involves thermal velocities and, with an eye on usual forms of balance having right-hand sides of form ˙ for some tensor field , we write ρ w := M
N (xi − x) ⊗ mi vi ⊗ vˆ i w(xi − x).
(2.45)
i=1
Accordingly, from (2.44), (2.45), (2.14) and (2.43), w = Mw − ρw Bw ⊗ v, M
(2.46)
and (2.42) becomes w + cw + Jw = ∂ {ρw Bw } + div{ρw Bw ⊗ vw }. − div M ∂t Since (2.40) may be written as div (a ⊗ b) ⊗ φc = (∇(a ⊗ b))φc + div(φc)(a ⊗ b)
(2.47)
(2.48)
and this result holds with a ⊗ b replaced by any second-rank tensor, div{ρw Bw ⊗ vw } = div{Bw ⊗ ρw vw } = (∇Bw )ρw vw + (div(ρw vw ))Bw .
(2.49)
Hence (2.47) may be written (using (2.6)) in the form w + cw + Jw = ρw B˙ w , − div M
(2.50)
where the material time derivative ˙ w := ∂ {Bw } + (∇Bw )vw . B ∂t
(2.51)
It proves possible to write cw as the divergence of a rank three tensor field via Noll’s theorem. To this end we define G(x, y) :=
N N (xi − x) + (xj − y) ⊗ fij w(xi − x)w(xj − y). i=1 j =1 i =j
(2.52)
608
A.I. MURDOCH
Consider N N G(x, y) dy = (xi − x) ⊗ fij w(xi − x) w(xj − y) dy E
E
i=1 j =1 i =j
+
N N E
i=1 j =1 i =j
(xj − y)w(xj − y) dy ⊗ fij w(xi − x). (2.53)
Changing to variable u := xj − y yields w(xj − y) dy = w(u) du = 1,
(2.54)
using normalisation condition (2.2). Further, (xj − y)w(xj − y) dy = u w(u) du.
(2.55)
Thus from (2.54), (2.55), (2.53) and (2.35), G(x, y) dy = cw
(2.56)
provided that weighting function w satisfies u w(u) du = 0.
(2.57)
E
V
E
V
E
V
Notice that if w is ‘balanced’, in the sense that w(−u) = w(u),
(2.58)
then (2.57) is satisfied. We also have G(x, y) =
N N (xj − x) + (xi − y) ⊗ fj i w(xj − x)w(xi − y) j =1 i=1 j =i
= −G(y, x)
(2.59)
on assuming interaction balance (2.28). The remaining boundedness condition necessary for application of Noll’s theorem is equivalent to requiring that u w(u)u3+δ be bounded for u ∈ V,
(2.60)
a further restriction upon the choice of w. For such weighting functions Noll’s theorem enables us to write cw = div C− w
(2.61)
609
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
and express balance (2.50) in the form ˙ w. div Cw + Jw = ρw B
(2.62)
Here the generalised couple-stress tensor Cw := C− w − Mw
(2.63)
with C− w (x) := −
N N 1 1 (xi − x − αu) + (xj − x + (1 − α)u) 2 i=1 j =1 V 0 i =i
⊗ fij ⊗ uw(xi − x − αu)w(xj − x + (1 − α)u) dα du.
(2.64)
REMARK 2. Moment of momentum balance corresponds to the skew part of relations (2.42) or (2.62). Denoting twice the skew part of any rank two tensor A ˘ and noting by A, ˘ b = a ∧ b, a⊗
(2.65)
where a ∧ b := a ⊗ b − b ⊗ a,
(2.66)
these relations become ∂ ˘w ˘ w } + div M {ρw B c˘ w + J˘ w = ∂t
(2.67)
˙˘ . ˘ w + J˘ w = ρw B div C w
(2.68)
and
˘ w are given by definitions (2.35), (2.36) and (2.43) with ⊗ Here c˘ w , J˘ w and B ˘ w is ˘ replaced by ∧, Mw is given by (2.44) with the first ⊗ replaced by ∧, and C the difference of expressions (2.64) and (2.45) with the first ⊗ replaced by ∧. Of course, relations (2.67) and (2.68) may be written in terms of the corresponding axial vectors. In so doing it should be noted that −a × b is the axial vector corresponding to a ∧ b. 3. A Simple Choice of Scale-Dependent Weighting Function and Corresponding Physical Interpretation of Field Values 3.1. A SIMPLE WEIGHTING FUNCTION In Section 2 the usual forms of mass conservation (2.6) and linear momentum balance (2.32), together with a generalised moment of momentum balance (2.62),
610
A.I. MURDOCH
were derived using any scalar-valued weighting function w defined on V of class C 1 , satisfying boundedness criteria (2.29) and (2.60), and relation (2.57). The only physical requirements so far introduced are that w should have physical dimension (length)−3 and be normalised in the sense (2.2). In the absence of any preferred direction for the system, it is natural to take w(u) = w (u),
where u := u.
(3.1)
The simplest way of introducing a length-scale dependence is to choose 3 3 π 4
w (u) :=
if u <
w (u) := 0
and
if u .
(3.2)
Clearly w is normalised, and fields ρw , pw , fw , bw , cw , Jw and Bw have simple interpretations as local averages of molecular variables. For example (see (2.1)), ρw (x, t) represents the mass of those particles which at time t reside within that sphere S (x) of radius centred at x divided by the volume of this sphere. However, w is not continuous wherever u = . It is a simple matter to “mollify” w over an interval (, + δ) in such a way that w is of arbitrary smoothness up to class constant on 0 u and zero for u + δ. Here C ∞ everywhere, with w δ(> 0) is arbitrarily small (see [5, p. 160]): for example, we could choose δ = 10−6 Å = 10−16 m. In such case the physical interpretations of fields ρw , etc. are essentially indistinguishable from those delivered by choice (3.2). Such mollification involves monotone decreasing smooth functions wherever u + δ. Nevertheless, such functions w have bounded (although very large) derivative values on (, +δ). Of course, these derivatives vanish on [0, ]∪[ +δ, ∞), and accordingly boundedness criteria (2.29) and (2.60) are satisfied. 3.2. INTERPRETATION OF FIELD VALUES 3.2.1. Values of ρw , pw , fw , bw , cw , Jw and ρw Bw are immediately seen to deliver values, at geometrical point x and time t, of sums of additive molecular quantities taken over those molecules which lie within S (x), divided by the volume V of this sphere. Modulo satisfaction of Newton’s third law (2.28), the expressions for fw and cw can be reduced somewhat. Writing N N i=1 j =1 i =j
fij w(xi − x) =
fik fi% + , V V P ,P ∈S (x) P ∈S (x) i
k
i =k
we note that the first sum is the same as 1 (fik + fki ), 2 P ,P ∈S (x) i
k
i =k
i P% ∈ / S (x)
(3.3)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
611
which vanishes by (2.28). Thus fw (x, t) is the resultant force of molecules outside S (x) upon those inside S (x), at time t, divided by V . Similarly, again invoking (2.28) and adopting the book-keeping on the righthand side of (3.3), N N
(xi − x) ⊗ fij w(xi − x)
i=1 j =1 j =i
=
fik fi% 1 + (xi − x) ⊗ . (xi − xk ) ⊗ 2 i =k V V i %
(3.4)
If interactions are governed by separation-dependent pair potentials, then fik = αik (xi − xk ).
(3.5)
where αik is a scalar-valued function of xi − xk . Accordingly, from (3.4) and (2.35), cw (x, t) = c˜ w (x, t) + cˆ w (x, t) ,
(3.6)
where c˜ w (x, t) takes symmetric values and cˆ w (x, t) represents at instant t the resultant tensor moment of forces about x exerted by molecules outside S (x) upon those inside S (x), divided by V . Specifically, cw (x, t) :=
1 (xi − xk ) ⊗ αik (xi − xk ) w(xi − x) 2 P ,P ∈S (x) i
and
k
i =k
(3.7)
(xi − x) ⊗ fil w(xi − x). cˆ w (x, t) :=
(3.8)
Pi ∈S (x) P% ∈ / S (x)
− 3.2.2. The geometrical interpretations of T− w and Cw follow from a theorem which − relates particular molecular interactions to the integrals of T− w n and Cw n over any subset S of an oriented plane. Specifically, let n (x0 ) denote that oriented plane through point x0 with unit normal n. Then n (x0 ) divides E into the two open subsets
En+ (x0 ) := {z ∈ E: (z − x0 ).n > 0} E− n (x0 )
:= {y ∈ E: (y − x0 ).n < 0}.
and
(3.9)
More precisely, E = En− (x0 ) ∪ n (x0 ) ∪ En+ (x0 ).
(3.10)
612
A.I. MURDOCH
THEOREM. If S is a connected subset of n (x0 ) then, for any function g as in Noll’s theorem, g(y, z) dy dz D(S)
$ 1 1 − = g(x + αu, x − (1 − α)u) ⊗ u dα du n dSx , 2 V 0 S
(3.11)
where domain
D(S) := (y, z) ∈ En− (x0 ) × En+ (x0 ) and %(y, z) intersects S ,
(3.12)
and %(y, z) denotes the line through points y and z. See [8] for a proof of this theorem. Choice (2.26) of g together with (2.31) yields from (3.11) N N T− n dS = fij Fij (S), w S
where
(3.13)
i=1 j =1 j =i
Fij (S) :=
w(xi − y)w(xj − z) dy dz.
(3.14)
Similarly, choice (2.52) of g with (2.64) gives N N − Cw n dS = F ij (S) ⊗ fij ,
(3.15)
D(S)
S
where
i=1 j =1 i =j
F ij (S) :=
D(S)
[(xi − y) + (xj − z)]w(xi − y)w(xj − z) dy dz.
(3.16)
REMARK 3. Before discussing the values of Fij (S) and F ij (S) for the mollified version of w it is instructive to consider the formal limit of w as scale tends to zero. This is provided by choosing the Dirac δ distribution in place of w . Accordingly, from (3.14) δ(xi − y)δ(xj − z) dy dz (3.17) Fij (S) := D(S)
takes the value 1 if %(y, z) passes through S, and is otherwise zero. In the same way, from (3.16) F ij (S) is seen to be zero for all particle pairs. Accordingly, we have the simple result (see Figure 1) T− n dS = fij , (3.18) w S
i
j
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
613
Figure 1.
where sums are taken only over particle pairs for which Pi ∈ En− (x0 ), Pj ∈ En+ (x0 ), and the line through Pi and Pj intersects S. Further, from (2.31) with w = δ we see the only contributions to T− w (x) derive from α and u values for which xi = x + αu and
xj = x − (1 − α)u.
(3.19)
Thus xj − xi = −u
(3.20)
and (2.31) takes form T− w (x)
N N 1 = fij ⊗ (xj − xi )aij (x) 2 i=1 j =1
(3.21)
i =j
for some scalar-valued functions aij . Accordingly, for interactions of form (3.5), T− w takes symmetric values. Irving and Kirkwood [1] first obtained the foregoing, strikingly simple, interpretation of T− w given in (3.18). Setting w = δ in (3.16) yields zero value for each F ij (S). Thus from (2.64) the couple-stress vanishes for such choice. More generally, in determining the values of Fij (S) and F ij (S) for any particular pair of particles and choice of scale embodied in w we note that: (i) the only nonzero contributions come from domains in which both y − xi < and z − xj < , and in such case w(xi − y)w(xj − z) takes the value V−2 , (ii) the y domain lies in En− (x0 ) and the z domain in En+ (x0 ), and (iii) line %(y, z) passes through S. Consequently there is no simple expression for either Fij (S) or F ij (S), and a number of different cases must be taken into account. Case 1 (See Figure 2). If S (xi ) ⊂ En− (x0 ), S (xj ) ⊂ En+ (x0 ), and any line joining a point in sphere S (xi ) to a point in sphere S (xj ) passes through S, then Fij (S) = 1 and
F ij (S) = 0 .
(3.22)
614
A.I. MURDOCH
Figure 2. Case 1: Fij (S) = 1 and Fij (S) = 0.
The latter result is a consequence of xi and xj being the centroids of, respectively, . Accordingly the contribution S (xi ) and S (xj ), and the spherical symmetry of w to the interaction stress integral (3.13) is fij and to couple-stress integral (3.15) is zero. Case 2 (See Figure 3(i) and (ii)). If xi and xj both lie within a distance of S, and all lines joining points in S (xi ) to points in S (xj ) pass through S, then from (3.14) Vi− Vj+ −2 dy dz = , (3.23) Fij (S) = V V2 S− (xi ) S+ (xj ) where S− (xi ) := S (xi ) ∩ En− (x0 ) and
S+ (xj ) := S (xj ) ∩ En+ (x0 ),
(3.24)
with (% = i or j ) V%± := vol(S± (x% )).
(3.25)
The net contribution of particles Pi and Pj to the integral in (3.13) is fij Fij (S) + fj i Fj i (S) = fij (Fij (S) − Fj i (S)) Vi− Vj+ − Vj− Vi+ fij = V2 $ Vi+ + Vj− = 1− fij V Vi− − Vj− Vj+ − Vi+ = fij or fij , V V
(3.26)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
615
+ − + 2 Figure 3. Case 2: Fij (S) = Vi− Vj+ /V2 and Fij (S) = {(xi − y¯ − i ) + (xj − z¯ j )}Vi Vj /V .
on noting (% = i or j ) V%+ + V%− = V .
(3.27)
Further, in this situation (3.16) yields (xi − y) dy + Vi− F ij (S) = V−2 Vj+ S− (xi )
S+ (xj )
$ (xj − z) dz .
Hence Vi− Vj+ ¯+ (xi − y¯ − F ij (S) = i ) + (xj − z j ) , 2 V
(3.28)
− + ¯+ where y¯ − i and z j denote the centroids of regions S (xi ) and S (xj ), respectively. z+ Since y¯ − i (¯ j ) lies on that line through xi (xj ) parallel to n, by symmetry,
F ij (S) is parallel to n.
(3.29)
The analogue of (3.26) for the net contribution to the integral in (3.15) from particles Pi and Pj is −1 − − ¯− F ij (S) ⊗ fij + F j i (S) ⊗ fj i = (xi − y¯ − i )Vi − (xj − z j )Vj V ⊗ fij (3.30) − upon using (2.28), (3.27) and where z¯ − j denotes the centroid of S (xj ). Notice that the foregoing results also hold when S (xi ) and S (xj ) intersect. Also, Fij (S) never vanishes, even when xi and xj lie on the same side of S: see (3.23) and Figure 3(ii). Further, the net contribution of Pi and Pj , given by (3.26), is a nonzero multiple of fij provided that xi and xj are not equidistant from S.
Other cases. For particles Pi and Pj for which not all lines joining points in S (xi ) to points in S (xj ) do not intersect S, results are more complex. In the situation depicted in Figure 4(i), S (xj ) Ri− (z) dy dz , Fij (S) = V2
616
A.I. MURDOCH
Figure 4. Some other cases.
where
Ri− (z) := y ∈ En− (x0 ) ∩ S (xi ) and l(y, z) intersects S .
Figure 4(ii) indicates a situation in which R + (xj ) Ri− (z) dy dz , Fij (S) = V2 where R + (xj ) is the set of points z ∈ S (xj ) such that there exists a point y ∈ S (xi ) for which l(y, z) intersects S. w (see (2.17) and (2.45)) are to be identified with fluxes 3.2.3. Fields D w and M of momentum and generalised moment of momentum associated with molecular mass transport. To see this note that as a consequence of (2.15) D w (x, t) =
N
mi vi (t) ⊗ vˆ i (t; x)w(xi (t) − x).
(3.31)
i=1
For any unit vector n it follows that, suppressing time dependence for brevity, D w (x)n =
N
mi vi {(vi − v(x)) · n} w(xi − x)
(3.32)
i=1
and w (x)n = M
N (xi − x) ⊗ mi vi {(vi − v(x)) · n} w(xi − x). i=1
(3.33)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
617
Each particle Pi within a distance of x contributes to sums (3.32) and (3.33) a weighted multiple of mi vi and (xi − x) ⊗ mi vi , respectively: the weighting factor, αi say, is (vi −v(x)).n/V . Such velocities vi , if constant, would involve Pi crossing a plane surface, moving with velocity v(x), through x with unit normal n from En− (x) into En+ (x0 ) (see (3.9)) if αi > 0 and in the opposite sense if αi < 0. Notice also that the contribution to Cauchy stress Tw from the momentum flux (see (2.33)) is ‘pressure-like’ in that, for any vector n, D w n.n =
N
mi (ˆvi .n)2 w > 0.
(3.34)
i=1
(The only exception would be physically-unrealistic situations in which, for some n, all particles have the same n component of velocity.) Quantity vˆ i (t; x) approximates, for particles near x, the thermal velocity v˜ i (t) := vi (t) − vw (xi (t), t)
(3.35)
of Pi corresponding to w. It is the kinetic energy associated with such thermal velocities that is, according to the kinetic theory of heat (see [9]), identified with heat energy. Specifically, at the scale embodied in the choice of w, ρw (x, t)hw (x, t) :=
N 1 i=1
2
mi v˜ 2i (t)w(xi (t) − x)
(3.36)
is the heat content density. Modulo the approximation v˜ i (t) 5 vˆ i (t; x)
if xi (t) − x < ,
(3.37)
from (2.17) we have trD w = 2ρh.
(3.38)
Consider a moderately-rarefied gas macroscopically at rest in a container of volume V . In such case v˜ i = vi and T− w is negligible in comparison with D w : interactions occur only ‘occasionally’ via binary ‘collisions’. In such case it is reasonable to expect D w to be isotropic and constant, except for a region of thickness 2 centred on the boundary of the container. Thus for some P > 0 (see (3.34)) D w = P 1,
(3.39)
and integration of (3.38) over the container interior yields (modulo neglect of boundary inhomogeneity) 3P V =
N i=1
since,
mi v2i ,
(3.40)
618
A.I. MURDOCH
2ρh =
E
container
2ρh =
N
mi v2i
(3.41)
i=1
as a consequence of normalisation property (2.2). If each molecule has mass m then N
mi v2i = Nm¯v2 ,
(3.42)
i=1
where v¯ 2 is the mean square velocity of molecules. Thus, from (3.40), (3.41) and (3.42). PV =
1 Nm¯v2 . 3
(3.43)
This is the ideal gas relation: the temperature θ in such context is given (see, for example, [10, Section 19.4]) by θ = m¯v2 /3k,
(3.44)
where k denotes the Boltzmann constant. 4. Discussion 4.1. ALTERNATIVE CHOICES OF WEIGHTING FUNCTION Averaging via weighting functions may be repeated, by defining the w-average, fw , of a spatial field f via f (y)w(y − x) dy. (4.1) fw (x) := all space
This accords with microscopic averages computed in Section 2 upon writing discrete (that is, purely microscopic) quantities in terms of distributions. For example, the microscopic mass density (at any given instant: time-dependence is suppressed) $mic (x) :=
N
mi δ(xi − x),
(4.2)
i=1
where δ denotes the three-dimensional Dirac distribution. Clearly, from (4.1), (4.2) and (2.1), ($mic )w = $w .
(4.3)
Upon repeating a w-average it is natural to compare (fw )w with fw . If one requires that repeated averaging yields nothing new, that is if (fw )w = fw ,
(4.4)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
619
then the form of w may be determined (see [5, p. 161]). In unbounded domains the convolution format of (4.1) implies that the Fourier transform w(k) of w should satisfy w(k)2 = w(k).
(4.5)
Thus w(k) = 0 or 1 and the simplest (and most physical) choice is for a wavevector ‘cut-off’, say at |k| = ε −1 for some choice of the length scale ε. That is, w(k) = 1
if |k| < ε −1 ,
w(k) = 0
if |k| ε −1 .
In such case it follows that $ d d d 1 − cos , sin w(d) = 2 3 2π d ε ε ε
(4.6)
(4.7)
where d := |d|.
(4.8)
The analogue for a bounded rectangular region of dimensions 2L1 × 2L2 × 2L2 yields truncated (at wavelength ε) multiple Fourier series which are delivered by 3 > sin((Ni + 1/2)di ) 1 , w(d) := 8L1 L2 L3 i=1 sin(di /2)
(4.9)
where Ni is the integral part of 2Li /ε and d = (d1 , d2 , d3 ). A consequence of using (scale-dependent) weighting functions of form (4.7) or (4.9) is that averaging at scale ε1 , followed by a further averaging at scale ε2 , yields the same result as merely averaging once at the larger of the two scales. Choice (4.7) does not satisfy the boundedness conditions (2.24) of Noll’s theorem, and accordingly fw and cw do not appear to be expressible in divergence form. Thus, for this choice, the balances of linear and generalised moment of momentum remain in the forms (2.19) and (2.50). To be able to invoke Noll’s theorem, choice (4.9) must be modified in the same manner as w given by (3.2). However, the interpretations analogous to those of (3.13) and (3.15) are no longer so transparent and simple. 4.2. PARTIAL STRESS IN MIXTURE THEORY Mass conservation and linear momentum balance for any single constituent in a non-reacting mixture can be obtained using the methodology of Sections 2.1 and 2.2. For constituent α the analogue of (2.19) is fαβ + bα = ρα aα , (4.10) −div D α + fαα + β =α
620
A.I. MURDOCH
where ∂vα (4.11) + (∇vα )vα ∂t denotes the α intrinsic acceleration field. Here fαα denotes the body force density associated with α–α interactions, and fαβ represents the body force density which derives from the effect on constituent α molecules due to those of constituent β. The β sum is over all constituents except α, and the weighting function subscript has been suppressed. Field D α is given by (2.17) with sum taken only over α molecules and velocity vα in place of vw in definition (2.14), and bα denotes the force density which derives from all influences outside the mixture. It is only possible to invoke Noll’s theorem in respect of fαα , in precisely the manner of Section 2.2: separately, and in combination, this is impossible for fαβ (β = α). Accordingly there exists (with choice w as in Section 2) an α–α interaction stress tensor T− α such that aα :=
fαα = div T− α.
(4.12)
Thus (4.10) takes the form fαβ + bα = ρα aα , div Tα +
(4.13)
β =α
where the α partial stress tensor Tα := T− α − Dα .
(4.14)
This partial stress differs in its interpretation from that of Truesdell [11] and Bowen [12], who regarded Tα n as yielding the traction on an oriented surface S with unit normal n due to the whole mixture on the ‘positive’ side of S upon species α on its negative side. This latter interpretation gave rise to a paradox (see [13]). The interpretation here given (which resolves the paradox) was also derived via the corpuscular considerations of Murdoch and Morro [14, 15] using cellular averaging. The current use of weighting functions and Noll’s Theorem makes precise this earlier approach. 4.3. COUPLE STRESS AND INHOMOGENEITY Couple stress is taken into account when materials have microstructure, or are inhomogeneous: see, for example, [16, Section 98], and Lecture II of Truesdell [17]. The discussion here did not address microstructure, but there is a clear link with inhomogeneity. A measure of this, at the scale ε associated with the weighting function, is the displacement d from the field point x of the mass centre of those molecules within a distance ε of x. (Here and hereafter subscripts w will be omitted, for simplicity.) Specifically, ρ(x, t)d(x, t) :=
N i=1
mi (xi (t) − x)w(xi (t) − x),
(4.15)
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
621
when w is chosen to be the mollified version of w given by (3.2). Differentiating with respect to t (with x fixed) yields ∂ mi vi w(xi − x) + mi ((xi − x) ⊗ vi )∇w {ρd} = ∂t i=1 i=1 N
= ρv −
N
N
mi ((xi − x) ⊗ vi )∇x w
i=1
%
= ρv + divx
/ N
0& mi (xi − x) ⊗ vi
/ w − divx
i=1
= ρv −
N
N (xi − x) ⊗ mi vi w
0
i=1
mi vi w(xi − x) − div{ρB}.
(4.16)
i=1
Here use has been made of identities div(φA) = φ div A + A∇φ, with φ = w and A = mi (xi − x) ⊗ vi , and div(a ⊗ b) = (∇a)b + (divb)a with a = (xi − x), b = mi vi , together with results ∇x a = −1 and div mi vi = 0. Thus from (4.16) (2.4) and (2.5) ∂ {ρd} = −div{ρB}. ∂t
(4.17)
Relation (4.17) is a conservation law for the density of moment of mass, ρd. Balance (2.62) in which (generalised) couple stress appears, may be regarded as an evolution equation for B, which is related to inhomogeneity via (4.17). 4.4. ON GENERALISED MOMENT OF MOMENTUM BALANCE Tensorial pre-multiplication of linear momentum balance (2.32) by (x − x0 ), where x0 is an arbitrary, but fixed, point in the relevant inertial frame, yields (x − x0 ) ⊗ divT + (x − x0 ) ⊗ b = (x − x0 ) ⊗ ρa.
(4.18)
Equivalently, noting that (2.40) with φ = 1 may be written as div(a ⊗ (b ⊗ c)) = (∇a)(b ⊗ c)T + a ⊗ div(b ⊗ c), which clearly holds when simple tensor b ⊗ c field is replaced by any rank-two tensor field, (4.18) may be written as ˙ x0 ) ⊗ v} − ρv ⊗ v. div{(x − x0 ) ⊗ T} − TT + (x − x0 ) ⊗ b = ρ {(x −
(4.19)
622
A.I. MURDOCH
Integration over a domain Rt whose boundary ∂Rt has outward unit normal n, and which deforms with the motion prescribed by v, leads to T (x − x0 ) ⊗ Tn + −T + (x − x0 ) ⊗ b + ρv ⊗ v ∂Rt Rt $ d (x − x0 ) ⊗ ρv . (4.20) = dt Rt Here use has been made of divergence theorem (2.41) with M = (x − x0 ) ⊗ T and Reynolds’ transport theorem (see [18]). Twice the skew part of (4.20) is, noting the symmetry of ρv ⊗ v and (2.66), T − TT + (x − x0 ) ∧ b (x − x0 ) ∧ Tn + ∂Rt Rt $ d (x − x0 ) ∧ ρv . (4.21) = dt Rt It follows that the usual form of moment of momentum balance for non-polar media (see, for example, [11, Section 205]) holds if, and only if, T takes symmetric values. REMARK 4. As discussed in Remark 3, from (2.31) T− is symmetric for interactions of form (3.5) in the limiting case of scale tending to zero, and hence symmetry of T follows from (2.33), since D is symmetric. However, for > 0 the complex nature of (2.31) does not lead to such a simple conclusion. Conventional, postulated forms of moment of momentum balance yield local forms which involve the skew part of T (and hence of T− ): see, for example, [17, p. 24]. A generalised moment of momentum balance was also motivated in [7] which involved skT− : see equation (3.27) therein. Here, however, this is not the case: the derived balance (2.62) does not involve skT. To compare the foregoing with postulated forms of moment of momentum balance, note the integral form of balance (2.62) is (cf. (4.21)) $ d Cn + J= ρB . (4.22) dt Rt ∂Rt Rt Adding (4.20) and (4.22) gives T Cn + (x − x0 ) ⊗ Tn + −T + ρv ⊗ v + J + (x − x0 ) ⊗ b ∂Rt Rt d {(x − x0 ) ⊗ ρv + ρB} . (4.23) = dt Rt
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
623
Twice the skew part of (4.23) is (see the definitions following (2.68) and noting the symmetry of D: see (2.17)) − ˘ + (x − x0 ) ∧ Tn + Cn T − (T− )T + J˘ + (x − x0 ) ∧ b ∂Rt Rt d ˘ (x − x0 ) ∧ ρv + ρ B . = (4.24) dt Rt Usual postulated moment of momentum balances correspond to (4.24) with contribution T− − (T− )T absent. This motivates the search for a couple-stress tensor C such that ˘ − + T− − (T− )T . div C = div C
(4.25)
The author has, without success, studied the explicit expressions for the interaction contributions to the right-hand side of (4.25), its skew part, and skT− , with the aim of obtaining a suitable analogue of G in (2.52), and thence invoking Noll’s theorem to establish the existence of such a tensor C . However, relation (3.33) of [7] delivers this result, modulo very weak assumptions on interactions (see I.1., I.2. and I.3.). In this respect it is necessary to note that time averaging can be omitted throughout [7] without affecting the veracity of (3.33), that S (x) is an example of an ‘-cell’ centred at x, and that a ∧ b employed therein is here written as 12 a ∧ b. It follows that postulated balances are not incorrect, but that the interpretation of couple stress therein is different. Further, Eringen [19] integrated local forms of balance relations for non-polar continua, assumed to hold in ‘microvolumes’ over so-called ‘macrovolumes’ to obtain a balance which involves both the divergence of macroscale couple stress, and the skew part of the macroscale stress. Pitteri [4] obtained a similar result by deriving an energy balance and then assuming this to be invariant under change of observer. In this context the approach herein is characterised by a definition of (generalised) couple stress for which the relations governing evolution of momentum and (tensor-valued) moment of momentum are uncoupled. REMARK 5. It is of interest to note the formal similarity between mass conservation (2.6) with momentum balance (2.32), and moment of mass conservation (4.17) with moment of momentum balance (2.62).
4.5. COMPARISON WITH STATISTICAL MECHANICS The further averaging of ensemble averages in space and time advocated in [1] (see the second paragraph in Section 1) is instructive. If spatial averaging of fields is effected in the manner of (4.1), then, for interactions governed by separationdependent pair-potentials, the spatial average of the strictly-local form of linear
624
A.I. MURDOCH
momentum balance derived in [1] yields a formally-identical relation in which the averaged stress tensor is symmetric. Such symmetry is preserved by subsequent time averaging (see [5, Section 7]). However, even for such simple conservative interactions, the possibility of asymmetric stress as a consequence of inhomogeneity is manifest in Section 3.2.2. Consequently the assumption in [1], that space-time averages of ensemble averages be identified with mean values of space time averages in oft-repeated experiments, is drawn into question. 4.6. CONCLUDING REMARKS 4.6.1. Local measurement values are (local) averages both in space and time. Thus if field values are to be related to local measurements then the spatial averages here discussed should be subjected to a further temporal averaging. Such additional averaging was implemented in [5], and has been extended to time-dependent systems in [20, 21]. For brevity details have been omitted: revised interpretations of stress and couple stress are simple time averages of those given here. 4.6.2. Adequate continuum modelling of macromolecular systems often requires consideration of couple stresses and body couples: for example, nematic liquid crystalline phases. If each macromolecule is regarded as an assembly of interacting point masses, the balances here obtained remain valid. However, more detailed book-keeping, taking account of co-operative macromolecular behaviour (for example, local alignment of long, ‘thin’, molecules), is necessary before balances are obtained which resemble those usually postulated. An earlier study [22] addressed this issue, using cellular averaging. In this work it was shown that if, roughly speaking, macromolecules deform homogeneously, and neighbouring molecules deform in much the same way, then the full tensor-valued moment of momentum balance serves as an evolution equation for the tensor-valued measure of such affine deformation. References 1. 2. 3. 4. 5. 6. 7.
J.H. Irving and J.G. Kirkwood, The statistical mechanical theory of transport processes. IV. The equations of hydrodynamics. J. Chem. Phys. 18 (1950) 817–829. W. Noll, Die Herleitung der Grundgleichungen der Thermomechanik der Kontinua aus der statistischen Mechanik. J. Rational Mech. Anal. 4 (1955) 627–646. M. Pitteri, Continuum equations of balance in classical statistical mechanics. Arch. Rational Mech. Anal. 94 (1986) 291–305. M. Pitteri, On a statistical-kinetic model for generalized continua. Arch. Rational Mech. Anal. 111 (1990) 99–120. A.I. Murdoch and D. Bedeaux, Continuum equations of balance via weighted averages of microscopic quantities. Proc. Roy. Soc. London A 445 (1994) 157–179. D.L. Goodstein, States of Matter. Dover, New York (1985). A.I. Murdoch, A corpuscular approach to continuum mechanics: basic considerations. Arch. Rational Mech. Anal. 88 (1985) 291–321.
ON THE MICROSCOPIC INTERPRETATION OF STRESS AND COUPLE STRESS
8.
9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
22.
625
A.I. Murdoch, Elements of the continuum modelling of material behaviour and its relation to the fundamentally-discrete nature of matter. In: Lecture Notes, Centre of Excellence for Advanced Materials and Structures, IPPT, Polish Academy of Sciences, Warsaw (2002). S.G. Brush, The Kind of Motion We Call Heat. North-Holland, Amsterdam/New York (1986). H.C. Ohanian, Physics, Vol. 1. Norton, NewYork/London (1985). C. Truesdell and R.A. Toupin, The Classical Field Theories, S. Flügge (ed.), Handbuch der Physik. Springer, Berlin (1960). R.M. Bowen, Theory of Mixtures, A.C. Eringen (ed.), Continuum Physics, Vol. III. Academic Press, New York (1976). M.E. Gurtin, M.L. Oliver and W.O. Williams, On the balance of forces for mixtures. Quart. Appl. Math. 30 (1973) 527–530. A. Morro and A.I. Murdoch, Stress, body force, and momentum balance in mixture theory. Mechanica 21 (1986) 184–190. A.I. Murdoch and A. Morro, On the continuum theory of mixtures: Motivation from discrete considerations. Internat. J. Engrg. Sci. 25 (1987) 9–25. C. Truesdell and W. Noll, The Non-linear Field Theories of Mechanics, S. Flügge (ed.), Handbuch der Physik, Vol. III/3. Springer, Berlin (1965). C. Truesdell, Six Lectures on Modern Natural Philosophy. Springer, Berlin (1966). C. Truesdell, A First Course in Rational Continuum Mechanics, Vol. 1. Academic Press, New York (1977). A.C. Eringen, Mechanics of micromorphic continua. In: E. Kröner (ed.), Mechanics of Generalized Continua. Springer, New York (1968). A.I. Murdoch, On time-dependent material systems. Internat. J. Engrg. Sci. 38 (2000) 429–452. A.I. Murdoch and S.M. Hassanizadeh, Macroscale balance relations for bulk, interfacial and common line systems in multiphase flows through porous media on the basis of molecular considerations. Internat. J. Multiphase Flow 28 (2002) 1091–1123. A.I. Murdoch, On the relationship between balance relations for generalised continua and molecular behaviour. Internat. J. Engrg. Sci. 25 (1987) 883–914.
The Hanging Rope of Minimum Elongation for a Nonlinear Stress–Strain Relation PABLO V. NEGRÓN-MARRERO Department of Mathematics, University of Puerto Rico, Humacao, PR 00791-4300, U.S.A. E-mail:
[email protected] Received 23 September 2002 Abstract. We consider the problem of determining the shape that minimizes the elongation of a rope that hangs vertically under its own weight and an applied force, subject to either a constraint of fixed total mass or fixed total volume. The constitutive function for the rope is given by a nonlinear stress– strain relation and the mass–density function of the rope can be variable. For the case of fixed total mass we show that the problem can be explicitly solved in terms of the mass density function, applied force, and constitutive function. In the special case where the mass–density function is constant, we show that the optimal cross-sectional area of the rope is as that for a linear stress–strain relation (Hooke’s Law). For the total fixed volume problem, we use the implicit function theorem to show the existence of a branch of solutions depending on the parameter representing the acceleration of gravity. This local branch of solutions is extended globally using degree theoretic techniques. Mathematics Subject Classifications (2000): 34B15, 74B20, 74G25. Key words: string, mass–density, nonlinear stress–strain relation, implicit function theorem, compact map.
The contributions of Clifford Truesdell to the development of the field of rational mechanics were vast. I met Professor Truesdell only briefly at a meeting on Theoretical Mechanics at Rutgers University in 1990. However, I am in debt to him for the legacy of his teachings in rational mechanics. The following paper is dedicated to his memory.
1. Introduction The problem of the motion of a string under different types of boundary conditions and forces dates back to Euler [7] and Lagrange [14]. The problem we consider here is a variation of the catenary problem first proposed by Leonardo da Vinci. In particular, we consider the problem of a rope or string attached at one end, hanging vertically under its own weight, and subject to an applied force at the hanging end. (See Figure 1.) Instead of specifying a shape for the rope and determining its deformation, we consider the optimal control problem of determining the shape of 627 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 627–649. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
628
´ P.V. NEGRON-MARRERO
Figure 1. Geometry of deformation. In the reference configuration there is no gravity and we either have the total volume fixed to V or the total mass fixed to M.
the rope (given by its cross-sectional area function) that minimizes its elongation under the effect of its own weight and applied force, subject to either a constraint of fixed volume or fixed mass. This problem was treated in [20] for a linear stress– strain relation (Hooke’s Law) and constant mass–density function. We generalize the results in [20] to materials that satisfy a general nonlinear stress–strain relation and with a variable mass–density function. The model for the string that we used is based on those in [3], to which we also refer for a detailed historical account of problems for strings. A related problem to the one treated here is that, instead of hanging, the rope is now upside down and is thought of as a column. The question is, subject to the constraint of fixed volume: what shape should the column have in order to maximize its strength? This problem is equivalent to maximizing the first buckling mode of the system. We refer to [13, 5, 6] for further details on this problem. In Section 2 we derive the equilibrium equations of the string and describe the constitutive assumptions on the material behavior. In Section 3 we characterize the problem of minimum elongation as one of the calculus of variations. For the problem of fixed total mass, we show in Section 4 that the corresponding Euler–Lagrange equations can be solved explicitly in terms of the mass–density function, applied force, and constitutive function (cf. (4.10)), and that this solution corresponds to a global weak minimizer of the total elongation functional. In the particular case where the mass–density function is constant, we find the surprising result that the corresponding cross-sectional area function is identical to the one found in [20] for the case of a linear stress–strain relation. The corresponding
THE HANGING ROPE OF MINIMUM ELONGATION
629
minimum elongation of the string, however, depends on the material response (cf. (4.15)). In Section 5 we carry out the analysis for the constraint of fixed total volume. Since the problem is formulated as one for a functional in terms of a pseudoantiderivative of the cross-sectional area function (cf. (3.3)), and the mass–density function may be variable, the volume constraint becomes an integral constraint. Thus we have an isoperimetric problem of the calculus of variations and the Euler– Lagrange equations involve an additional term proportional to a Lagrange multiplier. The Euler–Lagrange equations for this problem are then formulated as a smooth mapping between appropriate Banach spaces of smooth functions, using the gravitational parameter for continuation. We first study the problem with zero gravity and show that it has a solution which is unique and corresponds to a constant cross-sectional area function. We then show that the implicit function theorem is applicable to the problem with nonzero gravity to get the existence of a local curve of solutions. This result includes the case of small but negative gravitational parameter, which can be interpreted as the case of a gravitational field vertically upward or as if the rope would hang upside down. Finally we apply the classical Leray–Schauder degree for compact maps [16, 4, 2] to extend globally the local branch found via the Implicit Function Theorem. The uniqueness of solutions for the problem with zero gravity together with some a priori estimates on the solutions allow us to rule out two of the Rabinowitz alternatives for the global continuum. In addition we get that the problem has a solution for each nonnegative value of the gravitational parameter. We then show that for fixed values of the gravitational parameter, the solution obtained is a weak local minimizer of a modified version of the original variational formulation (cf. (5.25), (5.26)). For any m 0 we consider the spaces C m [0, L] consisting of functions v with m continuous derivatives in [0, L] and with norm given by vC m [0,L] =
m k=0
max v (k) (x) .
0xL
We observe that C m [0, L] is a Banach space and for m 1 it is compactly embedded into C m−1 [0, L]. That is, if {vk } is a bounded sequence in C m [0, L], then by the Arzelá–Ascoli’s Theorem, {vk } has a subsequence that converges in C m−1 [0, L]. 2. The Equations of Equilibrium 2.1. GEOMETRY OF DEFORMATION We consider a rope or string which in its reference configuration occupies the region in R3 . We let (x, y, z) represent Cartesian coordinates in and assume that [0, L] = {x: (x, y, z) ∈ } where the positive x axis is downward in the vertical direction. For any x ∈ [0, L] we define the cross-section of at x by x = {(y, z): (x, y, z) ∈ },
(2.1)
630
´ P.V. NEGRON-MARRERO
and let A(x) be the area of x . We assume that the cross-sectional area function A(·) is positive and continuous on [0, L] (see Figure 1). We consider a onedimensional deformation of given by p(x, y, z) = (u(x), y, z),
(2.2)
for some C 1 function u(·). (See Figure 1.) The requirement that an (infinitesimal) volume in the reference configuration cannot be reduced to zero by the deformation p, implies that u (x) > 0,
∀x ∈ [0, L].
(2.3)
2.2. MECHANICAL RESPONSE For any x ∈ [0, L] we denote by n(x) the force exerted by the material on [0, x] on that on [x, L] in a deformed configuration. We assume that the material of the rope has mass density per unit volume at x given by ρ(x), where ρ(·) is a given positive continuously differentiable function. Hence the weight of the [x, L] section of the rope is given by: L ρ(x)A( ¯ x) ¯ dx, ¯ (2.4) g x
where g denotes the acceleration of gravity. Assuming that a force W is applied at x = L, the total force exerted on the section [x, L] is L ρ(x)A( ¯ x) ¯ dx. ¯ (2.5) W +g x
For equilibrium, the forces must balance at each x ∈ [0, L], i.e., L ρ(x)A( ¯ x) ¯ dx. ¯ n(x) = W + g
(2.6)
x
We say that the material of the rope is elastic and nonhomogeneous if for some
·) we have that function N(·,
(x), x). n(x) = N(u
(2.7)
The usual way to account for the lack of homogeneity is by taking (u (x)),
(u (x), x) = A(x)N N (0, ∞) → R satisfies: where N: A1. A2. A3.
is a strictly increasing smooth function; N(·) N(ν) → ∞ as ν → ∞; N(ν) → −∞ as ν → 0+ .
(2.8)
631
THE HANGING ROPE OF MINIMUM ELONGATION
(0, ∞) → R has a smooth inverse From properties A1–A3 it follows that N: νˆ : R → (0, ∞). We further assume that A4. N → N 2 νˆ N (N) is strictly increasing on [0, ∞); A5. N 2 νˆ N (N) → ∞ as N → ∞. One can easily check that condition A4 is equivalent to the strict convexity of the integrand in the functional giving the total elongation of the rope (cf. (3.4)). If we combine (2.6), (2.7), and (2.8) we get that L −1 ρ(x)A( ¯ x) ¯ dx¯ . (2.9) N (u (x)) = A(x) W + g x
(See the Appendix for a derivation of this equation from the three-dimensional theory of elasticity.) Since the top of the rope is attached to a wall we have that u(0) = 0.
(2.10)
We consider two types of additional constraints: we assume either that the total mass of the rope is a given constant M: L ρ(x)A(x) dx = M, (2.11) 0
or that the volume of the rope is a given constant V : L A(x) dx = V .
(2.12)
0
For a constant mass–density function ρ(·) both constraints are equivalent. 3. Rope of Minimum Elongation Note that we can write (2.9) as: −1 u (x) = νˆ A(x) W + g
L
ρ(x)A( ¯ x) ¯ dx¯
.
(3.1)
x
Integrating now over [0, L] and using (2.10), we get the following expression for the total elongation of the rope: L L −1 νˆ A(x) W + g ρ(x)A( ¯ x) ¯ dx¯ dx. (3.2) u(L) = 0
x
The problem then is to find a function A(·) that minimizes the above expression for u(L) subject to either constraint (2.11) or (2.12). Let L ρ(x)A( ¯ x) ¯ dx. ¯ (3.3) B(x) = x
632
´ P.V. NEGRON-MARRERO
Hence B (x) = −ρ(x)A(x) and we can write (3.2) as L ρ(x)(W + gB(x)) dx. νˆ − u(L) = B (x) 0
(3.4)
Note that condition A4 can be seen now to be equivalent to the strict convexity with respect to B of the integrand in the above functional. Note that B(L) = 0 and either B(0) = M,
(3.5)
if (2.11) holds, or L B (x) dx = −V , ρ(x) 0
(3.6)
if (2.12) holds. Thus our problem now is to find a function B(·) that minimizes (3.4) subject to B(L) = 0 and either of the two constraints (3.5) or (3.6). 4. Fixed Total Mass We now study the problem of minimizing (3.4) subject to (3.5) for an arbitrary mass density function ρ(·). More specifically, we study the problem (4.1)
min J (B),
B∈X
where
L
J (B) = 0
ρ(x)(W + gB(x)) dx, νˆ − B (x)
X = B ∈ C 1 [0, L]: B(0) = M, B(L) = 0, B (x) < 0 ∀x . The Euler–Lagrange equations for this functional are given by: ρ(x)(W + gB(x)) d ρ(x)(W + gB(x)) νˆ N − dx B (x)2 B (x) ρ(x)(W + gB(x)) gρ(x) νˆ N − = 0, 0 < x < L, + B (x) B (x) B(0) = M,
B(L) = 0.
(4.2) (4.3)
(4.4a) (4.4b)
If we let H (x) = −
ρ(x)(W + gB(x)) , B (x)
(4.5)
633
THE HANGING ROPE OF MINIMUM ELONGATION
then a simple computation shows that H (x) d d 2 H (x) νˆ N (H (x)) = ρ(x)(W + gB(x)) − νˆ N (H (x)) dx dx B (x) ρ (x) H (x) − gρ(x) H (x)νˆ N (H (x)). + ρ(x) If we multiply (4.4a) by ρ(x)(W + gB(x)), recall (4.5), and use the above identity, then we have that (4.4a) is equivalent to ρ (x) d H (x)2 νˆ N (H (x)) − H (x)2 νˆ N (H (x)) = 0. dx ρ(x)
(4.6)
This equation can be easily integrated now to get that H (x)2 νˆ N (H (x)) = cρ(x),
(4.7)
for some constant c. The left-hand side of this equation can be written as h(H (x)) where h(N) = N 2 νˆ N (N). Thus (4.7) is equivalent to 1 −1 W + gB(x) =− h (cρ(x)), B (x) ρ(x)
(4.8)
where h−1 is the inverse function of h which exists under hypotheses A4, A5. Equation (4.8) can be written as B (x) +
gρ(x) h−1 (cρ(x))
B(x) = −
Wρ(x) h−1 (cρ(x))
.
Using the integrating factor x ρ(t) dt , µ(x) = exp g −1 0 h (cρ(t)) together with the boundary condition B(L) = 0, we conclude that L ρ(t) W exp g dt − 1 . B(x) = −1 g x h (cρ(t))
(4.9)
(4.10)
It remains to determine the constant c. But using (4.10) we get that the boundary condition B(0) = M is equivalent to L ρ(t) W exp g dt − 1 = M. (4.11) G(c) ≡ −1 g 0 h (cρ(t)) It follows from hypotheses A4, A5 that h−1 (0) = 0, h−1 is strictly increasing, and that h−1 (s) → ∞ as s → ∞. From these properties of h−1 it follows that G is strictly decreasing, G(c) → 0 as c → ∞, and G(c) → ∞ as c → 0+ . Thus
634
´ P.V. NEGRON-MARRERO
equation (4.11) has a solution which is unique for each M > 0. It follows as well that (4.4) has a unique solution for each M > 0. The above argument can be easily modified to show that the problem ρ(x)(W + g(x)) d ρ(x)(W + g(x)) νˆ N − dx (x)2 (x) ρ(x)(W + g(x)) gρ(x) νˆ N − = 0, a < x < L, (4.12a) + (x) (x) (a) = B(a), (L) = 0, (4.12b) has a unique solution ( · ; a) for any a ∈ [0, L), where B is the unique solution of (4.4). This fact can be used now to construct a stationary field for the functional (4.2). This together with condition A4 which implies the strict convexity with respect to B of the integrand in (4.2), allow us to invoke Hilbert’s Invariant Integral Theorem [18, Theorem 9.7] to get that the solution B of (4.4) is the unique (weak) minimizer of (4.2) on (4.3). 4.1. UNIFORM MASS DENSITY In the case where ρ is constant we can determine explicitly the constant c in (4.11). In this case the integrand in (4.11) is a constant which we denote by K. Equation (4.11) now reduces to W gKL e − 1 = M, g which has solution K = (1/Lg) ln(1 + gM/W ) that upon substitution into (4.10) yields gM 1−x/L W 1+ −1 . (4.13) B(x) = g W Since A(x) = −B (x)/ρ, we get after simplification that gM gM 1−(x/L) W ln 1 + · 1+ . A(x) = gρL W W
(4.14)
Note that this function is decreasing, i.e., the minimum elongation is attained by tapering down the rope from top to bottom. This is the same result obtained in [20] for the case νˆ (·) linear (Hooke’s Law). If we substitute (4.13) into (3.4) we get that the total elongation of the rope is given by gρL ρ = Lνˆ . (4.15) u(L) = Lνˆ − K ln(1 + gM/W ) In the general case where ρ depends on x, the cross-sectional area function need not be
monotone.
THE HANGING ROPE OF MINIMUM ELONGATION
635
5. Fixed Total Volume We now consider the problem of minimizing (3.4) subject to (3.6) for an arbitrary mass density function ρ(·). This problem can be formulated as (5.1)
min J (B), XV
where J is like (4.2) and 2 XV = B ∈ C [0, L]:
B (x) dx = −V , ρ(x) 0 $ B(L) = 0, B (x) < 0, ∀x . L
(5.2)
The first order necessary conditions for this problem are obtained by considering the extended functional L B V (x) ρ(x)(W + gB(x)) +λ + dx, (5.3) J(λ, B) = νˆ − B (x) ρ(x) L 0 over the set = (λ, B) ∈ R × C 2 [0, L]: B(L) = 0, B (x) < 0, ∀x , X
(5.4)
and where λ is a Lagrange multiplier. By considering smooth variations w with w(L) = 0 one gets that the Euler–Lagrange equations for (5.3) and hence of (5.1) are given by: λ ρ(x)(W + gB(x)) d ρ(x)(W + gB(x)) + νˆ N − dx B (x)2 B (x) ρ(x) ρ(x)(W + gB(x)) gρ(x) νˆ N − = 0, 0 < x < L, (5.5a) + B (x) B (x) λ ρ(x)(W + gB(x)) ρ(x)(W + gB(x)) + νˆN − = 0, (5.5b) 2 B (x) B (x) ρ(x) x=0 L B (x) dx = −V , B(L) = 0. (5.5c) ρ(x) 0 Note that the multiplier λ is determined from the boundary condition (5.5b) and in fact must be negative. Since in this case it is not possible to obtain explicit solutions, we study the boundary value problem (5.5) using g as a continuation parameter. Let Y = R2 × C 2 [0, L], Z = C 0 [0, L] × R2 , and (5.6) U = (g, λ, B) ∈ Y: λ < 0, B(L) = 0, B (x) < 0 ∀x . Equations (5.5) are now equivalent to G(g, λ, B) = 0 where G: U → Z is given by L B (x) dx + V , (5.7) G(g, λ, B) = G1 (g, λ, B), G2 (g, λ, B), ρ(x) 0
636
´ P.V. NEGRON-MARRERO
where G1 (g, λ, B) and G2 (g, λ, B) are given by the left-hand sides of (5.5a) and (5.5b), respectively. A simplified version of the results in [19], which are for Schauder spaces, gives us that G1 , G2 are twice continuously Fréchet differentiable, and since the other component of G is a twice differentiable linear functional of B, we conclude that: LEMMA 5.1. The function G: U → Z is twice continuously Fréchet differentiable and D(λ,B) G(g, λ, B) · (γ , v) = D(λ,B)G1 (g, λ, B) · (γ , v), D(λ,B) G2 (g, λ, B) · (γ , v),
v (x) dx , ρ(x)
L
0
where D(λ,B) G1 (g, λ, B) · (γ , v) is given by W + gB(x) d g v(x) − 2 v (x) νˆ N ρ(x) dx B (x)2 B (x)3 W + gB(x) γ g 2 W + gB(x) v(x) − − ρ(x) v (x) νˆ NN + B (x)2 B (x) B (x)2 ρ(x) 2 g W + gB(x) gρ(x) gρ(x)ˆνN v(x) − v (x) νˆNN , − 2 v (x) − B (x) B (x) B (x) B (x)2 and D(λ,B) G2 (g, λ, B) · (γ , v) is given by W + gB(x) g v(x) − 2 v (x) νˆN ρ(x) B (x)2 B (x)3 W + gB(x) γ g 2 W + gB(x) v(x) − − ρ(x) v (x) νˆ NN + B (x)2 B (x) B (x)2 ρ(x) x=0 and where the argument of νˆ N and νˆ NN is −ρ(x)(W + gB(x))/B (x).
5.1. LOCAL CONTINUATION We now study the existence of solutions of (5.5) for small values of g. LEMMA 5.2. The equation G(0, λ, B) = 0 has a solution which is unique and for which the corresponding cross-sectional area function A is constant. Proof. If we set g = 0 in (5.5a) then we get that λ ρ(x)W d ρ(x)W + = 0, νˆ N − dx B (x)2 B (x) ρ(x) i.e.,
λ ρ(x)W ρ(x)W + = constant. νˆ N − B (x)2 B (x) ρ(x)
THE HANGING ROPE OF MINIMUM ELONGATION
637
The boundary condition (5.5b) with g = 0 implies that this “constant” must be equal to zero from which we conclude that ρ(x)W ρ(x)W 2 = −λW. ν ˆ − N B (x) B (x) (Note that (5.5b) implies that λ < 0.) Since the right-hand side of this equation is constant, it follows from hypothesis A4 that −
ρ(x)W = C, B (x)
for some positive constant C. (The volume constraint in (5.5c) implies that C = W L/V .) Thus B (x) = −ρ(x)W/C and since B (x) = −ρ(x)A(x) we get that A(x) = W/C, i.e., A is constant. 2 Let (λ0 , B0 ) be the solution pair of G(0, λ, B) = 0 given by the above lemma. We now have: LEMMA 5.3. The linear map D(λ,B) G(0, λ0 , B0 ) is a bijection from R × C 2 [0, L] into Z. Proof. It follows from Lemma 5.1 that given any (f, α, η) ∈ Z, the equation D(λ,B) G(0, λ0 , B0 ) · (γ , v) = (f, α, η), is equivalent to: 1 γ d 2 d N νˆ N (N) = f (x), v (x) + dx B0 (x)2 dN ρ(x) N=H γ d 2 1 N ν ˆ (N) v (x) + = α, N B0 (x)2 dN ρ(x) x=0 N=H L v (x) dx = η, v(L) = 0, 0 ρ(x)
(5.8a) (5.8b) (5.8c)
where H = −ρ(x)W/B0 (x), etc. Since the coefficient of v (x) in (5.8a) is positive by hypothesis A4, problem (5.8a), (5.8b), and the first equation in (5.8c) can be uniquely solved for v in terms of f, α, γ , where the dependence in γ is linear. (See [17].) Upon substitution of this expression for v into the second equation of (5.8c), we get a linear equation for γ , which can be uniquely solved. 2 It follows now from Lemmas 5.1–5.3, and the implicit function theorem (see [15]) that: THEOREM 5.4. For small values of g, the problem (5.5) has a solution that depends continuously on g.
638
´ P.V. NEGRON-MARRERO
5.2. GLOBAL CONTINUATION In this section we carry a global analysis of solutions of (5.5) via Leray–Schauder degree theory. In order to apply the global continuation results in [16, 4, 2], we need to recast our problem in terms of a compact operator between appropriate Banach spaces. (It turns out that assumption A4 is crucial in this respect.) The local analysis of Section 5.1 is still valid in this setting and thus we just carry out the additional steps for the global analysis. By an analysis similar to the one that leads to (4.6), we can get that (5.5a) is equivalent to ρ (x) d ρ (x) H (x)2 νˆ N (H (x)) − H (x)2 νˆ N (H (x)) = λ (W + gB(x)), dx ρ(x) ρ(x) which in turn is equivalent to: ρ (x) d H (x)2 νˆ N (H (x)) =λ (W + gB(x)). dx ρ(x) ρ(x)2 If we integrate this equation from 0 to x we get that x ρ (t) H (x)2 νˆ N (H (x)) H (0)2 νˆ N (H (0)) (W + gB(t)) dt. − =λ 2 ρ(x) ρ(0) 0 ρ(t)
(5.9)
A simple integration by parts shows that x x W + gB(0) W + gB(x) ρ (t) B (t) − + g dt. (W + gB(t)) dt = 2 ρ(t) ρ(0) ρ(x) 0 0 ρ(t) Also from (5.5b) we have that −H (0)2 νˆ N (H (0)) = λ(W + gB(0)). Using these two last identities we can conclude that (5.9) is equivalent to x W + gB(x) B (t) H (x)2 νˆ N (H (x)) = −λ −g dt . ρ(x) ρ(x) 0 ρ(t)
(5.10)
Since νˆ N > 0, the boundary condition (5.5b) implies that λ < 0. Furthermore, since B (x) < 0, it follows that the right-hand side of the above equation is positive. Let h(N) = N 2 νˆ N (N). By hypothesis A4, this function for N 0 has an inverse function h−1 (·). Thus after multiplying both sides by ρ(x), the above equation is equivalent to W + gB(x) = F (g, λ, B)(x), B (x) where F (g, λ, B)(x) = −
(5.11)
x B (t) 1 h−1 −λ W + gB(x) − gρ(x) dt . ρ(x) 0 ρ(t) (5.12)
THE HANGING ROPE OF MINIMUM ELONGATION
639
Since F (g, λ, B)(x) < 0 for all x, we can write (5.11) as B (x) −
W g B(x) = . F (g, λ, B)(x) F (g, λ, B)(x)
(5.13)
If we treat the coefficient and right-hand side in this equation as if they were known functions of x, then after using an appropriate integrating factor and the boundary condition B(L) = 0 we can write that B = K2 (g, λ, B),
(5.14)
where
L W dt K2 (g, λ, B)(x) = exp −g −1 . g x F (g, λ, B)(t)
(5.15)
Note that
L W dt d K2 (g, λ, B)(x) = exp −g . dx F (g, λ, B)(x) x F (g, λ, B)(t)
(5.16)
With this expression for B , the volume constraint in (5.5c) becomes λ = K1 (g, λ, B),
(5.17)
where
L W K1 (g, λ, B) = V + λ + ρ(x)F (g, λ, B)(x) 0 L dt dx. × exp −g x F (g, λ, B)(t)
(5.18)
If we let K = (K1 , K2 ), then (5.5) is equivalent to (λ, B) = K(g, λ, B). Note that K: E → R × C 1 [0, L], where E = (g, λ, B) ∈ [0, ∞) × (−∞, 0) × C 1 [0, L]: B (x) < 0 ∀x .
(5.19)
(5.20)
The operator K need not be compact on the whole of E as the condition B (x) < 0 and λ < 0 might be violated in the limit for some converging sequence. To deal with this possibility, we define for any δ > 0 the open set (5.21) Eδ = (g, λ, B) ∈ E: λ < −δ, B (x) < −δ ∀x . Note that E = ∪δ>0 Eδ . We now have: LEMMA 5.5. For each δ > 0 and g ∈ [0, ∞), the operator K(g, ·, ·) that maps {(λ, B): (g, λ, B) ∈ Eδ } into R × C 1 [0, L] is compact.
640
´ P.V. NEGRON-MARRERO
x Proof. Since the mapping f → 0 f (t) dt is compact from C[0, L] into itself, it follows from (5.12) that F (g, ·, ·) is compact from Eδ, g = {(λ, B): (g, λ, B) ∈ Eδ }, into C[0, L]. Furthermore, from (5.15), (5.16), and (5.18), we get that K(g, ·, ·) is the composition of a continuous operator from C[0, L] into R × C 1 [0, L] with the compact operator F (g, ·, ·). Thus K(g, ·, ·) is compact from Eδ, g into C 1 [0, L]. 2 We need a few preliminary lemmas before invoking the global continuation results in [16, 4, 2]. LEMMA 5.6. Let {(gj , λj , Bj )} be a sequence of solutions of (5.19) that converges to (g, λ, B) in R2 × C 1 [0, L] with g 0. Then B (x) < 0 for all x and λ < 0. Proof. Since Bj < 0 and λj < 0 for all j , it follows that B 0 and λ 0. Assume that B (x) ¯ = 0 for some x¯ ∈ [0, L]. Then since Bj (x) ¯ < 0 for all j , we get that ¯ W + gj Bj (x) → −∞, Bj (x) ¯
j → ∞.
But from (5.11) and (5.12) we observe that x¯ 1 ¯ B (t) W + gj Bj (x) −1 →− h dt , −λ W + gB(x) ¯ − gρ(x) ¯ Bj (x) ¯ ρ(x) ¯ 0 ρ(t) which is finite and thus we get a contradiction. Hence B (x) < 0 for all x ∈ [0, L]. To argue that λ < 0, note that W W W + gj Bj (L) = → < 0. Bj (L) Bj (L) B (L) But if λ = 0, then (5.11) and (5.12) imply that 1 W + gj Bj (L) →− h−1 (0) = 0, Bj (L) ρ(L) 2
which again leads to a contradiction. LEMMA 5.7. For any solution (g, λ, B) of (5.19) with g 0, we have that BC[0,L] K, L g W ρC[0,L] exp −1 ρ(t) dt , B C[0,L] −1 h (−λW ) h (−λW ) 0 for some constant K depending only on ρ and V .
(5.22a) (5.22b)
THE HANGING ROPE OF MINIMUM ELONGATION
641
Proof. Since B(L) = 0, we have using an integration by parts that L L B (t) B (t) dt = − B(x) = − ρ(t) dt ρ(t) x x L L L B (t) B (ξ ) dt − dξ dt. ρ (t) = −ρ(x) ρ(t) ρ(ξ ) x x x The volume constraint in (5.5c) and the fact that B < 0 imply that L B (t) dt V . 0− ρ(t) x
(5.23)
This inequality together with the above expression for B gives the result (5.22a). To get (5.22b), note that since λ < 0, B < 0, and B(L) = 0, we get that x B (t) dt −λW. −λ W + gB(x) − gρ(x) 0 ρ(t) The result now follows from the representations (5.12), (5.16), and the fact that h−1 is strictly increasing. 2 LEMMA 5.8. Let {(gj , λj , Bj )} be a sequence of solutions of (5.19) with 0 gj R for all j for some constant R. Then {λj } satisfies that lim inf λj > −∞, j
lim sup λj < 0. j
Proof. If the first inequality does not hold, then {λj } would have a subsequence, which we denote again by {λj }, such that λj → −∞. Since h−1 (s) → ∞ as s → ∞, we get from (5.22b), Lemma 5.7, that Bj → 0 in C[0, L]. But this is impossible because L Bj (t) dt = −V , ∀j. (5.24) ρ(t) 0 Thus {λj } must be bounded from below. For the second inequality we argue again by contradiction. If {λj } were not bounded away from zero, there would be a subsequence, which we denote again by {λj }, such that λj → 0. It follows now from (5.22a), (5.23), and (5.12) that cj F (gj , λj , Bj )(x) < 0,
lim cj = 0.
j →∞
Using this in (5.16) yields that Bj (x)
W < 0, cj
x ∈ [0, L].
Since cj → 0, the above inequality would contradict the volume constraint (5.24). 2 Thus {λj } must be bounded away from zero.
642
´ P.V. NEGRON-MARRERO
LEMMA 5.9. Let {(gj , λj , Bj )} be a sequence of solutions of (5.19) with 0 gj R for all j for some constant R. Then {(λj , Bj )} is bounded in R × C 1 [0, L]. Proof. The result follows from Lemmas 5.7 and 5.8. 2 We now have: THEOREM 5.10. Let C ⊂ E be the connected component of solutions of (5.5) containing (0, λ0 , B0 ) where (λ0 , B0 ) is given by Lemma 5.2. Then C is unbounded in R2 × C 1 [0, L] and (5.5) has a solution for each g 0. Proof. It follows from Lemma 5.5 and the results in [16, 4, 2], that C must satisfy at least one of the following alternatives: (i) C is unbounded in R2 × C 1 [0, L]; (ii) C contains a solution of the form (0, λ∗ , B ∗ ) where (λ∗ , B ∗ ) = (λ0 , B0 ); (iii) C ∩ ∂E = ∅. We can rule out alternative (ii) using Lemma 5.2, and alternative (iii) with Lemma 5.6. Thus (i) must hold and the result about the existence of solutions for each g 0 follows from the unboundedness of C and Lemma 5.9. 2 This result as stated, cannot be used to construct a consistent stationary field for the problem (5.1), basically because of the lack of uniqueness. We can however get a partial result for any fixed value of g. Note that condition A4 together with the fact that the constraint in (5.2) is linear in B imply that the integrand in (5.3) is strictly convex in B . Let B be a solution of (5.5) corresponding to the length value L. It follows now from the results in [18, Theorems 9.10, 9.23], that for % sufficiently small, the solution B is a unique (weak) local minimizer of % ρ(x)(W + gv(x)) dx, (5.25) νˆ − J% (v) = v (x) 0 on the set
X% = v ∈ C 1 [0, %]:
v (x) dx = ρ(x)
B (x) dx, 0 0 ρ(x) $ v(%) = B(%), v (x) < 0, ∀x ∈ [0, %] . %
%
(5.26)
Thus B gives a local minimum among configurations of a rope of length %, with total volume equal to that of B in [0, %], and with total mass at x = % equal to B(%). 6. Numerical Examples In this section we present a typical family of constitutive functions that satisfies (·) of the form: hypotheses A1–A5. In particular, we consider functions N (ν) = A1 ν α1 − A2 ν −α2 , N
(6.1)
THE HANGING ROPE OF MINIMUM ELONGATION
643
where A1 > 0, A2 0, α1 , α2 > 0. This function clearly satisfies A1–A3 and thus has an inverse function νˆ (·) such that (ˆν (N)) = N, N νˆ (N(ν)) = ν,
N ∈ R, ν ∈ (0, ∞).
(6.2a) (6.2b)
If we differentiate (6.2a) with respect to N and solve for νˆ N (N), then we get that h(N) = N 2 νˆN (N) =
N2 . ν (νˆ (N)) N
(ν) in this expression, then we get from (6.2b) that If we let N = N h(N(ν)) =
2 (ν) N . ν (ν) N
(6.3)
Now A5 is equivalent to h(N(ν)) → ∞ as ν → ∞, which is satisfied by (6.1) for any A1 > 0, A2 0, α1 , α2 > 0. If we differentiate (6.3) with respect to ν, then we have that 2 (ν) 2Nν (ν) − N(ν)Nνν (ν) . =N hN (N(ν)) ν3 (ν) N Condition A4 requires that N > 0, which for (6.1) is equivalent to 1/(α1+α2 ) A2 . ν> A1
(6.4)
(6.5)
It follows from (6.4) now that A4 is equivalent to (ν)N νν (ν) > 0, ν2 (ν) − N 2N
(6.6)
provided ν satisfies (6.5). We now have: PROPOSITION 6.1. The constitutive function (6.1) satisfies condition A4 for any A1 > 0, A2 0 provided that α1 = α2 = α > 0 or for any α1 > 0, α2 1. Proof. A direct calculation shows that for (6.1), inequality (6.6) is equivalent to α1 (α1 + 1)A21 ν 2α1 + α2 (α2 − 1)A22 ν −2α2 + (α12 + α22 + α2 + α1 (4α2 − 1))A1 A2 ν α1 −α2 > 0, provided ν satisfies (6.5). This inequality is automatically satisfied for any α1 > 0, α2 1. If α1 = α2 = α > 0, the inequality is satisfied provided 1 − α 1/4α A2 1/2α . ν> 1+α A1 Since the expression (1 − α)/(1 + α) is less than 1 for α > 0, this last inequality is satisfied provided ν satisfies (6.5). 2
644
´ P.V. NEGRON-MARRERO
Figure 2. Computed mid-cross-sectional area functions for the density functions (6.7) and total fixed mass of 0.03.
We show now some numerical computations for the constitutive function (6.1) for the case A1 = 1, A2 = 0, and α1 = 3. We use variable density functions which along the axis of the rope are either increasing, decreasing or with an interior minimum. In particular, we consider: ρ1 (x) = 0.1(1 + x), ρ2 (x) = 0.2 − 0.1x, ρ3 (x) = 5.0(x − 0.5)2 + 0.1.
(6.7a) (6.7b) (6.7c)
We used the following values for the parameters L, W, g: L = 1.0,
W = 0.1,
g = 9.8,
the units of which are in the metric system. We show in Figure 2 the computed cross-sectional area functions for the densities (6.7) and total fixed mass of 0.03. Note that for variable density functions, the area function need not be decreasing as is the case when the density is constant (cf. (4.14)). Note that for ρ2 the rope is thinner at the beginning and fatter at the end as compared with the one for ρ1 compensating in this way for the decrease in density. In Figure 3 we show the corresponding shape of the rope of minimum elongation for the case (6.7c) assuming circular cross sections. Similar results for the case of total fixed volume of 0.05 are shown in Figures 4 and 5.
THE HANGING ROPE OF MINIMUM ELONGATION
645
Figure 3. The shape of the rope of minimum elongation for the density function (6.7c) and total fixed mass of 0.03.
Figure 4. Computed mid-cross-sectional area functions for the density functions (6.7) and total fixed volume of 0.05.
646
´ P.V. NEGRON-MARRERO
Figure 5. The shape of the rope of minimum elongation for the density function (6.7c) and total fixed volume of 0.05.
7. Conclusions A problem perhaps more interesting from the practical point of view than the ones treated in this paper is that of minimizing the volume (thus minimizing the amount of material used) of the rope for a given length of the rope. The one-dimensional version of this problem can be treated similarly to the ones discussed here. Its three dimensional version has applications in the petroleum industry where long tubes from the top to the bottom of the sea (fixed length) need to be constructed with the least amount of material. In this case the tubes need to be hollow in order to transport various materials and there is the additional complication of the external water pressure. The use of Leray–Schauder degree techniques in elasticity has a long and successful story that we will not try to review here. We refer to [3] for examples and its extensive literature review. However most of these applications of the Leray– Schauder degree have been limited to one dimensional problems, like the one treated in this paper, due to the complexity in transforming the equations of elasticity into an equivalent problem in terms of a compact operator. Not until recently, in [9], such a major enterprise was carried out for the three dimensional displacement problem of nonlinear elasticity. On the other hand, the use of a degree for proper Fredholm maps of index zero [8, 12] avoids the transformation of the orig-
THE HANGING ROPE OF MINIMUM ELONGATION
647
inal problem into one in terms of a compact operator but requires some a priori estimates on solutions of the linear problem and its spectrum. For the three dimensional mixed problem of nonlinear elasticity such spectral estimates were obtained in [11], and together with the estimates in [1] for elliptic systems, Healey and Simpson were able to apply a degree for proper Fredholm maps of index zero to get the existence of a global branch of solutions of this problem. This global continuum, in addition to the usual alternatives for such a continuum, may also “cease” to exist due to a failure of strong ellipticity, local injectivity, or the complementing condition. For specific materials with appropriate growth conditions, one can rule out termination due to a failure of strong ellipticity and local injectivity. (See, e.g., [10].) Appendix In this section we derive the model equations for the rope from the three-dimensional theory of elasticity. For the deformation (2.2) the deformation gradient is given by ⎞ ⎛ u (x) 0 0 1 0 ⎠. (A.1) ∇p = ⎝ 0 0 0 1 If we assume that the material of the body is isotropic and hyperelastic, then there exists a smooth stored energy function σˆ (F) of the form: 1 t 1 t F · F, FF · FF , det F , σˆ (F) = σ 2 4 where F · H = trace(FHt ) and such that the (first) Piola–Kirchhoff stress tensor is given by S(F) =
dσˆ (F) = σ,1 F + σ,2 FFt F + (det F)σ,3F−t . dF
For (A.1) we have that (u (x))2 + 2 F·F = , 2 2
FFt · FFt (u (x))4 + 2 = , 4 4
det F = u (x). (A.2)
It follows now that S(∇p) = diag u (x)σ,1 + (u (x))3 σ,2 + σ,3 ,
σ,1 + σ,2 + u (x)σ,3 , σ,1 + σ,2 + u (x)σ,3 ,
(A.3)
648
´ P.V. NEGRON-MARRERO
where the arguments of σ,1 , etc., are given by (A.2). If we let i be a unit vector pointing in the positive x direction, then we have from (2.1) that the force exerted by the material on [0, x] on that on [x, L] is given by S(∇p) · i dsx , − x
ˆ y, z) denote the masswhere dsx denotes an element of area over x . If we let ρ(x, i is density per unit volume at (x, y, z) and we assume that a force per unit area W applied at the bottom of the rope, then we get that the total force on the material on [x, L] is L i dsL . ρ(ξ, ˆ y, z)i dsξ dξ + W g x
ξ
L
For equilibrium we must have that L S(∇p) · i dsx = g ρ(ξ, ˆ y, z)i dsξ dξ + x
x
ξ
i dsL , W
L
which upon recalling (A.3) reduces to: A(x) u (x)σ,1 + (u (x))3 σ,2 + σ,3 L ρ(ξ, ˆ y, z) dsξ dξ + W, =g where
(A.4)
ξ
x
A(x) =
dsx ,
A(L). W =W
x
If we let (u (x)) = u (x)σ,1 + (u (x))3 σ,2 + σ,3 , N and assume that ρ(x, ˆ y, z) = ρ(x), then (A.4) reduces to (2.9). References 1.
2. 3. 4. 5.
S. Agmon, A. Douglis and L. Nirenberg, Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions. Comm. Pure Appl. Math. II(17) (1964) 35–92. J.C. Alexander and J.A. Yorke, The implicit function theorem and the global methods of cohomology. J. Funct. Anal. 21 (1976) 330–339. S.S. Antman, Nonlinear Problems of Elasticity, Applied Mathematical Sciences 107. Springer, New York (1995). F.E. Browder, On the continuity of fixed points under deformations of continuous mappings. Summa Brazil. Mat. 4 (1960) 183–191. S.J. Cox, The shape of the ideal column. Math. Intelligencer 14(1) (1992) 16–24.
THE HANGING ROPE OF MINIMUM ELONGATION
6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
649
S.J. Cox and M. McCarthy, The shape of the tallest column. SIAM J. Math. Anal. 29(3) (1998) 547–554. L. Euler, De motu corporum flexibilum. Comm. Acad. Sci. Petrop. 14 (1751) 182–196. C.C. Fenske, Extensio gradus ad quasdam applicationes Fredholmii. Mitt. Math. Seminar Giessen 121 (1976) 65–70. T.J. Healey, Global continuation in displacement problems of nonlinear elastostatics via the Leray–Schauder degree. Arch. Rational Mech. Anal. 152 (2000) 273–282. T.J. Healey and P. Rosakis, Unbounded branches of classical injective solutions to the forced displacement problem in nonlinear elastostatics. J. Elasticity 49 (1997) 65–78. T.J. Healey and H.C. Simpson, Global continuation in nonlinear elasticity. Arch. Rational Mech. Anal. 143 (1998) 1–28. H. Kielhöfer, Multiple eigenvalue bifurcation for Fredholm mappings. J. Reine Angew. Math. 358 (1985) 104–124. J.B. Keller and F.I. Niordson, The tallest column. J. Math. Mech. 16 (1966) 433–446. J.L. Lagrange, Application de la méthode exposée précédente à la solution de différens problèmes de dynamique. Misc. Tour. 2(2) (1762) 196–298. S. Lang, Real Analysis, 2nd edn. Addison-Wesley, Reading, MA (1983). J. Leray and J. Schauder, Topologie et équations fonctionelles. Ann. Sci. École Norm. Sup. 3(51) (1934) 45–78. I. Stakgold, Green’s Functions and Boundary Value Problems. Wiley, New York (1979). J.L. Troutman, Variational Calculus with Elementary Convexity. Springer, New York (1983). T. Valent, Boundary Value Problems of Finite Elasticity, Springer Tracts in Natural Philosophy. Springer, New York (1988). G. Verma and J. Keller, Hanging rope of minimum elongation. SIAM Rev. (1983) 369–399.
On Certain Weak Phase Transformations in Multilattices MARIO PITTERI DMMMSA, Università di Padova, Via Belzoni 7, 35131 Padova, Italy. E-mail:
[email protected] Received 13 September 2002; in revised form 8 April 2003 Abstract. This paper is dedicated to the memory of Clifford Truesdell, to whom I acknowledge my debt and gratitude. I present some recent results of mine on thermomechanics of crystalline solids, a research on which I began working during my stay at The Johns Hopkins University, 1977–1979. Then, in the last section, I mention some unusual merits of Truesdell which I experienced in my scientific career and are not widely known. Mathematics Subject Classifications (2000): 74A30, 74B20, 74G65, 74N05. Key words: crystalline solids, phase transformations, physical possibility, quartz, thermoelasticity.
1. Introduction I have many reasons to be grateful to Clifford Truesdell, both as a scientist and as a person. Along my research activity as a postdoc I soon realized that I needed to deepen my knowledge of kinetic-statistical theories of mechanics, and The Johns Hopkins University was my choice because of Truesdell’s work on the kinetic theory of gases – he was then carrying to completion his book [22] with Muncaster. During my stay there I was also exposed to other subjects I was not so familiar with, among which were elements of experimental continuum mechanics and of the history of mechanics, classical thermodynamics along Carnot’s lines, stability problems in continuum mechanics, and nonlinear elasticity of crystals. In fact, in the past fifteen years the last topic has been the main focus of my research, the most recent results of which constitute the core of this paper. There has recently been a renewed interest in the geometry and kinematics of multilattices, in view of constructing a nonlinear model of the thermomechanical behavior of complex crystals. The background, some details and references are given in [17], where some still unsolved problems are also outlined. One of these is the formulation of a unified kinematics of multilattices of different complexity. Indeed, any multilattice configuration admits a maximal skeletal lattice of translations mapping the multilattice to itself (the essential skeleton), which nevertheless shares the property of being translation-invariant with any one of its infinitely many sublattices. Once a sublattice of the essential skeleton is selected, additional vectors (shifts) have to be introduced to describe the multilattice configuration, and these 651 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 651–671. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
652
M. PITTERI
change, possibly also in number, with the changing sublattice. The kinematics developed so far in the literature works well for multilattices whose lattice of translations is the essential skeleton itself; these will be called essentially described, or essential for short. In this paper we restrict attention to essential multilattices alone. Further work is needed to cope with deformations along which a multilattice changes its periodicity, in a suitable sense. Handling such deformations would be of great interest from the theoretical point of view as well as for applications, for instance to certain phase transitions of shape memory alloys. In [15] it is shown by a simple example how the knowledge of the full geometry and kinematics of essential multilattices can help in classifying all the possibilities for their weak – that is, involving suitably small distortions – symmetry-breaking thermoelastic phase transformations, and motivation for the procedure is provided. The example is the diatomic 2-lattice used in the introductory sections of [19] to describe what is called a structural phase transition in a crystalline solid: in a primitive tetragonal lattice whose unit cell has a physically different atom in its center the transition is driven by the central atom moving off center. That this can happen in essentially two different ways is shown to be one of the possibilities; the other, orthogonal to the first in a suitable sense, corresponds to a configurational transition: the tetragonal skeleton drives the transformation, which is then accompanied by a suitable displacement of the central atom. For this kind of transition the independent generic possibilities have been classified in [5]. Here, in Section 3, I provide an explicit, general framework for generic weak bifurcations in essential multilattices, completing a scheme presented in [6]. As an example, in Section 4 we study the structural transitions of β-quartz, described by the 3-lattice model introduced in [12]. The analysis shows that, among others, there are two – mutually orthogonal in a sense, and both described by a 1-dimensional order parameter space – trigonal trapezohedral low-symmetry product phases, one
Figure 1. Equivalent bases for a given (planar) lattice. An example is a controlled deformation of a body-centered cubic (bcc) simple lattice along which the central atom starts moving off-center for certain (transition) values of the controls. At the transition the nonessential 2-lattice equivalent to the essential bcc simple lattice starts deforming into an essential 2-lattice.
653
PHASE CHANGES IN MULTILATTICES
of which is the α-phase as modelled in [12]. The other was obtained in [6] as an outcome of the search for a third quartz phase, used to give an alternative to the socalled incommensurate phase introduced in the physical literature to explain certain peculiarities of the α–β transition. Here the geometry of the 3-lattice describing this new phase is given in detail. Finally, since this paper is appearing in a volume dedicated to the memory of Clifford Truesdell, in Section 5 I mention some unusual merits of him, which I experienced in my scientific career and are not widely known. 2. Preliminaries I sketch here the bare essentials for the rest of the paper, referring the reader to [17] for more information. In particular, as there, I use the summation convention and “running indices” without specifying their range; for instance in expressions like “the lattice basis ea ” instead of “the lattice basis {ea , a = 1, 2, 3}”, or “the function ˆ 1 , e2 , e3 , p1 , . . . , pn−1 , θ)”. This should ˆ a , pr , θ)” instead of “the function φ(e φ(e not generate too much confusion. Also, the relations and < between groups mean subgroup of and proper subgroup of, respectively. Let Z and R denote the integral and real numbers, respectively. Consider first the simplest triply-periodic structures, that is, simple lattices (or 1-lattices): L = {N a ea , a = 1, 2, 3, N a ∈ Z} = L(ea ).
(1)
The lattice vectors (or lattice basis) ea are linearly independent in R3 . Any basis ea uniquely determines the 1-lattice L(ea ), but not vice versa: L( e¯a ) = L(ea )
⇔
e¯a = mba eb ,
m ∈ GL(3, Z);
(2)
here GL(3, Z) is the group of 3 by 3 integral matrices with determinant ±1. The crystallographic point group (or holohedry), P (ea ), of the lattice L(ea ) is then defined as the group of all the orthogonal transformations mapping L(ea ) to itself; equivalently: P (ea ) = {Q ∈ O(3): Qea = mba eb }.
(3)
Notice that the basis ea satisfies (3) if and only if the lattice metric C = (Cab ),
Cab = ea · eb ,
(4)
is a fixed point of the map C → mt Cm.
(5)
The conjugacy classes of the holohedries in O(3) correspond to the well known 7 crystal systems.
654
M. PITTERI
Figure 2. An elementary cell for the hcp lattice, and its projection on the x–y basal plane.
By looking at the right-hand side of the equation defining P (ea ) in (3) we introduce the lattice group L(ea ) of a lattice L(ea ), associated with its lattice basis ea : L(ea ) = {m ∈ GL(3, Z): mba eb = Qea , Q ∈ P (ea )} = {m ∈ GL(3, Z): mt Cm = C}.
(6)
By the last equality the lattice group depends on the basis ea only through the corresponding metric C, hence can be denoted by L(C). The conjugacy classes of lattice groups in GL(3, Z) correspond to the well known 14 (Bravais) lattice types. Real crystals (hexagonal metals, alloys, etc.) are not in general 1-lattices. Their geometry and kinematics can be described by means of multilattices, which are the union of a finite number of nontrivial translates of a 1-lattice. A simple, well known example of 2-lattice is the hexagonal close-packed (hcp) structure, which is sketched in Figure 2. In general, an n-lattice M in 3-dimensional affine space can be defined as follows, by using Grassman notation, choosing the origin O at one of the lattice points and setting p0 = 0 for convenience: M = M(ea , p1 , . . . , pn−1 ) =
n−1 4
{O + L(ea ) + pr };
(7)
r=0
L(ea ) is called the skeletal lattice of M. Figure 3 is a schematic picture of a 2-dimensional (planar) 2-lattice; if the atoms represented by filled circles are physically indistinguishable from the ones represented by open circles, the 2-lattice is called monatomic, otherwise diatomic. For simplicity, in this paper all multilattices are monatomic. The multilattice descriptors (ea , pr ) =: εσ , σ = 1, . . . , n + 2 (in terms of which we can write M = M(εσ )) satisfy the following conditions guaranteeing the three-dimensionality of M and the non-overlap of the constituent 1-lattices: e1 · e2 × e3 = 0,
a pr = ps + lrs ea ,
with the exclusion of the case r = 0 = s.
a r, s = 0, . . . , n − 1, lrs ∈ Z,
(8)
655
PHASE CHANGES IN MULTILATTICES
Figure 3. Unit cells of the component 1-lattices of a planar 2-lattice.
An n-lattice M is called essential if its skeletal lattice contains all the translations mapping M to itself. In this case the lattice cell has minimum volume. The O(3)-invariant multilattice metric K = (Kσ τ ),
Kσ τ = Kτ σ = εσ · ετ , σ, τ = 1, . . . , n + 2,
(9)
m of multilattice metrics is a submanifold of is an analog of (4). The manifold Qn+2 m is a “state space” the vector space Qn+2 of all symmetric matrices in Rn+2 ; Qn+2 > for n-lattices analogous to the set Q3 of positive definite quadratic forms in R3 (the lattice metrics) for 1-lattices. The “global symmetry group” of essential n-lattices expresses the indeterminateness in the choice of the multilattice descriptors:
M(ε¯ σ ) = M(εσ )
⇔
ε¯ σ = µτσ εσ ,
µ ∈ Γn+2 ,
(10)
where Γn+2 < GL(n + 2, Z) consists of the matrices of the form ⎛ ⎞ b mba l1b . . . ln−1 ⎜0 0 0 ⎟ ⎜ ⎟ (µτσ ) = ⎜ ⎟, .. s ⎝ ⎠ α . r
⎛
0
1 ⎜ 0 ⎜ .. ⎜ . ⎜ α¯ = ⎜ ⎜ −1 ⎜ .. ⎜ . ⎝ 0 0
0
0
0 1 .. . −1 .. . 0 0
... ... ··· ... ··· ... ...
⎞ 0 0 0 0 ⎟ .. .. ⎟ . . ⎟ ⎟ −1 −1 ⎟ , .. .. ⎟ ⎟ . . ⎟ ⎠ 1 0 0 1
(11)
for a, b = 1, 2, 3, r, s = 1, . . . , n − 1, (mba ) ∈ GL(3, Z), lrb ∈ Z, and α = (αrs ) belonging to the finite noncommutative group of matrices generated by the permutation matrices of the set {1, . . . , n − 1} and by the n − 1 by n − 1 matrices α¯ of the form (11)2 , which are obtained from the identity by replacing one of its rows by a row of −1s. If α is not a permutation of {1, . . . , n − 1}, then necessarily one of its rows consists of −1s.
656
M. PITTERI
Analogous to (5), the group Γn+2 acts in a natural way on the manifold of multilattice metrics: K → µt Kµ.
(12)
Among all changes of descriptors, particularly important are those which produce an affine isometry of the multilattice onto itself. If, with respect to the chosen origin O (which is one of the lattice points) we represent the isometry as a pair (t, Q), t ∈ R3 , Q ∈ O(3), it must be, for k = 0 if α is a permutation or, otherwise, for the index k of the row of α made of −1s – see (11): Qεσ = µτσ ετ ,
t = pk + na0 ea ,
na0 ∈ Z.
(13)
In terms of the metric K in (9), equality (13)1 is equivalent to the condition that K be a fixed point for the corresponding map (12): µt Kµ = K.
(14)
For any essentially described multilattice M = M(εσ ) the solutions (t, Q) of (13) depend on M itself and not on its specific descriptors εσ , and they constitute the space group S(M) of M. The orthogonal maps appearing in (13)1 constitute the crystal class P (M) (or P (εσ )) of M, which is a subgroup of the skeletal holohedry P (ea ). The matrices µ ∈ Γn+2 in (13)1 form the lattice group Λ(εσ ) < GL(n + 2, Z) of M(εσ ). The finite group Λ(εσ ) depends on the choice of the εσ – actually, on the corresponding metric K, so that it can be denoted by Λ(K) – and, under a change of descriptors, changes to a conjugate group in Γn+2 . The conjugacy classes of lattice groups in Γn+2 formalize the notion of (arithmetic) m gives information n-lattice types, and the way n+2 acts on the state space Qn+2 on the kinematics of deformable n-lattices. This analysis is considerably simpler if one restricts attention to suitably small distortions: PROPOSITION 1. Any multilattice metric K has a Λ(K)-invariant neighborm , to be called a wt-nbhd of K, such that, for any µ ∈ Γn+2 , hood N in Qn+2 µt N µ ∩ N = ∅
⇔
µ ∈ Λ(K)
⇔
µt Kµ = K.
(15)
Therefore, in any Λ(K)-invariant neighborhood of K contained in N the global Γn+2 -invariance reduces to the invariance under the lattice group Λ(K): for any K¯ ∈ N
¯ Λ(K). Λ(K)
(16)
For any choice of descriptors εσ an analogous O(3)-invariant neighborhood exists, and will be also called a wt-nbhd of εσ . This result allows us to efficiently reduce the description of the invariance in any m of the metric K of an arbitrarily chosen essential n-lattice, and wt-nbhd N ⊂ Qn+2 to greatly simplify the classification of its generic elastic bifurcations in N .
657
PHASE CHANGES IN MULTILATTICES
For simplicity, here we consider a crystalline solid in equilibrium with a heatbath of which we only control the temperature. One can extend this treatment to accommodate other controls, for instance, pressure, as in [6], or shear stresses, as in [4, 18], or [7]. Here an appropriate thermodynamic potential is the Helmoltz free energy of the multilattice, which is assumed to have a density per unit skeletal cell; this is a sufficiently smooth function ¯ σ , θ), ˆ a , pr , θ) = φ(ε φ = φ(e
(17)
where θ denotes the absolute environmental temperature, regarded as a control. The free energy density at zero temperature coincides with the internal energy density, and it can be reasonably assumed to depend on the location of the multilattice points also at any given positive temperature. Therefore the functions φˆ or φ¯ must have the same value on any two equivalent sets of descriptors for the same configuration; hence, for any µ ∈ Γn+2 , they satisfy the invariance conditions ˆ a , pr , θ), ˆ ba eb , αrs ps + lra ea , θ) = φ(e φ(m
¯ τσ ετ , θ) = φ(ε ¯ σ , θ), (18) φ(µ
respectively. In addition, for these functions Galilean invariance reduces to invariance under orientation-preserving isometries. In particular, ˆ ˆ φ(Qe a , Qpr , θ) = φ(ea , pr , θ)
for any Q ∈ SO(3),
(19)
hence ˆ a , pr , θ) = (s, Cab , pra , θ), φ(e for Cab = ea · eb , pra = pr · ea ,
s = sgn(e1 · e2 × e3 ).
For any µ ∈ Γn+2 the function satisfies the equality j (s, Cab , pra , θ) = ss(µ), mia Cij mb , mba (αrs psb + Cbi lri ), θ ,
(20) (21)
(22)
m(µ) denoting the m-component of µ, and s(µ) the sign of det m(µ). 3. Phase Changes in a wt-nbhd Consider a multilattice M whose admissible configurations can all be described as (perhaps nonessential) n-lattices; a (reference) configuration in which M is an essential n-lattice, described by vectors εσ0 = (ea0 , pr0 ), and an O(3)-invariant wt-nbhd N of εσ0 , based on Proposition 1. Since no configurations of higher complexity are considered, all our results are local, and in the configuration space the nonessential multilattices form smooth submanifolds of strictly lower dimension (see [17]); and since N is the union of disjoint SO(3)-invariant neighborhoods N + and N − of εσ0 and −εσ0 , respectively, we can assume, without loss of generality, that in N + all descriptors (ea , pr ) are essential, and s = s0 := sgn(e10 · e20 × e30 ). The following analogues of (13)1 , (18)1 hold: αrs ps0 + lra ea0 = Qpr0 for Q ∈ P (εσ0 ), mba eb0 = Qea0 , ˆ ba eb0 , αrs ps0 + lra ea0 , θ) for µ ∈ Λ(εσ0 ). ˆ a0 , pr0 , θ) = φ(m φ(e
(23) (24)
658
M. PITTERI
Denoting by Sym the space of symmetric tensors, and by Sym> the convex cone of the positive definite ones, we normalize the orientation of the (deformed) skeletal lattices of the multilattices in N + by restricting attention to lattice bases of the form ea = U ea0 , U ∈ Sym> , and define the referential shift increments πr by the equalities pr = U (pr0 + πr ).
(25)
The following is easily proved: Cab = ea0 · Ceb0 ,
C = U 2,
and
pra = ea0 · C(pr0 + πr ).
(26)
Therefore in N + , where s is fixed, (s, Cab , pra , θ) = (C, πr , θ).
(27)
From (18), (23), and (25), we have, for Q ∈ P + (εσ0 ), µ ∈ Λ+ (εσ0 ): (C, πr , θ) = (Qt CQ, αrs Qt πs , θ),
Qεσ0 = µτσ ετ0 ;
(28)
here P + (εσ0 ) is the subgroup of positive-determinant elements of P (εσ0 ), with an analogue for the m component of the elements of Λ+ (εσ0 ). Indeed, for any µ ∈ Λ+ (εσ0 ), ˆ a , pr , θ) (C, πr , θ) = φ(e ˆ ba eb , αrs ps + lra ea , θ) = φ(m ˆ Qea0 , U Q(pr0 + αrs Qt πs ), θ), = φ(U
(29)
from which the conclusion follows by (26) and (27). A minor problem is that the matrix α = (αrs ) is not orthogonal; but it can be always expressed in terms of one such, and it is convenient to do so. For instance, denote by G := {αi : µi ∈ Λ+ (εσ0 )}
(30)
the group of the submatrices α of the elements of Λ+ (εσ0 ), and by |G| the order of G; then construct the “metric” |G|
1 t αj αj = λ2k dk ⊗ dk =: W 2 , g= |G| j =1 k=1 W =
n−1
n−1
(31)
λk dk ⊗ dk ,
k=1
the dk being an orthonormal basis of eigenvectors of g and W . It immediately follows that αit gαi = g
for all αi ∈ G.
(32)
659
PHASE CHANGES IN MULTILATTICES
conjugate to G, consisting of the orthogonal matrices Introduce now the group G, βi = W αi W −1 ∈ O(n − 1),
(33)
and the new reference shift increments r = (W −1 )sr πs .
(34)
(C, r ) = (C, Wrs s ) has the following invariance, Then the new function when Q and β are related to the same matrix µ ∈ Λ+ (εσ0 ):
(Qt CQ, βrs Qt s ),
(C, r ) =
for any Q ∈ P + (εσ0 ), β ∈ G.
(35)
At this point we apply a procedure used by Ericksen [5] (see also [17]) to classify generic weak bifurcations in simple lattices: we introduce orthonormal bases Vk , k = 1, . . . , 6, in Sym and cl , l = 1, 2, 3, in R3 , and the representations C =1+
6
yk Vk ,
k=1
r =
3
yrl cl ,
(36)
l=1
so that, in particular, C is near 1 if and only if (y1 , . . . , y6 ) is near 0 ∈ R6 . In the treatment of the reduced problems in Section 4 we will choose the basis (c1 , c2 , c3 ) to coincide with (i, j , k) introduced there, and the basis V1 , . . . ,V6 to consist of the tensors represented in the basis (i, j , k) by the matrices shown in (47)–(49). By putting in a single list (yi ) the 3(n + 1) coordinates so introduced, we can
(C, r , θ), which then (yi , θ) = define the corresponding free energy density enjoys the invariance (¯yi , θ), (yi , θ) =
y¯ i = Qij yj ,
Q ∈ O(3(n + 1)).
(37)
By (35)1 each matrix Q is a block matrix, with a 6 by 6 and a 3(n − 1) by 3(n − 1) blocks, each one itself orthogonal. We denote by G the group of such orthogonal matrices Q corresponding to elements µ of Λ+ (εσ0 ). the equations of equilibrium in the absence of loads are, in a In terms of convenient notation, y := i
∂ (y , θ) = 0, ∂ yi j
i = 1, . . . , 3(n + 1).
(38)
We assume these conditions to hold for θ = θ¯ and yi = 0, the latter giving the (reference) multilattice M(εσ0 ). Consider now the second-order symmetric tensor of moduli at the transition: y y (0, θ¯ ); Lij = i j
(39)
if this is invertible, then, by the implicit function theorem, the equilibrium equa¯ such that y¯ i (θ) ¯ = 0, i = tions (38) have one solution yi = y¯ i (θ) for θ near θ,
660
M. PITTERI
1, . . . , 3(n + 1). Also, by continuity and uniqueness, all points on this equilibrium branch have the same symmetry as M(εσ0 ). Therefore symmetry breaking can only occur if the tensor L of moduli has a nontrivial kernel. The invariance (37) forces, by differentiation, the following at (0, θ): ¯ identity among the second derivatives of L = Qt LQ
for any Q ∈ G,
(40)
hence the eigenspaces of L are invariant under the action of G: Qt LQy = Ly = λy
⇔
LQy = λQy.
(41)
This can be interpreted by saying that invariance forces certain eigenvalues of L to be equal. We restrict the attention to the case, called generic by Ericksen [5] are (see also [17]), in which the only conditions to be imposed on derivatives of ¯ those guaranteeing that (0, θ ) is a stable equilibrium at which bifurcation occurs, and those forced by invariance; for instance, (40). In particular, the only eigenvalues of L that are equal are the ones that are forced to be so by invariance; or, the eigenspaces of L are irreducible invariant (i.i.) subspaces of R3(n+1) under the action y → Qy of the group G, and exactly one of them is the kernel of L. Then, the condition that a stable phase exists, say, for θ > θ¯ forces all the other eigenvalues to be strictly positive. We call reduced the action of G on each i.i. subspace, and also call reduced the group representing such action on that subspace. If one chooses the basis above aligned with a choice of i.i. subspaces, then each matrix Q is a block matrix, each orthogonal block corresponding to an i.i. subspace, and representing an element of the reduced group on that subspace. The fact that the action of G does not mix the first 6 and the last 3(n − 1) coordinates (see (35)), implies that the set of i.i. subspaces of R3(n+1) necessarily contains those of either one of the forms V1 × {0} or
{0} × V2 ,
(42)
where V1 (V2 ) is an i.i. subspace of R6 (of R3(n−1) ) and 0 ∈ R3(n−1) (0 ∈ R6 ). Case (42)1 corresponds to configurational transitions, in which the motif follows the deformation of the skeleton, at least in the beginning. Case (42)2 describes structural transitions, which are driven by the deformation of the motif, followed by a suitable consequent deformation of the skeleton. We then follow a classical procedure: we determine the i.i. subspaces of R6 (of R3(n−1) ) for case (42)1 ((42)2 ), and then consider the corresponding reduced problem; a description of these can be found in [8, 5, 19, 17]. Other possibilities for the i.i. subspaces will be analyzed elsewhere. 4. The Case of β-quartz At low pressures quartz exhibits two stable phases, called “low” (or trigonal, or α-) quartz and “high” (or hexagonal, or β-) quartz; at room pressure, these phases are observed below and above about 574◦ C, respectively.
PHASE CHANGES IN MULTILATTICES
661
We follow [12] (and [17]) by assuming that in any configuration of the SiO2 structure the positions of the Si atoms be compatible with the definition of a 3-lattice, and neglect the oxygens; thus we describe the crystalline structure of both quartz phases by a monatomic 3-lattice, whose points are the positions of the Si atoms in the SiO2 lattice. In the literature the α–β transition is attributed to a suitable deformation of the tetrahedra having the center at a Si atom, and the four nearest O atoms as vertices. Here this deformation is only described by the displacement of the Si atoms. In both α- and β-quartz the skeletal lattice type is hexagonal. A common choice of lattice vectors is the following: √ a 3a e2 = − , (43) e1 = (a, 0, 0), ,0 , e3 = (0, 0, c), 2 2 in an orthonormal basis (i, j , k). The rotational subgroup of the corresponding hexagonal holohedry is π/3 2π/3 4π/3 5π/3 π √ π , (44) Hk = 1, Riπ , Rjπ , Rkπ , Rk , Rk , Rk , Rk , Ri± , R√ 3j 3i±j where Rvω denotes the rotation by the angle ω about the direction of the vector v. In the crystallographic literature the plane of e1 and e2 is called the basal plane, and the direction of e3 (and of k) is called the (hexagonal) optic axis. One of the two possible (enantiomorphic) 3-lattice structures of β-quartz at the transition temperature θ0 has descriptors εσ0 , σ = 1, . . . , 5, where the lattice vectors ea0 are given by (43) for suitable choices a0 , c0 , of a and c, and the shifts are 1 2 p10 = e10 + e30 , 2 3
1 1 p20 = e20 + e30 . 2 3
(45)
Figure 4 shows the projection of the 3-lattice M(εσ0 ) onto the basal plane of e10 and e20 orthogonal to the optic axis e30 . James [12] (see also [17]) constructs the above 3-lattice structure of β-quartz as √ follows: consider a regular hexagonal planar honeycomb of edge d0 = ( 3/3)a0 , one cell of which is drawn by a plain line in the lower right part of Figure 4. For each one of its vertices consider a right-handed (with respect to k) circular helix of radius d0 /2 and pitch c0 , whose axis goes through that vertex and is orthogonal to the plane of the honeycomb. It is possible to arrange the right-handed helices in such a way that each one meets the neighboring three in equally spaced points along the helix itself, each point having the same projection onto the plane of the honeycomb as the third point following it. Each intersection point of the helices is the site of a Si atom. The circles in Figure 4 are the projections of the helices in the plane of the honeycomb, which is the basal plane generated by e10 , e20 . The crystal class P (εσ0 ) < SO(3) of this 3-lattice is the rotational subgroup Hk π/3 (see (44)) of the hexagonal holohedry P (ea0 ), and is generated, for instance, by Rk and Riπ . This class is called hexagonal trapezohedral, is denoted by 6 2 2 in [11],
662
M. PITTERI
Figure 4. Projection onto the basal plane of the Si atoms in right-handed β-quartz, and of the descriptors εσ0 = (ea0 , p10 , p20 ) for the 3-lattice given by (43) and (45) with λ > 0.
and is the actual crystal class of β-quartz, so that the monatomic 3-lattice M(εσ0 ) gives a good approximation of the actual structure of this quartz phase. Case (42)1 We follow [5], in the notation of [17] to which we refer for more details. a0 ) of symmetric tensors left fixed by the transformation E → (1) The set C(e t Q EQ of Sym form the 2-dimensional subspace of Sym represented in the basis (i, j , k) by matrices of the form %
α 0 0
0 α 0
0 0 β
& ;
(46)
a0 ) is given, for instance, by the orthonormal vectors a basis for C(e 1 V1 = √ 2
%
1 0 0 1 0 0
0 0 0
&
% and
V2 =
0 0 0 0 0 0
0 0 1
& .
(47)
a ) are those and only those that are The i.i. subspaces contained in C(e 1-dimensional. In each one of them the reduced action is trivial, and the bifurcation point is a turning (or limit) point, with change of stability but not of symmetry.
PHASE CHANGES IN MULTILATTICES
663
a0 )⊥ of C(e a0 ) there are two mutually or(2) In the orthogonal complement C(e thogonal 2-dimensional i.i. subspaces V1 , V2 , generated by V3 , V4 and V5 , V6 , respectively, with % & % & 0 1 0 1 0 0 1 1 V3 = √ V4 = √ (48) 1 0 0 , 0 −1 0 , 2 0 0 0 2 0 0 0 % % & & 0 0 1 0 0 0 1 1 V6 = √ (49) V5 = √ 0 0 0 , 0 0 1 . 2 1 0 0 2 0 1 0 (3) The reduced group P1 on V1 is the symmetry group of an equilateral triangle in R2 with center at the origin and a vertex on the second coordinate axis; it has order six and is generated by 2π −1 0 π/3 π ≈ Rk , f := ≈ Ri and r 0 1 3 (50) cos ω − sin ω r(ω) := . sin ω cos ω Thus the action of Hk on the typical element of V1 produces six monoclinic variants, the symmetry axis being k. In addition, V1 contains three 1-dimensional base-centered orthorhombic subspaces, whose crystal class is rhombic disphenoidal (2 2 2 in [11]). The reduced bifurcation diagram consists in three transcritical bifurcating curves which belong to the aforementioned subspaces and are all unstable. This is detailed in [17], where it is also shown how these curves can be restabilized, thus producing a subcritical bifurcation. (4) The reduced group P2 on V2 is the group of symmetries of a regular hexagon in R2 with center at the origin and a vertex on the first axis, has order twelve and is generated by π 5π/3 π ≈ Rk . (51) f ≈ Ri and r 3 Thus the action of the hexagonal holohedry on the typical element of V2 produces twelve triclinic variants. In V2 there are two triples of symmetry-related 1-dimensional subspaces made of centered monoclinics, whose crystal class is monoclinic sphenoidal (2 in [11]). The bifurcating curves consist in two triples of pitchforks, each one in one of these triples of monoclinic subspaces. For a triple of pitchforks to be stable it is necessary that it be supercritical: it must exist for θ θ0 under the assumed stability of the high-symmetry phase for θ > θ0 ; then necessarily also the other triple is supercritical, and exactly one of them is stable. Here and below ≈ means “represented by” or “representing”.
664
M. PITTERI
Case (42)2 We denote by αk the submatrix α of the matrix µk ∈ Λ+ (εσ0 ) that corresponds π/3 to the rotation Rk according to (13), etc., α1 denoting the 2 by 2 identity. Using π/3 the expressions of the generators µk and µπi reported in [17], we have 1 0 −1 −1 π/3 4π/3 = αkπ , αk = = αk , (52) α1 = 0 1 1 0 0 1 −1 −1 2π/3 5π/3 π = = αk , αi = = αjπ , (53) αk −1 −1 0 1 0 1 1 0 π π √ π √ π = = α , α = = α√ . α√ i+ 3j i− 3j 3i−j 3i+j 1 0 −1 −1 (54) π/3
π/3
Therefore, by (31), ⎛4 2⎞ ⎜ g=⎝3 2 3
3⎟ 4⎠ 3
and
1 W = √ 3 2
√ √ 3+ 3 3− 3 √ √ , 3− 3 3+ 3
from which, by (33) and in obvious notation, √ 1 1 1 0 3 π/3 4π/3 π √ = βk , = βk , βk = − β1 = 0 1 2 − 3 1 √ 1 √1 − 3 2π/3 5π/3 = βk , βk = − 3 1 2 √ 1 3 1 √ = βjπ , βiπ = − 1 − 3 2 0 1 π π √ = βi+ , β√3i−j = 3j 1 0 √ 1 − 3 √1 π π √ = β√ . βi− 3j = − 3i+j 3 1 2
(55)
(56)
(57)
(58)
Consider now the action induced by (37)2 on the 6-dimensional space of shift components, with typical element (a1 , a2 , a3 , b1 , b2 , b3 ) representing the pair (1 , 2 ). One can check that this action transforms into themselves the subspaces W1 and W of the form (0, 0, a3 , 0, 0, b3 ) and (a1 , a2 , 0, b1 , b2 , 0), respectively. The 2-dimensional subspace W1 , described by the pair (a3 , b3 ), is irreducible invariant, and consists of monoclinic 3-lattices, with axis k. The reduced group on W1 has order 6, is generated by the matrices √ 4π 1 3 1 π/3 π π √ ≈ Rk , ≈ Ri ≈ Rj and r (59) 1 − 3 2 3
665
PHASE CHANGES IN MULTILATTICES
and is the symmetry group of an equilateral triangle in R2 centered at the origin and having a vertex on the bisectrix of the second and fourth quadrant. Therefore, to within a rotation by π/4 of the coordinates, this reduced problem is the same as the one in item (3) of Case (42)1 . As there, the bifurcation diagram consists of three unstable transcritical bifurcating curves of orthorhombic 2 2 2 symmetry. For instance, the orthorhombic axes (besides k) are i and j for the choice of shifts π1 = 2λk = 2π2 ,
λ ∈ R.
(60)
This can be obtained from the corresponding 1-dimensional subspace of W1 , which has the form √ (61) (a3 , b3 ) = γ (2 + 3, 1), γ ∈ R, by (34) and (55), or directly from the condition (see (28)) Riπ πr = (αiπ )sr πs ;
(62)
or, equivalently, from its analogue for Rjπ . The subspace W decomposes into the orthogonal sum of three i.i. subspaces: W2 , W3 , W4 , the first two of dimension 1, the third of dimension 2. They are respectively generated by √ √ (63) w2 = (−1, 2 + 3, 0, 2 + 3, 1, 0), √ √ (64) w3 = (2 + 3, 1, 0, 1, −2 − 3, 0), and w4 = (0, 1, 0, −1, 0, 0). (65) w4 = (1, 0, 0, 0, 1, 0), The reduced group on W2 [W3 ] is {1, −1}; for instance, 1 ≈ Riπ
[1 ≈ Rjπ ] and
π/3
− 1 ≈ Rk
≈ Rkπ .
(66)
Therefore, as is known, the bifurcation diagram is the standard pitchfork. A fourthorder polynomial energy is sufficient to capture the qualitative features of a (supercritical) second-order bifurcation, while a subcritical first-order one, as in the case of quartz, requires a sixth-order polynomial (see, for instance, [5, 6] or [17]). In W2 [in W3 ] the crystal class of the bifurcating multilattices is trigonal trapezohedral ((32) in [11]), with k as 3-fold axis; the additional generator of the point group is Riπ [is Rjπ ]. The reduced group on W4 has order 12, is generated by the matrices √ π 1 3 1 π/3 π √ ≈ Rk , ≈ Ri and r (67) 1 − 3 2 3 and is the symmetry group of a regular hexagon in R2 which is centered at the origin and has a vertex on the bisectrix of the second and fourth quadrant (compare with (59)). Therefore, to within a rotation by π/4 of the coordinates, this reduced
666
M. PITTERI
Figure 5. Projection as in Figure 4 for the right-handed α-quartz structure, and of the descriptors εσ+ = (ea , p1+ , p2+ ) for the 3-lattice given by (43) and (70) with U = 1 and λ > 0.
problem is the same as the one in item (4) of Case (42)1 and has the same qualitative bifurcation diagram. All the bifurcating branches have monoclinic 2 symmetry. We now analyze in detail the two trigonal trapezohedral subspaces W2 and W3 . Using (34) and (55), we see that the reference shift increments corresponding to ¯ λ , W2 are, in terms of real parameters λ, √ √ λ (e10 + 2e20 ) 2(3 + 3) ¯ ,0 = 2 1 + 3 , (68) π1 = λ 0, 3 3 √ √ λ (2e10 + e20 ) √ 3+ 3 ¯ ,0 = 2 1 + 3 . (69) π2 = λ 1 + 3, 3 3 Equivalently, by (25), denoting by pr+ the present shifts, with pr = Upr0 , ea = U ea0 , and λ a real parameter, we have p1+ = p1 + λ(e1 + 2e2 ),
p2+ = p2 + λ(2e1 + e2 ).
(70)
These shifts represent deformed β-quartz for λ = 0, while for λ = 0 they give the M(εσ+ ) 3-lattice model for trigonal trapezohedral α-quartz proposed in [12], and used also in [6, 17]. The projection of this 3-lattice onto the (basal) plane of e1 and e2 is sketched in Figure 5 for λ > 0. Based on [12], we recall how the α-quartz structure can be obtained by deforming the helices described above for β-quartz. If the reference β-quartz helices, of radius d0 /2, maintain their axes while being radially stretched, so that their radius becomes larger than d0 /2, then any point of initial intersection of two neighboring
667
PHASE CHANGES IN MULTILATTICES
helices splits into two alternative intersections. For the actual α-quartz structure, which is uniformly stretched by some U ∈ Sym> whose representative matrix in the basis (i, j , k) has the form (46), the projections of these intersections onto the basal plane are shown in the lower right part of Figure 5. To maintain the threefold rotational symmetry about the normal to the honeycomb, the Si atoms must still be evenly spaced on the helices on such intersections, each atom being on the same vertical (with respect to the basal plane) as the third atom following it on the helix. There are exactly two ways in which this can happen: on an arbitrarily selected helix Si atoms are placed on either the first or the second of its possible intersections with one of the neighboring helices; either choice forces in a unique way the atoms on the other neighboring helices to be all placed on the second or on the first of their intersections, etc., up to completion of the whole structure. Figure 5 shows one of the configurations obtained in this way, namely M(εσ+ ) with shifts given by (70) for λ > 0. For fixed λ, the other possibility is given by the Dauphiné twin M(εσ− ), where the εσ− = (ea , p1− , p2− ) have the same lattice vectors ea as εσ+ , and shifts p1− = p1 − λ(e1 + 2e2 ),
p2− = p2 − λ(2e1 + e2 ),
(71)
with the same λ as in (70). These two configurations correspond to symmetryrelated points on the bifurcated branches of the pitchfork in the W2 subspace mentioned above. We refer the reader to [17] for more details on Dauphiné twins, and only recall that the twin multilattice M(εσ− ) can be obtained from M(εσ+ ) by means of the rotation Rkπ , of order 2 (see also (66)). Using (34) and (55), we observe that the reference shift increments correspond¯ µ , ing to elements of the subspace W3 are, in terms of real parameters µ, √ √ ¯ + 3, 0, 0) = 4µ 3 + 3 e10 , (72) π1 = 4µ(3 π2
√ √ 1 √ 3 ,− , 0 = −4µ 3 + 3 e20 . = 4µ¯ 3 + 3 2 2
(73)
Equivalently, by (25), denoting by pr the corresponding present shifts, we have p1 = p1 + µe1 ,
p2 = p2 − µe2 ,
(74)
with µ a real parameter. We have deformed β-quartz for µ = 0, while symmetryrelated points on the bifurcated branches of the pitchfork in this subspace correspond to opposite values of µ = 0 in the shifts given by (74). The related multilattices, say M(εσ ) and M(εσ ), are another example of shuffle twins, very similar to the Dauphiné twins described above. In particular, the twin multilattice M(εσ ) can be obtained from M(εσ ) by means of the same rotation Rkπ , of order 2, which relates the Dauphiné twins (see (66)). Also in this case one can describe the low-symmetry phase and the twins in terms of deformation of the reference β-quartz helices. Now the radius of those
668
M. PITTERI
Figure 6. Projection as in Figure 5 for the right-handed quartz structure with shifts given by (74) for U = 1 and µ > 0.
helices is shrinked, becoming less than d0 /2, and hence neighboring helices do not intersect anymore. Figure 6 shows one of the possible arrangements of the actual helices; looking at the hexagon drawn in the lower right corner, we see that the other possibility – which gives the twinned configuration – is obtained by exchanging the occupied and the nonoccupied helices in that hexagon, and consequently in the whole structure. In his paper [6], among other things, Ericksen tackles the problem of finding an alternative to the so-called incommensurate phase, which is introduced in the physical literature to explain certain features of the α–β transition in quartz. He looks for configurations described by a 3-lattice with hexagonal skeleton and shifts such that the symmetry of the structure consists in the threefold axis k alone. The reference shift increments must have the form π1 = (λ + µ)e10 + 2λe20 ,
π2 = 2λe10 + (λ − µ)e20 ,
(75)
for λ and µ varying arbitrarily over the reals; this corresponds in fact to the 2-dimensional orthogonal sum W2 3W3 of the above i.i. subspaces W2 and W3 . In that 2-dimensional space Ericksen, generalizing an example in [19], introduces a sextic polynomial (reduced) potential in the variables λ, µ, with the coefficients of λ2 and µ2 depending affinely on environmental pressure p and temperature θ, and all the other coefficients constant. The related bifurcation analysis shows the existence of four phases, labelled I (λ = 0 = µ, β-phase), II (λ = 0 = µ, α-phase, our subspace W2 ), III (λ = 0 = µ, our subspace W3 ), and IV (λ = 0 = µ, the complement of the previous phases in W2 3W3 ). According to Ericksen, phases II and III and their twins are rather similar, so that one could be confused for the
PHASE CHANGES IN MULTILATTICES
669
other. For instance, the twins can be described for both phases in terms of the same twinning operation Rkπ . Ericksen uses the sextic potential to describe the transition from β- to α-quartz, at constant pressure when temperature is lowered, by means of a second-order transition from phase I to phase III, followed by a first-order transition between phases III and II. Notice that a direct transition from I to IV corresponds to bifurcating into the orthogonal sum of W2 and W3 in an (initial) direction contained in neither W2 nor W3 , which requires the vanishing of both the eigenvalues of L corresponding to those eigenspaces. This is not forced by symmetry, and hence cannot generically occur with one control parameter, as the kernel of L is a reducible invariant subspace of R6 . We refer the reader to [6] for details, in particular for a comparison with certain formulae of [19] involving bifurcation into a reducible invariant subspace, of interest in his bifurcation analysis, and for comments on the incommensurate phase.
5. A Personal Tribute to Clifford Truesdell The fact that this paper appears in a volume dedicated to the memory of Clifford Truesdell gives me the opportunity to acknowledge, in addition to my debt of gratitude and admiration toward him, some of his merits that are not well known as well as very unusual. Among his various qualities, Truesdell was very interested in many research fields different from those in which he himself or his associates were active. For the purpose of this tribute he was interested in the logic of modality, that is, of possibility and hence of necessity, especially in its version introduced by Bressan [1] under the name of physical possibility to write rigorous foundations of classical particle mechanics based on ideas of Mach and Painlevé [13]. This version of possibility has some intuitive features by means of which it is regarded as a primitive concept; it makes axiomatization of physics rigorous and conform to the views of Hamel [9, 10]; and it endows the axiomatization with a threefold interdisciplinary character: mathematical physics, logic, and philosophy of science. Papers having this interdisciplinary character are exceptionally rare and can be easily misunderstood. As far as I am concerned, Truesdell showed his interest in the logic of modalities by accepting as appendix G6 of his well known book [21] my paper [14], in which I explicitly refer to the theory of modal logic presented in progressively refined versions in [1–3]. It is true that I use these versions from an intuitive point of view, but the aforementioned version of physical possibility is a key ingredient in definitions and axioms. And, indeed, in my university curriculum I never had a formal training in mathematical logic,
nor attended lectures on technical parts of it.
670
M. PITTERI
Moreover, at about the same time, Truesdell used very strong arguments in some polemic considerations on the axiomatization of mechanics ([20], Part V, Philosophy, essay 39: Suppesian stews (1980/1981)). These arguments show that, in fact, he had understood the main features of [1]. Additional evidence of Truesdell’s appreciation of the logic of modalities is contained in a 1986 letter of his to the Accademia dei Lincei, of which he gave me a confidential copy. There, he strongly supports the research program of extending a theory à la Mach–Painlevé from particle mechanics to the mechanics of continuous media. Details, background and extensive references are given in [16]. Those of us who have worked or are still working to render the approach in [13] rigorous, and to generalize it, are well aware that, even today, that approach is widely ignored by mathematical physicists and researchers in mechanics. The interdisciplinary character of that work sometimes strongly contributes to serious misunderstandings of the related publications, even involving their contents. This confirms that Truesdell’s interest and understanding illustrated above were really very unusual. Acknowledgements I want to extend here to Charlotte Truesdell my gratitude to her husband Clifford expressed earlier. This work is part of the research activities of the EU Network “Phase Transitions in Crystalline Solids”, and has been partially supported by the Italian M.U.R.S.T. through the project “Mathematical Models for Materials Science”. References 1.
2. 3.
4. 5.
6. 7. 8.
A. Bressan, Metodo di assiomatizzazione in senso stretto della meccanica classica. Applicazione di esso ad alcuni problemi di assiomatizzazione non ancora completamente risolti. Rend. Sem. Mat. Univ. Padova 32 (1962) 55–212. A. Bressan, A General Interpreted Modal Calculus. Yale Univ. Press, New Haven/London (1972). Foreword by N.D. Belnap, Jr., 327 pages. A. Bressan, (a) On physical possibility and (b) Supplement: A much used notion of physical possibility and Gödel’s undecidability theorem. In: M. Dalla Chiara Scabbia (ed.), Italian Studies in Philosophy of Science. North-Holland, Amsterdam (1981) pp. 197–210 and 211–214. B. Budiansky and L. Truskinovsky, On the mechanics of stress-induced phase transformations in zirconia. J. Mech. Phys. Solids 41 (1993) 1445–1459. J.L. Ericksen, Local bifurcation theory for thermoelastic Bravais lattices. In: J.L. Ericksen, R.D. James, D. Kinderlehrer and M. Luskin (eds), Microstructure and Phase Transition, IMA Volumes in Mathematics and its Applications, Vol. 54. Springer, New York (1993). J.L. Ericksen, On the theory of the α–β phase transition in quartz. J. Elasticity 63 (2001) 61–86. G. Fadda, L. Truskinovsky and G. Zanzotto, Unified Landau description of the tetragonal, orthorhombic, and monoclinic phases of zirconia. Phys. Rev. B 66 (2002) 174107 1–10. M. Golubitsky, D. Schaeffer and I. Stewart, Singularities and Groups in Bifurcation Theory, Vol. II. Applied Mathematical Sciences, Vol. 69. Springer, New York (1988).
PHASE CHANGES IN MULTILATTICES
9. 10. 11. 12.
13. 14. 15. 16.
17. 18.
19. 20. 21. 22.
671
G. Hamel, Über die Grundlagen der Mechanik. Math. Annalen 66 (1908) 350–397. G. Hamel, Die Axiome der Mechanik, Handbuch der Physik, Vol. 5. Springer, Berlin (1927) pp. 1–42. T. Hahn (ed.), International Tables for X-ray Crystallography, Vol. A. Reidel, Dordrecht/ Boston (1996). R.D. James, The stability and metastability of quartz. In: S.S. Antman, J.L. Ericksen, D. Kinderlehrer and I. Müller (eds), Metastability and Incompletely Posed Problems, IMA Volumes in Mathematics and its Applications, Vol. 3. Springer, New York (1987). P. Painlevé, Les Axiomes de la Mécanique. Gauthier-Villars, Paris (1922). M. Pitteri, On the axiomatic foundations of temperature. In: C. Truesdell’s Rational Thermodynamics, 2nd edn. Springer, New York (1984) Appendix G6, pp. 522–544. M. Pitteri, On bifurcations in multilattices. In: R. Monaco, M. Pandolfi and S. Rionero (eds), Proc. of WASCOM 2001, Porto Ercole, Italy. World Scientific, Singapore (2002). M. Pitteri, On certain weak phase transformations in multilattices. TMR Network “Phase Transitions in Crystalline Solids” Preprint Series, No. 100 (2002). Also Rapporto Tecnico DMMMSA No. 88, 2/12/2002. Available at http://www.dmsa.unipd.it/tmr/public_ html/PreprintDMMMSA.pdf. M. Pitteri and G. Zanzotto, Continuum Models for Phase Transitions and Twinning in Crystals. CRC/Chapman & Hall, Boca Raton/London (2002). N.K. Simha and L. Truskinovsky, Phase diagram of zirconia in stress space. In: R. Batra and M. Beatty (eds), Contemporary Research in the Mechanics and Mathematics of Materials. CIMNE, Barcelona (1996). P. Tolédano and V. Dmitriev, Reconstructive Phase Transformations: In Crystals and Quasicrystals. World Scientific, Singapore (1996). C.A. Truesdell, An Idiot’s Fugitive Essays on Science: Methods, Criticism, Training, Circumstances. Springer, New York (1984). C.A. Truesdell, Rational Thermodynamics, 2nd edn. Springer, New York (1984). C.A. Truesdell and R.G. Muncaster, Fundamentals of Maxwell’s Kinetic Theory of a Simple Monatomic Gas, Treated as a Branch of Rational Mechanics. Academic Press, New York (1980).
A New Quasilinear Model for Plate Buckling PAOLO PODIO-GUIDUGLI Dipartimento di Ingegneria Civile, Università di Roma “Tor Vergata”, Viale Politecnico, 1–I-00133 Roma, Italy. E-mail:
[email protected] Received 5 November 2002; in revised form 19 March 2003 Abstract. A new quasilinear model for plate buckling is presented, which reduces to von Kármán’s semilinear model through an explicit approximation procedure. Mathematics Subject Classifications (2000): 74G60, 74K20, 74B20. Key words: plate buckling, von Kármán equations, nonlinear eigenvalue problems.
1. Introduction Von Kármán’s is a system of two semilinear partial differential equations, each with a biharmonic principal part, which is intended to describe the large deflections of thin elastic plates and, in particular, their buckling under the action at the boundary of either in-plane compressive loads or in-plane inward displacements. The von Kármán buckling equations have been and are popular among nonlinear analysts because they provide a relatively easy and significant example of application of various abstract techniques that have been devised to study nonlinear eigenvalue problems (cf. [1–3]). Von Kármán equations do capture well the target phenomenology. Yet, their standard “derivations” in the engineering literature (including von Kármán’s own [21]) have a host of conceptual defects that have been repeatedly exposed, but never completely cured. To quote from Truesdell’s brilliant criticism in the Epilogue of [19], one may choose to regard that theory “. . . as handed down by some higher power (a Hungarian wizard, say) and study it as a matter of pure analysis” (p. 601). Indeed, such a discrepancy between final product and derivation is all but a strange case with named equations, since the greats’ intuition often compensates for rigor; but rarely, if ever, to such an extent. This is perhaps the reason why, both in [5] and lately in [6, p. 367], Ciarlet states that “von Kármán equations . . . play an almost mythical role in applied mathematics”. Sound mathematical justifications of the von Kármán theory have been given. Ciarlet [4] has shown that, under appropriate hypotheses on the boundary data and the elastic response, the von Kármán equations arise as the leading terms in a formal asymptotic expansion with respect to the thickness parameter of a quasilinear problem in three-dimensional elasticity. Refinements and complements to Ciarlet’s 673 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 673–698 © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
674
P. PODIO-GUIDUGLI
approach are found in [8], [5, 9; 6, Chapter 5]. Moreover, very recently Friesecke et al. [10, 11] have established the Föppl–von Kármán plate theory as a -limit of three-dimensional nonlinear elasticity by scaling the energy as the fifth power of the plate’s thickness. My present derivation begins with a kinematic Ansatz (relations (3.7) and (3.8)) just as von Kármán’s, and has various other similarities with it. In fact, I strive to keep as close to von Kármán’s line of reasoning as possible, and use the toolkit of modern continuum mechanics to rigorously justify every single step. Mine is an exact derivation from three-dimensional elasticity of a two-dimensional, quasilinear system of equations that reduces to von Kármán’s through an explicit approximation procedure. In other words, I show that the classical buckling equations of von Kármán can be given a rational position with respect to a two-dimensional system being an exact consequence, and having the same type of nonlinearity, of the threedimensional system of elasticity. At variance with methods based on asymptotic expansions or variational convergence, which apparently cannot help extracting a semilinear problem from a quasilinear one, my method preserves the mathematical type of the original three-dimensional problem. Hence, the bifurcation problem I derive is more difficult than von Kármán’s: I propose it for analysis. In short, the contents of this paper are the following. In Section 2, the von Kármán equations are recalled and the associated bifurcation problem is stated, both in the standard formulation and in the formulation introduced in the cited work by Berger. Next, in Section 3, a three-dimensional buckling problem is described, for an internally constrained, homogeneous, three-dimensional plate-like body of appropriate elastic response. The internal constraint we stipulate to hold is an exact nonlinear version of the linear constraint that allows for the classical plate theory of Kirchhoff–Love: material fibers parallel to the plate’s axis stay straight, do not change their length, and remain orthogonal to fibers orthogonal to the axis itself. We assume that the material comprising the body is a St. Venant–Kirchhoff elastic material being transversely isotropic with respect to the axial direction and capable only of deformations that agree with the constraint. We let the body be weakly clamped along its lateral boundary, in the sense that sliding parallel to the plate’s plane is allowed (Figure 2). Moreover, we let the body be in equilibrium under loads that are everywhere null, except for a uniform in-plane pressure of magnitude λ over all of the lateral boundary. For sufficiently large values of the parameter λ, we expect the body to buckle. The purpose of von Kármán’s theory is to approximately determine the critical values of the buckling parameter, as well as the accompanying buckled
An account of Davet’s refinement of Ciarlet’s work is given in Section XIV.14 of [1]. Not surprisingly, this is precisely the scaling exponent one obtains by thickness integration of the
stored energy density (3.27) with the strain field evaluated as in (B.4). I am indebted to Sergio Conti for kindly bringing this work to my attention on receiving a copy of the version of this manuscript I had just submitted for publication.
A MODEL FOR PLATE BUCKLING
675
shapes, without solving a full three-dimensional problem of nonlinear elasticity, whose global bifurcation analysis would be very difficult. In Section 4, it is shown that von Kármán’s first equation follows from a compatibility condition, which is an exact two-dimensional consequence of the classical St. Venant–Beltrami compatibility conditions. This compatibility condition insures, roughly speaking, that a suitably defined plane strain field allows for the construction of a displacement field consistent with the internal constraint in force. In Section 5, it is shown, among other things, that von Kármán’s second equation follows from an equilibrium condition, which is an exact two-dimensional consequence of stress balance, both at interior points of the body and at its upper and lower faces. In both sections, the passage from three to two dimensions is performed by one mathematical tool, thickness integration. Finally, in Section 6, the bifurcation problem with respect to which Kármán’s is given a position is formulated. The Appendix, in four parts, is meant to save the reader some ink. 2. The von Kármán Equations For o a fixed point in the three-dimensional Euclidean space E, let {o; c1 , c2 , c3 ≡ z)} be an orthonormal Cartesian frame, and let (x1 , x2 , x3 ≡ ζ ) be the Cartesian coordinates of a point p = x + ζ z of E, with xα the coordinates of the point x in the plane ζ = 0; moreover, let P be a simply-connected domain in that plane, with a smooth boundary ∂P , and let n(x) be the outer normal at a point x ∈ ∂P (Figure 1). Given the scalar-valued fields ϕo and ϕ1 over ∂P , the von Kármán’s bifurcation problem consists in finding a real number λ and a pair (ϕ, w) of fields over P ∪ ∂P such that: – in P , 1 (2.1) ϕ − [w, w] = 0, 2 κ w − [ϕ, w] = 0; (2.2) – in ∂P , ∂n ϕ = λϕ1 , (2.3) ϕ = λϕ0 , (2.4) w = 0, ∂n w = 0.
Figure 1. A study of certain bifurcation problems in nonlinear three-dimensional elasticity is found in a
recent paper by Healey and Simpson [14].
676
P. PODIO-GUIDUGLI
Here, [a, b] := a,11 b,22 +a,22 b,11 −2a,12 b,12 ,
(·),α :=
∂(·) , ∂xα
(2.5)
is the Monge–Ampère differential “crochet”; the field ϕ is an Airy-type stress function; w(x)z is interpreted as the transverse displacement of a point x of P ∪ ∂P ; κ > 0 is a stiffness constant bearing the same physical dimensions as ϕ; and λ is a scalar multiplier specifying the magnitude of a given distribution of in-plane, compressive loads orthogonal to ∂P . That von Kármán’s is a nonlinear eigenvalue problem is made more evident by a change in format [2]. Let ϑ be a biharmonic field over P that satisfies the boundary conditions ϑ = ϕ0 ,
∂n ϑ = ϕ1
in ∂P .
(2.6)
Set ψ := ϕ − λϑ, and denote by ψ(w) the unique solution, for each given field w over P , of the boundary-value problem 1 ψ − [w, w] = 0 in P , 2 ψ = 0, ∂n ψ = 0 in ∂P .
(2.7) (2.8)
Set A(w) := κ
w − [ψ(w) , w],
B(w) := [ϑ, w],
(2.9)
and call (λ, w) a proper pair if it solves the problem C(λ, w) := A(w) − λ B(w) = 0,
(2.10)
subject to the boundary conditions (2.4). Then, the von Kármán problem (2.1)–(2.4) may be reformulated as follows: Study the mapping λ → Wλ := {w | C(λ, w) = 0}.
(2.11)
This problem has been repeatedly looked at by engineers and mathematicians: by the former, in the first formulation; by the latter, in the formulation (2.11), because it provides a nontrivial instance where the abstract techniques developed by Crandal and Rabinowitz [7] and Rabinowitz [18] apply. 3. Three-Dimensional Plate Buckling In this section, with a view toward giving the two-dimensional bifurcation problem (2.1)–(2.4) the status of a rational approximation, we formulate a buckling prob-
A MODEL FOR PLATE BUCKLING
677
Figure 2.
lem for an internally constrained, three-dimensional plate-like body of appropriate elastic response. We use standard notation in continuum mechanics, and leave smoothness assumptions tacit. Consider a continuous body occupying a three-dimensional plate-like region of cross section P and thickness 2ε, with 2ε diam(P ), that is to say, a right cylinder C(ε) of axis z, that we identify pointwise with the set P × ] − ε, +ε[. Let C(ε) be weakly clamped along its lateral boundary M(ε) ≡ ∂P × ] − ε, +ε[, in the sense that sliding parallel to the plane ζ = 0 is allowed (Figure 2). Moreover, let C(ε) be in equilibrium under null body loads, null tractions over the upper and lower faces P ± = P × {±ε}, and uniform in-plane dead pressure of magnitude λ over all of M(ε). For sufficiently large values of the real parameter λ, we expect the plate C(ε) to buckle, that is, to admit equilibrium solutions with the deformed shape of C(ε) different from a right cylinder of axis z with flat cross sections. We specify what restricted class of motions we choose for C(ε) in the upcoming subsection; next, we assign to C(ε) a type of elastic response compatible with such constrained kinematics (Section 3.2); finally, we express the place boundaryconditions in terms of the functions that parametrize the admissible motions, as well as the traction boundary-conditions in terms of the admissible active and reactive stresses (Section 3.3). We need not lay down here the governing field equations; their consequences relevant to finding a two-dimensional problem which is an exact antecedent of the von Kármán’s bifurcation problem will be discussed in Sections 4 and 5. 3.1. CONSTRAINED KINEMATICS For p → f (p) a deformation of C(ε) and for F = ∇f the deformation gradient, we choose the strain measure D=
1 T (F F − 1). 2
Moreover, for u(p) = f (p) − p
(3.1)
678
P. PODIO-GUIDUGLI
the displacement vector field associated with the deformation f and for U = ∇u the displacement gradient, we note that 1 D = E + UT U, 2 where
(3.2)
1 (3.3) (U + UT ) =: sym U 2 is the linearized strain measure. As the first step of our derivation of the von Kármán equations, we stipulate that all the deformations of C(ε) satisfy the constraint condition E=
Dz = 0 in C(ε).
(3.4)
This condition is the nonlinear counterpart of the condition that characterizes the kinematics of Kirchhoff–Love plates, namely, Ez = 0 in C(ε).
(3.5)
Conditions (3.4) and (3.5) impose – the former exactly, the latter in the sense of the classic approximation that regards sup |∇u| C(ε)
as small – that in any deformation material fibers parallel to z stay straight, do not change their length, and remain orthogonal to fibers orthogonal to z. The system (3.5) of linear PDEs has the well-known solution u(x, ζ ) = v(x) + w(x)z − ζ ∇w(x),
v(x) · z = 0,
(3.6)
parametrized by the in-plane displacement field v(x) and the transverse deflection w(x). It is not difficult to solve the nonlinear system (3.4) as well: the admissible deformations have the form f (x, ζ ) = g(x) + ζ m(x),
(3.7)
parametrized by the mapping g ≡ f |P delivering the deformed image f (P ) of the cross section P of cylinder C(ε). Indeed, m(x), the unit normal to f (P ) at the point g(x), is completely determined by g: m(x) := uni(∂c1 f × ∂c2 f )|(x,0) = uni(g,1 ×g,2 )|x
(3.8)
(here ∂cα f = Fcα denotes the derivative of f in the direction cα , cα · z = 0, and uni(a) := |a|−1 a for each vector a = 0). We use the same symbol for fields such as u here and f below, no matter we regard them as functions of the point p ∈ C(ε), of the corresponding pair (x, ζ ) ∈ P ×] − ε, +ε[, or of the corresponding triplet of coordinates (x1 , x2 , ζ ).
679
A MODEL FOR PLATE BUCKLING
To see that (3.7) implies (3.4) is the matter of a straightforward computation. We prove the converse implication in a manner different from Naghdi and Nordgren [15], the first who noted that (3.4) and (3.7) are equivalent. With the use of definition (3.1), we write the constraint condition (3.4) as Fz = F−T z in C(ε),
(3.9)
and we observe that f,ζ (x, ζ ) = ∂z f (x, ζ ) = F(x, ζ )z.
(3.10)
It follows from (3.9) that |Fz| = |F−T z| = 1, whence, with the use of (3.8)1 , m(x) = F(x, 0)z = f,ζ (x, 0).
(3.11)
It also follows from (3.9) that (Fz),ζ · Fz = 0
and
(Fz),ζ · Fcα = 0,
α = 1, 2,
whence (Fz),ζ = 0.
(3.12)
Combining (3.10)–(3.12) we have that f,ζ (x, ζ ) = f,ζ (x, 0) = m(x); relation (3.7) then follows by integration, on setting f (x, 0) = g(x).
(3.13)
REMARKS. 1. To compare relations (3.6) and (3.7), we write the latter as u(x, ζ ) = v(x) + w(x)z + ζ(m − z),
(3.14)
v(x) + w(x)z = g(x) − x,
(3.15)
with v(x) · z = 0.
In Cartesian components (3.6) reads uα (x, ζ ) = vα (x) − ζ w,α (x),
u3 (x, ζ ) = w(x),
(3.16)
while (3.14) is uα (x, ζ ) = vα (x) + ζ mα (x),
u3 (x, ζ ) = w(x) + ζ(m3 (x) − 1)
(3.17)
680
P. PODIO-GUIDUGLI
(here uα := u · cα , u3 := u · z). Thus, the Kirchhoff–Love kinematics is recovered whenever mα 5 −w,α
m3 5 1.
and
(3.18)
2. In the standard linear theory of plates, which cannot treat bifurcations and concentrates on determining transverse deflections, the fields v(x) and w(x) are typically determined separately, an exceptional situation in the exact, nonlinear theory we deal with. 3. The von Kármán’s theory of plate buckling ignores in-plane displacements, just as Euler’s theory of rod buckling does with axial displacements. 3.2. CONSTRAINED ELASTIC RESPONSE As is standard doctrine in continuum mechanics (cf., e.g., [20, Section 30; 13; 1, Section XII.12]), the kinematical constraint (3.7) should be maintained by power˙ the less reactive stresses. In terms of the Cosserat stress S and the strain rate D, power expended per unit referential volume has the espression ˙ := S · D. ˙ π(S, D)
(3.19)
We split S into mutually orthogonal active and reactive parts: S = S(A) + S(R) ,
S(A) · S(R) = 0,
(3.20)
and assume that, at each point of C(ε), S(R) satisfies ˙ =0 π(S(R), D)
(3.21)
˙ that is, those obeying for all admissible strain rates D, ˙ = 0, Dz
(3.22)
a restriction that follows directly from (3.7). Since (3.22) may be equivalently written as ˙ · sym(z ⊗ a) = 0, D
(3.23)
with a an arbitrary vector, the Cosserat reactive stress must then have the form S(R) = z ⊗ d + d ⊗ z + δ z ⊗ z,
d · z = 0, δ ∈ R.
(3.24)
In other words, a reactive stress field over C(ε) takes its values in the three-dimensional subspace R := span {sym(cα ⊗ z), z ⊗ z}
(3.25)
of the space Sym of symmetric, 3 × 3 tensors; by (3.7) and (3.20)2 , respectively, both the admissible strain fields and the active stress fields are plane fields, in the sense that they take values in A := R⊥ , the orthogonal complement of R in Sym.
A MODEL FOR PLATE BUCKLING
681
Our specification of a stress response compatible with the internal constraint (3.7) is completed by assuming that the active part of the stress depends linearly on the strain measure, in the form ν E D+ (tr D)1 ; (3.26) S(A) = 1+ν 1−ν here tr D = 1 · D = D11 + D22 , with 1 the identity in A. With (3.26) and (3.21), (3.19) becomes ν E 2 2 ˙ = σ˙ (D), σ (D) := |D| + (trD) ; π(S, D) 2(1 + ν) 1−ν
(3.27)
the inequalities ν ∈ ]−1, +1[
E > 0,
(3.28)
guarantee strict positivity of the stored energy σ (D), as well as invertibility of the linear transformation (3.26) of A into itself: ν 1 + ν (A) (A) S − (tr S )1 . (3.29) D= E 1+ν All in all, the material comprising the plate-like region C(ε) is a homogeneous St. Venant–Kirchhoff elastic material [20, Section 94] being transversely isotropic with respect to the direction z and capable only of deformations that agree with the constraint (3.4). 3.3. BOUNDARY CONDITIONS We are now in a position to specify mathematically – for a continuous body having the referential shape C(ε), the motion class (3.7), and the mechanical response described by (3.20), (3.24) and (3.26) – the boundary conditions we described in words in the beginning of this section. As is customary in the nonlinear mechanics of solids, we express the traction conditions in terms of the Piola stress P, which relates as follows to Cosserat stress measure S: P = FS.
(3.30)
We require that Pz = 0
in P + ∪ P − ;
(3.31)
Here E and ν are material moduli resembling, respectively, the well-known Young and Poisson moduli that characterize a linearly elastic, unconstrained, isotropic material; their operational definition require some care (see [17, Section 19]). Note also that S(A) = ∂σ /∂D.
682
P. PODIO-GUIDUGLI
and that Pn = −λn in M(ε).
(3.32)
Moreover, with the use of the representation (3.14) for the admissible deformations, we express the weak-clamping condition in the form w = 0 and
m = z in ∂P .
(3.33)
We now analyze the boundary conditions (3.31)–(3.33) a bit more closely. Firstly, we observe that – with (3.30), (3.20) and (3.24) – condition (3.31) can be given the form d = 0 and
δ = 0 in P + ∪ P − .
(3.34)
Secondly, we observe that, on the lateral boundary, the Piola traction vector Pn splits into two vectors, the one reactive, the other constitutively specified: Pn = FS(R) n + FS(A) n = (d · n) z + FS(A) n; and that, due to (3.33)2 and (3.9), z = m = Fz = F−T z. Consequently, in view also of (3.26), z · FS(A) n = 0, and condition (3.32) can be given the form d·n=0
and
S(A) n = −λF−1 n in M(ε).
(3.35)
Note that the Cosserat traction vector S(A) n must be orthogonal to z, but not necessarily parallel to n (unless the closed curve ∂P has some special global symmetry, say, it is a circle, or a rectangle). Thirdly, we observe that, due to (3.8), condition (3.33)2 is equivalent to (g,1 ×g,2 ) · cα = 0,
α = 1, 2.
(3.36)
Since g,α = cα + v,α +w,α z,
(3.37)
an easy computation shows that (3.36) can be written as −(1 + v2 ,2 ) w,1 +v2 ,1 w,2 = 0,
(3.38)
v1 ,2 w,1 −(1 + v1 ,1 ) w,2 = 0.
Thus, since the standard requirement that deformation preserves local orientation (i.e., det F(x, ζ ) > 0 in C(ε)) implies that (1 + v1 ,1 )(1 + v2 ,2 ) − v1 ,2 v2 ,1 > 0
in ∂P ,
683
A MODEL FOR PLATE BUCKLING
the gradient of w must be null along the curve ∂P . This fact, together with (3.33)1 , is enough to conclude that (3.33) can be equivalently written in the form w = 0,
∂n w = 0
on ∂P ,
(2.4)
that is to say, precisely the Dirichlet-type boundary conditions in von Kármán’s problem. 4. Compatibility and von Kármán’s First Equation 4.1. A PREPARATORY RESULT We begin by establishing an easy consequence of the St. Venant–Beltrami compatibility conditions. As is well known (see [12, Section 14]), those conditions characterize as follows the solvability of the linear system 1 (∇u + ∇uT ) = E 2
(4.1)
for the vector field u associated to a given symmetric-valued tensor field E: for a simply-connected body, there is a solution to (4.1) if and only if the datum satisfies eij k elmn Ej m ,kn = 0,
i, l = 1, 2, 3,
(4.2)
(here eij k is the Ricci alternator and Ej m := cj · Ecm ). If, in particular, E is plane, that is, in the present circumstances, if Ez = 0,
(4.3)
then conditions (4.2) are met if and only if E has the form E(x, ζ ) = E(0) (x) + ζ E(1)(x),
(4.4)
with E(0) and E(1) (plane and) such as to satisfy, respectively, (0) (0) (0) − E11,22 − E22,11 = 0, 2E12,12
(4.5)
(1) (1) − E12,1 = 0, E11,2
(4.6)
and (1) (1) E22,1 − E12,2 = 0.
The preparatory result we need amounts to note that, for a plane strain field E(x, ζ ) to be compatible, its thickness average +ε 1 E(x) := E(x, ζ ) dζ (4.7) 2ε −ε must satisfy (4.5), that is, 2E 12,12 − E 11,22 − E 22,11 = 0.
(4.8)
684
P. PODIO-GUIDUGLI
Our plan is to show that von Kármán’s first equation follows from the compatibility condition (4.8), when applied to a suitable plane strain field. 4.2. AN EXACT ANTECEDENT OF VON KÁRMÁN ’ S FIRST EQUATION Taking into account (3.1) and (3.29), and denoting by P the orthogonal projection of Sym onto its subspace A, we first introduce the mapping ν 1 1+ν S− (trS)1 − P[HT H], E = E(S, H) := (4.9) E 1+ν 2 which associates a tensor E ∈ A with each pair consisting of a tensor S ∈ A and an arbitrary tensor H. The plane strain field we look for obtains by composition of E with fields H(x, ζ ) and S(x, ζ ) belonging, respectively, to the collections H and S we now describe. H is the collection of all tensor-valued fields over C(ε) that are gradients of vector fields of the form (3.14). The collection S consists of all those plane tensor field S(x, ζ ) over C(ε) whose thickness average S(x), a single-valued, plane tensor field over P , (i) is divergenceless: S αβ,β = 0,
α = 1, 2;
(4.10)
(ii) satisfies the thickness average of the boundary condition (3.35)2 : +ε 1 −1 Sn = −λ¯s, s¯ := F (x, ζ ) dζ n = F−1 n in ∂P . 2ε −ε
(4.11)
We note that S(x) admits a single-valued Airy representation in terms of a scalar field ϕ(x) over P : S = E R(∇∇ϕ)RT ,
R := −c1 ⊗ c2 + c2 ⊗ c1 ;
(4.12)
and that, with (4.12), the boundary condition (4.11) can be written in the form ϕ = λϕ0 ,
∂n ϕ = λϕ1
in ∂P ,
(4.13)
where, for σ the arc-length parametrization of the boundary curve ∂P from a point x(0) ∈ ∂P , σ (x(τ ) − x(σ )) ⊗ s¯(τ ) dτ, (4.14) E ϕ0 (σ ) := −R · 0 σ s¯ (τ ) dτ ⊗ n(σ ). (4.15) E ϕ1 (σ ) := −R · 0
REMARKS. 1. The implicit restriction that ϕ(0) = 0
and
∂n ϕ(0) = 0
is immaterial; see [12, Section 47].
685
A MODEL FOR PLATE BUCKLING
2. It is important to realize that the boundary fields ϕ0 and ϕ1 depend on the restrictions to ∂P of the parameter fields w and v: such a dependence is entrained by the presence, in the definitions (4.14) and (4.15), of the field s¯ . To make this point precise, note that, from (A.1) and the second of (3.33) we have that F−1 n = (hα · n)cα
in ∂P .
In particular, then, s¯ = (h¯ α · n)cα
in ∂P ;
but (see Appendix A) the contravariant base vectors hα depend functionally just on the fields w and v. Given (S, H) ∈ S × H , we use definition (4.9) to construct a plane field E(x, ζ ) = E(S(x, ζ ), H(x, ζ )) (in fact, a candidate strain field over the plate-like body C(ε) we consider), whose thickness average E is parametrized by three fields over P , namely, ϕ, v, and w. After some computations, we find that E obeys (4.8) if 1 1 2 1 [vα , vα ] + ε [mi , mi ] = 0. (4.16) ϕ − [w, w] + 2 2 3 For the reader’s convenience, let us repeat here the first of von Kármán equations: 1 ϕ − [w, w] = 0. 2
(2.1)
Comparison is striking. The differential relation (4.16) is an exact compatibility restriction involving the fields w(x), v(x), and ϕ(x) over P . As is customary with equations containing a small parameter, one investigates the dependence of the solutions of (4.16) on ε. The obvious scaling w ε = ε w,
vε = ε 2 v,
ϕ ε = ε 2 ϕ,
(4.17)
permits us to conclude that (4.16) is an exact two-dimensional antecedent of the first von Kármán equation (2.1). Likewise, conditions (4.13) are exact antecedents of the Neumann-type boundary conditions (2.3) in von Kármán’s problem, with which they formally coincide, Note that the first two of (4.17) imply that
mε = z − ε ∇w + o(ε), whence [mi , mi ] = O(ε2 ).
686
P. PODIO-GUIDUGLI
and to which they reduce when the scaling (4.17) is completed by taking λε = ε 2 λ.
(4.18)
REMARK. In this paper, we adhere to a common practice and loosely take 2ε, instead of 2εh, to measure the thickness of the plate-like region C(ε). Thus, whenever we regard ε as a dimensionless smallness parameter, as is done for the first time in (4.17), we think of it as divided by h. 5. Equilibrium and von Kármán’s Second Equation At equilibrium, the Piola stress must satisfy Div P = 0 at each point of C(ε), or rather, equivalently, (Pcα ),α +(Pz),ζ = 0.
(5.1)
Our first concern is to give this equilibrium equation a form that makes the role of the reactive stress explicit. 5.1. ACTIVE AND REACTIVE EQUILIBRIUM STRESSES We begin by observing that, for hα = f,α = g,α + ζ m,α ,
h3 = f,ζ = m
(5.2)
the covariant base vectors associated with the shape of C(ε) after a deformation of the form (3.7), the deformation gradient has the representation F = hα ⊗ cα + m ⊗ z
(5.3)
(cf. Appendix A). Because of this representation and the fact that S(A) ∈ A, we find that (A) hβ , P(A) cα = FS(A) cα = Sβα
P(A)z = 0,
(5.4)
where (A) (A) := cβ · S(A) cα = Sαβ . Sβα Under the scaling (4.17) , 1,2
(hα )ε = cα + o(ε),
s¯ε = n + o(ε).
(5.5)
687
A MODEL FOR PLATE BUCKLING
Moreover, again by (5.3) and by (3.30) and (3.24), we find that P(R) = dα hα ⊗ z + m ⊗ d + δ m ⊗ z,
dα := d · cα ,
(5.6)
whence P(R) cα = dα m,
P(R) z = dα hα + δ m,
(5.7)
so that the nonnull components of P(R) are (R) z = (hγ · hβ )dβ , Pγ(R) 3 := hγ · P (R) := m · P(R)cγ = dγ , P3γ
(5.8)
(R) := m · P(R)z = δ. P33
Note that (R) Pγ(R) 3 = (hγ · hβ )P3β ,
(R) (R) P3γ = (hγ · hβ )Pβ3 ,
(5.9)
where {h1 , h2 , m} is the contravariant base associated with the covariant base in (5.2). With (5.4) and (5.7), the equilibrium equation (5.1) takes the form (A) hβ + dα m),α +(dβ hβ + δ m),ζ = 0. (Sβα
(5.10)
Differentiating and taking the inner products with the base vectors (5.2), we deduce from (5.10) the following system of three scalar equations: (A) hβ ),α ·hγ + (hβ · hγ )dβ ,ζ = 0, (Sβα (5.11) (A) + dα ,α +δ,ζ = 0. (hβ ,α ·m)Sβα Finally, with the use of (5.8) and (5.9), we write (5.11) in the form (A) hβ ),α · hγ + Pγ(R) (Sβα 3 ,ζ = 0, (A) (R) (R) + ((hα · hβ )Pβ3 ),α + P33 ,ζ = 0. (hβ ,α · m)Sβα
(5.12)
Remarkably, equations (5.12) have the same structure exploited, in the linear case, to give an exact derivation from three-dimensional elasticity of the classical Germain–Lagrange equation for thin plates [16]; we then manipulate these equations in the same manner. We proceed as follows: firstly, thickness integration of (5.12) allows us to determine the reactive stress field P(R)(x, ζ ) in C(ε) in terms of the equilibrium deformation, that is, in terms of the parameter fields ϕ(x), v(x), and w(x) over P at equilibrium; secondly, three pure (that is, reaction-free) and exact scalar consequences of (5.12) and (3.34) are found, namely, equations (5.16) and (5.18) in the next subsection; this last equation, a nonlinear counterpart of the Germain– Lagrange equation, serves as a two-dimensional antecedent of (2.2), the second
688
P. PODIO-GUIDUGLI
of von Kármán equations; finally, by way of the scaling (4.17), the relationship between (2.2) and its antecedent (5.18) is made precise. 5.2. AN EXACT ANTECEDENT OF VON KÁRMÁN ’ S SECOND EQUATION Integrating (5.12)1,2 with the use of the boundary condition (3.34)1 restricted to P − , we find ζ (R) (A) (Sβα hβ ),α ·hγ dχ. (5.13) Pγ 3 (x, ζ ) = − −ε
Moreover, with (5.13) and (3.34)2 restricted to P − , integration of (5.12)3 yields ζ χ (R) (A) α β (Sδγ hδ ),γ ·hβ dτ ,α dχ P33 (x, ζ ) = (h · h ) −ε
−
−ε
ζ
−ε
(A) (hβ ,α · m) Sβα dχ.
(5.14)
Relations (5.13) and (5.14), together with the second of (5.9), allow us to construct the reactive field in C(ε) whenever the deformation field is known. Of the boundary conditions (3.34), those prevailing over P + remain to be satisfied. Two of them read: Pγ(R) 3 (x, ε) = 0
in P ,
(5.15)
and, with the use of (5.13), can be written as +ε (A) (Sβα hβ ),α · hγ dζ = 0 in P .
(5.16)
−ε
The third reads: (R) (x, ε) = 0 P33
in P ,
or rather, with (5.14), ζ +ε (A) α β (Sδγ hδ ),γ ·hβ dτ ,α dζ − (h · h ) −ε
in P .
−ε
(5.17)
+ε −ε
(A) (hβ ,α ·m) Sβα dζ = 0
(5.18)
We postpone to the next subsection our study of the two PDE’s in the unknown fields w and v to which conditions (5.16) reduce when the constitutive equation (3.26) for the active stress is taken into account, as well as our study of the accompanying boundary conditions. We now show that, given the material response we have chosen, condition (5.18) yields a fourth-order quasilinear PDE, which is an exact antecedent of the second von Kármán equation (2.2). Here and henceforth, for short, dχ signifies (x, χ) dχ.
689
A MODEL FOR PLATE BUCKLING
To see this, a rather lengthy computation is needed, in order to make explicit the functional dependence of both S(A), hi and hα on the unknown fields w, v and, possibly, ϕ. We define +ε ζ 1 (A) α β (Sδγ hδ ),γ ·hβ dτ ,α dζ, (5.19) (h · h ) I1 (ε; w, v) := 2ε −ε −ε +ε 1 (A) (hβ ,α ·m) Sβα dζ, (5.20) I2 (ε; w, v, ϕ) := 2ε −ε and write (5.18) in the form I1 (ε; w, v) − I2 (ε; w, v, ϕ) = 0 in P .
(5.21)
The dependence of I1 and I2 on the unknown fields is detailed in Appendix D. In particular, we find that the first of these integrals can be given the form of a quasilinear differential operator with principal part p.p.(I1 (ε; w, v)) =
E (m · z) B αδ (ε; w, v)( w),δα . 2 1−ν
(D.3)
We also find the following explicit representation for the second integral: I2 (ε; w, v, ϕ) = E(m · z)[ϕ, w] + E(m · v,αβ )(R(∇∇ϕ)RT · cα ⊗ cβ ) 1 (1) . (D.8)2 − ε 2 (m,α ·m,β )Sαβ 3 Under the scaling (4.17), we find that I1 (ε; w ε , vε ) = ε 3 Eκ
w + o(ε 3 ),
κ :=
1 h2 ; 3(1 − ν 2 )
(5.22)
as to the second integral, that I2 (ε; w ε , vε , ϕ ε ) = ε 3 E[ϕ, w] + o(ε 3 ).
(5.23)
Hence, 1 (I1 (ε; w ε , vε ) − I2 (ε; w ε , vε , ϕ ε )) = E(κ ε→0 ε 3 lim
w − [ϕ, w]),
(5.24)
Here,
B αδ (ε; w, v) :=
+ε 1 ζ Aαβ hβ · hδ dζ, 2ε −ε
(D.2)
where hα · hβ = Aαβ ,ζ .
(A.7)
Recall the remark at the end of Section 4. The last formula provides an interpretation for the
dimensional constant κ in von K´arm´an’s second equation (2.2).
690
P. PODIO-GUIDUGLI
which is enough to establish the second von Kármán equation as the limit of the equilibrium equation (5.18) when ε → 0. 5.3. TWO COMPLEMENTING PDES AND THE ASSOCIATED BOUNDARY CONDITIONS
We begin by writing conditions (5.16) in a form that reflects the functional dependence of both S(A) and hα on the unknown fields w and v: +ε 1 (A) I3γ := (Sβα hβ ),α · hγ dζ. (5.25) I3γ (ε; w, v) = 0, 2ε −ε We also note that S(A) (x, ζ ) = S(0) (x) + ζ S(1)(x) + ζ 2 S(2)(x),
(C.2)
while (5.2) can be written as hα = hα(0) + ζ hα(1) ,
hα(0) := g,α , hα(1) := m,α .
(5.26)
It is then the matter of an easy calculation to give (5.16) the following form: 1 (0) (1) (0) (0) (1) (0) hβ ),α ·hγ(0) + ε 3 (Sβα hβ + Sβα hβ ),α ·h(1) I3γ = (Sβα γ 3 1 (1) (1) (2) (0) (2) (1) + (Sβα hβ + Sβα hβ ),α ·hγ(0) + ε 4 (Sβα hβ ),α ·h(1) γ = 0 in P . 5 (5.27) We denote by I3 (ε; w, v) the vector-valued, nonlinear differential operator with components I3γ , and write (5.27) for short as I3 (ε; w, v) = 0 in P ;
(5.28)
this equation is to complement the antecedents (4.16) and (5.21) of the von Kármán equations (Section 6). To record here the complicated, explicit form of the operator I3 is scarcely relevant to our present purposes. What instead matters, as we shall demonstrate in the next section, is that I3 does not depend on the unknown field ϕ. In addition, it is interesting to note that, under the scaling (4.17), we have that 1 I3 (ε; w ε , vε ) = Div S(0) . ε→0 ε 2 lim
(5.29)
Thus, with the use (C.4)1 , we see that, in the limit when ε → 0, (5.28) reduces to 1+ν E ( w)1 + ∇∇w ∇w + Div S(0) (v) = 0 in P . (5.30) 2(1 + ν) 1−ν The dependence of S(A) on w and v is detailed in Appendix C; as to h , that dependence follows α
from (5.2), (3.8), and (3.15).
691
A MODEL FOR PLATE BUCKLING
We now turn to determine the boundary conditions to be associated with the differential system (5.28). Combining with (5.3) the second of (3.35), we find that the latter can be given the form (S(A) cα · n)hα = −λn in M(ε).
(5.31)
Another use of (C.2) and (5.26) yields:
(1) (0) (S(A) cα · n)hα = (S(0)cα · n)hα(0) + ζ (S(0) cα · n)h(1) α + (S cα · n)hα + ζ 2 (S(1)cα · n)hα(1) + (S(2)cα · n)h(0) α + ζ 3 (S(2)cα · n)h(1) α .
We set 1 B3 (ε; w, v) := 2ε
+ε −ε
(S(A) cα · n)hα dζ,
(5.32)
(5.33)
whence 1 (2) (0) B3 (ε; w, v) = (S(0) cα · n)hα(0) + ε 2 (S(1)cα · n)h(1) α + (S cα · n)hα , 3 (5.34) and stipulate that B3 (ε; w, v) × n = 0 in ∂P .
(5.35)
Note that B3 does not depend on the unknown field ϕ. In addition, note that, under the scaling (4.17), we have that 1 B3 (ε; w ε , vε ) = (S(0) cα · n)h(0) (5.36) α . ε2 Thus, due to the fact that ∇w ≡ 0 on ∂P as a consequence of the Dirichlet boundary conditions (2.4), another use of the first of (C.4) permits us to conclude that, when ε → 0, (5.35) reduces to lim
ε→0
(1 + ∇v) S(0) (v)n × n = 0 in ∂P .
(5.37)
6. An Exact Two-Dimensional Antecedent of von Kármán’s Bifurcation Problem We collect the relevant results obtained so far in order to formulate, in a format modeled after [2], the two-dimensional bifurcation problem we propose as an exact antecedent of von Kármán’s bifurcation problem (2.11). We note, firstly, that the format change introduced by Berger applies also when equation (4.16) takes the place of equation (2.1). Indeed, just as in Section 2, if we let ϑ be a biharmonic field over P that satisfies the boundary conditions (2.6)
692
P. PODIO-GUIDUGLI
and, moreover, we let ψ = ϕ − λ ϑ, then we can give the system (4.16), (2.3) the following form: 1 1 2 1 [vα , vα ] + ε [mi , mi ] = 0 inP , (6.1) ψ − [w, w] + 2 2 3 (2.4) ψ = 0, ∂n ψ = 0 in ∂P . We denote by ψ(w,v) the unique solution of this problem. Secondly, we note that
w, v, ϑ), I2 (ε; w, v, ϕ) = I2 (ε; w, v, ψ) + λB(ε;
(6.2)
with
w, v, ϑ) := E(m · z)[ϑ, w] + E(m · v,αβ )(R(∇∇ϑ)RT · cα ⊗ cβ ). B(ε; (6.3) Finally, we let
w, v) := I1 (ε; w, v) − I2 (ε; w, v, ψ(w,v) ), A(ε;
λ, w, v) := A(ε;
w, v) − λB(ε;
w, v), C(ε;
(6.4) (6.5)
and pose the problem of studying, for each ε > 0 fixed, the mapping
(ε;λ), λ → W
(ε;λ) is the collection of all pairs (w, v) of smooth fields over P ∪ ∂P where W satisfying
(ε; λ, w, v) = 0 C
in P ,
(6.6)
together with (2.4), I3 (ε; w, v) = 0 in P ,
(5.28)
B3 (ε; w, v) × n = 0 in ∂P .
(5.35)
and It is clear that this problem, in the limit when ε → 0, reduces to problem (2.11). Once the latter problem is solved and a critical pair (λ, w λ ) is selected, the limit problem which obtains from (5.28) and (5.35), that is, the problem consisting of equations (5.30) and (5.37), serves to determine the in-plane displacement field v that should accompany the critical transverse deflection w λ . Appendix A. Geometry of the Deformed Shape of C(ε) The gradient of a deformation f (x, ζ ) = g(x) + ζ m(x)
(3.6)
693
A MODEL FOR PLATE BUCKLING
of cylinder C(ε) may be written as F = hα ⊗ cα + m ⊗ z
(5.6)
in terms of the covariant base vectors hα = f,α = g,α +ζ m,α ,
h3 = f,ζ = m
(5.2)
associated with the deformed shape of C(ε). The inverse of the deformation gradient has the following expression in terms of the contravariant base vectors hi : F−1 = cα ⊗ hα + z ⊗ m
(A.1)
(note that h3 = m). Since det F = Fc1 × Fc2 · Fz = h1 × h2 · m,
(A.2)
we find that det F(x, ζ ) = (det F)(0) (x) + ζ(det F)(1)(x) + ζ 2 (det F)(2)(x),
(A.3)
where (det F)(0) = | g,1 ×g,2 |, (det F)(1) = (g,1 ×m,2 +g,2 ×m,1 ) · m, (det F)(2) = (m,1 ×m,2 ) · m.
(A.4)
Expressions for the contravariant base vectors hα in terms of the covariant vectors hi are: h1 = (det F)−1 h2 × m,
h2 = (det F)−1 m × h1 ,
(A.5)
whence h1 · h1 = (det F)−2 h2 · h2 , h2 · h2 = (det F)−2 h1 · h1 , −h1 · h2 = (det F)−2 h1 · h2 .
(A.6)
It follows, in particular, that there are fields Aαβ (x, ζ ) such that hα · hβ = Aαβ ,ζ .
(A.7)
These fields have the form
A11 (x, ζ ) = A11 (x, ζ0 ) + etc.
ζ (h2 · h2 )(0) + χ(h2 · h2 )(1) + χ 2 (h2 · h2 )(2) dχ, ζ0 ((det F)(0) + χ(det F)(1) + χ 2 (det F)(2) )2
694
P. PODIO-GUIDUGLI
Appendix B. The Strain Field in C(ε) In view of definition (3.1), the strain tensor D takes the form D=
1 ((hα · hβ )cα ⊗ cβ − cα ⊗ cα ). 2
(B.1)
Now, due to the first two of (5.2), hα · hβ (x, ζ ) = (hα · hβ )(0)(x) + ζ(hα · hβ )(1) (x) + ζ 2 (hα · hβ )(2) (x), (B.2) where (hα · hβ )(0) = g,α · g,β , (hα · hβ )(1) = g,α · m,β + m,α · g,β = −2 g,αβ · m, (hα · hβ )(2) = m,α · m,β .
(B.3)
Consequently, D(x, ζ ) = D(0) (x) + ζ D(1)(x) + ζ 2 D(2)(x),
(B.4)
1 ((hα · hβ )(0) − δαβ ) cα ⊗ cβ , 2 1 = (hα · hβ )(1) cα ⊗ cβ , 2 1 = (hα · hβ )(2) cα ⊗ cβ . 2
(B.5)
with D(0) = D(1) D(2)
To make explicit the dependence of the strain field in C(ε) on the surface gradients of the parameter fields w, v over P , we recall that g,α = cα + v,α +w,α z,
(3.37)
whence g,α ·g,β − δαβ = w,α w,β + vα ,β + vβ ,α + vγ ,α vγ ,β , g,αβ = w,αβ z + v,αβ ;
(B.6) (B.7)
and that m = uni(g,1 ×g,2 ), with g,1 ×g,2 = z − ∇w + (c1 + w,1 z) × v,2 +v,1 ×(c2 + w,2 z) + v,1 ×v,2 = −(w,1 (1 + v2 ,2 ) − w,2 v2 ,1 )c1 − (w,2 (1 + v1 ,1 ) − w,1 v1 ,2 )c2 + (1 + v1 ,1 +v2 ,2 +v1 ,1 v2 ,2 −v1 ,2 v2 ,1 )z. (B.8)
695
A MODEL FOR PLATE BUCKLING
We can then write relations (B.5) in the following form: 1 1 ∇w ⊗ ∇w + D(v), D(v) := sym(∇v) + (∇v)T ∇v, 2 2 = −(m · z)∇∇w − (m · v,αβ )cα ⊗ cβ , 1 = (m,α ·m,β )cα ⊗ cβ . 2
D(0) = D(1) D(2)
(B.9)
Appendix C. The Active Stress Field in C(ε) In order to find relations similar to (B.4) and (B.9) for the active strain, we first write the constitutive relation (3.26) in the form ν E (A) I+ 1⊗1 . (C.1) S = S[D], S := 1+ν 1−ν Secondly, we insert (B.4) into (C.1) and get S(A) (x, ζ ) = S(0) (x) + ζ S(1)(x) + ζ 2 S(2)(x),
(C.2)
S(i) = S[D(i)],
(C.3)
with i = 0, 1, 2;
more esplicitly, S
(0)
S(1) S(2)
ν E 2 ∇w ⊗ ∇w + |∇w| 1 + = S(0)(v), 2(1 + ν) 1−ν ν E (m · z) ∇∇w + ( w)1 + S(1)(w, v), = − 1+ν 1−ν E ν 2 (m,α ·m,β )cα ⊗ cβ + |m,α | 1 , = 2(1 + ν) 1−ν
(C.4)
where
S(0)(v) := S[D(v)],
S(1) (w, v) := −S[(m · v,αβ )cα ⊗ cβ ].
(C.5)
It follows from (3.20) and (C.1) that the thickness average of the stress field in C(ε) is the following field over P : S=S
(A)
1 = S(0) + ε 2 S(2). 3
In particular,
2 E 1 ν
(m · v,αβ )cα ⊗ cβ + (m · v,αα )1 . S(1) (w, v) = − 1+ν 1−ν
(C.6)
696
P. PODIO-GUIDUGLI
Appendix D. From (5.18) to (5.21) D .1. THE FIRST INTEGRAL IN
(5.18)
In view of (A.7), an integration by parts taking (5.15) into account yields ζ +ε +ε (A) (A) α β (h · h ) (Sδγ hδ ),γ · hβ dτ dζ = − Aαβ (Sδγ hδ ),γ · hβ dζ. −ε
−ε
−ε
Expanding the differentiations indicated, we have that +ε (A) Aαβ (Sδγ hδ ),γ · hβ ),α dζ −ε +ε +ε (A) (A) αβ Sδγ ,γ α (A hβ · hδ ) dζ + Sδγ ,γ (Aαβ hβ · hδ ),α dζ = −ε −ε +ε +ε (A) (A) αβ Sδγ ,α (A hδ ,γ · hβ ) dζ + Sδγ (Aαβ hδ ,γ · hβ ),α dζ. + −ε
−ε
(A)
The expressions (C.1)–(C.5) for S must now be inserted into each of the four integrals in the right side of the last relation. We concentrate on the first integral – the most important, because it gives rise to the principal part of the operator I1 – which takes the following form: +ε (A) Sδγ ,γ α (Aαβ hβ · hδ ) dζ −ε +ε +ε (0) (1) αβ αβ A hβ · hδ dζ Sδγ ,γ α + ζ A hβ · hδ dζ Sδγ ,γ α = −ε −ε +ε (2) 2 αβ + ζ A hβ · hδ dζ Sδγ ,γ α . −ε
What now counts is the second addendum. With (C.4)3 and (C.5)2 , we find that E (1) ,γ α = − (m · z)( w),δα +( S(1)(w, v))δγ ,γ α . (D.1) Sδγ 1 − ν2 We set +ε 1 ζ Aαβ hβ · hδ dζ, (D.2) B αδ (ε; w, v) := 2ε −ε and, finally, write the principal part of I1 as p.p.(I1 (ε; w, v)) =
E (m · z) B αδ (ε; w, v)( w),δα . 2 1−ν
D .2. THE SECOND INTEGRAL IN
(D.3)
(5.18)
Since hβ ,α ·m = (m · z)w,αβ +m · v,αβ −ζ m,α ·m,β ,
(D.4)
697
A MODEL FOR PLATE BUCKLING
the second integral in (5.18) can we written as +ε (A) Sβα dζ − (m,α ·m,β ) ((m · z)w,αβ +m · v,αβ ) −ε
+ε −ε
(A) ζ Sβα dζ.
Now, by the first of (4.12), +ε 1 S(A) dζ = ER(∇∇ϕ)RT, 2ε −ε so that wαβ
+ε −ε
(D.5)
(D.6)
(A) Sβα dζ = (2ε)E [w, ϕ] = (2ε)E [ϕ, w].
Moreover, by (C.2) and (C.3), +ε 1 1 ζ S(A) dζ = ε 2 S(1), 2ε −ε 3
(D.7)
with S(1) depending on w and v as specified by (C.4)3 and (C.5)2 . With this, we conclude that +ε 1 (A) (hβ ,α ·m) Sβα dζ I2 (ε; w, v, ϕ) := 2ε −ε = E(m · z)[ϕ, w] + E(m · v,αβ )(R(∇∇ϕ)RT · cα ⊗ cβ ) 1 (1) . (D.8) − ε 2 (m,α · m,β )Sαβ 3 Acknowledgements I have benefitted of some comments by two referees. This work has been supported by the Progetti Cofinanziati 2000 and 2002 “Modelli Matematici per la Scienza dei Materiali” and by TMR Contract FMRX-CT98-0229 “Phase Transitions in Crystalline Solids”. References 1. 2. 3. 4.
S.S. Antman, Nonlinear Problems of Elasticity. Springer, Berlin (1995). M.S. Berger, Nonlinearity and Functional Analysis. Academic Press, New York (1977). S. Chow and J. Hale, Methods of Bifurcation Theory. Springer, Berlin (1996). P.G. Ciarlet, A justification of the von Kármán equations. Arch. Rational Mech. Anal. 73 (1980) 349–389.
We here use an identity that follows from (2.5) and the second of (4.12), namely,
[a, b] = ∇∇a · R(∇∇b)RT .
698 5.
P. PODIO-GUIDUGLI
P.G. Ciarlet, Plates and Junctions in Elastic Multi-Structures: An Asymptotic Analysis. Springer, Berlin (1990). 6. P.G. Ciarlet, Mathematical Elasticity, Vol. II: Theory of Plates. North-Holland, Amsterdam (1997). 7. M.G. Crandall and P.H. Rabinowitz, Nonlinear Sturm–Liouville eigenvalue problems and topological degree. J. Math. Mech. 19 (1970) 1083–1102. 8. J.L. Davet, Justification de modèles de plaques non linéaires pour des lois des comportements générales. Moddél. Math. Anal. Numér. 20 (1986) 225–249. 9. D.D. Fox, A. Raoult and J.C. Simo, A justification of nonlinearly properly invariant plate theories. Arch. Rational Mech. Anal. 124 (1993) 157–199. 10. G. Friesecke, R.D. James and S. Müller, A theorem of geometric rigidity and the derivation of nonlinear plate theory from three-dimensional elasticity. Comm. Pure Appl. Math. LV (2002) 1461–1506. 11. G. Friesecke, R.D. James and S. Müller, The Föppl–von Kármán plate theory as a low energy Gamma limit of nonlinear elasticity. C. R. Math. Acad. Sci. Paris 335 (2002) 201–206. 12. M.E. Gurtin, The linear theory of elasticity. In: S. Flügge (ed.), Handbuch der Physik Via/2. Springer, Berlin (1972). 13. M.E. Gurtin and P. Podio-Guidugli, The thermodynamics of constrained materials. Arch. Rational Mech. Anal. 51 (1973) 192–208. 14. T.J. Healey and H.C. Simpson, Global continuation in nonlinear elasticity. Arch. Rational Mech. Anal. 143 (1998) 1–28. 15. P.M. Naghdi and R.P. Nordgren, On the nonlinear theory of elastic shells under the Kirchhoff hypothesis. Quart. Appl. Math. 21 (1963) 49–59. 16. P. Podio-Guidugli, An exact derivation of the thin plate equation. J. Elasticity 22 (1989) 121– 133. 17. P. Podio-Guidugli, A Primer in Elasticity. Kluwer Academic Publishers, Dordrecht (2000). 18. P.H. Rabinowitz, Some global results for nonlinear eigenvalue problems. J. Funct. Anal. 7 (1971) 487–513. 19. C. Truesdell, Some challenges offered to analysis by rational thermodynamics. In: G.M. de la Penha and L.A. Medeiros (eds), Contemporary Developments in Continuum Mechanics and Partial Differential Equations. North-Holland, Amsterdam (1978). 20. C. Truesdell and W. Noll, The non-linear field theories of mechanics. In: S. Flügge (ed.), Handbuch der Physik, Vol. III/3. Springer, Berlin (1965). 21. T. von Kármán, Festigkeitsprobleme in Maschinenbau. In: F. Klein and C. Müller (eds), Encyclopädie der Matematisches Wissenschaften, Vol. IV/4. Teubner, Stuttgart (1910) pp. 311–385.
Cauchy’s Flux Theorem in Light of Geometric Integration Theory G. RODNAY and R. SEGEV Department of Mechanical Engineering, Ben-Gurion University, P.O. Box 653, Beer-Sheva 84105, Israel. E-mail: {rodnay;rsegev}@bgumail.bgu.ac.il Received 24 June 2002; in revised form 17 January 2003 Abstract. This work presents a formulation of Cauchy’s flux theory of continuum mechanics in the framework of geometric integration theory as formulated by H. Whitney and extended recently by J. Harrison. Starting with convex polygons, one constructs a formal vector space of polyhedral chains. A Banach space of chains is obtained by a completion process of this vector space with respect to a norm. Then, integration operators, cochains, are defined as elements of the dual space to the space of chains. Thus, the approach links the analytical properties of cochains with the corresponding properties of the domains in an optimal way. The basic representation theorem shows that cochains may be represented by forms. The form representing a cochain is a geometric analog of a flux field in continuum mechanics. Mathematics Subject Classifications (2000): 73A05, 58A05. Key words: continuum mechanics, flux, Cauchy’s theorem, geometric integration, chains, cochains, flat, sharp, natural.
Dedicated to the memory of Clifford Truesdell who by his work and personality inspired the research of generations of scientists.
1. Introduction The Cauchy Theorem for the existence of stresses and fluxes is one of the fundamental results of continuum mechanics. Over the years, research work contributed to the subject by making the proof more rigorous, by weakening the postulates needed to prove the theorem, by extending the circumstances under which it is valid, and by proving the existence of stresses and fluxes using alternative methods and approaches. In terms of scalar fluxes in space, the basic notions of flux theory may be described as follows. One considers the total flux T (∂B) of an extensive property P through the boundary ∂B of the region B in a three dimensional Euclidean space. The total flux is assumed to be given as an integral of the flux density tB associated with the region B, a scalar field defined on ∂B, in the form 699 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 699–719. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
700
G. RODNAY AND R. SEGEV
T (∂B) =
tB dA. ∂B
The dependence of the flux density tB on the region B is considered next and it is assumed that at each point p, tB (p) depends on B only through the unit normal vector n to ∂B at p so one writes t (p, n) for the corresponding value. Then, one assumes that the total flux is balanced by the rate of decrease of the total amount of the property P in B as given in terms of an integral of a scalar field b over B so tB dA = − b dV . ∂B
B
Assuming that the dependence of t (p, n) on p is continuous one proves Cauchy’s theorem asserting that t (p, n) depends linearly on n. Thus, there is a vector field τ such that t = τ ·n, where the dependence on p was suppressed in the notation. Considering smooth regions such that Gauss’ theorem may be applied, the balance may be written in the form of a differential equation as div τ + b = 0. The first major contribution within the continuum mechanics community was made by Noll in 1957 [11]. Noll was able to prove the dependence of the flux on the normal vector, using a weaker assumption of locality, namely, the flux density tB (p) is equal for two regions if the intersection of their boundaries contains an open neighborhood of p. Gurtin and Williams in [5] and later works [12, 6, 13] use alternative assumptions, bi-additivity and boundedness, to obtain both locality and the representation of total flux in terms of the integral of a flux density. Specifically, assuming that the collection of admissible regions has the structure of a Boolean lattice, it is required that T be given in terms of a mapping of pairs of bodies I (A, B), so T (A) = I (A, B) if B is the complement of A. For separate (disjoint) domains A and B, I satisfies I (A B, C) = I (A, C) + I (B, C) and I (C, A B) = I (C, A) + I (C, B). Then, it is assumed that |I (A, B)| l area (∂A ∩ ∂B)+ k volume (A) – the boundedness assumption – to obtain locality. To prove Cauchy’s theorem it is assumed also that I (A, B) = −I (B, A) and that the dependence of τ (p, n) on p is continuous. In [4] Gurtin and Martins prove the linearity in n of t (p, n) almost everywhere, while using similar additivity and boundedness assumptions but relaxing the hypothesis that t (p, n) is a continuous function of p. In [20, 21] Šilhavý uses a weak approach to prove the existence of stress tensors or flux vectors. Admissible bodies are sets of finite perimeter in En , and the assumptions and results pertain to “almost every subbody” in a way which allows singularities. The resulting flux vector τ has an Lp weak divergence. Degiovanni et al. [1] generalize [20, 21] by considering flux mappings T whose corresponding flux vector fields τ are only locally integrable. The field b = −divτ is meaningful only in the weak sense. In order to present results that hold for domains and flux fields that are increasingly irregular, the works cited above rely on geometric measure theory of Federer
FLUXES AND GEOMETRIC INTEGRATION
701
[2] and de Giorgi (see [14]). For example, tools of geometric measure theory are used for choosing a universe of bodies, for a measure theoretic definition of the normal vector, and for using generalizations of Gauss’ theorem to such irregular domains. Another approach for proving Cauchy’s theorem directly from an integral balance equation is introduced in [3]. In this paper a variational approach is taken to prove the linear dependence on the normal starting from a weaker locality postulate. Stress theory for manifolds that are not equipped with a metric is presented in [15] from a weak point of view. Forces are defined as elements of the dual space of the Banach space of C k -sections of a vector bundle over the body. Stresses are Borel measures valued in the dual of a jet bundle and they represent forces using a representation theorem. Further analytical aspects of the theory are presented in [18]. In particular, as the theory introduces continuum mechanics of order k as corresponding to the space of C k -sections, general consistency conditions that are analogous to Cauchy’s postulates are formulated for arbitrary values of k and for stresses as irregular as Borel measures. In [16], and the following [19, 17] the analog of the classical Cauchy theorem is presented for differentiable manifolds. In 1947 and 1948 Whitney [22] and Wolfe [24] presented a geometric theory of r-dimensional integration in an n-dimensional Euclidean space. A comprehensive treatment [23] of the theory was published by Whitney in 1957. While geometric measure theory received a lot of attention because of its relevance to the Plateau problem, the mathematical work continuing Whitney’s geometric integration theory is limited. In [7] and the following [8–10] Harrison made important extensions to Whitney’s work. To the best of our knowledge, Whitney’s abstract geometric integration theory was never used in the formulation of Cauchy flux theory in continuum mechanics. It is our objective here to present the Cauchy flux theory from the point of view of geometric integration. In addition to offering a different approach to flux theory, the following features make it eminently suitable. Firstly, the theory considers various aspects namely, the collection of domains, integration, Stokes’ theorem (the analog of Gauss’ theorem), and fluxes, from a unified point of view. The properties and degrees of regularity of the various variables are linked. Thus, one may consider less regular domains if one is willing to consider smoother fluxes. In fact, the regions may be as irregular as the Dirac measure and its derivatives if one is willing to admit differentiable flux fields. On the other hand, the flux fields may be as irregular as essentially bounded and measurable functions when the boundaries are as irregular as the graph of an L1 -mapping. The way the theory is constructed, the relation between the regularity properties of domains and fluxes is optimal in the following sense. The class of domains is the largest class for which the evaluation of the various fluxes is continuous. Conversely, the class of fluxes is the largest class such that the total fluxes depend continuously on the domains.
702
G. RODNAY AND R. SEGEV
The codimension n − r is not limited to the value of 1 as in regular Cauchy flux theory. It follows that the theory may be used to formulate flux theory on membranes, strings, etc. Furthermore, the theory does not require that the r-dimensional domains be smooth. In fact, it permits for example the calculation of flux through a 1-dimensional “arc” on a 2-dimensional domain in R3 which is itself the graph of an L1 -mapping. In other words, not only the boundary is irregular, but so is the domain itself. Finally, the construction of continuous chains creates a bridge between the classical and weak formulation of the theory. The elegance of the structure enables its description in just a few sentences. One starts with the building blocks, r-dimensional oriented cells (convex polygons) in an n-dimensional Euclidean space En . Then, the formal vector space of linear combinations of r-cells is considered, where two linear combinations A and B are identified if they may be further subdivided to obtain a common subdivision. The elements of this vector space are called polyhedral r-chains. Then, the space of polyhedral chains is completed with respect to a norm to obtain a Banach space. The elements of the resulting complete space are called either flat, sharp, or natural r-chains depending on the norms used, and chains collectively. Integration operators are referred to as r-cochains and they are defined as continuous linear operators on the space of chains. The application of geometric integration theory to Cauchy flux theory is based on the identification of a total flux operator on regions with a cochain. In other words, a cochain is analogous to a total flux operator acting on the various domains to produce real numbers. In the case of traditional continuum mechanics, the total flux is regarded a 2-cochain in E3 . The analog of Cauchy’s flux theorem is a representation theorem stating that a cochain may be represented by an r-form, an antisymmetric r-tensor in En , using integration. As mentioned earlier, the analytical properties of chains and forms representing cochains are determined by the norm used. The topology on the space of chains allows one to extend various operations, e.g., integrals and boundaries, from polyhedral chains to the chains obtained as limits of sequences. Federer [2, pp. 367–378] introduces flat chains as currents, roughly, continuous linear functionals on the space of smooth forms with compact supports – the geometric analogs of Schwartz distributions, and defines the flat norm as the norm induced on the dual space by the norm φ = supp {|φ(p)|, |dφ(p)|} on the space of smooth forms. While Federer’s treatment of flat chains is concise and elegant, it does not contain the analogs for sharp and natural chains. In addition, it seems to us that Whitney’s approach is closer in spirit to the traditional approach of continuum mechanics. Furthermore, as Federer states in [2, p. 378] his main interest has been in chains while Whitney’s main concern has been with cochains – the objects representing Cauchy fluxes of continuum mechanics. It is noted that the expression for the representation of cochains in terms of forms also applies on general manifolds rather than a Euclidean space. In addition, while the definitions of the various norms utilize the metric structure of En ,
FLUXES AND GEOMETRIC INTEGRATION
703
the various topological spaces of chains remain invariant under diffeomorphisms. This suggests an extension of the theories to general manifolds. However, a formal presentation of such a theory is not available yet and will not be considered here. Thus, the basic constructions, results, and applications to Cauchy flux theory are described below. For details of the mathematical constructions and proofs see [23] and [8, 9]. We start in Section 2 with the basic building blocks: polyhedral chains and integration on polyhedral chains. Section 3 considers the construction of the various Banach spaces of chains and Section 4 presents the definitions and basic properties associated with cochains – the analogs of the Cauchy flux operators. The Cauchy theorem of fluxes is implied by the representation theorem of cochains by forms as presented in Section 5. Finally, Section 6 considers the extension of the exterior derivative to non-smooth forms through the notion of a coboundary, and the resulting local balance equation.
2. Chains and Integration 2.1. CELLS AND POLYHEDRAL CHAINS We start with a review of the basic definitions related to integration on chains in an n-dimensional Euclidean space En whose associated vector space is V . A cell, σ , is a non empty bounded subset of En expressed as an intersection of a finite collection of half spaces. The plane of σ is the smallest affine subspace containing σ , and the dimension of σ is the dimension of its plane. We refer to r-dimensional cells as r-cells. An oriented r-cell is an r-cell with a choice of one of the two orientations of the vector space associated with its plane. The cell −σ is the cell that contains the same points as σ but has the opposite orientation. The boundary of an oriented r-cell, ∂σ , is a collection of oriented (r −1)-cells. The boundary of a 1-cell consists of two points, and 0-cell has no boundary. The orientations of the cells that make up the boundary ∂σ are determined by the orientation of σ , in the following way. Given a cell σ ⊂ ∂σ , let v2 , . . . , vr be a collection of r − 1 independent vectors that belong to the plane of σ . Then, this collection is positively oriented if given a vector v1 at σ that belongs to the plane of σ and points out of σ , the collection (v1 , . . . , vr ) is positively oriented relative to σ . The boundary of a 1-cell oriented by the vector pq, consists of the two 0-cells q positive and p negative. Oriented cells are the building blocks of chains. A polyhedral r-chain in En is an element of the vector space spanned by formal linear combinations of r-cells, together with the following properties. (1) The polyhedral chain 1σ is identified with the cell σ . (2) We associate multiplication of a cell by −1 with the operation of inversion of orientation, i.e., −1σ = −σ . (3) If an oriented cell σ is cut into several cells σ1 , . . . , σm , then σ and σ1 + · · · + σm are identified as polyhedral chains. Thus, we identify the union of
704
G. RODNAY AND R. SEGEV
oriented r-cells having disjoint interiors with the polyhedral r-chain which is the sum of the r-cells. Polyhedral 0-chains are expressions of the form ai pi , where pi are points. The boundary of a cell is thus a chain, the sum of the various oriented cells that make up the boundary as above. The space of polyhedral r-chains in En is now an infinite-dimensional vector r-chain A = ai σi , is space denoted by Ar (En ). The boundary of a polyhedral a polyhedral (r − 1)-chain defined to be ∂A = ai ∂σi . The boundary of a polyhedral 0-chain is 0. Note that by this definition ∂ is a linear operator Ar (En ) −→ Ar−1 (En ).
2.2. MULTIVECTORS A simple r-vector in V is defined in a formal way, to be an expression of the form v1 ∧ · · · ∧ vr , where vi ∈ V , the vector space associated with En . We set r-vectors in V to be elements of the vector space Vr of formal linear combinations of simple r-vectors, together with the following properties: (1) (2) (3)
v1 ∧ · · · ∧ (vi + vi ) ∧ · · · ∧ vr = v1 ∧ · · · ∧ vi ∧ · · · ∧ vr + v1 ∧ · · · ∧ vi ∧ · · · ∧ vr ; v1 ∧ · · · ∧ (avi ) ∧ · · · ∧ vr = a(v1 ∧ · · · ∧ vi ∧ · · · ∧ vr ); v1 ∧ · · · ∧ vi ∧ · · · ∧ vj ∧ · · · ∧ vr = −v1 ∧ · · · ∧ vj ∧ · · · ∧ vi ∧ · · · ∧ vr .
1-vectors are just vectors, and 0-vectors are defined to be real numbers. It is noted that any r-vector can be written in various equivalent ways. The various identifications above, in particular the antisymmetry, imply that dimension of the space of r-vectors is dim Vr =
n! , (n − r)!r!
where n is the dimension of V . If r > n then Vr is empty. Given a basis {ei } of V , the r-vectors {eλ1 ...λr = eλ1 ∧ · · · ∧ eλr }, such that 1 λ1 < · · · < λr n, form a basis of Vr . Given an oriented r-simplex σ in En , with vertices p0 , . . . , pr , the r-vector of σ , denoted by {σ }, is defined to be {σ } = v1 ∧ · · · ∧ vr /r!, where the vectors vi are defined by vi = pi − p0 and are ordered in such a way that they belong to the orientation of σ . It is noted that in case {σ1 } = a{σ2 } for two r-simplexes σ1 and σ2 , then, the ratio between the r-dimensional volumes of the twosimplexes relative to σ , where ai σi any metric is |a|. The r-vector of a polyhedral r-chain A = a i i is a simplicial subdivision of A, is defined by ai σi = ai {σi }. Clearly, this defines the r-vectors of r-cells too, as r-cells are particular polyhedral r-chains.
705
FLUXES AND GEOMETRIC INTEGRATION
2.3. MULTI - COVECTORS The dual space of Vr is denoted by V r and its elements are referred to as r-covectors. We now show how r-covectors can be expressed using covectors. We denote by V ∗ the dual space of V , and by Vr∗ the space which is constructed exactly like the space Vr , butusing the (co-) vectors of V ∗ . Hence, elements of Vr∗ are expressions of the form ai f i1 ∧ · · · ∧ f ir , where f ij ∈ V ∗ . The scalar product of elements of Vr∗ and elements of Vr is defined by (f 1 ∧ · · · ∧ f r ) · (v1 ∧ · · · ∧ vr ) =
λ
= det
λ1 ...λr f 1 (vλ1 ) · · · f r (vλr )
%
f 1 (v1 ) ··· f r (v1 )
··· ··· ···
f 1 (vr ) ··· f r (vr )
& ,
for simple vectors, and extends linearly to the vector spaces. Here, λ = {λ1 , . . . , λr } ranges over the set of all permutations of (1, . . . , r), and λ1 ...λr is the alternating symbol. Any element τ¯ of Vr∗ may be identified with an element τ of V r by τ (α) = τ¯ · α for any r-multivector α. Furthermore, an element τ of V r may be regarded as an alternating multilinear form τ˜ by τ (v1 ∧ · · · ∧ vr ) = τ˜ (v1 , . . . , vr ).
2.4. INTEGRATION OF FORMS OVER POLYHEDRAL CHAINS The natural integrands over r-chains are r-forms. An r-form in a set Q ⊂ En is an r-covector valued mapping defined in Q. An r-form is continuous if its components are continuous functions. The Riemann integral of a continuous r-form τ over an r-simplex σ is defined as τ = lim τ (pki ) · {σki }, σ
k→∞
σki ∈Sk σ
where Sk σ is a sequence of simplicial subdivisions σki of σ with mesh → 0, and . The Riemann integral r-form each pki is a point in σki of acontinuous over a ai σi τ , where ai σi is polyhedral r-chain A = ai σi , is defined by A τ = a simplicial subdivision of the polyhedral chain A. An r-form in En is bounded and measurable if all its components relative to a basis of V are bounded and measurable. The Lebesgue integral of an r-form τ over an r-cell σ is defined by {σ } dp, τ = τ (p) · |σ | σ σ
706
G. RODNAY AND R. SEGEV
where |σ | is the r-dimensional volume of σ and the integral on the right is a Lebesgue integral of a real function. This is extended by linearity to domains that are polyhedral chains by τ= ai τ, A
if A =
σi
i
ai σi .
2.5. STOKES ’ THEOREM FOR POLYHEDRAL CHAINS The exterior derivative of a differentiable r-form τ is an (r + 1)-form dτ defined by dτ (p) · (v1 ∧ · · · ∧ vr+1 ) r+1 (−1)i−1 ∇vi τ (p) · (v1 ∧ · · · ∧ vi ∧ · · · ∧ vr+1 ), = i=1
where vi denotes a vector that has been omitted, and ∇vi is a directional derivative operator. The last definition is represented using coordinates by (dτ )λ1 ...λr+1 (p) =
r+1
(−1)i−1
i=1
∂ τ ˆ (p). ∂x i λ1 ...λi ...λr+1
Stokes’ theorem for polyhedral chains, based on the fundamental theorem of differential calculus, states that dτ = τ A
∂A
for every differentiable r-form τ and an (r + 1)-polyhedral chain A. 3. Banach Spaces of Chains 3.1. FLAT CHAINS
The mass of a polyhedral r-chain A = ai σi in En is defined to be |A| = |ai ||σi |, where |σi | denotes the the r-dimensional volume of |σi |. Thus, in case the interiors of the cells σi of a polyhedral r-chain do not intersect, and if ai = 1, then the mass of the polyhedral chain is exactly its r-dimensional volume. DEFINITION 3.1. The flat norm, |A|1 , of a polyhedral r-chain A in En is defined by |A|1 = inf{|A − ∂D| + |D|}, using all polyhedral (r + 1)-chains D.
FLUXES AND GEOMETRIC INTEGRATION
707
We note that it is not immediate that | · |1 is indeed a norm. Furthermore, the actual calculation of the the flat norm may be quite complicated even for simple r-chains. (For example, consider a 1-chain in the plane consisting two oriented line segments.) Taking D = 0 above, it is clear that |A|1 |A|. Completing the space Ar (En ) with respect to the flat norm gives a Banach space denoted by A1r (En ). That is, A1r (En ) contains the formal limits of all the sequences of polyhedral r-chains Ai , such that limi→∞ |Ai+1 − Ai |1 = 0. Elements of A1r (En ) which are limits of such sequences are sometimes denoted by lim1 Ai . We refer to elements of A1r (En ) as flat r-chains in En . If there are no intersections between cells and all coefficients have the value of 1, we identify the flat chain with the set that contains its points. EXAMPLE 3.2. Consider the sequence of 1-chains (Ai ) in E2 such that Ai = L1i + L2i where L1i and Li2 are 1-simplexes associated with two parallel line segments having the same length L, opposite orientation, and the line segment corresponding to L2i is obtained from the line segment corresponding to L1i by a translation of distance di perpendicularly to its direction. If we take the rectangle generated by the two line segments as Di in the definition of the flat norm, it follows that |Ai |1 (L + 2)di . Thus, if di → 0, the sequence (Ai ) converges to the zero chain in the flat norm. On the other hand, in the mass norm we have |Ai − Ai−1 | = 2L for all i so the sequence does not converge. Roughly speaking, the geometrical significance of the flat norm is that, unlike the mass norm, it takes into account how closely the two segments are located. If we let the length of the line segments shrink also so that for Ai , L = di , then by taking Di as above we get |Ai |1 di2 + 2di while taking Di = 0, implies |Ai |1 2di so |Ai |1 → 0 as di . EXAMPLE 3.3. Consider the “staircase” sequence (Bi ) shown in Figure 1. Here, 2 j−1
Aj =
Aj l
l=1
is the sum of 2j −1 oriented 1-squares of size dj = 1/2j , Bi = B0 + ij =1 Aj , and we take the limit as i → ∞. Set for each square Aj l , the cell Dj l such that Aj l = ∂Dj l . Then, using Dj l in the definition of the flat norm we get |Aj l |1 dj2 = 2−2j . Hence, |Bi − Bi−1 |1 = |Ai |1 2i−1 2−2i , so the sequence (Bi ) converges. Flat chains may be used to represent continuous and smooth submanifolds of E and even irregular surfaces as shown above. As another example, starting with a triangle on R2 one may construct a plane in R3 by mapping the vertices using n
708
G. RODNAY AND R. SEGEV
Figure 1. The staircase.
the values at the vertices of a real valued function u on R2 . One may subdivide the triangle and map the new vertices again using the mapping u to construct a piecewise flat surface in R3 approximating the graph of u. This procedure may be repeated to construct a sequence of 2-chains. If for the function u one uses a continuous function that is nowhere differentiable one obtains a flat chain that represents a surface that is not rectifiable. The Riemann integral ofa continuous r-form τ over a flat r-chain A = lim Ai , is defined to be A τ = lim Ai τ , if the limit exists. The boundary of a flat (r+1)-chain A = lim1 Ai , is defined to be ∂A = lim ∂Ai . The boundary of a flat (r + 1)-chain always exists as a flat r-chain. 3.2. SHARP CHAINS Whitney obtained chains that are even less regular then the flat chains by introducing a possibly smaller norm. Thus, more Cauchy sequences will converge and one ends up with a larger completed space. ai σi is DEFINITION 3.4. The sharp norm |A|2 of a polyhedral r-chain A = defined by 1 $ |ai ||σi ||vi | 2 + ai transvi σi , |A| = inf r +1 using all vectors vi ∈ En , where transv is a translation operator that moves each point p of σ to p + v, giving a translated cell transv σ with the same orientation as σ . Clearly, setting all vi = 0, we conclude that |A|2 |A|1 so the sharp norm defines a coarser topology. Completing the space Ar (En ) with respect to the sharp
FLUXES AND GEOMETRIC INTEGRATION
709
norm gives a Banach space denoted by A2r (En ) whose elements are referred to as sharp chains. It follows that A1r (En ) is a Banach subspace of A2r (En ). EXAMPLE 3.5. Consider again the sequence of pairs of 1-vectors in R2 of length di situated a distance di apart as above. Taking v1 = 0, and v2 as the vector such that transv2 will cause the two line segments to overlap so |v2 | = di , we have |Ai |2 di2 /2. Hence, for di → 0, the sharp norm of the shrinking pairs tends to zero faster than the flat norm. Consider the “staircase strainer” sequence (Bi ) constructed in the unit square as shown in Figure 2. Here, Aj is the sum of 2j −1 pairs of size dj = 1/2j , Bi = B0 + ij =1 Aj , and we take the limit as i → ∞. For the flat norm we have |Bi − Bi−1 |1 = |Ai |1
2i−1 2 = 1, 2i
so the sequence (Bi ) does not converge. On the other hand, for the sharp norm |Bi − Bi−1 |2 = |Ai |2
2i−1 (1/2i )2 2−i = , 2 4
and the sequence converges. Thus, we will be able to calculate the total flux through the staircase strainer limit. (The extensive property under consideration may flow through the strainer Bi at the horizontal segments only.) Similarly, the “staircase mixer” sequence shown in Figure 3 converges in the sharp norm but not in the flat norm. Roughly speaking, the difference in behavior between the flat norm and the sharp norm may be described as follows. Consider a sequence (Ai ) of shrinking r-polyhedral chains of typical size si → 0. If Ai is the boundary ∂Bi of a shrinking (r +1)-chain Bi , then taking D = Bi in the definition of the flat norm, |Ai |1 shrinks
Figure 2. The staircase strainer.
710
G. RODNAY AND R. SEGEV
Figure 3. The staircase mixer.
like sir+1 . If Ai cannot be represented as the boundary of an (r + 1)-chain, the r-dimensional mass of some subset of Ai will always be present in the definition of the flat norm and hence |Ai |1 will shrink like sir only. On the other hand, for the sharp norm, if one can cancel the flat norm of a chain by translating simplexes by vectors of the same order of magnitude as si , then the price to pay in the definition of the sharp norm is bounded by sir+1 whether Ai is the boundary of another chain or not. The Riemannintegral of a continuous r-form τ over a sharp r-chain A = lim Ai , is defined to be A τ = lim Ai τ , if the limit exists. It is noted that being less regular than a flat chain, the boundary of a sharp chain need not exist as a sharp chain. 3.3. NATURAL CHAINS A basic notion of Harrison’s constructions is that of a dipole. A simple r-dimensional 0-dipole is an r-simplex σ 0 whose diameter diam(σ 0 ) 1. A simple r-dimensional 1-dipole is a chain of the form σ 1 = σ 0 − transv1 σ 0 for a vector v1 , such that |v1 | 1, and transv1 σ 0 is disjoint from σ 0 . Inductively, a simple r-dimensional j -dipole is an r-chain of the form σ j = σ j −1 − transvj σ j −1 , where σ j −1 is a simple r-dimensional (j −1)-dipole, and vj is a vector with |vj | 1 such that transvj σ j −1 is disjoint from σ j −1 . A simple j -dipole is therefore determined by the simplex σ 0 and the v1 , . . . , vj vectors. A j -dipole is a simplicial chain j ai σi Dj = i
of simple j -dipoles.
711
FLUXES AND GEOMETRIC INTEGRATION
Given a simple j -dipole σ j constructed by the simplex σ 0 and vectors v1 , . . . , vj , its j -dipole mass is defined by |σ j |j = |σ 0 ||v1 | · · · |vj | (|σ 0 | is the mass of σ 0 ). The j -dipole mass of the j -dipole D j = defined as j |ai ||σi |j . |D j |j =
i
j
ai σi is
i
Using the notion of a dipole and the dipole mass, the k-natural norm, k = 1, 2, . . . , on the space of polyhedral chains is defined by / k 0 |D s |s + |C|3k−1 , |A|3k = inf s=0
where the infimum it taken over all decompositions of A in the form A = ks=0 D s + ∂C, for s-dipoles D s . The Banach space one obtains by completing the space of polyhedral chains relative to this norm is denoted by Akr and its elements are referred to as k-natural r-chains. Clearly, the 0-natural norm is equivalent to the flat norm. Harrison also defines norms associated with fractional values of r that are related to the Hölder conditions but we omit the discussion of such chains here. As k increases, the spaces of natural chains become larger, i.e., Akr is a Banach subspace of Alr for k < l. For increasing values of k these spaces contain increasingly irregular chains. For example, various fractals are natural chains, and the kth distributional derivative of the Dirac measure on the real line belongs to (see [8]). Ak+1 1 For a k-natural r-chain A, let τ be a form on A that has k−1 bounded derivatives and whose kth derivative is Lipschitz. The Riemann integral of τ over a natural 3 r-chain A = lim Ai , is defined to be A τ = lim Ai τ . Indeed, Harrison shows that the limit always exists as integrals over polyhedral chains are bounded by the natural norms of the chains. A clear advantage of using the natural norms in comparison with the sharp norm is the behavior of the natural chains under the boundary operator: the boundary operator of polyhedral chains extends to a continuous linear operator ∂: Akr → Ak−1 r−1 . 4. Cochains Cochains are elements of the dual spaces to the Banach spaces of flat, sharp, and natural chains. The basic idea of the application of Whitney’s abstract integration theory to the analysis of Cauchy fluxes is that cochains in the various dual spaces are abstract counterparts of total fluxes. Specifically, for classical continuum mechanics we regard the total flux TA of a certain extensive property P through a
712
G. RODNAY AND R. SEGEV
2-dimensional domain A in E 3 as the action T ·A of a 2-cochain T on the 2-chain A associated with the domain. For the sake of simplicity of the notation we used here the same notation for both the domain and the representing chain. It is noted that chains contain more information than just the domain where they are supported. For example, any continuous function defined on a submanifold of the Euclidean space may be represented as a chain. Obviously, the coefficients for the simplexes that make up the chain will be different than 1 and will represent the values of the function. In such a case, if we interpret the value of the function as a component of a velocity field, the action of a cochain on the chain may be interpreted as the calculation of power. Thus, geometric integration theory combines the classical approaches to flux theory and the variational weak approach. An immediate benefit of using geometric integration theory is that the analysis holds for r-chains in En for all values of r n. The properties of cochains that make them suitable mathematical models for Cauchy fluxes follow firstly from the linearity of their action on chains which is common to all Banach spaces considered above. Linearity of the action of cochains implies both the additivity and the action–interaction–antisymmetry properties assumed in various formulations of continuum mechanics. For example, given a cochain T , we have T · (−A) = −T · A. Secondly, the properties of the various cochains are determined by the continuity of their action on chains which is directly linked to the norm on the respective space of chains. Basic observations regarding the relations between the various norms and the properties we expect fluxes to have will described below. 4.1. FLAT COCHAINS Flat r-cochains in En are the elements of A1r (En )∗ , the dual space of A1r (En ). We will see next how the topology induced by the flat norm is related to traditional assumptions of Cauchy flux theory. We recall that in various formulations of Cauchy’s flux theory it is assumed that the total flux is bounded by both the volume and area of the corresponding region. That is, there are positive numbers N1 and N2 such that for every region A, |T∂A | N2 |∂A|,
|T∂A | N1 |A|,
where we use the mass norm to denote both the area and volume of the respective sets. In terms of a cochain T these boundedness conditions will be written as |T · A| N2 |A|,
|T · ∂D| N1 |D|,
for any r-chain A and an (r + 1)-chain D. Thus, |T · A| =
|T · A − T · ∂D + T · ∂D| |T · A − T · ∂D| + |T · ∂D| N1 |A − ∂D| + N2 |D| CT (|A − ∂D| + |D|),
713
FLUXES AND GEOMETRIC INTEGRATION
where CT is the least upper bound of all positive numbers satisfying this relation for all (r + 1)-chains D. The basic idea is to look at this relation as a requirement of continuity, |T · A| CT A, for the linear operator T . Since D is arbitrary it is natural to set then |A|1 = inf{|A − ∂D| + |D|}. D
It follows that the flat norm is the smallest of all norms that make the flux operators satisfying the boundedness condition continuous. As such, upon the completion of the space of polyhedral chains with respect to the flat norm, we obtain the largest Banach space for which the bounded flux operators are continuous. This means that flat chains are the most general geometrical objects for which the action of bounded flux operators is continuous. Conversely, if we consider norms | · |x on the space of polyhedral chains and wish to consider the action of a continuous flux functional T , then, |T · A| CT |A|x . If one requires that |A|x |A| and |∂D|x |D|, for any r-chain A and (r +1)-chain D, then the boundedness conditions are implied by continuity because |T · A| CT |A|x CT |A|,
and
|T · ∂D| CT |∂D|x CT |D|.
In order to admit the most general flux operators that satisfy these conditions we need the largest norm such that |A|x |A| and |∂D|x |D|. Indeed it can be shown that the flat norm is the largest norm satisfying these two conditions. 4.2. SHARP COCHAINS Sharp r-cochains in En are elements of A2r (En )∗ , the dual the space of A2r (En ). Since flat chains form a Banach subspace of A2r (En ), every sharp cochain may be restricted to flat chains. In other words, any sharp cochain is also flat. The additional property of sharp cochains that distinguishes them from flat cochains is the boundedness under translation. Given a sharp cochain T , consider for an r-cell σ and a vector v, the difference in the flux due to the translation by v, i.e., |T · σ − T · transv σ |. The continuity of T implies that |T · σ − T · transv σ | CT |σ − transv σ |2 |σ ||v| , CT r +1 by choosing v1 = 0 and v2 = −v in the definition of the sharp norm. Thus, continuity implies that there is a positive N3 such that |T · σ − T · transv σ | N3 |σ ||v|. In particular, the difference tends to zero if so does the magnitude of v. Clearly, this imposes a regularity restriction on sharp cochains. In analogy with flat chains, the sharp norm is the smallest of all norms for which all the flux operators satisfying the
714
G. RODNAY AND R. SEGEV
earlier boundedness conditions and boundedness under translation are continuous. Hence, in comparison with all other norms, it allows more elements to be added to the space of polyhedral chains in the process of completion. Conversely, if we consider norms | · |x on the space of polyhedral chains and wish to consider the action of a continuous flux functional T , then, |T · A| CT |A|x . If one requires that |A|x |A|, |∂D|x |D| and |σ − transv σ |x |σ ||v|, for every r-chain A, (r + 1)-chain D, r-cell σ , and vector v, then the boundedness conditions are implied by continuity. For example, boundedness under translation is implied by |T · σ − T · transv σ | CT |σ − transv σ |x CT |σ ||v|. In order to admit the most general flux operators that satisfy these three conditions we need the largest norm satisfying them. Indeed it can be shown that the sharp norm is the largest norm satisfying the conditions. 4.3. NATURAL COCHAINS k A k-natural r-cochain is an element of Ak∗ r , the dual space of Ar . Since the natural 3 norms | · |k are smaller than the flat norm for k > 0, all natural cochains are flat cochains. In fact, we will see later that natural cochains are very regular. It is a basic guiding principle in geometric integration that as chains become increasingly irregular the cochains become increasingly regular.
5. Representation of Cochains, the Isomorphism Theorem and Fluxes 5.1. THE CAUCHY MAPPING The Cauchy mapping of r-directions induced by an r-cochain is completely analogous to the mapping that gives the dependence of the flux density on the unit normal in classical continuum mechanics, hence the terminology we use. Let the r-direction α of an r-cell σ be the r-vector {σ }/|σ |. The Cauchy mapping DT , associated with the cochain T is defined to be the function of points and r-directions such that σi , DT (p, α) = lim T · i→∞ |σi | where σi is a sequence of r-cells containing p with r-direction α such that lim diam(σi ) = 0.
i→∞
As the r-direction α is the analog of the unit normal n used in continuum mechanics, the analog of Cauchy’s flux theorem will be the assertion that the restriction of the Cauchy mapping to each point p may be extended to a linear mapping of r-vectors. In other words, DT is a form in En .
FLUXES AND GEOMETRIC INTEGRATION
715
5.2. THE REPRESENTATION THEOREM FOR SHARP FLUXES The analog to Cauchy’s flux theorem in Whitney’s geometric integration theory for sharp cochains states the following. PROPOSITION 5.1. For each sharp r-cochain T , the Cauchy mapping DT may be extended to a unique r-form that represents T by DT , T ·A= A
for every polyhedral chain A. Clearly, the proposition defines the integral of a form over a sharp chain by continuity. Whitney’s theory determines exactly the forms that represent sharp cochains – the sharp forms. Firstly, the norm | · |0 is defined on V r by |τ |0 = sup |w · α| α simple, |α| = 1 . The sharp norm of the form τ is defined by $ |τ (q) − τ (p)|0 2 . |τ | = sup |τ (p)|0 , (r + 1) |q − p| p,q∈En Then, a sharp form is defined to be a form whose sharp norm is finite. Thus, sharp forms are bounded Lipschitz forms. Using the norm topology on the space of cochains where |T |2 = sup |T · A|, |A|2 =1
it can be shown that the previous proposition defines an isomorphism of the Banach space of sharp cochains and the Banach space of sharp forms. 5.3. REPRESENTATION OF FLAT COCHAINS While sharp r-cochains are regular enough to be represented uniquely by sharp r-forms, flat r-cochains are less regular, and each flat r-cochain is represented by an equivalence class of r-forms which satisfy certain regularity conditions. Sharp forms representing sharp cochains are continuous and Riemann integration may be used. The representation of flat cochains by forms uses the analogous Lebesgue integration. The Lebesgue integral of an r-form over a flat r-chain A = lim1 Ai is defined by τ = lim τ A
Ai
716
G. RODNAY AND R. SEGEV
if the limit exists. (The integrals on the right-hand side are Lebesgue integrals on polyhedral chains defined earlier.) The analysis of the representation of flat cochains by forms requires more attention then the sharp counterpart. For example, in the definition of the Cauchy mapping σi , DT (p, α) = lim T · i→∞ |σi | it is required that in the converging sequence (σi ), each of the simplexes will contain p as a vertex. It turns out that for each r-direction α, DT (p, α) is defined almost everywhere. Wolfe’s representation theorem as formulated by Whitney [23, p. 261] for flat cochains states as follows. PROPOSITION 5.2. Let T be a flat r-cochain in an open set R ⊂ En . Then, there is a set Q ⊂ R, with |R−Q| = 0, such that for each p ∈ Q, DT (p, α) is defined for all r-directions α, and is extendable to all r-vectors, giving an r-covector DT (p). The r-form DT is bounded and measurable in R. For any r-simplex σ in R, DT is a measurable r-form relative to the plane of σ and T · σ = DT . σ
In fact, one can describe exactly the flat forms – those forms that represent flat cochains. The exact conditions such that any flat r-form is associated with a unique flat r-cochain use the notions of Q-good simplexes and association of a form with a flat cochain. In order to avoid the technical details and since we are mainly interested in the existence of the representing forms, these notions will not be presented here (see [23, pp. 263–266] for the details). As one would expect, a flat cochain is associated with an equivalence class of forms under equality almost everywhere. The quotient space of flat forms obtained by identifying forms that are equal almost everywhere together with an appropriate norm, the flat norm of forms, is isomorphic to the space of flat cochains. 5.4. REPRESENTATION OF NATURAL COCHAINS As mentioned earlier, natural cochains are regular. Harrison’s representation theorem states that for k > 0 every k-natural cochain T is represented by a unique differential form DT as DT , T ·A= A
where the first k derivatives of DT are bounded and the kth derivative is Lipschitz. In fact, this relation defines an isomorphism of the of the space of k-natural cochains and the space of differential forms having this degree of smoothness (equipped with the suitable C k,Lip -norm).
FLUXES AND GEOMETRIC INTEGRATION
717
6. Coboundaries and Differential Balance Equations Coboundaries generalize exterior differentiation and their definition is purely algebraic. The coboundary dT of an r-cochain T is the (r + 1)-cochain defined by dT · A = T · ∂A, i.e., it is the dual of the boundary operator for chains. As ∂(∂A) = 0, one has d(dT ) = 0. The basic result concerning coboundaries is that the coboundary of a flat cochain is flat and the same holds for the coboundary of a sharp cochain. This implies a very general formulation of the balance equation. For a cochain T that is either sharp or flat, the coboundary exists as a flat cochain and we may define an (r +1)-cochain S, satisfying dT +S = 0, so the balance equation S ·A+T ·∂A = 0 holds. Here, S is interpreted as the cochain giving the rate of change of total amount of the property P in the flat (r + 1)-chain A (assuming there is no source term). If the form DT representing the cochain T is differentiable, then the flat form DdT representing dT is given as the exterior derivative of DT as one would expect, i.e., DdT = dDT . Thus, using τ for DT , the abstract balance equation above assumes the form b+ τ = 0. dτ + b = 0, A
∂A
In the more general case where τ is an arbitrary flat form representing the flat cochain T , dT is a flat cochain and hence it may be represented by any flat form d0 τ in the equivalence class of DdT . Thus, one may write the “differential” balance in the general situation of flat cochains. In fact, |T |1 = sup {|DT (p)|, |DdT (p)|}. p
The right-hand side of this identity is the flat norm of the form DT . In the particular case where T is a sharp cochain represented by the sharp form τ = DT , the functions giving the components of τ are Lipschitz mappings, hence, it has an analytic exterior derivative dτ as in Section 2.5 almost everywhere. Furthermore, it turns out that d0 τ = dτ almost everywhere. 6.1. COBOUNDARIES FOR NATURAL COCHAINS The fact that the boundary operator is continuous for natural chains allows the definition of the coboundary operator as the dual of the boundary operator. For natural cochains, one has DdT = dDT . It is noted that for natural cochains one may use a geometric definition of the exterior derivative as follows. Let p be a
718
G. RODNAY AND R. SEGEV
point and α an r-direction, then taking a decreasing sequence of r-simplexes (σi ) containing p, all of which are in the direction of α, then ∂σi τ . dτ (p, α) = lim |σi |→0 |σi | Thus, the balance equation holds pointwise. Acknowledgements The research leading to this paper was partially supported by a Kreitman Doctoral Fellowship to G. Rodnay and by the Paul Ivanier Center for Robotics Research and Production Management at Ben-Gurion University.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
12. 13. 14. 15. 16.
M. Degiovanni, A. Marzocchi and A. Musesti, Cauchy fluxes associated with tensor fields having divergence measure. Arch. Rational Mech. Anal. 147 (1999) 197–223. H. Federer, Geometric Measure Theory. Springer, New York (1969). R.L. Fosdick and E.G. Virga, A variational proof of the stress theorem of Cauchy. Arch. Rational Mech. Anal. 105 (1989) 95–103. M.E. Gurtin and L.C. Martins, Cauchy’s theorem in classical physics. Arch. Rational Mech. Anal. 60 (1975) 305–324. M.E. Gurtin and W.O. Williams, An axiomatic foundation for continuum thermodynamics. Arch. Rational Mech. Anal. 26 (1967) 83–117. M.E. Gurtin, W.O. Williams and W.P. Ziemer, Geometric measure theory and the axioms of continuum thermodynamics. Arch. Rational Mech. Anal. 92 (1986) 1–22. J. Harrison, Stokes’ theorem for nonsmooth chains. Bull. Amer. Math. Soc. 29(2) (1993) 235– 242. J. Harrison, Continuity of the integral as a function of the domain. J. Geometric Anal. 8(5) (1998) 769–795. J. Harrison, Isomorphisms of differential forms and cochains. J. Geometric Anal. 8(5) (1998) 797–807. J. Harrison, Flux across nonsmooth boundaries and fractal Gauss/Green/Stokes theorems. J. Phys. A 32(28) (1999) 5317–5327. W. Noll, The foundations of classical mechanics in light of recent advances in continuum mechanics. In: The Axiomatic Method, with Special Reference to Geometry and Physics (Symposium at Berkeley, 1957). North-Holland, Amsterdam (1959) pp. 265–281. W. Noll, Lectures on the foundations of continuum mechanics and thermodynamics. Arch. Rational Mech. Anal. 52 (1973) 61–92. W. Noll, Continuum mechanics and geometric integration theory. In: Categories in Continuum Physics, Lecture Notes in Mathematics, Vol. 1174. Springer, New York (1986) pp. 17–29. W. Noll and E.G. Virga, Fit regions and functions of bounded variation. Arch. Rational Mech. Anal. 102 (1988) 1–21. R. Segev, Forces and the existence of stresses in invariant continuum mechanics. J. Math. Phys. 27(1) (1986) 163–170. R. Segev, The geometry of Cauchy fluxes. Arch. Rational Mech. Anal. 154 (2000) 183–198.
FLUXES AND GEOMETRIC INTEGRATION
17. 18. 19. 20. 21. 22. 23. 24.
719
R. Segev, A correction of an inconsistency in my paper ‘Cauchy’s theorem on manifolds’. J. Elasticity 63 (2002) 55–59. R. Segev and G. de Botton, On the consistency conditions for force systems. Internat. J. Nonlinear Mech. 26(1) (1991) 47–59. R. Segev and G. Rodnay, Cauchy’s theorem on manifolds. J. Elasticity 56 (1999) 129–144. M. Šilhavý, The existence of the flux vector and the divergence theorem for general Cauchy fluxes. Arch. Rational Mech. Anal. 90 (1985) 195–212. M. Šilhavý, Cauchy’s stress theorem and tensor fields with divergences in Lp . Arch. Rational Mech. Anal. 116 (1991) 223–255. H. Whitney, Algebraic topology and integration theory. Proc. National Acad. Sci. 33 (1947) 1–6. H. Whitney, Geometric Integration Theory. Princeton Univ. Press, Princeton, NJ (1957). J.H. Wolfe, Tensor fields associated with Lipschitz cochainsm, PhD Thesis, Harvard (1948).
A Comparison of the Response of Isotropic Inhomogeneous Elastic Cylindrical and Spherical Shells and Their Homogenized Counterparts U. SARAVANAN and K.R. RAJAGOPAL Department of Mechanical Engineering, Texas A&M University, U.S.A. E-mail:
[email protected],
[email protected] Received 28 August 2002; in revised form 17 February 2003 Abstract. All real bodies are inhomogeneous, though in many such bodies the inhomogeneity is “mild” in that the response of the bodies can be “approximated” well by the response of a homogeneous approximation. In this study we explore the status of such approximations when one is concerned with bodies whose response is nonlinear. We find that significant departures in response can occur between that of a “mildly” inhomogeneous body and its homogeneous approximation (if the approximate model is restricted to a certain class), both quantitatively and qualitatively. We illustrate this fact within the context of a specific boundary value problem, the inflation of an inhomogeneous spherical shell. We also discuss the inappropriateness of homogenization procedures that lead to a homogenized stored energy for the body when in fact what is required is a homogenized model that predicts the appropriate stresses as they invariably determine the failure or integrity of the body. Mathematics Subject Classifications (2000): 74B20, 74Q15, 74Q20. Key words: homogenization, inhomogeneous body, stored energy, isotropy, inflation, spherical shell.
Dedicated to the memory of Clifford Truesdell
1. Introduction Nonlinear elasticity owes a great debt to the writings and research of Truesdell and those inspired by him. Much of this effort, though not all, in nonlinear elasticity is confined to understanding and describing the response of homogeneous bodies. Many bodies that are patently inhomogeneous are usually approximated by constitutive response functions for homogeneous bodies. In this short paper we are concerned with the error such an approximation entails. We thank the National Institutes of Health and the National Science Foundation for the support
of this work. 721 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 721–749. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
722
U. SARAVANAN AND K.R. RAJAGOPAL
All real bodies are inhomogeneous, however in many bodies the departure from homogeneity is ignorable while in others it is not. Biological (see [1]) and geological bodies, as well as composites (see [2]) are usually such that the inhomogeneity cannot be neglected. In order to render the response of inhomogeneous bodies amenable to analysis they are oftentimes approximated as homogeneous bodies, provided that the material properties depart but slightly from a meaningful average value. Even amongst the class of bodies whose properties vary slightly from a mean, it is important to determine those classes of inhomogeneous bodies that can be well approximated by homogeneous bodies and those classes that cannot. While such an approximation seems appropriate for a reasonably wide class of inhomogeneous bodies, it transpires that it is grossly deficient in characterizing many classes of inhomogeneous bodies, even when their properties vary mildly from an average value. There has been little work with regard to finite deformations of inhomogeneous solids and the little that there is concerns homogenization of nonlinear elastic solids that have a stored energy that is polyconvex with the emphasis being the determination of bounds for the stored energy. A popular model in biomechanics, defined through (12) does not have a stored energy that is polyconvex. The model (12) seems to fit the data well (of course, it is possible that a polyconvex model might also fit the data reasonably well). We are also interested in cautioning one involved in data reduction based on a homogeneous model for a body that is supposedly “mildly” inhomogeneous, totally different values would be ascribed to the material moduli on the basis of different experiments (we discuss this aspect in some detail later). A body (B) is said to be materially uniform if for any two points P1 , P2 ∈ B, there exists placers κ1 and κ2 and neighborhoods NX1 of X1 = κ1 (P1 ) and NX2 of X2 = κ1 (P2 ) such that the mechanical response of these neighborhoods is indistinguishable. If there exists a single placer κ such that the response of all X belonging to κ(B) are indistinguishable the body is said to be homogeneous. A body that is not homogeneous is said to be inhomogeneous. Let γ denote a material parameter. We can define a mean value for the parameter, in the configuration κR (B), through V (κR (B)) γ (X) dV , (1) γmean = V (κR (B)) where V (κR (B)) denotes the volume of the configuration κR (B). Now, it is reasonable to ask, when we can treat an inhomogeneous body in which a property varies over the body as a homogeneous body, with the material parameter having a constant value γmean in κR (B). Here we suppose the mechanical response is determined by the single parameter, γ . Otherwise we will have to consider several parameters γi with all of them being approximated by a mean value. If the approximation is to make sense, then the response of the inhomogeneous body has to be close to the response of the homogeneous approximation in some sense.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
723
A related question of importance is whether the γmean defined via equation (1) has any relevance to an experimentally inferred constant γexp through data reduction that presumes the body as homogeneous and belonging to the same type as the inhomogeneous body (e.g., an inhomogeneous neo-Hookean body being approximated by a homogeneous neo-Hookean body). Suppose we have an inhomogeneous body that comprises of homogeneous subparts whose stored energy function belongs to a certain class, say neo-Hookean, but the different subparts that are neo-Hookean having a different material modulus. In general, a homogeneous approximation need not be a body of the same type, i.e., in the above example neo-Hookean. However, if a body consists in various pieces of the same type with slightly differing values for the material moduli, we would be tempted to believe that it could be modelled by a homogeneous body belonging to the same class. Moreover, this is what is usually done (see [3]) though there are a few studies which do obtain a homogenized body belonging to a different class. As we observed earlier, it is possible that the homogenized model of an inhomogeneous body, each of its parts belonging to a certain class, need not be of the same class, i.e., the homogenization of a body comprised of different homogeneous neo-Hookean solids need not lead to a neo-Hookean body. But this depends on the homogenization procedure. Currently, the few homogenization studies pertaining to nonlinear elastic solids aim towards obtaining a homogenized stored energy. Irrespective of whether the homogenized body belongs to the same class or to another class, if the homogenization is based on energy considerations, the results are highly unsatisfactory from the point of view of applications, for the following reasons. Such a homogenization ill-serves a person interested in the failure of the body as failure is invariably determined by the stress in the body and not the stored energy at a point or for that matter the stored energy of a neighborhood of the point. Stresses are related to the derivatives of the stored energy with respect to the deformation gradient and thus having a homogeneous approximation of the stored energy does not in general provide a good approximation for the stresses. Even less useful are the studies concerning homogenization procedures for nonlinear elastic solids that obtain bounds for the stored energy as these bounds need not be tight and even if tight, they serve little useful purpose. The above points cannot be overemphasized. Another question that bears investigation is whether the data reduction that presumes that the model is homogeneous leads to the same value γexp in different 1 2 and γexp obtained from two experiments experiments, or at least are the values γexp close. In a recent study on the inflation, extension, torsion and shearing of a right circular isotropic inhomogeneous cylinder, Saravanan and Rajagopal [4] show that the answer to both the above questions is negative, i.e., γmean may not bear a close 1 2 may not be close to γexp . correlation to γexp and γexp We show that different but slight variations of the property, say a piecewise constant variation, a linear variation, a sinusoidal variation, etc., all of which have the same mean value lead to responses that are markedly different, even for global
724
U. SARAVANAN AND K.R. RAJAGOPAL
measures of the response. When we focus our attention on local measures such as stresses, even a 5% variation about a mean for the material moduli could lead to solutions for the inhomogeneous and the approximate homogeneous body to differ by several hundred percent! Even more disturbing is the fact that the sense of the stress for the two cases, at a given location, could be different, i.e., while one predicts a compressive stress, the other could predict a tensile stress or vice versa. Of course, all this depends on the class of stored energy being considered. It is worth noting that in the process of establishing the main thesis of the paper, an important boundary value problem, the inflation of a sector of a spherical shell, is solved for a variety of inhomogeneous bodies. The arrangement of this paper is as follows. After a brief review of the relevant kinematics in Section 2 we introduce the different types of bodies (stored energies) that we will be considering in Section 3. We discuss the various types of inhomogeneities in Section 4 and develop the governing equation for the inflation of a sector of spherical shell in Section 5. We follow this with a discussion of the issues concerning parameter estimation from experiments, and conclude by presenting a few interesting results on the stress distribution across the thickness of the shell. 2. Kinematics Let X ∈ κR (B) denote a typical particle belonging to the reference configuration κR (B) of the body, and let x ∈ κt (B) denote the position occupied by X at time t in the configuration κt (B). The motion of the body is defined through the mapping χκR that is one to one for each t ∈ R: x = χκR (X, t).
(2)
We shall assume that the motion is sufficiently smooth to render all the derivatives that follow meaningful. The deformation gradient, FκR is defined through FκR =
∂χκR , ∂X
(3)
and the Cauchy–Green stretch tensors, BκR and CκR are defined through BκR = FκR FTκR ,
(4)
CκR = FTκR FκR .
(5)
The principal invariants of any second order tensor A are defined through I1 = tr A,
I2 =
1 (tr A)2 − tr A2 , 2
I3 = det A.
These kinematical quantities are sufficient for our purpose.
(6)
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
725
3. Constitutive Relations In this study we shall restrict ourselves to hyperelastic, isotropic, incompressible inhomogeneous solids. The most general representation for the Cauchy’s stress, T, for this class of elastic solids is T = −p1 + 2W1 BκR − 2W2 B−1 κR ,
(7)
where, W1 and W2 are the derivatives of the stored energy W = W (I1 , I2 , X) with respect to the first and second principal invariants of CκR and −p1 is the indeterminate part of the stress due to the constraint of incompressibility. The specific form of the stored energy function depends on the particular solid that is of interest. We shall consider three types of stored energy functions associated with isotropic, incompressible inhomogeneous bodies and restrict ourselves to special forms of inhomogeneities, but these special forms suffice to make our case that great care has to be exercised in approximating inhomogeneous bodies as homogeneous bodies, however mild the inhomogeneities. We shall first consider a generalization of the classical homogeneous neoHookean model (see [5]). We shall suppose that the stored energy function W takes the form W = µ(X)(I1 − 3),
(8)
where I1 = tr CκR and µ(X) > 0 is the shear modulus. The Cauchy stress T in the body is given by T = −p1 + 2µ(X)BκR .
(9)
The second model that we shall consider is the inhomogeneous version of the Mooney model [6], for which the stored energy W has the form W = µ1 (X)(I1 − 3) + µ2 (X)(I2 − 3).
(10)
It follows from (7) and (10) that the Cauchy stress is given by T = −p1 + 2µ1 (X)BκR − 2µ2 (X)B−1 κR ,
(11)
where µ1 (X) > 0 and µ2 (X) > 0 are material moduli that have to satisfy certain restrictions (see [7]). Finally, we introduce a model proposed by Fung [8] to describe biological tissues. The stored energy function for the inhomogeneous counterpart of Fung’s model is (12) W = a(X) eQ − 1 , 2 2 2 + b2 (X)E22 + b3 (X)E33 + b4 (X)E11 E22 + b5 (X)E22 E33 + where, Q = b1 (X)E11 b6 (X)E11 E33 and E = 0.5[C − 1]. This stored energy is not polyconvex. We shall discuss the deformation of inhomogeneous anisotropic elastic solids elsewhere.
726
U. SARAVANAN AND K.R. RAJAGOPAL
The problems identified in this article have to do with “homogenization” and not with issues of symmetry. Here we shall, consider an isotropic model wherein, Q = b(X)(I1 − 3). The Cauchy stress corresponding to such a stored energy is T = −p1 + 2a(X)b(X)eQ BκR ,
(13)
where a(X) > 0 and b(X) > 0 are material parameters. We shall neglect body forces and as we shall consider only static problems, the balance of linear momentum reduces to div(T) = 0.
(14)
For incompressible bodies, the Cauchy stress can be expressed as T = −p1 + Te ,
(15)
where Te = W1 BκR − W2 B−1 κR is the constitutively determined part of the stress. Let us first introduce a dimensionless prescription of the position, gradient and divergence through x , L = Lgrad(·), grad(·) = Ldiv(·), div(·)
x =
(16) (17) (18)
where L is a relevant length scale for the specific boundary value problem under consideration. We note that the deformation gradient, FκR is already a dimensionless quantity. We introduce a parameter, µo with units of stress, to render the Cauchy stress dimensionless. The choice of the parameter µo depends on the specific form of the stored energy function that is under consideration. Here we choose µo to be, the mean value of the shear modulus, µ in case of a neo-Hookean stored energy, the mean value of µ1 for a Mooney stored energy, the mean value of a for a Fung stored energy. Thus, equation (7) can be written in the following dimensionless form:
=2 B−1 =1 BκR − W T = − p1 + W κR ,
(19)
=i = Wi /µo and T = T/µo , p
= p/µo . Consequently, equation (14) where, W becomes = div( T) = 0.
(20)
For sake of convenience we drop the tilde with the understanding that all the quantities considered henceforth are non-dimensional unless otherwise explicitly stated.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
727
4. Forms of Inhomogeneities For the purpose of illustrating our thesis we shall confine ourselves to a body B that is the annular region between two concentric spheres: B = {(R, , ) | Ri R Ro , 0 2π, 0 π }.
(21)
We shall use Ro for the non-dimensionalization of the length. Let γ (X) denote any of the material parameters µ(X), µ1 (X), µ2 (X), a(X) or b(X) introduced through the models in the previous section. We shall assume that the properties vary only along the radial direction, thus γ (X) = γ (R), and thus, µ, µ1 , µ2 , a and b are all functions of R. Before we discuss the manner in which the properties vary, we shall introduce a parameter R in terms of which we find it convenient to discuss the variation as this parameter ranges between 0 and 1. Let R=
R − Ri . Ro − Ri
The forms for γ (R) are such that 1 V (κR (B)) γ (X) dV = γ (R) dR. γmean = V (κR (B)) 0
(22)
(23)
We shall pick, for the purpose of illustration, γmean = 1, i.e., our properties will be assumed to vary about a mean value of unity. First, we shall consider cases where the material parameter varies monotonically. Here, we investigate two types of variations, one in which γ (R) increases from Ri to Ro and in the other in which it decreases. While this can happen in a variety of ways, we choose the following simple variations. 4.1. LINEAR VARIATION γ (R) = 2(1 − δ)R + δ, where 0 < δ < 2. Thus, dγ ? > 0 if 0 < δ < 1, dR < 0 if 1 < δ < 2.
(24)
(25)
4.2. EXPONENTIAL VARIATION Here we suppose that γ (R) =
(eδ
δ · eδR , − 1)
(26)
728
U. SARAVANAN AND K.R. RAJAGOPAL
where, −∞ < δ < ∞. Thus, dγ > 0, δ > 0, dR < 0, δ < 0.
(27)
Next we study cases where the variation of γ is non-monotonic. 4.3. PIECEWISE CONSTANT ( PWC ) VARIATION We shall assume that ⎧ k−1 ⎪ n ⎪ n ⎪ , k is even, (−1) H R − ⎪ ⎨ δ + 2 · (1 − δ) · k n=0 γ (R) = k−1 ⎪ n ⎪ n ⎪ δ + 2 · (1 − δ) k ⎪ (−1) H R − , k is odd, ⎩ (k + 1) n=0 k
(28)
where, ⎧ ⎨ 0 if R < n , n k = H R− ⎩ 1 if R > n . k k Here δ and k determine the amplitude and frequency of the variation. 4.4. SINUSOIDAL VARIATION In this case, we assume that γ (R) = 1 + δ · sin(2kπ R),
(29)
where δ determines the amplitude of the variation and k the frequency. Finally, we shall consider the case, γ (R) = 1 + δ · cos(2kπ R),
(30)
with the δ and k having the same meaning as before. 5. Inflation of a Sector of a Spherical Shell We shall seek a semi-inverse solution of the following form, for the deformation, in spherical coordinates: r = r(R),
θ = ,
φ = ,
(31)
where (R, , ) and (r, θ, φ) represents the coordinates of a typical material point, before and after the deformation, respectively. This deformation carries the region
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
729
between the two concentric spheres into a region between two other concentric spheres. The deformation gradient associated with this deformation, in spherical co-ordinates, has the following matrix representation ⎞ ⎛ dr 0 0 ⎟ ⎜ dR r ⎟ ⎜ (32) F=⎜ 0 0 ⎟. ⎠ ⎝ R r 0 0 R Hence the left stretch tensor has the form ⎛ 2 ⎞ dr 0 0 ⎜ dR ⎟ ⎜ ⎟ 2 ⎜ ⎟ r ⎜ B=⎜ (33) 0 ⎟ 0 ⎟. R ⎜ ⎟ 2 ⎠ ⎝ r 0 0 R The constraint of incompressibility requires that dr r 2 = 1. dR R
(34)
Integrating the above results in r 3 = R 3 + cs ,
(35)
where cs = ri3 − Ri3 . For the special form of the assumed deformation, the deformation gradient is only a function of R and therefore the stored energy has the form, W = W (FκR (R), R). Hence, the equilibrium equation (14) simplifies to 1 dTrr + (2Trr − Tθθ − Tφφ ) = 0. dr r
(36)
The Lagrange multiplier p due to the constraint of incompressibility can be determined by integrating (36): Ro 2 r 4R 2 R 4 − W1 (I1 , I2 , R) dR p(R) = Trre (R) − 3 r r R R $ 4 2 Ro R 4R 2 r − W2 (I1 , I2 , R) dR − Trr (r(Ro )). − r3 R r R (37) Note that here r is a function of R as given in (35) and I1 = (R/r)4 + 2(r/R)2 , I2 = (r/R)4 + 2(R/r)2 . We carry out, the integration numerically using Newton Cote’s 8 panel rule.
730
U. SARAVANAN AND K.R. RAJAGOPAL
It immediately follows from (7) that 4 4 R r W1 (I1 , I2 , R) − W2 (I1 , I2 , R), (38) Trr (R) = −p(R) + r R 2 2 r R W1 (I1 , I2 , R) − W2 (I1 , I2 , R), Tφφ (R) = Tθθ (R) = −p(R) + R r (39) and hence, the normal component of the stress in the radial direction at the inner surface is given by Ro 2 r 4R 2 R 4 − W1 (I1 , I2 , R) dR P = Trr (r(Ro )) + 3 r r R Ri $ 4 2 Ro R 4R 2 r − (40) − W2 (I1 , I2 , R) dR . r3 R r Ri Thus, given a value of ri we can determine the required magnitude of the normal component of the stress in the radial direction at the inner surface to engender such a motion. 6. Some Remarks Concerning Parameter Estimation from Experiments Let us first consider the deformations of an inhomogeneous neo-Hookean solid. Let µexp denote the constant value for the shear modulus for the homogeneous approximation for the inhomogeneous body, that is, we assume that the body is homogeneous with a constant material modulus µexp which is then determined through a correlation with an experiment, in which, the body is subject to the same boundary traction that has to be applied to engender a given inflation or deflation. We now determine the relationship between this constant value of the Sp–Inf material modulus µexp and the material modulus µ for the inhomogeneous body by comparing the solutions corresponding to an identical boundary value problem. It follows from (40) that 2 Ro r 4R 2 R 4 − µ(R) dR P − Trr (ro ) = 3 r r R Ri 2 Ro r 4R 2 R 4 Sp–Inf − dR. = µexp 3 r r R Ri Thus, µSp–Inf exp
Ro =
(4R 2 /r 3 )[(R/r)4 − (r/R)2 ]µ(R) dR . Ro 2 3 4 2 Ri (4R /r )[(R/r) − (r/R) ] dR
Ri
(41)
We immediately recognize from (41) that different forms of µ(R) with the same Sp–Inf µmean can lead to different values for µexp . Note that in the above equation
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
Sp–Inf
Figure 1. Variation of µexp
(b) µ(R) = (δ/(eδ − 1)) · eδR .
731
with Ri when cs = 0.1 and (a) µ(R) = 2(1 − δ)R + δ,
732
U. SARAVANAN AND K.R. RAJAGOPAL
Ro = 1, since, Ro is used as the characteristic length scale. Figures 1–3 capture Sp–Inf the variation of µexp with Ri for various types of variation of µ(R) presented in Section 4 for a given cs , defined in (35). Figures 2 and 3 show that when the inhomogeneity is periodic, the homogeneous approximation is better the higher the frequency of the inhomogeneity, the amplitude remaining fixed, a result in keeping with the previous results of Saravanan and Rajagopal [4] in their investigations of the deformation of inhomogeneous isotropic annular elastic cylinders. From the same figures it can also be seen that a piecewise constant variation with k being even is qualitatively similar to that for the sinusoidal variation, while if k is odd it qualitatively resembles that for the cosine variation. Instead of homogenizing such that the boundary traction and deformation, measured in experiments be the same in both the inhomogeneous body and its homogeneous approximation, we can homogenize such that the total stored energy in the inhomogeneous body and its homogeneous counterpart are the same. Further, if the stored energy of the homogeneous subparts of the inhomogeneous body belongs to the same class, say neo-Hookean and we seek to find a homogeneous counterpart for this body such that its stored energy belongs to the same class as the homogeneous subparts of the inhomogeneous body, then, we can mathematically seek the constant value of the shear modulus µ, denoted by µmth , such that the stored energy in the inhomogeneous body and its homogeneous approximation are same. Thus Ro R µ(R)[I1 − 3] dR , (42) µmth = i Ro ∗ Ri [I1 − 3] dR where I1 is the first principal invariant of C associated with the deformation field in the inhomogeneous body, while I1∗ is the first principal invariant of C∗ corresponding to the deformation field in the homogenized approximation. The deformation of the spherical body, here, is determined completely by the condition of isochoricity, irrespective of whether the body is homogeneous or inhomogeneous. Hence Ro R µ(R)[I1 − 3] dR sph , (43) µmth = i Ro Ri [I1 − 3] dR where I1 = (R/r)4 + 2(r/R)2 . It immediately transpires from Figures 4–7 that correlating the stored energy does not result in a good prediction of the boundary traction required to engender the given boundary deformation. This is apparent from comparing equation (43) with (41). The prediction for the normal component of the stress in the radial direction required at the inner surface, to engender, a given inflation for a given inhomogeneous spherical shell based on the equivalence of the stored energy in the inhomogeneous spherical shell and its homogeneous counterpart could at times be twice (refer Figure 7(b)) the actual value of the normal component of the stress in the radial direction required at the inner surface.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
Sp–Inf
733
Figure 2. Variation of µexp with Ri when cs = 0.1 and µ(R) = δ + 2 · (1 − δ) × k−1 k n=0 (−1) H (R − n/k): (a) k is even, (b) k is odd.
734
U. SARAVANAN AND K.R. RAJAGOPAL
Sp–Inf
Figure 3. Variation of µexp with Ri when cs = 0.1 and (a) µ(R) = 1 + δ · sin(2kπR), (b) µ(R) = 1 + δ · cos(2kπR).
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
735
Figure 4. Variation of µexp with Ri when cs = cc = 0.1, δ = 1.5 and (a) µ(R) = 2(1 − δ)R + δ, (b) µ(R) = (δ/(eδ − 1)) · eδR for various load combinations.
736
U. SARAVANAN AND K.R. RAJAGOPAL
Figure 5. Variation of µexp with Ri when cs = cc = 0.1, δ = 1.9 and µ(R) = δ + 2 · (1 − δ) × k−1 k n=0 (−1) H (R − n/k): (a) k = 2, (b) k = 10 for various load combinations.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
737
Figure 6. Variation of µexp with Ri when cs = cc = 0.1, δ = 1.9 and µ(R) = δ + 2 · (1 − δ)(k/ n (k + 1)) k−1 n=0 (−1) H (R − n/k): (a) k = 3, (b) k = 11 for various load combinations.
738
U. SARAVANAN AND K.R. RAJAGOPAL
Figure 7. Variation of µexp with Ri when cs = cc = 0.1, δ = 0.9 and (a) µ(R) = 1 + δ · sin(2kπR), (b) µ(R) = 1 + δ · cos(2kπR) for various load combinations.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
739
It would be appropriate at this juncture to discuss a very important issue concerning experimental evaluation of the material parameters, namely the ability to infer the same values for the material moduli from different experiments. Saravanan and Rajagopal [4] have shown that different deformations of an annular cylinder (i.e., different experiments) lead to different values, for the experimental modulus being deduced. Here we show that the same holds for µmth . In the case, of the inflation, extension and torsion of an annular right circular inhomogeneous cylinder the requirement that the deformation be isochoric completely determines the deformation and the deformation field is the same for both the homogeneous and the inhomogeneous right circular cylinder. Of course, the difference being in the stress field. This isochoric deformation is given by λr 2 = R 2 + cc2 ,
θ = + Z,
z = λZ
(44)
λ being the axial extension, the angle of twist per unit where = λ · − length, Ri the inner radius of the annular cylinder in the reference configuration and ri the radius of the cylinder after inflation, i.e., in the current configuration. Following, Saravanan and Rajagopal [4] we consider three special cases. The first being uniaxial extension of the right circular cylinder for which cc = 0 and = 0. Then, correlating the axial load required to engender a given axial extension they obtain Ro R µ(R)R dR Ax–Ext . (45) µexp = i Ro R dR Ri cc2
ri2
Ri2 ,
Instead if we correlate the total stored energy then Ro R µ(R) dR cs1 = µmean , µmth = i Ro dR Ri
(46)
from (42). Next, consider pure twisting of the annular cylinder for which cc = 0 and λ = 1. Now, correlating the torque required to engender a given twist they obtained Ro 3 Ri µ(R)R dR , (47) = µTr–Twt exp Ro 3 Ri R dR correlating the total stored energy we obtain Ro 2 Ri µ(R)R dR . = µcs2 mth Ro 2 Ri R dR
(48)
Finally, consider the case when λ = 1 and = 0 corresponding to pure inflation. Correlating the radial component of the radial stress required to engender a given inflation they obtained Ro 2 2 2 2 2 R µ(R)((2R + cc )/R(R + cc ) ) dR Pr–Inf . (49) µexp = i Ro 2 + c2 )/R(R 2 + c2 )2 ) dR ((2R c c Ri
740
U. SARAVANAN AND K.R. RAJAGOPAL
We obtain µcs3 mth
Ro =
Ri
(µ(R)/R 2(R 2 + cc2 )) dR
Ro Ri
(1/R 2 (R 2 + cc2 )) dR
,
(50)
correlating the total stored energy. The value of µexp obtained from equations (45), (47) and (49) for a given variation of µ(R) and the value of δ and k, are plotted in Figures 4–7. Now, the problem under consideration in a different geometry leads to yet another means of estimating the value for the material moduli for the homogeneous approximation. As observed by Saravanan and Rajagopal [4] the value of µexp depends on the thickness of the cylinder or the sphere, as the case may be, an unacceptable situation. Also, µexp obtained from different experiments are significantly different, again an unacceptable situation. Figures 4–7 provide the value of µmth obtained for the various deformations outlined above. Just like µexp , µmth could also, for a given inhomogeneous body, vary by as much as 1800% (refer to Figure 5(a)) depending on the boundary value problem and the thickness of the cylinder or the spherical shell. This suggests that the bounds obtained on these homogenized parameters will not be tight and hence of little utility. Further, correlating the stored energy does not result in a good prediction of the boundary traction required to engender the given boundary deformation. This is evident from comparing the equation (46) with (45) or (48) with (47) or (50) with (49). It should be recognized that the above observations are based on a few studies of specific boundary value problems and more importantly for a few type of inhomogeneities. Consequently, there might exist other inhomogeneities for which the variation is much more severe than that observed here. All these unsatisfactory features point to the need for recognizing the exact structure of the inhomogeneity in characterizing these bodies and solving the appropriate boundary value problem, until at least a better homogenization procedure is put in place. Sp–Inf sph Clearly, the value of µexp (as well as µmth ) depends on the value of cs . Figure 8 shows this dependence when the sphere is made up of layers of homogeneous neo-Hookean solid. Next, let us consider the deformation of an inhomogeneous Mooney solid. We Sp–Inf Sp–Inf immediately obtain the values of (µ1 )exp and (µ2 )exp as Ro 2 3 4 2 Ri (4R /r )[(R/r) − (r/R) ]µ1 (R) dR Sp–Inf , (51) (µ1 )exp = Ro 2 /r 3 )[(R/r)4 − (r/R)2 ] dR (4R Ri Ro 2 3 4 2 Ri (4R /r )[(r/R) − (R/r) ]µ2 (R) dR . (52) = (µ2 )Sp–Inf exp Ro 2 /r 3 )[(r/R)4 − (R/r)2 ] dR (4R Ri It is easy to see that (µ1 )exp has the same expression as that of the shear modulus (µexp ) in the neo-Hookean form. Hence, all the difficulties and undesirable charac-
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
Sp–Inf
741
Figure 8. Variation of µexp with cs when Ri = 0.5 and µ(R) = δ + 2 · (1 − δ) × k−1 k H (R − n/k): (a) k is even, (b) k is odd. (−1) n=0
742
U. SARAVANAN AND K.R. RAJAGOPAL
Sp–Inf
Figure 9. Variation of (µ2 )exp with Ri when cs = 0.1 and µ(R) = δ + 2 · (1 − δ) × k−1 k n=0 (−1) H (R − n/k): (a) k is even, (b) k is odd.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
743
teristics of evaluating the shear modulus for an inhomogeneous neo-Hookean solid Sp–Inf apply to (µ1 )exp . Figure 9 depicts the variation of (µ2 )exp when the sphere is made up of layers of homogeneous Mooney solid. It can be seen from the figure that even when the inhomogeneity is periodic, an increase in the frequency of the inhomogeneity doesn’t result in the homogeneous approximation being better, especially in a thick walled spherical annulus. In the case of an inhomogeneous body whose stored energy function is that proposed by Fung, the stored energy of its homogeneous approximation can not even belong to the same class, i.e. the stored energy of its homogeneous approximation will not belong to the class introduced by Fung (see [4]). However, we investigate the consequences of approximating an inhomogeneous body whose stored energy is given by the model proposed by Fung with its material parameters varying mildly with location, as a homogeneous body with the same class of stored energy function having constant material parameters. We use the mean value for the material parameters in the homogeneous approximation and compare the pressure vs inner radius response for the various class of inhomogeneous bodies that leads to the same homogeneous approximation. The results are depicted in Figure 10. Thus, the normal component of the stress in the radial direction required to engender the motion, parameterized by a given value of ri , in case of the inhomogeneous body, can be 150% to even 300% more than that required for its corresponding homogeneous approximation. Of course, this depends on the specific variation of the material parameters. Here both the parameters a and b are assumed to have the same functional dependence on R.
7. Stress Distribution We now turn our attention to the differences between the stress distribution in the inhomogeneous solid and its homogeneous counterpart. It is not surprising that the stress distribution corresponding to the inhomogeneous body is quantitatively different from that of the homogeneous approximation. However, one should expect that the qualitative features of the stress distribution like, the derivative of the stress (which indicates whether the stress is increasing or decreasing) or the sense of the stress (tensile or compression) to be preserved by and large, though not everywhere, in the inhomogeneous body and its homogeneous counterpart. The sense of the stress as well as its magnitude can determine the integrity or failure of the body. For instance, while certain materials can withstand significant compressive stresses they can fail due to tensile stresses and thus a homogenized approximation that predicts appropriate compressive stresses may lull one into a false sense of security while in fact it may fail as tensile stresses develop, in the real inhomogeneous body. Unfortunately, such is indeed the case and this was illustrated by Saravanan and Rajagopal [4] for the case of inflation, extension, torsion and shearing of an annular cylinder.
744
U. SARAVANAN AND K.R. RAJAGOPAL
Figure 10. Trr (ri ) vs ri plot for various forms of inhomogeneities when δ = 0.5, am = 1, (a) bm = 1, (b) bm = 2 and k = 2 except for PWC-2, for which k = 10.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
745
Figure 11. Variation of non-dimensional stress (a) Trr , (b) Tφφ with R for a neo-Hookean stored energy function when µ(R) = 2(1 − δ)R + δ and ri = 0.91.
746
U. SARAVANAN AND K.R. RAJAGOPAL
Figure 12. Variation of non-dimensional stress (a) Trr , (b) Tφφ with R for a neo-Hookean stored k energy function when µ(R) = δ + 2 · (1 − δ) · k−1 n=0 (−1) H (R − n/k) and ri = 0.91.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
747
Figure 13. Variation of non-dimensional stress (a) Trr , (b) Tφφ with R for a Fung stored energy k function when a(R) = 1, b(R) = δ + 2 · (1 − δ) · k−1 n=0 (−1) H (R − n/k) and ri = 1.1.
748
U. SARAVANAN AND K.R. RAJAGOPAL
Figure 14. Variation of non-dimensional stress (a) Trr , (b) Tφφ with R for a Fung stored energy k function when a(R) = 1, b(R) = δ + 2 · (1 − δ) · k−1 n=0 (−1) H (R − n/k) and ri = 1.4.
INHOMOGENEOUS BODIES AND THEIR HOMOGENIZED COUNTERPARTS
749
It can be seen from Figures 11 and 12 that for a inhomogeneous neo-Hookean body the derivative of the stress is quite different than that for the homogeneous body and in some cases even the sense of the stress is different. Finally, the stresses in the case of a sphere whose stored energy is that for the model proposed by Fung are plotted in Figures 13 and 14. For the cases illustrated in the figures, bmean = 1. In this case a 5% variation in parameter b causes a 8% variation in the stress (to be specific Tφφ ) when ri = 1.1 and a 17% when ri = 1.4. The percentage variation of the stresses increases with an increase in the mean value of parameter b, for the same variation of b about the mean. Thus, when bmean = 5, a 5% variation in the parameter b causes a 74% variation in the stress when ri = 1.4. Thus, the variation in the stress can be an order of magnitude greater than the variation of the material parameters. Hence, if we idealize an aneurysm as a sphere and obtain the stress distribution using a homogeneous approximation, we will find that the stress will be both qualitatively and quantitatively different from the one that accounts for the inhomogeneity of the aneurysm. References 1. 2. 3. 4. 5. 6. 7. 8.
Y.C. Fung, Biomechanics: Motion, Flow, Stress and Growth. Springer, New York (1990). R.M. Christensen, Mechanics of Composite Materials. Wiley, New York (1979). A. Imam, G.C. Johnson and M. Ferrari, Determination of the overall moduli in second order incompressible elasticity. J. Mech. Phys. Solids 43(7) (1995) 1087–1104. U. Saravanan and K.R. Rajagopal, On the role of inhomogeneties in the deformation of elastic bodies. Mech. Math. Solids (accepted for publication). L.R.G. Treloar, The elasticity of a network of long chain molecules – II. Trans. Faraday Soc. 39(9/10) (1943) 241–246. M. Mooney, A theory of large elastic deformation. J. Appl. Phys. 11 (1940) 582–592. C. Truesdell and W. Noll, The nonlinear field theories. In: Handbuch der Physik, Vol. III/3. Springer, Berlin (1965). Y.C. Fung, Elasticity of soft tissues in elongation. Amer. J. Physiol. 213 (1967) 1532–1544.
On SO(n)-Invariant Rank 1 Convex Functions M. ŠILHAVÝ ˇ Žitná 25, 115 67 Prague 1, Czech Republic. Mathematical Institute of the AV CR, E-mail:
[email protected] Received 9 October 2001; in revised form 30 December 2002 Abstract. Let f be a function defined on the set Mn×n of all real square matrices of order n. If f is SO(n)-invariant, it has a representation f˜ on Rn through the signed singular values of the matrix argument A ∈ Mn×n . A necessary and sufficient condition for the rank 1 convexity of f in terms of f˜ is given. Mathematics Subject Classifications (2000): Primary 49K20; secondary 73C50. Key words: rank 1 convex functions, rotational invariance.
In memory of Clifford Truesdell
1. Introduction In nonlinear elasticity and in the theory of phase transitions in solids, one minimizes the energy functional f (Du) dx, I (u) =
where ⊂ Rn is open and bounded, u ∈ W 1,p (, Rn ) is a deformation with the gradient Du, f : Mn×n → R ∪ {∞} is the energy defined on the set Mn×n of all real square matrices of order n and 1 p ∞. Consider, for definiteness, the minimum problem M = inf{I (u) : u ∈ A} 1,p
on the Dirichlet class A = {u : u − v ∈ W0 (, Rn )}, where v ∈ W 1,p (, Rn ) is fixed. If the problem has a solution, i.e., if there exists a u ∈ A such that I (u) = M, then u is a ‘stable’ equilibrium state; on the other hand, the nonexistence of a solution indicates the possibility of phase transitions and the formation of microstructure. Since the question of Truesdell [28, Section 20] it was clear that apart from the invariance, the energy f must satisfy further basic requirements, This research was supported by Grant 201/00/1516 of the Grant Agency of the Czech Republic.
751 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 751–762. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
752
´ M. ŠILHAVY
‘constitutive inequalities’ yet to be determined. Subsequently it was found that the existence of the solution, and its further properties, are directly related to the semiconvexity properties (i.e., quasiconvexity, rank 1 convexity, polyconvexity, and convexity) of f [14, 3, 8, 18, 19, 16]. Recall that f is said to be quasiconvex if f (A + Dv(x)) dx (1) |E|f (A) E
for each A ∈ Mn×n , each bounded open E ⊂ Rn with |∂E| = 0, and each v ∈ W01,∞ (E) such that the right-hand side in (1) makes sense as the Lebesgue integral. A closely related notion is that of rank 1 convexity, which requires that f ((1 − t)A + tB) (1 − t)f (A) + tf (B)
(2)
for every t ∈ [0, 1] and every A, B ∈ Mn×n with rank(A − B) 1. For finitevalued functions, quasiconvexity implies rank 1 convexity. If f is not quasiconvex, the material exhibits microstructure and phase transformation [5–8, 13, 15]. The effective energy is given by the relaxation of I , i.e., by ¯ Qf (Du) dx, I (u) =
where Qf is the quasiconvex hull of f , i.e., the largest quasiconvex function not exceeding f . One defines the rank 1 convex hull Rf similarly. For finite-valued functions, Qf Rf , and it often happens that Qf = Rf . A function f : Mn×n → R ∪ {∞} is said to be rotationally invariant (briefly, invariant) if f (A) = f (QAR) for all A ∈ Mn×n and all Q, R proper orthogonal. For example, stored energies of isotropic solids have this property. If we define f˜(τ ) = f (diag(τ )) for any τ ∈ Rn then f˜, called the representation of f , is symmetric and even, i.e., f˜(P τ ) = f˜(τ ) = f˜(τ ) for every τ ∈ Rn , every permutation matrix P and every diagonal proper orthogonal matrix . One finds that f (A) = f˜(τ ), where τ = (τ1 , . . . , τn ) are the signed singular values of A, defined as the unique n-tuple such that τ1 , . . . , τn−1 , |τn | are the singular values of A, arranged in a nonincreasing way, and sgn τn = sgn det A [17, 20]. This paper presents a condition equivalent to the rank 1 convexity of f in terms of f˜, Theorem 6. Like the conditions in [1, 20–22, 25], the present condition involves finite differences of arguments, resembling formally the inequality (2), as opposed to the Legendre–Hadamard condition D 2 f (A)(a ⊗ b, a ⊗ b) 0,
A ∈ Mn×n , a, b ∈ Rn ,
whose nature is ‘infinitesimal.’ The form of the Legendre–Hadamard condition in terms of f˜ has been given in [12] for n = 2, in [27] for n = 3 (in a different
753
ON SO(n)-INVARIANT RANK 1 CONVEX FUNCTIONS
framework), and generally in [21, Proposition 6.4]. The reader is referred to [2, 10], and [9] for additional information. In view of its global nature, Theorem 6, and the results of [20–22, 25], can be used to define iterative procedures for evaluating the rank 1 convex hull of an invariant function [26, 23]. Theorem 6, and its proof, has two parts. One part (item (i) of the theorem) is the monotonicity of invariant rank 1 convex functions as established in [24], which is closely related to the Baker–Ericksen inequalities in the differentiable case. In the special case of O(n) invariant functions, a similar result has been established in [11]; however, the result does not apply to SO(n) invariant functions treated here. The second part (item (ii) of the theorem) is based on an explicit construction of a rank 1 perturbation B of a given matrix A with prescribed signed singular values β, Proposition 1. Such a perturbation exists only if β and the signed singular values α of A satisfy the interlacing inequalities to be formulated in Section 2. Let S k , k = 1, . . . , n, denote the kth elementary symmetric function of n variables. If γ (t) are the signed singular values of C(t) := (1 − t)A + tB, 0 t 1, then for an appropriate diagonal orthogonal matrix , the functions S k (γ (t)) behave affinely, i.e., S k (γ (t)) = (1 − t)S k (α) + tS k (β),
k = 1, . . . , n.
(3)
Accordingly, the function f˜ is convex on any curve satisfying (3) (provided α, β satisfy the bilateral interlacing inequalities), i.e., f˜(γ (t)) (1 − t)f˜(α) + t f˜(β).
(4)
The necessity of (4) thus follows by a direct insertion into (2); the converse proof is based on local considerations. Namely, the rank 1 perturbations described in Proposition 1 have certain minimum properties stated in Lemma 4. Combining these with items (i), (ii) and the use of some continuity and density arguments (Lemmas 2 and 5) then completes the proof. Apart from the notation and the bilateral interlacing inequalities, Section 2 is not needed for the statement of Theorem 6; it only gathers a material for the proof, and can be used as reference as needed. It would be desirable to integrate the two conditions of Theorem 6 into a single condition. This can be done in dimension n = 2 as Theorem 7 shows. However, if n 3, the discrepancy of the functions occurring in conditions (i), (ii) of Theorem 6 seem to present a serious difficulty in such attempts. One connection between (i), (ii) is established in Lemma 4, which is the main technical improvement with respect to the previous papers of the author. The lemma shows that the rank 1 perturbations underlying condition (ii) are local minimizers of the partial products of signed singular values occurring in condition (i). This establishes the special positions of these particular rank 1 perturbations. However, the result is local in its nature and the full understanding of the issues is outstanding.
754
´ M. ŠILHAVY
2. Rank 1 Perturbations This section describes a class of rank 1 perturbations of a given matrix with prescribed signed singular values and collects other supplementary facts. Let Gn = {τ ∈ Rn : τ1 τ2 · · · τn−1 |τn |} and note that a τ ∈ Rn is an n-tuple of signed singular values of some A ∈ Mn×n if and only if τ ∈ Gn . For any α, β ∈ Rn let αβ = (α1 β1 , . . . , αn βn ), α 2 = (α12 , . . . , αn2 ), and write α 0 if α1 0, . . . , αn 0. For each k ∈ {1, . . . , n} denote by S k : Rn → R the kth elementary symmetric function of n variables, αi1 · · · αik S k (α1 , . . . , αn ) = 1i1 <···
and let S(α) = (S 1 (α), . . . , S n (α)). The pair α, β ∈ Gn ∩ Rn+ is said to satisfy the bilateral interlacing inequalities (BIL) if β1 α2 ,
α1 β2 α3 ,
...,
αn−1 βn .
The pair α, β ∈ Gn is said to satisfy the BIL if (α1 , . . . , αn−1 , |αn |) and (β1 , . . . , βn−1 , |βn |) satisfy the BIL. PROPOSITION 1. Let A = diag(α), α ∈ Gn . Then (i) β ∈ Gn are the signed singular values of some rank 1 perturbation of A ⇔ α, β satisfy the BIL; (ii) let β ∈ Gn , β = α, satisfy the BIL and let √ √ qj = xj , pj = j xj , where j ∈ {1, −1} is such that (β − α) 0, xj := j (βj − αj )
> i βi − j αj 0, i αi − j αj i
and the product is taken over all i for which the denominator is nonzero; then B := A + p ⊗ q has the signed singular values β; (iii) in the situation of (ii), set C = (1 − t)A + tB, 0 t 1, and denote by γ the signed singular values of C; then S(γ ) = (1 − t)S(α) + tS(β).
(5)
The requirement (β − α) 0 determines j ∈ {1, −1} uniquely as j = sgn(βj − αj ) if βj = αj while if βj = αj then both choices j = ±1 are possible; however, this ambiguity has no consequences on the values of xj . Item (iii)
755
ON SO(n)-INVARIANT RANK 1 CONVEX FUNCTIONS
expresses a remarkable fact that the elementary functions, when composed with , are affine functions of signed singular values along the rank 1 segments described in Proposition 1. This is not generally true for any rank 1 line segment whose endpoints have the signed singular values α, β. Proof. For the proof of (i), (ii), see [22, Propositions 3.1, 3.2]. To prove (iii), √ let m be defined by mi = xi , let E := diag(), L := EA, M := EB ≡ L + m ⊗ m, N := EC = (1 − t)L + tM, and note that L, M, N are symmetric. Then α, β, γ are (unordered) spectra of L, M, N, respectively. Denoting the characteristic polynomials of L, M, N by p, q, r, respectively, one finds q(z) = p(z) + cof(L − z1)m · m,
r(z) = p(z) + t cof(L − z1)m · m
and hence r = (1 − t)p + tq. Expanding p, q, r in terms of S k (α), S k (β), S k (γ ), one obtains (5).
2
LEMMA 2 ([22, Lemma 5.5]). Let A be diagonal, invertible and have distinct singular values. Then there exists a dense subset D of Rn × Sn−1 such that for each (a, n) ∈ D and each t ∈ R the matrix A + ta ⊗ n has distinct singular values. LEMMA 3 ([21, equation (6.11)]). Let f : Mn×n → R be invariant and of class C 2 in a neighborhood of A = diag(α), α ∈ Gn , where α 2 has distinct components, let B := a ⊗ b, a, b ∈ Rn , and ∈ {1, −1}n . Then Df (A)(B) =
n
f˜i Bii ,
(6)
i=1
D 2 f (A)(B, B) =
n 1 Kij (Bij − i j Bj i )2 + Tij i Bii j Bjj , 2 1i =j n i,j =1
(7)
where Tii = f˜ii and for i = j , Kij =
αi f˜i − αj f˜j , αi2 − αj2
Tij = i j f˜ij +
i f˜i − j f˜j , i αi − j αj
(8)
(9)
with the derivatives evaluated at α. Let mk : Mn×n → R, k = 1, . . . , n, be the partial products of the signed singular values mk (A) =
k > i=1
τi ,
756
´ M. ŠILHAVY
A ∈ Mn×n , where τ are the signed singular values of A. The rank 1 perturbations described in Proposition 1 have the property that p = diag()q and hence diag()B is symmetric. The following lemma shows that rank 1 perturbations with p = diag()q have a special position in the class of all rank 1 perturbations. LEMMA 4. Let a, b ∈ Rn , ∈ {1, −1}n satisfy i ai bi 0,
i = 1, . . . , n,
(10)
and define p, q ∈ Rn by qi = i ai bi , pi = i qi . Furthermore, let A = diag(α), α ∈ Gn , and set C(t) := A + ta ⊗ b,
R(t) := A + tp ⊗ q,
for any t ∈ R. If A is invertible and α 2 has distinct components then for all t sufficiently close to 0, mk (R(t)) mk (C(t)),
1 k < n,
mn (R(t)) = mn (C(t)).
(11)
Proof. Since A is invertible and α 2 has distinct components, the singular values, considered as functions of its matrix argument, are of class C ∞ in a neighborhood of A by [4, Section 6]. Since the first n − 1 signed singular values coincide with the singular values, also mk , k = 1, . . . , n − 1, are of class C ∞ in a neighborhood of A. Thus we can apply Lemma 3. For each i = j denote by Kijk the K-matrix defined by (8) for mk at α and for each i, j denote by Tijk, the T -matrix defined by (9) for mk at α. Then for k < n, ⎧ if 1 i < j k, ⎨0 mk (12) Kijk = Kjki = ⎩ α 2 − α 2 > 0 if 1 i k < j . i j Note first that we have % det C(t) = det A 1 +
n
& αi−1 ai bi
i=1
% = det A 1 +
n
& αi−1 pi qi
= det R(t)
i=1
which implies the assertion about mn . Let ck (t) = mk (C(t)), rk (t) = mk (R(t)), t ∈ R. The functions ck , rk are infinitely differentiable in some neighborhood of 0 and let c˙k , r˙k , c¨k , r¨k denote the first two derivatives at 0. Let D = a ⊗ b,
S = p ⊗ q,
and note that Dii = Sii . Since the first derivative depends only on the diagonal elements (see (6)), we have c˙k = r˙k .
ON SO(n)-INVARIANT RANK 1 CONVEX FUNCTIONS
757
Furthermore, since pi qj = i j pj qi , equation (7) provides n 1 k 2 Kij (ai bj − i j aj bi ) + Tijk, Dii Djj , c¨k = 2 1i =j n i,j =1
r¨k =
n
Tijk, Sii Sjj =
i,j =1
n
Tijk, Dii Djj ,
i,j =1
and thus c¨k − r¨k =
1 K k (ai bj − i j aj bi )2 . 2 1i =j n ij
(13)
Let u be the smallest integer such that ai = bi = 0,
u < i n,
where the case u = n is not excluded. Then from the definition, also pi = qi = 0,
u < i n,
and both R(t), C(t) have block diagonal forms, ˜ A ˜ 2 ), ˜ 1 + t a˜ ⊗ b, C(t) = diag(A
˜ 1 + t p˜ ⊗ q, ˜ 2 ), ˜ A R(t) = diag(A
where ˜ 1 = diag(α1 , . . . , αu ), A
˜ 2 = diag(αu+1 , . . . , αn ), A
˜ q, ˜ a˜ , b˜ ∈ Ru are the obvious truncations of p, q, a, b, respectively. Thus the and p, ˜ 1 + p⊗ ˜ q˜ list of singular values of R(t) is the union of the list of singular values of A with {αu+1 , . . . , αn } and the same holds for C(t). Since the components of α are ordered and distinct, for t sufficiently close to 0, we have, for continuity reasons, τi (C(t)) = τi (R(t)) = αi ,
u < i n.
(14)
Let us distinguish the following two cases: ai bu − i u au bi = 0
for all i < u,
av bu − v u au bv = 0 for some v < u.
(15)
Assume first (15). Then from the definition of u we have either au = 0 or bu = 0. Assume the latter, the treatment under the former assumption is similar. Then from (15), ai = i λbi
(16)
758
´ M. ŠILHAVY
if 1 i u, where λ = u au /bu . It is noted that since u au bu 0, we have λ 0. Exclude the trivial case λ = 0. Note also that (16) extends trivially to all i. Then √ qi = i ai bi = λ|bi |. √ Thus there√exists a σ ∈ {1, −1}n such that qi = σi λbi and hence from (16), pi = σi ai / λ. Thus if J := diag(σ ), we have D = JSJ and consequently R(t) = JC(t)J,
t ∈ R.
The invariance of the signed singular values implies that τi (R(t)) = τi (C(t)),
1 i n,
and (11) holds with equality signs for all t ∈ R. Next assume that (15) holds. Then by (12) and (13) we have k (av bu − v u au bv )2 > 0. c¨k − r¨k Kvu
Thus (11) holds for all q < u and t sufficiently close to 0 by continuity. Moreover, we have det(R(t)) = ru (t)αu+1 · · · αn ,
det(C(t)) = cu (t)αu+1 · · · αn ,
and thus since det(R(t)) = det(C(t)) we conclude that ru (t) = cu (t) for all t ∈ R. Finally using this and (14) we see that (11) holds with the equality sign for all k u and all t ∈ R. 2 Let g: D → R be a function defined on an interval D ⊂ R. We say that t ∈ R is a local subgradient of g at x if there exists an > 0 such that g(y)−g(x) t (y−x) for all y ∈ D such that |x − y| < . LEMMA 5 ([22, Proposition A.1]). Let g: D → R be a continuous function on an interval D ⊂ R which has a local subgradient t (x) at each x ∈ D. Then g is convex. The conclusion does not hold without the continuity hypothesis: consider, e.g., g: R → R given by g(x) = 1 if x < 0 and g(x) = 0 if x 0. 3. Invariant Rank 1 Convex Functions The main result of the paper is THEOREM 6. An invariant f : Mn×n → R is rank 1 convex if and only if it satisfies the following two conditions:
759
ON SO(n)-INVARIANT RANK 1 CONVEX FUNCTIONS
(i) if α, β ∈ Gn satisfy k >
αi
i=1
k >
βi ,
k = 1, . . . , n − 1,
i=1
and
n > i=1
αi =
n >
βi
i=1
then f˜(α) f˜(β); (ii) if α, β ∈ Gn satisfy the BIL and γ ∈ Gn , ∈ {1, −1}n , t ∈ [0, 1] satisfy (β − α) 0 and S(γ ) = (1 − t)S(α) + tS(β)
(17)
f˜(γ ) (1 − t)f˜(α) + t f˜(β).
(18)
then
It is not clear whether the assertion holds for functions with values in R ∪ {∞}. As explained in [24], condition (i) is equivalent to the Baker–Ericksen inequalities if f is of class C 2 . Item (ii) shows that the rank 1 convexity in Mn×n translates into the representation space Gn as follows. The rank 1 line segments are replaced by curves with endpoints satisfying the BIL on which the elementary symmetric functions (composed with the sign matrix ) behave like affine functions. For f of class C 1 Theorem 6 can be deduced from [21, Section 7]. Proof. Let f be rank 1 convex. (i) follows from [24, Theorem 5.4]. (ii) Let A = diag(α) and let B = A + p ⊗ q be the rank 1 perturbation of A with the signed singular values β as described in Proposition 1. Consider C = (1−t)A+ tB and denote by γ the signed singular values of C. Then by Proposition 1(iii), γ satisfies (17) and by the well known uniqueness property of elementary symmetric functions, this realization of γ is the only way to satisfy (17). The rank 1 convexity inequality for f and A, B, C reduces to (18). Conversely, assume that f satisfies conditions (i) and (ii). Let us first show that f˜ is separately convex. It suffices to show that h := f˜(·, δ) is convex on R for each δ = (δ2 , . . . , δn ) ∈ Gn−1 . For each p ∈ R let ξ(p) ∈ Gn be the signed singular values of diag(p, δ). Assume first that δn > 0 and prove that h is convex on [−δn , ∞). Let a, b, c ∈ [−δn , ∞), t ∈ [0, 1] satisfy c = (1 − t)a + tb and set α = ξ(a),
β = ξ(b),
γ = ξ(c).
Since diag(b, δ) is a rank 1 perturbation of diag(a, δ) we see that α, β satisfy the BIL. Assume without any loss of generality that a < b, let = (1, . . . , 1) and show that (β − δ) 0. Indeed, using b > a −δn one finds that α, β are of the form α = (δ2 , . . . , δk , a, δk+1 , . . . , δn ), γ = (δ2 , . . . , δm , c, δm+1 , . . . , δn ),
β = (δ2 , . . . , δl , b, δl+1 , . . . , δn ),
(19) (20)
760
´ M. ŠILHAVY
where k m l and the cases when k or m or l is equal to n, are not excluded. From (19), (20), S(α) = S(a, δ),
S(β) = S(b, δ),
S(γ ) = S(c, δ),
and since the elementary functions are separately affine (affine in each variable), we find that (17) holds. Hence f˜(·, δ) is convex on [−δn , ∞). Similar considerations show that f˜(·, δ) is convex on (−∞, δn ]. Since the overlap of (−∞, δn ] and [−δn , ∞) has a nonempty interior, it follows that f˜(·, δ) is convex on R. The same considerations apply to δn < 0. Finally assume that δn = 0 and let u be the largest integer such that δu > 0. The above considerations can be modified to show that f˜(·, δ) is convex on (−∞, 0] and [0, ∞). The application of (i) gives that f˜(·, δ) is nondecreasing on [0, ∞) and as δn = 0, we have f˜(−p, δ) = f˜(p, δ) for each p ∈ R by the even nature of f˜. Thus f˜(·, δ) is symmetric about 0, and nondecreasing and convex on [0, ∞). It follows that f˜(·, δ) is convex on R. To summarize f˜ is separately convex on Rn and hence locally Lipschitz continuous by [14, Theorem 4.4.1, p. 112]. Since f˜ is continuous and the signed singular values are Lipschitz continuous (this can be deduced from the Lipschitz continuity of the ordinary singular values, see [4, Section 6]), f is continuous. ¯ ∈ Next let us show that conditions (i), (ii) imply the rank 1 convexity. Let A n×n n ¯ ¯ ¯ M , a¯ , b ∈ R and ϕ(t) := f (A + t a¯ ⊗ b), t ∈ R, and we have to prove that ϕ is convex. This will be done by showing that ϕ has a local subgradient at each t. ¯ a¯ , b¯ are such that A ¯ + t a¯ ⊗ b¯ is nondegenerate for each t ∈ R. Assume first that A, ¯ + t a¯ ⊗ b¯ so that Thus let t be fixed, denote by α the signed singular values of A T ¯ ¯ A + t a¯ ⊗ b = Q diag(α)R for some Q, R ∈ SO(n). Let a := QT a¯ , b := RT b¯ so that ϕ(s) = f (diag(α) + sa ⊗ b), s ∈ R. Let p, q ∈ Rn , ∈ {1, −1}n be as in Lemma 4, let ϕ(s) ¯ := f (diag(α) + sp ⊗ q), and let δ(s) be the signed singular values of diag(α) + sp ⊗ q, s ∈ R. Then, because of the special choice p, q, the function s → S(δ(s)) is affine as the proof of Proposition 1 shows. An appeal to (ii) implies that ϕ¯ is convex, and since it is continuous, it has a subgradient q at t. Combining Lemma 4 with (i), we obtain that ϕ(s) ϕ(s) ¯ for all s sufficiently close to t. As ϕ(t) = ϕ(t) ¯ we conclude that q is a local subgradient of ϕ at t. A reference to Lemma 5 implies that ϕ is convex. Thus the conclusion follows under the additional assumption of nondegeneracy. Lemma 2 then extends the conclusion to a general situation. 2 Note that in dimension n = 2, conditions (i), (ii) of Theorem 6 can be joined into one and the requirement that f be finite-valued can be removed: THEOREM 7 ([25, Theorem 5.3]). Let f : M2×2 → R ∪ {∞} be invariant. Then f is rank 1 convex if and only if f˜(γ ) (1 − t)f˜(α) + t f˜(β)
ON SO(n)-INVARIANT RANK 1 CONVEX FUNCTIONS
761
for every α, β, γ ∈ G2 and t ∈ [0, 1] such that α, β satisfy the BIL and γ1 γ2 = (1 − t)α1 α2 + tβ1 β2 , γ1 + γ2 (1 − t)(α1 + α2 ) + t (β1 + β2 ), where
=
+1 if (α1 − β1 )(α2 − β2 ) 0, −1 if (α1 − β1 )(α2 − β2 ) < 0.
References 1.
G. Aubert, Necessary and sufficient conditions for isotropic rank-one convex functions in dimension 2. J. Elasticity 39 (1995) 31–46. 2. G. Aubert and R. Tahraoui, Sur la faible fermeture de certains ensembles de contraintes en élasticité non linéaire plane. Arch. Rational Mech. Anal. 97 (1987) 33–58. 3. J.M. Ball, Convexity conditions and existence theorems in nonlinear elasticity. Arch. Rational Mech. Anal. 63 (1977) 337–403. 4. J.M. Ball, Differentiability properties of symmetric and isotropic functions. Duke Math. J. 51 (1984) 699–728. 5. J.M. Ball and R.D. James, Fine phase mixtures as minimizers of energy. Arch. Rational Mech. Anal. 100 (1987) 13–52. 6. J.M. Ball and R.D. James, Proposed experimental tests of a theory of fine microstructure and the two-well problem. Philos. Trans. Roy. Soc. London 338 (1992) 389–450. 7. M. Chipot and D. Kinderlehrer, Equilibrium configurations of crystals. Arch. Rational Mech. Anal. 103 (1988) 237–277. 8. B. Dacorogna, Direct methods in the calculus of variations. Springer, Berlin (1989). 9. B. Dacorogna, Necessary and sufficient conditions for strong ellipticity of isotropic functions in any dimension. Discrete Contin. Dyn. Syst. B2 (2001) 257–263. 10. B. Dacorogna and H. Koshigoe, On the different notions of convexity for rotationally invariant functions. Ann. Fac. Sci. Toulouse II (1993) 163–184. 11. B. Dacorogna and P. Marcellini, Implicit Partial Differential Equations. Birkhäuser, Basel (1999). 12. J.K. Knowles and E. Sternberg, On the failure of ellipticity of the equations for finite elastostatic plane strain. Arch. Rational Mech. Anal. 63 (1977) 321–326. 13. R.V. Kohn and G. Strang, Optimal design and relaxation of variational problems, I, II, III. Comm. Pure Appl. Math. 39 (1986) 113–137, 139–182, 353–377. 14. C.B. Morrey Jr, Multiple Integrals in the Calculus of Variations. Springer, New York (1966). 15. S. Müller, Variational models for microstructure and phase transitions. In: Calculus of Variations and Geometric Evolution Problems (Cetraro, 1996), Lecture Notes in Math. 1713. Springer, Berlin (1999) pp. 85–210. 16. P. Pedregal, Parametrized Measures and Variational Principles. Birkhäuser, Basel (1997). 17. P. Rosakis, Characterization of convex isotropic functions. J. Elasticity 49 (1997) 257–267. 18. T. Roubíˇcek, Relaxation in Optimization Theory and Variational Calculus. W. de Gruyter, Berlin (1997). 19. M. Šilhavý, The Mechanics and Thermodynamics of Continuous Media. Springer, Berlin (1997). 20. M. Šilhavý, Convexity conditions for rotationally invariant functions in two dimensions. In: A. Sequeira et al. (eds), Applied Nonlinear Analysis. Kluwer Academic Publishers, New York (1999) pp. 513–530.
762 21.
´ M. ŠILHAVY
M. Šilhavý, On isotropic rank 1 convex functions. Proc. Roy. Soc. Edinburgh A 129 (1999) 1081–1105. 22. M. Šilhavý, Rotationally invariant rank 1 convex functions. Appl. Math. Optim. 44 (2001) 1–15. 23. M. Šilhavý, Rank 1 convex hulls of isotropic functions in dimension 2 by 2. Math. Bohem. 126 (2001) 521–529. 24. M. Šilhavý, Monotonicity of rotationally invariant convex and rank 1 convex functions. Proc. Roy. Soc. Edinburgh A 132 (2001) 419–435. 25. M. Šilhavý, On the semiconvexity properties of rotationally invariant functions in two dimensions. (2001) To be published. 26. M. Šilhavý, Rank 1 convex hulls of rotationally invariant functions. In: Ch. Miehe (ed.), Proceedings of the IUTAM Symposium on Computational Mechanics of Solid Materials at Large Strains. Kluwer Academic Publishers, New York (2003) pp. 87–98. In press. 27. H.C. Simpson and S. Spector, On copositive matrices and strong ellipticity for isotropic elastic materials. Arch. Rational Mech. Anal. 84 (1983) 55–68. 28. C. Truesdell and W. Noll, The non-linear field theories of mechanics. In: S. Fluegge (ed.), Handbuch der Physik III/3. Springer, Berlin (1965).
On Thermodynamics of Nonlinear Poroelastic Materials ´ K. WILMANSKI Weierstrass Institute for Applied Analysis and Stochastics, Berlin, Germany. E-mail:
[email protected] Received 30 November 2002 Abstract. The paper contains a brief presentation of a macroscopical thermodynamic model of poroelastic materials with many fluid components. A particular emphasis is placed on a Lagrangian formulation of the model and, consequently, on a consistent formulation of field equations on the reference configuration of the skeleton (solid phase of the mixture). It is demonstrated that the model possesses an identical structure as that in the pioneering work of C.A. Truesdell on the continuum mixture of fluids. An issue of porosity as an additional microstructural variable is particularly exposed. Mathematics Subject Classifications (2000): 74A15, 74E30, 74L05. Key words: continuum thermodynamics, porous media, mixtures.
In memoriam of Prof. Clifford Ambrose Truesdell whose work in continuum mechanics created new standards of research in field theories
1. Introduction The classical continuum theory of mixtures whose development was started in 1957 by the famous papers of Truesdell [1] is primarily designed to cover systems of many fluid components. In 1982 Bowen [2] (see as well [3]) has extended this classical field on mixtures one component of which is a solid. This has put theories of porous materials on the same footing as mixtures of fluids. During the last twenty years this field of research developed rapidly and in the meantime enhanced studies on such systems as suspensions, mixtures of granular materials saturated or not saturated with a fluid and many others. In spite of this development there are still some controversies concerning the construction of nonlinear models in which large deformations of the skeleton are incorporated. This is related to the fact that in contrast to mixtures of fluids a solid component (skeleton) yields naturally to a Lagrangian description of the system. Bowen was using in his papers a mixed description – Lagrangian for the solid skeleton and Eulerian for the fluid – but such an approach leads to technical difficulties 763 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 763–777. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
764
´ K. WILMANSKI
in applications of the model, in formulation of boundary conditions etc. For this reason I have proposed in 1995 a different way of description of two-component porous materials [4]. This may be extended to many components and first results for multicomponent porous materials have been published in the work [5]. In this work we present the full structure of a Lagrangian model of a poroelastic material in which there may be more than one fluid component and the kinematics of the skeleton is formulated in the Lagrangian way. In Section 2 we define the Lagrangian description of multicomponent systems and introduce various kinematical quantities analogous to those appearing in Truesdell’s theory of fluid mixtures. In Section 3 we present partial balance equations in the Lagrangian description in their global and local form. It is emphasized that in contrast to such balance equations for single continua they contain convective contributions whose form is objective. We also present a balance equation for the microstructural field of porosity and justify its macroscopic form on phenomenological grounds. This extension of the microstructural model has been proposed in papers [6, 7]. Section 4 contains a discussion of thermodynamic admissibility of constitutive relations for poroelastic materials with ideal fluid components. The whole development is fully macroscopical in contrast to many other works on this subject which are based on the notion of so-called true (real) densities. These may be introduced in the present model if needed at any stage of development but they are not necessary for the formulation of the consistent mathematical model. In order to be more specific we limit the attention solely to isotropic systems. Section 5 is devoted to the specification of some special models which have an important practical bearing. In particular we discuss the simplest model of a two-component poroelastic material. In Conclusions we indicate advantages of the Lagrangian description for both theoretical development as well as for numerical evaluations of the boundary value problems.
2. Porous Medium as a Mixture. Reference Configurations, Lagrangian Description The construction of the theory of mixtures of fluids proposed by Truesdell [1] is based on the Eulerian description of motion of components. As a continuum model it is based on the assumption that at each point of the space of configurations 63 all components are present simultaneously. Their various contributions are characterized by different concentrations (fractions of partial mass densities to the total mass density) as well as by their own velocity fields. A model of porous materials requires an extension of this approach. On the one hand it must account for large deformations of a solid component of the mixture which describes the behaviour of the skeleton of the porous medium. This indicates the necessity of the Lagrangian description which has been in part (solely to the
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
765
solid component) employed by Bowen [2]. On the other hand a description of the microstructure must be extended as its properties are described not only by concentrations but also by a volume fraction of voids called porosity. This is particularly visible when the porous material consists only of the solid component, i.e., the mass densities of fluid components are all identically zero. Then the concentrations are also zero but the microstructure is not trivial. This additional field requires an additional equation and in the above mentioned paper Bowen proposed an evolution equation describing its relaxation properties. An alternative approach has been proposed earlier for granular materials by Goodman and Cowin [8]. In their paper the authors proposed a second order equation for a microstructural behavior. Such an approach related to the so-called principle of self-equilibrated forces has been modified by Hutter and Svendsen [9] and is applied in the description of avalanches with abrasion [10]. In this paper we rely on a balance equation for porosity introduced in my own works [6, 7]. We consider a porous medium whose channels are filled with a mixture of A fluid components. The model is constructed on a chosen reference configuration B 0 of the solid component, i.e., all fields are functions of a spatial variable X ∈B0 and time t ∈ T . We consider a thermomechanical model in which the governing fields are as follows: 1. ρ S – mass density of the skeleton in the reference configuration, 2. ρ α , α = 1, . . . , A, – partial mass densities of fluid components referring to the unit volume of the reference configuration of the skeleton, 3. x´ S – velocity field of the skeleton, 4. FS – deformation gradient of the skeleton, 5. x´ α , α = 1, . . . , A, – velocity fields of fluid components, 6. θ S – absolute temperature of the skeleton, 7. θ α , α = 1, . . . , A, – absolute temperatures of fluid components, 8. n – porosity (the volume fraction of voids). Further in this work we assume that temperatures of components are the same θ = θ S = θ 1 = · · · = θ A.
(1)
From the thermodynamic point of view little has been done for continuum theories of mixtures in which this condition is not satisfied (e.g., [11]). Some semi-kinetic models have been proposed for ionized gases (plasma; e.g., [12]). The above fields are related to their Eulerian counterparts in the following way ρtS (x, t) := ρ S f−1 (x, t), t J S−1 f−1 (x, t), t , J S := det FS , ρtα (x, t) := ρ α f−1 (x, t), t J S−1 f−1 (x, t), t , α = 1, . . . , A, vS (x, t) := x´ S f−1 (x, t), t , (2) vα (x, t) := x´ α f−1 x, t , t , α = 1, . . . , A, n(x, t) := n f−1 (x, t), t , where the function of motion of the skeleton x = f(X, t),
(3)
766
´ K. WILMANSKI
is assumed to be at least twice continuously differentiable almost everywhere, i.e., x´ S =
∂f , ∂t
FS = Grad f.
(4)
Hence the fields x´ S , FS must satisfy the following integrability conditions ∂FS = Grad x´ S , ∂t
T Grad FS = Grad FS .
(5)
The reference configuration B0 is chosen in such a way that it is identical with a configuration at the instant of time t = t0 for which ∀X ∈ B0
FS (X, t0 ) = 1.
(6)
This choice of reference configuration is convenient for systems in which the solid component forms a skeleton whose topology does not change during the motion. It is the case for modelling of rocks, it may or may not be the case for granular materials, and it is certainly not the case for suspensions of solid particles which appear, for instance, after liquefaction of a granular compact material. For the above fields, field equations follow from general balance equations which we discuss in the next section. 3. Balance Equations We skip here axiomatic foundations for the integral representation of a general balance law. These may be found in Truesdell’s book [13], which is after more than 30 years still the most important reference on this subject. The general form of this equation for a density ϕ(X, t), written for an arbitrary domain P (t) whose motion is described by a velocity field V(X, t), is as follows: @ d ϕ(X, t) dV = (X, t) · N dS + γ (X, t) dV , (7) dt P (t ) ∂P (t ) P (t ) where is the so-called flux of ϕ, and γ is its volume supply. The first integral on the right-hand side is evaluated over a closed surface ∂P of the domain P and describes the transport through the surface. N is the field of unit outward normal to the surface. If we perform the differentiation on the left-hand side and apply the Stokes theorem, we obtain ∂ϕ + Div(ϕV − ) − γ dV = 0. (8) P (t ) ∂t We apply this relation to partial quantities listed in the previous section. In order to do so we have to find the kinematics of material domains for each component related to the reference configuration B0 . Obviously for material domains with respect to the skeleton we have V ≡ 0. For fluid components we have to use the
767
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
assumption on the simultaneous appearance of all components in each point of the domain Bt := f(B0 , t) in the configuration space. For the α-component we have then along the trajectory ∀x ∈ Bt , ∀x ∈ N (x) ⊂ Bt x = x + x´ α t + O t 2 = x + FS f−1 (x , t) − f−1 (x, t) + x´ S t + O |x − x|2 , where N (x) is a neighbourhood of x. The limit in this relation t → 0 yields the following velocity field for material domains of the α-component in the reference configuration of the skeleton ∀X ∈ B0
−1 −1 ´ α (X, t) := lim f (x , t) − f (x, t) X t →0 t α S−1 = F (X, t) x´ (X, t) − x´ S (X, t) .
(9)
We call this field the Lagrangian velocity of the α-component. Assuming that the balance equation (8) for a partial quantity ϕ α holds true for any material domain of the α-component, we obtain in the standard way the following local form of this equation ∂ϕ α ´ α − α = γ α a.e. in B0 . + Div ϕ α X (10) ∂t Obviously α denotes the corresponding partial flux, and γ α is the partial volume supply. In particular we have: • partial mass balance equations ∂ρ S = ρˆ S , ∂t
∂ρ α ´ α = ρˆ α , + Div ρ α X ∂t
α = 1, . . . , A,
(11)
• partial momentum balance equations ∂(ρ S x´ S ) = DivPS + pˆ S + ρ S bS , ∂t ∂(ρ α x´ α ) ´ α = DivPα + pˆ α + ρ α bα , + Div ρ α x´ α ⊗ X ∂t
(12) α = 1, . . . , A,
• partial energy balance equations 1 ∂ ρ S ε S + x´ S2 = Div QS − PST x´ S + ρ S bS · x´ S + ρ S r S + rˆ S , ∂t 2 1 1 ∂ (13) ´α ρ α ε α + x´ α2 + Div ρ α ε α + x´ α2 X ∂t 2 2 = Div Qα − PαT x´ α + ρ α bα · x´ α + ρ α r α + rˆ α , α = 1, . . . , A,
768
´ K. WILMANSKI
• balance equation of porosity ∂n = −DivJ + n. ˆ (14) ∂t In these equations, all functions are defined on the reference configuration B0 of the skeleton. In this sense we may call it the Lagrangian description even though partial balance equations for fluid components contain convective parts with respect to the corresponding Lagrangian velocities. The two-point tensors PS , Pα denote the Piola–Kirchhoff partial stress tensors, S b , bα are partial body forces, ε S , ε α are partial densities of the internal energy, QS , Qα – partial heat fluxes, r S , r α are partial energy radiations, J is the flux of porosity, and all quantities with a hat denote productions. The balance equation of porosity requires some justification. We have argued in previous works on this subject (e.g., [6, 7]) that the balance equation for n follows from an averaging procedure for a representative elementary volume accounting for geometrical properties of the microstructure. However this argument is not needed if we make an extension of the continuous model of mixtures on the macroscopical phenomenological level. In such a case a new scalar field satisfies in the most general case a balance equation. Second order equations for microstructural variables appearing in some works on this subject indicate that most likely two variables rather than one additional microstructural variable should be introduced and one of them has to be eliminated from the model by substitution of one balance equation in another. The most important question which must be answered in a model with an additional balance law is if such a model can be mathematically well-posed – in particular in relation to additional boundary conditions which may be necessary. The most prominent example for those difficulties appears within the extended thermodynamics (e.g., [17]) where the extension of number of fields and, consequently, an extension of the hierarchy of field equations yields unsolved problems of boundary conditions. Fortunately the above balance equation for porosity specified for two-component poroelastic materials does not require additional boundary conditions – it possesses all properties of an evolution equation. As we shall see, further thermodynamic considerations indicate that the flux J results from the diffusion (relative motion of fluid components with respect to the skeleton), and the source nˆ describes relaxation to the thermodynamic equilibrium as well as equilibrium changes of porosity ∂nE /∂t. We make an assumption similar to the one introduced by Truesdell for mixtures of fluids [1] that the bulk productions of mass, momentum, and energy vanish, i.e., the corresponding balance equations reduce to conservation laws. Hence ρˆ S +
A α=1
ρˆ α = 0,
pˆ S +
A α=1
pˆ α = 0,
rˆ S +
A
rˆ α = 0.
(15)
α=1
Under these conditions we can introduce bulk quantities which correspond to those introduced by Truesdell for fluid mixtures that satisfy conservation laws of
769
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
a single component continuum. Due to the fact that we have chosen one of the components – skeleton – as the reference the form of these laws differs from the classical Lagrangian form of conservation equations of a single continuum. Namely, by addition of partial mass balance equations (11) we obtain ∂ρ ˙ = 0, + Div ρ X ∂t
ρ := ρ + S
A
α
ρ ,
˙ := ρX
α=1
A
´ α. ραX
(16)
α=1
Hence for the single component bulk description we have to identify in rela˙ This Lagrangian mean velocity takes over the role of the barycention (8): V ≡ X. tric velocity of the classical mixture theory. However in contrast to the Eulerian description the Lagrangian mean velocity is relative, i.e., similarly to the Lagrangian ´ α it is objective. The above definition yields the following conservation velocities X laws: • momentum ∂(ρ x˙ ) ˙ − P = ρb, + Div ρ x˙ ⊗ X ∂t A A (17) ρ α x´ α , ρb := ρ S bS + ρ α bα , ρ x˙ :=ρ S x´ S + α=1
α=1
and the bulk Piola–Kirchhoff stress tensor P is defined by the relation 0 / A ´ α − X) ˙ ⊗ (X ´ α − X) ˙ , ˙ ⊗ X+ ˙ ρ α (X P := PI − FS ρ S X PI := P + S
A
α=1
(18)
α
P ,
α=1
• energy
1 2 1 2 ˙ ∂ T ρ ε + x˙ + Div ρ ε + x˙ X + Q − P x˙ = ρb · x˙ + ρr, ∂t 2 2 (19)
where the bulk internal energy density is defined as follows . A 1 S S ˙ ˙ + ´ α − X) ˙ ⊗ (X ´ α − X) ˙ , ρ C · (X ⊗ X) ρ α C S (X ρε := ρεI + 2 α=1 (20) A ρ α εα , CS := FST FS , ρεI := ρ S ε S + α=1
and the bulk heat flux has the form
770
´ K. WILMANSKI
1 ˙ ⊗X ˙ ⊗X ˙ Q = QI + −ρ S X 2 +
A
.
´ α − X) ˙ ⊗ (X ´ α − X) ˙ ⊗ (X ´ α − X) ˙ CS , ρ α (X
(21)
α=1
QI := Q + S
A
˙ + Q −ρ ε X α
S S
α=1
˙ − + PST FS X
A
´ α − X) ˙ ρ α ε α (X
α=1 A
´ α − X), ˙ PαT FS (X
α=1
as well as the radiation A A ´ α − X). ˙ ˙ ρ α r α − ρ S bS · FS X+ ρ α b α · F S (X ρr := ρ S r S + α=1
(22)
α=1
The formal similarity of these relations to the corresponding relations of the fluid mixture theory is obvious. Technical differences are related to the fact that one of the components is solid and, secondly, as the reference we have chosen this solid component rather than a mean barycentric motion of Eulerian description. 4. Field Equations and Thermodynamic Admissibility for Isotropic Materials Thermodynamics of mixtures of fluids needed more than 10 years since the publication of Truesdell’s papers [1] to start to develop. The pioneering work of Müller [14] contains the most fundamental extention of the Clausius–Duhem inequality which has been used as a condition for thermodynamic admissibility of various single component models. It is the assumption that the heat flux and the entropy flux are not related to each other by a classical universal Fourier relation: h =q/θ. The review of basic results for mixtures following from this extention can be found in the book [15]. The formal thermodynamic construction of a continuous model proceeds as follows. We need field equations for the following fields (23) F := ρ S , ρ α , FS , x´ S , x´ α , θ, n , α = 1, . . . , A. They follow from the balance equations (11), (5), (12), (19), (14). However, in order to transform these equations into field equations we have to perform the so-called closure. Namely, the following quantities α = 1, . . . , A, R := ρˆ α , PS , Pα , pˆ α , εI , QI , J, nˆ int , nE , ∂nE , (24) nˆ int := nˆ − ∂t must be specified in terms of fields and their derivatives in order to close the system. This is the constitutive problem defining materials contributing to the mixture.
771
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
The mass and momentum sources for the skeleton do not appear in the above list because, according to (15), they are not independent. Let us remark that in many cases of practical bearing additional constitutive relations may have the form of evolution equations. For instance, this is the case when the skeleton has some plastic properties, or when mass sources result from chemical reactions or adsorption/desorption processes. We do not consider such problems in this work and limit further our attention to the so-called poroelastic materials. Then the set of constitutive variables is as follows ´ α , θ, G, n, N , α = 1, . . . , A, C := ρ S , ρ α , FS , X (25) G := Grad θ, N := Grad n. Usually this set is still much too complicated for the full thermodynamic analysis and one considers simpler models. For example in the case of a simple twocomponent isotropic model of isothermal processes without mass exchange scalar constitutive functions depend on the following set of constitutive variables (26) Csimple := ρ F , I, I I, I I I, I V , V , V I, n , where the six invariants I, . . . , V I are defined as follows I I := 12 I 2 − trCS2 , I I I := det CS , I := trCS , ´ F, ´F ·X I V := X
´ F · CS X ´ F, V := X
´ F · CS2 X ´ F, V I := X
(27)
´ F being the Lagrangian velocity of the single fluid component: α = F . We with X present some results for such a model further in this paper. The fundamental assumption of a continuous modelling has the form of the following constitutive relation R = R(C),
(28)
where the mapping is assumed to be at least once continuously differentiable. The constitutive functions (28) are said to be thermodynamically admissible if any solution of field equations satisfies identically the following entropy inequality ∂(ρη) ˙ + H) 0, + Div(ρηX ∂t
η = η(C),
H = H(C).
(29)
This is the Lagrangian form of the second law of thermodynamics proposed by Müller for mixtures. As shown in 1973 by Liu (e.g., see [16]) the limitation to solutions of field equations can be eliminated from the above formulation by means of Lagrange multipliers. The equivalent form of the second law is then as follows. For all fields the following inequality must be fulfilled identically: S ∂(ρη) S ∂ρ S ˙ + Div(ρηX + H) − − ρˆ ∂t ∂t
772
´ K. WILMANSKI
α α ´S ∂ρ α α S S ∂x S S S S ´ + Div ρ X − ρˆ − λ · ρ − DivP − pˆ + ρˆ x´ − ∂t ∂t α=1 α A ´ α α ∂x α α α α α α ´ ´ ˆ ´ + X · Grad x − DivP − p + ρˆ x λ · ρ − ∂t α=1 S ∂F F S n ∂n − Grad x´ − + DivJ − nˆ − · ∂t ∂t T ε ∂ρε ˙ (30) − + Div ρε X + Q − P x˙ 0, ∂t A
α
where the Lagrange multipliers := {S , α , λS , λα , F , n , ε } are functions of constitutive variables C. The exploitation of the inequality is now standard. Applying the chain rule we separate a linear part which must vanish. This yields relations for multipliers and some restrictions of constitutive relations. The remaining nonlinear part of the inequality defines the dissipation in the system. We skip here a discussion of fully general restrictions of constitutive relations. These can be found in the paper [5] and in the book [16]. We present their particular cases further in this work. However it is worthwhile to expose the structure of the dissipation for constitutive variables C in which we leave out the dependence on G and N. After some calculations we obtain the following so-called residual inequality D :=
A
α − S ρˆ α + n nˆ int
α=1 A A α α S α α S S ´ α 0, + ρˆ α X λ − λ · pˆ − ρˆ x´ − λ · F α=1
(31)
α=1
where the multipliers are given by the relations ∂η ∂η S S ε ∂ε α α ε ∂ε − − , =ρ , =ρ S S ∂ρ α ∂ρ α −1∂ρ ∂ρ ∂ε ∂η , ε = ∂θ ∂θ A ∂η S S S−T ε ∂ε − , ρ λ = −ρF ´α ´α α=1 ∂ X ∂ X ∂η ∂η α α S−T ε ∂ε n ε ∂ε − . − , =ρ ρ λ = ρF ´α ´α ∂n ∂n ∂X ∂X
(32)
The first contribution to the dissipation function D (31) describes the dissipation due to the mass exchange between components. The second contribution is the dissipation due to the relaxation of porosity to its equilibrium value, say nE . Finally,
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
773
the last contribution is the dissipation due to the relative motion of components. It is known from the classical theory of mixtures that momentum sources are objective solely in the combination with mass sources. This property is also present in the model for poroelastic materials and, consequently, the second line in the definition of D should be considered as a whole. There is no contribution of dissipation due to the heat conduction because we have left out the dependence on the temperature gradient G. The lack of dependence on the gradient of porosity N does not lead to any simplifications in the dissipation. The thermodynamic equilibrium state is defined by the requirement that D = 0 in this state. It means that mass, momentum and porosity sources, ρˆ α , pˆ α , nˆ int vanish in this state, and simultaneously the dissipation function D reaches the minimum. The second law of thermodynamics does not specify constitutive relations for sources but it limits their form by the residual inequality. This statement can be made more specific by the assumption that deviations from the thermodynamic equilibrium are small. Then the dissipation becomes a quadratic function of nonequilibrium variables. We present further the results of this simplification. 5. Some Special Cases Let us begin with a rather formal simplification of the multicomponent model, which indicates a possible structure of energy, entropy and porosity fluxes. We assume that the intrinsic parts of the internal energy εI and the entropy η are ´ α . This assumption is motivated by the fact independent of relative velocities X that scalar functions for isotropic materials must be at least quadratic in their dependence on vector arguments. For small deviations from the thermodynamic ´ α can be left out. If so, then relations (32)4,5 equilibrium such a dependence on X for the multipliers become quite explicit and we obtain ˙ λS = ε FS X,
´ α − X). ˙ λα = ε FS (X
(33)
Then restrictions following from the second law which we are not quoting in this paper (e.g., see [6]) yield the following general form of fluxes for processes in isotropic materials with a small deviation from the thermodynamic equilibrium Q =
A
α ´ , Qα0 1 + Qα1 CS + Qα2 CS2 X
α=1
H =
A
α ´ , H0α 1 + H1α CS + H2α CS2 X
α=1
J =
A α=1
α ´ , J0α 1 + J1α CS + J2α CS2 X
(34)
774
´ K. WILMANSKI
where the scalar coefficients Qα0 , . . . , J2α are solely functions of equilibrium variables (35) Cequil = I, I I, I I I, ρ α , θ, nE , nE = nE (I I I, θ). Particularly, the last result is important because it allows to specify the equilibrium porosity. Namely, the balance equation of porosity (14) reduces in this case as follows ∂nE =0 ∂t
∂nE α ∂nE + ρ = 0, ∂ρ S ∂ρ α α=1 A
0⇒
ρS
(36)
which is the partial differential equation for nE . It shows that nE can be left out in the list (35) because it is not independent from the other variables. In the simple case of two components the solution of the differential equation (36) has the form F ρ . (37) nE = nE ρS The above simplification of the dependence on relative velocities and the structure of the dissipation function indicate as well the following structure of momentum and porosity sources ´ α, pˆ α − ρˆ α x´ α = π α FS X
nˆ = −
n − nE , τ
π α , τ > 0,
(38)
where parameters π α , τ may depend on equilibrium variables. We proceed to present the model for an important special case of the twocomponent poroelastic material. This models the so-called saturated porous materials whose components on the macroscopic level are the elastic skeleton and the ideal fluid. The thermodynamic admissibility following from the second law of thermodynamics leads for isothermal processes without mass exchange to the following constitutive relations (e.g., [18]): • partial Cauchy stresses TS := J S−1 PS FST = ℵ0 1 + ℵ1 BS + ℵ2 BS2 − θn (n − nE )1, (39) F F n S S ST B := F F , T = − p − θ (n − nE ) 1, where pF = pF ρ F , θ , ℵ = ℵ (I, I I, I I I, θ), = 0, 1, 2, (40) n n F = I, I I, I I I, ρ , θ . nE = nE (I I I, θ), • porosity flux and momentum source ´ F, ´ F , π = π I I I, ρ F , θ . pˆ = π FS X J = nE X
(41)
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
775
Consequently the model is analogous to the model of simple mixtures of fluids (e.g., [15]) in which interactions of components reduce to momentum sources and, what is characteristic for poroelastic materials, to nonequilibrium changes of porosity. We conclude these considerations with a few remarks concerning boundary conditions. Very little has been done for the case of models with more than two components. Therefore we limit the attention solely to this last case. The natural condition on the boundary ∂B0 is the condition for the total loading. If we denote by text the vector of force density on this surface which is controlled from the external world, then it must be taken over by the total stress vector, i.e. text = PN|∂B0 ,
(42)
provided the interface ∂B0 does not possess any intrinsic structure of its own. This may not be fulfilled by many porous materials which, for instance, may possess a surface tension on contact surfaces. In addition to this dynamical condition we have to formulate a kinematical condition depending on a relative motion of components. The tangential component of this vectorial condition has been intensively investigated and the early results of Beavers and Joseph [19] have been confirmed. In the case of ideal fluid components this condition reduces to the following one ´ F · N)N|∂B = 0. ´ F − (X X 0
(43)
The remaining normal component must be determined from investigations of a boundary layer which is created by fluid components flowing out of the porous material through a permeable boundary. A phenomenological model of this flow has been proposed by Deresiewicz and Skalak [20] and not much has been modified in this condition even though some questions seem to be still open. For two porous materials in contact through the permeable interface ∂B0 this condition has the form F p F ´F = 0, (44) ρ X · N + α0 n ∂B0 where the double brackets denote a jump, α0 is a phenomenological coefficient of surface permeablity, and the quantity in the brackets describes the difference of the pore pressure on both sides of the interface. It is a kind of a driving force for the flow of the fluid through the surface. There remains the problem of a boundary condition for the porosity. Note that the equation of porosity does not contain a divergence of porosity. Consequently it is a heterogeneous evolution equation rather than a real balance equation. For this reason it does not require any boundary condition at all. This may not be the case if we rely on the model proposed by Goodmann and Cowin in which the equation for the microstructural variable does contain spatial derivatives.
776
´ K. WILMANSKI
6. Conclusions The general framework of a nonlinear model of poroelastic materials reminds very much that designed by Truesdell for mixtures of fluids. The Lagrangian formulation of the present model is solely a technical issue which enables to incorporate large deformations of the solid component but does not change anything in “philosophy” of the construction of the model. A new element grows only from the fact that we have to incorporate an additional microstructural parameter into the model. The model presented in this work contains only one such parameter – the porosity. However the experience with soil and rock mechanics, mechanics of snow and glaciers indicates that the number of those parameters must be larger in many problems of practical bearing. For example, it may be tortuosity, double porosity, anisotropy of microstructure, plastic deformation of the skeleton, etc. In such cases the model must be extended even further but the fundamental elements of the theory of mixtures would remain in such extensions. Finally, let us remark that the linear version of the model has been extensively investigated and it seems to work very well, particularly in applications to acoustics of porous materials. Nonlinear problems of poroelastic materials are being solved usually by means of numerical methods for which the Lagrangian formulation is particularly useful. In such a description a mesh of finite elements or finite volumes does not have to be changed in time to follow the motion of fluid components. Analytical results are very rare (e.g., [21]) because very little is known about the form of constitutive relations for large deformations of the skeleton. References 1. 2. 3. 4. 5.
6. 7. 8. 9. 10.
C.A. Truesdell, Sulle basi della termomeccanica. Accad. Naz. dei Lincei, Rend. della Classe di Scienze Fisiche. Matematiche e Naturali 22(8) (1957) 33–38, 158–166. R.M. Bowen, Compressible porous media models by use of the theory of mixtures. Internat. J. Engrg. Sci. 20(6) (1982) 697–735. C.A. Truesdell, Rational Thermodynamics, 2nd edn. Springer, New York (1985). K. Wilmanski, Lagrangian model of two-phase porous material. J. Non-Equilib. Thermodyn. 20 (1995) 50–77. K. Wilmanski, Toward an extended thermodynamics of porous and granular materials. In: G. Ioos, O. Guès and A. Nouri (eds), Trends in Applications of Mathematics to Mechanics. Chapman&Hall/CRC (2000) pp. 147–160. K. Wilmanski, Porous media at finite strains – The new model with the balance equation of porosity. Arch. Mech. 48(4) (1996) 591–628. K. Wilmanski, A Thermodynamic Model of compressible porous materials with the balance equation of porosity. Transport Porous Media 32 (1998) 21–47. M.A. Goodman and S.C. Cowin, A continuum theory for granular materials. Arch. Rational Mech. Anal. 48 (1972) 249–266. B. Svendsen and K. Hutter, On the thermodynamics of a mixture of isotropic materials with constraints. Internat. J. Engrg. Sci. 33 (1995) 2021–2054. N.P. Kirchner, Thermodynamically consistent modelling of abrasive granular materials I. Nonequilibrium theory. Proc. Roy. Soc. London A 458 (2002) 2153–2176.
THERMODYNAMICS OF NONLINEAR POROELASTIC MATERIALS
11.
777
N.T. Dunwoody and I. Müller, Thermodynamic theory of two chemically reacting ideal gases with different temperatures. Arch. Rational Mech. Anal. 29 (1968). 12. N.A. Krall and A.W. Trivelpiece, Principles of Plasma Physics. McGrow-Hill, New York (1986). 13. C. Truesdell, A First Course in Rational Continuum Mechanics, Part 1, Fundamental Concepts, Academic Press, New York (1977); also Lecture Notes: A First Course in Rational Continuum Mechanics. Johns Hopkins Univ. Press, Baltimore, MD (1972). 14. I. Müller, A Thermodynamic theory of mixtures of fluids. Arch. Rational Mech. Anal. 28 (1968). 15. I. Müller, Thermodynamics. Pitman, New York (1985). 16. K. Wilmanski, Thermomechanics of Continua. Springer, Heidelberg (1998). 17. I. Müller and T. Ruggeri, Rational Extended Thermodynamics. Springer, New York (1998). 18. K. Wilmanski, Mass exchange, diffusion and large deformations of poroelastic materials. In: G. Capriz, V.N. Ghionna and P. Giovine (eds), Modeling and Mechanics of Granular and Porous Materials. Birkhäuser, Basel (2002) pp. 213–244. 19. G.S. Beavers and D.D. Joseph, Boundary conditions at a naturally permeable wall. J. Fluid Mech. 30(19) (1967) 197–207. 20. H. Deresiewicz and R. Skalak, On uniqueness in dynamic poroelasticity. Bull. Seismol. Soc. Amer. 53 (1963) 783–788. 21. B. Albers and K. Wilmanski, An axisymmetric steady-state flow through a poroelastic medium under large deformations. Arch. Appl. Mech. 69 (1999) 121–132.
Anisotropic Elasticity and Multi-Material Singularities WAN-LEE YIN School of Civil and Environmental Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0355, USA. E-mail:
[email protected] Received 24 July 2002; in revised form 21 January 2003 Abstract. Multi-material wedges associated with convergence of geometrical and material discontinuity lines generally show singular stress fields around the vertex of the wedge. In this paper, the eigenvalue problem for a multi-material wedge composed of several anisotropic elastic sectors is formulated in a completely generally manner, including the cases of degenerate and extra-degenerate material sectors, and various types of edge conditions for both open and closed wedges. General representation of the elasticity solution in a degenerate or extra-degenerate anisotropic sector requires higher-order eigenmodes (generalized eigenfunctions) in addition to zeroth-order eigenmodes. Such higher-order eigenmodes are obtained from appropriate analytical expressions of the zeroth-order eigenmode by using the derivative rule. The analysis is applied to one bisector wedge and one trisector wedge in a three-layer cracked composite model to obtain accurate elasticity solutions of the singular stress fields. These solutions were determined using the traction data generated on a circular collocation path by a conventional finite element analysis. Mathematics Subject Classifications (2000): 74E10. Key words: anisotropic elasticity, stress singularity, multi-material wedges, eigensolutions, degenerate materials, Lekhnitskii and Stroh formalisms.
To Clifford Truesdell, in Fond Memory, Admiration and Gratitude.
1. Introduction Composite structures involving interfaces, joints, free edges and cracks generally develop singular elastic stress fields near the intersection of lines of material and geometrical discontinuity. Examples include interface cracks, transverse matrix cracks impinging upon an adjacent ply, lap and beveled adhesive joints, skinstiffener interfaces, and ply drops in laminated structures. These localized regions of severe stress are possible sites of failure initiation and growth. In many cases, the local geometry and state of deformation do not vary significantly in the direction tangential to the line of singularity. A local analysis model may be used where the parameters and variables depend only on two rectangular coordinates x and y in the plane perpendicular to the line of singularity. This two-dimensional model, containing two or more anisotropic elastic sectors, is called a multimaterial wedge. 779 C.-S. Man and R.L. Fosdick (eds.), The Rational Spirit in Modern Continuum Mechanics, 779–808. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
780
W.-L. YIN
The dissimilar sectors are bonded by radial interfaces which converge at the vertex of the wedge. A general analysis method has been developed to obtain accurate elasticity solutions of multi-material wedges using a substructure approach [1–3]. In this analysis scheme, conventional finite element analysis of the global structure is performed to provide the traction boundary data on a path bordering the wedge. A 2-D elasticity solution of the region interior to the path is obtained subsequently by constructing an eigenseries that matches the traction data at various points along the path based on least square error. For bimaterial wedges associated with free edges or interface cracks, this method yields elasticity solutions of the local problem that are in close agreement with existing numerical solutions using special singular finite elements [4, 5]. In the case of interface cracks, the energy release rate predicted by the dominant singularity of the eigenseries was found to be in excellent agreement with the result of the J -integral evaluated along a remote boundary path [6, 7]. Further validation is shown by obtaining and comparing several eigenseries of the same problem with different numbers of terms, by changing the collocation path, and by using finiteelement displacement solutions rather than the traction solutions as the collocation data for determining the eigenseris. These different solutions were also found to be in close agreement [8]. One feature of anisotropic elasticity that significantly complicates the theoretical analysis as well as the computational algorithms is material degeneracy. A degenerate or extra-degenerate material has repeated material eigenvalues for which the number of associated (zeroth-order) eigenvectors is smaller than the multiplicity of the eigenvalue. The representation of the general solution of such materials must include higher-order eigenvectors (often called “generalized eigenvectors”) with more complicated analytical expressions. The practical importance of the issue is indicated by isotropic and transversely isotropic materials, which are degenerate and have triple material eigenvalues ±i. Ting [9] and Yin [10] have given examples of extra-degenerate materials, and the class appears to be surprisingly wide. Such materials have a triple eigenvalue with only one independent zeroth-order eigenvector, and thus require two higher-order eigenvectors. In previous elasticity analyses of multi-material wedges, the sectors are often assumed to be isotropic, transversely isotropic, or anisotropic but non-degenerate [11–15]. Analytical expressions of the wedge eigensolutions in such sectors are relatively simple. However, general multi-material wedges may contain degenerate sectors that are not isotropic or transversely isotropic, for which the commonly used expressions of material eigenmodes are not valid. In any such sector, a correct In this paper, the term “eigenmode” refers to one of six independent solutions of 2-D anisotropic
elasticity of a material sector which may be non-degenerate, degenerate or extra-degenerate, whereas an “eigensolution” of a wedge is obtained by taking linear combinations of eigenmodes in successive sectors, matching the coefficients of combination in such a way as to satisfy displacement and traction continuity across the radial interfaces and homogeneous boundary conditions on the exterior edges.
MULTI-MATERIAL SINGULARITIES
781
representation of the elasticity solution must include higher-order material eigenmodes, which are given by modified expressions of complicated types. Explicit expressions of the general solutions of extra-degenerate materials are not found in the literature until very recently. Hence current theoretical and computational analysis of multi-material wedge singularities shows deficiency in completeness and generality. It was shown recently that anisotropic materials may be classified into five distinct classes, each having different representations of the general solution in terms of the material eigenvalues and eigenvectors. Degenerate and extra-degenerate materials have higher-order material eigenvectors and eigenmodes which may be obtained by differentiating appropriate analytical expressions of the zeroth-order eigenvectors and eigenmodes with respect to µ, which is temporarily regarded as a variable, followed by evaluating µ at the specific multiple eigenvalue. This derivative rule was proved analytically for all classes of degenerate and extra-degenerate materials [10]. It provides a simple and direct way for deriving the higher-order eigenvectors and eigenmodes. The expressions required for representing the wedge eigensolutions in the degenerate sectors are thereby found. In this paper, the structure of the eigensolutions of anisotropic elasticity, both at the sector level and at the wedge level, is given in a concise form. A wedge eigensolution satisfies the continuity of tractions and displacements across all radial interfaces, and homogeneous boundary conditions on the two exterior edges. Within each anisotropic sector, the wedge eigensolution is a linear combination of the six material eigenmodes. The characteristic equation for the wedge eigenvalues is obtained explicitly regardless of the degeneracy of the sectors, and for various edge conditions including free, fixed, sliding, floating, or the elastically supported type. Analytical expressions of the wedge eigensolutions are also given. Thus the eigen-problem of general multi-material wedges with unrestricted elastic material types is solved completely in a purely algebraic procedure resulting in fully explicitly analytical expressions. In the final section of the paper, elasticity solutions are obtained for one bisector wedge and one trisector wedge in a three-layer composite model with the middle layer containing an inclined crack. Various solutions including different numbers of eigensolutions are compared to show the trend of convergence. Comparison is also made with the asymptotic solution (the dominant singular eigensolution). It is found that the trend shown by the asymptotic solution and the associated generalized stress intensity factors may be physically irrelevant and useless because they differ drastically from the elasticity solution over any physically meaningful range of scale length.
2. Two-Dimensional Anisotropic Elasticity – Non-Degenerate Material Let αij (i, j = 1, . . . , 6) denote the anisotropic elastic compliance constants relating the strain components εx , εy , εz , γyz , γxz , γxy to the stress components σx , σy ,
782
W.-L. YIN
σz , τyz , τxz , τxy , and let αi3 αj 3 for i, j = 3. βij = αij − α33 Then, for generalized plane deformations (i.e., deformations in which all strain components depend only on x and y), one has [16] {ε} = [β]{σ },
(2.1)
where {ε} = {εx , εy , γyz , γxz , γxy }T , {σ } = {σx , σy , τyz , τxz , τxy }T . In the absence of body forces, an equilibrated stress field {σ } may be represented by the derivatives of a pair of stress functions F (x, y) and (x, y): σx = F,yy ,
σy = F,xx ,
τxy = −F,xy ,
τxz = ,y ,
τyz = −,x , (2.2)
We seek solutions for the displacement vector u ≡ {u, v, w} and the vector of stress potentials q ≡ {F,y , −F,x , } of the following form u = af (x + µy),
q = bf (x + µy),
(2.3a,b)
or, χ [0] ≡ {F,y , −F,x , , u, v, w}T = ξ [0] f (x + µy),
ξ [0] ≡ {bT , aT }T , (2.3c,d)
where f is an arbitrary analytic function and a ≡ {a1 , a2 , a3 }T and b ≡ {b1 , b2 , b3 }T are constant vectors. The complex parameter µ and the six-dimensional vector ξ [0] will be identified later as material eigenvalues and (zeroth-order) material eigenvectors if they make the derivatives of u and q satisfy the anisotropic stress-strain relations, i.e., equation (2.1). Since τxy = −∂x F,y = −b1 f (x + µy) = ∂y (−F,x ) = b2 µf (x + µi y), one has b1 = −µb2 . Hence b ≡ J1 (µ)η,
⎡
⎤ −µ 0 where J1 (µ) ≡ ⎣ 1 0 ⎦ , 0 1
η≡
b2 b3
$ .
(2.4)
Equations (2.1), (2.3a,b) and (2.4) yield the important relation E(µ)a = [β]P(µ)η where the matrix functions E(µ) and P(µ) are defined by ⎤ ⎡ ⎤ ⎡ 1 0 0 −µ2 0 ⎢0 µ 0⎥ ⎢ −1 0 ⎥ ⎥ ⎢ ⎥ ⎢ ⎢ ⎥ ⎢ −1 ⎥ E(µ) = ⎢ 0 0 µ ⎥ , P(µ) = ⎢ 0 ⎥. ⎣0 0 1⎦ ⎣ 0 µ ⎦ µ 0 µ 1 0
(2.5)
(2.6a,b)
Notice that all columns of P(µ) are orthogonal to all columns of E(µ). Therefore, pre-multiplication of the last equation by ET (µ)[β]−1 and P(µ)T yield, respectively,
783
MULTI-MATERIAL SINGULARITIES
ET (µ)[β]−1 E(µ)a ≡ (µ)a = 0, PT (µ)[β]P(µ)η = M(µ)η = 0, where
M(µ) ≡
l4 (µ) −l3 (µ)
−l3 (µ) l2 (µ)
(2.7) (2.8)
l2 (µ) = β44 − 2β45 µ + β55 µ2 , l3 (µ) = −β24 + (β25 + β46 )µ − (β14 + β56 )µ2 + β15 µ3 , l4 (µ) = β22 − 2β26 µ + (2β12 + β66 )µ2 − 2β16 µ3 + β11 µ4 .
(2.9a)
(2.9b)
The matrix ≡ ET (µ)[β]−1 E(µ) in equation (2.7) is well known in the Stroh formalism [17]. Equations (2.7) and (2.8) have nontrivial solutions for a and η, respectively, if and only if the following characteristic equation (2.10a) or (2.10b) is satisfied: (µ) ≡ | (µ)| = 0,
δ(µ) ≡ |M(µ)| = 0.
(2.10a,b)
The two conditions are in fact equivalent, and they yield three identical pairs of complex conjugate roots which are the material eigenvalues. Equivalence of (2.10a) and (2.10b) follows clearly from the relations which express one of the two vectors b and η in term of the other: ⎡ ⎤ 1 0 0 0 0 (2.11) a ≡ J2 (µ)η, where J2 (µ) ≡ ⎣ −µ 0 0 0 1 ⎦ [β]P(µ), 0 0 0 1 0 0 1 0 0 0 (2.12) η≡ [β]−1 E(µ)a. 0 0 1 0 0 A direct if somewhat lengthy proof of equivalence was given by Barnett and Kirchner [18]. However, the present proof also makes transparent the equivalence of the eigenspaces of the Lekhnitskii and Stroh formalisms. For each eigenvalue µ, equations (2.8), (2.9) and (2.4) yield the explicit expression of the b-vector: ⎫ ⎫ ⎧ ⎧ ⎨ −µl2 (µ) ⎬ ⎨ −µl3 (µ) ⎬ l3 (µ) l2 (µ) if l2 (µ) = 0, .(2.13) otherwise b = b= ⎭ ⎭ ⎩ ⎩ l4 (µ) l3 (µ) The corresponding material eigenvector is given by $ $ b J1 [0] = Jη, where J ≡ . ξ ≡ J2 a
(2.14)
The Stroh formalism, on the hand, first determines the a-vector from equation (2.7). The expression is lengthy because (µ) is a 3 × 3 matrix. The Stroh formalism becomes very cumbersome in degenerate and extra-degenerate cases, where
784
W.-L. YIN
higher-order eigenvectors (often called “generalized eigenvectors”) must be obtained through the use of the derivative rule or other equivalent steps. It is easily seen that the complex conjugate of the eigenvalue µ is associated with eigenvectors that are the complex conjugates of ξ . If the characteristic equation (2.10b) has three distinct pairs of complex conjugate roots, or if it has a double root µ0 for which all elements of the matrix M(µ0 ) vanish, so that equations (2.7) and (2.8) yield two independent eigenvectors associated with the double root (in addition to the eigenvector associated with a simple root), then the material is called non-degenerate. Orthotropic materials with unequal elastic constants in three material symmetry axes generally belong to the first type of non-degenerate material. For non-degenerate materials, we let B denote the matrix of the three b-vectors associated with the eigenvectors µ1 , µ2 and µ3 that have positive imaginary parts, and let A be the matrix composed of the corresponding a-vectors. We define the 6 × 6 matrix of eigenvectors B 1 B + 1+ (2.15) Z ≡ [Z , Z ] = 1 A A where the overbars denote complex conjugates. Then the 2D general solution of a homogeneous body of a non-degenerate anisotropic material is given by χ ≡ {F,y , −F,x , , u, v, w} = Zf (x + µy)c,
(2.16)
where c is a constant vector and f (x +µy) denotes the 6×6 diagonal matrix with the elements f1 (x + µ1 y), f2 (x + µ2 y), . . . , f6 (x + µ¯ 3 y). Real-valued solutions χ are obtained by choosing cj +3 = c¯j ,
fj +3 (x + µ¯ j y) ≡ fj (x + µj y),
j = 1, 2, 3.
(2.17a,b)
3. Two-Dimensional Anisotropic Elasticity – Degenerate and Extra-Degenerate Materials If the characteristic equation has a repeated root µ, and the number of associated independent eigenvectors is smaller than the multiplicity of µ, then the material is called degenerate or extra-degenerate, depending on whether the deficiency of independent eigenvectors relative to the multiplicity of eigenvalues is 1 or 2. In such cases, the zeroth-order eigenvectors must be supplemented by higher-order eigenvectors to form the matrix Z. These higher-order eigenvectors yield additional independent eigenmodes, not according to the simple relations of equations (2.3a,b) but according to the “derivative rule” described in the following. The degenerate case is important in practice because isotropic materials and transversely isotropic materials are degenerate, and have the triple eigenvalues µ = ±i. There are two classes of degenerate materials, each with a distinct representation of the general solution. The first class has a double eigenvalue µ0 which is
MULTI-MATERIAL SINGULARITIES
785
normal, that is, M(µ0 ) = 0. For this class, equation (2.8) has only one independent solution η = {l2 (µ0 ), l3 (µ0 )}T , which yields one eigenvector ξ [0] = J(µ0 )η with the zeroth-order eigenmode χ [0] = f (x + µ0 y)ξ [0] . An independent eigenmode sharing the same eigenvalue is given by the following expression evaluated at µ = µ0 and involving an arbitrary analytic function g(x + µy): d dχ [0] = g(x + µy)ξ [0] dµ dµ = g(x + µy) J {l2 , l3 }T + J{l2 , l3 }T + yg (x + µy)J(µ){l2 , l3 }T , (3.1)
χ [1] =
i.e., differentiation of the analytical expression of the zeroth-order eigenmode χ [0] with respect to µ, followed by evaluation at µ = µ0 , yields independent eigenmodes of higher orders. This derivative rule is easily implemented in the present compliance-based formulation because the analytical expressions of ξ [0] = J(µ0 )η is explicit and simple. In the Stroh formalism, where the eigenvectors are expressed in terms of elastic constants instead of elastic compliances, analytical expressions of the zeroth-order eigenvectors are lengthy, and the generation of higher-order eigenvectors and eigenmodes via the derivative rule becomes prohibitively cumbersome. The second class of degenerate materials has a triple eigenvalue µ0 which is abnormal, that is, M(µ0 ) is the null matrix. Then equation (2.8) imposes no restriction on η. One may take ξ [0] = J(µ0 ){0, 1}T . Applying the derivative rule to J(µ){l2 (µ), l3 (µ)}T twice, one obtains two additional eigenvectors ξ [1] = J{l2 , l3 }T (µ0 ) and ξ [2] = 2J {l2 , l3 }T (µ0 ) + J{l2 , l3 }T (µ0 ). All isotropic materials belong to this class and the triple eigenvalues are µ0 = ±i. Finally, extra-degenerate materials have a normal triple eigenvalue µ0 . Equation (2.8) has only one independent solution which is proportional to {l2 (µ0 ), l3 (µ0 )}T . Three independent eigenmodes are given by f (x+µy)J(µ){l2 (µ), l3 (µ)}T and its first and second derivatives with respect to µ, followed by evaluation at µ = µ0 . The expression of equation (3.1) for χ [1] remains valid. The expression of χ [2] involves all three eigenvectors of different orders: χ [2] = f (x + µ0 y)ξ [2] + 2yf (x + µ0 y)ξ [1] + y 2 f (x + µ0 y)ξ [0] .
(3.2)
Thus, for all degenerate and extra-degenerate materials, three complex conjugate pairs of eigenvectors of the zeroth and higher orders may also be obtained which form the 6 × 6 matrix of eigenvectors Z. However, the linear independence of the higher-order eigenvectors needs to be proved. A proof is suggested in the next section and the details may be found in [10]. The general two-dimensional solutions of degenerate and extra-degenerate materials are given by the following expression in place of equation (2.15): χ = ZDf (x + µy)c,
(3.3)
786
W.-L. YIN
where D ≡ D1 , D1 is a block-diagonal matrix of differential operators composed of the a 3 × 3 matrix D1 and its complex conjugate matrix D1 . For non-degenerate materials, D is the identity matrix. The following expressions (3.4a) and (3.4b) give D1 for degenerate and extra-degenerate materials, respectively, ⎤ ⎡ ∂ ∂2 ⎤ ⎡ 1 1 0 0 ⎢ ∂µ ∂µ2 ⎥ ⎥ ⎢ ∂ ⎥ ⎢ (3.4a,b) D1 ≡ ⎢ D1 ≡ ⎣ 0 1 m 2∂ ⎥ ⎦, ⎥. ⎢ ∂µ ⎦ ⎣0 1 ∂µ 0 0 1 0 0 1 where m = 1 for a normal double eigenvalue and m = 2 for an abnormal triple eigenvalue. Let [rot3] denote the rotation matrix with respect to the z-axis through an angle θ: ⎡
cos θ ⎣ [rot3] = −sin θ 0
sin θ cos θ 0
⎤ 0 0⎦ 1
(3.5)
and let [rot6] ≡ [rot3], [rot3]. Then, from equation (3.1),
1 F,θ , −F,r , , ur , uθ , w r
$T = [rot6]ZDf (x + µy)c.
(3.6)
Differentiating equation (3.1) with respect to the coordinates x and y, one obtains $T ∂v (3.7) −τxy , −σy , −τyz , εx , , γxz = ∂x χ = ZDf (x + µy)c, ∂x $T ∂u (3.8) σx , τxy , τxz , , εy , γyz = ∂y χ = ZDµf (x + µy)c. ∂y Using the transformation rules of the displacements, strains and stresses from the rectangular to the polar coordinates, one finds that {−τrθ , −σθ , −τθz , εr , ∂r uθ , γrz }T 6 7 = [rot6]ZD (cos θ + µ sin θ)f (x + µy) c, $T ∂θ ur − uθ , εθ , γθz σr , τrθ , τrz , r 6 7 = [rot6]ZD (µ cos θ − sin θ)f (x + µy) c,
(3.9)
(3.10)
The shear strains γxy = ∂v/∂x + ∂u/∂y and γrθ = ∂r uθ + (∂θ ur − uθ )/r may be obtained from the previous equations by taking linear combinations.
MULTI-MATERIAL SINGULARITIES
787
4. Multi-Material Wedges and Eigensolutions A multi-material wedge is composed of N consecutive sectors of isotropic or anisotropic materials that are perfectly bonded along radial interfaces which converge at the vertex of the wedge. We choose a polar coordinate system (r, θ) with the vertex as the origin. The kth interface, θ = θk , separates the kth sector from the (k + 1)th sector (k = 1, 2, . . . , N − 1). In the case of an open wedge, the first and the last sectors are bounded, respectively, by exterior boundary edges θ = θ0 and θ = θN , on which boundary conditions of displacements, tractions, or of the mixed type are imposed. An artificially defined curve encircles the wedge and demarcates the interior domain of the wedge from the surrounding structure. In the case of a closed wedge, there are no exterior boundary edges. The radial lines θ = θ0 and θ = θN coincide and become the interface between the first and the Nth sector. Then is a closed circuit. We seek elasticity solutions of the wedge where the solution vector χ in each sector is expressed by equation (3.3) with f1 = f2 = f3 = (x + µy)λ = r λ (cos θ + µ sin θ)λ .
(4.1)
While the material eigenvalues µi vary from sector to sector, the parameter λ is required to be the same for all sectors. This is required by the continuity of the displacements and tractions across the sector interfaces. If the material of the kth sector is non-degenerate, equation (2.15) yields χ (k) (r, θ) = r λ Z(k) (k) (θ)c(k) ,
(4.2)
where 6 (k) (k) λ λ λ (k) (θ) = (cos θ + µ(k) 1 sin θ) , (cos θ + µ2 sin θ) , (cos θ + µ3 sin θ) , 7 λ λ λ ¯ (k) ¯ (k) (cos θ + µ¯ (k) 1 sin θ) , (cos θ + µ 2 sin θ) , (cos θ + µ 3 sin θ) . (4.3) Notice that equation (4.2) requires the dependence of the solution on the coordinates r and θ to be separated. In particular, the solution has the same θ-dependence irrespective of r. For a degenerate sector with µ2 = µ3 , or an extra-degenerate sector with µ1 = µ2 = µ3 , equations (3.3), (3.4a,b) and (4.1) yield results of the form (4.2) where Z(k) contains higher-order eigenvectors and where (k) (θ) is to be modified as follows (k) 1 0 (k) , (4.4) (θ) = 0 (k) 2 where for the degenerate case
788
W.-L. YIN
6 7 (k) (k) λ λ λ ≡ D1 (cos θ + µ(k) 1 sin θ) , (cos θ + µ2 sin θ) , (cos θ + µ3 sin θ)
(k) 1 ⎡ λ (cos θ + µ(k) 1 sin θ) ⎣ 0 0
0 λ (cos θ + µ(k) 2 sin θ) 0
⎤ 0 λ−1 ⎦ mλ sin θ(cos θ + µ(k) 2 sin θ) (k) λ (cos θ + µ2 sin θ) (4.5)
(m = 1 for a normal double eigenvalue and m = 2 for an abnormal triple eigenvalue) whereas for the extra-degenerate case ⎡ λ λ−1 (cos θ + µ(k) λ sin θ(cos θ + µ(k) 1 sin θ) 1 sin θ) (k) (k) λ ⎣ 1 = 0 (cos θ + µ1 sin θ) 0 0 ⎤ (k) 2 2 λ−2 λ sin θ(cos θ + µ1 sin θ) λ−1 ⎦ (4.6) 2λ sin θ(cos θ + µ(k) 1 sin θ) (k) λ (cos θ + µ1 sin θ) (k) In both cases the matrix (k) 2 is obtained from 1 by merely replacing cos θ + (k) µ(k) 1 sin θ and cos θ +µ2 sin θ by the respective complex conjugates, while keeping λ unchanged. The undetermined complex coefficient vectors c(k) and c(k+1) in two consecutive sectors are related according to the continuity of χ across the sector interface θ = θk :
Z (k+1) (k+1)(θk )c(k+1) = Z (k)(k) (θk )c(k) . This implies the recurrence relation for the vectors c(k) : c(k+1) = Tk c(k) ,
(4.7)
where
−1 (k+1) −1 (k) (k) Z (θk ) Z Tk ≡ (k+1) (θk ) (k)
is the transfer matrix relating the vectors c quently, c(N) = TN−1 TN−2 · · · T1 c(1) .
(4.8)
in two consecutive sectors. Conse(4.9)
The transfer matrices Tk , as defined by equation (4.8), involves the inverse matrices of (k+1) and Z (k+1). It follows from equations (4.3)–(4.6) that, irrespective of material degeneracy, the inverse matrix of (k) may be obtained simply by substituting −λ for λ in (k) , i.e., (k) −1 = (k) (−λ). (4.10) (λ) An explicit expression of the inverse matrix of Z (k) may also be given. It has been shown that any two eigenvectors ξ and ξ associated with different eigenvalues µ and µ are orthogonal in the following sense (see [10] for a general proof for all materials regardless of degeneracy)
789
MULTI-MATERIAL SINGULARITIES
[ξ , ξ ] ≡ ξ T IIξ = 0,
where II ≡
03×3 I3
I3 03×3
,
(4.11)
I3 and 03×3 denotes 3 × 3 identity and null matrices, respectively. In particular, [ξ , ξ¯ ] = 0. Hence the six-dimensional solution space is the direct sum of an even number of eigenspaces, one corresponding to each distinct eigenvalue, whose dimension equals the multiplicity of that eigenvalue. The eigenvectors belonging to the same eigenvalue are generally not orthogonal in the sense of the bracket product defined by equation (4.11). Clearly, ≡ [Z+ , Z+ ] ≡ Z+T IIZ+ = AT B + BT A
(4.12)
is a symmetric block-diagonal matrix, and so is its complex conjugate . On the + other hand, orthogonality of the vectors in Z+ and Z implies that +
¯ = 0. [Z+ , Z ] = AT B + BT A Combining (4.12), (4.13) and their complex conjugates, one obtains T A BT 0 B B T [Z, Z] = Z II Z = ¯ T T ¯ = 0 . A A A B
(4.13)
(4.14)
Hence the six-dimensional matrix Z is invertible if and only if the three-dimensional symmetric matrix is. The latter has been calculated for all types of anisotropic materials and is shown to be nonsingular in every case. This provides a proof of the linear independence of the six eigenvectors of the various orders. Hence the eigenmatrix possesses a unique inverse given by −1 T 0 A BT −1 (4.15) Z = −1 ¯ T BT . A 0 The form of the matrix −1 depends on the material type. For a material with three distinct complex conjugate pairs of eigenvalues, one has 1 1 1 −1 , , , (4.16) = δ (µ1 ) δ (µ2 ) δ (µ3 ) where δ(µ) was defined in equation (2.10b). For other types of materials, −1 can be expressed in terms of the functions l2 , l3 , l4 , δ and their derivatives of the various orders. These expressions are less simple and may be found in [10]. Substituting equations (4.10) and (4.15) into (4.8), one obtains a reduced explicit expression of Tk . The transfer matrix Tk of equation (4.7) relates the constant vectors c(k) and (k+1) of two consecutive sectors. A second type of transfer matrices may be defined c to relate the values of the eigensolution on two consecutive interfaces: 4
χ(r, θk ) = Tk χ(r, θk−1 ),
(4.17)
790
W.-L. YIN
where
(k) −1 (k) −1 4 (k) (k) Z . Tk ≡ Z (θk ) (θk−1 )
Hence r −λ χ(θN ) =
1>
2 4 4 4 4 −λ −λ Tk r χ(θ0 ) = TN TN−1 · · · T1 r χ(θ0 ).
(4.18)
(4.19)
We now consider the boundary conditions on the exterior edges, which will provide the last elements for the determination of the eigenvalues and the associated eigensolutions. For a closed wedge, the lines θ = θ0 and θ = θN coincide, and the displacements and stress potentials are required to be continuous across the line. Therefore, 2 1> 4 − I (4.20) Tk 6 χ(θ0 ) = 0. This yields the characteristic equation for the closed wedge: C> 4 D Det Tk − I6 = 0.
(4.21)
For every root λ of the characteristic equation, equation (4.20) has a nontrivial solution χ(θ0 ), uniquely determined except for an arbitrary complex factor. Then −1 (1) −1 −λ r χ(θ0 ), (4.22) Z c(1) = (1)(θ0 ) and equation (4.7) yields c(k) of the other sectors. The displacements and the stress potentials are given by equation (4.2) in the successive sectors. For an open wedge, there are generally three homogenous boundary conditions imposed on the six components of [rot6]χ = {(1/r)F,θ , −F,r , , ur , uθ , w}T at θ = θ0 , and another set of three conditions at θ = θN . A problem involving nonhomogeneous boundary conditions may be reduced to one with homogenous boundary conditions by superposing a suitable particular solution. Hence the two sets of conditions may be written as Q0 [rot6(θ0 )]χ(θ0 ) = 0,
QN [rot6(θN )]χ(θN ) = 0,
(4.23a,b)
where Q0 and QN are 3×6 matrices of rank 3, i.e., each having at least one nonsingular 3×3 submatrix. We may interchange certain pairs of elements of [rot6]χ(θ0 ) and the corresponding pairs of columns of Q0 so that, after the rearrangement, [rot6]χ(θ0 ) becomes a column vector whose first and last three elements form the vectors χ and χ , respectively, whereas the matrix Q0 changes to [Q , Q ], in which the first 3 × 3 submatrix Q is nonsingular. Equation (4.23a) becomes Q χ + Q χ = 0. Hence, χ = −(Q )−1 Q χ . The vector [rot6(θ0 )]χ(θ0 ) is related to the rearranged vector {χ T , χ T }T by a 6×6 matrix K. This matrix has zero elements except the following:
MULTI-MATERIAL SINGULARITIES
791
(i) if the rearrangement leaves any element of the vector [rot6(θ0 )]χ(θ0 ) unchanged, then the corresponding diagonal element of K has the value 1 and (ii) if any pair of distinct elements, the ith and the j th, are interchanged in the rearrangement, then Kij = Kj i = 1. If the rearrangement specified by K is repeated once, then the elements of the rearranged vector resume their original positions. Hence K−1 = K. Equations (4.23) and (4.19) yield $ $ χ −(Q )−1 Q (4.24) χ , [rot6(θ0 )]χ(θ0 ) = K = K χ I3 $ 1> 2 −(Q )−1 Q 4 (4.25) χ = 0. QN [rot6(θN )] Tk [rot6(−θ0 )]K I3 Hence the general form of the characteristic equation for the eigenvalues of an open multi-material wedge is −1 $ 1> 4 2 (Q ) Q )]K [rot6(−θ Det QN [rot6(θN )] = 0. (4.26) Tk 0 I3 The displacement and stress fields in the various sectors are obtained in a way similar to the eigensolutions of a closed wedge. If the real part of λ is smaller than one, then the stress field near r = 0 has the r −(1−Re[λ]) type singularity. Such wedge eigenvalues are called singular. The roots of the characteristic equation with the real parts smaller than or equal to zero must be ignored because the associated eigensolution requires infinite strain energy We now show that the wedge eigenvalues λ occur in complex conjugate pairs. According to equation (4.2), an eigensolution of the wedge has the following expression in each sector 7 6 D1 (cos θ + µ sin θ)λ 0 B B 6 7 c. χ = rλ ¯ D1 (cos θ + µ¯ sin θ)λ 0 A A (4.27) The complex conjugate of the preceding expression yields 6 ¯7 D1 (cos θ + µ¯ sin θ)λ B B λ¯ 6 χ¯ = r ¯ A A 0 D1 (cos θ 7 6 D1 (cos θ + µ sin θ)λ¯ B B ¯ 6 = rλ ¯ A A D1 (cos θ 0
0 7 c¯ + µ¯ sin θ)λ¯ 0 7 cˆ + µ¯ sin θ)λ¯ (4.28)
where cˆ is obtained from c¯ by interchanging the first three and the last three elements. Comparing equations (4.27) and (4.28), one finds that all interfacial continuity conditions as well as the homogenous boundary conditions on θ = θ0 and θ =
792
W.-L. YIN
θN of an open wedge remain satisfied when λ and c(k) are replaced by λ¯ and cˆ (k) , respectively (k = 1, 2, . . . , N). Therefore, if λ and c(1) , c(2) , . . . , c(N) constitute an eigensolution of the multi-material wedge, then so do λ¯ and cˆ (1) , cˆ (2) , . . . , cˆ (N) . For such a pair of solutions, the combined displacements and stress potentials, χ + χ, ¯ are real-valued in all sectors. In particular, if λ is a real eigenvalue with the associated vector c(k) in the kth sector, then equation (4.28) implies that λ and cˆ (k) also constitute an eigensolution, and so do λ and c(k) + cˆ (k) . While c(k) + cˆ (k) are generally not real, they yield realvalued displacements and stresses in all sectors. The last three elements of c(k) +ˆc(k) are the complex conjugates of the first three. If λ is a repeated root of the characteristic equation, then equation (4.20) or (4.25) (for closed and open wedges, respectively) may give more than one independent solution χ(θ0 ) or χ , and equation (4.22) gives an equal number of independent vectors c(1) . Each vector yields an independent eigensolution. However, the number of independent solutions of equation (4.20) or (4.25) may be smaller than the multiplicity of the eigenvalue λ. In such a case, additional eigensolutions involving the factors log(x + µ(k) i y) and possibly their integer powers may be obtained by differentiating χ (k) = Z (k)D(x + µ(k) y)λ c(k) with respect to λ. The first derivative yields the additional solution χ ∗(k) = Z (k)Dlog(x + µ(k) y)(x + µ(k) y)λ−1 c(k) + Z (k) D(x + µ(k) y)λ c∗(k) (4.29) where, if equations (4.22) and either (4.20) or (4.25) are recast in the form G0 (λ)c(1) = 0, then c∗(1) is obtained by solving the equation G0 c∗(1) +
dG0 (1) c = 0, dλ
(4.30)
and c∗(k) in the other sectors may be obtained recursively from c∗(k+1) = Tk c∗(k) +
dTk (k) c . dλ
(4.31)
All functions of λ in equations (4.29)–(4.31) should be evaluated at the repeated root, and the evaluation should be made only after performing all required differentiations with respect to λ. The general characteristic equations for closed and open wedges, equations (4.21) and (4.26), have an infinite number of roots. The eigenvalues must generally be determined by numerical methods. Furthermore, multiple eigenvalues need to be identified. For each multiple eigenvalue λ, equation (4.20) or (4.25) yields 3 − r independent solutions χ(θ0 ) or χ , where r is the rank of the matrix in the respective equation. If the multiplicity of λ is larger than 3 − r, then additional eigensolution of the form (4.29), and others obtained by further differentiation with respect to λ, must be found.
MULTI-MATERIAL SINGULARITIES
793
Due to the complexity of the characteristic equations, it is not practical to ascertain the multiplicity of a root by taking and evaluating the derivatives of the equation. However, the argument principle in the complex variable theory [19] provides an exceedingly useful mathematical tool for exhaustive search of all roots in any finite region of the complex plane. The principle gives the number of zeros of a function in the region enclosed by any closed path, with a repeated root counted as many times as its multiplicity. Hence the multiplicity can be determined except in the case when there are two very close roots. For practical purposes, however, two or more slightly differently roots may be treated as a single multiple root, and the elasticity solution is not appreciably affected by this alteration. 5. Examples: Elasticity Solutions of a Bisector Wedge and a Trisector Wedge Two examples are given in this section to illustrate the entire procedure of elasticity analysis of multimaterial wedges in a composite structure subjected to arbitrary mechanical loads. For each example, several solutions based on truncated eigenseries of different lengths are computed to ascertain accuracy and convergence, and to examine the relevance and usefulness of the asymptotic solution and the associated “generalized stress intensity factors”. The procedure includes the following steps: (i) Use a conventional finite element structural analysis code to determine the traction vector along a circular path encircling the vertex of the wedge. The path should be separated from the vertex by at least a few rings of elements. Although the finite element solution cannot closely approximate the singular stress field in the immediate vicinity of the vertex, it yields sufficiently accurate results of stress on the path . This may be validated by refining the mesh and comparing the original and refined finite element solutions. (ii) The traction data σr , τrθ and τθz on are curve-fitted by Fourier series in θ. The results are integrated analytically with respect to θ to obtain the data of F,r , F,θ and . (iii) The material eigenvalues and eigenvectors are found for each sector of the wedge by using symbolic algebraic capabilities of Mathematica [20]. In the cases of degenerate or extradegenerate sectors, higher-order eigenvectors associated with multiple eigenvalues, as described in Section 3, are obtained by implementing the derivative rule. This yields the matrices Z(k) of equation (4.2) for all sectors. The matrix functions (k) (θ) of equation (4.3) are also determined except for the wedge eigenvalue λ. (iv) The characteristic equation for a closed or open wedge, i.e., equations (4.21) and (4.26), respectively, are derived explicitly in closed analytical form by using symbolic algebra. For multimaterial wedges with more than two sectors, the characteristic equation contains hundreds of terms or more. All real and complex eigenvalues with the real parts below a certain level are determined by numerical techniques. This is achieved in Mathematica by using the “Find-
794
W.-L. YIN
Figure 1. Deformed 3-layer model under shear.
Root” command in conjunction with contour-plotting. However, the argument principle provides crucial help for ascertaining the number of roots (and the multiplicity of repeated roots) within any closed curve. Each real or complex root λ determines a wedge eigensolution whose analytical expression in the kth sector is given by equations (4.2)–(4.6). (v) The wedge eigensolutions associated with the successive wedge eigenvalues are linearly combined to form a (truncated) eigenseries. The coefficients of combination are determined by collocation along the path . Generally there are more data points than coefficients, and a least-square error criterion is used to best fit the data of stress potentials on the collocation path. The composite structural model to be studied has three layers of equal thickness 2 cm. The middle layer has the length 12 cm in the x-direction while the top and bottom layers are 14 cm long, as shown in Figure 1 in its deformed state under a shear loading. All three layers are made of the same unidirectional graphite-epoxy composite whose homogenized anisotropic elastic properties are characterized by the following elastic constant: E2 = E3 = 10.3 GPa, v12 = v13 = v23 = 0.28, E1 = 181 GPa, (5.1) G23 = 4.023 GPa. G12 = G13 = 7.17 GPa, The fiber axis in the bottom, middle and top layers are oriented at angles 30◦ , 0◦ and −60◦ , respectively, with respect to the z-direction. The middle layer has a crack in a 45◦ inclined plane that runs through the entire thickness of the layer, and also through the entire width in the z-direction. This is a matrix crack since the crack plane is parallel to the fibers. Since the stress solutions near the singularities will be examined at various scale lengths including those that are much smaller than fiber diameters, the elasticity solutions to be obtained are not so much appropriate to the three-layer composite model as to the layerwise homogenized model. That is, strictly speaking, our analy-
795
MULTI-MATERIAL SINGULARITIES
sis and solutions concern the layerwise homogenized model, not the composite model. But the three layers will still be designated as 30◦ , 0◦ and −60◦ layers. The lower surface of the model is fixed and the upper surface is moved rigidly in the negative x-direction through a distance 1/100 cm. Plane strain condition εz = 0 is maintained for the model. The material eigenvalues of the three layers are all purely imaginary: −60◦ layer 0◦ layer 30◦ layer
±0.871351 i, ±i, ±0.959036 i,
±1.25958 i, ±i, ±1.093396 i,
±4.26833, ±i, ±2.596064 i.
(5.2)
The 0◦ middle layer is transversely isotropic, and therefore degenerate (but not extradegenerate). It has two zeroth-order material eigenvectors and one first-order eigenvector. There are six multimaterial singularities in this model. The singularities at the two ends of the inclined crack are associated with trisectors wedges. At points B and C, one has the well-known free-edge singularities, which will not be studied in this work. The bisector wedges at A and B are similar but not identical to those at D and C, respectively, due to the different orientations of the top and bottom layers. For the two bisector wedges at the reentrant corners A and D, and for the two trisector wedges at the ends of the inclined crack, the wedge eigenvalues are shown in Table I. For these four wedges, all singular eigenvalues are real. Hence the elasticity solutions do not show oscillatory behavior in the immediate vicinities of the singularities. In addition, as one approaches the vertex of a wedge, the elasticity solution approaches the asymptotic limit determined by the real eigenvector associated with the dominant real eigenvalue. Both the radial and angular dependence of this real asymptotic solution are determined by the geometry and material of the wedge sectors and the edge boundary conditions except for a real amplitude factor which is determined by remote loading. Thus, changes in remote loading can only affect this amplitude factor but cannot change the ratios of stress components of the asymptotic solution. This is in stark contrast with the case of complex conjugate dominant singularities where the stress ratios of the asymptotic solution (for example, the stress-intensity factors of interface cracks) generally depend on remote loading. The distribution of eigenvalues in the complex plane has a similar pattern for the two corner wedges, and also for the two trisector wedges, despite large difference in the axial stiffness of the −60◦ and 30◦ layers (110.693 GPa and 24.690 GPa, respectively). In general, the wedge eigenvalues are strongly affected by the edge conditions and the wedge geometry (including the number of sectors and sector angles), but are less sensitive to moderate changes in the elastic constants of the sectors. The two bisector wedges have simple integer eigenvalues 1, 2, 4, 6, etc. For the trisector wedges, every positive integer is a triple eigenvalue. However, one of the three eigenvectors associated with λ = 1 is a rigid rotation mode that contributes no stress. It is interesting to notice that the dominant singular eigenvalues
[0/30] at D
0.5585964 0.7150752 0.9339855 1 1.2822588 1.6422164 ± 0.2330850 i 2 2.2631785 ± 0.3099347 i 2.7187519 2.9863063 ± 0.2948049 i 3.2811556 3.7095433 ± 0.4676336 i 4 4.2072597 ± 0.4663597 i 4.7189494 4.9936378 ± 0.3531516 i 5.2810307 5.8077471 ± 0.6126472 i 6 6.1164562 ± 0.5507335 i 6.7189994 6.9967129 ± 0.3782984 i 7.2809934 7.9133476 ± 0.8085709 i
8
[−60/0] at A
0.5924654 0.6760317 0.9679459 1 1.3290965 1.6881490 ± 0.2455260 i 2 2.1615945 ± 0.2719786 i 2.6696804 2.9973732 ± 0.1681712 i 3.3304593 3.8604587 ± 0.7342584 i 4 4.0072164 ± 0.2662313 i 4.6694182 4.9992643 ± 0.1843503 i 5.3306112 5.8853940 ± 1.0211285 i 6 6.0013338 ± 0.2422297 i 6.6693539 6.9997085 ± 0.1893604 i 7.3306568 7.9007901 ± 1.2123062 i
8
0.4891194 0.5056793 0.6130216 1, 1, 1 1.4577165 1.5280594 ± 0.0266804 i 2, 2, 2 2.4480841 ± 2.4652695 2.5869116 3, 3, 3 3.5396360 ± 0.0972610 i 3.7955437 4, 4, 4 4.2567490 ± 0.1718115 i 4.4462049 5, 5, 5 5.3228318 ± 0.2168114 i 5.4462668 6, 6, 6 6.5306451 6.5936590 ± 0.3364298 i 7, 7, 7 7.5448581 7.8660481 ± 1.1948926 i 7.9450938 ± 0.3830909 i 8, 8, 8
Lower end of crack
Upper end of crack
4.4694496 5, 5, 5 5.2557896 ± 0.4775619 i 5.4666851 6, 6, 6 6.5271205 6.5893884 ± 0.5830187 i 7, 7, 7 7.5344069 7.7590945 ± 1.9577296 i 7.9333809 ± 0.6565767 i 8, 8, 8
0.4732559 0.5271834 0.6922293 1, 1, 1 1.3592020 1.4947400 ± 0.0418597 i 2, 2, 2 2.4973147 ± 0.1793125 i 2.5318482 3, 3, 3 3.5437306 3.8255711 ± 0.9599962 i, 3.8721508 ± 0.3127019 i 4, 4, 4
Table I. Engenvalues of four multimaterial wedges
796 W.-L. YIN
797
MULTI-MATERIAL SINGULARITIES
of the two trisector wedges are smaller than 0.5 (λ = 0.4891194 and 0.4732559, respectively, for the wedge at the lower and upper end of the crack). That is, the strengths of the singularities exceed that of an interface crack. For the bisector wedge at A, an elasticity solution is obtained by the preceding procedure, using the traction data on a circle of radius r0 = 1 cm generated by a finite element analysis using over 2000 triangular elements. Twenty-two wedge eigensolutions are combined, including all eigensolutions with λ ≤ 5. The resulting interfacial stresses between 0◦ and −60◦ layers are shown in Figure 2. The leading term of the eigenseries contributes the dominant singular stress field of the order r λ−1 , where λ = 0.5924654 is the first eigenvalue. When this asymptotic stress field is multiplied by the factor r 1−λ , the result is independent of r, and is a function of θ only. The values of its components on the upper and lower sides of the interface θ = 0 may be taken as the generalized stress intensity factors Sij+ and Sij− (In fact Sij+ and Sij− determine each other algebraically due to the three continuity conditions of tractions and another three continuity conditions of tangential strains): + − = Syy = 5.89400, Syy + Sxx = 20.8647, − = 4.12336, Sxx
+ − Sxy = Sxy = −2.41276, + Szz = 6.84576, − Szz = 2.77966,
+ − Syy = Syy = −1.27436, + Sxz = −8.23518, − Sxz = 0.94912,
(5.3) where the unit of the stress is MPa.
Figure 2. Interfacial stresses of the bisector wedge.
798
W.-L. YIN
The results of Figure 2 can only be deciphered for the range 10−2 r¯ 1, where r¯ ≡ r/r0 . In Figures 3(a)–(c), the elasticity solutions of the interfacial stresses are normalized through multiplication by the factor r¯ 1−λ , and plotted in solid curves as functions of log10 (¯r ). It is seen that the normalized σy and τyz approach the asymptotic solution very slowly (Figures 3(a) and (c)) as compared to τxy . Significant discrepancies are found even as r decreases to the subatomic scale and beyond (notice that since r0 = 1 cm, r = 10−9 m corresponds to log10 (¯r ) = −7). This is due to the closeness of the first two eigenvalues. A two-term approximate solution obtained by discarding all except the first two eigensolutions in the 22-term elasticity solution yields interfacial stresses that are shown as broken curves in Figures 3(a)–(c). These results show excellent agreement with the 22-term solution except for the domain r¯ > 10−3 . The tangential stresses σx , σz and τxz are discontinuous across the interface. Their normalized values on the upper and lower sides of the interface are shown in Figures 4(a) and (b) with r¯ also plotted in the logarithmic scale. The preceding solution is compared with five additional elasticity solutions using truncated eigenseries of various lengths. Each solution includes all terms associated with wedge eigenvalues that have real parts not greater than N, where N is taken successively to be 1, 2, 3, 4 and 10 (N = 5 corresponds to the pre-
(a) Figure 3. (a) σy of 22-, 2- and 1-term series; (b) τxy of 22-, 2- and 1-term series; (c) τyz of 22-, 2- and 1-term series.
MULTI-MATERIAL SINGULARITIES
(b)
(c) Figure 3. (Continued.)
799
800
W.-L. YIN
(a)
(b) Figure 4. Tangential stresses (a) on the upper side (b) on the lower side of the interface.
MULTI-MATERIAL SINGULARITIES
801
Figure 5. Syy of various solutions, bisector wedge.
ceding 22-terms solution). The generalized stress intensity factors Syy for all six solutions are shown in Figure 5. For each solution, other components of Sij+ and Sij− are determined by the same stress ratios of the dominant singularity as given in equation (5.3). Except for the two solutions with the smallest number of terms, the results of the other solutions are in close agreement. The eigenseries converge rapidly and, for this wedge, an accurate solution requires only eigensolutions with λ < 3. For the trisector wedge at the lower end of the inclined crack, the traction data on r ≡ r0 = 1 is obtained from the same finte-element analysis. A 29-term eigenseries including all eigensolutions with Re[λ] 5 is used in collocation. The resulting interfacial stresses on the interface to the left of the singularity are shown in Figure 6. The results are normalized with respect to r¯ −0.510881, and plotted versus log10 (¯r ) in Figures 7(a)–(c), where r¯ covers a much wider range from 10−50 to 1. The corresponding asymptotic solutions are shown as dashed horizontal lines. Any lingering faith that the asymptotic solution generally provides useful and realistic parameters for characterizing the criticality of stress singularity and for predicting failure initiation must be dispelled by the behavior of the interfacial peeling stress as shown in Figure 7(a). The peeling stress of the elasticity solution is nearly ten times greater than the asymptotic solution at r¯ = 10−5 , and about eight times greater at r¯ = 10−10 . It is more than three times bigger even at r¯ = 10−50 . On the other hand, τxy approaches the asymptotic solution much faster, as shown in Figure 7(b). Therefore, the stress ratios of the elasticity solution are very different from the generalized stress intensity factors. The asymptotic solution has neither a physical relation nor a mathematical semblance to the stress field in the wedge at any physically meaningful length scale, because the region of dominance of the asymptotic solution has an extraordinarily small size of the order much smaller
802
W.-L. YIN
Figure 6. Interfacial stresses on the left interface, trisector wedge.
(a) Figure 7. (a) σy , (b) τxy , (c) τyz of 29-, 3- and 1-term series.
MULTI-MATERIAL SINGULARITIES
(b)
(c) Figure 7. (Continued.)
803
804
W.-L. YIN
Figure 8. Syy of various solutions, left interface of the trisector wedge.
than 10−50 . Outside this minute region, the next two eigensolutions contribute significantly, as shown by the dashed curves in Figures 7(a)–(c), which combine the results of the first three terms of the 29-term solution. The 3-term solution agrees closely with the full 29-term solution in the region r¯ < 10−5 . Notice that the trisector wedge has a series of clusters of three closely spaced eigenvalues on or near the real axis, as seen in Table I. On the interface to the right of the singularity, the interfacial streses are found to be significantly smaller and the plots are not shown. The stresses σy and τxy approach the asymptotic solution more rapidly compared to the left interface, but τyz still has a very slow rate of approach. The generalized stress intensity factors are given by the asymptotic stresses on the left and right interfaces multiplied by r¯ 0.510881. One has (all results in the unit MPa) On the left interface: Syy = 0.229723, On the right interface: Syy = 0.307952,
Sxy = −0.577359, Sxy = 0.510823,
Syz = 0.888552, Syz = −0.323603.
In the interior segment of the interface at sufficiently large distances away from the bisector and trisector singularities, the stresses σy , τxy and τyz reach constant values 0, −7.85475 MPa and 0, respectively. These limiting values are given by the layerwise constant stresses in an otherwise identical model without the inclined crack and with infinite length in the axial direction. In the opposite limit of extremely small r, the stress τxy on the left and right interface approaches negative and positive asymptotic results, respectively, whose ratio is independent of loading (since the dominant singular value is real). Besides the 29-term solution, five additional solutions using all eigensolutions with λ N are obtained, where N assumes the values 1, 2, 3, 4 and 6. For the stress intensity factor Syy on the right interface, the results of the various solutions
MULTI-MATERIAL SINGULARITIES
805
Figure 9. Interface stresses of the bisector wedge in a realistic range of r.
are compared in Figure 8. The 11-term solution (N = 2) yields relatively poor results, with an error of more than 20% compared to the 29-term solution. The last four solutions, all including the eigensolutions with λ 3, are in excellent agreement. On different interfaces the elasticity solutions show different stress ratios. These ratios also vary greatly with r. It is only meaningful to make comparison in a physically relevant range of r¯ , and comparisons of the interfacial stresses for different wedges require normalization with respect to a common power, which is conveniently taken to be r¯ −1/2 . Figures 9 and 10 show the results of the bisector wedge, and of the left interface of the trisector wedge, both normalized with respect to r¯ −1/2 and plotted over the range −7 log10 (¯r ) 0 (corresponding to 10−9 m r 0.01 m). Within this range, the stresses on the left interface of the trisector wedge (Figure 10) are much higher than those on the right interface (not shown). But the generalized stress intensity factors of the right interface exceed those on the left by more than one third. The relative intensity of the peeling stress of the two wedges as shown in Figures 9 and 10 appears to depend on the length scale and, therefore, may require consideration of damage mechanism and microstructure. In the literature, so much attention has been directed to the order of stress singularities that it is almost taken for granted that a stronger order is invariably more threatening. In the present case, the trisector wedge has a stronger order with
806
W.-L. YIN
Figure 10. Trisector wedge, left interface stresses over a realistic range of r.
the lowest eigenvalue 0.489119, versus 0.592465 for the bisector wedge. But this difference may have very little to do with the severity of the interfacial stresses in the two wedges over the physically relevant range of scales. Results in Figure 9 for the bisector wedge are more severe than the right interface of the trisector wedge. Yet for exceedingly small length scales, the mathematical solution of the stresses on the latter interface will grow faster and exceed in magnitude the stresses in the bisector wedge. Ultimately, if interface failure prediction is to be based on the elasticity solution, one must restrict attention to a physically relevant range of r, and ignore the mathematical solution beyond that range. That is, one must formulate and apply failure criteria to the relevant analysis results such as shown in Figures 9 and 10. Notice that these figures require accurate elasticity analysis as presented in this work. In order to obtain results at a length scale r¯ = 10−N directly, one almost needs a conventional finite-element analysis with close to 102N elements, unless a substructuring method is used. Solutions of the preceding two examples lead to the following observations: (1) The two-step substructure approach, in which a conventional finite-element analysis is used to generate the traction boundary conditions on a path encircling a singularity, and an elasticity solution of the multimaterial wedge is obtained by combining eigensolutions, provides a reliable, highly efficient and accurate method for the analysis of singularities in heterogeneous structures.
MULTI-MATERIAL SINGULARITIES
807
(2) The eigenseries converge rapidly. In the two examples, the various elasticity solutions that include all eigensolutions with λ 3 are in excellent agreement. However, a solution that excludes some terms with smaller λ may incur very significant error. (3) When the lowest singular eigenvalue is real, the elasticity solution approaches the dominant singular solution asymptotically as r → 0 but the approach may be very slow. As r decreases to subatomic size and even to 10−50 r0 in the present problems, the elasticity solution for the interfacial stress may still be significantly different from the asymptotic solution. For the intervening range of r, the ratios of the stress components of the elasticity solution vary widely with r, and they can be very different from the ratios of the generalized stress intensity factors. (4) Hence the asymptotic solution (including both the order of singularity and the generalized stress intensity factors) cannot be used generally to characterize the criticality of stress singularity, and to be used as the main basis for the prediction of failure initiation. Interface failure may be affected by stress contributions from the second and third eigensolutions, and failure criteria must be based on relevant size scales as determined by the model geometry and microstructure (fiber diameters, etc.), since the mathematical results of the stress level in a minute region of subatomic size have no relevance to physical processes including failure initiation.
References 1.
2.
3.
4. 5. 6.
7. 8.
W.-L. Yin, A general analysis methodology for singularities in composite structures. In: Proc. AIAA/ASME/ASCE/AHS/ASC 38th SDM Conference, Kissimere, FL, 7–10 April 1997, pp. 2238–2246. W.-L. Yin, Mixed mode stress singularities in anisotropic composites. In: Y.D.S. Rajapakse and G.A. Kardomateas (eds), Thick Composites for Load Bearing Structures, AMD 235. ASME, New York (1999) pp. 33–45. W.-L. Yin, K.C. Jane and C.-C. Lin, Singular solutions of multimaterial wedges under thermomechanical loading. In: G. J. Simitses (ed.), Analysis and Design Issues for Modern Aerospace Vehicles – 1997, ASME AD 55. ASME, New York (1997) pp. 159–166. A.Y. Kuo, Thermal stresses at the edge of a bimetallic thermostat. J. Appl. Mech. 56 (1989) 585–589. S.S. Wang and F.G. Yuan, A hybrid finite element approach to laminate elasticity problems with stress singularities. J. Appl. Mech. 50 (1983) 835–844. W.-L. Yin, Evaluation of the stress intensity factors in the general delamination problem. In: R.C. Batra and M.F. Beatty (eds), Contemporary Research in the Mechanics and Mathematics of Materials, International Center for Numerical Methods in Engineering, Barcelona, Spain (1996) pp. 489–500. W.-L. Yin, Delamination: Laminate analysis and fracture mechanics. Fatigue Fracture Engrg. Materials Struct. 21 (1998) 509–520. W.-L. Yin, Singularities of multimaterial wedges in heterogeneous structures. In: Proc. AIAA/ASME/ASCE/AHS/ASC 42nd SDM Conference, Seattle, WA, April 2001, AIAA Paper No. 2001-1250, 10 pages.
808 9. 10.
11. 12. 13. 14.
15. 16. 17. 18. 19. 20.
W.-L. YIN
T.C.T. Ting, Existence of an extraordinary degenerate matrix N for anisotropic elastic materials. Quart. J. Mech. Appl. Math. 49 (1996) 405–417. W.-L. Yin, Deconstructing plane anisotropic elasticity, Part I: The latent structure of Lekhnitskii’s formalism; Part II: Stroh’s formalism sans frills. Internat. J. Solids Struct. 37 (2000) 5257–5276 and 5277–5296. F. Delale, Stress singularities in bonded anisotropic materials. Internat. J. Solids Struct. 20 (1984) 31–40. S.S. Pageau and S.B. Biggers, Jr., The order of stress singularities for bonded and disbonded three-material junctions. Internat. J. Solids Struct. 31 (1994) 2979–2997. T. Inoue and H. Koguchi, Influence of the intermediate material on the order of stress singularity in three-phase bonded structures. Internat. J. Solids Struct. 33 (1996) 399–417. T.C.T. Ting, Stress singularities at the tip of interfaces in polycrystals. In: H.-P. Rossmanith (ed), Proc. of the 1st Internat. Conf. on Damage and Failure of Interfaces, Vienna, Austria, 1997, pp. 75–82. H.-P. Chen, Stress singularities in anisotropic multimaterial wedges and junctions. Internat. J. Solids Struct. 35 (1998) 1057–1073. S.G. Lekhnitskii, Theory of Elasticity of an Anisotropic Body. Holden-Day, San Francisco, CA (1963). T.C.T. Ting, Anisotropic Elasticity: Theory and Application. Oxford Univ. Press, New York, NY (1996). D.M. Barnett and H.O.K. Kirchner, A proof of the equivalence of the Stroh and Lekhnitskii sextic equations for plane anisotropic elastostatics. Phil. Mag. 76 (1997) 231–239. G.F. Carrier, M. Krook and C.E. Pearson, Functions of a Complex Variable. McGraw-Hill, New York (1966). S. Wolfram, Mathematica: A System for Doing Mathematics by Computer, 2nd ed. AddisonWesley, Redwood City, CA (1991).