Mechanistic Images in Geometric Form
This page intentionally left blank
Mechanistic Images in Geometric Form Heinri...
203 downloads
1280 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Mechanistic Images in Geometric Form
This page intentionally left blank
Mechanistic Images in Geometric Form Heinrich Hertz’s Principles of Mechanics
JESPER
LÜTZEN
Department of Mathematics, University of Copenhagen
1
3
Great Clarendon Street, Oxford OX2 6DP Oxford University Press is a department of the University of Oxford. It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide in Oxford New York Auckland Cape Town Dar es Salaam Hong Kong Karachi Kuala Lumpur Madrid Melbourne Mexico City Nairobi New Delhi Shanghai Taipei Toronto With offices in Argentina Austria Brazil Chile Czech Republic France Greece Guatemala Hungary Italy Japan Poland Portugal Singapore South Korea Switzerland Thailand Turkey Ukraine Vietnam Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries Published in the United States by Oxford University Press Inc., New York © Oxford University Press 2005 The moral rights of the author have been asserted Database right Oxford University Press (maker) First published 2005 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, or under terms agreed with the appropriate reprographics rights organization. Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this book in any other binding or cover and you must impose the same condition on any acquirer British Library Cataloguing in Publication Data Data available Library of Congress Cataloging-in-Publication Data Lützen, Jesper. Mechanistic images in geometric form : Heinrich Hertz’s Principles of mechanics / Jesper Lützen. p. cm. Includes bibliographical references and index. ISBN 0–19–856737–5 (acid-free paper) 1. Mechanics, Analytic. 2. Geometry. I. Title. QA805.L85 2005 531 .01 51636—dc22 2005001534 Typeset by Newgen Imaging Systems (P) Ltd., Chennai, India Printed in Great Britain on acid-free paper by Biddles Ltd., King’s Lynn, Norfolk ISBN 0–19–856737–5 (Hbk)
978–0–19–856737–0
10 9 8 7 6 5 4 3 2 1
Preface
I first encountered Heinrich Hertz in the spring of 1988 when Jed Buchwald visited Copenhagen and began the work on what eventually became his book The Creation of Scientific Effects. Heinrich Hertz and Electric Waves. We had many interesting discussions about Hertz’s ideas, in particular his intriguing derivation in 1884 of propagating vector potentials. However, my own working relation with Hertz did not begin until 1990–91, where I visited Jed Buchwald at the University of Toronto. Here, inspired by my research on Joseph Liouville, I started a study of the relationship between mechanics and differential geometry and in that connection I met Hertz as the first physicist who based mechanics on a differential geometry of configuration space. I continued my research on Hertz’s mechanics in 1994 when Jed Buchwald invited me to come to the Dibner Institute for The History of Science and Technology at MIT, of which he had by then become the director. That year was the centenary of the death of Heinrich Hertz and the posthumous publication of his book Die Prinzipien der Mechanik in neuem Zusammenhange dargestellt. On this occasion several meetings were held. I was fortunate to be invited to the conference entitled ‘Heinrich Hertz Classical Physicist, Modern Philosopher’ at the University of South Carolina, the proceedings of which were edited by D. Baird, R.I.G. Hughes and A. Nordmann (Baird et al. 1998). The many fine talks greatly inspired me to go on with my research, in particular because I realized that my work was not duplicated by others. In fact, many talks were given on the philosophical introduction to Hertz’s book on mechanics and its impact on later developments, but there was very little focus on the actual technical physical and mathematical content of the book. Since that was precisely what I intended to investigate I continued my work, and back in Copenhagen I wrote a 93-page manuscript that was printed as a preprint in 1995. This preprint was based on published sources exclusively. However, before I got around to sending the paper to a journal, I began to study Hertz’s drafts of the book and I realized that they added valuable information to the subject. Therefore I decided to postpone publication until I had read Hertz’s manuscripts in more detail. Due to Hertz’s handwriting and my teaching and administrative duties this turned out to take much longer than I had anticipated, and I really only had a go at it when The Dibner Institute invited me for the second time as a resident scholar in the spring of 1998.
vi
Preface
There I had the opportunity to prepare a draft of the first half of the present book. The second half of the book was written during the fall of 2001 and the spring of 2002 when I spent a year at Caltech. Hertz’s mechanics has been characterized as a very philosophical book and it has been investigated by many philosophers of science. My main concern is not philosophical but historical. Yet it is my hope that philosophers also will find some of the content of this book interesting. It seems to me that our understanding of Hertz’s philosophy of nature, or rather of scientific theories, is enhanced by a closer reading of his treatment of mechanics in its historical context. The book is conceived as a contribution both to the history of mathematics and to the history of physics. In both subjects our picture of the last half of the nineteenth century has changed dramatically over the last 20–30 years. Traditionally, historians of mathematics thought of the period as one where mathematics emancipated itself from physics and build its own rigorous foundation. Historians of physics considered the period as a messy, uninteresting interlude between Maxwell and Einstein–Bohr: ‘an age of successful scientific orthodoxy, undisturbed by much thought beyond the conventions,’ ‘one of the dullest stages of thought since the time of the First Crusade’ as Alfred North Whidehead wrote1 . Now historians of mathematics increasingly emphasize the important links that continued to tie the development of mathematics to that of physics, and historians of physics have come to appreciate the period as being interesting and complex in its own right and pointing to the revolutionary developments in the beginning of the twentieth century not only through its paradoxes but also through its approach to physics. ‘It was a time of probing and testing, a time when questions of principle were much considered and often hotly debated’ (Klein 1973, p. 58). This book is meant as a contribution to these new trends in both disciplines. Mechanics has always been on the interface between mathematics and physics. Before 1800 there was no clear division between mathematicians and physicists, and even when a division emerged during the nineteenth century both camps continued to contribute to mechanics. Mathematicians often tried their newest mathematical ideas such as elliptic functions and Riemannian geometry in this domain, and physicists, who considered mechanics as the fundamental physical discipline, pursued many technical aspects and tried hard to improve the foundation of the science. For Hertz, the establishment of a logically consistent foundation of mechanics was the primary question. His original answer to this problem influenced Hilbert and was thus an important background for the axiomatic movement in nineteenth-century mathematics. Thus a physicist influenced the purest part of mathematics. In addition, Hertz’s use of differential geometry in configuration space situates his book centrally in a long process of geometrization of mechanics, most of which took place inside of mathematics. In this way the book illustrates the willingness among pre-Einsteinian physicists to accept higher-dimensional and non-Euclidean geometries and sheds a new light on the development of tensor analysis.
1 (Whitehead 1925, p. 148), here quoted from (Klein 1973, p. 58).
Preface
vii
In the history of physics Hertz’s Mechanics can be seen as the summit of a long mechanical tradition and as the formulation of an ultimately unsuccessful research program for all of physics. It can also be seen as a step in the gradual separation between the simple observable phenomena of physics and the increasingly abstract theoretical and mathematical concepts and deductions used to describe these phenomena. This strong separation was a necessary prerequisite for the creation of modern physics. As mentioned above, the literature on Hertz and his work has exploded during recent years2 . The recent works can roughly be divided into two groups: one dealing with Hertz’s work on electromagnetic waves and another dealing with the philosophical aspects of his mechanics. The only recent book that deals with Hertz’s mechanics from the point of view of its physical content is Hans Paul Breunig’s Inaugural-Dissertation Zur Hertzschen Mechanik. Ursprung ihrer Grundkonzeptionen und denen Bedeutung für die Entwicklung der Physik (Breunich 1988). This is a very fine piece of work, and it complements the present book in some respects. Many people have been of great help to me during my work on this book. I have already mentioned Jed Buchwald who has been a strong support and a great source of information on nineteenth-century physics and its historiography. I also wish to thank him and Evelyn Simha for inviting me to the Dibner Institute and the other fellows at the Institute for making the stays so inspiring. Many of the initial ideas of the book have been shaped during discussions with my next-door neighbor at the Institute Ronald Anderson, who was working on the history of electromagnetic potentials. I am also grateful to Jed and Diana Buchwald for the invitation to Caltech where I spent a very pleasant year not least due to the great help of Susan Davis. While at Caltech I had helpful discussions with Evelyn Fox Keller and in particular with Andrea Loettgers and Tilman Sauer. I have also had inspiring discussions on various aspects of Hertz’s philosophy with Gert Grasshoff, Ulrich Majer, Alfred Nordmann, David Hyder, Frederik Christiansen, and Michael Friedman and on variational calculus with Craig Fraser. Skuli Sigurdsson read the preprint version of this book very carefully and I am grateful to him for many observations and informations. Moreover, I wish to thank the Deutsches Museum for providing me with copies of Hertz’s drafts for his book on mechanics and for giving me permission to reproduce a few specimens from these drafts. Likewise I am greatful to the editors of the Archives Internationales d’Histoire des Sciences for allowing me to include a revised version of my paper (Lützen 1999) in the present book. I also wish to thank Dita Andersen and Sheryl Cobb for typing parts of the manuscript and Peter Harremoes for helping me overcome computer problems. Tinne Kjeldsen has assisted me with all aspects of this book ranging from computer problems to stylistic and scientific advice. During the refereeing process I had the good fortune to receive extensive and very useful comments and suggestions from Olivier Darrigol and Jeremy Gray. The editorial process went very smoothly thanks to the efficient staff of the Oxford University Press. Finally, my thanks go to my home 2 See (Baird et al. 1998) for a comprehensive list of papers and books related to Hertz and his work.
viii
Preface
institution the Mathematics Department of the University of Copenhagen, for giving me the opportunity to get away from the daily teaching and administration. Without the sabbaticals the book would never have been written. JESPER LÜTZEN Department of Mathematics, University of Copenhagen Universitets Parken 5 DK. 2100 Copenhagen ∅ Denmark
Contents
1
Introduction 1.1 Aim and structure of the book . . . . . . . . . . . . . . . . . . 1.2 General outline of Hertz’s Mechanics . . . . . . . . . . . . . 1.3 What is new in this book? . . . . . . . . . . . . . . . . . . . . . 1.4 The publication of Hertz’s book and the preserved drafts 1.5 Notation, quotes, references, etc. . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
1 1 2 5 6 7
The principles of mechanics before Hertz 2.1 Principles and laws . . . . . . . . . . . . 2.2 Foundations of mechanics . . . . . . . . 2.3 Basic notions . . . . . . . . . . . . . . . . 2.4 Novel expositions and critical works .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
8 9 20 22 27
3
Mechanization of physics 3.1 The decline of the mechanistic world view . . . . . . . . . . . . . . . . .
30 38
4
Problematization of the concept of force 4.1 Forces and atoms . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 The problematization of distance forces. Field theory 4.3 Rejections of atomism . . . . . . . . . . . . . . . . . . . . . 4.4 Energetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
40 40 41 45 47
A biographical survey 5.1 Childhood and student years (1857–1883) 5.2 Privat Dozent in Kiel (1883–1885) . . . . . 5.3 Professor in Karlsruhe (1885–1889) . . . . 5.4 Professor in Bonn (1889–1894) . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
50 50 54 56 57
2
5
6
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
Hertz’s road to mechanics 6.1 Hertz’s electromagnetic work as a background for his mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63 63
x
Contents
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
63 66 67 71 73 77 80
7 Images of nature 7.1 A comparison of Hertz’s and Helmholtz’s signs and images 7.2 Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Logical permissiblity . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Appropriateness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 Distinctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Simplicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5 The relation among the criteria . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
83 85 86 87 89 90 92 93
8 Hertz’s earlier ideas about images 8.1 Images in the Kiel Lectures . . . . . . . . . . . . . . . . . . 8.2 The parable of the paper money . . . . . . . . . . . . . . . 8.3 The colorless theory and the gay garment . . . . . . . . 8.4 Comparison of the 1894 images with earlier concepts 8.5 Concepts in the Mechanics related to images . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
97 . 97 . 99 . 101 . 101 . 106
9 Images of mechanics 9.1 Principles of mechanics . . . . . . 9.2 The Newtonian–Laplacian image 9.3 The energetic image . . . . . . . . . 9.4 Hertz’s image . . . . . . . . . . . . . 9.5 Conclusion of the comparison . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
6.1.1 6.1.2 6.1.3
6.2 6.3 6.4 6.5
Axiomatization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mechanization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The elimination of distance forces . . . . . . . . . . . . . . . . . . . . .
Research on gravitation Ether . . . . . . . . . . . . . An energetic beginning . Chronology of the drafts
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
10 Kantianism. A-priori and empirical elements of images 10.1 Scientific representations . . . . . . . . . . . . . . . . . . . 10.2 A Kantian division . . . . . . . . . . . . . . . . . . . . . . . . 10.3 A metaphysics of corporeal nature . . . . . . . . . . . . . 10.4 Kantianism in the first draft of Hertz’s Mechanics. Existence problems . . . . . . . . . . . . . . . . . . . . . . . 10.5 The division between kinematics and dynamics . . . . 11 Time, space and mass 11.1 Space . . . . . . . . . . . . . 11.1.1 Pure geometry . . . 11.1.2 Applied geometry . 11.2 Time . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
111 111 112 114 115 117
119 . . . . . . . . . . 119 . . . . . . . . . . 121 . . . . . . . . . . 123 . . . . . . . . . . 124 . . . . . . . . . . 125
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
127 127 128 131 133
Contents
xi
11.3 Mass. The constitution of matter . . . . . . . . . . . . . . . . . . . . . . . . 134 11.3.1 The constitution of matter. Book one . . . . . . . . . . . . . . . . . . . 135 11.3.2 The constitution of matter. Book two . . . . . . . . . . . . . . . . . . . 141 12 The line element: The origin of the Massenteilchen 12.1 Hertz’s line element . . . . . . . . . . . . . . . . . . . . . . 12.2 Vanishing identical material points rejected . . . . . . 12.3 Massenteilchen appear . . . . . . . . . . . . . . . . . . . . 12.4 A matter of space . . . . . . . . . . . . . . . . . . . . . . . . 12.5 The Massenteilchen shrink and become matter free . 12.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
146 146 149 151 154 156 158
13 Hertz’s geometry of systems of points 13.1 Geometrization of mechanics . . . . . . . . . . . . . . . . . 13.2 Why the geometric form? . . . . . . . . . . . . . . . . . . . . 13.3 Direction, angle, and curvature in the printed book . . . 13.4 Direction, angle, and curvature in Hertz’s manuscripts .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
159 159 159 163 165
14 Vector quantities and their components 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 Components and reduced components of a displacement . . . 14.3 Vector quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4 Kinematic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.5 Vector quantities in Hertz’s drafts . . . . . . . . . . . . . . . . . . . 14.6 The (reduced) components in Hertz’s drafts . . . . . . . . . . . . 14.7 The origin of Hertz’s tensor analysis . . . . . . . . . . . . . . . . . 14.8 Interaction between physical content and mathematical form
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
173 173 173 176 178 180 181 183 185
15 Connections. Material systems 15.1 Connections rather than forces . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 Derivation of the equations of connection . . . . . . . . . . . . . . . . . . 15.3 Holonomic and non-holonomic systems . . . . . . . . . . . . . . . . . . .
187 187 189 192
16 The fundamental law
198
. . . . . .
17 Free systems 202 17.1 Straightest paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 17.2 Dynamics of free systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 18 Cyclic coordinates 208 18.1 Routh and modified Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . 208 18.2 Hidden cyclic motion. J.J. Thomson . . . . . . . . . . . . . . . . . . . . . . 210 18.3 A simple standard example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
xii
Contents
18.4 Helmholtz on adiabatic cyclic systems . . . . . . . . . . . . . . . . . . . . 213 18.5 What is new in Hertz’s Mechanics? . . . . . . . . . . . . . . . . . . . . . . 217 18.6 R. Liouville: One cyclic coordinate suffice . . . . . . . . . . . . . . . . . 217 19 Unfree systems. Forces 219 19.1 Guided systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 19.2 Systems acted on by forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 20 Cyclic and conservative systems 20.1 Adiabatic cyclic systems . . . . . . . . . . . . . . . . . . . 20.2 Conservative systems . . . . . . . . . . . . . . . . . . . . . 20.3 Hidden non-holonomic connections are not allowed 20.4 The approximative character of cyclic and conservative systems . . . . . . . . . . . . . . . . . . . . .
225 . . . . . . . . . . . 225 . . . . . . . . . . . 227 . . . . . . . . . . . 229 . . . . . . . . . . . 230
21 Integral principles 235 21.1 Shortest and geodesic paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 21.2 Integral principles of mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 237 22 A history of non-holonomic constraints 240 22.1 Hölder’s rescue of Hamilton’s principle . . . . . . . . . . . . . . . . . . . . 240 22.2 Repeated independent mistakes, rejections and rescues . . . . . . . . . 243 23 Hertz on the Hamilton formalism 247 23.1 The straightest distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 23.2 The characteristic and principal functions . . . . . . . . . . . . . . . . . . 249 24 Mathematicians on the geometrization of the Hamilton–Jacobi formalism 24.1 Gauss and Hamilton on geodesics, optics and dynamics . 24.2 Liouville and Lipschitz on the principle of least action . 24.3 Trajectories as geodesics . . . . . . . . . . . . . . . . . . . . . . 24.4 Hertz and the mathematicians . . . . . . . . . . . . . . . . . .
. . . .
25 Hertz on the domain of applicability of his mechanics 25.1 Practical applications . . . . . . . . . . . . . . . . . . . . . . 25.2 Validity and applicability of the fundamental law . . . 25.3 Constructability of forces . . . . . . . . . . . . . . . . . . . 25.4 Vitalism, teleology, reductionism, and mechanism in nineteenth-century biology . . . . . . . . . . . . . . . . . . 25.5 Hertz on living systems . . . . . . . . . . . . . . . . . . . . 25.6 ‘Permissible,’ ‘probable’? . . . . . . . . . . . . . . . . . . . 25.7 Applicability and correctness . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
252 252 255 258 261
263 . . . . . . . . . . 263 . . . . . . . . . . 264 . . . . . . . . . . 265 . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
267 268 272 273
Contents
xiii
26 Force-producing models
274
27 Reception, extension and impact 27.1 Reception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.2 Extensions and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.3 Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
278 278 284 286
28 List of conclusions
290
Appendix
295
Bibliography
299
Index
313
This page intentionally left blank
1 Introduction
1.1 Aim and structure of the book When Heinrich Hertz’s Die Prinzipien der Mechanik in neuem Zusammenhange dargestellt was published in 1894 it was received as an important contribution not only to mechanics but to physics as a whole. Many physicists and mathematicians did not agree with its agenda but everybody seems to have understood it. Modern readers, on the other hand, are often puzzled by its unfamiliar aim and by its content and methods. It is the purpose of the present book to show how Hertz’s Mechanics fits naturally into the mechanistic world view of the late nineteenth century and to analyse its general architecture as well as its elements. To do so is not easy. Hertz’s book was conceived as an attempt to give a logically consistent deductive presentation of mechanics. In such an ‘axiomatic’ treatise a whole series of theorems or ‘principles’ are presented as consequences of a few definitions and ‘axioms.’ However, for the person who conceived of the axiomatic system it is often the theorems or rather the theory as a whole that determines the definitions and axioms. The overall structure may be guided by philosophical considerations but the elements, the details, are often determined by the way they are supposed to fit into the general structure. Thus, as Euclid’s Elements Hertz’s Mechanics is a closely knit web in which every little part depends on the rest of the web. For example, in order to fully understand Hertz’s definition of matter and mass in the beginning of the book it is important to know how this definition is used in the definition of the line element of the geometry of configuration space; and in order to appreciate this definition one must know how it is put to use in the expression of energy and in the formulation of the fundamental law of motion. And in order to appreciate Hertz’s definition of curvature and his study of the straightest path, one must have in mind that isolated systems according to Hertz’s fundamental law of motion moves along straightest paths. Ideally, I would have structured my book as an analysis of the historical development that led Hertz from his initial general idea of the book to its final detailed presentation in the book. However, such a grand-scale reconstruction is not possible on the basis of the available manuscripts. Moreover, such a presentation of the material would completely obscure the logical structure of the book that was of primary
2
Introduction
importance to Hertz. Conversely, the logical structure would have been highlighted if I had started the book with a detailed summary of the technical, mathematical structure of Hertz’s Mechanics and then proceeded with a critical and historical analysis of its structure and elements (such as the concept of mass). Fearing that I would lose most of my readers half-way through the technical summary, I have decided to structure the book in a third way. I will first set the stage for Hertz’s Mechanics with brief introductions to the history of mechanics prior to Hertz (Chapter 2), to the mechanistic philosophy as it took shape in the nineteenth century (Chapter 3), in particular the widespread uneasiness with forces acting at a distance, the energetic program and the differing views on atomism (Chapter 4). Moreover, I will introduce the person and the physicist Heinrich Hertz and his earlier works (Chapter 5) and, in particular, trace those aspects of his work that shaped his ideas on mechanics (Chapter 5). Then I will proceed to the content of the Principles of Mechanics beginning with the philosophical introduction (Chapter 6). This is important in its own right and also as a background for the rest of the analysis. It tells us that Hertz intended the rest of the book to be an image or picture of the mechanical world and it shows us how Hertz compared his own Mechanics with other images (Chapter 9). His criteria for good images will often be used in the remaining part of this book. I also compare Hertz’s mature image theory to his earlier views on images and theories in physics (Chapter 8) and I point to some obvious Kantian features in Hertz’s image theory in general and in his mechanics in particular (Chapters 10 and 11). After these introductory chapters I will take the reader on a grand voyage through Hertz’s book. I will generally follow Hertz’s technical deductions, and pause every time we have seen enough of the technical argument to be able to analyse some particular subject. For example, I will analyse Hertz’s concept of mass in two stages. First, together with my analysis of the other basic concepts of time and space, and secondly, after the introduction of the line element. Some of these critical analytical sections will be extended to longer digressions about the background or subsequent development of specific elements of Hertz’s book. In particular his treatment of the Hamilton formalism will be compared to earlier works by various mathematicians (Chapter 24), and his introduction and treatment of so-called nonholonomic constraints will be contextualized as a part of a long and complex development (Chapter 22). After the grand voyage through Hertz’s Mechanics I will discuss Hertz’s views on its range of applicability (Chapter 25). The book will conclude with some sections on the reception, extensions and influence of Hertz’ book (Chapters 26 and 27).
1.2 General outline of Hertz’s Mechanics Before embarking on a more detailed investigation of Hertz’s Mechanics it will be useful to give a brief outline of the general structure of the book, to call attention to what can be found in it and, just as importantly, what cannot be found in it, and to give a qualitative idea of the technical issues involved. Such a summary will
General outline of Hertz’s Mechanics
3
allow the reader to keep the aim and direction in mind during the subsequent more detailed investigation, and it will explain why particular subjects are highlighted in the early chapters. I shall therefore provide such a summary here. Hertz’s Mechanics deals with systems of a finite number of material points (or mass points in common parlance) and can easily be generalized to other systems with a finite number of degrees of freedom. It does not deal with the motion of a single material point, except as a trivial special case, and it does not deal with continuous media such as hydrodynamics. In §7 Hertz hinted at how one can, in principle, incorporate continuum mechanics into his image of mechanics through a limiting process but he did not ‘enter into the details required for the analytical treatment of this case’ (Hertz 1894, §7). The most distinguishing feature of Hertz’s Mechanics is that it only operates with three fundamental concepts, namely time, space and mass. The concept of force (or energy) that is assumed as a fourth independent fundamental concept in other contemporary treatises of mechanics is defined a posteriori by Hertz. However, in order to eliminate force as a fundamental concept Hertz had to introduce two kinds of mass: ordinary or tangible (visible) mass and hidden or concealed mass. The first kind of mass is the one we can observe directly with our sensory aparatus. The aim of mechanics is to describe its motion. The concealed mass, on the other hand, is only felt through its interaction with ordinary mass. The material points of the system (visible as well as concealed) are allowed to interact through so-called rigid constraints expressible by first-order homogeneous differential equations in the Cartesian coordinates x1 , x2 , . . ., x3n of the material points of the system. This includes connections through imaginary rods and rolling but also many other types of constraints. In particular, it is worth stressing that Hertz’s constraints do not only allow localized or contact actions but allow material points separated by long distances to interact. It is also worth remarking that Hertz explicitly allowed such differential equations that cannot be integrated. If the equations of constraint can be integrated to give a system of finite algebraic or transcendental equations of the form F (x1 , x2 , . . . , x3n ) = 0, Hertz called the system holonomic. If the equations of constraint cannot be integrated in this way, he called the system non-holonomic. Hertz did not invent this distinction but he coined the word ‘holonomic’ and studied non-holonomic constraint more consistently than his precursors. Hertz chose to describe his material systems in terms of a differential geometric structure of configuration space. Its basic concept was the ‘distance’ between two configurations of the system, in particular the distance ds between two infinitesimally close configurations (the line element). Based on this Riemannian metric, Hertz introduced other geometrical notions for the system as a whole, such as angles between two paths of the system and curvature of a path as well as kinematic notions such as velocity, acceleration, and momentum of the system. The latter notions are examples of vector quantities that Hertz introduced generally. He could then formulate his one and only law of motion: ‘Every free system persists in its state of rest or of uniform motion in a straightest path’ (Hertz 1894, §309). Here uniform motion means one with constant speed and a straightest path is the one with the smallest curvature among all
4
Introduction
paths obeying the constraints and passing through a given configuration and having a given velocity at that configuration. Since the natural path of a system is the straightest path, Hertz investigated such paths in detail. In particular, he showed that straightest paths are the same as geodesics (locally shortest paths) for holonomic systems, whereas this identification does not hold for non-holonomic systems. Moreover, he introduced a geometric version of the Hamilton–Jacobi formalism for the investigation of geodesics. These parts of Hertz’s book as well as his appropriate but unusual definition of ‘the component along a coordinate’ and his definition of ‘vector quantities with regard to a system’ were similar to ideas that were developed in the mathematics community around the same time. In Hertz’s image of mechanics any unfree (non-isolated) system is a part (a so-called partial system) of a larger free system. The main technical problem is then to investigate the motion of the partial system when it is assumed that the rest of the system is more or less unknown. Hertz did that for two special types of partial systems: a guided system where the rest of the system is supposed to ‘perform a determined and prescribed motion’ (Hertz 1894, §431) and more importantly a ‘system acted on by forces.’ In the latter case only very special constraints are allowed to connect the partial system with the rest of the system. To be more precise, the connections between the two parts of the system must be expressible as the equality of a coordinate of the partial system and a coordinate of the remaining system. In that case Hertz could define the concept of ‘the force exerted by the remaining system on the partial system.’ Its component along a shared coordinate is, by definition, the Lagrange multiplier that the connection gives rise to. Hertz was particularly interested in studying the motion of partial systems for which the remaining system consisted of hidden masses. He argued that the remaining system would indeed be hidden from our observation if it is a so-called cyclic adiabatic system. In this case Hertz called the total system conservative. The reason is that he could show, with arguments borrowed from Helmholtz, that components of the force of the hidden system upon the visible system could be derived as the partial derivatives of the kinetic energy of the hidden system. For this reason Hertz defined the ‘potential energy’ of the visible system to be the kinetic energy of the hidden system. In this way Hertz was able to define a posteriori the concepts of force and energy, of which one was normally introduced as a fundamental concept in the usual treatments of mechanics. Force arose in Hertz’s mechanics as a result of connections of the system under consideration with another hidden system, and potential energy was simply the kinetic energy of this hidden system. The philosophically problematic nature of the concepts of force and energy had been overcome. Hertz showed that with his new definition of force and potential energy he could prove the usual principles of mechanics for systems acted on by forces, and in particular for conservative systems. The integral principles, such as the principle of least action and Hamilton’s principle were only applicable to holonomic systems, but this is also the case in the ordinary formulation of mechanics, although this limitation had often been overlooked before Hertz (and was often overlooked after Hertz as well).
What is new in this book?
5
In fact, Hertz only approximately obtained the ordinary laws for conservative systems. The approximation gets better the smaller the hidden masses are and the faster they move. In the philosophical introduction to the book Hertz made it clear that the above description of mechanics was meant as a mental image of things in nature. He did not postulate that interactions in ‘real’ mechanical systems necessarily take place the way they were described in his image. Indeed, since we have no experimental knowledge of the inner working of the hidden systems of nature the best we can do is, according to Hertz, to make sure that our image describes the known features of nature exactly. However, he argued that his image of mechanics was the best image available of the mechanical actions of nature. An image of a part of nature must, according to Hertz, be ‘permissible,’ that is logically consistent (though with some Kantian overtones) and ‘correct’ that means, as stated above, that it describes the known features of nature correctly. Among equally permissible and correct images the best image is the most appropriate in the sense that it is the most ‘distinct’ or detailed (excludes most possibilities that do not occur in nature), and, if equally distinct, the best one is the ‘simplest’ that is the one with fewest empty relations or ‘idle wheels.’ This is the sense in which Hertz claimed that his image of mechanics was the best. He realized that we cannot require nature to be simple, but he argued that we should require our image to be as simple as possible. He also stressed that later experimental knowledge of nature may very well require us to change our image.
1.3 What is new in this book? Hertz’s Mechanics has been studied by scholars from several fields. Philosophers have analysed the philosophical introduction in great numbers, historians of science have, in fewer numbers, studied its physical content, and a few historians of mathematics have emphasized its axiomatic structure. These are indeed important novel features of Hertz’s Mechanics. There is, however, a fourth novelty that has not received the attention it deserves, namely the differential geometric form. It is important because it represents a crucial step in the geometrization of mechanics, and because it is an integrated part of Hertz’s endeavor. Indeed, as I shall argue, some of the physical content of Hertz’s image of nature is influenced in a decisive way by the geometric form. The analysis of the geometric form is one of the novel features that this book intends to contribute to the historiographic literature concerning Hertz’s Mechanics. A second novelty is that I have based my analysis of Hertz’s mechanics not only on the published text but also on the preliminary manuscripts and drafts left behind by Hertz. They throw an interesting light on the construction and development of some of Hertz’s ideas and concepts. But the main novelty of the book is the attempt to deal with all aspects of Hertz’s Mechanics be they mathematical, physical or philosophical. Hertz pictured nature as one connected mechanical system. According to this holistic conception it does not make sense to imagine this system to be broken up into disconnected parts
6
Introduction
(Hertz 1894, p. 37/31). Similarly, Hertz emphasized that the physical content and the geometric form of his mechanics mutually assist each other. Moreover, it is quite obvious that his theory of images draws ideas from the technical developments of his image, and conversely that the presentation of concepts, the technical machinery, and the book as a whole were shaped by the philosophical considerations of the introduction. I therefore think that my attempt to deal with Hertz’s Mechanics as a whole follows a wholistic idea that is inherent in the work itself and in its image of nature. I think this more comprehensive study throws new light on the origin, structure and content of Hertz’s last work.
1.4 The publication of Hertz’s book and the preserved drafts Hertz worked on his Prinzipien der Mechanik during the last three years of his short life (February 22 1857–January 1 1894). He hardly did any other scientific research during this period, but his work was often interrupted by periods of illness and suffering. Before he died he negotiated the conditions of the publisher’s contract and sent most of the manuscript to the printer. On December 3 1893 he wrote to his parents: My introduction has been set, the major part of the manuscript goes off today ready for printing, only a small part still requires a final touch. (Hertz 1977, p. 343)
After his death his last assistant Philipp Lenard helped the book through press. In his preface Lenard wrote about Hertz’s work on the book that the general features were settled and the greater part of the book written within about a year; the remaining two years were spent in working up the details. At the end of this time the author regarded the first part of the book as quite finished, and the second half as practically finished. (Lenard 1884)
Thus, although the book was posthumously published it is Hertz’s own work through and through. A comparison of the last manuscript and the printed book corroborates Lenard’s statement in his preface to the effect that he only made a few rather inconsequential changes in Hertz’s book. The slight disagreement in the two quotes above as to how much of the book Hertz considered completely finished and how much he wanted to go over once more (half of the book or a small part) is cleared up by Hertz’s last letter to the publisher: Now there remains about 200 manuscript pages that I still need to copy. (Fölsing 1997, p. 513)
Counting 200 pages backwards from the end of Hertz’s last manuscript of the book gets us to §432 or near the beginning of Chapter 4 of Book 2 ‘Motion of Unfree Systems’ (§429). So §429 to §735 seems to be the part Hertz wanted to go over again. The present discussion of Hertz’s Mechanics is based on the published book as well as the manuscripts Hertz left behind. The manuscripts preserved in the Deutches Museum contain three successive versions of the book of increasing length, the third
Notation, quotes, references, etc.
7
of which is virtually identical to the printed work. In addition to these drafts there are seven short manuscripts of between 1 and 13 pages dealing with various special issues in his ‘geometry of systems of points.’ Together they make up a ‘zeroth’ draft of the first part of the book. The manuscripts shed light on the gradual development of Hertz’s ideas, in particular as far as the geometric aspects are concerned. However, they do not tell us much about how he worked out his theory of forces, cyclic and conservative systems or of potential energy. A survey of the content of the manuscripts, and an attempt at placing them in a chronological order can be found in the Appendix.
1.5 Notation, quotes, references, etc. I have made two deviations from Hertz’s mathematical notation. First, I have used the letters q and p to denote generalized coordinates and momenta, respectively. Here I follow the modern convention as well as the convention followed by Hamilton and by Hertz in his first manuscripts on mechanics. In the published Prinzipien der Mechanik however, Hertz followed Helmholtz who used the reverse convention denoting coordinates by p and momenta by q. Secondly, I have replaced Hertz’s German (fraktur) letters by letters superscribed by a tilde (as x ). Quotes from Hertz’s Principles of Mechanics are taken from D.E. Jones’s and J.T. Walley’s ‘authorized English translation’ of 1900, except where it is explicitly stated otherwise. Following the convention of (Baird et al. 1998) a reference to the main part of Hertz’s book looks like (Hertz 1894, §200) where the last number is the number of the section in question1 . The Introduction is not divided into sections, so here a reference looks like (Hertz 1894, p. 37/31). Here the first number refers to the page number in the 1910 edition of the Prinzipien der Mechanik that constitutes the third volume of Hertz’s Gesammelte Werke, and the second number is the page number in the 1956 Dover edition of the English translation. In this English edition the preface is not paginated, so when I refer to the preface I only give the page number of the 1910 German edition. Throughout the book references to the bibliography are made by author and year in parentheses (. . .). References to Hertz’s manuscripts appear as (Ms 9) where the number indicates the number I have given the manuscript in the Appendix. References to pages and formulas in the present book are also put in parentheses (. . .).
1 On simply (§ 200) if this will cause no confusion.
2 The principles of mechanics before Hertz
From a modern perspective it may seem strange that a first-class physicist like Hertz, who had become world famous for his production of, and experiments with electromagnetic waves, as well as for his theoretical investigations of this phenomenon, should devote three years of his life to an account of classical mechanics. Compared to electromagnetism, mechanics was an old discipline. It is usually thought to have been established as a paradigm with Newton’s Philosophiae Naturalis Principia Mathematica published in 1687. The following two centuries of normal scientific activity developed it to a high degree of maturity. When we look back on classical mechanics around 1890 we may even get the impression that the field had reached a finished, well-understood, completed state, and that it was dead as a research area. This impression is probably partly a result of our knowledge of the great revolution that was to take place during the first half of the twentieth century. However, to the physicists and mathematicians of the late nineteenth century, who did not know of these great changes to come, mechanics was still an important and active area of research. The developments that had taken place since the time of Newton had been much more innovative, than the term ‘normal science’ might suggest. To be sure, the eighteenth- and nineteenth-century discoveries in mechanics were, in a sense, consistent with Newton’s principles, but they went much further than merely fixing parameters or working out consequences. They added many new concepts and principles and even changed the meaning of Newton’s ideas. The period until 1860 saw a steady flow of interesting new theoretical discoveries in mechanics, even in the area of mechanics of systems of finitely many mass points, to which I shall limit this discussion. Moreover, there was a continued discussion about the meaning, nature, and foundations of the principles of mechanics and the concepts entering into them. These discussions went on till well into the twentieth century, and provide the background and the mechanical context for Hertz’s book. This chapter will present the most important elements of this mechanical context for Hertz’s work. First, I shall summarize the development of mechanics from Newton to Hertz. In particular, I shall introduce the different laws and principles of mechanics discussed by Hertz. I shall not attempt to give a presentation of these principles in their original form, but shall often formulate them in a (mathematical) language close 8
Principles and laws
9
to the one used by Hertz. In this way it will be possible to refer back to these principles and formulas in later chapters. Secondly, I shall consider the discussions about the foundation of mechanics, in particular those taking place during the second half of the nineteenth century, and the different attempts to give the field a satisfactory structure1 .
2.1 Principles and laws In 1687 Isaac Newton formulated his famous laws of motion in the Philosophiae Naturalis Principia Mathematica. Law 1. Every body continues in its state of rest or of uniform motion in a straight line, unless it is compelled to change that state by forces impressed thereon. Law 2. The alteration of the quantity of motion is ever proportional to the motive force impressed: and it is made in the direction of the straight line in which that force is impressed. Law 3. To every action there is always opposed an equal reaction – or the mutual actions of the two bodies upon each other are always equal, and directed to contrary parts. Newton did not take credit for these laws, but attributed them to his predecessors. In particular the first law, that was to become the basis for Hertz’s fundamental law, had been formulated in some sense or another by Galileo Galilei and Christiaan Huygens. In other respects Newton’s mechanics was in opposition to the mechanics of his immediate predecessors, in particular to that of René Descartes. Descartes and Newton disagreed about the status of the laws of motion (Descartes being a rationalist and Newton being mostly an empiricist) and about the possibility of actions at a distance (Descartes trying to reduce everything to impact and Newton accepting his law of gravitation, at least as a mathematical auxiliary (see Chapter 4 for a discussion of actions at a distance)). The latter difference between the two natural philosophers was partly a result of the problems they were each primarily interested in solving and it had an influence on the kind of laws they placed at the basis of mechanics. Where Newton formulated the three laws that deal with forces, Descartes formulated a law of conservation of motion. By motion Descartes meant something like momentum, so for a system of points with masses mi and velocities v i the law would say in modern terms: Principle of the conservation of momentum (or of the center of mass). For an isolated system the total momentum is conserved, i.e. 3n
mi v i = const.
(2.1)
i=1 1 The reader who wants a more detailed and deeper account of the history of mechanics than I can
give here can consult (Dugas 1988), (Szabó 1977), (Fraser 1997) and (Voss 1901). For a fuller account of philosophical and foundational issues the reader is referred to the last mentioned paper, as well as (Mach 1883, in particular the 7th edition from 1912), (Duhem 1903) and as far as the French scene is concerned (Chatzis 1995).
10
Principles of mechanics before Hertz
This law is often, e.g. by Hertz, referred to as the principle of the center of mass, because the sum in the formula can be interpreted as the momentum of the center of mass of the system. Thus the principle states that the center of mass moves as a free point mass in a straight line with constant velocity. Descartes failed to take the direction of the velocities into account and so his formulation of the principle was in fact flawed, as pointed out by Gottfried Wilhelm Leibniz. Leibniz argued that the important quantity was not the momentum but the living force (as opposed to Newton’s dead force) that is proportional to the mass and to the square of the velocity of the body. It can be measured by the height from which the body must fall in order to obtain its velocity.2 The live force is therefore what we would call the kinetic energy T = 21 3n ν=1 mν vν . Here, and in the following, I have used Hertz’s convention to index the rectangular coordinates of the point masses of the system by ν and to denote by m3k+1 = m3k+2 = m3k+3 the mass of the point mass with the coordinates (x3k+1 , x3k+2 , x3k+3 ). During the eighteenth-century mechanics was transformed, extended and systematized by a number of continental mathematicians and physicists such as Jacob Bernoulli, Johann Bernoulli, Daniel Bernoulli, Pierre Varignon, Leonhard Euler, Jean le Rond d’Alembert and Joseph Louis Lagrange. Except for Jacob and Johann Bernoulli they all developed Newton’s ideas and, in particular, transformed them into a continental analytical language. Thus Varignon and Euler formulated Newton’s second law as a differential equation Pν = mν
d2 xν = mν x¨ν , dt 2
(2.2)
where Pν is the force in the direction of the ν-th rectangular coordinate. However, the early eighteenth-century mechanicians also borrowed ideas, such as conservation laws, from Descartes and Leibniz (see (Watkins 1997)), and they added new laws of their own such as the Principle of the conservation of angular momentum (or the principle of areas). The angular momentum of a free system around a given point (the origin) is conserved: n M= mi x i × x˙ i = const. (2.3) i=1
The principle is often, e.g. by Hertz, called the principle of areas because it states that if one draws radius vectors from the origin to each of the points of the system, then the sum of the projections on the three coordinate planes of the areas described by these radii multiplied by the respective masses increase uniformly in time (see (Truesdell 1968, pp. 239–271) for a critical history of the principle). Moreover, the eighteenth-century mechanicians became interested in formulating the principles of mechanics for systems that are bound together by constraints or connections that can be described by finite equations in their rectangular coordinates Fι (x1 , x2 , . . . , x3n ) = cι ,
ι = 1, 2, . . . , i,
(2.4)
Principles and laws
11
or in any other system of ‘generalized’ coordinates (q1 , q2 , . . . , qr ) that will fix the position of the mechanical system Fκ (q1 , q2 , . . . , qr ) = cκ ,
κ = 1, 2, . . . , k.
(2.5)
Constraints are not of importance in Newton’s main field of interest, celestial mechanics, but are present in many engineering-type mechanical systems constructed out of rigid rods, etc. Constraints that can be expressed in terms of finite equations as eqns (2.4) and (2.5) are called holonomic constraints with a terminology introduced by Hertz. Rolling motion, however, presents a kind of constraint that cannot be expressed in that way but only in terms of first-order homogeneous differential equations in the coordinates 3n xιν δxν = 0, ι = 1, 2, . . . , i, (2.6) ν=1
or
r
qχρ δqρ = 0,
χ = 1, 2, . . . , k,
(2.7)
ρ=1
that cannot be integrated. We shall return to such non-holonomic constraints in Chapter 18. In order to determine the equilibrium of a system constrained by holonomic constraints Johann Bernoulli (1717) used the principle of virtual work. In a slightly modernized language it states: Principle of virtual work. A mechanical system acted on by forces is in equilibrium if the total work done by these forces in any virtual displacement of the system is zero: r
Pρ δqρ = 0.
(2.8)
ρ=1
Here, δqρ is an infinitesimal change of the ρ-th coordinate in a virtual displacement of the system, that is a displacement that satisfies the constraints (2.5). It need not be a natural motion of the system. Moreover, Pρ is the component of the force along the ρ-th generalized coordinate. The meaning of a component requires a notion of combination and decomposition of forces and the eighteenth-century mechanicians agreed that this had to be done according to the parallelogram method (addition of vectors in a later nineteenth-century terminology) but there were many discussions about how to explain this law of addition (Dhombres and Radelet-de Grave 1991) and (Mach 1883). The principle of virtual work is a law of statics, but when combined with a principle due to d’Alembert one can make use of it also in dynamics of constrained systems. D’Alembert’s principle. Consider a mechanical system influenced by forces Pν along the ν-th coordinate xν . If the system were not constrained its acceleration would be determined by eqn (2.2). However, when the system is constrained by eqn (2.4) or
12
Principles of mechanics before Hertz
eqn (2.6) it will have other accelerations x¨ν but the forces Pν − mν x¨ν will keep the system in equilibrium. This formulation of the principle is due to Lagrange2 . If d’Alembert’s principle is combined with the principle of virtual work we get what Hertz called d’Alembert’s principle. Lagrange’s version of d’Alembert’s principle: 3n
(Pν − mν x¨ν ) δxν = 0,
(2.9)
ν=1
where the δxs are the coordinates of an arbitrary virtual displacement and Pν is the component in the direction of xν of the force impressed on the system. Both Euler and d’Alembert wrote textbooks in which they tried to unite the different ideas of mechanics in one connected theory but the most ambitious eighteenth-century attempt in this direction was made by Lagrange. In his monumental Mécanique Analytique (1788) he derived all of mechanics from the principle of virtual work, using d’Alembert’s principle as a stepping stone to get from statics to dynamics, as explained above. Lagrange discovered a convenient way to deal with constraints. In order to determine the coordinates xν or qρ of an equilibrium position of a system one can of course use the equations of constraints (2.4) and (2.5) or rather the derived equations (2.6) and (2.7), where xιν = (∂Fι /∂xν ) and qχρ = (∂Fκ /∂qρ ), to eliminate i or k of the variables, after which one can operate with the rest of the variables or an equivalent set of equally many generalized coordinates as independent unconstrained variables3 . However, Lagrange presented another method: Lagrange multipliers. Multiply each of the derived equations (2.7) by an arbitrary number Qκ – a so-called Lagrange multiplier – and add all the left-hand sides to the left-hand side of eqn (2.8). This will lead to the equation r k Qκ qχρ δqρ = 0. (2.10) Pρ + ρ=1
κ=1
The central idea is that we can now deal with the δqρ as if they were free variables. This means that eqn (2.10) has to hold for all δqρ so that we have Pρ +
k
Qκ qχρ = 0,
ρ = 1, 2, . . . , r,
(2.11)
κ=1
which together with the equations of constraint (2.5) give us r + k equations from which we can determine the r + k variables qρ and Qκ . Lagrange did not 2 For a discussion of the original wording and meaning of the principle see (Fraser 1997). 3 Lagrange remarked in passing that the constraints need not be given on integral form as in eqn (2.4)
or eqn (2.5) but may just be given on differential form eqn (2.8) or eqn (2.9) even when these differential equations are not integrable. See Chapter 22 for further discussion of the problems raised in the latter case.
Principles and laws
13
only limit himself to arguing mathematically for this procedure, he also emphasized its physical significance. From eqns (2.10) and (2.11) we see that rκ=1 Qκ qχρ shows up as the ρ-th component of a force. It is the component of the reaction force that is created by the constraint limiting the system to move such that Fκ (q1 , q2 , . . . , qr ) = cκ . Thus Lagrange concluded that constraints can be replaced by forces, namely the reaction forces mentioned above. If the constraint requires one point of the mechanical system to move on a given surface, then the reaction forces Qκ q1ρ are the components of a force that is orthogonal to the surface. The method can be used also in dynamics to solve eqn (2.9). This will lead to the general equations of motion mν x¨ν +
i
Xι xιν = Pν ,
ν = 1, 2, . . . , 3n,
(2.12)
ι=1
which together with the equations of constraint (2.4) determine the x¨ν s and the multipliers Xι . Hertz conversely used the method of multipliers to introduce the concept of forces. In his image of mechanics there are a-priori no forces, but there are constraints. Forces only emerge as Lagrange multipliers Qκ (Hertz restricted the definition to the case where qχρ = 1, see Chapter 19). Lagrange multipliers, can be used to deal with all other similar equations or variational problems with constraints. Lagrange further showed how one can conveniently deal with systems described by generalized coordinates. Newton’s second law does not apply directly when written down in generalized coordinates, but Lagrange showed that if the kinetic energy T is expressed in free generalized coordinates qρ then the equations of motion can be written in the following form: Lagrange’s equations of motion, version I d ∂T ∂T = Pρ , ρ = 1, 2, . . . , r. (2.13) − dt ∂ q˙ρ ∂qρ Finally, Lagrange and Pierre Simon de Laplace noticed that the distance forces known at the time (gravitation) were of a form that could be derived as minus the gradient of a scalar force function U (called by Gauss and George Green a potential) of the coordinates of the system Pρ = −
∂U . ∂qρ
(2.14)
If this expression for Pρ is inserted into eqn (2.13) and we use that U does not depend on q˙ρ we get Lagrange’s equations of motion, version, II. d ∂L ∂L = 0, ρ = 1, 2, . . . , r, (2.15) − dt ∂ q˙ρ ∂qρ
14
Principles of mechanics before Hertz
where L=T −U
(2.16)
is the so-called Lagrangian of the system. Equations (2.15) are the general equations of motion expressed in arbitrary free coordinates (i.e. non-constrained coordinates) for a system where the forces are given by eqn (2.14). Such a system is called conservative because T + U is a constant during the motion. This conservation law had been formulated in special cases by Leibniz and several eighteenth-century mechanicians under the name ‘the principle of live force.’ In its general form this principle is formulated in terms of the concept of work, that was introduced during the beginning of the nineteenth century by several French engineering scientists. For a system of forces acting on a system as above, the work done when the system moves from an 1 initial configuration 0 to a final configuration 1 is defined to be 0 Pρ dqρ . Principle of live force. The change of the kinetic energy (live force) of a system is equal to the work performed by the forces acting on the system during the motion: T1 − T 0 =
r 1
Pρ dqρ .
(2.17)
0 ρ=1
This theorem is a consequence of the equations of motion, and in the case where the forces are given by way of a force function the conservation of T +U follows directly. This conservation law obtained its central physical importance when Hermann von Helmholtz interpreted the potential U as a form of energy4 , the potential energy and thus formulated the Principle of conservation of energy. The total energy, i.e. the sum of the kinetic and the potential energy of a system is a constant: T + U = const. = h.
(2.18)
For Helmholtz (Helmholtz 1847) this principle was not limited to mechanical systems with gravitational forces. He argued that all natural phenomena including thermodynamics and electromagnetic phenomena are explainable by material points acting on each other by central forces and concluded that the conservation of energy was a general principle applicable to any isolated system. Already before this conceptually important reinterpretation of the principle of live force had taken place Carl Friedrich Gauss (Gauss 1829) had formulated his principle of least constraint. Intuitively it says that a constrained system moves in such a way that at any instant it deviates as little as possible from the free motion of the system, i.e. the motion it would have had if it were not constrained. More precisely, Gauss defined the constraint of a system during a differential time interval as the sum of the squares of the deviation of each point of the system from its free motion multiplied 4 Helmholtz only went part of the way. An entirely independent concept of energy was developed in the 1850s and 1860s by William Thomson and Rudolf Clausius.
Principles and laws
15
by its mass, and showed that this constraint is minimized by the motion of the system. Analytically the principle says: Gauss’s principle of least constraint. The natural motion of a constrained system minimizes the constraint Z=
3n 1 (Pν − mν x¨ν )2 . mν
(2.19)
ν=1
Gauss showed that his principle was a consequence of d’Alembert’s principle, but he thought that his new principle might be advantageous for the solution of some mechanical problems. Moreover, he emphasized that it was interesting to note that a constrained system modified its motion just as a cunning astronomer modifies his observational data, namely according to the method of least squares. This remark obviously had a special appeal to Gauss, who was after all the (or one of the) discoverer of this method. His principle is of particular importance for our history because Hertz chose a geometrized version of Gauss’s principle as his fundamental law of motion. Gauss’s principle is a variational principle. It asks for the minimization of a certain quantity. Another variational principle, the principle of least action had gained prominence in mechanics three quarters of a century earlier. In contrast to Gauss’s principle, which is a differential principle, the principle of least action is an integral principle. It states that a mechanical system moves from one configuration 0 to another one 1, along the path that minimizes the total ‘action’ among all paths from 0 to 1. Here the 1 ‘action’ of a point is equal to 0 mv ds, so for a system the principle states: Principle of least action. The natural motion of a mechanical system minimizes the action A=
1
mν x˙ν dxν =
0
3n 1 0 ν=1
mν
dxν dt
2
1
dt =
2T dt
(2.20)
0
among all motions that takes the system from the configuration 0 to the configuration 1. This principle, which is a generalization of a principle used by Fermat in optics, was formulated in 1744 by Pierre Louis Moreau Maupertuis (most generally) and Euler (most precisely), and immediately gave rise to controversy about priority, metaphysics, and meaning. Here, I shall only discuss the last point. The problem is at least twofold. First, as was quickly pointed out by Lagrange, the action integral need in fact not be a minimum. It can also be a maximum. The correct formulation of the principle is therefore that the variation of the action integral must vanish 1 1 2T dt = 0. (2.21) mν x˙ν dxν = δ δA = δ 0
0
Moreover, it is not so obvious what kind of variations one must make. The eighteenthcentury writers were not entirely clear about this, but Carl Gustav Jacob Jacobi (Jacobi 1866) made it clear that when varying the path one must assume that the
16
Principles of mechanics before Hertz
system traverses it with a constant total energy h = T + U . This requirement fixes the time (in particular the time when the system will reach the end configuration 1) once we have fixed the path. Jacobi used this observation to remove the time variable entirely from the action integral. He determined dt from the energy equation 1 mν h= 2 3n
ν=1
dxν dt
2 +U
(2.22)
and inserted the value into the expression (2.20) and arrived at the following expression for the action
1 3n 2(h − U ) A= m (dx )2 . (2.23) ν
0
ν
ν=1
Jacobi’s version of the principle of least action says that the natural path of a mechanical system minimizes this integral among all paths between the configurations 0 and 1, or rather makes its variation vanish. This is in a certain sense a geometrization of the principle, because it only concerns the geometric properties of the motion. In order to determine how the system traverses its path one must use the principle of energy conservation. Thus the principle of energy conservation is entirely independent of Jacobi’s version of the principle of least action. Jacobi’s geometrization of the principle of least action was carried further by Joseph Liouville, Rudolf Lipschitz and Gaston Darboux (see Chapter 24). The action integral was also central for Hamilton’s and Jacobi’s formalism. Inspired by his work in optics (see Chapter 24) William Rowan Hamilton considered the action integral, but instead of fixing the end configurations and varying the path he considered the integral C 0 0 0 2T dt (2.24) V (h, q1 , q2 , . . . , qr , q1 , q2 , . . . , qr ) = B
along the trajectory of the system and considered it as a function of the total energy h, the coordinates q1 , q2 , . . . , qr of the final configuration C and the coordinates q10 , q20 , . . . , qr0 of the initial configuration B. Hamilton called V the characteristic function. When it is expressed in rectangular coordinates differentiation with respect to the initial and final coordinates yields ∂V = mν x˙ν , ∂xν ∂V = −mν x˙ν0 , ∂xν0 ∂V = t, ∂h
ν = 1, 2, . . . , 3n
(2.25)
ν = 1, 2, . . . , 3n
(2.26) (2.27)
Principles and laws
17
which, inserted into the equation of energy (2.18) give 3n ∂V 2 1 +U =h 2mν ∂xν
(2.28)
3n ∂V 2 1 + U0 = h. 2mν ∂xν0
(2.29)
ν=1
ν=1
Hamilton showed conversely that if V is determined as a solution of these two first-order non-linear partial differential equations one can determine the motion of the system from the finite5 equations (2.25)–(2.27). If the last equation is used to eliminate h eqns (2.26) will determine the xν s as a function of time and the initial values xν0 and x˙ν0 (ν = 1, 2, . . . , 3n). Equations (2.25) will give intermediate integrals for x˙ν . Hamilton explained his theory of the characteristic function in his First Essay on a General Method in Dynamics (Hamilton 1834). At the end of the essay he introduced another function, the so-called principal function P whose properties he investigated in the Second Essay (Hamilton 1835). Just as the characteristic function is defined in terms of the action integral, the principal function is defined in terms of the integral that appears in another variational principle, namely Hamilton’s principle. Among all the possible motions that take the system from a given initial configuration to a given final configuration in a given time t the natural motion of the system is the one that minimizes the integral t t L dt = (T − U ) dt, (2.30) 0
0
or rather makes the variation of this integral vanish. Although this principle is named after Hamilton it was known before his time (Voss 1901, no. 42, note 243). Lagrange’s equations (2.15) are the Euler–Lagrange equations corresponding to this variational principle. We also remark that energy conservation is not pre-supposed in this principle. On the contrary, energy conservation is a consequence of Hamilton’s principle. Hamilton defined the principal function as the integral t t L dt = (T − U ) dt (2.31) S(t, q1 , q2 , . . . , qr , q10 , q20 , . . . , qr0 ) = 0
0
along a trajectory of the system, and considered it as a function of time and the coordinates of the initial and final configurations. He showed that it is the Legendre transform of V S = V − th (2.32) 5 I shall use the term finite equation to denote an algebraic or transcendental equation, as opposed to a differential equation.
18
Principles of mechanics before Hertz
and that it satisfies the partial differential equations ∂S 1 + ∂t 2mν 3n
ν=1
∂S 1 + ∂t0 2mν 3n
ν=1
∂S ∂xν
∂S ∂xν0
2 +U =0
(2.33)
+ U0 = 0.
(2.34)
2
Moreover, once a solution to these equations has been found one can determine the motion of the system from the finite equations ∂S = mν x˙ν , ∂xν ∂V = −mν x˙ν0 , ∂xν0
ν = 1, 2, . . . , 3n
(2.35)
ν = 1, 2, . . . , 3n
(2.36)
∂S = −h. ∂t
(2.37)
In the second essay Hamilton also derived the so-called canonical equations of motion. Lagrange had, in 1809, introduced the generalized momentum pρ conjugate to a generalized coordinate qρ of the system. It is defined by pρ =
∂T ∂L = . ∂ q˙ρ ∂ q˙ρ
(2.38)
Subsequently Simeon-Denis Poisson (Poisson 1809) remarked that in terms of the generalized momenta Lagrange’s equation of motion eqn (2.15) can be written ∂pρ ∂L = . ∂t ∂qρ
(2.39)
The two equations (2.38) and (2.39) are what Hertz called Poisson’s form of the equations of motion. Hamilton went further by defining the so-called Hamiltonian H that is the total energy H (p1 , p2 , . . . , pr , q1 , q2 , . . . , qr ) = T + U
(2.40)
considered as a function of the generalized coordinates qρ and the conjugate generalized momenta pρ . In terms of this function he could formulate Hamilton’s equations of motion (canonical form) q˙ρ =
∂p H , ∂pρ
p˙ ρ = −
∂p H , ∂qρ
ρ = 1, 2, . . . , r
(2.41)
ρ = 1, 2, . . . , r.
(2.42)
Principles and laws
19
In terms of the Hamiltonian one can also write the partial differential equations (2.28), (2.29) and (2.33), (2.34) for the characteristic and principal functions in generalized coordinates: ∂V ∂V ∂V H , ,..., , q1 , q2 , . . . , qr = h (2.43) ∂q1 ∂q2 ∂qr ∂V ∂V ∂V 0 0 0 H − 0 , − 0 , . . . , − 0 , q1 , q2 , . . . , qr = h (2.44) ∂qr ∂q1 ∂q2 and ∂S ∂S ∂S ∂S +H , ,..., , q1 , q2 , . . . , qr = 0 ∂t ∂q1 ∂q2 ∂qr ∂S ∂S ∂S ∂S 0 0 + H − 0 , − 0 , . . . , − 0 , q1 , q2 , . . . , qr0 = 0, ∂t0 ∂qr ∂q1 ∂q2
(2.45) (2.46)
and the integrals of Hamilton’s equations (2.25)–(2.27) and (2.35)–(2.37) can be written ∂V = pρ , ∂qρ
ρ = 1, 2, . . . , r
(2.47)
∂V = −pρ0 , ∂qρ0
ρ = 1, 2, . . . , r
(2.48)
∂V = t, ∂h
(2.49)
and ∂S = pρ , ∂qρ
ρ = 1, 2, . . . , r
(2.50)
∂S = −pρ0 , ∂qρ0
ρ = 1, 2, . . . , r
(2.51)
∂S = −H, ∂t
(2.52)
respectively. In 1837 Jacobi developed Hamilton’s methods further. Where Hamilton had assumed V or S to be solutions of two partial differential equations Jacobi pointed out that it is enough to consider one of the equations. For example, he showed that if S(t, q1 , q2 , . . . , qr , α1 , α2 , . . . , αr ) is a complete solution of eqn (2.45), which contain r arbitrary constants α1 , α2 , . . . , αr in addition to the trivial additive constant
20
Principles of mechanics before Hertz
then the equations ∂S = pρ , ∂qρ
ρ = 1, 2, . . . , r
(2.53)
∂S = βρ , ∂αρ
ρ = 1, 2, . . . , r
(2.54)
∂S = −H, ∂t
(2.55)
where β1 , β1 , . . . , βr are r other arbitrary constants, is a complete set of integrals to Hamilton’s equations of motion (2.41) and (2.42). A similar remark holds with respect to the characteristic function. Moreover, Jacobi emphasized that Hamilton’s formalism can be used also when the Hamiltonian depends explicitly on time6 .
2.2 Foundations of mechanics From the time of Galilei and Newton to that of Hertz and beyond, the technical development of the principles of mechanics went hand in hand with a more critical and philosophical discussion about foundational and methodological issues. What is the meaning of the principles, how are they connected, what is the epistemological status of each principle, of the fundamental concepts of space, time, mass, force, and energy, and of the edifice of mechanics as a whole, and what is the status of mechanics in the overall understanding of the world? These questions were taken up by philosophers as well as by physicists and mathematicians. This discussion was shaped by the general tendencies in epistemology. Thus from the start, there were disagreements about whether the principles of mechanics were of an empirical origin or could be deduced by purely rational reflection. Newton was an empiricist whereas Descartes was a rationalist. Both standpoints were represented among the mechanicians of the eighteenth century, sometimes in various mixes7 . Particularly influential was the combination of the two standpoints that Immanuel Kant developed in his critical philosophy (Kant 1781; Kant 1786). Kant agreed with the empiricists that we receive our knowledge of the world through our senses but he argued that our experiences are formed by a-priori forms of intuition and concepts. Space and time are such a-priori intuitions; they are described by geometry and arithmetic, respectively, and these two sciences are constructed a-priori in intuition. Also, the laws of mechanics including the law of universal gravitation are, according to Kant, a-priori in the sense that they ‘are viewed as necessary conditions of the possibility of an objective notion of true motion’ (Friedman 1992, p. 174). Theological arguments also entered the picture. Many seventeenth- and eighteenthcentury mechanicians regarded the simple laws of mechanics as evidence of God’s 6 For an informative and clear summary and analysis of the development of the Hamilton–Jacobi 7 See, e.g. (Watkins 1997). formalism see (Fraser and Nakane 2002).
Foundations of mechanics
21
perfect plan of the universe. Others went further and suggested theological arguments for the laws. In particular the variational principles appealed to such arguments. Maupertuis (and others) argued that the principle of least action held true because God (or nature) did not do anything unnecessary. Integral principles such as the principle of least action lend themselves particularly well to such theological arguments because they also implicitly suggest final causes. They determine the motion that a system must follow in order to get from an initial configuration to a final configuration, and thus implies the idea that the system wants to reach this final configuration or that God has planned it. The differential principles do not imply such final causes, but may still suggest a rational plan behind the appearances. For example, although Gauss himself voiced empiricist convictions (even concerning geometry) his remark that the principle of least constraints shows that mechanical systems handle constraints in the same way as a clever astronomer handles observational errors, could suggest such a plan. Starting with Newton, mechanics was often presented as a deductive science in the Euclidean style, starting with a few basic principles from which everything else is deduced. Even Lagrange’s analytical mechanics is build up according to this scheme. It is analytical in the sense that it uses mathematical analysis rather than synthetic geometry but it is synthetic in the sense that it is deductive in nature. As stressed by Lakatos, this means that truth is spread downward from the basic principles to the rest of the system. This Euclidean structure should not be confused with the axiomatic method that David Hilbert began to promote starting with his 1899 Grundlagen der Geometrie. In such a modern axiomatic system one starts with a number of undefined notions; in mechanics it could be time, distance, mass, and perhaps force, together with a number of axioms relating them. The axioms are in principle arbitrary, but historically they are chosen so as to correspond to known rules for the behavior of known mechanical systems. The theorems of mechanics are the logical consequences of the axioms (logic itself being considered as conventionally chosen rules of inference). Truth is not an issue here. Only if we want to apply the axiomatic system to a given mechanical system in nature can we ask if the axiomatic system truly represents it, in the sense that axioms and theorems of the axiomatic system correspond to the real behavior of the system. In order to decide this question one needs coordinating rules that explain how one measures the basic quantities of the system such as time, distance and mass. Nineteenth-century mathematicians and mechanicians had a different view of the axiomatic, deductive method. From their point of view the basic notions were not undefined but needed a precise definition (of course this requirement had to fail at some level), and the axioms (laws, principles) were not arbitrary conventions but truths about nature. They could be considered as self-evident (as Euclid’s axioms), deducible from more basic philosophic or theological principles (as the rationalist had it) or they could be a result of empirical investigation, but they were true. The only deviation from this Euclidean scheme was presented by the empiricists who often would consider the principles or axioms as true in the sense that they were supported by experience up till now; different or more accurate future experience might necessitate revisions. Hertz seems to have been of this opinion.
22
Principles of mechanics before Hertz
2.3 Basic notions Before we enter into a discussion of the definitions of the basic notions of mechanics we must mention the problem of terminology. This may seem a trivial issue, but in fact, the lack of a precise terminology was a major problem in the early days of classical mechanics. Basically, mechanicians agreed about the meaning of words such as time, distance, velocity, and even mass, but words like force, energy, power, work, action, quantity of motion, and pressure were used in various senses and often interchangeably. For example, kinetic energy was, following Leibniz, considered as live force, and even as late as the midnineteenth century when Helmholtz wrote his famous paper on the conservation of energy, he entitled it ‘Über die Erhaltung der Kraft.’ However, toward the end of the nineteenth century the terminology was essentially fixed, and no longer gave rise to confusion. However, the more precise definition and meaning of the basic concepts such as distance, time, mass, force, and energy continued to be the subject of discussion up to Hertz’s time and beyond. Space, Distance. As far as the concept of space and distance is concerned, the discussion was mostly concerned with the problem of Euclid’s fifth postulate (the parallel postulate) and the problem of absolute space. The first problem was of ancient origin8 . Many mathematicians from the ancient Greeks and medieval Arabs to seventeenth- and eighteenth-century Europeans had tried to prove the parallel postulate from the other postulates of geometry, but none of the suggested solutions had won general acceptance. Around 1830 Gauss, Nikolai Ivanovich Lobachevsky and Janos Bolyai independently of each other came to the conviction that one cannot deduce the parallel postulate from the other axioms and they set out to construct a non-Euclidean geometry in which the postulate does not hold. They considered it an empirical question whether the postulate is satisfied in real physical space or not. Their ideas became widely known around 1868 together with the even more daring ideas of Bernhard Riemann and led physicists such as Helmholtz to embrace an empirical theory of space. However, all nineteenth-century scientists generally agreed that physical space was Euclidean either because experience tells us it is, or because it is a-priori so (as Kantians would have it) or because we need some convention about space, and the Euclidean convention is the simplest (this was the point of view of Henri Poincaré). The question of the Euclidean nature of space only entered into a few nineteenth-century works of mechanics (I shall return to some of them in Chapter 24). The problem of the existence of absolute space, however, was discussed at great length in connection with mechanics because it involves the laws of motion. Indeed Newton had argued that one can distinguish a rotating reference frame (a bucket filled with water) from a non-rotating one because the water will rise at the edges of the rotating bucket but will be plane in the non-rotating one. This led him to the idea that there exists an absolute space in which the laws of motion hold. On the other hand, Leibniz 8 For more information on the development of non-Euclidean geometry see (Bonola 1955) and (Gray 1989).
Basic notions
23
had argued that all measures of distances are relative and for that reason rejected the existence of absolute space. This discussion was continued through the eighteenth and nineteenth centuries. During the last half of the nineteenth century such firstrate scientists as James Clerk Maxwell (Maxwell 1876a), Carl Neumann (Neumann 1870), William Thomson (later Lord Kelvin) and Peter Guthrie Tait (Thomson and Tait 1879)9 discussed the problem. The most famous treatment of the problem was the total rejection of absolute space by Ernst Mach (Mach 1883). Not only did he point out that there is no way that one can distinguish between coordinate systems that are translated relative to each other. He also questioned whether Newton’s argument shows that one can single out an equivalence class of inertial frames without making reference to matter in space. According to Mach, we have no way of knowing how a bucket of water would behave if there were no other masses present in the universe. The only thing we know is that a reference frame that is at rest relative to the fixed stars is with a good approximation an inertial frame. In some passages he even seems to suggest that inertia may be due to interactions between all matter in the universe, an idea that influenced Albert Einstein’s work on the general theory of relativity10 . Time. The problem of absolute space has its counterpart in the problem of absolute time. Newton believed in the existence of an absolute, true and mathematical time, whereas Mach and others before him argued that we can only know of relative time. According to Mach we can chose one free motion as measuring time, and we can compare another free motion with respect to this standard. We can then inquire whether the second motion will traverse equal distances in the time it takes the other motion to traverse equal distances, etc. But there is no way we can tell if these movements are absolutely uniform or if they are accelerated. The discussion of absolute and relative space and time were closely related to each other and as the previous brief accounts have suggested it involved a discussion of the meaning of Newton’s first law, the law of inertia. Mass. The definition of the concept of mass was a problem from Newton through the nineteenth century. As late as 1867 Thomson and Tait defined it as the product of volume and density, which of course just raises the question: what is density? From Newton’s time mass was somehow considered to be a measure of the quantity of matter. This connected the definition of mass to the hotly debated question of the constitution of matter11 . Some mechanicians such as Euler, Lazare Carnot and Barré de Saint-Venant12 imagined that one can consider matter as made up of small identical particles, in which case one can define mass as the number of these particles. In the hand of Barré de Saint-Venant this idea gave rise to the following semi-empirical definition of mass: The mass of a body is the ratio of two numbers expressing how many times this body and another body, chosen arbitrarily and always the same, contain parts which being separated and 9 10 11 12
For other references see (Mach 1883, pp. 280–297) and Voss (Voss 1901, pp. 30–35). Mitchell (Mitchell 1993) has questioned this interpretation of Mach’s ideas of space. I shall return to the question of matter in Chapters 4, 11 and 12. See references in (Jouguet 1909, notes 7 and 48).
24
Principles of mechanics before Hertz
colliding with each other two by two, communicate opposite equal velocities to each other. (Dugas 1988, p. 437. The quote is from 1851)
A similar definition of equal mass was put forward by Mach: ‘All those bodies are bodies of equal mass, which mutually acting on each other, produce in each other equal and opposite accelerations’ (Mach 1883, p. 266). However, Mach did not make any hypotheses about the constitution of matter, so instead of breaking bodies of unequal mass into pieces of equal mass, he used Newton’s third law of action and reaction to define mass: ‘If we take A as our unit, we assign to that body the mass m which imparts to A m times the acceleration that A in the reaction imparts to it’ (Mach 1883, p. 266). This entirely empirical definition is a definition of inertial mass. Mach argued that the mass we measure on a scale is the inertial mass. Indeed if the scale is in equilibrium it tells us that the masses on the pans via the lever of the scale produce in each other equal and opposite (gravitational) accelerations13 . If we accept additivity of mass as an experimental fact (this is not mentioned by Mach) this observation will explain why usual weighing tells us inertial mass. Mach pointed out that the constants m1 and m2 that enter into Newton’s law of gravitation F = G(m1 m2 /r 2 ) are other constants (the gravitational masses). Only experience teaches us that they are proportional to the inertial masses. Force. The definition of the concept of force and its relation to Newton’s three laws of motion and the parallelogram of forces remained a major problem in the foundation of mechanics and although remarkable clarifications were arrived at during the nineteenth century, Hertz did not consider the matter to be settled. Many physicists and mathematicians such as Euler would define force as the cause of change of motion and would inevitably be led into metaphysical problems. Others, such as d’Alembert and Barré de Saint-Venant, would reject the idea of motive causes as metaphysical and obscure and would instead define force by Newton’s second law as mass times acceleration. They then had to explain how a definition could be used as a law of nature that could determine the motion of bodies. The logically clearest introduction of forces along the second line was due to Gustav Robert Kirchhoff. In his Vorlesungen über Mathematische Physik vol. 1 Mechanik (Kirchhoff 1876) he noticed that experience has shown that the acceleration of a particle is a function of its position but independent of its velocity. He therefore defined this function of position as the accelerating force acting on the particle. A particle is said to be acted on by a system of forces if its acceleration is the vector sum of these forces. Of course any force describing a motion can be considered as resulting from infinitely many systems of forces, but Kirchhoff maintained that ‘experiments show that in natural motions there can always be found systems whose separate forces are given more easily than their resultant’ (Kirchhoff 1876, pp. 11–12). For example, it is an experimental fact that we can consider the force given by the motion of the moon as resulting from gravitational forces from 13 This argument is based on the empirical fact that gravitational acceleration is the same for all bodies placed at the same place in the Earth’s gravitational field. However, that empirical result is exactly the one that establishes the identity of inertial and gravitational mass.
Basic notions
25
the Earth and the sun (and the other planets). Having introduced accelerating force of one particle, Kirchhoff considered systems of particles, and in that connection introduced masses as certain coefficients that one needs to introduce into the differential equations of motion of a constrained system in order that they be invariant under coordinate transformations. In this way, Kirchhoff succeeded in introducing forces and masses in an entirely descriptive way. In general, he emphasized that the purpose of mechanics is to ‘describe the motions that take place in nature … rather than to explain their causes’ (Kirchhoff 1876, p. iii). This dictum was close to Mach’s opinion and Kirchhoff’s introduction of mass and force was similar to the one that Mach independently arrived at. A similar explanation of forces (but not of mass) was put forward by C. Neumann (Neumann 1887, pp. 154–168). Many mechanicians considered these problematizations of the concept of force as being ill founded. They considered force as an empirically existing object that we have immediate access to via our tactile sensations. Ferdinand Reech in particular insisted that the concept of force was independent of motion and therefore belonged to statics. It could be measured by a static elastic pull of a thread (Reech 1852). His ‘school of the thread’ was developed further by Jules Andrade (Andrade 1898). In connection with Hertz it is particularly important to consider how his mentor Helmholtz conceived of the concept of force (Heidelberger 1993). In his early works on the conservation of ‘force’ (Helmholtz 1847) he argued metaphysically along Kantian lines that nature must consist of matter in the form of material points. According to Helmholtz the aim of science is to find the fundamental and thus invariable causes of natural phenomena that ultimately result from spatial motion of matter. He identified the causes with forces that he saw as being intimately linked to matter. Moreover, he argued that since the fundamental forces are invariable they must be functions of the distance between the material points between which they act, and they must be central forces14 . On top of this metaphysical argument Helmholtz offered a mathematical ‘proof’ that forces that do not depend solely on distance would make a prepetuum mobile possible, i.e. it would violate energy conservation15 . After this ‘proof’ had been criticized by Lipschitz and Rudolf Clausius, Helmholtz admitted in 1870 that it was flawed. At the same time he distanced himself from metaphysical arguments of the kind he had brought forward in 1847 and approached a point of view that was in line with his general empiricist philosophy. He now believed that the aim of science is to find lawful relations between the appearances. In this respect he was in agreement with the positivists. However, he continued to believe that these laws can appropriately be formulated in terms of forces, the nature of which must be determined empirically. He did not agree with the radical positivists that science had better avoid the concept of force and he continued to believe that behind the appearances there is a reality that our physical laws approach. Still, the concept of force in his later writings was reduced from a necessary presupposition
14 For a discussion of the relation between this argument and Kantian philosophy see (Hyder 2004). 15 See also (Bevilacqua 1993).
26
Principles of mechanics before Hertz
of science to a hypothetical abstraction from lawful relations (Heidelberger 1993, pp. 476–477). A separate and very hotly debated problem related to the concept of force concerned the possibility of actions at a distance. I shall come back to the discussions of this problem in Chapter 4. Newton’s laws. As mentioned above the status of Newton’s laws was up for discussion. Were they really laws or were they definitions? Was the first law a principle relating time, space and inertial systems, or was it a definition of one of these concepts? Did the second law state a law of motion or was it a definition of force? Did the third law relate action and reaction, or was it part of the definition of the concept of mass? Moreover, the advent of electromagnetism had cast doubt on the universal validity of the third law. The other principles. The other principles of mechanics were not so easily confused with definitions, but in many cases their initial formulation left their precise meaning unclear. For example, the principle of virtual work, d’Alembert’s principle, the principle of live force and in particular the principle of least action were originally formulated in a somewhat vague way in terms of concepts that lacked precise meaning. However, Euler’s works and in particular Lagrange’s Mécanique Analytique provided much conceptual and in particular mathematical clarification. Yet, as Jacobi pointed out in his last and only recently published lectures of mechanics (Jacobi 1996), there still remained many physical and philosophical unsettled problems. For example, Jacobi criticized Lagrange’s ‘derivation’ of the principle of virtual work that was at the basis for his presentation of mechanics. According to Pulte (Pulte 1998) Jacobi’s critique was an important source for the surge in the critical revaluation of the foundation of mechanics during the second half of the nineteenth century. In order to illustrate the extent to which even the mathematical formulations and the validity of the principles of mechanics were up for discussion during the late nineteenth century and even the beginning of the twentieth century I shall briefly mention the discussion about the formulation, meaning, generality, and applicability of the principle of least action. This discussion also had a historiographical dimension, because it was partly conducted as an investigation of what Lagrange had meant in his various and not always too clear formulations of the principle16 . In 1816 Olinde Rodrigues showed that if the time t is varied in the principle of least action one can derive the equations of motion from it. When applied to a conservative system, energy conservation would thus become a result of the principle of least action. However, Rodrigues’s observation went unnoticed for some time and other interpretations of the principle were put forward. As we saw above, Jacobi claimed that energy conservation had to be pre-supposed when formulating the principle of least action, and he formulated his ‘geometric’ version of the principle, in which he had eliminated time altogether. Mikhail V. Ostrogradsky (Ostrogradsky 1850, 415ff.), on the other hand insisted that the only true formulation of the principle was the one put forward by Hamilton (Hamilton’s principle). He was initially 16 For a discussion of the early history of the principle see (Pulte 1989).
Novel expositions and critical works
27
followed by Christian Gustav Adolph Mayer (Mayer 1877) but the latter changed his mind when he became acquainted with Rodrigues’s approach. Other mechanicians such as Sloudsky (Sloudsky 1879) and Routh (Routh 1877a) had by then come to the same conclusion as Rodrigues. Still the discussion about the meaning of the principle continued involving such authors as Helmholtz (Helmholtz 1887), Otto Hölder (Hölder 1896), Moritz Réthy (Réthy 1897), (Réthy 1904), Aurel Edmund Voss (Voss 1900), and Philip E.B. Jourdain (Jourdain 1905b), (Jourdain 1905a), (Jourdain 1908b), (Jourdain 1913). The discussion was reflected in most of the contemporary and later literature on the history of the principle such as (Mach 1871), (Kneser 1928), and (Brunet 1938). In addition to the question about the relation between the principle of least action and Hamilton’s principle, and the problem of whether time is to be varied in the variational procedure, the discussion revolved around questions like: Under what conditions is the principle applicable to systems where the force function contains time explicitly, to non-conservative systems in general, to constraints that contain time explicitly and to non-holonomic constraints. More information about the discussion can be found in (Voss 1901), in Jourdain’s above-mentioned papers, and in his edition in Ostwald’s Klassiker of the works by Lagrange, Rodrigues, Jacobi, and Gauss (Jourdain 1908a). I shall return to the problem of non-holonomic constraints in Chapter 22.
2.4 Novel expositions and critical works The above summary was intended to give an impression of the richness of the activity in classical mechanics during the half century that preceded Hertz’s Principles of Mechanics. Another way to bring out the message that mechanics and in particular its foundational aspects was a lively area of research towards the end of the nineteenth century, is to list the many new textbooks and critical works on the subject that were published during this period. Hertz explicitly mentioned that ‘in a general way I owe very much to Mach’s splendid book on the Development of Mechanics (Mach 1883). I have naturally consulted the better-known textbooks of general mechanics, and especially Thomson and Tait’s comprehensive treatise’ (Thomson and Tait 1879), (Hertz 1894, Preface, p. XXXII). Moreover, as evidence for the contemporary interest in foundational issues Hertz mentioned a series of papers in Nature by Oliver Lodge (Lodge 1893) on the foundations of dynamics. These papers were of such a late date that they cannot have influenced Hertz’s own ideas. Finally, as evidence of ‘the increasing care with which the logical analysis of the elements is carried out in the recent textbooks of mechanics’ Hertz (Hertz 1894, p. 10/8) mentioned Budde’s Allgemeine Mechanik der Punkte und starren Systeme (Budde 1890). However, he did not consider Budde’s attempt to be altogether successful: ‘The representation there given shows at the same time how great are the difficulties encountered in avoiding discrepancies in the use of the elements.’ (Hertz 1894, p. 10/9). In addition to these explicit references Hertz implicitly referred to Kirchhoff’s Mechanics (Kirchhoff 1876) (Hertz 1894, p. 21/18). As far as the more
28
Principles of mechanics before Hertz
technical mathematical machinery of his mechanics is concerned Hertz explicitly referred to Helmholtz’s papers (Helmholtz 1884) and (Helmholtz 1886) as well as to Joseph John Thomson’s (Thomson 1886) although he declared that he only learned of the latter when his own research was well advanced. Moreover, there is no doubt that he was well aware of Helmholtz’s paper on the conservation of energy (Helmholtz 1847). The textbooks of general mechanics that Hertz may have consulted include Clifford’s Elements of Dynamics (Clifford 1878), Despeyrous’s Cours de Méchanique (Despeyrous 1884), with notes by Darboux, Mathieu’s Dynamique analytique (Mathieu 1878), Maxwell’s Matter and Motion (Maxwell 1876a), C. Neumann’s Grundzüge der analytischen Mechanik, insbesondere der Mechanik starrer Körper (Neumann 1887), Julius Petersen’s Lerbuch der Statik, Kinematik und Dynamik fester Körper (Petersen 1881), (Petersen 1884), (Petersen 1887), Resal’s Traité de mécanique générale (Resal 1873), Routh’s A Treatise on the Dynamics of a System of Rigid Bodies (Routh 1877a), and W. Schell’s Theorie der Bewegung und der Kräfte (Schell 1879). Hertz’s book did not slow down the publication of new approaches to mechanics. Such major mathematicians and physicists as Appell, Boltzmann, Koenigsberger, Love, Painlevé, and Voigt all published textbooks on mechanics during the eight years following the appearance of Hertz’s book. In a paper on the principles of rational mechanics in the Encyclopädie der Mathematischen Wissenschaften Voss listed 46 important works published between 1870 and 1901 dealing with mechanics (Voss 1901). On top of that he listed 33 historicalcritical works of which the vast majority were also written in this time interval. They include: Clifford’s The Common Sense of the Exact Sciences (Clifford 1885), Dühring’s Kritische Geschichte der allgemeinen Prinzipien der Mechanik (Dühring 1873) that later became famous or infamous among a general audience, thanks to Lenin’s criticism, C. Neumann’s Die Prinzipien der Galilei-Newton’schen Theorie (Neumann 1870), Planck’s Das Prinzip der Erhaltung der Energie (Planck 1887), du Bois-Reymond’s Über die Grundlagen der Erkenntnis in den exakten Wissenschaften (du Bois-Reymond 1890), Tait’s Properties of Matter (Tait 1885), and W. Wundt’s Die physikalischen Axiome und ihre Beziehung zum Kausalprinzip (Wundt 1866). Also, this type of literature continued to be published after 1894. In particular, the works of Poincaré and Duhamel should be mentioned. The above brief introduction to the development of mechanics and in particular its foundations was intended to demonstrate the great interest this field attracted toward the end of the nineteenth century, and to call attention to a few of the problems at stake. I shall return to other problems that Hertz addressed explicitly, for example: Is it appropriate to appeal to non-observable quantities, in particular to atoms and molecules and their attractive and repulsive forces, or should one attempt to give a phenomenological and macroscopic description of the world. Mach opted for the latter, many working physicists in particular Boltzmann advocated the former. Connected to this problem was the status of the energy principle: Some like Thomson and Tait, and in a purer form Ostwald (Ostwald 1888, p. 91), developed a so-called energetic program in which they based a macroscopic phenomenological mechanics on the energy principle, avoiding
Novel expositions and critical works
29
thereby all references to forces between atoms (see Chapter 4). In Chapter 4 I shall deal with the fundamental problem of the nature of forces, in particular those that act at a distance. Finally, there is the question about the role of mechanics in the scientific understanding or description of the world. This is the subject of the next chapter.
3 Mechanization of physics
All physisists agree that the problem of physics consists in tracing the phenomena of nature back to the simple laws of mechanics. (Hertz 1894, beginning of the preface)
In the previous chapter I raised the question why Hertz would have been attracted to an old and classical discipline such as mechanics. The answer I gave was that mechanics and in particular its foundations had remained an active area of research that attracted the attention of many first-rate mathematicians and physicists. However, that is only part of the answer. In order to understand Hertz’s motives it is important to realize that for Hertz and most of his contemporaries and precursors from the seventeenth century on, mechanics was not just one among many more recent areas of physics. It was the basic discipline of physics. In a weak sense this meant that mechanics was a perfect model for other disciplines, and in a stronger sense it meant that the ultimate goal of physics was to give mechanical explanations of all natural phenomena1 . That Hertz agreed to this mechanistic reductionist program is manifest from the opening words of the preface to his book, quoted above, as well as from the following quote from the introduction: . . . the fundamental ideas of mechanics, together with the principles connecting them, represent the simplest image which physics can produce of things in the sensible world and the processes which occur in it. By varying the choice of the propositions which we take as fundamental, we can give various representations of the principles of mechanics. Hence we can thus obtain various images of things. (Hertz 1894, p. 4/4)
Thus, when Hertz found the existing foundations of mechanics unsatisfactory this must have appeared as a major problem for all of physics. To solve it would be to solve the most fundamental physical problem. The mechanistic reductionist program was as old as classical mechanics itself. The strategy was usually to explain non-mechanical physical phenomena by the motion of microscopic non-observable mechanical systems. In Newton’s optics, light rays were explained as rays of fast-moving light particles, whereas Augustin Fresnel and 1 For a discussion of the mechanistic world view during the last half of the nineteenth century see (Harman 1982), (Jungnickel and McCormmach 1986), (Klein 1973), (Smith and Wise 1989), (Mertz 1903, vol. II, Chapters VI and VII), (Boltzmann 1900b) and (Boltzmann 1903). Cassirer has given very interesting analyses of the great philosophical implications of the physical theories of the late nineteenth century (Cassirer 1950, Chapter 5), (Cassirer 1910).
30
Mechanization of physics
31
his nineteenth-century followers such as Augustin Louis Cauchy, considered light to be vibrations in a mechanical medium, the luminiferous ether. Similarly, towards the end of the eighteenth-century heat was usually treated as a substance that Antoine Laurent Lavoisier (1743–1794) called caloric, but the generation of heat by friction suggested in 1798 to Count Rumford that heat was some kind of motion. Faced with these two possibilities Lavoisier and Laplace (in some of their works) and in particular Fourier dealt with heat in an agnostic way in which the nature of heat was left open. However, after Fresnel had advanced his theory of light and André Marie Ampère had formulated a vibrational theory of heat the kinetic theory began to gain ground and it became the dominant one during the 1850s after James P. Joule had measured the mechanical equivalence of heat. James Clerk Maxwell in particular began to consider the thermodynamic properties of gases as being the result of the motions and collisions of the molecules of the gas. He stressed the statistical nature of the laws of thermodynamics, an emphasis taken over by Ludwig Boltzmann (early 1870s) who, using this approach, succeeded in deriving the 2nd law of thermodynamics from the ordinary laws of mechanics. Inspired by Boltzmann, but without his statistical point of view, the old Helmholtz also suggested a mechanical model of thermodynamics (Helmholtz 1884)2 . In his earlier days Kantian metaphysical convictions had made Helmholtz a strong proponent of the mechanistic philosophy. As late as 1869 he wrote: If motion, however is the basic change underlying all the alterations in the world, then all elementary forces are moving forces. The final goal of the sciences is thus to find all the movements and driving forces supplying the foundation of all other change. In other words, the final goal of the sciences is to reduce [everything] to mechanics. (Heidelberger 1993, p. 474)
Such statements from his mentor may have influenced Hertz to follow his mechanistic creed stated in almost the same words in the beginning of his Mechanics. However, the mechanistic thinking in electromagnetic theory, in particular among British physicists was probably at least as influential. I shall discuss these ideas in more detail, also because they illustrate that mechanical reductions came in many varieties, ranging from strong claims about the true ontological nature of the phenomena to weak analogies3 . The case of William Thomson (the later lord Kelvin) and Maxwell will show that the same person could operate on many levels of mechanistic thinking. Siegel (Siegel 1991) has seen this as a wavering between Scottish scepticism and ontologically committed hypothetical mathematical deductive reasoning, put forward by John Herschel and William Whewel, and favored in Cambridge. Both Thomson and Maxwell began as agnostics, creating ‘analogies.’ It had already been remarked by the French mathematician Michel Chasles (Chasles 1837) that heat conduction and electrostatics were ‘analogous’ in the sense that they were described by the same mathematical equation (Laplace’s equation). Thomson, perhaps inspired by Chasles, carried this idea further (see (Knudsen 1985)) showing how our intuition about one of the two areas of physics could lead to correct but intuitively less obvious 2 See (Bierhalter 1993, p. 4/4) and Section 18.4. 3 For a history of electromagnetism see (Darrigol 2000).
32
Mechanization of physics
results in the other area. Thomson did not suggest that the physical processes of heat conduction and electrostatics resembled each other, but only that such an analogy based on a common mathematical description could enhance our understanding of both fields. When Maxwell embarked on his first work on electrodynamics in 1854–55 in which he wanted to mathematize Faraday’s theories about lines of force, he borrowed Thomson’s idea of analogy: I am trying to construct two theories, mathematically identical, in one of which the elementary conceptions shall be about fluid particles attracting at a distance while in the other nothing (mathematical) is considered but various states of polarization tension &c existing at various parts of space. The result will resemble your analogy of the steady motion of heat. Have you patented that notion with all its applications? for I intend to borrow it for a season . . .. (Maxwell to Thomson, May 15, 1855, quoted in (Larmor 1937, p. 11))
In the published paper On Faraday’s Lines of Force (Maxwell 1864b) Maxwell suggested an analogy between Faraday’s electromagnetic fields and ‘tubes of variable section carrying an incompressible fluid’ (Maxwell 1864b, p. 158). His reason for invoking the hydrodynamical analogy was ‘to present the mathematical ideas to the mind in an embodied form, as systems of lines or surfaces, and not as mere symbols, which neither convey the same ideas, nor readily adapt themselves to the phenomena to be explained’ (Maxwell 1864b, p. 187). A mathematical theory of a vector field did not exist at this time, so in order to avoid a purely analytical theory, Maxwell used the analogy to ‘embody’ the analytical theory in geometric form4 . He counted it as an advantage that the hydrodynamical analogy clearly did not look like a true explanation of the electromagnetic phenomena: By referring everything to the purely geometrical idea of the motion of an imaginary fluid, I hope to attain generality and precision, and to avoid the dangers arising from a premature theory professing to explain the cause of the phenomena. (Maxwell 1864b, p. 159)
Within a few years, however, Maxwell thought that the time was ripe to suggest a mechanical theory ‘which if not true, can only be proved to be erroneous by experiments which will greatly enlarge our knowledge of this part of physics’ (Maxwell 1861, p. 452). This shift ‘toward a realistic, comprehensive, and explanatory theory’5 was preceded by a similar shift in the work of Thomson caused by his work on the kinetic theory of gases and his explanation of the rotation of the plane of polarization of polarized light in a magnetic field. The latter had given Thomson the idea that the magnetic field was due to ‘molecular vortices,’ i.e. to rotational motion of some kind. Maxwell accepted this explanation and thus exchanged the tubes of his previous analogy with rotating molecular vortices along the lines of force. This would give rise to centrifugal forces that would explain the tendency of the vortices to widen and contract, and therefore to cause magnetic attraction. He presented his new theory in a series of papers expressively entitled On Physical 4 I owe this insight to Jed Buchwald. 5 This quote is from Siegel (Siegel 1991) who has analysed Maxwell’s theory in great detail.
Mechanization of physics
33
Fig. 3.1. Maxwell’s mechanical model of the electromagnetic ether (Maxwell 1861).
Lines of Force (Maxwell 1861). His commitment to this theory and its operative power to produce new physical insight has been disputed by some historians of physics. However, Siegel (Siegel 1991) has argued that Maxwell considered his theory of the vortices as very probable and a good candidate for reality. In order to explain how the vortices or tubes could all rotate in the same direction without creating too much friction Maxwell introduced small ‘idle wheels’ between the vortices (see Fig. 3.1). The motion of these spherical movable idle wheels represented an electric current. In order for Maxwell to incorporate electrostatics into his theory he imagined that the medium carrying the vortices was elastic, a hypothesis that made him introduce the displacement current and to suggest that light was in fact transversal vibrations in the same medium. In fact, he could use measurements done by Wilhelm Weber and Rudolph Kohlrausch to calculate the velocity of a wave in the electromagnetic medium and it turned out to agree with the velocity of light in air. This made Maxwell conclude ‘that the magnetic and luminiferous media are identical’ (Maxwell to Thomson, December 10, 1861 (Larmor 1937)). Earlier on, Maxwell had speculated that his physical theory of electromagnetism might turn out to include the kinetic theory of gases, and his new result now pointed to an electromagnetic theory of light. This unexpected success lent support to the mechanical world view, and was seen as the strongest argument in favor of Maxwell’s theory. However, long before Hertz produced experimental evidence for electric waves, the specific mechanical theory that had given rise to it, had been abandoned as
34
Mechanization of physics
a convincing explanation of the phenomena. Yet some physicists, including Thomson, continued to think of rigid rotating cells and idle wheels as a suggestive ‘model’ of the magnetic field. Thus, in his famous Baltimore Lectures, he formulated his mechanistic creed as follows: It seems to me that the test of ‘Do we or not understand a particular subject in physics?’ is, ‘Can we make a mechanical model of it?’ I have an immense admiration for Maxwell’s mechanical model of electro-magnetic induction. He makes a model that does all the wonderful things that electricity does in inducing currents, etc.; and there can be no doubt that a mechanical model of that kind is immensely instructive and is a step towards a definite mechanical theory of electro-magnetism. (Thomson 1985, p. 111)
However, he could not accept elastic wheels so he did not consider Maxwell’s model as a complete model of electromagnetism: I never satisfy myself until I can make a mechanical model of a thing. If I can make a mechanical model I can understand it. As long as I cannot make a mechanical model all the way through I cannot understand; and that is why I cannot get the electro-magnetic theory. (Thomson 1985, p. 206)
A ‘mechanical model’ in the terminology of Thomson and most of his contemporaries was thought of as being more than a mere analogy, but mostly much less than a true mechanical explanation. Where the analogy only manifested itself on the mathematical level, a model corresponded in some ways to the natural phenomenon of which it was a model. Some models, like Maxwell’s 1862 model for electromagnetism could pass as possible realistic explanations, others were merely mechanical illustrations of specific phenomena. This, for example, is the case with the mechanism that Maxwell produced to illustrate induction of currents (Fig. 3.2). Boltzmann later incorporated so many such gear wheel models into his textbook on Maxwell’s theory that a reader flipping through it might think he has picked up a textbook for machine engineers. It is obvious that these mechanical models were not meant as attempts to explain what really goes on in the electromagnetic field. They served a variety of other purposes: a means to understanding (see Thomson’s quote above), an illustration, a didactical device, a help to further conjecturing, and most fundamentally a proof that the physical property in question can be incorporated into mechanics. A mechanical model, however unrealistic it may be, indicates that it might be possible at a later time to find more realistic models and perhaps eventually to uncover the true mechanical origin of the phenomenon. Maxwell himself soon became dissatisfied with his concrete model of the electromagnetic ether. He pointed out that even if we can make a mechanical model that produces the observed effects, there is no guarantee that this is the only possible model nor indeed the ‘correct’ one. Yet he continued to assume that in ‘space there is matter in motion, by which the observed electromagnetic phenomena are produced’ (Maxwell 1864a, p. 527). In his monumental Treatise on Electricity and Magnetism
Mechanization of physics
35
Fig. 3.2. Mechanical model designed by Maxwell to illustrate the laws of induction of currents. Maxwell’s Treatise (Maxwell 1873b) Book 2 Chapter 7.
(Maxwell 1873b) he overruled his own scepticism as follows: It is difficult, however for the mind which has once recognized the analogy between the phenomena of self-induction and those of the motion of material bodies, to abandon altogether the help of this analogy, or to admit that it is entirely superficial and misleading. The fundamental dynamical idea of matter, as capable by its motion of becoming a recipient of momentum and of energy, is so interwoven with our forms of thought that, whenever we catch a glimpse of it in any part of nature, we feel that a path is before us leading, sooner or later, to the complete understanding of the subject. (Maxwell 1873b, §550)
Having argued that a current is associated with some form of kinetic energy and that it cannot reside in the ‘current-bearing’ wire, Maxwell concluded that the energy must be stored in the field (the ether) surrounding the wire. Toward the end of the Treatise he hypothesized, in accordance with his earlier view, that magnetism is stored as ‘very small portions of the medium (the ether), each rotating on its own axis. This is the hypothesis of molecular vortices.’ (Maxwell 1873b, §822). However, in his analysis of induction between circuits he made no such strong commitment as to the way in which the energy is stored in the ether. Still he could derive interesting results about electromagnetic induction by using a highly mathematical ‘model’ namely the Lagrange formalism: What I propose now to do is to examine the consequences of the assumption that the phenomena of the electric current are those of a moving system, the motion being communicated from one part of the system to another by forces, the nature and laws of which we do not yet even attempt to define, because we can eliminate these forces from the equations of motion by the method given by Lagrange for any connected system. (Maxwell 1873b, §552)
36
Mechanization of physics
As explained in Chapter 2 Lagrange’s method makes it possible to set up the equations of motion as soon as one knows a set of generalized coordinates (and their conjugate momenta) that completely determine the mechanical system. It is not necessary to know the forces that produce the constraints. We only need to know the kinetic and potential energies (or the Lagrangian) in terms of the generalized coordinates and velocities. Thus, when Maxwell wanted to study induction between two conducting circuits (Maxwell 1873b, §568–584) he considered the currents in the two circuits as generalized velocities that together with the geometric configuration of the circuits determine the motion of the mechanical system (the ether). He then derived an expression for the energy in terms of the currents, and used Lagrange’s equations to derive theorems about the interactions between the two currents. This approach to model making had been begun by George Green in a study of reflection and refraction (Green 1842). Having expressed his dissatisfaction with Cauchy’s assumption about the centrality of the forces acting between the molecules of the ether, he continued: If, however this were not the case, we are so perfectly ignorant of the mode of action of the elements of the luminiferous ether on each other, that it would seem a safer method to take some general physical principle as the basis of our reasoning, rather than assume certain modes of action, which after all, may be widely different from the mechanism employed by nature. (Green 1842)
More particularly he used the idea of a potential function for the whole system and d’Alembert’s principle. He emphasized that this Lagrangian method was particularly suited to the treatment of systems of an immense number of particles, and that it led not only to the equations of motion but also to the boundary conditions at the interface between two different media. Similar ideas were used the following year by James MacCullagh in his Essay towards a dynamical theory of crystalline reflexion and refraction (MacCullagh 1839). The physical use of Lagrange’s formalism was considerably facilitated when Thomson and Peter Guthrie Tait included it in their influential textbook Treatise on Natural Philosophy (Thomson and Tait 1867) often referred to as T and T’. They gave the stark mathematical formalism a physical treatment using ‘impulsive forces’ to introduce the generalized momenta and the energy function. This was also the way in which Maxwell chose to deal with the Lagrangian formalism in his Treatise (Maxwell 1873b, §553–567). In a review of the second edition of T and T’ (Thomson and Tait 1879) Maxwell highlighted the use of this method in physics. Having stated that Newton’s laws are sufficient to deal with mechanical problems such as the motion of the solar system where we believe we can observe all that we have to account for he continued: But when we have reason to believe that the phenomena which fall under our observation form but a very small part of what is really going on in the system, the question is not – what phenomena will result from the hypothesis that the system is of a certain specific kind? but – what is the most general specification of a material system consistent with the condition that the motions of those parts of the system which we can observe are what we find them to be?
Mechanization of physics
37
It is to Lagrange, in particular that we owe the method which enables us to answer this question without asserting either more or less than all that can be legitimately deduced from the observed facts. But though this method has been in the hands of the mathematicians since 1788 when the Mécanique Analytique was published, and though a few great mathematicians, such as Sir W.R. Hamilton, Jacobi, & c., have made important contributions to the general theory of Dynamics, it is remarkable how slow natural philosophers at large have been to make use of these methods. Now, however, we have only to open any memoir on a physical subject in order to see that these dynamical theorems have been dragged out of the sanctuary of profound mathematics in which they lay so long enshrined, and have been set to do all kinds of work, easy as well as difficult, throughout the whole range of physical sciences. The credit of breaking up the monopoly of the great masters of the spell, and making all their charms familiar in our ears as household words, belongs in a great measure to Thomson and Tait. The two northern wizards were the first who, without compunction or dread, uttered in their mother tongue the true and proper names of those dynamical concepts which the magicians of old were wont to invoke only by the aid of muttered symbols and inarticulate equations. And now the feeblest among us can repeat the words of power and take part in dynamical discussions which but a few years ago we should have left for our betters. (Maxwell 1879)
This quote is a fine expression of Maxwell’s mature view on mechanical modelling. It also highlights the importance of the Lagrangian formalism and Thomson’s and Tait’s role in making it available to physicists. However, Maxwell clearly underestimated his own role in this development. His treatment and application of the formalism in the Treatise was influential as was his more popular account of the effect of hidden systems (e.g. (Maxwell 1876b) and (Maxwell 1879)). In the latter he explained the ‘Lagrangian’ approach to hidden systems by way of an analogy with a belfry. In an ordinary belfry, each bell has one rope which comes down through a hole in the floor to the Bellringers’ room. But suppose that each rope, instead of acting on one bell, contributes to the motion of many pieces of machinery, and that the motion of each piece is determined not by the motion of one rope alone, but by that of several, and suppose, further, that all this machinery is silent and utterly unknown to the men at the ropes, who can only see as far as the floor above them. Supposing all this, what is the scientific duty of the men below? (Maxwell 1879, p. 783)
According to Maxwell their duty is to determine the potential energy of the machinery above in terms of the known coordinates (i.e. the position of the ropes) and its kinetic energy in terms of the known positions and velocities. This can be done by pulling the ropes in various ways and measuring the tug from each rope. These data are sufficient to determine the motion of every one of the ropes when it and all the others are acted on by any given forces. This is all that the men at the ropes can ever know. If the machinery above has more degrees of freedom than there are ropes, the coordinates which express these degrees of freedom must be ignored. There is no help for it. (Maxwell 1879, p. 784)
The last remark is a reference to the concept of ignorable or cyclic coordinates which were incorporated into the second edition of T and T’. The special property of these coordinates had been noted earlier in the century, and had been emphasized by Routh
38
Mechanization of physics
in 1877 but they became particularly important in the Lagrangian mechanization of physical phenomena toward the end of the nineteenth century. Since they play a key role in Hertz’s mechanics, I shall devote a separate chapter (18) to their properties and use, in particular, in the hands of Helmholtz and J.J. Thomson. On the Continent, physicists were less occupied with mechanical model making than in Britain, but many of them shared the general belief that the ultimate goal was a mechanical explanation of physical phenomena. In so far as mechanical modelling was done it showed a variation similar to that in Britain. Hermann von Helmholtz used the Lagrangian formalism (more precisely the principle of least action) and cyclic coordinates to make ‘models’ of physical phenomena such as heat (Helmholtz 1884) and electromagnetism (Helmholtz 1886) (see Chapter 18), stressing that although we do not know the inner working of the mechanical systems responsible for these phenomena, the analysis shows that these phenomena do not contradict the laws of mechanics (Helmholtz 1886, p. 140). Ludwig Boltzmann continued Helmholtz’s line of research on heat the same year and explained in the introduction: The aim is not to put forward mechanical systems that will be completely congruous to hot bodies but to find all systems which show more or less analogies with the properties of hot bodies. (Boltzmann 1884, p. 122)
By this time he had himself developed Maxwell’s kinetic theory of gases into a much more realistic statistical mechanical model. Moreover, as mentioned above he was a great constructor of concrete and totally unrealistic gear-wheel mechanisms to illustrate electromagnetic phenomena. He continued to defended the mechanical world view in popular lectures in 1900 and 1903 (Boltzmann 1900b) and (Boltzmann 1903) at a time when many other physicists had given it up.
3.1 The decline of the mechanistic world view ‘All physicists agree that the problem of physics consists in tracing the phenomena of nature back to the simple laws of mechanics.’ These opening words of Hertz’s preface to his Mechanics shows that he wholeheartedly supported the mechanistic philosophy. However, the drafts of this clearcut statement reveals that Hertz was aware of the fact that the mechanistic philosophy was not shared by everyone. Indeed in the first draft (Ms 10) he first wrote ‘Most physicists’6 then changed it to ‘Probably all physicists’7 and finally to ‘The Physicists’8 . Moreover, to the next word ‘agree’ he added the word ‘completely’9 but later deleted it. Indeed one of his main sources of inspiration Ernst Mach’s Die Mechanik in Ihrer Entwickelung historisch-kritisch dargestellt had forcefully attacked the mechanical reductionistic program: We think it is a prejudice to assume that mechanics must be considered the foundation of all other branches of physics, and that all physical processes must be explained mechanically. (Mach 1883, p. 467) 6 ‘Die Meisten Physiker.’
7 ‘Wohl alle Physiker.’
8 ‘Die Physiker.’
9 ‘vollständig.’
The decline of the mechanistic world view
39
Mach argued that it was a historical coincidence that mechanics was considered the deepest physical theory, and opted for a phenomenological approach in which physics is restricted to the description of observable facts leaving aside all hypotheses about an underlying unobservable mechanism. Mach had come to this conviction through a critical philosophical analysis. Soon thereafter the kinetic theory of gases, which had been one of the success stories of the mechanistic program, raised problems for the program (Harman 1982, 149ff). One problem was caused by the so-called equipartition theorem, another more fundamental problem was to explain how an irreversible law such as the second law of thermodynamics could arise from a completely reversible theory of mechanics. Also, electromagnetism gradually led to the decline of the mechanistic world view and ironically Hertz was among the contributors to this tendency. Indeed, in Hertz’s theoretical papers on Maxwell’s theory he did not give any mechanical explanation of the electromagnetic phenomena, but having introduced some basic physical ideas and operational definitions of the fields he treated the theory from an axiomatic, mathematical point of view. This made it possible for continental physicists to work with Maxwell’s equations without having to invoke Maxwell’s mechanical ether. Towards the end of the century, the mechanical world view began to give way to an electromagnetic world view based on the ideas of Hendrik Antoon Lorentz (see (Harman 1982, pp. 116–119) and (Jungnickel and McCormmach 1986, pp. 227–245)). The program was formulated explicitly in 1900 by Wilhelm Wien as an alternative to the mechanical world view, and in particular as ‘diametrically opposed to Hertz’s foundation of mechanics’ (Wien 1900, p. 107) and it was developed further in various forms by Max Abraham (1875–1922) and Gustav Mie (1868–1957).
4 Problematization of the concept of force
4.1 Forces and atoms The most remarkable feature of the physical content of Hertz’s Mechanics is that it is a mechanics without forces. Hertz’s elimination of the concept of force from the basis of mechanics must be seen in the context of a general development of the status of this concept. The concept of force and, in particular, of gravitation acting at a distance was introduced by Newton in his Principia (Newton 1687). He enjoyed the hope that gravitation could be explained in terms of local actions but since he could not find any satisfactory theory he left actions at a distance as an experimentally verified fact in his theory. This was strongly criticized by the Cartesians who, following René Descartes, argued that all interactions could and should be explained in terms of contact forces. A void was argued to be a self-contradictory term, since space is determined by its content, and so the whole universe was thought to be filled with particles of various sorts, the motions and impact of which were the real cause of apparent actions at a distance. For example, Descartes tried to explain the motion of the solar system as the result of vortices in the etherial particles between the sun and the planets. Similar mechanistic atomistic world views were held by Peter Gassendi and Robert Boyle (see (Van Melsen 1952) and (Shapin and Schaffer 1985)). Their atoms had mass, volume and shapes that would explain physical and chemical properties, but they had no occult properties such as being able to act where they were not, i.e. at a distance. A very different type of atom was suggested by Giuseppe Boscovich. His atoms had no extension but they acted on each other at a distance. Similar atoms or molecules played a key role in the new and influential program for physics that Pierre Simon Laplace formulated during the early years of the nineteenth century in collaboration with his chemist friend Claude Louis Berthollet (see (Fox 1974)). In Laplacian physics short-range forces between the molecules of matter were invoked to explain physical properties such as capillarity and chemical affinity, in the same way that gravitational forces explained the motion of the solar system in Newtonian 40
The problematization of distance forces. Field theory
41
mechanics. Optical phenomena were explained as the result of forces acting between the light particles and ordinary matter, and heat was thought of as a fluid, the so-called caloric that could combine with ordinary matter. Similarly, electricity and magnetism were each conceived as one or perhaps two fluids. These fluids as well as caloric were believed to be imponderable, i.e. carrying no gravitational mass. In Laplacian physics, actions at a distance were fundamental, and even ‘local’ phenomena such as impact or motion on a surface were, in fact or in principle, reduced to short-range forces. Rigid constraints that Lagrange had successfully handled in his Mécanique Analytique were, in the Laplacian approach, considered to be mere mathematical idealizations of phenomena whose true physical causes could be found in the forces that act between the atoms of the mechanical system. Poisson, one of the most vigorous proponents of the Laplacian program, explained the difference between the old Lagrangian and the new Laplacian approach as follows: They [the questions of mechanics] had to be treated in an entirely abstract manner . . . within this class of generality and of abstraction Lagrange has gone as far as one could conceive possible when he replaced the physical connections [liens physiques] of bodies by the equations relating the coordinates of their different points. It is that which constitutes the analytical mechanics; but next to this admirable conception one can now offer the physical mechanics . . . (Poisson 1829 quoted from (Arnold 1978, p. 254))
In his works on dynamics Hamilton followed Poisson’s account of rigid constraints: But the science of force, or of power acting by law in space and time, has undergone another revolution, and has become already more dynamic by having almost dismissed the conceptions of solidity and cohesion, and those other material ties, or geometrically imaginable conditions, which Lagrange so happily reasoned on, and by tending more and more to resolve all connexions and actions of bodies into attractions and repulsions of points. (Hamilton 1834, p. 247)
For that reason Hamilton did not deal with constraints in his mechanical formalism at all. As pointed out by Harman (Harman 1982, p. 19) many aspects of the Laplacian unified physical world view, that encompassed both theoretical and experimental physics, continued to shape the development of physics long after the specific elements of the theory such as the imponderable fluids were abandoned. However, as we have seen, Hertz’s view of the relation between the concept of force and the concept of constraint was a total reversal of that of Poisson and Hamilton. According to Hertz connections are the physically primary concept and forces are only derived idealized epiphenomena. This reversal was part of a larger rejection of Laplacian physics.
4.2 The problematization of distance forces. Field theory Laplacian physics gradually lost its leading role around 1830, first in the theory of light with Fresnel’s wave theory then in electromagnetism and soon thereafter in the theory of heat, where the kinetic theory suppressed the caloric theory (Fox 1974). In electromagnetism André Marie Ampère rejected the magnetic fluid, explaining magnetism
42
Problematization of concept of force
instead as a result of rotating motion of electricity. This explanation was suggested by Hans Christian Ørsted’s discovery in 1820 of the deflection of a magnetic needle near an electrical current. Though Ampère’s electrodynamic theory is usually seen as a reaction against Laplacian physics, it inherited several of its important characteristics: electromagnetic phenomena were explained by actions at a distance, and these actions were considered as a result of elementary interactions between infinitesimal entities the difference being that the infinitesimal entities were neither molecules nor imponderable fluids but differential elements of conductors. Ampère’s mathematical theory of electrodynamics consisted of two steps: First, he used carefully conceived null experiments to determine the fundamental infinitesimal force law between two conducting elements, and secondly he showed how he could deduce all known electromagnetic phenomena by integrating the force law over the conductors in question, interpreting magnets as coils. A more Laplacian mathematical theory was developed by Wilhelm Weber (1804–1891). He explained the electromagnetic phenomena as a result of actions at a distance between the particles of electric fluids. However, his force law did not only depend on the distance between the electric particles but also on their velocities and acceleration. Both Ampère and Weber supplied their mathematical theories with a theory of an ether that transmitted the actions between the electrical particles. In Weber’s case this was even necessitated by Faraday’s discovery that the dielectrica separating the currents altered their interaction. Yet despite such adjustments their theories were generally considered as action at a distance theories. Maxwell embarked on a much more fundamental attack on distance forces. When he learned of Ampère’s investigations he ‘greatly admired them’ as he wrote to William Thomson in 1854 (Larmor 1937, p. 8) but he also admitted that ‘I was not satisfied with the form of the theory which treats of elementary currents & their reciprocal actions’ (Larmor 1937, p. 8). He much preferred Faraday’s approach: Now I have heard of you speak of ‘magnetic lines of force’ & Faraday seems to make great use of them, but others seem to prefer the notion of attractions of elements of currents directly. Now I thought that as every current generated magnetic lines & was acted on in a manner determined by the lines thro wh: it passed that something might be done by considering ‘magnetic polarisation’ as a property of a ‘magnetic field’ or space and developing the geometrical ideas according to this view. (Maxwell to Thomson, November 13, 1854 (Larmor 1937, p. 8))
So already here Maxwell sketched a plan of how to avoid action at a distance by developing a mathematical description of Faraday’s lines of force. The idea that apparent distance forces could be explained as the result of actions in a ‘field’ seems to have been helped along by his study of electrostatics: I got up the fundamental principles of electricity of tension easily enough. I was greatly aided by the analogy of the conduction of heat wh: I believe is your invention (Maxwell to Thomson, November 13, 1854 (Larmor 1937, p. 7))
Indeed Thomson’s analogy (see Chapter 3) showed that electrostatic forces, that were usually thought of as distance forces, were in fact mathematically analogous to the effect of heat conduction that was usually considered to be a result of heat being diffused locally from one part of space (or of the temperature field) to the contiguous
The problematization of distance forces. Field theory
43
parts of space (or the field) (see (Maxwell 1864b, p. 157)). Maxwell seems to have liked and understood this last field description better and to have conceived the idea of finding an equally mathematically precise field theory of electromagnetism based on Faraday’s unmathematical concepts. This became a life-long project. It is not surprising that when Maxwell some months later learned about Weber’s law he did not like it: I am reading Weber’s Elektrodynamische Maasbestimmungen which I have heard you speak of. I have been examining his mode of connecting electrostatics with electrodynamics, induction &c & I confess I like it not at first. (Maxwell to Thomson, May 15, 1855 (Larmor 1937, p. 11))
Also in his public writings Maxwell explicitly presented his theory as an alternative to Weber’s that avoided the latter’s forces acting at a distance. Thus, in his Dynamical Theory of the Electromagnetic Field he began with a brief account of Weber’s and Carl Neumann’s theories and their successful account of electromagnetic phenomena. However, he continued: The mechanical difficulties, however, which are involved in the assumption of particles acting at a distance with forces which depend on their velocities are such as to prevent me from considering this theory as an ultimate one though it may have been, and may yet be useful in leading to the coordination of phenomena1 . I have therefore preferred to seek an explanation of the fact in another direction, by supposing them to be produced by actions which go on in the surrounding medium as well as in the excited bodies, and endeavoring to explain the action between distant bodies without assuming the existence of forces capable of acting directly at sensible distances. (Maxwell 1864a, p. 527)
Also, in his Treatise he explained that: I have chosen this method [his own as opposed to Weber’s] because I wish to shew that there are other ways of viewing the phenomena which appear to be more satisfactory . . . than those which proceed on the hypothesis of direct action at a distance. (Maxwell 1873b, §552)
In 1873 Maxwell further gave a popular lecture to the Royal Institution On Action at a distance. Rather than ‘dressing up in philosophical language the loose opinions of men who [have] no knowledge of the facts [of nature]’ (Maxwell 1873a, p. 315) he embarked on a partly historical discussion of the pros and cons of distance forces, concluding with his own ideas of the electromagnetic ether that according to Maxwell ‘must not be regarded as mere mathematical abstractions’: But the medium, in virtue of the very same elasticity by which it is able to transmit the undulations of light, is also able to act as a spring. When properly wound up, it exerts a tension, different from the magnetic tension, by which it draws oppositely electrified bodies together, produces effects through the telegraph wires, and when of sufficient intensity, leads to the rupture and explosion called lightning. These are some of the already discovered properties of that which has often been called vacuum or nothing at all. They enable us to resolve several kinds of action at a distance into actions between contiguous parts of a continuous substance. Whether this resolution is 1 Maxwell probably referred to Helmholtz’s ultimately refuted claim that velocity-dependent force laws would contradict energy conservation.
44
Problematization of concept of force
of the nature of explication or complication, I must leave to the metaphysicians. (Maxwell 1873a, p. 323)
Later, Maxwell expressed similar ideas in his papers on Attraction and Ether in the Encyclopedia Britannica (see (Maxwell 1965, vol. 2, pp. 485–491 and 763–775)). When Maxwell believed that electromagnetism was basically a mechanical phenomena, the mechanics he had in mind was therefore a mechanics without distance forces. While ideally wanting to reduce everything to ‘matter in motion’ he often had recourse to concepts such as strain, tension and potential energy assuming them apparently to be results of contact forces alone. In his kinetic theory of gases he likewise tried to reduce all interactions to elastic collisions between the molecules of the gas, but at one point when his predictions did not fit with experiments he assumed that the molecules interacted through a rapidly decaying action at a distance proportional to the fifth power of the distance between the molecules. On the Continent there also appeared revisions of and alternatives to Weber’s electrodynamics. Thus Bernhard Riemann (Riemann 1867a) assumed that the electric action was not propagated instantaneously but with the speed of light, and the Danish physicist Ludwig Lorenz argued that light was electromagnetic propagation. Carl Neumann rephrased Weber’s theory into an ‘energetic’ theory in which the phenomena are explained as the result of potentials that are propagated with the velocity of light. He worked out these developments of Riemann’s theory in an attempt to save Weber’s theory from Helmholtz’s criticism. During the 1870s, Helmholtz developed his own alternative electromagnetic theory. Rejecting velocity-dependent forces, Helmholtz build his theory on interaction potentials through which any infinitesimal part of the electromagnetic system will interact with any other part ((Buchwald 1994, appendix 16) and (Wise 1981)). Contrary to Carl Neumann’s potentials Helmholtz’s potentials depended only on the macroscopic states of the interacting systems and they were not propagated with finite velocity. In fact, Helmholtz suggested a whole family of theories depending on a parameter k entering into the expression of the potential. One value of the parameter would yield results similar to Weber’s theory (or C. Neumann’s version of it), another value results similar to the theory of Frantz Neumann (Carl’s father), and a third value would simulate Maxwell’s theory. Helmholtz was happy that he could thus rephrase Maxwell’s theory within his own potential theory: Thus it appears from these investigations that the strange analogy between the motion of electricity in a dielectricum and those of light does not depend on the special form of Maxwell’s hypothesis; it turns up in an essentially similar way when we stick to the older view about the electric action at a distance. (Helmholtz 1870b, p. 558)
However, the physical content of Helmholtz’s version and Maxwell’s own version are very different. In Helmholtz’s version it is the direct interaction of the spatially separated laboratory equipments that create the polarization. In Maxwell’s theory the electric and magnetic fields are the primary objects, currents and charges being only epiphenomena due to discontinuities in the field.
Rejections of atomism
45
Hertz learned Maxwell’s theory first in its Helmholtzian version; only later did he convert to Maxwell’s own version and began to consider Helmholtz’s reformulations in terms of distance actions to be essentially misconceived (see Chapter 6).
4.3 Rejections of atomism Actions at a distance were often coupled with various hypotheses about the atomic or molecular constitution of matter. The strongest arguments in favor of atomism came from chemistry. Around 1800 it had been shown that elements often combine in constant proportions (Van Melsen 1952, pp. 131–148). Dalton and his followers saw this as evidence of the atomic nature of matter, whereas other more positivistic chemists did not draw that conclusion. All through the nineteenth century there was a tension in both physics and chemistry between atomic and molecular theories on the one hand and non-atomic and non-molecular theories on the other hand. Though insights into the molecular realm were obtained through chemical analysis, spectral analysis and the kinetic theory of gases, many physicists and chemists cautioned that our knowledge of this realm was too fragmentary to base sound theories upon it. In fact, different areas of physics and chemistry often led to apparently inconsistent assumptions about atoms or molecules (Harman 1982, p. 134). For that reason Helmholtz only referred to the molecular level of description when the investigated phenomenon required it, as in the case of electrolysis. Even Weber originally phrased his force law as a law of interacting small volume elements of electricity and only later developed a molecular theory. British physicists developed their own approach to the problems of atomic physics. Instead of considering atoms as the fundamental building blocks of matter they considered atoms to be structures in a more fundamental substance: the ether. This approach had been opened up by Helmholtz (Helmholtz 1858). He showed that if a vortex was introduced into a non-viscid fluid, the vortex would, during the cause of time, change place and shape but it would never disappear. William Thomson (Thomson 1867), (Thomson 1869) further showed that such vortices would interact as though they influenced each other with forces acting at a distance, a result that Tait could beautifully illustrate using smoke rings. Therefore Thomson suggested that what we call material particles are nothing but vortices in a fluid ether. Vibrations of the vortex rings were supposed to account for spectral lines, and the great variety of spectra was thought to be a result of the knottedness of the rings (this was the beginning of knot theory). Although the theory yielded only qualitative results at best, it was as late as the late 1880s, widely considered as the most promising potential explanation of all natural phenomena. (See, e.g. (Love 1887).) According to this conception of nature the ether became the carrier not only of light and electromagnetism but also of ordinary matter, all types of interactions and chemical phenomena, in fact of everything going on in the physical world. In order for that to be possible, the ether had to possess many different rather incompatible properties. Over time, William Thomson suggested several different models of the
46
Problematization of concept of force
ether but in the end none of them could account for all the effects they were supposed to explain (Smith and Wise 1989, Chapter 12). An even more radical reduction was suggested by William Kingdon Clifford in a short programmatic paper read in 1870 to the Cambridge Philosophical Society and published in (Clifford 1876). He claimed that physical space is a Riemannian manifold whose curvature varies not only from place to place but also in time. He further claimed That this variation of the curvature of space is what really happens in that phenomenon which we call the motion of matter, whether ponderable or etherial. That [moreover] in the physical world nothing else takes place but this variation, subject (possibly) to the law on continuity. (Clifford 1876)
The essential idea of Clifford’s world view, that he developed further in his popular book The Common Sense of the Exact Sciences (Clifford 1885), seems to be that instead of reducing all physical phenomena including ordinary mechanics to mechanics of an ether one should reduce them to the varying curvature of space. This daring precursor of Einstein’s general theory of relativity theory had little impact and was not carried out in any quantitative detail. In the introduction to his Mechanics, Hertz briefly mentioned Lord Kelvin’s vortex atoms, but only as one among many other instances where hidden masses, in this case the ether, had been used by earlier writers. His own image of mechanics did not depict ordinary masses as structures in a medium of hidden masses but more traditionally as fundamental primary objects that interact with and through a hidden system. He was not an opponent of atomism. Indeed in his Kiel Lectures (Hertz 1999) he listed all the experimental evidence for the existence of atoms and explained which conclusions one could draw from these experiments about the size of the atoms. Moreover, he speculated about their possible structure (Chapters 5 and 6). In the introduction to his Principles of Mechanics he repeated this commitment to atomism: ‘It is true that we are now convinced that ponderable matter consists of atoms; and we have definite notions of the magnitude of these atoms and of their motions in certain cases’ (Hertz 1894, p. 21/17&18). He also suggested that ‘our conception of atoms is in itself an important and interesting object for further investigation’ (Hertz 1894, p. 21/18) (he probably thought of experimental investigation). Yet he emphasized that ‘the form of the atoms, their connection, their motion in most cases – all these are entirely hidden from us’ and therefore an atomistic image ‘is in no wise specially fit to serve as a known and secure foundation for mathematical theories.’ (Hertz 1894, 21/18). He further recalled that: To an investigator like Gustav Kirchhoff, who was accustomed to rigid reasoning, it almost gave pain to see atoms and their vibrations wilfully stuck in the middle of a theoretical deduction2 . The arbitrarily assumed properties of the atoms may not affect the final result. The result may be correct. Nevertheless the details of the deduction are in great part presumably false; the deduction is only in appearance a proof. (Hertz 1894, p. 21/18)
Hertz did not base his Mechanics on an atomistic image of nature. The lack of experimental knowledge of atoms convinced him that any atomistic image would 2 He might have thought of Cauchy’s optics.
Energetics
47
have to be based on arbitrary hypotheses that future experiments would probably falsify. Moreover, he conceived of his Mechanics as a foundation for all of physics, including atoms. Thus it clearly could not be based on atomism. Finally, Hertz closely linked atomism with distance forces: We have already had occasion to remark that in tracing back phenomena to force we are compelled to turn our attention continually to atoms and molecules. (Hertz 1894, p. 21/17)
This rejection of atomism as a basis for his mechanics therefore formed a part of his rejection of the Newtonian–Laplacian image of mechanics.
4.4 Energetics An alternative to an atomistic, force-based mechanical conception of physics gradually took shape during the last part of the nineteenth century. The alternative was called energetics by W.J. Macquorn Rankine in 1855. As suggested by its name, this world view insisted that all physical phenomena could and should be derived as consequences of the transformation of energy. It was based on the idea that in the world there is in addition to matter another entity, energy, that is conserved. The principle of the conservation of energy (or of ‘force’ as it was called in the beginning) was formulated around 1850 by Julius Robert Mayer (1842), Helmholtz (1847), Rankine (1850, 1853), W. Thomson (1851), and others. For an adherent to the energetic program, physics (and chemistry and probably all material phenomena) can be described by first assigning energies to all physical (chemical . . .) states and second by finding laws that describe the conversion of energy. For example, in mechanics one would need to assign kinetic (unproblematic) as well as potential energy to all possible states of the system and determine a law that would describe the motion of the system in terms of the energy functions. This law could be Lagrange’s equations or Hamilton’s principle. As the mechanistic philosophy of nature energeticism came in many varieties. In a weak sense it simply meant using energy as a basic concept as Helmholtz did in his early works on the conservation of energy. Among British physicists such as Thomson, Tait, and Maxwell energeticism also provided the possibility of formulating physics without the Laplacian ‘hypothetical supposition of atoms, forces, etc. hitherto existing’3 . They considered energetism in harmony with their mechanistic program of matter in motion, and did not mind applying energy considerations on the microscopic level. For example, they hypothesized various mechanical models in order to illustrate or explain how energy is stored. Many Continental physicists such as Franz Neumann, Kirchhoff, and Planck, on the other hand, emphasized the macroscopic aspect of energy conservation. They rejected atomic as well as other hypothetical hidden mechanical models and operated on the macroscopic level of observable measurable quantities. In this form Energeticism strove toward a positivistic ideal and was appreciated as such by Mach and Duhem. 3 Rankine 1855 quoted from (Heidelberger 1993, p. 472).
48
Problematization of concept of force
The energetic program in its pure form was formulated by Georg Ferdinand Helm in 1887 (Helm 1887)4 , and in 1893 Friedrich Wilhelm Ostwald made it the basis of his doctrine of chemical affinity. Helm and Ostwald became the militant advocates of energetics and at the 1895 Naturforscherversammlung in Lübeck, their addresses provoked a great deal of uproar among the proponents of the kinematic mechanistic philosophy. The heated discussions between the two proponents of the energetic philosophy and the defenders of the mechanistic philosophy, Boltzmann and Carl Neumann, continued in the journals after the congress. When Hertz wrote his Mechanics the very sharp distinctions between the two philosophies had not yet been drawn up. However, Hertz was keenly aware of the energetic alternative to ordinary force- and atom-based physics, and initially contemplated writing a work on mechanics based on the energetic philosophy (see Chapter 6). In the end, he abandoned this plan, but he left the energetic image as one of the three alternative images that he evaluated in the introduction. Hertz learned about the energetic approach from his teacher Helmholtz. Indeed, in his works from around 1880, Helmholtz expressed the interactions between small parts of a physical system in terms of interaction energies (see (Helmholtz 1886)). He showed how the principle of least action could be used together with such expressions of the interaction energies as a basis not only for mechanics but also for reversible thermodynamics and electromagnetism. The latter subject was elaborated in (Helmholtz 1892). As mentioned above, Helmholtz continued to believe that a reduction to central forces was, in principle, possible although he lost faith in his earlier metaphysical argumentation and no longer believed that the determination of these forces was within immediate reach. So for him the energetic program was a way to show that the physical phenomena was in accordance with the principles of mechanics without the need to construct a mechanical model. More specifically his papers on monocyclic systems established the possibility of a mechanistic reduction of the theory of heat without producing a detailed model of that phenomenon. The energetism that Hertz described in the introduction to his Mechanics was of a purer kind. He realized that in order to present a self-contained description of nature based on the concept of energy it was necessary to free it from any previous notion of force or hidden motion. While praising the positivist macroscopic nature of such an image of mechanics he remarked that it ‘has never been portrayed in all its details. So far as I know, there is no textbook of mechanics which from the start teaches the subject from the standpoint of energy, and introduces the idea of energy before the idea of force. Perhaps there has never yet been a lecture on mechanics prepared according to this plan.’ (Hertz 1894, p. 17/15). The only textbook before 1890 that made energy a basic concept was Thomson and Tait’s Treatise on Natural Philosophy also known as T and T’ (Thomson and Tait, 1879), that Hertz was well acquainted with. In earlier drafts Thomson and Tait had tried to base the whole book on the concept of energy (Smith and Wise 1989) but in the published version they introduced forces before 4 See also (Helm 1890) and his book (Helm 1898).
Energetics
49
energy. In a sense the weak type of mechanistic philosophy as described in the preceding chapter corresponds to the energetic program. Hertz borrowed from the energetic program a dissatisfaction with atomic hypothesis, in particular interatomic forces, as a basis for mechanics and physics. However, he borrowed much more from the kinematic, mechanistic program, namely the idea that mechanical (physical) phenomena must be explained by the motion of a hidden system. In that sense his image was a logical purification of the British mechanistic ideas and stood in sharp contrast to the German type of macroscopic energetism.
5 A biographical survey
The aim of this chapter is to supply a brief biography of Heinrich Hertz with particular emphasis on those elements that relate to his work on the Principles of Mechanics1 .
5.1 Childhood and student years (1857–1883) Born in 1857 Heinrich Hertz grew up in an intellectually inclined upper-middle-class Hamburg home. His father, Gustav F. Hertz, descended from a respected Jewish merchant family, but was baptized in the Lutheran church as a child. He was himself an attorney and a member of the Hamburg senate. As the oldest of five siblings, Heinrich Hertz was brought up by his mother Anna Elizabeth (born Pfefferkorn) to become something special. Already as a boy he showed exceptional intellectual as well as artistic and mechanical talents. He was always number one in his class and the director of the technical school, where he took extra geometry lessons on Sundays, recommended to his father that he ought to become a mathematician. The young Hertz was very fond of mathematics and pitied his mother that she could not share his pleasure with him. Yet, he thought that ‘Mathematics is such an abstract science and I really wish to live with people.’ (Fölsing 1997, p. 29). All through his life he continued to waver between a love for mathematics and a skepticism towards its formal features and its possible lack of applicability. Like his father, Heinrich Hertz soon became fluent in ancient Greek and Latin and also acquired knowledge of Hebrew and Arabic, the latter with such a success that the professor who taught him privately tried to persuade Heinrich’s father that he should become an orientalist. He had never met such a talent before (Fölsing 1997, p. 41). From an early age Heinrich Hertz also showed that he was good with his hands. He made excellent drawings and sculptures and he enjoyed working at his carpentry 1 For a more comprehensive account the reader is referred to Fölsing’s very interesting biography (Fölsing 1997). The reader who prefers an English text is referred to (Mulligan 1994) and to a fine short biography in the Dictionary of Scientific Biography (McCormmach 1972). Buchwald has analysed Hertz’s scientific work before 1889 in his penetrating book (Buchwald 1994). The disagreements between Buchwald and Fölsing about Hertz’s early commitment to Maxwellianism is discussed in (Nordmann 2000).
50
Childhood and student years (1857–1883)
51
bench and later at his lathe. He produced furniture, objects of art, and scientific instruments with which he conducted physical and chemical experiments. In an attempt to combine his theoretical and practical interests, Hertz decided to become a constructional engineer. After his Arbitur in 1875, he moved to Frankfurt where he passed his year of practical experience in a construction bureau. He did not find the work rewarding, but in his abundant spare time he continued his theoretical studies in particular in mathematics. The following spring he began his engineering education at the Dresden Polytechnikum. He particularly enjoyed Leo Koenigsberger’s lectures on integral calculus and Fritz Schultze’s lectures on the history of philosophy. In connection with the latter, Hertz studied various works by Immanuel Kant, in particular his Kritik der reinen Vernunft, a work that continued to influence him later when he wrote his Principles of Mechanics. He also tried to follow Koenigsberger’s lectures on analytical mechanics but he found it too difficult. After only one semester in Dresden Hertz served his one year military service at the railroad troops in Berlin after which he continued to serve as an officer of the reserve. Until his last fatal illness he periodically took part in military training and on festive occasions he dressed up in his lieutenant’s uniform. In 1877 Hertz resumed his engineering studies at the Polytechnicum in München, but after one month he decided that he would rather study physics. The physics professor Philipp von Jolly recommended that Hertz learn mathematics and mechanics by reading the great masters. In particular Lagrange’s Mécanique Analytique made Hertz reflect on the principles of mechanics and the concepts of force, time, space and motion (Fölsing 1997, p. 69). These reflections later matured in his last published work. He also studied Laplace’s Mécanique Céleste, Montucla’s history of mathematics and original papers by Newton and Leibniz. Moreover, he enjoyed Alfred Pringsheim’s lectures on elliptic functions. Yet, he reported back to his parents: The entire new mathematics (from about 1830 on) is, I think, of no great value to the physicist, however beautiful it may be intrinsically, for I find it so abstract, at least in parts, that it no longer has anything in common with reality; for instance, the non-Euclidean geometry, which is based on the assumption that the sum of the angles in a triangle need not be always equal to 2 right angles, or the geometry dealing with space of four, five, or more dimensions etc. Even the elliptical functions are, I think, of no practical value. But perhaps I am mistaken. (Hertz 1977, pp. 71–72)
Even though Hertz later decided to base his mechanics on a geometry of systems of points analogous to Riemannian high-dimensional non-Euclidean geometry his Kantian convictions about the a-priori nature of geometry made him remain skeptical about these modern ideas of mathematics. After one year in München Hertz thought he had learned enough mathematics, but he was unhappy with the München standards in his primary discipline, physics. Von Jolly’s lectures on experimental physics did not impress him and when he was allowed into the laboratory toward the end of the year, he found the level too elementary. Therefore, in 1878 he decided to move to Berlin to continue his studies at the university where Hermann von Helmholtz had established the leading physics laboratory in Germany, or perhaps in the whole
52
A biographical survey
world. Hertz did not learn much from Helmholtz’s lectures but when he was allowed to work in Helmholtz’s laboratory a new world of front-line research was revealed to him. Upon arrival in Berlin, Hertz had by accident seen the advertisement of a prize competition formulated by Helmholtz. The problem was to investigate if the conduction of currents were accompanied by transport of inertial mass. Within a year Hertz had experimentally verified that if currents have inertia, the mass must be smaller than a very small specified quantity. This work won him the prize of the University, his first publication, and, perhaps most importantly, the respect of Helmholtz, who had followed Hertz’s laboratory work with great interest. In fact Hertz had come to the conclusion that Helmholtz had hoped for. Much of Helmholtz’s efforts at this period of time were focused on trying to disprove Weber’s action at a distance theory and to prove his own electromagnetic theory based on interaction potentials. According to Weber, electric currents are the result of displacement of electric particles that must necessarily have mass if instabilities should be avoided. According to Helmholtz (and Maxwell) this is not necessarily the case. By establishing a low upper boundary for the mass that may be involved, Hertz had given credence to Helmholtz rather than Weber. Some of his later works on electromagnetism were likewise aimed at falsifying Weberian electromagnetic doctrines and supporting Helmholtzian or eventually Maxwellian electromagnetic theory. While studying in Berlin, Hertz also followed theoretical courses. For the first semester he followed Kirchhoff’s lectures on electricity and magnetism and of most interest for us, the lectures by Carl Wilhelm Borchardt on analytical mechanics. The careful notes he took from these lectures later ‘proved useful’ (Hertz 1894, Preface) when he worked on his Principles of Mechanics. Borchard’s course was Hertz’s second course on mechanics and soon he was presented with two other approaches to this subject. In the summer of 1879 Hertz followed Kirchhoff’s Mechanics of Rigid and Fluid Bodies. However, he did not find the lectures particularly novel because he had already acquainted himself with Kirchhoff’s untraditional approach to mechanics by reading his book published three years earlier. Yet Kirchhoff’s way of avoiding force as a fundamental concept probably gave him something to think about. Finally, in order to offer a mathematical subject for his doctoral exam, he followed Ernst Kummer’s lectures on ‘Analytical Mechanics.’ Thus he spoke out of a great personal experience when in the introduction to the Mechanics, he mentioned the experience that it is exceedingly difficult to expound to thoughtful hearers the very introduction to mechanics without being occasionally embarrassed, without feeling tempted now and again to apologize, without wishing to get as quickly as possible over the rudiments, and on to examples which speak for themselves. (Hertz 1894, p. 8/7)
It was this uneasiness that Hertz wanted to eliminate with his Principles of Mechanics. Hertz also followed Helmholtz’s lecture on acoustics and Kirchhoff’s lecture on optics and finished his study with a theoretical dissertation on the currents induced on a rotating conducting or dielectric sphere by a magnet2 . 2 This effect was later called the skin effect.
Childhood and student years (1857–1883)
53
Helmholtz had tried to tempt Hertz to work instead on a second prize problem that he had formulated particularly with Hertz in mind. As the first prize problem it was an experimental test aimed at distinguishing between different theories of electromagnetism. The problem was to show that the displacement current or changing polarization exists and has measurable effects like ordinary currents in conductors. According to Maxwell and Helmholtz this ought to be the case, but according to the old Ampère theory passive dielectrica should not display such effects. However, Hertz’s theoretical investigation of the problem suggested to him that only two types of experiments could decide the matter: the investigation of induction in rotating dielectrics, the subject that he chose to treat theoretically in greater generality in his thesis, and the investigation of oscillation currents in induction spirals (Buchwald 1994, p. 92). The latter was the procedure that eventually led Hertz to the solution of the problem and to the discovery of electromagnetic waves. However, in 1879 Hertz thought that the chance of a successful outcome was too small to justify that he use three years of his life to such experiments. This was the time Helmholtz had estimated it would take to settle the Prize problem. Although Helmholtz probably did not read Hertz’s careful theoretical report about the small chances of success of these experiments he accepted Hertz’s decision to concentrate on other experiments. His consent was essential, for after Hertz in February of 1880 had earned his doctoral degree he was employed as one of Helmholtz’s assistants. During his three years in Helmholtz’s physics laboratory Hertz worked hard to fulfil his dream (and that of his parents), of achieving something extraordinary. The way to do that in Helmholtz’s laboratory was, in Buchwald’s words, to create a new effect. This search for novelty and a somewhat qualitative enterprise contrasted starkly with the minute testing of existing theories and measurement of constants that went on in Weber’s laboratory in Göttingen. It was a style of experimental practice that Hertz adopted fully from Helmholtz’s laboratory. In between his experimental work Hertz also did theoretical investigations. This tendency of going back and forth between experimental and theoretical work was characteristic of Hertz all through his life. The research that Hertz did as an assistant in Berlin resulted in 11 published papers on a wide variety of subjects. He continued the work on the first prize problem as well as on rotating spherical conductors, and he developed a theory of the deformation of two elastic colliding balls from which he deduced a new quantitative absolute measure of hardness that he hoped would be of value to mineralogists. As has been pointed out by Buchwald (Buchwald 1994, Chapter 9) Kirchhoff severely criticized Hertz’s mathematical arguments when he refereed the paper for Borchard’s Journal für die reine und angewandte Mathematik and to Hertz’s regret rewrote a large section of the derivation. From this episode Buchwald has concluded that Hertz was somewhat impatient with the rigorous style of mathematics characteristic of the Berlin mathematical community and sanctioned by Kirchhoff. This impatience should, however, be contrasted with Hertz’s later insistence on absolute conceptual clarity in his image of mechanics. Hertz also wrote on evaporation, especially of mercury, on the tides, on a new dynamometer, on floating plastic plates and on cathode rays. In the last paper he
54
A biographical survey
concluded that cathode rays ‘are electrically indifferent’ and due to disturbances in the ether. After J.J. Thomson’s experimental verification of the existence of the electron in 1897, this conclusion was soon discovered to be incorrect, and cathode rays were thereafter explained as a ray of electrons. Despite the variety of Hertz’s activities during these years, Buchwald has explained most of them as an attempt to carry out Helmholtz’s research program investigating the bipartite interaction potentials characteristic of two objects in given states. Hertz, however, did not see this unity and explained that his works ‘emerged not in consistent pursuit of a larger goal, but as a result of occasional impulses that I received in large quantities from my teachers and collaborators.’ (Folsing 1997, p. 178).
5.2 Privat Dozent in Kiel (1883–1885) In 1883 a position as Privat Dozent in theoretical physics became vacant at the small university in Kiel. Having been asked about his opinion, Karl Weierstrass advised the influential Prussian ministerial adviser Friedrich Althoff to give the job to Hertz. … I am pleased that I can mention a young physicist who is willing to habilitate in Kiel and who has the very best recommendations of competent experts (in particular Prof. G. Kirchhoff). It is Dr. Hertz, assistant at the Physical Institute of this town. He is, what one at present calls a theoretical physicist, i.e. in addition to a thorough knowledge of the physical facts, he has a mathematical education which is sufficient to explain the natural phenomena, making use of the means offered by mathematics in its present state. (Weierstrass letter to Althoff, February 28, 1883)3
Heinrich Hertz accepted the position and habilitated in Kiel with his papers on impact. At the University of Kiel there was no physical laboratory, so during his two years at this place, Hertz was forced to work exclusively on theoretical matters. He published on the distribution of pressure in an elastic cylinder, on meteorology, and on Maxwell’s equations. In this last-mentioned strange paper he argued that if one accepts that there exists only one type of electric force and one type of magnetic force, then the continental theories of electromagnetism are incomplete, and if completed they lead to Maxwell’s equations. Indeed, from the principle of the unity of electric force, Hertz deduced that changing or moving magnets ought to act on each other with electric forces. In the continental theories this action is absent, but if it is built into them, an infinite series of correction terms will lead to electric and magnetic vector potentials that propagate with a finite velocity equal to the velocity of light. He admitted that he had not shown that Maxwell’s theory was the only valid theory, but concluded that it was the most satisfactory of the known theories4 . When he later provided experimental evidence for the superior merits of Maxwell’s theory over its continental rivals, he never referred to his earlier attempt at a purely theoretical argument. 3 Statsarchiv Berlin Dahlen, Rep 92, Abt.B, Nr.194, Ban 13. I thank Dr. Reinhard Siegmund Schultze for sending me a copy of this letter. 4 For a detailed discussion of this paper see (Buchwald 1994, pp. 177–214). See also (D’Agostino 1998).
Privat Dozent in Kiel (1883–1885)
55
While in Kiel Hertz lectured, among other things, on the mechanical theory of heat and on optics and he worked intensively on hydrodynamics. Moreover, in the summer semester of 1884, he gave a well-attended public lecture entitled ‘Modern Ideas on the Constitution of Matter.’ He prepared a detailed manuscript version of these lectures, apparently meant for publication. However, it was not printed during his lifetime nor immediately after his death, even though Max Planck urgently requested Hertz’s widow to do so. The manuscript has therefore remained unknown until Fölsing recently dug it up and published it (Hertz 1999). It gives a very well written semipopular but highly informed and original impression of the contemporary ideas about fundamental questions in physics and it is of particular interest for our present purpose, because it reveals the origins of some of the important elements of Hertz’s Mechanics. For example, Hertz developed a preliminary version of his later image theory of natural science and secondly he explained his ideas about the concept of matter. However, the aim of the lectures is quite different from that of his later Principles of Mechanics. Where the latter lays down an axiomatic framework that can serve as a foundation for all further study of nature, the lecture discusses what a-priori considerations and in particular, the latest experimental evidence tell us about the constitution of matter. The lecture course is divided into two parts: the first about the ether and the second about ponderable matter. In the introduction he announced a third part about the interaction between ether and ponderable matter, but there is no trace of this part in the manuscript. His aim in the first part is to argue that there exists an all-pervasive ether that is the carrier of light (transversal waves) and cathode rays (longitudinal waves) as well as the propagator of the apparent actions at a distance. In particular, Hertz argued elegantly that electromagnetic actions (and probably gravitation) did in fact not act at a distance but were contiguous actions through the ether. Some of his arguments against actions at a distance later found their way into the introduction of the Mechanics and some of his arguments for preferring Maxwell’s field theories foreshadow his later experimental and theoretical work on Maxwell’s theory. In particular, he argued strongly for Maxwell’s electromagnetic theory of light. In order to convey an intuitive image of propagation of electromagnetic radiation (Fortpflanzung), he considered the successive transfer by induction of electric oscillations from one oscillator to a whole series of identical oscillators. As noted by Fölsing (Fölsing 1997, p. 219), these oscillators, consisting of two conducting spheres connected by a conducting wire, resemble the oscillators he later used to produce radio waves. However, as pointed out by Buchwald (Buchwald 1994), the very roundabout and entangled way that gradually led Hertz to develop these oscillators in the laboratory, seems to owe little to his thought experiment in Kiel. It is interesting to note that Hertz did not discuss Maxwell’s mechanistic explanations of the electric field, although that would have given some idea about how the ether could carry the electromagnetic actions. In the second part Hertz related all the contemporary arguments for the atomistic structure of ponderable matter drawn from electrolysis, mechanics, optics and chemistry. In particular, Hertz showed that several different experiments suggested that an atom has a size around 10−7 –10−6 mm. He did not consider atoms as
56
A biographical survey
‘elementary’ in the sense of being necessarily indestructible, but considered them as having an inner structure (like 20–100 balls connected by rubber bands). He imagined that one would in the future be able to analyse the structure of the atoms by deciphering their intricate emission and absorption spectra. Bohr’s explanation of the Balmer series of hydrogen proved him right to some extent, but Hertz failed to foresee the equally important role later played by scattering experiments.
5.3 Professor in Karlsruhe (1885–1889) In the beginning of 1885 Hertz moved on to a professorship at the Polytechnic School in Karlsruhe. Though a Polytechnic School was not as prestigious a place as a University, Hertz chose this position over a professorship in Kiel because he wanted to take up experimental work again, and Karlsruhe offered good possibilities. However, his first year in Karlsruhe was not productive. He had new heavy administrative duties, and had problems with his female acquaintances. Already in Kiel he had an affair with a lady and trouble with another woman, and when he came to Karlsruhe he felt so lonely and desperate that he asked one of his colleagues for his daughter’s hand in marriage, apparently without knowing the woman. Ten days later he was engaged to be married but after three more days he regretted his rash action and called the engagement off. This created a scandal in the small town and upset Hertz so much that he had to ask for one months leave after the summer break to complete a water cure. Upon return he continued to be depressed because he had not found a wife and embarrassed by his actions; the depression lasted until the following year when he met Elizabeth Dold, the daughter of a teacher of practical geometry at the Polytechnic. Hertz was engaged to Elizabeth in April of 1886 and married her in July of the same year. The marriage was happy and Hertz became a good and caring father for their two daughters. After almost two years of unproductive life, Hertz finally, in the fall of 1886, began a series of experiments that eventually made his name immortal. It started when he performed a routine classroom experiment aimed at demonstrating induction from one so-called Knochenhauer or Riess coil to another. Hertz observed that sparks were emitted by the first termini of the first coil when it was connected to a Leiden jar and he concluded that he had produced fast electric oscillations in the coil. This reminded him of Helmholtz’s old prize problem and his own argument that it might be solved using such oscillating currents. The subsequent two years of intense experimentation that resulted in 12 papers, can from a technical standpoint be described as a gradual development of the first Riess coil into an efficient oscillator or radio antenna, and an equally gradual development of the second coil into a resonator and further into an electric probe. Both were gradually tuned to higher and higher frequencies, so as to produce and detect waves with wavelengths short enough to be measured in a laboratory. On the more theoretical level, Hertz gradually changed his interpretation of the effects. At first he considered them to be the result of electrostatic and inductive forces but gradually he arrived at a full-fledged understanding that the phenomena was
Professor in Bonn (1889–1894)
57
a result of electromagnetic waves. At the same time he also changed his theoretical standpoint from being a Helmholtzian to being a Maxwellian ((Buchwald 1994), (D’Agostino 1971), (D’Agostino 1975)). In terms of results the experiments were highly productive. Not only did they lead to the desired answer to the prize questions (though too late to win Hertz the prize), they also led to the discovery of the photoelectric effect and to the first artificially produced radio waves, whose behavior (such as reflections, refraction and polarization) Hertz showed to be similar to those of light. Hertz’s British colleagues, such as George Francis FitzGerald and Oliver Lodge, who had both been occupied with electromagnetic oscillations, considered Hertz’s results as the final proof of Maxwell’s theory. At first Hertz himself was more prudent. He argued that while his experiments showed that Maxwell’s theory was the best available, they could not prove that other theories were not equally good. One result even suggested that Maxwell’s theory had to be changed fundamentally. Indeed, in one of the essential steps toward the measurement of electric waves in ‘empty’ space, Hertz had measured the interference between electric waves in a conducting wire and the surrounding air. This experiment had made him conclude that the speed of the wire wave was only 23 of the speed of the wave in air, which is equal to the speed of light. Such a difference in velocity contradicted Maxwell’s theory and suggested to Hertz that he had detected an important effect that would eventually lead to a reformulation of the theory. However, after several of his colleagues, in particular E. Sarrasin in Genève, had been unable to detect the difference in wave velocities Hertz finally came to the conviction that the effect was a strange artifact of the geometry of the lecture hall in which he had conducted the experiments. Therefore in 1892, Hertz could state: ‘The object of these experiments (with electromagnetic waves) was to test the fundamental hypothesis of the Faraday– Maxwell theory, and the results of the experiments is to confirm the fundamental hypothesis of that theory’ (Hertz 1892, p. 21/20). Hertz’s experiments made him world famous. He received many visitors, became a member of several academies and received many important prizes. For example, he was awarded the Rumford Medal of the Royal Society and went to England to receive it. On this occasion he met many of the leading British physicists and electrical engineers. He also received a job offer from the University in Giessen, but through the intervention of Althoff, he chose the better position at the University of Bonn. He could also have chosen to become Kirchhoff’s successor as professor of theoretical physics in Berlin, but again he preferred an experimental position. In Bonn he was the successor of Rudolf Clausius whose house he bought and moved into with his family.
5.4 Professor in Bonn (1889–1894) When he began his job in Bonn in April of 1889, he had already written a Maxwellian account of the fields surrounding his oscillator and he had begun a general theoretical work on Maxwell’s theory. The latter work, which he completed after he had moved
58
A biographical survey
to Bonn, contained an elegant, brief ‘axiomatic’ presentation of Maxwell’s theory or rather of its essential points as Hertz saw them (see below). Together with Oliver Heaviside’s earlier work from 1885, to which Hertz referred, it is usually quoted as the first presentation of Maxwell’s field theory as we now know it. Hertz’s subsequent paper on electrodynamics of moving bodies, on the other hand, was only of limited importance. The reason is that Hertz was forced to make an assumption about the motion of the ether, relative to a moving body. Fresnel, Armand H.L. Fizeau and Hendrik Antoon Lorentz (1886) had argued that the ether is stationary and thus moves relative to ponderable bodies moving through it, but Stokes, Albert A. Michelson and Edward W. Morley had argued that bodies drag the ether along with them so that there is no velocity difference between the moving body and the ether in and immediately surrounding it (Harman 1982, pp. 112–116). Even though Hertz had met Michelson in Berlin and had a deep interest in the subject (Fölsing 1997, p. 458), he did not even mention Michelson’s now famous experiment or any of the other arguments mentioned above. He simply recognized that the problem was still open and stated that he would assume a dragged ether, because this assumption was the simplest. Only two years later, however, Lorentz, on the basis of his electron theory, argued for a stationary ether. Moreover, he argued that the molecular forces were modified like electromagnetic forces when the molecule moves through the ether, which in turn implied that solid bodies would contract in the direction of their motion. This would explain the lacking effect of the Michelson–Morley experiment. As is well known, Einstein adopted the Lorentz contraction in his special theory of relativity (1905) but not its physical explanation. These developments made Hertz’s research on electrodynamics of moving bodies obsolete. Hertz never tried to pursue his electromagnetic research in a technological direction. However, others soon did. Less than two years after Hertz’s death, Guglielmo Marconi had refined Hertz’s laboratory instruments into devices with which he could send wireless telegraphy over a distance of two kilometers and in 1901 he could even send signals across the Atlantic. Hertz only turned to electromagnetism once more, namely in 1891, when Felix Meiner, the owner of J.A. Barth publishing house, who published the Annalen der Physik, asked him to publish a book containing his experimental and theoretical papers on electric waves. At this occasion, Hertz wrote a long introduction that was formed as a somewhat rationally reconstructed account of the events leading to his discoveries. It also contained some corrections to the papers and some philosophical remarks. The book was published in 1892. While finishing his paper on electrodynamics of moving bodies, he began to look around for other new effects that he could discover. However, although he conducted many new experiments during 1890, only one led to a publishable result, namely the significant observation that cathode rays can penetrate thin metal foils. He did not follow up this discovery either, but left his instruments to his new assistant, Philipp Lenard, who made important discoveries with them. On January 26 1891, Hertz was ‘worn out’ by the unsuccessful experiments and a week later he confided to his diary that he was ‘fed up with work in physics’
Professor in Bonn (1889–1894)
59
(Hertz 1977, p. 313). So when Lenard arrived on April 1, 1891, Hertz had turned to theoretical research in mechanics. The previous month Felix Klein had asked him if he would write a paper for the Encyclopädie der Mathematischen Wissenschaften on Physical Mechanics (Fölsing 1997, p. 474). Hertz replied that he would agree to write the article if he was allowed to concentrate on the fundamental questions. In particular, [if Klein] thinks about the theory of energy in its widest sense . . . then I admit that for the next whole or half year I have set myself the task to penetrate as far as possible into these difficult things. Therefore, if I succeed to my own satisfaction, I would gladly make my work useful for others. (Hertz to Klein, March 25, 1891). (Fölsing 1997, p. 474)
Still he did ‘not know for sure if I will at all succeed in penetrating into the spirit and common sense of this subject.’ (Hertz to Klein, March 25, 1891, (Fölsing 1997, p. 474)). The project Hertz had undertaken turned out to last about three times longer than the one year he had originally planned. This was due to at least three factors: First, the project gradually became more ambitious. Instead of writing an Encyclopedia article about the concept of energy, Hertz ended up constructing a whole new formulation of the principles of mechanics that was eventually published as a separate monograph. Secondly, Hertz was prevented from working full time on his book, first by his teaching, his duties as the head of a research institute, and his duties as a physics celebrity, and afterwards by his fatal illness. Thirdly, Hertz found it more time consuming to write such a long, logically closely knit theoretical work than he had imagined. And when the formidable proportions of the work gradually occurred to him, he had already invested so much work in the project that he was intent on carrying it to the end. Yet, when the work was nearing completion, he admitted that ‘I often think I should not have begun it.’ (Hertz to Sarasin, May 19, 1893 in (Fölsing 1997, p. 500)). During the nearly three years Hertz worked on the book, he put off almost all other research and even informed his father, who wanted to take him on a holiday to Turkey, that ‘it would be impossible for me to go on a longer journey until the work that I now have in hand is complete and finished and in print; I would go all the more gladly afterwards.’ (Hertz to his parents, July 26, 1892, in (Hertz 1977)). During the last year the work almost haunted him. He seems to have realized that he might not be able to finish it, and the theoretical sedentary work seems to have increased his suffering. He even went so far as to try and blame his illness on the mechanics project; thus, when he had written the last word of his Prinzipien der Mechanik, he wrote to his parents: I am glad, because it was a great burden for me and I blame a large part of my last year’s infirmity on it: first because it may really be so, second because it gives me a sort of consolation to think so. In any case I feel infinitely better when I am up and about and keeping busy with my hands than when I sit at my desk or squat in my room, lost in thought. So I promise myself that I shall master my suffering more easily now. Today I even went to the laboratory and started to make some preparations for working there. Long ago I made a solemn vow not to enter on theoretical work for a long time to come. But this one had to be finished. (Hertz to his parents, October 10, 1893, in (Hertz 1977, pp. 341–343))
60
A biographical survey
There is no doubt that Hertz was relieved that he had finished his book and could finally undertake experimental work again. However, the entirely negative impression the letter leaves concerning his theoretical work on mechanics should be contrasted with the following more representative quote from a letter of June 20, 1892 to his parents: ‘This mathematical work is the most time consuming pleasure.’ In fact, Hertz was often quite happy when working on mechanics, but perhaps more often unhappy. His spirits changed many times and often rather suddenly between utter depression over the lack of progress and happiness or at least satisfaction resulting from obtained insights. In this respect his work with mechanics did not differ much from his earlier theoretical and experimental work. Yet the rigidity of the subject matter, the lack of spectacular breakthroughs, combined with his fatal illness, made the work on the Principles of Mechanics a more depressing experience than his earlier research. Moreover, one should not underestimate the difficulty it must have caused Hertz to start almost from scratch after his long series of successful experimental and theoretical works on electromagnetic waves, and the world fame it had earned him. Hertz himself was well aware of this psychological effect: I too am well, but not in as good spirits as I could wish; the rapid progress of my work and the external successes over the past several years have altogether spoiled me so, it would be surprising if the present stagnation did not affect me heavily. It is some consolation to know that I have been quite often the victim of a similar gloom in the past and that good times then followed. (Hertz to his parents, April 10, 1891, in (Hertz 1977, pp. 314–315))
Let us now follow Hertz’s progress with his work in mechanics as it is revealed in his diary and letters (see also (Fölsing 1997, pp. 474–499 and 513–577). During March and April of 1891, Hertz did some initial research. Then teaching took his time until June when he could take up mechanical research again. In August and September he traveled, but in October he was back doing mechanics until teaching and a major Helmholtz Celebration in Berlin diverted him again. After that he used 6 weeks to write the introduction to the republication of his papers on electric waves and so did not return to mechanics until the Christmas holiday. Soon, however, problems with the tax authorities and many festive occasions diverted him. When in March he began to work ‘On the Hamiltonian differential equation’ (Hertz 1977, p. 323) he felt he made little progress. After a trip to Switzerland and some unsuccessful experimentation he returned again to serious work on mechanics. He had now ‘grown somewhat more content and serene about my work: although I have nothing that is complete, I can see this and that growing healthily of which one or the other may yet yield a good harvest.’ (Hertz to his parents, April 10, 1892, in (Hertz 1977, p. 323). He ‘worked intensively on it [mechanics] throughout the summer semester,’ reflecting on May 8th on the ‘straightest distance’ (Hertz 1977, pp. 322–323). During June of 1892, he rewrote his work on mechanics quite anew (probably the second draft (see Section 6.5)) and for the first time he felt that it began ‘to take palpable shape’ (Hertz 1977, p. 323). He projected that the book would be finished by the beginning of the following year.
Professor in Bonn (1889–1894)
61
However, in the middle of July he was struck by an illness that made him completely unfit for work. It began as an infection of the nose but soon spread to all the cavities of his head. It may have been caused by the damp rooms in the physics institute. During the rest of 1892 Hertz went on cures and was operated on several times, but the infection just got worse until the end of January of 1893 when the doctors successfully flushed out the pus from his jaw. Then Hertz could return to work, until March when he went on a holiday to Italy to recuperate. On return he even began lecturing again. However, the infection was not gone, only temporarily stabilized. On August 6, he could report to his parents that ‘Four weeks ago, I thought once again that I would have to bring the book to a premature end, but since then it has come close to being completed and naturally I am greatly attracted by the idea of finishing with it for good.’ (Hertz 1977, p. 339). Later that month Hertz visited new spas but since it did not help, he had his jaw cut open twice in September and October. This helped so much that he could work and lecture again. On October 10, 1893, he wrote the last paragraph of his book on mechanics, and he began to negotiate a contract with Felix Meiner from Barth Publishing House. Hertz was a rather tough negotiator. He asked for a high honorarium of 3000 Marks and required that nothing be spared in the printing: Each formula must stand free and alone on a white surface, each section well spaced. This is not pure vanity on my part, rather the aim of the book requires it. It is to a large degree its aim to create absolute clarity where beforehand unclarity reigned. For this aim, I have put myself to great trouble and the printer should not destroy this; on the contrary he can and must assist considerably. (Hertz to Meiner, November 12, 1893, in (Fölsing 1997, p. 498)).
Moreover, Hertz reserved himself the right to make changes to later editions and even to add a third part. Meiner accepted Hertz’s demands5 . Yet, Hertz was nagged by the possibility of mistakes in the presentation: The book could easily reduce my by and large good repute to rack and ruin, ‘if the casting fail.’ Even a minor fault can flaw the whole. It frightens me to come out with something that I have never talked over with any human being. (Hertz to his parents, November 19, 1883, in (Hertz 1977, p. 343)).
Indeed after criticizing all previous presentations of mechanics to be logically deficient, it would have been embarrassing if Hertz had committed even a minor error. On December 3, 1893, Hertz could tell his parents that ‘my introduction has been set, the major part of the manuscript goes off today, ready for printing, only a small part still requires a final touch.’ (Hertz 1977, pp. 342–343) In fact the ‘small part’ consisted of about 200 manuscript pages, close to 1/3 of the book (Fölsing 1997, p. 513). However, he never got a chance to give these manuscript pages their final 5 On the other hand, Hertz gave Meiner free hands to change such trivial matters as spelling, and, in fact, several words were spelled differently in the printed book. For example, Hertz consistently put an h in words such as ‘Theil’ that was rendered ‘Teil’ in the book. Other examples are Hertz’s ‘Vector’, ‘Coordinate’, and ‘Rechtwinkliche’ that were spelled ‘Vektor,’ ‘Koordinate’ and ‘Rechtwinklige’ in the book.
62
A biographical survey
touch. His condition rapidly worsened, blood poisoning supervened and on January 1, 1894, Hertz died, only 36 years old. During the last phase of his illness, it consoled him somewhat to know that his book was in good hands. He had paid Philipp Lenard to make a copy of the manuscript and indeed Lenard saw the book through press. It was published in the summer of 1894.
6 Hertz’s road to mechanics
Why did Hertz embark on his work on mechanics and how did he gradually develop this ideas. These are questions that I shall address in this chapter. In particular, I shall discuss how various elements of his previous work point toward the ideas laid down in the Principles of Mechanics. Ideally, this chapter should have been structured as a chronological analysis of how Hertz gradually developed his ideas on mechanics. However, the existing source material does not allow that. As Hertz wrote in the letter quoted at the end of the last chapter, he did not discuss these matters with anybody, and his letters and diary are almost silent about details pertaining to his work. The drafts of the book reveal some interesting developments that we shall discuss in the later chapters. However, it is rather difficult to match these drafts with his remarks in the diary and the letters, so it is hard to date the drafts. And what is worse, even the earliest drafts and notes reveal a rather clear sense of the aim methods and direction of the project, making them of little value for understanding Hertz’s motivation and early approach to the subject. Therefore our speculations about Hertz’s motives and original apprehension of the subject must remain somewhat conjectural.
6.1 Hertz’s electromagnetic work as a background for his mechanics Hertz’s experimental and theoretical investigations of electromagnetism provided a threefold background for his work on mechanics. 1. As an axiomatic reorganization of a field of physics. 2. As an area of physics that needed mechanical explanation and 3. As an investigation that suggested that distance forces should and could be eliminated from physics. I shall discuss these three aspects below1 .
6.1.1 Axiomatization Hertz did not characterize his work neither in electromagnetic theory nor in mechanics as axiomatization, but this phrase seems to capture quite well what he did in both 1 See also (D’Agostino 1993), (D’Agostino 1998).
63
64
Hertz’s road to mechanics
areas. In his ‘Über die Grundgleichungen der Electrodynamik f ür ruhende Körper,’ Hertz explained his aim as follows: The structure of the system [of Maxwellian electromagnetic theory] ought to allow a clear apprehension of its logical foundation. All non-essential concepts ought to be removed from the system and the connections between the essential concepts ought to be reduced to their simplest form. In this respect, Maxwell’s own exposition does not represent the attainable goal. It often swings back and forth between the views that Maxwell inherited (vorfand) and those to which he was guided. Maxwell begins by assuming unmediated distance forces and he investigates the laws, according to which the hypothetical polarization of the dialectic ether will change under the action of such distance forces, and he ends with the claim that the polarizations really changes in this way, even though the changes are in fact not caused by distance forces. This procedure leaves one with an unsatisfactory feeling that either the final results or the way in which they were obtained must be wrong. Moreover this procedure leaves, in the formulas, a number of superfluous rather rudimentary concepts behind that only had a proper meaning in the old theory of immediate action at a distance. (Hertz 1890, pp. 208–209)
Indeed, Maxwell’s Treatise presents a long and winding route leading from the observed phenomena of electrostatic action and Ampère’s laws to Maxwell’s equations, presenting even Weber’s and other competing ideas on the way. It is in a sense a rational reconstruction of Maxwell’s own way to field theory. What Hertz offered instead was a minimalistic, axiomatic, treatment2 . He introduced the electric and magnetic fields (forces) as states in space, and postulated the relations between them in the form of what we now call Maxwell’s equations. Only then did he define derived motions such as polarization, electricity, magnetism and (ponderomotive) forces and he showed how one can deduce various phenomena from the basic equations. Where Maxwell’s Treatise presented the account of the discoverer, who wanted to induce his theory and equations from the empiric evidence, Hertz’s paper was an a posteriori axiomatization, whose aim was logical deductive clarity. Where experimental evidence entered in many places in Maxwell’s deduction of particular points in his theory, Hertz saw the relations to experience quite differently. I will add explanations (Erläuterungen) to the formulas [Maxwell’s], but these explanations are not proofs of the formulas. Rather, the statements are given as facts drawn from experience and experience should count as their proof. To be sure, each individual equation cannot be tested against experience, only the system as a whole. The situation is hardly different as far as the equations of the ordinary mechanics is concerned. (Hertz 1890, p. 210)
When Hertz wrote his mechanics, he was more concerned with investigating which parts of the theory he considered a priority and therefore unshakable by experience and which parts had empirical contents. Otherwise, he proceeded similarly. He was not interested in displaying experimental evidence but simply claimed that his fundamental law was the result of the most general experience. Also in the Mechanics the aim was logical clarity and as in electromagnetic theory this clarity 2 Already twenty years earlier Carl Neumann had pioneered a different axiomatic formulation of electromagnetism.
Hertz’s electromagnetic work as a background for his mechanics
65
was obtained by removing superfluous rudimentary elements that threatened to create logical inconsistency in the system. As he put it in the 1892 introductory survey of his electromagnetic papers: I have further endeavoured in the exposition to limit as far as possible the number of those conceptions which are arbitrarily introduced by us, and only to admit such elements as cannot be removed or altered without at the same time altering possible experimental results. (Hertz 1892, p. 30/28)
Neither in electromagnetism nor in mechanics was Hertz the original discoverer or inventor of the theory. In both cases, his theoretical contribution was to clarify and axiomatize existing theories a posteriori and in both cases such an axiomatization was a means to choose between several more or less equivalent theories (or images in Hertz’s terminology). For electromagnetic theory, the competing theories were Weber’s, Helmholtz’s and Maxwell’s theories, the latter being the one that Hertz axiomatized. In mechanics the three competing images were the usual Newtonian– Laplacian image, the energetic image and Hertz’s new image. However, there are differences between the two cases. In electromagnetism, Hertz believed that his experiments had falsified Weber’s theory and Helmholtz’s theory except for the limiting case that is equivalent to Maxwell’s theory. In mechanics, Hertz had no such experimental falsification of any of the images (except for the energetic image for nonholonomic constraints). The parallelism, however, becomes more striking if in the electromagnetic case we consider the three versions of Maxwell’s theory: Maxwell’s own, Helmholtz’s limiting case, and Hertz’s axiomatic version. Hertz emphasized that mathematically these versions are equivalent, but physically they are quite different. Here, as in mechanics, Hertz preferred his own version of the theory because it is logically consistent and simpler. In electromagnetic theory Hertz went one step further in his axiomatic endeavors than he had indicated to be desirable in his Kiel Lectures and also further than he went in his mechanics book. In the Kiel Lectures and in the Principles of Mechanics, Hertz claimed that it is the duty of the physicist to create images (Bilder) in the mind. He admitted that these images would necessarily contain inessential elements that did not correspond to empiricial elements of the external world. In the Mechanics he further insisted that one could (and should) try to minimize them but one could not create an image without them. They were only subject to the requirement that the image be correct. In this paper on Maxwell’s equations, Hertz stripped Maxwell’s theory of so much inessential baggage that he was hardly left with an image but only with an ‘abstract and colorless’ theory. He admitted that this may seem unsatisfactory when we are ‘used to having placed before our eyes the perceptible pictures (Bild) of atoms covered with electricity.’ But, Hertz continued: Nevertheless I believe that we cannot, without deceiving ourselves, extract much more from experience than is asserted in the papers referred to. If we wish to lend more colour to the theory, there is nothing to prevent us from supplementing all this and aiding our powers of imagination by concrete representations of the various conceptions as to the nature of electric polarisation, the electric current, etc. But scientific accuracy requires of us that we should in
66
Hertz’s road to mechanics
no wise confuse the simple and homely figure, as it is presented to us by nature, with the gay garment which we use to clothe it. Of our own free will we can make no change whatever in the form of the one, but the cut and colour of the other we can choose as we please. (Hertz 1892, pp. 30–31/28)
It is obvious that positivists like Mach endorsed Hertz’s version of Maxwell’s theory: I have read your work, ‘Über die Grundgleichungen der Electrodynamik’ with very special interest. In this work you come very close to the ideal of a physics free of mythology, that I once [Mechanics, p. 468] allowed myself to recommend to my colleagues. (Mach to Hertz, September 25, 1890 (Thiele 1968, p. 134))
Boltzmann, on the other hand, who preferred mechanical images, thought that Hertz’s colorless theory was a bad joke (Fölsing 1997, p. 455), and William Thomson thought it was a nihilistic description (Knudsen 1985).
6.1.2 Mechanization To the question ‘What is Maxwell’s theory?,’ I know of no shorter or more definite answer than the following: – Maxwell’s theory is Maxwell’s system of equations. (Hertz 1892, p. 23/21)
This quote has often been interpreted to mean that Hertz did not endorse the mechanistic philosophy according to which electromagnetic phenomena should eventually be explained by way of mechanics. However, the opening words of Hertz’s preface to his Principles of Mechanics (quoted at the beginning of Chapter 3) suggests that this is a misinterpretation. In fact, a closer reading of the quote in its context reveals that Hertz did not address the question of mechanical reductionism here. Instead, he discussed the difference between Maxwell’s theory as presented by Helmholtz, by Maxwell and by Hertz himself, and argued that they are all forms of ‘Maxwell’s Theory’ according to the above definition. It is indeed interesting to note that when Hertz discussed which inessential things he had removed from Maxwell’s own presentation he mentioned the dielectic displacement in the free ether, and the vector potentials, but he did not mention Maxwell’s mechanical ideas. Indeed he did not even mention these models or explanations at all in the paper. This clearly contributed to the understanding among many physicists that Maxwell’s theory did not need any mechanical underpinning and therefore contributed to the abolition of the mechanistic philosophy. However, it does not seem to have been Hertz’s own intention. In his paper on Maxwell’s equations, Hertz wanted to create a logically consistent macroscopic theory by removing all inessential microscopic concepts. It corresponded quite well to the axiomatic macroscopic theory of thermodynamics, which he knew well and may have been inspired by. However, the establishment of such a theory does not exclude a further reduction to a microscopic mechanic theory similar to the kinetic theory of gases. When Hertz did not mention mechanical concepts at all, he simply seems to have implied that mechanization was a different project and perhaps that time was not quite ripe for it yet. In fact, this corresponds quite well to Maxwell’s own point of view when he was
Hertz’s electromagnetic work as a background for his mechanics
67
in his most positivistic mood. In fact, in 1892, when Hertz illustrated the concept of polarization in his version of Maxwell’s theory he wrote: The explanation of the nature of the polarizations, of their relations and effects, we defer, or else seek to find out by mechanical hypotheses. (Hertz 1892, p. 27/25)
In 1890 Hertz ‘deferred,’ but in his Mechanics he declared that a mechanical explanation ‘seems to be nearly realized’ in electromagnetic theory. However, the mechanics suited for such an explanation had to be a mechanics that does not build on a concept of distance forces, a concept that Hertz had himself proved to be unnecessary in electromagnetic theory. Thus, Hertz was probably on the lookout for a mechanical explanation of his axiomatic electromagnetic theory and this can very well have been one of the main reasons why he undertook his complete reformulation of mechanics.
6.1.3 The elimination of distance forces Already in the Kiel Lectures Hertz devoted much time to arguing that mediated contact action through the ether gave a better understanding of all interactions than actions at a distance. He did not claim that he could prove that actions at a distance did not exist, but he claimed that a theory (Anschauung) of pure contact actions was possible (both mathematically and physically), and that it was preferable as a help to understanding and controlling nature. Moreover, Hertz considered it a ‘likely’ theory at least as far as the electromagnetic actions were concerned. (Hertz 1999, pp. 62–63) He immediately addressed one possible refutation of a contact action, or field theory: Consider a surface that totally encloses a part of a physical system. In order to maintain a field theory we must be able to show that the action of the system inside the surface or the system outside the surface, is entirely determined by the state on the surface. Otherwise, one cannot maintain that the action of the inner system is propagated through the surface to the outer system. An advocate of the action at a distance theory will naturally point out that since the system inside the surface has more degrees of freedom than the states on the surface, it is very unlikely that its action on the external system can be described completely by the state on the surface. And yet, the mathematical theory of potentials shows that for forces that fall off as the square of the distance, it is in fact true that the ‘forces’ on the surface determine the actions of the included system on the surroundings. Now, since all known ‘forces’ of nature are of this kind, Hertz concluded that the mentioned objection against field theory was unfounded. Indeed, the fact that only 1/r 2 forces exist in nature strongly suggests that field theory is the most likely theory. To be sure, Kant and others had tried to argue along other lines that 1/r 2 forces were a consequence of the 3-dimensionality of space, but Hertz rejected such arguments as metaphysical and mystical. Having argued for the possibility and likelihood of a field theory, Hertz continued to argue that it is preferable (simpler) as a description of nature. To this end he considered the description of a glass of water. In the action at a distance, image of the water, all the hydrogen and oxygen atoms attract or repel each other as well as the charged
68
Hertz’s road to mechanics
particles outside the glass, with strong electric forces, and, if in motion, also with magnetic forces. And yet one cannot detect any action outside the glass because all the forces compensate each other. The field-theoretic image is much simpler because here the actions compensate each other locally in the field. The field-theoretic description is, according to Hertz, also the simpler for our prediction of phenomena. To be sure the action at a distance theory has proved to be simple in the study of the solar system, where we can dissolve the problem in a series of two-particle interactions. However, for astronomers who want to study Saturn’s rings and even more for physicists, this ‘Phantom of the two points,’3 is not helpful and a field-theoretic description (here almost identified with potential theory) is simpler both because it leads to simpler mathematics and because the physicist will always see the force field (as illustrated by Faraday’s iron filings) in his mind’s eye. In the following section of his Kiel Lectures, Hertz turned to the electromagnetic actions in particular and argued that they were ‘probably’ field actions. First, he noticed that Weber’s force law (according to Helmholtz) contradicted energy conservation. Secondly, he argued that polarization of ponderable dielectrica suggested a possible way in which electromagnetic actions could propagate through the ether. In particular, he pointed out that it is possible to detect the presence of an electromagnetic field in the ether (empty space) by observing the rotation of polarization of light. He therefore suggested that if we had been born with eyes that could detect polarization no one would ever have doubted the electromagnetic field theory. The above-mentioned arguments were only skirmishes in the war between the theory of action at a distance and field theory. As Hertz vividly expressed it in military rhetoric, befitting an officer of the reserve, the decisive battle between the two world views was to be fought on the battlefield of the electromagnetic theory of light. Already in the Kiel Lectures, Hertz described field theory as having the upper hand in this battle, in particular because it could account for the identity (within observational error) of the speed of light and the theoretically computed speed of electromagnetic waves. Thus, already in 1884, Hertz appeared as a vigorous advocate of field theories and a strong critic of action at a distance theories, and accordingly an outspoken supporter of Maxwell’s electromagnetic theory as opposed to Weber’s and even Helmholtz’s theories. It is therefore surprising that in his 1892 introduction to his collected papers on electric waves, Hertz explained that he had not quite been able to grasp the physical meaning of Maxwell’s theory when he began his Karlsruhe experiments in 1886. According to this account, Hertz used Helmholtz’s theory as his theoretical guide as late as 1886 and he did not become a full-fledged Maxwellian until after he had succeeded in describing his oscillator in purely Maxwellian terms and had axiomatized Maxwell’s theory. One might suspect that Hertz’s 1892 account consciously postponed his conversion to Maxwellianism in order to make the experiments on electric waves stand out as more crucial than they really were. However, Buchwald has convincingly argued that, in fact, Hertz had not grasped the deep implications 3 Probably an alusion to Helmholtz’s idea that the most basic interaction must involve two points. Note that already here Hertz emphasized the primacy of the entire system, as opposed to some subsystem.
Hertz’s electromagnetic work as a background for his mechanics
69
of Maxwell’s theory when he began his experiments with electromagnetic waves, if indeed he ever did. He may have been convinced of the superiority of Maxwell’s theory on a philosophical level when he gave his Kiel Lectures, but when it came down to handling and thinking theoretically about laboratory equipment and do calculations he was still so steeped in Helmholtzian ways of thinking that he reverted to this theory in 18864 . In his 1892 introduction to Electric Waves, Hertz emphasized that his corroboration of Maxwell’s theory also meant a corroboration of contact forces and a rejection of actions at a distance. In order to see that more clearly, let us follow his presentation of four stages ranging from pure action at a distance theory to pure field theory. 1. The pure action at a distance theory assumes that action happens unmediated between two bodies. If one of the bodies is removed, the force disappears. Hertz wrote that this conception was admitted in the theory of gravitation, but was almost abandoned in electromagnetic theory. It seems to correspond to Weber’s theory, although Hertz did not mention that. 2. From the second standpoint, an acting body will ‘strive to excite at all surrounding points attractions of definite magnitude and direction.’ (Hertz 1892, p. 24/22). This striving is only noted if a second body is placed at the point, but although the striving or action does not change anything at the place where its action is exerted, it is still assumed to be there even if there is no second body to be acted upon. Hertz identifies this standpoint with that of potential theory. He might have thought of Carl Neumann’s theory: 3. The third standpoint retains the conceptions of the second, but adds to them a further complication. It assumes that the action of the two separate bodies is not determined solely by forces acting directly at a distance. It rather assumes that the forces induce changes in the space (supposed to be nowhere empty), and that these again give rise to new distance-forces (Fernkräften). The attractions between the separate bodies depend, then, partly upon their direct action, and partly upon the influence of the changes in the medium. The change in the medium itself is regarded as an electric or magnetic polarisation of its smallest parts under the influence of the acting force. (Hertz 1892, p. 25/33)
This standpoint was the one that Helmholtz held and that guided Hertz through the early experiments with fast electric oscillations. Hertz illustrated the standpoint by two oppositely charged condenser plates with a dielectric in between (Fig. 6.1, left). The standpoint can now be varied according to how much of the interaction energy is supposed to be due to the direct action between the charged plates and how much is supposed to reside in the medium. In the limit where all interaction energy is supposed to reside in the medium, one will be led to Maxwell’s equations. This is how Helmholtz arrived at the theory he called Maxwell’s theory5 . 4 For a discussion of the apparent clash between the two dates of Hertz conversion to Maxwellianism defended by Fölsing and Buchwald, respectively, see (Nordmann 2000). 5 In fact, Helmholtz had also assumed that the constant k involved in his expression of the interaction energy must be zero (see Section 4.2). However, this assumption is unnecessary (see (Buchwald 1994, pp. 378–379).
70
Hertz’s road to mechanics
Fig. 6.1. Maxwell’s theory according to Helmholtz (left) and according to Maxwell and Hertz (right).
However, according to Hertz’s understanding in 1892, this does not represent Maxwell’s own physical ideas. Indeed, according to this limiting case of Helmholtz’s standpoint, the charge on the plates and therefore the distance forces, are completely counterbalanced by the opposite charge of the medium that is displaced towards it, and yet it is the action at a distance between the charges on the condenser plates that create the polarization. 4. The fourth standpoint, which Hertz identified with Maxwell’s own, belongs to the pure conception of action through a medium. From this standpoint we acknowledge that the changes in space assumed from the third standpoint are actually present, and that it is by means of them that material bodies act upon one another. But we do not admit that these polarisations are the result of distance-forces; indeed, we altogether deny the existence of these distance-forces. (Hertz 1892, p. 27/25)
From a mathematical point of view, this standpoint is equivalent to the limiting case of Helmholtz’s standpoint but from a physical point of view, they differ fundamentally. For example, the physical image of the above-mentioned condenser plates would, according to Maxwell, look like the right-hand figure in Fig. 6.1. One notable difference between this image and the previous one is that the polarization of the medium has been reversed, the positive charge on the left plate being now a result of the displacement of positive charge in the medium towards the plate. Thus we see that Hertz described his own conversion as a shift from a Helmholtzian standpoint that allowed both actions at a distance and mediated actions through a medium to a Maxwellian standpoint that banned actions at a distance completely. Even the Maxwellian limit of Helmholtz’s theory did not eliminate actions at a distance, for as Hertz remarked: It is impossible to deny the existence of distance-forces, and at the same time regard them as the cause of the polarizations. (Hertz 1892, pp. 27–28/25)
Hertz regarded his experiments as a proof of this Maxwellian denial of actions at a distance, and even considered this as their philosophically most important
Research on gravitation
71
result. Having recalled the attention that his reflection and refraction experiments had attracted, he continued: A considerable part of this approval was due to reasons of a philosophic nature. The old question as to the possibility and nature of forces acting at a distance was again raised. The preponderance of such forces in theory has long been sanctioned by science, but has always been accepted with reluctance by ordinary common sense; in the domain of electricity these forces now appeared to be dethroned from their position by simple and striking experiments. (Hertz 1892, pp. 19–20/18)
A bit further on he concluded: Casting now a glance backwards we see that by the experiments above sketched the propagation in time of a supposed action-at-a-distance is for the first time proved. This fact forms the philosophic result of the experiments; and, indeed, in a certain sense the most important result. (Hertz 1892, p. 21/19)
When Hertz attached such a great value to his elimination of distance forces in electrodynamics, it is not a great surprise that he also tried to avoid this notion in his work on mechanics.
6.2 Research on gravitation Fölsing has found traces of a sequence of events that seem to link Heinrich Hertz’s electromagnetic research more directly to his work in mechanics. It has to do with gravitation, the paradigmatic force of mechanics. Already in the Kiel Lectures on the Constitution of Matter, Hertz argued that gravitation propagated by contact forces through the ether. Indeed his potentialtheoretic argument was designed to make such an assumption plausible, and his mention of Saturn’s rings was intended to show that it was the simplest theory for astronomers – let alone physicists. Hertz was aware of earlier ‘attempts to explain gravitation . . .’ (Hertz 1999, p. 108), but he did not discuss these attempts in his lecture. One such attempt that Hertz probably did not know about was due to Maxwell. It was a reaction to Faraday’s remark to the effect that ‘the idea of gravity appears to me to ignore entirely the principle of the conservation of force.’ (Faraday 1857). Maxwell’s reply, in the form of a letter to Faraday in November of 1857, sketched a theory for how gravitation could be explained by Faraday’s lines of force. You have also seen that the great mystery is, not how like bodies repel and unlike attract but how like bodies attract (by gravit[at]ion). But if you can get over that difficulty, either by making gravity the residual of the two electricities or by simply admitting it, then your lines of force can ‘weave a web across the sky’ and lead the stars in their courses without any necessarily immediate connection with the objects of their attraction. (Maxwell 1990, p. 550)
In the theory of electricity, the field lines go from positive to negative bodies and pull them together. Maxwell suggested that the gravitational field lines go from masses to the sphere at infinity and that they are pushing instead of pulling. Figure 6.2 shows his
72
Hertz’s road to mechanics
Fig. 6.2. Maxwell’s image of the field lines around the sun and a planet. (Maxwell 1990, p. 550)
image of the field lines around the sun and a planet. He claimed that this mechanism could explain gravity so that ‘I for my part cannot realize your dissatisfaction with the law of gravitation provided you conceive it according to your own principles’ (Maxwell 1990, p. 551). Indeed Faraday agreed. Maxwell also raised great questions concerning gravity such as ‘Does it require time?’ and ‘Has it any reference to electricity? or does it stand on the very foundation of matter – mass or inertia’ (Maxwell 1990, pp. 551–552). Later, he recognized that it was not such an easy matter to explain gravitation as a result of the motion of the ether. In 1877 he wrote: ‘The attempts which have been made since the time of Newton to solve this difficult question are few in number and have not led to any well-established result’ (Maxwell 1876a, p. 121). After Hertz, through his experiments with radio waves, had proved that electromagnetic forces were mediated with the speed of light, the problem of gravitation became more urgent to him. Toward the end of his celebrated talk about light and electricity, delivered at the meeting of the Deutsche Naturforcher und Ärzte in Heidelberg in 1889, Hertz reviewed the great questions opened by his experiments the first being the question of action at a distance: We are at once confronted with the question of direct actions-at-a-distance. Are there such? Of the many in which we once believed there now remains but one – gravitation. Is this too a deception? The law according to which it acts makes us suspicious. (Hertz 1889, p. 353/326)
Hertz’s talk was praised afterwards when Leo Königsberger had gathered many of the important German mathematicians and physicists at his home. To the question, why he [Hertz] in his talk had not openly declared that he also wanted to eliminate gravity as an action at a distance, he shyly answered ‘I am still too much of a coward for that.’ (Koenigsberger 1903, vol. 3, p. 26)
Soon, however, Hertz turned to some of the problems, mentioned by Maxwell, that confront a field theory of gravity. First, the problem of the velocity of the gravitational action. A field theory would require a finite velocity but astronomers believed that gravity acted instantaneously and Laplace had ‘proved’ in his Mécanique Céleste
Ether
73
(Laplace 1799, vol. 4, X. livre, Chapter 8) that the speed of the gravitational action had to be more than 109 times the velocity of light. In November of 1889, Hertz turned to an acquaintance – Lehmann-Filhés – to ask him his opinion and he answered that Laplace’s derivation was quite untenable. Hertz agreed: The immensity of Laplace’s result shows in advance that it cannot be derived from the facts of observation, but from more or less arbitrary assumptions (Fölsing 1997, p. 461)
Hertz even speculated how one could derive the speed of gravity by accurate observations of comets or through anomalies in the motion of the moon or Mercury, too which Lehmann-Filhés answered that it would be of the greatest interest if it could be done. Now Hertz was not an astronomer, so, in May of 1890, he turned to the Cambridge astronomer George Darwin, but he could not help him (Fölsing 1997, p. 461). At the same time, he again ‘put my mind to old lectures on dynamics and Hamilton’s principle’ (Diary, May 3, 1890 in (Fölsing 1997, p. 461)), in particular his old notes from Borchardt’s lectures. A few days later he even ‘Asked Lipschitz [his mathematics colleague in Bonn] about the Hamiltonian principle.’ (Diary, May 7, 1890 in (Hertz 1977, p. 301)). Fölsing interprets Hertz’s renewed occupation with mechanics as a result of his interest in gravitation and indeed the chronology strongly suggests that he is right. This gives a nice direct link between his work on electromagnetism and his work on mechanics. Still, after the brief fling with theoretical mechanics in May of 1890, it lasted almost one year before he wholeheartedly turned to this subject (see Chapter 5). In the meantime, he once returned to gravitation in an experimental fashion. His diary entry of January 5, 1891, laconically reports: Made experiments on polarization through gravitational effect. (Hertz 1977, p. 313).
It is not at all obvious what this means. For one thing, polarization could refer to polarization of dielectrica or to polarization of light. One entirely conjectural interpretation could be that Hertz tried to measure an effect due to gravitation on the plane of polarization of polarized light, i.e. a gravitational analogue of the Faraday effect. Indeed, if gravitation is somehow due to a field in the ether, one might expect such an effect. Hertz had previously interpreted the weak electromagnetic effect on cathode rays as a result of a kind of Faraday effect (Buchwald 1994, pp. 173–174). Still, Hertz would probably have considered the possible effect of the gravitational field on the plane of polarization as being very small. At any rate, nothing came of these experiments and no later note suggests that Hertz continued to be seriously occupied with gravity. He might very well have thought that such speculations had to be postponed until a consistent theoretical framework had been erected. The Principles of Mechanics provided such a framework.
6.3 Ether Whether concerned with the field theory of electromagnetism or of gravitation, the important task for the nineteenth-century physicist would be to construct an ether that
74
Hertz’s road to mechanics
could carry these fields. This was also Hertz’s opinion. In his Kiel Lectures, Hertz primarily argued for the existence of an ether. Only in passing did he mention the often contradictory properties that one seems to be forced to ascribe to this substance. In the Heidelberg lecture, where he likened the subject-matter of science with a mountainous landscape and his discovery of electromagnetic waves with a pillar in a bridge, connecting two ridges (electromagnetism and light), he concluded by raising the question about the nature of electricity and magnetism and continued. Directly connected with these is the great problem of the nature and properties of the ether which fills space, of its structure, of its rest or motion, of its finite or infinite extent. More and more we feel that this is the all-important problem, and that the solution of it will not only reveal to us the nature of what used to be called imponderables, but also the nature of matter itself and of its most essential properties – weight and inertia. The quintessence of ancient systems of physical science is preserved for us in the assertion that all things have been fashioned out of fire and water. Just at present physics is more inclined to ask whether all things have not been fashioned out of the ether? These are the ultimate problems of physical science, the icy summits of its loftiest range. Shall we ever be permitted to set foot upon one of these summits? Will it be soon? Or have we long to wait? We know not: but we have found a starting-point for further attempts which is a stage higher than any used before. Here the path does not end abruptly in a rocky wall; the first steps that we can see form a gentle ascent, and amongst the rocks there are tracks leading upwards. There is no lack of eager and practised explorers: how can we feel otherwise than hopeful of the success of future attempts? (Hertz 1889, p. 354/326–327)
During the following meeting of the Naturforscher in Halle in 1891, of which Hertz attended the end (Fölsing 1997, p. 479), it became somehow known that he had turned to mechanics. With the optimistic programmatic conclusion of his 1889 address in mind, it is no wonder that rumors began to circulate that Hertz had embarked on the solution of these grand questions concerning the ether. However, when Hertz subsequently heard of these rumors from his colleague Emil Cohn in Strassburg, he rejected them categorically: What you have heard about my works via Halle is unfortunately without any foundation and I do not know how this opinion has been formed. I have not at all worked with the mechanics of the electric field, and I have not obtained anything concerning the motion of the ether. This summer I have thought a great deal about the usual mechanics, but I do not think I spoke about that in Halle at all. In this area I would like to put something straight and arrange the concepts in such a way that one can see more clearly what are the definitions and what are the facts of experience, such as, for example, concepts of force and inertia. I am also already convinced that it is possible to obtain great simplifications; for example, I have only recently clarified for myself in a satisfactory manner what a mechanical force is. However, I have neither written the thing down nor do I know if others will afterwards find it satisfactory. In any case, it is something that can only ripen slowly. (Hertz to Emil Cohn, November 29, 1891, translated from the German by J.L. Deutsches Museum).
Hertz’s insistence here and in the introduction to his Mechanics that his aim was primarily to obtain logical clarity and consistency does not exclude that the explanation of the properties of the ether was a major motivating factor in the undertaking.
Ether
75
Indeed, in the preface to the Mechanics, Hertz explained: It is in the treatment of new problems that we recognise the existence of such open questions as a real bar to progress. So, for example, it is premature to attempt to base the equations of motion of the ether upon the laws of mechanics until we have obtained a perfect agreement as to what is understood by this name. (Hertz 1894, p. xxv)
This remark may very well represent the events that led Hertz to study mechanics. An initial attempt to create a mechanics of the ether may have revealed to him that the ordinary mechanics was still not in a state that could sustain such an endeavor. He may then have begun a critical study and reconstruction of ordinary mechanics, a project that eventually got a life of its own and resulted in the book. Towards the end of the introduction, Hertz further made it clear that his new image of mechanics would eventually prove its value in dealing with the ether. This section of the introduction contains a comparison of the usual image of mechanics and Hertz’s own image. Hertz admitted that they may arguably both be permissible (consistent) and equally appropriate, but only one of them can be correct, in the sense of describing nature accurately. In the first image, force laws (the relative accelerations) are exactly satisfied, whereas rigid connections are only approximate. In Hertz’s image rigid connections are exactly realized but the force laws are only approximations (see Chapters 19 and 20). Both images may be false, but they cannot both be correct. We do not yet possess any falsification of either theory, but we may initially tend to prefer the ordinary image because . . . in actions-at-a-distance we can actually exhibit relative accelerations which, up to the limits of our observation, appear to be invariable; whereas all fixed connections between the positions of tangible bodies are soon and easily perceived by our senses to be only approximately constant. (Hertz 1894, p. 49/41)
However, Hertz continued with an implicit reference to the mechanics of the ether: But the situation changes in favour of the third image as soon as a more refined knowledge shows us that the assumption of invariable distance-forces only yields a first approximation to the truth; a case which has already arisen in the sphere of electric and magnetic forces. And the balance of evidence will be entirely in favour of the third image when a second approximation to the truth can be attained by tracing back the supposed actions-at-a-distance to motions in an all-pervading medium whose smallest parts are subjected to rigid connections; a case which also seems to be nearly realised in the same sphere. (Hertz 1894, p. 49/41)
Thus Hertz maintained, as he had often done, that (his) experiments had shown that electromagnetic actions cannot be accurately described as actions at a distance and he added that they had almost been explained by the mechanics (with constraints and without distance forces) of an ether. This latter claim is surprising. He had never before written approvingly about the existing mechanistic field theories and as late as 1889, in his Heidelberg lecture he had contrasted the different properties of the ether: on the one hand it must carry transversal waves and must therefore behave like a solid body and on the other hand, the heavenly bodies must move unhindered through it, so that it must behave like a perfect fluid. Hertz did not conceal this ‘defect’ in our
76
Hertz’s road to mechanics
understanding of the ether, but explained that ‘These two statements together land us in a painful and unintelligible contradiction which disfigures the otherwise beautiful development of optics’ (Hertz 1889, p. 341/315). So when Hertz in the introduction to the Mechanics wrote that a mechanical reduction was nearly realized, the ‘nearly’ apparently referred to such rather major problems. It is hard to escape the feeling that Hertz was somewhat opportunistic here. He continued the introduction of his Mechanics motivating his development of the new image as a clarification that was necessary in a future experimental evaluation of the two theories: This is the field in which the decisive battle between these different fundamental assumptions of mechanics must be fought out. But in order to arrive at such a decision, it is first necessary to consider thoroughly the existing possibilities in all directions. To develop them in one special direction is the object of this treatise, – an object which must necessarily be attained even if we are still far from a possible decision, and even if the decision should finally prove unfavourable to the image here developed. (Hertz 1894, p. 49/41)
Thus Hertz clearly stated that (one of ) the aim(s) of his book was to develop an image that could serve as a basis for a theory of the ether, a theory that was in need of clarification in order to be compared with the ordinary image of mechanics. However, as he correctly wrote to Cohn, the Principles of Mechanics is not about the ether. First, it definitely does not present a pure ether theory as the one underlying Thomson’s vortex atoms. Indeed in Hertz’s image of mechanics, there is ordinary and concealed mass, the latter having no measurable gravitational mass, and it may be thought of as the stuff making up the ether. Except for the fact that we cannot directly perceive the hidden mass, it does not differ in character from the ordinary mass. In Thomson’s theory there is basically only one kind of matter, matter that makes up the ether. Ordinary mass is an epiphenomenon, explainable as a state in the ether and the mass of ordinary mass points (i.e. vortices) must be traced back to the way the hidden masses interact. Thus ordinary and hidden mass are by no means the same type of thing in the vortex theory. Thus Hertz’s image in his Mechanics is a mixed theory having both ordinary and hidden mass on the same theoretical level. In this respect, his image corresponded quite well with the one he presented in his Kiel Lectures on the constitution of matter. However, in the meantime, he seems to have flirted with something like a vortex theory. Thus in the quote above from his Heidelberg lectures, Hertz claimed that the solution to the problem of the ether would also reveal ‘the nature of matter itself and its most essential properties – weight and inertia.’ If this is a commitment to some kind of vortex theory of matter, which it certainly seems to be, then Hertz must have changed his mind back to his more traditional Kiel standpoint before writing the Mechanics. But there is a deeper sense in which even Hertz’s hidden mass cannot truly be said to be an image of the ether. Indeed, as Hertz argued in the Heidelberg lecture, the ether was usually conceived as a continuum, either as a fluid or a solid or some strange mixture of the two. The systems that Hertz dealt with in his Mechanics, on the other
An energetic beginning
77
hand, were exclusively systems of finitely many material points, be they ordinary or hidden. However, as he remarked in §7: as we assign no upper limit to their number, and no lower limit to their mass, our general statements will also include as a special case that in which the system contains an infinite number of infinitely small material points. We need not enter into the details required for the analytical treatment of this case. (Hertz 1894, §7)
This is the only place in the Mechanics where Hertz mentioned continuum mechanics and thus it is the only clue as to how he imagined one could deal with the ether. However, the way in which he brushes this problem aside is quite cavalier. He must have known that it is not such an easy matter to go to the limit as his remarks seem to suggest. In fact, with his acute eye for logical inconsistencies, Hertz must have been dissatisfied with the ways this step from point mechanics to continuum mechanics was usually made. I therefore interpret Hertz’s remark not as a treatment of the problem, but as a program for further research. In fact, it is possible that this was the problem he had in mind for the third book of his mechanics, which he hinted at in his last letter to the editor (see Chapter 5). If this project had materialized, Hertz would have provided the basis for an image of the ether and a possibility for a mechanical treatment of the field theories that his experiments on electromagnetism had suggested.
6.4 An energetic beginning For a person like Hertz who wanted to give a new presentation of mechanics without distance forces, there existed an obvious alternative to the ordinary theory: the energetic theory. And indeed Hertz first tried out this approach. This seems to be the meaning of Hertz’s letter to Klein of March 25, 1891, quoted in Section 5.4 in which he declared that his project for the next half or whole year would be to ‘study the theory of energy in its widest sense’ (Fölsing 1997, p. 474). However, having explained the main principles of the energetic image in the Introduction of his Mechanics, Hertz concluded: ‘I have discussed this second mode of representation at some length, not in order to urge its adoption, but rather to show why, after due trial, I have felt obliged to abandon it.’ (Hertz 1894, p. 29/24). One of the problems that made Hertz abandon the energetic mode of representation concerned the difficulty involved in imagining energy as a substance. If energy is to play the role of a fundamental concept rather than a mathematically derived concept, it should be possible to intuit it in space and time. However, the usual formulation of energy conservation does not necessitate such a localization of energy in space. It only postulates that the total amount of energy of an isolated system is conserved but it does not tell where the energy is located or how it gets from one place to another. This question had been addressed in electromagnetic theory by John Henry Poynting who introduced a measure of the energy flux in the electromagnetic field: If we believe in the continuity of energy, that is, if we believe that when it disappears at one point and reappears at another it must have passed through the intervening space, we are forced
78
Hertz’s road to mechanics
to conclude that the surrounding medium contains at least a part of the energy, and that it is capable of transferring it from point to point. ((Poynting 1884, p. 343), quoted from (Buchwald 1985, p. 43)).
Hertz discussed Poynting’s result in his paper on Maxwell’s equations (Hertz 1890) but already then questioned whether such a localization of energy was really meaningful. Having emphasized that the expression of the energy flux came about by a hypothetical mathematical analysis of a surface integral into its elements he went on to argue that ‘the result thereof is not always probable’: If a magnet remains permanently at rest in presence of an electrified body, then in accordance with this result the energy of the neighbourhood must find itself in a state of continuous motion going on, of course, in closed paths. In the present state of our knowledge respecting energy there appears to me to be much doubt as to what significance can be attached to its localisation and the following of it from point to point. (Hertz 1890, §11)
The last sentence indicates that Hertz did not consider Poynting’s association of a continuous energy flow with a static configuration as being entirely impossible but only as being improbable. However, he raised a more fundamental and general doubt about the localization of energy: Considerations of this kind have not yet been successfully applied to the simplest cases of transference of energy in ordinary mechanics; and hence it is still an open question whether, and to what extent, the conception of energy admits of being treated in this manner. (Hertz 1890, §11)
Hertz’s ‘study of the theory of energy in its widest sense’ seems to have been undertaken in order to answer this basic question. When he republished his 1890 paper in his book on electrical waves he had come to the conclusion that in mechanics the idea of localization of energy would lead to paradoxical situations, which cast additional doubts on Poynting’s idea. He published his reasoning in an important note of his book: . . . a steam engine . . . drives a dynamo by means of a strap running to the dynamo and back, and which in turn works an arc lamp by means of a wire reaching to the lamp and back again. In ordinary language we say – and no exception need be taken to such a mode of expression – that the energy is transferred from the steam engine by means of the strap to the dynamo, and from this again to the lamp by the wire. But is there any clear physical meaning in asserting that the energy travels from point to point along the stretched strap in a direction opposite to that in which the strap itself moves? And if not, can there be any more clear meaning in saying that the energy travels from point to point along the wire, or – as Poynting says – in the space between the wires? There are difficulties here which badly need clearing up. (Hertz 1892, note 31)
Here, Hertz implies that if energy is imagined to be a substance localizable in space it must be attached to matter. And that leads to a paradox because the strap (i.e. matter) moves from the dynamo to the engine whereas energy flows the other way6 . Hertz wrote this note in December of 1891 about 9 months after he had begun his research 6 For a closer analysis of Hertz’s paradox see (Buchwald 1985, pp. 41–43).
An energetic beginning
79
on ‘energy in its widest sense.’ It shows that by then Hertz had come to the conclusion that conceiving of energy as a substance that could serve as a basic notion of physics would lead to paradoxes. While speculating about the nature of energy Hertz also studied the principle of least action or Hamilton’s principle that he considered the fundamental law of motion in an energetic image. In order to learn about this principle and its use in all branches of physics he naturally turned to Helmholtz’s recent papers on this subject. In a letter of December 1892 when for the first time he revealed his serious involvement with mechanics to Helmholtz he accorded a central place to this paper: Recently I have been confined to theoretical work on topics suggested by a study of your papers on the principle of least action. I asked myself what form mechanics should be given right from the outset if the principle of least action is to appear at the point of departure and if its various forms are to show up not as the results of complicated derivations but as obvious truths of simple significance, and to present themselves clearly and distinctly as various forms of one and the same theorem. I am to a degree satisfied with my results, but I still have six months’ or a year’s work to go on this matter, . . .. (Hertz 1977, pp. 332–333)
If Hertz turned to Helmholtz’s 1886 paper on the principle of least action, in order to find a basis for an energetic image of nature, he would also have been presented with Helmholtz’s study of cyclic motion and a reference to (Helmholtz 1884) which deals in more depth with this subject. These two papers, and their idea of cyclic motion, became fundamental for Hertz’s approach to mechanics: Both in its broad features and in its details, my own investigation owes much to the abovementioned papers ((Helmholtz 1884), (Helmholtz 1886)). (Hertz 1894, p. xxi)
The main idea (to which we will return in Chapter 18) of cyclical motion is that a cyclically moving hidden system can be ignored if a suitable term is added to the potential energy of the visible system. That is, some potential energy may be due to ignored hidden cyclic systems. The image suggested by these papers is therefore an image in which space, time, mass (visible as well as hidden) and energy are basic notions. However, to a mind intent on simplifying the image, the following thought must almost certainly have presented itself: What if all potential energy could be ascribed to the ignoration of hidden cyclic systems. In that case, all energy is kinetic energy, either of the visible masses or of the hidden masses. Kinetic energy can be deduced from space, time and mass by way of the formula E = (1/2)m(dx/dt)2 so that this assumption would reduce the number of basic notions to three: space, time, and mass. This would be a conceptual simplification of the energetic image that would have removed the somewhat mysterious concept of localizable potential energy that Hertz had become convinced would lead to paradoxes. I suggest that the above course of events presents a rational and streamlined reconstruction of Hertz’s reflections about mechanics during the spring and summer of 1891, or at least a part of it. Indeed another chain of thought (probably caused by the consideration of non-holonomic constraints) suggested to Hertz that the principle of least actions was not the correct basis for mechanics so he exchanged it for his ‘principle of the straightest path.’
80
Hertz’s road to mechanics
6.5 Chronology of the drafts The above reconstruction of the events leading Hertz to his own image of mechanics may explain an oddity of the preserved Nachlass pertaining to Hertz’s mechanics. The preserved notes start head on with some preliminary studies of the basic concepts of his new ‘geometry of systems of points’ and goes on with a first draft of the book, presented also from the beginning. One would probably have expected Hertz to begin with some investigations of the technically complicated and theoretically central ideas, in particular how one could a posteriori introduce the classical notions of potential energy and force, beginning from his assumptions. Of course, it is possible that such notes once existed but have been lost. Yet, it is more likely that Helmholtz’s arguments showed Hertz how to deal with this central issue. Indeed, the following remark in Hertz’s preface points in this direction: ‘. . . the chapter on cyclical systems is taken almost directly from them (Helmholtz’s above mentioned papers)’ (Hertz 1894, p. xxi). Hertz seems to have been so sure about how to introduce forces that he did not address this question until the second draft. In the Appendix I have given a survey of Hertz’s manuscripts and drafts of the book, as far as they have been conserved. They show that Hertz began his research by writing a series of partly connected manuscripts dealing with his new geometric formulation of mechanics. They correspond to §1–147 of the printed book. Then he drafted a first connected manuscript of the book. However, before dealing with ‘Unfree systems,’ he interrupted (around §417 in the printed book)7 and began a new second draft of the material covered in the first draft. To this he added a draft of the rest of the book dealing with unfree systems, force, cyclic systems, conservative systems and discontinuous systems. Hertz wrote two more drafts of the main part of the book and two drafts of the preface and the introduction. Thus Hertz composed five drafts of the parts concerning the geometry of systems of points, four drafts on the mechanics of free systems, three drafts of the mechanics of unfree systems including the parts about force and potential energy, and two drafts of the preface and the philosophical introduction. This reveals that the interest that has been bestowed on the various aspects of Hertz’s endeavor by historians and philosophers of science is inversely proportional to the attention Hertz devoted to them. Hertz’s diary and letters give us one chronology of the events leading to the composition of The Principles of Mechanics, the manuscripts give us another chronology. How do they match each other? This is not so easy to decide, since the manuscripts are not dated, and the letters and entries in the diary are generally too vague to point to a particular place in the manuscripts. The only place in the diary that seems to 7 He may have decided to rewrite the first part rather than to continue because he had discovered that it would be preferable to introduce the so-called reduced components from the beginning (see Chapter 14).
Chronology of the drafts
81
allow a cross-reference to the manuscripts is the following sequence of entries: 6 March 1892. On the Hamiltonian differential equation. 16 April 1892. I set to work on mechanics again and worked on it intensively throughout the summer semester. 8 May 1892. Reflections on the ‘straightest distance’ of a system. 12 June 1892. In June I copied the work on mechanics completely again, it begins to take palpable shape. (Hertz 1977, pp. 322–323)
In order not to be misled by the dates of these entries, we must first note that at least parts of these entries must be back-dated. Indeed, the summer semester began around April 16th, so how could Hertz on that day write (in perfect tense) that he had worked on mechanics during the summer semester (gearbeitet)? A similar remark seems to apply to Hertz’s entry of June 12, about his work during the month of June. Furthermore, we should remark that the entry of March 6th does not talk about Hamilton’s equations in the plural, but about Hamilton’s equation in the singular. This suggests that Hertz was not referring to the canonical differential equations (2.41) and (2.42), §379, in the Mechanics, but rather to the partial differential equation (2.43) (often called the Hamilton–Jacobi equation), satisfied by Hamilton’s characteristic function. Finally, it seems most natural to assume that Hertz mentioned this equation, and the straightest distance in his diary, because he was developing his account of these ideas for the first time. Now, the Hamilton–Jacobi equation appears for the first time in Hertz’s manuscripts on p. 37 of the first draft (Ms 9). There it is presented, in purely geometric garb, as an equation satisfied by the straightest distance S between two positions of a mechanical system (as in §227 of the book). The straightest distance is defined four pages earlier, as in §215 of the book. The subject resurfaces in a dynamical context on the three last pages of the first draft (Ms 9, pp. 51–53). This leads me to the conjecture that Hertz concluded the last 16 pages of the first draft (corresponding to §215–417 of the book) from March to May of 1892 and began over again with the second draft (Ms 12 and 14) which he finished before the end of June of that year. This second draft may have included the first version of the preface and the introduction (Ms 10 and 11)8 . This conjecture is corroborated by Lenard’s statement in his preface to the book according to which ‘the general features were settled and the greater part of the book written within a year.’ The 15 months between March 1891, when Hertz began his work on mechanics, and May–June, 1892 is ‘about a year.’ 8 However, one may argue that the exact wording of the June 12 entry may suggest that he was working on the 3rd draft of the book. Indeed, ‘copying again’ [aufs neue ganz abgeschrieben] seems to mean that he had already copied the work once. Still, such an interpretation would be at variance with the entries about the Hamiltonian differential equation and the ‘straightest distance.’ To be sure, these ideas are also discussed at the end of the first part of the second draft (Ms 13), but the treatment there is not so different from the treatment in the first draft that it would appear to warrant two separate entries in the diary. I think my original conjecture can be saved by noting that in fact, the first draft (at least the first half of it) was a ‘copy’ of the contents of the preliminary manuscripts (Ms 1–8); so indeed the second draft was ‘copied again’ as Hertz put it.
82
Hertz’s road to mechanics
However, the conjecture does not fit so well with the continuation of the above quote by Lenard: ‘the remaining two years were spent in working out the details’ (Lenard 1884, p. vii). Two years before Hertz’s death brings us to January of 1892, so a literal reading of this part of Lenard’s quote would imply that Hertz had entirely finished his first draft (and perhaps even the second draft) by January of 1892, but this seems hardly in accordance with the diary. If my conjecture is correct, Hertz composed the last two drafts of his book, including at least the second draft of the preface and the introduction during the last 1 21 years of his life. Although these drafts follow the second one rather closely, they contain a number of new technical details and deeper philosophical reflections. When we keep in mind that Hertz, from July 27, 1892 was seriously ill and unable to work for long periods, we can understand that he often felt overburdened with work during the last phase of his life.
7 Images of nature
According to Hertz, the aim of theoretical physics is to construct and evaluate images1 that can help us predict future events. These images are not images on canvas or on paper or in books, but they are ‘Innere Scheinbilder oder Symbole.’ This means they are images ‘produced by our mind (unsere besonderen Geistes) and necessarily affected by the characteristics of its mode of portrayal.’ (Hertz 1894, p. 3/2). Yet Hertz seems to take it for granted that these images are intersubjective in the sense that all (educated) humans can form the same images and can describe them accurately (e.g. in a book like Hertz’s own) to one another. What do we depict? First, Hertz wrote that ‘we form for ourselves images or symbols of external objects.’ (Hertz 1894, p. 1/1). In other places, he talked about ‘nature.’ He never addressed the ontology of the external objects explicitly, but it is consistent with what he wrote to assume that he believed in a real existing external world. From this external world, we depict ‘objects’ or ‘things’ as Hertz often wrote. Moreover, Hertz soon included also ‘relations’ between the objects among the things we can depict and, in fact, he seems to have made no clear distinction between the different types of things we can depict. He also used the word ‘image’ to describe both the global image (e.g. of nature) and the local image of one particular object or relation, (e.g. ‘force’). For the latter, local images, however, he sometimes used the words ‘sign’ or ‘symbol’ (see the influence from Helmholtz below). In order for an image to serve its predicting purpose, Hertz made the following Basic requirement: that the necessary consequents of the images in thought are always the images of the necessary consequents in the nature of the things pictured. (Hertz 1894, p. 1/4) 1 Following the English translation of Hertz’s Principles of Mechanics, I shall use the translation ‘image’ for Hertz’s Bild. Perhaps, it would have been more fortunate to use the translation ‘picture.’ Indeed, that would have exphasized that we do not passively receive these images but actively form them. It would also made it more clear that Hertz’s criterion of distinctness (deutlichkeit) is a metaphor describing a painting.
83
84
Images of nature
This is the one and only thing (perhaps in addition to logical consistency (see below)) that Hertz required from an image, and he added: The images which we here speak of are our conception [Vorstellungen] of things. With the things themselves they are in conformity in one important respect, namely in satisfying the above-mentioned requirement. For our purpose, it is not necessary that they should be in conformity with the things in any other respect, whatever. As a matter of fact, we do not know, nor have we any means of knowing, whether our conceptions of things are in conformity with them in any other than this one fundamental respect. (Hertz 1894, p. 2/1–2)
It is interesting to note that Hertz in the first draft of the introduction, called an image ‘true’ [Wahr] if it satisfies the basic requirement. He later changed this term to ‘correct,’ probably in order not to imply any kind of ontological truths of his images. Because of the weak relation between image and outer world, Schiemann (Schiemann 1998), has characterized Hertz’s image theory as a ‘loss of world in the image.’ All ‘observable’ elements of our image must, of course, correspond uniquely to elements in nature, otherwise the fundamental requirement of an image would not be fulfilled. On the other hand, any element of our image that may not be directly observed, (e.g. atoms, forces, the ether, rigid connections…) may or may not correspond to things in nature. According to Hertz, an image cannot help containing such unobservable elements. Empty relations cannot be altogether avoided: they enter into the images because they are simply images. (Hertz 1894, p. 3/2) [And further:] We have felt sure from the beginning that unessential relations could not be altogether avoided in our images. (Hertz 1894, p. 15/12)
Hertz required that such inessential relations, or hypothetical unobservables, be kept to a minimum in our image. Still, he clearly stated that we cannot a priori demand from nature simplicity nor can we judge what, in her opinion, is simple…. Hence our requirement of simplicity does not apply to nature, but to the images thereof which we fashion. (Hertz 1894, p. 28/33–34)
In his Kiel Lectures of 1884, Hertz expressed his belief that the world is comprehensible ‘since what really exists [tatsächlich ist] cannot, in my opinion, be incomprehensible’ (Hertz 1999, p. 33). But contrary to many of his famous predecessors, he had no metaphysical or religious convictions about its simplicity. Thus a requirement that the image be as simple as possible was not a way to make the image more true. Truth of images, in addition to what is required by the fundamental requirement, is simply not an issue for Hertz. In this sense, Hertz’s theory was a final high point in what Schiemann has described as ‘an increasing hypothesization of scientific propositions.’ (Schiemann 1998, p. 28). It is, of course, true that earlier physicists had allowed concepts into their theories that they did not necessarily consider to be in true correspondence with nature. For example, Newton did not think of actions at a distance as the final truth about the causes of celestial mechanics. Still, until the middle of the nineteenth century, there had been a general belief that physical theories were (or ought to be) true about the world. Thomson and Maxwell’s ideas about reasoning by analogy and description of
A comparison of Hertz’s and Helmholtz’s signs and images
85
mechanical models changed this general belief. Hertz’s image theory is just one step further in this direction.
7.1 A comparison of Hertz’s and Helmholtz’s signs and images As pointed out by Schiemann (1998), Heidelberger (1998), and others, Hertz owed many ideas of his image theory to Helmholtz. In his physiological, philosophical and psychological papers, Helmholtz had himself developed a theory of how our minds form signs and images of the outer world. His theory of how we form such signs from our sense experiences changed over time. At first he attributed this to an a priori law of causality, and later to a presupposition of the law-likeness of all the appearances of nature (Friedman 1997). What matters here is more what Helmholtz wrote about the nature of the signs we make. In his classic statement of his sign-theory from The Facts in Perception of 1878, he wrote: In so far as the quality of our sensation gives us a report of the character of the external influence through which it is excited, it may count as a sign [Zeichen] of the latter, but not as an image [Abbild]. For of an image one requires some kind of sameness with the pictured [abgebildeten] object, of a statue sameness of form, of a delineation sameness of perspective projection in the visual field, of a painting also sameness of color. But a sign needs to have no kind of similarity at all with that of which it is a sign. The relation between the two is limited to the fact that the same object, exerting an influence in the same circumstances, calls forth the same sign, and thus that different signs always correspond to different influences. To the popular opinion, which accepts in good faith the full truth of images [Bilder], this remainder of similarity, which we do recognize, may appear very insignificant. In reality, it is not so; for with it a matter of the very greatest importance can still be achieved, namely, the picturing [Abbildung] of the lawlikeness in the processes of the actual world. (Helmholtz 1878, translated in (Friedman 1997))2
Here, Helmholtz made clear that the popular opinion of images, is very different from Hertz’s later use of the word. Hertz’s image corresponds more closely to Helmholtz’s sign. This becomes even more clear in an earlier quote from Helmholtz’s Handbuch der Physologieschen Optik 1857–67: I believe, therefore, that there can be no possible sense at all in speaking of any other truth for our representations except a practical [truth]. Our representations of things can be nothing else at all except symbols, naturally given signs for things, that we learn to use for the regulation of our motions and actions. When we have correctly learned to read such a symbol, we are then capable of so adjusting our actions with its help that they have the desired result, that is, the expected new sensations occur. Another comparison between representations and things not only fails to exist in actuality – here all schools agree – but any other kind of comparison is in no way thinkable and has no sense at all. (Helmholtz 1857/67, translated in (Friedman 1997)) 2 Schriften zur Erkenntnistheorie (1921), p. 115; Epistemological Writings (1977), pp. 121–122.
86
Images of nature
Here the relations between symbol and sensation is very similar to Hertz’s fundamental requirement of an image, and the last insistence that this is the only thinkable relation is paralleled in Hertz’s quote above. In the same book, Helmholtz even gave a similar or even clearer description of ‘images,’ (they seem to be collections or successions of signs) that he here used in a manner very similar to Hertz’s: Thus representations of the external world are images [Bilder] of the lawlike temporal succession of natural events, and if they are correctly formed [gebildet] in accordance with the laws of our thinking, and we are able correctly to translate them back again into actuality through our actions, then the representations that we have are also the uniquely true [ones] for our faculty of thought; all others would be false. (Helmholtz 1867, p. 22)
As these quotes indicate, Hertz probably owed to Helmholtz his general idea of an image and his fundamental requirement of an image. Yet there are several differences. First, as pointed out by Schiemann, Hertz’s images are removed further from the depicted in the sense that Helmholtz thought there was only one correct image of the external world (see the quote above). Hertz, on the contrary, believed that there were many correct and logically permissible images of the same part of the external world. This has to do with Hertz’s inclusion into his image of symbols of unobservable entities. In Helmholtz’s image, every symbol is a symbol of one unique thing or sensation. In Hertz’s version, however, neither the content of a theory [image], nor its principles, concepts, and laws, but only its results, can be linked to the external world (Schiemann 1998, p. 20). Where Helmholtz was interested in explaining how we form and gradually build up the signs of the external objects, Hertz was entirely uninterested in these physiological and psychological questions. Where Helmholtz was interested in sign formation from the simplest stage (children’s first cognition of the world) to the scientific level, Hertz was only interested in the latter, and not in the scientific historical process that has led to our scientific images, nor the pedagogical issues of how best to teach them, but solely in the question of logically perfecting already existing scientific images and in comparing competing images. ‘Contrary to Helmholtz, Hertz did not advance an inductivist conception of science, but a deductivist one’ (Schiemann 1998, p. 30). Another difference, pointed out by Schiemann, is that Helmholtz tended to remove the sharp ‘distinction between a priori supposed laws of thought and those empirical propositions that are capable of revision’ (Schiemann 1998, p. 30). As we shall see below, Hertz upheld such a Kantian distinction. Yet, Hertz’s demand (discussed below) that images, in addition to fulfilling the basic requirement, must be in accordance with the laws of thought, had a parallel in the last quote from Helmholtz.
7.2 Correctness Hertz set up three, now famous, criteria for the evaluation of image: logical permissibility, correctness, and appropriateness. The second of these is, in fact, just
Logical permissiblity
87
a repetition of the first fundamental requirement: We shall denote as incorrect any permissible images, if their essential relations contradict the relations of external things, i.e., if they do not satisfy our first fundamental requirement. (Hertz 1894, p. 2/2)
Schiemann interprets this as a loosening of the first fundamental requirement: This criterion restricts the agreement of consequents necessary in thought and consequents necessary in nature (‘first fundamental requirement’) to ‘essential relations.’ ‘Essential’ in this context are exactly those successions which for whatever reason claims to be empirically verifiable. (Schiemann 1998, p. 32)
Admittedly I do not see how the inessential relations could be affected at all by the first fundamental requirement, so I think Hertz’s correctness requirement is equivalent to the first fundamental requirement. This requirement, however, does not postulate ‘agreement of consequents necessary in thought and consequents necessary in nature’ as stated by Schiemann. It only postulates that ‘the consequents of the images must be the images of the consequents,’ but not the converse, i.e. that the images of the consequents must be the consequents of the images. In other words, those parts of nature that are described by our image must be correctly described, but there may be things in nature that are not described by our image at all. The consequents in our image must be a subset of the images of the consequents in nature, but it may very well be a proper subset. As we shall see below, it is the ‘distinctness’ of the image that describes how big a subset of nature our image describes. Hertz believed that ‘without ambiguity we can decide whether an image is correct or not but only according to the state of our present experience, and permitting an appeal to later and riper experience.’ (Hertz 1894, p. 3/3). He did not raise the question of the theory dependence of our experience, but he made it quite clear that correctness may change over time. He explicitly admitted that, contrary to the beliefs of some of his contemporaries, even such an established field as mechanics could be falsified by future experiments: ‘that which is derived from experience can again be annulled by experience.’ (Hertz 1894, p. 11/9). Yet, just as he did not explain how images are derived from experience, he did not clearly explain how they could be annulled.
7.3 Logical permissiblity We should at once denote as inadmissible all images which implicitly contradict the laws of our thought. Hence, we postulate in the first place that all our images shall be logically permissible – or, briefly, they shall be permissible. (Hertz 1894, p. 2/2)
It is no accident that Hertz put this requirement first on his list, before correctness. It was the one he stressed the most and the property he found most wanting in earlier works on mechanics. He concluded his preface with the words: What I hope is new, and to this alone I attach value, is the arrangement and collocation of the whole – the logical or philosophical aspect of the matter. According as it marks an advance in this direction or not, my work will attain or fail of its object. (Hertz 1894, p. xxxii)
88
Images of nature
Moreover, having described the essential features of his own new images of mechanics, Hertz continued: I think that as far as logical permissibility is concerned it will be found to satisfy the most rigid requirements, and I trust that others will be of the same opinion. This merit of the representation I consider to be of the greatest importance, indeed of unique importance. Whether the image is more appropriate than another; whether it is capable of including all future experience; even whether it only embraces all present experience, all this I regard almost as nothing compared with the question whether it is in itself conclusive, pure and free from contradiction. For I have not attempted this task because mechanics has shown signs of inappropriateness in its applications, nor because it in any way conflicts with experience, but solely in order to rid myself of the oppressive feeling that to me its elements were not free from things obscure and unintelligible. (Hertz 1894, p. 39/33)
Hertz clearly believed that our laws of thought are unproblematically given to us a priori: To the question whether an image is permissible or not, we can, without ambiguity answer yes or no; and our decision will hold good all time. (Hertz 1894, p. 3/3)
The timelessness of this objective answer distinguishes permissibility from correctness. Hertz did not explain how he imagined one should be able to prove the logical permissibility of an image, and indeed this question was not seriously addressed, even among mathematicians before Hilbert developed his Beweistheorie, and Gödel showed that even within formal logic one cannot give such proofs. Some nineteenthcentury mathematicians (e.g. the Danish mathematician Petersen3 had argued even within a less formal framework that consistency proofs were unattainable. Hertz seems to have been more optimistic, at least in principle. In order to highlight logical permissibility, Hertz used the axiomatic deductive method: By way of giving expression to my desire to prove the logical purity of the system in all its details, I have thrown the representation into the older synthetic form. For this purpose, the form used has the merit of compelling us to specify beforehand, definitely even if monotonously, the logical value which every important statement is intended to have. This makes it impossible to use the convenient reservations and ambiguities into which we are enticed by the wealth of combinations in ordinary speech. But the most important advantage of the form chosen is that it is always based upon what has already been proved, never upon what is to be proved later on: thus we are always sure of the whole chain if we sufficiently test each link as we proceed. In this respect I have endeavored to carry out fully the obligations imposed by this mode of representation. (Hertz 1894, pp. 41–42/35)
Hertz often, in fact constantly, interrupted his mathematical presentation with philosophical explanations of the physical meaning of the mathematics or its proper place in the deductive structure, but he never tried to render the definitions or axioms plausible. Ordinary textbooks in mechanics, and, e.g. Maxwell’s Treatise, were (and are) usually written in a mixed inductive–deductive style that Hertz (correctly) considered 3 See (Lützen 2001) and (Lützen et al. 1992).
Appropriateness
89
to blur the logical structure of the theory. As Hertz himself admitted, the austere deductive style made his Mechanics unsuitable as a textbook for beginners. The great amount of work that Hertz put into his Mechanics bears witness to his insistence on logical clarity. In his younger days he did not pay so much attention to consistency of the presentation. As pointed out by Buchwald ((Buchwald 1994, Chapter 9) and (Buchwald 2003)) Hertz’s 1882 paper on evaporation shows that he did not take the time to rewrite large parts of the paper even after he discovered that his experiments did, in fact, not show the new effect he had announced. Instead he simply added a new section that partly contradicted the conclusions in the previous part of the paper. This somewhat careless but time-saving approach to scientific paper writing contrasts starkly with his later very careful and time-consuming work on the Principles of Mechanics. As mentioned in Chapter 6 Hertz rewrote central parts of that book four or five times! The difference in approach to the composition of the two scientific texts can partly be explained sociologically. In 1882 Hertz was still a student, and it was important for him to produce scientific papers quickly. In the 1890s Hertz was a world-famous celebrity and his works were expected to be something extraordinary. But also the subject matter of mechanics required more care. The 1882 paper represented a new experiment and a possible new effect. Style and consistency were of secondary importance. But mechanics was a well-known discipline so here it was very important for Hertz to make the presentation entirely consistent and flawless. Yet, although Hertz carried out his deductive plan very carefully, and never based his arguments on unstated physical principles, his argumentation was not (mathematically) rigorous in the sense that had been promoted by the mathematicians in Berlin when he studied there. There are no εs or δs in the continuity arguments (that are, in fact, often differentiability arguments). Hertz freely operated with infinitely small differentials, and in his theory of matter, he summed infinitely many infinitely small quantities to give a finite result. Finally, he never specified the mathematical (geometric and arithmetic) axioms on which his mechanics was based. In his letters to his parents he often called his work on mechanics his ‘mathematical work,’ but he was not interested in mathematical subtleties. It was the logical connections between the principles of mechanics that he wanted to clear up.
7.4 Appropriateness But two permissible and correct images of the same external objects may yet differ in respect of appropriateness. Of two images of the same object that is the more appropriate which pictures more of the essential relations of the object, – the one which we may call the more distinct. Of two images of equal distinctness the more appropriate is the one which contains, in addition to the essential characteristics, the smaller number of superfluous or empty relations, – the simpler of the two. (Hertz 1894, p. 2/2)
It is the criterion of appropriateness that allows Hertz to find the ‘best’ among a selection of permissible and correct images. The most appropriate one is the best. In this way, Hertz ends up with a unique (best) image just as Helmholtz. However,
90
Images of nature
he is not so optimistic about the unambiguity of the choice, as he was in the case of permissibility and correctness: But we cannot decide without ambiguity whether an image is appropriate or not; as to this differences of opinion may arise. One image may be more suitable for one purpose, another for another; only by gradually testing many images can we finally succeed in obtaining the most appropriate. (Hertz 1894, p. 3/2)
The criterion of appropriateness is separated into two subcriteria: distinctness and simplicity.
7.4.1 Distinctness As we made clear in the discussion of ‘correctness,’ Hertz did not require that an image depict all relations of the piece of nature in question. But the more relations it depicts, the more distinct it is. That is, the more distinct an image is, the more forceful it will be in predicting the future, because it contains more consequences in the image. Presented in this way, distinctness is an obvious virtue of an image. But let us take one of Hertz’s own examples to show that it is not so obvious after all. In his discussion of the Newton–Laplacian image of mechanics, Hertz criticized the image for allowing too many types of forces, forces that do not seem to exist in nature. Allowing such forces makes the image less distinct than it could otherwise be. He listed the following properties that forces in nature seem to have, but which are not postulated in the Newtonian–Laplacian image: 1. They are conservative, in the sense that they lead to the principle of the conservation of energy. 2. They can be decomposed into a sum of mutual actions between infinitely small elements of matter. 3. These elementary forces are ‘independent of absolute time and place.’ 4. They are central. 5. They are determined only by the distance (or perhaps velocity and acceleration) of the elements of matter (Hertz 1894, p. 12/10). Hertz admitted that the last two points were up for debate, but he insisted upon the desirability to limit the range of forces in order to increase distinctness. Seen in this light, distinctness is not such an obvious virtue of an image. In fact, one may argue that a presentation of mechanics is better the more general it is. If it can deal with many types of forces, also some that we have not yet detected in nature, it will be more robust for future discoveries. Isn’t a more general theory better than a less general one? No, not according to Hertz. His criterion of distinctness is a criterion of a physicist who wants to make his theory as vulnerable as possible to falsification, not the criterion of a mathematician imbued with a love for generality. Yet, it is a criterion of an ultratheoretical physicist. Recall that Thomson and Maxwell had praised the Lagrangian formulation for its generality that made it applicable to all sorts of special physical situations. Even Jacobi, whose interest in mechanics was of a very theoretical nature, counted it as a virtue of his treatment of the Lagrange formalism that it could deal with non-conservative forces. Indeed, even if one believes that on the microscopic level all forces are conservative, one would often want to have a macroscopic description in which energy, which is transformed to heat is considered
Appropriateness
91
as being dissipated (lost): This is clearly the point of view one would like to take in practical applications of mechanics, but as Hertz pointed out … we have only spoken of appropriateness in a special sense – in the sense of a mind which endeavors to embrace objectively the whole of our physical knowledge without considering the accidental position of man in nature, and to set forth this knowledge in a simple manner. The appropriateness of which we have spoken has not reference to practical applications or the needs of mankind. (Hertz 1894, p. 46/40)
Here we encounter a tension in Hertz’s image theory. On the one hand, he acknowledged that as humans, we cannot hope to learn the truth about nature: the best we can do is to make an image of it in our minds. On the other hand, appropriateness of the images are judged without considering the accidental position of man in nature. A comparison with axiomatic systems in mathematics may help bring out what Hertz wanted to gain by his requirement of distinctness: In mathematics two types of axiomatic systems are useful. One is the general (non-distinct) type such as the axiom system for a group, the other is the specific (distinct) type, e.g. the axiom system of Euclidean geometry, or the real or natural numbers. The first kind are useful for their generality, the second for their distinctness. Hertz’s aim was to create an image of mechanics like the last type of axiomatic systems. The reason why Jacobi, Thomson and Maxwell praised the Lagrangian version is that it was general and purposely non-distinct, like the first kind of mathematical axioms systems. When Hertz, in the definition of distinctness, talked about picturing ‘more of the essential relations of the object,’ he did not think of an extension of the depicted area, to a larger domain of the external world. For example, he did not want to judge an image of space (geometry) with an image of motion in space (mechanics). His concern was to compare images of the same object (area) as to their ability to account for as much of the object as possible. Only in one place did he implicitly refer to the coexistence of images of different parts of the external world, namely in his discussion of Hamilton’s principle where he stated: In order that an image of certain external things may in our sense be permissible, not only must its characteristics be consistent amongst themselves, but they must not contradict the characteristics of other images already established in our knowledge. (Hertz 1894, p. 27/22–23)
In this quote, Hertz clearly did not speak of an already existing image of the same thing. It is clear that a new image of mechanics, e.g. Hertz’s own, may be logically incompatible with an image, e.g. the Newtonian–Laplacian that it is going to replace. In fact, in the introduction to Electric Waves Hertz stressed that one should be careful not to mix two competing images: Hence for a proper comprehension of any one of these [representations of Maxwell’s theory], the first essential is that we should endeavor to understand each representation by itself without introducing into it the ideas which belong to another. (Hertz 1892, p. 23/27)
Thus, the already established images, that Hertz mentioned in the quote above, must be images of other (more general) areas of nature. For example, an image of
92
Images of nature
mechanics must be in harmony with our already established image of geometry, or perhaps with some principles of metaphysics (such as causality that is in question in Hertz’s discussion of Hamilton’s principle). It is useful to think of ‘distinctness’ (deutlichkeit) in connection with the ‘image’ or ‘picture’ metaphor, from which Hertz borrowed it: An image is more distinct when it shows more details. A naturalistic accurate painting is more distinct than an expressionistic or impressionistic one, not because it is a painting of a larger section of reality, but because it allows one to see more details of the common object depicted. In modern language, an image is more distinct than another if it has more pixels.
7.4.2 Simplicity Of two images of equal distinctness, the more appropriate is the one which contains, in addition to the essential characteristics, the smaller number of superfluous or empty relations – the simpler of the two. (Hertz 1894, pp. 2–3/2)
What is the difference between essential characteristics (relations) and inessential or superfluous ones? Hertz did not give a definition of this distinction but it can be inferred from the formulation of the requirement of correctness to the effect that the essential relations do not contradict the external things. This seems to imply that the essential relations are the empirically testable ones, whereas the inessential ones cannot be empirically tested. In the discussion of forces in the Newtonian–Laplacian image of mechanics, Hertz called the latter ‘idle wheels’: It cannot be denied that in very many cases the forces which are used in mechanics for treating physical problems are simply free running idle wheels4 which keep out of the business altogether when actual facts have to be represented. (Hertz 1894, p. 14/12)
The use of the phrase ‘free running idle wheel,’ for the empty relations may have been inspired by Maxwell’s model of the electromagnetic field in the ether (Chapter 3), even though the idle wheels in Maxwell’s model are far from free running. They play an essential role in describing the displacement current. Hertz’s requirement of simplicity is thus in its strict sense a requirement of the minimization of free running idle wheels or empty relations. In many places in the introduction, however, Hertz wrote about simplicity in a way that is hard to reconcile with such a strict reading. For example, on p. 28/23–24, he rejected Hamilton’s principle as the fundamental law of mechanics because ‘the actual relations between the things can only be represented by complicated relations, which are not even intelligible to an unprepared mind’ (Hertz 1894, p. 28/24). Hertz considered this as a violation of ‘simplicity.’ Here, simplicity is not so much a question of empty relations; it is rather the opposite of ‘conceptual and mathematical complication.’ It is also questionable whether ‘simplicity’ should be understood in its strict sense when 4 ‘Leergehende Nebenräder.’ In the printed English translation it is misleadingly rendered as sleeping partners.
The relation among the criteria
93
Hertz declared the mechanistic image of nature as the simplest: In this sense, the fundamental ideas of mechanics, together with the principles connecting them, represent the simplest image which physics can produce of things in the sensible world and the processes which occur in it. (Hertz 1894, p. 4/4)
In passing, we can remark that here ‘image’ is used in a slightly variant meaning. It is not here the question of making an image of mechanics, but of mechanics being an image of other physical phenomena. Thus, Hertz sometimes used ‘simplicity’ in a more general sense than the strict sense of avoiding idle wheels. And although he did not list any other criteria than distinctions and simplicity (in this strict sense) as being part of appropriateness, he sometimes involved other criteria in his evaluation of an image. For example, his arguments in favor of the mathematical form of his own image (the geometry of systems of points) involve such properties as intuitive clarity, elegance, and beauty (Lützen 1998, p. 105).
7.5 The relation among the criteria After a four-page general discussion of the image theory, Hertz devoted the remaining 65 pages of the Introduction to a comparison of three images of mechanics: the Newtonian–Laplacian, the energetic, and his own image. In the light of the general discussion, one would have imagined the following clear-cut structure of such a comparison: First, an absolute evaluation of the logical permissibility of the three images (valid for all times), and an exclusion of the non-permissible images. Then an absolute evaluation of the correctness (as of 1894) of the permissible images and an exclusion of incorrect images. Finally, a comparison of the appropriateness of the remaining permissible and correct images starting with an evaluation of distinctness and ending with an evaluation of the simplicity of the equally (maximally) distinct images. In fact, this is how Hertz himself stated one should proceed when dealing with mature knowledge (such as mechanics by 1894): Mature knowledge regards logical clearness as of prime importance: only logically clear images does it test as to correctness; only correct images does it compare as to appropriateness. (Hertz 1894, 11/10)
In several discussions of Hertz’s Mechanics (among them some of my own), it is stated that this is what Hertz did and that he finally chose his own image because it was simpler than the other two – either because it did not operate with the concepts ‘force’ or ‘energy’ (which are then in fact free moving idle wheels, because they can be avoided) or because it only operated with one, simply formulated fundamental law. In fact, however, Hertz’s comparison of the three images of mechanics was much more complex. First, he did not in fact discuss permissibility and correctness as absolute criteria, i.e. criteria whose fulfilment one could decide with a simple yes or no, for a given image. Instead he compared the three images with regard to permissibility
94
Images of nature
and correctness, arguing whether one was more permissible (or correct) than another. On the other hand, he sometimes treated appropriateness as an absolute property, although in the general section it was only presented as a relative one. For example, he asked about the Newtonian–Laplacian image: ‘Is this image perfectly distinct?’ (Hertz 1894, p. 12/10). Although perfect distinctness is not introduced in the beginning of the Introduction it is clear what Hertz meant here and indeed he continued to spell it out: ‘Does it contain all the characteristics which our present knowledge enables us to distinguish in natural motions.’ (Hertz 1894, p. 12/10). Absolute distinctness, therefore, corresponds to what Hilbert later called completeness of an axiomatic system for a branch of mathematics (or physics). Moreover, he asked of the same image: ‘Is our image simple?’ adding, ‘Is it sparing in unessential characteristics – ones added by ourselves, permissibly and yet arbitrarily, to the essential and natural ones?’ (Hertz 1894, p. 13/11). It is less obvious what could be understood by absolute simplicity, since Hertz himself underscored that inessential relations (idle wheels, unobservables) cannot be altogether avoided. And Hertz’s answer does not clarify the matter because it consists in pointing to idle wheels that can be avoided. So Hertz blurred the clear distinctions between the absolute and the relative criteria for images. He even, in many cases, changed the order of the discussion of the criteria. For example, when he discussed the energetic image, he began with its appropriateness, then turned to its correctness and finally to its permissibility. Moreover, in several cases, he was not always clear about whether a particular criticism was directed against the permissibility, the correctness, or the appropriateness of an image. For example, he first presented his criticism of forces in the Newton–Laplacian image as a criticism of the permissibility of the image, but finally concluded that it is a problem that concerns its appropriateness. Similarly, his criticism of Hamilton’s principle, in the energetic image, is first presented as a problem of correctness, but is then interpreted as a problem of appropriateness. There are probably many reasons for this somewhat confusing comparison of the three images of mechanics. First, there are the stylistic reasons: If he had proceeded according to his own general plan, he would not have had the chance to present all the arguments he wanted to present. For example, if he had simply argued that the energetic image is incorrect, because Hamilton’s principle gives the wrong trajectories for systems with non-holonomic constraints, his discussion should, according to his general principles, have stopped there, and he would never have had the chance to praise its superior appropriateness vis-a-vis the Newtonian–Laplacian image. This would have cut out his discussion of atomism, a problem he clearly wanted to discuss. The straightforward use of the general principles would also have made the discussion of the Newtonian–Laplacian image very short, at least if Hertz had decided that its use of the concept of force was impermissible. Secondly, Hertz must have realized that absolute judgements about the permissibility and correctness are not as easy to make as his earlier claim suggested. As pointed out above, he did not present any methods to check permissibility and correctness, and when dealing with a non-axiomatically described image such as the Newtonian– Laplacian and the not fully developed energetic image, it is hard to see how he could
The relation among the criteria
95
have imagined such methods. Even when dealing with his own axiomatic image, he admitted that the deductive style did not in itself guarantee absolute permissibility. … it is obvious that the form by itself is no guarantee against error or oversight; and I hope that any chance defects will not be more harshly criticised on account of the somewhat presumptuous mode of presentation. I trust that any such defects will be capable of improvement and will not affect any important point. (Hertz 1894, p. 42/35)
As we have seen in Chapter 5, Hertz was actually nervous that he might have committed such errors. Any kind of proof theory was, of course, not considered by Hertz. Without any workable algorithm for deciding absolute permissibility and correctness (as of 1894), Hertz had to fall back on the relative comparison of the images as to their permissibility and correctness. Thirdly, although Hertz presented the three criteria: permissibility, correctness, and appropriateness, as clearly distinguishable criteria, there are, in fact, many subtle relations between them, relations that turn up in Hertz’s discussion of the three images of mechanics. First, permissibility and correctness are related, formally by the fact that an impermissible image, i.e. an image that contains a contradiction, allows one to draw any conclusion, and of course, they can not all be correct representations of the outer world. Thus, according to such a formal argument, correct images must be permissible. This reflects Hertz’s own general rule that only permissible images be checked for correctness. It is by no means obvious that it is possible to make permissible and correct images of nature at all, but Hertz claimed that this was an empirical fact: In order that this requirement (the first fundamental requirement of correctness) may be satisfied, there must be a certain conformity between nature and our thought. Experience teaches us that the requirement can be satisfied, and hence that such a conformity does in fact exist. (Hertz 1894, p. 1/1)
So, according to Hertz, natural phenomena can be imaged in a logically permissible form. However, he did not explicitly mention the above formal argument according to which impermissibility will lead to incorrectness. On the contrary, he almost argued the converse relation (Hertz 1894, p. 11/9), stating that in the early formative phases of a science, it may be advantageous to leave the logical structure somewhat vague because that would make it possible to make more correct predictions. This is, of course, true, but it is equally true that it allows many incorrect predictions as well. Hertz actually seems to have thought (Hertz 1894, pp. 9–10/8) that it is possible to have a correct image that is logically impermissible, as long as the inconsistencies are limited to the inessential characteristics that we have ourselves arbitrarily worked into the essential content given by nature. Thus, as long as the contradictions do not involve the essential observable elements, they do not, according to Hertz, have to lead to incorrectness. This shows that his ideas about our laws of thought were far from the modern conception of formal logic, according to which a contradiction anywhere in the system will allow one to deduce every statement (and its negation). Still, even for Hertz there is some relation between permissibility and correctness.
96
Images of nature
Secondly, it is essential to Hertz’s argumentation that there is a relation between simplicity and permissibility. Lack of simplicity will often lead to impermissibility. In fact, Hertz’s main method to obtain permissibility is to strip the image of as many inessential relations as possible. This, of course, is the central idea of the axiomatic method. In his discussion of the concept of force in the Newtonian–Laplacian image, he expressed the idea in this way: But we have accumulated around the terms ‘force’ and ‘electricity’ more relations than can be completely reconciled amongst themselves. We have an obscure feeling of this and want to have things cleared up. Our confused wish finds expression in the confused question as to the nature of force and electricity. But the answer which we want is not really an answer to this question. It is not by finding out more and fresh relations and connections that it can be answered; but by removing the contradictions existing between those already known, and thus perhaps by reducing their number. When these painful contradictions are removed, the question as to the nature of force will not have been answered; but our minds, no longer vexed, will cease to ask illegitimate questions. (Hertz 1894, p. 9/7–8)
That is why a problem, like the problem surrounding the concept of force in the Newtonian–Laplacian image, may be considered both as a question of permissibility and of simplicity. If the logical problems of an image can be solved just by leaving out inessential relations or slightly reformulating others, then the problem may be diagnosed as one of simplicity, if not, then the problem is one of permissibility. This seems, more or less, to be Hertz’s standpoint. Thirdly, correctness and appropriateness are related. For example, instead of considering rolling motion as incorrectly described by Hamilton’s principles, one may consider rolling motion as a motion with little slipping, a kind of motion that is correctly described by Hamilton’s principle. This would make the description less simple, but correct. Although Hertz advanced strong arguments against such a reformulation (one related to what we in Hadamard’s terminology would call the well posedness of the problems of mechanics) he finally preferred ‘that the doubt is one which affects the appropriateness of the system, not its correctness … .’ (Hertz 1894, p. 25/21)
8 Hertz’s earlier ideas about images
8.1 Images in the Kiel Lectures Hertz’s image theory, as presented in the introduction to his Mechanics, has earned him fame as a ‘modern philosopher’ (Baird et al. 1998). However, it is somewhat ironic that Hertz initially developed a rudimentary version of this theory as a defense against philosophers. He presented this initial version of his image theory in the introduction to the Kiel Lectures. In the course of an historical survey of the main questions and answers related to the construction of matter, Hertz pointed out that this field had in earlier times been the domain of the philosophers, but was now to a large extent taken over by the physicists (and chemists). However, according to Hertz, a physicist interprets the aims and questions concerning the constitution of matter differently from a philosopher and gives different answers. ‘We (Hertz, the physicist and his philosopher opponent), not only can, but must, pursue our goals independently of each other’ (Hertz 1999, p. 33). The goal of the physicist is correctness, for the philosopher it is logical consistency (Hertz 1999, pp. 32–33). The physicist investigates the facts of nature (experimentally), the philosopher investigates the difficulties that the human mind encounters while trying to understand them (Hertz 1999, p.32). Hertz wanted to investigate the constitution of matter from the point of view of a physicist, but he was well aware that such an approach was open to the criticism of the philosophers. In fact, even if the philosophers were sympathetic to the goals of the physicist, they probably would object to the conclusions. For example, if the physicist concludes that matter is made up of atoms like small balls (of diameter 10−6 mm), immersed in an ether that conducts light waves, the philosophers would probably object: 1. Since atoms are not points we can imagine that they be further divided, and we are therefore no closer to understanding the constitution of matter. 2. We cannot conceive of small balls without attributing some color to them. However, since atoms are much smaller than the wavelength of light, this makes no sense. Similarly, it makes no sense to make visual images of the light waves in the ether because they must themselves be invisible. In other words, any idea (Vorstellung) of the constitution of matter (atoms) must contain a logical flaw because they cannot avoid appealing to sensible properties that are inapplicable on the atomic level. 97
98
Hertz’s earlier ideas about images
Hertz put forward the first version of his image theory as the physicist’s way to anticipate this kind of philosophical criticism. First he argued that even if we strip our ideas about matter of all those properties that our mind has added in order to create a visual image, there will still remain a core of conceptually defined quantities which are connected among themselves, and to the macroscopic properties of matter, through strictly mathematically defined relations. Even if it is not allowed to consider them (these quantities) for their own sake and ascribe conceivable (vorstellbare) meanings to them, they will retain their value as auxiliary quantities for the sake of these relations. Thus, even if I am not really allowed to speak of the diameter of an atom, the quantity that I call the diameter of an atom of a certain gas still retains its meaning: it is a length in terms of which I am able to set up a relation between the thermal conductivity of the gas, its inner friction, its dielectric constants, and its index of refraction. (Hertz 1999, p. 35)
Thus, according to Hertz, it is possible to restrict physics to the simplest possible description of the sensible facts. From this point of view, everything that lies outside the sensible perceptions are fictions or mathematical auxiliary quantities that only serve the purpose of facilitating this description. Hertz did not attach any separate name to this view of physics but for future reference, let us call it the phenomenological theory. Hertz explained that ‘many physicists’ (the positivists) were of the opinion that the aim of physics was to set up such a phenomenological theory. Hertz admitted that it was possible to consider the aim of physics in this way, but he argued that it was by no means necessary or desirable to thus limit physics. It is a general and necessary property of the human mind that we can neither intuitively represent nor conceptually define, the things without attributing properties to them that do not at all exist in them. (Hertz 1999, p. 35)
Even in an abstract science, as geometry, it is impossible to imagine the objects, such as a line, without attributing to it some properties, such as width, that it does not possess by definition. If this is so in a highly respected and mature science such as geometry, then how and why should we avoid adding inessential properties to the essential ones in physics? In fact, such an addition of inessential elements ‘is not false intuitions, but the condition for imagination at all’ and it poses no problem as long as one keeps in mind which of the properties are essential and which are inessential. Thus let us guard ourselves from believing that we can investigate the nature of the things themselves by considering the atoms; let us also guard ourselves from confusing the unnecessary properties, that we must necessarily ascribe to them with the essential properties, that are merely time and space relations. However, let them [the philosophers] not make us believe that we have worked in vain when we have made ourselves images [Bilder] of the things that are real but do not enter into our mind, images that correspond to those things in some respects, while in other respects they bear the imprint of our imagination. We have then, in our field, followed the general course of the human mind. (Hertz 1999, p. 36)
Hertz even emphasized that it has a certain practical advantage to make such images, instead of making due with the naked phenomenological relations between
The parable of the paper money
99
measurable quantities. Indeed, by imagining the motion of the atoms, we can often predict what will happen, without making any calculations. Thus our image ‘constitutes somehow a quite useful integration machine for the differential equations of mechanics’ (Hertz 1999, p. 37). In the Kiel Lectures, Hertz was not entirely clear about what an image must satisfy in order to be called an image of a part of reality. However, he pointed out that the essential elements must be ‘logically possible’ and ‘appropriate.’ Moreover, the image including the inessential added elements should be as ‘probable’ as possible (Hertz 1999, p. 35). At a later stage in his lectures, when he had formed an image (intuition) of the ether, he investigated whether this intuition was: 1. possible mathematically – corresponding to ‘permissible’ in the terminology of the later book – as well as physically – corresponding to ‘correct’ in the book, 2. advantageous, i.e., it must facilitate our understanding (later he spoke about usefulness for prediction of the phenomena) and 3. probable. Hertz renounced all use of metaphysical weapons (Hertz 1999, p. 62) and explicitly rejected the question ‘is our intuition correct.’ Here he obviously did not attribute the same meaning to the word ‘correct’ as he did in the book, but rather meant to say that it makes no sense to ask about the ontological truth of the (inessential) elements of our intuition. Hertz couched the introduction of this preliminary version of his image theory in the form of a dialogue between physics (himself) and philosophy. However he also made it clear that he did not want to criticize the philosophers as a professional group. In fact, he explained that The present day philosophy, in so far as it is based on Kant, to an increasing extent excludes the question of the constitution of matter from their sphere of interest and refers it to the exact natural sciences, and reserves for themselves at most, a control of the last results. (Hertz 1999, p. 25)
Hertz wanted to make his image of the constitution of matter immune to philosophical control, but he did not think of any particular philosopher or group of philosophers, but rather of the philosophical self of every human being including the physicist (Hertz 1999, p. 39).
8.2 The parable of the paper money Hertz had another go at the problem of meaning of physical theories at the beginning of the second part of his Kiel Lectures. After a discussion of the metaphysical properties of matter: extension, mobility, impenetrability and indestructibility he concluded that these properties are unclear mixtures of a-priori and empirical parts. Thus we are left with the question: how is it possible for the physicist to erect a theory of matter on such a shaky foundation. Hertz tried to answer by way of a parable: he likened matter to ‘paper money that the mind issues in order to control its relationship to the things’ (Hertz 1999, p. 117). The importance of the bank notes lies in the fact that they are signs of something else. Yet merchants and most of us ordinary people go about our
100
Hertz’s earlier ideas about images
everyday business with money, as though bank notes were something in themselves without bothering about their sign value. Similarly, matter (that we conceive and as we conceive it, with all its general properties as something existing outside us) no longer appears as a sign for something that we cannot understand and imagine in itself. (Hertz 1999, p. 118)
This is how the practical (experimental) physicist deals with matter. He does not deal with the discussion about the concept of matter, but uses the concept as an unproblematic thing. The theoretical physicist, who deals with the theory of the constitution of matter, can be likened to the economist who investigates the laws governing the circulation of money. He needs to consider bank notes as signs, but he does not need to care about how they are made. Finally, the philosopher is like people who study the production of paper money, the production of quality paper of careful printing, etc. The philosopher needs to enter into the workshop of our minds and investigate how we produce our concepts. The philosopher is like an engraver who enjoys a beautifully printed bill, the physicist is like the merchant who does not care about the quality of the bill. A philosopher that would reject all that a physicist says about matter, because it rests on unclear definitions would be like a man who rejects paper money because it is not beautifully printed. Similarly, the physicist who ridicules the philosopher’s work on the constitution of matter is like a man who does not recognize the beauty of a well-engraved coin. Hertz appreciated both the physicists and the philosopher’s approach to the constitution of matter – but he warned against mixing them. It is important to distinguish between what things mean and what they signify. If one does not make this distinction, one would be like a man who put a bank note in the melting pot because he had heard that one can change money into silver. Hertz presented the idea of images and the parable of paper money at the beginning of the first and second part of his Kiel Lectures, respectively. He did not, himself, make any connection between them but presented them as two entirely seperate considerations. Still, there are obvious connections between them. First, in both cases, the aim was to make a clear demarcation between the domain of philosophy and that of physics. In particular, Hertz wanted to explain how a physicist need not deal with metaphysical problems and need not be concerned with criticisms from philosophers. Secondly, images seem to play the same role in physics as paper money in economy. They both symbolize something else, and the working physicists or economists do not need to concern themselves with their production1 . Moreover, in both the theory of images and in the parable, the complete absence of ontological agreement between the sign and the signified is underscored. They only need to correspond to each other in the sense that the mathematically calculable measurable effects of the sign (image) correspond reasonably well with the signified (the external world). In all other respects, they may be as different as a car and 25 000 one dollar bills. 1 In the quote above, Hertz used the word ‘matter’ about ‘matter as we conceive it.’ Thus, when he said that he likened matter to paper money, it is conceivable that he meant to say that he likened our image (conceptions) of matter to paper money.
Comparison of the 1894 images with earlier concepts
101
Although Hertz’s image theory arose as a physicist’s defense against possible metaphysical criticism from philosophers, it was itself an important contribution to philosophy.
8.3 The colorless theory and the gay garment In the introduction to his collection of papers on Electric Waves (Hertz 1892), Hertz gave a new, somewhat philosophical analysis of the theories he presented in particular in the last two theoretical papers on Maxwell’s equations. Hertz was no longer primarily interested in the different roles of philosophy and physics in dealing with our images of nature, nor in the connection these images may have to nature. His aim was to distinguish between different types of mental renderings of nature. He distinguished between a theory and its representation (Vorstellung). A theory only deals with such conceptions (quantities) that have direct empirical content, and their (mathematical) relations. A representation of a theory adds a physical picture (Vorstellung) to the theory. This addition is made in a rather arbitrary way by the mind, in order to create a colorful image; it is a gay garment that we use to dress up the simple and homely figure (the theory) as it is presented to us by nature. In connection with electromagnetism, Hertz’s rendering of Maxwell’s equations constitutes the theory (Maxwell’s theory). In this theory, Hertz dealt with various ‘directed changes of state’ (vector fields) without explaining how we may imagine such changes. ‘If we wish to add more color to the theory, there is nothing to prevent us from supplementing all this and aiding our powers of imagination by concrete representations of the various conceptions as to the nature of electric polarization, the electric current. etc.’ (Hertz 1892, pp. 30–31/28). The distinction here between a theory and its representation corresponds very well to the distinction in the Kiel Lectures between what we called the phenomenological theory and the more colorful images. However, where he had clearly come down in favor of the colorful images in his Kiel Lectures, he presented a colorless ‘phenomenological’ theory in his Electric Waves.
8.4 Comparison of the 1894 images with earlier concepts It is not so obvious how Hertz’s mature images of the Mechanics compare with the 1884 and 1892 distinction between (phenomenological) theory and representation (colorful image). Michael Heidelberger has tried to clarify this relation by adding a third concept to the 1892 classification. He claims that Hertz distinguished between a representation of a theory and a presentation of it. The former, characterized by Hertz with words such as representation [Vorstellung], physical representation, interpretation [Deutung], physical meaning [physicalische Bedentung] or intuition [Anschauung] adds hypotheses about the microscopical agents responsible for the macroscopic phenomena described by the theory. The latter, its ‘presentation’ or
102
Hertz’s earlier ideas about images
‘expression’ [Darstellung], is, according to Heidelberger ‘the concrete sensual aids and devices which are used for its more or less contingent formulation in a certain historical context and which depend on our arbitrary choice.’ (Heidelberger 1998, p. 19). With this distinction at hand, Heidelberger identifies Hertz’s mature images with the representations of his 1892 introduction. Now, it is true that in the Mechanics, Hertz introduced the idea of presentation [Darstellung] described by Heidelberger (see Section 8.5), however, it is not easy to locate such an idea in the introduction to Electric Waves. In particular, it seems impossible to identify the ‘presentation’ with the gay garment as Heidelberger does, first because Hertz explicitly stated that the gay garment is added to the theory in order to help our ‘Vorstellung.’ i.e. the representation rather than the presentation, and secondly because it seems rather clear that his gay garment could consist in a microscopic rendering of the colorless macroscopic mathematical quantities (e.g. polarization) that belong to the representation according to Heidelberger. So in our comparison between Hertz’s mature images and his earlier concepts we seem to be faced with the question: are the 1894 images similar in nature to the 1884 and 92 theories or to the images of 1884 and the representations (gay garment) of 1892, or to none or both of them? First, we should notice that where Hertz in 1884 and 1892 contrasted two types of physical theorizing, he only presented one version, the image theory, in 1894. And this one-image theory has borrowed characteristics from both of the earlier two ideas. It has borrowed its name, ‘image,’ from the 1884 lectures. Moreover, Hertz characterized the images as our ‘Vorstellungen’ [‘conceptions’ in the English translation] of things, reminiscent of the terminology of representations of 1892. Finally, images are mental images or intuitions just as the images of 1884 and the representations of 1892. From the earlier idea of image or representation, it has also borrowed the idea that it must necessarily contain inessential elements: ‘Empty relations cannot be altogether avoided: they enter into the images because they are simply images’ (Hertz 1894, p. 3/2). However, Hertz also used the word theory [physikalischer Theorien] synonymously with image (Hertz 1894, p. 3/3) and he borrowed the requirement of simplicity (i.e. the minimization of empty relations) from his earlier characterizations of a (phenomenological) theory. About his treatment of Maxwell’s theory, Hertz wrote in 1892: I have further endeavoured in the exposition to limit as far as possible the number of those concepts which are arbitrarily introduced by us, and only to admit such elements as cannot be removed or altered without at the same time altering possible experimental results. (Hertz 1892, p. 30/28)
In his earlier discussions of images or representations, Hertz had not made any such requirement about simplicity. In fact, in the Kiel Lectures, an addition of more inessential elements or more color to an image may make it more appropriate in the sense ascribed to this word in 1884, because it might help our fantasy and facilitate
Comparison of the 1894 images with earlier concepts
103
our ability to predict events. In the Mechanics, it is quite the reverse: per definition, appropriateness will decrease when inessential elements are added to an image. And indeed the image that Hertz painted of mechanics in 1894 was much less colorful than the presentations he had suggested two years earlier for polarization and the very vivid images he had given of the atoms in 1884. In view of the requirement of simplicity of images put forward in the Mechanics, we may pose the question: is it possible to strip an image (representation) of so many empty relations that it ceases to be an image? Hertz’s answer in 1884 and 1892 seems clearly to be ‘Yes.’ If one removes so many empty elements as possible, while still retaining the possibility of mathematically accounting for the observable facts of nature, one is left with a colorless phenomenological theory and this is not an image (presentation). The above quote from the Mechanics to the effect that an image must contain empty relations, because it is an image, seems to indicate that Hertz still held this point of view. If that was really his opinion, one may ask, how many empty relations is it possible to remove from an image before it ceases to be an image? Hertz did not explicitly address this question, but it may be said to be included in his statement to the effect that questions about appropriateness are to some extent a question of personal taste. However, it seems more in tune with what Hertz wrote in 1894 that he had, in fact, changed his mind regarding how much added material one can scrap from an image. I shall argue 1. that in 1894 he had come to the conclusion that one can remove more inessential elements than he had imagined in 1884 and 1892, while still retaining a mental image and 2. that even the most minimalistic lawful account of the phenomena will require elements that do not correspond to sensible macroscopic facts and 3. that a combination of 1 and 2 would characterize a simplest image as a theory. This explains how Hertz could combine the earlier notions of theory and image into one new notion of image. As far as the second point above is concerned, Hertz wrote in the Mechanics: If we try to understand the motions of bodies around us, and to refer them to simple and clear rules, paying attention only to what can be directly observed, our attempt will in general fail. We soon become aware that the totality of things visible and tangible do not form an universe conformable to law, in which the same results always follow from the same conditions. We become convinced that the manifold of the actual universe must be greater than the manifold of the universe which is directly revealed to us by our senses. If we wish to obtain an image of the universe (Weltbild), which shall be well-rounded, complete, and conformable to law, we have to presuppose, behind the things which we see, other, invisible things – to search for confederates concealed beyond the limits of our senses. (Hertz 1894, p. 30/25)
Here, Hertz seems to deny that we can make even a lawful theory of the world around us without introducing elements that do not correspond to external sensation2 . This statement is, in fact, in harmony with his earlier statements about phenomenological theories, where he also allowed for concealed confederates such as ‘the diameter of 2 To be sure Hertz uses the word ‘Weltbild’ here, but nothing seems to indicate that he used it in the technical sense of an ‘innere Scheinbild’ and there is no trace of any distinction between an image and a theory here.
104
Hertz’s earlier ideas about images
an atom,’ as long as one does not attribute more to these mathematical entities than one can infer from the role they play in the mathematical relations, and as long as one minimizes their number. What changed in the Mechanics was therefore not so much concerned with point 2 above, but rather with point 1. Hertz seems to have come to the conclusion that one can strip an image of so many inessential elements that one is, in the end, left with a theory from which one cannot remove any more elements without the law-like theory collapsing – and he now called even this rather colorless image, an image. We may say that he has arrived at this idea of an image from his earlier colorful images by a stretch of the imagination. Why did Hertz in 1894 present a concept of an image that is so stark and colorless relative to his earlier idea of images and representations? I think we can answer this question by taking the contexts of the various image theories into account. To be sure, in the final version of the Mechanics Hertz presented his philosophical considerations as a background for his presentation of mechanics, but the historical evidence shows that chronologically, the mature image theory was developed after Hertz had constructed the physical and mathematical contents of the book. Indeed, in an early plan of the book (in Ms 13) that seems to be written at the same time as the first preserved manuscripts there is no trace of any image theory. The introduction was planned as a historical discussion of different principles of mechanics, and a critical analysis of the concepts of force and energy, concluding with Hertz’s version of Gauss’s principle of least constraint. This planned introduction was apparently never written. The image theory is presented in almost its final form in the first preserved draft of the introduction but that was written together with the second (or even the third) draft of the book. Thus, Hertz did not chose his particular presentation of mechanics as a result of a clearly formulated image theory or any other explicitly formulated philosophical standpoint. On the contrary, Hertz wrote the introduction to the Mechanics with a view to the image of mechanics it was supposed to support. He formulated the mature version of the image theory as a philosophical background for the Mechanics, just as the remarks in the Kiel Lectures about images and in the Electrical Waves about theories and representations were written as philosophical reflections on the ideas about atoms and electromagnetism presented there. And in connection with the Mechanics, we can ask: what are the inessential elements in Hertz’s image of the mechanical world? There are really only two: the hidden masses and the connections. Of these two Hertz himself only highlighted the hidden masses, and according to Hertz, we cannot make a law-like theory without introducing them or something even less simple like force or energy. The quote above continues: These deep-lying influences we recognized in the first two representations; we imagined [dachte] them to be entities of a special and peculiar kind, and so, in order to represent them in our image, we created the ideas of force and energy. But another way lies open to us. We may admit that there is a hidden something at work, and yet deny that this something belongs to a special category. We are free to assume that this hidden something is nought else than motion
Comparison of the 1894 images with earlier concepts
105
and mass again, – motion and mass which differ from the visible ones not in themselves but in relation to us and to our usual means of perception. (Hertz 1894, p. 30/25)
Here again, Hertz speaks of representations, but taken together with the previous part of the quote the idea seems to be clear: Hidden masses are not just added in order to add color to a theory. They are added in order to make a law-like theory possible at all. Now, Hertz must have been aware that a true positivist would not have agreed with him on this point, and he even mentioned Kirchhoff’s much more phenomenological treatment of mechanics. But he seems to have been of the opinion that any further simplification was unattainable. After all, even in a phenomenological account of mechanics, one has to set up equations that account for the time and space relations of the mechanical system. These will contain various coefficients that may be called forces or something else; but one cannot claim to do without such mathematical quantities. One can avoid attributing inessential properties to them, but that seems to be exactly what Hertz tried to do when he defined the concept of mass or matter (ordinary and concealed). Matter, in Hertz’s image, is nothing but an infinite collection of characteristics of space and time relations (vector functions of time). No color or shape here, as in the 1884 images of atoms, only essential space and time relations. The other hidden type of element in Hertz’s image of mechanics, connections, is treated in an entirely positivistic manner by Hertz. He never mentioned connections as a special concept that he needed in his image, on a par with time, space and mass. It is not even introduced as a ‘characteristic’ of some thing, but simply by noting that there are possible and impossible displacementss. Together with certain continuity requirements, Hertz then deduced the systems of first-order linear homogeneous differential equations that is the analytical expression of the connections. However, he never gave a specific name to the coefficients (as he had to those coefficients in the line element that represent masses) and he made no effort to give any intuitive image of these connections. This actually caused problems for the British physicists, who found it hard to imagine connections without getting them entangled (see Chapter 27). So, in fact, Hertz’s image of mechanics is very close to a phenomenological theory. The only thing a positivist might argue with was the fact that Hertz imagined that the coefficients in the line elements that do not correspond to observable masses can be imagined as masses of hidden point masses3 . Thus, the change in Hertz’s concept of image from the complex, colorful vivid images of 1894 corresponds nicely to the physics he actually presented. But then the question of why Hertz changed his idea of an image can be rephrased as follows: why did he prefer to give a stark image of mechanics, where he had earlier given more colorful images of matter? Or phrased differently: why did Hertz in 1894 put such 3 Also Hertz’s image of matter being built up from indefinitely small ‘Massenteilchen’ might be a thorn in the flesh of the positivist, but as I shall argue in Chapter 12, this inessential part of the image was introduced for mathematical reasons.
106
Hertz’s earlier ideas about images
an emphasis on simplicity, where he had earlier allowed more color to help intuition? I think there are several reasons: 1. A strategic reason could be that he wanted to argue against the other two images of mechanics that he did not like for other reasons (lack of conceptual clarity). A requirement of simplicity is a weapon that would favor his own image. 2. This strategic reason has a deeper reason: In mechanics, Hertz put great emphasis on logical permissibility. He also realized that one cannot resolve a contradiction by adding something to the image. Only by removing relations can consistency be attained. This makes simplicity a virtue. 3. But the deepest reason is probably the status of mechanics relative to atoms or electromagnetism. The laws of mechanics are supposed to be the bedrock to which every other physical phenomenon should eventually be reduced. Atoms and electromagnetic phenomena are more complicated entities. We can make images of atoms: for example as a collection of balls tied together by rubber bands. We can make images of polarization by imagining it as a specific state of the ether. However, such images are phrased in terms of other concepts, here mechanical concepts. But when it comes to the basic principles of mechanics, it is not possible to make intuitive images of such a vivid kind. If we try we will imagine them in terms of the mechanical elements we are trying to depict, e.g. if we want to make a vivid image of matter in terms of balls or something else, we need to imagine the mechanical properties of the ball. Or if we try to imagine connections in terms of rigid rods or balls rolling on each other, we are immediately confronted with the questions of the mechanics of these mechanisms. Thus, the only way we can imagine mechanical phenomena, without taking recourse to mechanics itself (which would be begging the question) is to phrase the images in terms of the more fundamental (and according to Hertz’ a-priori) ideas of space and time. And the resulting image is therefore, in fact, similar to what he had in 1884 and 1892 called a theory. As a conclusion of this comparison of Hertz’s earlier ideas on theories, images and his mature image theory of the Mechanics, we can state that Hertz in his mature theory of images, introduced the requirement of simplicity that he borrowed from his earlier description of a theory. In this way, images became more colorless, in fact, so colorless that they (almost) became what Hertz had earlier called theories. This change in the notion of an image was coupled with a change of the field of physics it was intended to reflect, the simple laws of mechanics calling out for stark colorless minimalistic images rather than the colorful images or representations that may help the imagination to get to grips with atoms or electromagnetic fields.
8.5 Concepts in the Mechanics related to images Having analysed the relation between Hertz’s mature concept of an image and his earlier concepts of theory, image and representation, we will now turn to the relation
Concepts in the Mechanics related to images
107
between the concept of image and other related concepts in the Mechanics itself. In fact, Hertz in various places of the book speaks of theory, analogy and model as well as of representations, scientific representations and mathematical form of an image. In some cases these words are used almost interchangeably with ‘image’ (e.g. the use of the word theory) but in other cases these words carry a separate meaning. It is important to keep these shades of meaning clear, so I shall now briefly explain some of these concepts. Representation of an image. In his analysis of the usual Newtonian–Laplacian image of mechanics, Hertz distinguished between an image and a representation (Darstellung) of it. The latter is the concept that Heidelberger called presentation, and accurately described as ‘the concrete sensual aids and devices, which are used for its more or less contingent formulation.’ In particular, Hertz alluded to the presentations of mechanics found in textbooks. Scientific representation (or form). A scientific representation (Darlegung) of an image is one that leads to a clear conception (Bewustsien) of the role of the different elements or properties. In a scientific representation of an image it should be clear what elements are ascribed to the image for the sake of permissibility, what for the sake of correctness and what for appropriateness (Hertz 1894, p. 3/2). It is not entirely clear if these requirements are requirements of the image or of its representation (Darstellung). On the other hand, Hertz’s initial use of two different words (Darstellung and Darlegung) suggest that the scientific representation is an organization of the image that is prior to its representation (Darstellung). However, having explained the ideas of an image and the requirements one must assign to an ‘image’ and its ‘scientific representation,’ Hertz continued: ‘Those are in my opinion the standpoints from which we must estimate the value of physical theories and the value of the representation [Darstellung] of physical theories’ (Hertz 1894, p. 3/3). Here he used the word ‘Darstellung’ instead of ‘Wissenschaftliche Darlegung,’ which seems to lead to the conclusion that the requirement of the ‘scientific presentation’ is part of the presentation and not prior to it. I shall return to Hertz’s requirement of a scientific representation in Chapter 10. Theory. The last quote also seems to suggest that Hertz in the Mechanics used ‘Theory’ as a synonym for image. It may perhaps be a more inclusive, non-technical term that includes the more clearly defined images. Yet, it seems obvious that it is not a concept that is contrasted to images as in the earlier 1884 and 1892 discussions of images and representations. Mathematical form. In the introduction to the Mechanics, Hertz distinguished between the physical content of his image and the ‘mathematical form in which it will be represented’ [in welcher wir denselben wiedergeben werden] (Hertz 1894, p. 34/29). This third German word [wiedergeben] that is rendered ‘represent’ in the English translation has to be distinguished from the two former meanings of ‘presentation.’ Indeed, where it is quite clear that the physical content is part of the image, it is not so clear whether Hertz considered the mathematical form as a part of the image or merely a part of the representation. The latter is the most likely.
108
Hertz’s earlier ideas about images
At least it seems clear that analytical formulas are part of the representation of an image. Indeed, Hertz always introduced new concepts into his mechanics by first giving a verbal definition and then translating it into an ‘analytic representation’ [analytische Darstellung] in terms of mathematical symbols. However, already the verbal definition is often phrased in the language of the ‘geometry of systems of points’ so it is possible that he thought of this most original element of the mathematical form of his mechanics as belonging to the image itself. In an informal way, it definitely enhanced the visualization of mechanics. The relation between mathematics and physics may seem to have been almost reversed here relative to the situation in Hertz’s version of Maxwell’s theory. In the latter, the mathematical equations were the theory, physical representations could then be added afterward in various ways. In mechanics, the physical content is the core of the image that can then be given various mathematical forms, e.g. a traditional one or Hertz’s high-dimensional geometric one. However, there are some differences between the two situations. In fact, when Hertz talked about the mathematical form of the image in the mechanics he did not mean the fundamental equations of mechanics (or the one fundamental equation that is the analytic representation of his fundamental law). He rather referred to the possibility of expressing this equation in various mathematical garbs: a geometric garb or a purely analytical one. There is no doubt that he considered the fundamental law a central part of his image, just as Maxwell’s equation is a (the) central part of Maxwell’s theory. The question of mathematical form only concerns how one represents this law mathematically: in geometric terms or in purely analytic terms, and in which notation, etc. Therefore, the contrast between Hertz’s presentation of Maxwell’s theory and his mechanics is not so great in this respect after all. Model: 1. Mathematical Model – modern. The concept of a mathematical model that pervades much of modern theoretical science, economy and technology owes a debt to Hertz’s image theory because of its lack of ontological commitment (Morrison 1999). However, these mathematical models would not themselves count as images in Hertz’s sense and often for two reasons. 1. They are usually purely mathematical and therefore do not create a mental image, even a stark black and white one. 2. Often, the mathematical model describes only the situation rather well but not perfectly, and often it is only certain of the consequences of the mathematical model that correspond to empirical results. Thus, the models would disqualify as images because they do not fulfil the requirement of correctness in a strict sense. 2. (Dynamical) Model – Hertz’s concept. Definition. A material system is said to be a dynamical model of a second system when the connections of the first can be expressed by such coordinates as to satisfy the following conditions:— (1) That the number of coordinates of the first system is equal to the number of the second. (2) That with a suitable arrangement of the coordinates for both systems the same equations of condition exist.
Concepts in the Mechanics related to images
109
(3) That by this arrangement of the coordinates the expression for the magnitude of a displacement agrees in both systems. (Hertz 1894, p. 418)
This is Hertz’s precise definition of a dynamical model. Note, as Hertz did in §419, that this defines an equivalence relation among material systems as described in Hertz’s image of nature. In particular, if one system is a model of a second, then conversely, the second is also a model of the first. This is different from the relation between the external world and the image we make of it. In the latter case, only the image is a mechanical system in the sense treated in Hertz’s book and so the relation ‘being a model of’ does not apply to this situation. Moreover, the symmetry is broken in the latter case. It makes no immediate sense to claim that the external world is an image of our mental image. Still, this is precisely what Hertz claimed in two interesting passages that relate the idea of an image with that of a model: Observation 1. If we admit generally and without limitation that hypothetical masses (§301) can exist in nature in addition to those which can be directly determined by the balance, then it is impossible to carry our knowledge of the connections of natural systems further than is involved in specifying models of the actual systems. We can then, in fact, have no knowledge as to whether the systems which we consider in mechanics agree in any other respect with the actual systems of nature which we intend to consider, than in this alone, – that the one set of systems are models of the other.
As it stands, the last statement, to the effect that the systems we consider (our images) are models of the actual systems of nature, is not meaningful. It is the basic assumption of Hertz’s book that external nature can be imagined correctly by hidden masses, connections and the fundamental law of motion. However, even if this image turns out to be correct, it does not follow that nature is a mechanical system in Hertz’s sense. This is the whole idea of the image theory. It may be that the world does operate with actions at a distance or is more complicated than Hertz’s image in other ways (Hertz repeatedly emphasized that he did not have metaphysical reasons to believe that nature itself is simple). But it only makes sense to speak of a model of ‘actual systems of nature’ if they are systems in Hertz’s sense. So the meaning of the quote must be the following. Let us assume that Hertz’s image of mechanics is not only correct but also true (nature really consists of connected masses (ordinary and concealed) moving according to the fundamental law.) Even under this assumption we cannot hope to acquire true knowledge of any particular system in nature. The only thing we can hope for is that our image is a mechanical model of the natural system we intend to consider. This removes us even further from the truth about nature than the image theory had already suggested. The similarity of images and models is taken one step further in Hertz’s subsequent observation: Observation 2. The relation of a dynamical model to the system of which it is regarded as the model, is precisely the same as the relation of the images which our mind forms of things to the things themselves. For if we regard the condition of the model as the representation
110
Hertz’s earlier ideas about images
of the condition of the system, then the consequents of this representation, which according to the laws of this representation must appear, are also the representation of the consequents which must proceed from the original object according to the laws of this original object. The agreement between mind and nature may therefore be likened to the agreement between two systems which are models of one another, and we can even account for this agreement by assuming that the mind is capable of making actual dynamical models of things, and of working with them. (Hertz 1894, §428)
Towards the end of this quote, Hertz more subtly likens the agreement between mind and nature with two systems that are models of each other. But otherwise this observation calls for the same clarification as above. The end of the quote seems to mean: if Hertz’s image is a true image of nature, then we can account for our ability to build images by assuming that our brain can make models of external systems. This is Hertz’s only psychophysical consideration in the book. Analogy. As pointed out in Chapter 3, this term was used by Thomson and Maxwell and other nineteenth-century scientists in a precise sense: two theories (about completely different things, e.g. heat and electricity) are analogous if they are described by the same mathematical equation. Hertz’s image theory may very well have been influenced by this notion of analogy, but in this strict sense images are not analogous to external nature. There are no equations in nature, and images are more than equations. However, when Hertz stated that his geometry of systems of points was ‘analogous’ to high-dimensional geometry (Hertz 1894, p. 36/30), he probably used the word in the precise sense explained here: the two theories deal with quite different things (configurations of a mechanical system and points in an n-dimensional space, respectively) but they share the analytical apparatus of line elements, etc. That is how Hertz could use ideas from Riemanninan geometry without implying that his mechanics was based upon such suprasensible abstractions (see Chapters 11 and 13).
9 Images of mechanics
Having considered Hertz’s general ideas about images, let us now turn to his evaluation of the three images of mechanics. First, however, it will be useful to consider what a ‘principle of mechanics’ is, according to Hertz.
9.1 Principles of mechanics When Hertz referred to the principle of least action and Gauss’s principle of least constraints and the principle of area (conservation of angular momentum), he simply followed tradition. But these separate concrete propositions are not what we shall have in mind when we speak simply and generally of the principles of mechanics: by this will be meant any selection from amongst such and similar propositions, which satisfies the requirement that the whole of mechanics can be developed from it by purely deductive reasoning without any further appeal to experience. In this sense, the fundamental ideas of mechanics, together with the principles connecting them, represent the simplest image which physics can produce of things in the sensible world and the processes which occur in it. By varying the choice of the propositions which we take as fundamental, we can give various representations of the principles of mechanics. (Hertz 1894, pp. 4–5/4)
Thus, in reality, Hertz did not define what he understood by a principle of mechanics, but by a system of principles: it is what we would call an axiom system for mechanics. However, only those axioms that contain empirical knowledge are principles of mechanics. For example, geometrical axioms are not mechanical principles according to Hertz. In his own mechanics, there is only one mechanical principle, namely his fundamental law of motion. The above quote also suggests that images of mechanics are built primarily by choosing a system of mechanical principles. This is somewhat at variance with the impression Hertz gave in the beginning of the general introduction where it was the symbols of objects that primarily determined the image and it is also at variance with the description of the three concrete images of mechanics in which it is the choice of basic concepts (among the concepts, space, time, mass, force and energy) that primarily characterize the image. However, both in the early plan of the book 111
112
Images of mechanics
(in Ms 13) and in the first draft of the preface (Ms 10) Hertz more explicitly stressed that it is the choice of the fundamental principles (the axioms) that determine the representation. If Newton’s laws are chosen one gets the usual Newtonian–Lagrangian representation and one is led to consider space, time, mass, and force as basic concepts. If Hamilton’s principle is chosen as the fundamental principle one gets the energetic presentation that considers energy rather than force as a basic concept. Finally, if Gauss’s principle is chosen one gets Hertz’s presentation ‘assuming that real forces do not exist’1 . Thus, the idea that it is primarily the choice of mechanical principle that determines a representation of mechanics seems to have been the leading idea in Hertz’s early work on mechanics. It may have receded more into the background after he developed his image theory and in the book it is mainly visible in the above quote. The fact that Hertz does not speak of images but of representations in this quote also points to an origin that pre-dates his introduction of the image theory.
9.2 The Newtonian–Laplacian image Hertz considered the Newtonian–Laplacian image as being so familiar to all readers that he did not need to explain it in much detail. The fundamental concepts are space, time, mass and force and they are connected by the following principles: Newton’s three laws and d’Alembert’s principle. After a one-page introduction of these concepts and principles, Hertz engaged in a 10-page critical analysis (sometimes almost ridicule) of it. He proceeded in the order suggested by his general introduction, beginning with a frontal attack on the permissibility of the image. Here it is the use of the concept of force that is seen as the big stumbling block. In particular, Hertz asked: if we swing a stone tied to a string in a circle, our hand constantly exerts a force on the stone, but according to Newton’s third law, the stone should exert an equal but opposite force on our hand. But which? Having ruled out the so-called centrifugal force, Hertz was left without an answer to this question and hinted at the possibility that this logical impermissibility is ascribable to the fundamental laws (the principles). He then alluded to the experience that it is exceedingly difficult to expound to thoughtful hearers the very introduction to mechanics without being occasionally embarrassed, without feeling tempted now and again to apologise, without wishing to get as quickly as possible over the rudiments, and on to examples which speak for themselves. (Hertz 1894, p. 8/6–7)
As examples of problems at the beginning of mechanics books, he ridiculed Newton’s and Thomson’s and Tait’s definition of mass, Lagrange’s definition of force and the different proofs of the parallelogram rule for combining forces. Many of these objections were similar to those of Mach and other critical analysers of mechanics (see Chapter 2). Surprisingly enough, he ended up arguing (as we have already seen above) that because of its correctness, these inconsistencies cannot be 1 ‘Vorausgesetzt dass eigentliche Kräfte nicht bestehe’ (Ms 13).
The Newtonian–Laplacian image
113
ascribed to lack of permissibility of the image itself, but only of its representations in various textbooks. The problems are therefore, according to Hertz, problems of appropriateness. Hertz went on to declare that the Newtonian–Laplacian image was correct, although he hinted that its ability to deduce correct consequences may, in part, be due to its fussy logical form (see Section 7.5). ‘And,’ he continued, what here holds for the forces, can be equally asserted of the fixed connections of bodies which are represented mathematically by equations of condition between the coordinates and whose effect is determined by d’Alembert’s principle. It is mathematically possible to write down any finite or differential equation between coordinates and to require that it shall be satisfied; but it is not always possible to specify a natural, physical connection corresponding to such an equation: we often feel, indeed sometimes are convinced, that such a connection is by the nature of things excluded. And yet, how are we to restrict the permissible equations of condition? Where is the limiting line between them and the conceivable ones? To consider only finite equations of condition, as has often been done, is to go too far; for differential equations which are not integrable can actually occur as equations of condition in natural problems. (Hertz 1894, p. 13/11)
This particular criticism of the lack of distinctness of the Newtonian–Laplacian image is somewhat surprising, not because it is unwarranted in itself, but because it applies almost as well to Hertz’s own image. To be sure, Hertz limited connections to be those that satisfy a certain continuity requirement, which means that they can be expressed by first-order homogenous linear differential equations. In this way, his image is more distinct than the Newtonian–Laplacian image. However, even in this case, it seems often impossible to specify a natural, physical connection corresponding to such an equation. When finally investigating the simplicity of the Newtonian–Laplacian image, Hertz again turned to the idea of force and declared that in many descriptions of mechanical systems, forces enter as ‘free running idle wheels.’ For example, he claimed that in the description of the solar system, ‘the forces of gravitation enter as transitory aids in the calculation and then disappear from consideration’ (Hertz 1894, p. 14/12). But his main example is a variation of an argument from his Kieler Lectures (see Section 6.1.3). Consider a bar of iron lying on a table. According to the Newtonian– Laplacian image, the iron is influenced by a host of forces: every atom of the iron is attracted by gravitation to every other atom in the universe. In a similar fashion, electric and magnetic forces act. The table reacts, there are molecular forces, etc. But, in fact, all the forces, are so adjusted amongst each other that the effect of the whole lot is zero; that in spite of a thousand existing causes of motion, no motion takes place. … And it is for us to reflect whether we have really depicted the state of rest of the iron and its particles in a simple manner … , but there can be no question that a system of mechanics which does avoid or exclude them is simpler and in this sense, more appropriate than the one here considered. (Hertz 1894, p. 16/13)
With this last statement, Hertz seems to suggest that his own image is better precisely because it avoids these complications.
114
Images of mechanics
9.3 The energetic image Hertz’s description of the energetic image is more thorough. This image also rests on four fundamental concepts: space, time, mass and energy and the basic mechanical principle is Hamilton’s principle. In this case, Hertz began with a discussion of its appropriateness, ‘since it is in this respect that the improvement is most obvious.’ (Hertz 1894, p. 20/17). It is more distinct because it avoids other forces than the conservative ones (that can be defined from a concept of potential energy). It is also simpler because it avoids the arbitrary hypotheses of atoms with special properties and can be formulated entirely macroscopically. As far as correctness is concerned, Hertz first explained how Hamilton’s principle applied to non-holonomic systems would yield the wrong conclusions. However, as mentioned in Chapter 8, Hertz was reluctant to declare the image to be incorrect. Instead he suggested that one might save the phenomena by considering rolling with just a little slipping instead of pure rolling. Rolling with a bit of slipping can be correctly described in the energetic image, so the problem may be described as one of appropriateness. Still, Hertz came up with several arguments that seem to indicate that it is not quite satisfactory to reduce the problem from one of correctness to one of appropriateness. He reminded the reader that the process of pure rolling ‘is one which is so nearly realized in the visible world that even integration machines are constructed on the assumption that it strictly takes place.’ (Hertz 1894, pp. 24–25/21). Here he referred to the great outburst of integrators, harmonic analysers and the like that was being produced during this period. Maxwell and Thomson in particular, had emphasized the usefulness of pure rolling in such machines, in comparison to the mixed rolling and sliding that takes place in ordinary planimeters, e.g. of the Amsler type2 . So when rolling appears in the visible world, how are we then to exclude it from ‘the mechanics of unknown systems, such as the atoms or the parts of the ether’ (Hertz 1894, p. 25/21)3 . Moreover, Hertz pointed out that in order for a law to give a truly fundamental description of our system, we must require ‘that when applied to approximately correct relations it should always lead to approximately correct results.’ Yet, using Hamilton’s principle to describe pure rolling and rolling with just a minute slipping produce entirely different results. However, this problem that can be described as a problem of well posedness, still did not preclude Hertz from concluding: We should prefer to admit that the doubt is one which affects the appropriateness of the system, not its correctness, so that the disadvantages which arise from it may be outweighed by other advantages. (Hertz 1894, p. 25/21)
Hertz then began the discussion of the permissibility by declaring that here lies the real difficulties of the energetic image. First, he pointed out that it is difficult to imagine energy as a real substance that can be localized in space (see Section 6.4). Kinetic energy is unproblematic but potential energy is problematic both because it is 2 See (Thomson and Tait 1879, vol. 1, Appendix 3) (Galle 1912), and (Willers 1951). 3 As I shall argue in Section 20.3, Hertz’s own system does, in fact, exclude non-holonomic rolling in
the hidden parts of a conservative mechanical system, but he never stated that explicitly.
Hertz’s image
115
only definable up to an additive constant (this is, of course, untrue of a real substance) and because it will be infinite if the universe is infinite. However, as in the case of the Newtonian–Laplacian image, Hertz did not exclude that it would be possible to overcome these problems. The most prudent thing to do will be to regard it for the present an open question, whether the system can be developed in logically unexceptionable form. (Hertz 1894, p. 27/22)
Similarly, Hertz criticized Hamilton’s principle for being too complicated and metaphysically objectionable. Yet he felt that it might be possible to penetrate to the deeper and real meaning which we are convinced it possesses. If this conception is correct, the objection brought forward does really justify a doubt as to the system; but it does not apply so much to its permissibility as to its appropriateness. (Hertz 1894, p. 28/24)
Thus, although Hertz claimed that the energetic image had problems as far as its permissibility was concerned, he finally regrouped these problems as having to do with appropriateness as well.
9.4 Hertz’s image As one would expect, Hertz’s description of his own image of mechanics is longer and more detailed than his description of the competing images. I shall return to many of his points below. Suffice it to say that he stressed that his image only deals with three fundamental concepts: space, time and mass. He further admitted that if we try to describe the motions of bodies in terms of these concepts alone we will soon be ‘convinced that the manifold of the actual universe must be greater than the manifold of the universe which is directly revealed to us by our senses.’ So we must admit that there is something hidden. In the two earlier images this was force and energy. But another way lies open to us. We may admit that there is a hidden something at work, and yet deny that this something belongs to a special category. (Hertz 1894, p. 30/25)
So Hertz introduced the concealed masses that only differ from ordinary masses in the way they interact with us (our sensory system). It is the motion of these hidden masses that give rise to the phenomena that we call force: What we are accustomed to denote as force and as energy now become nothing more than an action of mass and motion, but not necessarily of mass and motion recognizable by our coarse senses. Such explanations of force from processes of motion are usually called dynamical; and we have every reason for saying that physics at the present day regards such explanations with great favour. (Hertz 1894, p. 31/26)
He went on to refer to earlier applications of hidden masses (the ether) and introduced rigid connections and his fundamental law. After this presentation of the physical content of his image, Hertz continued to describe and defend the untraditional mathematical form he had chosen to present it in.
116
Images of mechanics
He then proceeded with an evaluation of his image beginning with permissibility. As mentioned above, he considered it the most important merit of his image that it satisfies the most rigid requirements as far as permissibility is concerned. This is partly a result of the strict deductive style and the stripping of the image of all unnecessary relations. In particular, he defended the use of rigid connections but seems to admit that macroscopic connections are only approximately rigid so that ‘in seeking the actual rigid connections we shall perhaps have to descend to the world of atoms.’ (Hertz 1894, p. 41/34)
If this is similar to the situation with respect to forces in the usual image, Hertz suppressed any mention of this similarity in the subsequent discussion. The real problem faced by Hertz’s image is its correctness. Here, Hertz discussed three problems: 1. could there be connections other than the ones admitted by Hertz, 2. is the image able to deal with all the situations that the usual mechanics describe by way of forces, 3. does the image permit a correct description of living things. As to the first point, Hertz argued first that his type of connections were the only type compatible with the old proposition ‘Natura non facit saltus,’ but then admitted that we cannot obtain absolute certainty this way. So in the end, he admitted that his assumption about the possible connection is ‘of the nature of a tentatively accepted hypothesis.’ The second point is the central one and the one that all later commentators of Hertz’s mechanics pointed to. Is it possible for Hertz’s image of mechanics to correctly describe systems like the solar system or electromagnetic systems that are ordinarily described using, e.g. gravitational or electromagnetic forces? In other words, can we devise a hidden system of masses and a collection of connections with the visible system such that the actual motion of the visible system is a consequence of the fundamental law of motion applied to the complete system (visible as well as hidden). I shall return to this problem in Chapter 25. Here, it suffices to say that he briefly discussed the problem in the introduction and after the formulation of the fundamental law, but never showed how to solve it. In the introduction he stated that one can show that hidden systems can produce forces of a ‘very general nature; and in fact we do not deduce any restrictions for them.’ Yet he admitted: But on the other hand it remains for us to prove that any and every form of the force-functions can be realised; and hence it remains an open question whether such a mode of explanation may not fail to account for some one of the forms occurring in nature. Here again we can only bide our time so as to see whether our assumption is refuted, or whether it acquires greater and greater probability by the absence of any such refutation. (Hertz 1894, p. 44/37)
Thus, rather than seeing it as his duty to demonstrate the correctness of his image, he saw it as the task of his opponents to show that his image could not correctly describe nature. As to the description of live matter, Hertz did not commit himself but he considered it most probable that it was not describable by his mechanics (see Chapter 25). As far as appropriateness is concerned, Hertz assigned to his own image ‘about the same position as to the second image’ [the energetic one] (Hertz 1894, p. 46/39).
Conclusion of the comparison
117
Strangely enough he did not argue at this place that it was simpler because it avoided a fourth fundamental concept of energy. He claimed that like the energetic image, his own image permitted a completely macroscopic description. If this is supposed to contrast the Newtonian–Laplacian image, Hertz owes his readers an explanation of the nature of the constraints that he described as having an atomic origin. He admitted that his system only avoided hypothetic, hidden entities ‘when we are dealing with systems which are completely known, and that it disappears as soon as concealed masses come in’ (Hertz 1894, p. 47/39). However, he further argued that ‘The loss of simplicity is not due to nature, but to our imperfect knowledge of it.’ Finally, Hertz argued that his image was more appropriate than the energetic image in the sense that its fundamental law is simpler.
9.5 Conclusion of the comparison Let me briefly summarize Hertz’s evaluation of the three images of mechanics. He spotted problems with permissibility of the first Newtonian–Laplacian image, but redefined them as problems of appropriateness. He had no serious doubts about its correctness. He spotted problems with both permissibility and correctness of the second energetic image but ended up redefining those also as problems of appropriateness. He considered his own image as permissible, distinguishing this as its most important merit, and the problems that he realized could be raised against its correctness, he deferred for future falsification or verification. As far as appropriateness is concerned, he considered the energetic and his own image on a par whereas the Newtonian–Laplacian image was far inferior. With these considerations in mind it is very surprising to read Hertz’s own final comparison of the three images, added as a two-page conclusion to the introduction. Here, he first discarded the energetic image: ‘After what we have already said, we may leave the second image out of consideration.’ (Hertz 1894, p. 48/40).
After what I have already said above, I think this is a somewhat mysterious move that in fact is not explainable from Hertz’s earlier arguments. As for the two remaining images, Hertz now declared them to be on a par as far as permissibility and appropriateness are concerned (assuming that the Newtonian– Laplacian image is thrown into a permissible form). Therefore, the sole criterion on which to judge is correctness. Now the first image operates with fixed forces from which one can deduce approximately rigid connections whereas the third image operates with truly rigid connections from which one can deduce approximate forces. Therefore, both images cannot be correct, and Hertz imagined that future experience will decide between them. As already mentioned in Chapter 6, he considered the recent development of electromagnetism to be an argument against forces, and he predicted that a future development of an ether theory would vindicate his own image. This is the field in which the decisive battle between these different fundamental assumptions about mechanics must be fought out. (Hertz 1894, p. 49/61)
118
Images of mechanics
Thus, where Hertz, in the main text of the introduction, emphasized the permissibility of his own image, ridiculing the lack of permissibility in the Newtonian– Laplacian image, and stressed the higher distinctness and simplicity of his own image over an image that was filled with inessential forces, he now, in the conclusion, put the two images on an equal footing as far as these criteria are concerned. And where he had earlier declared that ‘no one will deny that within the whole range of our experience up to the present the correctness [of the Newtonian–Laplacian image] is perfect.’ (Hertz 1894, p. 11/9) and had admitted that the correctness of his own image was an unproven problematic hypothesis, he now argued that his image would win the battle on exactly this ground. Thus, the last two pages of the introduction read more as a second thought than as a conclusion. It is also noticeable that this last part is not present in Hertz’s first draft of the introduction, and so seems to have been added at a very late date4 . It may be read as a conflicting view, but we may also consider the change of view as a result of a change of perspective: where his earlier evaluations were made from the point of view of Hertz’s own time, his final conclusion seems to be reached from the point of view of the future, when the Newtonian–Laplacian image might have been developed to perfection, as far as permissibility and appropriateness are concerned, but when experimental physicists like Hertz himself have developed experimental research of the atomic realm so far, that the Newtonian–Laplacian image has been proven incorrect but Hertz’s own image has been proven correct. Such a future never arose.
4 The last pages are not contained in the second final draft either, but that is due to the fact that the last pages of this manuscript are lost. It breaks off in the middle of a sentence at the bottom of a page corresponding to p. 45/38. As discovered by Nordmann (Nordmann 1998, note 24). Hertz, in the corrections of his manuscript (#2853 at the Deutches Museum in Munich) requested that the three main sections and the concluding remarks be set apart from the preceding text by 1/4 to 1/3 blank page. This emphasizes that the last ‘conclusion’ presents a new approach to the subject.
10 Kantianism. A-priori and empirical elements of images
In this section I shall investigate some unmistakably Kantian features of Hertz’s Mechanics. I shall argue that Hertz initially, when he embarked on his mechanics project, took over a widespread Kantian distinction between an a-priori kinematics and an empirical dynamics, and that he gradually developed and sharpened this distinction while working on the book. Moreover, I shall argue that viewing Hertz’s Mechanics in this Kantian light will clarify his otherwise odd requirements of the scientific representation of an image, and will shed more light on the concept of permissibility.
10.1 Scientific representations As we have seen in the previous section, the properties of permissibility, correctness and appropriateness were somehow interconnected in Hertz’s discussion of the three images of mechanics. Still, he insisted that in a scientific representation of an image one should distinguish clearly between them: The postulates already mentioned (about the permissibility, correctness, and appropriateness) are those which we assign to the images themselves: to a scientific representation of the images we assign different postulates. We require of this that it should lead us to a clear conception of what properties are to be ascribed to the images for the sake of permissibility, what for correctness, and what for appropriateness. Only thus can we attain the possibility of modifying and improving our images. (Hertz 1894, p. 3/2)
Where Hertz in the beginning of the Introduction spoke of permissibility, correctness, and appropriateness as though they applied to the image as a whole, the above quote implies that one can and should use these criteria locally and identify the elements that are in the image for the sake of permissibility, those that are in it for the sake of correctness, and those that are there for the sake of appropriateness. From a modern perspective this requirement seems strange, undesirable or even impossible. After all, from a modern point of view, logical permissibility of a system is a requirement of an entire (axiomatic) system, and it does not apply to a single definition or axiom. Similarly, Pierre Duhem (Duhem 1906) and others have argued that one 119
120
Kantianism. Elements of images
cannot experimentally corroborate or falsify any single physical law, but only a whole theory. Any experiment will necessarily involve many elements of the theory, so it is not possible to blame a falsification on one particular law. And yet this is precisely what Hertz insisted one should do. In order to better understand what Hertz was up to with his requirement of a scientific representation of an image, it is helpful to recall one of the criticisms he levelled at the usual Newtonian image of mechanics. Having ascertained that the image owes its laws in part to experience, he insisted that therefore they ‘can again be annulled by experience’ (Hertz 1894, p. 11/9). He admitted that many physicists would consider it unthinkable that future experiments would falsify the laws of mechanics but he attributed this conservative outlook to ‘the fact that the elements of experience are to a certain extent hidden in them [the laws of mechanics] and blended with the unalterable elements which are necessary consequents of our thought’ (Hertz 1894, p. 11/9). According to Hertz, such a mix of the elements is acceptable in the formative phases of a science, but in a mature science it is inadmissable. This criticism of the classical theory of mechanics is similar to and probably derived from Mach’s critical analysis, which Hertz explicitly referred to. Mach had also argued that the classical theory of mechanics was a product of a historical development and therefore contingent. Only critical reflection could reveal in how far it was philosophically necessary. Hertz’s requirement of a scientific image shall be seen as an attempt to philosophically clarify this distinction between the philosophically necessary and the empirical parts of the theory of mechanics. As indicated in the quote above, Hertz believed that such a clarification was necessary for future modifications and improvements of images. In particular, he did not exclude that future experience might contradict his own image of mechanics. But since it is quite clear which elements are ascribed to this image for the sake of correctness, one knows that precisely these elements need to be modified in the face of a future falsification. This seems to be the motive for Hertz’s requirement of a scientific representation of an image. As mentioned in Chapter 6, Hertz had earlier in connection with Maxwell’s equations declared: ‘To be sure, each individual equation cannot be tested against experience, only the system as a whole’ (Hertz 1890, p. 210). This seems to express adherence to the view expressed later by Pierre Duhem to the effect that only theories as a whole can be tested by experiment. And yet, in the Mechanics Hertz seems to have rejected the thesis. Why? And how did Hertz imagine that one could localize the elements of an image that are ascribed to it for the sake of permissibility? In order to clarify these questions we need to investigate which elements Hertz ascribed to an image for the sake of permissibility, which for the sake of correctness and which for the sake of appropriateness. In the introduction he first gave a general answer: What is ascribed to the images for the sake of appropriateness is contained in the notations, definitions, abbreviations, and, in short, all that we can arbitrarily add or take away. What enters into the image for the sake of correctness is contained in the results of experience, from which
A Kantian division
121
the images are built up. What enters into the images, in order that they may be permissible, is given by the nature of our mind. (Hertz 1894, p. 3/2–3)
As far as his own representation of his image was concerned Hertz explained at the end of the introduction that he would postpone the empirical elements: Before proceeding to mechanics proper, as dependent upon physical experience, I have naturally discussed those relations which follow simply and necessarily from the definitions adopted and from mathematics; the connections of these latter with experience, if any, is of a different nature from that of the former. (Hertz 1894, p. 42/35)
Indeed, this explains the division of Hertz’s Mechanics into two books, in which the empirical elements are introduced at the beginning of book two.
10.2 A Kantian division However, it is interesting to note that in the main text of the book Hertz phrased the division in more Kantian terms. Thus he opened the first book with the following prefatory note: The subject-matter of the first book is completely independent of experience. All the assertions made are a priori in Kant’s sense. They are based upon the laws of the internal intuition of, and upon the logical forms followed by, the person who makes the assertions; With his external experience they have no other connection than these intuitions and forms may have. (Hertz 1894, §1)
The first book deals with geometry and Kinematics of Material Systems. What is missing from this book is the fundamental law that explains which of the many possible paths a free mechanical system will actually follow in nature. This law is introduced in the second book Mechanics of Material Systems, which is prefaced by the following note: In this second book we shall understand times, spaces, and masses to be symbols for objects of external experience; symbols whose properties, however, are consistent with the properties that we have previously assigned to these quantities either by definition or as being forms of our internal intuition. Our statements concerning the relations between times, spaces and masses must therefore also be in accordance with possible, and, in particular, future experiences. These statements are based, therefore, not only on the laws of our intuition and thought, but in addition on experience. The part depending on the latter, in so far as it is not already contained in the fundamental ideas, will be comprised in a single general statement which we shall take for our Fundamental law. No further appeal is made to experience. The question of the correctness of our statements is thus coincident with the question of the correctness or general validity of that single statement. (Hertz 1894, §296)
Thus, the empirical elements of Hertz’s presentation of his image are consciously postponed to the second book, and they are minimized to one single statement: the fundamental law of motion. Thus, if future experimental evidence would falsify the image, we would, according to Hertz, know exactly what to modify, namely the fundamental law (see Chapter 25 for a more comprehensive discussion).
122
Kantianism. Elements of images
Having found that the division of the various elements of Hertz’s image of mechanics are presented in one way in the introduction, and in another (more Kantian way) in the main text of the Mechanics one might ask whether the divisions are really meant to be the same? I think they are. Indeed, the end of §296 quoted above makes it fairly certain that the empirical elements in the sense of the second book are exactly what Hertz in the introduction called the elements ascribed to an image for the sake of correctness. Secondly, the a-priori elements of the first book are obviously something like the elements ascribed to the image for the sake of permissibility. However, we are faced with the obvious problem that the classification of the elements in the introduction has three classes and the classification in the main part of the book has only two. If we identify the a-priori elements with the elements ascribed to the image for the sake of permissibility and the empirical elements with the elements ascribed to the image for the sake of correctness what then do we do with the elements ascribed to the image for the sake of appropriateness? If we recall that according to Hertz, definitions are among those elements, the quote above from page 42/35 seems to group them in the first book with the a-priori elements. But it is also clear that they are not themselves a-priori but conventional. Since we also find definitions, for example of the concept of force, in the second book, we must conclude that the elements ascribed to the image for the sake of appropriateness are simply not covered by the classification into a-priori and empirical elements. Except for that I think Hertz intended the two classifications to correspond to each other. This argument has an interesting consequence for our understanding of Hertz’s notion of permissibility. Indeed by identifying the two classifications, we equate the a-priori elements of an image with those that are ascribed to it for the sake of permissibility. Recall that an image is called permissible if it does not contradict our ‘laws of thought.’ On the other hand, the a-priori elements depend upon ‘the laws of the internal intuition of, and upon the logical forms followed by, the person who makes the assertions.’ Now Hertz clearly believed that the laws of thought are part of the laws of the internal intuition plus the logical forms…, i.e. our laws of thought are a-priori. Conversely, matching the classification in the introduction with the classification in the main text implies that Hertz believed that our laws of thought include the a-priori intuitions. That does not mean that Hertz believed that all a-priori statements were analytic in Kant’s sense. He clearly agreed with Kant that Euclidean geometry is a-priori and synthetic (see Chapter 11). On the contrary it means that when Hertz wrote ‘laws of thought’ he did not restrict that to logic, but meant to include all a-priori knowledge in Kant’s sense. This interpretation would mean that when Hertz required his image to be permissible, he did not simply mean that it be consistent in the modern sense. In addition to being consistent in a modern sense it should also conform to our a-priori judgments. With such a reading, it becomes much easier to understand what Hertz meant when he required that a scientific representation should single out those elements that are ascribed to it for the sake of permissibility. It means that one should separate the a-priori from the a posteriori parts of the image.
A metaphysics of corporeal nature
123
With these reflections in mind, we can also understand why Hertz in his Mechanics rejected the ‘Duhem thesis’ that he had earlier applied in his work on electromagnetism. Indeed, it is rather clear that Maxwell’s equations all have the same a posteriori status. It is therefore consistent from Hertz’s point of view to let them stand and fall together. In mechanics, however, the basic elements have very different status: some are a-priori, others are a posteriori. The first type of elements are immune to empirical tests so the Duhem thesis can only apply to the latter. In Hertz’s mechanics there is only one a posteriori element, so here the Duhem thesis becomes vacuous. I shall return to the a-priori nature of Hertz’s concepts of space and time, in particular how these intuitions may relate to our experience of the outer world (see Chapter 11), as well as to the empirical nature of the fundamental law of motion (see Chapter 16). For now it suffices to stress that Hertz’s endeavor to separate the a-priori from the empirical elements of his image, and the division of his book according to these lines, show a distinctly Kantian influence on Hertz’s thought.
10.3 A metaphysics of corporeal nature David Hyder (Hyder 2003) has suggested that Hertz’s Mechanics is a thoroughly Kantian endeavor on an even more fundamental level in that it provides what Kant in his Metaphysische Anfangsgründe called a ‘metaphysics of corporeal nature.’ Kant had argued that in addition to our a-priori intuitions of time and space, a rational mathematical science of nature requires a basic purely philosophical analysis of the concept of matter. This ‘metaphysics of corporeal nature’ should not be based on any particular experience but only on the empirical notion of matter as a concept of nature pertaining to the pure intuitions of time and space. When combined with the empirical laws of nature it would lead to the science of physics. As Hyder points out, this corresponds well to what Hertz did in his Mechanics, the first a-priori book containing the ‘metaphysics of corporeal nature,’ whereas the second book combines it with the empirical law of motion to provide an image of physics of material systems. Hyder further argues that not only can we think of Hertz’s Mechanics in this way, Hertz himself thought about it in this way as well. By way of argument Hyder points out that Hertz read Kant’s Metaphysische Anfangsgründe in 1883 and the following year referred to it in his Kiel Lectures On the Constitution of Matter. Hyder considers these lectures as being conceived with such a Kantian goal in mind1 . Since these lectures were the starting point for several of the ideas that Hertz later worked out in his Mechanics Hyder has constructed a historical link between Kant’s ‘metaphysics of corporeal nature’ and the content of this book. I think that Hyder argues convincingly that Hertz would have agreed that one can view his Mechanics as providing the ‘metaphysics of corporeal nature’ that Kant called for. However, I think it is unlikely that Hertz, while working on his Mechanics had this Kantian motive at the front of his mind. My reason is not so much that on a technical level Hertz’s metaphysics for a science of matter is quite different from 1 The presentation in Chapter 8 shows that I do not share that opinion.
124
Kantianism. Elements of images
what Kant had in mind, in particular as far as the concept of mass and forces are concerned. Rather, I find it conspicuous that in the entire preface and introduction where Hertz explained the nature of his undertaking and how he had been led to it, he did not mention Kant once. There are explicit references to Helmholtz and Mach, and indirect reference to the British electromagnetic tradition, but none to the Königsberg philosopher.
10.4 Kantianism in the first draft of Hertz’s Mechanics. Existence problems Another reason to think that a Kantian agenda was not one of the main forces that drove Hertz when he initially set to work on his Mechanics is the fact that his first plan of the Mechanics (in Ms 13) does not mention Kantian notions at all. It is not even divided into an a-priori first part and an empirical second part. For example, Hertz introduced the concepts of space, time and mass by mentioning how they are measured. This corresponds to his coordinating rules in the book (see Chapter 11). There is no discussion of an a-priori intuition of these concepts. The first sign of a Kantian influence can be found in the first long draft (Ms 9) of the book that is divided into an a-priori first book and an empiric second book. However, this division is not highlighted at the beginning of the first book, as it is in the second and subsequent drafts and in the printed book. Rather it is made explicit only in the following ‘end note’ to the first book: Logical value of the above: All the above were developments based on our inner intuition of time and space and the laws of our logical thinking [a priori according to Kant, based on the definitions and the laws of our intuitions]. With experience there is no other connection than the assumption that in reality there are systems of masses which satisfy the conditions for our material systems, i.e. systems of material points for which certain displacements are possible and others are impossible independent of any consideration of time. (Ms 9, p. 43)
Thus, in addition to being more like an after-thought, the explicit reference to Kant in brackets was clearly intercalated into the text. This seems to suggest that the Kantian agenda was not of primary importance to Hertz. Moreover, it is interesting that the division into an a-priori and an empirical book is not so clear-cut in the first draft, as in the book. In fact, Hertz assumed some empirical content in the first book of the draft, namely the existence in nature of a system of masses fulfilling the requirements of a material system (see Chapter 15), in particular some continuity requirements that according to Hertz determine the type of connections between the material points of the system. That Hertz initially felt obliged to assume the existence of such systems points to the idea that a definition should imply the assumption of the existence of the object defined. Many mathematicians at that time held such beliefs (e.g. Poincaré (Poincaré 1952, p. 152)) with the exception that for the mathematician, existence did not mean existence in the natural world, but existence in some mathematical sense (e.g. consistency).
The division between kinematics and dynamics
125
Already Aristotle, however, had made a strict distinction between definition and existence. It is one thing to define a square (or a rectangular pentagon) and quite another thing to establish (by construction) that a square exists or that a rectangular pentagon does not exist. In the published book Hertz seems to have adopted a more Aristotelian stance. Here, he defined material systems in the first book, and derived many theorems about them, but only in book 2 did he announce that: ‘We know from experience that there is an actual content corresponding to the conceptions so defined’ (Hertz 1894, §306). In this way he removed all reference to experience from the first book and thereby highlighted the Kantian division between the a-priori and the empirical.
10.5 The division between kinematics and dynamics If one assumes that Hertz did not have the Kantian agenda clear at the front of his mind when he began working on the Mechanics, how can one then explain that already the first draft is divided into an almost a-priori kinematics and an empirical dynamics? I will suggest that Hertz simply followed tradition. Indeed, such a division was common-place at the time. The distinction originated with Ampère and was adopted by Thomson and Tait in their Treatise T and T’ (Smith and Wise 1989, Chapter 11). Even Helmholtz made a similar division in his lectures in Berlin on mechanics of discrete mass points. In §1 of his published lectures given during the winter of 1893–94 (i.e. before he had seen the manuscript of Hertz’s book on mechanics) he began as follows: Dynamics covers the theory of those natural phenomena that can be traced back (reduced to) the motion of ponderable masses. In order to be able to formulate the observed laws in this field in a systematic and exact way, we need to begin by introducing those concepts, that are suitable for a mathematical description of the observed motions. The theory of the introduction of these concepts and their connections is called Kinematics. Thus in this theory we do not yet deal with the laws of nature, which can only be determined through external experience, but rather with logical conceptualisations. (Helmholtz 1898, §1)
Considering that Helmholtz in his papers on geometry argued for an empiricist conception of space it is highly surprising to see that he characterized kinematics, which clearly includes space and time, as a logically conceptual science that is not determined by external experience. I think that this shows that at the end of the nineteenth century the German physics community shared some rather Kantian ideas regarding mechanics that even Helmholtz chose to follow, even if they clashed with his anti-Kantian views on the status of Euclidean geometry. It is therefore very possible that a physicist like Hertz, who was probably less critical of Kant than his mentor, would rather unreflectingly follow this somewhat Kantian tradition. So let me conclude by summarizing the Kantian influences on, and the Kantian features of, Hertz’s Mechanics. From the start of his involvement with mechanics he seems to have subscribed to some generally accepted Kantian ideas about the nature of mechanics, in particular the idea of an a-priori kinematics and an empirical
126
Kantianism. Elements of images
dynamics. He probably acquired these ideas from the general physics tradition at the time, although his own reading of Kant may also have been influential as suggested by Hyder. The Kantian ideas were apparently not initially the primary driving force behind Hertz’s involvement with mechanics, but during the two years he worked on his book, he sharpened and highlighted its Kantian features. Thus the printed book, and in particular its sharp distinction between the a-priori and the empirical, shows a strong Kantian influence. Also Hertz’s requirement of permissibility needs to be understood in this Kantian context.
11 Time, space and mass
Hertz introduced the three fundamental concepts of his mechanics, time, space, and mass at the beginning of book one and then returned to them at the beginning of book two. In accordance with the general character of each book the concepts are first introduced as pure a-priori intuitions in book one, and then as ‘applied’ symbols for objects of external experience. In this chapter we shall first analyse Hertz’s concepts of space beginning with the pure concept and ending with the applied concept, and in particular the problematic connection between them. In particular, we shall try to understand how Hertz could attribute a-priori status to Euclidean geometry, even though he was familiar with the recent development of non-Euclidean geometries. Our discussion of Hertz’s conception of space and geometry will shed light on his introduction of a ‘geometry of systems of points.’ Hertz dealt with the concept of time in a way similar to the concept of space, so there is little more to say about this concept, in particular because there is no contemporary discussion of non-Euclidean time to contextualize Hertz’s ideas. His introduction(s) of the concept of mass, however, raises a lot of interesting questions. One of them concerns the notion of Massenteilchen from which all matter is build up, another concerns the concept of concealed masses, and a third concerns Hertz’s coordinative rule stating that mass is determined by weighing. We shall discuss these questions in this chapter. A deeper analysis of the concept of Massenteilchen and Hertz’s gradual development of it, will be postponed till the next chapter after the introduction of the line element of Hertz’s geometry of ‘systems of points.’
11.1 Space In accordance with the a-priori nature of book one, Hertz introduced space as follows: The space of the first book is space as we conceive it (der Raum unserer Vorstellung). It is therefore the space of Euclid’s geometry, with all the properties which this geometry ascribes to it. (Hertz 1894, §2)
In the prefatory note of the second book, however, Hertz declared: In this second book we shall understand times, spaces, and masses to be symbols for objects of external experience; . . . Our statements concerning the relations between times, spaces and
127
128
Time, space and mass
masses must therefore also be in accordance with possible, and, in particular, future experiences. These statements are based, therefore, not only on the laws of our intuition and thought, but in addition on experience. (Hertz 1894, §296)
So where the first book presents a pure geometry, the second presents geometry as an applied science. We shall first discuss Hertz’s pure geometry and then turn to the applied geometry, and in particular to the connection between them.
11.1.1 Pure geometry It is revealing to compare Hertz’s division of geometry into a pure and an applied geometry with the more famous division made by Einstein in his paper ‘Geometrie und Erfahrung’ (Einstein 1921), which is often cited even today. Einstein distinguished between Hilbert-style pure axiomatic geometry and practical geometry, which arises from our experience with almost rigid bodies, and he concluded with the often-quoted words: As far as the propositions of mathematics refer to reality they are not certain; and as far as they are certain, they do not refer to reality. (Einstein 1921, p. 3)
Is Hertz’s distinction between the two types of geometry the same as Einstein’s distinction? In particular, is the pure geometry of Hertz’s first book simply formalistic axiomatic geometry? In the continuation of the explanation of the concept of space in §2 quoted above Hertz might allow this possibility: It is immaterial to us whether these properties are regarded as being given by the laws of our internal intuition, or as consequences of thought which necessarily follow from arbitrary definitions. (Hertz 1894, §2)
Is the latter possibility Hertz’s way to describe formal axiomatic geometry starting from arbitrary axioms? That is possible, but perhaps not so likely. Indeed by 1893, six years before Hilbert’s Grundlagen der Geometrie, this view of geometry had only been hinted at even in the mathematical community. It seems more likely that Hertz meant what he wrote, namely that we may consider the definitions (say of point, line, plane, parallel, circle, etc.) to be arbitrary, but that the consequences of the definitions are necessary. Such a point of view would almost be the opposite of Hilbert’s view. To Hilbert the axioms are arbitrary, whereas the objects of geometry are undefined, or only defined indirectly through the axioms. But irrespective of how one should interpret Hertz’s remark about arbitrary definitions, the possibility of viewing geometry as a result of arbitrary definitions seems to come as an after-thought. Everywhere else in the book, including the places quoted above, Hertz described geometry of the first book as the geometry of our intuition and not as a formal system given by arbitrary axioms or definitions. The remark seems to serve as a disclaimer: if the reader does not admit that Euclidean geometry is a-priori the geometry of our intuition but is a result of arbitrary definitions, this does not invalidate the remaining part of the book.
Space
129
Hertz’s Kantian belief in the a-priori Euclidean nature of our intuition of space is remarkable at a time where non-Euclidean geometries were widely discussed. Indeed from the very beginning the possibility of non-Euclidean geometries had been viewed as a proof that Kant was simply wrong. Already, Gauss, who was the first to entertain the possibility that the space we live in might be non-Euclidean, in the sense that the parallel axiom does not hold, was of the opinion that ‘we must not put geometry on a par with arithmetic that exists purely a-priori but rather with mechanics’ (Gauss to Olbers 1817, (Gauss 1817)). Lobachevsky and Janos Bolyai, who were the first to publish works on non-Euclidean geometry around 1830 were of a similar opinion and so was Riemann. In his Habilitationsvortrag (Riemann 1867b) Riemann introduced manifolds that are even more non-Euclidean in which the curvature need not be constant and negative as in Lobachevsky’s and Bolyai’s geometry but could be positive and even vary from point to point. Moreover, Riemann allowed of dimensions higher than three. He did not use the word ‘geometries’ for these manifolds, reserving this term for the physical space we live in but he insisted that it was an empirical question to decide which of the a-priori equally possible manifolds corresponds to physical space. In particular, he pointed out that whereas it seems well established that real space is 3-dimensional and unbounded, it is not at all inconceivable that it is slightly positively curved (a cosmological effect), and thus finite, and that it might have variable curvature on an atomic scale, as long as the curvature integrates up to almost zero over measurable distances. He suggested that the variable curvature might be a result of the binding forces between the matter in space. This is an interesting suggestion, because it views space as more than just a container of matter by assuming some connection between space and the matter it contains. These revolutionary thoughts about space and geometry were not widely known before the late 1860s when Gauss’s letters and Riemann’s Habilitationsvortrag were published (1866 and 1867, respectively) and the works of Lobachevsky and Bolyai were republished in a more easily accessible form. But by the 1870s non-Euclidean geometry was widely publicized by Eugenio Beltrami (Beltrami 1868a), Felix Klein (Klein 1873) and Helmholtz (Helmholtz 1868), (Helmholtz 1870a). The work of the two former made it clear that non-Euclidean geometry was not impossible, but in modern parlance just as consistent as Euclidean geometry. Helmholtz combined this new mathematical tool with his physiological and psychological research on perception in order to argue that since we know, for example through the motion of our body, that rigid bodies exist, the space we live in must be a Riemannian manifold of constant curvature. Whether this curvature is negative, zero, or positive, corresponding to Lobachevsky geometry, Euclidean geometry, or what he called spherical geometry, is, according to Helmholtz, a matter that has to be determined experimentally. In fact, he admitted that we cannot really know if there exist rigid bodies that do not change shape or size when moved around, or whether bodies just change shape and size in some regular fashion. This idea was carried further by Henri Poincaré who maintained that it is a convention that space is Euclidean. According to him it is meaningless to ask the question: ‘Is Euclidean geometry true? We might as well ask if the metric system is true and if the old weights and measures are
130
Time, space and mass
false: if Cartesian coordinates are true and polar coordinates false. One geometry cannot be more true than another; it can only be more convenient’ (Poincaré 1902, p. 50)1 . Yet, to Poincaré Euclidean geometry is and will continue to be the most convenient description of space, because it is the simplest in the sense that in Euclidean geometry the subgroup of translations is a normal subgroup of the group of rigid motions (see (Gray 2004)). Now, it is obvious that Hertz could not have known about Poincaré’s conventionalism. However, he was well aware of the general ideas of non-Euclidean geometry. Already on November 25th 1877 as a student in München he reported back to his parents: The entire new mathematics (from about 1830 on) is, I think, of no great value to the physicist, however beautiful it may be intrinsically, for I find it so abstract, at least in parts, that it no longer has anything in common with reality; for instance, the non-Euclidean geometry, which is based on the assumption that the sum of the angles in a triangle need not be always equal to 2 right angles, or the geometry dealing with space of four, five, or more dimensions etc. Even the elliptical functions are, I think, of no practical value. But perhaps I am mistaken. (Hertz 1977, pp. 71–72)
Later, when Hertz went to study in Berlin, he must surely have heard of his mentor Helmholtz’s empiricist conceptions of geometry. And yet Hertz declared in his Mechanics that our intuition of space is necessarily Euclidean, as Kant had assumed. How is that possible? He clearly accepted the possibility of constructing high-dimensional analytic geometries of varying curvature. In a sense his ‘geometry of systems of points’ of the Prinzipien der Mechanik is precisely that. He seemingly realized that contrary to his earlier opinion such geometries could be of value to the physicist. And still in the introduction to the book he characterized the high-dimensional geometries of the mathematicians in a derogatory way as ‘suprasensible abstractions’ (Hertz 1894, p. 39/33). This corresponds well with his statement in the 1877 letter quoted above to the effect that non-Euclidean geometries have nothing in common with reality. These facts seem to suggest that Hertz had the following ideas about pure geometry: It is possible to create, in an analytic or axiomatic way, consistent ‘geometries’ that are different from the Euclidean one. Yet these analytically or axiomatically defined manifolds do not deserve the name geometries, because they do not correspond to our intuition of space as we sense it. Geometry is an abstract science of sensible space (this was an idea shared by almost all mathematicians before Hilbert’s Grundlagen der Geometrie) and our intuition tells us that this space is Euclidean. This is not an empirical question (not until the introduction of applied geometry of book 2), but a question that we can decide a-priori. Non-Euclidean geometries do not and cannot correspond to our conception of sensible space. They are suprasensible as Hertz wrote in the book. We find a similar view expressed by all students when they are presented with non-Euclidean geometry for the first time. This is not to say that the feeling is naive, but on the contrary, that it is in some sense true. Our intuition of space (though not in a 1 Similar ideas were already expressed in (Poincaré 1895).
Space
131
Kantian sense) seems to be Euclidean. Poincaré’s conventionalist stance even shows that as late as 1900 a view of Euclidean geometry as the only convenient description of space could be the result of deep considerations. However, Hertz’s Kantian point of view was definitely conservative in the 1890s although not at all uncommon, even among mathematicians.
11.1.2 Applied geometry In the second book, geometry is an applied science in the sense that spaces (distances) are now ‘symbols for objects of external experience.’ Yet it does not differ from the pure geometry of book one because, according to Hertz, the properties of these symbols ‘are consistent with the properties that we have previously assigned to these quantities either by definition or as being forms of our internal intuition.’ (Hertz 1894, §296). Why is it so? Why is the geometry of our intuition applicable to the symbols for objects of external experience? Hertz does not give any unique and explicit answer to this intriguing question. In fact, he hints at three different answers: A Kantian answer, an empiricist answer and a conventionalist answer. 1. The Kantian answer agrees best with the whole structure of Hertz’s Mechanics. As we have pointed out above Hertz insisted that the only empirical element of his mechanics was the Fundamental law. All the rest is therefore independent of experience and thus seemingly a-priori. That must mean that also applied geometry of the second book is a-priori Euclidean. Hertz does not give an argument for that, but an argument along the following Kantian lines seems to be implied: Our pure intuition of space is Euclidean and since this is also the form of empirical intuition it must necessarily apply to our experience. In other words, since we ‘process’ our external impressions through our a-priori intuition of space, our experience will be in accordance with Euclidean geometry. Such a reading of Hertz is consistent with almost everything in the Prinzipien der Mechanik. Yet, a few remarks reveal that Hertz did not think that the connection between pure and applied geometry was that simple. 2. In §296, quoted in Section 10.2 where Hertz claimed that his fundamental law was the only empirical element of his image he did leave a slight doubt as to the possible empirical content of the basic notions of space and time. In fact, he wrote that the part of his image that depend on experience ‘in so far as it is not already contained in the fundamental ideas’ will be comprised in the fundamental law. This could perhaps be understood in a Kantian fashion, to the effect that our a-priori Euclidean geometry does indeed have consequences for the way we conceive space. However, in §299 Hertz more explicitly allows for an empirical testing of geometry. In fact, he pointed out that the Kantian argument for the applicability of Euclidean geometry does not work in the simple way indicated above. Indeed, he made it clear that in order to speak of the applicability of our pure intuitions of space to our sensations of external experience, we need a coordinative rule for translating external experiences of concrete spaces (distances) into the language of the image, i.e. into
132
Time, space and mass
the language of our pure intuition of space. In agreement with the metric nature of his image he gave a rule for measuring distances (rather than a rule for determining straight lines): Rule 2. We determine space relations according to the methods of practical geometry by means of a scale. The unit of length is settled by arbitrary convention. A given point in space is specified by its relative position with regard to a system of coordinates fixed with reference to the fixed stars and determined by convention. (Hertz 1894, §299)
After this statement of the coordinative rule, which contains the only mention in the Mechanics of the hotly debated question of the existence of absolute space and the determination of inertial frames, Hertz continued: We know by experience that we are never led into contradictions when we apply all the results of Euclidean geometry to space relations determined in this manner. The rule is also determinate and unique, except for the uncertainties which we always fail to eliminate from our actual experience, both past and future. (Hertz 1894, §299)
Thus, Hertz did admit the possibility that we could be led into contradiction by applying Euclidean geometry to space relations determined according to this rule. Indeed it is hard to see why it should be inconceivable that one could come across a (large) geodesic triangle with an angle sum different from 180◦ . It is a matter of experience that we have not come across such a triangle yet. Now it is important to note that the above empirical test does not, according to Hertz, test Euclidean geometry, but the coordinative rule 2. That means that we can continue reading Hertz along the following somewhat Kantian lines: applied geometry is a-priori Euclidean. Experience tells us that we can translate external experience into the language of Euclidean geometry by way of rule 2. If, however, future experience will prove this to be wrong, we must find an alternative coordinative rule but we can continue to apply the Euclidean geometry of our intuition. This reading may rescue the a-priori nature of Euclidean geometry, but it is questionable how Hertz can maintain that the fundamental law of motion is the only empirical element of his image. This coordinative rule seems to be an empirical element too, unless we declare that it is not a part of the image at all. However, without the coordinative rule, it makes no sense to claim that one can test the fundamental law. I shall return to this question in Chapter 25. For now it is interesting to note that by conferring the empirical testability to the coordinative rule rather than to Euclidean geometry itself, Hertz came close to Poincaré’s later point of view. In fact, Poincaré also focused upon the dictionary by which we translate things in the physical world into mathematical terms. For example, we may translate ‘light ray’ into ‘straight line.’ However, if future experience tells us that there are triangles composed of light rays, whose angle sum is different from 180◦ , then we should not reject Euclidean geometry, but the translation. This is the basis for Poincaré’s conventionalism. 3. There are some remarks by Hertz that connect his view of geometry even closer to Poincaré’s conventionalist view. In §304 Hertz stressed the conventional nature of
Time
133
rule 2 and the corresponding rules for translating times and masses from our sensations into our image: Observation 3. There is, nevertheless, some apparent warrant for the question whether our three rules furnish true or absolute measures of time, space, and mass, and this question must in all probability be answered in the negative, inasmuch as our rules are obviously in part fortuitous and arbitrary. In truth, however, this question needs no discussion here, not affecting the correctness of our statements, even if we attached to the question a definite meaning and answered it in the negative. It is sufficient that our rules determine such measures as enable us to express without ambiguity the result of past and future experiences. Should we agree to use other measures, then the form of our statements would suffer corresponding changes, but in such a manner that the experiences, both past and future, expressed thereby, would remain the same. (Hertz 1894, §304)
This passage seems to hint at something like Helmholtz’s mirror world (Helmholtz 1870a, 57ff): Instead of measuring lengths by way of rigid rods (a scale) we first reflect the world in a strangely curved mirror and then measure lengths in the mirror image by way of a rigid rod. In this way the world would not be described correctly by Euclidean geometry but by way of some non-Euclidean geometry depending on the mirror. The ‘form of our statements would suffer corresponding changes.’ Hertz did not analyse what kind of changes would result, nor did he argue why he preferred the usual way of measuring lengths (Rule 2). Thus, Hertz’s brief anticipation of conventionalism remained an inconsequential parenthesis in the book.
11.2 Time Also, time is introduced at the start of book 1 as an a-priori intuition: The time of the first book is the time of our internal intuition. It is therefore a quantity such that the variations of the other quantities under consideration may be regarded as dependent upon its variation; whereas in itself it is always an independent variable. (Hertz 1894, §2)
In book two, however, time is considered as an image of external experience. More precisely, Hertz pointed out that the concept of time as well as that of space and mass in themselves ‘are in no sense capable of being made the subjects of our experience, but only definite times, space-quantities, and masses. Any definite time, space-quantity, or mass may form the result of a definite experience.’ As in the case of space-quantities, a coordinative rule is required to make our intuitions into symbols for objects of external experience: Rule 1. We determine the duration of time by means of a chronometer, from the number of beats of its pendulum. The unit of duration is settled by arbitrary convention. To specify any given instant, we use the time that has elapsed between it and a certain instant determined by a further arbitrary convention. (Hertz 1894, §298)
As in the case of the rule for measuring lengths, Hertz pointed out that this rule is somehow indeterminate (§298), but he maintained that the indeterminateness is not
134
Time, space and mass
due to an indeterminateness of our image or of the coordinative rule but due to the indeterminateness of the external experience itself, in the sense that there is no way we can determine times more accurately, such as by the best chronometer (§303). Hertz’s choice of a coordinating rule using a chronometer rather than a rule appealing to astronomical occurrences, such as the day (the time of revolution of the Earth), seems to be dictated by questions of accuracy. This may seem as a very non-philosophical choice, which makes time dependent on the Earth’s gravitational field and through the centrifugal force on its time of revolution. However, any coordinative rule will contain such arbitrary, and possibly varying factors, so the choice of the chronometer is just as philosophically pleasing as any. In fact, the use of the Earth’s effective gravitational field (including centrifugal forces) as a standard against which time is measured, gives a nice consistency to Hertz’s coordinative rules. In fact, as we shall see, Hertz’s rule for measuring mass made appeal to a scale, which similarly depends on the effective gravitational field. Moreover, after Hertz had introduced the Fundamental law he was able to explain the appropriateness of his choice of coordinating rule. He remarked that the fundamental law (in addition to the straightestness of the path) implied that: 2. Different free systems describe in identical times lengths of their paths proportional to each other. 3. Time as measured by a Chronometer (§298), increases proportionally to the length of the path of any one of the free moving systems. The first two statements alone contain facts of a general nature derived from experience. The third only justifies our arbitrary rule for the measure of time, and only includes the particular experience that in certain respects a chronometer behaves as a free system, although, strictly speaking, it is not such. (Hertz 1894, §323)
Thus, the choice of a chronometer is appropriate because it behaves like a free system. Any free system would have done the job equally well. This observation also reveals that the coordinating rule in a sense reduces the measure of time to the measure of distance, a reduction that fundamentally relies on the empirical fundamental law.
11.3 Mass. The constitution of matter Hertz’s introduction of the concept of mass in the first book differs fundamentally from his introduction of the concepts of time and space. The latter are, according to Hertz, a-priori intuitions, whereas ‘the mass of the first book will be introduced by a definition’ or rather by a series of definitions. This makes the defined concept a property ascribed to the image for the sake of appropriateness (see Chapter 10). It could, according to Hertz, be arbitrarily added or taken away. Already when introducing the concept of space, we saw Hertz appeal to ‘arbitrary definitions.’ However, we concluded that in the case of geometry it was probably only meant as a sort of lip service to satisfy non-Kantians. In the case of mass, however, there is no doubt that Hertz really meant his definition as an arbitrary definition. That means that the definition makes no ontological claim in the sense of pretending to portray the real
Mass. The constitution of matter
135
nature and constitution of matter. It is only a convenient image. Of course, this holds true of all elements of an image, and of Hertz’s image in particular, but in the case of a concept introduced by a definition, there is much more freedom and arbitrariness than when we deal with the a-priori elements and the elements introduced for the sake of correctness. We cannot test the arbitrary definitions against our inner logic, nor against the experiences of the external world. The only criterion for the definitions is whether they provide an appropriate, in particular a simple image. Therefore, when analysing Hertz’s image of mass and matter, we must have this appropriateness in mind. In particular, when we are dealing with a rather axiomatically and deductively constructed theory such as Hertz’s mechanics, we should not forget that the question of simplicity is determined by the whole axiomatic system, rather than locally. That means that when we want to understand why Hertz came up with his image of matter rather than another image we must take the later use of the definitions into account. To phrase it in a Latourian fashion, we need to look at the concept in action. In the next chapter, we shall see how the concepts of mass and matter act in connection with the ‘geometry of systems of points,’ and I shall argue that this context was decisive, when Hertz chose his image of matter. In this chapter, however, I shall consider Hertz’s image of matter in the context of earlier theories on the constitution of matter, in particular his own earlier ideas as expressed in his Kiel Lectures.
11.3.1 The constitution of matter. Book one The fundamental unit, from which all matter is build up in Hertz’s mechanics is the Massenteilchen: Definition 1. A material particle [Massenteilchen] is a characteristic by which we associate without ambiguity a given point in space at a given time with a given point in space at any other time. Every material particle [Massenteilchen] is invariable and indestructible. The points in space which are denoted at two different times by the same material particle [Massenteilchen], coincide when the times coincide. Rightly understood, the definition implies this. (Hertz, 1894, §3)
In the following I shall use Hertz’s original German term Massenteilchen (small mass part) rather than the term material particle used in the English translation of the Prinzipien der Mechanik. The reason is that I find it confusing that this smallest mass part is called a particle, whereas the larger entities that are build up from finitely or infinitely many Massenteilchen are called ‘material points’ with a literal translation of the German ‘materieller Punkte.’ In ordinary language a particle is larger than a point, but in the English translation of Hertz’s book it is the reverse. So in order to help the reader forming appropriate images of the basic concepts, I shall cling to the German word Massenteilchen. But what is the image Hertz wanted to give his reader of the basic element of mass? The definition is, in fact, rather non-suggestive. According to the definition, a Massenteilchen is what we would to day call a function of time having values in space. Expressed in terms of the mathematical language of Hertz’s image it is a
136
Time, space and mass
vector function from the real numbers (representing times) to R3 (representing space). According to the views expressed in Hertz’s Kiel Lectures (Hertz 1884/1999, p. 36), such time and space relations are the essential characteristics of matter (see Chapter 8). However, in these lectures, Hertz argued that it would be impossible to conceive the elements of matter without adding inessential properties. ‘What we add are not false ideas, but it is the conditions of conceptualization (Vorstellbarkeit) as such’ (Hertz 1999, p. 36). In particular, he argued that it is impossible to have a conception (image) of matter on the atomic level without imagining that it has macroscopic qualities, such as temperature, color, etc., properties that in fact lose their meaning on this level. To convey the properties of the perceptible physical world to its smallest parts is allowed when we keep in mind in how far these properties are essential – that is always only relations between quantities (Größenbeziehungen) – and in how far they are only added in order to make a conceptualization (Vorstellung) possible. (Hertz 1999, pp. 36, 37)
Yet, in his definition of Massenteilchen in his Mechanics Hertz only attributed to them the very properties that he had declared the essential ones in 1884, namely time and space relations. In light of his earlier lectures, this seems to be a very conscious move. He seems to have decided that he could after all create an image of the smallest building blocks of matter without attributing to them any inessential elements. In a certain sense he had already in his Kiel Lectures acknowledged the possibility of creating an image of matter without any inessential properties. Indeed he presented Boskovic’s idea of an atom as a geometric point with a force field attached to it. Hertz explained that this definition could be and had been interpreted in two ways. One can interpret the atom as the force field itself (Hertz 1999, p. 114), or one can think of the geometric point as an extentionless atom that carries the force field (Hertz 1999, p. 112). The Massenteilchen is, in a sense, what remains of this latter version of Boskovic’s atom if one removes the force field, which has no place in Hertz’s forceless image of mechanics. In the Kiel Lectures Hertz mentioned four properties that had been emphasized as characteristic or defining properties of matter: Extension, movability, impenetrability, and indestructibility. He carefully argued that all these properties were neither clear, undisputed, nor necessary consequences of experience. Boskovic’s atoms considered as points, gave an example of a perfectly legitimate image in which matter has itself no extension; and considered as infinite force fields, these atoms are not impenetrable. As for the concept of movability, which Kant had considered the most important property of matter, Hertz argued that it is dubious whether it is a clear and significant property. His doubts about the significance and meaning of these three properties of matter is reflected in his definition of Massenteilchen in the Mechanics. These smallest parts of matter have no extension, the question of impenetrability is not raised at all and they may be movable, but since constant functions of time are allowed, they need not be movable. In the Kiel Lectures Hertz analysed the concept of indestructibility in more detail. According to Hertz it is not directly dictated by experience. On the contrary, many experiments seem to point to loss or creation of mass. For example, a cup of water
Mass. The constitution of matter
137
left to itself will gradually dry up. We have invented auxiliary hypotheses that can save the apparent falsification of the doctrine of the indestructibility of matter, but the idea itself is not rooted in experience. Indeed it dates back to a time long before precision weighing was a part of chemical or physical practice. On the contrary, ‘Indestructibility is a requirement of our mind’ (Hertz 1999, p. 115). More precisely he argued as follows: The lawlikeness in nature, the form of which we search for, but whose existence we presuppose, is impossible, if we assume that from nothing and without sufficient reason something can come into being, and conversely the world would gradually stop to exist, if nothing could come into being, but here and there things could return to nothing. (Hertz 1999, p. 115)
Yet, Hertz revealed that some recent experiments had led some physicists to assume that matter could be converted into energy and conversely. That would mean that matter and energy taken together would be conserved. It might also be that we have to look for something even more extensive that we can assume to be constant. ‘This beautifully illustrates how our dogma [of indestructibility of matter] is a mix of a-priori and empirical components; the assumption of the constancy of something is a-priori, whereas it is an empiric claim that it is the weight, the mass, the amount of sensible matter which is the constant thing’ (Hertz 1999, p. 117). In the Mechanics Hertz declared that ‘every material particle [Massenteilchen] is invariable and indestructible’ (see the quote from §3 above). This statement is a part of a definition and has therefore, according to Hertz, neither a-priori nor empirical status. It is a convention. Yet in book 2 after Hertz had explained that mass is determined by weighing, he declared: The mass of a tangible body as determined by this rule possesses the properties attributed to the ideally defined mass (§4). That is to say, it can be conceived as split up into any number of equal parts, each of which is indestructible and unchangeable, and capable of being employed as a mark to refer, without ambiguity, a point of space at one time to a point of space at any other time. (Hertz 1894, §300)
It is important to note that it is the constancy of the tangible matter that is being postulated here. That means that Hertz excluded the possibility of converting tangible matter into concealed mass. Also in the Kiel Lectures, Hertz clearly spoke of the indestructibility of the ponderable matter, and did not admit conversion from ponderable matter to ether stuff. Thus, according to Hertz’s argument from 1884 there must be some empirical content in the statement in §300 quoted above. The Kiel Lectures definitely imply that the constancy of matter itself could be empirically testable. However, as in the case of time and space the wording of §300 suggests that by the 1890s Hertz rather thought that it was the coordinating rule that could be tested, i.e. if in the future, an experiment would turn out to falsify mass conservation, one should look for a different way of measuring mass, rather than assuming that mass could be converted into something else such as energy. This would come close to a Poincaré-type conventionalism. And, as in the case of space and time, it would problematize Hertz’s claim that only the fundamental rule has empirical content.
138
Time, space and mass
The definition of Massenteilchen thus avoided almost all the inessential properties that Hertz ten years earlier had declared to be necessary parts of an image of the smallest parts of matter. Only the essential time and space relations were left as well as the property of indestructibility that Hertz in 1884 had declared to be an important mix between a-priori and empirical knowledge. This conscious change of opinion on Hertz’s part is partly a result of the fact that in 1884 when talking about the smallest parts of matter, he usually had atoms in mind, whereas in the Mechanics the Massenteilchen are much smaller. However, as explained in Chapter 8, it is also a result of a change in the meaning of the term ‘image.’ In 1884 an image was a colorful, very visual picture, similar to what he called the ‘gay garment’ in his 1892 book. In 1894, however, an image had turned colorless and austere. We might read Hertz’s account in the Mechanics of the constitution of matter as an attempt to reduce the concepts of matter and mass to geometry (the position of the material particle) and arithmetic (the number of ‘characteristics’ [Merkmale] it consists of). Such a reduction would parallel Descartes’s attempt to reduce the property of matter to the geometric property of volume. However, Hertz did not go that far. He included mass together with space and time as the basic concepts of mechanics, and thereby of physics. While he defined force and energy a posteriori from the other three basic concepts, he did not claim he had defined mass in terms of space and time. With the definition of Massenteilchen in place, Hertz could define the concept of mass: Definition 2. The number of material particles [Massenteilchen] in any space, compared with the number of material particles [Massenteilchen] in some chosen space at a fixed time, is called the mass contained in the first space. (Hertz 1894, §4)
This definition is based on the assumption that the material world consists of a set of Massenteilchen, or a set of vector-valued functions of time. The mass contained in a given region of space at a given time (this must be implied by the definition) is proportional to the cardinality of the set of these functions that at the given time has values in the given region. More precisely, the mass is the number of these functions divided with a given reference number. He assumed this reference number to be given as the number of Massenteilchen in a given reference space. This move was clearly made to make the ideal definition correspond to the usual way we measure mass against a certain reference mass, such as a 1-kg weight. The definition seems to imply that mass can only assume discrete values, namely integral multiples of the mass of one Massenteilchen. In order to avoid that, Hertz assumed that the reference number is infinitely great: We may and shall consider the number of material particles in the space chosen for comparison to be infinitely great. The mass of the separate material particles [Massenteilchen] will therefore, by the definition, be infinitely small. The mass in any given space may therefore have any rational or irrational value. (Hertz 1894, §4)
Mass. The constitution of matter
139
This specification is naturally mathematically problematic. Indeed, what does it mean to divide one infinite number with another. It certainly was not a clear notion either in the classical mathematics of the 1890s or in the then recent Cantorian set theory. As usual, Hertz did not quibble with such mathematical subtleties. He was after conceptual clarity and not after mathematical rigor. His number concept was much closer to the sixteenth- and seventeenth-century concept containing infinitesimals and infinitely large quantities, than to the Dedekind–Cantor–Weierstrass concept of the real numbers. Today we may save Hertz’s reasoning by way of non-standard analysis, but with the standards and methods of the late nineteenth century, Hertz’s definition of mass was certainly mathematically unrigorous. Nonetheless, by assuming that the number of Massenteilchen in the reference space was infinitely large, Hertz obtained that masses can range over all positive real numbers. This made this part of his image immune to empirical test. It could be upheld if future experiments revealed that masses only come in discrete multiples of a smallest mass, as an atomic theory might suggest, or if no such effect would ever be measured. In fact, Hertz’s mechanics did not primarily deal with the infinitely small Massenteilchen, but with collections of Massenteilchen, the so-called material points: Definition 3. A finite or infinitely small mass, conceived as being contained in an infinitely small space, is called a material point. A material point therefore consists of any number of material particles [Massenteilchen] connected with each other. This number is always to be infinitely great: this we attain by supposing the material particles [Massenteilchen] to be of a higher order of infinitesimals than those material points which are regarded as being of infinitely small mass. The masses of material points, and in especial the masses of infinitely small points, may therefore bear to one another any rational or irrational value. (Hertz 1894, §5)
Thus, in the language of vector functions, a material particle is a collection (I do not call it a set because the functions could be identical) of vector-valued functions having values infinitely close together at any time. Hertz wanted to have material particles with finite mass as well as with infinitely small mass. In order to obtain that even the latter could have masses that range through a continuity of values, he assumed that the Massenteilchen have masses that are infinitely small of the second (or higher) order. That means that the reference space must contain an infinity of the second order of Massenteilchen. Thus, we have the following image of matter. Matter is build up from Massenteilchen that have masses that are infinitely small of the second order. An infinity of the first order of Massenteilchen make up mass points with infinitely small masses of the first order. Hertz called them infinitely small material points. A second-order infinity of Massenteilchen will make up a material point of finite mass or finite material points. Now, Hertz’s book did not even deal with separate material points, but with systems of points. It is the basic philosophy behind his geometry of systems of points that one can just as well start immediately with systems rather than with single points. It is
140
Time, space and mass
mathematically just as easy, and it is the only thing that corresponds to anything in the sensible world. A number of material points considered simultaneously is called a system of material points, or briefly a system. The sum of the masses of the separate points is, by §4, the mass of the system. Hence a finite system consists of a finite number of finite material points or of an infinite number of infinitely small material points, or of both. It is always permissible to regard a system as being composed of an infinite number of material particles [Massenteilchen]. (Hertz 1894, §6)
However, Hertz’s Mechanics only deals with systems consisting of finitely many finite material points. Why then did he bother to introduce the infinitely small mass points, which led to the complication of introducing infinitesimals of the second order? Probably, Hertz wanted to be able in principle to deal with fluid mechanics, in which a fluid is assumed to consist of a continuum of infinitesimally small parts. This would be important both for dealing with ordinary fluids, as well as with the ether. As Hertz explained in the Kiel Lectures it was not clear whether one should consider the ether as a continuous fluid or as a discrete structure (Hertz 1999, p. 52). Therefore, if Hertz’s mechanics should be able to serve as the foundation for a theory of the ether, infinite systems of infinitely small material points could not be excluded. In §7 Hertz even outlined how one could arrive at a mechanics of such systems from the mechanics of finite systems treated in the book: In what follows we shall always treat a finite system as consisting of a finite number of finite material points. But as we assign no upper limit to their number, and no lower limit to their mass, our general statements will also include as special case that in which the system contains an infinite number of infinitely small material points. We need not enter into the details required for the analytical treatment of this case. (Hertz 1894, §7)
The analytical details would include the problem of passing from sums to integrals. In particular, they should clarify the domains one should integrate over, and introduce continuity requirements by which one could define the notion of a perfect fluid. These problems, which are less trivial than Hertz’s remark might suggest, were taken up by Paul Ehrenfest (Ehrenfest 1904). Hertz’s pure image of matter in the Mechanics is a consciously minimalistic one. It does not commit itself regarding the many intriguing questions that he discussed in his Kiel Lectures, and that divided the physics community at the time. It did not even address these questions. Only the indestructibility of tactile matter is assumed in the definition. In particular, Hertz’s image did not have any bearing on the question of the atomic structure of matter. Both in his Kiel Lectures and in the introduction to the Principles of Mechanics he had expressed his conviction that matter is somehow made up from atoms, but he had also maintained that until much more is known about these atoms, one cannot build a foundational theory of mechanics (and thereby of physics) on any particular hypothesis about their properties. His image of Massenteilchen does not imply that matter can physically be divided into infinitely many infinitesimal parts, it only maintains that we can imagine these parts of matter. Of course, from a modern
Mass. The constitution of matter
141
perspective of quantum theory, even this proposition is problematic, but at the time it was an unproblematic assumption that would stand whichever properties one would subsequently attribute to atoms.
11.3.2 The constitution of matter. Book two We have already appealed to the following coordinative rule that translates outer experience into the language of our pure image of book one: Rule 3. The mass of bodies that we can handle is determined by weighing. The unit of mass is the mass of some body settled by arbitrary convention. (Hertz 1894, §300)
We have also quoted the continuation of this quote to the effect that this rule translates in a correct way. In this section I shall discuss two other related points: Why did Hertz choose to measure mass with direct reference to gravity, and why did he restrict his rule to the bodies we can handle? In order to understand the restriction of the rule we need to consider the introduction of concealed masses in the subsequent section: Addition to Rule 3. We admit the presumption that in addition to the bodies which we can handle there are other bodies which we can neither handle, move, nor place in the balance, and to which Rule 3 has no application. The mass of such bodies can only be determined by hypothesis. In such hypothesis we are at liberty to endow these masses only with those properties which are consistent with the properties of the ideally [begrifflich] defined mass. (Hertz 1894, §301)
As Hertz argued in the introduction to the Mechanics we cannot account for the motion of the tangible matter without assuming that there exist concealed confederates (heimliche Mitspieler). It is the characteristic feature of Hertz’s image that this hidden something is assumed to be nothing but masses that are just like the tangible masses, with the only exception that we cannot sense them in the same way that we can sense the tangible matter. We can only infer their properties from the way they act on the tangible matter, through connections. There is therefore no difference between tangible and concealed matter except from the way they are connected to our human sensory apparatus. It is not a difference in kind, only a difference in the relation to us. In this connection it is interesting to note which name Hertz used for the concealed confederates. In the definition he used the word ‘nicht greifbaren Körper’ translated into non-tangible or non-handleable bodies. In the introduction of the book and in the technical discussion of conservative systems, he used the term ‘Verborgene Massen’ or concealed masses as opposed to ‘sichtbare’ or visible masses. It is noticeable that he did not use the word non-ponderable matter that he had used freely in his Kiel Lectures, to denote the matter constituting the ether. This reflects the fact that Hertz in the Mechanics totally avoided the question discussed in the lectures of whether the concealed matter has gravitational mass (Hertz 1999, p. 108). This ontological problem is turned into an epistemological one. Instead of saying that concealed matter has no gravitational mass and therefore cannot be measured on a balance, Hertz simply
142
Time, space and mass
postulated that we cannot place the concealed mass on the scale. Again the difference between the tangible and the concealed matter is reduced to our ability to handle it, i.e. to our relation to it. It is not a basic difference in nature. But although the difference is not basic in this sense, it is nonetheless a constant difference, in the sense that tangible matter cannot be converted into concealed matter and vice versa. At least, as I argued above, this seems to be the most natural conclusion one can draw from the indestructibility of Massenteilchen as stated in §3 combined with the claim in §300 that also tangible matter alone satisfies this conservation law. Now we can address the question of why Hertz in his coordinating rule 3 chose to state that mass is determined by weighing. In his Kiel lectures he discussed the difference between inertial mass and gravitational mass at some length. He came to the conclusion that inertial mass has some a-priori aspects to it, in the sense that it would be impossible to imagine matter without inertial mass. It also has even stronger empirical aspects to it, for example the fact that matter without inertial mass would contradict our laws of motion, and these in turn are of an empirical origin. Gravitational mass, however, is, according to Hertz an entirely empirical property of matter. In fact, we can easily imagine matter without gravitational mass; in fact we mostly think of the ether in this way. And yet we experience that all tangible mass2 possesses gravitational mass, and this mass is even equal to (proportional to) the inertial mass irrespective of the material it is made up of. Hertz characterized this empirical fact as a ‘wonderful puzzle.’ Hertz was therefore well aware of the fundamental difference between the two types of mass, and was of the opinion that the inertial mass is a more fundamental characteristic of matter than gravitational mass. His definition of mass by way of Massenteilchen is clearly an attempt to give an image of inertial mass, so it may seem odd that he chose to measure mass by the specific gravitational effects rather than by the more fundamental inertial properties of matter. It may be that he followed Mach who had claimed that weighing, in fact, determines inertial mass (see Chapter 2). But this would require that Hertz accepted the constant gravitational acceleration as an empirical fact, and it also raises the new question of why he chose to determine inertial mass in this particular way rather that using Mach’s more general definition. One reason for the choice of the scale that he himself alludes to in §303 is the operational one that the best scale provides the most accurate means of measuring mass. This argument is similar to the arguments for the coordinative rules 1 and 2 regarding the measurement of times and spaces. However, I think there may be a more fundamental reason why Hertz chose the gravitational coordinative rule rather than a general inertial one. Indeed Mach’s general definition of inertial mass raises many problems. In order to see that let us recall how Mach had suggested that one determines inertial mass in the ordinary image of mechanics. One takes an isolated system consisting of two bodies that act on each other by a force. Then, the inertial masses are inversely proportional to the accelerations that the force imparts on each body. Now this is already problematic in the ordinary image of mechanics. 2 Noting that the statement ‘ponderable matter has gravitational mass’ is a tautology, he used the word greifbare Materie at this place already in the 1884 lectures.
Mass. The constitution of matter
143
It is not clear how one can isolate two bodies from the rest of the world, and it is a deep truth of nature that any force acting between the two bodies gives the same ratio. Why do different forces not give rise to different inertial masses? These problems remain in Hertz’s image, and several other problems arise. First, the law of action and reaction that is implicitly buried in Mach’s definition, is not a universal law in Hertz’s mechanics (see Chapter 19). Secondly, forces are complicated things rather than basic notions. In order for the two bodies to act on each other with a force, they should both be coupled to a concealed system of masses. This makes it even more inexplicable that the same ratio between the masses of the tactile bodies should come out each time. Indeed it would be rather surprising if the mass of the concealed system would never affect the measurement. In fact, since the inertial mass of the hidden masses behave just like the tangible mass, except in its relation to our sensory apparatus, the mass of the system that produces the mass, and also any other concealed mass that would be connected to the tangible system would have to be taken into account. Thus, an empirical determination of inertial mass of a tangible object would, in Hertz’s image, not determine the true inertial mass but at best some effective or apparent mass that would involve the concealed systems attached to it. Such an idea of apparent mass would not be foreign to Hertz. In his paper on electromagnetism of moving bodies, he had himself assumed that material bodies drag the ether along when they move through it. Therefore one might assume that the amount of ether dragged along by the tangible matter would increase its apparent inertial mass. A similar non-mechanical effect had also been observed in electromagnetic theory: When an electrically charged body changes its motion self-induction will try to counteract the change, just as if the body had a higher inertial mass than it really has. Later, Wien even suggested that all inertial mass may, in fact, be apparent mass due to such electromagnetic effects. This was the high point in the electromagnetic world view that was suggested as a replacement for the mechanical one that Hertz subscribed to. So, if Hertz had chosen a coordinative rule along the lines suggested above for measuring inertial mass he would only have been able to measure apparent mass, or different apparent masses depending on the non-controllable circumstances. It would have been very difficult to give a rule for how from these apparent masses to determine the true inertial mass. One might argue that since we can only measure apparent mass, this is the only physically meaningful one, and therefore all talk of the real inertial mass should be avoided. Such a radical solution, however, would have upset the entire image Hertz tried to establish. In particular, it would have created a slippery connection between the a-priori notions in book one and the empirical application in book two. Hertz cut the Gordian knot by deciding that one measures the mass of tangible bodies by weighing3 . It is, of course, true that all the considerations above concerning the concealed systems apply equally well to the system responsible for gravitation, and of course, even when we weigh tangible matter it is probably still connected 3 Hertz’s coordinating rule was criticized by Jouguet (Jouguet 1909, pp. 279–280).
144
Time, space and mass
not only to the system producing gravity but to many other systems. However, by postulating that we cannot place these concealed masses in the scale, Hertz avoided all these problems. Of course, he has not given an explanation for the postulate. Indeed it may almost be a defining property of concealed masses that we cannot handle it at will, but it is not at all clear how we can avoid putting some of the concealed mass into the scale. Nor is it explained how the concealed system responsible for gravity has the effect of measuring the mass defined in book 1. This is simply an empirical fact but nevertheless a wonderful puzzle. In his Kiel Lectures Hertz had insisted that this equality between inertial and gravitational mass ‘need an explanation. We may assume that even an easy and comprehensible explanation is possible, and that this explanation will provide us with a deep insight into the constitution of matter’ (Hertz 1999, p. 122). Hertz mentioned that one argument had been given for this relation. The idea is that matter is actually composed of only one type of very small identical parts, and that the different materials including their elements are only differing forms of this one type of matter. Such an idea had been advanced by Newton, Euler, Lazarre Carnot and Saint-Venant (Jouguet 1909, notes 7 and 48). It would explain that the relation between inertial and gravitational mass is the same in all materials. It also gives an easy explanation of the proportionality between inertia and gravitational mass: Indeed, if the mass of a body is proportional to the number of its particles, if all particles attract each other in the same way, and if the gravitational force between two bodies is the sum of all the forces between the constituting particles, it is obvious that gravitation must be proportional to mass. Such an argument had been advanced by Euler. Hertz’s image of matter in the Mechanics fits nicely into this tradition. All matter is build up from one unique type of identical Massenteilchen. Is it possible that Hertz introduced the Massenteilchen for the same reason that Euler and his followers had done, namely to account for the equality of inertial and gravitational mass? Bearing in mind what he wrote in his Kiel lecture notes it is very likely that he was aware of this aspect of his image, and at the same time it seems unlikely that he attributed much importance to it. In fact, in 1884 Hertz wrote that the one matter theory ‘is hardly supported by anything except that which we just intend to explain by it [i.e. that the ratio between inertial mass and gravitational mass is independent of the material]. Therefore we should not take it as a sufficient and true explanation; we shall rather leave the question open’ (Hertz 1999, p. 122). Moreover, the above argument for the proportionality between inertial and gravitational mass was never mentioned by Hertz, neither in his Kiel Lectures nor in his Mechanics. In fact, as stated above the argument makes no sense, because force in general, and gravity in particular, is not a fundamental concept, but rather the result of connections with a hidden system. It would perhaps be possible to transfer the spirit of the argument to Hertz’s image if one imagined that the connections between tangible matter and the hidden system producing gravity, operated on the level of Massenteilchen. However, it is conspicuous that in the Mechanics Hertz only allowed connections between material points. Thus, he missed his chance of using the argument. Thus, if the argument occurred to Hertz at all he did not value it highly.
Mass. The constitution of matter
145
Why then did Hertz in his Prinzipien der Mechanik subscribe to a one-matter theory that he had more or less rejected ten years earlier? One reason for his change of view might be found in the different aims of the 1884 lectures and the Mechanics. In the 1884 lectures Hertz wanted to get to grips with the atomic structure of matter. In the Mechanics he wanted to build the simplest image of mechanics that would allow such an explanation. For such an image there would be no reason to differentiate between different types of matter (oxygen matter, helium matter. . .). Any such differentiation would just complicate the image, and since we look for the simplest image we only assume the existence of one type of matter. Such an argument seems to lie behind the following remark in Hertz’s second draft of the book: All Massentheilchen are equal as far as our considerations are concerned. [Hertz, Ms 12, p. 1]4
Here, as in 1884, Hertz assumed that there might be different kinds of matter, but he postulated that the differences do not concern the considerations of the book, i.e. mechanics. However, it is conspicuous that in the third draft and in the published book Hertz omitted any such remark that would allow Massenteilchen to be different in some non-mechanical respects. This omission is a logical consequence of the strong mechanistic program expressed by Hertz at the beginning of the book. Indeed, if the behavior of the entire physical world is reducible to mechanics, then all properties of matter, be they physical or chemical, must, a fortiori, have a mechanical origin. In a mechanical world view, there is no room for properties that are not concerned with mechanics. In his Kiel Lectures Hertz did not express any strong commitment to a mechanistic world view. On the contrary, he expressed the view that understanding light as transversal mechanical waves in the ether is difficult or even impossible, whereas understanding it as an electromagnetic wave is conceivable and fits all observations. He did not in 1884 suggest that electromagnetic waves should be further reduced to mechanical properties of the ether. Therefore, in 1884 it was not inconceivable to Hertz that there might exist several types of matter. With his commitment to a mechanistic world view in the Mechanics, however, Hertz would be naturally led to a one-matter theory. 4 Alle Massentheilchen sind in Hinsicht unserer Betrachtungen gleich.
12 The line element: The origin of the Massenteilchen
The argument at the end of the previous chapter may explain why Hertz did not introduce different types of matter in his image of matter, one for each of the chemical elements. However, it does not explain why he chose to build his material particles from infinitely small Massenteilchen, rather than simply taking the material points themselves as the fundamental building blocks. Such an approach would have been entirely possible. Hertz could simply have defined a material point as a pair consisting of its mass (given as a real number and an arbitrarily chosen unit mass) and a function of time indicating the position of the material point. Of course, with this definition, mass would be an intrinsic property of the material point and not something one can obtain by counting, but the image has, in a sense, gained simplicity because one does not have to introduce the Massenteilchen, which in fact operate almost as an idle wheel in Hertz’s image. So Hertz’s introduction of Massenteilchen does not seem to be dictated by his requirement of simplicity (quite the reverse), nor is it a necessary part of his image of mechanics. Why then were Massenteilchen introduced? In accordance with the reflections in Section 11.3 I shall approach this question by looking at how this concept is put to use in the rest of the axiomatic synthetic structure of Hertz’s Mechanics. The fact is that Massenteilchen only enter into the arguments in two places of the book: In the definition of mass of a material point as I have explained in the previous chapter, and in the derivation of the line element in configuration space, which is the basic concept of Hertz’s geometric description of mechanics. I shall argue that it is the lastmentioned derivation that will provide us with a key to understanding Hertz’s choice of his image of matter, so let us now turn to this important feature of Hertz’s image.
12.1 Hertz’s line element From the very start Hertz considered systems of material points rather than one material point. Let such a system consist of n material points, and let for convenience 146
Hertz’s line element
147
m3µ−2 = m3µ−1 = m3µ , denote the mass of the µ-th material point with Cartesian coordinates (x3µ−2 , x3µ−1 , x3µ ). Hertz represented the configuration of this system by the coordinates x1 , x2 , x3 , . . . , x3n . If the system is displaced from an initial configuration with the above-mentioned coordinates to a final configuration with the , the distance between the two configurations is, according coordinates x1 , x2 , . . . , x3n to Hertz, given by the equation: ms 2 =
3n
mµ (xµ − xµ )2 ,
(12.1)
µ=1
where m = 13 3n µ=1 mµ is the total mass of the system. If therefore the system undergoes an infinitesimal displacement from (x1 , x2 , . . . , x3n ) to (x1 + dx1 , x2 + dx2 , . . . , x3n +dx3n ) the length ds of this infinitely small displacement will be given by m ds 2 =
3n
mµ dxµ2 .
(12.2)
µ=1
This is Hertz’s fundamental line element (Hertz 1894, §31). As usual in mechanics, Hertz also considered generalized coordinates q1 , . . . , qr , i.e. geometrically independent parameters that completely determine the configuration of the system (Hertz 1894, §12, 13), and showed (Hertz 1894, §57) that the line element could be expressed as a positive-definite quadratic form ds 2 =
r r
aρσ dqρ dqσ
(12.3)
ρ=1 σ =1
in the differentials of the generalized coordinates. Here, aρσ is defined by maρσ =
3n
mν ανρ ανσ ,
(12.4)
∂xν . ∂qρ
(12.5)
ν=1
where1 ανρ =
In modern terminology we could say that Hertz introduced this line element ds as a metric in the configuration space, making it thereby into a Riemannian manifold. To be more specific and less anachronistic, Hertz defined various geometric 1 The reader who consults Hertz’s book should be warned that Hertz, following Helmholtz, consistently used pρ to denote generalized coordinates, and qρ to denote the conjugate generalized momenta. I have used the opposite convention in agreement with Hamilton’s and modern usage. Moreover, in both the German original and the English translation ανρ is virtually indistinguishable from aνρ , which may cause some confusion.
148
Line element origin of Massenteilchen
notions such as angle, curvature, and straightest path, which in the end allowed him to formulate his one and only dynamical fundamental law (axiom). I shall return to this geometry of systems of points in the next chapter. Here, it suffices to note that Hertz did not postulate eqns (12.1) (12.2) or (12.3), but deduced them from two other postulates, namely: 1◦ . The magnitude of the displacement of one point is the usual Euclidean distance s given by 3 s2 = (xν − xν )2 . (12.6) ν=1
(Hertz 1894, §23) 2◦ . ‘The magnitude of the displacement of a system is the quadratic mean value of the magnitudes of the displacements of all its particles’ (i.e. Massenteilchen) (Hertz 1894, §29). Thus, the displacement s of a system is given by the equation As 2 = (xi − xi )2 ,
(12.7)
i
where xi and xi are the final and initial coordinates of all the infinitely many Massenteilchen that build up the finitely many material points of the system, and where A denotes the number of these Massenteilchen. From this infinite sum it is easy to derive the expression (12.1) of the displacement (Hertz 1894, §31). Indeed, let η denote the number of Massenteilchen in a unit mass. Then mµ η denotes the number of Massenteilchen in the µ-th material point of the system, and the total number A of Massenteilchen in the system is A = mη. If we collect all the Massenteilchen belonging to each material point in the system, eqn (12.7) will therefore become: mηs 2 =
3n
mµ η(xµ − xµ )2 ,
(12.8)
µ=1
which implies eqn (12.1) and further eqn (12.2). We see that the introduction of the Massenteilchen allowed Hertz to deduce the weighted sums (12.1) and (12.2), in which the contribution to the distance s from each material point depends on the mass of the material point: the larger masses contributing most to the magnitude of the displacement. The Massenteilchen worked for Hertz as an active agent to produce the ‘correct’ line element (12.2). Was it this action of the Massenteilchen that led Hertz to introduce them? In an earlier publication (Lützen 1998) I conjectured that this might be the case, and now I can show it beyond doubt on the basis of Hertz’s early sketches and drafts of the Principles of Mechanics2 . 2 For a survey of the manuscripts see the Appendix.
Vanishing identical material points rejected
149
12.2 Vanishing identical material points rejected The earliest sketchy notes on mechanics that exist in Hertz’s hand deal exactly with the introduction of the basic notions of the geometry of systems of points. Together with the first long draft of the book they shed a very interesting light on the creative process Hertz went through to define his line element and his image of matter. The very first page of the second of these notes is devoted to ‘Displacements of a System’3 . It begins with the definition of the configuration (or position) of a system and the distance between two configurations of a system: 1. The totality of the positions which all the material points occupy at the same time is called the configuration of the system. 2. The distance between two configurations of the system is the quadratic mean value of the distances of all its points. If x1 , x2 , . . . , x3n are the 3n rectangular coordinates of one configuration and are the rectangular coordinates of the second configuration then the distance x1 , x2 , . . . , x3n s12 between the two configurations is [determined by] 2 = s12
3n
(xr − xr )2 .
(12.9)
r=1
(Ms 2, p. 1)4
We recognize the use of the quadratic mean value from the published definition. However, here in the early manuscript Hertz took the quadratic mean value over the n points of the system rather than over all the Massenteilchen. The resulting sum is an unweighted sum, and the masses of the points do not enter. In fact, Hertz explicitly added a comment to this effect: Comment 1. The distance of two configurations [is] independent of the assumed size of the points [in the system]. Further division of a point has no influence. Holds for all the following. (Ms 2, p. 1)5
The last part of the note is rather mysterious. Indeed, if ‘point’ means the same as in the rest of this manuscript, the statement is simply false. Indeed, if a point of the system is divided in two it will contribute twice as much to the expression (12.9), and therefore alter it. This argument is similar to one of Galilei’s arguments (Galilei 1638) against the Aristotelian description of free fall. The placing of the comment in the manuscript 3 ‘Verschiebungen eines Systems’ (Hertz Ms 2, p. 1) 4 ‘1. Die Gesamtheit der gleichzeitigen Lagen der einzelnen mater. Punkte heisst die Lage des Systems.
2. Entfernung zweier Lagen des Systems nennen wir das quadratische Mittel der Entfernungen der einzelnen Punkte. Sind x1 , x2 , . . . , x3n die 3n rechtw. Coord. der einen Lage . . . . . . . . . . . . . . . . . . . . . . . . anderen Lage x1 , x2 , . . . , x3n 2 = 3n r(x − x )2 .’ (Ms 2, p. 1). so ist die Entfernung s12 beider Lagen s12 r r 1 Note that the number (12.9) of the equation is my addition. 5 ‘Anmerkung 1. Die Entfernung zweier Lagen unabhängig von der angenommenen Grösse der einzelnen Punkte. Weitere Theilung eines Punktes hat keinen Einfluss. Gilt für alles folgende.’ (Ms 2, p. 1)
150
Line element origin of Massenteilchen
suggest that it was written later than the main text, so it may express a problem Hertz had subsequently spotted in the definition, and which he wanted to change here and in the following6 . In the remaining of this and the other first six notes (Ms 1–6), Hertz tried in various ways to introduce geometric concepts such as orthogonality, angle, components and curvature of a path, and he began to introduce the concept of time and kinematic concepts such as velocity and momentum. All these investigations were based on the unweighted expression (12.9) for the distance, and the corresponding line element ds 2 =
3n
dxr2
(12.10)
r=1
as well as their expression in other generalized systems of coordinates. Only one note, whose place in the series is uncertain, contains an mr and an M that seem to stand for masses. However, in a short manuscript (Ms 7) in which Hertz introduced connections between the material points of his system he also introduced the concepts of mass and matter for the first time. This manuscript begins: Material Point. An identical quantity of matter of [vanishingly small mass and] vanishingly small volume considered in the immediate neighborhood of a certain geometric point is called a material point. [We consider all material points as equal, their number in a finite {quantity of matter} mass is proportional to this mass. Each material point is invariable and indestructible. The only change they can undergo is a change of place]. (Ms 7, p. 1)7
Here, Hertz has already introduced a concept of vanishing small identical building blocks of matter. They are called material points, but should not be confused with the material points of the book. Indeed, they all have the same mass contrary to the material points of the book, and should therefore rather be compared to the Massenteilchen of the book. It is possible that the previous notes and the quote from Ms 2 in particular, operates with this type of vanishingly small identical material points rather than the finite material points of the book. That would explain the lack of any mention of the mass of the points in the formula (12.9) for the distance. However, it would make the sum in formula (12.9) an infinite sum (or very large sum, according to whether ‘vanishing’ means ‘infinitely small’ or ‘very small’). It would also make the 6 It is also possible that my reading of this comment is incorrect and that it reads: ‘Further division of a point has an influence’ (‘einen Einfluss’ instead of ‘keinen Einfluss’). The reading is not entirely clear. 7 ‘Eine identische Menge von Materie von [verschwendend kleiner Masse und] verschwindend kleinem Volumen vorgestellt in unmittelb. Nachbarschaft eines bestimmten geometrischen Punktes heisst ein materieller Punkt. [Alle materiellen Punkte betrachten wir als gleich, die Anzahl dersellben in einer endlichen {Menge der Materie} Masse ist proportional dieser Masse. Jeder materielle Punkt ist unveränderlich und unzerstörbar: Die einzige Veränderung welche sie erleiden kann, ist eine Veränderung der Orte].’ (Ms 7, p. 1). The words in { } were crossed out and replaced by ‘mass.’ This change seems to have been done while writing the remaining of the sentence.
Massenteilchen appear
151
Comment 1 even harder to understand, since the material points in this reading are not divisible. Again this may indicate that the Comment 1 was added at a later time, when Hertz had changed his idea about material points. In fact, Hertz soon changed his mind concerning the definition of a material point. He crossed out the part of the above quote placed in brackets and replaced it in the margin with the following: A material point is entirely characterized by its mass and its place in space. (Ms 7, p. 1)8
Thus, Hertz at this point rejected his first idea of building up mass from vanishingly small identical material points, and instead changed the definition of material point to mean a finite mass (or at least a mass that need not be vanishingly small) with a vanishing volume. This concept is the same as the concept of a material point as defined in the book, except that in the marginal note of the manuscript, Hertz has rejected the attempt to build the material points out of smaller identical building blocks. Instead, he simply characterized a material point by its mass and its place in space without any further explanation of the concept of mass. This definition is similar to the one I suggested above that could have saved Hertz the introduction of the Massenteilchen.
12.3 Massenteilchen appear In the beginning of the first extended draft (Ms 9) of the book, Hertz first repeated the new definition of a material point: §1. Material Point and Material System 1. Configurations A. Point Definition. A material point is an identical quantity of matter of a vanishingly small volume, considered in the immediate neighborhood of a certain geometric point. A material point is entirely characterized by the specification of its mass and its place in space. (Ms 9, p. 1)9
However, following this definition Hertz inserted two lines in a smaller script: A material point of vanishing mass we will call a Massentheilchen [sic!]. We will consider a material point of finite mass to be composed of Massentheilchen. (Ms 9, p. 2)10 8 ‘Ein materieller Punkt ist vollständig charakterisiert durch seine Masse und seinen Ort im Raum.’ (Ms 7, p. 1) 9 ‘§1. Materieller Punkt und Materielles System. 1. Lagen A Punkt Def. Ein materieller Punkt ist eine identische Menge von Materie von verschwindend kleinem Volum, vorgestellt in unmittelbarer Nachbarschaft eines bestimmten geometrischen Punktes. Ein materieller Punkt ist vollständig charakterisiert durch die Angabe seiner Masse und seines Orts im Raume.’ (Ms 9, p. 1). 10 ‘Einen materiellen Punkt verschwindender Masse nennen wir ein Massentheilchen [sic!] Einen materiellen Punkt von endlicher Masse betrachten wir als zusammengesetzt aus Massentheilchen.’ (Ms 9, p. 1).
152
Line element origin of Massenteilchen
Fig. 12.1. Hertz’s first introduction of Massenteilchen (Ms 9, p. 1).
So here Hertz reintroduced vanishingly small building blocks of matter, but now he gave them a new name: Massenteilchen. It is not clear from the definition that they are supposed to have equal mass, but this becomes clear in the subsequent derivation of the line element, (Hertz, Ms 9, p. 3). There it also becomes clear that the mass of a material point is the number of Massenteilchen of which it is composed divided by the number of Massenteilchen in a unit mass. What made Hertz reintroduce the vanishingly small building blocks of matter that he had previously discarded? To answer this question it is important to note that the composition of the manuscript (see Fig. 12.1) suggests that the two lines containing the definition of Massenteilchen were intercalated at a later time. When? If we can answer this question we may be able to see why Hertz introduced the Massenteilchen. Let us therefore try to look for the next place where Massenteilchen occur. It happens on the bottom of page 2 after Hertz has introduced systems of material points, and has defined finite displacements of one material point (mentioning even the possibility that the material point can ‘live in’ Rn . He then introduced displacements of a system of material points, and defined ‘quadratic mean value’ as follows: Auxiliary Definition. The quadratic mean value of a series of quantities is the positive square root of the arithmetic mean of the squares of the considered quantities. (Ms 9, p. 2)11
He then proceeded to the crucial definition of the length of a displacement: Definition: The magnitudes of the displacement of a system of material points will be measured [by the quadratic mean value of the displacements of the masses of the system] by the quadratic mean value of the displacements of all of the Massentheilchen of the system. (Ms 9, p. 2)12 11 ‘Hilfsdefinition. Quadratischer Mittelwerth einer Reihe von Grössen heisst die positive Quadratwurzel aus dem arithmetischen Mittel der Quadrate der betreffenden Grössen.’ (Ms 7, p. 2). 12 ‘Definition. Die Grösse der Verrückung eines Systems materieller Punkte wird gemessen [durch die mittlere quadratische Verrückung der Masse des Systems] durch den quadratischen Mittelwert der Verrückungen sämtlicher Massentheilchen des Systems.’ (Hertz, Ms 9, p. 2).
Massenteilchen appear
153
Fig. 12.2. Hertz’s first use of Massenteilchen in the definition of the magnitude of a displacement (Ms 9, p. 2).
In the quote above, the phrase in the bracket has been crossed out, apparently just after it was written, and replaced by the new formulation in terms of the Massenteilchen (see Fig. 12.2). Here we have caught Hertz at an important moment of transition. Until this moment his line element had always been determined by the unweighted sum (12.9) or (12.10). With the new definition he changed to the weighted sum (12.1) or (12.2) of the published book. I am not quite sure of what to make of the deleted passage. It reads conspicuously like the earlier definitions (see the first quote of this section), so it may be that he had at first intended to introduce an unweighted line element such as eqn (12.9) or eqn (12.10). However, in the earlier definition he took the quadratic mean value of the displacements of the points of the system, whereas in the deleted passage, he took the quadratic mean value of the displacements of the masses of the system. This may indicate that already when writing the deleted passage he knew he was headed for the weighted sum (12.1) or (12.2). In this case, however, he must have realized that his own auxiliary definition of ‘quadratic mean value’ did not convey a clear meaning to the expression ‘quadratic mean value of the displacements of the masses.’ In any case, at some point (before or after formulating the deleted passage) Hertz discovered that it was preferable to work with the weighted expression of distance (12.1), and the weighted line element (12.2), rather than the unweighted expressions (12.9) or (12.10). Moreover, he realized that in order to obtain those expressions in a way similar to what he had previously done, i.e. by taking a quadratic mean value, he had to divide each mass into equal Massenteilchen so that the mass of a particle was proportional to the number of its Massenteilchen. At this point he seems to have gone back to the beginning of the manuscript to add the two lines in small script, containing the definition of the Massenteilchen. Admittedly, the above course of events is a reconstruction. However, it fits the facts so well that it is hard to escape the conclusion that Hertz introduced his concept of a Massenteilchen, and thereby his image of matter, in order to be able to derive the correct line element for his basic geometry of configuration space. Indeed, in the earlier manuscript (Ms 7), where he had not considered the line element, he had explicitly rejected his first idea of introducing such a small building block of matter. Thus Hertz’s image of matter was not just as a matter of matter. It was also a matter of the geometry.
154
Line element origin of Massenteilchen
12.4 A matter of space At first sight the above reconstruction seems to reduce the question of Hertz’s image of matter to a purely technical mathematical question. However, it is important to note that the introduction of the Massenteilchen is only a necessity if one accepts that the line element should be deduced as a quadratic mean value, and if one accepts that it should have the weighted form (12.2). The second point seems to be almost inescapable if one wants to build a geometric mechanics such as the one Hertz had in mind. Indeed, many simple theoretical reflections could have led Hertz to realize that a ds defined by an unweighted sum (12.10) would lead into trouble. The simplest argument is the one Hertz indicated in Comment 1 above: If we take two material points and glue them together we can consider the compound either as one material point or as two material points. However, if the compound is a part of a system of material points, the unweighted line element (and therefore the resulting motion) of the system will depend upon whether we consider the compound as one or as two material points. This is unsatisfactory. It is also possible that Hertz early on knew that he wanted to express the (kinetic) energy of a system by the usual formula (1/2)m(ds/dt)2 . This is clearly false if ds is determined by eqn (12.10), but true if it is determined by eqn (12.2). Finally, it is possible that Hertz knew that he wanted to express Gauss’ principle of least constraint as a law minimizing the curvature of the path of the system. This is a central element of Hertz’s one and only Fundamental law of motion. As remarked already by Arnold Sommerfeld (Sommerfeld 1943, pp. 205–206) this would also lead to the weighted expression of the line element. It is interesting that Sommerfeld at this place already suggested that it was for this reason that Hertz introduced the Massenteilchen: Further, in order to arrive at the geometric interpretation he [Hertz] had in mind, he had to consider all masses as multiples of an atomic unit mass, so to speak.13 (Sommerfeld 1943, pp. 205–206)
At any rate, Hertz was almost bound by the inner logic of the subject to discover that it was preferable to replace the unweighted expressions (12.9) and (12.10) by the weighted sums in eqns (12.1) and (12.2). He could have done this simply by postulating eqn (12.1) and thereby eqn (12.2) as the measure of length of a displacement, but he did not do that. Instead he clung to his old derivation by way of quadratic mean values. Why? This is a question that I do not think one can answer by reference to purely mathematical or physical facts. I think it has to do with Hertz’s philosophical view of geometry and space. In fact, the line element ds defined by eqns (12.2) or (12.3) is a non-Euclidean line element of the sort studied by several mathematicians from the time of Riemann. Hertz was well aware of these investigations, and referred to them in his book (Hertz 1894, p. xxxii and p. 39/32–33). Yet, in the Introduction to the Mechanics Hertz clearly dissociated his geometry of systems of points from the Riemannian geometries 13 Note that Sommerfeld used the word unit mass for the Massenteilchen, not for the arbitrarily chosen unit mass in Hertz’s definition.
A matter of space
155
developed by the contemporary mathematicians. For example in connection with his discussion of the Hamilton–Jacobi formalism (see Chapter 23) he remarked: It has long since been remarked by mathematicians that Hamilton’s method contains purely geometrical truths, and that a peculiar mode of expression, suitable to it, is required in order to express these clearly. But this fact has only come to light in a somewhat perplexing form, namely, in the analogies between ordinary mechanics and the geometry of space of many dimensions, which have been discovered by following out Hamilton’s thoughts. Our mode of expression gives a simple and intelligible explanation of these analogies. It allows us to take advantage of them, and at the same time it avoids the unnatural admixture of supra-sensible abstractions with a branch of physics. (Hertz 1894, p. 39/32–33)
To a modern reader this attempt to draw a clear line between the geometry of systems of points and Riemannian geometry may seem strange. In modern textbooks a Riemannian manifold is presented as a formal analytical system to which are attached geometric names such as point, distance, map, atlas, curvature, etc. Therefore we have no problem in identifying Hertz’s geometry of systems of points with a Riemannian manifold. However, one has to keep in mind that in Hertz’s time the theory of Riemannian geometry was totally entangled with a discussion of physical space. To be sure, Riemann had constructed his concept of manifold in a rather analytical way, but he intended it to be an a-priori possible structure of physical space. He pointed out that only when we understand the wide range of a-priori possible geometries can we specify those hypotheses that lie at the basis of real physical space. Helmholtz had a similar understanding of non-Euclidean geometries. Such a concept of geometry must clearly have been problematic to Hertz. As we saw in the previous chapter Hertz considered space to be a-priori 3-dimensional and Euclidean. From such a Kantian point of view Riemannian geometry could not play the role of an a-priori possible structure of physical space, but was a structure of a suprasensible non-existent space. Its ontological status must obviously have seemed highly problematic to Hertz. His own geometry of systems of points, on the other hand, was entirely unproblematic. It did not deal with points in a suprasensible space, but with a system of points in ordinary Euclidean space. It is symptomatic that although Hertz introduced many other geometrical terms in his geometry of systems of points, he never used the term point for his systems of points nor the term space for what we to day call configuration space. Thus Hertz could benefit from the formalism of high-dimensional Riemannian manifolds, without using the philosophical problematic concept of a high-dimensional non-Euclidean space. To Hertz, the relation between his geometry of system of points and the Riemannian geometries introduced by the mathematicians was not one of equality or inclusion but one of analogy: Hence there arise many analogies with the geometry of space of many dimensions; and these in part extend so far that the same propositions and notations can apply to both. But we must note that these analogies are only formal, and that, although they occasionally have an unusual appearance, our considerations refer without exception to concrete images of space as perceived by our senses. Hence all our statements represent possible experiences; if necessary, they could be confirmed by direct experiments, viz by measurements made with models.
156
Line element origin of Massenteilchen
Thus we need not fear the objection that in building up a science dependent upon experience, we have gone outside the world of experience. (Hertz 1894, p. 36/30)
Here, Hertz probably ascribed to the word analogy the same technical meaning as William Thomson and Maxwell had given to it (see Section 8.5): Two physical theories are analogous, if they are described by the same analytical mathematical formalism. In this case the one ‘physical’ theory is the suprasensible high-dimensional nonEuclidean space and the other physical theory is the systems of points. They share the same analytical formalism namely what we would call Riemannian geometry. In other words they have entirely different objects but shares the analytic formalism. To Hertz it was questionable if the theory of high-dimensional spaces has an object at all, but as he pointed out in the quote above the statements of his geometry of systems of points all ‘represent possible experiences’ (Hertz 1894, p. 36/30). For these reasons it is reasonable to assume that Hertz would have felt very uncomfortable if he should have postulated a non-Euclidean metric such as eqn (12.2) into his configuration space. Deriving it as a quadratic mean value may have seemed more acceptable and natural to Hertz. Indeed, the ordinary Euclidean distance is obtained as a quadratic mean value of the rectangular coordinates. Hertz just extended this derivation to systems. If this reconstruction is correct, Hertz’s introduction of Massenteilchen, and thereby of his image of matter, was a result of natural philosophical reflections, but it was not primarily reflections about the nature of matter, but rather reflections about the nature of space.
12.5 The Massenteilchen shrink and become matter free The Massenteilchen that Hertz introduced in his first draft (Ms 9) were not quite identical to the ones in the printed version of the Mechanics. Indeed, in connection with his deduction of the expression (12.1), which otherwise follow the later deduction in the book (as sketched above), Hertz remarked: We chose the mass of a Massentheilchen so small that with sufficient approximation it divides all masses of the system a whole number of times. (Ms 7, p. 3)14
Thus, despite the fact that Hertz in the definition wrote that the mass of a Massenteilchen was ‘vanishing,’ he seems to have considered it as finite, and just so small that it will measure all the masses, at least approximately. This suggests that our choice of Massenteilchen may depend on the system we want to study. Thus the Massenteilchen have the character of auxiliary quantities of an ad hoc approximative nature. There is an interesting parallel here to another instance of approximation in Hertz’s mechanics. When Hertz succeeded in explaining conservative forces (or potential 14 ‘Wir wählen die Masse eines Massentheilchens so klein, dass sie alle Massen des Systems mit genügend Annäherung ganzzahlig theilt.’ (Ms 7, p. 3).
The Massenteilchen shrink and become matter free
157
energy) as the result of cyclic motions of hidden masses, this too was an approximation. If the hidden masses move very fast but are very small, the approximation will be good, but it would be perfect only if the hidden masses were infinitely small and would move infinitely fast. Even in the printed book Hertz did not go to this limit but in this case left his image as an approximation of ordinary mechanics that could be made as good as one wished (see Chapter 20). Hertz dealt with the approximation caused by the finite size of his Massenteilchen quite differently. Here in the published book, he rejected to build his mechanics on an approximation, and he rejected a discretization of mass that would have resulted from composing mass from finite Massenteilchen. Instead, he chose to make his Massenteilchen infinitely small. This happened in two stages. In the second draft of the book (Ms 12) Hertz made the Massenteilchen infinitely small so that material particles could have finite masses with a continuum of values. Only in the third draft (Hertz, Ms 15) did he make the Massenteilchen infinitely small of the second order. It is in this connection that he introduced material particles with infinitely small masses. There is, however, one vestige of the vanishing masses in the printed version of the Mechanics. Indeed in Definition 3 quoted from the English translation in Section 11.3.1, the second sentence in the second paragraph in fact reads: ‘This we attain by supposing the Massenteilchen to be of a higher order of infinitesimal than the material points that are almost considered of vanishing mass’15 . So, apparently Hertz here identified (or confused) ‘infinitesimal’ with ‘vanishing,’ which as we saw above, in the first draft meant very small but not infinitely small. This confusion could be explained away as a slip of the pen,16 had it not been for the fact that in the third draft of the book Hertz had written ‘infinitely small mass’ (‘unendlich kleiner Masse’) instead of vanishing mass at the end of the sentence from §5 quoted above. Later he changed it, apparently deliberately, when he copied the definition into his fourth draft. It therefore seems as though Hertz also in the printed book wanted to convey the idea that the material points of infinitely small mass could be thought of as vanishing (perhaps of atomic or subatomic size). Be that as it may, Hertz introduced the Massenteilchen to serve a role in his geometry of systems of points, but shrunk them afterwards to make them acceptable in the theory of matter. Had he taken the Massenteilchen really seriously as an image of the building blocks of matter, he could have suggested that they were indeed very small but perhaps finite, and he could have suggested to test in the future if one could detect a resulting discretization of the mass of atoms. This would have paralleled what he did in the case of his approximation of the ordinary laws of conservative systems. That he did not chose this possibility seems to indicate that he did not consider his image of matter as a fundamental part of his image of mechanics, but mainly included it because it served a role in the derivation of the line element. This point of view fits well with 15 ‘dass wir uns die Massenteilchen von höherer Ordnung unendlich klein denken, als die etwa betrachteten materiellen Punkte von verschwindender Masse’ (Hertz 1894, §5). 16 The translators of the English translation of Hertz’s book apparently considered it as a slip of the pen. At least they translated ‘verschwindender Masse’ into ‘infinitely small mass.’
158
Line element origin of Massenteilchen
Hertz’s insistence that definitions (e.g. the definition of mass) are introduced for the sake of appropriateness rather than for the sake of permissibility or correctness. Finally, let me mention a last important difference between the Massenteilchen of the first draft and those of the book: In the first draft the Massenteilchen (or the vanishing material point of the deleted passage in (Ms 7)) is a quantity of matter. Massenteilchen were therefore quite similar to finite masses, just smaller. In the book, however, a Massenteilchen is nothing but a ‘characteristic (Merkmal) by which we associate without ambiguity a given point in space at a given time with a given point in space at any other time’ (see the previous chapter). Hertz no longer stated that it was made of matter. Matter seems to be a result of the presence of Massenteilchen. To put it differently in Hertz’s early manuscripts, Massenteilchen are made up from matter; in his book matter is made up from geometrically defined Massenteilchen17 .
12.6 Conclusion The analysis of this chapter leads to the following conclusion: Hertz did not at first introduce the Massenteilchen in his image of Mechanics in order to explain the constitution of matter but as a means to derive the fundamental line element in his ‘geometry of systems of points.’ He probably wanted to give such a derivation on the basis of Euclidean geometry because he looked with suspicion on the new ideas of non-Euclidean geometry. Having introduced the Massenteilchen Hertz subsequently changed their definition as a result of reflections about the constitution of matter. 17 This change of the definition happened in the second draft (Ms 12). This means that in this draft Massenteilchen are defined as a mere characteristics (Merkmale) and, on the other hand, it is implied that they may differ in some way. No wonder that Hertz changed this formulation in the next draft.
13 Hertz’s geometry of systems of points
13.1 Geometrization of mechanics The relation between mechanics and geometry has changed dramatically over the last 350 years. Newton wrote his Principia (Newton 1687) in a geometric style, but one century later Lagrange in his Mécanique Analytique proudly announced that: ‘One will not find any figures in this work. The methods that I put forward need neither constructions nor geometric or mechanical arguments but only algebraic operations, subjected to a regular and uniform procedure’ (Lagrange 1788, vj). At the beginning of the twentieth-century geometry again came to the fore as the basis of Einstein’s general theory of relativity, and today even classical mechanics is often treated as so-called symplectic geometry (Arnold 1978). The type of geometry that is applied in mechanics has also changed. Newton build on classical Euclidean synthetic geometry, whereas Einstein and his followers have used differential geometry. In fact differential geometry was already used in dynamics from the middle of the nineteenth century by mathematicians such as Liouville, Bertrand, Serret and Minding and it was carried further after the publication of Riemann’s Habilitationsschrift by Beltrami, Lipschitz and Darboux. However, these researches, that I shall return to in Chapter 24, were hardly noticed by the physicists. The first physicist who used differential geometry in mechanics was Hertz, and other physicists seem to have learned about it from Hertz’s Prinzipien der Mechanik. In fact, one can argue that the geometric formalism was to be the most lasting innovation of Hertz’s book. In this chapter I shall explain Hertz’s geometry of systems of points in some detail, and discuss two questions related to it: Why did Hertz choose to present his mechanics in a geometric form, and how did he initially construct this geometry? Hertz’s early notes on mechanics as well as the first draft of the book will give us a rather good answer to the latter question, but since his decision to create a geometric formalism pre-dates his earliest notes, they do not help us answer the first question.
13.2 Why the geometric form? The question of why Hertz chose a geometric mathematical form is rather intriguing. Indeed, as Hertz himself emphasized ‘the physical content is quite independent of 159
160
Hertz’s geometry of systems of points
the mathematical form’ (Hertz 1894, p. 34/29). Of course, the fundamental law loses its meaning in a more traditional mathematical presentation of mechanics, but as Hertz pointed out, one can replace it with Newton’s first law combined with Gauss’s principle of least constraint. Thus, Hertz was not logically forced to create and use this novel geometric language. Keeping in mind his Kantian view of geometry and his somewhat sceptical remarks about the unnaturalness of suprasensible highdimensional geometries, it might even seem to be a somewhat surprising choice. One might have expected this step to have been taken by a physicist with a more open mind about the new geometry such as Helmholtz. There are two pieces of evidence suggesting that Hertz could initially have been influenced by mathematicians to develop his geometry of systems of points. The first one is the first entry related to mechanics in Hertz’s diary. It reads: 7. May. [1890] Asked Lipschitz about the Hamiltonian principle. (Hertz 1977, pp. 300–301)
Lipschitz was a professor of mathematics in Bonn, where Hertz had been appointed professor of physics in 1889. We do not know what the two professors talked about, but it is not inconceivable that Lipschitz mentioned his own paper from 1872 in which he had given a treatment of Hamilton’s principle in terms of Riemannian geometry (see Chapter 24). Another possible influence on Hertz could have come from the much younger mathematician Hermann Minkowski who was also in Bonn at the time and busy developing his Geometrie der Zahlen, another application of a high-dimensional geometry to a field of mathematics, namely number theory (Kjeldsen 2002). Indeed we know that Hertz was in regular contact with his younger mathematics colleague. However, in the introduction to the Mechanics Hertz did not mention Minkowski among his sources of inspiration and he claimed that he learned of the works of the mathematicians, including Lipschitz, after he had developed his own ideas. Indeed, after referring to J.J. Thomson’s paper (Thomson 1886) he wrote: I might have derived assistance from this paper as well [as from Helmholtz’s papers]; but as a matter of fact my own investigation had made considerable progress by the time I became familiar with it. I may say the same of the mathematical papers of Beltrami (1868b) and Lipschitz (1872), although these are of a much older date. Still I found these very suggestive, as also the more recent presentation of their investigations which Darboux (1888) has given with additions of his own. I may have missed many mathematical papers which I could and should have consulted. (Hertz 1894, Preface)
And, in fact, Hertz’s note books corroborate this claim. They show that he constructed the geometry of systems of points from scratch without borrowing any technical ideas from his contemporary mathematicians. He seems to have build on Riemann’s and Helmholtz’s work on manifolds, but not on any works linking these geometric investigations to mechanics. Of course, this does not exclude that contacts with Lipschitz or Minkowski might have opened his mind to the possibility of using a geometric formalism, but his own remarks in the introduction of the book seem to suggest that the major push for the geometric formalism came from the unique physical content of his image of mechanics. Indeed, although the geometric form and the physical content are formally independent, ‘they are,’ according to Hertz, ‘so suited,
Why the geometric form?
161
that they mutually assist one another’ (Hertz 1894, p. 34/29). In what way do they assist each other? Hertz particularly emphasized that ‘the essential characteristic of the terminology used (the geometry of systems of points) consists in this, that instead of always starting from single points, it from the beginning conceives and considers whole systems of points’ (Hertz 1894, p. 34/29). Something similar can be said about the physical content of Hertz’s mechanics. Indeed, in a presentation of a mechanics without forces, there would be no point in starting with a chapter on the motion of one point mass, as is and was the traditional way to start a mechanics textbook. In ordinary mechanics there are interesting things to be said about the motion of a point mass in a force field, but when there are no forces, there is nothing one can say about the motion of one point. It is therefore an inevitable consequence of the central physical content of Hertz’s mechanics that he needed to start directly with systems of points. This match between the physical content and the geometric form, may explain how Hertz came upon the idea of his geometry of systems of points in the first place. The following quote from the introduction may reflect Hertz’s own initial train of thoughts: Every one is familiar with the expressions ‘position of a system of points’ and ‘motion of a system of points.’ There is nothing unnatural in continuing this mode of expression, and denoting the aggregate of the positions traversed by a system in motion as a path. Every smallest part of this path is then a path-element. Of two path-elements one can be a part of the other: they then differ in magnitude and only in magnitude. But two path-elements which start from the same position may belong to different paths. In this case neither of the two forms part of the other: they differ in other respects than that of magnitude, and thus we say that they have different directions. It is true that these statements do not suffice to determine without ambiguity the characteristics of ‘magnitude’ and ‘direction’ for the motion of the system. But we can complete our definitions geometrically or analytically so that their consequences shall neither contradict themselves nor the statements we have made; and so that the magnitudes thus defined in the geometry of the system shall exactly correspond to the magnitudes which are denoted by the same names in the geometry of the point, – with which, indeed, they always coincide when the system is reduced to a point. (Hertz 1894, pp. 34–35/29)
Thus a likely reconstruction of Hertz’s way to the geometrization of mechanics goes as follows: Having decided to create a forceless mechanics Hertz realized that he had to deal with systems of points from the start. However, he decided to try and deal with them in a way that paralleled the traditional mechanics of one mass point. He discovered that from the familiar expressions of position and motion of a system he could go on to define notions of distance and directions. He discovered that there were many ways to define these notions, but after some experimentation he discovered how to do it in such a way that the description of a system would correspond completely to the traditional description of one point mass. This reconstruction is corroborated by Hertz’s manuscripts. Indeed, the earliest notes on mechanics left behind by Hertz are concerned with the development of the geometry of systems of points. Unfortunately, there are no reflections about why such a mathematical form would be desirable, but it is rather certain that Hertz did not first develop his physical idea of a force-free mechanics in a traditional mathematical
162
Hertz’s geometry of systems of points
language. Indeed, it is conspicuous that there are no notes in Hertz’s hand in which he experiments with the physical content of the image. It appears almost full born in the first draft of the book. Apparently Hertz was not just flattering his mentor when he wrote in the preface to the Mechanics. Both in its broad features and in its details my own investigation owes much to the abovementioned papers (Helmholtz 1884), (Helmholtz 1886): the chapter on cyclical systems is taken almost directly from them. Apart from matters of form, my own solution differs from that of Helmholtz chiefly in two respects … . (Hertz 1894, Preface)
So, it seems that from the start of his serious engagement with mechanics, the physical content of a forceless mechanics was more or less clear in Hertz’s mind thanks to Helmholtz’s papers. Therefore he began his work by experimenting with the mathematical, geometric form. In what appears to be the first of Hertz’s notes (Ms 1) he studied the path of a point, and introduced the notions of magnitude and direction of an infinitesimal element of the path, as well as the angle between two such elements and the length of a piece of a path. He also introduced generalized coordinates and the concepts of a displacement in the direction of a coordinate and components of a displacement in the direction of a coordinate. He then crossed out the beginning of the note and replaced it by a new manuscript (Ms 2) in which all reference to a path and to time have been removed. Instead he defined length1 and direction of a finite displacement as well as the angle between two displacements. Moreover, he now dealt with a system of points rather that one single point. However, he continued to use the word ‘Verschiebung’ that suggests a motion in time through a path. The same word is used in the two subsequent manuscripts (Ms 3, 4) in which Hertz began to introduce the central concept of curvature (see the following section). However, in the next manuscript (Ms 5) the term ‘Verschiebung’ is replaced by the word ‘Verrückung,’ which is also used in the printed book. It is surely a conscious change that signals that a Verrückung does not imply a motion in time and depends only on the initial and final configurations of the system, not on the path it may have followed to get from one to the other. Hertz then went on to introduce kinematic notions such as velocity and momentum for a system moving in a path (Ms 6) and the concept of connections (Ms 7), after which he composed a new beginning of the introductory material (Ms 8). In all of these short manuscripts (except in one section of Ms 4 that may be of a later date) Hertz did not introduce the concept of mass. It appeared for the first time in the first long draft of the book (Ms 9) that Hertz seems to have begun after completing (Ms 8). This gradual development of the basic concepts of the geometry of systems of points follows the general path that Hertz sketched in the introduction of the book (see quote above) and corroborates his own claim to have developed the ideas independently from the contemporary mathematicians. The more technical details of the manuscripts are also interesting because they shed light on Hertz’s gradual construction 1 We already discussed this introduction of the length of a displacement in the previous chapter.
Direction, angle, and curvature in the printed book
163
of the central concepts. In the previous chapter I have already discussed what they reveal about the development of the concepts of the line element (i.e. distance) and the concept of mass. In the remaining part of this chapter and the following two chapters I shall discuss Hertz’s development of the concepts of direction, angle, curvature, vector quantities, components, velocity, momentum, and connections. For each of these concepts, I shall first summarize how Hertz introduced them in the published book, and then analyse what the notes tell us about how Hertz gradually constructed them.
13.3 Direction, angle, and curvature in the printed book Having introduced the distance (12.2) between two positions of a system, Hertz proved the triangular inequality: Proposition. The distance between two positions of a system is always smaller than the sum of the distances of the two positions from a third. (Hertz 1894, §32)
This allowed him to construct a plane triangle with sides equal to the relative distances |P0 P1 |, |P0 P2 | and |P1 P2 | between three positions P0 , P1 , and P2 of a mechanical system. The angle included between the sides with length |P0 P1 | and |P0 P2 | in the plane triangle, is then by definition what Hertz called the ‘angle’ between the displacements P0 P1 and P0 P2 or the ‘difference in direction’ between them. He then showed that the angle denoted s s between the two displacements s and s , can be expressed in terms of the rectangular coordinates as follows: ms s cos(s s ) =
3n
mν (xν − xν )(xν − xν ),
(13.1)
ν=1
where xν is the coordinate of the common initial point of the two displacements and xν and xν are the coordinates of their endpoints, respectively. With this definition Hertz has only dealt with angles between two displacements from the same initial position. In order to define the angle between two displacements starting from different initial positions he needed the idea of parallel transport. He introduced this idea as follows. First, he proved the Proposition. Two displacements of a system from the same initial position have the angle between them equal to zero when the displacements of the individual points of both the systems are parallel and correspondingly proportional, and conversely. (Hertz 1894, §38)
Then he introduced the idea of direction: He showed that the property ‘having angle zero with each other’ is an equivalence relation among displacements having the same initial position (not Hertz’s formulation) and then defined ‘direction’ as an equivalence class of this relation. In Hertz’s words: ‘The common property [Das Gemeinsame] of all such displacements is called their direction.’ This is a remarkably set theoretical definition.
164
Hertz’s geometry of systems of points
The proposition §38 quoted above allowed Hertz to define what it means for two displacements from two different initial positions to be parallel: Definition. … Two displacements are equal when the displacements of the individual points are equal (i.e. they have the same magnitude and direction, §25) and two displacements are parallel when the displacements of the individual points in both are parallel and correspondingly proportional. (Hertz 1894, §41)
This notion of parallel displacement allowed Hertz to give the general definition of angle between two arbitrary displacements: Additional Note. By the difference in direction [the angle] between two displacements of a system from different initial positions we mean the angle between either of them, and a parallel displacement to the other from its own initial position. (Hertz 1894, §43)
With this definition at hand Hertz could finally (§44) find the expression for the angle between two arbitrary displacements s and s in terms of the initial coordinates xν and xν0 , and the final coordinates xν and xν of the two displacements: ms s cos(s , s ) =
3n
mν (xν − xν0 )(xν − xν ).
(13.2)
ν=1
Applied to two infinitesimal displacements ds and ds this formula yields the expression 3n m ds ds cos(s, s ) = mν dxν dxν (13.3) ν=1
for the cosine of the angle between ds and ds expressed in terms of the changes dxν and dxν of the rectangular coordinates of the system (§56). The corresponding formula in generalized coordinates look like ds ds cos(s, s ) =
r r
aρσ dqρ dqσ ,
(13.4)
ρ=1 σ =1
where aρσ is defined by eqn (12.4) (Hertz 1894, §58). However, as Hertz pointed out, since parallel transport does not necessarily preserve the differences in the generalized coordinates this formula is only valid for two infinitesimal displacements at the same position. Hertz finally introduced the concept of orthogonality, and expressed it analytically by setting the right sides of the formulas (13.3) and (13.4) equal to zero. With the concept of angle in place Hertz could define the concept of curvature of a path. A path of a system is defined as ‘the simultaneously considered aggregate of positions which a system occupies in its passage from one point to another’ (Hertz 1894, §97)2 . Thus, the concept of a path is an entirely geometric concept. It has no 2 In the English translation of the book, however, the words ‘simultaneously considered’ [gleichzeitig vorgestellte] have been left out.
Direction, angle, and curvature in Hertz’s manuscripts
165
reference to time. An element of a path is a portion of a path limited by two infinitely near positions, and the direction of a path in a given position is defined as the direction of a path element at this position. The path is called straight if it has the same direction in all its positions. If it is not straight, the direction will change and its rate of change with regard to the length of the path is what Hertz called the curvature c (Hertz 1894, §98–104). Thus, the curvature is analytically represented by c=
dε , ds
(13.5)
where dε is the angle between the direction of the path at the beginning and the end of a path element ds. In a somewhat convoluted manner Hertz derived the following expression of the curvature 3n mc2 = mν xν 2 (13.6) ν=1
(§106) in terms of the second derivatives xν with respect to the curve length along the path of the rectangular coordinates. From this expression he was further able to deduce the following expression of the curvature in terms of generalized coordinates c2 =
r r
aρσ qρ qσ +
ρ=1 σ =1
+
r ∂aρσ ∂aρσ − 2 qρ qτ qσ ∂qτ ∂qσ τ =1
r r
aρσ λµ qρ qσ qλ qµ
(13.7)
λ=1 µ=1
(§108), where he has introduced the symbols aρσ λµ =
3n mν ∂ανσ ∂ανρ m ∂qλ ∂qµ
(13.8)
ν=1
that, according to Hertz, cannot be expressed algebraically in terms of the aρσ .
13.4 Direction, angle, and curvature in Hertz’s manuscripts It is interesting to compare this streamlined introduction of the concept of angle and curvature to the more roundabout way Hertz arrived at them in his notes. From the start Hertz seems to have known that the concept of curvature, and thus of angle would be central to his new presentation of mechanics. However, his initial treatment of the concepts differed in several respects from the treatment summarized above. In his first note (Ms 1) he only dealt with one point and went head on with the elements of a path, rather than with general displacements. In his second note (Ms 2), however,
166
Hertz’s geometry of systems of points
he studied systems, and general displacements, but as I have already explained there is no concept of mass in these early notes, with the result that the formulas corresponding to eqns (12.1) and (13.1) do not contain the m and mν . However, the most conspicuous difference between the treatment in the book and in the notes concern the way Hertz dealt with the problem of parallel transport3 . In fact, it seems as though Hertz initially found it difficult to define a concept of angle between two displacements (finite or infinitely small) when they have different initial positions. In (Ms 1) and (Ms 2) he only introduced the angle between two displacements with the same initial position. In (Ms 3), which seems to be a continuation of (Ms 1 and 2), Hertz then developed an approach to the concept of curvature that is different from the one published in the book. He first defined a path of a system to be straight if the displacements from one fixed initial position 0 of the system to any other position 1 has the same direction irrespective of the position 1 on the path. This presupposes the notion of direction or angle between finite displacements from the same initial point 0 as developed in (Ms 1, 2) but it does not pre-suppose a comparison of direction between (infinitesimal) displacements from different points of the path, as does the definition he eventually gave in the book. Hertz then defined the path to be curved if the direction of the displacement from the initial position 0 to the final position 1 on the path changes with the point 1. Then comes the definition of curvature: Curvature of the path in the point 0 is the velocity with which this direction of this displacement changes relative to the unit of arc length, and evaluated for displacements in the immediate neighborhood of the point. As absolute measure of the curvature we use the double of its velocity4 .
Among Hertz’s loose sheets of calculations (Ms 13) there are several pages containing figures and calculations pertaining to this definition of curvature. In Fig. 13.1 is reproduced one of these figures that explains the relation between this first definition and the later one found in the published book. Hertz had the following comment to his first definition of curvature definition: Note. We take the double in order to follow the usual geometric measurements. However our definition had to differ from the usual geometric definition because we only define differences in direction for those displacements that start from the same initial position5 .
Through a Taylor expansion and passing to the limit (an argument rather different from the one found in the book) Hertz then derived the expression (13.1) (except for the factors m and mν ) for the curvature in rectangular coordinates, and from that he derived the expression in curvilinear coordinates. Judging from the many 3 For the later introduction of parallel transport, see (Reich 1992). 4 ‘Krümmung der Bahn im Punkte 0 nennen wir die Geschwindigkeit mit welcher sich die Richtung
jener Verschiebung ändert bezogen auf die Einheit der Bahnlänge und berechnet für Verschiebungen im unmittelbaren Nachbarschaft der Punktes. Als absoluter Maas der Krümmung setzen wir der doppelte seiner Geschwindigkeit fest’ (Ms 3, p. 5). 5 Anmerkung. Der doppelte wird festgesetzt um uns den schon feststehenden geometrischen Messungen anzuschliesen. Unsere Definition Müste von den üblichen geometrischen derselbe ?? zwar abweichen weil wir Richtungsunterschied nur für solche Verschiebungen definiert welche von der gleiche Anfangslage ausgehen. (Ms 3, p. 5)
Direction, angle, and curvature in Hertz’s manuscripts
167
Fig. 13.1. In the manuscript (Ms 3) Hertz defined the curvature as 2(dv/ds), whereas the more traditional definition found in the book gives the curvature as (dε/ds). That these two definitions are equivalent boils down to the fact that in an isosceles triangle the two angles at the basis are equal and therefore dε = 2dv. The figure to the right is Hertz’s own illustration of this fact (Ms 13).
calculations (Ms 3, 4) that Hertz made related to this transformation of the expression for the curvature into generalized coordinates it caused him some trouble, and the formula he ended up with is not quite identical to eqn (13.7) because he did not define the quantities aρσ λµ , but other related quantities. Hertz seems to have grown dissatisfied with this untraditional definition of curvature. A subsequent one-page manuscript (Ms 5) presents an unfinished new approach to this concept. Here, he introduced the angle between two ‘neighboring displacements,’ i.e. infinitely small displacements from infinitely close positions. He first defined that such neighboring displacements are called equal if the displacements of the individual points have the same direction and magnitude. That corresponds to the definition in §41 quoted above, but only for neighboring displacements. He then proved that two neighboring displacements are equal if they correspond to the same change in the rectangular coordinates. However, he also remarked that ‘two neighboring displacements that correspond to the same change of the generalized coordinates are generally not equal, but they differ by a quantity of their own order of magnitude multiplied by the order of magnitude of the distance between the two initial points’ (Ms 5)6 . The notion of sameness of neighboring displacements suggests an obvious definition of the angle ss between two neighboring displacements ds and ds . Hertz 6 ‘Zwei benachbarte Verrückungen deren gleiche Änderungen der wilkürlichen Coordinaten entsprechen sind im Algemein nicht Gleich, aber [?] unterschieden sich um Größen von ihrer eigenen Größenordnung multipliciert mit der Größenordnung der Abstand der Ausgangslagen’ (Ms 5).
168
Hertz’s geometry of systems of points
forgot to introduce this obvious notion in (Ms 5). Instead, he immediately formulated the following analytic formula for this angle cos s, s =
3n
cos s, xν cos s , xν ,
(13.9)
ν=1
where s, xν and s , xν are the angles that ds and ds make with the rectangular coordinates of the system (see Section 14.2 for an explanation of this notion). This formula appears in §87 of the Mechanics. Then, Hertz formulated the corresponding formula in generalized coordinates: Remark: The angle between two neighboring displacements is given by the formula cos s, s = cos s, qν cos s , qν . . .
(13.10)
but only with the exception of quantities which are of the order of the distance between the displacements7 .
This formula appears as follows in §86 of the book: cos s, s =
r r
√ bρσ aρρ aσ σ cos s, qν cos s , qν .
(13.11)
ρ=1 σ =1
The dots in the quote thus seem to signify the summations and the coefficients, the meaning of which I shall come back to in Section 14.2. However, what is more interesting is that in the book Hertz explicitly remarked: ‘It is to be noticed that the equation §86 ((13.11) above) assumes the same position for the two displacements, whereas the equation of §87 (13.9) above is free from this assumption’ (Hertz 1894, §88). Thus, in the book Hertz has given up the attempt to introduce approximate expressions in generalized coordinates for the angle between neighboring displacements. The idea behind the introduction of such a notion in the early manuscript (Ms 5) seems to be a wish to define the curvature of a path in a more traditional way than the one used in (Ms 3). Unlike this earlier manuscript, the (Ms 5) contains a notion of angle between infinitesimal displacements starting at different points (based on an implicit notion of parallel transport) but unlike the published book the notion is (unnecessarily) restricted to neighboring displacements. As we saw above, Hertz in (Ms 3) and in the book first found an expression in rectangular coordinates and then transformed it into an expression involving generalized coordinates. In the note (Ms 5), however, he seems to try to get at the general expression directly without the detour via rectangular coordinates. However, this is complicated by the fact that the formulas such as (13.7) and (13.11) for the angle between two infinitely small displacements only hold when these displacements start 7 ‘Anmerkung: Der Winkel zwischen zwei benachbarten Verrückungen ist zwar gegeben durch die Formel cos ss = cos sqν cos s qν . . . jedoch nur mit Ausschluß von Größen welche von der Ordnung der Abstanden?? der Verrückungen’ (Ms 5).
Direction, angle, and curvature in Hertz’s manuscripts
169
at the same position. However, Hertz seems to have noticed that in order to find the curvature, one only has to compare directions between two path elements when they have infinitely close initial points. This seems to be the rationale behind the investigation of neighboring displacements. Hertz’s note (Ms 5) on neighboring displacements breaks off before dealing with the concept of curvature, but the first long draft (Ms 9) of the book continued along this line. First, Hertz introduced the angle between displacements from the same initial position (Ms 9, p. 4), and then he introduced the notion of sameness of neighboring infinitely small displacements as in (Ms 5). However, after the definition he remarked: Equality of the displacements [of the individual points] here means that the displacements in question do not differ by quantities that are of the order of this displacement multiplied by the magnitude of the distance between the initial points. Both these infinitely small quantities are independent as far as their order is concerned8 .
So, in contrast to the definition in the note (Ms 5) Hertz now allowed differences of higher order. He then gave the definition of ‘same direction’ and ‘parallel’ for neighboring displacements that he had forgotten in the previous note, and mentioned in a parenthesis the need for defining the angle between neighboring displacements. Then he continued to formulate the following theorem: Theorem. Two displacements from neighboring positions are equal if they correspond to equal differences in the rectangular coordinates. Proof. Because then all the displacements of the individual points are equal, independently of the order of the distance between the initial points9 .
The proof of the theorem is very strange. First, it does not establish the conclusion of the theorem, namely that the differences of the rectangular coordinates are the same. Secondly, Hertz seems to conclude from the equality up to infinitesimals of higher order to exact equality without giving any further argument for that. Thirdly, the proof does not at all invoke the rectangular or any other coordinates. Indeed, Hertz had explicitly formulated a note (Ms 9, p. 4) corresponding to §36 in the book, to the effect that the concept of angle is independent of the coordinates used. Thus the proof seems to establish that if the displacements are equal in the sense that the displacements of the individual points are equal up to infinitesimals of higher order, then they are equal in the stricter sense that the displacements of the individual points are exactly equal. In that case the generalization of the concept of equality made in (Ms 9) is, in fact, an empty generalization. It is a pity that this proof is so sketchy and problematic, because it is the only proof in Hertz’s notes where he used the type of 8 ‘Gleichheit der Verrückungen soll hier heißen, daßsich die fragliche Verrückungen sich unterscheiden um Größen, welche von der Ordnung der Größe dieser Verrückung multipliciert mit der Größe der Entfernung der Ausgangslagen sind. Beide unendlich kleine Größen sind der ordnung nach von einander unabhängig.’ (Ms 9, p. 11) 9 ‘Lehrsatz. Zwei Verrückungen aus benachbarten Lagen sind Gleich, wenn ihnen beziehentlich gleiche änderungen der rechtwinkligen Coordinaten entsprechen. Beweis. Denn Alsdann ?? sind hier die Verrückungen aller einzelnen Punkte gleich, unabhängig von der Ordnung der Entfernung der Anfangslagen’ (Ms 9, p. 11).
170
Hertz’s geometry of systems of points
definition that asks for equality up to an infinitesimal of high order. It is therefore not obvious how Hertz imagined that these definitions should work. After this note Hertz formulated the formulas (13.11) and (13.4) (without proofs) and pointed out that the latter only hold up to quantities of the order of the distance between the initial positions of the displacements. The subsequent treatment of the concept of a path and its curvature is remarkably local in nature. First, Hertz pointed out that a path can be considered as the aggregate of all its (infinitely small) path elements. The direction of a path element is defined as the direction of the displacement from its initial to its final position. A path element is said to be straight when the direction does not change when the line element is being traversed except for changes that vanish relative to the length of the path element. Finally, a path is called straight if all its path elements are straight, (Ms 9, pp. 12–13). This approach, where one goes from the local to the global is dictated by the fact that Hertz has only defined angles between neighboring displacements. In the same vein, Hertz continued to define first the curvature of a path element as the rate of change of the direction of the path relative to the path length, and then added that the curvature of the path element is also called the curvature of the path. The manuscript continues with derivations of analytic expressions of the curvature first in rectangular and then in generalized coordinates. Except for insignificant errors of calculations and notational differences, these derivations proceed as in the printed book. In the second draft of the Mechanics Hertz gave up this local approach and defined angles between displacements with different initial positions as well as straightness of a path and curvature, just as in the published book. Why did the concept of direction and curvature create such a big problem for Hertz? More specifically, why did he not immediately define the angle between two displacements from different initial positions in the rather obvious way he used in the printed book? The answers to these questions are not easy to establish from the extant sources. There could be two essentially different types of answer: 1. Hertz for some reason did not see the ‘obvious’ way out and thought that there were essential problems in defining this concept. Said differently, he did not see how to define parallel transport in his geometry. 2. Hertz saw the ‘obvious’ way out, but was reluctant to use it. 1. If Hertz did not initially see how one could define parallel transport, and consequently angles between displacements with different initial points it may simply be because the concept is not so obvious as it may seem once one has read Hertz’s definitions in the book. In other words, it may be that our question ‘why not’ is in fact an illegitimate historical question, which is only imposed on us because we take our point of departure in later theories. Perhaps the correct question should be: Why and how did Hertz discover the definition he presented in the book? The answer to this question is probably that he discovered that the definitions appealing to higher-order infinitesimals were in fact mathematically unsatisfactory, and unnecessarily complicated. However, I think that the original question of ‘why not’ is legitimate in view of the fact that the definition of angles between displacements with different initial positions, is the same as the one Hertz suggested in (Ms 5)
Direction, angle, and curvature in Hertz’s manuscripts
171
for neighboring displacements. In fact, the definition in the book is the same as the earlier one with the only difference that Hertz does not restrict it to the case where the initial points are infinitely close. It might seem as if something inhibited Hertz from taking this obvious step. What could that be? Is it possible that Hertz, through his mathematics reading or mathematics courses, had been made aware of the problems of this kind that one encounters if one wants to make a totally intrinsic theory of surfaces or higher-dimensional Riemannian manifolds? In such an intrisic approach it is, in fact, a problem that there is no direct way in which one can compare the direction of a tangent vector at two different points. In modern parlance they belong to two different tangent spaces. Only through a notion of parallel transport is this possible. In order to define parallel transport one often takes recourse to transport along geodesics, but Hertz did not want to introduce this concept until later in the book. Yet this problem is a pseudoproblem in the case of Hertz’s geometry of systems of points. Indeed Hertz’s configuration space R3n is equipped with a canonical set of rectangular coordinates that allows him to define two displacements (irrespective of their initial positions) as equal if their differences in the rectangular coordinates are the same. Of course, Hertz may have noticed that the metric (12.1) is not really the Euclidean one in R3n , but has different coefficients in front of the dxi2 s. However, that does not invalidate the above definition. It is also possible that Hertz had thought ahead in the book to the parts in which he has introduced connections. Indeed, if the mechanical system is constrained by a number of holonomic constraints, then its motions take place in some lower-dimensional curved submanifold of R3n . Defining parallel transport and intrinsic geodesic curvature in this manifold would be subject to the problems mentioned above. However, again the problem is not a real one for Hertz’s mechanics. Indeed, when talking about curvature of a path, Hertz never referred to the geodesic curvature of the path in some curved submanifold but always referred to the curvature of the path as a path in the large Euclidean space R3n in which the curved manifold is embedded. It is this embedding that makes the problem of parallel transport an easy matter. In fact, it would not have been easy for Hertz to define some kind of geodesic curvature, because in the more general case where the constraints are not holonomic, they do not define a lower-dimensional submanifold of R3n and geodesics do not correspond to paths with minimal curvature. I shall come back to this problem in Chapter 22. The discussion above suggests that the general theory of intrinsic geometry of surfaces or higher-dimensional manifolds might have misled Hertz to believe that there was a problem, where in fact there was no problem. 2. However, it is also possible that Hertz was aware that the embedding of his geometry in R3n gave an answer to the question of parallel transport, but that he did not want to use it. What could have been his reasons for that? He might have wanted a presentation in which he did not use the imbedding, a presentation in which all coordinate systems were on an equal footing, or phrased differently, a presentation where all properties were defined locally. Such an approach to mathematics and physics, seems to have been followed consciously by Riemann and later by
172
Hertz’s geometry of systems of points
Weyl10 who wanted to infer the global properties from the local ones. We have seen traces of this infinitesimal approach in Hertz’s first extended draft (Ms 9). In fact, such an approach is very much in tune with Hertz’s mechanics in which differential principles are the fundamental ones, whereas integral principles are only valid in special cases (for holonomic systems). However, if Hertz consciously tried out this local approach to the concept of curvature, he never followed the idea consistently. For example, even in his most ‘local’ presentation in the second draft, he defined the concept of distance first for finite displacements, and only subsequently let the displacement become infinitely small to arrive at the expression for the line element.
10 See (Scholz 1992), (Bottazzini and Tazzioli 1995), and (Varadarajan 2003).
14 Vector quantities and their components
14.1 Introduction Ultimately, mechanics is not only about geometric displacements, but about kinematic concepts involving time, such as velocity, momentum, and acceleration. In order to deal with such quantities, Hertz introduced the general concept of a vector quantity with respect to a mechanical system. This concept corresponds to the modern concept of a vector field over a Riemannian manifold. In this connection, Hertz also introduced the covariant components of a vector to use the modern language of tensor analysis. These are important in Hertz’s mechanics, because the covariant components, or the reduced components as Hertz called them, of the momentum of a system are precisely the quantities that Lagrange had introduced as the generalized momenta of the system. Hertz introduced the reduced components in a beautiful geometric way, as the orthogonal projections on the coordinate directions of the vector (measured conveniently). This geometric interpretation of the generalized momenta reveals the real strength and intuitive appeal of Hertz’s geometry of systems of points. On the basis of Hertz’s manuscripts I shall argue that his introduction of the reduced components was partly or entirely a result of the role his geometry of systems of points was intended to play in the presentation of mechanics, in particular, in connection with the Lagrangian and Hamiltonian formalisms. However, before turning to the origin of Hertz’s concept of vector quantities and their components, I shall summarize how their definitions and most important properties appear in the Principles of Mechanics.
14.2 Components and reduced components of a displacement In an ordinary rectangular coordinate system the coordinates of a vector can be found by projecting the vector orthogonally onto the coordinate axes. However, when the coordinate system is not rectangular, this no longer holds true. This is the background for Hertz’s introduction of the reduced components. He first introduced this notion for infinitesimal displacements, and then applied it to other vector quantities. 173
174
Vector quantities and their components
Hertz first defined the direction of a coordinate as follows: Definition 1. A displacement in the direction of a definite coordinate is an infinitely small displacement in which only this one coordinate is changed without the remaining ones changing. The direction of all displacements in the direction of the same coordinate from the same position is the same; it is called the direction of the coordinate in that position. (Hertz 1894, §69)
Having emphasized that the direction of a coordinate depends on the choice of the other coordinates, Hertz defined the central notion of the reduced components: Definition 2. The reduced component of an infinitely small displacement in the direction of a given coordinate is the component of the displacement in the direction of the coordinate [i.e. (the length of) the orthogonal projection onto that direction] divided by the ratio of the change of the coordinate to a displacement in its own direction. The reduced component in the direction of a coordinate is called for shortness the component along the coordinate. (Hertz 1894, §71)
First, note the somewhat confusing choice of language here: One has to distinguish between ‘the component in the direction of a coordinate’ which is simply the length of the orthogonal projection, and ‘the component along a coordinate,’ which is the orthogonal projection divided by a suitable quantity. In the following, I shall refer to the former as the orthogonal projection and to the latter as the reduced component when there is any doubt about the meaning. However, it would have been clumsy for Hertz to continue to talk about the reduced components, because it is precisely the reduced components, rather than the orthogonal projections that play a central role in the rest of the book. Why did Hertz introduce the reduced components, i.e. why did he divide the orthogonal projection by the ‘ratio of the change of the coordinate to a displacement in its own direction’? (even the wording of the definition seems puzzling at first). The real answer to that question will gradually emerge as we consider Hertz’s treatment of the Lagrangian and Hamiltonian formalism. However, we can get a superficial answer to the question by considering the analytical expression of the reduced component denoted dq ρ along the coordinate qρ . The orthogonal projection of a displacement ds on the direction of the coordinate qρ is equal to ds cos sqρ where sqρ means the angle between the displacement ds and a displacement in the direction of the coordinate qρ . Here, cos sqρ can be found from eqn (13.4) by setting the differentials dqρ equal to zero except for the one corresponding to the chosen ρ, and from eqn (12.3), which under the same conditions √ gives ds = aρρ dqρ . This will lead to the following expression for the orthogonal projection of ds on the direction of the coordinate qρ : r 1 ds cos sqρ = √ aρσ dqσ . aρρ
(14.1)
σ =1
In order to find the reduced component along the coordinate qρ Hertz must divide this quantity by ‘the ratio of the change of the coordinate to a displacement in its own direction.’ When the coordinate qρ changes the amount dqρ the system is displaced
Components and reduced components of a displacement
175
√ √ an amount ds = aρρ dqρ . Thus, the said ratio is dqρ /ds = 1/ aρρ , and thus the reduced component dq ρ of ds along the coordinate qρ can be expressed as follows in terms of the increments of the coordinates: dq ρ =
r
aρσ dqσ .
(14.2)
σ =1
√ Thus, the effect of dividing by the said ratio, is to get rid of the inconvenient 1/ aρρ in front of the summation sign and to arrive at the handy expression (14.2) for dq ρ . Conversely, if, following Hertz, we denote the inverse of the matrix {aρσ } by {bρσ }, we have r dqρ = bρσ dq σ . (14.3) σ =1
Since the expression of ds in rectangular coordinates xν eqn (12.2) contains the weight factors mi the reduced components are not equal to the coordinates, even in this coordinate system, but are given by the equation (§73): dx ν =
mν dxν . m
(14.4)
An infinitesimal displacement of a system can be described either in terms of the increments in its coordinates dqρ or in terms of the reduced components dq ρ . As I mentioned in the introduction of the chapter, the reduced components dq ρ also appear in modern books on Riemannian geometry (see, e.g. (Kreyszig 1959, pp. 102–103)). There, these components derive their importance from the fact that they transform as covariant vectors under coordinate transformations. That means that if qρ is another system of coordinates then dq ρ =
r ∂q σ dq . ∂q ρ σ
(14.5)
σ =1
This transformation rule is different from the transformation rule for the coordinates dqρ that transform as so-called contravariant vectors: dqρ =
r ∂qρ dqσ . ∂qσ
(14.6)
σ =1
However, Hertz did not emphasize these transformation properties of the coordinates and the reduced components. In fact he never wrote down eqns (14.5) and (14.6) in their generality, but only for the case that one of the coordinate systems is the rectangular system xν (Hertz 1894, §57, 80). Instead he derived many other analytic
176
Vector quantities and their components
formulas involving the reduced components, for example the following expressions for the line element (Hertz 1894, §82) ds 2 =
r r
aρσ dqρ dqσ =
ρ=1 σ =1
r r
dqρ dq σ =
r r
bρσ dq ρ dq σ
(14.7)
ρ=1 σ =1
ρ=1 σ =1
and the corresponding expressions for the angle between two displacements ds and ds (Hertz 1894, §85) r r r r r r ds ds cos(s, s ) = aρσ dqρ dqσ = dqρ dq σ = bρσ dq ρ dq σ . ρ=1 σ =1
ρ=1 σ =1
ρ=1 σ =1
(14.8) In order to facilitate the introduction of the Lagrangian formalism Hertz also expressed the reduced components of an infinitesimal displacement as a partial derivative of the line element ds. Since one can describe an infinitesimal displacement of a system both by its coordinates and by its reduced components, Hertz distinguished between two types of partial derivatives: If one considers the displacement as being given in terms of the coordinates and one makes a change in one of the coordinates, while keeping the others fixed, the corresponding partial differential of ds is denoted by ∂q ds. If, on the other hand, one considers the displacement as given by its reduced components, and one makes a change in one of them while keeping the rest fixed, the corresponding partial differential of ds is denoted by ∂p ds. Hertz did not explain the choice of the index p, but it finds its explanation later in the book, where displacements are usually described by their coordinates, whereas momenta that are denoted p are described by their reduced components. With this notation the formula (14.2) and the formula (14.7) for the line element gives dq ρ =
r σ =1
aρσ dqσ =
∂q ds 1 ∂q ds 2 = ds . 2 ∂ dqρ ∂ dqρ
(14.9)
14.3 Vector quantities In the seventh and final chapter of the first book, Hertz introduced kinematic concepts, i.e. concepts that depend on the concept of time. In order to be able to define such concepts in his geometry of systems of points, he began the chapter by introducing vector quantities with regard to a system. They correspond to the modern concept of a vector field over a Riemannian manifold. However, since Hertz had not introduced an explicit notion of tangent space, his definition was different from the modern definition: Definition. A vector quantity with regard to a system is any quantity which bears a relation to the system, and which has the same kind of mathematical manifold as a conceivable displacement of the system. (Hertz 1894, §237)
Vector quantities
177
A conceivable displacement is any displacement of the system including displacements that do not satisfy the connections (see Chapter 15). This definition may not seem so precise, but the subsequent note makes it clear how one should operate with vector quantities: Note 2. Every vector quantity with regard to a system can be represented geometrically by a conceivable displacement of the system. The direction of the displacement representing it is called the direction of the vector quantity. The measure of the representation can and will always be so chosen that the displacement representing it is infinitely small1 . Every vector with regard to a system which changes with the position of the system can then be represented as an infinitely small displacement of the system from the position to which its instantaneous value belongs. (Hertz 1894, §239)
Since vector quantities are represented by infinitely small displacements they satisfy what we today would call the rules of a vector field. Moreover, the definition allowed Hertz to adapt to vector quantities all the concepts he had introduced for infinitesimal displacements. In particular, vector quantities have reduced components along the coordinates. Hertz generally called them kρ (or hν if referring to the rectangular coordinates xν ). From now on I shall follow Hertz and leave out the word ‘reduced.’ Hertz could also adapt the formulas for infinitesimal displacements to apply to vector quantities in general. However, two complications arose. When describing a mechanical system Hertz would allow a choice of generalized coordinates that could describe all the possible displacements, (i.e. all those that do not violate the connections (see Chapter 15 below)) but not necessarily all the conceivable displacements. That means that a vector quantity can always be expressed in terms of its components hν along the rectangular coordinates, and from those one can deduce the components along the generalized coordinates kρ by a formula similar to eqn (14.5). However, a vector quantity cannot, in general, be described completely in terms of its components kρ along the generalized coordinates, except if the vector quantity can be represented by a possible displacement. That means that one cannot, in general, deduce the rectangular components of a vector quantity from its components along a system of generalized coordinates. The other complication, which had fundamental consequences for the principles of mechanics, is that one cannot compare or add vector quantities with regard to different systems. The only exception is when one system is a part of another system. In that case one may consider a vector quantity with respect to the partial system as a vector quantity of the larger system as well. However, due to the reduction of the √ components by the factor aρρ , the components of the vector quantity will not be the same in the two systems, even if the coordinates of the partial system are considered as coordinates of the complete system. For example, in §254 Hertz determined the components hν along the rectangular coordinates xν of the complete system in terms of the components hν along the rectangular coordinates xν of the partial system. Let m denote the mass of the partial system and m denote the mass of the complete system. If now the partial system suffers a displacement, which is also a displacement of the 1 ‘Unendlich klein’ is wrongly translated into ‘indefinitely small’ in the English version of the book.
178
Vector quantities and their components
complete system then dxν = dxν for the common coordinates, whereas dxν = 0 for the remaining coordinates of the complete system. However, according to eqn (14.4) m dx ν = mν dxν and m dx ν = mν dxν and consequently m dx ν = m dx ν . If a vector quantity is represented by means of this displacement, the component hν along xν is proportional to dx ν and the component hν along xν is proportional to dx ν . Thus we get: (14.10) m hν = mhν for every ν that the systems have in common, whereas hν = 0 for the remaining coordinates. Similarly, if a vector quantity with regard to a partial system of mass m is described by the generalized coordinates qρ and these coordinates are also coordinates of the complete system of mass m (where for clarity we call them qρ ) and if the remaining coordinates of the complete system are not coordinates of the partial system, we have (Hertz 1894, §255): m kρ = mkρ (14.11) for the common coordinates, whereas m kρ = 0 for the remaining coordinates.
14.4 Kinematic concepts Having introduced the concepts of motion and path of a mechanical system in a straightforward way, Hertz defined the velocity of a system as its ‘instantaneous rate of motion’ or expressed analytically2 s˙ = ds/dt. This is a vector quantity with regard to the system, whose magnitude v is (by eqn (12.2)) given as the positive root of the equation 3n mv 2 = mν x˙ν2 , (14.12) ν=1
or, expressed in generalized coordinates, the positive root of the equation v2 =
r r
aρσ q˙ρ q˙σ
(14.13)
ρ=1 σ =1
with the components along qρ equal to rσ =1 aρσ q˙σ . Thus, we see how the geometry of systems of points allowed Hertz to define velocity for a system by complete analogy with the usual definition of the velocity of a single point. Similarly, he defined the momentum of a system by analogy with the definition of the momentum of one point mass as ‘the product of the mass of a system into its velocity’ (Hertz 1894, §268). Its component pρ along the generalized coordinate qρ (also called the momentum along this coordinate) is, according to the above, given as pρ = m
r
aρσ q˙σ .
σ =1 2 Hertz used dots to represent derivatives with respect to time.
(14.14)
Kinematic concepts
179
The energy of a system is also defined by the same formula as the kinetic energy of a single point mass in ordinary mechanics, i.e. as E = 21 mv 2 (recall there is only kinetic energy in Hertz’s mechanics). In rectangular coordinates this gives, from eqn (14.12) 1 mν x˙ν2 . 2 3n
E=
(14.15)
ν=1
Thus, as we remarked in Section 12.4, the line element ds is so chosen that the definition of energy as 21 m( ds/dt)2 will give the usual kinetic energy (14.15) of the system. Expressed in generalized coordinates we get from eqns (14.13) and (14.14) 1 1 1 aρσ q˙ρ q˙σ = pρ q˙ρ = m bρσ pρ pσ , m 2 2 ρ 2 r
E=
r
r
ρ=1 σ =1
r
(14.16)
ρ=1 σ =1
where, as above, {bρσ } denotes the inverse of the matrix {aρσ }. From this expression of the energy Hertz derived the following expression for the momenta along a generalized coordinate: pρ =
∂q E ∂ q˙ρ
(14.17)
q˙ρ =
∂p E , ∂pρ
(14.18)
as well as:
where we have used the index convention explained above. The first of these equations, which is also a consequence of eqn (14.9), is identical to the definition of the generalized momenta as given by Lagrange. The only difference is that Lagrange had used the Lagrangian L = T − V (i.e. the difference between the kinetic and the potential energy) instead of E in the above formula, but since an isolated system has only kinetic energy in Hertzian mechanics, this makes no difference. As I shall argue below, this accordance between the components of the momentum and the usual generalized momenta was the reason why Hertz introduced the reduced components, rather that continuing to work with the orthogonal projections on the directions of the coordinates. Finally, Hertz defined the acceleration f of a system as the rate of change of its velocity, and showed with an argument similar to the one leading to the expression (13.7) of the curvature that its components along the generalized coordinates are given by (§277): r r r ∂aρσ 1 ∂aσ τ fρ = aρσ q¨σ + − (14.19) q˙σ q˙τ ∂qτ 2 ∂qρ σ =1
σ =1 τ =1
180
Vector quantities and their components
or, in terms of the energy (§291, 294): d ∂q E − mfρ = dt ∂ q˙ρ d ∂q E + = dt ∂ q˙ρ = p˙ ρ +
∂q E ∂qρ
ρ = 1, 2, . . . , r
(14.20)
∂p E ∂qρ
ρ = 1, 2, . . . , r
(14.21)
ρ = 1, 2, . . . , r.
(14.22)
∂p E ∂qρ
In the middle line we have used that ∂p E ∂q E =− ∂qρ ∂qρ
(14.23)
(see (§292)), which Hertz deduced from the equality ∂q ds ∂p ds =− , ∂qρ ∂qρ
(14.24)
which is, in turn, a consequence of eqn (14.3). The first of the above equations (14.20) corresponds to the Lagrangian equations for an isolated system in ordinary mechanics. However, in ordinary mechanics, where E must be replaced by the Lagrangian L, the equation has empirical content. In Hertz’s mechanics it is a mathematical consequence of the geometric and kinematic definitions. If the acceleration is resolved into a component in the direction of the path, and a component perpendicular to the path, Hertz showed that the former ft is equal to s¨ or v˙ and the latter fr is equal to cv 2 where c denotes the curvature of the path (Hertz 1894, §280–281).
14.5 Vector quantities in Hertz’s drafts In his preliminary notes on the geometry of systems of points , Hertz did not mention kinematic quantities such as velocity, momentum, acceleration, and energy. In fact, except for one occurrence in the very first note, the concept of time did not enter into these early experiments with the formalism. Hertz had apparently seen from the start that the introduction of kinematical notions would be trivial, once the geometric concepts had been introduced in a suitable way. In the first long draft of the book (Ms 9), the kinematic concepts were defined in a way that is very similar to the way they later appeared in the printed book. The only significant difference is that Hertz did not introduce the concept of a vector quantity. Instead, he defined the components of the velocity, as the time derivative of the components of the displacements, and similarly for the other vector quantities momentum and acceleration. He then operated with these components as though they were components of a displacement, without making any explicit remarks about it.
The (reduced) components in Hertz’s drafts
181
However, in the second draft, Hertz felt the need to introduce an explicit concept of vector quantity that would capture the idea of those quantities that behave as displacements, without being displacements themselves: Notation: A quantity which has the same kind of manifold as the displacements of a system, and which can be represented by such a displacement, we will denote a vector quantity. We speak of magnitude, direction, components of a vector quantity, as of the corresponding magnitudes of a displacement, similarly of sums or differences of vector quantities3 .
This definition is similar to the one that appeared in the book, but it is placed much earlier in the manuscript, in fact just after the introduction of the notion of a finite displacement. Therefore, in this definition Hertz did not insist that a vector quantity be represented by an infinitely small displacement, and he did not mention that it can be represented by a conceivable, as opposed to a possible, displacement. Moreover, Hertz did not formulate any properties of or formulas for vector quantities, and he continued to introduce magnitudes and directions for each vector quantity separately. Only in the third draft did Hertz move the introduction of the concept of vector quantity to the beginning of the chapter on kinematics, and gave it a treatment almost identical to the one he later published. So, the concept of vector quantity was not present in an explicit form in Hertz’s mind when he began working on his mechanics and only gradually emerged during his rewriting of the material. However, its initial absence had no consequence for the technical content of Hertz’s theory, since from the start he chose to deal with velocity, momentum and acceleration as if they were infinitely small displacements. The introduction of vector quantities only led to a conceptual clarification.
14.6 The (reduced) components in Hertz’s drafts By contrast, Hertz’s introduction of the reduced components had a direct influence on the technical formalism of his mechanics. In the very first note on mechanics (Ms 1) as well as in the first full draft of the book (Ms 9), Hertz introduced the component dq ρ of an infinitely small displacement ds in the direction of qρ as the orthogonal projection of ds onto that direction. Thus, the dq ρ of the first manuscript and the first draft is not the reduced component as it is in the published book, but the orthogonal projection. √ That means it differs from the dq ρ of the book by a factor aρρ . Therefore, many of the analytical formulas of the first draft differ from those of the published book. For example, having defined the momentum of a system, he found its components4 3 ‘Bezeichnung: Eine Größe welche dieselbe G??? der Manigfaltigheit besitzt wie die Verrückung eines Systems und welche als durch eine solche Verrückung dargestellt ??? bezeichnen wir als eine Vectorgröße [sic!]. Wir sprechen von Größe, Richtung, Componenten einer Vectorgröße wie von den entschprechenden Größ en einer verrückung, ebenso von Summen oder Differenzen von Vectorgrößen.’ 4 In the manuscripts the momentum in the direction of q is called h . (Ms 12, p. 14). ρ ρ
182
to be expressed by
Vector quantities and their components
r m aρσ q˙σ pρ = √ aρρ
(14.25)
σ =1
rather than by eqn (14.14), and, in terms of the energy, as pρ = √
1 ∂q E aρρ ∂ q˙ρ
(14.26)
rather than eqn (14.17). That means that in the first draft pρ is not the usual Lagrangian √ generalized momentum conjugate to qρ but differs from it by a factor 1/ aρρ . As a consequence, in the first draft Hamilton’s equations at first appear as follows: q˙ρ =
∂p E √ ∂( aρρ pρ )
∂p E √ ( aρρ p˙ ρ ) = − ∂qρ
(14.27)
(14.28)
rather than in the following form printed in the book: q˙ρ =
∂p E ∂pρ
p˙ ρ = −
∂p E . ∂qρ
(14.29) (14.30)
The latter two equations present the usual appearance of Hamilton’s equations. To be sure, in ordinary presentations of mechanics the energy function E is replaced by the Hamiltonian. However, since an isolated system has no potential energy in Hertz’s mechanics, the Hamiltonian is, in fact, equal to the energy. It seems to be the discrepancy between the formulas (14.27) and (14.28) and the usual appearance of Hamilton’s equations that made Hertz introduce the reduced components. Indeed, at this place in his first draft he defined a quantity ρ equal √ to aρρ pρ , i.e. equal to the reduced component of the momentum along qρ and he then wrote up the familiar formulation of Hamilton’s equations in terms of these new variables. On page 51 of the manuscript where Hertz introduced ρ he did not give it any particular name. However, on a page inserted in the manuscript between page 10 and 11 he remarked: NB. reduced component =
√ aρρ dqρ (Ms 9, p. 10a).
This naming of the reduced components of a displacement seems to be added after Hertz had discovered the need for such a concept in connection with Hamilton’s equations. Indeed, the remaining of the inserted page deals with the use of partial derivatives, and are concerned with the derivation of some formulas used in his derivation of Hamilton’s equations, e.g. formula (14.24). Hertz seems to have seen
The origin of Hertz’s tensor analysis
183
the need for these formulas while deriving Hamilton’s equations, and since they belong naturally just after the introduction of the orthogonal projection of a displacement, he inserted the page at this place. The conjecture that the definition of the reduced component was indeed inserted at a late moment in the composition of the first draft is corroborated by the fact that the concept is not used anywhere else in the draft, even in the many instances where it would have come in handy. In the second draft of the book Hertz defined dqρ almost as in the printed book, and the reduced components obtained their central place in the geometric architecture of the mechanics. That means that the analytic formulas, including Hamilton’s equations, got their more familiar form. However, when dealing with infinitesimal displacements in the direction of a coordinate Hertz used the term ‘moment(um)’5 rather than reduced component. He did not explain why he chose this name. It may have been suggested to him by the analogy with Lagrange’s generalized momentum conjugate to a generalized coordinate but it is also possible that it was simply a natural term to use for a weighted sum like eqn (14.2). When Hertz introduced kinematic concepts later in the second draft he returned to the term ‘reduced component,’ and this was then used consistently also for infinitesimal displacements in the third draft.
14.7 The origin of Hertz’s tensor analysis Hertz’s use of the term ‘moment’ has a striking parallel in Levi-Civita’s The Absolute Differential Calculus (1925, see p. 92). In this classic treatise Ricci’s cofounder of tensor calculus used this term to denote (dqρ /ds). As in Hertz’s second draft this term is only used for a ‘displacement.’ When dealing with general vectors, LeviCivita used the terms covariant and contravariant components. Of course, a direct influence from Hertz’s draft seems out of the question. However, it is possible that it was the same reasons that made the two chose the term ‘moment.’ Or is it possible that the two had the term from some common source. In other words, is it conceivable that Hertz lifted his ideas about differential geometry and, in particular, the very important idea of the reduced components from a mathematical paper on this subject? As I have already mentioned in Section 13.2 Hertz himself claimed in the preface of the Principles of Mechanics that he only got to know the work of his mathematical precursors when he had already developed his own ideas rather far. It is, of course, possible that this remark only applies to the idea of applying differential geometry to mechanics. However, since Hertz explicitly quoted Beltrami’s purely mathematical paper (Beltrami 1868b) in this connection the remark seems to signify that he did not know of the more recent literature on the mathematical theory of high-dimensional differential geometry and the rudiments of tensor calculus or the absolute differential calculus as it was called before Einstein. Indeed he did not quote neither (Christoffel 1869) nor Ricci’s papers that began to appear (in Italian) from 1884 and it is not even quite obvious if knowledge of these papers would have helped him much. Indeed, as 5 In German, Hertz used the word Moment, which can be translated both as ‘moment’ and as ‘momentum.’
184
Vector quantities and their components
pointed out by Karin Reich in her thorough book on the development of tensor calculus (Reich 1994) the mathematicians approached the subject from the point of view of algebraic invariant theory, which is very different from Hertz’s approach. To be sure, Lipschitz did use analogies with mechanics, to suggest ideas in his 1870 studies of Riemannian geometry and tensor calculus, and others like Voss (Voss 1880) used the ideas of tensor calculus to study curvature of Riemannian manifolds, but there is no indication that Hertz should have known these works6 . Furthermore, neither these works, nor any other work on tensor analysis prior to 1894 seem to give a geometric interpretation of the covariant components (Hertz’s reduced components) of a vector as the orthogonal projections on the coordinate directions suitably reduced. Conversely, as mentioned above Hertz did not show any particular interest in those properties of tensors that interested the contemporary mathematicians namely their general transformation properties. This seem to corroborate the point of view already put forward in Section 13.2 that Hertz developed his geometry of systems of points independently of the works of the contemporary mathematicians, on the basis of Gauss’s, Riemann’s, and Helmholtz’s work. I have also in the previous section argued that Hertz’s introduction of the reduced component first of an infinitesimal displacement and then of a general vector quantity, was a result of the mechanical context, in particular Lagrange’s definition of generalized momenta and Hamilton’s equations. At least the drafts of Hertz’s book √ suggested that this mechanical context made Hertz introduce the factor 1/ aρρ so as to reduce the orthogonal projections on the coordinate directions and arrive at what we now call the covariant components. However, this leaves us with the question: Why did Hertz consider the orthogonal projections in the first place? Here, we are left guessing, for the orthogonal projections are already introduced in Hertz’s very first notes, and he did not explain why they are of importance. Of course, it is possible that Hertz simply tried to play with the orthogonal projections, because these quantities are of importance in ordinary geometry. However, I think it is more likely that he somehow saw that the usual Lagrangian definition of momenta conjugate to a generalized coordinate could be interpreted as an orthogonal projection of the momentum onto the coordinate direction (suitably reduced), and that this suggested the study of the orthogonal projections of displacements and of other vector quantities. If this reconstruction is correct, the influence of the mechanical content on the mathematical form is even stronger than suggested in the previous paragraph. The main argument against such a strongly mechanical reconstruction of Hertz’s way to the introduction of orthogonal projections on the coordinate directions is that it becomes rather strange why Hertz would not have introduced the reduced components from the outset. The reason could be that Hertz tried to work with geometrically intuitive concepts, and he may not have considered the strange √ reduction by the factor 1/ aρρ as a natural step in a geometric presentation. Indeed, the many different formulations in the subsequent drafts of what became Definition 2 6 Hertz referred to one paper by Lipschitz namely the one on mechanics from 1872 but he did not mention (Lipschitz 1869).
Interaction between physical content and mathematical form
185
in §71 on the book shows that Hertz had great difficulties in formulating the reduction in a way that satisfied himself. In fact, since the covariant transformation properties of the reduced components are not emphasized by Hertz, the definition of the reduced components in the book appears rather perplexing. So, Hertz may at first have resolved to try to do without the reduction, but later discovered that this led to such unfamiliar and inconvenient formulations of the principles of mechanics that it was preferable √ to make the reduction by the factor 1/ aρρ , unnatural as it may seem at first.
14.8 Interaction between physical content and mathematical form The conclusion of the above discussion is that it was the physical content of the mechanics that made Hertz introduce one of the most striking concepts of his new differential geometric formalism, namely the concept of reduced components of a vector quantity, a concept that is identical to the modern concept of the covariant components. The drafts strongly suggest that at the least it was the Lagrangian definition of generalized momenta combined with Hamilton’s equations that made Hertz introduce √ the factor 1/ aρρ that reduces the orthogonal projection to the covariant component. It is even likely (but uncorroborated by the manuscripts) that it was the mechanical definition of generalized momenta that suggested the importance of the orthogonal projections to Hertz in the first place. Conversely, we saw in the previous chapter how considerations about the geometric formalism (the derivation of the line element) led Hertz to introduce an important physical concept in his image of nature, namely the Massenteilchen. These examples illustrate that there was a strong connection between the physical content and the mathematical form in Hertz’s development of his Principles of Mechanics. This reflects a more general rule concerning the development of so-called applied mathematics. I write so-called, because the term applied mathematics seem to suggest that what goes on in this area is that one takes an already existing mathematical theory and applies it to something else, e.g. to a branch of natural science. With such an understanding of applied mathematics, mathematics does not benefit from being applied, and the natural science only benefits in so far as the mathematical formalism may allow better and speedier calculations of theoretical consequences. However, in reality, ‘applications of mathematics’ rarely (or never) take such a form. It usually (always) turns out that the piece of mathematics needed for natural science is not available in a form ready to be applied, and it also turns out that the scientific theory is not precise or quantitative enough to allow the piece of mathematics to be applied. Moreover, it often happens that the mathematical formalism suggests entirely new scientific concepts and ways of thinking about the natural phenomenon. As examples, let me just mention the development of Fourier series in connection with the theory of heat conduction, the development of the concept of energy, that owes much to the mathematical theory of potentials, and the simultaneous development of quantum mechanics and the theory of Hilbert spaces and operators on them.
186
Vector quantities and their components
An often-mentioned example of the miraculous application to nature of a piece of pure mathematics is Einstein’s application of tensor calculus to his general theory of relativity. If one only considers the development of tensor calculus prior to Einstein in the context of algebraic invariant theory this seems to be an example of the application of a piece of pure mathematics to a hitherto unrelated field of physics. However, although algebraic invariant theory was a strong trend of the development of tensor calculus prior to 1915, the development had other facets. Tensors, were also developed in crystallography (in fact the name tensor comes from this line of development) (Reich 1994) and at least Lipschitz was influenced by considerations of mechanics. Moreover, there is clear evidence that although most of the papers on the absolute differential calculus was written in a strict analytic algebraic style, the authors were aware of the geometrical interpretations of the content but chose an ageometric presentation because it was more rigorous. And with our conclusions about Hertz, we can add that in Hertz’s Mechanics one finds an independent development of tensorial ideas influenced directly by a mechanical context. When all these factors are taken into consideration, the application of tensor calculus to relativity theory seem to conform better to a theory about mutual stimulation between mathematics and physics, than to a theory of miraculous application of a purely mathematical theory to a branch of nature. Hertz’s own use of differential geometry to mechanics is an example of mutual stimulation.
15 Connections. Material systems
Any image of mechanics must be able to account for interactions between different parts of a system. In the ordinary image, interactions are caused by forces, in Hertz’s image they are due to constraints or connections as Hertz called them1 . They can be expressed in terms of first-order homogeneous differential equations in the coordinates: 3n xιν dxν = 0, ι = 1, 2, . . . , i, (15.1) ν=1
(Hertz 1894, §128) or equivalently, in generalized coordinates r
qχρ dqρ = 0,
χ = 1, 2, . . . , k,
(15.2)
ρ=1
(§130), where xιν and qχρ are continuous functions of x1 , . . . , x3n and q1 , . . . , qr , respectively.
15.1 Connections rather than forces We notice that Hertz allowed actions at a distance in his image. Nothing precludes an object on Earth from being connected to an object on a far-away star. Thus, although Hertz may initially have been directed to his new image of mechanics in part by his dislike for actions at a distance, he did not, in the end, remove this feature from his image. Why did he then prefer connections to forces? After all, forces in the ordinary image act in a way similar to the connection-coefficients xιν and qχρ in 1 In the English translation of Hertz’s Mechanics the word ‘constraint’ is used exclusively to denote a certain vector quantity with regard to a material system, namely the vector quantity that appears in Gauss’s principle of least constraint (see Section 17.2). I shall often use the words connection and constraint interchangeably. The reader should be warned that Hertz’s and my use of the word connection has nothing to do with the meaning of the word in differential geometry and relativity theory where it means a way to relate tangent spaces at different points of the manifold.
187
188
Connections. Material systems
Hertz’s image. Forces are given functions Fi of the coordinates that enter into the differential equations of motion: Fi = mi (d2 xi /dt 2 ). Now, Newton’s equation is of the second order, whereas Hertz’s equations of constraint are only of the first order. However, this difference is not highlighted by Hertz. According to him, the major distinguishing difference between the two differential equations is that time enters into Newton’s second law but not into Hertz’s equations of constraint. Connections are purely geometric properties of a system, not a kinematical or dynamical property (Hertz 1894, p. 32/27). They are what Hertz sometimes called rigid connections. To be sure, when dealing with non-isolated systems, Hertz allowed the connections, i.e. the coefficients xιν and qχρ , to depend explicitly on time, but his most fundamental considerations concerned systems where the connections are independent of time, so-called normal (gesetzmäßige) connections. It is the purely geometric nature of the connections that makes them less objectionable to Hertz than forces2 . In the introduction to the Mechanics (Hertz 1894, p. 40/34) Hertz anticipated a criticism that he imagined would be raised from traditionally thinking physicists. They might try to argue that Hertz was guilty of a petitio principii when he first introduced connections and then deduced forces and their properties from them. In fact, in traditional mechanics connections are a result of forces that prevent the system from making certain movements. In this image it would therefore be a vicious circle if one tried to derive forces from connections. But, according to Hertz, this criticism does not apply to his new image where connections are primary basic concepts. Thus, it is entirely logically permissible to account for interactions the way Hertz did. Yet, the traditional physicist might argue that we know from experience that no connections are entirely rigid but only approximately so. This may suggest that Hertz’s image is simply incorrect. To that objection from his imaginary opponent Hertz answered: With regard to rigid connections which are only approximately realized our mechanics will naturally only state as a fact that they are approximately satisfied; and for the purpose of this statement the idea of force is not required. If we wish to proceed to a second approximation and to take into consideration the deviations, and with them the elastic forces, we shall make use of a dynamical explanation for these as for all forces. In seeking the actual rigid connections we shall perhaps have to descend to the world of atoms. (Hertz 1894, p. 41/34)
Thus, where Laplace, Poisson, Hamilton and other traditional mechanicians would claim that when we get to know a mechanical system in detail we will discover that the constraints are, in fact, epiphenomena of reaction forces on the microscopic level (see Chapter 2) Hertz argued conversely that we only need to introduce forces if we do not know the system well enough. If we knew the system better we would see that the apparent forces are a result of rigid connections on the microscopic atomic level. 2 Høffding (Høffding 1915, p. 125) argued that Hertz preferred motion to force because he was a ‘visualist.’ Conversely, he characterized Ostwald as a ‘motorist.’
Derivation of the equations of connection
189
15.2 Derivation of the equations of connection Why then did Hertz allow exactly those connections that are expressible in terms of first-order homogeneous differential equations (15.1) and (15.2)? Hertz’s answer to this question falls into two parts: He explained why one can do with these connections and he explained why one cannot do with a more restricted set of connections. The crux of the first part of Hertz’s answer is the derivation of the equations of constraint (15.1). Indeed, as everywhere else in the book Hertz defined constraints in a verbal way and then deduced the analytic expression (15.1) from it. The definition of constraints goes as follows: There exists a connection between a series of material points when from knowledge of some of the components of the displacements of those points we are able to state something as to the remaining components. (Hertz 1894, §109)
Hertz argued that connections exist in a system if and only if there are some displacements that are excluded (impossible displacements) and others that are not excluded (possible displacements). Hertz’s mechanics does not deal with all such connections but only with continuous connections: Definition 1. A connection of a system is said to be a continuous one when it is not inconsistent with the three following assumptions: 1. That the knowledge of all possible finite displacements should be included in the knowledge of all possible infinitely small displacements. 2. That every possible infinitely small displacement can be traversed in a straight, continuous path. 3. That every infinitely small displacement, which is possible from a given position, is also possible from any infinitely neighboring position, except for variations of the order of the distance between the positions or of a higher order. (Hertz 1894, §115)
A system that is subject to no other than continuous connections is called a material system (Hertz 1894, §121). Hertz then argued that a system is a material system in this sense if and only if its connections are expressible by a system of first-order homogeneous linear differential equations such as (15.1) or (15.2). First, in §124 he argued somewhat loosely that constraints given by eqns (15.1) or (15.2) are continuous: The first requirement of Definition 1 must be satisfied when mention is made of the differentials of the coordinates, and the two remaining requirements follow from the equations. Hertz gave a more detailed treatment of the converse: Assume that a material system is given a possible infinitely small displacement dxν from a given possible position. Assume that its coordinates dxν have to one another the ratios ε11 : ε12 : . . . : ε13n .
(15.3)
If du1 denotes any infinitely small quantity, then according to Hertz the equations dxν = ε1ν du1
(15.4)
190
Connections. Material systems
define a possible displacement from the given position. Here, he assumed that if a possible displacement is multiplied by a real number (in the sense that every coordinate is multiplied by that number) then the resulting displacement is also possible. However, he did not argue how that property followed from the three requirements of Definition 1. If there are possible displacements from the given position that are not contained in those described by eqn (15.4), we choose a second possible displacement whose coordinates dxν bear to one another the ratios ε21 : ε22 : . . . : ε23n .
(15.5)
Hertz now argued that if du2 is any other infinitely small quantity, the equations dxν = ε1ν du1 + ε2ν du2
(15.6)
define a possible displacement. Here, in addition to the above-mentioned multiplicative property Hertz has used that the sum of two possible infinitesimal displacements is again possible. This time Hertz did provide an argument in §116: According to the third requirement of Definition 1 the individual displacements may be performed successively (Hertz did not explain what to do with the infinitely small deviations) and according to the second requirement the direct displacement from the initial position to the final one is itself a possible displacement – the sum. Now Hertz proceeded as above until all possible displacements from the given point are given in the form l dxν = ελν duλ . (15.7) λ=1
This must happen for some l ≤ 3n. From the third requirement of Definition 1 Hertz deduced that the coefficients ελν must be continuous, and he concluded that by eliminating the duλ s one will arrive at a system of equations of the form (15.1) in which xιν are continuous functions. From a mathematical point of view (both modern and contemporary) the above derivation of the differential equations of connection is unrigorous. Yet it was seemingly important for Hertz. It is a highly streamlined version of a complicated argument found in the early manuscript (Ms 7) as well as in the first draft of the book (Ms 9). However, in these early versions of the derivation, Hertz had not explicitly formulated the first requirement of Definition 1. This seems to suggest that it was, in fact, the differential equations that were the starting point for Hertz’s considerations of connections. The Definition 1 in plane words was cooked up later to supply a foundation for the derivation of the differential equations. Why was it important for Hertz to derive his equations of constraints from a verbal definition like Definition 1? First, as we have seen several times, Hertz always insisted on formulating the physical notions of his image independently of his specific mathematical form, deriving the ‘analytic representation’ subsequently. But in the case of the connections there was another important reason for the verbal formulation. It has to
Derivation of the equations of connection
191
do with the question of the correctness of Hertz’s image. The question is whether any physical system can be conceived as a material system in Hertz’s sense of the word, or expressed differently: ‘What right have we to assert that all natural connections can be expressed by linear differential equations of the first order?’ (Hertz 1894, p. 43/36). This question is, according to Hertz, of primary importance. ‘Our System stands or falls with it’ (Hertz 1894, p. 43/36). Now, the definition of continuous connections and a material system is given in the first book of the Mechanics, which, according to Hertz, is a-priori in Kant’s sense. However, Hertz did not imply that it was a-priori that all physical systems could be described as material systems in his sense of the word. That would also have been strange, considering that he clearly knew that many physicists would not agree to this proposition. In the first draft of the book he therefore accepted as an empirical fact that there exist material systems: With the experience, there was no other connection than the assumption that in reality there are systems of masses which satisfy our requirements of a material system, i.e. systems of material points for which certain displacements are possible and others are impossible, independently of any consideration of time. This assumption is confirmed by experience. (Ms 9, p. 43)3
Hertz may have added this weak appeal to experience at the end of the first draft of the first book in order to ensure that the first book is an image of something in the real world. In the printed book, a similar existential claim is postponed to the beginning of the second book (§306). However, in order to ascertain that Hertz’s mechanics gives a correct image of nature, it is, of course, not enough to know that there exists one (perhaps small and artificial) system that can be described by continuous connections. We need to know that all systems in nature can be described correctly in that way. This is the content of the continuation of the existential claim in the book: We may even assert that other than continuous connections are not found in nature, and that, consequently, every natural system of material points is a material system. (Hertz 1894, p. 306)
In the main body of the book Hertz did not comment on the probability of this central assertion, but in the introduction he admitted ‘that our assumption as to the permissible connections is of the nature of a tentatively accepted hypothesis’ (Hertz 1894, p. 44/37). Still, he added two theoretical arguments to support the hypothesis. One was a reference to ‘the great authority of Helmholtz’s name.’ Helmholtz had, according to Hertz, argued that any other type of connection cannot be realized by any practical mechanism. However, Hertz was not satisfied with this argument, because some possibility may have been overlooked, so instead he formulated a second argument: It seems to me that the reason for our conviction should more properly be stated as follows. All connections of a system which are not embraced within the limits of our mechanics, indicate 3 ‘Mit der Erfahrung bestand keinen anderen Zusammenhang, als die Voraussetzung, daß es in der Wirklichkeit Massensysteme gebe, welche den bedingungen unserer materiellen Systeme entsprechen, also Systeme materieller Punkte fur welche gewisse Verrückungen möglich, andere unmöglich sind – unabhängig von jede rücksicht auf die Zeit. Diese Voraussetzung wird von der Erfahrung bestätigt.’ (Ms 9 p. 43)
192
Connections. Material systems
in one sense or another a discontinuous succession of its possible motions. But as a mater of fact it is an experience of the most general kind that nature exhibits continuity in infinitesimals everywhere and in every sense: an experience which has crystallised into firm conviction in the old proposition – Natura non facit saltus. In the text I have therefore laid stress upon this: that the permissible connections are defined solely by their continuity; and that their property of being represented by equations of a definite form is only deduced from this. (Hertz 1894, pp. 43–44/37)
Thus, Hertz derived the equations of constraints from the verbal Definition 1 primarily in order to show that the question of the correctness of his assumption about possible connections could be reduced to the question of the correctness of the principle Natura non facit saltus. He did not pretend that he had completely answered the correctness question in this way, or even reduced it to a metaphysical question. Indeed, as is clear from the quote, Hertz considered the principle Natura non facit saltus as a matter of experience. But since it is an ‘experience of the most general kind’ and thereby well tested, it lends a high degree of empirical probability to the resulting continuity assumptions in Definition 1 and thus to the equations of constraint. In most sections of the Mechanics Hertz restricted the type of connections further to internal and normal (gesetzmäßig) connections: Definition 2. A connection of a system is said to be an internal one when it only affects the mutual position of the points of the system … . Definition 3. A connection of a system is said to be normal when it exists independently of the time. (Hertz 1894, §117, 119)
As in the case of continuity, Hertz in the beginning of book 2 claimed that experience teaches us that there exist systems with only internal connections, and asserted ‘that this independence [of absolute position] always appears, so long as a material system is sufficiently distant in space from all other systems’ (Hertz 1894, p. 306). Moreover, he claimed that experience teaches us that systems with only internal connections are normal. Thus, Hertz primarily studied material systems whose connections are continuous, internal and normal, so-called free systems (§122). Any mechanical system in Hertz’s image of mechanics can be considered as a portion of a free system (§429)4 , so even the behavior of unfree systems must be deduced from the motion of free systems.
15.3 Holonomic and non-holonomic systems We have now seen how Hertz argued that he could do with connections that are expressed in terms of first-order homogeneous differential equations in the coordinates, and even in most cases with such equations where the coefficients only depend on the relative position of the point masses (internal connections) and are independent of time (normal connections). Now we shall turn to the question, why 4 This explains why Hertz in the second draft of the book (Ms 12) chose to call free systems ‘complete’ (volständig) rather than free.
Holonomic and non-holonomic systems
193
Hertz did not restrict his connections even further. In most textbooks on mechanics connections were (and are) restricted to those that can be expressed by finite or integral equations between the coordinates: Fι (x1 , x2 , . . . , x3n ) = cι
ι = 1, 2, . . . , i,
(15.8)
κ = 1, 2, . . . , k,
(15.9)
or expressed in generalized coordinates: Fκ (q1 , q2 , . . . , qr ) = cκ
connections through rigid rods or gear wheel mechanisms are all of this kind. However, rolling motion (e.g. a ball rolling on a plane or two balls rolling on each other) cannot be described by such finite equations but they can be described by differential equations of the form (15.1) and (15.2). That is why Hertz would not limit his connections to finite equations but allowed of differential equations. In the discussion of the energetic image, he admitted that rolling with a bit of slipping could be described without recourse to differential equations of the form (15.1) or (15.2), but he found it unsatisfactory to deal with rolling in this way. First, because rolling without slipping did not violate any general principle of mechanics, and because such motions were, in fact, very nearly realized in various integration machines, which were constructed toward the end of the nineteenth century (see Section 9.3). This argument was not in itself sufficient for Hertz, who explicitly emphasized that his mechanics was not meant to facilitate practical mechanical computations. Yet, it indicated to Hertz that ‘we have scarcely any right, then, to exclude its (rolling without slipping) occurrence as impossible, at any rate from the mechanics of unknown systems, such as the atoms or the parts of the ether’ (Hertz 1894, p. 25/21). Thus, Hertz’s argument ran as follows: It is possible that rolling without slipping takes place on the atomic level or in the ether. Therefore, in an image that aspires to be the foundation of all physics we cannot limit our connections to those that can be expressed by finite equations but we must allow differential equations5 . Of course, even in Hertz’s image some of the differential equations (15.1) or (15.2) could be integrable, which means that their left-hand sides are differentials of functions Fι (x1 , x2 , . . . , x3n ) or Fκ (q1 , q2 , . . . , qr ), such that the corresponding connection can be expressed in integral form such as eqns (15.8) or (15.9). If the entire system of connections (15.1) or (15.2) is integrable, and thus reducible to 5 As we shall see in Section 20.3, Hertz needed to asume that the connections among the hidden masses (the ether) are holonomic, i.e. given by finite equations such as (15.3) or (15.4). This somehow undermines his argument for the necessity of non-holonomic connections. It is also remarkable that Larmor in 1900 used the converse argument against the use of non-holonomic constraints. He rejected Hertz’s argument against the principle of least action with the words: ‘Against this view it may be urged that the notion of rolling is foreign to molecular dynamics, on which the laws of mechanical dynamics must be ultimately based.’ (Larmor 1900, p. 277)
194
Connections. Material systems
integral equations of the form (15.8) or (15.9) Hertz called the system holonomic6 . Though he was not the first to emphasize the difference between holonomic and nonholonomic constraints, he was the first to give them their names. He explained that the ,, , name should indicate that a holonomic system ‘obeys integral (oλoς ) laws (ν oµoς ), whereas material systems, in general, obey only differential conditions.’ (Hertz 1894, §123). The name holonomic appeared for the first time in the fourth and last of Hertz’s drafts of the Mechanics (Ms 16). In the second and the third draft (Ms 12, 15) Hertz called such systems ‘Integralsysteme’ (integral systems) for obvious reasons, whereas he used the term ‘Positionssysteme’ or ‘Lagensysteme’ (position systems) in the manuscript (Ms 7), where he introduced the concept of connections for the first time, as well as in the first long draft (Ms 9). The latter names are derived from the fact that for a holonomic system the possible positions uniquely determine the possible displacements. In fact, this property is chosen by Hertz as the defining one in all his drafts and even in the published book: Definition 3. A material system between whose possible positions all conceivable continuous motions are also possible motions is called a holonomic system. (Hertz 1894, §123)
In order to appreciate this definition, we shall follow Hertz and define a possible path as a path that is composed of possible displacements, and possible positions as the positions that can be reached (from one given possible position)7 via possible paths (Hertz 1894, §112). Having formulated these definitions Hertz preceded with the following crucial observation: Thus all positions of possible paths are possible positions. But it is not to be understood that any conceivable path whatever through possible positions is also a possible path. On the contrary, a displacement between infinitely neighbouring possible positions may be an impossible displacement. (Hertz 1894, §113)
The definition of a holonomic system is meant to say that the latter possibility cannot occur, i.e.that any displacement between infinitely neighboring possible positions is a possible displacement. However, as formulated by Hertz the Definition 3 is not entirely correct. For example, a system of two point masses constrained by the condition that they maintain a fixed given distance between them, is the simplest example of a holonomic system. However, it is, of course, possible to move this system from one possible position (i.e. one where the two points are separated by the given distance) to another possible position through a path where the two points do not always maintain the given fixed distance, i.e. through an impossible motion. So, what Hertz meant to say in the definition was either the above italicized statement where the displacements are limited to infinitesimal ones, or he could have stated that in a holonomic system all conceivable paths8 through possible positions are possible paths. In order to understand how a system can fail to be holonomic in this sense, let us consider an example alluded to in the introduction of Hertz’s Mechanics. 6 Holonom in German and Holonomous in the translation. I shall adapt the usual modern translation 7 Hertz forgot to fix a given initial possible position in his definition. holonomic. 8 In fact, Hertz’s use of the word motion in Definition 3 instead of path is probably a slip of the pen.
Holonomic and non-holonomic systems
195
(Hertz 1894, p. 23/19) (I shall return to this example in Chapter 21). Consider a ball of radius r rolling without slipping on a fixed plane. The configuration of the ball is uniquely determined by: 1. the point p on the plane where it touches the ball (2 coordinates) 2. the point P on the ball where it touches the surface (2 coordinates) and 3. the angle the ball is turned around the diameter through the point of contact (1 coordinate). It is also rather clear that it is possible to roll the ball such that it gets from a given configuration A to any other configuration B. Indeed, by choosing a suitably long curve in the plane from PA to PB it is possible to roll the ball along this curve such that the correct point PB on the ball ends up touching the correct point PB in the plane. Then one can rotate the ball around the diameter through PB till it reaches the configuration B. Thus there exists a possible path from the configuration A to any configuration B. That means that any position of the ball on the plane is a possible position and therefore that the positions of the system is a 5-dimensional manifold. However, it is easy to specify two infinitely close positions such that the displacement between them is not a possible displacement. For example, consider one position of the ball on the plane and another position where the ball is oriented in the same way but situated over a neighboring point on the surface. If the ball is rolled without slipping (i.e. is given a possible displacement) from a point on the plane to a neighboring one it will also turn a non-zero angle around its instantaneous axis of rotation. But the ball in the above example is not turned and thus it cannot be reached through a possible infinitesimal displacement from the first position. That means that a ball rolling without slipping is not a holonomic system. But what has Hertz’s definition to do with the difference between differential and integral conditions that have given the holonomic constraints their name? This question is answered by Hertz in §132–133 of the Mechanics. He proved that a system is holonomic if and only if its differential equations of constraint (15.1) or (15.2) can be integrated to yield an equal number of finite or integral equations between its coordinates. The difficult part of this statement is the ‘only if’ part, which involves a case of the implicit function theorem. Hertz dealt with it as follows: Among the r coordinates, whose differentials are related through the k equations (15.2), he selected the first r − k independent ones. Now, pass from some initial position through possible paths to final positions with the same values of these independent coordinates. Then, if the system is assumed to be holonomic the rest of the coordinates must also have the same values in the final positions. Indeed, according to the definition, all the final positions are possible positions, and since the system is supposed to be holonomic the displacement from one final position to another one must be a possible displacement. But since the differentials of the independent coordinates are zero, and the equations of constraint are homogeneous, the remaining differentials must also be zero. Thus the first r − k coordinates completely determine the remaining coordinates, or put differently, the remaining coordinates are functions of the independent ones. The k finite equations that express this fact are the integral equations that solve the differential equations of constraint. In order to shed more light on the difference between holonomic and non-holonomic systems Hertz introduced the concept of the degree of freedom of a system. It is,
196
Connections. Material systems
by definition, equal to ‘the number of infinitely small changes of the coordinates of a system that can be taken arbitrarily’ (Hertz 1894, §134), (the dimension of the tangent space). This is the number l of independent duλ s in the proof of the form of the connections (15.6). It is thus equal to the number of coordinates diminished by the number of independent connections: 3n − i or r − k in the above terminology. We remark with Hertz that if the system is not a holonomic one, this number is strictly less than the number of dimensions of the manifold of possible positions. For example, we remarked above that the possible positions of a ball rolling on a plane is a 5-dimensional manifold. However, an infinitely small displacement is uniquely determined by specifying an axis of rotation of the ball (two coordinates) and an angle of rotation around this axis (one coordinate). Thus the system has only three degrees of freedom. Moreover, Hertz defined a free coordinate to be a coordinate whose changes can take place independently of the changes of the other coordinates. Free coordinates are thus those coordinates that do not enter into the equations of constraint. In particular, a coordinate of absolute position is a free coordinate for a free system. In terms of these concepts Hertz could prove that the following statements about a mechanical system are equivalent: 1. It is holonomic. 2. Its equations of constraint are integrable. 3. Its positions can be expressed by coordinates that are all free. In that case there are no remaining equations of constraints. 4. The number of independent coordinates (the dimension of the manifold of possible positions) is equal to its number of degrees of freedom. The acceptance of non-holonomic connections complicates Hertz’s mechanics considerably, but, as we have seen, he nevertheless argued they were an essential ingredient of his image. However, this argument only partially answers the question: Is it possible to limit the possible types of connections in some other way or as Hertz himself put it: It is mathematically possible to write down any finite or differential equation between coordinates and to require that it shall be satisfied; but it is not always possible to specify a natural, physical connection corresponding to such an equation: we often feel, indeed sometimes are convinced, that such a connection is by the nature of things excluded. And yet, how are we to restrict the permissible equations of condition? Where is the limiting line between them and the conceivable ones? (Hertz 1894, p. 13/11)
Hertz asked this question in connection with the usual Newtonian–Laplacian image of mechanics, and he criticized this image for the inability to give a satisfactory answer to the question as well as to the corresponding question about the nature of possible forces in nature. According to Hertz the problem was one of distinctness. If one could restrict the possible constraints one would depict more of the essential relations of the image, and thus make it more distinct.
Holonomic and non-holonomic systems
197
In the presentation and evaluation of his own image, however, Hertz wisely did not raise the question again. One might imagine that he would have liked to somehow limit connections to contact or local connections, so as to avoid the possibility of actions at a (great) distance. However, this would have been difficult in Hertz’s image. Indeed, Hertz’s masses have only infinitesimal extension, so in a system of finitely many point masses, as the ones Hertz dealt with, there would not be any contact actions except if two masses coincided. This would lead to very limited possibility for interaction, and would, moreover, contradict the principle Natura non facit saltus. So, the only possibility would be to postulate that masses could only be connected if they were sufficiently close. Yet that would involve a completely arbitrary choice of largest distance for actions, for which there would be no rational or empirical argument. So, instead of raising the question of possible limitations of the connections in his own image Hertz evaded it by the following remark: To investigate in detail the connections of definite material systems is not the business of mechanics, but of experimental physics. (Hertz 1894, p. 32/27)
He developed this point of view at greater length in §307.
16 The fundamental law
In order to account for the way a mechanical system moves in time some laws of motion are called for. In the usual Newtonian–Laplacian image Newton’s three laws are often taken as the basic ones. Hertz, on the other hand formulated one and only one law of motion: Fundamental Law. Every free system persists in its state of rest or of uniform motion in a straightest path. Systema omne liberum perseverare in statu suo quiescendi vel movendi uniformiter in directissimam. (Hertz 1894, §309)
Hertz probably formulated the Latin version of the law in order to highlight its similarity to (and simplification of) Newton’s formulation of his first law: Corpus omne perseverare in statu suo quiescendi vel movendi uniformiter in directum, nisi quatenus a viribus impressis cogitur statum illum mutare. (Newton 1687)
In order to understand Hertz’s fundamental law, we must explain the meaning of the words ‘uniform motion’ and ‘straightest path’: The motion of a system is said to be uniform if the magnitude of the velocity does not change (Hertz 1894, §263). A path of a system is said to be a straightest path, if it consists of straightest-path elements, i.e. possible elements (those that satisfy the constraints) that have smaller curvature than any other possible line element with the same position and the same direction. With that in mind we can formulate Hertz’s fundamental law as follows: A free system moves with constant speed along a path that is as straight as it can be without breaking the connections of the system. In the introduction of his Mechanics Hertz formulated the content of the fundamental law in everyday language: ‘if the connections of the system could be momentarily destroyed, its masses would become dispersed, moving in straight lines with uniform velocity; but . . . as this is impossible, they tend as nearly as possible to such a motion.’ (Hertz 1894, p. 33/28). Except for such attempts to explain the content in a more popular version, Hertz’s formulation of the fundamental law was surprisingly stable throughout his work on mechanics. Already an early plan of the book (in (Ms 13)) contains a latin formulation identical 198
The fundamental law
199
to the printed law. In this plan the German formulations differ slightly from the published one, but in all the drafts beginning with (Ms 9) the law reads like the published one1 . The Latin version reappeared in the third draft (Ms 15). It was the geometry of systems of points that allowed Hertz to limit the laws of motion to his one simple, elegant and intuitively appealing fundamental law. Hertz explicitly mentioned this fact as one of the merits ‘although not a very important one’ of the geometrical form of his image (Hertz 1894, p. 37/31). He did not consider it so very important, because it would have been possible to formulate the physical content of the fundamental law in ordinary mathematical language. As Hertz’s everyday wording above suggests, one would, however, have to divide the content of the law into two separate laws: 1. Newton’s first law of inertia stating that in a system consisting of free points (with no connections) the points will move uniformly in straight lines (Hertz 1894, §383), and 2. Gauss’s principle of least constraint to the effect that the natural motion of a connected system will minimize the constraint among all motions that have the same position and velocity. I shall return to Hertz’s geometric definition of the constraint of a system and his formulation of Gauss’s principle in the next chapter. Here, it suffices to note that Gauss had defined the constraint of a system in a more traditional mathematical language reminiscent of the method of least squares (2.19). Hertz proved that these two laws could ‘together replace completely the fundamental principle, and that for all systems’ (Hertz 1894, §391). Yet he did not like the idea of replacing his fundamental law with the two mentioned laws, because they would by implication contain something more, and this something more would be too much. In the first place they suggest the conception, which is foreign to our system of mechanics, that the connections of the material system might be destroyed; whereas we have denoted them as being permanent and indestructible throughout. In the second place we cannot, in using Gauss’s principle, avoid suggesting the idea that we are not only stating a fact, but also the cause of this fact. We cannot assert that nature always keeps a certain quantity, which we call constraint, as small as possible, without suggesting that this quantity signifies something which is for nature itself a constraint, – an uncomfortable feeling. We cannot assert that nature acts like a judicious calculator reducing his observations, without suggesting that deliberate intention underlies the action. There is undoubtedly a special charm in such suggestions; and Gauss felt a natural delight in giving prominence to it in his beautiful discovery, which is of fundamental importance in our mechanics. Still, it must be confessed that the charm is that of mystery; we do not really believe that we can solve the enigma of the world by such half-suppressed allusions. Our own fundamental law entirely avoids any such suggestions. It exactly follows the form of the customary law of inertia, and like this simply states a bare fact without any pretence of establishing it. And as it thereby becomes plain and unvarnished, in the same degree does it become more honest and truthful. (Hertz 1894, p. 37/31–32)
It may be a bit difficult to agree with Hertz’s latter assertion except on a superficial level. After all, Hertz’s own fundamental law also asks for the minimization of a certain quantity, namely the curvature. History has shown that any minimal principle, be it a differential principle as Gauss’s principle or Hertz’s fundamental law 1 In the second draft, the word ‘free’ is replaced by ‘complete’ (vollständig).
200
The fundamental law
or an integral principle as the principle of least action, is prone to metaphysical or even religious considerations. For example, ever since Maupertuis and Euler formulated the principle of least action it has been interpreted as a metaphysical principle that results from the deliberate intention in nature not to spend more action than necessary, and it has even been used as an argument for a Devine planning of the universe. It is difficult to see that Hertz’s formulation of the fundamental law avoids any such suggestion. Nothing seems to prevent a religious or metaphysical mind from claiming that nature or God intentionally minimizes curvature, and that this is the reason for the fundamental law. One can easily formulate metaphysical or religious slogans such as ‘Nature (or God) is straight’ or ‘Nature (or God) is not crooked’ and pretend to have solved the enigma of the world by such half-suppressed allusions. One might argue that Hertz’s use of the word curvature is less suggestive than the word constraint that suggests an uncomfortable feeling, and that for this reason Hertz’s fundamental law is less prone to metaphysical abuse than Gauss’s principle. But, except for this superficial difference, I think the two principles lend themselves equally well to metaphysical speculations. So, in this one respect, I tend to agree with Hertz when he wrote: Perhaps I am prejudiced in favour of the slight modification which I have made in Gauss’s principle, and see in it advantages which will not be manifest to others. (Hertz 1894, p. 38/32)
However, it quite clear from the above quote that Hertz himself rejected any claim of a metaphysical origin of the fundamental law. To him the fundamental law was to be considered as the probable outcome of most general experience. More strictly, the law is stated as a hypothesis or assumption, which comprises many experiences, which is not contradicted by any experience, but which asserts more than can be proved by definite experience at the present time. (Hertz 1894, §315)
Indeed, the fundamental law was, according to Hertz, the only empirical element of his image of mechanics. We have already discussed the problematic nature of this claim in Chapters 11 and 14. Here, let me just recall that some of Hertz’s remarks about space and time seem to suggest some empirical content in these notions, or at least in the coordinative rules that explain how to apply our a-priori intuitions to our sensations of outer space and time. Moreover, as we saw in the previous chapter, Hertz also characterized his assumption about the possible nature of connections as being based on an ‘experience of the most general kind.’ These are the exact same words that he used to characterize the empirical nature of his fundamental law, so it is a bit surprising that he claimed that: ‘The question of the correctness of our statements is thus coincident with the question of the correctness or general validity of that single statement’ (Hertz 1894, §296). Hertz may have considered his fundamental law as containing implicitly also the statement of the nature of connections. After all, the straightest path is defined relative to the notion of connections. I shall return to Hertz’s discussion of the validity of his fundamental law, and consequently of his mechanics in Chapter 25 after I have shown how Hertz accounted for the motion of free and unfree mechanical systems.
The fundamental law
201
Hertz mentioned one other law that could have replaced his fundamental law, namely the following law of least acceleration: Proposition. A Free system moves in such a manner that the magnitude of its acceleration at any instant is the smallest which is consistent with the instantaneous position, the instantaneous velocity and the connections of the system. (Hertz 1894, §344)
This proposition is a simple consequence of the fundamental law. Indeed as we saw at the end of Section 14.4 Hertz had shown that the tangential and normal components of the acceleration are equal to v˙ and cv 2 , respectively. Thus, the square of the magnitude of the acceleration is equal to v 4 c2 + v˙ 2 .
(16.1)
Now, according to the fundamental law v˙ = 0 and c is minimized for the natural motion. Thus, the acceleration is minimized. Conversely, when the acceleration is minimized, v˙ must be equal to 0, i.e. the motion is uniform and the curvature c must be at a minimum, i.e. the path is a straightest path. Thus, the law of least acceleration could have replaced the fundamental law completely. As far as the mathematical form is concerned the two laws are on an equal footing. They are both formulated in the language of Hertz’s geometry of systems of points. Hertz chose to state the fundamental law in terms of uniform motion in a straightest path, because this formulation ‘has the advantage of making its meaning clearer and more unmistakable’ (Hertz 1894, §346). In one sense, however, the law of least acceleration ‘might be regarded as a preferable form of the statement of the fundamental law, inasmuch as it condenses the law into a single indivisible statement, not only externally into one sentence’ (Hertz 1894, §346). Indeed, as Hertz pointed out in §323, the fundamental law can be decomposed into three independent statements: 1. Of the possible paths of a free system its straightest paths are the only one which it pursues. 2. Different free systems describe in identical times lengths of their paths proportional to each other. 3. Time as measured by a chronometer (§298) increases proportionally to the length of the path of any one of the free moving systems. The first two statements alone contain facts of a general nature derived from experience. The third only justifies our arbitrary rule for the measure of time, and only includes the particular experience that in certain respects a chronometer behaves as a free system, although, strictly speaking, it is not. (Hertz 1894, §323)
17 Free systems
Hertz’s strategy for studying the motion of mechanical systems was to deal first with free systems, to which the fundamental law applies, and then with unfree systems considered as subsystems of free systems. In this chapter I shall discuss Hertz’s treatment of free systems. Hertz dealt with them in two steps: In the first book (Hertz 1894, §151–236) he discussed the purely geometric properties of straightest paths. Only in the second book after the formulation of the fundamental law does it become clear why the straightest paths are particularly interesting. At that point of the Mechanics Hertz went on to investigate the dynamic theory, i.e. how systems move in time (Hertz 1894, §331–426). At each step he deduced the general differential equations, derived differential and integral principles and dealt with the special phenomena of holonomic systems. I shall structure my discussion along different lines. In order to arrive as quickly as possible at Hertz’s introduction of the concept of force, which in a sense is the highlight of the physical content of his mechanics, I shall, in this chapter, deal exclusively with the differential equations of motion and those general differential principles of mechanics that follow from them. Only later (Chapter 21) shall I investigate Hertz’s theory of integral principles in particular for holonomic systems.
17.1 Straightest paths After Hertz had introduced the concept of curvature of a path, and the expressions (13.6) and (13.7) for it, as well as the concept of connections and their analytical expressions (15.1) and (15.2), he combined them into a study of the straightest paths (Hertz 1894, §151–165). According to the definition, a straightest path minimizes the expression of the curvature among all path elements that satisfy the connections. In order to deal with this variational problem Hertz applied the usual Lagrangian method of multipliers (Section 2.1). He observed that, according to eqn (15.1), the derivatives xi with respect to the curve length will satisfy the equations 3n
xιν xν = 0,
ι = 1, 2, . . . , i,
ν=1
202
(17.1)
Straightest paths
203
and thus xν will satisfy the derived equations 3n
xιν xν +
ν=1
3n 3n ∂xιν x x = 0, ∂xµ ν µ
ι = 1, 2, . . . , i.
(17.2)
ν=1 µ=1
Hertz now needed to vary xν while keeping xν and xν (the starting point and direction) constant so as to minimize the curvature c (13.6) or, equivalently, the expression 1 2 1 mν 2 x c = 2 m ν 2 3n
(17.3)
ν=1
under the constraints (17.2). To this end he multiplied the ι-th equation (17.2) with a multiplier ι and added all the left-hand sides to the expression in eqn (17.3). Then he determined the stationary points of this new expression by setting its partial derivatives with respect to xν equal to zero. This leads to the equations mν x + xιν ι = 0, m ν i
ν = 1, 2, . . . , 3n,
(17.4)
ι=1
which together with the i equations (17.2) determine the 3n + i quantities xν and ι . Similarly, in terms of generalized coordinates eqn (15.2) will yield the equations r
qχρ qρ +
ρ=1
r r ∂qχρ q q = 0, ∂qσ ρ σ
χ = 1, 2, . . . , k,
(17.5)
ρ=1 σ =1
and using the expression (13.7) for the curvature Hertz deduced that the qσ of a straightest path will satisfy the equations: r σ =1
aρσ qσ +
r r ∂aρσ σ =1 τ =1
ρ = 1, 2, . . . , r,
∂qτ
−
1 ∂aσ τ 2 ∂qρ
qσ qτ +
k
qχρ χ = 0,
χ=1
(17.6)
where χ are the Lagrangian multipliers. Hertz remarked in passing that the terms involving aρσ λµ (13.8) have disappeared from the equation. Thus, these 41 r 2 (r + 1) quantities that are necessary if one wants to know the value of the curvature of a path, are not needed to determine the straightest path. Since the differential equations of the straightest path are of second order Hertz concluded that there is a uniquely determined straightest path from a given position in a given direction. He also derived equations that express the change of direction along the path but there is no reason to enter into the details of these equations.
204
Free systems
17.2 Dynamics of free systems In the second book of the Mechanics after the formulation and discussion of the fundamental law Hertz embarked on a study of the kinematic and dynamic properties of natural motions of free systems, i.e. properties that depend on time as well as on space. He began by deriving some easy consequences of the fundamental law. First, he remarked that such a path is uniquely determined by an initial position and velocity (Hertz 1894, §331). That follows directly from the properties of the straightest paths. Secondly, he observed that since the magnitude of the velocity and the mass of a system do not change with time, the energy E = ( 21 )mv 2 is conserved during a natural motion (§340). Thirdly, he proved the law of least acceleration, such as I did at the end of the previous chapter. Having discussed these and a few other easy consequences of the fundamental law Hertz derived the differential equations of motion. He derived them from the equations of the straightest paths, by introducing time t as the independent variable rather than curve length s (§368). Since v = ds/dt is a constant we have x˙ν = xν · v,
x¨ν = xν v 2 ,
(17.7)
so if we multiply eqns (17.2) and (17.4) by mv 2 and put for, shortness Xi , instead of mv 2 we have 3n
xιν x¨ν +
3n 3n ∂xιν x˙ν x˙µ = 0, ∂xµ
ι = 1, 2, . . . , i
(17.8)
ν=1 µ=1
ν=1
and mν x¨ν +
i
xιν Xι = 0,
ν = 1, 2, . . . , 3n.
(17.9)
ι=1
These 3n + i equations determine the 3n + i quantities x¨ν and Xι as functions of xν and x˙ν . In the same way, from eqns (17.5) and (17.6) he found the equations of motion expressed in generalized coordinates (§371): r
qχρ q¨ρ +
ρ=1
r r ∂qχρ q˙ρ q˙σ = 0, ∂qσ
χ = 1, 2, . . . , k
(17.10)
ρ=1 σ =1
and m
r
σ =1
aρσ q¨σ +
r r ∂aρσ σ =1 τ =1
ρ = 1, 2, . . . , r,
∂qτ
1 ∂aσ τ − 2 ∂qρ
q˙σ q˙τ +
k
qχρ Qχ = 0,
κ=1
(17.11)
Dynamics of free systems
205
where we have put mv 2 χ = Qχ . According to eqn (14.19) the quantity in the square bracket of eqn (17.11) is equal to the component fρ of the acceleration of the system along qρ , so this equation can be written m fρ +
k
qχρ Qχ = 0,
ρ = 1, 2, . . . , r.
(17.12)
χ=1
Inserted into the purely kinematical equation (14.20) this leads to Lagrange’s equations of motion (§373) d dt
∂qE ∂ q˙ρ
∂qE + qλρ Qχ = 0, ∂qρ k
−
ρ = 1, 2, . . . , r.
(17.13)
χ =1
In particular, if we chose to describe a holonomic system by free coordinates qρ the constraints vanish (qχρ = 0) so that we have the equations of motion (§375): ∂qE d ∂qE − = 0. (17.14) dt ∂ q˙ρ ∂qρ From this Lagrangian form of the equation of motion Hertz derived Hamilton’s equations of motion for a free holonomic system expressed in free coordinates: q˙ρ =
∂p E , ∂pρ
(17.15)
∂p E . ∂qρ
(17.16)
p˙ ρ = −
The first of these equations is a simple consequence of eqn (14.16) (see eqn (14.18)) and has therefore, according to Hertz, no empirical content. Formula (17.16), on the other hand, has empirical content because it is a consequence of the fundamental law, or of eqn (17.14). Indeed, according to eqn (14.17), we have pρ =
∂qE , ∂ q˙ρ
(17.17)
∂qE . ∂qρ
(17.18)
which, inserted into eqn (17.14), yields p˙ ρ = But, according to eqn (14.23), we have ∂q E ∂p E =− , ∂qρ ∂qρ so that eqn (17.16) follows.
(17.19)
206
Free systems
Hertz concluded his general treatment of free systems by proving three more principles of mechanics: the principle of least constraint, d’Alembert’s principle and the principle of areas (conservation of angular momentum). He defined the constraint of a system to be ‘the difference between the actual acceleration of the system and the acceleration of that natural motion which would result on removal of all the equations of condition of the system’ (Hertz 1894, §385). It is a vector quantity of the system. Now, since the acceleration of a free system with no connections is zero, the constraint of such a system is equal to the acceleration of the system. Therefore, we can reformulate the law of least acceleration (see the previous chapter) in a way that corresponds to Gauss’s Principle of least constraint: The magnitude of the constraint is at every instant smaller for the natural motion of a free system than for any other possible motion which coincides with it in position and velocity at a particular instant considered. (Hertz 1894, §388)
This may seem a redundant formulation of the fundamental law, and indeed for free systems it is. However, as will become clear in the next section, Gauss’s principle will continue to hold true for non-free systems, for which the fundamental law or the law of least acceleration do not hold. Having defined the constraint of a system, Hertz could formulate and prove d’Alembert’s principle: The direction of the constraint in the natural motion of a free system is constantly perpendicular to every possible or virtual displacement of the system from its instantaneous position (Hertz 1894, §392)
Indeed, as remarked above, for a free system the constraint is equal to its acceleration. The latter is, according to eqn (17.2), equal to fρ =
k 1 qχρ Qχ , m
(17.20)
χ =1
and Hertz proved that a vector quantity with components kρ along qρ can be written in the form k kρ = qχρ γκ (17.21) χ=1
if and only if it is orthogonal to any virtual displacement of the system, i.e. a displacement satisfying all the eqns (15.2) of connection. In fact, let δqρ denote the component along qρ of a virtual displacement. If we multiply the κ-th equation of connection (eqn (15.2) with dqρ replaced by δqρ ) by γκ and add them all up we get: r ρ=1
kρ δqρ =
r k ρ=1 χ=1
qχρ γκ δqρ =
k χ=1
γκ
r
qχρ δqρ = 0.
(17.22)
ρ=1
When we recall that the components of a vector quantity correspond to the reduced (covariant) components of a displacement, we see from eqn (14.8) that eqn (17.22)
Dynamics of free systems
207
implies that kρ (or fρ ) is orthogonal to any virtual displacement. This completes Hertz’s proof of d’Alembert’s principle. As eqn (17.22) indicates, the principle can analytically be expressed by: r fρ δqρ = 0. (17.23) ρ=1
From this principle Hertz deduced that the acceleration of a free system along a free coordinate is always zero, and by choosing six suitable free coordinates of absolute position (the rectangular coordinates of the center of mass, and three suitable angular coordinates) he could deduce that the total linear momentum and the angular momentum around the center of mass of a free system is conserved during a natural motion. This is the principle of the center of gravity (or mass) and the principle of areas, respectively, in Hertz’s nineteenth-century wording (see Chapter 2).
18 Cyclic coordinates
In Hertz’s account of conservative systems the concept of a cyclic coordinate, and, in particular, what he called an adiabatic cyclical system enters as an important technical tool. In this chapter I shall explain the meaning of these concepts as understood in the ordinary image of mechanics. I shall first give an account of the historical development of the concept and its important mechanical and mathematical properties. Then I shall give an example of a simple mechanical system that has a cyclic coordinate. This example can then serve as a simple standard example to which I shall refer in later chapters when I want to give a concrete example of some of the properties of the systems discussed by Hertz. Hertz himself did not give any examples, but I have found it suggestive to complement his entirely general theoretical considerations with a simple example. I do not suggest that Hertz himself primarily had such simple systems in mind when he wrote his Mechanics. The example is only meant as a didactical device. At the end of this chapter I shall discuss Helmholtz’s use of cyclic motion in his mechanical model of thermodynamics. Helmholtz’s papers on this topic are important for the present book because they were the point of departure for Hertz’s ideas on cyclic systems.
18.1 Routh and modified Lagrangians A generalized coordinate of a mechanical system is called a cyclic coordinate if it does not enter explicitly into the expressions of the kinetic or potential energy of the system – or almost equivalently in its Lagrangian or Hamiltonian. For example, if a change of the coordinate does not change the mass distribution of the system but only cyclically permutes the masses (e.g. the turning of a wheel), then the coordinate is cyclic. This is the origin of the name. The importance of such coordinates was discovered in several steps. First, it was observed that if the Hamiltonian H (q1 , . . . , qr , p1 , . . . , pr ) does not contain a certain generalized coordinate qρ ; Hamilton’s equations dqρ ∂H = , dt ∂pρ
dpρ ∂H =− , dt ∂qρ 208
ρ = 1, 2, . . . , r,
(18.1)
Routh and modified Lagrangians
209
imply that the conjugate generalized momentum pρ (t) is a constant during the motion. This simple way to determine integrals of motion was used by Joseph Liouville (see (Lützen 1990, p. 707)) around 1850. A deeper understanding was obtained by Edward John Routh (Routh 1877b) in connection with the Lagrangian formalism: Let T and V represent the kinetic and potential energy, respectively, of an isolated conservative mechanical system considered as functions of its generalized coordinates qρ and their time derivatives q˙ρ (ρ = 1, 2, . . . , r). Then, as explained in Chapter 2, Lagrange had shown (Lagrange 1788) that the equations of motion can be written (see eqn (2.13)) d ∂T ∂T − = Pρ , (18.2) dt ∂ q˙ρ ∂qρ where Pρ is the force in the direction of the coordinate qρ . If we express the force as minus the gradient of the potential energy we get d ∂T ∂V ∂T + = 0, (ρ = 1, 2, . . . , r), (18.3) − dt ∂ q˙ρ ∂qρ ∂qρ or d dt
∂L ∂ q˙ρ
−
∂L = 0, ∂qρ
(ρ = 1, 2, . . . , r),
(18.4)
where the Lagrangian L is defined by L = T − V.
(18.5)
If the Lagrangian does not depend explicitly on some of the coordinates qk+1 , qk+2 , . . . , qr , but only on their time derivatives, Lagrange’s equations (18.4) imply that ∂L (ρ = k + 1, k + 2, . . . , r). (18.6) = cρ , ∂ q˙ρ This corresponds to the observation made above concerning the Hamiltonian formalism. In this case, Routh suggested (Routh 1877b, §20) that one could use eqns (18.6) to express q˙ρ (ρ = k + 1, k + 2, . . . , r) in terms of the constants cρ . If we substitute these expressions into the expression for L we obtain a function L2 = L2 (q1 , q2 , . . . , qk , ck+1 , . . . , cr ) whose partial derivatives with respect to q˙ρ and qρ for ρ = 1, 2, . . . , k can be written as
or
∂L2 ∂L ∂ q˙k+1 ∂L ∂ q˙r ∂L + + ··· + = ∂ q˙ρ ∂ q˙ρ ∂ q˙k+1 ∂ q˙ρ ∂ q˙r ∂ q˙i
(18.7)
∂L2 ∂ q˙k+1 ∂ q˙r ∂L = + ck+1 + · · · + cr , ∂ q˙ρ ∂ q˙ρ ∂ q˙ρ ∂ q˙ρ
(18.8)
210
and
Cyclic coordinates
∂ q˙k+1 ∂ q˙r ∂L2 ∂L = + ck+1 + · · · + cr . ∂qρ ∂qρ ∂qρ ∂qρ
(18.9)
Therefore, if we follow Routh and make a so-called Legendre transform, defining the modified Lagrangian L by L = L2 − ck+1 q˙k+1 − ck+2 q˙k+2 − · · · − cr q˙r ,
(18.10)
and again use eqn (18.6) to eliminate q˙k+1 , . . . , q˙n , this function will, by eqns (18.4), (18.8) and (18.9), satisfy the equations d ∂L ∂L − =0 dt ∂ q˙ρ ∂qρ
(ρ = 1, 2, . . . , k),
(18.11)
i.e. equations exactly similar to Lagrange’s equations (18.4). As pointed out by Routh, this transformation corresponds to a partial use of Hamilton’s transformation of Lagrange’s equations, the constants cρ being precisely the momenta conjugate to qρ .
18.2 Hidden cyclic motion. J.J. Thomson Routh used his modified Lagrangian in a study of ‘stability of motion’ (Routh 1877b) (see also (Routh 1877a)), but with Helmholtz ((Helmholtz 1884) and (Helmholtz 1886)) and J.J. Thomson ((Thomson 1886), (Thomson 1888)) the technique got a new twist in that they considered the cyclic coordinates qk+1 , . . . , qr as hidden, unobservable coordinates1 . Given a mechanical (or another type of physical) system that we describe by way of the coordinates q1 , q2 , . . . , qk . We can then, through geometric considerations, determine its apparent kinetic energy T as a quadratic form in the q˙i s. If, moreover, we can determine a function L such that the motion of the system is described by eqn (18.11), we will conclude that L is the Lagrangian of the system, and hence that V = T − L is the potential energy of the system that in turn describes the forces on the system. However, we may have been deceived by our lack of knowledge of a certain number of hidden cyclic coordinates so that what we believed was the Lagrangian, was in fact only the modified Lagrangian. What we mistook for potential energy may therefore have been the result of kinetic energy due to the hidden coordinates. To be more precise, let us with J.J. Thomson (Thomson 1888) consider a complete system as in eqns (18.4)–(18.11), which has only kinetic energy L = T , and such that T can be divided into (18.12) T = T1 + Tcycl , where T1 is a quadratic form of q˙1 , q˙2 , . . . , q˙k and Tcycl is a quadratic form in the remaining cyclic velocities. According to Euler’s fundamental theorem about 1 As Topper pointed out in his analysis of Thomson’s early commitment to mechanistic philosophy (Topper 1971) Thomson’s ideas about hidden cyclic motion went back to his thesis.
A simple standard example
211
homogeneous functions we have 1 ∂T q˙ρ , 2 ∂ q˙ρ r
L=T =
(18.13)
ρ=1
and thus, according to eqns (18.10) and (18.12), the modified Lagrangian takes the form L =
k r ∂T 1 ∂T q˙ρ − q˙ρ 2 ∂ q˙ρ ∂ q˙ρ ρ=1
=
ρ=k+1
k r ∂T 1 1 ∂T q˙ρ − q˙ρ = T1 − Tcycl , 2 ∂ q˙ρ 2 ∂ q˙ρ ρ=1
(18.14)
ρ=k+1
where Tcycl indicates that q˙k+1 , . . . , q˙ρ have been expressed as functions of cρ and q1 , q2 , . . . , qk by way of eqn (18.6) or equivalently by: ∂Tcycl = cρ , ∂ q˙ρ
ρ = k + 1, . . . , r.
By Routh’s modified Lagrangian equation (18.11) we therefore have d ∂ ∂ (T1 − Tcycl ) − (T1 − Tcycl ) = 0, ρ = 1, 2, . . . , k. dt ∂ q˙ρ ∂qρ
(18.15)
(18.16)
This is the equations of motion of a free mechanical system described by the coordinates q1 , . . . , qk , and with kinetic energy T1 and potential energy Tcycl . Indeed, the latter is, from eqn (18.15), a function of the variables q1 , . . . , qk alone, independent of q˙1 , . . . , q˙n .
18.3 A simple standard example In order to illustrate Thomson’s ideas let me apply them to a simple mechanical system that it will also be useful to keep in mind, when discussing Hertz’s approach to mechanics. Consider a rod of length and moment of inertia I swinging around its one end O in the plane of the paper (Fig. 18.1), and let its other end A be connected to a cyclical system similar to a centrifugal regulator consisting of a point mass m rotating around the point P (a position of the endpoint A) in a plane perpendicular to OP . The point mass m is connected to A by a massless string of length 2 passing through a pulley at P . We shall think of the rod, whose position is described by the angle ω = ∠AOP , as a visible system, and the mass m as a hidden system. The position of this hidden system is described by an angle ϕ (see Fig. 18.1), and the distance x = P m, or alternatively by ϕ and ω, where x is expressed in terms of ω by the formula x = 2 − 2 sin 21 ω.
(18.17)
212
Cyclic coordinates O
v
m
A w P
Fig. 18.1. A simple standard example
The energy of the total system is given by E = T = 21 I ω˙ 2 + 21 mx˙ 2 + 21 mx 2 ϕ˙ 2 = 21 I ω˙ 2 + 21 m2 cos( 21 ω)ω˙ 2 + 2m2 (1 − sin 21 ω)2 ϕ˙ 2 .
(18.18) (18.19)
We see that 1. ϕ is a cyclic coordinate of the system since it does not enter into the expression of the energy and 2. the energy can be divided as assumed by Thomson into a quadratic form T1 = 21 I ω˙ 2 + 21 m2 cos( 21 ω)ω˙ 2 of ω˙ and a quadratic form Tcycl = 2m2 (1 − sin 21 ω)2 ϕ˙ 2 of ϕ. ˙ Following Routh and Thomson we shall express the last term in terms of the conjugate to ϕ: conserved momentum p = p
1 2 ∂T = 4m2 1 − sin ω ϕ, ˙ ∂ ϕ˙ 2
(18.20)
so that L=E=T =
2 1 p 1 2 1 2 I ω˙ + m cos ω ω˙ 2 + . (18.21) 2 2 2 8m2 (1 − sin(1/2)ω)2
Thus, the modified Lagrangian is given by 2 1 1 2 1 2 p ω ω˙ 2 − , L = I ω˙ + m cos 2 2 2 8m2 (1 − sin(1/2)ω)2
(18.22)
Helmholtz on adiabatic cyclic systems
213
and the visible system will move as though it had kinetic energy given by T1 = 21 I ω˙ 2 + 21 m2 cos( 21 ω)ω˙ 2 ,
(18.23)
and potential energy given by U = Tcycl =
2 p . 8m2 (1 − sin(1/2)ω)2
(18.24)
From his general considerations, Thomson concluded (somewhat rashly): Thus we may look on the potential energy of any system as kinetic energy arising from the motion of systems connected with the original system – the configurations of these systems being capable of being fixed by kinesthetic or speed coordinates. [Thomson’s terms for cyclic coordinates] Thus from this point of view all energy is kinetic, and all terms in the Lagrangian function express kinetic energy, the only thing doubtful being whether the kinetic energy is due to the motion of ignored or positional coordinates; this can however be determined at once by inspection. (Thomson 1888, p. 14)
In this way, the widespread British feeling (see Chapters 3 and 4) that action at a distance, or equivalently potential energy, ought to be explainable through mechanical systems in motion (i.e. kinetic energy), was combined with Routh’s mathematical formalism of cyclic coordinates and modified Lagrangians to yield an abstract method for explaining forces kinetically, sidestepping thereby the problem of describing the hidden motions in all mechanical detail. It was precisely this approach Hertz chose in his Mechanics.
18.4 Helmholtz on adiabatic cyclic systems However, although Hertz was obviously well aware of the general tendencies in British physics he claimed in the preface to his book that he did not know of Thomson’s particularly clear anticipation of his approach to mechanics until his own research was well advanced. Instead, he built on Helmholtz’s two papers (Helmholtz 1884), (Helmholtz 1886). Helmholtz did not suggest that all energy might turn out to be kinetic energy, but he introduced the idea of adiabatic processes, i.e. processes where the non-cyclic observable coordinates change very slowly in comparison with the cyclic coordinates. This idea that came to play a central role in Hertz’s mechanics, was important to Helmholtz’s account of thermodynamics. Helmholtz’ aim was to display a mechanical model of thermodynamics. The heat dQ gained by a thermodynamical system when its temperature θ and its external parameters qi (e.g. the volume of a gas) are changed by infinitesimal amounts, is described by the equation: dQ = dE −
k (Pi dqi ), i=1
(18.25)
214
Cyclic coordinates
where E is the total energy of the system, and Pi denotes the external force tending to increase the parameter qi . This means that the expression dQi = Pi dqi
(18.26)
measures the energy gained by the system when qi is increased by an amount dqi . The important fact about dQ is that although it is not itself an exact differential of q1 , . . . , qk and θ, the temperature θ is an integrating denominator such that dQ/θ is an exact differential of a function S(q1 , . . . , qk , θ) called the entropy dQ = θ dS.
(18.27)
Helmholtz emphasized that any function S of S would do the trick equally well. Indeed, if we put ∂S (18.28) η=θ ∂ S we will have dQ = θ dS = θ
dS dS = η d S. d S
(18.29)
Thus, the temperature is not uniquely determined as the integrating denominator of dQ, but, according to Helmholtz, it is the only choice of η that satisfies the important criterion: when two systems of equal value of θ are put into contact with each other no heat dQ will flow from the one to the other. It is important to note that in this type of equilibrium thermodynamics, we consider the system determined entirely by q1 , . . . , qk and θ . That means that we consider q˙1 , q˙2 , . . . , q˙k as negligible so that the kinetic energy, which would be associated with the changes in the parameters, is neglected in comparison with the potential energy corresponding to these parameters and the energy stored as heat. Moreover, it is assumed that θ changes so slowly that the system can be considered as having the same temperature everywhere. Under these circumstances we say that the system changes adiabatically. Now, Helmholtz showed that one can mimic eqns (18.25) and (18.26) by assuming that the energy stored as heat is associated with a hidden cyclic motion. So, let us as above (eqns (18.6)–(18.16)) consider a mechanical system with r degrees of freedom and r − k cyclic coordinates, i.e. coordinates qk+1 , . . . , qr that do not appear in the Lagrangian. However, as opposed to the system described by eqn (18.4), Helmholtz assumed that in addition to conservative inner forces described by the potential energy V , the system is influenced by external (not necessarily conservative) forces Pi tending to increase qi . Then eqn (18.4) must be replaced by the more general Lagrangian equation: d ∂L ∂L − = Pρ , dt ∂ q˙ρ ∂qρ
ρ = 1, 2, . . . , r.
(18.30)
Helmholtz on adiabatic cyclic systems
The momenta pρ =
∂L , ∂ q˙ρ
215
ρ = k + 1, . . . , r
(18.31)
conjugate to the cyclic coordinates qρ are no longer constants but we have d pρ , (ρ = k + 1, . . . , r). (18.32) dt Thus, the energy gained by the system when qρ is increased by dqρ can be written (see eqn (18.26)): Pρ =
dQρ = Pρ dqρ = Pρ q˙ρ dt = q˙ρ dpρ ,
(ρ = k + 1, . . . , r).
(18.33)
In conformity with the thermodynamic situation Helmholtz now assumed that the non-cyclic coordinates q1 , . . . , qk change adiabatically, i.e. that q˙1 , . . . , q˙k and q¨k+1 , . . . , q¨r are negligible in comparison with q˙k+1 , . . . , q˙r . This will happen if the external forces are very close to the values that will keep q1 , . . . , qk constant. For example (as mentioned by Helmholtz in (Helmholtz 1886, p. 148) but not in (Helmholtz 1884)) if the kinetic energy can be divided as in eqn (18.12) and if the external forces P1 , P2 , . . . , Pk are zero, then there are motions of the system for which q1 , . . . , qr are constant. The assumption that the motion is adiabatic has several important consequences. First, the state of the system is described by q1 , . . . , qk , . . . , qr and q˙k+1 , . . . , q˙r , and since L does not depend on qk+1 , . . . , qr the momenta pρ will be functions of q1 , . . . , qk , q˙k+1 , . . . , q˙r . Secondly, for the non-cyclic coordinates the first term of eqn (18.30) vanishes compared with the second term so that the Lagrangian works as a sort of potential function for the external forces Pρ : −
∂L = Pρ , ∂qρ
ρ = 1, 2, . . . , k.
(18.34)
Thirdly, since the kinetic energy is approximately a quadratic form in the cyclic velocities alone, the other velocities being approximately zero, we have T =
r r ∂T ∂L 1 1 q˙ρ = q˙ρ . 2 ∂ q˙ρ 2 ∂ q˙ρ ρ=k+1
(18.35)
ρ=k+1
Finally, the total energy E = T + V = 2T − L =
n ρ=k+1
q˙ρ
∂L −L ∂ q˙ρ
(18.36)
is a function of q1 , . . . , qk , q˙k+1 , . . . , q˙r whose differential can, according to eqn (18.32), be written dE =
r ρ=1
Pρ dqρ =
k ρ=1
Pρ dqρ +
r ρ=k+1
q˙ρ dpρ .
(18.37)
216
Cyclic coordinates
Helmholtz considered, in particular, such systems for which the cyclic motions are all determined by one parameter. He called such systems monocyclic. Let us consider the simplest case where there is only one cyclic motion described by qr . In this case the energy transformation equation (36) can be written dQr = dE −
k
Pρ dqρ ,
(18.38)
ρ=1
where dQr = q˙r dpr .
(18.39)
This is in complete accordance with eqns (18.25) and (18.27) if we interpret the work dQr stored in the cyclic motion as a measure of heat, and let q˙r and pr play the roles of temperature and entropy, respectively. As remarked above in the section on thermodynamics, an integrating denominator as q˙r is not unique, and in the case treated here, the kinetic energy of the system is itself an integrating denominator. Indeed from eqn (18.35) we have 1 1 ∂L q˙r = q˙r pr , 2 ∂ q˙r 2
(18.40)
dQ = T dS,
(18.41)
q˙r dpr = 21 q˙r pr dS,
(18.42)
T = and therefore we obtain if we choose S such that or
dS = 2
dpr , pr
(18.43)
or
pr , (18.44) A where A is a constant. This is, according to Helmholtz, the choice of integrating denominator that will have the property that two systems having the same value of T will remain in equilibrium if coupled with each other. It is therefore the kinetic energy that plays the role of the temperature in Helmholtz’s mechanical model, in conformity with the kinetic theory of gases as developed by Boltzmann. One may also remark that L (according to eqn (18.36)) plays the role of Gibbs’s free energy. As mentioned above, Helmholtz did not suggest that heat was in reality due to one hidden cyclic coordinate. Rather, he thought that it was probably a result of an elimination of many so-called adiabatic coordinates (see Helmholtz 1886, p. 157). Such eliminations can be done in a way similar to Routh’s elimination of cyclic coordinates (see (Helmholtz 1884, pp. 130–131) and (Helmholtz 1886, pp. 148–149)). S = 2 log
However, my research on combined monocyclic systems [systems with many cyclic motions described by one parameter (Helmholtz 1884)] have shown that also more complicated systems
R. Liouville: One cyclic coordinate suffice
217
in motion, which will be more similar to the inner molecular motions of hot bodies, can lead to the same theorems. (Helmholtz 1886, p. 157)
18.5 What is new in Hertz’s Mechanics? In his book Hertz acknowledged his debt to Helmholtz’s papers: Both in its broad features and in its details my own investigation owes much to the abovementioned papers (Helmholtz 1884, 1886); the chapter on cyclical systems is taken almost directly from them. (Hertz 1894, Preface)
In view of Helmholtz’s and J.J. Thomson’s works we may ask: what was new in Hertz’s Mechanics? As far as Helmholtz is concerned Hertz answered the question himself: Apart from matters of form, my own solution differs from that of von Helmholtz chiefly in two respects. Firstly I endeavour from the start to keep the elements of mechanics free from that which von Helmholtz only removes by subsequent restriction from the mechanics previously developed. Second, in a certain sense I eliminate less from mechanics, inasmuch as I do not rely upon Hamilton’s principle or any other integral principle. (Hertz 1894, Vorwort/Preface)
What Hertz says here about Helmholtz can be said of J.J. Thomson’s explicit, and other British physicists’ more implicit idea that all energy might be kinetic so that potential energy (and thus forces) are merely apparent fictions due to our lack of knowledge of the finer details of nature. No one had tried to build mechanics up from the ground on this assumption. This was left to Hertz. Moreover, in Hertz’s Mechanics the cyclic coordinates are not just mathematical parameters, they present coordinates of a system of hidden point masses coupled with the visible system under consideration. Hertz did not emphasize this distinctive character of his account of forces, perhaps because the contemporary physicists had implicitly made similar assumptions about the cyclic, hidden motions. However, the great difference between just introducing a new cyclic coordinate and insisting that it be a real coordinate of a hidden system, is made quite clear by a simple observation by Roger Liouville from 1892.
18.6 R. Liouville: One cyclic coordinate suffice R. Liouville (Liouville 1892) considered a conservative mechanical system S described by the coordinates q1 , . . . , qk having kinetic energy T and potential energy U . He then constructed a new system S1 described by q1 , . . . , qk and one new cyclic coordinate qr . The new system is assumed to have only kinetic energy T determined by q˙ 2 (18.45) T = T + r , U
218
Cyclic coordinates
and no potential energy. The motion of S will be governed by Lagrange’s equations (see eqn (18.4)) ∂T d ∂T q˙ 2 ∂U − + r2 , ρ = 1, 2, . . . , k, (18.46) dt ∂ q˙ρ ∂qρ U ∂qρ and
d dt
q˙r U
= 0.
(18.47)
From eqn (18.47) we conclude that q˙n /U is a constant, and we consider those motions for which it is equal to 1. This means that the remaining eqns (18.46) are reduced to d ∂T ∂T ∂U − + = 0, i = 1, 2, . . . , k, (18.48) dt ∂ q˙i ∂qi ∂qi which are precisely the same as Lagrange’s equations (18.3) for the system S. Thus, by an analysis, that is the reverse of J.J. Thomson’s argument, R. Liouville showed that in any conservative mechanical system one can assume that the potential energy is due to kinetic energy associated with just one hidden cyclic coordinate. Hertz almost certainly did not learn about R. Liouville’s brief note, and even if he did, it would probably not have helped him much; indeed it is not clear how the new cyclic coordinate could be thought of as a coordinate of a hidden material system coupled with the system S. In fact, as pointed out by Hertz, it can only approximately be interpreted in this way (see Section 20.4).
19 Unfree systems. Forces
In chapter four of the second book of the Principles of Mechanics Hertz began the discussion of the motion of unfree systems. He made the crucial assumption that every unfree system is ‘a portion of a more extended free system’ (a so-called partial system) (Hertz 1894, §429). When Hertz considered an unfree system as a part of a free system ‘it is assumed that the rest of the system is more or less unknown, so that an immediate application of the fundamental law is impossible.’ The question is then how to take the influence of the rest of the system into account without knowing its motion in detail. Hertz mentioned two cases where this can be done in different ways. In the first case the rest of the system ‘perform a determinate and prescribed motion’ (Hertz 1894, §431ff.). Hertz introduced the term ‘guided system’ for this situation. The second, more important case, concerned ‘systems acted on by forces’ (Hertz 1894, §450ff.).
19.1 Guided systems Consider a free system consisting of two partial systems, of which we want to describe the motion of the first while we are partially ignorant about the second. Denote the coordinates of the first partial system by q1 , q2 , q3 , . . . , qs and the coordinates of the second system by q1 , q2 , q3 , . . . , qs 1 . The equations of connection (15.2) can then be written as s ρ=1
qχρ dqρ +
s
qρ = 0, qχρ d
χ = 1, 2, . . . , k,
(19.1)
ρ=1
qχρ s are functions of all the coordinates both the qρ s as well as where the qχρ s and the the qρ s. If the second system performs a ‘determinate and prescribed motion’ Hertz called the second system a guided system. In that case, the qρ s are known functions 1 Hertz used German types.
219
220
Unfree systems. Forces
of t and if one inserts these functions into the equations of connection (19.1) they take on the form s
qχρ dqρ + qχt dt = 0,
χ = 1, 2, . . . , k,
(19.2)
ρ=1
where the qχρ s and qχt are now functions of the qρ s and t alone. Thus, a guided system can be considered as an abnormal system in the notation of (Hertz 1894, §117) (Section 15.2) i.e. a system whose connections depend explicitly on time. Conversely, Hertz considered every abnormal system as a guided system. This is a strong claim that Hertz needed to make in order to save the correctness of his image of mechanics. Since the fundamental law is formulated for normal systems only, it does not apply directly to guided systems. Yet Hertz was able to show that many of the differential and integral principles, he had derived for free systems still hold for guided systems. For example, the law of least acceleration discussed at the end of Chapter 16 still holds. Indeed, according to the rules for calculating with vector quantities we have f2 , m f 2 = mf 2 + m
(19.3)
where, following Hertz, I have used primes to denote the quantities of the total system. Now, according to the law of least acceleration applied to the total free system, the natural motion will minimize f and since we assume that the second subsystem performs a prescribed motion fis constant. Therefore, the natural motion will minimize the acceleration f of the guided system. Similarly, and for almost the same reasons Gauss’s principle of least constraint and d’Alembert’s principle hold for guided systems and the equations of motion have the same form as for free systems. On the other hand, energy conservation clearly does not hold. Guided systems are of some interest in dealing with some practical mechanical problems. Of more fundamental interest are the systems acted on by forces that we shall now turn to.
19.2 Systems acted on by forces Hertz introduced the concept of force impressed on an unfree system only in cases where it is ‘coupled’ to another system; that means in his terminology that the coordinates of the two partial systems can be so chosen that the only equations of constraints, involving both partial systems, state that one or more coordinates of the first system are always equal to one or more coordinates of the second (Hertz 1894, §450, 451), i.e.
or
q˜ρ − qρ = 0,
ρ = 1, 2, . . . , r1 ,
(19.4)
q˙˜ρ − q˙ρ = 0,
ρ = 1, 2, . . . , r1 ,
(19.5)
Systems acted on by forces
221
where r1 ≤ r. In addition the first system may be constrained by equations of the form (15.2) and the second by similar equations in the thilded variables. Here, and in the following discussion, it may be helpful to keep the simple standard example (Section 18.3) in mind. In that example the concealed system is coupled to the visible system because the connection can be explained by the use of the common coordinate ω. In general, if, as above, we use primes to indicate quantities of the total system consisting of the two coupled systems (17.2), gives directly the following equations of motion m fρ
+
k
qχρ Qχ = 0,
ρ = r1 + 1, . . . , r
(19.6)
χ =1
for the coordinates of the first system that are not coupled to the second. Here, fρ is a vector quantity with respect to the total system. It can also be considered a vector quantity fρ with respect to the first partial system if, according to eqn (14.10), we put m fρ = mfρ . Hence m fρ +
k
qχρ Qχ = 0,
ρ = r1 + 1, . . . , r.
(19.7)
λ=1
In order to find the equations of motion connected with the coordinates constrained by eqn (19.4) or (19.5) Hertz multiplied the derivative of the left-hand side of eqn (19.5) with respect to q˙ρ (i.e. −1) with a Lagrange multiplier Pρ and added it to the corresponding eqn (17.12); so for these coordinates he obtained mfρ +
k
qχρ Qχ − Pρ = 0,
ρ = 1, 2, . . . , r.
(19.8)
χ=1
The motion of the total system and thus of the first partial system is now determined by a. b. c. d.
the equations (19.7) and (19.8), the corresponding equations for the second partial system in the thilted variables, the equations of constraint for each of the two partial systems, the equations of constraint (19.4) involving both of the partial systems.
At this point, Hertz changed the point of view drastically. Instead of considering the Pρ s as unknown multipliers that must be determined by solving the above-mentioned system (a–d) of equations, he considered them to be known functions of time. In that case, any other knowledge of the second system is unnecessary for the description of the motion of the second system. One just has to solve the equations of constraint of the first partial system (15.2) together with eqns (19.7) and (19.8), or equivalently, m fρ +
k
qχρ Qχ = Pρ ,
ρ = 1, 2, . . . , r,
(19.9)
χ=1
where we have put Pρ = 0 for the uncoupled coordinates qρ , ρ = r1 + 1, . . . , r.
222
Unfree systems. Forces
Hertz could now define the force exerted by the second system on the first to be the aggregate of the Pρ s and Pρ to be the component of the force along qρ (Hertz 1894, §455, 460). He extended this definition to a system coupled to several other systems and discussed composition or summation of forces (§471–480). In particular, he was able to show that the components of a force along the rectangular coordinates of a system can be considered elementary forces (i.e. forces that are exerted by a single material point or on a single material point), and that any force can be considered as a sum of these elementary forces. Hertz also devoted a section (§517–530) to statics, but I shall not discuss this subject further. With his concept of force in place, Hertz could show a series of theorems that were, in one sense or another, familiar from ordinary mechanics: For example, he showed that ‘if a force is perpendicular to every possible displacement of a material system then it has no effect on the motions of the system’ (Hertz 1894, §488). The proof is easy: Hertz had shown (eqn (17.21)) that the components πρ of such a force π (or vector quantity) can be written in the form πρ =
k
(19.10)
qχρ γκ .
χ=1
Thus, if the force π acts on the system in addition to the force P the equations of motion (19.9) can be written m fρ +
k
qχρ (Qχ − γκ ) = Pρ ,
ρ = 1, 2, . . . , r.
(19.11)
χ=1
The solution of this differential equation will, of course, lead to values of Qχ that are increased by γκ compared to the solutions of eqn (19.9), but fρ and therefore q¨ρ that alone determine the motion will remain unaltered. The preceding theorem shows that one cannot uniquely determine the forces acting on a system from the consideration of the motion of the system. The forces can only be determined modulo a force, that is orthogonal to all possible displacements of the system. However, one can uniquely determine the components of the acting force in the direction of any possible displacement (§492). In particular, the component Pρ of the force exerted by a system along a free coordinate can be determined from the following expressions (§493) Pρ = −mfρ =
∂q E d − ∂qρ dt
=
∂q E − p˙ ρ ∂qρ
=−
∂q E ∂ q˙ρ
∂p E − p˙ ρ . ∂qρ
(19.12) (19.13) (19.14) (19.15)
Systems acted on by forces
223
Here, the first equality follows from eqn (19.9) when we keep in mind that the qχρ are zero for a free coordinate, and that the force exerted by a system is the negative of the force acted on the system. The second, third and fourth equalities follow from eqns (14.20), (14.17) and (14.23), respectively. It is clear that a system acted on by forces does not satisfy the fundamental law, and contrary to a guided system it does not satisfy the law of least acceleration either. However, Hertz could prove that Gauss’s principle of least constraint does apply to such systems and so does d’Alembert’s principle (see Section 17.2). In the case of a system acted on by a force with components Pρ the constraint has the components zρ = fρ − (Pρ /m) so that d’Alembert’s principle has the analytic form: r
fρ −
ρ=1
Pρ m
δqρ = 0
(19.16)
for all possible displacements δqρ (see eqns (17.23) and (2.9)). Hertz also showed how to extend the principle of the center of gravity and the principle of areas to systems acted on by forces (Hertz 1894, §508–509). In connection with each of these principles Hertz pointed out that according to the ordinary way of looking at mechanics, the formulation of the principles for systems acted on by forces are the general ones, whereas the formulation for free systems are special cases. However, he emphasized that in his image of mechanics the relation is the converse: The principles formulated for free systems are the general ones, whereas the principles formulated for systems acted on by forces are consequences of them. Hertz finally defined the work done by a force in a given time as the increase of energy of the system (Hertz 1894, §510), and showed that ‘the work which a force does on a system whilst it traverses an element of its path is equal to the product of the element and the component of the force in its direction’ (§512), or expressed analytically r Pρ dqρ . (19.17) dE = ρ=1
At first glance it may seem an unnecessary restriction when Hertz only introduced the concept of force for a system coupled along certain coordinates to a second system. However, this limitation has the advantage that one can immediately define the components of the force along the common coordinates (the remaining components being zero), and, in particular, it allowed Hertz to prove that ‘force and counterforce are always equal and opposite’ (Hertz 1894, §468). The meaning of this law is that the force exerted by the first system on the second system have components that are the negative of the components Pρ of the force exerted by the second system on the first system (here, the former must be considered as a vector quantity of the first system). This is a limited version of Newton’s 3rd law. In Hertz’s Mechanics Newton’s 3rd law does not hold in a more extended sense. This is clear from the fact alone that vector quantities related to different systems cannot be compared except for the components along common coordinates (see Section 14.3). Hertz reckoned
224
Unfree systems. Forces
this limitation of the law of action and reaction as a merit of his mechanics because Newton’s 3rd law had been questioned when applied to electromagnetic actions at a distance (Hertz 1894, §470). The limitation of the concept of forces to a system coupled to another system is, in fact, only an apparent limitation, since one can always conceive of a partial system as being coupled to the remaining part of the total free system through a machine. A machine is defined by Hertz to be ‘a system whose masses are considered vanishingly small in comparison with the masses of the systems with which it is coupled’ (Hertz 1894, §531). The energy of a machine is therefore vanishing so that it is entirely described by its equations of connection. If we divide any free system into two systems we can consider the first as coupled to the second through a machine; indeed we may divide the equation of constraints of the total system into three: those that only involve the coordinates of the first system, those that only involve the coordinates of the second, and those that involve coordinates of both. We can then consider the first group as constraints on the first system, the second group as constraints of the second and the third as constraints on a connecting machine. Thus, although Hertz’s approach does not, in general, allow one to speak of the forces exerted by the second system on the first, it is permissible to speak of the forces exerted by the machine or even by the machine and the second system on the first. In the simple standard example in Section 18.3 we can think of the rod as the visible system and the mass m as the hidden system. The string is then the machine that connects the two. In general, when a system is acted on by forces, the forces are explicit functions of time, Moreover, in most cases we are able from the (forced) motion of the first known system to ascertain the mass and even the change of coordinates of the second system (Hertz 1894, §598, 600). It is therefore a matter of taste whether we will call the second system hidden. However, Hertz introduced a special case in which the second system remains truly hidden and in which the forces exerted on the first system can be expressed in terms of the coordinates of the first system alone. This is the case where the second system is a so-called adiabatic cyclical system.
20 Cyclic and conservative systems
The most interesting example of a system acted on by forces is the so-called conservative system in which a force function exists. This is a system consisting of a visible system coupled to a special kind of hidden system. In order to deal with such systems, Hertz first introduced the concept of an adiabatic cyclic system, which was intended to play the role as the second (hidden) subsystem in a conservative system.
20.1 Adiabatic cyclic systems In accordance with his overall approach Hertz defined cyclic coordinates geometrically: A free coordinate of a system is said to be cyclical when the length of an infinitesimal displacement of the system does not depend on the value of the coordinate but only on its change. (Hertz 1894, §546)
Since the energy is defined by the same quadratic form as the line element (14.16) this definition is equivalent to the usual definition of a cyclic coordinate as a coordinate that does not enter into the expression of the energy of the system (Section 18.1) (§548). More unusual is Hertz’s definition of a cyclic system as ‘a material system whose energy approximates sufficiently near to a homogeneous quadratic function of the rates of change of its cyclical coordinates’ (Hertz 1894, §549). It may be disturbing to find such an approximate definition in Hertz otherwise precise presentation. However, as mentioned by Hertz, the approximate character of the definition cannot be avoided since the energy of a system will, strictly speaking, involve the rate of change of all its coordinates. Indeed a change in a coordinate will necessarily displace at least one mass point and any motion of the system will add to the energy. The approximation is exactly the same as the one made by Helmholtz in his study of monocyclic systems (see Section 18.4). In accordance with Helmholtz’s terminology, Hertz used the term parameter to denote the non-cyclic coordinates. Before we continue the discussion of the general cyclic system it is useful to consider the hidden system in the standard example in Section 18.3. Its energy is 225
226
Cyclic and conservative systems
expressed by = 1 m2 (cos 1 ω)ω˙ 2 + 2m2 (1 − sin 1 ω)2 ϕ˙ 2 . E 2 2 2
(20.1)
Thus it is a cyclic system if ϕ˙ is much larger than ω. ˙ Since the general expression (14.16) of the energy is a quadratic form in all the generalized velocities of the system, it is true, in general, that a system is cyclic if its cyclic coordinates have much larger rates of change than its parameters, corresponding to Helmholtz’s assumption that the parameters vary adiabatically. In general, if we denote the parameters with ordinary letters and everything pertaining to the cyclic coordinates with thilded letters, we can, according to the definition of a cyclic system, write its energy in the form (see eqn (14.16)) 1 = 1m ρ p σ , E q˙ ρ q˙ σ = bρσ p aρσ 2 2 m r
r
r
ρ=1 σ =1
r
(20.2)
ρ=1 σ =1
is the mass of the cyclical system and where m aρσ and bρσ are functions of the qρ . parameters qρ alone but not of the cyclic coordinates In analogy with thermodynamics, Hertz distinguished between two particularly interesting types of cyclic systems for which he could define a so-called force function: ρ remain constant, and 2. isocyclic 1. adiabatic systems for which the cyclic momenta p systems for which the generalized cyclic velocities q˙ ρ remain constant (Hertz 1894, §560). In order for a system to be an isocyclic system it is necessary for the cyclic coordinates to be coupled to other systems whose motions will impress forces that will keep the cyclic velocities constant. I shall not follow Hertz’s discussion of this case. It is more interesting to consider the adiabatic case because there the cyclic coordinates are not influenced by any forces. Indeed, from eqn (19.15) it follows that for a cyclic coordinate qρ that by definition is free and does not enter into the expression of the energy we have: ρ = −p ˙ρ . P
(20.3)
ρ is constant is equivalent to requiring that no forces act along Thus, requiring that p the cyclic coordinates. In our standard example in Section 18.3 the hidden system is an adiabatic cyclic system with cyclic coordinate ϕ and parameter ω. A force function is, according to Hertz, a function of the parameters with the property that its derivative with respect to any parameter is equal to minus1 the force exerted2 by the system along that parameter (Hertz 1894, §563). For an adiabatic is a force function. cyclic system the third expression in eqn (14.16) for the energy E 1 Hertz, following many of his nineteenth-century predecessors, did not have a minus here. I shall follow modern practice. That means that I have changed sign in front of all expressions of the force function. 2 For a cyclic system Hertz calculated the force exerted by the system rather than the force exerted on the system. The reason is that the cyclic system is intended to play the role of a hidden system coupled to a visible system. In that case it is, of course, the force exerted on the visible system, i.e. the force exerted by the cyclic system that is of interest.
Conservative systems
227
ρ are constant in an adiabatic system this expresIndeed, since the cyclic momenta p sion is a function of the parameters alone. Moreover, since the system is cyclic the does not contain the rate of change of the parameters, so that according to energy E eqn (14.17) we have ∂q E pρ = = 0. (20.4) ∂ q˙ρ Thus, if we, furthermore, assume that the parameters are also free coordinates we have, according to eqn (19.15) r r ∂p E ∂p 1 ρ p σ , =− (20.5) bρσ p Pρ = − ∂qρ ∂qρ 2 m ρ=1 σ =1
is a force function for the adiabatic cyclic system. Hertz which expresses the fact that E did not explicitly mention the assumption that the parameters are free coordinates. I shall return to this problem in Section 20.3. Of course, the force function is not uniquely determined by the definition but + c, where c is we know that U is a force function if and only if it is of the form E a constant.
20.2 Conservative systems As mentioned above, Hertz defined a conservative system as follows: Definition 1. A material system which contains no other concealed masses than those which form adiabatic cyclical systems is called a conservative system. (Hertz 1894, §601)
A conservative system can be divided into two partial systems, of which the first consists of the visible masses and the second consists of the hidden masses. The coordinates of the visible system (the visible coordinates of the complete system) are at the same time the parameters of the second hidden adiabatic cyclic system. For example, the standard example in Section 18.3 is a conservative system if ϕ˙ ω˙ and m is small compared with the mass of the rod. The energy T of the visible system is called the kinetic energy of the conservative is called the potential energy system, and the kinetic energy of the hidden system E of the conservative system. The energy of a conservative system is thus the sum of its kinetic and potential energy: E = T + E.
(20.6)
As explained above, the potential energy can be considered as a function of the parameters, i.e. of the visible coordinates of the conservative system, whose derivative with respect to any visible coordinate gives the component along that coordinate of the force exerted by the hidden system on the visible system. This property is shared
228
Cyclic and conservative systems
by all force functions of the hidden system, and so they are all called force functions of the conservative system. Thus, any force function U of the conservative system is of + c. Of course, we have no means of determining E but the motion of the the form E visible system will reveal a force function. Hertz therefore defined the mathematical energy h to be the sum of the kinetic energy and the force function h = T + U.
(20.7)
It differs from the true energy of the system by a constant. If the conservative system is a free system its energy, and thus also its mathematical energy, is conserved. This is the reason for the name conservative for such systems. In ordinary mechanics, kinetic and potential energy are two completely different things. Kinetic energy is a simple consequence of space, time and mass relations and is thus unproblematic according to Hertz (Hertz 1894, p. 26/22). Potential energy, on the other hand is a problematic concept. In the ordinary Newtonian–Laplacian image of mechanics, potential energy is a concept that is constructed from the problematic concept of force. In the energetic image it is a primitive concept that is in itself marred by logical problems (Hertz 1894, p. 26/22) (see Chapter 7). In Hertz’s image, on the other hand, kinetic and potential energy of a conservative system do not differ in nature but only in the limitations of our knowledge of the masses of the system (Hertz 1894, §607). If we could obtain a better knowledge of the system, so that we could perceive the hidden masses, the potential energy would reveal itself as kinetic. Thus, in Hertz’s mechanics there is nothing logically problematic about the concept of potential energy. Hertz considered this as a major advantage of his image. Hertz went on to show that with his concept of potential energy he could deduce the usual mechanical principles for conservative systems. In particular, the equations of motion (19.9) for a free conservative system will take the form mfρ +
k
qχρ Qχ =
χ=1
∂U , ∂qρ
ρ = 1, 2, . . . , r,
(20.8)
and if qρ is a free coordinate we get mfρ =
∂U . ∂qρ
(20.9)
In particular, if the system is a holonomic one and we chose all the coordinates qρ to be free coordinates and we define the Hamiltonian H by H =T +U
(20.10)
then the equations of motion can be written in the form q˙ρ =
∂p H , ∂pρ
p˙ρ = −
∂p H . ∂qρ
(20.11) (20.12)
Hidden non-holonomic connections are not allowed
229
Indeed, the first equation is a consequence of the purely kinematical equation (14.18), if one replaces E with T (the energy of the visible system) and recalls that U does not contain pρ . The second equation follows from eqns (20.9) and (14.22). The equations (20.11) and (20.12) are Hamilton’s equations for a conservative system. Hertz emphasized that these equations as well as eqn (20.8) only contain quantities belonging to the visible system (e.g. m in eqn (20.8) is the mass of the visible system). Thus, they can be used to determine the motion of the visible system without any knowledge of the hidden system, except for the potential energy, which in this image is a result of the cyclic motion of the hidden system. Thus, the hidden system is really hidden.
20.3 Hidden non-holonomic connections are not allowed In the previous section we saw that for a conservative system the energy of the cyclic system (the potential energy of the conservative system) is a force function (see eqn (20.5)). In the derivation we assumed that the parameters of the cyclic system were free, as are the cyclic coordinates by definition. The assumption was needed in order to make sure that the qχρ s of eqn (19.9) are zero, which is clearly needed for the argument to work. It is also intuitively obvious that hidden constraints among the parameters are only allowed in so far as their effects are included in the expression of the kinetic energy of the cyclic system. Indeed, if the forces are supposed to be the only influence one can experience from the hidden system, and if these forces can be derived from the potential energy of the conservative system, or equivalently, the energy of the hidden cyclic system, there cannot be other equations of constraint working between the coordinates of the cyclic system. Of course, one can have holonomic constraints acting in the cyclic system, but they must be taken into account by a proper choice of free coordinates. In that case, the constraints are reflected in the expression of the energy. Non-holonomic constraints cannot be accounted for in this way and can therefore not be allowed in the hidden cyclic system. To be sure, the parameters of the cyclic system are also coordinates of the visible system and as such non-holonomic constraints can operate among them, but they must be visible constraints. They cannot be thought of as constraints of the hidden cyclic system but must explicitly be taken into account when writing down the equations of motion (19.9). Thus, though Hertz was careful to allow non-holonomic constraints in his mechanics, such constraints are not allowed among the coordinates of the hidden system used to describe the motion of conservative systems. This is a little surprising in the light of Hertz’s own arguments for the importance of non-holonomic constraints. As we saw in Section 15.3, Hertz argued in the introduction that since non-holonomic constraints are so nearly realized in various mechanisms (integrators) we cannot exclude such constraints from the mechanics of yet unknown systems as atoms or the ether. This clearly gives the impression that he included non-holonomic constraints in his image of mechanics so that he could allow his hidden systems (the ether) to
230
Cyclic and conservative systems
perform rolling motions or other motions that could not be described by holonomic constraints. And yet, as we have just argued, such hidden constraints cannot be allowed into the hidden parts of Hertz’s conservative systems. How did Hertz explain this unfortunate circumstance? He did not explain it. In fact, in the derivation of the fact that the energy of the hidden system is a force function of the visible system (20.5) Hertz did not even mention the necessary assumption that the parameters be free coordinates of the cyclical system. Everywhere else in the Mechanics he was (true to his requirement of logical permissibility) very careful to mention the assumptions needed for the various theorems and deductions. So it seems as though he simply overlooked this important fact, and thus never realized its disturbing consequence for his account of hidden systems producing conservative forces3 .
20.4 The approximative character of cyclic and conservative systems We noted above that Hertz’s definition of a cyclic system involved an approximation. Why is such an approximation necessary? This question is the more urgent since apparently J.J. Thomson did not make any approximation in his argument (see Section 18.2) that made him conclude that all energy could be considered kinetic. In order to unravel the question, let us therefore take a closer look at Thomson’s argument in connection with the simple standard example in Section 18.3. Recall that the energy of the entire system was expressed by E = 21 I ω˙ 2 + 21 m2 cos( 21 ω)ω˙ 2 + 2m2 (1 − sin 21 ω)2 ϕ˙ 2 = 21 I ω˙ 2 + 21 m2 cos( 21 ω)ω˙ 2 +
2 p 8m2 (1 − sin 21 ω)2
.
(20.13) (20.14)
Here, the first term is the true kinetic energy T of the visible rod and the last term is what both Thomson and Hertz would call the potential energy of the system. The problem is what to do with the middle term. Thomson grouped it with the first term, because it is quadratic in ω, ˙ and thus arrived at the apparent kinetic energy T1 = 21 I ω˙ 2 + 21 m2 cos( 21 ω)ω˙ 2 of the system. However, if we assume with Hertz that we can determine the true mass distribution of the visible system we will be able to determine the true kinetic energy T = 21 I ω˙ 2 and we would therefore be able to detect that the apparent kinetic energy entering into the equations of motion is different from the true kinetic energy. It would therefore not be true that the second system only reveals itself through the potential energy. It would also show up in an unfortunate way in the expression of the kinetic energy. The second system would not be hidden (see Hertz’s remarks in §598 and §600 discussed at the end of the previous chapter). 3 In his Ether and Matter Joseph Larmor (Larmor 1900, p. 278) noted that one can only eliminate force as an independent concept when there are no rolling motions.
The approximative character of cyclic and conservative systems
231
v
l
w
Fig. 20.1. An even simpler example
One way out of the problem would be to assume that we have no way of determining the true kinetic energy of the visible system, but only the apparent kinetic energy. In that case we would simply take T1 as the kinetic energy of the system and would not detect the problem. However, this assumption is equivalent to the assumption that we cannot measure the true mass distribution of the visible system. Let me give an even simpler example where such an assumption would in fact suggest itself: Consider a visible rod swinging in a plane around its one end, and let the other end of the rod carry a rotating disc that we shall consider as a hidden cyclic system (Fig. 20.1). The energy of the total system can be expressed as E = 21 I ω˙ 2 + 21 M2 ω˙ 2 + 21 I1 ϕ˙ 2 ,
(20.15)
where I and I1 are the moments of inertia of the rod and the disc, respectively, and M is the mass of the disc. In this simple case ϕ˙ is constant, so if we consider this cyclic coordinate as a hidden coordinate we have T1 = 21 (I + 21 M2 )ω˙ 2 ,
(20.16)
Tcycl = 21 I1 ϕ˙ 2 .
(20.17)
and Here again, the apparent kinetic energy T1 is not the true kinetic energy of the visible system. Instead, it is the kinetic energy of the rod with an additional mass point with a mass equal to that of the disc attached to its free end. Here, a simple way out suggests itself. We might introduce the concept of apparent mass, such that the apparent mass distribution of the visible bar would be the
232
Cyclic and conservative systems
‘true mass distribution’ with an added point mass M at the free end. If we assume that we cannot measure the real mass distribution but only the apparent one we would be led to believe that T1 is the kinetic energy of the visible system, and we would not notice any discrepancy between the kinetic energy and the function that enters into the equations of motion. Thus, the hidden system would be truly hidden. One might argue that the assumption that we can only measure the apparent mass is a rather natural one. Indeed when we put the rod on the balance, we may put the invisible disc on the balance as well. Still, we have seen in Section 11.3.2 that Hertz did not make this assumption. According to him, we cannot place the hidden mass on the balance, and the mass we measure on it is the true mass of the visible system. I have already discussed some of the general problems that the introduction of apparent mass may give rise to and that may have prevented Hertz from introducing such a concept (Section 11.3.2). Here, we can add a more technical problem. Indeed the introduction of apparent mass can only solve the question of the difference between apparent and true kinetic energy in cases where the connection of the visible and the hidden system is very simple. In order to illustrate this, let us again consider the slightly more complicated standard example of Section 18.3. Here, the difference between the apparent and the true kinetic energy of the visible system is of the form 1 1 2 ˙ 2 , which represents the kinetic energy of the hidden system when the 2 m cos( 2 ω)ω cyclic motion has stopped (ϕ˙ = 0), and the hidden mass m is just dragged along with the visible rod. However, this time the extra term cannot be interpreted as the result of a constant additional mass distributed along the visible rod. Indeed, the term depends on ω. Therefore, if we shall interpret it as resulting from an additional mass, this mass would have to depend on the position of the rod. If we had allowed the visible rod to move outside the plane of the paper, we would see that the additional mass would depend on the velocity too. Indeed, if the rod just rotates with fixed ω around OP it would not influence the motion of m and there would be no additional mass. On the other hand, any motion in the radial direction (A moving away from P ) will give rise to the additional mass discussed above. Thus, we see that Thomson’s inclusion of the term 21 m2 cos( 21 ω)ω˙ 2 into the kinetic energy is, in fact, problematic. In Hertz’s treatment of conservative systems, this term is part of the kinetic energy of the hidden cyclic system, and thus belongs to the potential energy. However, that is problematic as well. Indeed, it depends on ω˙ and cannot be considered as a function of the coordinate ω of the visible system and therefore cannot play the role of a force function. The only way out is to assume that this term is negligible compared to the other two terms. In order to make 21 m2 cos( 21 ω)ω˙ 2 small compared to 21 I ω˙ 2 we must assume that the mass m of the hidden system is negligible compared to the mass of the visible system. But then the hidden system cannot have a sensible influence on the visible system, and cannot be thought of as the cause of apparent forces, unless we assume that it moves very fast. Indeed, in order for the potential energy 2m2 (1 − sin 21 ω)2 ϕ˙ 2 to be comparable in size to the kinetic energy 21 I ω˙ 2 of the system we must assume that the cyclic velocity ϕ˙ of the hidden system is very large compared with the velocity ω˙ of the visible system. Thus, if we want to have a hidden system that can influence
The approximative character of cyclic and conservative systems
233
the visible system, and at the same time want to solve the problem with the term 2m2 (1 − sin 21 ω)2 ϕ˙ 2 we must assume that the cyclic velocities of the hidden system are large compared to the velocities of the visible system (the parameters of the cyclic system) and that the hidden masses are small compared to the visible masses. This assumption is equivalent to the assumption that the energy of the hidden cyclic system is a quadratic form in the cyclic velocities alone and not in the parameters. This is precisely the approximation made by Hertz. Thus, we see that the approximation made by Hertz in his definition of a cyclic system, and consequently in his definition of a conservative system, cannot be avoided. At least it cannot be avoided in standard mathematical terms. However, Hertz could have chosen to ‘go to the non-standard limit’: He could have postulated that the hidden system has infinitely small mass and moves infinitely fast. In that case, the definition of a cyclic system could have been exact except for infinitely small quantities. If he had chosen this way out, he would have behaved just as he did with regard to the problem of the continuity of mass. Here, as we saw in Section 12.5 he at first contemplated building material points out of small, but finite Massenteilchen that are chosen such ‘that with sufficient approximation it divides all masses of the system a whole number of times.’ However, he ended up discarding this approximate definition and making the Massenteilchen infinitely small, so that masses of material points could, without approximation, range over all the positive real numbers. Why did Hertz not go to a similar limit in his definition of a cyclic system? I think the reason is that he considered his definition of a cyclic system as an absolutely central part of his image of mechanics that was of an empirical nature. As I already argued in Chapter 12 it is highly probable that in the case of the Massenteilchen Hertz considered them as concepts introduced for the sake of appropriateness, and went to the limit, in order to make sure that their size could not have measurable effects. In the case of the approximation made in the definition of a conservative system, on the other hand, there are various hints in the Mechanics that suggest that Hertz imagined that one might, in special cases, be able to measure the discrepancy between the approximate results, which are similar to those of ordinary mechanics, and the correct results. This seems to be the content of the following corollary to the theorem (20.5) about the existence of a force function for an adiabatic cyclic system: Corollary. The forces of a cyclical system along its parameters are independent of the rates of change of these parameters. It is always assumed that these changes do not exceed the values which permit us to treat the system as a cyclical one. Thus in electromagnetics the attractions between magnets are independent of the velocity of their motion, but only so long as this velocity is considerably less than the velocity of light4 . (Hertz 1894, §556)
4 Does the explicit mention of the velocity of light here suggest that Hertz imagined that the hidden systems moved with the velocity of light, or with a velocity of that order of magnitude?
234
Cyclic and conservative systems
So, here is an example where Hertz believed that one can already now measure when the approximation involved in the definition of a cyclic system breaks down. It is also probable that Hertz had something similar in mind when at the end of the introduction he referred to the decisive battle between the different images of mechanics (Hertz 1894, p. 49/41). Indeed one could imagine that in the future one would experience other situations where a fast-moving system did not obey the rules of ordinary mechanics, or equivalently the approximate rules deduced in Hertz’s mechanics. If one could show that the system moves according to the non-approximate laws of Hertz’s image this would count as a victory for this image and a defeat of the ordinary image. Thus, by leaving the definition of a cyclic system as an approximate definition, Hertz set up an image of mechanics whose laws were not quite the same as those of conservative systems in usual mechanics, but only approximately so. In this way he left the possibility open for an empirical test of the two systems against each other. He seems to have thought that measurements of fast-moving magnets supported his image, and seems to have hoped that future measurement would give him the final victory. It is remarkable that classical physics did break down only a decade after the publication of Hertz’s Mechanics and exactly in the area where Hertz had suspected it, namely for systems moving with velocities close to the velocity of light. It is, however, also remarkable that the solution of the problem suggested by Einstein was totally different from the one suggested by Hertz.
21 Integral principles
Hertz’s treatment of integral principles falls into three parts: In §166–196 of the Principles of Mechanics the properties of geodesics and their relation to straightest paths are discussed in a purely geometric way. These investigations then form the basis for a discussion in §347–366 of integral principles applied to the motion of free systems, and finally in §625–643 to the motion of conservative systems. Hertz’s approach differed from the usual approach in three respects. First, it was based on his geometry of systems of points, secondly, it dealt with forces as an indirect result of a coupling of the visible system with a hidden system, and thirdly, it limited the applicability of even the most general of these principles (Hamilton’s principle) to special mechanical systems declaring that it was invalid in general. While the first two of these characteristics were a result of Hertz’s general approach to mechanics the last one presented a deep, but, as we shall see in the next chapter, not entirely new insight into the use of integral principles to non-holonomic systems. In fact, Hertz pointed out that Hamilton’s principle and the principle of least action did not apply to non-holonomic systems. In the introduction to the book he explained in a non-technical way why these principles fail in the simple case of a ball rolling without slipping on a plane (for a clearer exposition see (Hölder 1896): As we have already seen in Section 15.3 the ball can be rolled from any given initial configuration A to any other configuration B in the 5-dimensional manifold of configurations. However, since the system has three degrees of freedom, it can only reach a 3-dimensional submanifold of these configurations in a natural path (when no forces are applied to the system). However, according to the principle of least action (or Hamilton’s principle), there should be a natural path from any initial configuration A to any other final configuration B along which the ball would roll when free and under no influence of forces, namely the possible path that minimizes the action integral (or another integral)1 . Even if B is a configuration that can be reached from A according to the fundamental law, the integral principles will often pick out the wrong path (Hertz 1894, §195). 1 Here, Hertz commits the usual error of confusing the existence of a lower bound with the existence of a minimum; but the problem does, in fact, not lie here.
235
236
Integral principles
21.1 Shortest and geodesic paths In Hertz’s mechanics the reason for the failure of the integral principles is to be found in the geometry of systems of points, more precisely, in the difference between straightest paths and geodesic paths. In order to explain the difference I shall summarize Hertz’s derivation (§177–189) of the equations of the geodesics and compare it to the derivation of the equation of the straightest path (eqn (17.4) together with eqn (17.2)). Hertz defined a geodesic path between two positions B and C of a mechanical system as a path whose variation in length vanishes:
C
δ
C
ds =
B
δ ds =
B
C B
∂ ds δ dxν = ∂ dxν
C B
∂ ds dδxν = 0, ∂ dxν
(21.1)
when the initial and final positions are kept fixed and it is assumed that the varied path as well as the original path must satisfy the equations of constraint 3n
xιν dxν = 0,
ι = 1, 2, . . . , i.
(21.2)
ν=1
This means that the variations δxν must satisfy the varied equations 3n
xιν dδxν +
ν=1
3n 3n ∂xιν µ=1 ν=1
∂xµ
δxµ dxν = 0,
ι = 1, 2, . . . , i.
(21.3)
According to the Lagrangian procedure Hertz multiplied each of eqns (21.3) by a multiplier ξι , added the sum of these expressions to the integral in eqn (21.1) and integrated by parts so as to eliminate the dδxν terms. In this way he found the following expression for the coefficients of δxν :
∂ ds d ∂ dxν
+
i
xιν dξi +
ι=1
3n i ∂xιν ι=1 µ=1
∂xιµ − ∂xµ ∂xν
ξi dxµ = 0,
ν = 1, 2, . . . , 3n.
(21.4)
They must be equal to zero for the variation to vanish. Using the path length s as the independent variable, Hertz deduced from eqns (21.4) and (12.2) that i i 3n ∂xιµ ∂xιν mν xν + xιν ξι + − ξι xµ = 0, m ∂xµ ∂xν ι=1
ι=1 µ=1
ν = 1, 2, . . . , 3n.
(21.5)
Integral principles of mechanics
237
Together with the equations obtained by differentiation of eqn (21.2): 3n ν=1
xιν xν +
3n 3n ∂xιν x x = 0, ∂xµ ν µ
ι = 1, 2, . . . , i,
(21.6)
ν=1 µ=1
eqns (21.5) are the equations of the geodesic path. They differ from the equations for the straightest path (17.2) and (17.4) through the presence of the last term on the left-hand side of eqns (21.5). If, however, the constraints are all holonomic it is possible by multiplication with suitable multipliers to obtain that ∂xιµ ∂xιν = . (21.7) ∂xµ ∂xν In that case, eqns (21.5) will be reduced to mν x + xιν ξι = 0, m ν i
ν = 1, 2, . . . , 3n,
(21.8)
ι=1
which are identical to eqn (17.4) if we replace ξι with . Thus, Hertz concluded that for holonomic systems the straightest paths and the geodesic paths are the same. He also showed that for non-holonomic systems the two types of paths are not the same. Hertz also defined a shortest path of a system between two of its positions as a path that is shorter that any other infinitely neighboring path between the same positions (Hertz 1894, §166), and he called the shortest of the possibly many shortest paths the absolutely shortest path (§167). As in usual differential geometry any geodesic path is the shortest path between sufficiently close positions on it (§176).
21.2 Integral principles of mechanics In the second book of the Mechanics Hertz combined the above theorems about the shortest paths and geodesics with the fundamental law to arrive at the following integral principles pertaining to the motion of holonomic systems: Proposition 1. ‘The natural path of a free holonomic system between any two sufficiently near positions is shorter than any other possible path between the two positions’ (Hertz 1894, §347). More generally the variation of the integral
C n 1 ds = √ mν ds 2 (21.9) m B ν=1
vanishes in the natural motion between any two configurations B and C of a holonomic system (Hertz 1894, §349).
238
Integral principles
This corresponds to Jacobi’s formulation of the principle of least action for systems not influenced by forces (Section 2.1, eqn (2.23)). If we restrict the variation to paths traversed with the same constant velocity or energy, we get the following principle of least time: Proposition 2. ‘The natural motion of a free holonomic system carries the system in a shorter time from a given initial position to a sufficiently near final one than could be done with any other possible motion with the same constant value of the energy’ (Hertz 1894, §352). Finally, relaxing the constraint of equal energy Hertz arrived at the fundamental principle of ‘least Time-Integral of the Energy.’ Proposition 3. ‘The time integral of the energy in the transference of a free holonomic system from a given initial position to a sufficiently near final one is smaller for the natural motion than for any other possible motion by which the system may pass from the given initial position to the final one in an equal time’ (Hertz 1894, §358). More generally t1 δ E dt = 0, (21.10) t0
between any two configurations. This is Hamilton’s principle for systems not influenced by forces (Section 2.1, eqn (2.30)). In order to apply this principle to a holonomic conservative system we recall that Thus we have the energy of the total system E can be written as T + E. δ q or
dt = 0, (T + E)
(21.11)
(T + U ) dt = 0,
(21.12)
t0
δ q
t1
t1
t0
by a constant. where U is any potential function that may differ from E The variations in eqn (21.12) are restricted by the fact that the visible as well as the hidden system must pass between two given configurations. This is what is indicated by the subscript q to δ in eqns (21.11) and (21.12). Since we have no way of knowing the configuration of the hidden system this variational principle is of no immediate conjugate to the hidden coordinates use. However, we do know that the momenta p of a conservative system remain constant. Hertz could prove (Hertz 1894, §593–628) that if δp indicate a variation where the momenta of the hidden variables are kept fixed, we have t1 t1 U dt = −δ U dt, (21.13) δ q p t0
t0
Integral principles of mechanics
239
and since T only involves the visible coordinates we clearly have t1 t1 T dt = δp T dt, δ q
(21.14)
and thus, according to eqn (21.12), t1 (T − U ) dt = 0. δp
(21.15)
t0
t0
t0
This last formulation of the principle is Hamilton’s principle for holonomic conservative systems (Section 2.1, formula (2.30)). It applies to such systems because it does not pre-suppose any knowledge of the hidden system except the force function U . In a similar way, Hertz deduced from the principle of least time for a free system (Proposition 2. above) that δp
C
(h − U ) ds = 0,
(21.16)
B
where it is assumed that the visible system passes between the configurations B or equivalently of the and C with a constant value of the total energy E = T + E, mathematical energy h = T + U. This is Jacobi’s formulation of the principle of least action for a conservative system (Section 2.1, eqn (2.23)). Hertz repeatedly stressed that in his mechanics the variational principles eqns (21.10) and (21.11) for free systems are the general forms, whereas eqns (21.15) and (21.16) are special cases. In usual mechanics it is the other way round. He also stressed that as in ordinary mechanics Hamilton’s principle (21.10) implies energy conservation, and it can therefore replace the fundamental law for holonomic systems. Jacobi’s form of the principle of least action (21.9), on the other hand, pre-supposes energy conservation and is therefore not strong enough to replace the fundamental law, even in the case of holonomic constraints (see Section 2.3). In the case of non-holonomic constraints, the difference between geodesic paths and the natural straightest path implies that none of the mentioned integral principles hold.
22 A history of non-holonomic constraints
Hertz was not the first to introduce non-holonomic constraints into mechanics but his careful distinction between holonomic and non-holonomic constraints in the Mechanics influenced the general history of that discipline. First, his name for such constraints was quickly adopted by others, and secondly, his discussion of integral principles gave rise to a discussion of the application of non-holonomic constraints in various principles of mechanics. For this reason, I shall, in this chapter, give a brief history of non-holonomic constraints and Hertz’s place in it. First, I shall discuss the immediate reaction to Hertz’s book, and then turn to a general overview of the very messy history of repeated independent mistakes, rejections and rescues.
22.1 Hölder’s rescue of Hamilton’s principle As we saw in the previous chapter Hertz derived various integral principles such as the principle of least action and Hamilton’s principle for holonomic conservative systems, but concluded that these principles are invalid for non-holonomic systems. If this failure of the integral principles had been a result of the special features of Hertz’s image of mechanics, his conclusions would probably not have stirred much attention. However, his reasoning applies equally well to the other images of mechanics. Indeed, in ordinary mechanics the integral principles are deduced from d’Alembert’s principle that is equivalent to Hertz’s fundamental law. That means that Hertz had pointed out a serious problem in ordinary mechanics. It had devastating effects on the energetic image because it meant that its fundamental law, Hamilton’s principle, was, in fact, incorrect for a wide variety of mechanical systems. This was one of Hertz’s major reasons for rejecting that image. The situation called for an immediate rescue operation. The operation was led by the mathematician Otto Hölder. In a paper of 1896 he showed that if one performs the variations in the ‘correct way’ (implying that Hertz’s way was ‘incorrect’) the variational principles remain valid also when applied to systems with non-holonomic constraints. The variational principles say that the variation of a certain integral must vanish along the natural path when the variations performed are subject to certain restrictions. 240
Hölder’s rescue of Hamilton’s principle
241
Hertz assumed that the varied path must be a permissible path, i.e. that it must satisfy the equations of constraint (15.1). Hölder, on the other hand, assumed that the variations δxν themselves must satisfy the constraints, i.e. that i
xιν δxν = 0,
ι = 1, 2, . . . , i,
(22.1)
ι=1
and showed that under this assumption the usual derivation of Hamilton’s principle (and the principle of least action) from d’Alembert’s principle holds even when the constraints are non-holonomic. The main thing to observe here is that Hölder’s and Hertz’s assumptions about the variations are equivalent if and only if the system is holonomic. Hölder’s argument has sometimes been interpreted to mean that Hertz was wrong about the non-applicability of the variational principles to non-holonomic constraints (see, e.g. (Klein 1970, p. 72)). However, in fact, the principles discussed by Hertz and Hölder are quite different, although they are given the same name, and in one sense Hertz’s is the more natural in that it is a true variational principle. It asks for the maximization or minimization – or rather stability – of a functional within a certain well-defined class of admissible functions. Hölder’s principle fails to be a variational principle in this sense, since it does not specify a class of admissible functions but only a class of admissible variations. This means that Hölder’s resulting path does not even locally minimize a certain quantity among a specified class of paths. This makes Hölder’s principle less appealing than the usual formulations of Hamilton’s principle and the principle of least action (in the holonomic case). As appears from the following excerpt from a letter of January 15th, 1904 from Hölder to Philip E.B. Jourdain, Hölder was criticized for his reformulation of integral principles of mechanics: The view is perhaps odd at first sight and it has also already been said that I do not have a real variational problem. However it does not matter to me. What matters to me is a clear interpretation of δx, δy, . . . , δt, so that the Principles apply most generally. (Jourdain 1905b, p. 75)
In the same letter Hölder wrote that he had taken the criticism so seriously that he had begun to use the word ‘changed motion’ [abgeänderte Bewegung] instead of ‘varied motion’ [variirte Bewegung]. Hertz was well aware of the difference between the two different types of variations: ‘a displacement between infinitely neighboring possible positions may be an impossible displacement’ (Hertz 1894, §113). One may even speculate if Hertz did not himself realize the possible rescue of the integral principles pointed out by Hölder. Indeed, it is immediately obvious that one can get rid of the unpleasant last term on the left-hand side of eqn (21.4) by assuming that the variations satisfy Hölder’s eqn (22.1) instead of Hertz’s equation (21.3). This observation leads directly to Hölder’s ‘variational’ principle. But if Hertz discovered this possibility, why did he not include it in his book? First, he may have considered it too unnatural, and secondly he may have left it out
242
A history of non-holonomic constraints Table 22.1. Hertz’s and Hölder’s variational methods compared
Hertz
Hölder B
B
(dq1,...,dqr) ds
ds ~
ds
(δq1,...,δqr)
~
(dq1,...,dq~ r)
A
A
Trajectory is a possible path: r
ρ=1 qλρ dqρ
= 0,
λ = 1, 2, . . . , k.
Varied path is a possible path:
Variations are possible:
r
r
ρ=1 qλρ
dq˜ρ = 0,
λ = 1, 2, . . . , k.
Usual variational problem: minimize integral in class of admissible functions. This does not lead to trajectory unless constraints are holonomic.
ρ=1 qλρ δqρ
= 0,
λ = 1, 2, . . . , k.
Varied path not a possible path unless constraints are holonomic. Not a usual variational problem. This leads to trajectory.
for tactical reasons. Indeed, if he had admitted that the integral principles, including Hamilton’s principle could be rescued in this way, he would have missed a central argument against the energetic image. Even if Hertz should not himself have seen this way out of the problem, his contribution to the question of the applicability of the variational principles to nonholonomic constraints, should not be recorded as that of a blunder but rather as a focusing on an essential and deep problem, which then led to Hölder’s clear solution.
Repeated independent mistakes, rejections and rescues
243
As we shall see in the next section, Hertz was not the first to call attention to the problem, nor was Hölder the first to indicate the way out, but Hertz’s sharp denunciation became more widely known, and Hölder’s clarification was more lucid than any earlier treatment of the problem.
22.2 Repeated independent mistakes, rejections and rescues The history of non-holonomic constraints is one of independent discoveries and neglect. In addition to the problem of the applicability of the integral principles to non-holonomic systems a related problem concerning Lagrange’s equations entered the discussions: When setting up Lagrange’s equations (eqns (2.13) and (2.15)) for a non-holonomic system one could be tempted to use the i equations of constraint (15.1) or (15.2) to eliminate i of the (generalized) velocities x˙ν (or q˙ρ ), and express T (or L) as a function of 3n − i (r − k) variables. However, the equations that result from this procedure do not give the correct trajectories. If one performs these eliminations, one has to use a different equation: Assume that the equations of constraint (15.1) allows us to express dxν in terms of r differentials dqρ : dxν =
r
ανρ dqρ
(22.2)
ρ=1
and assume we use these equations to express x˙ν in T (or L) in terms of q˙ρ 1 . Then, from d’Alembert’s principle, one can deduce the equations d dt
∂T ∂ q˙ρ
∂T − mν x˙ν q˙σ ∂qρ 3n
−
r
ν=1 σ =1
∂ανρ ∂ανσ − ∂qσ ∂qρ
= Pρ ,
ρ = 1, 2, . . . , r,
(22.3) where Pρ = αjρ Xj (Xj being the force along xj ) is the force along qρ and where the partial derivatives with respect to qi are performed using eqn (22.2). Thus in comparison with the usual Lagrange’s equations (2.13) we get an extra term (the last term on the left-hand side). If the constraints are all holonomic, i.e. if eqn (22.2) can be integrated to express xν as a function of q1 , q2 , . . . , qr , then ∂ανρ ∂ανσ − = 0, ∂qσ ∂qρ
(22.4)
and the extra term disappears. In the case of non-holonomic constraints, however, the extra term does not vanish. These problems began to be noticed in the 1870s and 1880s when the importance of non-holonomic constraints was realized in connection with treatments of rolling 1 I shall follow Boltzmann and call the q s non-holonomic coordinates. ρ
244
A history of non-holonomic constraints
motion. An early treatment of the problem is contained in a short paper by Norman Macleod Ferrers from 1873. It begins: Lagrange’s generalized equations of motion are not directly applicable when the equations of condition are not expressible in an integral form. The object of the following investigation is to show the modification necessary in this case. (Ferrers 1873, p. 1)
Ferrers then derived eqns (22.3) in a slightly different form and concluded with an application to a disc rolling on a plane. According to a remark made by Routh to Ph. E.B. Jourdain (see the note in (Jourdain 1905b, p. 63)) it was Routh who called Ferrers’ attention to the failing of the usual form of Lagrange’s equations in the case of non-holonomic constraints. Routh himself pointed to the problem in a rather indirect way in the 3rd edition (Routh 1877a) of his Dynamics of a System of Rigid Bodies2 . Routh himself refrained from using the non-holonomic equations of constraints to reduce the number of variables, and instead took the constraints into account by using Lagrange multipliers. For this reason Jourdain (see, e.g. (Jourdain 1905b) called the equations arising in this way ‘Routh’s form,’ but of course this form goes back to Lagrange. This is the method used by Hertz. Routh also briefly mentioned that Hamilton’s principle would only hold true if the variations were chosen to satisfy the equations of constraints. However, he did not stress that one would commit an error by using Lagrange’s equation in non-holonomic coordinates, nor did he stress that the varied path obtained in the variational process would not satisfy the equations of constraints if the equations were non-holonomic. Therefore, his short paragraph (printed in small letters) did not create the feeling that he had addressed two essential problems, and so it was apparently overlooked. The problem concerning Hamilton’s principle was (apparently independently) highlighted in a more explicit form in 1888 by Carl Neumann in the second part of a long paper Grundzüge der analytischen Mechanik, insbesondere der Mechanik starrer Körper (Neumann 1887). When using Hamilton’s principle to describe the motion of two bodies rolling on one another he assumed that the variation satisfied the constraints (22.1), and continued in a remark printed in small letters: As a whole one can distinguish between three different types of motion: first the as yet unknown true motion, second the fictive motion that is infinitely close to the true motion and third a type of motion that might be called the passage motion [from the true to the fictive trajectory] . . .. The true motion will of course have the character of the given material system, e.g. satisfy the relations [(15.1) or rather (22.1)]. The same holds true of the passage motion. The fictive motion, on the other hand, will in general not have the character of the system [i.e. will not necessarily satisfy equation (15.1)] . . .. And thus in the application of Hamilton’s principle we make use of a fictive motion that does not at all have the character of the material system. (Neumann 1887, p. 34)
This is exactly the approach to Hamilton’s principle that was later advanced by Hölder. However, Neumann obscured the matters by claiming that he only assumed 2 I have not been able to locate this edition but according to Jourdain (1905a) it is in this place almost exactly identical to the 6th widely reprinted edition vol. II, §445.
Repeated independent mistakes, rejections and rescues
245
eqn (22.1) because it was ‘advantageous to his purpose’ but that Hamilton’s principle would be correct even without it. As for the equations of motion Neumann (Neumann 1887, p. 36) used Lagrange multipliers and the ordinary Lagrange’s equations as Routh had done. While following this procedure in a paper entitled Über gleitende und rollende Bewegung, Alfred Vierkandt warned against the trap one might be tempted to fall in: Finally, I shall allow myself to warn against a misunderstanding that may easily creep in regarding the differential equations of the rolling motion . . . And thus one might think that it would be allowed to take [the equations of constraint (15.1)] into account from the start when computing the live force [the kinetic energy] T , and then to deduce the Lagrangian differential equations on the basis of this simplified expression of T . Such a simplification is, however, in general incorrect. (Vierkant 1892, p. 52)
Jaques Hadamard who referred to Neumann and Vierkandt proved that this simplification was allowed if and only if the equations of constraint were integrable (Hadamard 1895). However, these numerous warnings went unheeded by Ernst Lindelöff (1870–1946) who committed the natural mistake in (Lindelöff 1895) and by Paul Appell, who repeated Lindelöff’s error in his influential Traité de mécanique rationelle (Appell 1896, vol. 2, pp. 344–349 (§452)). These errors and three errors of the same kind committed by Dutch mathematicians, were noticed by Dieterik Korteweg while examining a prize essay on rolling motion. Therefore, he decided to put the matter straight, but before his paper was published (Korteweg 1900) he discovered that Vierkandt and Hadamard had preceded him. He apparently did not discover that Appell himself had, in the meantime, learned about the papers of Vierkandt and Hadamard and 1. had published a booklet on Les mouvements de roullement en dynamique (Appell 1899) in which he explicitly corrected his former mistake 2. had commented (in a not too clear way and without references to previous authors) on the way one ought to use Hamilton’s principle in case of non-holonomic constraints (Appell 1898) and 3. had corrected the error in later editions of his textbook. Without being aware of these works, Boltzmann tried to use Lagrange’s equations on a special cyclic mechanical system involving rolling motion, and discovered that he arrived at the wrong result. He then published a paper Über die Form der Lagrangeschen Gleichung für nichtholonome, generalisierte Koordinaten (Boltzmann 1902) in which he derived a slight variation of eqns (22.3). A mathematically more sophisticated derivation of the equations was found independently of Boltzmann by Georg Hamel (Hamel 1904). Like Hölder, but unlike anyone else of the above mentioned authors, Boltzmann and Hamel referred to Hertz. In the above list of more or less independent treatments of non-holonomic constraints one can distinguish two different responses to both the difficulties regarding the equations of motion and the difficulties regarding Hamilton’s principle. The first response is ‘the rejecting one’ the other is ‘the rescuing one.’ In connection with Lagrange’s equations the rejecting response amounts to saying that it is forbidden to use these equations with a Lagrangian that has been simplified by taking the constraints into account. One must use the non-simplified Lagrangian and take account
246
A history of non-holonomic constraints
of the constraints using multipliers. This was Routh’s, Neumann’s, Vierkandt’s, Hadamard’s, Appell’s and Korteweg’s response. According to the rescuing response the simplification of the Lagrangian using the constraints is allowed, one just has to modify Lagrange’s equations accordingly (22.3). This was Ferrers’, Boltzmann’s and Hamel’s response. The rejecting response to the problem concerning Hamilton’s principle was to declare the principle invalid for non-holonomic constraints. This was Hertz’s response. The rescuing response was to declare that the principle holds true when the variations (and not the varied paths) were restricted by the constraints. This was Routh’s, Neumann’s and Hölder’s response. In relation to Hertz’s work the many contributions to the discussion about nonholonomic constraints can be divided into a German and a non-German class. The Germans after 1894 were aware of Hertz’s book and referred to it, the rest of the contributors did not.
23 Hertz on the Hamilton formalism
In the introduction to his Principles of Mechanics Hertz emphasized that one of the advantages of his geometric formulation of his mechanics is . . . that it throws a bright light upon Hamilton’s method of treating mechanical problems by the aid of characteristic functions . . . In our form of the mathematical representation Hamilton’s method, instead of having the character of a side branch [as had been the case in ordinary treatments of mechanics], appears as the direct, natural, and, if one may so say, self evident continuation of the elementary statements in all cases to which it is applicable. Further, our mode of representation gives prominence to this: that Hamilton’s mode of treatment is not based as is usually assumed, on the special physical foundations of mechanics; but that it is fundamentally a purely geometrical method, which can be established and developed quite independently of mechanics, and which has no closer connection with mechanics than any other of the geometrical methods employed in it. (Hertz 1894, pp. 38–39/32)
Hertz developed the geometric version of the Hamilton formalism in the first kinematic book (Hertz 1894, §197–236) and then applied these results to the motion of free holonomic systems (§409–417), and finally to the motion of unfree systems (§644–661).
23.1 The straightest distance The formalism only works for holonomic systems so let us, with Hertz, assume that (q1 , . . . , qr ) is a set of free coordinates of the system. In the corresponding r-dimensional configuration space Hertz introduced the geometric notion of a hypersurface or a ‘surface of positions’ as he called it (§200–214). He characterized such surfaces analytically by an equation: R(q1 , . . . , qr ) = Const.
(23.1)
In particular, he considered the surfaces consisting of the points that have a constant distance to a fixed position, and he showed that these surfaces are cut orthogonally by the straightest paths emanating from the fixed position (§222). 247
248
Hertz on the Hamilton formalism
This is an n-dimensional generalization of a theorem about geodesics on a surface that Gauss had proved in 1828 (Section 24.1). By introducing the ‘straightest distance’ S(q10 , q20 , . . . , qr0 , q11 , q21 , . . . , qr1 ) between the positions P 0 and P 1 as a function of their coordinates q10 , q20 , . . . , qr0 and q11 , q21 , . . . , qr1 , respectively, he could analytically represent the surfaces with a constant distance to P 0 by the equation S(q10 , q20 , . . . , qr0 , q11 , q21 , . . . , qr1 ) = Const.,
(23.2)
where q10 , . . . , qr0 are kept constant and q11 , . . . , qr1 are the coordinates of the variable point on the surface. It is clear from the generalization of Gauss’s theorem that for two different values of the constant, eqn (23.2) represents two surfaces separated by a fixed orthogonal distance. Hertz showed (§227) that, as a consequence, S will satisfy the two partial differential equations: r r
0 bρσ
∂S ∂S = 1, ∂qρ0 ∂qσ0
(23.3)
1 bρσ
∂S ∂S = 1, ∂qρ1 ∂qσ1
(23.4)
ρ=1 σ =1
and
r r ρ=1 σ =1
and that, once a solution S to these equations is found, one can express the straightest paths in finite form. This closely corresponds to Hamilton’s formalism (Section 2.1). Following Carl Gustav Jacob Jacobi (Jacobi 1837) Hertz then investigated how a solution to just one of these equations could help in determining the straightest paths of the system. First, he proved that if R(q1 , . . . , qr ) is a solution to the Hamilton–Jacobi equation r r ∂S ∂S bρσ = 1, (23.5) ∂qρ ∂qσ ρ=1 σ =1
there is a constant orthogonal distance between any two surfaces represented by the equation (23.6) R(q1 , . . . , qr ) = Const. for two different values of the constant, and this distance is equal to the difference between the value of R on the two surfaces. Conversely, if any two surfaces from a family of surfaces represented by eqn (23.6) have everywhere the same orthogonal distance measured by the difference of the constant, then R must satisfy eqn (23.5) (Hertz 1894, §231–232). Thus, if R is a solution to eqn (23.5) the orthogonal trajectories of the family of surfaces (23.6) are straightest paths. Such orthogonal trajectories can be determined by the following system of first-order differential equations √
aρρ cos s, qρ =
∂R , ∂qρ
ρ = 1, 2, . . . , r
(23.7)
The characteristic and principal functions
249
(equation b in (Hertz 1894, §232)). If we know a complete solution R of eqn (23.5) containing r − 1 constants α1 , α2 , . . . , αr−1 , in addition to the trivial additive constant, then we can even determine a system of equations for the orthogonal trajectory in finite terms. Indeed ∂R(q1 , . . . , qr ) = βi , ∂αi
i = 1, 2, . . . , r − 1
(23.8)
represent such a system when (βi ) is another system of constants (§235). This corresponds to Jacobi’s famous theorem (Jacobi 1837) (Section 2.1) with the important difference that Hertz’s presentation gives a beautiful geometric interpretation of Jacobi’s purely analytic formalism. Hertz’s discussion clearly demonstrated what he had claimed in the introduction: that Hamilton’s formalism was in essence a purely geometric formalism.
23.2 The characteristic and principal functions As in the case of the equations of motion, Hertz applied his theorems of the straightest distance to the dynamics of book two in two steps: first to free systems (§409–417) and then to unfree systems, or more specifically to conservative systems (§644–661). For free systems the above results about the straightest distance are immediately applicable since for such systems, the natural motion follows the straightest path. To be sure, the above theorems only determine the natural path and not the position of the system as a function of time. However, the latter can easily be found from the conservation of velocity or energy. Another way to account more directly for the way a system traverses its path is by way of Hamilton’s characteristic and principal functions. Hertz defined the characteristic function as √ (23.9) V = 2EmS, where V is considered as a function of the qρ0 s, the qρ1 s, and E, and he defined the principal function as mS 2 , (23.10) P = 2(t1 − t0 ) considered as a function of the qρ0 s, the qρ1 s, t0 , and t1 . He deduced the two partial differential equations corresponding to eqns (2.43) and (2.44), and (2.45) and (2.46) (with P instead of S) they each satisfy and eqns (2.47)–(2.49) and (2.50)–(2.52) (with P instead of S) expressing the motion in finite terms. In this way he derived Hamilton’s formalism for an isolated system as a straightforward consequence of the geometric theorems about straightest paths. He also mentioned the possibility of carrying over to dynamics the theorems about general solutions to eqn (23.5) (i.e. Jacobi’s generalization of Hamilton’s ideas), and admitted that such general solutions could
250
Hertz on the Hamilton formalism
be analytically advantageous. However, he did not go into any details because he felt that . . . their physical significance, on account of the mathematical complications, becomes more and more obscure. (Hertz 1894, §417)
About the introduction of the characteristic and principal functions he even remarked: It appears, moreover, that even in the characteristic and principal functions it is only the simple idea of the straightest distance which appears, and this, too, somewhat indistinctly; so that the introduction of these two functions together and in addition to the straightest distance would have but little significance if all the systems to be considered were always, as here, completely known and free. (Hertz 1894, §417)
In the case of an unfree system, where only a part of the total free system is assumed to be known, the above theorems are no longer applicable. However, when the system is conservative Hertz could define the characteristic function of the visible system as the action integral (see eqns (21.16) and (2.23)) √ V = 2m
1√
h − U ds;
(23.11)
0
where the integral has to be taken along the natural paths between two configurations of the visible coordinates alone. In the case where there is no hidden cyclic system this definition coincides with eqn (23.9) or differs from it only by a constant. The characteristic function, considered as a function of the initial and final visible coordinates and h, satisfies the following first-order partial differential equations 1 1 ∂V ∂V bρσ 1 1 = (U − h)1 2m ∂qρ ∂qσ
(23.12)
1 0 ∂V ∂V bρσ 0 0 = (U − h)0 , 2m ∂qρ ∂qσ
(23.13)
r
r
ρ=1 σ =1 r
r
ρ=1 σ =1
corresponding to eqns (2.43) and (2.44), and eqns (2.47)–(2.49) describe the motion in finite form. Similarly, Hertz defined the principal function as
1
P =
(T − U ) dt,
(23.14)
0
where again one integrates along the trajectory of the visible system (see eqn (21.15)). When there is no hidden cyclic system this definition only differs from eqn (23.10) by a constant. The principal function considered as a function of the initial and final
The characteristic and principal functions
251
visible coordinates and the initial and final time satisfies the following first-order partial differential equations 1 1 ∂P ∂P ∂P bρσ 1 1 + = U1 2m ∂t1 ∂qρ ∂qσ
(23.15)
∂P 1 0 ∂P ∂P bρσ 0 0 + = U0 , 2m ∂t0 ∂qρ ∂qσ
(23.16)
r
r
ρ=1 σ =1 r
r
ρ=1 σ =1
corresponding to eqns (2.45) and (2.46) (with P instead of S), and eqns (2.50) and (2.52) describe the motion in finite form. Thus, Hertz was able to express the analytical equations of the Hamilton formalism for a conservative system, without taking the hidden system into account except through the force function U . The geometric interpretations that made his theory for the straightest distance so appealing no longer hold in his description of conservative systems. However, it is possible to introduce a different metric in configuration space, so that the geometric part of the theory also applies to conservative systems. Such a formalism had already been suggested by some mathematicians prior to Hertz.
24 Mathematicians on the geometrization of the Hamilton–Jacobi formalism
It has long since been remarked by mathematicians that Hamilton’s method contains purely geometrical truths, and that a peculiar mode of expression, suitable to it, is required in order to express these clearly. But this fact has only come to light in a somewhat perplexing form, namely, in the analogies between ordinary mechanics and the geometry of space of many dimensions, which have been discovered by following out Hamilton’s thoughts. Our mode of expression gives a simple and intelligible explanation of these analogies. It allows us to take advantage of them, and at the same time it avoids the unnatural admixture of supra-sensible abstractions with a branch of physics. (Hertz 1894, p. 39/32–33)
Together with an explicit reference to the work of Beltrami, Lipschitz and Darboux in the preface the above passage from the introduction is the only reference Hertz made to the work on mechanics done by contemporary mathematicians. Hertz correctly connected their work with his own treatment of the Hamiltonian formalism. I shall therefore here give a short summary of the mathematical developments in this area during the period 1828–1888, so that we can compare the accomplishments with those of Hertz. A more detailed discussion can be found in my paper (Lützen 1995).
24.1 Gauss and Hamilton on geodesics, optics and dynamics The development during the middle of the nineteenth century of differential geometric notions in dynamics came about through a combination of Gauss’s differential geometry and Hamilton’s formalism in mechanics. Both have their origin in the year 1828. In that year Carl Friedrich Gauss published his Disquisitiones generales circa superficies curvas in which he explained how to study surfaces intrinsically, that is without taking into account how they are imbedded into space. As a basis for the intrinsic study of surfaces, Gauss introduced the line element ds 2 = E du2 + 2F du dv + G dv 2 , 252
(24.1)
Gauss and Hamilton on geodesics, optics and dynamics
253
P f
r
Fig. 24.1. Polar geodesic coordinates
P
r
P⬘ O
f (f)
Fig. 24.2. General geodesic coordinates
where u and v are ‘surface coordinates’ by which a point on the surface can be determined. He showed that the Gauss curvature is an intrinsic property (the famous Theorema Egregium) and he studied geodesics, which are also intrinsic objects. In particular, he proved the remarkable theorem (Fig. 24.1): If on a curved surface an infinite number of shortest lines (geodesics) of equal length be drawn from the same initial point, the line joining their extremities will be normal to each of the lines. (Gauss 1828, §15)
He generalized the theorem to the case where the geodesics instead of emanating from one point, are drawn orthogonally to a given curve (Fig. 24.2). If one measures off equal distances along each geodesic the endpoints will again constitute a curve orthogonal to all the geodesics. These theorems suggested to Gauss that one can introduce particularly simple coordinates on the surface. In Fig. 24.2 let an arbitrary point P have coordinates r, φ where r is the distance from P to the given curve measured along the geodesic that cut
254
Mathematicians on Hamilton–Jacobi formalism
the given curve orthogonally at P , and φ is the distance from P to a given point O on the given curve measured along that curve. In particular, if the given curve is an infinitely small circle around O and φ measures the angle along the circle we get polar coordinates as suggested by Fig. 24.1. The theorem stated above shows that in these coordinates the line element will not have a mixed dr dφ term. Moreover, since r measures the geodesic distance the coefficient of dr 2 is one, so that the line element in these coordinates has the simple appearance: ds 2 = dr 2 + m2 dφ 2 .
(24.2)
Gauss showed how one can determine such a system of coordinates from an arbitrarily given set of surface coordinates u, v. First, one must determine r(u, v) as a solution to the first-order partial differential equation EG − F = E 2
∂r ∂v
2
− 2F
∂r ∂u
∂r ∂v
∂r +G ∂u
2 ,
(24.3)
and then one must determine φ(u, v) as a solution of the partial differential equation ∂r ∂r ∂φ ∂r ∂r ∂φ E −F = F −G . (24.4) ∂v ∂u ∂v ∂v ∂u ∂u This gives a two-step method of finding geodesics on a surface: Instead of integrating the equations of the geodesics directly one first finds a solution r of eqn (24.3) and then determines φ from eqn (24.4), which states that the family of geodesics φ = const. is orthogonal to the family r = const. The procedure is completely analogous to the Hamilton–Jacobi formalism, and indeed eqn (24.3) is the Hamilton– Jacobi equation for a point mass moving on the surface and not influenced by any forces. Obviously its trajectory is a geodesic. Now, let us turn to the other 1828 main publication in our story namely Hamilton’s Theory of Systems of Rays (Hamilton 1828). One of the main theorems of this paper was a generalization of a theorem by Etienne Louis Malus. It states that when a system of light rays emanating from a point are reflected in a series of (curved) mirrors, they remain a normal congruence. That means that through any point on any of the reflected light rays there exists a surface that cuts all the reflected light rays orthogonally. In a series of supplements to the paper (Hamilton 1830b), (Hamilton 1830a), (Hamilton 1832) Hamilton generalized the theorem further to a system of rays that are continually refracted by moving through a continuous medium with varying diffracting index v. For this purpose Hamilton introduced the characteristic function V =
C
v ds,
(24.5)
B
i.e. the action integral along the light ray connecting the points B and C and considered it as a function of the coordinates of these points. From the time of Fermat it was
Liouville and Lipschitz on the principle of least action
255
known that the light ray between B and C minimizes the action integral V among all curves between the two points. By varying the endpoints Hamilton deduced the sought generalization of Malus’s theorem: Theorem. Consider the family of light rays emanating from a given point B (or orthogonally from a given surface) and determine on each ray a point C such that the characteristic function V (B, C) (B being the intersection of the ray and the given surface) is a given constant. Then the points C will constitute a surface S that is orthogonal to all the light rays. This theorem is an obvious 3-dimensional analogue of Gauss’s theorem about geodesics, v ds playing the role of the line element and light rays being geodesics in this metric. Hamilton also deduced the partial differential equations that V must satisfy. These came to play an important role when he generalized his method from optics to mechanics in (Hamilton 1834), (Hamilton 1835). As explained in Section 2.1, he generalized the notion of the characteristic function to a mechanical system, derived the differential equations it must satisfy and displayed the equations of motion in finite form in terms of the characteristic function. Thus, Hamilton carried the analytic aspects of his optics over to mechanics. However, the geometric aspects of his geometric optics were not carried over to mechanics; in particular, Hamilton did not prove a theorem of mechanics corresponding to the above generalization of Gauss’s and Malus’s theorems. The main reason why he did not carry the geometric aspects over to mechanics was probably that the necessary high-dimensional geometry had not been developed at the time.
24.2 Liouville and Lipschitz on the principle of least action As we saw in Section 2.1 Jacobi carried Hamilton’s method further, but in so doing he went further in a purely analytic direction, leaving geometry and even mechanical intuition behind. However, some of the French mathematicians who reacted to Hamilton’s and Jacobi’s ideas began to introduce some geometric methods. First, Joseph Alfred Serret gave a differential geometric derivation of the Hamilton–Jacobi equation for one point mass moving in a conservative force field, and gave a geometric meaning to the characteristic function in this case ((Serret 1848a), see also (Serret 1848b)). More interesting for the present discussion is a paper by Joseph Liouville (1809–1882) A remarkable expression of the quantity which for the motion of a system of material points with arbitrary constraints is a minimum according to the principle of least action (Liouville 1856). As indicated by the title, Liouville considered the action integral
1 r 2m(h − U ) A= aij dqi dqj , (24.6) 0
i,j =1
256
Mathematicians on Hamilton–Jacobi formalism
where the letters have the same meaning as in Hertz’s Mechanics1 . He showed that if V is a solution to a certain first-order differential equation (the Hamilton–Jacobi equation) then one can transform the differential form f (dq) under the square root sign in the action integral into the simple form f (dq) = 2m(h − U )
r
aij dqi dqj = (dV )2 + g(dφ2 , dφ3 , . . . , dφr ), (24.7)
i,j =1
where g(dφ2 , dφ3 , . . . , dφr ) is a certain positive quadratic differential form of the r − 1 new coordinates φ2 , φ3 , . . . , φr . Once the action integral is given the form A=
1
(dV )2 + g(dφ2 , dφ3 , . . . , dφr )
(24.8)
0
it is easy to determine a family of trajectories. Indeed, according to the principle of least action, the trajectory from one configuration 0 to another configuration 1 will be the path that minimizes the action integral (24.8). But a path that makes g(dφ2 , dφ3 , . . . , dφr ) = 0 will clearly minimize this integral. Thus by setting φ2 = c 2 ,
φ3 = c3 , . . . , φr = cr
(24.9)
we get a trajectory of the system, and along that trajectory the action integral takes the value 1
A=
dV = V (1) − V (0),
(24.10)
0
which shows that V is the characteristic function. Liouville’s transformation of coordinates is entirely analogous to Gauss’s transformation of coordinates from the general u, v coordinates to the r, φ coordinates in which the line element has the form (24.2) similar to eqn (24.7 last expression). The transformation was suggested to Liouville by a paper by Schläfli (Schläfli 1852) on geodesics on an ellipsoid, and this may, in turn, have been inspired by Gauss’s transformation. So, the origin of Liouville’s transformation was of a geometric nature. Yet, Liouville presented it in an entirely analytic form in his paper. A similar presentation was given independently by Ferdinand Minding [1864]. Independently of both Liouville and Minding the same idea was pursued by Lipschitz who presented the theory in an explicitly geometric language. His paper (Lipschitz 1872) was published only a few years after the appearance of Riemann’s Habilitationsvortrag (Riemann 1867b). Its main aim was to investigate what dynamics of systems of material points would look like in a Riemannian manifold, or more generally in a Finsler manifold, where the metric is given as the p-th root of a homogeneous differential form of degree p in the coordinates. Here, I shall not enter into 1 I have modified Liouville’s notation slightly in order to fit that of Hertz and Lipschitz.
Liouville and Lipschitz on the principle of least action
257
B
A
Fig. 24.3. Geometric illustration of the solution of the Hamilton-Jacobi equation
the difficulties created by this very general setting, but only deal with Lipschitz’s ideas in the case that he himself thought to be the physically real case, when the point masses of the system live in Euclidean 3-space. In this case, he proved theorems entirely similar to Liouville’s, but he interpreted the analytic formalism geometrically. In particular, he considered the solutions of an equation like (24.11) φ(q1 , q2 , . . . , qr ) = c as determining a hypersurface (a manifold of the (r − 1)-th order) in configuration space, and the solutions of the system of r − 1 equations (24.9), i.e. the intersection of the corresponding hypersurfaces, as a curve (a manifold of the first order). Moreover, he defined that a hypersurface defined by eqn (24.11) and a curve defined parametrically by qi (t) are orthogonal with respect to the form f (dq) at their point of intersection if at that point the following equations hold: ∂φ/∂qi ∂f (dq)/∂qi = . ∂φ/∂q1 ∂f (dq)/∂q1
(24.12)
Lipschitz showed that orthogonality with respect to the form defined in eqn (24.7) is the same as orthogonality with respect to the form ri,j =1 aij dqi dqj , i.e. with respect to Hertz’s line element. Thus, Lipschitz’s concept corresponds entirely to Hertz’s concept of orthogonality between a surface of positions and a path of a mechanical system. In terms of these geometric notions Lipschitz could now prove the following generalizations of Gauss’s and Malus’s theorems to mechanics: Theorem. Let P (q1 , q2 , . . . , qr , α1 , α2 , . . . , αr ) be a complete solution of the Hamilton–Jacobi equation. Fix the values of α1 , α2 , . . . , αr and consider the family of trajectories of the mechanical system that are orthogonal to the hypersurface P = A with respect to the form f (dq) (defined by eqn (24.7)) (Fig. 24.3). Then the trajectories
258
Mathematicians on Hamilton–Jacobi formalism
are determined by the equations ∂P = ∂ai
∂P ∂ai
,
(24.13)
0
where (∂P /∂ai )0 is the value of (∂P /∂ai ) at the intersection point. Moreover, any other hypersurface P = B will cut the trajectories orthogonally with respect to f (dq) and the action integral along the trajectories between the two hypersurfaces P = A and P = B will have the value B − A. Theorem. Let P (q1 , q2 , . . . , qr ) = A denote a hypersurface (Fig. 24.3). Consider the family of trajectories of the mechanical system cutting this hypersurface orthogonally with respect to the form f (dq) (defined by eqn (24.7)). On each trajectory and on the same side of the hypersurface determine a point such that the action integral V between the hypersurface and this point is equal to B − A. Then these points make up a hypersurface that is orthogonal to all the trajectories with respect to f (dq)2 . Moreover, if the action integral V along a trajectory from its intersection with P = A to an arbitrary point is considered as a function of this latter point, then R = A + V is a solution of the Hamilton–Jacobi equation and the hypersurface R = A will coincide with the original hypersurface P = A. In the above theorems the trajectories and the integral of least action must all correspond to the same given value h of the total energy, unless the potential function is a constant.
24.3 Trajectories as geodesics As we saw above, Hertz only formulated these theorems for a free system but as they stand they apply equally well to Hertz’s conservative systems. Hertz explained that he did not develop the Jacobi formalism in this direction because he felt that the physical meaning would be ‘obscured under its mathematical form’ (Hertz 1894, §661). However, there may be another reason as well: when dealing with a free system, the surfaces P = A and P = B, where P is a solution to the Hamilton– Jacobi equation, are separated by a constant orthogonal distance. For a conservative system this is no longer the case. They are separated by a constant action integral. Therefore, Lipschitz’s theorems do not have the same geometric appeal as Hertz’s theorems about free systems. At least they do not have the same appeal, unless one introduces a new geometry into configuration space, namely the one defined by the line element f (q) in eqn (24.7): dsL2 = f (dq) = 2m(h − U )
r
2 aij dqi dqj = 2m(h − U ) dsH ,
(24.14)
i,j =1 2 This part of the theorem for one particle in a force field was stated without reference to Lipschitz in (Thomson and Tait 1879, p. 353).
Trajectories as geodesics
259
where dsH denotes Hertz’s line element. In that case, the notion of orthogonality conserves its meaning (as we mentioned above) but now the level surfaces of P are indeed equidistant. Thus, one gets exactly the same formulations of the theorems as Hertz gave for free systems. From a modern point of view, we can therefore read Lipschitz’s paper as saying: Equip configuration space with the Riemannian metric dsL defined above. Then Gauss’s theorem about geodesics holds true in this higher-dimensional manifold, and represents the generalization of Malus’s theorem to mechanical systems with total √ (mathematical) energy h. However, Lipschitz never explicitly called f (dq) a metric so this formulation is somewhat anachronistic. Yet it is not far from Lipschitz’s own way of thinking. He explicitly referred to Gauss’s investigations of geodesics, and Beltrami’s generalizations to higher dimensional Riemannian manifolds (Beltrami 1868b) as his ‘Leitfaden’ (leading thread), and clearly modelled his transformation (24.7) of the form dsL2 = f (dq) on Gauss’s analogous transformation of the line element (24.2). The idea of treating dsL as a line element in configuration space is more explicit in the subsequent treatment by Gaston Darboux (1842–1917). In the second part of his Leçons sur la théorie générale des surfaces he explained why he had included a chapter on mechanics in a book on surfaces: In particular, I have stressed the connection one can find here between the methods employed by Gauss in the study of geodesics and those that Jacobi later applied to the problems of analytical mechanics. In this way, I have been able to show the great interest of Jacobi’s beautiful discoveries when those are considered from a geometrical point of view. (Darboux 1888, Preface)
In particular, Darboux emphasized the analogy between the theory of geodesics and the theory of trajectories of a mechanical system. Indeed, if we equip configuration space with the metric dsL then, according to the principle of least action, trajectories of the mechanical system are determined by the variational principle
C
δ
dsL = 0.
(24.15)
B
They are therefore geodesics in this Riemannian manifold. As we have seen, Hertz showed that a free holonomic system (not influenced by forces) moves along a geodesic in the space equipped with the metric dsH . What Lipschitz had implicitly done was to introduce a new line element dsL in configuration space such that a conservative system with total (mathematical) energy h will move along geodesics even when the system is influenced by forces. The trick was to include the forces into the geometry by way of the potential energy U as in eqn (24.14). This trick had been anticipated by Liouville already in 1850–1851 in a series of lectures at the Collège de France (Lützen 1990, p. 754). He considered a particle moving on a given surface equipped with isothermal coordinates with the line element ds 2 = λ(α, β)(dα 2 + dβ 2 ),
(24.16)
260
Mathematicians on Hamilton–Jacobi formalism
and influenced by a conservative force with potential function U . The action integral could thus be written3 A= 2m(h − U )λ(dα 2 + dβ 2 ). (24.17) Concerning this expression, Liouville remarked: Thus a reduction to the plane with force function λ(h − U ) . . . (Liouville 1851).
Thus, Liouville suggested that one can think about the analytic problem of minimizing the above integral in two ways: Either as it was originally presented, as giving the trajectories of a point moving on the given surface with line element ds 2 = λ(dα 2 + dβ 2 ) and influenced by a force field with potential U , or, alternatively, as giving the trajectories of a point moving in a plane (with line element ds 2 = (dα 2 + dβ 2 )) and influenced by a force field with potential (h − U )λ. In particular, if the potential U is constant, Liouville has transformed the problem of the motion of a point in a plane under the influence of forces into the problem of finding geodesics of a surface. Liouville’s unpublished note, and Lipschitz’s and Darboux’s published works showed how one can geometrize conservative forces. If the corresponding potential energy is taken into account in the formation of the Riemannian metric of configuration space, mechanical systems will move along geodesics, i.e. as though they were not influenced by forces. In a sense this method can be thought of as a mathematically precise realization of Clifford’s dream about reducing physics to the study of curved space (see Section 4.3), and can also be thought of as an anticipation of central ideas of the general theory of relativity in which Einstein showed how to interpret gravitation as a result of the curvature of space-time. However, the dating of the publication of Clifford’s and Lipschitz’s papers exclude any direct influence between them, and nothing seems to suggest that Einstein should have known of Lipschitz’s and Darboux’s geometrization of classical mechanics. Moreover, there are important differences between the geometrization by the nineteenth-century mathematicians and Einstein’s theory. First, on a technical level, Einstein changed the geometry of space-time, whereas the mathematicians changed the geometry of configuration space. Secondly, the mathematicians did not go as far as Einstein in the sense that they only dealt with a given mechanical system with a given total energy. Thirdly, Einstein did not go as far as the mathematicians in the sense that he only geometrized gravitational forces, whereas the mathematicians dealt with all conservative forces. Finally, and most importantly, the nineteenth-century mathematicians only considered their inclusion of the forces into the line element as an elegant mathematical trick. It was not supposed to make any fundamental change in the physical meaning of the formalism. Einstein’s formalism, on the other hand, changed the basic assumptions and 3 I have changed the Liouville’s notation here and in the quote below. Liouville wrote
A=
2(U + C)λ(dα 2 + dβ 2 ).
Hertz and the mathematicians
261
principles of mechanics, and therefore added to our physical understanding of the world.
24.4 Hertz and the mathematicians Let me conclude this chapter by comparing Hertz’s geometrical approach to mechanics with that of his mathematical precursors. First, it is to be remarked that Hertz is more explicit in his dealing with the geometric formalism than the previous mathematicians. Both Lipschitz and Hertz introduced (the same) concept of orthogonality, but only Hertz spoke about dsH as a distance, Lipschitz and Darboux only spoke of dsL as being analogous to a metric on a surface. In this sense Hertz went further than his predecessors. Moreover, the mathematicians did not introduce the important concept of the reduced component (the covariant components) of a vector quantity. On the other hand, in their dealings with conservative forces (or equivalently potential energy) the mathematicians went further than Hertz. By implicitly including the forces into the line element they could obtain that a mechanical system moves along geodesics, even in the case where it is influenced by forces. Hertz also avoided forces in his mechanics, but rather than including them into the line element he accounted for them by a hidden adiabatic cyclic system. Therefore, only free (holonomic) systems move along geodesics in Hertz’s mechanics. Nothing would, of course, have prevented Hertz from introducing a second metric dsL on configuration space, but this would clearly have been a purely mathematical trick without physical substance, and, as is apparent from Hertz’s comments about the general Jacobi formalism (quoted above) Hertz was not prepared to go that far. And here is perhaps the greatest difference between Hertz and his predecessors. Hertz believed that his mechanics was not just a new mathematical formalism but provided an image of the world that was physically different from the earlier images. In Section 13.1 I have discussed in general terms how the geometrical formalism of his predecessors might have influenced Hertz. Now that we have discussed the mathematical details of this formalism, we can be more specific. There are three things that Hertz could have learned from Lipschitz and Darboux: 1. the general idea of geometrizing configuration space, 2. the mathematical details about how to deal with quadratic differential forms, and 3. the geometric version of the Hamilton formalism. Recall that in the preface to his Mechanics Hertz wrote that he became familiar with the work of Beltrami and Lipschitz at a time when his own research had made ‘considerable progress.’ If he did not intentionally lie or willfully try to deceive his authors4 , we can exclude that Hertz borrowed the general geometric ideas from the mathematicians. I have argued (Section 13.1) that there is no reason to doubt Hertz’s declaration of independence relative to his mathematical predecessors. 4 I say willfully deceive, because the way Hertz phrased the sentence (see Section 13.1) does, in fact, not exclude that he had borrowed all the geometric ideas from Darboux, but only familiarized himself with Beltrami’s and Lipschitz’s papers, to which Darboux explicitly referred, when his own investigations had made considerable progress. However, if that was the case, Hertz’s careful formulation is definitely deceiving.
262
Mathematicians on Hamilton–Jacobi formalism
Hertz’s manuscripts strongly suggest that he developed his geometry of systems of points on his own. But what shall we then make of his statement in the preface to the effect that when he learned of the works of Beltrami, Lipschitz and Darboux he ‘found these very suggestive’ (‘konnte ich noch reiche Anregung aus denselben schöpfen’) (Hertz 1894, p. XXXII). This could just be a polite gesture to his older colleagues, or it could be a reference to the geometric formulation of the Hamiltonian (and the Hamilton–Jacobi) formalism. Indeed, as pointed out above the only specific reference Hertz made to the mathematicians were to their geometric but perplexing version of Hamilton’s method. It is therefore possible that the works of the mathematicians suggested the geometric treatment of the Hamilton formalism to Hertz. However, if that is the case, it happened before Hertz finished the first long draft (Ms 9) of the Mechanics. Indeed, the end of that manuscript contains a preliminary version of the geometric Hamilton–Jacobi theory. He introduced the shortest distance S, derived the partial differential equations (23.3) and (23.4) and showed how the straightest paths can be determined from S. He did not consider general solutions to eqn (23.5) in the first manuscript, but in the sketchy draft of the second book, he introduced the characteristic function V (23.9) and indicated as a section heading ‘Lehrsatz [?] von Jacobi’ (Jacobi’s theorem) (Ms 9, p. 53) without substantiating this any further. Thus, it seems that whatever Hertz learned from the mathematicians he learned it before completing the first long draft Ms 9. This makes it a matter of conjecture to try and pinpoint Hertz’s debt to his mathematical precursors.
25 Hertz on the domain of applicability of his mechanics
In this chapter I shall analyse Hertz’s views concerning the range of applicability of his mechanics, in particular his views about living systems. Hertz discussed these matters in the introduction of the book, after the introduction of conservative forces, and in four sections following the introduction of the fundamental law. These latter sections are entitled: ‘Validity of the Fundamental Law,’ ‘Limitations of the Fundamental Law,’ ‘Method of applying the fundamental law,’ and ‘Approximate Application of the Fundamental Law.’
25.1 Practical applications Hertz considered his Mechanics as a purely theoretical endeavor. It was not intended to facilitate ‘practical applications or the needs of mankind. In respect of these latter, it is scarcely possible that the usual representation of mechanics which has been devised expressly for them can ever be replaced by a more appropriate system,’ Hertz admitted (Hertz 1894, p. 47/40). Our representation of mechanics bears towards the customary one somewhat the same relation that a systematic grammar of a language bears to a grammar devised for the purpose of enabling learners to become acquainted as quickly as possible with what they will require in daily life. The requirements of the two are very different, and they must differ widely in their arrangement if each is to be properly adapted to its purpose. (Hertz 1894, p. 47/40)1
Since he could show that the usual principles of mechanics also hold in his image of mechanics any analysis of a mechanical problem within the usual mechanics is, in a sense, also valid in his mechanics. And if the system is described in terms of forces, nothing is gained by trying to reduce the forces to the motion of a hidden system. Of course, from the point of view of Hertz’s mechanics theoretical insight 1 The German original in fact states that the customary representation (Darstellung) bears to Hertz’s representation a relation similar to the relation between a systematic grammar and a grammar devised for the quick acquisition of practical language skills. Obviously Hertz meant the reverse, and this is what the English translation states.
263
264
Hertz on the domain of applicability of his mechanics
may be gained that way, but for practical purposes an analysis and solution in terms of Newtonian mechanics will be simpler. If a mechanical system is entirely described through rigid connections alone, such as for example a linkage, a gear-wheel mechanism, an integrator, or another machine, one might use Hertz’s mechanics to study it also from a practical point of view. But even in that case one is faced with a problem that Hertz discussed in §327–330 of the book: The rigid connections that we know from experience are probably not the basic rigid connections, but may admit of further physical explanations. The fundamental law explains how a system will move when we know the basic connections. Therefore, it is a question if the fundamental law applies at all to a system that is described not by the basic connections but by a set of equations derived from them. Hertz gave the answer ‘Yes.’ Note. When equations result from the given equations of condition of a system and the fundamental law, which have strictly the form of equations of condition, then for the determination of the motion of the system it is indifferent whether we consider the original equations alone, or instead of them the derived equations, as a representation of the connections of the system. (Hertz 1894, §327)
Hertz’s subsequent argument makes it clear that one can only leave out a set of the original connections that is a consequence of the remaining original connections and the derived ones. On top of this there is another problem: If a mechanical system, such as a machine is investigated in detail, it will always turn out that the assumed rigid connections between its macroscopic parts are only approximately rigid. ‘We are compelled to seek the ultimate connections in the world of atoms, and they are unknown to us’ (Hertz 1894, §330). So, one might ask if one will get an approximately correct description of the motion of the system if one applies the fundamental law to the approximate connections. Again, Hertz answered yes (Hertz 1894, §329) but he did not give any argument for this claim.
25.2 Validity and applicability of the fundamental law As we have seen, Hertz considered the fundamental law as a probable outcome of most general experience. In fact, it is the only empirical element of his image. He maintained that it is not contradicted by any experience, but nevertheless divided mechanical systems into three classes having different relations to the fundamental law. 1. The first class consists of systems that can be shown by experience to satisfy the requirements of a free system, and to which the fundamental law applies directly. As examples of such systems Hertz mentioned rigid bodies moving freely in space and perfect fluids. ‘The fundamental law is deduced from experiences on such material systems. With regard to this first class it merely represents an experimental fact’ (Hertz 1894, §316). This very clear statement needs some qualifications. For, as Hertz pointed out (see above), we never experience absolutely rigid bodies or any other system that is exactly characterized by known constraints. That means that this
Constructability of forces
265
class of systems for which the fundamental law is an experimental fact is strictly speaking empty, and all mechanical systems belong to the second (or third) class. 2. The second class consists of systems to which the fundamental law cannot immediately be applied, or which do not at first sight obey the law, but which can be made amenable to the law if one makes certain hypotheses. Hertz mentioned two main examples: First, systems that do not seem to be continuous, for example if impulses occur. Here, he considered it highly probable that all occurrences in the world would, in fact, turn out to be continuous if one considers them on a small enough time and space scale. Secondly, and most interesting, the systems that are described by actions at a distance or include heat. Here, it can happen that if one brings the tangible bodies to rest they will begin moving when set free. Such behavior seemingly contradicts the fundamental law, but one can save the phenomena by assuming that the system is a part of a free system whose other masses are concealed. It appears that assumptions can always be made with regard to these concealed motions such that the complete systems obey the fundamental law. As regards the second class of natural systems the law bears the character of a hypothesis which is in part highly probable, in part fairly probable, but which, as far as we can see, is always permissible. (Hertz 1894, §317)
3. The third class consists of those systems whose motions cannot be described directly by the fundamental law and for which no definite hypotheses can be made that will make their motions conform to the law. ‘Among these are included, for instance, all systems which contain organic or living beings’ (Hertz 1894, §318). I shall investigate the last two types of systems in the following sections.
25.3 Constructability of forces The main question regarding the second type of systems is whether systems that are usually described by forces can be described in Hertzian terms as well. Hertz returned to that question after he had introduced the derived notions of (conservative) forces in his mechanics. With any given analytical form of a force of either kind, [conservative or non-conservative] the question may be raised whether this form is consistent with the assumptions of our mechanics, or the reverse. To this question an answer cannot in general be given; in particular cases it is to be judged from the following considerations: (1) If it can be shown that there exists a normal continuous system which exerts forces of the given form, then it is proved that the given form satisfies the postulates of our mechanics. (2) If it can be proved that the existence of such a system is impossible, then it is shown that the given form contradicts our mechanics. (3) If it can be shown that there exists in nature any system which we know by experience to exert forces of the given form, then we consider it thereby proved that the given form is consistent with our mechanics.
266
Hertz on the domain of applicability of his mechanics
If no one of the three cases happens, then the question must remain an open one. Should such a form of force be found as would be rejected by the second consideration, but permitted by the third, then the insufficiency of the hypothesis on which our mechanics reposes, and in consequence the insufficiency of our mechanics itself, would be proved. (Hertz 1894, §666, 667)
As we saw above, Hertz considered it permissible and highly probable to consider all forces in nature as a result of a hidden system connected to the tangible system. In the introduction of the book he was a bit more explicit about this assumption: It can be shown that the form of these force functions [those that arise in Hertz’s mechanics] may be of a very general nature; and in fact we do not deduce any restrictions for them. But on the other hand it remains for us to prove that any and every form of the force functions can be realized; and hence it remains an open question whether such a mode of explanation may not fail to account for some one of the forms occurring in nature. Here again we can only bide our time so as to see whether our assumption is refuted, or whether it acquires greater and greater probability by the absence of any such refutation. We may regard it as a good omen that many distinguished physicists tend more and more to favor the hypothesis. I may mention Lord Kelvin’s theory of vortex-atoms: this presents to us an image of the material universe which is in complete accord with the principles of our mechanics. And yet our mechanics in no wise demands such great simplicity and limitation of assumptions as Lord Kelvin has imposed upon himself. We need not abandon our fundamental propositions if we were to assume that the vortices revolved about rigid or flexible, but inextensible, nuclei; and instead of assuming simply incompressibility we might subject the all-pervading medium to much more complicated conditions, the most general form of which would be a matter for further investigation. Thus there appears to be no reason why the hypothesis admitted in our mechanics should not suffice to explain the phenomena. (Hertz 1894, p. 44,45/37,38)
Hertz did not solve the crucial problem of constructing a concealed system that would account for the empirically known forces of nature such as gravitation and electromagnetic forces. In his Kiel Lectures he had argued that both types of forces could be described in terms of field theories and his subsequent famous experiments had demonstrated this to his own satisfaction as far as electromagnetic forces were concerned. However, the question still remained, how to construct a hidden mechanical system (the ether) that could carry these fields. As we have seen in Section 6.3 Hertz never planned to include such a construction as a part of his book on mechanics. The sole aim of the book was to establish the theoretical foundation for a construction of such hidden systems or in other words for constructing a model of the ether. In one place, Hertz even seems to relegate the problem of constructing hidden systems to the realm of experimental physics: ‘To investigate in detail the connections of definite material systems is not the business of mechanics, but of experimental physics’ (Hertz 1894, p. 32/27). Hertz may have thought of experiments similar to his own or other experiments that would reveal more of the hidden atomic world. Such experiments can, of course, tell us more about the connections of definite material systems. However, it seems rather clear that Hertz believed that even the minutest investigations will not be able to reveal all the connections of a system. His discussion of mass seems to suggest that he believed that one would not be able to explain all
Vitalism, teleology, reductionism, and mechanism in nineteenth-century biology
267
phenomena in nature without assuming that there are masses and connections that will remain forever hidden. If that is true, experimental research will never be able to fully uncover the connections of natural systems. There will remain the theoretical problem of constructing at least one possible hidden system that will bring forth the known appearances. Hertz’s appeal to experimental physics at this point thus seems to be a rhetorical trick that allowed him to put the burden of proof on the shoulders of his opponents. Until they could prove that there exist systems (forces) in nature that cannot be accounted for in Hertz’s mechanics by any hypothesis about a connected hidden system, he would consider his mechanics as a correct description of nature.
25.4 Vitalism, teleology, reductionism, and mechanism in nineteenth-century biology In this section I shall briefly provide a context for Hertz’s considerations about the applicability of his mechanics to living systems. For a more comprehensive analysis see (Merz 1903, Chapter X), (Lenoir 1982), and (Keller 1995, Chapter 2). Since the scientific revolution an increasing number of phenomena related to living organisms were explained in terms of physics and chemistry. Still, in his Kritik der Urteilskraft Kant (Kant 1790) had argued that it would be impossible to explain the organization of plants and animals solely in terms of physical and chemical forces. Even the usual causal explanation from causes to effects had to be supplemented by a kind of teleological or holistic explanation that takes the organization of the whole organism into account and considers its elements both as causes of the whole and as caused by the whole. According to Kant the organizational principle is present in the embryo of every living organism and directs its growth. Kant’s ideas were made the basis for a fruitful German research tradition that Lenoir has called teleomechanism. ‘It was mechanical in that the specific functioning of the organ was to be explained not as the result of a vital force but in terms of the forces of physics and organic chemistry; it was teleological in the sense that . . . the same sorts of physico-chemical causation that account for the functioning of the organ are not capable of being the source of its organization’ (Lenoir 1982, p. 159). Many biologists cast their reflections about the organization of living organisms in terms of a vital force (Lebenskraft). This force was sometimes considered merely as an expression of the order in the organism but at times also as a cause of this organization. It was often considered as a force that although different from physical forces operated in accordance with the usual laws of mechanics, or at least did not contradict these laws. ‘Liebig emphasized that the Lebenskraft was to be conceived solely in terms of the order and arrangement of natural forces, and that its only mode of appearance was through the material interconnections of those forces. This implied that the Lebenskraft had to be analyzed within the same conceptual framework as all other forces, namely in terms of the general principles of motion’ (Lenoir 1982, p. 164).
268
Hertz on the domain of applicability of his mechanics
The teleomechanical tradition was developed during the first half of the nineteenth century by scientists such as Reil, Kielmayer, Hildebrandt, Weber, von Baer, Berzelius, Liebig, Müller and Leuckart. However, around the middle of the century it was challenged by Helmholtz, Emil DuBois-Reymond, and Matthias Schleiden who were all students of Müller. The scientific basis for their rejection of the vital force was Helmholtz’s discovery of the conservation of energy. Helmholtz was led to this discovery as a result of his experimental physiological research of metabolism. In order to show that the action of muscles could be explained entirely in terms of metabolic processes that took place in the muscles themselves, he carefully investigated the energy exchange in a frog leg that was made to perform work. He concluded that all energy could indeed be accounted for as originating from the chemical reactions taking place in the muscle. There was no need to assume that the nerves contributed with vital force. This experiment included many types of energy conversions. Combining its result with the mechanical principle of vis viva he was led to the principle of conservation of energy (Lenoir 1982, p. 197). Helmholtz only accepted material forces similar to physical forces, and his experiments had shown that all energy transformation in an organism can be accounted for by the known physical and chemical forces. Thus, there was no need for a vital force. This opinion was immediately taken over by DuBois-Reymond and Schleiden, who attacked the vitalists in a particularly vicious way. A long controversy led to a victory for the mechanistic reductionist camp. They were helped by the publication of Darwin’s ideas that could explain how even the organization of the organs could be accounted for from a purely mechanistic point of view. However, the victory was only temporary. The problem of the organization of living organisms resurfaced during the second part of the nineteenth century, and this time the anti-mechanicians found support in thermodynamics. According to the newly formulated second law of thermodynamics entropy always increases, but this law seemed to be contradicted by the increasing order in a growing organism. This problem made some scientists such as William Thomson denounce materialistic doctrines in connection with living systems, while others tried to save the phenomena by appealing to some variant of Maxwell’s demon that could make entropy decrease without using any perceptible amount of energy (Keller 1995, p. 59). This way out may be considered as a reintroduction of the vital force. During the twentieth century Erwin Schrödinger pointed out that a living organism is never an isolated system, and suggested that it decreased its entropy by increasing the entropy of its surroundings. ‘What an organism feeds upon is negative entropy’ (Schrödinger, quoted from (Keller 1995, p. 67)).
25.5 Hertz on living systems Hertz did not make any explicit reference to the contemporary discussions about the possibility of reducing biological systems to mechanics. However, it is very likely that he would have known of at least the first part of the discussion, and in particular of Helmholtz’s opinions. However, he did not follow his mentor entirely, but placed
Hertz on living systems
269
himself somewhere between the vitalists and the materialists. He maintained that in the usual presentation of mechanics it is possible to assume that the fundamental laws include animate as well as inanimate nature. That is because ‘we give the freest play to the forms of the forces which there enter into the fundamental laws, and reserve to ourselves an opportunity of explaining, later and outside of mechanics, whether the forms of animate and inanimate nature are different, and what properties may distinguish the one from the other’ (Hertz 1894, §322). Hertz’s view is in accordance with both Helmholtz’s radical materialism as well as with teleomechanism. Indeed according to the latter there may be forces (vital forces) different from those usually encountered in inanimate nature, but these forces must operate according to the usual principles of motion (see Liebig’s point of view, cited above). Thus, as far as the usual mechanics is concerned, Hertz did not commit himself either to materialism or to teleomechanism. In our presentation of the subject [Hertz’s image] greater prudence is necessary, since a considerable number of experiences which primarily relate to inanimate nature only are already included in the principle itself [the fundamental law], and the possibility of a later narrowing of the limits is much lessened. (Hertz 1894, §322)
As far as his own mechanics was concerned Hertz first rejected materialism on the basis of a sound instinct: In a system of bodies which conforms to the fundamental law there is neither any new motion nor any cause of new motion, but only the continuance of the previous motion in a given simple manner. One can scarcely help denoting such a material system as an inanimate or lifeless one. If we were to extend the law to the whole of nature, as the most general free system, and to say – ‘The whole of nature pursues with uniform velocity a straightest path,’ – we should offend against a feeling which is sound and natural. It is therefore prudent to limit the probable validity of the law to inanimate systems. This amounts to the statement that the law, applied to a system of the third class (§318), forms an improbable hypothesis. (Hertz 1894, §320)
In the introduction to the book Hertz emphasized that this limitation of the fundamental law is in a sense an advantage, since it underscores that the law is not a necessity of thought but is of an empirical origin: Our fundamental law, although it may suffice for representing the motion of inanimate matter, appears (at any rate that is one’s first and natural impression) too simple and narrow to account for even the lowest process of life. It seems to me that this is not a disadvantage, but rather an advantage of our law. For while it allows us to survey the whole domain of mechanics, it shows us what are the limits of its domain. By giving us only bare facts, without attributing to them any appearance of necessity, it enables us to recognise that everything might be quite different. (Hertz 1894, p. 45/38)
However, this initial rejection of the mechanistic point of view of his mentor was counteracted by his subsequent remarks. He argued that although it is an improbable hypothesis that the fundamental law applies to living systems, it is nevertheless
270
Hertz on the domain of applicability of his mechanics
a ‘permissible one’: We know, however, so little of all the systems included under this head [the third class of systems], that it cannot be regarded as proved that such hypotheses are impossible, and that the phenomena in these systems contradict the fundamental law. Thus, then, with regard to the third class of systems of bodies the fundamental law has the character of a permissible hypothesis. (Hertz 1894, §318)
Moreover, Hertz argued that even if it could be proved that the fundamental law does not apply to animate nature, a weaker version of materialism could still be upheld: If it could be proved that living systems contradicted the hypothesis, then they would separate themselves from mechanics. In that case, but only in that case, our mechanics would require supplementing with reference to those unfree systems which, although themselves lifeless, are nevertheless parts of such free systems as contain living beings. As far as we know, such a supplement could be formed, namely from the experience that animate systems never produce any different results on inanimate ones than those which can also be produced by an inanimate system. Thus it is possible to substitute for any animate system an inanimate one; this may replace the former in any particular problem under consideration, and its specification is requisite in order that we may reduce the given problem to a purely mechanical one. (Hertz 1894, §321)
Let us have a closer look at Hertz’s attempt to show that a materialistic mechanistic view of animate nature is ‘permissible.’ The first claim that there is no proof that animate nature cannot be explained mechanically, is a restatement of Helmholtz’s point of view. The second safeguard, however, is less clear. Hertz seems to say that even if it turned out that the behavior of a living organism in itself does not conform to the fundamental law, all its interactions with inanimate systems can nevertheless be accounted for by that law. It is possible to replace it with a robot. Does this distinction between the organism in itself and its interaction with its surroundings reflect Hertz’s awareness of the teleomechanist’s distinction between the organization of the different organs (that cannot be explained on the basis of mechanical principles alone) and the functioning of the organs (that is described completely by the principles of mechanics) or does it reflect Hertz’s awareness of the fact that the second law of thermodynamics does not hold for the living organism itself but may hold for the living organism together with its environment? Considering his relationship with Helmholtz the first possibility is likely. But, even if the remark may be explainable in terms of the contemporary debate about the applicability of mechanics to biological systems, it is not so easy to make sense of it within Hertz’s own philosophical framework. Indeed, how is it possible to falsify the fundamental law as applied to a living organism without letting it interact with an inanimate system? A falsification can only be made through measurements of times, spaces, and masses, and according to Hertz’s own coordinating rules this is done by way of a chronometer, a scale and by weighing, i.e. through interactions with inanimate systems. Thus, if it could be demonstrated that an animate system contradicts the fundamental law, this will necessarily mean that its interaction with an inanimate system cannot be described by that law.
Hertz on living systems
271
This may not at first contradict Hertz’s claim, at least in its final formulation. Here, he claimed that even if an animate system does not conform to the fundamental law one can replace it with a mechanical system (a robot) that has the same effect on an inanimate system. In his own terminology this seems to mean that we can construct a mechanical model (see Section 8.5) of the animate system. But, according to his own theory of images, we cannot do better than constructing an image or model of the system under consideration. Thus if we can find a mechanical system that has the same effects as an animate system, this mechanical system is the image of the external biological system, and we cannot say more about it. There is no difference here between the way we picture an animate system and an inanimate one. One could perhaps assume that for each particular interaction between an animate system and an inanimate one, and for each limited time interval, one can replace the animate system by a robot, but one cannot find a universal mechanical replacement. However, this is not what Hertz said and if the replacement works for arbitrarily large inanimate systems, and for arbitrarily long time intervals, it will again be impossible to distinguish the animate system from an inanimate one. In light of these reflections it seems difficult to maintain Hertz’s sharp distinction between inanimate systems of class two and animate systems of class three. In both cases the fundamental law does not describe the motions of the tangible system but in both cases Hertz claimed (without any constructive proof) that one can imagine a hidden system connected to the tangible one, such that the combined system moves according to the fundamental law. If Hertz had actually constructed hidden systems and connections that would explain the motion of inanimate systems involving gravity and electromagnetic interactions he could have made a clear distinction between systems of class two and three. The systems of class two would then be those for which one had, in fact, constructed an image of a hidden system such that the combined system moves according to the fundamental law and the systems of class three would be the systems for which no such system had been constructed yet. However, as presented in Hertz’s book the distinction is only one of degrees. In class two some partly successful attempts indicate that one will probably be able to construct the desired hidden system in the future, whereas for the systems of class three it seems unlikely that one will ever be able to construct entirely satisfactory hidden systems. Many philosophers from the time of Descartes had made a clear distinction between body and mind or consciousness. However, there are indications that Hertz intended his reflections about animate systems to include the mind itself. Indeed, as we have already noticed in Section 8.5, Hertz remarked that the relation between a model and the system it is a model of, is similar to the relation between external nature and the images we make of it in our mind. The agreement between mind and nature may therefore be likened to the agreement between two systems which are models of one another, and we can even account for this agreement by assuming that the mind is capable of making actual dynamical models of things, and of working with them. (Hertz 1894, §428)
272
Hertz on the domain of applicability of his mechanics
If we recall that a dynamical model is a mechanical system, Hertz here implies that we can ‘assume’ that the mind can be considered as a mechanical system. Thus the mind seems to be included among the animated systems. It is permissible to make the hypothesis that it is described by the fundamental law, but it is an improbable hypothesis.
25.6 ‘Permissible,’ ‘probable’? About the applicability of the fundamental law to mechanical systems Hertz concluded that it was a hypothesis that was ‘permissible’ and ‘probable’ in the case of inanimate systems and ‘permissible’ but ‘improbable’ in the case of animate systems. Now let us consider the meaning of these words. First, let me point out that ‘permissible’ does not mean logically permissible in the sense of Hertz’s image theory. When Hertz argued that the hypothesis is permissible he explicitly made appeal to experience. But experience has nothing to say about the logical permissibility of an image. Correctness is the property of an image that can be and must be tested against the external world. Still, ‘permissible’ cannot mean ‘correct’ because Hertz did not demonstrate the correctness of the fundamental law. That would have required him to explicitly construct the hidden systems. Rather ‘permissible’ seems to mean ‘not demonstrably incorrect’ or rather ‘not yet proved to be incorrect.’ Secondly, what did Hertz mean by saying that the hypothesis is probable for inanimate systems but improbable for animate systems? He used the word ‘probable’ in connection with his earlier theory of images in the Kiel Lectures (Chapter 8) but in his Mechanics it has no technical philosophical meaning. In fact, Hertz used it in two different ways in the sections discussed in this chapter. As far as the inanimate systems of class two were concerned Hertz argued that the hypothesis of the applicability of the fundamental law was probable, because it seemed probable that suitable hidden systems would be constructed in the near future. Thus, the probability concerned the constructability of the hidden system. However, in the case of animate systems it was a ‘sound and natural feeling’ that made Hertz proclaim that the hypothesis of the applicability of the fundamental law was improbable. Did Hertz mean to say that it is unlikely that animate systems are really only mechanical systems? A first reading certainly suggests a meaning along these lines. However, such a statement about what really goes on in the external world is entirely foreign to Hertz’s theory of images in its mature form. If, as Hertz claimed, we can make a dynamical model (i.e. an image) of any external animate system, and if this model moves in conformity with the fundamental law, the image contains all elements of the system that we can talk about in a scientifically meaningful way. Is it possible that when Hertz was face to face with animate nature he forgot the stark nature of his own philosophy of images, and applied ‘sound and natural feelings’ about ontology that are foreign to it? Or did he also in his discussion of animate nature mean to say that it is improbable that one will find satisfactory hidden systems? Whatever meaning he attached to ‘(im)probable’ it is clear that it is a rather subjective statement about mechanical
Applicability and correctness
273
systems and their relation to the fundamental law that does not correspond to any of the concepts concerning images that Hertz introduced in the introduction.
25.7 Applicability and correctness It is interesting to compare Hertz’s discussion of the applicability of his fundamental law with his discussion earlier in the book of the correctness of his image. In the sections discussed above, Hertz clearly stated that the fundamental law was a result of experience, but he also seems to be willing to accept that there may be systems to which the fundamental law does not apply. This may suggest a conventionalist understanding of the laws of mechanics, similar to that of Poincaré who claimed that though the laws of mechanics resulted from experience they could not be refuted by experience (see (Nordmann 1998)). According to this point of view, future knowledge of nature may lead to auxiliary hypotheses but it will never lead to rejection of the fundamental laws. However, such a reading of these sections in Hertz’s Mechanics clashes squarely with his earlier statements about the requirement of correctness. In the earlier discussion Hertz emphasized that the fundamental law is the only empirical law that entered into his image. He also stated in the introduction that what originated from experience could also be falsified by experience (see Section 10.1) and, moreover, that ‘the question of the correctness of our statements is . . . coincident with the question of the correctness or general validity of that single statement [the fundamental law]’ (Hertz 1894, p. 11/9). Finally, he stated that a mature science should only consider correct images. These statements clearly imply that if one can find just one mechanical system to which the fundamental law does not apply Hertz’s image of mechanics is incorrect, and must be entirely rejected or at least modified. Yet in the sections discussed in this chapter Hertz signalled a willingness to allow some ‘limitations’ of his image. This raises the question: How many exceptions can one accept before one has to declare an image of mechanics to be incorrect? Hertz did not give an explicit answer to this question, but his division of mechanical systems into three classes may indicate the following answer: Even if animate systems turn out to contradict the fundamental law, we should still accept Hertz’s image as a valid image of nature. However, if an inanimate system can be found, which cannot be described by the fundamental law, then the image should be rejected as incorrect. This interpretation is in accordance with Hertz’s remark, quoted above to the effect that if one can find a force in nature that cannot be described in Hertzian terms as a result of a hidden system ‘then the insufficiency of the hypothesis on which our mechanics reposes, and in consequence the insufficiency of our mechanics itself, would be proved.’ I must, however, admit that Hertz did not use the word ‘incorrect’ here but rather the word ‘insufficient,’ which may not entirely exclude a more conventionalist reading.
26 Force-producing models
The correctness of Hertz’s image of mechanics is essentially reduced to one question. Is it possible to construct hidden systems and connections to the tangible systems, such that the total system will obey the fundamental law, or said differently, such that the effect on the tangible system will mimic the forces empirically found in nature. If such systems can be constructed Hertz’s image is correct; if it can be shown that such systems cannot be constructed the image will be incorrect. It is perhaps not surprising that critics of Hertz’s mechanics did not provide a proof that a construction is impossible. Indeed it is not so obvious how such an impossibility proof should be made, and, moreover, such a proof is only necessary if one accepts Hertz’s philosophical analysis. A critic of Hertz’s mechanics was more likely to reject at least parts of his philosophical analysis as well. It is more surprising that very few defenders of Hertz’s approach to mechanics tried to support his image of mechanics by constructing concealed motions that would account for concrete forces. The only serious attempts were made by Brill (Brill 1909, pp. 28–30) (see Chapter 27) and independently and more directly in 1916 by a relatively unknown assistant at the Technische Hochschule in Vienna, Franz Xaver Paulus (1895–1949). Paulus began his paper (Paulus 1916) with an admirably clear exposition of the main ideas in Hertz’s account of forces1 . Then he treated conservative systems whose hidden masses are monocyclic. Exactly like R. Liouville (see Section 18.6), to whom he did not refer, he remarked that if a monocyclic hidden system adds an amount q˙n2 /U to the kinetic energy of the system, the additional apparent potential energy would be C 2 U . This is exactly the step from eqn (18.45) to eqn (18.48) if we assume that the constant value of q˙n2 /U is C instead of one. He then (following Boltzmann (Boltzmann 1891)) investigated a series of variations of the centrifugal regulator of which the following was the most general instance: Consider (Fig. 26.1) a system of that one visible mass m that can move along a vertical z-axis and a hidden mass m can rotate around the z-axis in variable height z and with variable distance x from the z-axis. Assume that the two masses are coupled with a weightless string of fixed length that can roll over a weightless wheel w that is forced through a weightless 1 He did not use Hertz’s geometric language, though he had himself independently suggested a similar language (Paulus 1910), nor did he treat non-holonomic constraints).
274
Force-producing models
275
Fig. 26.1. Paulus’s force producing mechanism (Paulus 1916)
. If z denotes the height of m the coupling mechanism to remain at the same level as m can be expressed by the equation z = x − x0 + z − z0 ,
(26.1)
s where x0 and z0 are the values of m x - and z-coordinates when z = 0. is forced to move along a curve Assume, moreover, that m z = f ( x ) that rotates . Then eqn (26.1) will take the form around the z-axis together with m x ) − z0 . z = x − x0 + f (
(26.2)
and Paulus used x and the angle ϕ around the z-axis as generalized coordinates of m expressed the kinetic energy of the hidden system as 2 = 1m E x 2 ϕ˙ 2 , x˙ + 21 m 2
(26.3)
2 z˙ = 21 m(f ( x ) x˙ )2 but that does not matter for the (he has forgotten the term 21 m is subsequent argument). If the system is cyclic in Hertz’s sense this means that m so small and ϕ˙ 2 so large that the first term in this expression vanishes in comparison with the last term so that = 1m (26.4) E x 2 ϕ˙ 2 . 2
According to the remark above this corresponds to m being subject to a potential energy 2 U (z) = C 2 · . (26.5) m x 2 (z)
276
Force-producing models
Now assume that we want m to be subject to a given conservative force with a positive (decreasing) potential energy U (z). This can be obtained by a suitable choice of f . Indeed, from eqn (26.5) we see that 2 −1 2C z=U (26.6) m x2 and if we substitute this into eqn (26.2) we see that f ( x ) must be determined by 2 2C f ( x) = z0 − . (26.7) x + x0 + U −1 m x2 For example, if z is subject to gravitation U (z) = (−γ /z) + k we have x + x0 − f ( x ) = z0 −
γ . (2C 2 /m x2) − k
(26.8)
With this argument, Paulus has shown that a suitable concealed mechanism consisting of one particle could account for ‘almost any’ attractive vertical force on one particle with the potential U (z). I write ‘almost any’ because, as Paulus pointed out, the righthand side of eqn (26.5) is strictly positive so the above mechanism can only produce the potential energy U (z) if U is everywhere positive. This is an immediate result of Hertz’s assumption that potential energy is, in fact, kinetic energy of a hidden system, kinetic energy being positive by definition. To be sure, one can always add or subtract an arbitrary constant from the function U so that one can reproduce negative potentials as well, but only if they remain bounded from below. For example, in the description of gravitation k in eqn (26.8) may be chosen arbitrarily large, but once it has been chosen z cannot obtain values less than γ /k. Paulus only worked thoroughly through this one example of one point moving along a straight line, and merely indicated in a few words how he imagined one could generalize the mechanisms to more parameters and more cyclic coordinates (Paulus 1916, pp. 858–859). More than 50 years later the leading MIT economist Paul Anthony Samuelson, independently of Paulus, investigated how Hertz’s mechanics could account for the motion of a point mass in a constant force field. He came up with a similar answer, although phrased in abstract analytical terms, without any mechanism to do the trick. Samuelson also pointed to the problem that U cannot become negative, and therefore z has a lower bound. He took that to be a strong argument against Hertz’s approach: We can always select our arbitrary origin for altitude z in such a way as to make the new gz of whatever sign we like at any range of altitudes. [This corresponds to an addition of an arbitrary constant to the potential energy]. But every time we consider a path that falls out of that range, we would have to change in advance our origin for z, a procedure both practically messy and aesthetically repugnant. Or, as a Hertzian would put it, this construction would seem to lack ‘appropriateness.’ . . . Moreover, the phenomenon discussed here is general: V (q)[U (z)] will for many natural problems want to run through a gamut of values from −∞ to +∞, leading to the same messy
Force-producing models
277
requirement that we add new arbitrary constants to V in each different range. Why take a local train involving many transfers when an express train rides right through? (Samuelson 1971)
Samuelson’s argument assumes as a given thing that in nature there are potentials that are not bounded from below. However, there are good physical arguments to the contrary. In nature no (approximately) constant force field stretches infinitely far, and gravitational forces cease to be applicable when the gravitating bodies touch each other. The unboundedness only occurs because one makes the unphysical assumption that the attracting bodies are mathematical points. Thus, one may argue that Hertz’s assumption about bounded potential energy is in conformity with nature. The force-producing models suggested by Paulus, may lend logical support to Hertz’s image, but it is also evident that very few physicists would embrace an image of nature based on such artificial mechanisms. It is also rather obvious that it was not such models Hertz had hoped for. Paulus’s mechanisms appear ad hoc, and each interaction will require its own set of mechanisms. What Hertz had in mind was a rather simple model of the ether, which could ‘explain’ all known interactions in one stroke. One might perhaps even have hoped that such a model would predict new phenomena, which Paulus’s mechanisms could not. After Einstein’s introduction of relativity theory interest in the ether gradually faded away, and with it interest in constructing better hidden force producing models to support Hertz’s mechanics.
27 Reception, extension and impact
27.1 Reception Hertz’s Mechanics aroused widespread interest in the physics and, to a certain degree also, the mathematics community. When Hertz died the news about his posthumous work quickly spread (see, e.g. Jones’s obituary (Jones 1894) in Nature), and when the book appeared it received much attention. It was perceived as a culmination of the search for a mechanics without distance forces, but its popularity also owed something to the tragic fate of its author: ‘In the old classical times it would have been said that he had fallen a victim to the envy of the gods’ Helmholtz wrote in his introduction to the book (Helmholtz 1894). George Francis FitzGerald appealed to the same sentiments when he opened his review of Hertz’s Mechanics with the words This posthumous volume of Hertz’s works, edited by Prof. Lenard, with a preface by von Helmholtz, has a doubly melancholy interest. It is the last work of Hertz upon which he was engaged until a few days before his death, and it contains a preface which is almost the last work of von Helmholtz. The pupil died shortly before his master, and by the departure of such a pupil and of such a master, science, and with science mankind, have lost many prospects of advances in the near future. (FitzGerald 1895)
The early commentators agreed, to a large extent, in their evaluation of Hertz’s book. Among the merits they counted its philosophical sophistication, the rigorous and elegant mathematical structure, and the avoidance of forces acting at a distance. As its main weakness they mentioned its complete neglect of the question of how to construct the hidden systems that would account for the observed motions in the physical world. Thus, in his preface to Hertz’s book Helmholtz wrote: Unfortunately he has not given examples illustrating the manner in which he supposed such hypothetical mechanism to act; to explain even the simplest cases of physical forces on these lines will clearly require much scientific insight and imaginative power. (Helmholtz 1894, p. XXVII)
Most authors felt that if such systems could at all be constructed, they would be so complicated and contain so many ‘idle wheels’ that it would be hard to argue that the resulting image was more ‘appropriate’ than the usual image of mechanics. If Hertz had lived he would certainly have been hard pressed for a reaction to this problem. 278
Reception
279
As it were, reviewers deeply regretted that he was prevented from explaining his views. Boltzmann wrote: He [Hertz] created a strikingly simple system of mechanics based on very few but to be sure logically quite natural principles. Regrettably, at the same moment his voice fell silent forever, leaving unanswered the thousand open questions that surely I am not the only one to have on my mind. (Boltzmann 1900b, p. 84)
In this way, Hertz’s untimely death mollified the criticism of his Principles of Mechanics. The published receptions of Hertz’s book can be divided into three main categories: reviews, philosophical reflections, and works in which Hertz’s ideas were used or carried further. Of the reviews proper, (Ebert 1895) was brief, positive and precise, (Lampe 1896) was rather superficial and positive and (FitzGerald 1895) was informative and highly competent. In accordance with the concrete mechanistic British way of thinking about physics, he raised the ‘question as to the danger of his [Hertz’s] rigid connections becoming tangled’ (FitzGerald 1895, p. 284). Moreover, he criticized Hertz’s use of ‘space of multiple dimensions’ since ‘this . . . represents the real by the unattainable.’ He also argued that the only problem related to distance forces is to explain how a body can act ‘where it is not’ and suggested that this could be solved by considering ‘each atom as existing everywhere’ (FitzGerald 1895, pp. 285–286). Yet, despite these critical points, he concluded: ‘It [Hertz’s book] is most philosophical and condensed, and gives one of the most – if not the most – philosophical representations of dynamics that has been published. It is worthy of its author: What more can be said?’ (FitzGerald 1895, p. 285). Many reactions to Hertz’s mechanics were shaped by its mechanistic implications for other branches of physics, rather than by its actual content. Thus, in his preface to Hertz’s Mechanics Helmholtz could hardly hide his fundamental uneasiness with Hertz’s endeavor. Having eulogized his favorite student and having explained the main thrust of his new and logically perfect presentation of mechanics Helmholtz added: English physicists – e.g. Lord Kelvin, in his theory of vortex-atoms, and Maxwell, in his hypothesis of systems of cells with rotating contents, on which he bases his attempt at mechanical explanation of electromagnetic processes – have evidently arrived a fuller satisfaction from such explanations than from the simple representation of physical facts and laws in the most general form, as given in systems of differential equations. For my own part, I must admit that I have adhered to the latter mode of representation and have felt safer in so doing; yet I have no essential objections to raise against a method which has been adopted by three physicists of such eminence. (Helmholtz 1894, pp. XXVII–XXVIII)
This rather positivist insistence on physical law as opposed to mechanical reduction is the view of the old Helmholtz. Earlier in his life he had himself argued that all phenomena should ultimately (see Chapter 3) be reduced to the laws of mechanics, but contrary to Hertz he believed that the concept of force had to be a basic ingredient of mechanics. This explains Helmholtz’s lukewarm reception of Hertz’s Mechanics.
280
Reception, extension and impact
Henri Poincaré shared this view. In his Les idées de Hertz sur la mécanique (Poincaré 1897) he praised the novel perspective Hertz had given mechanics but declared some dissatisfaction with Hertz’s image ‘because it relies too much on hypothesis.’ Rather than inventing mechanical models ‘as the English love to do’ he would rather admit our ignorance about the nature of forces. As he put it in his Science and Hypothesis: let us not forget the end we seek, which is not the mechanism: the true and only aim is unity. We ought therefore to set some limits to our ambition. Let us not seek to formulate a mechanical explanation; let us be content to show that we can always find one if we wish. (Poincaré 1902, p. 177)
It may be in order to point out that the differences between Hertz and his critics Helmholtz and Poincaré was rather one of degrees than a one of total disagreement. Indeed Hertz seems to have agreed with Poincaré that at least for the time being it would be premature to construct a particular model of the ether. His remarks about the experimental nature of this problem (see (Hertz 1894, p. 32/27) (discussed in Sections 6.3 and 25.3) and pp. 48–50/40–41) may well be read as recognition of the fact that without further knowledge of the microstructure of the world it would involve too many hypotheses to attempt to construct a concrete model of the ether. Moreover, Hertz’s discussion of Maxwell’s equations reveals that he was perfectly willing to discuss on the level of differential equations just as Helmholtz preferred to do. His mechanics even laid the ground work for discussing on the level of forces. However, Hertz would insist that in the end one should strive for a mechanistic explanation, and for that aim a mechanics that does not assume forces from the outset would be necessary. One might imagine that Hertz would answer Poincaré. ‘You are right. For the moment the goal is to show unity of science by showing that mechanical explanations are possible. But for such a demonstration, my image of mechanics provides a necessary foundation.’ Poincaré was also convinced that all mechanical systems could be described by rigid constraints. Actually he pretended (Poincaré 1897) that Kempe’s theorem provided a proof. This theorem states that any finite part of an algebraic curve can be described by a linkage. How this theorem should provide a proof that one can construct a hidden system that describes any interaction in nature is not clear to me. Poincaré’s paper of 1897 was not a real review but rather a philosophical and critical reflection on the principles of mechanics caused by Hertz’s book. In particular, he rejected Hertz’s claim that ‘that which is derived from experience can again be annulled by experience’ (Hertz 1894, p. 11/9). According to Poincaré the principles of mechanics are conventions. They are not arbitrary conventions, because they derive from experience. But since they are conventions they cannot be falsified. If we experience systems that do not seem to adhere to these principles we will have to think of some auxiliary hypotheses that will save the phenomena. For example, if the motion of a planet seems to defy Newton’s laws one should not reject these laws, but look for possible perturbations. That is how Neptune was discovered from the apparent irregularities of Uranus. According to this conventionalist point of view,
Reception
281
that Poincaré explained in more detail in his Science and Hypothesis (Poincaré 1902, in particular pp. 104–105), it will always be simpler to imagine such hypotheses, than to reject the simple laws of mechanics1 . In his ‘review’ of Hertz’s mechanics, Poincaré wrote as a natural philosopher. It is remarkable that the mathematician Poincaré did not find it worth mentioning the novel mathematical form in which Hertz cast his mechanics. One might have imagined that this form would appeal to a mathematician who had himself derived fundamental consequences from a geometric consideration of mechanics (though Poincaré geometrized phase space rather than configuration space as Hertz did). Did he not study the main part of Hertz’s book in detail, or did he think that Hertz’s differential geometric formalism was rather obvious? Helm’s review of Hertz’s Mechanics in the Vierteljahrsschrift für wissenschaftliche Philosophie (Helm 1895) began as a praise of the novelty and the ‘admirable rigor and clarity’ of the book. The mechanistic world view according to which nothing exists except masses in motion had never been presented in a clearer light. And precisely for that reason a study of Hertz’s book will reveal the deep gulf that separates this world view from the energetic one. According to Helm, Hertz’s book reveals that the mechanistic world view will necessitate the fabrication (Erdichtung) of artificial and complicated hidden masses, so that it can only be, and only pretends to be an image. Helm contrasted this poetry with the energetic world view that offered a quantitative description of the connections of nature. Thus, Helm’s review ended on a highly critical note vis-á-vis Hertz’s image theory and his mechanistic world view. That is not surprising considering that Helm wrote his review in the same year that he and Ostwald began their energetic war with the mechanistically oriented physicists. Other philosophical and critical reactions to Hertz’s work include Classen’s (Classen 1897) comparison between Boltzmann’s description and Hertz’s explanation of mechanics, and the influential works by Ernst Mach (Mach 1883, 4. edn 1901) and Pierre Duhem (Duhem 1903). Although the latter two agreed on most details in Hertz’s mechanics, their overall evaluations were rather different. Duhem regarded Hertz’s program as the last step in a series of British mechanistic explanations of physics of which he was extremely critical. He stressed that ‘Hertz’s mechanics is less of a doctrine than a project or a program of a doctrine’ (Duhem 1903, p. 167). As opposed to Duhem’s negative evaluation Mach expressed an overall positive view of Hertz’s ideas. In the fourth edition of his Die Mechanik in Ihrer Entwickelung (the second edition to appear after Hertz’s book) he added a chapter where he explained that he regarded Hertz’s mechanics as an essential step forward towards the goal he himself had sketched in the first edition (which Hertz on his part had admitted to have been of great value to him (Hertz 1894, end of Preface)). Mach also argued that his own requirement of ‘economy’ was equivalent to Hertz’s requirement of ‘appropriateness.’ Of course, a positivist like Mach could not be entirely satisfied with Hertz’s hidden masses and he felt that forces were to be preferred to rigid connections; yet he concluded: ‘As an ideal program Hertz’s mechanics is more beautiful 1 See (Gray 2004) and (Lützen 2004) for more information about Poincaré’s concentionalism and its relation to Hertz’s points of view.
282
Reception, extension and impact
and more unified, but for applications our usual mechanics recommends itself.’ (Mach 1883, 4. edn 1901, Chapter 2, Section 8). Mach’s discussion was important for the reception of Hertz’s work among philosophers and physicists with a bent to foundational questions. Among mathematicians Voss’s discussion of Hertz’s principles in the Encyclopädie der Mathematischen Wissenschaften (Voss 1901, in particular §28) was a more obvious reference. Hertz’s critical-historical discussion of the usual Newtonian image of mechanics was received with reservation by several authors, in particular by Volkmann. In his opinion, a physicist like Hertz was not always able to make just historical analyses. In particular, he argued that Hertz had not presented Newton’s ideas in a correct light: The recent education of physicists devotes so little time to the historical study of the classics that a scientific authority like Hertz may mislead the young generation to speak of Newtonian mechanics as a standpoint of the past. (Volkmann 1901, p. 281)
Volkmann maintained that Newton was a better leader (Führer) of the next generation than Hertz. Among the advocates of the electromagnetic world view Hertz’s Mechanics became the locus classicus for the competing mechanistic world view. Thus when Wien formulated the basic ideas of the electromagnetic world view in his programmatic paper ‘On the possibility of an electromagnetic foundation of mechanics’ he remarked: ‘The general plan of the Hertz’ian mechanics seems to me to be conceived to incorporate not only the mechanical but also the electromagnetic phenomena’ (Wien 1900, p. 96). He continued to point out that no generally accepted mechanistic explanation of the ether had been constructed on the basis of the traditional mechanics, but added that it is unknown if Hertz’s more appropriate image of mechanics will lead to a satisfactory mechanical explanation of electrodynamics. Yet he believed it was more promising to ‘consider the basic electromagnetic equations as the most general ones, from which the mechanical equations must follow.’ Such a reductionist program is, according to Wien, ‘the diametrically opposite of Hertz’s. While Hertz’s mechanics clearly aims at deducing the electromagnetic equations as consequences [of the equations of mechanics] the relation is here the opposite.’ (Wien 1900, p. 107). Not only did Wien consider Hertz’s mechanics as the best exponent of the mechanistic world view, he also admitted that the sketch he offered of an electromagnetic world view could not compete with the logical clarity of Hertz’s presentation: As far as the logical construction is concerned, an electromagnetically based mechanics can naturally not measure up to Hertz’s mechanics, for that reason alone that the system of Maxwell’s differential equations have not received a critical revision. However it seems to me that it has a great advantage, namely that, as we have shown, it goes outside the usual mechanics, which from now on can be considered as a first approximation. In this way there is a possibility to experimentally decide for or against it (Wien 1900, p. 107)
According to the electromagnetic world view the inertial mass of a body is of electromagnetic origin. It is only constant to a first approximation and depends both on velocity and direction of the impressed acceleration. Wien does not seem to have been aware of the fact that Hertz’s mechanics opens the same possibilities and even
Reception
283
suggests that such phenomena would be felt if the velocity of the body moves with a velocity close to that of light (see Section 20.4). Indeed, no one seems to have tried if one could deduce a suitable approximation to the later relativistic transformation of mass from Hertz’s mechanics, just as one could from the electromagnetic point of view. As a conclusion of this section I shall turn to the particularly interesting problem about the reception of Hertz’s mechanistic ideas by their editor Philipp Lenard. Lenard conscientiously saw Hertz’s posthumous work through press and even helped with the English translation2 . In 1910 he also prepared a slightly corrected second edition of the book and added a new preface in which he explained that Hertz’s aims with the book had already been fulfilled in two directions: His image theory had become common property and the logical rigor of his presentation had been generally recognized. The explanation of the hidden motions of the ether was still a problem for the future, but Lenard maintained that Hertz’s mechanics would be of prime importance in such an endeavor. He went so far as to suggest that Hertz’s mechanics might in the end result in an explanation of the so-called principle of relativity. Lenard mentioned that the ‘conviction, that it is possible to consider the material world as a mechanism has in recent years among some scientists suffered from the . . . futile search for the special nature of this mechanism’ (Lenard 1910, p. X). In his preface he did not declare himself as one of the physicists who doubted the mechanistic world view, but in his Deutsche Physik volume 1 on mechanics he explicitly denied that the ether can be understood in mechanistic terms, although he still upheld its existence and rejected the Jewish theory of relativity. In particular, he was very outspoken against the unnecessary use of complicated mathematics in physics that he considered a special characteristic of the Jewish people who hid their lack of understanding of the truths3 of nature behind a veil of incomprehensible mathematics (Lenard 1938, Preface). He did not mention Hertz’s mechanics in particular, but it is quite clear that it would fall into the category of unnecessarily mathematical Jewish works. Lenard’s mechanics was a contribution to a special German (national socialist) type of physics that Lenard tried to establish with himself as its Führer. His earlier book on the Great men of Science served the same purpose. Still, Hertz, who was presented as being ‘partly of Jewish blood,’ was included among the great men of science, and one with a great insight into the workings of nature: It is very noteworthy that in fifteen years between Maxwell’s publication of his work On Electricity and Magnetism and Hertz’s discoveries, a great deal had been written about ‘Maxwell’s theory’ and in particular concerning the electromagnetic theory of light, and these subjects had been lectured upon at universities, yet not even the beginning of a way to the goal had been made clear, for people simply played about with Maxwell’s equations, but not with the ideas of Maxwell and Faraday: it was a mathematical game and not scientific research that was pursued, and the results were sterile. Hertz was the first who not only understood the equations, and knew how to deal with them mathematically when necessary, but also saw the structure of ideas upon which they were based by their originator, and understood how to move about in it. 2 See the translator’s note in the English translation of (Hertz 1894). 3 According to Lenard Jews had no concept of truth.
284
Reception, extension and impact
The equations are, so to speak, merely ground plans of this structure, and are far from being actual inhabitable apartments; the latter can only be produced by the architect, who knows how to grasp the ideas which have been put into the ground plan. (Lenard 1934, p. 359 note 1)
Problematic as this quote may be from a historical point of view, it shows that Lenard depicted his former teacher as a true hero of scientific investigation. It is not surprising that he only devoted one line of his Hertz biography to his ‘remarkable work on the Principles of Mechanics.’
27.2 Extensions and applications Six years after the publication of Hertz’s last work, Boltzmann wrote that he had ‘very often heard Hertz’s Mechanics being praised, but until now I have seen no one proceed along the way laid out by Hertz’ (Boltzmann 1900b, p. 83). Later, a few physicists and mathematicians did try to develop Hertz’s ideas in a limited way. In addition to Paulus, whose paper (Paulus 1916) was discussed in the previous chapter, and to a certain degree Boltzmann himself, one can mention the physicists Hendrik Antoon Lorentz and Paul Ehrenfest as well as the mathematician Alexander Brill. Three of these people, namely Boltzmann, Lorentz and Brill also spread the gospel of Hertz’s mechanics by lecturing on the subject4 . As a late advocate of the mechanistic world view Boltzmann (see (Hiebert 1980)) became the most influential propagator of Hertz’s principles of mechanics. His intensive occupation with Hertz’s ideas have become legendary through an untranslatable story that he told in 1903: Als ich mich wochenlang ausschliesslich mit Hertz’ Mechanik befasst hatte, wollte ich einmal mit den Worten ‘Liebes Herz’ einen Brief an meine Frau beginnen und ehe ich mich versah, hatte ich Herz mit tz geschrieben.’ (Boltzmann 1903, p. 11)
Boltzmann was probably the first who tried to use Hertzian mechanics. In the preface (written 1895) to his famous Vorlesungen über Gastheorie he recounted: I have also studied a gas theory in which, instead of forces acting during collisions, one merely has conditional equations in the sense of the posthumous mechanics of Hertz, which are more general than those of elastic collisions; I have abandoned this theory, however, since I only had to make more new arbitrary assumptions. (Boltzmann 1896, p. 26)
The problem, as he later explained (Boltzmann 1903, p. 34), was that in order to free mechanics from forces acting at a distance, Hertz had to introduce ‘connections acting at a distance’ and in order to account for the observed phenomena Boltzmann had to make ‘arbitrary assumptions’ or hypotheses about these connections. He therefore based his gas theory and his later textbook on mechanics on the ‘old distinction between potential and kinetic energy’ (Boltzmann 1896, p. 26). 4 Regarding the two first see (Klein 1970, p. 66), regarding Brill see the preface of (Brill 1909).
Extensions and applications
285
Despite this negative experience regarding the application of Hertz’s ideas, Boltzmann missed no opportunity to mention them in his more popular and philosophic lectures (Boltzmann 1900b) and (Boltzmann 1903). In these papers he stated the usual criticism concerning the lack of concrete realization of the known actions in nature, and called for a ‘hypothesis-free extension’ of Hertz’s ideas (Boltzmann 1903, pp. 34–35). However, he considered it a real possibility that one day it would be possible to explain all natural phenomena in a non-artificial way using Hertzian hidden systems (Boltzmann 1900b, pp. 85 and 95), and therefore concluded: The Hertz’ian mechanics seems to me to be more like a program for a far future. (Boltzmann 1900b, p. 85)
Boltzmann had already (the previous year) in a paper in the Jahresbericht of the Deutsche Mathematiker-Vereinigung asked physicists and mathematicians to begin concretizing this Hertzian program: It is clearly of the greatest importance for the understanding of Hertz’s mechanics to apply it to simple special cases. I herewith ask my colleagues to try to apply it to a case that I have not been able to work out. (Boltzmann 1899, p. 76)
The case in question that probably originated in his kinetic theory of gases, concerned the collision of two elastic balls. The problem is how one can avoid the obvious formulation of the constraint by way of an inequality and replace it with constraints expressed in terms of equalities that are the only ones allowed by Hertz. As an example of a similar problem Boltzmann showed how to deal with a small elastic ball moving inside a large hollow sphere. The problem of the motion of two elastic balls was solved by Brill in the following volume of the Jahresbericht (Brill 1900b). His solution was not of the non-artificial type that Boltzmann had hoped for. Indeed, it required the balls to be attached to weightless linked rods that even have to be infinitely long if the balls are allowed to move in an unbounded region. More importantly, Brill in the same paper announced that he had succeeded in generalizing Hertz’s mechanics to an incompressible fluid, a generalization that he felt might ultimately prove the general applicability of Hertz’s ideas to all of physics. He had published this generalization earlier the same year in a more obscure journal (Brill 1900a). As Brill noted in a postscript to his paper (Brill 1900b) this generalization clashed with a remark made by Boltzmann in his talk given the same year to the Deutsche Mathematiker-Vereinigung. Boltzmann had addressed the question how to construct the ‘medium’ of hidden masses that would propagate electromagnetic and gravitational action: One cannot assign to them [the hidden masses] the structure of the media that have been used earlier including Maxwell’s luminiferous ether because in all these media one assumes the action of the kind of forces that Hertz just excludes. (Boltzmann 1900b, p. 84)
Since incompressible fluids were among the media that had earlier been suggested as carriers of actions, Brill objected to Boltzmann’s assertions and pointed out that the type of equations that specified the motion of such a fluid was of the kind allowed by Hertz (Brill 1900b, pp. 203–204). In this respect, Brill was in complete accordance
286
Reception, extension and impact
with the opinion that Hertz himself had explicitly expressed. Boltzmann accepted Brill’s arguments in a paper the main aim of which was to point out a grave mistake in a paper Die Druckkräfte in der Hydrodynamik und die Hertz’sche Mechanik by Richard A. Reiff (Reiff 1900). Reiff’s main aim was to show ‘that we encounter difficulties when we try to carry out Hertz’s idea for a continuous changing system.’ As an example of a continuous system Reiff discussed an incompressible inviscid fluid, and deduced the pressure as resulting from the conservation of mass alone. Boltzmann, however, pointed out (Boltzmann 1900a) that the deduction was fundamentally flawed, and as an alternative he referred to Brill’s correct generalization of Hertz’s methods to fluids. This led him to conclude: . . . the equations of motion of an incompressible fluid and of rigid bodies that are immersed in such a fluid or surround it can without difficulty be deduced from Hertz’s principles of mechanics. (Boltzmann 1900a, p. 668)
Boltzmann did not substantiate his claim, but he repeated it in a seminar in 1903 after which one of his students, Ehrenfest, showed in detail how one could indeed deduce the motion of rigid bodies immersed in a fluid using Hertz’s approach. This constituted his doctoral dissertation (Ehrenfest 1904). As pointed out by Klein, the deduction itself was rather straightforward (Klein 1970, pp. 66–74). However, Ehrenfest was surprised to discover that Hertz’s law of the straightest path did not lead to the usual Lagrangian equations, but had additional terms built in. The explanation of these terms was the main problem addressed by Ehrenfest. They are, in fact, analogous to the additional terms found by Boltzmann and others before him (see Section 22.2) when adjusting the Lagrangian formalism for non-holonomic systems of point masses. Five years after Ehrenfest’s thesis, and without any mention of it, Brill published his Hertzian treatment of ‘space filling media’ in book form (Brill 1909). Boltzmann, Brill and Ehrenfest centered their investigations around Hertz’s concept of hidden masses but did not really use his ‘geometry of systems of points.’ Lorentz, on the other hand, questioned whether the Hertzian hidden systems would lead ‘to a clear and satisfactory view of natural phenomena’ (Lorentz 1902, p. 1) but argued: On the contrary, it seems hardly possible to doubt the great advantage as to conciseness and clearness of expression that is gained by the mathematical form Hertz has chosen for his statements. (Lorentz 1902, p. 1)
Therefore, he chose to investigate if one can formulate also the usual mechanics of systems governed by forces in terms of Hertz’s geometric language. His positive conclusion was published in the paper Some considerations on the principles of dynamics in connextion with Hertz’s ‘Prinzipien der Mechanik’ (Lorentz 1902).
27.3 Impact Seven years after Hertz’s Mechanics appeared Volkmann reported that it ‘was read with equal enthusiasm by philosophers and physicists.’ It was considered
Impact
287
a ‘new gospel’ and had a ‘great influence . . . in particular on the younger generation’ (Volkmann 1901). Several physicists imagined that this might be the direction along which the basic questions of physics should eventually be answered. Hertz had presented his mechanics as a foundation for a subsequent mechanistic reconstruction of physics as a whole, and many of his readers agreed that it provided a program for future research. However, as a research program it mostly operated on a very abstract level. It only gave rise to a few concrete pieces of research and its lifetime as a promising program was very limited. In fact, it was challenged from the day it was published. Its inherent microscopic foundation was challenged by the energetics program, and more importantly, the mechanistic reductionist program was losing ground, not only to positivist non-reductionist ways of thinking but also to another reductionist endeavor that considered electromagnetism as more fundamental than mechanics. For example, it was argued that inertia may be a consequence of electromagnetic induction. This electromagnetic world view was advanced by Wien in (Wien 1900). Finally, in 1905 Einstein began to direct physics and in particular the foundational questions addressed by Hertz in a completely different direction. Thus, a decade after its publication, the status of Hertz’s book as a program for the future looked less promising, and it appeared more as a brilliant conclusion of a dead end in the history of physics. However, although Hertz’s Principles of Mechanics never became the fundamental basis of a new program in physics, such as Hertz and some of his immediate successors had hoped for, it did influence the later development in several ways. First, on a technical level Hertz’s term holonomic constraint was accepted, and his law of the straightest path became an ingredient in many later textbooks on rational mechanics (see, e.g. (Sommerfeld 1943, §39) and (Hamel 1949)). Secondly, in a certain sense Hertz’s ideas of a forceless mechanics were realized as far as gravitational forces are concerned in Einstein’s general theory of relativity. This led Grigorjan (Grigorjan and Polak 1964) and Unsöld (Unsöld 1970) to consider Hertz as a precursor of Einstein. The latter concluded . . . that Hertz in his Principles of Mechanics has anticipated a series of physical ways of thinking . . . that were only rediscovered and made fruitful twenty years later by Einstein in his [clearly independent] general theory of relativity. (Unsöld 1970, p. 342)
To be sure, Einstein shares several things in common with Hertz; in addition to the elimination of gravitational forces acting at a distance, the use of differential geometry and tensor calculus. However, the basic idea in Einstein’s treatment of gravitational forces is that he included them into the geometry. Therefore, I think that the general theory of relativity can be viewed as a natural continuation of the ideas of Lipschitz and his followers, e.g. Ricci and Levi Civita, whereas it stands in stark contrast to Hertz’s hidden masses. Moreover, in contrast to Hertz, Einstein only eliminated gravitational forces. This may be seen as a limitation but it is, of course, essential to the explanation of the equality of gravitational and inertial mass. Finally, there is no evidence of any influence of Hertz’s mechanics on Einstein’s train of thought. Einstein is reported to have read at least the introduction to Hertz’s
288
Reception, extension and impact
Mechanics (see (Einstein 1989, p. 75 note 42)), but it is unclear if he read the main part of the book. He did not mention Hertz’s ideas (nor those of Lipschitz and Darboux) as a motivating factor in his later accounts of his discovery of general relativity ((Einstein 1933), (Einstein 1949)) and the editors of his collected works, have not found traces of such an influence5 . Thirdly, the idea of explaining problematic physical effects by way of hidden masses or variables did not vanish entirely in the beginning of the twentieth century, but continued a life at the fringe of the mainstream of physics. In his paper of 1916 Paulus showed how a strict (i.e. not approximate) interpretation of Hertz’s mechanics would give rise to a velocity- (and place-) dependent concept of mass. This, he suggested, might yield a ‘natural’ explanation of the effects predicted by the special theory of relativity. Moreover, Schrödinger may have conceived some relation between Einstein’s general theory of relativity and Hertz’s mechanics as is revealed by a manuscript from 1916–1918 (see (Kuhn et al. 1967)). In a more fundamental way hidden variables have been used by the opponents of the Copenhagen interpretation of quantum mechanics (e.g. Einstein) to explain the statistical element as a result of yet unknown processes. However, these explanations have not used the technical apparatus of Hertz’s mechanics, so it is questionable to regard them as a continuation of Hertz’s point of view. It is more appropriate to regard them as continuations of more general ideas about hidden masses that had been around from the time of Descartes through the late nineteenth century. Fourthly, the mathematical form of Hertz’s mechanics, i.e. its use of a Riemannian geometry of configuration space, has had a great impact on the later presentations of mechanics. Although such geometric language had been employed by several mathematicians beginning (explicitly) with Lipschitz and Poincaré, physicists seems to have learned it from Hertz (see, e.g. (Lorentz 1902), (Webster 1904), and (Pihl 1955)). Thus, Hertz’s influence on the geometrization of mechanics has probably been more long lasting than his influence on other parts of mechanics. Finally, Hertz’s Mechanics was an influential part of a general trend that reassessed the relationship between reality and theory. His image theory replaced a belief in the truth of physical theories by a critical idea of correctness that was ontologically uncommitted and that allowed elements in the theory that do not correspond to observable phenomena. Such a reassessment was important for the subsequent development of quantum mechanics with its clear separation between the theoretical formalism and the observables. The importance of this change in the philosophy of science is brought out in a clear way in (Cassirer 1950). Hertz’s insistence that there may be several permissible and correct images of reality and that only rather subjective criteria of appropriateness allows us to chose between them, was an important step in a relativisation of scientific theories. According to Hacking, Kuhn ‘immortalized’ Hertz’s terminology of images in the opening sentence of his Structure of Scientific Revolutions (Kuhn 1962) and only took the next step ‘when he (in an extreme reading) maintained that “There is no criteria for saying which representation of reality is 5 Oral communications from several members of the Einstein project.
Impact
289
the best.” Representations get chosen by social pressures. What Hertz had held up as a possibility too scaring to discuss, Kuhn said was brute fact’ (Hacking 1983, p. 144). The philosophical introduction to Hertz’s Mechanics was read and admired by many physicists, mathematicians and philosophers. This fact has been brought out convincingly in many recent papers. In particular, Hertz’s influence on Wittgenstein’s Tractatus (Wittgenstein 1921) has been analysed in a great number of works, e.g. (Barker 1980), (Barker 1979), (Majer 1983), (Majer 1985), (Wilson 1989), (Grasshoff 1998), (Festersen 2001), and (Kjaergaard 2002) (see Baird et al. 1998, p. 285 for a fuller list). It is generally accepted that Wittgenstein, who mentioned Boltzmann and Hertz, before Schopenhauer, Frege, Russell and other philosophers as his main source of inspiration, borrowed his image theory from Hertz, and extended it to the whole domain of language and thought6 . Grashoff, also traced Wittgenstein’s idea of simple objects back to Hertz’s Massenteilchen, and considered Hertz’s concept of models as an integral part of Wittgenstein’s theory. Recently, Festersen (Festersen 2001) has argued that Wittgenstein’s main debt to Hertz was the essential insight that one can obtain logical clarity by removing inessential elements from an image. While Hertz’s influence on Wittgenstein were noted half a century ago the influence of his image theory on physicists and mathematicians have only recently attracted the interest of historians and philosophers7 . It has been pointed out (see (Toepell 1986b), (Toepell 1986a), (Corry 1997) and (Majer 1998)) that Hilbert in his lectures on geometry referred to Hertz’s image theory already in the year Hertz’s book was published. Hilbert considered his Grundlagen der Geometrie (Hilbert 1899) to be for geometry what Hertz’s book was for mechanics, namely a critical and logically rigorous foundation. He considered his axioms to be images in Hertz’s sense and even considered his own requirements of an axiomatic system to correspond to Hertz’s requirements of an image: Hilbert’s consistency correspond to Hertz’s permissibility, Hilbert’s completeness corresponds to Hertz’s correctness, and Hilbert’s independence corresponds to Hertz’s simplicity. Problematic as these translations may be they are a testimony of an influence of Hertz’s ideas on the axiomatic development in mathematics. Majer (Majer 1998) has argued that such an influence was not limited to Hilbert but extended also to Weyl and Ramsey. Recently, Hentschel (Hentschel 1998) has argued for an influence on Heisenberg’s paper on the anomalous Zeeman effect. Thus, although the Prinzipien der Mechanik did not become the fundamental basis of all of physics that Hertz had hoped for, it still had a substantial influence on the subsequent development of physics, mathematics and philosophy. 6 Ian Hacking has argued that by doing so he made a mistake (Hacking 1983, p. 145). According to Hacking, Hertz was right and Wittgenstein was wrong. 7 Hertz’s influence on other philosophers has not been studied as much as it deserves. See, however, (Christiansen 2004).
28 List of conclusions
Instead of repeating the general aim of the book that was explained in the preface and the introduction, I shall, in this conclusion, list the specific conclusions I have arrived at in the book. Each conclusion is followed by a reference to the section in which I have argued for it. 1. Hertz conceived of his mechanics as the foundation of all of physics (3.1). 2. Hertz thought there was strong empirical evidence for the existence of atoms (5.2), but since his mechanics was supposed to be the foundation of the atomic structures as well, it was not (and could not be) based on an atomic theory of matter (11.3.1). 3. As a student Hertz was exposed to four different introductions to mechanics (5.1). 4. Hertz completed his manuscript of the Mechanics and Lenard only made a few inconsequential changes and additions in the last parts (1.4, 5.4). 5. Hertz’s work on electromagnetism provided a threefold background for his work on mechanics. a. As an ‘axiomatic’ reorganization of a field of physics, b. As an area of physics that needed mechanical explanation, and c. As an investigation that suggested that actions at a distance could and should be eliminated from physics (6.1). 6. Hertz investigated if gravitation could be described as a field theory (6.2). 7. Hertz considered the theory of the ether as ‘the all-important problem’ of physics, and he conceived of his Mechanics as a necessary foundation for such a theory (6.3). 8. Hertz at first tried to make a foundation of mechanics along energetic lines. He wanted to study the theory of energy in its widest sense in order to clarify the problem of localizability of energy as suggested by Poynting (6.4). 9. Felix Klein asked Hertz to write on mechanics for the Encyclopädie der mathematischen Wissenschaften. When Hertz began to work on mechanics he at first aimed toward composing such a paper (6.4). 10. Hertz’s manuscripts and drafts of the Mechanics throw light on Hertz’s construction of the geometric formalism. The physical idea of how to introduce forces, seem to have been clear to Hertz from the start. He lifted the method from Helmholtz (6.5). 290
List of conclusions
291
11. Hertz composed five drafts of the parts of his book concerned with the geometry of systems of points, four drafts of the mechanics of free systems, three drafts of the mechanics of unfree systems, including the introduction of forces and potential energy, and two drafts of the preface and the philosophical introduction (6.5). 12. Although Hertz introduced permissibility, correctness and appropriateness as three separate requirements of an image they were interrelated in various ways. In particular, Hertz considered the removal of empty relations (idle wheels) as the main way to make an image more appropriate, but also as the main strategy for gaining logical permissibility (7.5). 13. Hertz introduced an early (1884) version of his image theory as a physicist’s defence against philosopher’s undue critique (8.1). 14. Hertz’s mature image theory can be considered as a fusion of his earlier (1884) theory and his 1892 distinction between a colorless theory and a gay garment. In particular, in his mature theory Hertz borrowed a requirement of simplicity from his 1892 description of a colorless theory and applied it to the concept of an image. In this way, images became much more colorless than they had been in 1884 (8.4). 15. Hertz had not developed his mature image theory when he began working on his mechanics. His first plan of the book did not contain any philosophical introduction. The first draft of the image theory was written together with the second or third draft of the book (10.4). 16. It is necessary to distinguish between images and many concepts related to images (8.5). 17. According to Hertz, a comparison between images must proceed as follows: First the images are checked for permissibility. The non-permissible images are discarded. The remaining images are checked for correctness. Incorrect images are discarded. Finally, the remaining images are compared for appropriateness, and the most appropriate image is chosen. Hertz did not proceed along these lines when he compared the three images of mechanics. He often compared permissibility and correctness of the images and he sometimes appealed to appropriateness as an absolute criterion. Moreover, his concluding evaluation is entirely different from the rest of the evaluations. The apparent clash between the two evaluations is explained if we consider the concluding evaluation as an evaluation from the point of view of a future physicist (9). 18. Hertz carefully distinguished between a-priori and empirical elements of his image. This Kantian distinction grew stronger while he worked on his book (10.1–10.4). 19. Of a scientific representation of an image Hertz required that one can determine what is in the image for the sake of permissibility, what for the sake of correctness and what for the sake of appropriateness (10.1). This squares with the Duhem thesis that only whole theories can be tested for correctness. We can understand the requirement as far as permissibility is concerned if we consider permissibility as being concerned not just with logic but with everything that is a-priori in Kant’s sense (10.2).
292
List of conclusions
20. Our intuition of time and space is a-priori according to Hertz. However, the coordinative rules that tell us how to measure external distances and times may be empirical (11). 21. Hertz’s concept of mass and in particular the Massenteilchen were introduced in the image for the sake of appropriateness. The Massenteilchen were introduced in order to facilitate a deduction of the line element of the geometry of systems of points from the Euclidean metric of ordinary space. They were subsequently made infinitely small and defined in terms of pure space and time relations (11, 12). 22. Hertz’s mechanics does not deal with single point masses but treats systems of points as the basic entity. Hertz constructed his geometry of systems of points as the natural mathematical way to deal with systems as one entity. He constructed it independently of the contemporary mathematician’s geometrization of mechanics (13.2). 23. Hertz thought of several ways to define the central concept of curvature in his geometry of systems of points before he settled for the one to be found in the published book. The reason was that he was unsure about how to deal with the question of parallel transport (13.4). 24. In the Mechanics, Hertz introduced a concept of reduced components of a vector quantity. It coresponds to the modern concept of a covariant component. He did not introduce this concept until the second draft of the book. He introduced it in order that Hamilton’s equations have their usual form. In particular, the notion of reduced components gives a nice geometric interpretation of generalized momenta. 25. As usual in the history of mathematics and physics, the physical content and the mathematical form interacted in Hertz’s construction of his mechanics (14.8). In a similar way the philosophical considerations were both a result of and a basis for the mathematical and physical content of the book (10.4). Thus, Hertz’s Mechanics must be considered as an integrated whole. 26. Hertz’s image of mechanics allows connections at a distance (15.1). 27. Hertz derived his equations of constraint from a general experience of continuity (15.2). 28. According to Hertz, the fundamental law of motion is the only empirical element of his image (10.2, 16). Yet the form of the constraint and perhaps the coordination rules also have some empirical content (11, 15.2). 29. Hertz emphasized that ‘that which is derived from experience can again be annulled by experience.’ This applies to the fundamental law at least as far as all inanimate nature is concerned. He believed it was improbable but permissible to assume that the law could be applied to animate nature. Yet his remarks about animate nature suggest that his image gives a correct image of animate nature as well. This would imply that there is no difference in the status of animate and inanimate nature (25.5). 30. Hertz defined force as a Lagrange multiplier (19.2). 31. Hertz only derived the usual principles for conservative systems as an approximation. It is a good approximation if the particles of the hidden system that
List of conclusions
32.
33.
34. 35.
36.
37.
293
stores the potential energy have small mass and high velocity. Hertz admitted the possibility that one might be able to detect deviations from the usual principles when the visible system moves with a velocity close to that of light (20.4). Hertz’s image of mechanics does not allow non-holonomic constraints among the hidden particles. This somewhat undermines Hertz’s own argument for the necessity of non-holonomic constraints (20.3). Hertz did not commit an error when he pointed out that integral principles do not apply to non-holonomic systems. He may very well have been aware of the method to save them that Hölder later suggested (21, 22). Hertz’s geometrization of the Hamilton–Jacobi formalism was different from that of the mathematician’s, but it may have been suggested by the latter (23, 24). Hertz did not construct hidden systems that could account for gravitation and electromagnetic ‘forces.’ He referred the investigation of such concrete mechanistic explanations to experimental physics. Only a few of his successors constructed such mechanisms (26). Hertz’s Mechanics was well received. It was considered a program for the future, but it functioned only in a few cases as a mechanistic foundation for all of physics, as Hertz had hoped (27). Yet, it had a rather substantial influence on the subsequent development of physics, mathematics and philosophy (27.3).
This page intentionally left blank
Appendix
The following is a list of the manuscripts preserved in the Hertz Nachlass at the Deutsches Museum. They are ordered according to a probable chronology. The Ms number is the one I use for references in this book and the HS number is the archival code from the Deutsches Museum. I also cite the opening words and the number of pages (excluding inserted unpaginated sheets) of each of the early short manuscripts. Moreover, I mention my reasons for ascribing the manuscripts their place in the chronology, and I give a brief description of the content. Ms 1, HS 2845, 4 pp. Paginated 1–4 ‘Die gleichzeitig’ Hertz’s first introduction of a geometry apparently only for one point. Hertz speaks of ‘Verschiebungen’ rather than ‘Verrückungen.’ The first half page is crossed out. Ms 2 was clearly written as a replacement. Ms 2, HS 2845, 2 pp. Paginated A, blank ‘Verschiebungen eines Systems.’ Meant as a replacement of the first half page of Ms 1. Deals with a system of points. In the Nachlass, Ms 2 precedes Ms 1. Ms 3; HS 2845, 5 pp. Paginated 5–7 and two unpaginated sheets with calculations. Is a continuation of the pages 1–4 of Ms 1. ‘Krümmung.’ In the Nachlass the pages come in reverse order. ‘Verschiebungen’ rather than ‘Verrückungen.’ Ms 4, HS 2845, 5 pp. Unpaginated. ‘Aufgabe.’ One page continuation of Ms 3 followed by 4 pages with calculations. In the Nachlass they precede Ms 3. In one of the calculations masses appear for the first time. Ms 5, HS 2845, 1 p. Unpaginated. ‘Benachbarte Verrückungen.’ In the Nachlass this page follows Ms 2 and Ms 7, but the content and the use of ‘Verrückung’ rather than ‘Verschiebung’ suggests that it belongs chronologically after Ms 3, 4. Ms 6, HS 2845, 1 p. Paginated B. ‘Bahn etc.’ Hertz writes Verrückung (schiebung). Introduction of kinematic concepts. Ms 7, HS 2849, 11 pp. ‘Materieller Punkt’ paginated 1–10 (there is a p. 7a). Changes between ‘Verschiebung’ and ‘Verrückung.’ First introduction of connections and degrees of freedom. In the Nachlass placed after Ms 15. 295
296
Appendix
Ms 8, HS 2846, 20 pp. Not paginated. ‘Zeit, Raum, Materie.’ The first 20 pp. of HS 2846 contains this new beginning of the introductory material. It does not contain the concept of mass, but operates with connections. Ms 9, HS 2845, 54 pp. First draft of book. No introduction. Second book incomplete; breaks of around §417 in printed book. Last 3 pages stand out because Hertz here denotes generalized coordinates by p rather than q as in the earlier part of the manuscript. Ms 10, HS 2847. First draft of the preface. Paginated I–VI followed by one unpaginated page with references. May belong together with Ms 12–14 or Ms 15. Ms 11, HS 2874 and 2849. First draft of the introduction. First part (paginated I–XXIIIb) is kept in HS 2849 together with the 3rd draft of the book, second part (paginated XXIV–XXXXVII + many unpaginated pages) are kept in HS 2847 together with the second draft. It is not clear if it was written as part of the second or third draft. The mention of images ‘Bilder’ in the second draft (in the section on models) points to the earlier possibility. Ms 12, HS 2847. Paginated 1–139. Second draft of the beginning of the main part of the book corresponding to §1–417 in printed book, i.e. the same part that was covered by Ms 9. Ms 13, HS 2846, 180 pp. Miscellaneous calculations. Many calculations related to integral principles of conservative holonomic systems (§625–656 of the book). It may have been made in connection with Ms 14. Some of the first calculations, e.g. in connection with curvature, are probably of an earlier date. A three-page plan of the book seems to be contemporaneous with (or pre-date) the very first manuscripts. Ms 14, HS 2848. Paginated pp. 1–28 and 1–12. First draft of the last part of the book corresponding to §418–736. A separate sheet explains that this belongs to the second draft. This locates the manuscript as the continuation of Ms 12 rather than of Ms 9, an ordering that is corroborated both by the content and the numbering of the sections: Last section of Ms 9 is numbered §2, last section of Ms 12 is numbered §4, and first new section number in Ms 13 is §5. The paper format, however, is similar to that of Ms 9 but different from that of Ms 12. Ms 15, HS 2849, 258 pp. 3rd draft of main part of book. Paginated 1–258. Bears paragraph numbers §1–655. In the middle of the manuscript there are double numerations of the paragraphs. Ms 16, HS 2850, HS 2851, HS 2852: 4th draft of main part of the book. 1st book is in HS 2850, (pp. 1–159, §1–295). Beginning of 2nd book (pp. 160–unpaginated, §296–595 middle) is in HS 2851. End of 2nd book pp. 1–97 (not Hertz’s pagination, §595 middle–725) is in HS 2852. The text is the same as in the printed book, except for a few small changes made by Lenard mostly toward the end of the book. The paragraph numbers 1–673 were changed by Hertz during the composition of the manuscripts. The final numbers correspond to those in the book. From §674 on Hertz did not make
Appendix
297
the necessary changes so these last paragraphs are numbered with a number that is 12 smaller than in the book. Ms 17, HS 2852. Second draft of the Preface and Introduction. Paginated I–XIII and 1–91, respectively. Seems to have been written together with the fourth draft of the book.
This page intentionally left blank
Bibliography
Andrade, J. (1898). Leçons de Mécanique physique. Paris. Appell, P. (1896). Traité de mécanique rationnelle, volume 2. Gauthier-Villars, Paris. Appell, P. (1898). Sur les équations de Lagrange et la principe d’Hamilton. Bulletin de la Société Mathématique de France, 26: 265–267. Appell, P. (1899). Les mouvements de roullement en dynamique, volume 4 of Physique Mathématique Scientia, Paris. Arnold, V. I. (1978). Mathematical Methods of Classical Mechanics. Springer-Verlag, New York. Translated from the Russian by K. Vogtmann and A. Weinstein, Graduate Texts in Mathematics, 60. Baird, D., Huges, R. I. G., and Nordmann, A. (ed.) (1998). Heinrich Hertz: Classical Physicist, Modern Philosopher, volume 198 of Boston Studies in Philosophy of Science. Kluwer, Dordrecht. Barker, P. (1979). Untangling the net metaphor. Philosophy Research Archives, 5: 184–199. Barker, P. (1980). Hertz and Wittgenstein. Studies in History and Philosophy of Science, 11: 243–256. Beltrami, E. (1868a). Saggio di interpretazione della geometria non-euclidea. Giornale di Matematiche, 6: 284–312, Opere Matematiche I, 374–405, French Translation in Annales scientifiques de l’École Normale Supérieure, 6 (1869): 251–288. English Translation in (Stillwell 1991), 7–34. Beltrami, E. (1868b). Sulla theorica generale dei parametri differentiali. Memoire dell’ Academia delle Scienze dell’ Instituto di Bologna, ser. 2, 8: 551–590, Opere Matematiche, vol. 2, 74–118. Bevilacqua, F. (1993). Helmholtz’s Ueber die Erhaltung der Kraft. The emergence of a theoretical physicist. In Cahan, D. (ed.), Hermann von Helmholtz and the Foundations of Nineteenth-Century Science, Chapter 7, pages 291–333. University of California Press, Berkeley. Bierhalter, G. (1993). Helmholtz’s mechanical foundation of thermodynamics. In Cahan, D. (ed.), Hermann von Helmholtz and the Foundations of Nineteenth-Century Science, Chapter 11, pages 432–458. University of California Press, Berkeley. Boltzmann, L. (1884). Über die Eigenschaften monozyklischer und anderer damit verwandter Systeme. Journal für die reine und angewannte Mathematik, 98: 68–94, Wissenschaftliche Abhandlungen III, 122–181. Boltzmann, L. (1891). Vorlesungen über Maxwells Theorie der Elektricität und des Lichtes. Barth, Leipzig. vol. I 1881, vol. II 1893.
299
300
Bibliography
Boltzmann, L. (1896). Vorlesungen über Gastheorie, volume 1. Barth, Leipzig. Quotations from the English translation: Lectures on Gas Theory, University of California Press, Berkeley and Los Angeles, 1964. Boltzmann, L. (1899). Eine Aufgabe, betreffend ein Beispiel zu Hertz’ Mechanik. Jahresberichte der deutchen Mathematiker-Vereinigung, 7: 76–77. Boltzmann, L. (1900a). Die Druckkräfte in Hydrodynamik und die Hertz’sche Mechanik. Annalen der Physik, 4(1): 673–677, Wissenschaftliche Abhandlungen, vol. 3, 665–669. Boltzmann, L. (1900b). Über die Entwicklung der Methoden der theoretischen Physik in neuerer Zeit. Jahresberichte der Deutschen Mathematiker-Vereinigung, 8: 71–95. Boltzmann, L. (1902). Über die Form der Lagrangeschen Gleichungen für nichtholonome generalisierte Koordinaten. Sitzungsberichte der Akademie der Wissenschaften zu Wien. Math-naturwiss. Klasse, 111: 1603–1614, Wissenschaftliche Abhandlungen III, 682–692. Boltzmann, L. (1903). Über die Prinzipien der Mechanik. Zwei akademische Antrittsreden. Hirzel, Leipzig. Bonola, R. (1955). Non-Euclidean Geometry, a Critical and Historical Study of its Developments. Dover Publications Inc., New York. Translation with additional appendices by H. S. Carslaw, Supplement containing the G. B. Halsted translations of ‘The Science of Absolute Space’ by John Bolyai and ‘The Theory of Parallels’ by Nicholas Lobachevski. Bottazzini, U. and Tazzioli, R. (1995). Naturphilosophie and its role in Riemann’s mathematics. Revue d’Histoire des Mathématiques, 1(1): 3–38. Breunich, H. P. (1988). Zur Hertzchen Mechanik, Inaugural-Dissertation. PhD thesis, Frankfurt. Brill, A. (1900a). Über die Mechanik von Hertz. Mathematisch-naturwissenschaftliche Mitteilungen des mathematisch-naturwissenschaftlichen Vereins in Würtemberg, 2(2): 1–16. Brill, A. (1900b). Über ein Beispiel des Herrn Boltzmann zu der Mechanik von Hertz. Jahresberichte der Deutschen Mathematiker-Vereinigung, 8: 200–204. Brill, A. (1909). Vorlesungen zu Einführung in die Mechanik raumfüllender Massen. Teubner, Leipzig. Brunet, P. (1938). Etude Historique sur le Principe de la Moindre Action. Hermann, Paris. Buchwald, J. (1985). From Maxwell to Microphysics. University of Chicago Press, Chicago. Buchwald, J. (1994). The Creation of Scientific Effects: Heinrich Hertz and Electric Waves. University of Chicago Press, Chicago. Buchwald, J. (2003). On the scholar’s seeing eye. In Jürgen Renn, Larry Holmes, H.-J. R. (ed.), Reworking the Bench: Research Notebooks in the History of Science, volume 7 of Archimedes, pages 309–325. Kluwer, Dordrecht. Budde, E. (1890). Allgemeine Mechanik der Punkte und starren Systeme. Berlin. Cassirer, E. (1910). Substanzbegriff und Funktionsbegriff. Bruno Cassirer, Berlin. Cassirer, E. (1950). The Problem of Knowledge. Philosophy, Science, and History since Hegel. Yale University Press, New Haven. Translated by W. H. Woglom and C. W. Hendel. Chasles, M. (1837). Mémoire sur l’attraction d’une couche ellipsoidale infiniment mince et les rapports qui ont lieu entre cette attraction et les lois de la chaleur en mouvement dans un corps en équilibre de température. Journal de l’Ecole Polytechnique, 15 (25. cahier): 266–316. Chatzis, K. (1995). Un aperçu de la discussion sur les principes de la mécanique rationnelle en France à la fin du siècle dernier. Révue d’histoire des mathématiques, 1: 235–270. Christiansen, F. V. (2004). Heinrich Hertz’s neo-Kantian Philosophy of Science, and its Development by Harald Høffding. Preprint, Copenhagen. Christoffel, E. B. (1869). Ueber die Transformation der homogenen Differentialausdrücke zweiten Grades. Journal für die reine und angewandte Mathematik, 70: 46–70.
Bibliography
301
Classen, J. (1897). Die Prinzipien der Mechanik bei Boltzmann und Hertz. Jahrbuch der Hamburgischen Wissenschaftlichen Anstalten, 15: 27–37. Clifford, W. K. (1876). On the space-theory of matter. Proceedings of the Cambridge Philosophical Society, 2: 157–158, Read February 21, 1870. Mathematical Papers (1882), 21–22. Clifford, W. K. (1878). Elements of Dynamics, Part 1. London. Clifford, W. K. (1885). The Common Sense of the Exact Sciences. London. Cohen, I. B. (ed.) (1981). The Conservation of Energy and the Principle of Least Action. Arno Press, New York. Corry, L. (1997). David Hilbert and the axiomatization of physics (1894–1905). Archive for History of Exact Sciences, 51(2): 83–198. D’Agostino (1971). Hertz and Helmholtz on electromagnetic waves. Scientia, 106: 637–648. D’Agostino, S. (1975). Hertz’s researches on electromagnetic waves. Historical Studies in the Physical Sciences, 6: 261–323. D’Agostino, S. (1993). Hertz’s researches and their place in nineteenth century theoretical physics. Centaurus, 36: 46–82. D’Agostino, S. (1998). Hertz’s view on the methods of phyics: Experiment and theory reconciled? In Baird, D., Hughes, R., and Nordmann, A. (ed.), Heinrich Hertz: Classical Physicist, Modern Philosopher, pages 89–102. Kluwer, Dordrecht. Darboux, G. (1888). Leçons sur la théorie générale des surfaces, 2. partie. Gauthier-Villars, Paris. Darrigol, O. (2000). Electrodynamics from Ampère to Einstein. Oxford University Press, Oxford. Despeyrous, T. (1884). Cours de Mécanique. Paris. 2 volumes, with notes by G. Darboux. Dhombres, J. and Radelet-de Grave, P. (1991). Contingence et nécessité en mécanique. Étude de deux textes inédits de Jean d’Alembert. Physis: Rivista Internazionale di Storia della Scienza, 28(1): 35–114. du Bois-Reymond, P. (1890). Über die Grundlagen der Erkenntnis in den exakten Wissenschaften. Tübingen. Dugas, R. (1988). A History of Mechanics. Dover Publications Inc., New York. With a foreword by Louis de Broglie, Translated from the French by J. R. Maddox, Reprint of the 1957 translation. Duhem, P. (1903). L’Évolution de la mécanique. Joanin. Reprinted by Vrin, Paris, 1992. Duhem, P. (1906). La théorie physique, son objet et sa structure. Paris. Dühring, E. (1873). Kritische Geschichte der allgemeinen Prinzipien der Mechanik. Theobald Grieben, Berlin, 1st edn. 3rd edn, Leipzig, 1887. Ebert, H. (1895). H. Hertz. Die Prinzipien der Mechanik (a review). Beiblätter zu den Annalen der Physik und Chemie, 19: 106–108. Ehrenfest, P. (1904). Die Bewegung Starrer Körper in Flüssigkeiten und die Mechanik von Hertz, Dissertation. In Collected Scientific Papers, pages 1–75. Einstein, A. (1921). Geometrie und Erfahrung. Springer, Berlin. English translation: ‘Geometry and experience’ p. 232 in Einstein, A. Ideas and Opinions. Crown Trade, New York, 1982. Einstein, A. (1933). The Origins of the General Theory of Relativity, volume 30 of Glasgow University Publications. Jackson, Wylie and Co., Glasgow. German version in Mein Weltbild (pp. 176–181) 2nd edn. Europa Verlag, Zürich, 1953. Einstein, A. (1949). Autobiographical notes. In Schilpp, P. (ed.), Albert Einstein: Philosopher Scientist, pages 1–96. Tudor, New York.
302
Bibliography
Einstein, A. (1989). The Collected Papers of Albert Einstein. Vol. 2. Princeton University Press, Princeton, NJ. The Swiss years: writings, 1900–1909 (ed.), John Stachel, Translations from the German by Anna Beck. Faraday, M. (1857). On the conservation of force. Proceedings of the Royal Institution of Great Britain, 2: 352–365. Philosophical Magazine ser. 4 (1857): 225–239. Ferrers, N. M. (1873). Extention of Lagrange’s equations. Quarterly Journal of Pure and Applied Mathematics, 12: 1–5. Festersen, C. (2001). Philosophie und Mathematik in Wittgensteins Tractatus. Master’s thesis, Institut for Videnskabshistorie, Aarhus Universitet, Århus. FitzGerald, G. F. (1895). The foundations of dynamics. Nature, 51: 283–285. Review of (Hertz 1894). Fölsing, A. (1997). Heinrich Hertz. Eine Biographie. Hoffmann und Campe, Hamburg. Fox, R. (1974). The rise and fall of Laplacian physics. Historical Studies in the Physical Sciences, 4: 89–136. Fraser, C. (1997). Calculus and Analytical Mechanics. Variorum, Aldershot. Fraser, C. and Nakane, M. (2002). The early history of Hamilton-Jacobi dynamics 1834–1837. Centaurus, 44: 161–227. Friedman, M. (1992). Kant and the Exact Sciences. Harvard University Press, Cambridge, Mass. Friedman, M. (1997). Helmholtz’s Zeichentheorie and Schlick’s allgemeine Erkenntnislehre: Early logical empiricism and its nineteenth century background. Philosophical Topics, 25: 19–50. Galilei, G. (1638). Discorsi e dimostrationi matematiche intorno a due nuove scienze. Elsevier, Leiden. English translation: Dialogues Concerning Two New Sciences. Dover, New York, 1954. Galle, A. (1912). Mathematische Instrumente. Teubner, Leipzig. Gauss, C. F. (1817). Letter form Gauss to Olbers, April 28, 1817. In Gauss Werke, vol. VIII, 1900. Göttingen. Gauss, C. F. (1828). Disquisitiones generales circa superficies curvas. Commentationes recentiores Societatis Regiae Scientiarum Göttingensis Math. Classe, 6: 99–146, Werke 4. 217–258, English translation. General Investigations of Curved Surfaces, Raven Press, New York, 1965. Gauss, C. F. (1829). Über ein neues allgemeines Grundgesetz der Mechanik. Journal für die reine und angewandte Mathematik, 4: 232–235. Werke vol. 5. Grasshoff, G. (1998). Hertz’s philosophy of nature in Wittgenstein’s Tractatus. In Baird, D., Hughes, R., and Nordmann, A. (ed.), Heinrich Hertz: Classical Physicist, Modern Philosopher, pages 243–268. Kluwer, Dordrecht. Gray, J. (1989). Ideas of Space. Euclidean, Non-Euclidean, and Relativistic. The Clarendon Press Oxford University Press, New York, 2nd edn. Gray, J. (2004). Kant, Poincaré and geometry. In Friedman, M. and Nordmann, A., (ed.), Kant’s Legacy. (To appear). MIT Press, Cambridge, Mass. Green, G. (1842). On the laws of reflection and refraction of light at the common surface of two non-crystalized media. Transactions of the Cambridge Philosophical Society, 7: 1–24, 113–120. Mathematical Papers, pp. 245–269. Grigorjan, A. T. and Polak, L. S. (1964). Die Grundideen der Mechanik von Heinrich Hertz. NTM, Beiheft, pages 89–101. Hacking, I. (1983). Representing and Intervening. Introductory Topics in the Philosophy of Natural Science. Cambridge University Press, Cambridge.
Bibliography
303
Hadamard, J. (1895). Sur les mouvements de roulement. Mémoires de la Société des Sciences Physiques et Naturelles de Bordeaux, (4. ser.) 5: Reprinted in (Appell 1899), 47–68. Hamel, G. (1904). Die Lagrange-Eulerschen Gleichungen der Mechanik. Zeitschrift für Mathematik und Physik, 50: 1–57. Hamel, G. (1949). Theoretische Mechanik. Springer-Verlag, Berlin. Hamilton, W. R. (1828). Theory of systems of rays. Transactions of the Royal Irish Academy, 15: 69–174, Mathematical Papers, 1: 1–87. Hamilton, W. R. (1830a). Second supplement to an essay on the theory of systems of rays. Transactions of the Royal Irish Academy, 16: 93–125, Mathematical Papers, 1: 145–163. Hamilton, W. R. (1830b). Supplement to an essay on the theory of systems of rays. Transactions of the Royal Irish Academy, 16: 1–61, Mathematical Papers, 1: 107–144. Hamilton, W. R. (1832). Third supplement to an essay on the theory of systems of rays. Transactions of the Royal Irish Academy, 17: 1–144, Mathematical Papers, 1: 164–293. Hamilton, W. R. (1834). On a general method in dynamics. Philosophical Transactions of the Royal Society, II: 247–308, Mathematical Papers, 2: 103–161. Hamilton, W. R. (1835). Second essay on a general method in dynamics. Philosophical Transactions of the Royal Academy, I: 95–144, Mathematical Papers, 2: 162–211. Harman, P. M. (1982). Energy, Force, and Matter. The Conceptual Development of NineteenthCentury Physics. Cambridge University Press, Cambridge. Heidelberger, M. (1993). Force, law, and experiment. The evolution of Helmholtz’s philosophy of science. In Cahan, D. (ed.), Hermann von Helmholtz and the Foundations of NineteenthCentury Science, Chapter 12, pages 461–497. University of California Press, Berkeley. Heidelberger, M. (1998). From Helmholtz’s philosophy of science to Hertz’s picture-theory. In Baird, D., Hughes, R. I. G., and Nordmann, A. (ed), Heinrich Hertz: Classical Physicist, Modern Philosopher, pages 9–24. Kluwer, Dordrecht. Helm, G. (1887). Die Lehre von der Energie, historisch-kritisch entwickelt. Nebst Beiträgen zu eine allgemeine Energetik. Leipzig. Helm, G. (1890). Ueber die analytische Verwendung des Energieprincips in der Mechanik. Zeitschrift für Mathematik und Physik, 35: 307–320. Helm, G. (1895). Über die Hertz’sche Mechanik. Vierteljahrsschrift für wissenschaftliche Philosophie, 19: 257–262. Helm, G. (1898). Die Energetik nach ihrer geschichtlichen Entwickelung. Veit, Leipzig. Helmholtz, H. v. (1847). Über die Erhaltung der Kraft. G. Reimer, Berlin. Helmholtz, H. v. (1858). Ueber Integrale der hydrodynamischen Gleichungen welche den Wirbelbewebungen entsprechen. Journal für die reine und angewandte Mathematik, 55: 25–56. Helmholtz, H. v. (1867). Handbuch der physiologischen Optik, volume 3. Voss, Leipzig. Helmholtz, H. v. (1868). Über die thatsächlichen Grundlagen der Geometrie. Verhandlungen des naturhistorisch-medicinischen Vereins zu Heidelberg, 4: 197–202. Wissenschaftliche Abhandlungen, 2: 610–639. Helmholtz, H. v. (1870a). On the origin and significance of geomertical axioms. Lecture given in Heidelberg 1870. In Newman, J. (ed.), The World of Mathematics, Vol. 1. 1956, pages 647–668. Schuster, New York. Helmholtz, H. v. (1870b). Über die Bewegungsgleichungen der Elektricität für ruhende leitende Körper. Journal für die reine und angewandte Mathematik, 72: 57–129. Wissenschaftliche Abhandlungen, vol. 1, Leipzig 1882, 545–628.
304
Bibliography
Helmholtz, H. v. (1884). Studien zur Statik monocyklischer Systeme. Sitzungsberichte der Akademie der Wissenschaften zu Berlin, pages 159–177, Wissenschaftliche Abhandlungen, 3: 117–141. Helmholtz, H. v. (1886). Ueber die physikalische Bedeutung des Princips der kleinsten Wirkung. Journal für die reine und angewandte Mathematik, 100: 137–166 and 213–222, Wissenschaftliche Abhandlungen, 3: 203–248. Helmholtz, H. v. (1887). Zur Geschichte des Prinzip der kleinsten Aktion. Sitzungsberichte der Königlichen Preussischen Akademie der Wissenschaften zu Berlin, pages 749–757. Nature, 36 (1887), 547, Annalen der Physik und Chemie, 34 (1888), 737–751. Helmholtz, H. v. (1892). Das Prinzip der kleinsten Wirkung in der Elektrodynamik. Sitzungsberichte der Königlichen Preussischen Akademie der Wissenschaften zu Berlin, pages 459–475. Annalen der Physik und Chemie, 47 (1892), 1–26. Helmholtz, H. v. (1894). Vorwort (Preface). In Hertz, H. (ed.), Die Prinzipien der Mechanik in neuem Zusammenhange dargestellt, pages XIII–XXXII. Barth, Leipzig. Helmholtz., H. v. (1898). Vorlesungen über die Dynamik discreter Massenunkte... Herausgeg. v. Otto Krigar–Menzel. Vorlesungen über theoretische Physik / von H. von Helmholtz Bd 1, Abth 2. Leipzig. Hentschel, K. (1998). Heinrich Hertz’s mechanics: A model for Werner Heisenberg’s april 1925 paper on the anomalous Zeemann effect. In Baird, D., Hughes, R., and Nordmann, A. (ed.), Heinrich Hertz: Classical Physicist, Modern Philosopher, pages 183–223. Kluwer, Dordrecht. Hertz, H. (1889). Über die Beziehungen zwischen Licht und Elektricität. In Vortrag gehalten bei der 62. Versamlung deutscher Naturforscher und Ärzte zu Heidelberg am 20. September 1889. Emil Strauss, Bonn, Gesammelte Werke, 1 (1895): 339–354. Hertz, H. (1890). Ueber die Grundgleichungen der Elektrodynamik für ruhende Körper. Annalen der Physik und Chemie, 40: 577–610. Nachrichten von der Königlichen Gesellschaft der Wissenschaften zu Göttingen (1890): 106–149. Gesammelte Werke, 2 (1894): 208–255. Hertz, H. (1892). Untersuchungen ueber die Ausbreitung der elektrischen Kraft. Barth, Leipzig. Gesammelte Werke vol. 2. Page references to the English translation: Electric Waves. Macmillan, London, 1900. Hertz, H. (1894). Die Prinzipien der Mechanik in neuem Zusammenhange dargestellt. Barth, Leipzig. Gesammelte Werke 3 (1910). Reprinted Sändig Vaduz 1984. English translation: The Principles of Mechanics Presented in a New Form, Macmillan, 1900. Reprinted Dover, New York 1950. Page references to the German/English translation. Hertz, H. (1999). Die Constitution der Materie, Eine Vorlesung über die Grundlagen der Physik aus dem Jahre 1884. Springer-Verlag, 1999, Berlin (ed.) Albrecht Fölsing. Hertz, J. (1977). Heinrich Hertz. Erinnerungen, Briefe, Tagebücher/Memoirs, Letters, Diaries. Physik Verlag/San Francisco Press, Weinheim/San Fransisco, 2nd edn. Hiebert, E. (1980). Boltzmann’s conception of theory construction: The promotion of pluralism, provisionalism, and pragmatic realism. In Hintikka, J., Gruender, D., and Agazzi, E. (ed.), Pisa Conference Proceedings, vol. 2, Dordrecht. Reidel. Hilbert, D. (1899). Grundlagen der Geometrie. Leipzig. Many later editions. Høffding, H. (1915). Modern Philosophers. Lectures delivered at the University of Copenhagen during the Autumn of 1902 and Lectures on Bergson. MacMillan, London. Hölder, O. (1896). Ueber die Principien von Hamilton und Maupertuis. Nachrichten der Königlichen Gesellschaft der Wissenschaften zu Göttingen, Math. Phys. Klasse, pages 122–157. Hyder, D. (2003). Kantian metaphysics and Hertzian mechanics. In Stadler, F. (ed.), The Vienna Circle and Logical Empiricism, pages 35–48. Kluwer, Dordrecht.
Bibliography
305
Hyder, D. (2004). Kant, Helmholtz and the determinacy of physical theory. In Lützen, J. (ed.), The Interaction between Mathematics, Physics and Philosophy from 1850 to 1940, Kluwer, Dordrecht (to appear). Jacobi, C. G. J. (1837). Über die Reduktion der Integration der partiellen Differentialgleichungen erster Ordnung zwischen irgend einer Zahl Variablen auf die Integration eines einzigen Systems gewönlicher Differentialgleichungen. Journal der reine und angewannte Mathematik, 17: 97–162, Werke, IV, 57–127, French translation: Sur la réduction de i’intégration des équations différentielles partielles du premier ordre entre un nombre quelconque de variables à l’intégration d’un seul système d’equations différentielles ordinaire, Journal de Mathématiques Pures et Appliquées, 3 (1838), 60–96 and 161–201. Jacobi, C. G. J. (1866). Vorlesungen über Dynamik gehalten an der Universität zu Königsberg im Wintersemester 1842–1843 und nach einem von C.W. Borchardt ausgearbeiteten Hefte. Teubner, 1866, Leipzig. Werke, Supplementband. Jacobi, C. G. J. (1996). Vorlesungen über analytische Mechanik, volume 8 of Dokumente zur Geschichte der Mathematik [Documents on the History of Mathematics]. Deutsche Mathematiker Vereinigung, Freiburg. Lecture notes from Berlin 1847/48 prepared by Wilhelm Scheibner, With a foreword by Jürgen Jost, Edited and with a preface by Helmut Pulte. Jones, D. E. (1894). Heinrich Hertz. Nature, pages 265–266. An obituary. Jouguet, E. (1909). Lectures de Mécanique, Deuxième partie. Gauthier-Villars, Paris. Jourdain, P. E. B. (1905a). Alternative forms of the equations of mechanics. Quarterly Journal of Pure and Applied Mathematics, 36: 284–296. Jourdain, P. E. B. (1905b). On the general equations of mechanics. Quarterly Journal of Pure and Applied Mathematics, 36: 61–79. Jourdain, P. E. B. (1908a). Abhandlungen über die Prinzipien der Mechanik von Lagrange, Rodriques, Jacobi und Gauss. Engelmann, Leipzig. Jourdain, P. E. B. (1908b). On those principles of mechanics which depend upon processes of variation. Mathematische Annalen, 65: 513–527. Jourdain, P. E. B. (1913). The Principle of Least Action. Open Court, Chicago. Jungnickel, C. and McCormmach, R. (1986). Intellectual Mastery of Nature: Theoretical Physics from Ohm to Einstein. University of Chicago Press, Chicago. 2 volumes. Kant, I. (1781). Kritik der reinen Vernunft. English translation by N. K. Smith: Critique of Pure Reason, London, 1933. Kant, I. (1786). Metaphysische Anfangsgründe der Naturwissenschaft. Riga. English Translation by J. L. Ellington: Metaphysical Foundations of Natural Science, Indianapolis, 1970. Kant, I. (1790). Kritik der Urteilskraft. English Translation: Critique of Teleological Judgment, Oxford, 1928. Keller, E. F. (1995). Refiguring Life. Columbia University Press, New York. Kirchhoff, G. (1876). Mechanik, volume 1 of Vorlesungen über Mathematische Physik. Teubner, Leipzig. Page references to 2nd edn, 1877. Kjaergaard, P. C. (2002). Hertz and Wittgenstein’s philosophy of science. Journal for General Philosophy of Science, 33: 121–149. Kjeldsen, T. H. (2002). Different motivations and goals in the historical development of the theory of systems of linear inequalities. Archive for History of Exact Sciences, 56(6): 469–538. Klein, F. (1873). Über die sogenannte Nicht-Euklidische Geometrie. Mathematische Annalen, 6: 112–145, English translation in (Stillwell 1991), 69–110. Klein, M. J. (1970). Paul Ehrenfest. The Making of a Theoretical Physicist, volume 1. North Holland, Amsterdam.
306
Bibliography
Klein, M. J. (1973). Mechanical explanation at the end of the nineteenth century. Centaurus, 17(1): 58–82. Kneser, A. (1928). Das Prinzip der kleinsten Wirkung von Leibniz bis zur Gegenwart. Teubner, Leipzig. Knudsen, O. (1985). Mathematics and physical reality in Willliam Thomson’s electromagnetic theory. In Harman, P. (ed.), Wranglers and Physics. Studies on Cambridge Physics in the 19th Century. Manchester University Press, Manchester. Koenigsberger, L. (1903). Hermann von Helmholtz, volume 3. Vieweg, Braunschweig. Korteweg, D. J. (1900). Über eine ziemlich verbreitete unrichtige Behandlungsweise eines Problems der rollenden Bewegung. Niew Archief voor Wiskunde, (2) 4: 132–155. Kreyszig, E. (1959). Differential geometry. Mathematical Expositions, No. 11. University of Toronto Press, Toronto. Kuhn, T. S. (1962). The Structure of Scientific Revolutions. Chicago University Press, Chicago. Kuhn, T. S., Heilbron, J., Forman, P., and Allen, L. (1967). Sources for History of Quantum Physics. Philadelphia. Lagrange, J. L. (1788). Mécanique Analitique. Desaint, Paris. Later editions: Mécanique Analytique. 2nd edn, Paris 1811–1815. 3rd edn, edited and annotated by Bertrand, Paris 1853, 4th edn, (ed.) Darboux 1888. Lampe, E. (1896). Review of (Hertz 1894). Jahrbuch über die Fortschritte der Mathematik, 25: 1310–1314. Laplace, P. S. (1799). Traité de Mécanique Céleste. Paris, 1799–1825. Oeuvres 1–5. Larmor, J. (1900). Aether and Matter. Cambridge University Press, Cambridge. Larmor, J. (1937). Origins of Clerk Maxwell’s Electric Ideas as Described in Familiar Letters to William Thomson. Cambridge University Press, Cambridge. Lenard, P. (1894). Preface. In Hertz, H. (ed.), Prinzipien der Mechanik in neuem Zusammenhange dargestellt, pages v–vi. Barth, Leipzig. Lenard, P. (1910). Vorbemerkung des Herausgebers zur zweiten Auflage. In Hertz’s Prinzipien der Mechanik, 2nd edn, pages IX–XII. Lenard, P. (1934). Great Men of Science. MacMillan, New York. Lenard, P. (1938). Deutsche Physik. Erster Band Einleitung und Mechanik, volume 1. Lehmann, München. Lenoir, T. (1982). The Strategy of Life. Reidel, Dordrecht. Lindelöff, E. (1895). Sur le mouvement d’un corps de révolution roulant sur un plan horizontal. Acta Societa tis Scientiarum Fennicae, 20(10) 18 pp. Liouville, J. (1851). Leçons au Collège de France (1851.-1er semsetre). Manuscript Ms 3640 (1846–1851) conserved at the Bibliothèque de l’Instutut de France. Liouville, J. (1856). Expression remarquable de la quantité qui, dans le mouvement d’un système de points matériels à liaisons quelconques, est un minimum en vertu du principe de la moindre action. Comptes Rendus de l’Académie des Sciences Paris, 42: 1146–1154. Liouville, R. (1892). Sur les équations de la dynamique. Comptes Rendus de l’Académie des Sciences Paris, 114: 1171–1172. Lipschitz, R. (1869). Untersuchungen in Betreff der ganzen homogenen Funktionen von n Differentialen. Journal für die reine und angewandte Mathematik, 70: 71–102. Lipschitz, R. (1872). Untersuchung eines problems der variationsrechung, in welchem das problem der mechanik enthalten ist. Journal für die reine und angewannte Mathematik, 74: 116–149. Lodge, O. (1893). The fundamental axioms of dynamics. Nature, 48: 62–63, 101–102, 126–127, 174–175.
Bibliography
307
Lorentz, H. A. (1902). Some considerations on the principles of dynamics, in connection with Hertz’s Prinzipien der Mechanik. Verslagen der Zittingen van de Wis– en Naturkundige Afdeeling der Koninklizke Akademie van Wetenschappen, 10: 876. Amsterdam Proceedings (1901–1902) 713; Abhandlungen über Theoretische Physik, I (1907) 1–22. Love, A. E. H. (1887). On recent English researches in vortex-motion. Mathematische Annalen, 30: 326–344. Lützen, J. (1990). Joseph Liouville 1809–1882. Master of Pure and Applied Mathematics, volume 15 of Studies in the History of Mathematics and Physical Sciences. Springer-Verlag, New York. Lützen, J. (1995). Interactions between mechanics and differential geometry in the 19th century. Archive for History of Exact Sciences, 49(1): 1–72. Lützen, J. (1998). Heinrich Hertz and the geometrization of mechanics. In Baird, D., Hughes, R. I. G., and Nordmann, A. (ed.), Heinrich Hertz: Classical Physicist, Modern Philosopher, pages 103–121. Kluwer, Dordrecht. Lützen, J. (1999). A matter of matter or a matter of space? Heinrich Hertz’s image of mass in his Prinzipien der Mechanik Archives Internationales d’Histoire des Sciences 49: 103–121. Lützen, J. (2001). Julius Petersen, Karl Weierstrass, Hermann Amandus Schwarz and Richard Dedekind on hypercomplex numbers. In Lützen, J. (ed.), Arround Caspar Wessel and the Geometric Representation of Complex Numbers, pages 223–254. Det Kongelige Danske Videnskabernes Selskab, Matematisk fysiske Meddelelser 46: 2, København. Lützen, J. (2004). Images and conventions: Kantianism, empiricism and conventionalism in Hertz’s and Poincaré’s philosophies of space and mechanics. In Friedman, M. and Nordmann, A. (ed.), Kant’s Legacy (to appear). MIT Press. Lützen, J., Sabidussi, G., and Toft, B. (1992). Julius Petersen 1839–1910. A biography. Discrete Mathematics, 100: 5–82. Special volume to mark the centennial of Julius Petersen’s ‘Die Theorie der regulären Graphs.’ MacCullagh, J. (1839). An essay towards a dynamical theory of crystalline reflexion and refraction. Transactions of the Royal Irish Academy, 21: 17–50. Mach, E. (1871). Die Geschichte und die Wurzel des Satzes von der Erhaltung der Arbeit. Prague. 2nd edn Barth, Leipzig 1909, English Edition Open Court, Chicago 1911, Reprinted in (Cohen 1981). Mach, E. (1883). Die Mechanik in Ihrer Entwickelung. Historisch-kritisch dargestellt. Brockhaus, Leipzig. 4th edn, 1901. 7th edn, 1912. Majer, U. (1983). Die Wissenschaftsteorie des Tractatus: Eine bisher unbekannte Form des Konventionalismus. In Erkenntnis- und Wissenschaftstheorie: Akten des 7. Internationalen Wittgenstein Symposiums, Vienna, pages 460–464, Wien. Majer, U. (1985). Hertz, Wittgenstein und der Wiener Kreis. In Dahms, H. (ed.), Philosophie, Wissenschaft, Aufklärung: Beiträge zur Geschichte und Wirkung des Wiener Kreises, pages 40–66. de Gruyter, Berlin. Majer, U. (1998). Heinrich Hertz’s picture-conception of theories: Its elaboration by Hilbert, Weyl, and Ramsey. In Baird, D., Hughes, R. I. G., and Nordmann, A. (ed.), Heinrich Hertz: Classical Physicist, Modern Philosopher, pages 225–242. Kluwer, Dordrecht. Mathieu, E. (1878). Dynamique analytique. Paris. Maxwell, J. C. (1861). On physical lines of force. Philosophical Magazine, 21: 161–175, 281–191, 338–348; vol. 23 (1862), 12–24, 85–95. Page numbers refer to Scientific Papers, 1: 451–513.
308
Bibliography
Maxwell, J. C. (1864a). A dynamical theory of the electromotive field. Transactions of the Royal Society, 155: 459–512. Scientific Papers 1: 526–597. Maxwell, J. C. (1864b). On Faraday’s lines of force. Transactions of the Cambridge Philosophical Society, 10: 27–83. Presented to the Cambridge Philosophical Society 1855/56. Scientific Papers 1: 155–229. Maxwell, J. C. (1873a). On action at a distance. Proceedings of the Royal Institution of Great Britain, 7: 44–54. Scientific Papers, 2: 311–323. Maxwell, J. C. (1873b). A Treatise on Electricity and Magnetism, volumes I + II. Clarendon, Oxford. 3rd edn, 1891. Maxwell, J. C. (1876a). Matter and Motion. London. Reprint: Dover, New York. Maxwell, J. C. (1876b). On the proof of the equations of motion of a connected system. Proceedings of the Cambridge Philosophical Society, 2: 292–294. Scientific Papers, 2: 308–309. Maxwell, J. C. (1879). Thomson and Tait’s natural philosophy. Nature, 20. Scientific Papers, 2, 776–785. Maxwell, J. C. (1965). The Scientific Papers of James Clerk Maxwell, volumes 1 and 2. Dover, New York. (ed.) Niven, W. D. Maxwell, J. C. (1990). The Scientific Letters and Papers of James Clerk Maxwell, volume 1. Cambridge University Press, Cambridge. (ed.) Harman, P. M. Mayer, A. (1877). Geschichte des Prinzips der kleinsten Action. Veit, Leipzig. McCormmach, R. (1972). Heinrich Rudolf Hertz. In Gillispie, C. (ed.), Dictionary of Scientific Biography, vol. 6, pages 340–350. Charles Scribner’s sons, New York. Merz, J. T. (1903). A History of European Thought in the Nineteenth Century, volume 2. Blackwood and Sons, Edinburgh. Mitchell, S. (1993). Mach’s mechanics and absolute space and time. Studies in History and Philosophy of Science, 24: 565–583. Morrison, M. (1999). Models as autonomous agents. In Morgan, M. S. and Morrison, M. (ed.), Models as Mediators, pages 38–65. Cambridge University Press, Cambridge. Mulligan, J. F. (1994). Heinrich Rudolf Hertz (1857–1894). Garland, New York. Neumann, C. (1870). Ueber die Principien der Galilei-Newton’schen Theorie. Teubner, Leipzig. Neumann, C. (1887). Grundzüge der analytischen Mechanik, insbesondere der Mechanik starrer Körper. Berichte über Verhandlungen der Königlich Sächsischen Gesellschaft der Wissenschaften zu Leipzig. Mathematisch-Physische Classe. 1. part (1887), 153–190, 2. part (1888), 22–88. Newton, I. (1687). Philosophiae naturalis principia mathematica. London. English translation. F. Cajori, 3rd edn, California 1946. Nordmann, A. (1998). ‘Everything could be different’: The principles of mechanics and the limits of physics. In Baird, D., Hughes, R. I. G., and Nordmann, A. (ed.), Heinrich Hertz: Classical Physicist, Modern Philosopher, pages 155–171. Kluwer, Dordrecht. Nordmann, A. (2000). Heinrich Hertz: Scientific biography and experimental life. (Essay review). Studies in History and Philosophy of Science, 31: 537–549. Ostrogradsky, M. (1850). Equations différentielles dans le problème des isopérimètres. Mémoires de l’Académie de St. Petresburg, (6) 4: 385. Ostwald, W. (1888). Die Energie und ihre Wandlungen. Leipzig Antrittsrede, Leipzig. Paulus, F. (1910). Über eine unmitelbare Bestimmung jeder einzelnen Reaktionskraft eines bedingten Punktsystems für sich aus den Lagrange’schen Gleichungen zweiter
Bibliography
309
Art. Sitzungsberichte der kaiserlichen Akademie der Wissenschaften Mathematisch – Naturwissenschaffliche Klasse (Wien), II a 119: 1669–1718. Paulus, F. (1916). Ergänzungen und Beispiele zur Mechanik von Hertz. Sitzungsberichte der kaiserlichen Akademie der Wissenschaften Mathematisch-Naturwissenschaffliche Klasse (Wien), II a 125: 835–882. Petersen, J. (1881). Forelæsninger over Statik (Lectures on Statics). Høst og Søn, København. Translated into German 1882. Petersen, J. (1884). Forelæsninger over Kinematik (Lectures on Statics). Høst og Søn, København. Translated into German 1884. Petersen, J. (1887). Forelæsninger over Dynamik (Lectures on Dynamics). Høst og Søn, København. Translated into German 1887. Pihl, M. (1955). Den klassiske mekanik i geometrisk beskrivelse. Det Kongelige Danske Videnskabernes Selskab Matematisk-fysiske Meddelelser, 30. Planck, M. (1887). Das Prinzip der Erhaltung der Energie. Leipzig. Poincaré, H. (1895). L’espace et la géométrie. Révue de métaphysique et de morale, 3: 631–646, In La science et l’hypothèse, pp. 77–94, Science and Hypothesis, pp. 51–71. Poincaré, H. (1897). Lés idées de Hertz sur la mécanique. Révue Génerale des Sciences, 8: 734–743. Oeuvres, VII, 231–250. Poincaré, H. (1902). La Science et l’Hypothese. Paris. New ed. Flammarion, Paris 1968. English translation. Science and Hypothesis, Walter Scott, 1905, re-edited by Dover, 1952. Poincaré, H. (1952). Science and Method. Dover, New York. Poisson, S. D. (1809). Mémoire sur la variation des constantes arbitraires dans les questions de la mécanique. Journal de l’Ecole Polytechnique, 8: 266–344. Poynting, J. H. (1884). On the transfer of energy in the electromagnetic field. Philosophical Transactions, 175: 343. Collected Scientific Papers, 1920, 175. Pulte, H. (1989). Das Prinzip der kleinsten Wirkung und die Kraftkonzeptionen der rationalen Mechanik. Studia Leibnitiana Sonderheft [Studia Leibnitiana Special Issue], 19. Franz Steiner Verlag Wiesbaden GmbH, Stuttgart. Pulte, H. (1998). Jacobi’s criticism of Lagrange: the changing role of mathematics in the foundations of classical mechanics. Historia Mathematica, 25(2): 154–184. Reech, F. (1852). Cours de mécanique d’après la nature généralement flexible et élastique des corps. Paris. Reich, K. (1992). Levi-civitasche Parallelverschiebung affiner Zusammenhang, Übertragungsprinzip, 1916/17–1922/23. Archive for History of Exact Sciences, 44: 77–105. Reich, K. (1994). Die Entwicklung des Tensorkalküls. Birkhäuser, Basel. Reiff, R. (1900). Die Druckkräfte in der Hydrodynamik und die Hertz’sche Mechanik. Annalen der Physik, IV, 1: 225–231. Resal, H. (1873). Traité de mécanique générale, volume 1, 2. Paris. Réthy, M. (1897). Über das Prinzip der kleinsten Aktion und das Hamiltonsche Prinzip. Mathematische Annalen, 48: 514–547. Réthy, M. (1904). Über das Prinzip der Aktion und über die Klasse mechanischer Prinzipien, der es angehört. Mathematische Annalen, 58: 169–194. Riemann, B. (1867a). Ein Beitrag zur Elektrodynamik. Annalen der Physik und Chemie, 131: 237–243. Written 1858. Gesammelte mathematische Werke, 288–293. Riemann, B. (1867b). Über die Hypothesen welche der Geometrie zu Grunde liegen (Habilitationsvortrag 1854). Abhandlungen der Königlichen Gesellschaft der Wissenschaften zu Göttingen, Mathematische Classe, 13: 1–20, Werke, 271–287.
310
Bibliography
Routh, E. J. (1877a). The Advanced Part of a Treatise on Dynamics of a System of Rigid Bodies. London, 3rd edn. The 6th edn London 1905 is reprinted in Routh: Stability of Motion. Taylor and Francis, London, 1975. Routh, E. J. (1877b). A Treatise on the Stability of a given State of Motion. Macmillan, London. Reprinted in Stability of Motion, Taylor and Francis, London 1975. Samuelson, P. A. (1971). An important problem in physics. Scientia, 106: 32–39. Schell, W. (1879). Theorie der Bewegung und der Kräfte, volume 1, 2. Leipzig, 2nd edn. Schiemann, G. (1998). The loss of the world in the image. In Baird, D., Hughes, R. I. G., and Nordmann, A. (ed.), Heinrich Hertz: Classical Physicist, Modern Philosopher, pages 25–38. Kluwer, Dordrecht. Schläfli, L. (1852). Ueber das Minimum eines Integrals wenn die Variablen durch eine Gleichung zweiten Grades gegenseitig von einander abhängig sind. Journal für die reine und anngewandte Mathematik, 43: 23–36. Gesammelte Abhandlungen, 2: 142–155. Scholz, E. (1992). Riemann’s vision of a new approach to geometry. In 1830–1930: a century of geometry (Paris, 1989), volume 402 of Lecture Notes in Physics, pages 22–34. Springer, Berlin. Serret, J. A. (1848a). Sur l’intégration des equations différentiells du mouvement d’un point matériel. Comptes Rendus de l’Académie des Sciences de Paris, 26: 605–610. Serret, J. A. (1848b). Sur l’intégration des équations génerales de la dynamique. Comptes Rendus de l’Académie des Sciences de Paris, 26: 639–643. Shapin, S. and Schaffer, S. (1985). Leviathan and the Air-pump. Hobbes, Boyle and Experimental Life. Princeton University Press, Princeton. Siegel, D. M. (1991). Innovations in Maxwell’s Electromagnetic Theory. Molecular vortices, Displacement Current, and Light. Cambridge University Press, Cambridge. Sloudsky, T. (1879). Note sur le principe de la moindre action. Nouvelles annales de Mathematiques, (2) 18: 193–200. Smith, C. and Wise, M. N. (1989). Energy and Empire. A Biographical Study of Lord Kelvin. Cambridge University Press, Cambridge. Sommerfeld, A. (1943). Mechanik. Akademischer Verlaggesellschaft, Leipzig. English translation, Academic Press, New York, 1952. Stillwell, J. (1991). Sources of Hyperbolic Geometry. History of Mathematics, vol. 10. American Mathematical Society, London Mathematical Society. Szabó, I. (1977). Geschichte der mechanischen Prinzipien und ihrer wichtigsten Anwendungen. Birkhäuser Verlag, Basel. Wissenschaft und Kultur, 32. Tait, P. G. (1885). Properties of Matter. London. Thiele, J. (1968). Ernst Mach und Heinrich Hertz. Zwei unveröffentlichte Briefe aus dem Jahre 1890. NTM, 2 (12): 132–134. Thomson, J. J. (1886). On some applications of dynamical principles to physical phenomena. Philosophical Transactions, 176: 307–342 and 178 (1888): 471–526. Thomson, J. J. (1888). Applications of Dynamics to Physics and Chemistry (lectures of 1886). Macmillan, London. Reprinted 1968. Thomson, W. (1867). On vortex atoms. Philosophical Magazine, (4) XXXIV: 15–24. Proceedings of the Glasgow Philosophical Society VI (1868): 197–206; Proceedings of the Royal Society of Edinburgh, VI (1869): 94–105. Thomson, W. (1869). On vortex motion. Transactions of the Royal Society of Edinburgh, 25: 217–260. Proceedings of the Royal Society of Edinburgh, VII (1872): 576–577. Thomson, W. (1884/1985). Kelvin’s Baltimore Lectures and Modern Theoretical Physics. MIT Press, 1995, Cambridge Mass. (ed.) Kargon, R. and Achinstein, P. Thomson, W. and Tait, P. (1867). Treatise on Natural Philosophy. Oxford.
Bibliography
311
Thomson, W. and Tait, P. (1879). Treatise on Natural Philosophy. Cambridge, 2nd edn., vol. 1 (1879), vol. 2 (1883). Toepell, M.-M. (1986a). On the origins of David Hilbert’s Grundlagen der Geometrie. Archive for History of Exact Sciences, 35(4): 329–344. Toepell, M.-M. (1986b). Über die Entstehung von David Hilberts ‘Grundlagen der Geometrie.’ Number 2 in Studien zur Wissenschafts-, Sozial- und Bildungsgeschichte der Mathematik. Vandenhoeck und Ruprecht, Göttingen. Topper, D. R. (1971). Commitment to mechanism: J. J. Thomson, the early years. Archive for History of Exact Sciences, 7: 393–410. Truesdell, C. A. (1968). Whence the law of moment of momentum? In Essays in the History of Mechanics, volume 39, pages v + 384. Springer-Verlag, New York. Unsöld, A. (1970). H. Hertz’ Prinzipien der Mechanik, Versuch einer historischen Klärung. Physikalische Blätter, 26: 337–342. Van Melsen, A. G. (1952). From Atomos to Atom. The History of the Concept of Atom. Duquense University Press, Pittsburgh. Varadarajan, V. S. (2003). Vector bundles and connections in physics and mathematics: some historical remarks. In A Tribute to C. S. Seshadri (Chennai, 2002), Trends in Mathematics pages 502–541. Birkhäuser, Basel. Vierkant, A. (1892). Über gleitende und rollende Bewegung. Monatshefte für Mathematik und Physik, 3: 31–54. Volkmann, P. (1901). Die gewöhnliche Darstellung der Mechanik und ihre Kritik durch Hertz. Zeitschrift für den physikalischen und chemischen Unterricht, 14: 266–283. Voss, A. (1880). Zur Theorie der Transformation quadratischer Differentialausdrücke und der Krümmung höherer Mannigfaltigkeiten. Mathematische Annalen, 16: 129–178. Voss, A. (1899–1900). Über die Principe von Hamilton und Maupertuis. Nachrichten von der Königlichen Gesellschaft der Wissenschaften zu Göttingen. Mathematisch-Physische Klasse, pages 322–327. Voss, A. (1901). Die Prinzipien der rationalen Mechanik. Encyclopädie der Mathematischen Wissenschaften, IV (1) (1901–1908): 3–121. Watkins, E. (1997). The laws of motion from Newton to Kant. Perspectives on Science, 5: 311–348. Webster, A. G. (1904). The Dynamics of Particles and of Rigid, Elastic, and Fluid Bodies. Teubner, Leipzig. Wien, W. (1900). Über die Möglichkeit einer elektromagnetischen Begründung der Mechanik. Archives Néerlandais des Sciences Exactes et Naturelles (Recueil de Travaux offerts par les auteurs à H.A. Lorentz á l’occasion du 25’eme anniversaire de son doctorat le 11 décembre 1900), II, 5: 96–107. Also Annalen der Physik (4) 5, 1901, 501–513. Willers, F. A. (1951). Mathematische Maschinen und Instrumente. Akademie-Verlag, Berlin. Wilson, A. D. (1989). Hertz, Boltzmann and Wittgenstein reconsidered. Studies in History and Philosophy of Science, 20: 245–263. Wise, M. N. (1981). German concepts of force, energy, and the electromagnetic ether: 1845–1880. In Cantor, G. and Hodge, M. (ed.), Conceptions of the Ether. Studies in the History of Ether Theories, pages 269–307. Cambridge University Press, Cambridge, 9th edn. Wittgenstein, L. (1921). Tractatus Logico-Philosophicus. English translation Routledge and Kegan Paul (1966). Wundt, W. (1866). Die physikalischen Axiome und ihre Beziehung zum Kausalprinzip. Erlangen.
This page intentionally left blank
Index
absolute space, 22, 132 acceleration, 179, 206, 207 actions at a distance, 2, 9, 13, 29, 32, 40–45, 52, 55, 58, 63, 64, 67–72, 75, 84, 109, 187, 197, 213, 224, 278, 279, 284, 287, 290 adiabatic cyclic system, 213, 225–227, 233, 261 Ampère, André Marie (1775–1836), 31, 41, 42, 53, 64, 125 analogy, 31, 32, 34, 35, 37, 38, 42, 44, 51, 84, 110, 155, 156, 252 Andrade, Jules (1857–1933), 25 angle, 3, 162–165, 167–170, 174, 176, 195, 196 animate systems, see living systems apparent mass, 143, 231, 232 Appell, Paul (1855–1930), 28, 245 applicability, 264 application, 284–286 applied geometry, 131–133 applied mathematics, 185 appropriate, 5, 89–96, 99, 102, 103, 107, 117, 119, 120, 122, 134, 135, 263, 276, 278, 281, 282, 288, 291, 292 approximation, 230, 233, 234 atom, 40, 48, 55, 98, 103, 136, 279 axiomatization, 63–65 Beltrami, Eugenio (1835–1900), 129, 159, 160, 183, 252, 259, 261, 262 Berlin University, 52 Bernoulli, Daniel (1700–1782), 10
Bernoulli, Jacob (1655–1705), 10 Bernoulli, Johann (1667–1748), 10, 11 Berthollet, Claude Louis (1748–1822), 40 biology, 267 Boltzmann, Ludwig (1844–1906), 28, 31, 34, 38, 48, 66, 216, 245, 246, 274, 279, 281, 284–286, 289 Bolyai, Janos (1802–1860), 22, 129 Bonn University, 57 Borchardt, Carl Wilhelm (1817–1880), 52, 73 Boscovich, Giuseppe (1711–1787), 40 Boyle, Robert (1626–1669), 40 Brill, Alexander (1842–1935), 274, 284–286 Carnot, Lazare (1753–1823), 23, 144 cathode rays, 54, 55, 58, 73 Cauchy, Augustin Louis (1789–1857), 31, 36 center of mass (principle of), 9, 10 characteristic function, 16, 17, 20, 249, 250, 254–256, 262 Chasles, Michel (1793–1880), 31 Clausius, Rudolf (1822–1888), 25, 57 Clifford, William Kingdon (1845–1879), 28, 46, 260 Cohn, Emil (1854–1944), 74 colorless theory, 101, 291 component, 173, see component along a coordinate, see reduced component component along a coordinate, 4, 174 concealed mass, see hidden mass
313
314 configuration space, v, vi, 1, 3, 146, 147, 153, 155, 156, 171, 247, 251, 257–261, 281, 288 connection, 2–4, 10–15, 21, 27, 36, 41, 65, 75, 79, 94, 104–106, 108, 109, 111, 113, 116, 117, 121, 154, 160, 162, 163, 171, 187–200, 203, 205, 206, 220, 221, 223, 224, 229, 230, 236–246, 255, 264, 266, 267, 279–281, 284, 285, 287, 292, 293 conservation of angular momentum, 10 conservation of energy, see energy conservation conservation of momentum, 9 conservative system, 4, 14, 26, 27, 80, 141, 157, 208, 225, 227, 228, 230, 232–235, 238, 239, 251, 258, 259, 274, 292 constraint, see connection, 10, 206 continuity principle, 116 continuous connection, 189 continuous medium, 3, 76, 77, 140, 254 conventionalism, 130, 132, 133, 137 coordinative rule, 131–133, 141 correctness, 5, 75, 84, 86, 87, 93–96, 99, 107–109, 272–274, 288, 289, 291, 292 curvature, 1, 3, 46, 129, 130, 148, 154, 164–166, 168–172, 179, 180, 184, 198, 199, 201–203, 253, 260, 292 cyclic coordinate, 37, 38, 208, 210, 212–218, 225, 226, 229, 231, 276 cyclic system, 79, 80, 208, 213, 225–227, 229, 231–234 d’Alembert’s principle, 11, 12, 15, 26, 36, 112, 113, 206, 207, 220, 223, 240, 241 d’Alembert, Jean le Rond (1717–1783), 10–12, 24 d’Alembert’s principle Lagrange’s version, 12 Dühring, Eugen Karl (1833–1921), 28 Darboux, Gaston (1842–1917), 16, 28, 159, 160, 252, 259–262, 288 deductive style, 1, 21, 31, 88, 89, 95, 111, 116, 135 degree of freedom, 195 Descartes, René (1596–1650), 9, 10, 20, 40, 138, 271, 288
Index differential geometry, v, 159, 183, 186, 237, 252, 287 direction, 162–167, 169–171, 174, 177 displacement, 147, 148, 152, 154 distinctness, 5, 87, 89–94, 196 drafts of Hertz’s Mechanics, 6, 7, 38, 80–82, 84, 104, 112, 118, 124, 125, 145, 148, 149, 151, 156–159, 162, 169, 170, 172, 180–185, 190, 191, 194, 199, 262, 290–292 Dresden Polytechnikum, 51 Duhem, Pierre (1861–1916), 47, 119, 120, 123, 281 dynamics, 119, 125, 126 Ehrenfest, Paul (1880–1933), 140, 284, 286 Einstein, Albert (1879–1955), 23, 58, 128, 159, 186, 260, 287, 288 electromagnetic world view, 39, 143, 282, 287 electromagnetism, 7, 8, 14, 26, 31–34, 41, 43–45, 48, 52–55, 57, 58, 60, 63, 64, 66–69, 71, 73, 75, 77, 92, 101, 104–106, 116, 117, 124, 143, 145, 224, 233, 266, 271, 279, 282, 283, 285, 287, 290, 293 energeticism, 44, 47–49, 77, 79, 94, 112, 114–117, 240, 242, 281, 287, 290 energy, 179 localization in space, 77, 78, 114 energy conservation, 14, 16, 17, 22, 25, 26, 28, 47, 68, 71, 77, 90, 220, 239, 249, 268, 286 engagement, 56 entropy, 214 ether, 31, 34–36, 39, 42, 43, 45, 46, 54, 55, 58, 64, 66, 72–77, 99, 117, 140–143, 145, 193, 229, 266, 277, 280, 282, 283, 285, 290 Euclidean structure, 21 Euler, Leonhard (1707–1783), 10, 12, 15, 23, 24, 26, 144, 200, 210 Euler–Lagrange equations, see Lagrange’s equations existence problem, 124 extention, 284–286
Index Ferrers, Norman Macleod (1829–1903), 243, 244, 246 field theory, 41, 43, 55, 64, 67, 68, 70–73 final causes, 21 FitzGerald, George Francis (1851–1901), 57, 278 Fizeau, Armand H.L. (1819–1896), 58 force, 3, 4, 7, 9–14, 20, 24, 25, 41–43, 45, 47, 51, 52, 67, 69, 71, 112, 219, 220, 222–224, 265–267 force-producing models, 274, 277 foundations of mechanics, 20 free system, 196, 198, 201, 202, 204, 206, 207 Fresnel, Augustin (1788–1827), 30, 31, 41, 58 fundamental law of motion, 1, 3, 15, 64, 109, 198–201, 204, 206, 263–265, 269–273, 292 Galilei, Galileo (1564–1642), 9, 149 Gassendi, Peter (1592–1666), 40 Gauss’ principle of least constraint, see principle of least constraint Gauss, Carl Friedrich (1777–1855), 13–15, 21, 22, 27, 104, 111, 112, 129, 154, 160, 184, 199, 200, 206, 220, 223, 248, 252–257, 259 gay garment, 66, 101, 102, 138, 291 generalized coordinate, 13, 147 generalized momentum, 18, 179, 182 geodesic, 4, 171, 235–237, 239, 248, 252–256, 258–261 geometrization of mechanics, 159, 161, 252–262 geometry, v, vi, 1, 20–22, 50, 51, 56, 91, 92, 98, 110, 121, 122, 125, 127–134, 138, 153–155, 158–160, 170, 171, 252, 255, 260, 287, 289 geometry of systems of points, 7, 51, 80, 93, 107, 110, 127, 130, 135, 139, 148, 149, 154–162, 171, 173, 176, 178, 180, 184, 199, 201, 235, 236, 262, 286, 291, 292 gravitation, 9, 13, 14, 20, 24, 40, 55, 69, 71–73, 113, 116, 134, 143, 260, 266, 276, 285, 287, 290, 293
315 gravitational mass, 24, 41, 76, 141, 142, 144 Green, George (1793–1841), 13, 36 guided system, 4, 219, 220, 223 Hadamard, Jaques (1865–1963), 96, 245 Hamel, Georg (1877–1954), 245 Hamilton formalism, 16 Hamilton’s equations, 18–20, 81, 182–185, 205, 208, 229, 292 Hamilton’s principle, 4, 17, 26, 27, 47, 73, 79, 91, 92, 94, 96, 112, 114, 115, 160, 217, 235, 238–242, 244, 245 Hamilton, William Rowan (1805–1865), 16, 17, 19, 26, 41, 188, 254, 255, 257 Hamilton–Jacobi formalism, 4, 16, 81, 155, 248, 254–258, 262, 293 Heaviside, Oliver (1850–1925), 58 Heidelberg lecture, 72, 73, 75, 76 Helm, Georg Ferdinand (1851–1923), 48, 281 Helmholtz, Hermann von (1821–1894), 4, 7, 14, 22, 25, 27, 28, 31, 38, 44, 45, 47, 48, 51–54, 56, 57, 60, 65, 66, 68–70, 79, 80, 83, 85, 86, 89, 124, 125, 129, 130, 133, 147, 155, 160, 162, 184, 191, 208, 210, 213–217, 225, 226, 268–270, 278–280, 290 Hertz, Heinrich Rudolf (1857–1894) biography, 50 hidden mass, 3, 46, 76, 79, 104, 109, 115, 117, 127, 137, 141–144, 157, 227, 228, 232, 233, 274, 281, 285–288 hidden non-holonomic connection, 229 Hilbert, David (1862–1943), 21, 88, 94, 128, 130, 289 Hölder, Otto (1859–1937), 27, 240, 241, 243–246, 293 holonomic constraint, 11 holonomic system, 3, 4, 11, 171, 172, 192, 194–196, 202, 205, 228, 229, 237–241, 243, 247, 259, 261, 287 Huygens, Christiaan (1629–1695), 9 hydrodynamics, 3, 55 illness, 61 image, 2, 55, 83–118, 288, 289, 291, 292
316 impact, 286–289 inertial frame, 23, 132 inertial mass, 24, 52, 142–144, 282, 287 integral principles, 4, 21, 172, 202, 220, 235–237, 239–241, 243, 293 internal connection, 192 isocyclic system, 226 Jacobi, Carl Gustav Jacob (1804–1851), 15, 16, 19, 20, 26, 27, 90, 91, 248, 249, 255, 259, 261 Jolly, Philipp von (1809–1884), 51 Joule, James P. (1818–1884), 31 Jourdain, Philip E.B. (1879–1919), 27, 241, 244 Kant, Immanuel (1724–1804), 20, 51, 67, 99, 121, 123, 124, 129, 136, 267 Kantianism (neo-), 2, 5, 22, 25, 31, 51, 86, 119, 121–126, 129, 131, 132, 134, 155, 160, 291 Karlsruhe Polytechnicum, 56 Kiel lectures, 46, 55, 65, 67, 68, 71, 97, 99–102, 123, 136, 137, 140–142, 144, 145 Kiel University, 54 kinematics, 119, 125, 178 Kirchhoff, Gustav Robert (1824–1887), 24, 25, 27, 46, 47, 52–54, 57, 104 Klein, Felix (1849–1925), 59, 77, 129, 290 Koenigsberger, Leo (1837–1921), 28, 51 Korteweg, Dieterik (1848–1941), 245 Kummer, Ernst (1810–1893), 52 Lagrange formalism, 35, 90 Lagrange multiplier, 12, 13, 203, 221, 244, 292 Lagrange’s equations, 13, 17, 36, 47, 205, 209, 210, 218, 243–246 Lagrange, Joseph Louis (1736–1813), 10, 12, 13, 15, 18, 21, 26, 27, 37, 41, 51, 112, 159, 173, 179, 184, 209, 244 Lagrangian, 14, 36, 179, 208–210, 213–215, 245
Index Laplace, Pierre Simon de (1749–1827), 13, 31, 40, 72, 188 Lavoisier, Antoine Laurent (1743–1794), 31 least action (principle of), see principle of least action Leibniz, Gottfried Wilhelm (1646–1716), 10, 14, 22, 51 Lenard, Philipp (1862–1947), 6, 58, 59, 62, 81, 278, 283, 284, 290 Levi-Civita, Tullio (1873–1941), 183 Lindelöff, Ernst (1870–1946), 245 line element, 1–3, 105, 110, 127, 146–150, 152–154, 157, 158, 163, 170, 172, 176, 179, 185, 198, 225, 252, 254–261, 292 Liouville, Joseph (1809–1882), v, 16, 159, 209, 255–257, 259, 260, 274 Liouville, Roger (1856–1930), 217, 218 Lipschitz, Rudolf (1832–1903), 16, 25, 73, 159, 160, 184, 186, 252, 255–262, 287, 288 live force (principle of), 14 living systems, 263, 267–270 Lobachevsky, Nikolai Ivanovich (1793–1856), 22, 129 Lodge, Oliver (1851–1940), 27, 57 Lorentz, Hendrik Antoon (1853–1928), 39, 58, 284, 286 Lorenz, Ludwig (1829–1891), 44 Love, Augustus Edward Hough (1863–1940), 28 München Polytechnikum, 51 MacCullagh, James (1809–1847), 36 Mach, Ernst (1838–1916), 23–25, 27, 28, 38, 39, 47, 66, 112, 120, 124, 142, 143, 281, 282 machine, 224 Malus, Etienne Louis (1775–1812), 254, 255, 257, 259 manuscript, see drafts of Hertz’s Mechanics Marconi, Guglielmo (1874–1937), 58 mass, 1, 3, 20, 23–26, 40, 52, 72, 76, 77, 79, 104, 105, 111, 112, 114, 115, 124, 125, 133–144, 146, 148, 150–154, 156–158, 161–163, 166, 231, 232, 282, 283, 288, 292, 293
Index Massenteilchen, 127, 135–140, 142, 144–146, 148–154, 156–158, 185, 233, 289, 292 material system, 189 mathematical energy, 228 mathematical form, 107 matter, 1, 23–25, 34, 35, 40, 44–47, 55, 71, 74, 76, 90, 97–99, 105, 106, 116, 123, 127, 129, 134–146, 149–154, 156–158, 290 Maupertuis, Pierre Louis Moreau (1698–1759), 15, 21, 200 Maxwell’s equations, 58, 64 Maxwell, James Clerk (1831–1909), 23, 28, 31–39, 42–45, 47, 52, 53, 64, 66, 70–72, 84, 90–92, 114, 156, 268, 279 Mayer, Christian Gustav Adolph (1839–1908), 27 mechanistic philosophy, 30, 31, 34, 38–40, 47–49, 55, 66, 75, 93, 145, 268–270, 279–284, 287, 293 Michelson, Albert A. (1852–1931), 58 Minding, Ferdinand (1806–1865), 159, 256 Minkowski, Hermann (1864–1909), 160 model, 108, 109, 271, 272 modified Lagrangian, 208, 210–212 momentum, 178 monocyclic system, 48, 216, 225 Morley, Edward W. (1838–1923), 58 natura non facit saltus, 192 neighboring displacement, 167, 168 Neumann, Carl (1832–1925), 23, 25, 28, 43, 44, 48, 69, 244, 245 Newton’s first law, 9, 23, 26, 160, 198, 199 Newton’s second law, 9, 10, 13, 24, 26, 188 Newton’s third law, 9, 24, 26, 112 Newton, Sir Isaac (1642–1727), 8, 9, 20–24, 30, 40, 51, 84, 144, 159, 198, 282 Newtonian–Laplacian image, 112 non-Euclidean geometry, 22, 51, 129, 130, 133, 158 non-holonomic constraint, 11 non-holonomic system, 2–4, 11, 27, 65, 79, 94, 114, 192, 194–196, 229, 235, 237, 239–246, 286, 293 normal connection, 192
317 Ørsted, Hans Christian (1777–1851), 42 optics, 15, 16, 30, 52, 55, 76, 252, 255 Ostrogradsky, Mikhail V. (1801–1862), 26 Ostwald, Friedrich Wilhelm (1853–1932), 28, 48, 281 paper money, 99 parallel transport, 164, 166 parameter, 225 Paulus, Franz Xaver (1895–1949), 274, 276, 277, 284, 288 permissibility, 5, 87, 88, 119, 120, 122, 126, 265, 266, 270, 272, 288, 289, 291, 292 Petersen, Julius (1839–1910), 28, 88 picture, see image Planck, Max (1858–1947), 28, 47, 55 Poincaré, Henri (1854–1912), 22, 28, 124, 129–132, 273, 280, 288 Poisson, Simeon-Denis (1781–1840), 18, 41, 188 possible displacement, 105, 177, 181, 190, 194, 195, 222 potential, 13 potential energy, 4, 14, 37, 44, 47, 79, 80, 114, 157, 179, 209–211, 213, 214, 217, 218, 227–230, 232, 259–261, 274–277, 291, 293 Poynting, John Henry (1852–1914), 77, 78, 290 practical application, 91, 263, 264 principal function, 17, 19, 249, 250 principle of areas, 10, 206, 207, 223 principle of least action, 4, 15, 16, 21, 26, 27, 48, 79, 111, 160, 200, 235, 238–241, 255, 259 Jacobi’s version, 16 principle of least constraint, 15, 21, 104, 111, 154, 160, 199, 206, 220 principle of the center of mass (or gravity), 9, 10, 207, 223 principle of virtual work, 11, 12, 26 principles of mechanics, 8, 10, 20, 26, 111 prize problem, 52 publication of the ‘Mechanics’, 61 pure geometry, 128–130
318
Index
Réthy, Moritz (1846–1925), 27 radio wave, 57 Rankine, William John Macquorn (1820–1872), 47 reception, 278–284 reduced component, 173–177, 179, 181–185, 261, 292 reductionism, 267 Reech, Ferdinand (1805–1884), 25 Reiff, Richard A. (1855–1908), 286 representation of image, 106 Ricci-Curbastro, Gregorio (1853–1925), 183, 287 Riemann, Bernhard (1826–1866), 22, 44, 129, 154, 155, 159, 160, 171, 184, 256 Riemannian geometry, 6, 155, 156, 160, 175, 184, 288 Riemannian metric, 3, 259, 260 Rodrigues, Olinde (1794–1851), 26, 27 rolling motion, 11, 193, 195 Routh, Edward John (1831–1907), 27, 28, 37, 208–211, 213, 216, 244–246 Rumford, Count (Thompson, Benjamin) (1753–1814), 31, 57
straightest path, 1, 3, 4, 79, 148, 198, 200–204, 235–237, 239, 247–249, 262, 269, 286, 287 straightest-path, 198 system acted on by forces, 4, 219, 220, 223, 225
Saint-Venant, Barré de (1797–1886), 23, 24, 144 Samuelson, Paul Anthony (born 1915), 276, 277 Schläfli, Ludwig (1814–1895), 256 school of the thread, 25 Schrödinger, Erwin (1887–1961), 268, 288 scientific representation, 107, 119, 120, 122, 291 Serret, Joseph Alfred (1819–1885), 159, 255 shortest distance, 262 shortest path, 236, 237 sign, 85 simplicity, 5, 90, 92–94, 96, 102–106, 109, 111, 146, 289, 291 Sommerfeld, Arnold (1868–1951), 154 space, 2, 3, 20, 22, 23, 26, 40, 46, 51, 57, 67, 79, 91, 106, 110–112, 114, 115, 121, 123–125, 127–134, 137, 142, 154–156, 200, 252, 260, 279, 281, 292 standard example, 211 straightest distance, 247–249
vanishing material points, 149, 151 variational principle, 15, 17, 21, 238–242, 259 Varignon, Pierre (1654–1722), 10 vector quantity, 3, 4, 163, 173, 176–178, 180, 181, 184, 185, 206, 220–223, 261, 292 velocity, 178 Vierkandt, Alfred, 245 virtual displacement, 206 virtual work (principle of), 11 vital force, 269 vitalism, 267 Voss, Aurel Edmund (1845–1931), 27, 28, 184, 282
Tait, Peter Guthrie (1831–1901), 23, 27, 28, 36, 37, 45, 47, 48, 112, 125 teleology, 267 tensor, 173, 183, 184, 186, 287 theory, 107 thermodynamics, 213 Thomson, Joseph John (1856–1940), 28, 38, 54, 160, 210, 217, 218, 230 Thomson, William (Lord Kelvin) (1824–1907), 23, 27, 28, 31–34, 36, 37, 42, 45, 47, 48, 66, 76, 84, 91, 110, 112, 114, 125, 156, 268 time, 3, 20–23, 27, 51, 79, 90, 106, 111, 112, 114, 115, 123, 124, 127, 131, 133, 134 uniform motion, 3, 9, 198 Unsöld, Albrecht O.J. (b. 1905), 287
Weber, Wilhelm (1804–1891), 33, 42–45, 52, 53, 64, 65, 68 Weierstrass, Karl (1815–1897), 54 Wien, Wilhelm (1864–1928), 39, 143, 282, 287 Wittgenstein, Ludwig (1889–1951), 289