This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
= -|jt, the equation becomes simply cos <j>' = —v/c. We still have to find the amplitude of the waves, as it appears in the moving system. If we call the amplitude of the electric or magnetic force A or A' respectively, accordingly as it is measured in the stationary system or in the moving system, we obtain A "v/c (xk), f(xrk), draw directed lines leading even or odd. Hence, finally, t' in (47) is + 1 or out from xi, to the edge of the diagram, and in — 1 according to whether the permutation r0, from the edge of the diagram to xrk. For each •••, r„ is even or odd. It is important that e' pair of factors (i4(*ii)^4(xi0), draw an undepends only on the type of matrix element M directed line joining the points x«, and xii. The considered, and not on the points .to, ••-, x„\ complete set of points and lines will be called therefore, it can be taken outside the integrals the "graph" of M; clearly there is a one-to-one correspondence between types of matrix elein (31). ment and graphs, and the exclusion of matrix One result of the foregoing analysis is to justify elements with ti = i for i^k corresponds to the the use of (37), instead of the more correct (38), exclusion of graphs with lines joining a point to for the charge-current operator occurring in H' itself. The directed lines in a graph will be called and H'. For it has been shown that in each "electron lines," the undirected lines "photon matrix element such as M the factors <j/ and (Ti,T\), we have what is called a projective representation, or a representation 'up to a phase'. The structure of the Lie group cannot by itself tell us whether physical state-vectors furnish an ordinary or a projective representation, but as we shall see, it can tell us whether the group has any intrinsically projective representations at all. The exception to the argument that led to Eq. (2.2.14) is that it may not be possible to prepare the system in a state represented by ^A + ^BFor instance, it is widely believed to be impossible to prepare a system in a superposition of two states whose total angular momenta are integers and half-integers, respectively. In such cases, we say that there is a 'superselection rule' between different classes of states,3 and the phases ' = 0. T ? The problem is analogous to the formation of domain structures in a ferromagnet. We may draw the analog: < . Depending on the dynamical theory, such domains may also remain as physical realities, even after the matter source J is removed. As an illustration, we may consider a local scalar f i e l d theory. The Lagrangian density is Z due to a constant e x ternal matter source J . In case (a) $ changes continuously with J . In case (b), as J i n creases, there is a- c r i t i c a l value at which q> makes a sudden jump. > tion, then < 4 > T
_
(l-cos^.v/cf
~A
l-i> 2 /c 2
which equation, if
„
l-vfc l+v/c
'
It follows from these results that to an observer approaching a source of light with the velocity c, this source of light must appear of infinite intensity.
§ 8. Transformation of the Energy of Light Rays. Theory of the Pressure of Radiation Exerted on Perfect Reflectors Since A2/Sn equals the energy of light per unit of volume, we have to regard /4'2/&r, by the principle of relativity, as the energy of light in the moving system. Thus A'2/A2 would be the ratio of
133
134 Lorentz and Poincare Invariance the "measured in motion" to the "measured at rest" energy of a given light complex, if the volume of a light complex were the same, whether measured in K or in k. But this is not the case. If /, m, n are the direction-cosines of the wave-normal of the light in the stationary system, no energy passes through the surface elements of a spherical surface moving with the velocity of light: (x-lct)2
+ (y-mct)2
+ (z-nct)2
= R2.
We may therefore say that this surface permanently encloses the same light complex. We inquire as to the quantity of energy enclosed by this surface, viewed in system k, that is, as to the energy of the light complex relatively to the system k. The spherical surface—viewed in the moving system—is an ellipsoidal surface, the equation for which, at the time x = 0, is 2
W-ipev/cy+to-mPhlcy+G-nPSvlc? = -R .
If S is the volume of the sphere, and S' that of this ellipsoid, then by a simple calculation
V(l-a2/c2)
£_ S
1 — cos <j>. v/c '
Thus, if we call the light energy enclosed by this surface E when it is measured in the stationary system, and E' when measured in the moving system, we obtain
— ~E~
A 2S
' ' - l~cos
A2S
$-vlc
V(l-v2/c2)
'
and this formula, when <j> = 0, simplifies into E ~ \
l+v/c'
It is remarkable that the energy and the frequency of a light complex vary with the state of motion of the observer in accordance with the same law. Now let the co-ordinate plane 5 = 0 be a perfectly reflecting surface, at which the plane waves considered in § 7 are reflected. We seek for the pressure of light exerted on the reflecting surface, and for the direction, frequency, and intensity of the light after reflexion. Let the incidental light be defined by the quantities A, cos (p, v
Chap. 2. Special Relativity ... (referred to system K). Viewed from k the corresponding quantities are 1—cos (p.vjc A' = A V(l-«2/c2) ' cos <£' =
cos
, V
1—cos S.vlc =
j)
•—
.
V(l-v2/c*) For the reflected light, referring the process to system k, we obtain A" = A'. COS (j)" = —COS (j)'
v" = v'. Finally, by transforming back to the stationary system K, we obtain for the reflected light A1" _ j " l + c o s <j>"-v/c _ V(i~v2/c2) ,,, _
cos cp" + v/c _ I + c o s
V
= V —-rr. V(l-v2/c2)„ , '
1 — 2'cos (j).v/c + v2fc2 l-v2/c2 '
(l + v2/c2)cos (j)-2v/c 1 - 2 cos 4>.vjc + v2/c2' 1— 2 cos
= v-
The energy (measured in the stationary system) which is incident upon unit area of the mirror in unit time is evidently A2(c cos
A2 (cos 4> — v/c)2 in l-v2/c2
In agreement with experiment and with other theories, we obtain to a first approximation 2- ——cos2 (j). on
135
136 Lorentz and Poincare Invariance All problems in the optics of moving bodies can be solved by the method here employed. What is essential is, that the electric and magnetic force of the light which is influenced by a moving body, be transformed into a system of co-ordinates at rest relatively to the body. By this means all problems in the optics of moving bodies will be reduced to a series of problems in the optics of stationary bodies. § 9. Transformation of the Maxwell-Hertz Equations when Convection-Currents are Taken into Account We start from the equations dX ) c -dT+U*\ dY
]_
dN dM -dy—dz-'
1 dL c dt
1
dL
dN
1 dM dZ c dt ~ dx
]
dM
dL
1 dN c dt
=
c
j_
dZ
c
-¥t- * \--dx—dj'
+u e
dY "dz
dX dy
dZ •
dy'
dX ' dz'' dY dx '
where dX q =
dY
-dx-+-dJ
+
dZ
1F
denotes 4n times the density of electricity, and (ux, uy, uz) the velocity-vector of the charge. If we imagine the electric charges to be invariably coupled to small rigid bodies (ions, electrons), these equations are the electromagnetic basis of the Lorentzian electrodynamics and optics of moving bodies. Let these equations be valid in the system K, and transform them, with the assistance of the equations of transformation given in §§ 3 and 6, to the system k. We then obtain the equations
i f ex'
;i
7\-dr+u*\
1 f dZ'
,}
dN' dr)
dM' dC
1 dL' c dr
dL' dC
dN' 01
1 dM' c dr
dM'
dL'
dS,
dr)
1 dN' c dr
~ dC
dZ' dr]
dZ'
dX'
dY'
~
dt ~'dC~ dX' drj
dY' d$
Chap. 2. Special Relativity ... where ux — v l-uxv/c2 u
'
P(l-uxv/c2)'
"
Ul " c ~ P(l-uxv/c2)
'
and ,
dX'
BY'
dZ'
...
. „,
Since—as follows from the theorem of addition of velocities (§ 5)— the vector (wf, un, wc) is nothing else than the velocity of the electric charge, measured in the system k, we have the proof that, on the basis of our kinematical principles, the electrodynamic foundation of Lorentz's theory of the electrodynamics of moving bodies is in agreement with the principle of relativity. In addition I may briefly remark that the following important law may easily be deduced from the developed equations: If an electrically charged body is in motion anywhere in space without altering its charge when regarded from a system of co-ordinates moving with the body, its charge also remains—when regarded from the "stationary" system K—constant.
§ 10. Dynamics of the Slowly Accelerated Electron Let there be in motion in an electromagnetic field an electrically charged particle of charge e (in the sequel called an "electron"), for the law of motion of which we assume as follows: If the electron is at rest at a given epoch, the motion of the electron ensues in the next instant of time according to the equations d2x eX, dt2 m£y2=eY,
dt
d2z = eZ dt2
137
138
Lorentz and Poincare Invariance
where x, y, z denote the co-ordinates of the electron, and w the mass of the electron, as long as its motion is slow. Now, secondly, let the velocity of the electron at a given epoch be v. We seek the law of motion of the electron in the immediately ensuing instants of time. Without affecting the general character of our considerations, we may and will assume that the electron, at the moment when we give it our attention, is at the origin of the co-ordinates, and moves with the velocity v along the axis of X of the system K. It is then clear that at the given moment (/ = 0) the electron is at rest relatively to a system of co-ordinates k which is in parallel motion with constant velocity v along the axis of X. From the above assumption, in combination with the principle of relativity, it is clear that in the immediately ensuing time (for small values off) the electron, viewed from the system k, moves in accordance with the equations
-
^
=
- '
in which the symbols £, r\, C, r, X', Y', Z' refer to the system k. If, further, we decide that when t = x — y — z — 0 then T = £ = V = C = 0, the transformation equations of §§ 3 and 6 hold good, so that we have I = P(x-vt), r] = y, Z = z, x = P(t-vx/c2) X' = X, Y' = P(Y-vN/c), Z' = p(Z+vM/c). With the help of these equations we transform the above equations of motion from system k to system K, and obtain (Px ~d~F cPy ~dP cP-z
=
=
-
e
x
1
•Mr-7»)
(A)
•syK").
Taking the ordinary point of view we now inquire as to the "longitudinal" and the "transverse" mass of the moving electron.
Chap. 2. Special Relativity
We write the equations (A) in the form cPx mB3 -—-2 = eX = eX', ' dt m 2
P Sr =
s
P(Y-7N)=eY'<
-*•£=*(*+»
= eZ',
and remark firstly that eX', eY', eZ' are the components of the ponderomotive force acting upon the electron, and are so indeed as viewed in a system moving at the moment with the electron, with the same velocity as the electron. (This force might be measured, for example, by a spring balance at rest in the last-mentioned system.) Now if we call this force simply "the force acting upon the electron,"* and maintain the equation—mass X acceleration = force—and if we also decide that the accelerations are to be measured in the stationary system K, we derive from the above equations 111
Longitudinal mass = Transverse mass
2
{Vl-v /c2)3 m \-v2lc2 '
With a different definition of force and acceleration we should naturally obtain other values for the masses. This shows us that in comparing different theories of the motion of the electron we must proceed very cautiously. We remark that these results as to the mass are also valid for ponderable material points, because a ponderable material point can be made into an electron (in our sense of the word) by the addition of an electric charge, no matter how small. We will now determine the kinetic energy of the electron. If an electron moves from rest at the origin of co-ordinates of the system K along the axis of X under the action of an electrostatic force X, it is clear that the energy withdrawn from the electrostatic field has the value
eXdx. As the electron is to be slowly accelerated,
and consequently may not give off any energy in the form of radiation, the energy withdrawn from the electrostatic field must be put down as equal to the energy of motion W of the electron. Bearing * The definition of force here given is not advantageous, as was first shown by M. Planck. It is more to the point to define force in such a way that the laws of momentum and energy assume the simplest form.
...
139
140 Lorentz and Poincare Invariance in mind that during the whole process of motion which we are considering, the first of the equations (A) applies, we therefore obtain W = \ eXdx = m V B3v dv = mc21, * ,-ll. J Jo }Vl-^2/c2 J Thus, when v = c, W becomes infinite. Velocities greater than that of light have—as in our previous results—no possibility of existence. This expression for the kinetic energy must also, by virtue of the argument stated above, apply to ponderable masses as well. We will now enumerate the properties of the motion of the electron which result from the system of equations (A), and are accessible to experiment. 1. From the second equation of the system (A) it follows that an electric force Y and a magnetic force N have an equally strong deflective action on an electron moving with the velocity v, when Y = Nvjc. Thus we see that it is possible by our theory to determine the velocity of the electron from the ratio of the magnetic power of deflexion Am to the electric power of deflexion Af, for any velocity, by applying the law
This relationship may be tested experimentally, since the velocity of the electron can be directly measured, e.g. by means of rapidly oscillating electric and magnetic fields. 2. From the deduction for the kinetic energy of the electron it follows that between the potential difference, P, traversed and the acquired velocity v of the electron there must be the relationship
3. We calculate the radius of curvature R of the path of the electron when a magnetic force N is present (as the only deflective force), acting perpendicularly to the velocity of the electron. From the second of the equations (A) we obtain
_fl_2 dt
t--±l
~ R
m c
AM/7\-*\ 2 ]j \
c J
or mc2 v/c 1 ~7~ ' V ( I - « 2 / c 2 ) ' If ' These three relationships are a complete expression for the laws according to which, by the theory here advanced, the electron must move.
Chap. 2. Special Relativity ... In conclusion I wish to say that in working at the problem here ...dealt with 1 have had the loyal assistance of ray friend and colleague M. Besso, and that I am indebted to him for several valuable suggestions.
141
142
Lorentz and Poincare Invariance
Concepts which have proved useful for ordering things easily assume so great an authority over us, that we forget their terrestrial and accept them as unalterable facts. "conceptual
necessities,"
"a priori
scientific progress is frequently errors.
origin
They then become labeled as
situations," etc.
The road of
blocked for long periods by such
It is therefore not just an idle game to exercise our ability to
analyze familiar concepts, and to demonstrate which their justification
and usefulness
depend,
the conditions
on
and the way in
which these developed, little by little, from the data of experience. In this way they are deprived of their excessive authority." Einstein (Phys. Zeitschr. 17,101 (1916))
Einstein and Ehrenfest sharing their love of music. From a drawing by Maryke Kamerlingh Onnes.
Chap. 2. Special Relativity ...
143
The Principle of Relativity and the Fundamental Equations of Mechanics* by Max Planck, (presented in the meeting on 23. March 1906.) The "principle of relativity", recently introduced by H. A. LORENTZ 1 and, in a more general form by A. EINSTEIN 2 , states that of two reference frames (x, y, z, t) and (x', y1, z',t'), related by: x' =
,
1
(x - Vt),
y' = y,
z' = z,
t' =
. \ ,
(t - Kx)
m
(c velocity of light in vacuum) both can be used for the basic equations of mechanics and electrodynamics with same justification. Neither frame can therefore be considered to be "at rest" (or preferred). If this statement proves true in general, it implies such a great simplification of all problems of the electrodynamics of moving bodies, that the question of its validity deserves to be placed at the center of all theoretical research in this area. Apparently, the most recent measurements by W. KAUFMANN 3 have solved this problem in a negative sense, so that any further investigation seems to be unnecessary. However, in view of the not-so-simple theory behind these experiments, I do not think one can entirely exclude the possibility that further developments of the principle of relativity prove to be compatible with the observations. Neither do I wish to give decisive weight to the consideration that, according to the principle of relativity, a moving electron would be subject to a special deformation work. After all, one can consider this work to be generally part of the kinetic energy of the electron. It is true that an electrodynamic explanation of inertia remains an open question; on th ; other hand, an advantage arises in that one need not ascribe a spherical shape or any other form to an electron to attain a certain dependency of inertia on velocity. However that may be: a physical idea of the simplicity and generality as that contained in the principle of relativity deserves to be tested in more ways than one, and to be proven wrong if faulty. No better way of testing it can be found than 'Translated by T. Kleinschmidt and W. Kern, Univ. Heidelberg and Univ. of Massachusetts Dartmouth. 'H.A. Lorentz, Versl. Kon. Akad. v. Wet. Amsterdam 1904, S. 809. 2 A . Einstein, Ann. d. Phys. (4) 17, 891, 1905. 3 W . Kaufmann, Sitzungsber. d. preu. Akad. d. Wiss. 1905, S.949; Ann. d. Phys. (4) 19,487, 1906.
144 Lorentz and Poincare Invariance searching for the consequences that it leads to. In this point of view, the following investigation might be of some use. It treats the problem of determining the form of the fundamental equation of motion of mechanics that have to replace the usual Newtonian equations for the motion of a free point particle mx = Fx,
my - Fy,
m'z = Fz
(2)
if the principle of relativity is taken into account. According to this principle the simple equations (2) only hold for a point at rest (x = 0, y — 0, z — 0). For a point with a finite velocity V = y/x2 + y2 + P
(3)
they need to be generalized. It is true that for arbitrary values of v one could simply define quantities Fx,Fy,Fz as products of mass and acceleration and label them components of the force, which causes the movement, as is the custom in many treatments of mechanics. However, the force defined in this manner would not have an independent physical meaning. In particular, it would lose its simple relationship with the potential energy. This is because the principle of relativity demands that in a "primed" reference frame defined by equations (1), the equations mx' = Fx,
my'=Fy,
mz'= F'z
be valid as well. But then the relations between Fx,Fy,Fz and FX,F'FZ would become quite complicated as can be seen by deriving them from the relations between x and x1 resulting from equation (1). A simple physical interpretation of these quantities becomes impossible. In order to become familiar with the general relation existing between acceleration and moving force, it is advisable to start with a special case in which one knows the relationship between the components of the moving force in both reference frames. Such a case is an electromagnetic field in vacuum acting on a point particle with mass m and charge e. In that case, we have the relations 1
E
'- = OT (a + 7*') •
"'' = W^
("' - 7 £ 0
for the electric and magnetic fields in both reference frames (1). Imagine a point particle at the origin of the "unprimed" reference frame (x, y, z,t) with velocity components x, y, z in that frame. We ask for its equations of motion. This question has a unique answer when we imagine the point particle to be at the origin of a second reference frame which moves with constant velocity components J
A . Einstein, 1. c , S. 909.
Chap. 2. Special Relativity ...
145
x,y,z with respect to the original frame. Let the x-axis of this frame be along the velocity direction v of the point particle with magnitude given by (3). Then the particle is at rest in the new frame and the equations of motion in the new frame are valid in the simple form (2), where the force, causing the motion is given by the product of electric charge e and electric field strength. Now let us transform the equations of motion to a second frame in which the x-axis points again in the direction of v, but which is at rest in the (x,y,z,t) frame. On one side, equations (1) serve to transform the components of the acceleration, on the other side equations (4) serve for the force components, by simply replacing i; with V everywhere. Finally, a simple rotation brings this frame into the (x,y, z,t) frame. After performing all these elementary calculations one obtains the equations of motion in the form TTIX
GX
6
eEx - -j{iEx
+ yEy + zEz) + -{yH2 - zHy),...
(5)
These three equations can be verified by considering that, according to the principle of relativity, the equations have to remain correct when replacing unprimed quantities by primed ones everywhere, while maintaining the constants c, e and m. This is indeed confirmed in general because of the relations (1) and (4), for an arbitrary value of V. To simplify the equations of motion as much as possible, we multiply them with x,y and z, respectively, and add them. The result is e(xEx + yEy + zEz) =
m(xx + yy+ z'z)
(l-£)§
Substituting this in (5) yields mx
dt \ y i _ i ; 2 / c 2
= Fx,etc.
(6)
where we have used eEx + -(yHz-zHy)
= FXl...
These equations contain the solution of the problem posed above. They represent the generalization of the Newtonian equations of motion (2) demanded by the principle of relativity. Comparing them with the Lagrangian equations of motion: d
fdL
*lai)=F--
W
where L denotes the Lagrangian, we obtain L = —mc \ 11
+ const.
(8)
146 Lorentz and Poincare Invaiiance Multiplying equations (7) with xdt, ydt and zdt, respectively, and adding them yields the energy theorem dL .dL .dL d ( x— + y— + z— - L ) = Fxdx + Fydy + Fzdz. ox oy oz From this relation the expression of the energy H of the point particle follows: .dL .dL .dL TT H = Z — + y— + z— - L Ox Oy oz
mc2 . + const. L _ v2
The equations of motion (7) can also be represented in the form of Hamilton's Principle:
I {SL + A)dt = 0, to
where the time t and the initial and the final position remain unvaried. The quantity A denotes the virtual work: A = Fx5x + FySy +
Fjz.
Finally we write down Hamilton's canonical equations of motion. For this purpose we introduce momentum coordinates px, py, pz, where dL Px = dx
mi
v^T
Regarding the energy H as a function of px, py, pz, and using the abbreviation 2 Pl+ PI+PI = P2 w e obtain y H = mc2\
/
I -\
v2 r-rr + const m^c1
and the Hamiltonian equations of motion become:
dt
*'
dx__dH_ dt dpx'
dt
"'
dy__dH_ dt dpy'
dt
dz _ dH dt dpz
All these relations are valid in the (x, y, z, i) frame used here as well as in any other frame (x', y', z', t') connected with it by equations (1).
Chap. 2. Special Relativity
...
SPACE AND T I M E BY H. MINKOWSKI
T
H E views of space and time which I wish to lay before you have sprung from the soil of experimental physics, and therein lies their strength. They are radical. Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality. I First of all I should like to show how it might be possible, setting out from the accepted mechanics of the present day, along a purely mathematical line of thought, to arrive at changed ideas of space and time. The equations of Newton's mechanics exhibit a two-fold invariance. Their form remains unaltered, firstly, if we subject the underlying system of spatial co-ordinates to any arbitrary change of position; secondly, if we change its state of motion, namely, by imparting to it any uniform translatory motion; furthermore, the zero point of time is given no part to play. W e are accustomed to look upon the axioms of geometry as finished with, when we feel ripe for the axioms of mechanics, and for that reason the two invariances are probably rarely mentioned in the same breath. E a c h of them by itself signifies, for the differential equations of mechanics, a certain group of transformations. The existence of the first group is looked upon as a fundamental characteristic of Bpace. The second group is preferably treated with disdain, so that we with untroubled minds may overcome the difficulty of never being able to decide, from physical phenomena, whether space, which is supposed to be stationary, may not be after all in a state of uniform translation. Thus the two groups, side by side, lead their lives entirely apart. Their Utterly heterogeneous character may have discouraged any attempt to compound them. B u t it is precisely when they are compounded that the complete group, as a whole, gives us to think. W e will try to visualize the state of things by the graphic method. L e t x, y, z be rectangular co-ordinates for space, and let t denote time. The objects of our perception invariably include places and times in combination. Nobody has ever noticed a place except at a time, or a time except at a place. B u t I still respect the dogma that both space and
147
148
Lorentz and Poincare Invariance
time have independent significance. A point of space at a point of time, that is, a system of values x, y, z, t, I will call a world-point. The multiplicity of all thinkable x, y, z, t systems of values we will christen the world. With this most valiant piece of chalk I might project upon the blackboard four world-axes. Since merely one chalky axis, as it is, consists of molecules all a-thrill, and moreover is taking part in the earth's travels in the universe, it already affords us ample scope for abstraction; the somewhat greater abstraction associated with the number four is for the mathematician no infliction. Not to leave a yawning void anywhere, we will imagine that everywhere and everywhen there is something perceptible. To avoid saying " matter " or " electricity " I will use for this something the word " substance." We fix our attention on the substantial point which is at the worldpoint x, y, z, t, and imagine that we are able to recognize this substantial point at any other time. Let the variations dx, dy, dz of the space co-ordinates of this substantial point correspond to a time element dt. Then we obtain, as an image, so to speak, of the everlasting career of the substantial point, a curve in the world, a world-line, the points of which can be referred unequivocally to the parameter t from - oo to + GO . The whole universe is seen to resolve itself into similar world-lines, and I would fain anticipate myself by saying that in my opinion physical laws might find their most perfect expression as reciprocal relations between these world-lines. The concepts, space and time, cause the x, y, ^-manifold t = 0 and its two sides t > 0 and t < 0 to fall asunder. If, for simplicity, we retain the same zero point of space and time, the first-mentioned group signifies in mechanics that we may subject the axes of x, y, z at t = 0 to any rotation we choose about the origin, corresponding to the homogeneous linear transformations of the expression x2 + y* + z\ But the second group means that we may—also without changing the expression of the laws of mechanics—replace x, y, z,tbj x - at, y - fit, z - yt, t with any constant values of a, /9,
Chap. 2. Special Relativity
...
To establish the connexion, let us take a positive parameter c, and consider the graphical representation of cH* - x* - y2 - z* = 1. It consists of two surfaces separated by t = 0, on the analogy of a hyperboloid of two sheets. We consider the sheet in the region t>0, and now take those homogeneous linear transformations of x, y, z, t into four new variables x, y', z, t', for which the expression for this sheet in the new variables is of the same form. It is evident that the rotations of space about the origin pertain to these transformations. Thus we gain full comprehension of the rest of the transformations simply by taking into consideration one among them, such that y and z remain unchanged. We draw (Fig. 1) the section of this sheet by the plane of the axes of x and t—the upper branch of the hyperbola cH2 - x% — 1, with its asymptotes. From the origin O we draw any radius
FIG.
1.
149
150
Lorentz and Poincare Invariance
vector OA' of this branch of the hyperbola; draw the tangent to the hyperbola at A' to cut the asymptote on the right at B ' ; complete the parallelogram OA'B'C; and finally, for subsequent use, produce B'C to cut the axis of x at D'. Now if we take OC' and OA' as axes of oblique co-ordinates x', t', with the measures OC = 1, OA' = 1/c, then that branch of the hyperbola again acquires the expression c3?* - x* = 1, t'> 0, and the transition from x, y, z, t to x, y, z, t' is one of the transformations in question. With these transformations we now associate the arbitrary displacements of the zero point of space and time, and thereby constitute a group of transformations, which is also, evidently, dependent on the parameter c This group I denote by Gc. If we now allow c to increase to infinity, and 1/c therefore to converge towards zero, we see from the figure that the branch of the hyperbola bends more and more towards the axis of x, the angle of the asymptotes becomes more and more obtuse, and that in the limit this special transformation changes into one in which the axis of f may have any upward direction whatever, while x approaches more and more exactly to x. In view of this it is clear that group Gc in the limit when c = co , that is the group Gm, becomes no other than that complete group which is appropriate to Newtonian mechanics. This being so, and since Gc is mathematically more intelligible than G^, it looks as though the thought might have struck some mathematician, fancy-free, that after all, as a matter of fact, natural phenomena do not possess an invariance with the group G B , but rather with a group Gc, c being finite and determinate, but in ordinary units of measure, extremely great. Such a premonition would have been an extraordinary triumph for pure mathematics. Well, mathematics, though it now can display only staircase-wit, has the satisfaction of being wise after the event, and is able, thanks to its happy antecedents, with its senses sharpened by an unhampered outlook to far horizons, to grasp forthwith the far-reaching consequences of such a metamorphosis of our concept of nature. I will state at once what is the value of c with which we shall finally be dealing. It is the velocity of the propagation of light in empty space. To avoid speaking either of space or of emptiness, we may define this magnitude in another way, as the ratio of the electromagnetic to the electrostatic unit of electricity.
Chap. 2. Special Relativity
...
The existence of the invariance of natural laws for the relevant group Ge would have to be taken, then, in this way:— From the totality of natural phenomena it is possible, by successively enhanced approximations, to derive more and more exactly a system of reference x, y, z, t, space and time, by means of which these phenomena then present themselves in agreement with definite laws. But when this is done, this system of reference is by no means unequivocally determined by the phenomena. It is still possible to make any change in the system of reference that is in conformity with the transformations of the group Gc, and leave the expression of the laws of nature unaltered. For example, in correspondence with the figure described above, we may also designate time t', but then must of necessity, in connexion therewith, define space by the manifold of the three parameters x, y, z, in which case physical laws would be expressed in exactly the same way by meanB of x, y, z, t' as by means of x, y, z, t. We should then have in the world no longer space, but an infinite number of spaces, analogously as there are in three-dimensional space an infinite number of planes. Three-dimensional geometry becomes a chapter in four-dimensional physics. Now you know why I said at the outset that space and time are to fade away into shadows, and only a world in itself will subsist. II The question now is, what are the circumstances which force this changed conception of space and time upon us ? Does it actually never contradict experience ? And finally, is it advantageous for describing phenomena ? Before going into these questions, I must make an important remark. If we have in any way individualized space and time, we have, as a world-line corresponding to a stationary substantial point, a straight line parallel to the axis of t; corresponding to a substantial point in uniform motion, a straight line at an angle to the axis of t; to a substantial point in varying motion, a world-line in some form of a curve. If at any world-point x, y, z, t we take the world-line passing through that point, and find it parallel to any radius vector OA' of the above-mentioned hyperboloidal sheet, we can introduce OA' as a new axis of time, and with the new concepts of space and time thus given, the substance at the
151
152
Lorentz and Poincare Invariance
world-poinfc concerned appears as at rest. We will now introduce this fundamental axiom :— The substance at any world-point may always, with the appropriate determination of space and time, be looked upon as at rest. The axiom signifies that at any world-point the expression c2dt2 - dx* - dy2 - dz1 always has a positive value, or, what comes to the same thing, that any velocity v always proves less than c. Accordingly c would stand as the upper limit for all substantial velocities, and that is precisely what would reveal the deeper significance of the magnitude c. In this second form the first impression made by the axiom is not altogether pleasing. But we must bear in mind that a modified form of mechanics, in which the square root of this quadratic differential expression appears, will now make its way, so that caBes with a velocity greater than that of light will henceforward play only some such part as that of figures with imaginary co-ordinates in geometry. Now the impulse and true motive for assuming the group Gc came from the fact that the differential equation for the propagation of light in empty space possesses that group Gc. * On. the other hand, the concept of rigid bodies has meaning only in mechanics satisfying the group G^,. If we have a theory of optics with Gc, and if on the other hand there were rigid bodies, it is easy to see that one and the same direction of t would be distinguished by the two hyperboloidal sheets appropriate to Gc and G,,,, and this would have the further consequence, that we should be able, by employing suitable rigid optical instruments in the laboratory, to perceive some alteration in the phenomena when the orientation with respect to the direction of the earth's motion is changed. But all efforts directed towards this goal, in particular the famous interference experiment of Michelson, have had a negative result. To explain this failure, H. A. Lorentz set up an hypothesis, the success of which lies in this very invariance in optics for the group Gc According to Lorentz any moving body must have undergone a contraction in the direction of its motion, and in fact with a velocity v, a contraction in the ratio
1: VI " «7c2.
* An application of this fact in its essentials has already been given by W. Voigt, Gdttinger Naohrichten, 1887, p. 41.
Chap. 2. Special Relativity
...
This hypothesis sounds extremely fantastical, for the contraction is not to be looked upon as a consequence of resistances in the ether, or anything of that kind, but simply as a gift from above,—as an accompanying circumstance of the circumstance of motion. I will no w show by our figure that the Lorentzian hypothesis is completely equivalent to the new conception of space and time, which, indeed, makes the hypothesis much more intelligible. If for simplicity we disregard y and z, and imagine a world of one spatial dimension, then a parallel band, upright like the axis of t, and another inclining to the axis of t (see Fig. 1) represent, respectively, the career of a body at rest or in uniform motion, preserving in each case a constant spatial extent. If OA' is parallel to the second band, we can introduce t' as the time, and x as the space co-ordinate, and then the second body appears at rest, the first in uniform motion. W e now assume that the first body, envisaged as at rest, has the length I, that is, the cross section P P of the first band on the axis of x is equal to I . OC, where OC denotes the unit of measure on the axis of x; and on the other hand, that the second body, envisaged as at rest, has the same length Z, which then means that the cross section Q'Q' of the second band, measured parallel to the axis of x, is equal to I. O C . W e now have in these two bodies images of two equal Lorentzian electrons, one at rest and one in uniform motion. But if we retain the original co-ordinates x, t, we must give as the extent of the second electron the cross section of its appropriate band parallel to the axis of x. Now since Q'Q' = I. OC, it is evident that QQ = I. OD'. If dx/dt for the second band is equal to v, an easy calculation gives OD' = O C V 1 "
W , 2
therefore also P P : QQ = 1 : Jl - v /c2. But this is the meaning of Lorentz's hypothesis of the contraction of electrons in motion. If on the other hand we envisage the second electron as at rest, and therefore adopt the system of reference x' t', the length of the first must be denoted by the cross section P ' P ' of its band parallel to O C , and we should find the first electron in comparison with the second to be contracted in exactly the same proportion; for in the figure P ' P ' : Q'Q' = OD : O C = O D ' : OC = QQ : P P . Lorentz called the t' combination of x and t the local time of the electron in uniform motion, and applied a physical
153
154
Lorentz and Poincare Invariance
construction of this concept, for the better understanding of the hypothesis of contraction. But the credit of first recognizing clearly that the time of the one electron is just as good as that of the other, that is to say, that t and t' are to be treated identically, belongs to A. Einstein.* Thus time, as a concept unequivocally determined by phenomena, was first deposed from its high seat. Neither Einstein nor Lorentz made any attack on the concept of space, perhaps because in the above-mentioned special transformation, where the plane of x, t' coincides with the plane of x, t, an interpretation is possible by saying that the a;-axis of space maintains its position. One may expect to find a corresponding violation of the concept of space appraised as another act of audacity on the part of the higher mathematics. Nevertheless, this further step is indispensable for the true understanding of the group Grc, and when it has been taken, the word relativitypostulate for the requirement of an invariance with the group Gc seems to me very feeble. Since the postulate comes to mean that only the four-dimensional world in space and time is given by phenomena^ but that the projection in space and in time may still be undertaken with a certain degree of freedom, I prefer to call it the postulate of the absolute world (or briefly, the world-postulate).
ni The world-postulate permits identical treatment ofkfche four co-ordinates x, y, z, t. By this means, as I shall now show, the forms in which the laws of physics are displayed gain in intelligibility. In particular the idea of acceleration acquires a clear-cut character. I will use a geometrical manner of expression, which suggests itself at once if we tacitly disregard z in the triplex x, y, z. I take any world-point 0 as the zero-point of spacetime. The cone c2t2 - x2 - y2 - z2 = 0 with apex 0 (Fig. 2) consists of two parts, one with values t < 0, the other with values t > 0. The former, the front cone of O, consists, let us say, of all the world-points which " send light to O," the latter, the back cone of O, of all the world-points which " receive light from O." The territory bounded by the front cone alone, we may call " before " 0, that which is bounded by • A. Einstein, Ann. d. Phys., 17, 1905, p. 891; Jahib. d. Radioaktivit&t und Elektronik, 4, 1907, p. 411.
Chap. 2. Special Relativity
...
the back cone alone, " after" O. The hyperboloidal sheet already discussed F = c't1 - x1 - y2 - z* = 1, t > 0 lies after 0. The territory between the cones is filled by the one-sheeted hyperboloidal figures - F = x* + y2 + z* - cH* = ¥ for all constant positive values of k. We are specially interested in the hyperbolas with O as centre, lying on the latter figures. The single branches of these hyperbolas may be called briefly the internal hyperbolas with centre O. One of these branches, regarded as a world-line, would represent a motion which, for t = - co and t = + co, rises asymptotically to the velocity of light, c. If we now, on the analogy of vectors in space, call a directed length in the manifold of x, y, z, t a vector, we have to distinguish between the time-like vectors with directions from 0 to the sheet + F = 1, t > 0, and the space-like vectors
Fia. 2.
with directions from O to - F = 1. The time axis may run parallel to any vector of the former kind. Any world-point between the front and back cones of O can be arranged by means of the system of reference so as to be simultaneous with 0, but also just as well so as to be earlier than 0 or later than 0. Any world-point within the front cone of 0 is necessarily always before O ; any world-point within the back cone of 0 necessarily always after O. Corresponding to passing to the limit, c = oo , there would be a complete flattening out of the wedge-shaped segment between the cones into the plane manifold t = 0. In the figures this segment is intentionally drawn with different widths. We divide up any vector we choose, e.g. that from 0 to x, y, z, t, into the four components x, y, z, t. If the directions
155
156
Lorentz and Poincaxe Invariance
of two vectors are, respectively, that of a radius vector OE from 0 to one of the surfaces + F = 1, and that of a tangent RS at the point R of the same surface, the vectors are said to be normal to one another. Thus the condition that the vectors with components x, y, z, t and xx, ylt zv tx may be normal to each other is cHtx
- ZiCj - yyx
- zzx — 0 .
For the measurement of vectors in different directions the units of measure are to be fixed by assigning to a space-like vector from 0 to - F = 1 always the magnitude 1, and to a time-like vector from 0 to + F = 1, t > 0 always the magnitude 1/c. If we imagine at a world-point P (x, y, z, t) the worldline of a substantial point running through that point, the magnitude corresponding to the time-like vector dx, dy, dz, dt laid off along the line is therefore dr = -y/^dt*
- dx1 - dif - dz'.
The integral $dr = r of this amount, taken along the worldline from any fixed starting-point P 0 to the variable endpoint P, we call the proper time of the substantial point at P. On the world-line we regard x, y, z, t—the components of the vector OP—as functions of the proper time T ; denote their first differential coefficients with respect to T by x, y, z, i ; their second differential coefficients with respect to T by £. y> z> ^; and give names to the appropriate vectors, calling the derivative of the vector OP with respect to T the velocity vector at P, and the derivative of this velocity vector with respect to r the acceleration vector at P. Hence, since cH2 - x2 - y* - z* = c 2 ,
we have cHt - xx - yy - zz = 0, i.e. the velocity vector is the time-like vector of unit magnitude in the direction of the world-line at P, and the acceleration vector at P is normal to the velocity vector at P, and is therefore in any case a space-like vector. Now, as is readily seen, there is a definite hyperbola which has three infinitely proximate points in common with the world-line at P, and whose asymptotes are generators of
Chap. 2. Special Relativity
...
a " front cone " and a " back cone " (Fig. 3). Let this hyperbola be called the hyperbola of curvature at P . If M is the centre of this hyperbola, we here have to do with an internal hyperbola with centre M. L e t p be the magnitude of the vector M P ; then we recognize the acceleration vector at P as the vector in the direction M P of magnitude If x, y, z, l are all zero, the hyperbola of curvature reduces to the straight line touching the world-line in P , and we must put p = oo. IV To s r i o w that the assumption of group Gc for the laws of physics never leads to a contradiction, it is unavoidable to undertake a revision of the whole of physics on the basis of this assumption. This revision has to some extent already been successfully carried out for questions of thermodynamics and heat radiation,* for electromagnetic processes, and finally, with the retention of the concept of mass, for mechanics.f F o r this last branch of physics it is of prime importance to raise the question—When a force with the components X, Y, Z parallel to the axes of space acts at a world-point P (x, y, z, t), where the velocity vector is x, y, z, £,.what must we take this force to be when the system of reference is in any way changed ? Now there exist certain approved statements as to the ponderomotive force in the electromagnetic field .in the cases where the group G c is undoubtedly admissible. These statements lead up to the simple rule:—When the system of reference is changed, the force in question transforms into a force in the new space co-ordinates in such a way that the appropriate vector with the components £X, FIG. 3.
tY, tZ, tT, where T = i (%X + %Y +
-Z)
is the rate at which work is done by the force at the world* M. Planck, " Zur Dynamik bewegter Systems," Berliner Berichte, 1907, p. 542; also in Ann. d. Phys., 26, 1908, p. 1. + H. Minkowski, " Die Grundgleichungan fiir die elektromagnetiBchen Vorgange in bewegten Korpern," Gottinger Nachrichfcen, 1908, p. 58.
157
158
Lorentz and Poincare Invariance
point divided by c, remains unchanged. This vector is always normal to the velocity vector at P . A force vector of this kind, corresponding to a force at P, is to be called a " motive force vector " at P . I shall now describe the world-line of a substantial point with constant mechanical mass m, passing through P. L e t t h e velocity vector at P , multiplied by m, be called the " m o m e n t u m vector" at P, and the acceleration vector at P , multiplied by m, be called the " force vector " of the motion at P . With these definitions, the law of motion of a point of mass with given motive force vector runs thus :— * The Force Vector of Motion is Equal to the Motive Force Vector. This assertion comprises four equations for the components corresponding to the four axes, and since.both vectors mentioned are a priori normal to the velocity vector, the fourth equation may be looked upon as a consequence of the other three. In accordance with the above signification of T, the fourth equation undoubtedly represents the law of energy. Therefore the component of the momentum vector along the axis of t, multiplied by c, is to be defined as the kinetic energy of the point mass. The expression for this is mc2-,- = mc 2 /«yi - v^lc2 i.e.-, after removal of the additive constant mc2, the expression ^mv2 of Newtonian mechanics down to magnitudes of the order 1/c1. It comes out very clearly in this way, how the energy depends on the system of reference. But as the axis of t may be laid in the direction of any time-like vector, the law of energy, framed for all possible systems of reference, already contains, on the other hand, the whole system of the equations of motion. At the limiting transition which we have discussed, to c = oo, this fact retains its importance for the axiomatic structure of Newtonian mechanics as well, and has already been appreciated in this sense by I. E. Schutz.*/ W e can determine the ratio of the units of length and time beforehand in such a way that the natural limit of velocity becomes c = 1. If wa then introduce, further, * H . Minkowski, loc. cit., p 107. Cf. also M. Planck, Verhandlungen der physikalischen Gesellschaft, i, 1906, p. 136. *tl. R. Scafitz, " D a s Prinzip der absolution Erhaltung der Energie," Gottinger Nachr., 1897, p. 110.
Chap. 2. Special Relativity
...
*/ - 1 t = s in place of t, the quadratic differential expression dr* = - dx2 - dya - dz* - ds* thus becomes perfectly symmetrical in x, y, z, s ; and this symmetry is communicated to any law which does not contradict the world-postulate. Thus the essence of this postulate may be clothed mathematically in a very pregnant manner in the mystic formula 3 . 105 km = *J - 1 sees.
The advantages afforded by the world-postulate will perhaps be most strikingly exemplified by indicating the effects proceeding from a point charge in any kind of motion according to the Maxwell-Lorentz theory. Let us imagine the world-line of such a point electron with the charge e, and introduce upon it the proper* time T from any initial point. In order to find the field caused by the electron at any world-point Pj, we construct the front cone belonging to Pi (Fig. 4). The cone evidently meets the world-line, since the directions of the line are everywhere those of time-like vectors, at the single point P. We draw the tangent to the world-line at P, and construct through P x the normal P,Q to this tangent. Fia. 4. Let the length of PiQ be r. Then, by the definition of a front cone, the length of PQ must be rjc. Now the vector in the direction PQ of magnitude e/r represents by its components along the axes of x, y, z, the vector potential multiplied by c, and by the component along the axis of t, the scalar potential of the field excited by e at the world-point P. Herein lie the elementary laws formulated by A. Lienard and E. Wiechert.* Then in the description of the field produced by the electron we see that the separation of the field into electric * A. Lienard, " Champ <51eotrique et magnetique produit par une charge concentres en un point et anim£e d'un mouvement quelconque," L'Eclairage Electrique, 16, 1898, pp. 5, 53, 106; B. Wiechert, " Elektrodynamisohe Elementargesetze," Arch. N^erl. (2), 5,1900, p. 549.
159
160 Lorentz and Poincare Invariance
and magnetic force is a relative one with regard to the underlying time axis ; the most perspicuous way of describing the two forces together is on a certain analogy with the wrench in mechanics, though the analogy is not complete. I will now describe the ponderomotive action of a moving point charge on another moving point charge. Let us imagine the world-line of a second point electron of the charge' eu passing through the world-point ¥1. We define P, Q, r as before, then construct (Fig. 4) the centre M of the hyperbola of curvature at P, and finally the normal MN from M to a straight line imagined through P parallel to Q P r With P as starting-point we now determine a system of reference as follows:—The axis of t in the direction PQ, the axis of x in direction QPj, the axis of y in direction MN, whereby finally the direction of the axis of z is also defined as normal to the axes of t, x, y. Let the acceleration vector at P be x, y, z, t, the velocity vector at P x be xu yv zx, ix. The motive force vector exerted at P x by the first moving electron e on the second moving electron ex now takes the form
- ee^k - £ ) * , where the components Six, Siy, Stz, Sit of'the vector Si satisfy the three relations cStt - Six = p ,
Sty = - j - , Stz = 0,
and where, fourthly, this vector Si is normal to the velocity vector at P^ and through this circumstance alone stands in dependence on the latter velocity vector. When we compare this statement with previous formulations * of the same elementary law of the ponderomotive action of moving point charges on one another, we are compelled to admit that it is only in four dimensions that the relations here taken under consideration reveal their inner being in full simplicity, and that on a three dimensional space forced upon us a priori they cast only a very complicated projection. In mechanics as reformed in accordance with the worldpostulate, the disturbing lack of harmony between Newtonian •K. Schwarzwald, G5ttinger Nachr., 1903, p. 182; H. A. Lorentz, Enaykl. d. math. Wissensoh., V, Art. 14, p. 199.
Chap. 2. Special Relativity
...
mechanics and modern electrodynamics disappears of its own accord. Before concluding I will just touch upon the attitude of Newton's law of attraction toward this postulate. I shall assume that when two points of mass m, ml describe their world-lines, a motive force vector is exerted by m on m^, of exactly the same form as that just given in the case of electrons, except that + mmx must now take the place of - eev W e now specially consider the case where the acceleration vector of m is constantly zero. L e t us then introduce t in such a way that m is to be taken as at rest, and let only mx move under the motive force vector which proceeds from m. If we now modify this given vector in the first place by adding the factor i ~ l = * / ! - v^c1, which, to the order of 1/c2, is equal to 1, it will be seen f that for the positions xv yv zv of m1 and their variations in time, we should arrive exactly at Kepler's laws again, except that the proper times TX of mx would take the place of the times ^. F r o m this simple remark it may then be seen that the proposed law of attraction combined with the new meghanics is no less well adapted to explain astronomical observations than the Newtonian law of attraction combined with Newtonian mechanics. T h e fundamental equations for electromagnetic processes in ponderable bodies also fit in completely with the worldpostulate. As I shall show elsewhere, it is not even by any means necessary to abandon the derivation of these fundamental equations from ideas of the electronic theory, as taught by Lorentz, in order to adapt them to the worldpostulate. The validity without exception of the world-postulate, I like to think, is the true nucleus of an electromagnetic image of the world, which, discovered by Lorentz, and further revealed by Einstein, now lies open in the full light of day. In the development of its mathematical consequences there will be ample suggestions for experimental verifications of the postulate, which will suffice to conciliate even those to whom the abandonment of old-established views is unsympathetic or painful, by the idea of a pre-established harmony "between pure mathematics and physics. + H. Minkowski, loc. oit., p. 110.
161
162 Lomntz and Poincare Invariance
The Theory of Relativity and Science* W. Pauli
The special theory of relativity was linked up with the mathematical group concept, as it had already come to light in the mechanics of Galileo and Newton, now sofirmlyestablished on an empirical basic. In this system of mechanics all states of motion of the observer, or, expressed mathematically, all coordinate systems which arise from each other by a uniform motion of translation without rotation, are equally privileged. Since the state of rest of a mass does not require any particular cause for its maintenance, the same assumption had to be made in classical mechanics for the state of uniform motion, since the latter arises from the state of rest by one of the transformations contained in the group of mechanics. This formulation of the law of inertia of classical mechanics is of course not the original one, but takes account of the later development of the group concept in the mathematics of the 19th century. The development of electrodynamics during the same period culminated in the partial differential equations of Maxwell and K A. Lorentz. It was evident that these did not admit the group of classical mechanics, since in particular the fact that the velocity of light in vacuo is independent of the state of motion of the light-sources is contained in them as a -consequence.
* Helvetica Physica Acta, Supplement IV, pp. 282-286 (1956).
Chap. 2. Special Relativity
...
163
Would it now be necessary to abandon as only approximately valid the property whereby the laws of nature admit a group, or is the group of mechanics perhaps only approximately valid, and should it be replaced by a more general group, valid for both mechanical and electromagnetic processes? The decision was in favour of the second alternative. This postulate could be arrived at by two paths. Either one could investigate by pure mathematics what is the most general group of transformations under which the equations of Maxwell and Lorentz which were well known at this time, preserve their form. This path was followed by the mathematician, H. Poincare. Or one could determine, by critical analysis, those physical assumptions which had led to the particular group of the mechanics of Galileo and Newton. This was the path followed by Einstein. He showed that, from the general standpoint of the equivalence of all coordinate systems moving with constant velocity with respect to each other, the invariance of simultaneity of spatially separated events, in the sense in which it is assumed in classical mechanics, involves the special additional supposition of the possibility of infinitely great signal velocities. If this supposition is dropped and replaced by the assumption of a finite maximal signal velocity, time is also transformed, and the group, mathematically speaking, leaves invariant an indefinite quadratic form in four dimensions, three of space and one of time. The electrodynamics of Maxwell and Lorentz did in fact turn out to be invariant under the group of transformations determined by Einstein on the basis of these general considerations, if the maximal signal velocity was identified with the velocity of propagation of light in vacuo. Both Einstein and Poincare took their stand on the preparatory work of H. A. Lorentz, who had already come quite close to this result, without however quite reaching it. In the agreement between the results of the methods followed independently of each other by Einstein and Poincare I discern a deeper significance of a harmony between the mathematical method and analysis by means of conceptual experiments (Gedankenexperimente), which rests on general features of physical experience.
164 Lorentz and Poincare Invariance
Einstein's first paper on relativity H. M. Schwartz Department of Physics, University of Arkansas, Fayetteville, Arkansas 72701 (Received 4 April 1975; revised 24 May 1976) Because of its exceptional significance in the history of great ideas in science, Einstein's first paper on relativity, especially its first part, deserves a more careful translation into English than presently exists. A new and annotated translation of this first part is presented here, together with a brief discussion of certain aspects of Einstein's paper.
A. INTRODUCTION With the recent perceptible awakening of an interest among physicists in the history of their subject, the older of the two pillars of current terrestrial physics, the special theory of relativity, has become the subject of intensive historical studies.1 Yet rather remarkably, seven decades after its publication, Einstein's trailblazing paper,2 "Zur Elektrodynamik bewegter Korper" ("On the Electrodynamics of Moving Bodies"), does not yet have a fully satisfactory translation into English. The only English translation available until recently, contained in the widely known collection of original papers on relativity,3 falls short as a completely reliable historical document, not so much because it is marred by a few outright mistranslations— which are in any case easily spotted—but because of its failure in important instances to convey properly the intent of a passage or of the nuances in its original presentation.4 A new translation has indeed appeared recently,5 but unfortunately its newness is limited essentially only to correcting the few flagrant mistranslations.6 For these reasons, and since Einstein's first paper on relativity represents one of the most remarkable intellectual achievements,7 a thoroughgoing English translation of this work would be of obvious interest. In the meantime, an attempt in this direction is presented in Sec. B of the present paper. It covers only the first of the two parts of Einstein's paper. It is this part which is of greatest interest historically, and the one where changes in the older translation are especially in order. A few observations concerning the second part of the paper are contained in Sec. C.
Chap. 2. Special Relativity ...
165
The references to Sees. B and C include comments on the text of the paper, and remarks concerning Refs. 3 and 5. The translations of the original footnotes to the first part of the paper are labeled with lower case roman letters (a, b, and c) and are located at the end of the present paper as the first three references. Except for one formula which is labeled as (A), none of the formulas are numbered in the original. Thus all the numbering of formulas by Arabic numerals is supplied in the translation. This is, however, the only essential deviation from the original. Even the punctuation of the original paper has been followed as closely as feasible, and the lengthy sentences of the original text have been retained almost throughout without dissection, so as to preserve as far as possible a sense of the flow of thought in the original writing.8 In addition to the comments on the original text contained in the footnotes, a few observations of a more systematic character are presented in the concluding section. A more complete discussion relating to this subject is deferred for presentation in a future paper dealing with certain questions in the history of special relativity (referred to in Ref. 1).
B. TRANSLATION OF THE INTRODUCTION AND THE FIRST PART OF EINSTEIN'S PAPER That Maxwell's electrodynamics—as presently conceived—when applied to moving bodies, leads to asymmetries which do not appear to be inherent in the phenomena, is well known. Consider, for instance, the electrodynamic interaction of a magnet and a conductor. The observed phenomenon depends here only on the relative motion of the conductor and the magnet, whereas according to the usual conception, the two cases in which either one or the other of the two bodies is in motion must be strictly differentiated. For, if.the magnet is moving and the conductor is stationary, there arises in the vicinity of the magnet an electric field of a certain energy, which generates a current in those places where parts of the conductor are situated. However, if the magnet is stationary and the conductor is moving, then no electric field arises in the vicinity of the magnet, whereas in the conductor there appears an electromotive force to which in itself there corresponds no energy, but which—assuming equality of the relative motion in the two considered cases—gives rise to electric currents of the same magnitude and the same course as those produced by the electric forces in the first case. Examples of a similar kind, as well as the unsuccessful attempts to verify that the earth moves relative to the "light medium," l) lead to the conjecture that, not only in mechanics but in electrodynamics as well, no properties of
166 Lorentz and Poincare Invariance phenomena10 attach to the idea of absolute rest, but rather that the same electrodynamic and optical laws hold in all coordinate systems in which the equations of mechanics are valid, as has already been proved for first-order quantities.1' We shall raise this conjecture (whose content will be called in the sequel the "Principle of Relativity") to the status of a postulate, and in addition we shall introduce another postulate that is only seemingly inconsistent with the former, namely, that light in empty space always propagates with a definite velocity V which is independent of the state of motion of the emitting body. These two postulates suffice for arriving at a simple and consistent electrodynamics of moving bodies on the basis of Maxwell's theory for stationary bodies. The introduction of a "luminiferous ether" will prove to be superfluous, inasmuch as according to the conception that will be developed, there will be no need for the introduction of an "absolutely stationary space" endowed with special properties, or for the assignment of a velocity vector to any point of empty space at which electromagnetic processes occur. Like any other electrodynamic theory, the theory to be developed is based on the kinematics of the rigid body, since the assertions of every theory concern relations between rigid bodies (coordinate systems), clocks, and electromagnetic processes.12 Not taking account sufficiently of this circumstance is the source of the difficulties with which the electrodynamics of moving bodies has to contend with at present.
r
I. Kinematics part /. Definition of simultaneity
vj
•
«
*
Consider a coordinate system in which Newton's equations of mechanics are valid. In order to distinguish this system from other coordinate systems to be introduced later, and for the sake of clarity of presentation, it will be called the "stationary system." If a material point is at rest relative to this coordinate system, its position with respect to it can be determined with the aid of rigid measuring rods by the employment of the methods of Euclidean geometry, and can be expressed in terms of Cartesian coordinates. If we wish to describe the motion of a material point, we give the values of its coordinates as functions of time. It is well to bear here in mind that such a mathematical description has physical sense only when it has been first established clearly what is to be understood here by the word "time." We must note that all our judgments in which time plays a role are always judgments of simultaneous occurrences. When I say, for example: "That train arrives here at 7 o'clock," it means something as follows: "The pointing of the small hand of my watch to 7 and the arrival of the train are simultaneous events."2-13
Chap. 2. Special Relativity ...
167
It might appear that it would be possible to surmount all the difficulties connected with the definition of "time" by substituting "the position of the small hand of my watch" for "time." Such a definition does in fact suffice when it is a question of defining the time exclusively at the place where the watch happens to be located, but the definition no longer suffices when it is a matter of connecting in time a series of events occurring at different places, or—what amounts to the same thing—to establish the time of events that occur at places remote from the watch. We could, of course, content ourselves with determining the time of every event by having an observer, situated together with a clock at the origin of coordinates, associate the position of the clock's hand with the arrival through empty space of a light signal produced at the event. But, as we know from experience, this association entails the drawback of not being independent of the standpoint of the observer supplied with the clock. We arrive at a much more practical determination by the following consideration. If a clock is situated at the point A of space, then an observer at A can determine the times of the events in the immediate vicinity of A by noting the positions of the hands of the clock which are simultaneous with these events. If there is also a clock at the point B of space—we wish to add, "a clock of precisely the same construction as that located at A"—then the times of events in the immediate vicinity of B can be likewise determined by an observer situated at B. But without further stipulation it is impossible to compare the times of an event at A and of an event at B\ we have so far defined only an "/1-time" and a "fl-time," but have not defined a for -/1-and-S common "time." The latter time can, however, be defined if we establish by definition that the "time" required by light to travel from A to B is equal to the "time" which it requires to travel from B to A. Thus let a light ray depart from A towards B at the "/J-time" tA, be reflected at B towards A at the "fi-time" tB, and return to A at the ",4-time" tA'. The two clocks run in synchrony by definition, when tB~tA
= tA' -tB.
(1)
We assume that this definition of synchronism admits of no contradictions and can be applied for any number of points; that the following conditions are, thus, generally satisfied: 1. When the clock at B is synchronous with the clock at A, then the clock at A is synchronous with the clock at B. 2. When the clock at A is synchronous both with the clock at B and with the clock at C, then also the clocks at B and C are mutually synchronous. We have thus established with the aid of certain (imagined) physical procedures14 what is to be understood by synchronously running clocks at rest at different places, and
168 Lorentz and Poincare Invariance have thereby evidently acquired a definition of "simultaneous" and of "time." The "time" of an event is the indication, simultaneous with the event, of a stationary clock located at the place of the event, and which, for all time determinations, runs in synchrony with a particular stationary clock. We also stipulate, in agreement with experience,! 5 that the quantity 2AB/(tA'- tA) = V (2) is a universal constant (the velocity of light in empty space). The essential thing is that we have defined time by means of clocks at rest in the stationary system; we call the time so defined, because of this association with the stationary system, "the time of the stationary system." 2. On the relativity of lengths and times
I'
|rp>'/.--'^-,R.--"-
The following considerations are based on the principle of relativity and on the principle of the constancy of the velocity of light. We define these two principles as follows: 1. The laws in accordance with which the states of physical systems vary are not dependent on whether these changes of state are referred to one or to the other of two coordinate systems that are in uniform translational relative motion. 2. Every light ray moves in the "stationary" coordinate system with the fixed velocity V, independently of whether this ray is emitted by a stationary or a moving body. Here velocity = (light path)/(time interval),
!•.; 7 /
where time interval is to be taken in the sense of the definition hi Sec. 1. Let there be given a stationary rigid rod, and let its length be / as measured by a measuring rod which is also stationary. We now imagine the axis of the rod placed along the .Y-axis of the stationary coordinate system, and then a uniform parallel translation (of velocity v) imparted to it along the .Y-axis in the direction of increasing x. We now inquire as to the length of the mooing rod, which we suppose to be established by the following two operations: (a) The observer moves together with the aforementioned measuring rod and the rod to be measured, and measures the length of the rod directly, by laying off the measuring rod along it, in the same way as when the rod to be measured, the observer, and the measuring rod are at rest. (b) By means of stationary clocks set up in the stationary system and nchronized according to Sec. 1, the observer ascertains the points in the stationary system where the ends
Chap. 2. Special Relativity ...
169
of the rod are situated at a specified time t. The distance between these two points, measured with the measuring rod previously employed, which is in this case stationary, is likewise a length, which can be denoted as the "length of the rod." According to the principle of relativity, the length obtained with the operation (a), which we shall call "the length of the rod in the moving system," must be equal to the length / of the stationary rod. The length obtained with the operation (b), which we shall call "the length of the (moving) rod in the stationary system," will be determined by us on the basis of our two principles, and we shall find that it differs from /. The kinematics in general use involves the tacit assumption that the lengths determined by our two operations are exactly equal, or, in other words, that at a given instant of time t, a moving rigid body is completely replaceable in geometrical respects by the same body, when it is at rest in a definite position. We imagine further that at the two ends (A and B) of the rod16 there are placed clocks which synchronize with the clocks of the stationary system, i.e., whose readings correspond at any given instant to the "time of the stationary system" at the places where they happen to be; these clocks are thus "synchronous in the stationary system." We also image that each clock is accompanied by an observer, and that these observers apply to both clocks the criterion established in Sec. 1 for the synchronization of two clocks. Suppose that, at the timeb tA, a light ray departs from A, is reflected at B at the time tB, and returns to A at time tA'. Taking account of the principle of the constancy of the velocity of light, we find h-tA= rAB/( V-v) and t/ -tB = rAB/{V+v). where rAB denotes the length of the moving rod—as measured in the stationary system. Observers accompanying the moving rod would thus find that the two clocks are not synchronous, whereas observers in the stationary system would declare them to be synchronous. We thus see that we cannot attach any absolute significance to the concept of simultaneity, but that two events, which when viewed from a given system of coordinates are simultaneous, can no longer be considered as simultaneous events when viewed from a system moving relative to the given system. 3. Theory of coordinate and time transformations from the stationary system to one having relative to it a uniform translational motion Let there be given in the "stationary" space17 two coordinate systems, i.e., two systems consisting each of three
In variance mutually perpendicular rigid material lines issuing from a point. Let the A"-axes of the two systems coincide, and the Y- and Z-axes be respectively parallel. Let each system be provided with a rigid measuring rod and a number of clocks, and let the two measuring rods, as well as all the clocks of both systems, be exactly alike. Let there be imparted now to the origin of one of the two systems (k) a (constant) velocity o in the direction of increasing x of the other, stationary system (A"), which velocity is also communicated to the coordinate axes, the measuring rod in question, and the clocks. To each time t of the stationary system K there corresponds then a definite position of the axes of the moving system, and from symmetry considerations we are justified in assuming that the motion of K can be such that the axes of the moving system at time t (by "r" is always represented a time of the stationary system) are parallel to the axes of the stationary system. We suppose now that space18 is measured in the stationary system K with the stationary measuring rod and in the moving system k with the comoving measuring rod, and that the respective coordinates x.y.z and £,rj,f are thus established. Further, let the time t of the stationary system be determined at all its points where stationary clocks are located, with the aid of these clocks, by the method of light signals presented in Sec. 1; likewise let the time r of the moving system be determined at all its points where clocks arc located and are at rest relative to it, by applying the method of light signals, described in Sec. 1, between the points at which the latter clocks arc located. To every set of values x.y.z.t which determine completely the place and time of an event in the stationary system there corresponds a set of values £,77,f,r which determine that event with respect to the system k. and the problem that must now be solved is finding the system of equations that connect these quantities. To begin with, it is clear that the equations must be linear owing to the homogeneity properties which we attribute to space and time.19 If we set x' = x — ot, then it is clear that to a point at rest in the system k there corresponds a fixed set of values x'.y.z independent of time. We determine first r as a function of x'. y, z, and t. To this end we must express in terms of equations that r is just a representation of the aggregate20 of readings of the clocks at rest in k, which have been synchronized according to the rule given in Sec. 1. Suppose that at the time r 0 a light ray is sent out from the origin of the system k along the A"-axis towards x' and is there reflected at time T\ back to the origin of coordinates, which it reaches at time ri; we must then have ( r o + r 2 ) / 2 = r,
Chap. 2. Special Relativity ...
171
or, inserting the arguments of the function T, and applying the principle of the constancy of the velocity of light in the stationary system,21
I [ , ( 0 , 0 , 0 . ) + r(0,0,0,r
+
^
=
+
-^)_
4<°'°-' + 7rJ-
(3)
From this it follows by choosing x' infinitely small that I I 1 2\V-v
1 \ ch_ _ jh_ V + oJdt dx'
_J dr V-vdt
or dr
^
v
:
+
dr
^ 3 ^ ^ 7 = o-
(4)
It is to be noted that instead of the origin of coordinates we could have chosen any other point as the starting point of the light ray, so that the equation just obtained holds for all values of x'.y.z-22 An analogous consideration—applied to the H- and Z-axes—yields the following equations, when it is noted that light is always propagated along these axes with the velocity (V2 — v2)'/2 when viewed from the stationary system: - = 0, dy
^ = 0. d:
(5)
From these equations it follows, since r is a linear function, that r = a[t-ox'/{V--v-)\. (6) where a is for the moment an unknown function <j}(u), and it is assumed for the sake of brevity that, at the origin of k. t = 0 when r = 0. Using this result, it is easy to determine the quantities £,7j,f by expressing by means of equations that (as required by the principle of the constancy of the velocity of light together with the principle of relativity) light propagates with the velocity V also when measured in the moving system. For a light ray emitted at time r = 0 in the direction of increasing |, we have = = Vr
or Z=
aV[t-vx'/{V^-o-)\.
But, as measured in the stationary system, the ray moves with velocity V — u relative to the origin of k, so that we have
172 Lorentz and Poincaxe Invariance x'/(V -u) = t. Inserting this value of t in the equation for £, we find £=
aV2x'/(V2-v2).
Analogously, we find, by considering light rays moving along the other two axes. 7,= VT = aV[t - ux'/iV2 - u2)], where yj(\n
_ „2)l/2 =
X>=Q.
ti
so that i?= [aV/{Vi - 02)i/2]y,
f = [aV/(V2 -
v2Y'2]z.
Substituting for x' its value [i.e., x - vt], we obtain T = 4>(v)f3{t - vx/V1), i=4>{v)p{x-vt), 7, =
+ z2=
V2f-.
We transform this equation with the aid of our transformation equations and obtain, after a simple calculation, f- + rp + r2 = KV.
^Sf^
The wave under consideration is therefore also as viewed in the moving system, a spherical wave with velocity of propagation V. This shows that our two fundamental principles are mutually compatible. In the transformation equations that have been developed there enters still an unknown function <\> o( u. which wc will now determine.
Chap. 2. Special Relativity ...
173
To this end we introduce yet a third coordinate system K', which is in such a state of parallel-translational motion relative to the system k. parallel to the H-axis, that its origin moves along the H-axis with the velocity —u.2J Let all the three origins of coordinates coincide at the time t = 0 and let the time t' of the system K' vanish when / = x = y = ; = 0. We denote by x'.y'.z' the coordinates measured in the system K\ and obtain by a twofold application of our transformation equations /' = 0 ( - r ) t f ( - r ) ( r + £t-/K2) = (-L')tf(-<')(sC + VT) = 0(1')0(-O).t. / = (/>(-'')>? = (l>(v)
Since the relations between x'.y',:'. and x.y.: do not contain the time /, the systems K and K' are mutually at rest, and it is clear that the transformation from K to K' must be the identity transformation. Therefore, 4>(u)
We now investigate the significance of >(<;). We direct our attention to that part of the W-axis:4 of the system k which lies between £ = 0,»; = 0, f = 0 and £ = 0,77 = /, f = 0. This part of the //-axis is a rod that moves perpendicularly to its axis with the velocity u relative to the system K, and whose ends possess in K the coordinates25 x\ = ut,
_y, = l/
j,.= 0
and xi = vt,
yi = 0 ,
zi = 0.
Hence, the length of the rod as measured in K is !/
(8)
From this relation and the one obtained previously it follows that (j>(u) = 1, so that our transformation equations reduce to r = 0[t -
S = 0ix-ot).
{v/V-)x], V
= y,
r=--.
(9)
174
Lorentz and Poincare Invariance where 0=[\-(u/V)i\-W.
(10)
4. Physical meaning of the derived equations as concerns mooing rigid bodies and moving clocks We consider a rigid sphere^ of radius R, which is at rest relative to the moving system k. and whose center lies at the origin of coordinates of k. The equation of the surface of this sphere, which moves with velocity o relative to the system K, is £2 + „2 + f2 =
R2
When expressed in x.y.z, the equation of this surface is at time / = 0 (1 - v2/V2)~Kx2
'• '4 / . • - ? "
+ y2 + z2 = R2.
A rigid body, which is of spherical shape when surveyed in a state of rest, has therefore when in a state of motion—as viewed from the stationary system—the shape of an ellipsoid of revolution with the axes
A
R[I -
><"~D
*S"-
{p/V)*yn,R.R.
Thus, while the Y- and Z-dimensions of the sphere (and hence also of every rigid body of arbitrary shape) do not appear to be changed by the motion, the A"-dimension appears to be shortened in the ratio 1:(1 — {u/V)2]]/2, and hence the more so the greater the value of v. For o= Kail moving objects—viewed from the "stationary" system— shrink into plane-like structures. For superlight speeds our considerations become senseless; we shall find, moreover, in the following discussion that the velocity of light plays in our theory the role of an infinitely large velocity. It is clear that the same results hold for bodies at rest in the "stationary" system, as viewed from a system in a state of uniform motion. We imagine further one of the clocks which are capable of furnishing the time / when at rest in the stationary system and the time T when at rest in the moving system, placed at the origin of coordinates of k, and set so as to mark the time r. How fast does this clock run when viewed from the stationary system? We obviously have the following equations between the quantities x, t, and T, which refer to the position of this clock: r = [\ - {»/V)2]-V\t
-
vx/V2)
and x = at. Hence,
r = r[i - {u/vy-yr- = t - \\ - [i - (o/^]'Z2!*,
Chap. 2. Special Relativity ...
175
from which it follows that the indication of the clock (viewed in the stationary system) is retarded by 1 — [1 — {o/V)1] '/2 sec per second, or by u2/2 V2 sec per second—to fourth and higher order. From this arises the following peculiar consequence. If at the points A and B of K there are stationary clocks that are synchronous as viewed in the stationary system, and if the clock at A is moved with the velocity o along the line joining A and B, then upon the arrival of this clock at B the two clocks no longer synchronize; instead, the clock that has moved from A to B lags behind the one which has stayed at B by tvl/2V2 sec (to fourth and higher order), when t is the time of transit of the clock from A to B. It is seen immediately that this result also remains true when the clock moves from A to B along an arbitrary polygonal line,26 and so also when the points A and B coincide. If one assumes that the result proved for a polygonal line26 holds likewise for a continuously curved line, one obtains the proposition: If there are two synchronous clocks at A, and one of these is moved with uniform speed along a closed curve until it returns back-to A after an interval of t sec, then upon its arrival the latter clock lags by tv2/2V2 sec behind the clock that has not been moved. From this one concludes that a balance clock27 at the Earth's equator must go more slowly by a very small amount than a precisely similar clock located at one of the Earth's poles, and otherwise subject to identical conditions. 5. Addition theorem for celocities In the system k moving with the velocity u along the .Y-axis of the system K. let a point move according to the equations £ = WCT, n = w,r, f = 0, where Wj and wn denote constants. It is desired to find the motion of the point relative to the system K. If one substitutes in the equations of motion of the point the quantities x,y,:,t given in the transformation equations developed in Sec. 3, one obtains 1 -{v/V)lV/2\ \1 + CwJV -JV J) ' • ' V 1 + vwcJV2 w„t. z = 0. 22
Thus the law of the parallelogram of forces holds in our theory only to first approximation. We set
and [correcting an obvious misprint]
176
Lorentz and Poincare Invariance a = arctan(vv,,/Wj); « is then to be looked upon as the angle between the velocities v and w. After a simple calculation it is found that 28 [u2 + w2 + 2cw cosa — (vw sina/ V)2]*/-2 (ID 1 + uw cosa/ V2. It is noteworthy that o and w enter in the expression for the resultant velocity in a symmetrical way. If w has also the direction of the ,V-axis (H-axis), we obtain U=
U =-
v+ w + vw/V2
lt follows from this equation that from the superposition of two velocities that are smaller than V there always results a velocity smaller than V. For if we set v = V — K, W = V — X, where K and X are positive and smaller than V, then V(2V - K - X) U =< V. 2VX + K\/V It follows further that the velocity of light V cannot be changed when it is combined with a "subluminal velocity." 29 In this case we obtain £/ = •
V+ w = V. + w/V
(12)
In the case that v and w have the same direction, we could have also obtained the formula for U by the composition of two transformations according to Sec. 3. If in addition to the systems K and k appearing in Sec. 3 we introduce a third coordinate system k' moving parallel to k, whose origin moves with the velocity w along the H-axis, we obtain equations between the quantities x.y.z.t and the corresponding quantities of k', which differ from those obtained in Sec. 3 only by the replacement o f ' V by the quantity v+w 1 + vw/V2' we sec from this that—as must be the case—such parallel transformations form a group. We have now derived the required laws of kinematics which correspond to our two principles, and proceed to show their application to electrodynamics.
C. REMARKS CONCERNING THE SECOND PART OF EINSTEIN'S PAPER The second part of Einstein's paper deals with'applications of relativistic kinematics to optics (Sees. 7 and 8) and to electron dynamics (Sec. 10), after the establishment of the Lorentz invariance of the Maxwell equations (Sec. 6) and of the Maxwell-Lorentz equations (Sec. 9), 3 0 respectively.
Chap. 2. Special Relativity ...
177
In Sec. 6, in addition to deriving the relativistic covariance of Maxwell's equations for the electromagnetic field in a vacuum, Einstein points out the consequent relative nature of the concepts of electric and magnetic "force." Section 7, entitled "Theory of Doppler's Principle and of Aberration," treats not only these subjects, but also the Lorentz transformation of the "amplitude of the electric or magnetic force" associated with a monochromatic electromagnetic wave. Denoting this amplitude "as measured respectively in the stationary and moving system" by A and A', he finds (first formula on p. 57 of POR 31 ) 32 : A'I A = j3[l - (u/T)cos<£]. This formula takes on added interest by the result obtained in Sec. 8 (see pages 57-58 of POR) for the ratio E'/E of the Lorentz-transformed and the original values of the energy of the associated light pulse, namely,33 £'/£ = j3[l - {v/V) cos
dr-
M/33 dt-
dc- upV
' M
$\
v1 )•
(A)
VI
Einstein was fully aware of the questionable elements in his approach35; moreover, luckily, by a restricted use of the first, and correct equation of the set (A), he obtained the correct relativistic expression for the kinetic energy of a particle (last equation on p. 63 of POR), which enabled him to obtain in 1905 his first enunciation of the equivalence of mass and energy.36 The paper concludes with a remarkable acknowledgment to Einstein's "friend and colleague, M. Besso." 37 D. DISCUSSION When Einstein's first paper on relativity is read with due attention to its content and its style, and proper account is
178 Lorentz and Poincare Invariance
Gr\l>**
taken of what is known about its author in the decade ending in 1905, and of the state of basic physics during that period, there arise a number of questions that are of considerable interest to the historian and to the philosopher of science. Most of these questions have already been the subject of penetrating investigations,1 and further discussion will be presented in the last reference in Ref. 1. The present discussion will be limited largely to examining the idea expressed in Ref. 7. That the creation of the special theory of relativity in Einstein's sense was in some respects even a more remarkable achievement than the creation of the general theory may be questioned by many—Einstein's worldwide fame was certainly associated with the latter theory. What is extraordinarily unique about Einstein's formulation of special relativity is the constructive introduction of radically new ideas concerning time and space. After those ideas have been sufficiently assimilated, it is difficult to appreciate fully the intellectual feat involved in that step. It was unquestionably one of rare physico-philosophic intuition and intellectual daring. That Einstein himself must have been aware (if perhaps only dimly) of the magnitude of the contribution contained in the first part of his paper is shown by the unusual pains taken there in explaining the relativity of simultaneity and the ensuing novel conceptions of time and space, as contrasted with the rapid development of the second part, dealing with applications to the electrodynamics of moving bodies—the avowed object of the paper. At the same time, the exposition in Part 1 gives also the impression of being in large measure a spontaneous recording of the flow of exciting ideas. Along with great lucidity, there exist, therefore, not surprisingly a few minor scarcely detectable logical flaws. The most apparent and perhaps the strangest lapse—and one which, unfortunately, has been imitated in many textbooks—js contained in the phrasing of the principle of relativity in Law 1, Sec. 2. The laws of physics are stated there to be independent of whether they are referred to "one or the other of two coordinate systems that are in uniform translational relative motion." Without specifying explicitly that one of the coordinate systems is inertial, the statement lacks, of course, in precision. Less apparent and more innocuous are the logical questions which are left open, concerning the relationship between the definition of simultaneity in terms of Eq. (1) and postulate 2 of Sec. 2. These questions touch the discussion in Sec. 3 leading to the crucial transformation equations (9) and (10). That discussion, incidentally, clearly illustrates the spontaneous-outpouring character of the exposition, and at the same time represents convincing evidence (if any were needed) that in 1905 Einstein was unaware that the result
Chap. 2. Special Relativity ...
179
(9) was discovered by Lorentz a year earlier. Moreover, Einstein himself never reverted to the arguments emploved in Sec. 3.31* As an indication of how these arguments can be sharpened—if we pause to examine them—with a simultaneous clarification of the above-mentioned logical questions, let us consider Einstein's discussion through Eqs. (1) and (3)-(5), culminating in the result (6) or, equivalently, the first of Eqs. (7). At this point Einstein uses an alternative expression of his second postulate, namely: 2'. The speed of light in vacuum has the same value in every inertial frame. What can be shown is that both postulate 2' and the result (9) can now be obtained very simply by using an essentially self-evident relationship, namely, that the speed of relative motion of two inertial frames is the same as measured in each of these frames (with the use, of course, of identical standards of length and duration).39 Applying this relationship, we have (by understandable reasoning involving the spirit of Einstein's first postulate) t = H-») 0'2{T + (v/V'i)z], where V is the speed of light as measured in the frame k, and /?' is the corresponding value of (10). Eliminating T from this equation and the equation in (7), and identifying >(—u) with 4>{v) = 4> by Einstein's isotropy-of-space argument [discussion leading to Eq. (8)], we find H = a[x - (V2/v) (\ - 1/0 W 2 ) ' ] . a = (V'2/V2)
<13)
Since £ must be proportional to x-vt, it follows that 02/T-=l. This equation will violate the principle of relativity unless ff' = /3, i.e., V = V, which proves postulate 2'. At the same time, a in Eq. (13) assumes the value j3, and Eq. (13) reduces to the second of Eqs. (9). ACKNOWLEDGMENTS The author wishes to express his indebtedness for helpful suggestions to the referees, and to Professor Lothar Schafer of the Chemistry Department of the University of Arkansas. The author also thanks Dr. Otto Nathan, trustee of the estate of Albert Einstein, for his kind permission to publish in this Journal the translation of the first part of Einstein's paper, Ref. 2. 'The imprecision which inheres in the idea of the simultaneity of two events at (approximately) the same place and which must be likewise bridged
180
Lorentz and Poincare Invaxiance over by an abstraction will not be discussed here. "Time" signifies here "'time of the stationary system," as well as "the position of the hands of the moving clock situated at the place under consideration." '•Thai is. a body which is of spherical shape when examined at rest. 'The first such study was published in this Journal: G. Holton. Am. J. Phys. 28, 627 (I960). This paper is reprinted (with omission of two paragraphs of the introductory remarks) in G. I lolton. Thematic Origins of Scientific Thought: Kepler to Einstein (Harvard U. P., Cambridge, MA, 1973). Further papers on the subject by the author are also included in this book. Up-to-date references on the history of special relativity will be included in a forthcoming study. "Lorentz. Poincare. Einstein, and the Theory of Relativity." : A. Einstein, Ann. Phys. (Leipzig) 17, 891 (1905). 'A. Einstein el al.. The Principle of Reiatirily (Dover. New York; reprinting of a translation by VV. Pcrrett and C i. B. Jeffery, first published in 1923), pp. 37-65. This will hereafter be referred to as POR. 4 This is attested to by the observations contained in a number of references of the present paper. 5 C. W. Kilmister, Special Theory of Reiatirily (Pcrgamon, New York, 1970), pp. 187-218. This will hereafter be referred toasSTOR. "These consist of a number of utter mistranslations of isolated words, and of a misleading amplification of a passage, which has been corrected by C. Scribner, Jr., Am. J. Phys. 31. 398 (1963). Another mistranslation, reproduced in STOR, is pointed out in Ref. 23. Moreover, in addition to not reproducing the original notation, as is also true of POR, and reproducing a misprint only found in POR, this translation includes, apparently inadvertently, two footnotes not found in the original paper, Ref. 2, without reproducing all the additional footnotes contained in POR. 'It can be argued that in certain respects this achievement surpasses even Einstein's creation of the theory of general relativity. "Only in one instance, namely in the line above Eq. (7), is a minor explanatory item (enclosed in brackets) added to facilitate the reading of the text. A few changes in mathematical notation have been introduced for typographical convenience. This is the literal translation of Einstein's "Lichtmedium," for which "luminiferous ether" is the more common designation, which he. in fact, employs later in the paragraph. "That is to say, no observable physical properties. ' 'This statement reflects Einstein's ignorance in 1905 of Lorentz's relevant work of 1904. 12 The original "die Aussagen eincr jeden Theorie" is translated in POR as "the assertions of any such theory." but the addition of the word "such" to the original phrase is unwarranted. Every theory, i.e., every fundamental theory, was then apparently to Einstein, as to most of his contemporaries, reducible essentially to a consideration of electromagnetic processes (with the possible exception of.gravitation). This might constitute also an explanation for Einstein's referring earlier explicitly only to "electrodynamic and optical laws" in his first pronouncement of the "principle of relativity." '•The original in footnote a reads "Die Ungenauigkeit, welche in dem Begriffe dcr Gleichzeitigkeit zweier Ereignisse an (annahernd) demselben Orte steckt und gleichfalls durch eine Abstraction iiberbruckt wcrden muss, soil hicr nicht erortert werden" (italics supplied). This is not rendered quite accurately by the following translation in POR: "We shall not here discuss the inexactitude which lurks in the concept b
f\'" *
o
•
Chap. 2. Special Relativity
...
181
of the simultaneity of two events at approximately the same place, which can only be removed by an abstraction." It is true that the word "gleichfalls" (likewise> appears to be redundant in Einstein's statement, as it stands, but it may well be a slip of the pen, reflecting the author's awareness of the many other tacit assumptions that necessarily enter in the framing of a fundamental theory of physics, and that are basically in the nature of abstractions. l4 The rendition in POR of the original "gewisser (gedachter) physikalischer Erfahrungen" as "of certain imaginary physical experiments" is hardly accurate. On the other hand, the literal translation of "Erfahrung," namely, "experience," is also hardly appropriate in the present context. 15 The original phrase is, "Wir setzen noch der Erfahrung gemass fest." This could also be translated—as is done in POR—"In agreement with experience we further assume," but the logical status of this statement connected with relation (2) is not quite the same as that of the related assumption announced in the introduction and reiterated in Sec. 2. See the pertinent discussion in Sec. D. ,6 That is, of the mooing rod. 'That is to say, in the physical space associated with the inertial coordinate system referred to in the first paragraph of Sec. 1. and designated there as the "stationary system." l8 There appears to be some logical conflict in the present use of the word "space" and the previous one, which is commented upon in Ref. 17. Apparently, this is an instance of Einstein's terminology being still steeped in the older mode of thinking in kinematics, while he was in the process of laying the foundations for a revolutionary change in that very mode of thinking. "This is a laconic statement in need of amplification. 20 The more direct translation in POR of the word "Inbegriff" as "summary" does not appear to adequately convey the idea intended in this sentence. Inbergriff'is, in fact, not an easily translatable word. 21 Actually, it is only the special case associated with Eq. (1) that is being used here (cf. Sec. D). 22 This observation is superfluous when account is taken of the linearity of the transformation equations, which property is, in fact, utilized presently in arriving at Eq. (6). ^That is to say, moving with this velocity relative to the coordinate frame k. This is explicit in the original phraseology, and is indeed what is required to arrive at the next set of equations. But this part of the sentence ' ("dass stch desscn Koordinatenursprung mit dcr Geschwindigkcit — v auf der H-Achse bewege") is mistranslated in POR as "such that the origin of coordinates of system k moves with velocity — v on the axis of X." This mistranslation is reproduced in STOR, but with the symbol X replaced by the original symbol S. 24 The letter H (used also earlier) stands for the capital of rj. 23 The order of the following two coordinate sets should be reversed for agreement with that employed earlier. 26 This has not been proved. The clock experiences an acceleration (in fact, an infinite one) at the corners of the polygon, and therefore, strictly, an additional assumption is required here to the effect that the rate of a clock is not affected by acceleration. !7 The following footnote is included in POR: "Not a pendulum clock, which is physically a system to which the Earth belongs. This had to be excluded." This footnote is not contained in the original paper, but is
182
Lorentz and Poincare Invariance reproduced in STOR. In POR the last term under the square root in Eq. (11) has an obvious misprint, not contained in the original paper, nor in the German edition of POR, but reproduced in STOR. 29 Since the relation (12) is actually valid for any value of w, this statement must be taken to imply Einstein's taking here for granted the "upperlimit" property of the speed of light. J0 Einstein does not yet use in this paper the designation "Maxwell-Lorentz equations." •"Einstein's symbol Kis replaced in POR (see Ref. 3) bye. 32 Einstein outlines a proof of this important relationship in his subsequent comprehensive paper on relativity, Jahrb. Radioakt. Elektron. 4, 409-462 (1907), p. 431. This essay will form the subject of sequel papers to the present one. "In the derivation of this result there occurs the curious coincidence that the same (somewhat unusual) method.for finding the ratio of the volumes attributed by observers in two coordinate frames having uniform relative motion, to a sphere moving with constant velocity, is also found in Poincare's Rendiconti paper on relativity. See, e.g., H. M. Schwartz, Am.J.Phys.39, 1289 (1971), p. 1290: or STOR. p. 151. An outline of the proof by this method of the first equation on p. 58 of POR is given in H. M. Schwartz, Introduction to Special Relativity (McGraw-Hill, New York, 1968), p. 415 (answer to problem 2-5). A proof that is based more obviously on relativistic kinematics is also indicated there. M A. Einstein, Ann. Phys. (Leipzig) 17, 132 (1905). "Referring to the "longitudinal" and "transverse" masses corresponding to Eqs. (A), he remarks (Ref. 2, p. 919): "Naturally, we shall obtain other values for the masses with other definitions of force and of acceleration: from this it is seen that one must proceed with utmost caution when comparing different theories of the motion of the electron." A footnote is contained in POR (p. 63) pointing out a more suitable definition of force, later introduced by Planck and generally accepted. This footnote is, of course, not in the original paper, but is included in STOR. " A . Einstein, Ann. Phys. (Leipzig) 18, 639 (1905); POR, pp. 69-71. "This acknowledgment is remarkable, in the first place, because until fairly recently the identity of this important friend of Einstein's was not generally known. Now there is a book about him and his correspondence with Einstein, namely, P. Spezialli, Albert Einslein-Michele Besso: Correspondence 1905-1955 (Hermann, Paris, 1972). This book contains translations of the German letters into French, notes, and an illuminating biographical sketch of Besso by Spezialli. The acknowledgment is also intriguing in the questions it raises as to the nature of the assistance which Einstein received from Besso. The translation in POR, reproduced in STOR, namely, "and I am indebted to him for several valuable suggestions," is not an altogether precise rendition of the concluding phrase "und dass ich demselben manche vcrtvolle Anregung verdanke," which strictly means "and that I am obliged to him for some valuable stimulation." 3, Sec, e.g., his derivations of the Lorentz transformations in the reference in Ref. 32, and his popular exposition. Relativity (Crown, New York, 1961), Appendix I. •"This result may be taken to follow from a straightforward application of the principle of sufficient reason. Einstein uses it without explanation in the second of Refs. 38 (first paragraph on p. 117). 2s
Chap. 2. Special Relativity
...
On the Origins of the Special Theory of Relativity* GERALD HOLTON
Department of Physics, Harvard University (Received May 9, 1960) Einstein's early work on relativity theory is found to be related to his other work at that time (e.g., in subject matter and style). In addition to this element of internal continuity one finds also—as a key to a new evaluation of the significance of Einstein's contribution—an external continuity with the classic, Newtonian tradition governing restrictions on permissible hypotheses. On the other hand, Einstein's work is shown to have been, in important respects, more independent of other contemporary work in this field than has recently been proposed. These continuities and discontinuities are set forth to make the point that philosophical studies of scientific work should proceed on historically valid ground. Some guiding principles are indicated for dealing with conflicting source materials for such studies.
W
H E N I received the persuasive invitation to speak today on a problem of theory construction and of the logic of discovery, I noted particularly the request to bring out the historical-sociological aspects. This directive was a pleasant surprise, for I recalled t h a t Hans Reichenbach had flatly declared himself for the opposite view when he said " T h e philosopher of science is not much interested in the thought processes which lead to scientific discoveries- • •, that is, he is not interested in the context of discovery, but in the context of justification". 1 If, therefore, I shall make some remarks on the origins of Einstein's special theory of relativity, I will be disobeying the Reichenbachian dictum. However, I draw further strength for this resolution from Einstein, who himself declared for the value of the historical treatment of the rise of key theories in science. In fact, it is appropriate to say at the very outset to an audience consisting primarily of philosophers of science t h a t sound historical investigations have lately perhaps been overlooked as important bases of sound philosophical discussions. Some examples come to mini immediately. The crux of the Copernican revr ition was initially not, as is maintained in son ; philosophical works, a pragmatic search for the smallest num* Presented at the Symposium -Theory Construction in Logical and Historical Perspective on December 27, 1959, organized by Section L (History and Philosophy of Science) of the AAAS, the American Philosophical Association, and the Philosophy of Science Association. Based on work-inprogress, supported in part by a grant of the National Science Foundation. ' P . A. Schilpp, editor, Albert Einstein: PhilosopherScientist (Library of Living Philosophers, 1949), p. 292.
183
184
Lorentz and Poincaxe Invariance
^
•
•
\
\
I
•
>!b
m HW. ? . < i , ,V>
•w »
1
ber of components with which to build a world system, nor was it the establishment of the possibility of relativism in the choice of coordinate systems. Rather, as historians of science have shown, it was a return to an earlier, even an Aristotelian austerity concerning the type of motion judged to be suitable for the construction of the world system, mixed with a commitment to a neo-Platonic epistemology t h a t looked for the warrant of reality in a new direction. T h e importance of Kepler is not t h a t he was a mystic, an obsessed searcher for empirical rules, or a master of the intuitive, "personal" way to scientific knowledge; on the contrary, it can be shown that he was the first of the modern mathematical physicists, the first to look with some success for one dynamical explanation of all celestial and terrestial motions. Galileo, we have had to relearn only recently, was not the patron saint of laboratory experimentation, as philosophers of science have at times maintained. Concerning the abuse to which Newton's work has had to lend itself, the less said the better. Einstein's work has not been immune from this fate. I am suggesting t h a t in this case, as in the others, we build our philosophical analyses of science on real ground instead of dubious models, that we examine what physics was like in Einstein's time, what he did and said, how he came to do and say these things, and how he changed his mind—not once, but often. I urge this not as an easy program—for it is not that—and even not just because it is in principle better to do justice to the work of a man on his own terms rather than to use his work for a purpose which may have been inherently foreign to him. 1 urge this, rather, because I believe that a future source of strength of scholarship in the philosophy of science lies in philosophical analysis of historical cases. I speak of Einstein's work because his case is both typical and special. T h e rise of relativity theory shares many features with the rise of other important scientific theories in our time, and in addition it is of course very much more: To find another work that illuminates as richly the relationship between physics, mathematics, and epistemology, or between experiment and theory,
Chap. 2. Special Relativity
...
185
or one with the same range of scientific, philosophical and general intellectual implications, one would have to go back to Newton's Principia. The theory of relativity was a key development, both in physical science itself and also in modern philosophy of science. The reason for its dual significances is that Einstein's work provided not only a new principle of physics, but, as A. N. Whitehead said, "a principle, a procedure, and an explanation." Accordingly, the commentaries on the historical origins of the theory of relativity have tended to fall into two classes, each having distinguished proponents: the one views it as a mutant, a sharp break with respect to the work of the immediate predecessors of Einstein ; the other regards it as an elaboration of then current work, e.g., by Lorentz and Poincare. To my mind, the Einsteinian innovation is understood best by superposition of both views, by seeing the discontinuity of methodological orientation within an historically continuous scientific development.2 Before we come to discuss this, and if we take seriously my point of view, we should first be ready to investigate a number of real problems of the historical or even "historicalsociological" kind: What are the sources for a study of the origins of the special theory of relativity (RT) and what is their probable reliability? What was the state of science around 1905, what were the contributions which prepared the field for the RT, and what did Einstein know about them ? What were the steps by which Einstein reached the conclusions he published in 1905 ? To what extent was this work a member of a continuous chain having as its immediate predecessors Lorentz and Poincare? What was the role of experiment in the genesis of the RT, and what the role of the existence of contradictory hypotheses? What part played epistemological analysis in Einstein's thought? What was the early reception of the RT among scientists? I n particular, what was Einstein's relation with Mach, Lorentz, and Planck? What may we say about the style of Einstein's work and his personal orientations? What, if anything, in the 2 G. Holton, IX Congreso International de Historia de las Ciencias, Guiones de las Communtcacioncs (BarceionaMadrid, 1959), Vol. II, p. 41.
186
Lorentz and Poincare
G*,
>t
••O.... -
Invaiiance
origins and content of the RT is typical of other theories with great impact on science ? And even, what methodological principles for the study of the history of science emerge from this study? We would find that the existing literature is not always of help in studying such questions. The literature on the RT is of course vast. LeCat3 listed over 3400 scientific papers in the field up to 1922, with an approximately exponential growth giving a sevenfold increase in seven years. Biographically or philosophically oriented analyses are also fairly numerous (for example, by Schlick, Reichenbach, Frank, Meyerson, Cassirer, Whitehead, Wenzel, Grunbaum, Polanyi, Margenau, Lenzen, Bridgman, and Northrop.) It may be remarked there has so far been no full-scale historical study (although one is now in progress). A number of valuable essays exist in this direction (for example, by Born, Dugas, Kuznetsov, von Laue, Pauli, Straneo, and Whittaker); these are generally concerned with the chronological development of physics, and typically constitute a portion of a longer work having a purpose different from that of a primarily historicalphilosophical study. For the latter, the best source is at present indeed Einstein's own set of papers. CONTINUITY IN EINSTEIN'S "WORK To these papers we must turn to discover, for example, the elements of continuity linking Einstein's first publication on the RT with his other work at the time and with the older tradition itself. After the paper of 1905/ Einstein returned to the exposition of the RT several times, and each restatement is of interest. For instance, in his book Uber die spezielle und die
allgemeine
Relativitatstheorie* he emphasized in his introduction that "the author has made the greatest effort to present the main ideas- • -on the whole in the sequence and in such context as they in fact arose." It is not surprising that the sequence given there is not in accord with the sequence of » Maurice LeCat, Bibliographic de la Rclativili (Bruxelles, 1924). 4 A. Einstein, Ann. Physik 17, 891 (1905). ! A. Einstein, (Braunschweig, 1916).
Chap. 2. Special Relativity ...
187
steps in the 1905 paper itself, but the historian of science finds an interesting problem in the fact that neither of these is in accord with other autobiographical or biographical accounts. When one studies the relativity papers in the larger contextual setting of Einstein's other scientific papers, particularly those on the quantum theory of light and on Brownian motion which also were written and published in 1905, one notices two crucial points. While the three epochal papers of 1905—sent to the Annalen der Physik at intervals of less than eight weeks— seem to be in entirely different fields, closer study shows that they arose in fact from the same general problem, namely, the fluctuations in the pressure of radiation. In 1905, as Einstein later wrote to von Laue,6 he had already known that Maxwell's theory leads to the wrong prediction of the motion of a delicately suspended mirror "in a Planckian radiation cavity." This connects on the one hand with the consideration of Brownian motion as well as to the quantum structure of radiation, and on the other hand with Einstein's more general reconsideration of "the electromagnetic foundations of physics" itself.7 One also finds that the style of the three papers is essentially the same, and reveals what is typical of Einstein's work at that time. Each begins with the statement of formal asymmetries or other incongruities of a predominantly esthetic nature (rather than, for example, a puzzle posed by unexplained experimental facts), then proposes a principle—preferably one of the generality of, say, the second law of thermodynamics, to cite Einstein's repeated analog}'—which removes the asymmetries as one of the deduced consequences, and at the end produces one or more experimentally verifiable predictions. Specifically, Einstein's first paper on the quantum theory of light opens in a typical manner: "There exists a radical formal difference between the theoretical representations which physicists have constructed for themselves concerning gases and other ponderable bodies on the one hand, and 'Letter of January 17, 1952 (unpublished). See also
Max Born in Finfzig Jahrc Relativitdtsthearie, edited by A.7Mercier and M. Kervaire (Bern, 19SS), pp. 248-249. See footnote reference 1, p. 47.
Poincare Invariance Maxwell's theory of electromagnetic processes in so-called empty space on the other hand." 8 The significant starting point is a formalistic difference between theoretical representations in two fields of physics which, to most physicists, were so widely separated that no such comparison would have invited itself and therefore no such discrepancy would be noted. The discrepancy Einstein points out is between the discontinuous or discrete character of particles and of their energy on one hand, and the continuous nature of functions referring to electromagnetic events and of the energy per unit area in an expanding wave front on the other hand. The discussion of the photoelectric effect, for which this paper is mostly remembered, occurs toward the end, in a little over two pages out of the total sixteen. The prescription for obtaining an experimental verification of his point of view is given in a single, typically succinct Einsteinian sentence (straightline relation with constant slope between frequency of light and stopping potential for all electrode materials). Inhissecond paper published in 1905,9 Einstein points out in the second paragraph that the range of application of classical thermodynamics may be discontinuous even in volumes large enough to be microscopically observable. He ends with the equation giving Avogadro's number in terms of observables in the study of particle motion, and with the one-sentence exhortation: "May some investigator soon succeed in deciding the question which has been raised here, and which is important for the theory of heat!" Significantly, Einstein reported the following year10 that only after the publication of this paper was his attention drawn to the experimental identification, as long ago as 1888, of Brownian motion with the effect whose existence he had deduced as a necessity from the kinetic-molecular theory. In his autobiographical notes he repeats that he did the work of 1905 "without knowing that observations concerning Brownian motion were already 8
A. Einstein, Ann. Physik 17, 132 (1905). • A. Einstein, Ann. Physik 17, 549 (1905). 10 A. Einstein, Ann. Physik 19, 371 (1906).
Chap. 2. Special Relativity
...
189
long familiar". 11
The third paper of 190512 is, of course, Einstein's first paper on the RT. He begins again by drawing attention to a formal asymmetry, i.e., in the description of currents generated during relative motion between magnets and conductors. The paper does not invoke explicitly any of the several well-known experimental difficulties— and the Michelson and Michelson-Morley experiments are not even mentioned when the opportunity arises to show in what manner the RT accounts for them. At the end, Einstein briefly mentions here, too, specific predictions of possible experiments (giving the equation "according to which the electron must move in conformity with the theory presented here").13 RETURN TO A CLASSIC RESTRICTION ON HYPOTHESES
The recognition of these common elements in the three papers prepares us for the essential realization that the fundamental postulates appearing in each of the three papers are heuristic. The heuristic nature of the postulate of relativity was from the beginning apparent to Einstein (as he asserted in 1907 and later) because of the restriction of the RT to translational motions and to gravitation-free space." The study of the three papers together reveals also the extent to which Einstein's RT represents an attempt to restrict hypotheses to the most general kind and the smallest number possible—a goal on which Einstein often insisted.15 In the 19Q5 paper on RT, he makes, in addition to the "See footnote reference 1, p. 47. See also L. Infeld, Albert Einstein (New York, 1950), p. 97-98. 12 A. Einstein, Ann. Physik 17, 891 (1905). " See footnote reference 12, p. 921. " On a few occasions, although not in the original paper, Einstein made this point [e.g., Ann. Physik 23, 206 (1907)3: "The relativity principle [is to be regarded]- • • solely as a heuristic principle, which, considered by itself, contains only assertions about rigid bodies, clocks, and light signals." " Cf. A. Einstein, "The Problem of Space, Ether, and the Field in Physics," Ideas and Opinions by Albert Einstein, translated and revised by Sonja Bargmann (New York, 1954), p. 282: "The theory of relativity is a fine example of the fundamental character of the modern development of theoretical science. The initial hypotheses become steadily more abstract and remote from experience. On the other hand, it gets nearer to the grand aim of all science, which is to cover the greatest possible number of empirical facts by logical deductions from the smallest possible number of hypotheses or axioms."
In variance two "conjectures" raised to "postulates" (i.e., of relativity and of the constancy of light velocity) only four other hypotheses: one of the isotropy and homogeneity of space, the others concerning three logical properties of the definition of synchronization of watches. In contrast, H. A. Lorentz's great paper which appeared a year before Einstein's publication 18 and typified the best work in physics of its time—a paper which Lorentz declared to be based on "fundamental assumptions" rather than on "special hypotheses"—contained in fact eleven ad hoc hypotheses : restriction to small ratios of velocities v to light velocity c; postulation a priori of the transformation equations (rather than their derivation from other postulates); assumption of a stationary ether; assumption that the stationary electron is round; t h a t its charge is uniformly distributed ; t h a t all mass is electromagnetic; t h a t the moving electron changes one of its dimensions precisely in the ratio of (1—n 2 /c 2 )* to 1; that forces between uncharged particles and between a charged and uncharged particle have the same transformation properties as electrostatic forces in the electrostatic system; t h a t all charges in atoms are in a certain number of separate "electrons"; that each of these is acted on only by others in the same a t o m ; and t h a t atoms in motion as a whole deform as electrons themselves do. It is for these reasons t h a t Einstein later maintained t h a t the RT grew out of the Maxwell-Lorentz theory of electrodynamics "as an amazingly simple summary and generalization of hypotheses which previously have been independent of one another- • •," 17 If one has studied the development of scientific theories, one notes here a familiar t h e m e : the socalled scientific "revolution" turns out to be at bottom an effort to return to a classical purity. This is not only a key to a new evaluation of Einstein's contribution, but indicates a fairly general characteristic of great scientific "revolutions." In" H. A. Lorentz, Proc. Acad. Sci. Amsterdam 6, 809 (1904). This paper, originally presented as part of the proceedings of the meeting of April 23, 1904, was first published in June, 1904 in the Dutch language edition of the Proceedings [12, 986-1009 (1904)]. 17 See footnote reference 5, p. 28. See also A. Einstein. Scientia 15, 338 (1914).
Chap. 2. Special Relativity ...
191
deed, while it is usually stressed that Einstein challenged Newtonian physics in fundamental ways, the equally correct but neglected point is the number of methodological correspondences with earlier classics, for example, with the Principia. Here a listing of some main parallels between the two works must suffice: the early postulation of general principles which in themselves do not spring directly from experience; the limitation to a few basic hypotheses18; the exceptional attention to epistemological rules in the body of a scientific work; the philosophical eclecticism of the author; his ability to dispense with mechanistic models in a science which in each case was dominated at the time by such models19; the small number of specific experimental predictions ; and the fact that the most gripping effect of the work is its exhibition of a new point of view. The central problem, moreover, is the same in both works: the nature of space and time, and what follows from it for physics. Here, the basic attitudes have in both cases more in common than appears at first reading. That Newton's absolute space and absolute time were not meaningful concepts in the sense of laboratory operations, was, of course, not the original discovery of Mach; rather, it was freely acknowledged by Newton himself. But Einstein was also quite explicit that in replacing absolute Newtonian space and time with an infinite ensemble of rigid meter sticks and ideal clocks he was not proposing a laboratory-operational definition. He stated it could be realized only to some degree, "not even with arbitrary approximation," and that the fundamental role of the whole conception, both on factual and on logical grounds "can be attacked with a certain right."20 Thus the RT
•
/
" Wolfgang Pauli, in Theory of Relativity [(B. G. Teubner, Leipzig, 1921 and Pergaraon Press, New York, 1958), p. 5], unwittingly draws forceful attention to this particular point when summarizing his analysis of the RT in the following words: "The postulate of relativity implies that a uniform motion of the center of mass of the universe relative to a closed system will be without influence on the phenomena in such a system." Note the correspondence with the main hypothesis in the last edition of the Principia. " Cf. Max von Laue, Naturwissenschaften 43, 1 (1956). "•"Les Prix Nobel en 1921-1922 (Stockholm, 1923), p. 2. See also A. Einstein, Naturwissenschaften 6, 692 (1918).
192 Lorentz and Poincaxe Invariance merely shifted the locus of space time from the sensorium of Newton's God to the sensorium of Einstein's abstract Gedankenexperimentsr—as it were, the final secularization of physics. In his tribute on the occasion of the 200th anniversary of Newton's death, Einstein wrote: "I must emphasize that Newton himself was better aware of the weakness inherent in his intellectual edifice that the generation of learned scientists which followed him. This fact has always aroused my deep admiration • • • ."21 He then immediately draws attention to the fact that "Newton's endeavors to represent his system as necessarily conditioned by experience and to introduce the smallest number of concepts not directly referable to empirical objects is everywhere evident." He recalls that Newton regarded the law of gravitational interaction as a heuristic device, "not supposed to be a final explanation, but a rule derived by induction from experience." When the essay ends with Einstein clearly associating himself with a view of causality which he characterizes as "Newtonian," he could well have widened the context of that remark.
Chapter 3
Inquiries Regarding the Constancy of the Speed of Light 3
3
W. Ritz (1908), R. C. Tolman (1910), J. Kunz (1910), D. F. Comstock (1910), W. Pauli (1921).
The Postulate of the Constancy of the Speed of Light. Ritz's and Related Theories 1 W . Pauli
It will be shown in the next section that the constancy of the speed of light, together with the principle of relativity, 2 leads to a new concept of time. For this reason W. Ritz 3 and, independently, Tolman, 4 Kunz 5 and Comstock 6 have raised the following question: Could one not avoid such radical deductions and still preserving agreement with experiment, by refusing the constancy of the speed of light and accepting only the first postulate? It is clear then that one would have to give up not only the idea of the existence of an aether but also Maxwell's equations for the vacuum, so that the whole of electrodynamics would have to be re-created. Only Ritz has succeeded in accomplishing this in a systematic theory. He accepts the equations curl E + — — = 0, div H = 0, c at so that the field intensities can be derived, just as in the conventional electrodynamics, from a scalar and vector potential IdA E = —grad (f> — , ~c~di' However, the equations ation s
•h
pdVpi
H == curl A.
(l/c)pvdVp.
A(p,t)--
r
pp']t'=t-(r/c) '
• / •
[rpp']t'=t-(r/c)
of the conventional electrodynamics are now replaced by
J
[rpp'\t'=t-(r/[c+vr])
J
[rpp'lt'=t-(r/[c+vr])
This corresponds to the assumption that it is only the speed of a light wave
194
Chap. 3. Inquiries Regarding the Constancy of the Speed ...
195
relative to its source, and analogously the speed of an electromagnetic disturbance relative to the electron, which is equal to c. We shall call all theories which are based on this assumption "emission theories". Since the relativity principle is automatically satisfied by all such theories, they can all explain Michelson's interferometer experiment. It remains to be investigated whether they are also consistent with the results of other optical experiments. 7 References 1. An excerpt translated from W. Pauli, Collected Scientific Papers by Wolfgang Pauli (Edited by R. Kronig and V. F. Weisskopf, 1964, Interscience Publisher, New York.) pp. 11-12. 2. That is, the first postulate. 3. W. Ritz, 'Recherches critiques sur Pelectrodynamique generale', Ann. Chim. Phys., 13 (1908) 145 [Coll. Works, p. 317]; 'Sur les theories electromagnetiques de Maxwell-Lorentz', Arch. Sci. Phys. Nat., 16 (1908) 260 [Coll. Works, P. 447]; 'Du role de Tether en physique', Riv. Sci., Bologna, 3 (1908) 260 [Coll. Works, p.477]; see also P. Ehrenfest, 'Zur Frage nach der Entbehrlichkeit des Lichtathers', Phys. Z., 13 (1912) 317; 'Zur Krise der lichtather-hypothese', lecture delivered in Leyden, 1912 (berlin, 1913). 4. R. C. Tolman, Phys. Rev., 30 (1910) 291 and 31 (1910) 26. 5. J. Kunz, Amer. J. Sci., 30 (1910) 1313. 6. D. F. Comstock, Phys. Rev., 30 (1910) 267. 7. For further discussions, see W. Pauli, The Theory of Relativity (translated by G. Field, Pergmon Press, London, 1958) pp. 6-9.
196
Lorentz and Poincare Invariance
Critical Researches on General Electrodynamics Walter Ritz Translated (1980) by Robert S. Fritzius 1 from Recherches critiques sur I'Electrodynamique Generale, Annales de Chimie et de Physique, 13, 145, 1908.
Introduction Electric and electrodynamic phenomena have acquired in the course of these last years more and more importance. They include Optics, the laws of radiation and the innumerable molecular phenomena associated with the presence of charged centers, ions and electrons. Finally, with the notion of electromagnetic mass, mechanics itself seems obliged to become a chapter of general electrodynamics. In the form given to it by H. A. Lorentz, Maxwell's theory would thus become the turning point towards a new conception of nature, where the laws of electrodynamics, considered as primary, would The Introduction and First Part are available in hard-copy from R.S. Fritzius, 305 Hillside Drive, Starkville, MS 39759. [email protected]. For a short biography of Ritz, see appendix C in this volume.
Chap. 3. Inquiries Regarding the Constancy of the Speed ...
197
contain the laws of motion as special cases and would play the fundamental role in the physical theories which, until now, have belonged to mechanics. Under these circumstances, it is plainly desirable to have a rigorous criticism of the foundations of this theory, to give it the degree of clarity and precision that mechanics itself reached only recently after much controversy. It is in order to ask which hypotheses are essential and can be deduced from observations, which others are logically useless or can be discarded without experience ceasing to be adequately represented, and finally, which are those which can be, and should be (Oeuvres 318) rejected; a question which is asked principally in regard to absolute motion. In the first part of his Lessons on Electricity and Optics2 Poincare devoted some classic pages to the criticism of the more or less distinctive theories of Maxwell himself and of Hertz. Therefore I will concern myself only with the form that the theory took in the hands of Lorentz, a form that presents well known advantages. Some of his results can easily be extended to the other theories. Here again, I only have to recall or to complete the ideas put forward by Poincare and more importantly by Lorentz who was well aware of the different aspects under which his theory could be interpreted. In general, I set aside the phenomena of molecular order, dependent on the corpuscular theory of electricity. That fruitful concept is evidently independent in large part of ideas that we can develop about the mode of action of electric charges on one another via the ether medium, which are more specifically the object of electrodynamic equations. The result of these researches has not been favorable to the existing theories. The discussions about the difficulties that they raise show that the difficulties have a common origin intimately linked to the concept of ether, which is the basis of all these theories. We will see specifically that: 1° From a strictly logical point of view, the electric and magnetic forces, which in appearance play in the theory a role so fundamental are notions that we can eliminate entirely; they only contain in reality the relations of space and time. We thus return to the old elementary actions with the sole difference that they are no longer instantaneous. 2° The theory permits an infinite number of solutions, each satisfying all the conditions, but incompatible with experience and even leading for example to perpetual motion. To remove these solutions we must admit by hypothesis formulae for retarded potentials. These formulae introduce irreversibility in electrodynamics (Oeuvres 319) whereas the general equations permit reversibility. I show that contrary to accepted ideas that they can't be deduced from a proper specialization of the initial state. They constitute H. POINCARE, Electricite et Optique: La lumiere et las theories 2nd edition, Paris 1901
electrodynamiques;
198
Lorentz and Poincare Invariance
a new hypothesis, making useless the partial differential equations. To clarify this hypothesis it is necessary to distinguish the elementary actions; it is to renounce Maxwell's fundamental idea of rejecting them. 3° The notion of localization of energy in the ether is vague and allows many simple solutions. 4° The Impossibility described by Maxwell to reduce gravitation to the same notions. That the negative energy involved would correspond to an unstable system shows that these ideas do not have general applicability to the forces of nature. 5° Action and reaction are not equal, and this inequality, in the manner in which it is deduced from the introduction of absolute velocities, is contrary to experience. 6° Kaufmann's experiments on the electric and magnetic deviability of 0 rays of radium don't prove that the mass of electrons is entirely of electromagnetic origin and dependent on their absolute velocity, because on the first hand, nothing obliges us to believe, as in Lorentz's theory, that the forces are linear functions of velocity, (this may be true at small velocities), and that, on the other hand, one of Trouton and Noble's experiments shows that the expression of electromagnetic momentum as a function of velocity from which Abraham has deduced the one of electromagnetic mass is certainly inexact. 7° The theory of Maxwell and of Lorentz starts from a system of absolute coordinates, that is to say, independent of all motions of matter. In order to be in agreement with experimental results, which have always, in optics and electrodynamics, as well as in mechanics, confirmed the principle of relative motions, we are obliged, then, to eliminate this absolute system by hypotheses of little credibility, thus eliminating the notion of solid bodies, and the concept of the invariability of ponderable masses. It will be necessary also, to change the principles of kinematics, to consider the rule of the velocity parallelogram just as a first approximation, valid at small speeds, (Oeuvres 320) and to make time and simultaneity completely relative notions. It would be regrettable, for the economy of our thinking if we had to live with all the complications listed above. I think, that instead of kinematics, it will be the ether hypothesis, and with it, the representation of phenomena by partial differential equations, that must be abandoned. The necessity to explain why bodies do not meet any resistance from the ether as they pass through it, and the fact that they do not modify its state, and many other considerations, have created a simple physical space out of Fresnel's mechanical ether, perfectly penetrable by matter, a system of absolute coordinates. The ether is now only a mathematical abstraction and its elimination would only be the final phase of a long evolution.
Chap. 3. Inquiries Regarding the Constancy of the Speed ...
199
This conclusion, as I set it forth, is not at all involved with a return to actions at a distance. Nevertheless, it indeed collides head-on with many currently accepted ideas, and I am the first to admit that a hypothesis which has rendered such great services to science can't be condemned for the sole reason that it presently raises some seemingly inextricable difficulties. We should always hope for future solutions of these difficulties, or accept the idea that they are inherently a part of things, and independent of our models. This is, fortunately, not the case. This is what I have sought to demonstrate in the Second Part of this work, but the theory which I will present does not pretend to be a satisfactory and definite solution to a problem so difficult. Its primary purpose is to show how large is the unknown part which, in spite of recent advances, still exists in this domain, and in what measure, it's much smaller than we would be tempted to believe. Experimental evidence may be considered as confirmation of Maxwell and Lorentz's theory, even when we adopt, as I will do, the remarkable ideas of this latter savant on the atomic constitution of electricity, the nature of conduction current and of dielectrics, in a nutshell, the theory of electrons. These researches will show that it is not necessary to introduce absolute motion and thus to upset kinematics and the notion of time; relative velocities alone will suffice. There will be no use of notions subject to criticism such as polarization, the electric vector the magnetic (Oeuvres 321) force, etc., but only the notions of time, space, and electric charges, these latter only playing, like the masses in mechanics, the role of coefficients conveniently chosen and invariable for a given ion or electron. In a certain sense it is a mechanical theory of electricity. But I have not believed it advisable to bring in the more or less complicated latent mechanisms which play such an important role in Maxwell's theory. Those hypotheses are unnecessary, and, one must say, barely satisfactory. It suffices, indeed, to recall that ponderable bodies must pass through these complicated mechanisms without disturbing them and without feeling sensible action, even when their speed reaches that of celestial bodies. Impenetrability, in particular, doesn't exist in the mechanical theories, and this is the one point which isn't always sufficiently placed in evidence. Experience has shown that actions are not instantaneous; also it hasn't revealed any trace of a medium which could exist in materially empty space. I therefore felt I could restrict myself to give to the law of propagation of these actions, a very simple kinematic interpretation borrowed from the emanative theory of light and satisfying the principle of relativity of motion. Fictitious particles are constantly emitted in all directions by electric charges. They keep on moving indefinitely in straight lines with constant speed, even through material bodies. The action under gone by a charge depends uniquely on the disposition, velocity, etc., of these particles
200
Lorentz and Poincare Invariance
in its immediate surroundings. The particles are therefore simply a concrete representation of kinematic and geometric data. These hypotheses are sufficient for the purely critical objective that I suggest to reach here. They permit study in detail of the law of elementary actions between electrons in motion and show in particular, that this law, almost entirely unknown for great speeds, requires, even at small speeds, an indeterminate parameter K, which is not without analogy with the one that Helmholtz has introduced in his theory. I need to specify the temporary scope of these hypotheses. Indeed, when the particles (or, if we like, the actions or energies) emitted by an electrified body reach another electric charge and modify its motion, the principle (Oeuvres 322) of action and reaction demands that they undergo on their part, a deviation, or a change, and it is very remarkable that Fizeau's experiment on the entrainment of waves, like certain other facts of optics, is not compatible with the hypothesis admitted here, and demands such a reaction. It's the opposite that happens in the ether hypothesis, as Poincare so presented it. Hertz's theory, which satisfies the principle of action and reaction, is incompatible with Fizeau's experiment. Lorentz's theory, which doesn't satisfy it, explains the experiment perfectly. But Poincare has shown that in giving a momentum to the radiant energy, everything falls into place; therefore this hypothesis is natural if this energy is projected instead of being propagated. It is precisely this that permits safeguarding this principle in the new model that I propose. We can even foresee the possibility of obtaining, by these principles, the electrodynamic terms that depend on velocity and acceleration, using only the consideration of propagation 3 , a problem that Gauss posed in his well know letter to W. Weber, and that Maxwell's theory didn't solve because it introduces to these terms a special quantity, the vector potential. I will return to these questions.later. The remarks which precede are sufficient to explain why I didn't address optics in this criticism. In many respects, the new theory will therefore bring the reader back to some older classical ideas, which seemed destined to be forgotten. The interpretation of certain experiments will necessarily be modified. In particular, perhaps part of or the whole of mass will be electromagnetic in origin, but it will be constant and won't depend on an absolute velocity. It is the forces, and not the mass, that changes. Kaufmann's experiments also permit this new viewpoint. The new formulae are applicable to gravitation; they permit abolishment in large part the most apparent divergence which exists at this time between calculation and experiment regarding the perihelon motion of Mercury. 3
The word "projection" may have been intended here. Trans.
Chap. 3. Inquiries Regarding the Constancy of the Speed ...
201
The theory of electrons has constituted a first partial return from Maxwell's ideas to others much older, and for those who consider as indispensable a new evolution in (Oeuvres 323) the same sense, Lorentz's hypotheses, which have been so fruitful, will maintain their importance, and the mathematical form that he gave them will continue in many cases to be the most elegant and the most practical. FIRST PART 1. - RECALL OF LORENTZ'S THEORY 4 We know that Maxwell did not work on hypotheses on the nature of electric currents. Lorentz claims that all conduction current results from motion of electric particles, which are subjected to a kind of friction in conductors and to elastic forces in dielectrics. Many experiments have, in the last years, confirmed this hypothesis. This concept has permitted Lorentz to consider only the dielectric ether in his fundamental equations. Renouncing a purely mechanical explanation and the impenetrability of matter, Lorentz considers the ether as immobile and even present in the interior of ions and electrons. The ions and electrons modify it physically, and this modification, which is difficult to picture in a concrete fashion, is characterized by two vectors; the electric or dielectric displacement vector E, whose components are Ex, Ey, Ez, and the magnetic vector H (Hx, Hy, Hz). The electric charges are fixed to the ions which are considered undeformable. Where the electric density, p is measured in electrostatic units at xyz, at time t, the system of coordinates being connected to the stable ether, and t; the velocity of the electric substance in (x ,y, z, t) and c, the speed of light; we have, for these constraints, the system of equations ,TS
,TT
(I) v '
l
9
E
„
V
curlH = -——\-4np— H j. c 8t c
(TT\
III) y
IT?
~
1 9 H
curlE = '
— c dt
{III)
divE = 4TI7>
(IV)
divH = 0
4
H.A. LORENTZ, Archives need., vol. XXV, 1892; Versuch einer Theorie der elektrischen und optischen Erscheninungen in bewegten Korpern, Leiden, 1895; Elektronentheorie: Enzyklopadie der math, Wissenschaften. Bd. V, Heft 1, Leipzig, 1904-POINCARE, Electricite et Optique, Chap, III, p. 422.
202
Lorentz
and Poincare
Invaxiance
(V)
-£ + divpv = 0
The field therefore created in the ether by other charges present, exerts on the element of charge pdr the mechanical force represented in magnitude and direction by the vector Fpdr where (VI)
F = E+-(V
xH)
c
In this theory there is no magnetism. The magnetization comes from Amperian particle currents. By means of certain hypotheses, to which we will return, this system of equations integrates itself by the introduction of retarded potentials. As a matter of fact we show that all solutions of the system (I) to (V), where we assume the data p, vx,vy, vz can be expressed in the form 1 d A
(VII)
E=-div
(VIII)
H = curlA
the functions $ (the scalar potential) and Ax, Ay, Az (components of the vector potential) being continuous with their first derivatives in all space, from zero to infinity, and satisfying the equations 1 d2$ — -
(IX)
1 d2Ax ,2 dt2 1 d2Ay
(X)
,
A
=
A A
LS/ix A
{ ?T5i*
A
4,,, _ 4ivpVx ~ c _ AirpVy
and ldd> divA
(xi)
= -cTt
Lorentz satisfies these conditions by setting (XII)
U(x, (x-x>y
+
l\2 (y-y>y+(Z-z')
Chap. 3. Inquiries Regarding the Constancy of the Speed ... 203 Ax=t±fff^dT' (XIII)
Ay^Uff^dr' I A 2 = ...
These expressions have the form of newtonian potentials, with the difference that instead of taking the value of p at point x'y'z' at time t, we have to take it at the previous time t' = £ — r/c the time r/c being necessary because of the propagation. This is what we will point out along with Lorentz by the notation \p'] = p{x',y',z?,t-^), [p'v'x] = p(x\ y', z', t - r-)Vx(x', y\ z',
t-'-).
The field is therefore completely determined, and in introducing expressions (XII) and (XIII) into formulae (VII) and (VIII) (Oeuvres 326) and hence into (VI), we obtain for Fx d[p'}
<*">
'-JJH-KV, v* d [p'v'x] | vy d [p'v'y] | c2 dx r c2 dx r vx d [p'v'x] vy d [p'v'x] c2 dx r c2 dy r
and similar expressions for Fy,Fz. d d dt ' = m
Id [p'v'x] c2dt vz d [p'v'z] c2 dx r vz d \p'v'x}, c2 dz r
In introducing the total derivative d dx-
+ Vx
d d *d-y + Vzdz~>
+ v
and setting
(XV)
L(x,y,z,t,vx,vy,vz)
= f f f M { 1 - v*b>'*] + vvty\ +
v
*[v'*\]dTi
Schwarschild5 has found for Fx the remarkable form
This is the form of Lagrange's equations. The expressions (XIV) and (XVI) give the force undergone by an electric point of unit charge, expressed by s
Gottinger
Nachr., Math.-Phys.
Klasse, 1903, p. 126.
204
Lorentz and Poincare Invariance
means of elementary actions analogous to the ones considered in the old electrodynamics, without the notion of non-instantaneous transmission that we find again in Gauss and C. Neumann. A charge e\ sensibly point-like, exercises, under very general considerations, on another similar charge e', a force
,
N
f-'(-§HM*:)>
(* >
{-'(-t+£i£),
y//
where A_
i
f
i - ^ K ] + ^ K ] + ^K3
r 2 = [x(t) - x'(t - T-)f + [y(t) - y'(t - r-)f + [,(*) - z'(t - ^ ) ] 2 This expression reduces itself, to a first approximation, to the law of distance squared. We can therefore name it the generalized Newton's law. This explicit expession will be given later on. In these formulae the notion of field doesn't come into play. It is very remarkable that Clausius, like Weber, in seeking to account for the electrodynamic actions by means of actions at a distance, depending on positions, velocities and accelerations of the electric points, has derived the same formulae (XV) and (XVI) with the only difference that the actions are instantaneous, so that we have to take the values of p' and v' at time t and not at time t — - . c
This very remarkable result, due to Schwarzschild, shows that Lorentz's theory is very close to the old theories. The first form that Lorentz gave to his theory was less abstract in the sense that, following the path that Maxwell laid out, he started from Lagrange's equations, introducing two kinds of variables, the first ones determining the positions of the electrified particles, the others the state of the ether. We attribute to this latter one kinetic energy, without determining its internal motions. It suffices that they exist. Hamilton's principle permits, likewise, restricting the variations under certain conditions, to obtain the fundamental equations (I) to (VI) by considering the electrical energy i - j{Ex2 on J
+ Ey2 +
E2)dr
as potential energy, and the magnetic energy l-j(Hx2
+ Hy2 + Hz2)dr
Chap. 3. Inquiries Regarding the Constancy of the Speed ...
205
as kinetic energy.*' This deduction is fairly complicated and can be done in various forms. 7 These two aspects of Lorentz's theory are clearly distinct. The second one is similar to Larmor's theory, 8 which, in leading to the same formulae offers precise hypotheses on the motions of ether in an electromagnetic field as borrowed from Lord Kelvin's conceptions on gyrostatic ether. Ether is incompressible and moves itself in the direction of magnetic force lines. We know that Maxwell and Hertz explain the mechanical forces that a substances experiences in an electromagnetic field by the pressure that ether is supposed to exert on the substance, and conversely: the action of one is equal and opposite to the reaction of the other in all respects. These pressures, as Helmholtz has shown,9 tend to set the ether (supposedly incompressible) in motion; they are, by unit volume, proportional to the time derivative of Poynting's radiant vector S: (XVIII)
S = — £ x H Aire
Lorentz considers ether as immobile; he is therefore inclined to Maxwell's theory of pressures and, by that, the equality of action and reaction, the non-compensated force being characterized by the vector ^jf.10 Let's observe, in concluding this brief exposition, that if we admit unknown motions in ether, the solution of the equations is determinable only for quantities close to the order of the speed of the ether divided by that of light. 2 - CRITICISM OF NOTIONS OF ELECTRIC AND MAGNETIC FIELDS We know that the introduction. of the notion of force into mechanics was the subject of much criticism. This notion calls for muscular sense, whereas the ideas cf space and time are primarily of tactile and visual origin: and the irreducible psychological duality introduced at the very base of this fundamental science leaves a certain discomfort in the mind, justifiably so, for it seems very obvious that the notion eliminates itself in each particular case. Whether we measure the forces by masses and accelerations, or by elastic deformations, whether we oppose their effects with those of gravity, 6
LORENTZ, Proc. Amsterdam Acad., 1903, p. 608; Elektronen-theorie, p. 165 to 170. LARMOR, Aether and Matter, Cambridge,' 1903, Chap. IV. SCHWARZSCHILD, Gottinger Nachr. 1903, p. 125. 8 LARMOR, Proc. Roy. Soc, Vol. LIV, 1893, p. 438; Aether and Matter. 9 Gesammelte Abhandlungen, Vol. Ill, p. 526. 10 POINCARE Electricite et Optique, p. 448. 7
206
Lorentz and Poincare Invariance
etc., what we really observe and measure is always a displacement, or the absence of a displacement. Again, in this latter case, we only end up defining the difference of two forces. In the equations of mechanics, as applied to any particular example, there remain only the relations of space and time, with certain coefficients properly chosen and invariable which are the masses or other physical constants. With regard to pure logic, it is therefore with good reason that many experts have rejected the introduction of the notion of forces in the fundamental expressions as being useless. Modern electrodynamics is entirely based on the notion of electric and magnetic forces. If this were absolutely necessary it would be regrettable. But it isn't so. These notions eliminate themselves in the equations. They are logically useless. In the final analysis the theory only expresses the existence of certain relations of time and space, as it is in the case of mechanics. It will therefore be preferable to express these relations directly. We thus come back to the classical elementary actions. In fact, what are the exact definitions of the vectors for E and H fields? I say that these vectors are defined by the theory itself. Thus, without knowing the significance of these symbols we can at once, by means of certain hypotheses that (Oeuvres 330) we will examine in the next section, integrate the fundamental equations by the method of retarded potentials, and we will be led to expressions (XIV) and (XVI). The equations of motion for a material point of charge e, of mass m and of coordinates x, y, z will be m
"
/
m
^
W
= eFx(x1,yi,zi\
vxl,vyi,v2l,t)
If we desire to take into account the action of the electron on itself, or of liaisons, d'Alembert's principle has to be applied, and we have, in extending the integration over the whole of the electron, and designating by Sx\,6yi,Szi the virtual displacements compatible with the liaisons, and by Hi, pi the densities of the substance and of electricity,
(2)
J{{j*i-^r
-
PiF*(*i,-)]6*i
After having replaced Fx, • • • with the value (XIV) or (XVI) (the terms solely relative to the electron will play a special role), we will have in (1) and (2) only relations of space and time, even when fi — 0, that is to say, when mass is entirely of electromagnetic origin. Now, I say that Lorentz's equations don't effectively express anything more than (1) and (2). That is to say, that the field never plays a role in
Chap. 3. Inquiries Regarding the Constancy of the Speed ...
207
pure ether. In fact, we can only determine the field's magnitude and direction by positioning a body and observing the mechanical forces that it feels or rather its motions and those of the ions in its near vicinity, motions which are indicated by luminous, thermal, chemical, etc., phenomena. Therefore, we only know F, and that, only at such points of £1,2/1,21, where there is electrified matter, and we deduce E and H by reasoning (which is not always so simple when we have to consider absolute motion). This is to say that, in all cases, to know the formula (Oeuvres 331) that gives F as the result of elementary actions exerted by an element of charge on another element of charge, and that this second representation, with regard to the facts, is exactly equivalent to the first, which is based on the field and its partial differential equations which only play a purely mathematical role. We can, if we please, get along without the notion of electric and magnetic fields. It is important to specify the sense of this affirmation. In the theory of light, for example, everything thus represented with regard to Lorentz's theory can be derived from elementary actions between ions in the luminous source, those in the dielectrics or conductors which constitute the optical apparatus, and finally those in the retina or photographic plate which receives the impression. Thus, we are accustomed, for example, to describe the phenomena of diffraction that we observe in the case of a slit used with a screen by considering with Fresnel that the points of ether situated in the slit as so many centers of disturbance. This does not conform to the equation for retarded potentials though. Electric charges are the only points of origin for waves. Lorentz's theory, or the law of elementary actions, will explain it as the combined action of ions in the source and in the screen; besides it is easy to show, using Huygen's principle as Kirchoff has it, the equivalence of the two methods with regard to results. It wouldn't be allowed any longer to say that the field is a purely mathematical intermediate which we could do without, if it were possible to perceive its existence in a region of ether without placing any matter in the region. That is what would have happened, for example, if ether, under the influence of a field, were to be susceptible to modification or were to move itself more or less as Hertz wants it to, and as Lord Kelvin demands. 11 Interference experiments would have put this speed into evidence. These ideas were generally very widespread; but we know that the (Oeuvres 332) experiment, conducted several times, 12 gave only negative results, as do all the experi11 Lord Kelvin, Baltimore Lectures on Molecular Dynamics and the Wave Theory of Light, London, 1904, p. 159: "It is absolutely certain that there is a definite dynamical theory for waves of light, to be enriched, not abolished, by electromagnetic theory." 12 5eein particular O. LODGE, Phil. Trans., Vol. CLXXXIV, 1893; HENDERSON and HENRY, Phil. Mag., 5th Series, Vol. XLIV, 1897, p. 20.
208
Lorentz and Poincare Invariance
ments designed to prove the existence of ether. The hypothesis of all these motions, on the other hand, has not lent itself to any plausible mechanical explanation of electrical actions in their effects. Lorentz was therefore led to exclude it in the recent statements of his theory; and that is what gives the go-ahead for us to eliminate the notions of force and field in this theory without touching any fact of reality or any possible experiment according to it. Lorentz has already indicated 13 this point of view: "We therefore see, in the new way I'm going to present it, Maxwell's theory draws nearer to the older ideas. We can even, after we have established the simple formulae that describe the motions of the particles, leave out the reasoning that spawned them and consider looking at these formulae as expressing a fundamental law comparable with those of Weber and Clausius." The actions, however, are no longer instantaneous; and we have seen, in light of this important restriction, that there is even an identity with Clausius' law. We can easily see that the notion of field introduces the notion of absolute motion as soon as the velocities come into play, either in the field expressions or in the expression for its action on bodies. It isn't likely anymore that it depends only on coordinates and accelerations. Sections 3-8 are omitted. 9. - ABSOLUTE MOTION In placing the ether hypothesis at the base of electrodynamics and optics we of necessitiy introduce, at least for the propagation of light and electric actions, a system of coordinates, independent of ordinary matter. We should therefore expect, and we have indeed long expected, an influence of absolute motion with respect to the assumed ether. -We know that the experiments have always been negative. Lorentz's theory gives this result when we consider first order terms. But the experiments (Oeuvres 361) of Michelson and Morley, Trouton and Noble, and Lord Rayleigh, which should have shown second order effects, have, contrary to the theory, also given negative results. Lorentz and FitzGerald have therefore assumed that all bodies experience a contraction of the relation (1 — v2/c2)~ll2 in the direction of their velocity v. We thus account for the observed negative results. 14 To explain this contraction, Lorentz calls to mind that, according to his theory, for a system of 13
Arch, neerl., Vol. XXV, 1892, p. 433. M. Planck has shown that if we assume that the density of the ether at the earth's surface is less than 50,000 times greater than that in the interplanetary medium, without any appreciable change in its properties resulting, we may possibly reconcile the aberration theory with the hypothesis that the ether is entrained in the earth's motion (see 14
Chap. 3. Inquiries Regaxding the Constancy of the Speed ...
209
electric charges S, at rest, in equilibrium, this same system, when assumed to be antimated by a uniform translational motion v, will still be in equilibrium if we modify its dimensions by the indicated ratio. Therefore, if the molecular actions obey the law of electrostatic actions, and if we can exclude the molecular motion, the molecules of a solid body should of necessity take the position of equilibrium, and the accepted contraction will take place. It is evident that this hypothesis confuses our notions of solids. The invariability of certain bodies, when we transfer them from one place to another, when we change their direction or their speed, gives us the experimental definition of distance and of other geometric scales. The bodies that we use, necessarily participating in the motion of the earth, will always have an infinity of movements and rotations which change their dimensions and since we don't have any way to precisely determine the absolute motion which comes into play here these deformations remain absolutely unknown. How do you physically define the true length of a body? Does the assertion of the reality of this contraction have any sense? It results, from the researches of Einstein, to which we return later on, that the answer is negative. The question of stability brings forth a second objection. (Oeuvres 362) A system of electric charges, subject only to electrostatic forces, is never in stable equilibrium. This is evident when the sole restriction imposed is the conservation of electricity. In changing all the dimensions of the system by the ratio of 1 to 1 + e, the charges of elements of corresponding volume being equal, we will have performed a deformation compatible with the conditions of the system. The energy will fall to (jr^) t h of its original value. The equilibrium, therefore, is not stable. The sphere, for example, is for a deformable electron a shape of unstable equilibrium, and its the same if we suppose, with Bucherer and Langevin, that its volume is invariable15 a fortiori, whereas in Lorentz's hypothesis this restriction doesn't exist. To obtain a solid body we therefore need to add forces of a very different character from those of electrostatic forces, or of liaisons other than incompressibility, or finally of whirling motions giving a dynamic equilibrium. But in all of these cases, Lorentz's explanation no longer applies so this explanation doesn't seem acceptable. Poincare has finally objected to the hypothesis as being incomplete. New experiments could bring new terms into evidence and we would require new hypotheses if, as expected, results are negative. The question of the complete LORENTZ, Enzyklop. math. Wiss., vol 5, art. 133, p. 104). This would be a very strange property of the ether. ls This is what Ehrenfest pointed out (Physik. Zeitschr., vol. 7, 1906, p. 302). For gravitation the equilibrium would be stable; but in changing the attractions into repulsions the energy changes sign and the equilibrium becomes unstable.
210
Lorentz and Poincare Invariance
elimination of absolute motion was therefore posed and was addressed by Lorentz, 1 6 Poincare, 1 7 and Einstein 1 8 . It is no longer permissible to overlook the difference between "local time" and "true time", which was an essential point when we were content to explain the negative results observed up until now and a few others analogous. For us to render a full account, lets consider two points A and B which move with a constant absolute velocity v in the direction A B . A luminous wave, starting from A at instant t, will arrive (Oeuvres 363) at B at instant t'. It will have to travel the distance AB + (t — t')v with the speed c we have then t
,_
t =
AB + (f - t)v c
C — V
The duration of transmission will depend on v, and its changes include a first order term -ABv/c, the correction (including higher order terms) being precisely the one we have to apply to the true time to get the local time. Lorentz 1 9 showed t h a t for terrestrial phenomena, this correction is without influence. In particular, to determine the speed of light directly we are obliged to make it cover a closed path which brings it back to its starting point; thus eliminating the first order terms. So, in the example considered, if the wave emitted at A is reflected at B, it will arrive at A after a time
c— v
c+ v
c
c*
But it will be otherwise for astronomical Phenomena. In the determination of the speed of light by the occultations of satellites, we don't use a closed path, thus, the perturbation, that the hypothesis of a translation of the solar system with respect to the ether would cause in the observed retardation, would be of a first order and of observable magnitude. Indeed, the delay of an occultation can reach * (where d is the diameter of the terrestrial orbit), or about 1000 seconds. An absolute speed of the solar system (having no connection with the system's motion with respect to the closer fixed stars) equal to 30 km per second in the plane of the ecliptic would bring about a correction of about 1 0 0 0 3 ^ , 3 = 0.1 second for the maximum observed delay, a correction which would change sign according 16
Amsterdam Proceedings, 1903-1904, p. 809. Comptes rendus, vol.. CXL, 1905, p. 1504. ™Ann. der Physik, vol. XVII, 1905, p. 891. 19 Versuche einer Theorie, etc., p. 82ff.
17
Chap. 3. Inquiries Regarding the Constancy of the Speed ...
211
to the relative position of the Earth and the satellite in relation to the direction of translation. The systematic differences for a (Oeuvres 364) long system of observations can then reach 0.2 second, a quantity which is on the order of those which we observe in astronomy. 20 If, therefore, we do not wish to admit that the speed of light depends on that of the bodies emitting it and purely relative, like all speeds (and the ether concept, alone, prevents drawing out of the relativity principle this so natural consequence), we will have to modify the definition of time. Lorentz has enunciated the hypotheses which would allow giving to the equations for an entrained system, as well as for the axes of coordinates, in uniform motion of translation v parallel to the x axis, the same form as for the case of rest. He admits that all the masses are functions of velocity, thus abandoning the principle of the conservation of mass. 21 We must also, as we formerly revealed, suppress.the notion of solid bodies and introduce a new definition of time. It will be the variable ,
/
V2
VX
which will play the role of time in the equations. Time, thus defined doesn't satisfy almost any of the axioms assigned to the notion of time in the ordinary sense. Two events, simultaneous for observer A, but taking place at different points, are not simultaneous anymore for a second observer B [who is] in motion with respect to the first. Simultaniety becomes a relative notion. Two equal times for observer A will not be so for B. The parallelogram rule for velocities is only approximated. Thus for v and v' being the speeds of two bodies which are moving in opposite directions in relation to a primary system of coordinates, the relative speed of the first body with respect to the second, i.e., the velocity that an observer traveling along with the second body would observe, is not v — v', but v — v' cr 20
Supposing that the laws of gravitation are modified by motion as the laws of electrodynamics are, the corrections would be only of the second order and would not cancel the first order term. 21 The word mass, in the theories of Lorentz, Poincare, and Einstein, has no precise meaning anymore. The number representing it depends on the motion of the system of coordinates, the motion being absoutely arbitrary. But the force depends also on this motion and it is not the two members of the equation d2x/dt2 = X,... but indeed their relation alone that remains invariable when we change this motion.
212
Lorentz and Poincare Invariance
it remains steadfastly inferior to the speed of light. For two (3 rays emitted in opposite directions by "a grain of radium, each with a speed of 250,000 km per second, the relative speed will not be 500,000 km per second, but 294,000 km per second. The words "speed", "time", etc., have therefore acquired a significance very different from that which they normally have, and now only have a relative sense. The ether in this new Kinematics will not play any role because it no longer furnishes a system of absolute coordinates. But this concept will oblige us to replace the simple axioms of the conservation of mass, the invariability of solids, the parallelogram of velocities, etc., axioms which we should abandon only as a last resort, with complicated relationships presenting considerable difficulties to the imagination (similar to those of curved space in three dimensions), and which we generally can't treat rigorously, except by analytical considerations. We must add that this theory was presented by Lorentz with all reservations. Einstein (loc. cit.) has presented the same results in a different form. He admits, a priori, for the speed of light, a law which by its nature involves a large amount of arbitrariness; the comparison with that which we will adopt in the Second Part of this work will demonstrate this sufficiently. It leads, with the principle of relativity, to a definition of the simultaneity of two events at two different points, of which he makes a relative notion, and more generally [leads] to the new kinematics which has just been discussed. The simultaneity included in the (Oeuvres 366) definition of the length of a body in motion in relation to the fixed marks of a standard measure (since it will be a matter of pointing simultaneously to the two extremities of the body, otherwise the body would move during the interval) this body will appear to be of a different length to an observer at rest, according to its more or less large speed (although its true length remains invariable). We therefore avoid the contractions admitted by Lorentz, or rather we see that their reality is only a question of definition. Einstein verifies that Lorentz's equations are thus made independent of absolute motion, and that the law admitted by him for the propagation of light is in accordance with the equations. These then, in the measure that they express this law of propagation, become superfluous. Moreover, the reasoning does not demonstrate at all, as some authors have beleived, that these transformations are the only group which leaves Lorentz's equations invariable. This problem has rather to do with Poincare's methods. 17 Bucherer 22 was led, by considerations on the relativity of motions to abandon the notion of ether. Lorentz's equations should always be applied Physik. Zeitschr., vol. 7, 1906, p. 553.
Chap. 3. Inquiries Regarding the Constancy of the Speed ...
213
by assuming that the system of coordinates is at rest with respect to the point P whose motion we study. Bucherer only considers the case of uniform motions; the action on an electron propelled with a relative velocity u = v'—v in relation to P, according to formula (13) (where we set v' = u, v — 0), will be given by the formula *x
ee'cospx pi ^i+
u2-3up2 2c2 ; '
t
v ~ -
For two closed currents, we verify easily (see the Second Part) that only the terms proportional to vv' play a role, with the accelerations excluded, which we haven't taken into consideration, and terms in v2, v'2: it is moreover what should happen if the intensity of the force is proportional to the product of the intensities of the currents. The action of two elements of current on each other will from then on be %% dsds 2—cos(p, x)[—cos(ds, ds') + 3cos(p, ds)cos(p, ds')] The hypotheses of Ampere are verified. The action is parallel to the line of junction p of the elements; but the parenthesis should be —2cos(ds, ds') + 3cos(p, ds'). Bucherer's hypothesis is therefore irreconcilable with the laws of closed currents. This is because the notion of field ceases to be applicable there. We are therefore brought back to the complicated hypotheses which were set forth. We must say, in closing, that these complications occur not only at great speeds, but also in Fizeau's experiment on the entrainment of waves, for example. In effect, according to the principle of relativity, an observer carried along in the translational movement of a transparent body will find, for the speed of propagation of waves in this environment, the same value as if he were at rest (supposing the period to be the same, or the dispersion negligible). We would conclude that, in ordinary kinematics, the waves are totally carried along by the substance. There is nothing to it. The term velocity has a new meaning, and in reality, Lorentz's proof continues to be applicable. We come back to Fresnel's coefficient. 10 - SUMMARY AND CONCLUSIONS We know that ether was, at first, only one of the numerous fluids of physics, but with the new experiments having proved to Fresnel that light waves are transversal, we had to create a body analogous to elastic solids.
214
Lorentz and Poincare Invariance
But then, how can the other bodies move through it without experiencing any appreciable resistance? The question was all the more difficult with the problem of aberration obliging us to admit that the ether does not participate in the earth's translational motion, so that all bodies are constantly traversed by a current of ether of 30 km per second with zero effects despite the rigidity of ether. We must add that the elasticity of this body is quite singular, seeing that its resistance to compression would be zero, which wouldn't happen for a finite solid. We can, it is true, appeal to Lord Kelvin's rotational elasticity, by being (Oeuvres 368) careful not to take into account the perturbation that would be brought to this ingenious mechanism by the brutal passage of an animated body with a speed of 30 km per second. The difficulties increased when, the identity of light and electrical oscillations having been shown, we had to extend this system of explanations to electromagnetism as a whole. Poincare 23 has exposed some of the strangeness to which we are led. Besides, experiments have refused to accord to ether that primordial property of bodies; movement. Fizeau's experiments (interpreted by Lorentz), and those of Lodge, and others, have concurred in their negative results. Ether is not entrained by the movement of matter, nor by that of charged or magnetized bodies, nor by currents, etc. The hypothesis of such motions, itself, does not permit obtaining a mechanical explanation of electrodynamics. We resigned ourselves to accept the absolute rest of ether. The hypothesis of a complete compenetration permitted the avoidance of the difficulty relative to the movement of bodies through ether. This latter has become what Drude calls a "Physical Space", the seat of electric and magnetic energy, and polarizations. It furnishes a system of coordinates independent of all matter, and to which Maxwell's and Lorentz's equations must be referred. This is already too much abstraction. It still is not enough. Indeed, according to these views, ether could also be the seat, of phenomena independent of matter, and thus manifest its existence. Nothing of the kind; and, in order to explain it, we needed a new hypothesis, discarding all waves that don't diverge from a material element of volume. The role of ether is again reduced. We have seen that, from now on, we can completely set aside the notion of field and the consideration of what is passing in the ether, and to be content with elementary actions of charges on each other (exactly as in the older theories of Gauss, Weber, Riemann and Clausius, but with a finite time of transmission). We thus express the same facts, but by including the hypothesis on the divergence of waves and the consequential (Oeuvres 369) irreversibility, which the equations of partial derivatives are powerless 23
Electricite et Optique, 2nd ed.; And proposed by Larmor's theory., p. 577ff.
Chap. 3. Inquiries Regarding the Constancy of the Speed ...
215
to express, the ether has become a system of absolute coordinates, a mathematical abstraction. The equations of partial derivatives are an intermediate mental construct which, however, aren't sufficient in themselves. Finally, this phantom of ether itself did not stand up under the scrutiny of experiment. It seems well accepted t h a t absolute motion cannot be put into evidence. We have seen t o which hypotheses, disrupting all the principles of physics, we must have recourse to, to render account of this result. T h e only conclusion which, from then on, seems possible to me, is t h a t ether doesn't exist, or more exactly, that we should renounce the use of this representation that the motion of light is a relative motion like all the others, that only relative velocities play a role in the laws of nature; and finally that we should renounce the use of partial differential equations and the notion of field, in the measure that this notion introduces absolute motion. As I already stated in the Introduction, this overly negative conclusion needs two complimentary remarks; a simple representation for the new mode of light motion; and the demonstration t h a t a theory satisfying these principles is possible. The habit t h a t we have to "substantiate", if I dare to express myself so, a habit which we owe to the old caloric, magnetic, etc., fluids and the new energy fluid, makes indispensable, indeed, the introduction of a representation which makes us realize what happens t o light and electrical forces when, having left a body, they don't act on still another. A theory which wouldn't admit such a representation would be considered by many as introducing actions at a distance, simply retarded. Moreover, as Poincare noted (Science et Hypothesis, p. 199), 2 4 and this is one of the reasons t h a t we can judge in favor about the existence of ether, mechanics would have it t h a t the state of a system depends only on the immediately preceeding states. It wouldn't be so anymore if we cancelled all intermediates. Actually, we thus save only a convention, which perhaps doesn't have any extreme usefulness. We have seen t h a t we can't arbitrarily give the initial s t a t e of ether, which must satisfy (Oeuvres 370) the formulae of retarded potentials. It is to say t h a t the consideration of the system during a finite period is not avoided effectively. On the other hand, the pressure exerted by light on a mirror, even in a vacuum, for example, is contrary to the equality of action and reaction when applied to the material alone. We will therefore have to "substantialize" the radiant energy t o save the principle and t h a t of the conservation of energy whenever there is a body in which the radiation doesn't meet any material obstacle in certain directions, and for which the energy cannot, consequently, ever be fully recovered. These principles will then become conventions, in part at least, but for a greater advantage to the economy of our thought. 24
This is probably page 119. Trans.
This page is intentionally left blank
Chapter 4
Extended Relativity and its 4-Dimensional Symmetry 4
"H. Reichenbach (1928), A. Griinbaum (1963), W. F. Edwards (1963) L. Hsu (1997), J. P. Hsu (1997), D. A. Schneble (1997).
The Philosophy of Space and Time Hans Reichenbach Simultaneity x After we had specified the unit of time, which is the first metrical coordinative definition of time, we were led to the problem of uniformity, which is the second metrical coordinative definition of time and deals with the congruence of successive time intervals. There is however a second type of time comparison that concerns parallel time intervals occurring at different points in space rather than consecutive time intervals occurring at the same point in space. The comparison of such time intervals leads to the problem of simultaneity and hence to the third metrical coordinative definition of time. Although it had been known for some time that uniformity is a matter of definition— Mach,1 for instance, asserted the definitional character of the uniformity of time emphatically—the definitional character of simultaneity was recognized first by Einstein and has since become famous as the relativity of time. Einstein immediately applied his solution of the problem of simultaneity to theoretical physics and for this reason the epistemological character of his discovery has never been clearly distinguished from the physical results. Therefore, we shall not follow the road taken by Einstein, which is closely connected with the principle of the constancy of the velocity of light, but begin with the epistemological problem. To see this problem clearly, we must start with a distinction which originated with the work of Einstein. We shall distinguish between the simultaneity at the same place and the simultaneity of spatially separated events. Only the latter contains the actual problem of simultaneity; the first is strictly speaking not a simultaneity of time points, but an identity. Such a concurrence of events at the same place and at the same time is called a coincidence. In a strict coincidence there is actually no comparison of space or time since position and time are identical for both events. Practically speaking, such an i E. Mach, The Science of Mechanics, The Open Court Publishing Co., Chicago and London, 1919, p. 223.
218
Chap. 4. Extended Relativity ...
219
identity never occurs since we could no longer distinguish the two events. But an approximate coincidence can be realized, in the example of two colliding spheres or two intersecting light rays. Simultaneity plays no essential role even in the case of a roughly approximated coincidence, because a time comparison of distant events shows such slight differences in the determination of the time of neighboring events that they can be ignored. We can therefore treat the problem of the comparison of neighboring events similarly to the problem of coincidence and restrict our investigation to the comparison of distant events.^This investigation will lead us to the result that the simultaneity of distant events is based on a coordinative definition. We shall demonstrate this result by showing that a comparison of time has the characteristic properties of a coordinative definition. We therefore maintain: First, it is impossible to ascertain whether two distant clocks are set "correctly" in their indication of time; second, they can be set arbitrarily and yet no contradiction will arise. Following the first line of thought, we may ask how one can determine the simultaneity of distant events. We shall consider events as distant, if the distance between them is large compared with the dimensions of the human body. The perceptual judgment of simultaneity is thus not sufficient under these circumstances. We may hear for instance the sound of thunder and notice at the same time that the hands of our watch point to 8:50. The determination of simultaneity which we make here is a comparison of neighboring events; we compare the moment when the watch indicates 8:50 with the moment when the sound of thunder reaches our ear and not with the instant of its occurrence. If we want to derive from this time determination the actual time at which the thunder occurred, we must have additional physical facts. We must know the distance which the sound has traveled and the velocity of sound, before we can calculate backwards from 8:50 to the time at which the thunder took place. But are there no other means? It is well known that we can avoid using the velocity of sound in the given example, if we observe the lightning rather than the thunder. Let us say that the lightning was i For a rigorous treatment of the comparison of neighboring events, see A., § 8.
220
Lorentz and Poincare Invariance
observed at 8h 49m 50s; we may then consider this time as the time at which both lightening and thunder occurred. Is this statement true? Obviously, in this type of time determination the situation is changed quantitatively but not in principle. The light of the lightning also requires a certain amount of time to reach the eye, and our judgment therefore concerns again the moment at which the light reaches our eye and not the moment when the lightning actually occurred. Only because this time difference is extremely small can we ignore it for practical purposes. It can easily be seen that the time comparison of distant events is possible only because a signal sent from one place to another is a causal chain. This process leads to a coincidence, i.e., a comparison of neighboring events, and from the time measurement thus obtained we can determine the time of the distant event only with the help of an inference. What assumptions are contained in this inference? This inference requires besides the knowledge of the distance also the knowledge of the velocity of the signal. How can this velocity be measured? In principle, there exists only one method, which we shall schematize as follows. The signal leaves a point P i at the time ti and reaches a point Pz at the time U. Its velocity is then given by the quotient of the time interval t2—ti and the distance P2 — P\. Therefore, two time measurements are required which have to be made at different places. We can think of them as given by two clocks located at P\ and P2. If the indication of the time interval t2—ti is to be meaningful, however, the two clocks must have been synchronized previously, i.e., it must have been determined whether their hands occupied the same positions at the same time. In order to measure a velocity, therefore, the simultaneity of distant events must already be known. Is this statement correct? Did not Fizeau measure the velocity of light differently? Fizeau indeed used an arrangement which did not require the simultaneity of distant events. We can schematize his measuring arrangement as follows. In Fig. 16 a light ray is sent from
Fig. 16. Round trip of a light signal.
Chap. 4. Extended Relativity
...
221
A at the time t\ = 12:00; it is reflected at the point B, which is at a distance I from A, and finally returns to A at h = 12:06. It has required 6 minutes to travel twice the distance I, and its velocity is thus given by the ratio of these two numbers. In this arrangement, time is measured only at A, only one clock is used and the simultaneity of distant events does not affect the problem. Of course, in the actual experiment the time interval was much smaller than 6 minutes, even though I was several kilometers long, but Fizeau measured it by an ingenious device involving a rotating gear. We have simply chosen larger numbers to clarify the illustration. On closer examination, we notice that this measurement contains a certain untested assumption, namely, that the velocity of light is the same in both directions along /. For instance, if it were less, in the direction AB than in the direction BA, the velocity of light as calculated by Fizeau would correspond to neither of the two velocities, but represent an average of the two. How can we prove this assumption by Fizeau? It seems that it can be proved only if the time t% is known at which the light ray reaches B. This means, however, that we are again employing two clocks and a comparison of distant events. Our assertion that the measurement of any velocity in one direction presupposes a knowledge of simultaneity is therefore correct. Thus we are faced with a circular argument. To determine the simultaneity of distant events we need to know a velocity, and to measure a velocity we require knowledge of the simultaneity of distant events. The occurrence of this circularity proves that simultaneity is not a matter of knowledge, but of a coordinative definition, since the logical circle shows that a knowledge of simultaneity is impossible in principle. We also notice that the second characteristic of a coordinative definition, namely its arbitrariness, is satisfied. It is arbitrary which time we ascribe to the arrival of the light ray at B. If we assume it to be 12:03, the velocity of light becomes equal in both directions. If we assume it to be 12:02, the light ray requires 2 minutes in one direction and 4 minutes in the other; this assumption is equally compatible with Fizeau's measurements. It does not make sense, therefore, to call the time 12:02 false or improbable, since we are here concerned not with an
222
Lorentz and Poincare Invariance
empirical statement but with a definition. This definition determines at once the velocity of light and simultaneity, and such a determination can therefore never lead to contradictions. If we wish to determine by velocity measurements which events are simultaneous, we shall always obtain that simultaneity which has already been introduced by definition. It is this consideration that teaches us how to understand the definition of simultaneity given by Einstein h=h+\[tz-h) (1) which defines the time of arrival of the light ray at B as the mid-point between the time that the light was sent from A and the time that it returned to A. This definition is essential for the special theory of relativity, but it is not epistemologically necessary. Einstein's definition, too, is just one possible definition. If we were to follow an arbitrary rule restricted only to the form (2) ' 2 = ' i + £('3-'i) 0<*<1 (3) it would likewise be adequate and could not be called false. If the special theory of relativity prefers the first definition, i.e., sets e equal to \, it does so on the ground that this definition leads to simpler relations. It is clear that we are dealing here merely with descriptive simplicity, the nature of which will be explained in § 27. The arbitrariness is restricted only by condition (3) which specifies that h must lie between ti and h; otherwise the signal would arrive at B at a time earlier than its departure from A. The epistemological significance of this restriction will be discussed in detail in § 22. These considerations have shown that simultaneity is a matter of a coordinative definition. Simultaneity also has the peculiar dual character which we can most easily observe in the definition of the unit of length. What we mean by a unit of length can be defined conceptually: a unit of length is a distance with which other distances are compared. Which distance serves as a unit for actual measurements can ultimately be given only by reference to some actual distance. The same is true of simultaneity. We can give a conceptual definition of "simultaneity": two events at distant places are simultaneous if the time scales at the respective places indicate the same time value for these events. What time points of parallel time scales do receive the same time value can ultimately be determined
Chap. 4. Extended Relativity
...
223
only by reference to actual events. This reference is essentially of the form:'' These particular events are to be called simultaneous.'' We say with regard to the measuring rod as well as with regard to simultaneity that only "ultimately" the reference is to be conceived in this form, because we know that by means of the interposition of conceptual relations the reference may be rather remote. We may recall here the example of the determination of a unit of length by reference to a color, mentioned in § 4, where the reference is not directly to a spatial distance. Correspondingly, we find that the reference in the definition of simultaneity is commonly not in terms of the occurrence of arbitrary events, but in terms of light, i.e., a physical process, the properties of which are utilized in the definition of simultaneity. In this fashion we are able to replace a direct reference by a description of operations which can easily be repeated, since it is commonly understood what is meant by " l i g h t " and by these operations. The definition of simultaneity through the use of light signals, for instance Einstein's definition, cannot be compared to the definition of the meter by means of the Parisian standard meter, but is to be compared to the definition of the meter by means of the earth's circumference. In this definition the physical phenomenon the earth's circumference corresponds to the physical phenomenon light in the definition of simultaneity, and the rule "count off 40 million times" corresponds to the rule "send a light signal from A to B and back and set the time of arrival at B equal to the average of the two time values at A." Such a rule does not change the nature of the coordinative definition, since what is meant by "light" and " t h e circumference of the e a r t h " can ultimately be determined only through a direct reference. The conceptual definition which we related to the coordinative definition of simultaneity may appear empty; it is tautological to define simultaneity as the equality of time values on parallel time scales. But the situation is no different for any other conceptual definition. All conceptual definitions are tautological in this sense, since they deal exclusively with analytic relations. A concept is coordinated to a combination of certain other concepts and derives its meaning only from these other concepts. The conceptual definition of the unit of length is also a tautology in this sense. Yet the desire for a different conceptual definition of simultaneity has a certain justification. We
224
Lorentz and Poincare Invariance
mean more when we speak of simultaneity; we are searching for a rule that restricts the determination of the parallel time scales in a special fashion. An answer to this question can only be given by the causal theory of time which we shall develop in § 21 and § 22. We anticipate, however, that this investigation will not eliminate the relativity of simultaneity but only justify the restriction of arbitrariness given in (3).
References Hans Reichenbach, Philosophie der Raum-Zeit-Lehre (Berlin, Walter de Gruyter Company, 1928); The Philosophy of Space and Time (Translated by Maria Reichenbach and John Freund, Dover Publications, 1958) pp.123-129. For a short biography of Hans Reichenbach, see appendix D in this volume.
Reprint with permission. Copyright ©1958 by Maria Reichenbach. All right reserved.
Chap. 4. Extended Relativity ...
>m ^^%^
v
-^>V*^
Hans Reichenbach (1891-1953)
225
226
Lorentz
and Poincare
Invariance
Special Relativity in Anisotropic Space W. F . EDWARDS
Utah State University, Logan, Utah (Received 9 January 1963) An investigation of the propagation of light, assuming absolute spacial distances, the veracity of the principle of causality, and that the average speed of light measured over a closed path is a constant G, shows that the speed of light may have values between e / 2 and is a function of direction. G e n e r a t e d Lorentz transformations are found which are shown to be equivalent to those conventionally adopted. T h a t an absolute simultaneity cannot be established electromagnetically is shown to follow. The arbitrary nature of indirect observables such as time dilation is discussed.
I. INTRODUCTION A
' I HE apparent fact that one cannot unam-*- biguously measure the "one-way" velocity of light between two points allows certain liberties in synchronizing clocks and leads to interesting differences in the formalism of the special theory of relativity. The measurement of the "average" speed of light emitted from point 0, traveling through a distance d to A and then back to 0 can easily be accomplished by means of distance measurements and time measurements using only one clock. This "two-way" speed is, of course, 6. On the other hand, to measure the one-way speed between two distant points the distance traveled by the pulse and the time lapse between its emission at 0 and its arrival at A, must be established. The time measurement involves two clocks which must be synchronized. This synchronization can not be accomplished by sending a light signal from 0 to A for that method would presume knowledge of the quantity whose value is desired. On the other hand, the synchronization might be affected by setting two clocks while at point 0 and then by carrying one to A. It appears that if the effect of the motion upon the translated clock is known, then this method might be feasible. In the past, this method was discarded for lack of this knowledge or for other reasons to be mentioned.
Chap. 4. Extended
Relativity
...
Two schools of thought have arisen in the absence of a one-way speed of light measurement. The attitude of the first group, and one that might have been held by Einstein, 1 is that no observable difference would result if the speed of light really were anisotropic (i.e., had a value other than 6 in some direction).21^ The second holds that relativity theory contains an important assumption that has not and possibly cannot be tested and that, because of this, the foundations of relativity are uncertain. 4 Some suggest that one must "take on faith" Einstein's simultaneity ("We establish by definition that the 'time' required by light to travel from A to B equals the 'time' it requires to travel from B to A.").1 Even recently, serious suggestions have been made of experiments futilely designed to measure the one-way velocity of light.4
•^j)i
rt,M
Concerning the necessity of taking on faith the isotropy of space, may I preliminarily remark that if the adoption of one definition of simultaneity (or, which is equivalent, the one-way speed of light) could in any way lead to different observable results than those predicted by the adoption of another, then a test of the isotropy of space would exist. If, however, no experiment can be devised to make this test, then surely it makes no difference what one assumes about the simultaneity of clocks so long as the two-way speed of light equals S. This study investigates the consequences in the special theory of relativity of adopting definitions of simultaneity other than that used by Einstein. The following facts are found: (1) Certain restrictions are placed upon clock synchronization within a single coordinate system ' A . Einstein, Ann. Physik 17, 891 (1905). 2 H. Reichenbach, The Philosophy of Space and Time (Dover Publications, Inc., New York, 1958), p. 124. 3 A. Grunbaum, "Logical and Philosophical Foundations of the Special Theory of Relativity," in Philosophy of Science, edited by A. Danto and S. Morgenbesser (Meridian Books, New York, 1960), Sec. 2. 4 M. Ruderfer, Proc. I R E 48, 1661 (1960); also see, P. M. Rapier, Proc. I R E 49, 1322 (1961).
227
In variance which have apparently been overlooked. (2) There are an infinite number of correct Lorentz transformations corresponding with the infinity of possible clock synchronizations. (3) These new transformations give the same results when applied to observable effects. (4) An example illustrates the fact that one cannot discover the actual properties of light propagation or establish an absolute simultaneity by moving a master clock about, for clocks would change in such a way as to frustrate any attempts. Finally, (5) using electromagnetic means alone, the measurement of the one-way speed of light or the establishment of an absolute simultaneity cannot be accomplished. II. DEFINITIONS
To obtain Lorentz transformations when other than isotropic space is assumed, we must first inquire as to the restrictions which limit the various possible choices of clock synchronization. Let us imagine that at each point within the coordinate system resides a clock at rest. The clocks are synchronized using light signals. It is assumed that an absolute measure of spatial distance using "rigid" rods exists. Any problems entailed by this assumption are neglected. The time at any point in space is simply the reading of the clock at that point. The time lapse between events at any two points is defined as the difference between the readings of the clocks at those two points. The speed of propagation of any causal signal producing events at two points is clearly the distance between the points divided by the elapsed time. The two-way speed of a light pulse traveling from point 0 to point A and by reflection, back to 0, is an observable quantity that does not depend upon the definition of simultaneity for only the clock at 0 need be used. Using the definition of time lapse, the following is seen to be an identity: toAO — toA~{~tAOt
(1)
Chap. 4. Extended
Relativity
...
229
where the subscripts indicate the path of the light signal and the first and last subscripts identify the clocks that are read when the signal reaches those points. Using the definition of speed and letting S z.
/ y\—y^y
IX
y
J/
X FlG. 1. The "circulation" path of a light signal emanating from the origin and traversing path OABCO by reflection from mirrors. It is assumed that an experiment would show that the time taken in traversing such a path would give the value 6 for the average speed. This thought experiment is used to find the velocity of light as a function of direction.
indicate the two-way speed of light, we find that 2x/Q
= (X/COA) +
(X/CAO)
or (1/C0A)
+ (1/CAO)=2/G.
(2)
The choices of C0A and cAo are restricted in such a w a y that the sense of "cause" is preserved. In other words, a light signal starting at 0 cannot reach A before it leaves 0. Since t0A must be positive for such signally connected events, so must Co A be positive; and since this also holds for CAO, we see from Eq. (2) that neither can be smaller than 6/2. Therefore, the speed of light in any direction indicated by the direction cosines a, /?, and y has the restriction e/2
(3)
It should be clear that the choice of synchronization of clock A with respect to clock 0 determines the value of COA, or vice versa, and cAo follows from Eq. (2). Thus, we can use the ideas of clock synchronization or one-way speed of light interchangeably.
230 Lorentz and Poincare Invariance Further limitations on the arbitrariness of clock synchronization are found by imagining an experiment where a light signal from 0 reflects from mirrors appropriately placed so that the path is OABCO, as in Fig. 1. One can easily imagine light traversing each of the straight paths in such a way that the two-way speed for each path is equal to 6 but the "circulation" speed, COABCO, is equal to °°. This would occur if
COA — CAB = CCO= °°
and
COC = CCB = CBA — CAO
= 6/2, so that conditions (2) and (3) are met.
&\y&
but a value for the speed of light other than e would be measured using only one clock. Although an explicit measurement of this circulation speed has not been made, I anticipate the result and assume the value of Q for the remainder of this paper. III. DEPENDENCE OF SPEED UPON DIRECTION
To find the speed of light as a function of direction, we begin by writing the identity t0ABC0 = t0A~\~tA.BJrtBC~\-tc0,
(4)
from which we obtain, by assuming the value Q for the circulation speed,
x+y+z+ixt+yt+z^
e
x
y
z
- =—+—+— Cx
Cy
CZ
(*J+y*+z*)» c(-
-P,
(5)
-7)
where cx, cy, and cz are the speeds in the directions of the positive x, y, and z axes, respectively [i.e., Ci = c(l,0,0), etc.], and c( — a, —/3, —7) is the speed in the direction of the negative radius vector from the origin. For the later derivations, it is convenient to alter the form of Eq. (5). Using the definitions of the angle cosines and referring to Eq. (2) to establish that lA(a,/3,7) = ( 2 / e ) - [ l A ( - a , - 0 , - 7 ) ] , Eq. (5) becomes
Chap. 4. Extended
Relativity
...
231
e c(a,j8,7)
= a(l cj
)+j8(l \
)+7(l c„/ \
)•
(6)
cz
R(«^,*)
FIG. 2. The relationship between the speed of light and the corresponding speed variable.
We now introduce variables": X=
the
following
"speed
l-e/cx,
K=l-e/c„ z=i-e/c„ i?=l-e/c(a,/3,7).
(7)
Note that as the speed of light in a particular direction varies from 6/2 to «> the corresponding speed variable varies from —1 to + 1 , as shown in Fig. 2. The restriction indicated by Eq. (3) therefore becomes -1
(8)
The speed in the opposite direction may be found simply by reversing the sign on the speed variable as can be seen by combining Eqs. (7) and (2). Equation (6), giving the speed of light as a function of direction, now becomes R(0t,P,y)=aX
+ l3Y+yZ.
(9)
This result shows that, in general, the speed of light in any direction is determined when the speed in the direction of the x, y, and z axes are given. Furthermore, any clock synchronization leading to light speeds which are functions of
Invariance
direction and which do not agree with Eq. (9) contradicts the principle of causality earlier mentioned. 6 IV. TYPES OF SPACE We now investigate the type of "space" one obtains if clocks along the z axis are synchronized so as to make c z =oo. Letting Z = l , Eq. (9) becomes R = aX+pY+y.
(10)
The arbitrariness that generally exists in X and Y is not found in this special case as is shown by the following argument: Let ,3 = 0 so that we are restricted to the xz plane. Where 8 is the polar angle, we have 7 = cos0, a = sin0, and R = Xsind+cos8.
(11)
We now find that X is limited to a single value by requiring that R lie within the range —1 to 1. The extreme values of R are obtained by taking the 8 derivative of Eq. (11), setting it equal to zero and solving for X, which results in X = tan8m,
(12)
where 8m is the value of 9 corresponding with an extreme value of R. Substituting this back in Eq. (11) we obtain 2?w = l/cos0„.
(13)
To keep Rm within the allowable range, the only values of 8m that may be admitted are 0 or x. In either case, we find from Eq. (12) that X = 0 or C i = 6 . By symmetry we see also that F = 0 or c„ = S. Using the above values for X and F, Eq. (10) becomes 6 Reichenbach used such a contradictory synchronization, though he apparently wished to retain the principle of causality. (See p. 162, Ref. 2.) If the value of his parameter "a" is taken so that the speed of light in the positive x direction is J e then, for instance, the value of the speed of light of a signal emanating from the negative x axis and traveling parallel to the y axis to the point equidistant from the two axes is —3.16, a negative value!
Chap. 4. Extended Relativity
R = y=cosO,
...
233
(14)
giving
c(9) = e / ( l - c o s 0 ) . The conclusions are that in this space where cz— oo we must have cz = cy = Q and the speed of light in any direction is a function of the polar angle only. If the speed of light in one direction is infinite, then the speed in every other direction is determined. With this definition of simultaneity, one would picture a wavefront of light emitted by a point source as a paraboloid of revolution as in Fig. 3. It is interesting that Huyghen's construction appears to be as applicable in this case as in the case of spherical fronts emanating from point sources. V. LORENTZ TRANSFORMATIONS IN ANISOTROPIC SPACE
For the work that follows, the relationship between time measurements in isotropic and anisotropic space is needed. Let the coordinate variables without subscripts indicate general space, while those with subscript i refer to isotropic space. The origins of the coordinate systems representing these spaces are coincident, the axes are aligned and the systems are at rest
FIG. 3. The form of a wavefront emanating from a point source situated at the origin if cI = cy=
with respect to one another. The relationship between the readings of clocks at a position
Invariance indicated by radius vector r, whose direction cosines are a, /?, and y is <(r)=<,-(r)-r{(l/e)-[l/c(a,a,7)]},
(15)
or t=*U-(r/e)R.
(16)
Using Eq. (16) the relationship between instantaneous velocities is dx
dti
dt
dt
V=-=V—
r \e
c(a,$,y)J r
c2(«-/3,7)
VrVc(a,p,y)
"I"1 .
(17)
J
To derive the generalized Lorentz transformations, Einstein's second postulate is modified to read: the two-way speed of light in a vacuum as measured in two coordinate systems moving with constant relative velocity is the same constant regardless of any assumptions concerning the oneway speed. The equation of the wavefront emanating from a point source of light situated at the origin is r = c(a,p,7)t. (18) Using the fact that a = x/r, p = y/r, and y = z/r, and applying Eq. (7), Eq. (9) takes the form r/c(a,0,y)=(l/e)lr-
(xX+yY+zZ)l,
(19)
so that Eq. (18) becomes r = xX+yY+zZ+et.
(20)
This new form for the equation of the wavefront may be used to find the Lorentz transformations between any two systems. For simplicity we examine the case where symmetry about the x axis exists; and cv = cz=Q, so that F = Z = 0. Thus, Eq. (20) may be written r = xX+Qt.
(21)
Chap. 4. Extended Relativity ...
235
The prime system S' is moving with constant velocity v relative to and as measured in the unprimed system. This motion is in the direction of the positive x axis. A different simultaneity is allowed for in the 6" system as indicated by and X' which may be different from X. r' = x'X' + et'.
(22)
Writing Eq. (21) in Cartesian coordinates and squaring both sides leads to the form x2(l-X2)+y2+z2-QH2-2eXxt
= 0.
(23)
A similar equation must hold in the primed system. As in standard derivations of the Lorentz transformations, we choose the following equations with A, B, and D to be determined. x'=D(x-vt),
(24a)
y'=y,
(24b)
z' = z,
(24c)
f=Al+Bx.
(24d)
Substituting these in the primed version of Eq. (23), and setting the left-hand side of the resulting equation equal to the left-hand side of Eq. (23), collecting terms and setting the coefficients of the coordinate variables equal to zero, we obtain the following set of equations e2B2+2eX'BD-(l-X'i)D2+l-X2 &A B + QX'AD 2 2
i
= 0,
eX'vBD +v(l-X")Di~QX 1
i
2
Q A -2eX'vAD-v (l-X' )D -e
(25a)
= 0, (25b) = 0.
(25c)
The solution to these equations follows:
A = r,li+(v/e)(X'+xn B = (71/e)l(v/e)(x*-i)+x-x'l, D =v
(26a) (26b) (26c)
where v={ll
+ (vX/e)J-v2/&)-K
(27)
Combining Eqs. (26) with Eqs. (24) we have
the general Lorentz transformations in a space where clocks along the y and z axes are set so as to make cv = c,= 6 in both systems and c% and cx> are undetermined. x' = v(x-vt),
(28a)
y-y,
(28b)
z' = z,
(28c)
t'=v{[i+(v/e){x+x')-]t +t(v/e)(X*-i)+x-x'l(x/e)}.
(28d)
Let us now examine a few limiting cases of the above transformations. First, letting X' = X = 0 we obtain the traditional Lorentz transformations for isotropic space as is easily verified. The equations in spaces where cx = cx> = °° , or X = X' = \ are interesting. In this case Eqs. (27) and (28) give x'=[l/(l + 2iy'e)i](x-0.
(29a)
y' = y,
(29b)
z' = z,
(29c)
*'=(l+2»/e)**.
(29d)
The fact that the transformation of time in these two systems does not involve a spatial coordinate, is particularly interesting. If we set v = 0, then Eq. (28d) reduces to a form of the time transformation, t' = t+ (X+X') (x/ e) [similar to Eq. (16)], which suggests that one can obtain the generalized transformation equations simply by changing the time and velocity terms in the Lorentz transformations for isotropic space according to Eqs. (16) and (17). This, in turn, implies that the only real difference in the forms of the Lorentz transformations is in how one wishes to synchronize clocks. This is indeed the case. VI. THE IDENTICAL NATURE OF THE TRANSFORMATIONS
Using the subscript i again to indicate a quantity in isotropic space, Eqs. (28) are now shown to reduce to the familiar transformations in all cases.
Chap. 4. Extended Relativity ...
237
Using Eq. (17) we find the velocity of the S' system as measured in the 5 system (which assumes cy = cz = (5) to be v=Vi/ii-(viX/e)2.
(30)
From Eq. (16) the transformations of the time coordinates are clearly t' =
ti'-x'X/Q,
t = U-xX/e.
(31)
When these equations are substituted in the transformation Eqs. (28) the familiar Lorentz transformation equations result: 3c' = [ l / ( l - o i 1 , / e i ) * ] ( * - o i 0 .
(32a)
y'=y,
(32b)
z' = z,
(32c)
*'=[i/(i-».7e*)*]('—We*)-
(32d)
It should therefore be clear that, as far as special relativity is concerned, the same experimental results are obtained by assuming isotropic space regardless of what function of direction the speed of light "really" is. If one were able to establish absolute simultaneity using some signal having an infinite two-way speed, then presumably one particular form of Eq. (9) would give the actual manner in which light propagates. But, as we have seen, within relativity, this form would give the same results as would that which simply presumes that space is isotropic, or, for that matter, as would any other presumption concerning the speed of light, even if space is not in actuality so constituted. The equation one uses or the manner in which one sets his clocks is simply a matter of convenience. VII. ON THE MOTION OF CLOCKS
As mentioned in the Introduction, one method that comes immediately to mind, for the absolute synchronization of clocks in a coordinate system, is to set two clocks at the same point in space and then to carry one to all other points in the coordinate system to synchronize the rest of the clocks. The question that is raised is the
Invariance
effect of the motion upon the transported clock. In Reichenbach's excellent book on space and time, this clock transport method is discussed.6 He concludes that the method must fail if the time of the master clock cannot be shown to be independent of path and velocity. I feel, however, that this very motion effect could conceivably be the means to establish absolute simultaneity. If, for instance, it could be shown that, for a particular type of motion, the effect would cause the transported clock (after being synchronized at M) to differ with a clock at point A (that was set by 0 using a light signal after arriving at that point) by an amount that was a function of the true nature of light velocities as predicted by the theory of relativity and a function of the type of motion, then the observed value would determine the absolute simultaneity, at least relative to the whole experimental basis of relativity. The type of motion that is adopted in discussing this question, is as follows.7 The moving clock M, at rest in the S' system is given a constant velocity such that it passes point 0 and later point A. Clocks 0 and A are at rest in the S system. All clocks are on the x axis. In principle, when the clock is at the origin, it can be synchronized with the clock at rest at that point with a vanishingly small error. Later, the clock passes point A where clock M is read. The velocity of M is determined by clock 0 and A, the latter having been set using light signals. As in Sec. V, the types of space are limited to those symmetric about the x axis, with F = Z = 0 in all coordinate systems considered. We inquire as to the difference At' in the reading of the S' clock at point A and at point 0, the distance between being Ax. Note that At' is a directly observable quantity since it involves only one clock. If At' does not remain invariant under 8
H. Reichenbach, Ref. 2, Sec. 20. One may wish to consider the case where clock M is initially a t rest at point O, accelerates in a negligible time to velocity v, then travels with this velocity till it arrives "at A where it accelerates, again in a negligible time, to rest. The result of general relativity that the time change of clock M during these accelerations approaches zero as the time of acceleration approaches zero reduces this problem to the one considered in the text. 7
Chap. 4. Extended Relativity ...
239
changes in the nature of space, given by X and X', for a constant velocity between the two points as determined by the clocks in the 5 system, then a method would exist to establish absolute simultaneity. Using Eq. (28d) to find the time lapse At' in terms of At and Ax, we have At' = v{£l + (v/Q)(X+X')lAt + t(v/e)(X*-l)+X-X'-](Ax/e)).
(33)
Since Ax' = 0, we have from Eq. (28a) Ax = vAt. Substituting this in Eq. (33) and using Eq. (27) to eliminate -q results in At' = i(l + (vX/e)i-(v2/e*)yAt
= At/r,.
(34)
To see whether or not At' changes when under a different clock synchronization, we transform At and v/Q according to Eqs. (16) and (17). Generalizing Eq. (16) to give the relationship between time measured at different positions under two arbitrary clock synchronizations and limiting the positions to the x axis gives At = Ata+(Ax/e)(Xa-X),
(35)
where subscript "a" is used to indicate quantities in the new system with clock synchronization relating to the actual universe. The corresponding form of the velocity transformation is v = va/Zl+va/e(Xa-X')J
(36)
Substituting for At and v in Eq. (34) and using the fact that Ax in Eq. (35) equals v,Ata, we obtain At' = {11+ (vaXa/Q)J-
(va*/&)))Atat
(37)
which, by comparison with the form of Eq. (34), clearly yields the result At' = Ata'.
(38)
We conclude that if one chose a clock synchronization different from nature's intrinsic one, no observable difference would result. Thus a clock that is moved to all positions in a coordinate system to set the other clocks does not establish an absolute synchronization.
240 Lorentz and Poincare Invariance VIII. THE RELATIVITY OF SIMULTANEITY
5^> ™FV-V
To mention the fact that two spatially noncopunctal events simultaneous in one system are not simultaneous in a system moving with constant relative velocity is now standard in textbooks. Now, however, it is easy to show that with a proper setting of clocks in the S' system, two events simultaneous in one system can always be made simultaneous in a second regardless of the value of the relative velocity, and the setting of clocks in the first system. This fact, which was previously discussed by Grunbaum, 8 can easily be proved in the case where Y=Z = 0. If simultaneity exists in the S system so At = 0, Eq. (33) becomes
At' = vt(v/e)(X*-l)+X-X'l(Ax/e).
(39)
Setting S' clocks so that
^ & \ ^
T
' -
X'=(v/e)(X*-l)+X
(40)
clearly gives the result, At' = Q and the events are simultaneous in the second system. By a simple analysis of Eq. (40) it can be shown that X' is between — 1 and + 1 for any allowed values of X and V. An interesting result of choosing X' so that events simultaneous in the 5 system are also simultaneous in the S' system, is that the transformation of the time difference At' between any two events separated by Ax and At in the 5 system is not a function of the spacial separation Ax, but is given simply by At'= At/r)
(41)
as is easily verified. This is the same as Eq. (34) but the former equation applied only to clocks at rest in the two systems so that Ax was equal to zero. We thus can have the same time dilation regardless of the spacial separation of the two events in the 5 system. Figure (4) illustrates the effect of assuming different values of the velocity of light upon the concepts of past, present, and future. Assume 8
A. Grunbaum, Ref. 3, p. 410.
Chap. 4. Extended Relativity ... two events, one occurring at the origin and the other at any point in hyperspace. If the second is above the x axis it is definitionally future (definitional present is abbreviated d.p.) but not necessarily causally future because there is some question as to whether or not a signal having velocity less than that of light could connect the two event points. If isotropic space is assumed, then the causally nondetermined region is as occurs in the figures between the causal future and causal past. A light cone gives the separation between these regions. Figures 4(b) and 4(c) show the situation under different definitions of simultaneity. Note that an event just above the positive x axis is causally and definitionally future when c x = °°. One can see how the concepts of past, present, and future are altered. In Fig. 4(a), lines representing the definitional present, corresponding to those in 4(b) and 4(c) but transformed to the new time system ta, are shown. IX. DIRECT AND INDIRECT OBSERVABLES
Since one form of the Lorentz transformation is always equivalent to all other forms obtained by changing the definition of simultaneity, any direct observable (such as two readings on a single clock) could not be changed by either a physical or a mathematical thought alteration of space time. But such quantities as time lapse involving two clocks separated in distance with respect to the observer, are not directly observable and involve the definition of simultaneity which is, to some extent, arbitrary. Even the time dilation formulas or the concepts, past, present, and future do not involve direct observables. Some of the mathematical quantities are directly observable and others are not. Nevertheless, the application of the mathematical quantities to physical experiments result in quantities which are directly observable and these quantities are invariant under any change of clock reading or the manner in which light "really" travels. Such quantities as the standing-
241
In variance et
ct
ct
FIG. 4. Showing light cones in hyperspace under three definitions of simultaneity. In the figures d.p. means definitional present. The figures are discussed in the text.
wave pattern in a waveguide, the diffraction pattern in a slit experiment, or the detector frequency in the observation of radiation emitted by a moving source are completely independent of how fast light really travels in one direction or how one sets his clocks. One might well argue about the reality of time dilation, or the relativity of simultaneity or other indirect observables, but in the final analysis, the argument is academic unless a two-way signal velocity greater than Q is discovered outside of electromagnetism. As far as any measurement is concerned, one can adopt any view he wishes consistent with the fact that the circulation speed of light is <3 (if, indeed, it is). For most problems the most convenient assumption to make is still that of isotropic space. ACKNOWLEDGMENTS
I wish to thank William E. Dibble for many interesting discussions and Akeley Miller for critically reading the manuscript.
Chap. 4. Extended Relativity ...
243
Four-dimensional symmetry of taiji relativity and coordinate transformations based on a weaker postulate for the speed of light. - I LEONARDO HSUC 1 ), JONG-PING HSU( 2 ) and
DOMINIK A. SCHNEBLE( 2 )
C) Physics Department, University of California at Berkeley - Berkeley, CA 91*720, USA (2) Physics Department, University of Massachusetts Dartmouth North Dartmouth, MA 0271*7, USA (ricevuto il 4 Aprile 1995; approvato il 2 Aprile 1996)
Summary. — The power of the four-dimensional symmetry of taiji relativity is demonstrated through an analysis of Edwards' transformation based on the weaker postulate that only the two-way .speed of light is a universal constant. His transformation, involving Reichenbach's time, is shown to be inconsistent with Lorentz-group properties and the relativistic energy-momentum expression for particles. However, using the symmetry of taiji relativity as a guiding principle, we can obtain a new four-dimensional transformation which does not have these difficulties. We show that Reichenbach's concept of time is compatible with the four-dimensional symmetry of physical laws. No known experiments can distinguish physical laws in this four-dimensional formalism («extended relativity") from those in special relativity. PACS 03.30 - Special relativity. PACS 11.30.Cp - Lorentz and Poincare invariance.
Only at the beginning of the twentieth century, after the creation of the space-time four-dimensional symmetry of special relativity, was it recognized that the concept of symmetry played an important role in physics [1]. Subsequently, all basic symmetries, such as space-time symmetry, right-left symmetry, time-reversal symmetry etc., have been taken for granted. In 1957, however, when right-left symmetry was found to be violated in weak interactions, it was a shock to the physics community. Nowadays, many symmetries, such as CPT symmetry and isotropy of space in relation to the speed of light, are still under experimental investigation. In view of the ultraviolet divergence difficulties in quantum field theory, one may wonder whether the four-dimensional symmetry still holds at extremely high energies. Recently, Devlin has informed us that this symmetry has been tested and confirmed by measuring the decay lifetime of A;s° in flight at several hundred GeV [2]. The four-dimensional symmetry is one of the most thoroughly tested symmetry
244 Lorentz and Poincare Invariance principles in physics. In the beginning of his book Invariance Principles and Elementary Particles [3], Sakurai quoted from C. N. Yang's Nobel Lecture: «Nature seems to take advantage of the simple mathematical representations of the symmetry laws. When one pauses to consider the elegance and the beautiful perfection of the mathematical reasoning involved and contrast it with the complex and far-reaching physical consequences, a deep sense of respect for the power of the symmetry laws never fails to develop.» This quotation summarizes the essence of symmetry in physics, which will be illustrated below by a non-trivial analysis of different viewpoints of the physical world to show how four-dimensional symmetry is critical to any theory. In this paper, we would like to demonstrate the importance of using four-dimensional symmetry as a guiding principle for discussing physical laws from different viewpoints of time and space. We first analyze Edwards' original attempt in 1963 to formulate a relativity theory based on a weaker postulate for the speed of light [4], namely that «the two-way speed of light in a vacuum as measured in two coordinate systems moving with constant relative velocity is the same constant regardless of any assumptions concerning the one-way speed» [5]. He derived space and time transformations involving infinitely many possible physical times which can be physically realized through Reichenbach's convention of clock synchronization and which include relativistic time as a special case [4]. Edward's transformations were shown to be consistent with many experiments related to the propagation of light. Nevertheless, we show that they do not form the Lorentz group so that, in general, physical laws are not invariant under his transformation. Moreover, it also leads to an incorrect expression for the relativistic energy-momentum of a particle. In view of these results, one may conclude that assuming the universality of the 2-way speed of light and Reichenbach's concept of time is wrong. However, that is not the case. We show that, using symmetry, Reichenbach's general convention of time can be accommodated in a new four-dimensional formalism which is consistent with the relativistic energy-momentum of a particle and the Lorentz group. Such a four-dimensional formalism may be termed «extended relativity» which includes special relativity as a special case. These results are physically and pedagogically interesting for the following reasons: first, as early as 1910 several physicists, including Ritz, Tolman, Kunz and Comstock, raised the question of whether it was possible to construct a relativity theory without postulating the constancy of the (one-way) speed of light [6]. They all failed because the importance and the general structure of the fundamental four-dimensional symmetry were not recognized [5,7]. Now, we are able to answer their question affirmatively. Second, people are usually puzzled by discussions of Reichenbach's convention of time and the impossibility of the unambiguous measurement or test of the one-way speed of light in the literature [4,5]. They ask: Can Reichenbach's convention of time be consistent with the Lorentz-group properties of transformation between two inertial frames? Are the ideas of Reichenbach and Edwards viable? Our results suggest that the key to answering all these questions is four-dimensional symmetry. The physical theory of taiji relativity based solely on the first postulate of relativity, i.e. the principle of relativity for physical laws, has been discussed recently [8]. Such a theory has a four-dimensional coordinate transformation between any two inertial frames, F(w, x, y, z) and F'(w', x', y', z'). The absence of a second postulate in taiji relativity makes it
Chap. 4. Extended Relativity ...
245
impossible to factor the inherent evolution variable w, called «taiji-time», into a well-defined velocity and time as in special relativity. As a result, time and speed of light are undefined in taiji relativity. Nevertheless, the four-dimensional taiji relativity is consistent with all known experiments [7,8]. We show that Edwards' weaker postulate for the speed of light is a further restriction of taiji relativity, just like the second postulate of special relativity. Thus, extended relativity discussed in this paper gives a more restricted view of the four-dimensional physical world than that of taiji relativity. It also shows the power of the first postulate of relativity from the vantage point of four-dimensional symmetry. 2. - Edwards' transformation with Reichenbach's time Let us consider two inertial frames F and F', where F' is moving along the x-axis. Suppose there are two identical clocks, clock 1 located at the origin of the F frame and clock 2 at point x on the x-axis. A light signal starts from the origin (event 1) at time ti, it reaches clock 2 (event 2) at time i2 and returns to the origin (event 3) at time t s . Reichenbach's concept of time can be realized by synchronizing clock 2 to read t2 by the relation [4] (2.1)
*2 = t1 + e[«3-* 1 ],
where e is restricted by 0 < £ < 1, so that causality is preserved, e.g., t2 cannot be earlier than i : . The same synchronization can be done to clocks in the F' frame: (2.2)
*2=«i+e'[*s'-*i].
The special case £ = £' = 1/2 corresponds to the conventional synchronization of special relativity. One can easily verify that the two-way speed c2w of light along the x-axis in F is given by (2.3)
c2w =
2Lj\t%-tx\,
which is independent of t2 and Reichenbach's parameter s. Similarly, in the F' frame, we also have a constant two-way speed of light, (2.4)
c2'w, = 2 L 7 [ i 3 ' - i i ' ] .
Following Edwards, we shall assume these two-way speeds of light to be the same, (2.5)
c 2 ' w = c2w = c .
The synchronization of clocks in F and F' according to (2.1)-(2.5) defines the Reichenbach times which include relativistic time as a special case [4]. In special relativity, since an Einstein clock at x is synchronized according to tE = = t0 + x/c (where t0 is the time of the clock at the origin of F) and the corresponding Reichenbach clock at x reads tR = t0 + 2ex/c2w = t, which follows from (2.1), (2.4) and (2.5), tE and t are related by (2.6)
tE = t-(2E-l)x/c
= t-qx/c,
Similarly, in the F' frame, we have the relation (2.7)
ti=t'
-q'x'/c,
q'=2e'-l.
q=
2e-l.
246 Lorentz and Poincaie Invariance For simplicity and without loss of generality, we set q = 0 in (2.6), so that physics in the F frame is identical to that in special relativity. We concentrate on physical implications of (2.7) in the F' frame to see whether Edwards' transformation contradicts experiments. Using (2.6) with q = 0, (2.7) and the Lorentz transformation involving Einstein time tE and ££> o n e c a n obtain the coordinate transformation between inertial frames F and F' [4],
\p = V/c,
y = l/(l-02)1/2,
where V is the speed of F' as seen from F. This type of transformation with Reichenbach's time was first derived and discussed by Edwards [4]. It is important to note that A) Space-time (cf , x', y', z') is no longer a four-vector under the transformation (2.8). B) As V—»0, F and F' become the same inertial frame, but (2.8) does not reduce to the identity transformation (2.9)
t' = {t- q'x),
x'=x,
y'=y,
z'=z.
This shows that Edwards' transformations do not form the Lorentz group, except for the special case q' = 0. O The «invariant four-dimensional interval» under Edwards' transformation is (2.10)
c2t'2 + 2q'x'ct'-x'2(l-q'2)-y'2-z'2
= c2t2-x2-y2-z2
,
q =0.
We may remark that if q * 0, then c2t2 - x2 - y2 - z2 will be replaced by c2t2 + + 2qxct - x2(l — q2) - y2 - z2 At can be shown that this quadratic form holds also for infinitesimal intervals.
3. - Difficulties of Edwards' transformation The transformation (2.8) was shown to be consistent with many experiments related to the propagation of light. Moreover, since Edwards' transformation can be obtained by a «change of time variables", (2.7), in the Lorentz transformation, one might think that it is equivalent to the Lorentz transformation [4]. However, we show that this is not the case because Edwards' transformation (2.8) does not have the Lorentz-group properties in general due to the property of t'. Thus, Edwards' transformation (2.8), as it stands, violates four-dimensional symmetry. As a result, one can show explicitly (see (3.4) below) that the time t' leads to an incorrect expression for the relativistic energy-momentum of a particle, which contradicts experiments in general, except for the special case q'=0. (The correct transformation takes the form (4.3) rather than (2.8), as we shall see later.)
Chap. 4. Extended Relativity ...
247
To wit, let us consider Edwards' transformation (2.8), where we have chosen a frame F in which the Reichenbach time equals the Einstein time by setting q = 0. We stress that this frame can be chosen arbitrarily. (If one wishes, one can set q' = 0 instead of q = 0. The following arguments still hold with F and F' interchanged.) For a free particle, we have the actions S and S' in F and F', respectively: ds 2 = c2dt2 - dx2 - dy2 - dz 2 ,
S= ~jmeda=JLdt, (3.1) L= -mc2(l-i;2/c2)1/2,
inF(q = 0),
S' = - Jracds' = \h' dt' , (3.2)
ds' 2 = c2dt'2 + 2 g ' d x ' c d f - d z ' 2 ( l - g' 2 ) - dy'2 L ' ( E ) = -mc2[(l
- q'vi/cf
- v'2/c2)1/2
,
dz'2,
in F ' ( l > 9' > - 1 ) .
Note that S = S' because ds = ds' which can be verified by taking differentials of (2.8). The constant c in (3.2) is the universal 2-way speed of light measured in F'. The Lagrangians L and L' lead to the following momenta in F and F', respectively: (3.3)
px(E) = dL/dvx = mvx/(l-v2/c2)1/2
(3.4)
px(E) = (mcq'-mq'v±+mvx)/[(l-q'vx/c)2-v'2/c2]1/2,
= px,
etc. in F; etc. i n * " ,
according to Edwards' formalism. We see clearly that the momentum px (E) in F' is in contradiction to experimental results, except for the special case q' = 0. Similarly, one can show that the result for the energy, defined by p'(E)-v' - L', in F' is also incorrect. Before dismissing the expression (3.4) as incorrect, one should see whether (3.4) is the same as the result of changing time variable of the relativistic momentum px (SR) of special relativity in the F' frame, (3.5)
px (SR) = mux / ( l - u'2lc2)1'2
,
ux = dx'/dt& ,
Under a change of time variable (2.7), we have dx'/dt' ie. (3.6)
etc. in F' .
= (dx'/dti)(dtz/dt'),
etc.,
u'=v'/{l-q'vi/c).
And from (3.5) and (3.6), we have (3.7)
p,'(SR) = m ^ / [ ( l - q'vx/cf
- v'2/c2?'2
*
px{E),
which differs from the momentum px (E) in (3.4). This example shows that the lack of four-dimensional symmetry in Edwards' formalism based on transformation (2.8) with 1 > q' > - 1 makes it untenable. Of course, there are many other difficulties related to the lack of symmetry, such as the non-invariance of the Maxwell equations, the Klein-Gordon equation, laws in quantum electrodynamics, etc. under transformation (2.8). In the next section, we show that if one uses four-dimensional symmetry of taiji relativity as a guiding principle, all these difficulties can be resolved.
248 Lorentz and Poincare Invariance 4. - Extended relativity—a four-dimensional theory with Reichenbach's time The method for the construction of a four-dimensional symmetry framework without necessarily having the usual relativistic time has been discussed before [5,7]. The logically simplest case is the four-dimensional symmetry of taiji relativity which is based solely on the first postulate of relativity [8,7]. It can be applied to guide the construction of the four-dimensional framework for the present case with Reichenbach's time. For simplicity and without loss of generality, we choose q = 0 in synchronizing clocks in the F frame, so that we have the usual relativistic time t = tE. In the F' frame, clocks are synchronized to read Reichenbach's time V. An event is, as usual, denoted by (ct, x, y, z) in F. However, following taiji relativity [8], the same event must be denoted by (b't',x',y',z') in F' within the four-dimensional symmetry framework. We stress that it is necessary to introduce the function V so that (b' t', x', y', z') = (w' ,r') transforms like a four-vector and the laws of physics can display four-dimensional symmetry. Otherwise, laws of physics cannot be invariant under the transformation from F to F'. With synchronization of clocks in F and F' as discussed previously, the times t and t' are related as in (2.8): (4.1)
t'=y[(l-Pq')t-(P-q')z/c],
q'=2e'-l.
This may be considered as an assumption. Indeed, from the viewpoint of taiji relativity [8], this is the second postulate which is the same as assuming the universality of the two-way speed of light over a closed path in any inertial frame, as discussed in sect. 2. As usual, we start with the invariance of the fourdimensional interval s 2 , (4.2)
s2 = b'2t'2 - x'2 -y'2-z,2
= c2t2 - x2 - y2 - z2,
to derive the four-dimensional transformation. We may remark that the relation (4.2) or [c2t2 — x2 - y2 - z2]/s2 = 1, s2 > 0, is nothing but the law of motion (i.e. the energy-momentum relation, p2 - p2 = m2c2), for a free particle with a mass m > 0. (This equivalence can be seen more easily by using the infinitesimal form ds 2 in both F and F' frames. See also sect. 6 below.) By the usual method for deriving the Lorentz transformation or the taiji transformation [8], we can derive the extended four-dimensional transformation involving Reichenbach's time, \ w' = b' V = y(ct (43)
\fi = Vlc,
fix),
x' = y(x-fict),
y'=y,
z'-
z\
2 1 2
y = l/(l-/3 ) / ,
where t' is given by (4.1). This extended transformation completely determines the function b': (4.4)
b' = (ct - fix)/[t(l -pq')-(fi-
q')x/c],
i.e. if x and t are known, we can always calculate the value of b' in (4.3) and (4.4). Note that the fourth dimension in F' is now b't' = w' rather than t' or ct', where w' may be called «lightime». Although the function V and the time t' separately have complicated transformation properties, we stress that b' and t' are separately well defined, in sharp contrast to b' and t' in taiji relativity,
Chap. 4. Extended Relativity ... 249 and that the product b' t' = w' transforms as the fourth component of the coordinate four-vector. The property of V is completely determined by (4.4) or the transformation (4.3). From now on we use w' = b' t' to display four-dimensional symmetry of physical laws. The extended transformation of velocities can be derived from (4.3): ' c' =d(b't')/dt' (4.5)
| vx =
=
[dt/dt']y(c-0vx),
(dt/dt')y(vt-0c),
Vy = (dt/dt ')vv,
vz' = (dt/dt ')vz,
where dt/dt' may be obtained from differentiation of (4.1), (4.6)
l/[dt/dt']
= dt'/dt =
y{[l-Pq']-[p-q'][dx/dt]/c}.
We stress that the definition c' = d(b't')/dt' is quite natural because [d(b't')f - dr' 2 = 0 is to be interpreted as the law of light propagation, c' 2 d£' 2 - dr' 2 = 0. Note that the transformation of the ratios v'/c' is precisely the same as those in special relativity (4.7)
vi/c'=(vx/c-p)/(l-vxp/c);
v^/c' = (vy/c)/[y(l - 0vx/c)].
Let us consider the property of c' in F'. Its value depends on its direction of propagation. Suppose a light signal propagates along an angle 8 as measured in F and 6' as measured in F', we have (4.8)
vx = c cos 6,
vx = c' cos 9' .
From the inverse transformation of (4.5), we have c = (dt'/dt)[y(c' (4.9)
—fivx)},i.e.
c' = c/{(di'/dt)[y(l+i3cos0')]} = c'(0' ).
Evidently, the average speed of light over any closed path in F is a constant c. Now we prove that the average speed of a light signal over any closed path is also c, though the speed of light is no longer isotropic in F'. Suppose a light signal travels along the vectors r[, where i = 1, 2, ..., N, which form a closed path on the (x', ?/')-plane in F'. The total distance Lt'ot and the total time 7Yot are given by N
(4.10)
LU = 2 r> ,
N
and 7"^ = 2 r{ /c'(01);
i=i
N>1.
i = i
The average speed c'w of the light signal over this closed path is N
(4.11)
ca'v = Lt'ot /7Yot = L ^ /{Lit /c + (Q 7c) 2 r[ cos B[ } = c, i=
I
where we have used (4.12) vi = c'(0!)cos0!, (4.13)
c ' ( 0 O = c/{(d«7d«) i [y(l + /3cos0O]},
(4.14)
(dt'/dt)i = (l + q'cosBi)/[y(l N
(4.15)
2n'cos0-=O. i = 1
+ bcosB^]},
i=
l,2,...,N,
250 Lorentz and Poincare Invariance Equation (4.15) is a property of a closed path in F'. Thus, the average speed of light over an arbitrary closed path is a universal constant c in extended relativity. We note that a closed path for a light signal as observed by F observers is in general not a closed path as observed by F' observers. Suppose a signal starting from 0 (0'), i.e. the origin of F (F'), travels to a point A (A') on the x-axis and is reflected back to B = 0 (B' * 0') as observed in F (F'). Evidently, the two-way speed of this signal in F is c. One may ask: What is the average speed of this light signal as measured in F't From Reichenbach's time (4.1) and extended transformation (4.3), we have the space and time coordinates of these events in F', (4.16)
t'(O') = x'(O')
= 0,
(4.17)
t'(A') = y{[l - Pq'WA)
(4.18)
t'(B') = y{[l-
- [0 + q' }x(A)/c) , x'(A')
pq']t(B) - [p + q']x(B)/c}
, x'(B')
= y(x(A) - pct{A)), = y(x(B) - pct(B)).
Since t(0) = x(0) = x(B) = 0, x(A) = ct(A) = L and t(B) = '2L/c, eqs. (4.16)-(4.18) lead to the «average speed» ca'v(nc): (4.19)
ca'v(nc) = = {x'(A')+
\x'(B')-x'(A')\}/{t'(A')+
\t'(B')
-t'(A')\}
= c/[l -
0q'],
for such a non-closed path in F'. The result (4.19) can also be derived from (4.5) directly because this average speed of light can be considered as two events which satisfy Ax = x(B) — x(0) = 0. As far as constant velocities are concerned, Ax = 0 is equivalent to dx = 0 or vx = 0. Thus using the expression for c' for a non-closed path in F', we have (4.20)
c'(v i = 0) = c / ( l - / 8 7 ' ) I
where we have used the relation (4.6) with the condition dx/dt = 0, (4.21)
(dt'/dt) = y(l-Pq'),
vx = 0.
4. - Basic postulates of extended relativity The physical foundation of the extended transformation (4.3) is based on two postulates: the invariance of the form of physical laws and Reichenbach's time (4.1). This latter may also be regarded as a convention of time. As we have discussed previously, the relation (4.1) is directly related to the clock synchronization. To be specific, we have used the relation between Reichenbach's and Einstein's times, together with the Lorentz transformation, to obtain (4.1). One may ask: Is it possible to derive the extended transformation (4.3) without using Einstein's time and the Lorentz transformation as a crutch? The answer is yes. Instead of postulating (4.1), one can use its equivalent postulate: namely, that the 2-way speed of light over a closed path in any frame is a universal constant and independent of the motion of the light sources. This postulate was first made by Edwards for his formalism [4]. Let us now derive the relation (4.1) based on the two basic postulates of extended relativity:
Chap. 4. Extended Relativity ...
251
1. The principle of relativity for physical laws: The form of a physical law must be invariant under coordinate transformations. (In other words, physical laws must display four-dimensional symmetry.) 2. The 2-way speed of light over a closed path is a universal constant and is independent of the motion of sources. To derive (4.1) from these two postulates, we first observe that the first postulate leads to (4.2) and (4.3) but not to (4.1). Since the F' frame moves along the x-axis, its time t' can only be a linear function of t and x, (5.1)
t'=Pt
+ Qx,
where unknown coefficients P and Q are to be determined. From (4.5) and (5.1), we have (5.2)
c'( + ) = (dt/dt')lcy(l-fi),
(5.3)
c'(-)
(5.4)
(dt/df),
(5.5)
(dt/dt')2 =
dx/dt=+c,
= (dt/dt')2cy(l+P),
dx/dt=-c,
= P + Qc, P-Qc.
Postulate 2 implies that (5.6)
L'/c'( + )+L'/c'(.-)
= 2L'/cim = 2L'/c.
It follows from (5.2)-(5.6) that (5.7)
P + /3cQ = l / y .
Without loss of generality, we may express Q in terms of another parameter q' such that (5.1) is more closely related to the form (4.1) in Edwards' transformation, (5.8)
Q=-y(P
+ q')/c.
The relations (5.1), (5.7) and (5.8) lead to (5.9)
t'=y[(l-Pq')t-(P
+
q')x/c],
which is precisely the basic relation of Reichenbach's time (4.1). 6. - Physical implications of extended relativity Although extended relativity involves a class of different concepts of time, realized by Reichenbach's procedures of clock synchronization with the universal 2-way speed of light, all physical laws have the four-dimensional forms which are identical to those in special relativity. Let us first demonstrate that extended relativity leads to a correct expression for momentum, in contrast to Edwards' formalism. The invariant action for a free particle
252 Lorentz and Poincare Invariance in F' is (61)
\
s
' = - j™cds' = JL'dt' 2
2
2
| ds' = c' dt'
2
, 2
c = ciw = c2w, 2
- dx' -dy' -
dz' ,
c' = d(b't')/dt'
.
Note that c in (6.1) is the two-way speed of light which is a universal constant. The Lagrangian L' in the F' frame takes the form (6.2)
L'= -mcc'{\
-
v'2/c'2)1'2,
which leads to the momentum p', (6.3)
p' =
{mcv'/c')/(l-v'2/c'2)^2.
The «energy» p0' is defined as the zeroth component of the momentum by (6.4)
p0' = (p'-v' -L')/c'
=
These form the momentum 4-vector four-dimensional invariant relation
mc/{l-v'2/c'2)1'2. p'M = (pi,p')
which
satisfies
the
p 0 ' 2 - p ' 2 = ra2c2.
(6.5)
By Noether's theorem, this is the conserved energy-momentum in extended relativity. We see clearly that the momentum p' and the energy p0' in F' are consistent with those seen in high-energy experiments. As a matter of fact, the-momentum p'x given by (6.3) is precisely the same as the momentum px' (SR) in (3.5) for the F' frame in special relativity because of the relations (3.6) and (2.7) and the invariant property that ds 2 in F' for special relativity is the same as ds' 2 given by (6.1) (i.e. ds2 = c2dt'i - dx'2 -dy'2dz'2 = c'2dt'2 - da;'2 - dy'2 - dz'2): (6.6)
- q'v'x/c)[c2 - u'2]1'2}
p'(SR) = mo'c/{(l
= p' .
This example shows that for Reichenbach's time to be viable it must be embedded in the four-dimensional framework. Of course, the Klein-Gordon equation with the form (^-fc2)9P = 0
(6.7)
is invariant under the extended transformation (4.3) as well because the differential operators 3/l = d/dx'1,
(6.8) 1
and
3; = 3/3a;'^,
1
are 4-vectors, where x ' = (ct, r), x' * - ( w \ r') = (&' V, r'). 7. - Remarks and conclusions We would like to make some remarks as follows: 1) It appears that a particular concept of time, such as Reichenbach's time, cannot be judged to be correct or not on its own merits. Only when it is combined with
Chap. 4. Extended Relativity
... 253
another postulate which specifies a symmetry framework, can one make a proper judgement as to its correctness [5,7]. Although we do not have the usual relativistic time dilation in transformation (4.3) of extended relativity, explicit calculations of the lifetime of a particle decay in flight show that the theory is consistent with experiments. (See appendix B.) 2) If we use the inverse transformation of (4.3) to express ct and x in the function b' given by (4.4), we would obtain (7.1)
b' = e-q'x'/t'
.
This turns out to be consistent with the function V obtained from comparing the four-dimensional interval (4.2) and (2.10) (or from (2.7) and (7.2) below). 3) Although Edwards' transformation (2.8) can be obtained by a simple «change of time variables" (2.7) in the Lorentz transformation, this does not imply that the two transformations are physically equivalent. We have shown in (3.2) that Edwards' transformation leads to an incorrect momentum because of its lack of four-dimensional symmetry, as shown in eqs. (3.3)-(3.7). In this sense, the principle of relativity for physical laws has not been really incorporated in his formalism. The main reason is that (ct', x', y', z') is not a four-vector. In contrast, the present formalism of extended relativity based on the new transformation (4.3) between coordinate four-vectors (w, x, y, z) and (w', x', y', z'), where w = ct and w' = b' t', is explicitly consistent with this principle. If one compares the space-lightime transformation (4.3) with the Lorentz transformation, one sees that (7.2)
cti = b't',
where t^ and t' are, respectively, Einstein's time and Reichenbach's time. However, extended relativity is not a trivial change of time variables. If extended relativity is just a change of time variables given by (7.2), we should be able to obtain all results in extended relativity from the corresponding result in special relativity by the relation (7.2). To wit, the momentum (6.3) indeed can be obtained from usual relativistic momentum (3.5) by a simple change of variables (7.2). However, this is not always the case. For example, a plane wave in F' is described by an invariant function (7.3)
exp[i((o'tE-k'-r')]
in special relativity. By a simple change of time variable (7.2), one obtains (7.4)
exp[i(a)'b't'/c-k'-r')]
which leads to an incorrect wave 4-vector (
v'X'=c',
in extended relativity. This implies that the wave 4-vector in F' is (7.6)
(fco',t') = ( a > 7 c ' , * ' ) .
We note that if quantum mechanics is formulated on the basis of extended relativity, the 4-momentum of a photon should be proportional to the extended-wave 4-vector (7.6) rather than (a> '/c, k'). We may remark that from (7.3) one can obtain the correct
254 Lorentz and Poincare Invariance invariant phase, involving (7.6) and (w', x', y', z'), by changing both quantity related to time (e.g., the frequency). In doing this, essentially know the correct wave 4-vector in extended relativity. Therefore, time variable in special relativity appears not to be a useful approach extended relativity.
time and the one must first changing the for obtaining
4) After the transformation of time variables in (2.7), we are not left with the same theory (special relativity) because, strictly speaking, special relativity is a theory which possesses the four-dimensional symmetry with a universal constant for the one-way speed of light and the Einstein time t& has the usual time dilatation property. However, after the transformation of time in (2.7) (which actually violates and destroys the second postulate of special relativity), one has extended relativity which possesses the four-dimensional symmetry with a universal constant only for the two-way speed of light (i.e. Edwards' weaker postulate for the speed of light) and the Reichenbach time t' given by (5.9) does not have t^he usual time dilatation property in general. But we show in appendix B that the experimental results of lifetime dilatation of unstable-particles decay in flight can be understood in extended relativity in terms of the dilatation of their decay lengthi 5) Although c' differs in different directions in F', if one compares speeds of various physical objects in a given direction in F', the speed of light is still the maximum speed in the Universe. This can be seen from the expressions for extended relativistic momentum and energy in. (6.3) and (6.4) or the extended transformation of velocities (4.5). Furthermore, even though the velocities c' and v' measured by using Reichenbach's time depend on the parameters q', as shown in (4.5) and (4.6), the experimental results (such as the Michelson-Morley experiment, the conservation laws of momentum p' and «energy» p 0 ', the dilatation of lifetimes, etc.) turn out to be independent of q'. For example, the momentum p' in (6.3) depends on the ratio v'/c' = (dr'/dt')/(dw')/dt' = dr'/dw' which is independent of time t' or q'. Thus, the restriction for q' in (3.2) is really not necessary from the experimental viewpoint. The basic reason for these properties is that, within the four-dimensional symmetry framework, the inherent evolution variable in the laws of physics is the lightime w' rather than Reichenbach's time t', as indicated by taiji relativity [8]. (Cf. appendix B.) From the discussions in 3) and 4), we conclude that extended relativity and special relativity are two logically different theories—one has a universal one-way speed of light and the other has a universal two-way speed of light (which includes the former as a special case). Nevertheless, they both have four-dimensional symmetry, so that physical laws have the same forms in all inertial frames. We have examined all known experiments. Strangely enough, these experiments cannot distinguish extended relativity from special relativity. For any covariant law in special relativity, there is a corresponding law of the same form in extended relativity. This suggests that extended relativity is equivalent to special relativity, at least as far as known physics is concerned. We believe this indicates that physical properties such as our concept of time (e.g., Einstein's relativistic time or Reichenbach's time, etc.) and the corresponding speed of light in the four-dimensional symmetry framework are human conventions rather than the inherent nature of the physical world. This conclusion was already indicated by the results of fourdimensional symmetry of taiji relativity [8].
Chap. 4. Extended Relativity ... 255
This work was supported in part by The Jing Shin Research Fund of the UMass Dartmouth Foundation. One of us (JP) would like to thank Yuan-Zhong Zhang for correspondences.
APPENDIX A
Group properties of the space-lightime transformation If an object is at rest in F, i.e. v = (0, 0, 0), the extended velocity transformations (4.5) lead to (A.1)
c' = c / ( l - # 7 ' ) s c ' ( 0 ) ,
vl=-0c/(l-Pq')*vHO),
^ = vi = 0.
This implies that the speed of the F frame, measured from F', is different from the speed of F' measured from F. When one thinks about it, there is no reason, and in particular, no experimental evidence (obtained independently of the second postulate of special relativity) that the speed of F measured from F' should be - V. One might claim that common sense dictates that the two speeds be equal and opposite. However, as is evidenced by special relativity and quantum mechanics, common sense is often a poor guide in modern physics. Consider the ratio v'x (0)/c' (0) in (A.l). This ratio is a constant, independent of dtjdt', and satisfies (A.2)
P's-vi(0)/e'(.0)=+p.
Thus, using the ratio of velocities, {$' and /}, one restores the symmetry of velocities between F and F'. The inverse transformation of (4.3) can be written as \ct = y{b'f
+P'x')
\x = y'(x' +P'b't'),
=
Y'Q't'+P'x'), y = y',
z = z';
y' = y.
Let us consider also another frame F" moving with a constant velocity Vt = = (V-!, 0, 0), as measured from F and VI = (V{, 0, 0), as measured in F'. Note that the ratio V{/c' is related to Vx by (4.7)[7]: (A.4)
Vl/c'
^(Vjc-frlil-pVjc).
From (A.1), (A.2) and (A.4), we see that instead of V, one should use the ratio V/c to characterize the relative motion between inertial frames in extended relativity, as this ratio will always be constant and, more importantly, is independent of the parameter q'. The four-dimensional transformation between F and F" is given by (A.5)
bKt" = Yi(ct-Pix),
x" = yi(x-plCt),
y" = y,
z" = z,
where (A.6)
t" = y i [ ( l - piq")ct
- (0J -
q")x]/c.
From (4.3) and (A.5), we can obtain the transformation between F' and F", (A.7)
b"t" = y[(b't'-p[x'),
x'^r'Ax'-pib't'),
y' = y',
z" = z' ,
256
Lorentz and Poincare Invariance
where (A.8)
fc
= (Pi-P)/a-0iP)
= Vl/c',
(A.9)
y; = y 1 y ( l - / 3 l / 3 ) = l / ( l - ; 3 i 2 ) 1 / 2 ,
(A.10)
t" =
px = Vx/c, Yl
= 1/(1 - /J?)1/*,
y'l[(l-!3'1q")b't'-(P'1-q")x'yc.
This result (A.7), together with other properties such as the existence of an inverse transformation and associativity, demonstrates that the set of extended fourdimensional transformations forms the Lorentz group. This Lorentz-group property is the core of the four-dimensional symmetry and is crucial for extended relativity to be consistent with experiments.
APPENDIX B
Decay rate and «lifetime dilatation» Fundamental wave equations such as (6.6) with (6.7) show that four-dimensional symmetry dictates the evolution parameter to be the lightime w rather than t [8]. We must use the lightime w i n a general inertial frame as the evolution variable for a state 0 ( S ) (w) in the Schroedinger representation [8,9]: (B.l)
itid
=H{SHw)0{S)(w),
H™ = H^ + H{S),
because the evolution of a physical system is usually assumed to be described by a «Hamiltonian operator» H = P0 for which the corresponding evolution parameter is the lightime w. Both H = P0 and d/dw transform as the fourth component of a 4-vector. Naturally then, the transition probability within extended relativity is defined directly in terms of the lightime w. The usual covariant formalism of perturbation theory can be applied to quantum field theory based on extended relativity [8]. For example, let us consider the decay rate I\ 1 —> 2 + 3 + ... + N) for a physical process 1 —» 2 + 3 + ... + N. In a general frame, it is given by [8,9] (B.2)
c|(/|SK)| 2 d 3 x 2 d 3 p 2 d3xNd3pN r ( 1 ^ 2 + 3 + ... +A0 = lim f w (2Po2ftf ' " (2pQNhf
which has the dimension of inverse time. The decay lifetime r is given by r = l / r ( l —» -> 2 + 3 + ... + N). To illustrate the calculation of a decay rate, let us consider a simple example, ie. the muon decay n" ( p ^ - ^ e " (p2) + v(1(p3) + v e (p 4 ) with the usual V - A coupling. The decay S-matrix in momentum space is, as usual, given by [9] J ( / | S | i } « (P0lP02P03P04) -1 <5 4 (Pl-P2-P3-P4)M a c> [M3C = (G/V2)[v ( i (p 3 )y"(i-)'5)Mp 1 )][e(P2)yA(i-y5)v e (p4)].
Chap. 4. Extended Relativity ... 257 The muon lifetime r can now be calculated and the result is (B.4)
l / r = r ( 1 ^ 2 + 3 + ... +N) <=c a
J _ f d ^ d ^ ^ POI J
P
4 ( p i
P03
_
p 2
_
p 3
_
p 4 ) E | M s c | 2
P04
s
Pta
Everything to the right of l / p 0 1 in (B.4) is invariant under the extended transformation so that the decay lifetime r is indeed dilated: (B.5)
T<x(pf + m 1 2 ) 1 / 2 = poi-
One should keep in mind, however, that (B.2) is really for the decay length, as shown in ref. [8]. We use the universal two-way speed of light c to convert it into lifetime. I t is the decay length which is really measured in the laboratory.
REFERENCES [1] YANG C. N., Phys. Today, 33 (1980) 42. [2] GROSSMAN N., HELLER K., JAMES C. et al, Phys. Rev. Lett, 59 (1987) 18.
[3] SAKURAI J. J., Invariance Principle and Elementary Particles (Princeton University Press, Princeton) 1964, p. v. [4] EDWARDS W. F., Am. J. Phys., 31 (1963) 482; REICHENBACH H., The Philosophy of Space and Time (Dover, New York, N.Y.) 1958. Edwards' transformation of space and time (2.8) was rederived and discussed by several authors. See, for example, WINNIE J. A., Philos. Sci., 37 (1970) 81, 223; YUAN-ZHONG ZHANG, Experimental Foundations of Special Relativity (Science Publishers, Peking) 1979, pp. 14-21; Gen. Relativ. Gravit, 27 (1995) 475. [5] For a detailed discussion of the universal constant of the one-way speed of light and its implications, see Hsu J. P., Nuovo Cimento B, 74 (1983) 67; 88 (1985) 140; 89 (1985) 30; Phys. Lett. A, 97 (1983) 137; Hsu J. P. and WHAN C, Phys. Rev. A, 38 (1988) 2248, Appendix. [6] PAULI W., Theory of Relativity (Pergamon, London) 1958, pp. 5-9 and references therein. [7] Hsu L. and Hsu J. P., Nuovo Cimento B, 111 (1996) 1283; (it is the preceding article in this issue); Experimental tests of a new Lorentz-invariant dynamics based solely on the first postulate of relativity. - II, UMassD preprint (1995). [8] Hsu J. P. and Hsu L., Phys. Lett. A, 196 (1994) 1. [9] See, for example, BJORKEN J. D. and DRELL S. D., Relativistic Quantum Mechanics (McGraw-Hill, New York, N.Y.) 1964, pp. 261-268 and 285-286; SAKURAI J. J., Advanced Quantum Mechanics (Addison-Wesley, Reading, Mass.) 1967, pp. 171-172 and 181-188.
258 Lorentz and Poincare Invariance
Four-dimensional symmetry of taiji relativity and coordinate transformations based on a weaker postulate for the speed of light. - II JONG PING HSU (*) and LEONARDO HSU (2)
C1) Physics Department, University of Massachusetts Dartmouth North Dartmouth, MA 02747, USA (2) Physics Department, University of California at Berkeley - Berkeley, CA 94720, USA (ricevuto il 15 Maggio 1996; approvato il 9 Luglio 1996)
To Prof. Ta-You Wu for his wonderful and tireless teaching of physics and his ninetieth birthday
Summary. — Extended relativity is a theory of four-dimensional symmetry with Reichenbach's time, in which only the 2-way speed of light is a universal constant. It includes special relativity as a special case. The theory is shown to be consistent with experiments such as Fizeau's experiment, aberration of light and precision Doppler shifts. The formulations of classical and quantum electrodynamics are discussed. They are shown to be dependent on the four-dimensional symmetry rather than on the usual constant one-way speed of light. The four-dimensional symmetry also dictates a new coordinate transformation, called the "Wu transformation", for constant-linear-acceleration frames. PACS 03.30 - Special relativity. PACS 11.30.Cp - Lorentz and Poincare invariance.
We continue to demonstrate that the four-dimensional symmetry is necessary and essential [1] for discussing physical laws from Reichenbach's viewpoint of time [2] or Edwards' weaker postulate for the speed of light [3]. Edwards attempted in 1963 to formulate a relativity theory based on a weaker postulate that the 2-way speed of light in a vacuum is a universal constant. He derived space and time transformations which involve Reichenbach's time but which do not form a four-dimensional Lorentz group in general. As a result, it leads to an incorrect expression for the relativistic energy-momentum of a particle in the Lagrangian formalism of mechanics and electrodynamics, as showrt in paper I [1]. Furthermore, it appears to be impossible to obtain invariant forms of Maxwell equations and the Dirac equation if Reichenbach's time is used as an evolution variable, as we shall see later. The reason is that Reichenbach's time does not transform covariantly as the zeroth component of the coordinate 4-vector in general. Therefore, the lack of four-dimensional symmetry makes Edwards' original transformations untenable. Recently, we have formulated and discussed taiji relativity [4] based solely on the
Chap. 4. Extended Relativity ...
259
first postulate of relativity, i.e. the invariance of physical laws. A four-dimensional transformation between any two inertial frames, F(w, x, y, z) and F'(w', x', y', z'), is derived. Since taiji relativity does not make any assumption regarding the speed of light, the zeroth components w and w' cannot be factored into a well-defined speed of light and time. In fact, the speed of light and usual time (measured in seconds) are unspecified and undefined in the theory. However, the theory of taiji relativity possesses the four-dimensional symmetry which is shown to be the only essential ingredient for the theory to be consistent with previous experiments. This sheds light on the difficulty encountered by Edwards' transformations. In paper I, we show that, guided by the four-dimensional symmetry of taiji relativity, Reichenbach's general convention of time (or, equivalently, the universal 2-way speed of light) can be used as the "second postulate" for the construction of a new four-dimensional formalism of coordinate transformation which is termed extended relativity. The second postulate is necessary to, factorize, say, w' into a well-defined velocity function b' (called "ligh") and Reichenbach's time V in the F' frame, i.e. w' = d ' f which is called "lightime". (See eqs. (2.1)-(2.3) in sect. 2.) It turns out that the lightime w', rather than Reichenbach's time, plays the role of evolution variable in physical laws and makes extended relativity consistent with established energy-momentum of a particle, the Lorentz group, etc. Furthermore, the covariant lightime embedded in the four-dimensional symmetry is also crucial for the formulation of a covariant quantum electrodynamics (QED) based on extended relativity, as we shall see in sect. 6. 2. - Extended relativity—A theory with universal 2-way speed of light Reichenbach's synchronization procedure amounts to imposing a second postulate [1] upon taiji relativity [4], so that we have a well-defined "extended" time, which includes Einstein's time as a special case. For simplicity and without loss of generality, we choose q = 0 in the synchronization of clocks in the F frame, so that we have Einstein's time t = t E ; while in the F' frame, clocks are synchronized to read Reichenbach's time t'. An event is, as usual, denoted by (ct, x, y, z) in F. Following taiji relativity [4], the same event must be denoted by (b' t', x', y', z') in F' which is moving along the + x axis with a constant velocity ft = V/c, as measured in F. We stress that it is necessary to introduce the "ligh function" V so that- {b'f, x', y', z') = (w',r') transforms like a four vector and laws of physics can display four-dimensional symmetry. It was shown by Edwards on the basis of the universal 2-way speed of light that times t and V were related by [3] (2.1)
t' = Y[(l-l3q')t-(p-q')x/c],
q' = 2e'-l,
P = V/c,
which was the basic property of Reichenbach's time. Assuming relation (2.1) is effectively the same as assuming the universality of the 2-way speed of light over a closed path in any inertial frame within four-dimensional framework. The extended four-dimensional coordinate transformation was derived [1], w' = b't' = y(ct-fix), (2.2)
0 = V/c,
x' = y(x- j3ct),
y' = y ,
2
y = l/(l-0 )V*,
_b' = (ct- fix)/[t(l - [3q') - (fi - q') x/c] = c - q' x' jf ,
z' = z\
260 Lorentz and Poincare Invariance where t' is given by (2.1) and b' is determined by (2.1) and (2.2). Although V and V in (2.1) and (2.2) separately have complicated non-covariant transformation property, their product w' = b't' has a simple transformation property and is covariant. The extended transformation of the velocities r and acceleration a can be derived from (2.2): we obtain (2.3)
e' = d(b't')/dt'
=
v^=[dt/dt']y(c-fivx),
v; = (dt/dt')y(vx-fie);
vy' = (dt/dt')
vy ,
v'z = (dt/dt')
vt
and a^ = dc'/dt' = [d2t/dt'2] 2
2
2
2
ax' = (d t/dt' )y(vx-pc)
(2.4)
ay' = (d t/dt' )
where vx =dx'/dt', (2g)
\l/[dt/dt'l 2
\d t/dt'
2
+
vy + [dt/dt' 2
az =
y(c - pvx) - [dt/dt'fyfiax
,
[dt/dt'fyax, fay,
2
(d t/dt' )vz+[dt/dt'faz;
vx = dx/dt,
etc. and dt/dt'
= dt'/dt = = (dt/dt')
may be obtained from (2.1),
y{[l-l3q']-[p-q']vx/c},
Z,
Z=[(P-q')ax/c]/[(l-l3q')-(P-q')vx/c].
We stress that the definition c' = d(b't')/dt' is quite natural because when ds 2 = 0, 2 ie. [d(b' t')f - dr' = 0 is the law of light propagation, c ,2dt '2 - dr' 2 = 0. Note that we still' have the four-dimensional law for the propagation of light in the F' frame, although the speed of light is not isotropic and Reichenbach's time is not covariant. Also, the transformations of the velocity ratios v ' / c ' turn out to be precisely the same as those in special relativity, (2.6)
v'/c1 = (vy/c)/[y(l
vl/c'=(vx/c-P)/(l-vxp/c);
-0vx/c)],
etc.
This is crucial for the extended coordinate transformations (2.2) to form the Lorentz group. For the ratios of accelerations, we have J ^/a,;
= [Z{vx - c/3) + (dt/dt')
yax]/[Z(vx
- c0) - (dt/dt')
\a;/a^=[Zvy+(dt/dt')ay]/[Z(vx~cfi)-(dt/dt')yliax],
y0ax], etc.,
which are more complicated than those in special relativity because q' ^ 0 in Z given by (2.5). The momentum p* = (p0, px, pv, pz) and wave vector k1" = (k0, kx, ky, kz) transform like a coordinate four-vector (2.2). For example, we have (2.8)
fc0'
= y(k0- Pkx),
kx = y(kx -0ko),
ky = ky ,
kz = kz.
For a light wave, k= (kcosd, ksind, 0) and k' = (k' cos8', k'sin9', leads to the formula for the aberration of light, (2.9)
cos0'=(cos0-/S)/(l-/3cos0),
(2.10)
sin0' = s i n 0 / [ y ( l - 0 c o s 0 ) ] ;
0), eq. (2.8)
Chap. 4. Extended Relativity ... 261 where we have used k = | k | = k0 for a light wave. These results are the same as those in special relativity because they both have the four-dimensional symmetry. 3. - Some experimental implications of extended relativity In the Fizeau experiment, the observed drag coefficient can be explained by the addition law (2.6) for velocity ratios with vx /c = 1 /n: (3.1)
v'/c'=(l/n-p)/a-P/n),
[3 = V/c.
The speed of light relative to the medium (at rest in the F frame) is vx = c/n in F, where n is the refractive index of the medium. Now, if the medium is moving with speed V = fie parallel to the direction of light, the ratio vx /c' observed by a person at rest in F' is given by (3.1) as (3.2)
vx' /c' = 1 /nL = (1 /n - 0) / ( l - 0/n) s i jn - fi(l - 1 jn2),
where nL is the "effective refractive index" of the moving medium with velocity + V. Assuming each tube in the Fizeau experiment has length L' and the speed of water is ±V= ±[ic, the optical path difference AL' of the two beams of light is (3.3)
A L ' = 2L'TC1 -2L'nL
(3.4)
\/n'±
= AL'n2{V/c){\
-
l/n2),
=\/n±(V/c){l-l/n2)
as expected, since the optical path is just the distance in vacuum equivalent to the actual path length traveled by each beam [7]. The formula for the aberration of light can be obtained from the inverse transformations of (2.9) and (2.10): (3.5)
tg0 = sin0'/[y(cos0' + )3)].
This shows the deviation of light when transforming to a new reference frame. Note that the invariant law for the propagation of light c' 2 d*' 2 -Hr' 2 = 0
(3.6)
does not refer to any specific source and, hence, it holds for light emitted from any source. From a microscopic viewpoint, the state of motion of a macroscopic source of light is actually irrelevant because photons are emitted from atoms in violent motion which may not even be uniform. 4. - Doppler shifts of frequency and atomic energy levels The extended transformation (u)' /c', k') is given by (2.8): (4.1)
(o'/c' = y(cu/c -f}kx),
fcj
of wave four-vectors k= (a>/c,k) = y(kx -
fiw/c),
ky = ky , fc; = kz.
This implies that the Doppler wavelength shift in extended relativity is (4.2)
and k' =
l / l ' = y(l/2-/3/A) = ( l / A ) [ ( l - / 3 ) / ( l + / 3 ) ] 1 / 2 ,
262 Lorentz and Poincare Invariance where kr = 2n/k, ky = kz = 0, &J = 2 J T / A ' . This is the same as that in special relativity. However, the consistency of the Doppler frequency shift in (4.1) with the result of laser experiments is more subtle because c' is not a constant in extended relativity. Since experiments which measure the frequency shift involve the absorption of photons by atoms, we must first re-examine the nature of atomic levels from the point of view of extended relativity. In extended relativity, Dirac's Hamiltonian Hv for a hydrogen atom is hctidip/dw = HDy, [HD= - a-Pc - fine2 - e2/(inr),
P=-ifcV,
which is given in the Dirac equation (6.6) below. Using the usual method, it can be shown that (4.3) leads to atomic energy levels (44)
\E„ = m c 2 / { l + ai/[n -h,+ \ho=j +1/2,
(h2 -
a^l2fy/*,
ae = e2/(4jtch).
Thus when an electron jumps from a state nl to another state re2, it will emit or absorb an energy quantum chk0: (4.5)
En2 - Enl = chk0 = hw,
inF,
(4.6)
E^ - E^ = chko = cfiu> ' /c',
in F',
where c' and c in (4.6) do not cancel in F', except for the special case q' = 0. If two photons with "energy" ((o0/c) ch and (a>'0/c') ch are emitted from two hydrogen atoms at rest in F and F', respectively, then by. the equivalence of F and F' frames, (4.7)
(aj0/c)cti=(a)'0/c')ch.
Here, the ratio a>'0/c' is isotropic. However, if (CD/C) ch and (a>'/c') ch are energies of the same photons measured from F and F', respectively, then they are related by (4.1) (4.8)
w ' / c ' = y(a>/c)( 1 -/?cos 9),
kx = kcosd = {CD/C) cos8 .
Evidently, only when c'—c do we have the usual relation for the Doppler frequency shift. In general, we can only talk about an "energy shift" or a "A;0 shift" in extended relativity because the energy of a quantum particle is not always proportional to the corresponding frequency in all frames. In other words, the energy [1] of a particle or A;0 is the zeroth component of a 4-vector, but the frequency (defined on the basis of Reichenbach's time) is in general not covariant. Experimentally, one never measures the frequency directly unless constant one-way speed of light is presupposed (so that the frequency becomes well-defined in all frames); instead, what one really measured is the shift of atomic levels in interaction with radiation or a laser that is measured in a general frame. Thus, (4.4)-(4.8) are consistent with precision measuring experiments of Doppler shifts. Since Reichenbach's time is used in F', the speed c' of a photon and its frequency co' in F' is anisotropic in general; however, the ratio cu'/c' is isotropic. Four-dimensional symmetry dictates the mixture of two waves in F' in terms of (w1, r') and (CD' /c', k')
Chap. 4. Extended Relativity ...
263
rather than (£', r') and (a)', k'), so their superposition is in general given by (4.9)
Af, sini^w'-k[-r')
+ B0 sin(k^w'
-k^-r'),
w' =
b't'.
Only in the F frame, where c is, by definition, isotropic and constant, does one have the usual expression A0 sin(a> l t-k 1 -r) + B0 sin(a>2^ ~ k2-r).
5. - Classical electrodynamics in extended relativity Although the one-way speed of light in extended relativity is not a universal constant, one still has the universal 2-way speed of light c. Thus, the usual action for a free particle, - jmc ds, is an invariant in extended relativity. If a charged particle with mass m and charge e is moving in an electromagnetic field in a general frame, the invariant action is assumed to be [1] (5.1)
S = - J weds - (e/c) J A „ dx" - (l/4c) JF^F*" dsrdw ,
(5.2)
F„, = d,Av - dvAM ,
(5.3)
ds = (g/lvdx'xdxv)1^
(5.4)
e/c= -1.6021891 •10- 20 (4^) 1 / 2 (g. C m) 1 / 2 ,
= dt(Cz-v2)1/2,
C = dw/dt,
v = dr/dt,
where gMV = ( 1, - 1 , - 1 , - 1 ) , x* = (w, x, y, z), e/c is in Heaviside-Lorentz units and Af, is the same as the usual electromagnetic vector potential in special relativity. Note that C = dw/dt is in general not a constant in extended relativity. For a charged particle moving in an electromagnetic potential field A^, the invariant action is given by S cp , (5.5)
Scp=JLcpdt,
Lcp =
rac(l--y2/C)1/2-(e/c)A^da;7dt,
which consists of the first two terms in (5.1), dx^/di = (C, v) and A^ = (A0, -A). The canonical momentum of a charged particle is now given by
(5.6)
\P = 3Lm/d\ = p + eA/c, p/ ' [ p =(mcv/C)/(l-?;2/C2)1/2.
Note that we have used the universal 2-way speed of light c to make p having the usual dimension of mass times velocity. In contrast, there is no universal speed of light in taiji relativity, so that the covariant momentum must have the dimension of mass [4]. Following taiji relativity, we define the covariant Hamiltonian H as H= [(5L/3v)-v - L](c/C) = cp0 + eA 0 , p0 = m c / ( l - ' y 2 / C 2 ) 1 / 2 .
264 Lorentz and Poincare Invariance Note that the factor c/C in definition (5.7) is necessary for the Hamiltonian H/c to transform as the zeroth component of the momentum 4-vector, H/c = P0. Otherwise, the Hamiltonian will not be meaningful. From (5.6), (5.8) and H/c = P0 we have (5.8)
(P 0 - eAo/cf - (P - eA/cf
= m2c2.
The Lagrange equation of motion for a charged particle in the electromagnetic field has the usual form dp"ds = (e/c) F"v dxv/ds ,
(5.9)
where p* = (p°, p) and x^ = (w, - r ) . Making the substitutions P-*-ihV relativistic Klein-Gordon equation (5.10)
[(ihd/dw-
eA0/cf
and P0^>ihd/dw,
- (-ihV - eA/cf
we obtain the extended
For a continuous charge distribution in space, the second term in (5.1) should be replaced by -fA^J^dPrdw. In this case, the variation of (5.1) leads to the invariant Maxwell equations in a general frame
[3xF>"' + d/lF"xdvFi<'=0.,
d^d/dx",
xx = x(w,r),
Thus, Maxwell's equations have the invariant form, even if the one-way speed of light is not universal. This is consistent with the result in taiji relativity. 6. - Quantum electrodynamics based on extended relativity We demonstrate that an extended quantum electrodynamics (QED) can be formulated because extended relativity has the four-dimensional symmetry with the lightime w' as the evolution parameter in the F' frame. This would be impossible if one attempts to use Reichenbach's time t' in Edwards' transformation [3] as the evolution parameter because V does not transform as the zeroth component of a 4-vector. In extended QED, the invariant action Sq involves rp and Au is assumed to be (6.1)
Sq=JLd4x,
L = ylyf'(ihdll +
eAfl/c-mc]ip-{l/4c)FflvFiJV,
where e < 0 and d4£ = dwd 3 r. For quantization of the fermion fields, for example, the "canonical momentum" jib conjugate to i/>b is defined as usual to be (6.2)
jzb = dLy,/9(d0ipb),
Lv =
y)[yf'itid/t-mc]yj,
and the Hamiltonian density for a free electron is H - itd0ip - Lv. For free-photon (Au) and electron (ip) fields, we have (6.3)
A„{w, r) = (1/F 1 / 2 ) 2 (ft/Zpo)1 / 2 . •L4(p, a) eM(a) expl-ip-x/h]
+Ar(p,
a) e^a)
exp[ip-x/h]],
Chap. 4. Extended Relativity ... (6.4)
265
ip(w, r) = (1/y 1 / 2 ) 2 (m/po)1/*. p;s
'
•[b(p, s)u(p, s) exp[-ip-x/h]
+ dHp, s) v(p, s) exp[ip-x/h]],
p-x = p>1x",
where f [ A ( p , a ) , A t ( p ' > a ' ) ] = (5pp.<5aa.) (6.5)
+
[{6(p, s), 6 T (p', s')] = <5PP<5SS.,
[d(p, s), d T (p', s') =^PP-<5SS. ;
and all other commutators vanish. The Dirac equation in extended relativity can be derived from (6.1). We obtain (6.6)
ih dip/dw = [a- (ihV + eA/c) - fimc + eA0 /c] \p .
In view of the equations of motion (5.10) and (6.6), we must use the lightime w in a general frame as the evolution variable for a state <S)(w) in the Schroedinger representation: (6.7)
34><S>(W) ih
= H®Hw) &S)(w),
H(S) = H™ + HlS),
dw because the evolution of a physical system is assumed to be described by a Hamiltonian operator which has the same transformation property as that of w. The usual covariant formalism of perturbation theory [5] can also be applied to quantum field theory based on extended relativity. To illustrate this point, let us briefly consider the interaction representation and the S-matrix based on extended relativity. The transformations of the state vector <3E>(w) and operator 0 from the Schroedinger representation to the interaction representation are (6.8)
O(w) = 4>(I)(w) = exp [iH^w/h]
0>(S)(w),
(6.9)
0(w) = 0 ( I ) (w) = exp [iH&S) w/h] 0<S) exp [ - z#0(S) w/h]
Since 0 ( S ) and 0(w) are the same for w = 0, we have (6.10) (6.11)
ih—— dw
= Hl(w) 3>(w),
HY = exp[itf0(S)w//i] H±S) exp [ - itf0(S) w / ^ '
O(w)=exp[iH^S)w/h]O(0)exp[-iH^)wh].
The [/-matrix can be defined in terms of the lightime w: O(w) = U(w, w0) O(w;0), U(wQ, w0) = 1. It follows from (6.10) and (6.11) that dU(w, wa) ih — ° = Hj(w) U(w, w0). dw If a physical system is in the initial state O; at lightime w0, the probability of finding it in the final state <£>{ at a later lightime w is (6.12)
(6.13)
|(4>f | U(w, w0) *,) | 2 = | Un(w, w0) | 2 .
266 Lorentz and Poincare Invaiian.ee Evidently, the average transition probability per unit lightime for 4>f—>
(6.14)
As usual, we can express the S-matrix in terms of the ^/-matrix, ie. S = £/(°°, - °°) and obtain the following form: (6.15)
S=l-(i/h)
j Hl(w)dw+(-i/hf
j Hj(w)dw
— 00
j
Hl(w')dw'+....
— 00
—00
For w-dependent operators, one can introduce a w-product W (corresponding to the usual chronological product), so that one can write (6.15) in an exponential form: (6.16)
S = W exp | Hl(x")dsr
(6.17)
•ii/h) J
Htix^dwd^r
= HJ(iv).
For simplicity, one may set h = c = 1, where c is the 2-way speed of light. (These are the "natural units" in extended relativity.) For the QED Lagrangian (6.1), we can derive Feynman rules (based on extended relativity). Let us summarize the Feynman rules for QED with the Lagrangian L in (6.1) with a gauge-fixing term O^A") 2 /(2a'): (6.18)
LQED = L - 0 „ A " ) 2 / ( 2 a ' ) ,
h = c=\,
e < 0.
The covariant photon and electron propagators are (6.19)
- i[g,v - (1 - a ' ) k„ kv /{kf + ie),
(6.20)
i/(yflp"-rn
+ ie).
The vertex factor is (6.21)
ley".
There is a factor e^ for each external photon line and a factor u(p, s) for each absorbed electron, a factor - 1 for each closed fermion loop, etc. These rules are identical to those in the conventional theory, if the natural unit is used. Thus, if one calculates scattering cross-sections and decay rates (with respect to the lightime w) of a physical process in a general frame, one will get the same result as that in special relativity [5]. For example, let us consider the decay rate T(l —>2 + 3 + ... + N) for a physical process l - * 2 + 3 + ...+iV. It is given by (6.22)
r ( l - > 2 + 3 + ...+iV) = lira f
\(f\S\i)\2 w
d3x2d3p2 (27thf
'"
d3xNd3PN (2jthf
which has the dimension of inverse length. The decay length D is given by D = l/r(l—>2 + 3 + ... + N). Thus in extended relativity, we have the dilatation of the decay length for a particle decay in flight. Since we have the universal 2-way speed of
Chap. 4. Extended Relativity ...
267
light c, we can use it to define the lifetime r of a particle by (6.23)
x = D/c.
We stress that such a lifetime has little to do with Reichenbach's time in general. The result (6.23) is also consistent with experiments because the decay length dilatation is what one actually measured in the high-energy laboratory. This is not surprising because we use the same four-dimensional symmetric Lagrangian as that in special relativity. 7. - Limiting four-dimensional symmetry and accelerated Wu transformation One may wonder whether the power of four-dimensional symmetry is strong enough to say something about constant-linear-acceleration (CLA) frames. The attempt to generalize the coordinate transformation for inertial frames to that for CLA frames through a symmetry consideration is very natural because the transformation for a CLA frame must reduce to that for an inertial frame in the limit of zero acceleration. So far, no satisfactory transformation for such non-inertial frames has been obtained in the literature, even though one has general relativity and the correspondence principle [6]. All those accelerated transformations discussed previously are not based on a symmetry principle and do not naturally reduce to the four-dimensional transformation for an intertial frame when the acceleration approaches zero. By a stroke of luck, we have found a transformation for CLA frames which does reduce to the correct four-dimensional transformation in the limit of zero acceleration and reduces to the Galilean accelerated transformation when the velocity is small. For simplicity of notation and calculations in an accelerated frame, let us denote a CLA frame by F(w, x, y, z) and an inertial frame by Fi(wi, x r , ylt z{). Suppose a CLA frame F(w, x, y, z) is moving with a constant acceleration a, so that its velocity is /3 = aw + p0, along the +x axis. We find that the accelerated transformation between Fi and F is given by wl = yp(x + 1/ayl) (7.1)
P0/ay0, Vi = y ,
xr = y(x + 1/ay f,)-l/ay 0, P = aw + p0,
1
y = 1/(1-ft') /*,
zx = z ; y o =l/U-0 2 o) 1/2 ,
which will be called the "Wu transformation". If one wishes, one may define Wi = ct\, where t\ is the usual Einstein time, in (7.1) for easy comparson with special relativity. (But this definition is not necessary for deriving experimental results.) The inverse Wu transformation of (7.1) is wi + Po/ayo (7.2)
a(*i + l / a y 0 ) 2
_ 0o a
x = [(x, + l / a y 0 ) - (w, +/S 0 /ay 0 ) 2 ] 1 / 2 y = yu
1/ayl,
z = zx.
One can verify that (7.1) and (7.2) reduce to four-dimensional transformations of the form (2.2) in the limit of zero acceleration a. (See appendix.) We may remark that the
268
Lorentz and Poincare Invariance
coordinate transformation between two CLA frames can be derived on the basis of (7.1) or (7.2). From the viewpoint of limiting four-dimensional symmetry, the CLA transformation must be expressed in terms of the Cartesian coordinates rather than other coordinates, just like the Lorentz transformation. Furthermore, the coordinates of CLA frames should play the same role and have a similar physical meaning as those of inertial frames. This appears to be different from the usual viewpoint that coordinates for accelerated frames have no physical meaning. The Wu transformation (7.1), based on the four-dimensional symmetry, differs from that obtained by Moller [6] based on the approximate principle of equivalence in general relativity because they give different spatial measurements by meter sticks or the Bohr radius of hydrogen atoms. We believe that such a difference should be tested by, say, measuring a Doppler shift of wavelength emitted from a source with a constant linear acceleration. We may remark that the constant acceleration a in (7.1) can be shown to be related to constant change of "energy" (or "moving mass") per unit length measured in an inertial frame. This differs from the usual definition of acceleration in (2.4). It is interesting to note that such a constant acceleration a dictated by the limiting four-dimensional symmetry is precisely what has been actually realized in linear accelerators in laboratories. Physical implications of the Wu transformation and their experimental tests will be discussed in a separate paper. 8. - Remarks and discussions In the formulation of QED in sect. 6, the electron is, as usual, assumed to be a point particle. However, if the physical electron is really a fuzzy point (in the sense of fuzzy set theory with a bell-shape membership function having a width L0) rather than a geometric point, then there will be a departure from the four-dimensional symmetry at short distances or large momentum [7]. A fuzzy-point model of a particle has been interpretated as follows: a particle by itself is a structureless-point particle, but it can simultaneously exist at different places with a different probabilities. As a result, the position uncertainty of such a quantum particle has a minimum width Acc~L 0 . The Coulomb potential will be modified when r
Chap. 4. Extended Relativity ...
269
dimensional physical laws. There are three universal and fundamental constants in QED based on extended relativity. Special relativity is a special case (q-q' = 0 in (2.2)). c) Common relativity: it is based on two postulates. The additional second postulate is a common time t' = t for all observers [8]. The speed of light is, roughly speaking, relative. Lightimes w and w' are evolution variables in four-dimensional laws. There are only two universal and fundamental constants in QED based on common relativity, precisely the same as those in taiji relativity. Common relativity has the unique advantage for dealing with many-particle systems where canonical evolution of the system is essential and for obtaining covariant thermodynamics and invariant Planck's law of black-body radiations [9]. One may ask: how can one realize the evolution variable w' in the extended coordinate transformation (2.2) by physical means? Since the invariant phase of an electromagnetic wave in the F' frame is given byfc0'w' - k' l r', wherefc0'= | k' |, we can define the lightime w' in terms of fc0\ J u s t as the length can be defined by the wavelength A' or |k' |. We note that the "clocks", which show lightime in this theory, are the same as those in taiji relativity [4] because they have exactly the same four-dimensional transformation property. However, the taiji-time w' in F' cannot be factored into two well-defined b' and t' because of the absence of a second postulate while the lightime w' in extended relativity and common relativity can be factored into two well-defined functions b' and t', as shown in (2.1), (2.2) and ref. [8]. Our discussions show that it is extremely important to be aware of what quantities are actually measured in the experiments and what effects the assumption of a universal speed of light may have had on the interpretation of the results. For example, we have seen in paper I that the lifetime dilatation of unstable particle decay in flight has little to do with the property of Reichenbach's time with a general parameter q or q', because the lifetime r is basically defined as the decay length divided by the universal 2-way speed of light c. The basic reason is that the four-dimensional symmetry dictates that the decay rates in, say, QED based on extended relativity can only be defined in terms of the covariant lightime w or w' which has the dimension of length. The constant 2-way speed of light in extended relativity is in general not the maximum speed of physical objects in the universe. Rather, it is the one-way speed of light in a given direction, that is the maximum speed of any object in that direction, as shown in (2.3). This holds for any inertial frame. It is worthwhile to note that this property of light, bt ng the "maximum speed" of all physical objects in any given direction, is a logical consequence of the first postulate of relativity, as shown in taiji relativity [4]. We have examined a number of experimental tests of special relativity and the formulations of classical electrodynamics and QED. All of them are consistent with extended relativity. These discussions can be generalized to other field theories such as unified electroweak theory and quantum chromodynamics. As we have seen, only the four-dimensional symmetry of physical laws is absolutely essential for explanations of experimental results and for the formulation of classical electrodynamics and QED; the universality of the one-way speed of light is irrelevant. In this connection, we stress that according to the theory of taiji relativity [4], the universality of the one-way or the two-way speed of light is a convention rather than an inherent part of the physical world [10].
270
Lorentz and Poincare Invariance
Note added in proofs Suppose one writes dwj = y(Tdw+ Uftdx), d«i = y(Vdx + Wfidw), dyl = dy, dz} = dz; y = 1/(1- f}'2)1/2, where T, U, V and W are four unknown functions of x and w. The new Wu transformation (7.1) for a constant-linear-acceleration (CLA) frame can be derived from the postulate of the limiting four-dimensional symmetry of taiji relativity and the initial condition that a CLA transformation reduces to the spatial identity r : = r when the taiji-time w = 0 and the initial "velocity" >30 = 0. This initial condition holds also for the Lorentz transformation. Thus, once the principle (or the first postulate) of relativity is rigorously stated to include the limiting cases, the concept of acceleration is determined in the physical theory based on "extended four-dimensional framework". Within the present conceptual framework, the taiji-time w in the Wu transormation (7.1) or (A.2) is a primary concept and has the dimension of length. The motion of physical objects, including light signals, is a derived concept and described by dimensionless "taiji-velocities" dr/dw. The taiji-time w can be realized by computerized "Leonardo clocks" [4]: We could program any Leonardo clock in a CLA frame F to obtain a reading w, from the nearest clock in an inertial frame Ft and, based on its F1 frame position Xi and given parameters a and /30, compute the taiji-time w it should display, iv = (wx + (l0/ay0)/[a(xi + l/ay0)] -l30/a. (See (7.2).) In the limit of zero acceleration w shown on a Leonardo clock will automatically reduce to the taiji-time in the four-dimensional transformation, w = z/oC^'i + Pox\)- It will not reduce to relativistic time, unless the second postulate of universal constant for the speed of light (w = ct, W\ = ct{) is made in this limit [41.
This paper is dedicated to Prof. TA-YOU W U for his wonderful and tireless teaching of physics and his ninetieth birthday. The work was supported in part by The Jing Shin Research Fund of the UMass Dartmouth and by a grant from the Potz Science Fund.
APPENDIX
Limiting four-dimensional symmetry and constant-linear-acceleration frames For simplicity, let us denote a CLA frame by F(w, x,y,z) and an inertial frame by F\(wi, Xi, yi, zi). Suppose a CLA frame F(w,x,y,z) is moving with a constant acceleration a, so that its velocity is (A.1)
/3 = aw + p0,
along the + x axis. Guided by the limiting four-dimensional symmetry, we find that the linearly accelerated transformation between Fl and F should be Wi = yfi(x + (A.2)
l/ay20)-p0/aya,
xl = y{x + l/ayl)-l/ay0,
P = aw + I30,
Vi = y, 2 1 2
y = l/(l-/3 ) / ,
Zi = z ,
y 0 = 1/(1 - / ^ ^ .
This is a generalization of the accelerated transformation obtained by Wu and Lee [7] based on a kinematic approach to satisfy limiting four-dimensional symmetry. It will be
Chap. 4. Extended Relativity ...
271
called the "Wu transformation". When /?0 approaches zero, the accelerated transformation (A.2) can reduce to the well-known transformation obtained in ref. [7], provided one uses a new time T :fi = aw + f}0 = tgh (at). Furthermore, one can verify that the Wu transformation (A.2) indeed reduces to four-dimensional transformations in the limit of zero acceleration a—•(): W\ = y 0 ( w + /*0x), Xi = y0(x + fi0w), yi = y, z\ = z; where y 0 = 1/(1 - Pi)1'2. With the definitions wx = cti and w = ct, the "extented relativistic time" t in the CLA frame F is completely determined by (A.2). In other words, if the time tx and the position x\ of an event as observed in the inertial frame F\ is known, then the corresponding time t in the CLA frame can be calculated, provided the constants c, a and /30 are given. Such a time t in a CLA frame can be physically realized by computerized "Leonardo clocks" [4]: evidently, the time t in the CLA frame is not the relativistic time in general and, hence, the constant c by itself is not physically meaningful. Only when the acceleration a vanishes, the time t in (A.2) for the frame F reduces to the relativistic time and c becomes physically meaningful. We may remark that the definitions W\ = ctx and w = ct are not necessary for deriving observable results because we may directly use w as evolution variables. When p0—>0, the inverse of the Wu transformation (A.2) leads to f w — c i : / ( l + ax{) = 0 ^ ( 1 — axi), \x=
2
2
ctx = wl,
( l / a ) [ l + 2axl + a (xf - c tf)]^
2
~ l/a~xI
- c2at?/2 .
Thus, c2a is related to a constant acceleration g in Newtonian mechanics by the relation a = g/c2
(A.4)
when velocities are small. In this sense, the Wu transformation (A.2) is a four-dimensional generalization of the Galilean transformation for accelerated frames in classical mechanics. From (A.1) we obtain (A.5)
ds 2 = c 2 At? - drf =firoodw2 - d r 2 ,
9oo
= y<(yo 2 + ax)2,
where x in a CLA frame F is restricted to the region x > xs = ~\/{ay\) which may be pictured as a "wall singularity" at x,. We may remark that finite Wu transformation (A.2) implies that the space-time of the CLA frame F(W, x, y, z) is flat, i.e. the Riemann curvature tensor vanishes, i?';ftm = 0, which can also be directly calculated by using the metric tensors in (A.5). The velocity of a fixed point xr in Fj as measured by F-observers using evolution variable [4] w is dx/dw with xx fixed. From (A.2), we find (A.6)
(dx/dw)Xi = Asroo)1/2;
x > - l/(a r 2 0 ),
1 > £2.
We see that only in the approximationfiroo— 1 do we have (dx/dw)Xl~ /3 and (A.7)
(d 2 x/dw 2 )x 1 * s a = constant.
We note that the Wu transformation (A.2) holds for general W\ and w. In the limit of zero acceleration, it reduces to the four-dimensional taiji transformation [4]. If one wishes, one may define (A.8)
wI = ctI and w = bt,
b = {ctY - # r i ) / [ * i ( l —/3g') - (£ - q')
xx/c\,
272
Lorentz and Poincare Invaxiance
where ti and t are, respectively, Einstein's time and "extended" Reichenbach's time (and b is the corresponding "ligh function"), then the limit of zero acceleration of (A.2) is the extended transformation (2.2) (where the inertial frame F' corresponds to the CLA frame F of (A.2) in the limit of zero acceleration). One can formulate, say, classical electrodynamics in a CLA frame. According to taiji relativity, physical results in the CLA frame F should be independent of the definition in (A.8).
REFERENCES [1] Hsu L., Hsu J. P. and SCHNEBLE D., NUOVO Cimento B, 111 (1996) 1299. This is referred as paper I in the text. [2] REICHENBACH H., The Philosophy of Space and Time (Dover, New York) 1958. [3] EDWARDS W.F., Am. J. Phys., 31 (1963) 482. [4] Hsu J. P. and Hsu L., Phys. Lett. A, 196 (1994) 1; 217 (1996) 359; Hsu L. and Hsu J. P., Nuovo Cimento B, 111 (1996) 1283. [5] See, for example, BJORKEN J. D. and DRELL S. D., Relativistic Quantum Mechanics (McGraw-Hill, New York) 1964, pp. 261-268 and pp. 285-286; SAKURAI J. J., Advanced Quantum Mechanics (Addison-Wesley, Reading, Mass.), 1967, pp. 171-172 and pp.181-188; WEINBERG S., The Quantum Theory of Fields (Cambridge University Press, New York) 1995, pp. 134-147. [6] M0LLER C , Danske Vid, Sel. Mat.-Fyz., xx, No. 19 (1943); FOCK V., The Theory of Space Time and Gravitation (Pergamon, New York) 1958, pp. 206-211; Wu T. Y. and LEE Y. C, Int. J. Theor. Phys., 5 (1972) 307; TA-YOU "WU, Theoretical Physics, Vol. 4, Theory of Relativity (Lian Jing Publishing Co., Taipei) 1978, pp. 172-175. [7] Hsu J. P., Nuovo Cimento B, 80 (1984) 183; 88 (1985) 140; Hsu J. P. and PEI S. Y., Phys. Rev. A, 37 (1988) 1406. [8] For a detailed discussion of common time in four-dimensional framework and its implications, see Hsu J. P., Nuovo Cimento B, 74 (1983) 67; 88 (1985) 140; 89 (1985) 30; Phys. Lett. A, 97 (1983) 137; Hsu J. P. and WHAN C, Phys. Rev. A, 38 (1988) 2248, appendix. [9] Hsu J. P., Nuovo Cimento B, 93 (1986) 178. [10] In other words, all physical results in taiji relativity or extended relativity can be derived by simply using the quantities (w, x, y, z) and (w', x', y', z') without ever mention time t or V (measured in seconds) and speeds of light or other physical objects.
Chapter 5
The Splendid Union of Special Relativity and Quantum Mechanics 5
5
P. A. M. Dirac (1927-1928), S. Tomonaga (1946), J. Schwinger (1948), R. Feynman (1949), F. Dyson (1949).
The Quantum Theory of the Emission and Absorption of Radiation
By P. A. M.
St. John's College, Cambridge, and Institute for Theoretical Physics, Copenhagen.
DIRAC,
(Communicated by N. Bohr, For. Mem. R.S.—Received February 2, 1927.) § 1. Introduction and Summary. The new quantum theory, based on the assumption that the dynamical variables do not obey the commutative law of multiplication, has by now been developed sufficiently to form a fairly complete theory of dynamics. One can treat mathematically the problem of any dynamical system composed of a number of particles with instantaneous forces acting between them, provided it is describable by a Hamiltonian function, and one can interpret the mathematics physically by a quite definite general method. On the other hand, hardly anything has been done up to the present on quantum electrodynamics. The questions of the correct treatment of a system in which the forces are propagated with the velocity of light instead of instantaneously, of the production of an electromagnetic field by a moving electron, and of the reaction of this field on the electron have not yet been touched. In addition, there is a serious difficulty in making the theory satisfy all the requirements of the restricted
274
Chap. 5. The Union of Special Relativity and Quantum ...
275
principle of relativity, since a Hamiltonian, function can no longer be used. This relativity question is, of course, connected with the previous ones, and it will be impossible to answer any one question completely without at the same time answering them all. However, it appears to be possible to build up a fairly satisfactory theory of the emission of radiation and of the reaction of the radiation field on the emitting system on the basis of a kinematics and dynamics which are not strictly relativistic. This is the main object of the present paper. The theory is non-relativistic only on account of the time being counted throughout as a o-number, instead of being treated symmetrically with the space co-ordinates. The relativity variation of mass with velocity is taken into account without difficulty. The underlying ideas of the theory are very simple. Consider an atom interacting with a field of radiation, which we may suppose for definiteness to be confined in an enclosure so as to have only a discrete set of degrees of freedom. Resolving the radiation into its Fourier components, we can consider the energy and phase of each of the components to be dynamical variables describing the radiation field. Thus if E r is the energy of a component labelled r and 0r is the corresponding phase (defined as the time since the wave was in a standard phase), we can suppose each E r and 0r to form a pair of canonically conjugate variables. In the absence of any interaction between the field and the atom, the whole system of field plus atom will be describable by the Hamiltonian H = ZrE, + H 0
(1)
equal to the total energy, H 0 being the Hamiltonian for the atom alone, since the variables E r , 6r obviously satisfy their canonical equations of motion •j? _
3H _
i _ 3H
_
When there is interaction between the field and the atom, it could be taken into account on the classical theory by the addition of an interaction term to the Hamiltonian (1), which would be a function of the variables of the atom and of the variables E„ 0r that describe the field. This interaction term would give the effect of the radiation on the atom, and also the reaction of the atom on the radiation field. In order that an analogous method may be used on the quantum theory, it is necessary to assume that the variables E r , 0f are q-numbers satisfying the standard quantum conditions 0,Er — E r 0 r = ih, etc., where h is (2TT) -1 times the usual Planck's constant, like the other dynamical variables of the problem. This assumption immediately gives light-quantum properties to
276
Lorentz and Poincare Invaxiance
the radiation.* For if vr is the frequency of the component r, 2-nvTQr is an angle variable, so that its canonical conjugate ET/2TCV,. can only assume a discrete set of values differing by multiples of h, which means that B r can change only by integral multiples of the quantum (2-KJI) vr. If we now add an interaction term (taken over from the clasical theory) to the Hamiltonian (1), the problem can be solved according to the rules of quantum mechanics, and we would expect to obtain the correct results for the action of the radiation and the atom on one another. I t will be shown that we actually get the correct laws for the emission and absorption of radiation, and the correct values for Einstein's A's and B's. In the author'3 previous theory,f where the energies and phases of the components of radiation were c-numbers, only the B's could be obtained, and the reaction of the atom on the radiation could not be taken into account. I t will also be shown that the Hamiltonian which describes the interaction of the atom and the electromagnetic waves can be made identical with the Hamiltonian for the problem of the interaction of the atom with an assembly of particles moving with the velocity of light and satisfying the Einstein-Bose statistics, by a suitable choice of the interaction energy for the particles. The number of particles having any specified direction of motion and energy, which can be used as a dynamical variable in the Hamiltonian for the particles, is equal to the number of quanta of energy in the corresponding wave in the Hamiltonian for the waves. There is thus a complete harmony between the wave and light-quantum descriptions of the interaction. We shall actually build up the theory from the light-quantum point of view, and show that the Hamiltonian transforms naturally into a form which resembles that for the waves. The mathematical development of the theory has been made possible by the author's general transformation theory of the quantum matrices.J Owing to the fact that we count the time as a c-number, we are allowed to use the notion of the value of any dynamical variable at any instant of time. This value is * Similar assumptions have been used by Born and Jordan [' Z. f. Physik,' vol. 34, p. 886 (1925)] for the purpose of taking over the classical formula for the emission of radiation by a dipole into the quantum theory, and by Born, Heisenberg and Jordan [' Z. f. Physik,' vol. 35, p. 606 (1925)] for calculating the energy fluctuations in a field of black-body radiation. t ' Roy. Soc. Proc.,' A, vol. 112, p. 661, § 5 (1926). This is quoted later by, loc. cit., I. % ' Roy. Soc. Proc.,' A, vol. 113, p. 621 (1927). This is quoted later by loc. cit., II. An essentially equivalent theory has been obtained independently by Jordan [' Z. f. Physik,' vol. 40, p. 809 (1927)]. See also, F. London, ' Z. f. Physik,' vol. 40, p. 193 (1926).
Chap. 5. The Union of Special Relativity
and Quantum ...
277
a q-number, capable of being represented by a generalised " matrix " according to many different matrix schemes, some of which may have continuous ranges of rows and columns, and may require the matrix elements to involve certain kinds of infinities (of the type given by the 8 functions*). A matrix scheme can be found in which any desired set of constants of integration of the dynamical system that commute are represented by diagonal matrices, or in which a set of variables that commute are represented by matrices that are diagonal at a specified time.f The values of the diagonal elements of a diagonal matrix representing any q-number are the characteristic values of that q-number. A Cartesian co-ordinate or momentum will in general have all characteristic values from — oo to + oo , while an action variable has only a discrete set of characteristic values. (We shall make it a rule to use unprimed letters to denote the dynamical variables or q-numbers, and the same letters primed or multiply primed to denote their characteristic values. Transformation functions or eigenfunctions are functions of the characteristic values and not of the q-numbers themselves, so they should always be written in terms of primed variables.) If/(£, V)) is any function of the canonical variables \k, 7jt, the matrix representing/at any time t in the matrix scheme in which the E,k at time t are diagonal matrices may be written down without any trouble, since the matrices representing the £,k and r)fc themselves at time t are known, namely,
u m = ^'8(ST)>
1 •(2)
vj* W) = -ih 8tex'-Si")...S ( ^ ' - ^ - i " ) 8' &'-!•/) S ( k + i ' - W ) - J Thus if the Hamiltonian H is given as a function of the \k and 7)*, we can at once write down the matrix H(^' £"). We can then obtain the transformation function, (£'/ a ') sa7> which transforms to a matrix scheme (a) in which the Hamiltonian is a diagonal matrix, as (£7 a ') must satisfy the integral equation H ({•'!•") at" (57a') = W («') . (r/a'),
(3)
of which the characteristic values W(oc') are the energy levels. This equation is just Schrodinger's wave equation for the eigenfunctions (£'/a')> "which becomes an ordinary differential equation when H is a simple algebraic function of the * Loc. cit. n , § 2. •)• One can have a matrix scheme in which a Bet of variables that commute are at all times represented by diagonal matrices if one will sacrifice the condition that the matrices most satisfy the equations of motion. The transformation function from snch a scheme to one in which the equations of motion are satisfied will involve the time explicitly. See p. 628 in loc. cit., II.
278 Lorentz and Poincare Invariance 5* and?]j. on account of the special equations (2) for the matrices representing 2,k and t\k. Equation (3) may be written in the more general form H (5T) &,' ( W = ih 3 (57«')/3«,
(3')
in which it can be applied to systems for which the Hamiltonian involves the time explicitly. One may have a dynamical system specified by a Hamiltonian H which cannot be expressed as an algebraic function of any set of canonical variables, but which can all the same be represented by a matrix H(i;'£"). Such a problem can still be solved by the present method, since one can still use equation (3) to obtain the energy levels and eigenfunctions. We shall find that the Hamiltonian which describes the interaction of a light-quantum and an atomic system is of this more general type, so that the interaction can be treated mathematically, although one cannot talk about an interaction potential energy in the usual sense. It should be observed that there is a difference between a light-wave and the de Broglie or Schrodinger wave associated with the light-quanta. Firstly, the light-wave is always real, while the de Broglie wave associated with a lightquantum moving in a definite direction must be taken to involve an imaginary exponential. A more important difference is that their intensities are to be interpreted in different ways. The number of light-quanta per unit volume associated with a monochromatic light-wave equals the energy per unit volume of the wave divided by the energy (2TCA)V of a single light-quantum. On the other hand a monochromatic de Broglie wave of amplitude a (multiplied into the imaginary exponential factor) must be interpreted as representing o2 lightquanta per unit volume for all frequencies. This is a special case of the general rule for interpreting the matrix analysis,* according to which, if (£'/*') or <|v (!;*') is the eigenfunction in the variables Z,k of the state a' of an atomic system (or simple particle), | (jv (£*')|2 is the probability of each ?,k having the value £jfc', [or | <|v(£*/) | 2 dE,x' &£,%'... is the probability of each \k lying between the values \k and Ek + d£,k', when the %k have continuous ranges of characteristic values] on the assumption that all phases of the system are equally probable. The wave whose intensity is to be interpreted in the first of these two ways appears in the theory only when one is dealing with an assembly of the associated particles satisfying the Einstein-Bose statistics. There is thus no such wave associated with electrons. * Loc. cit., H, §§ 6, 7.
Chap. 5. The Union of Special Relativity
and Quantum ...
279
§ 2. The Perturbation of an Assembly of Independent Systems. We shall now consider the transitions produced in an atomic system by an arbitrary perturbation. The method we shall adopt will be that previously given by the author,! which leads in a simple way to equations which determine the probability of the system being in any stationary state of the unperturbed system at any time.J This, of course, gives immediately the probable number of systems in that state at that time for an assembly of the systems that are independent of one another and are all perturbed in the same way. The object of the present section is to show that the equations for the rates of change of these probable numbers can be put in the Hamiltonian form in a simple manner, which will enable further developments in the theory to be made. Let H 0 be the Hamiltonian for the unperturbed system and V the perturbing energy, which can be an arbitrary function of the dynamical variables and may or may not involve the time explicitly, so that the Hamiltonian for the perturbed system is H = H 0 + V. The eigenfunctions for the perturbed system must satisfy the wave equation
ih mdt = (H0 + V) 4>, where (H 0 + V) is an operator. If <]> = E ^ ^ r is the solution of this equation that satisfies the proper initial conditions, where the tp/s are the eigenfunctions for the unperturbed system, each associated with one stationary state labelled by the suffix r, and the ar's are functions of the time only, then I aT 12 is the probability of the system being in the state r at any time. The ar's must be normalised initially, and will then always remain normalised. The theory will apply directly to an assembly of N similar independent systems if we multiply each of these a r 's by N* so as to make S r | ar | 2 = N. We shall now have that | ar | 2 is the probable number of systems in the state r. The equation that determines the rate of change of the aT's is§ ihdr = E s V„a„ where the V„'s are the elements of the matrix representing V. imaginary equation is - ihd* = X,Y„*a* = -Lji*Y„.
(4) The conjugate (4')
t Loc.tit.I. % The theory has recently been extended by Born [' Z. f. Physik,' vol. 40, p. 167 (1926)] so as to take into account the adiabatic changes in the stationary states that may be produced by the perturbation as well as the transitions. This extension is not used in the present paper. § Loc. tit., I, equation (25).
280 Lorentz and Poincare Invariance If we regard aT and ih at* as canonical conjugates, equations (4) and (4') take the Hamiltonian form with, the Hamiltonian function F x = Z„a,.*Vrja„ namely,
£& = 1 JUi. <&
iA 3 a r * '
a
*-
3Fl
8a r
We can transform to the canonical variables N„ <j>T by the contact transformation ar = N,**-**^*, a,* = N r V* /A . This transformation makes the new variables N r and j>T real, N r being equal to aji* = \arl2, the probable number of systems in the state r, and ^r/A being the phase of the eigenfunction that represents them. The Hamiltonian F x now becomes Fx = S ^ W N . V * - " " ' * , and the equations that determine the rate at which transitions occur have the canonical form
A more convenient way of putting the transition equations in the Hamiltonian form may be obtained with the help of the quantities 6r = o r e- iW ' i/A J
b* == a* e i W ^ ,
W r being the energy of the state r. We have | br | 2 equal to | ar | 2 , the probable number of systems in the state r. For br we find
with the help of (4). If we put V„ = v„e' (Wr_w,) " A , so that vT, is a constant when V does not involve the time explicitly, this reduces to ih br = W A + Z.tvA = S.H.A,
(5)
where HT, = W r S„ + v„, which is a matrix element of the total Hamiltonian H = H 0 + V with the time factor e'( w '- w ')(/» removed, so that H „ is a constant when H does not involve the time explicitly. Equation (5) is of the same form as equation (4), and may be put in the Hamiltonian form in the same way. I t should be noticed that equation (5) is obtained directly if one writes down the Schrodinger equation in a set of variables that specify the stationary states of the unperturbed system. If these variables are i;ft, and if H(5'£") denotes VOL. cxiv.—A.
Chap. 5. The Union of Special Relativity and Quantum ...
281
a matrix element of the total Hamiltonian H in the (5) scheme, this Schrodinger equation would be ih 3+ (l')ldt = S r H (l-T) + m,
(6)
like equation (3'). This differs from the previous equation (5) only in the notation, a single suffix r being there used to denote a stationary state instead of a set of numerical values £*' for the variables E,k, and bT being used instead of cp (£'). Equation (6), and therefore also equation (5), can still be used when the Hamiltonian is of the more general type which cannot be expressed as an algebraic function of a set of canonial variables, but can still be represented by a matrix H ( £ T ) or H„We now take bT and ih br* to be canonically conjugate variables instead of o r and ih ar*. The equation (5) and its conjugate imaginary equation will now take the Hamiltonian form with the Hamiltonian function F = 2WH^..
(7)
Proceeding as before, we make the contact transformation br = Nr* «-*'*
b* = Nr* eis'lh,
(8)
to the new canonical variables N„ 0r, where N r is, as before, the probable number of systems in the state r, and 6r is a new phase. The Hamiltonian F will now become •F=S„H r ,N r *N^e i ^-«->/*, and the equations for the rates of change of N r and 6r will take the canonical form N - _ ^
e -3F
The Hamiltonian may be written F = S,W,N r + 2 rs «„ N r i N,* J » " W*.
(9)
The first term ~LrWjST is the total proper energy of the assembly, and the second may be regarded as the additional energy due to the perturbation. If the perturbation is zero, the phases 9r would increase linearly with the time, while the previous phases
282
Lorentz and Poincare Invariance
naturally suggests itself is to make these canonical variables q-numbers satisfying the usual quantum conditions instead of c-numbers, so that their Hamiltonian equations of motion become true quantum equations. The Hamiltonian function will now provide a Schrb'dinger wave equation, which must be solved and interpreted in the usual manner. The interpretation will give not merely the probable number of systems in any state, but the probability of any given distribution of the systems among the various states, this probability being, in fact, equal to the square of the modulus of the normalised solution of the wave equation that satisfies the appropriate initial conditions. We could, of course, calculate directly from elementary considerations the probability of any given distribution when the systems are independent, as we know the probability of each system being in any particular state. We shall find that the probability calculated directly in this way does not agree with that obtained from the wave equation except in the special case when there is only one system in the assembly. In the general case it will be shown that the wave equation leads to the correct value for the probability of any given distribution when the systems obey the Einstein-Bose statistics instead of being independent. We assume the variables br, ihbr* of § 2 to be canonical q-numbers satisfying the quantum conditions br. ih b* — ih br* .br = ih or and
bfi* — b*br = 1, bjb, — b,br = 0, bjb* - b*br = 0
b*b* — b*br * = 0, (« ^ r).
The transformation equations (8) must now be written in the quantum form br = (Nr + 1)* e-'8'"1 = e-*/»N,* b* = N r V'' A = «*/* (Nr + 1)*,
(10)
in order that the N„ 0, may also be canonical variables. These equations show that the N r can have only integral characteristic values not less than zero,j" which provides us with a justification for the assumption that the variables are q-numbers in the way we have chosen. The numbers of systems in the different states are now ordinary quantum numbers. f See § 8 of the author's paper ' Boy. Soc. Proc.,' A, vol. I l l , p. 281 (1926). What are there called the c-number values that a q-number can take are here given the more preoise name of the characteristic values of that q-number.
Chap. 5. The Union of Special Relativity and Quantum ...
283
The Hamiltonian (7) now becomes
= SJEJtf,* (N. + 1 - *„)**<*-** in which the H,, axe still c-numbers. sponding to (9)
(11)
We may write this F in the form corre-
F = E , W A + Xr,vrFr* (N, + 1 - 8J
e**-«/*
(11')
in which it is again composed of a proper energy term S^WyN, and an interaction energy term. The wave equation written in terms of the variables N r isf ih | + (W, N 2 ', N / ...) = F+ (N/, N,', N , ' ...),
(12)
where F is an operator, each 0r occurring in F being interpreted to mean ih 3/3N/. If we apply the operator e±i0'lh to any function / ( N / , N 2 ', ... N / , ...) of the variables N x ', N 2 ', ... the result is e * * / * / ^ ' , N 2 ', ... N / f ... ) = e ^ / ^ 7 ( N 1 ' ) N 2 ', ... N / ... ) = /(N x ', N 2 ', ... N / T 1, ... ). If we use this rule in equation (12) and use the expression (11) for F we obtain^ tt!
+
i _ S„)i<j, ( N l ', N 2 ' . . . N / - 1 , . . . N / + 1, ...). (13)
We see from the right-hand side of this equation that in the matrix representing F, the term in F involving e1 (Sr ~ , , ) / h will contribute only t o those matrix elements that refer to transitions in which N r decreases by unity and N, increases by unity, i.e., to matrix elements of the type F ^ ' , N 2 ' . . . N / . . . N , ' ; N x ', N 2 ' ... N / - 1 ... N / + 1 ...). If we find a solution ^ ( N / , N 2 ' ...) of equation (13) that is normalised [i.e., one for which Sir,',N,'... I "K^i', N 2 ' . . . ) | 2 = 1] and that satisfies the proper initial conditions, then | <\> (Nj', N 2 ' ...) I2 will be the probability of that distribution in which N x ' systems are in state 1, N 2 ' in state 2, ... at any time. Consider first the case when there is only one system in the assembly. The probability of its being in the state q is determined by the eigenfunction j- We axe supposing for definiteness that the label r of the stationary states takes the values 1, 2, 3, .... % When a = r, f (N/, N 2 '... Nr' - 1... N,' + 1) is to be taken to mean f (Nj'N,'... N / . . . ) .
284 Lorentz and Poincare Invariance ^(Nj', N 2 ', ...) in wkich all the N"s are put equal to zero except N,', which is put equal to unity. This eigenfunction -we shall denote by <\i {q}. When it is substituted in the left-hand side of (13), all the terms in the summation on the right-hand side vanish except those for -which r = q, and we are left with
which is the same equation as (5) with <\i {q} playing the part of bq. This establishes the fact that the present theory is equivalent to that of the preceding section when there is only one system in the assembly. Now take the general case of an arbitrary number of systems in the assembly, and assume that they obey the Einstein-Bose statistical mechanics. This requires that, in the ordinary treatment of the problem, only those eigenfunctions that are symmetrical between all the systems must be taken into account, these eigenfunctions being by themselves sufficient to give a complete quantum solution of the problem.f We shall now obtain the equation for the rate of change of one of these symmetrical eigenfunctions, and show that it is identical with equation (13). If we label each system with a number n, then the Hamiltonian for the assembly will be H A = £„H (n), where H (n) is the H of § 2 (equal to H 0 + V) expressed in terms of the variables of the nth system. A stationary state of the assembly is denned by the numbers rlt r2... rn ... which are the labels of the stationary states in which the separate systems lie. The Schrodinger equation for the assembly in a set of variables that specify the stationary states will be of the form (6) [with H A instead of H], and we can write it in the notation of equation (5) thus :— ihb(r^a
...) = S,„,,...H A (r x r 2 ... ; s^
...)b(SjS2...),
(14)
where H A (rjr 2 ... ; SjS^ ...) is the general matrix element of H A [with the time factor removed]. This matrix element vanishes when more than one sn differs from the corresponding r n ; equals H r . , , when sro differs from rn and every other sn equals r„ ; and equals S n H,. r , when every s„ equals r„. Substituting these values in (14), we obtain ihb^rz ...) =S m S f a l t ,JH r i J > .6(f 1 r,... r»_i««r» + i...) + 2^Hrwrw6(r1r,...). (15) We must now restrict b ( r ^ ...) to be a symmetrical function of the variables r,. r» ... in order to obtain the Einstein-Bose statistics. This is permissible since if b (r^,, ...) is symmetrical at any time, then equation (15) shows that f Loc. eft., I, § 3.
Chap. 5. The Union of Special Relativity and Quantum ...
285
b(r1r2 ...) is also symmetrical at that time, so that b ( r ^ ...) will remain symmetrical. Let N r denote the number of systems in the state r. Then a stationary state of the assembly describable by a symmetrical eigenfunction may be specified by the numbers N1} N 2 ... N r ... just as well as by the numbers rv r2 ... r„ .... and we shall be able to transform equation (15) to the variables N"1; N 2 .... We cannot actually take the new eigenfunction b (Nx, N 2 ...) equal to the previous one 6 ( r ^ ...), but must take one to be a numerical multiple of the other in order that each may be correctly normalised with respect to its respective variables. We must have, in fact, ST* r,... | & far, . . . ) | 2 = 1 =S f f a l r,...|6(N 1 , N 2 . . . ) | 2 , and hence we must take | b(Nj, N 2 ...) | 2 equal to the sum of | b [r1rz ...) | 2 for all values of the numbers rlt r 2 ... such that there are N t of them equal to 1, N 2 equal to 2, etc. There are N \[NX ! N 2 ! ... terms in this sum, where N = 2rN r is the total number of systems, and they are all equal, since &(rjr 2 ...) is a symmetrical function of its variables rx, r 2 .... Hence we must have 6(N 1 ,N 2 ...) = ( N ! / N 1 ! N 2 ! . . . ) H ( r 1 r 2 . . . ) . If we make this substitution in equation (15), the left-hand side will become ih (N, ! N 2 ! ... /N !)* b (Nx, N 2 ...). The term BTmSrb {rxrz ... r m _ x smrm+1 ...) in the first summation on the right-hand side will become [ N 1 ! N 2 ! . . . ( N r - l ) ! . . . ( N , + l ) ! . . . / N ! ] * H r s 6 ( N 1 , N 2 . . . N r - l . . . N . + l...), (16) where we have written r for rm and s for sm. This term must be summed for all values of s except r, and must then be summed for r taking each of the values rv r 2 .... Thus each term (16) gets repeated by the summation process until it occurs a total of N r times, so that it contributes N r [N x ! N 2 ! . . . (Nr - 1)!... ( N . + l ) ! .../N!]* H„& ( N ^ N , . . . N r - 1 ...N. + 1 . . . ) = Nr*(N3 + l ) i ( N 1 ! N 2 ! . . . / N ! ) i H r ^ ( N 1 , N 2 . . . N r - l . . . N J + l . . . ) to the right-hand side of (15). Finally, the term I ^ H ^ ^ (rv r 2 ...) becomes L f N f H f f .6(f 1 r,...)=S f N f H f r .(N 1 !Br a l.../NI)*6(N 1 ,N a ...). Hence equation (15) becomes, with the removal of the factor (Nx ! N 2 I.../N !)*, ihb (N 1 ( N 2 ...) = S . 2 U N , * ( N . + l p H ^ { S v N 2 . . . N , - l . . . N. + 1 . . . ) + S,N r H n 4(N 1> N 2 ...),
(17)
286
Lorentz and Poincare Invariance
which, is identical with (13) [except for the fact that in (17) the primes have been omitted from the N's, which is permissible when we do not require to refer to the N's as q-numbers]. We have thus established that the Hamiltonian (11) describes the effect of a perturbation on an assembly satisfying the EinsteinBose statistics.
§ 4. The Reaction of the Assembly on the Perturbing System. Up to the present we have considered only perturbations that can be represented by a perturbing energy V added to the Hamiltonian of the perturbed system, V being a function only of the dynamical variables of that system and perhaps of the time. The theory may readily be extended to the case when the perturbation consists of interaction with a perturbing dynamical system, the reaction of the perturbed system on the perturbing system being taken into account. (The distinction between the perturbing system and the perturbed system is, of course, not real, but it will be kept up for convenience.) We now consider a perturbing system, described, say, by the canonical variables Jk, oik, the J's being its first integrals when it is alone, interacting with an assembly of perturbed systems with no mutual interaction, that satisfy the Einstein-Bose statistics. The total Hamiltonian will be of the form H T = H p (J) + E B H(n), where H p is the Hamiltonian of the perturbing system (a function of the J's only) a n d H (n) is equal to the proper energy H 0 (n) plus the perturbation energy V(w) of the nth system of the assembly. H (n) is a function only of the variables of the nth system of the assembly and of the J's and w's, and does not involve the time explicitly. The Schrodinger equation corresponding to equation (14) is now ih'b ( J ' , r 1 r 2 . . . ) = E J , , E v v . . H T ( J ' , r 1 r 2 . . . ; J ' , V l . . . ) 6(J",
Va
..,),
in which the eigenfunction 6 involves the additional variables Jk'. The matrix element H T (J', rxr2 ...; J", sts2 ...) is now always a constant. As before, it vanishes when more than one s„ differs from the corresponding r n . When sm differs from rm and every other sn equals r„, it reduces to H (J'rm; J"s m ), which is the (J'r m ; 3"sm) matrix element (with the time factor removed) of H = H 0 -f- V, the proper energy plus the perturbation energy of a single system of the assembly; while when every sn equals r„, it has the value H P (J') SJ/J/, + 2 „ H (J'r„; J*r n ). If, as before, we restrict the eigenfunctions
Chap. 5. The Union of Special Relativity and Quantum ...
287
to be symmetrical in the variables ra, r 2 ..., we can again transform to the variables N 1; N 2 ..., which will lead, as before, to the result * 6 (J', N x ', N , ' ...) = H P ( J ' ) b(J', Wv N , ' . . . ) + S J „ 2 r . A ' i ( N / + l - S „ ) i H ( J ' r ; J" S )&(J",N 1 ',N 2 '...N/-1...N/ + 1...) (18) This is the Schrodinger equation corresponding to the Hamiltonian function F = H P (J) + S r .,H„N r *(N i + l-8„)*e < «-*>'\
(19)
in which H„ is now a function of the J's and w's, being such that when represented by a matrix in the (J) scheme its (J' J") element is H (J'r; J"s). (It should be noticed that H„ still commutes with the N's and 0's.) Thus the interaction of a perturbing system and an assembly satisfying the Einstein-Bose statistics can be described by a Hamiltonian of the form (19). "We can put it in the form corresponding to (11') by observing that the matrix element H (J'r; J"s) is composed of the sum of two parts, a part that comes from the proper energy H 0 , which equals W r when J / = J*' and s = r and vanishes otherwise, and a part that comes from the interaction energy V, which may be denoted by v (J'r ; J"s). Thus we shall have
H„ = Wr S„ + v„, where vr, is that function of the J's and w>'s which is represented by the matrix whose (J' J") element is v (JV ; J's), and so (19) becomes F = H P (J) + S r W f N r + Sr..i;„Nr* (N, + 1 - S J ^ * - * ' * .
(20)
The Hamiltonian is thus the sum of the proper energy of the perturbing system H P (J), the proper energy of the perturbed systems S f W ^ , and the perturbation energy 2,, ,»„!*,* (N. + 1 - SJ* jV'-W. § 5. Theory of Transitions in a System from One State to Others of the Same Energy. Before applying the results of the preceding sections to light-quanta, we shall consider the solution of the problem presented by a Hamiltonian of the type (19). The essential feature of the problem is that it refers to a dynamical system which can, under the influence of a perturbation energy which does not involve the time explicitly, make transitions from one state to others of the same energy. The problem of collisions between an atomic system and an electron, which has been treated by Born,* is a special case of this type. Born's method is to find a periodic solution of the wave equation which consists, in BO far as it involves the co-ordinates of the colliding electron, of plane waves, * Barn,' Z. f. Phyedk,' vol 38, p. 803 (1926).
288 Lorentz and Poincare Invariance representing the incident electron, approaching the atomic system, which are scattered or diffracted in all directions. The square of the amplitude of the •waves scattered in any direction with any frequency is then assumed by Born to be the probability of the electron being scattered in that direction with the corresponding energy. This method does not appear to be capable of extension in any simple manner to the general problem of systems that make transitions from one state to others of the same energy. Also there is at present no very direct and certain way of interpreting a periodic solution of a wave equation to apply to a non-periodic physical phenomenon such as a collision. (The more definite method that will now be given shows that Bom's assumption is not quite right, it being necessary to multiply the square of the amplitude by a certain factor.) An alternative method of solving a collision problem is to find a non-periodic solution of the wave equation which consists initially simply of plane waves moving over the whole of space in the necessary direction with the necessary frequency to represent the incident electron. In course of time waves moving in other directions must appear in order that the wave equation may remain satisfied. The probability of the electron being scattered in any direction with any energy will then be determined by the rate of growth of the corresponding harmonic component of these waves. The way the mathematics is to be interpreted is by this method quite definite, being the same as that of the beginning of §2. We shall apply this method to the general problem of a system which makes transitions from one state to others of the same energy under the action of a perturbation. Let H 0 be the Hamiltonian of the unperturbed system and V the perturbing energy, which must not involve the time explicitly. If we take the case of a continuous range of stationary states, specified by the first integrals, afc say, of the unperturbed motion, then, following the method of § 2, we obtain ih d (a') = f V (a'a*) do." . a (a'),
(21)
corresponding to equation (4). The probability of the system being in a state for which each <xfc lies between xk' and oc^'-f^a*' at a n v t u i i e *s I a (<*') l ^ i ' • ^ • • • when a (a') is properly normalised and satisfies the proper initial conditions. If initially the system is in the state a0, we must take the initial value of a (a') to be of the form a 0 . S(a' — a 0 ). We shall keep a0 arbitrary, as it would be inconvenient to normalise a (a') in the present case. For a first approximation
Chap. 5. The Union of Special Relativity
and Quantum . ..
289
we may substitute for a (a") in the right-hand side of (21) its initial value. This gives ih d(cc') = a°V(a'a°) = a M * ' * V I W M ~ W M I ' / * » where v (x'aP) is a constant and W (a') is the energy of the state a.'. Hence „
<>i[W<«')-W(aW/A_ 1
a a M = «• 8 V - «°> + A M ;
L W ( 0
_
W
^-
TO 0
For values of the xk such that W (a') differs appreciably from W (a ), a (a') is a periodic function of the time whose amplitude is small when the perturbing energy V is small, so that the eigenfunctions corresponding to these stationary states are not excited to any appreciable extent. On the other hand, for values of the ock such that W (a') = W (a 0 ). and <xfc' j± xk° for some k, a (a') increases uniformly with respect to the time, so that the probability of the system being in the state «' at any time increases proportionally with the square of the time. Physically, the probability of the system being in a state with exactly the same proper energy as the initial proper energy W (<x°) is of no importance, being infinitesimal. We are interested only in the integral of the probability through a small range of proper energy values about the initial proper energy, which, as we shall find, increases linearly with the time, in agreement with the ordinary probability laws. We transform from the variables aj, a 2 ... ix, to a set of variables that are arbitrary independent functions of the oc's such that one of them is the proper energy W, say, the variables W, y 1; y 2 , ... y„_i. The probability at any time of the system lying in a stationary state for which each y* lies between yk and Tfc' + ^Yfe' i s n o w (apart from the normalising factor) equal to
dYl' . dy2' ... dfS
f | a(a') ] » ^ K ' ^ / - O
^
(23)
For a time that is large compared with the periods of the system we shall find that practically the whole of the integral in (23) is contributed by values of W' very close to W° = W(oc°). Put a ( a ' ) = a ( W ' ) T 0 and dfa', « 2 ' . . . : 0 / 3 ( W ' , y / ... y u _i') = J ( W , y')Then for the integral in (23) we find, with the help of (22) (provided yA' 9^ -fk0, for some k) f | a (W, y') |2 J (W, y') dW'
= I *° |»J I v (W, y'; W°, y°) |2 J (W, y')
fe
^ V y i
~ ^
= 21a012 f |v(W',y'; W°,y°)l2 J ( W / , y ' ) [ l - c o s ( W ' - W ° ) i / A ] / ( W ' - W 0 ) 2 . d W ' = 2\a°\*tlh.
f | v (W° + hxjt, y'; W°, y°) | 2 J (W°+ Jix/t, y') (1 - cos x)jxl. dx,
290 Lorentz and Poincare Invariance if one makes the substitution (W'—W°)t/A = x. For large values of t this reduces to 21 o° | 2 tfh. 1 v (W°, Y ' ; W°, Y°) | 2 J (W°, y') f
(1-cos x)l& . dx
J —»
= 2u | a01« * / i . | «(W°, T ' ; W°, Y°) | 2 J (W°, y'). The probability per unit time of a transition to a state for which each yk lies between yk' and yk + ^Y*' ^ thus (apart from the normalising factor) 2* | a° |»/& . |«(W°, Y ' ; W°, Y°) | 2 J (W°, T ') d T l ' • d Y l ' . . . <*Y«_I', (24) which is proportional to the square of the matrix element associated with that transition of the perturbing energy. To apply this result to a simple collision problem, we take the a's to be the components of momentum px, pv, p, of the colliding electron and the Y' S to be 9 and
c2
Thus the J (W°, Y') of the expression (24) has the value J (W°, Y') = E ' P sin B'/c2,
(25)
where E' and P' refer to that value for the energy of the scattered electron which makes the total energy equal the initial energy W° (i.e., to that value required by the conservation of energy). We must now interpret the initial value of a (a'), namely, a0 8 (a' — a 0 ), which we did not normalise. According to § 2 the wave function in terms of the variables a.k is b (a')= a (a') e - f w '' A , so that its initial value is o° 8 (a' - a0) e-iWt'h = a0 8{px' - px°) 8 «
- py°) 8 (pz' - p°) e"™""*.
If we use the transformation function* (x'/p') =
(2n1i)-sliei^"':c'lh,
and the transformation rule * (x') = |(x'/j>') i> (p') dp,' dpv' dp,', we obtain for the initial wave function in the co-ordinates x, y, z the value o°
(2nh)-s,2ei'!-"''-°x'lhe-iW'tlh.
* The symbol x is used for brevity to denote x, y, z.
Chap. 5. The Union of Special Relativity
and Quantum ... 291
This corresponds to an initial distribution of | a 0 1 2 (2-Kh)~z electrons per unit volume. Since their velocity is P V / E 0 , the number per unit time striking a unit surface at right-angles to their direction of motion is \a°\2P°c2/(27rA)3E°. Dividing this into the expression (24) we obtain, with the help of (25), i-K*(2Tzh)*^\v(P'; p°)\^0smV
dVdfi.
(26)
This is the effective area that must be hit by an electron in order that it shall be scattered in the solid angle sin 0' dQ' d
292 Lorentz and Poincare Invariance physically in evidence, so that it appears to have been created. Since there is no limit to the number of light-quanta that may be created in this way, we must suppose that there are an infinite number of light-quanta in the zero state, so that the N 0 of the Hamiltonian (20) is infinite. We must now have 80, the variable canonically conjugate to N 0 , a constant, since 0O = 3F/3N 0 = W 0 + terms involving N„-* or (N0 + 1)-* and W0 is zero. In order that the Hamiltonian (20) may remain finite it is necessary for the coefficients i^, v^ to be infinitely small. We shall suppose that they are infinitely small in such a way as to make v^$£ and v<^£ finite, in order that the transition probability coefficients may be finite. Thus we put UH) (N0 + 1)* e-w°lh = v„ VorNjeW = v*, where vr and vr* are finite and conjugate imaginaries. We may consider the vr and v* to be functions only of the J's and w's of the atomic system, since their factors (N0 + 1)* e~"VA and N 0 V° / A are practically constants, the rate of change of N 0 being very small compared with N 0 . The Hamiltonian (20) now becomes F = H P (J) + S r W r N f + S^oKNVV*/" + vr*(NT + l ) » e - * ^ + S r „ o S ^ o t>„Nr* (N. + 1 - S„)le< «'-°»\
(27)
The probability of a transition in which a light-quantum in the state r is absorbed is proportional to the square of the modulus of that matrix element of the Hamiltonian which refers to this transition. This matrix element must come from the term ivNrVr'"> in the Hamiltonian, and must therefore be proportional to N/* where N / is the number of light-quanta in state r before the process. The probability of the absorption process is thus proportional to N,'. In the same way the probability of a light-quantum in state r being emitted is proportional to (N/ + 1), and the probability of a light-quantum in state r being scattered into state s is proportional to N / (N/ -j-1). Eadiative processes of the more general type considered by Einstein and Ehrenfest,| in which more than one light-quantum take part simultaneously, are not allowed on the present theory. To establish a connection between the number of light-quanta per stationary state and the intensity of the radiation, we consider an enclosure of finite volume, A say, containing the radiation. The number of stationary states for light-quanta of a given type of polarisation whose frequency lies in the t * Z. f. Physik,' vol. 19, p. 301 (1923).
Chap. 5. The Union of Special Relativity and Quantum ...
293
range vr to vr -\- dvr and whose direction, of. motion lies in the solid angle dar about the direction of motion for state r will now be A V ^ V ^ G V / C 3 . The energy of the light-quanta in these stationary states is thus N / . 2TtAvr. Av^dv^o^/c 3 . This must equal Ac""1ITivri
(28)
so that N / is proportional to I, and"(N/ -f- 1) is proportional to I r + (27tA)vr3/c2. We thus obtain that the probability of an absorption process is proportional to I„ the incident intensity per unit frequency range, and that of an emission process is proportional to I r + (2TcA)vr3/c2, which are just Einstein's laws.* I n the same way the probability of a process in which a light-quantum is scattered from a state r to a state s is proportional to I r [ I , + (2izh)vT3/c2], which is Pauli's law for the scattering of radiation by an electron, f §7. The Probability Coefficients for Emission and Absorption. We shall now consider the interaction of an atom and radiation from the wave point of view. We resolve the radiation into its Fourier components,, and suppose that their number is very large but finite. Let each component be labelled by a suffix r, and suppose there are ar components associated with the radiation of a definite type of polarisation per unit solid angle per unit frequency range about the component r. Each component r can be described by a vector potential kT chosen so as to make the scalar potential zero. The perturbation term to be added to the Hamiltonian will now be, according to the classical theory with neglect of relativity mechanics, c - 1 S r «f X,, where X, is the component of the total polarisation of the atom in the direction of KT, which is the direction of the electric vector of the component r. We can, as explained in § 1, suppose the field to be described by the canonical variables N„ 0r, of which N r is the number of quanta of energy of the component r, and 9 r is its canonically conjugate phase, equal to 2-jzhv,. times the 8 r of §1. We shall now have «r = a r cos 8, /h, where aT is the amplitude of JC„ which can be connected with N r as follows:—The flow of energy per unit area per unit time for the component r is \KC~Xar2vr2. Hence the intensity * The ratio of stimulated to spontaneous emission in the present theory is just twice its value in Einstein's. This is because in the present theory either polarised component of the incident radiation can stimulate only radiation polarised in the same way, while in Einstein's the two polarised components are treated together. This remark applies also to the scattering process. t Pauli, \Z. f. Physik,' vol. 18, p. 272 (1923).
294 Lorentz and Poincaie Invariance per unit frequency range of the radiation in the neighbourhood of the component r is I r = ^7tc -1 a r 2 v r 2 a r . Comparing this with equation (28), we obtain aT = 2 (AVf/cff^'N,*, and hence Kr = 2 (Avr/<wr)* Nr* cos 8r/A. The Hamiltonian for the whole system of atom plus radiation would now be, according to the classical theory, F = H P (J) + S r (27tAvr) N r + 2c- 1 S r (Avr/<w,)1 2 W cos 8,/A, (29) where Hp (J) is the Hamiltonian for the atom alone. On the quantum theory we must make the variables N r and 0r canonical q-numbers like the variables J*, wk that describe the atom. We must now replace the Nr- cos 9r/A in (29) by the real q-number \ {Nr* «*'* + e-*/* N/"} = i {Nr* e*'» + (Nr + 1 ) * «-*•'»} so that the Hamiltonian (29) becomes F = H P (J) + 2 r (27civr) N r +A* c-3 S, (v,/
x
' 0\ tta(«,) = 8.v + t>(«'«°H i[W(a')-W(a°)]/A'
Chap. 5. The Union of Special Relativity and Quantum ...
295
corresponding to (22). If, as before, we transform to the variables W, y x , Y2 ••• Yu-i> we obtain (when y' ?± y°) a (WY) = t»(W', y ' ; W°, Y°) [l-«'<*'-•"•> «*J/(W'-W°). The probability of the system being in a state for which each yk equals ytr *s S W ' | a (W y')| 2 . If the stationary states lie close together and if the time t is not too great, we can replace this sum. by the integral (AW) - 1 | a (WY) I2 dW', where AW is the separation between the energy levels. Evaluating this integral as before, we obtain for the probability per unit time of a transition to a state for which each yj. = yfc' 2-K/h AW . 1 v (W°, y'"; W°, Y°) |2.
(32)
In applying this result we can take the Y'S to be any set of variables that are independent of the total proper energy W and that together with W define a stationary state. We now return to the problem defined by the Hamiltonian (30) and consider an absorption process in which the atom jumps from the state J° to the state J' with the absorption of a light-quantum from state r. We take the variables Y' to be the variables J ' of the atom together with variables that define the direction of motion and state of polarisation of the absorbed quantum, but not its' energy. The matrix element v (W°, y ' ; W°, y°) is now Wc-** (vr/cjr)1/2 X, (J°J')N r °, where X, (J°J') is the ordinary (J°J') matrix element of X,- Hence from (32) the probability per unit time of the absorption process is
To obtain the probability for the process when the light-quantum comes from any direction in a solid angle da, we must multiply this expression by the number of possible directions for the light-quantum in the solid angle da, which is da ar AW/27c/i. This gives dc*±\±r
(JoJ') | 2 Nr° = da — L - 2 | X, (J°J') | * I r
with the help of (28). Hence the probability coefficient for the absorption process is l/2irA2cvra. |X r (J°J')| 2 , in agreement with the usual value for Einstein's absorption coefficient in the matrix mechanics. The agreement for the emission coefficients may be verified in the same manner.
296 Lorentz and Poincare Invariance The present theory, since it gives a proper account of spontaneous emission, must presumably give the effect of radiation reaction on the emitting system, and enable one to calculate the natural breadths of spectral lines, if one can overcome the mathematical difficulties involved in the general solution of the wave problem corresponding to the Hamiltonian (30). Also the theory enables one to understand how it comes about that there is no violation of the law of the conservation of energy when, say, a photo-electron is emitted from an atom under the action of extremely weak incident radiation. The energy of interaction of the atom and the radiation is a q-number that does not commute with the first integrals of the motion of the atom alone or with the intensity of the radiation. Thus one cannot specify this energy by a c-number at the same time that one specifies the stationary state of the atom and the intensity of the radiation by c-numbers. In particular, one cannot say that the interaction energy tends to zero as the intensity of the incident radiation tends to zero. There is thus always an unspecifiable amount of interaction energy which can supply the energy for the photo-electron. I would like to express my thanks to Prof. Niels Bohr for his interest in this work and for much friendly discussion about it. Summary. The problem is treated of an assembly of similar systems satisfying the Einstein-Bose statistical mechanics, which interact with another different system, a Hamiltonian function being obtained to describe the motion. The theory is applied to the interaction of an assembly of light-quanta with an ordinary atom, and it is shown that it gives Einstein's laws for the emission and absorption of radiation. The interaction of an atom with electromagnetic waves is then considered, and it is shown that if one takes the energies and phases of the waves to be q-numbers satisfying the proper quantum conditions instead of c-numbers, the Hamiltonian function takes the same form as in the light-quantum treatment. The theory leads to the correct expressions for Einstein's A's and B's.
Chap. 5. The Union of Special Relativity and Quantum ...
297
The Quantum Theory of the Electron By P. A. M. DIRAC St. John's College, Cambridge (Communicated by R. H. Fowler, F.R.S.—Received January 2, 1928)
THE new quantum mechanics, when applied to the problem of the structure of the atom with point-charge electrons, does not give results in agreement with experiment. The discrepancies consist of "duplexity" phenomena, the observed number of stationary states for an electron in an atom being twice the number given by the theory. To meet the difficulty, Goudsmit und Uhlenbeck have introduced the idea of an electron with a spin angular momentum of half a quantum and a magnetic moment of one Bohr magneton. This model for the electron has been fitted into the new mechanics by Pauli*, and Darwin t , working with an equivalent theory, has shown that it gives results in agreement with experiment for hydrogen-like spectra to the first order of accuracy. The question remains as to why Nature should have chosen this particular model for the electron instead of being satisfied with the point-charge. One would like to find some incompleteness in the previous methods of applying quantum mechanics to the point-charge electron such that, when removed, the whole of the duplexity phenomena follow without arbitrary assumptions. In the present paper it is shown that this is the case, the incompleteness of the previous theories lying in their disagreement with relativity, or, alternatively, with the general transformation theory of quantum mechanics. It appears that the simplest Hamiltonian for a point-charge electron satisfying the requirements of both relativity and the general transformation theory leads to an explanation of all duplexity phenomena without further assumption. All the same there is a great deal of truth in the spinning electron mo[* Proc. Roy. Soc. (A), 117, 610 (1928).] * Pauli, Z.f. Physik, vol. 43, p. 601 (1927). t Darwin, Roy. Soc. Proc, A, vol. 116, p. 227 (1927).
298
Lorentz and Poincare Invariance
del, at least as a first approximation. The most important failure of the model seems to be that the magnitude of the resultant orbital angular momentum of an electron moving in an orbit in a central field of force is not a constant, as the model leads one to expect.
§ 1. Previous Relativity Treatments The relativity Hamiltonian according to the classical theory for a point electron moving in an arbitrary electro-magnetic field with scalar potential A0 and vector potential A is F=
E+£^+,p+^A)+mV>
where p is the momentum vector. It has been suggested by Gordon* that the operator of the wave equation of the quantum theory should be obtained from this F by the same procedure as in non-relativity theory, namely, by putting W = ih
dt'
pr—— ih
d
dxr
r=
1,2,3,
in it. This gives the wave equation Fip = -[ih
cdt
+Zr
-ih
dxr
Ar\ +m2c2
w
= 0, (1)
the wave function ip being a function of x\, x2, x3, t. This gives rise to two difficulties. The first is in connection with the physical interpretation of tp. * Gordon, Z.f. Physik, vol. 40, p. 117 (1926).
Chap. 5. The Union of Special Relativity
and Quantum ...
299
Gordon, and also independently Klein,1" from considerations of the conservation theorems, make the assumption that if rpm, ipn are two solutions
and Imn = —~2— | —iKVm grad yn-yn
gradipm) + 2—Amy)my)„I
are to be interpreted as the charge and current associated with the transition m-—n. This appears to be satisfactory so far as emission and absorption of radiation are concerned, but is not so general as the interpretation of the nonrelativity quantum mechanics, which has been developedt sufficiently to enable one to answer the question: What is the probability of any dynamical variable at any specified time having a value lying between any specified limits, when the system is represented by a given wave function i/>„? The Gordon-Klein interpretation can answer such questions if they refer to the position of the electron (by the use of £>„„), but not if they refer to its momentum, or angular momentum or any other dynamical variable. We should expect the interpretation of the relativity theory to be just as general as that of the non-relativity theory. The general interpretation of non-relativity quantum mechanics is based on the transformation theory, and is made possible by the wave equation being of the form (H- W)W = 0,
(2)
i.e., being linear in W or dfdt, so that the wave function at any time determines the wave function at any later time. The wave equation of the relativity theory must also be linear in W if the general interpretation is to be possible. t Klein, Z.f. Physik, vol. 41, p. 407 (1927). % Jordan, Z. /. Physik, vol. 40, p. 809 (1927); Dirac, Roy. Soc. Proc, A, vol. 113, p. 621 (1927).
300
Lorentz and Poincare Invariance
The second difficulty in Gordon's interpretation arises from the fact that if one takes the conjugate imaginary of equation (1), one gets W e \2 / e V = 0, -( + —A0\ + I - P + —A) +m2c2 which is the same as one would get if one put — e for e. The wave equation (1) thus refers equally well to an electron with charge e as to one with charge —e. If one considers for definiteness the limiting case of large quantum numbers one would find that some of the solutions of the wave equation are wave packets moving in the way a particle of charge — e would move on the classical theory, while others are wave packets moving in the way a particle of charge e would move classically. For this second class of solutions Wh&s a negative value. One gets over the difficulty on the classical theory by arbitrarily excluding those solutions that have a negative W. One cannot do this on the quantum theory, since in general a perturbation will cause transitions from states with W positive to states with Wnegative. Such a transition would appear experimentally as the electron suddenly changing its charge from - e to e, a phenomenon which has not been observed. The true relativity wave equation should thus be such that its solutions split up into two non-combining sets, referring respectively to the charge — e and the charge e. In the present paper we shall be concerned only with the removal of the first of these two difficulties. The resulting theory is therefore still only an approximation, but it appears to be good enough to account for all the duplexity phenomena without arbitrary assumptions.
§ 2. The Hamiltonian for No Field Our problem is to obtain a wave equation of the form (2) which shall be invariant under a Lorentz transformation and shall be equivalent to (1) in the limit of large quantum numbers. We shall consider first the case of no field, when equation (1) reduces to
Chap. 5. The Union of Special Relativity
(-p2o+V2+m*c2)W
and Quantum . ..
= 0
301
(3)
if one puts W
.,
8
The symmetry between po and pi, p2, p3 required by relativity shows that, since the Hamiltonian we want is linear in /70, it must also be linear in pu p2 and pz. Our wave equation is therefore of the form (po + <xiPi + <x2P2 + ct3p2+P)y> = 0, (4) where for the present all that is known about the dynamical variables or operators <x1; <x2, <X3, /3 is that they are independent of po, PuP2,P3, i-e-, that they commute with t, xx, x2, x3. Since we are considering the case of a particle moving in empty space, so that all points in space are equivalent, we should expect the Hamiltonian not to involve /, xi, x2, x3. This means that ai, a 2 , a 3 , /3 are independent of t, x\, x2, x3, i.e., that they commute with/? 0 , Pi, Pi-, p3. We are therefore obliged to have other dynamical variables besides the co-ordinates and momenta of the electron, in order that ai, a 2 , a 3 , /? may be functions of them. The wave function rp must then involve more variables than merely xx, x2, x3, t. Equation (4) leads to 0 = (-/'o + ai/7i + a2!p2 + a3/'3 + j8)(/'o + ai/7i-l-aj|P2 + a3/73 + /5>= [-^+2'af'7?+i:(a 1 a 2 + a2a1)/71/?2 + i32+Z'(a1/3 + /5a1)/71]i/;,
(5)
where the U refers to cyclic permutation of the suffixes 1, 2, 3. This agrees with (3) if oc2r = 1,
a r a, + a,a r = 0
P2 = m2c2,
(r ^ s)
r, s = 1, 2, 3.
ccrp+p
If we put/3 = a 4 wc, these conditions become ccl = 1
«./t«.v+o!.va.fl = 0
(/J^D)
(JL,
v — 1, 2, 3, 4.
(6)
We can suppose the a^'s to be expressed as matrices in some
302
Lorentz and Poincaxe Invariance
matrix scheme, the matrix elements of a^ being, say, a (£'£"). The wave function ip must now be a function of C as well as x-j, x2, x3, t. The result of a^ multiplied into ip will be a function (cn^ip) of x%, x2, x3, t, C defined by (oc^vO (x, t, C) = i V a ^ C C ) y(x, ^ £')•
We must now find four matrices a^ to satisfy the conditions (6). We make use of the matrices ot =
0 1
1 0,
0 i
<*i
-i 0.
0- 3
=
which Pauli introduced* to describe the three components of spin angular momentum. These matrices have just the properties o\ = 1
GTOS
+
(r 9± s),
(7)
that we require for our a's. We cannot, however, just take the cr's to be three of our a's, because then it would not be possible to find the fourth. We must extend the a's in a diagonal manner to bring in two more rows and columns, so that we can introduce three more matrices QI, Q2, QS of the same form as ax, a2, a 3, but referring to different rows and columns, thus:
Ol =
3 =
<
0 1 0 0 1 0 0 0 0 0 0 1 ' 0 0 1 0 1 0 0 -1 i 0 0 0 0
* Pauli, loc. cit.
0 0 0 0 1 0 0 -1
(Jo =
<
0 —i / 0 0 0 0 0
0 0 0 0 0 —i i 0
Chap. 5. The Union of Special Relativity
Qi
0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0
1 0 £3 = < 0 0
Qo = <
0 0 0 0 , 1 0 > 0 0 - 1 0 0 -1
0 0 i 0
and Quantum ...
0 —i 0 0 0 0 i 0
303
0 —/
0 0
-
The g's are obtained from the o's by interchanging the second and third rows, and the second and third columns. We now have, in addition to equations (7) 1
and also
QrQs+QsQr = 0 QrO, =
(r 7* s).
(T)
fftQr.
If we now take ai = giffi,
a2 = gicr2,
a 3 = gicr3,
a4 = g 3 .
all the conditions (6) are satisfied, e.g., ,2 _
QiaiQiffi
= Qi°i =
aia 2 = Q1O1Q1O2 = Q1O1P1 = —
l
Qx^^x
-a2ai.
The following equations are to be noted for later reference Q1Q2 = 1Q3 =
-Q2Q11
(8)
CTiO'2 = /(T3 = —OiCx J
together with the equations obtained by cyclic permutation of the suffixes. The wave equation (4) now takes the form [po+Qi(o, v) + Q3mc]y = 0,
(9)
where o denotes the vector (ou o2, cr3). ((o,p) = £orpr, orpr being a matrix product).
304
Lorentz and Poincare Invariance
§ 3. Proof of Invariance under a Lorentz Transformation Multiply equation (9) by Q3 on the left-hand side. It becomes, with the help of (8), [QaPo+iQ2(0iPi + O2P2 + 03P3) + mc]y = 0. Putting Po = ipi, we have
Qz = 7i, Q&r = yr, [zTy^+mcfy = 0,
r = 1, 2, 3, /z = 1, 2, 3, 4.
(10) (11)
The p transform under a Lorentz transformation according to the law P'M =
Evd/jypv,
where the coefficients a are c-numbers satisfying 2^/JlfiyClfjx =
Ore»
-"T<2UT^IT
=
O/iar •
The wave equation therefore transforms into [iZy^+mcty = 0,
(12)
where y'M = Era^y,. Now the y , like the a , satisfy yl = 1,
y^y,+yryM = 0,
(^ ^ v).
These relations can be summed up in the single equation yMyv+yvyM = 2d/iV. We have y't/y'v+y'vy'^ = Erxa^civxiyryx+yxyr) — 2Urxaf^avxdrx Thus the y^ satisfy the same relations as the y ^ Thus we can put, analogously to (10) V'i = Q3,
7r =
QiOr
where the £"s and a's are easily verified to satisfy the relations
Chap. 5. The Union of Special Relativity and Quantum ...
305
corresponding to (7), (7') and (8), if g'2 and g'x are defined by £ 2 = - W 2 7 3 > Qi =
-iQ&v
We shall now show that, by a canonical transformation, the g"s and cr"s may be brought into the form of the g's and cr's. From the equation g'32 — 1, it follows that the only possible characteristic values for g'3 are ± 1. If one applies to g'3 a canonical transformation with the transformation function g[, the result is
eieiCeO -1 = - gaeKei) - 1 = - eS • Since characteristic values are not changed by a canonical transformation, g'3 must have the same characteristic values as — g'3. Hence the characteristic values of g3 are + 1 twice and — 1 twice. The same argument applies to each of the other @"s, and to each of the or"s. Since g'3 and o'3 commute, they can be brought simultaneously to the diagonal form by a canonical transformation. They will then have for their diagonal elements each + 1 twice and — 1 twice. Thus, by suitably rearranging the rows and columns, they can be brought into the form Qz and g3 respectively. (The possibility g3 = ±o'3 is excluded by the existence of matrices that commute with one but not with the other.) Any matrix containing four rows and columns can be expressed as c+£rcror+Zrc'rQr+Zrscrsgrcs
(13)
where the sixteen coefficients c, cr, c'r, crs are c-numbers. By expressing o[ in this way, we see, from the fact that it commutes with Qz = & a r ) d anticommutes* with o3 — a3, that it must be of the form o{ = c1ai + c2cr2 + C3i93cri + C32(?3cr2, i.e., of the form 0
cii2
0
0
floi
0
0
0
0 0
0 ai3
a3i 0
10 0
* We say that a anticommutes with b when ab — — ba.
306 Lorentz and Poincare Invariance
The condition a'x = 1 shows that aX2a2X = 1, a3iai3 = 1. If we now apply the canonical transformation: first row to be multiplied by (a2i/a12)112 and third row to be multiplied by (ai3/a2i)112, and first and third columns to be divided by the same expressions, a'x will be brought into the form of o^ and the diagonal matrices a'3 and £3 will not be changed. If we now express g'x in the form (13) and use the conditions that it commutes with a'x = ax and a'3 = a3 and anticommutes with Gz = £?3> w e s e e th a t ^ must be of the form pi = C1Q1 + C2Q2.
The condition g'x2 = 1 shows that c'x2 + c'22 = 1 , or c[ = cos 8, c'2 = sin 6. Hence g'x is of the form
Pi
*
=
'0 0 0 0 •< ie I e 0 0 eie
e~ie 0 0 0
0 e~ie 0 0
r
If we now apply the canonical transformation: first and second rows to be multiplied by e'e and first and second columns to be divided by the same expression, p^ will be brought into the form QX, and ax, az, QZ will not be altered. Q'2, and a'2 must now be of the form g2 and a2, on account of the relations ig'2 = Q'3Q'X, io'2 = a a
'z 'x-
Thus by a succession of canonical transformations, which can be combined to form a single canonical transformation, the p/'s and a "s can be brought into the form of the p's and tr's. The new wave equation (12) can in this way be brought back into the form of the original wave equation (11) or (9); so that the results that follow from this original wave equation must be independent of the frame of reference used.
Chap. 5. The Union of Special Relativity
and Quantum ...
307
§ 4. The Hamiltonian for an Arbitrary Field To obtain the Hamiltonian for field with scalar potential A0 and usual procedure of substituting for p in the Hamiltonian for thus obtain
an electron in an electromagnetic vector potential A, we adopt the p0 + e/c-Ao for po and p + e/c-Ao no field. From equation (9) we
Po + — Ao + Qiia, p + — A\+Q3mc ip = 0.
(14)
This wave equation appears to be sufficient to account for all the duplexity phenomena. On account of the matrices o and a containing four rows and columns, it will have four times as many solutions as the non-relativity wave equation, and twice as many as the previous relativity wave equation (1). Since half the solutions must be rejected as referring to the charge + e o n the electron, the correct number will be left to account for duplexity phenomena. The proof given in the preceding section of invariance under a Lorentz transformation applies equally well to the more general wave equation (14). We can obtain a rough idea of how (14) differs from the previous relativity wave equation (1) by multiplying it up analogously to (5). This gives, if we write e' for e/c, 0 = [ — (po + e'A0)+Qiip, X[(po+e'A0)+QX(a, 2
= [-(po+e'A0)
p + e'A)+g 3 mc] v + e'A) + Q3mc]ip
+ (a,V + e'Af + m2c2
+ Ql{(o,V + e'A)(po+e'A0)-(po
+ e'Ao)(a,V + e'A)})y. (15)
We now use the general formula, that if B and C are any two vectors that commute with a
(a, B) (a, C) = ZolBxd+EipKJiBiCz + o&iBiPi) = (B, Q + iZo^Cz-BiC) = (B, C) + /(o, BXC).
(16)
308
Lorentz and Poincare Invariance
Taking B = C = p+e'A, we find (a,p + eA')2 = (p+ e 'A) 2 + z2a3 [(pi+e'Ai) (p2 + e'A2)-(p2 + e'A2) (px + e'AJ] = ($ + e'A? + he'(o, curl A). Thus (15) becomes 0
-(p0+e'Ao)2+(-p + e'Xf + m2c2 + e'h(a, curl A) 1 6A" — z'e'/zpWa, grad ^40+c a? V = [ - ( p 0 + ^ o ) 2 + ( p + e'A)2 + m2c2 + rt(o, H.) + ie'hei(a, E)]-ip,
where E and H are the electric and magnetic vectors of the field. This differs from (1) by the two extra terms eh . __ ieh — (o, H) + —
,
_.
e i (o,E)
in F. These two terms, when divided by the factor 2m, can be regarded as the additional potential energy of the electron due to its new degree of freedom. The electron will therefore behave as though it has a magnetic moment eh/2mc. a and an electric moment iehjlmc. QXa. This magnetic moment is just that assumed in the spinning electron model. The electric moment, being a pure imaginary, we should not expect to appear in the model. It is doubtful whether the electric moment has any physical meaning, since the Hamiltonian in (14) that we started from is real, and the imaginary part only appeared when we multiplied it up in an artificial way in order to make it resemble the Hamiltonian of previous theories.
§ 5. The Angular Momentum Integrals for Motion in a Central Field We shall consider in greater detail the motion of an electron in a central field of force. We put A = 0 and e'A0 = V(r), an arbitrary function of the radius r, so that the Hamiltonian in (14) becomes
Chap. 5. The Union of Special Relativity
F = p0+V+Qi(a,
and Quantum ...
309
p) + g3mc.
We shall determine the periodic solutions of the wave equation Fip = 0, which means that po is to be counted as a parameter instead of an operator; it is, in fact, just 1/c times the energy level. We shall first find the angular momentum integrals of the motion. The orbital angular momentum m is defined by m = xxp, and satisfies the following "Vertauschungs " relations myXi — xifni = 0, mipi-pimx = 0, mXm = ihm,
m\X2 — x2mx = ihxz 1 m1p2-p2ml = ihpz i, 2 mrmi — mim. = 0, J
(17)
together with similar relations obtained by permuting the suffixes. Also m commutes with r, and with/? r , the momentum canonically conjugate to r. We have rnxF-Fni! = ei{mi(o, p ) - ( o , p)/ni} = Pi(o, /wip-pmi) = ihQ1(o2p3-o3p2), and so mF—Fm = i/iQi o X p .
(18)
Thus m is not a constant of the motion. We have further GXF-FGX
= ei{ffi(a, p ) - ( a , p ) ^ } =
=
Gi(ffao-offi,p)
liQxfapz-OzPa),
with the help of (8), and so oF—Fc = — 2zgi a X p . Hence (m+^ha)F-F(m
+ j;ha) = 0.
Thus m+-|7za ( = M say) is a constant of the motion. We can interpret this result by saying that the electron has a spin angular
310
Lorentz and Poincare In variance
momentum of \hc, which added to the orbital angular momentum m, gives the total angular momentum M, which is a constant of the motion. The Vertauschungs relations (17) all hold when M's are written for the m's. In particular MXM = ihM
and M 2 M 3 = M3M2.
Mz will be an action variable of the system. Since the characteristic values of w 3 must be integral multiples of h in order that the wave function may be single-valued, the characteristic values of Mz must be half odd integral multiples of h. If we put
M2 = (; 2 -i)/z 2 ,
(19)
j will be another quantum number, and the characteristic values of Mz will extend from {j—\)h to {—j+\)h*. Thus j takes integral values . One easily verifies from (18) that m2 does not commute with F, and is thus not a constant of the motion. This makes a difference between the present theory and the previous spinning electron theory, in which m2 is constant, and defines the azimuthal quantum number k by a relation similar to (19). We shall find that our j plays the same part as the k of the previous theory.
§ 6. The Energy Levels for Motion in a Central Field We shall now obtain the wave equation as a differential equation in r, with the variables that specify the orientation of the whole system removed. We can do this by the use only of elementary non-commutative algebra in the following way. In formula (16) take B = C = m. This gives (a, m)2 = m2 4- /(a, m X m) (20) 2 2 = (m + ihaf-h(a, m)-|/z a -/i(a, m) 2 = M -2A(a,m)-f/z 2 . * See Roy. Soc. Proc, A, vol. I l l , p. 281 (1926).
Chap. 5. The Union of Special Relativity and Quantum ...
Hence
311
{(a, m) + h}2 = M2 + \h2 = j2h2.
Up to the present we have defined j only through j 2 , so that we could now, if we liked, take jh equal to (a, m) + /z. This would not be convenient since we want/ to be a constant of the motion while (o, m)+/2 is not, although its square is. We have, in fact, by another application of (16), (a, m)(a, p) = /(a, mXp), hence (m, p) = 0, and similarly (a, p)(a, m) = /(a, pXm), so that (a, m)(a, p) + (o, p) (a, m) = iEcx(mipi-mzp>l+p
Qzjh-h.
Hence (a, x) (a, p) = rpr+iQ3jh. Introduce the quantity e defined by
(22)
312
Lorentz and Poincare Invariance
re = gi(o, x).
(23)
Since r commutes with 01 and with (a, x), it must commute with e. We thus have r2s2 — [gi(a, x)]2 = (c, x)2 = x2 = f1 or e2 = 1 .
Since there is symmetry between x and p so far as angular momentum is concerned, QX (a, x), like Qi (o, p), must commute with M andy. Hence e commutes with M and/. Further, s must commute with pr, since we have (a, x) (x, p ) - ( x , p) (a, x) = ih(o, x), which gives re(rpr + ih) — (rpr + ih)rs = ihrs, which reduces to epr-pre
= 0.
From (22) and (23) we now have regi(o, p) = rpr + ig3jh or ei(°>P) = epr + iep3Jh/r. Thus F - p0+ V+ epr + ieQ3Jh/r+Q3mc.
(24)
Equation (23) shows that e anticommutes with g3. We can therefore by a canonical transformation (involving perhaps the x's and p's as well as the cr's and g's) bring e into the form of the g2 of § 2 without changing g3, and without changing any of the other variables occurring on the right-hand side of (24), since these other variables all commute with e. ieg3 will now be of the form 1Q2Q3 — = —QX, so that the wave equation takes the form Ftp = [po+V+ Q2pr- Qijh/r+ Q3mc]y) = 0.
Chap. 5. The Union of Special Relativity
and Quantum ...
313
If we write this equation out in full, calling the components of ip referring to the first and third rows (or columns) of the matrices y)a and ipp respectively, we get (Fy)* = (po+V)ya-h—y)(}
ipp+mcrp* = 0,
(Fy)p = (po+V)ipf}+h-z-ipa
y)*-mcipf} = 0.
The second and fourth components give just a repetition of these two equations. We shall now eliminate ipa. If we write hB for po+ V+mc, the first equation becomes
{w+7)n = Bvwhich gives on differentiating a2 drz
j d r dr _B_ ~ h
j
„ d
dB
dr
dr
Jh ip a
— (po + V — mc) ipa-\ (po+Vf-mW h2
j •WP +
T
+
1
1 dV dV\ I d
j \
-Bh^)[c-r+^^
This reduces to 82 xpis + Br2
(po+Vf-mW
7(7+1)
1 dV(d
A
2
/z
(25) The values of the parameter po for which this equation has a solution finite at r = 0 and r = °° are 1/c times the energy levels of the system. To compare this equation with those of previous theories, we put xpp = r%, so that
314
Lorentz and Poincar^ Invariance
a 22 Z + 2 a
ar
r
%+
dr
(po+Vf-mW h2 1_ dV_ ~ dr Bh
j(j+iy 7+1 dr
X
= 0.
(26)
If one neglects the last term, which is small on account of B being large, this equation becomes the same as the ordinary Schroedinger equation for the system, with relativity correction included. Since j has, from its definition, both positive and negative integral characteristic values, our equation will give twice as many energy levels when the last term is not neglected. We shall now compare the last term of (26), which is of the same order of magnitude as the relativity correction, with the spin correction given by Darwin and Pauli. To do this we must eliminate the d%jdr term by a further transformation of the wave function. We put % = B-^Xi, which gives a2
2 a „
(po+Vy-mW 2
r2
h
+
Bh 7 ar
2" Bh dr2
jXj+l)
+
4 B2h2 \ dr
%i
Xi = 0.
(27)
The correction is now, to the first order of accuracy, 1 [j dV Bh\r dr
1 d2V 2 dr2 V
where Bh = 2mc (provided p0 is positive). For the hydrogen atom we must put V — e2/cr. The first order correction now becomes
Chap. 5. The Union of Special Relativity
and Quantum ...
315
If we write —j fory+ 1 in (27), we do not alter the terms representing the unperturbed system, so
< 28 '>
OT>
will give a second possible correction for the same unperturbed term. In the theory of Pauli and Darwin, the corresponding correcting term is
- 2 ^
(
°'
m)
when the Thomas factor \ is included. We must remember that in the Pauli-Darwin theory, the resultant orbital angular momentum k plays the part of our j . We must define k by m2 =
k(k+l)h2
instead of by the exact analogue of (19), in order that it may have integral characteristic values, like/. We have from (20) (a, m)2 = k(k+l)h2-h(c, or
m)
{(a,m) + |/z} 2 = (k + iW,
hence (a, m) = kh
or
-{k+\)h.
The correction thus becomes
which agrees with (28) and (28')- The present theory will thus, in the first approximation, lead to the same energy levels as those obtained by Darwin, which are in agreement with experiment.
316
Lorentz and Poincare
Invaiiance
The Radiation Theories of Tomonaga, Schwinger, and Feynman F. J.
DYSON
Institute for Advanced Study, Princeton, New Jersey
(Received October 6, 1948) A unified development of the subject of quantum electrodynamics is outlined, embodying the main features both of the Tomonaga-Schwinger and of the Feynman radiation theory. The theory is carried to a point further than that reached by these authors, in the discussion of higher order radiative reactions and vacuum polarization phenomena. However, the theory of these higher order processes is a program rather than a definitive theory, since no general proof of the convergence of these effects is attempted. The chief results obtained are (a) a demonstration of the equivalence of the Feynman and Schwinger theories, and (b) a considerable simplification of the procedure involved in applying the Schwinger theory to particular problems, the simplification being the greater the more complicated the problem. I. INTRODUCTION
A
S a result of the recent and independent discoveries of Tomonaga, 1 Schwinger,* and Feynman,* the subject of quantum electrodynamics has made two very notable advances. On the one hand, both the foundations and the applications of the theory have been simplified by being presented in a completely relativistic way; on the other, the divergence difficulties have been at least partially overcome. In the reports so far published, emphasis has naturally been placed on the second of these advances; the magnitude of the first has been somewhat obscured by the fact that the new methods have been applied to problems which were beyond the range of the older theories, so that the simplicity of the methods was hidden by the complexity of the problems. Furthermore, the theory of Feynman differs so profoundly in its formulation from that of Tomonaga and Schwinger, and so little of it has been published, that its particular advantages have not hitherto been available to users of the other formulations. The advantages of the Feynman theory are simplicity 1 Sin-iriro Tomonaga, Prog. Theoret. Phys. 1, 27 (1946); Koba, Tari, and Tomonaga, Prog. Theoret. Phys. 2, 101 198 (1947); S. Kanesawa and S. Tomonaga, Prog. Theoret. Phys. 3, 1, 101 (1948); S, Tomonaga, Phys. Rev. 74, 224 (1948). "Julian Schwinger, Phys. Rev. 73, 416 (1948); Phys. Rev. 74, 1439 (1948). Several papers, giving a complete exposition of the theory, are in course of publication. •R. P. Feynman, Rev. Mod. Phys. 20, 367 (1948); Phys. Rev. 74, 939, 1430 (1948); J. A. Wheeler and R. P. Feynman, Rev. Mod. Phys. 17, 157 (1945). These articles describe early stages in the development of Feynman's theory, little of which is yet published.
and ease of application, while those of TomonagaSchwinger are generality and theoretical completeness. The present paper aims to show how the Schwinger theory can be applied to specific problems in such a way as to incorporate the ideas of Feynman. T o make the paper reasonably self-contained it is necessary to outline the foundations of the theory, following the method of Tomonaga; but this paper is not intended as a substitute for the complete account of the theory shortly to be published by Schwinger. Here the emphasis will be on the application of the theory, and the major theoretical problems of gaugeinvariance and of the divergencies will not be considered in detail. The main results of the paper will be general formulas from which the radiative reactions on the motions of electrons can be calculated, treating the radiation interaction as a small perturbation, to any desired order of approximation. These formulas will be expressed in Schwinger's notation, but are in substance identical with results given previously by Feynman. The contribution of the present paper is thus intended to be twofold: first, to simplify the Schwinger theory for the benefit of those using it for calculations, and second, to demonstrate the equivalence of the various theories within their common domain of applicability.* * After this paper was written, the author was shown a letter, published in Progress of Theoretical Physics 3, 20^ (1948) by Z. Koba and G. Takeda. The letter is dated May 22, 1948, and briefly describes a method of treatment of radiative problems, similar to die method of this paper.
Chap. 5. The Union of Special H. OUTLINE OF THEORETICAL FOUNDATIONS
Relativistic quantum mechanics is a special case 01 non-relativistic quantum mechanics, and it is convenient to use the usual non-relativistic terminology in order to make clear the relation between the mathematical theory and the results of physical measurements. In quantum electrodynamics the dynamical variables are the electromagnetic potentials Af{t) and the spinor electron-positron field i W r ) ; each component of each field at each point r of space is a separate variable. Each dynamical variable is, in the Schrtidinger representation of quantum mechanics, a time-independent operator operating on the state vector <S of the system. The nature of $ (wave function or abstract vector) need not be specified; its essential property is that, given the $ of a system at a particular time, the results of all measurements made on the system at that time are statistically determined. The variation of <J> with time is given by the Schrodinger equation tA[d/3<>
-1/
H{r)dr
*,
(1)
where H(i) is the operator representing the total energy-density of the system at the point r. The general solution of (1) is :
l>(i) = exp
Z-Wlf
H{T)dr * 0 ,
(2)
with $o any constant state vector. Now in a relativistic system, the most general kind of measurement is not the simultaneous measurement of field quantities at different points of space. It is also possible to measure independently field quantities at different points of space a t different times, provided that the points of space-time a t which the measurements are made lie outside each other's light cones, so that the measurements do not interfere with each other. Thus the most comprehensive general type of measurement is a measurement of field quantities at each point r of space at a time t(t), Results of the application of the method to a calculation of the second-order radiative correction to the KleinNishina formula are stated. All the papers of Professor Tomonaga and his associates which have yet been published were completed before the end of 1946. The isolation of these Japanese workers has undoubtedly constituted a serious loss to theoretical physics.
Relativity
and Quantum
...
317
the locus of the points (r, t(j)) in space-time forming a 3-dimensional surface <x which is spacelike (i.e., every pair of points on it is separated by a space-like interval). Such a measurement will be called "an observation of the system on
-vmj t(r)H(T)dT\*o,
(3)
which differs from
318
Lorentz
and Poincare
Invariance
field quantities at any given point of space are independent of time." This statement is plainly non-relativistic, and so (4) is, in spite of appearances, a non-relativistic equation. The simplest way to introduce a new state vector * which shall be a relativistic invariant is to require that the statement "a system has a constant state vector * " shall mean "a system consists of photons, electrons, and positrons, traveling freely through space without interaction or external disturbance." For this purpose, let fl-(r)=ffo(r)+Hl(r), (5) where Ho is the energy-density of the free electromagnetic and electron fields, and Hi is that of their interaction with each other and with any external disturbing forces that may be present. A system with constant * is, then, one whose Hi is identically zero; by (3) such a system corresponds to a $ of the form $(
-[''A]JV)#»(r)
(6)
It is therefore consistent to write generally
*(«)=rw*w,
(7)
thus defining the new state vector * of any system in terms of the old *. The differential equation satisfied by * is obtained from (4), (5), (6), and (7) in the form
;fc[a*/a
(8)
Now if g(r) is any time-independent field operator, the operator
where Hi{x<>) is the time-dependent form of the energy-density of interaction of the two fields with each other and with external forces. The left side of (9) represents the degree of departure of the system from a system of freely traveling particles and is a relativistic invariant; Hi(x0) is also an invariant, and thus is avoided one of the most unsatisfactory features of the old theories, in which the invariant Hi was added to the non-invariant Ho. Equation (9) is the starting point of the Tomonaga-Schwinger theory. HI. INTRODUCTION OF PERTURBATION THEORY
Equation (9) can be solved explicitly. For this purpose it is convenient to introduce a oneparameter family of space-like surfaces filling the whole of space-time, so that one and only one member
I
Hi(x)dx
is denoted the integral of Hi(x) over the 4dimensional volume between the surfaces
f
Hi{x)dx
are denoted integrals over the whole volume to the past of no and to the future of
Hi(x)dx\
g(xo) = (r(
(9)
and the general solution of (9) is
*(
(12)
Chap. 5. The Union of Special Relativity Expanding the product (10) in ascending powers of Hi gives a series HdxddXi+i-i/hcY
dxA
Hl(xl)Hl(.x,)dx1+---.
(13)
Further, U is by (10) obviously unitary, and
iy-i= f? = l + (i/hc) j
H&Jdx^V/hc)*
and Quantum
...
small perturbation as was done in the last section. Instead, H* alone is treated as a perturbation, the aim being to eliminate H' but to leave H' in its original place in the equation of motion of the system. Operators S(
Hl(x2)H1(xl)dxi+•
• •. (14)
It is not difficult to verify that U is a function of
(17)
Suppose now a new type of state vector fl(o-) to be introduced by the substitution ¥(
dxA
319
(18)
By (9), (15), (17), and (18) the equation of motion for fl(o-) is *ac[dG/3
(19)
The elimination of the radiation interaction is hereby achieved; only the question, "How is the new state vector fl(cr) to be interpreted?," remains. It is clear from (19) that a system with a constant J2 is a system of electrons, positrons, and photons, moving under the influence of their mutual interactions, but in the absence of external fields. In a system where two or more particles are actually present, their interactions alone will, in general, cause real transitions and scattering processes to occur. For such a system, it is rather "unphysical" to represent a state of motion including the effects of the interactions by a constant state vector; hence, for such a system the new representation has no simple interpretation. However, the most important systems are those in which only one particle is actually present, and its interaction rv. ELIMINATION OF THE RADIATION with the vacuum fields gives rise only to virtual INTERACTION processes. In this case the particle, including the In most of the problem of electrodynamics, the effects of all its interactions with the vacuum, energy-density Hi(x<>) divides into two parts— appears to move as a free particle in the absence of external fields, and it is eminently reasonable #,(*„) =-ff<(*o)+ff'(*o), (15) to represent such a state of motion by a constant ir , (*o) = -Cl/c]j,(*o)4 1 ,(*o), (16) state vector. Therefore, it may be said that the operator, the first part being the energy of interaction of HT(xo) = (5(
320
Lorentz
and Poincare
Invariance
single physical particle deviates, in the external field, from the motion represented by a constant state-vector, i.e., from the motion of an observed "free" particle. If the system whose state vector is constantly 12 undergoes no real transitions with the passage of time, then the state vector 12 is called "steady." More precisely, 12 is steady if, and only if, it satisfies the equation
mass, and should have been used instead of ilo(r) in the definition (6) of T(o-). Consequently, the second bracket should have been used instead of Hi(t) in Eq. (8). The definition of S(
5(oc)n = n.
The value of 5m can be adjusted so as to cancel out the self-energy effects in S( =o) (this is only a formal adjustment since the value is actually infinite), and then Eq. (21) will be valid for one-electron states. For the photon self-energy no such adjustment is needed since, as proved by Schwinger, the photon self-energy turns out to be identically zero. The foregoing discussion of the self-energy problem is intentionally only a sketch, but it will be found to be sufficient for practical applications of the theory. A fuller discussion of the theoretical assumptions underlying this treatment of the problem will be given by Schwinger in his forthcoming papers. Moreover, it must be realized that the theory as a whole cannot be put into a finally satisfactory form so long as divergencies occur in it, however skilfully these divergencies are circumvented; therefore, the present treatment should be regarded as justified by its success in applications rather than by its theoretical derivation. The important results of the present paper up to this point are Eq. (19) and the interpretation of the state vector 12. The state vector * of a system can be interpreted as a wave function giving the probability amplitude of finding any particular set of occupation numbers for the various possible states of free electrons, positrons, and photons. The state vector 12 of a system with a given t o n a given surface <J is, crudely speaking, the ^ which the system would have had in the infinite past if it had arrived at the given ¥ on a under the influence of the interaction H'(xo) alone. The definition of 12 being unsymmetrical between past and future, a new type of state vector 12' can be defined by reversing the direction of time in the definition of 12. Thus the 12' of a system with a given * on a given a is the *
(21)
As a general rule, one-particle states are steady and many-particle states unsteady. There are, however, two important qualifications to this rule. First, the interaction (20) itself will almost always cause transitions from steady to unsteady states. For example, if the initial state consists of one electron in the field of a proton, IIT will have matrix elements for transitions of the electron to a new state with emission of a photon, and such transitions are important in practice. Therefore, although the interpretation of the theory is simpler for steady states, it is not possible to exclude unsteady states from consideration. Second, if a one-particle state as hitherto defined is to be steady, the definition of 5(
SmcWMW) + (H,(r)-ta^'(i)j3f(r)).
The first bracket on the right here represents the energy-density of the free electromagnetic and electron fields with the observed electron rest-
H'(x0) = H'(.ro) +HS(xa) = ff'(x0) -8MC^(I,)*W.
' Here Schwinger's notation ^ = ^-*j3 is used.
(22)
Chap. 5. The Union of Special which the system would reach in the infinite future if it continued to move under the influence of H'ixo) alone. More simply, ft' can be defined by the equation Q'(
(23)
Since S(°o) is a unitary operator independent of
The Schwinger theory works directly from Eqs. (19) and (20), the aim being to calculate the matrix elements of the "effective external potential energy" HT between states specified by their state vectors ft. The states considered in practice always have ft of some very simple kind, for example, ft representing systems in which one or two free-particle states have occupation number one and the remaining free-particle states have occupation number zero. By analogy with (13), S{<m) is given by S(
Relativity
and Quantum
...
321
satisfactory agreement with experimental results. In this paper the development of the Schwinger theory will be carried no further; in principle the radiative corrections to the equations of motion of electrons could be calculated to any desired order of approximation from formula (25). In the Feynman theory the basic principle is to preserve symmetry between past and future. Therefore, the matrix elements of the operator HT are evaluated in a "mixed representation;" the matrix elements are calculated between an initial state specified by its state vector Qi and a final state specified by its state vector ft/. The matrix element of HT between two such states in the Schwinger representation is ft2*ifrft1 = Q2'*5(«>)ii'rfti,
(26)
and therefore the operator which replaces HT in the mixed representation is 5,(i,)=5(»)ffrW
= S(«)(SW)-'ff'W5W.
(27)
Going back to the original product definition of S(o) analogous to (10), it is clear that S(=°) X(5(cr)) _I is simply the operator obtained from S(o) by interchanging past and future. Thus, R(c) = 5 ( « ) (5(
xj H'(Xl)dXi + (-i/hcyf dx, dxA
H'(xl)H'(x,)dxt
+ • • •,
(24)
and (S(
x j
dxA
dx„xLH'(xn),
~"
dx,---
[-..,
[#'(*,),
[ f f & f O . i f ' W ] ] - ] ] - (25)
The repeated commutators in this formula are characteristic of the Schwinger theory, and their evaluation gives rise to long and rather difficult analysis. Using the first three terms of the series, Schwinger was able to calculate the second-order radiative corrections to the equations of motion of an electron in an external field, and obtained
.X
f
H'(xt)H'(xl)dxi+---.
(28)
The physical meaning of a mixed representation of this type is not at all recondite. In fact, a mixed representation is normally used to describe such a process as bremsstrahlung of an electron in the field of a nucleus when the Born approximation is not valid; the process of bremsstrahlung is a radiative transition of the electron from a state described by a Coulomb wave function, with a plane ingoing and a spherical outgoing wave, to a state described by a Coulomb wave function with a spherical ingoing and a plane outgoing wave. The initial and final states here belong to different orthogonal systems of wave functions, and so the transition matrix elements are calculated in a mixed representation. In the Feynman theory the situation is
322
Lorentz
and Poincare
Invariance
analogous; only the roles of the radiation interaction and the external (or Coulomb) field are interchanged; the radiation interaction is used instead of the Coulomb field to modify the state vectors (wave functions) of the initial and final states, and the external field instead of the radiation interaction causes transitions between these state vectors. In the Feynman theory there is an additional simplification. For if matrix elements are being calculated between two states, either of which is steady (and this includes all cases so far considered), the mixed representation reduces to an ordinary representation. This occurs, for example, in treating a one-particle problem such as the radiative correction to the equations of motion of an electron in an external field; the operator HF(XO), although in general it is not even Hermitian, can in this case be considered as an effective external potential energy acting on the particle, in the ordinary sense of the words. This section will be concluded with the derivation of the fundamental formula (31) of the Feynman theory, which is the analog of formula (25) of the Schwinger theory. If FJxi),
•••,
F„(x„)
are any operators defined, respectively, at the points Xi, • • •, .v„ of space-time, then PiFtixJ,
•••,
/%(.v„))
(29)
will denote the product of these operators, taken in the order, reading from right to left, in which the surfaces
H'(x.)). Since the integrand is a symmetrical function of the points .v,, • • •, .v„ the value of the integral is just n! times the integral obtained by restricting the integration to sets of points .Vi, • • •, x, for which cr(x,) occurs after cr(x,+i) for each i.
The restricted integral can then be further divided into ( » + l ) parts, t h e / t h part being the integral over those sets of points with the property that o-(xo) lies between
/„ = n ! E
«ff(i„-l)
dxy
dx„
x r <&,_,••• r dxixn'ixd--H'{x^)H'{x,)H'(x^
• • •#'(*„).
(30)
Now if the series (24) and (28) are substituted into (27), sums of integrals appear which are precisely of the form (30). Hence finally ^(*o) = X:(-v7"0»[l/»!]/, = £ ( - i / / i c ) " [ l / » ! ] f dxr-»-0
•/_„,
XP(H'(Xo),
H'(Xl),
f
dx,
J
H'(*.)).
(3D
By this formula the notation IIr(xa) is justified, for this operator now appears as a function of the point Xo alone and not of the surface u. The further development of the Feynman theory is mainly concerned with the calculation of matrix elements of (31) between various initial and final states. As a special case of (31) obtained by replacing 77' by the unit matrix in (27),
S(«) = £; (-(7/ic)"[l/«!] f dxr-- f dx„ XP(U'(xd,
•••, W'CO).
(32)
VI. CALCULATION OF MATRIX ELEMENTS
In this section the application of the foregoing theory to a general class of problems will be explained. The ultimate aim is to obtain a set of rules by which the matrix element of the operator (31) between two given states may be written down in a form suitable for numerical evaluation, immediately and automatically. The fact that such a set of rules exists is the basis of the Feynman radiation theory; the derivation in this section of the same rules from what is
Chap. 5. The Union of Special fundamentally the Tomonaga-Schwinger theory constitutes the proof of equivalence of the two theories. To avoid excessive complication, the type of matrix element considered will be restricted in two ways. First, it will be assumed that the external potential energy is
-ff'(zo) = - [lAQ/„(*„).4/(*o).
(33>
that is to say, the interaction energy of the electron-positron field with electromagnetic potentials A/ixo) which are given numerical functions of space and time. Second, matrix elements will be considered only for transitions from a state A, in which just one electron and no positron or photon is present, to another state B of the same character. These restrictions are not essential to the theory, and are introduced only for convenience, in order to illustrate clearly the principles involved. The electron-positron field operator may be written ^.W = E«»aWa„, (34) where the <£„«(:<:) are spinor wave functions of free electrons and positrons, and the a„ are annihilation operators of electrons and creation operators of positrons. Similarly, the adjoint operator
.(*) = £**..(*)"«,
(35)
where d„ are annihilation operators of positrons and creation operators of electrons. The electromagnetic field operator is Af(x) = Z (A.„(x)b,+A.S(x)6.),
(36)
where b, and 6, are photon annihilation and creation operators, respectively. The chargecurrent 4-vector of the electron field is j,(x)=ieo(>(x)y^(x);
(37)
strictly speaking, this expression ought to be antisymmetrized to the form7 j,(x) = i*a:|iP„(*)iM*) - * « ( * ) * . ( * ) ) (7,).a.
(38)
but it will be seen later that this is not necessary in the present theory. Consider the product P occurring in the n'th 'See Wolfgang Pauli, Rev. Mod. Phys. 13, 203 (1941), Eq. (96), p. 224.
Relativity
and Quantum
...
323
integral of (31); let it be denoted by P„. From (16), (22), (33), and (37) it is seen that P„ is a sum of products of ( « + l ) operators ^„, (n + 1) operators i£„, and not more than n operators A„, multiplied by various numerical factors. By Q, may be denoted a typical product of factors ^„, $„, and At, not summed over the indices such as a and M, so that P „ is a sum of terms such as (?„. Then Q„ will be of the form (indices omitted) Q» =tf(*.-o)>K*.-o)!?(*.-i)>K*.-i)-• •
324
Lorentz
a n d Poincare
Invariance
be made for each possible division into pairs and the results added together. It follows from the above considerations that the matrix element of Q„ for the transition A —>B is a sum of contributions, each contribution arising from a specific way of dividing the factors of Q„ into two single factors and pairs. A typical contribution of this kind will be denoted by M. The two factors of a pair must involve a creation and an annihilation operator for the same particle, and so must be either one $ and one ^ or two A ; the two single factors must be one $ and one \j/. The term M is thus specified by fixing an integer k, and a permutation ro, ru • • •, r„ of the integers 0 , 1 , • • •, n, and a division ( W i ) , (W2), • • •, (Sk,th) of the integers j u •••,]„ into pairs; clearly m = 2h has to be an even number; the term M is obtained by choosing for single factors
XP(A(x.J,A(.*n))--
-P{A{x.h),A(x:»)),
(40)
a factor e being inserted which takes the value ± 1 according to whether the permutation of ^ and yp factors between (39) and (40) is even or odd. Then in (40) each product of two associated factors (but not the two single factors) is to be independently replaced by the sum of its matrix elements for processes involving the successive creation and annihilation of the same particle. Given a bilinear operator such as A,{x)A,(y), the sum of its matrix elements for processes involving the successive creation and annihilation of the same particle is just what is usually called the "vacuum expectation value" of the operator, and has been calculated by Schwinger. This quantity is, in fact (note that Heaviside units are being used) (A>(x)A,{y))a =
not be given here, because it turns out that the vacuum expectation value of P{A„{x)rA,{y)) takes an even simpler form. Namely, {P{AAx),A,{y)))o
(41)
where DF is the type of D function introduced by Feynman. Dp(x) is an even function of x, with the integral expansion 5?(.t) = - [ i 7 2 i r 2 ] f
exp[ia.r 2 ]da,
(42)
•'0
where x2 denotes the square of the invariant length of the 4-vector x. In a similar way it follows from Schwinger's results that (P(t.(x)My)))>
= b(x,y)Sr,.(x-y),
(43)
where Srt.(x) = - (.y,(.d/dx.) +
(44)
reciprocal Compton wave-length of the ri(x,y) is —1 or + 1 according as a{x) or later than
Af(x) = — [ i / 2 x : ] I
exp[ta.r-' —«oV4a]
Substituting from (41) and (44) into (40), the matrix element M takes the form (still omitting the indices of the factors $, <j/, and A of Q„) Af = til
(h(xi,Xr,)Sr(x,-Xn))
Xmh>'cVr(x.i-x,j))P($(xk)Mx'K)).
(46)
i
The single factors ^{xt) and ${xrk) are conveniently left in the form of operators, since the matrix elements of these operators for effecting the transition A—>B depend on the wave functions of the electron in the states A and B. Moreover, the order of the factors $(xt) and
\hctiv.{Dm+iD\{x-y),
where Dm and D are Schwinger's invariant D functions. The definitions of these functions will
= \hc!>t,DF{x-y),
(iSF(x,-xr,))U(i^DF(x,i-xh)) 1
Xt{xk)f(xn),
(47)
with
«'=«rii(*.-.*'0.
(48)
Chap. 5. The Union of Special Relativity
and Quantum ...
325
Now the product in (48) is ( — 1)', where p is the ing section it will be shown how this solution-innumber of occasions in the expression (40) on principle can be reduced to a much simpler and which the ii of a P bracket occurs to the left of more practical procedure. the ty. Referring back to the definition of e after VII. GRAPHICAL REPRESENTATION OF Eq. (40), it follows that t' takes the value + 1 or MATRIX ELEMENTS — 1 according to whether the permutation of $ Let an integer n and a product P„ occurring and
326
Lorentz
and Poincare
Invariance
ment yields the simple formula «' = ( - l ) ' - .
(51)
This formula is one result of the present theory which can be much more easily obtained by intuitive considerations of the sort used by Feynman. In Feynman's theory the graph corresponding to a particular matrix element is regarded, not merely as an aid to calculation, but as a picture of the physical process which gives rise to that matrix element. For example, an electron line joining X\ to x2 represents the possible creation of an electron at Xi and its annihilation at Xi, together with the possible creation of a positron at Xi and its annihilation at X\. This interpretation of a graph is obviously consistent with the methods, and in Feynman's hands has been used as the basis for the derivation of most of the results, of the present paper. For reasons of space, these ideas of Feynman will not be discussed in further detail here. To the product P„ correspond a finite number of graphs, one of which may be denoted by G; all possible G can be enumerated without difficulty for moderate values of n. To each G corresponds a contribution C{G) to the matrix element of (31) which is being evaluated. It may happen that the graph G is disconnected, so that it can be divided into subgraphs, each of which is connected, with no line joining a point of one subgraph to a point of another. In such a case it is clear from (47) that C(G) is the product of factors derived from each subgraph separately. The subgraph G\ containing the point x„ is called the "essential part" of G, the remainder G2 the "inessential part." There are now two cases to be considered, according to whether the points xk and Xrk lie in Gt or in G\ (they must clearly both lie in the same subgraph). In the first case, the factor C{Gi) of C(G) can be seen by a comparison of (31) and (32) to be a contribution to the matrix element of the operator •S(°°) for the transition A—>B. Now letting G vary over all possible graphs with the same Gi and different G2, the sum of the contributions of all such G is a constant C(Gi) multiplied by the total matrix element of 5(°°) for the transition A—>B. But for one-particle states the operator S ( « ) is by (21) equivalent
to the identity operator and gives, accordingly, a zero matrix element for the transition A—>B. Consequently, the disconnected G for which xt and xrk lie in G2 give zero contribution to the matrix element of (31), and can be omitted from further consideration. When xk and Xrt lie in Gu again the C{G) may be summed over all G consisting of the given Gx and all possible G 2 ; but this time the connected graph Gi itself is to be included in the sum. The sum of all the C{G) in this case turns out to be just C(Gi) multiplied by the expectation value in the vacuum of the operator 5(a>). But the vacuum state, being a steady state, satisfies (21), and so the expectation value in question is equal to unity. Therefore the sum of the C(G) reduces to the single term C(Gi), and again the disconnected graphs may be omitted from consideration. The elimination of disconnected graphs is, from a physical point of view, somewhat trivial, since these graphs arise merely from the fact that meaningful physical processes proceed simultaneously with totally irrelevant fluctuations of fields in the vacuum. However, similar arguments will now be used to eliminate a much more important class of graphs, namely, those involving self-energy effects. A "self-energy part" of a graph G is defined as follows; it is a set of one or more vertices not including x„, together with the lines joining them, which is connected with the remainder of G (or with the edge of the diagram) only by two electron lines or by one or two photon lines. For definiteness it may be supposed that G has a self-energy part F, which is connected with its surroundings only by one electron line entering F a t xi, and another leaving F at x2; the case of photon lines can be treated in an entirely analogous way. The points Xi and X2 may or may not be identical. From G a "reduced graph" Go can be obtained by omitting F completely and joining the incoming line at xx with the outgoing line at x2 to form a single electron line in Go, the newly formed line being denoted by X. Given Go and X, there is conversely a well determined set T of graphs G which are associated with Go and X in this way; Go itself-is considered also to belong to T. It will now be shown that the sum C(T) of the contributions C{G) to the matrix element of (31) from all the graphs G of T reduces to a single term C'(Go).
Chap. 5. The Union of Special Relativity
and Quantum
... 3 2 7
Suppose, for example, that the line X in Go with J?i an absolute constant. Therefore the sum leads from a point x3 to the edge of the diagram. C(V) is in this case just C'(Go), where C'(Go) is Then C(Go) is an integral containing in the inte- obtained from C(Go)by the replacement grand the matrix element of (*»)-*i?ift*j). (56) ia(x,)
(52)
for creation of an electron into the state B. Let the momentum-energy 4-vector of the created electron be p; the matrix element of (52) is of the form K„(*,)=a.exp[-*'(£-*j)A] (S3) with a„ independent of x*. Now consider the sum C(r). I t follows from an analysis of (31) that C(r) is obtained from C(Go) by replacing the operator (52) by t
( - i / M " [ l / n ! ] f dyi---[
dyn
(This is, of course, a consequence of the special character of the graphs of I\) I t is required to calculate the matrix element of (54) for a transition from the vacuum state 0 to the state B, i.e., for the emission of an electron into state B. This matrix element will be denoted by Z„; C(V) involves Z„ in the same way that C(Go) involves (53). Now Z„ can be evaluated as a sum of terms of the same general character as (47); it will be of the form Ki->{yi-xi)Yt(yi)dyi,
where the important fact is that Ki is a function only of the coordinate differences between y,and Xi. By (53), this implies t h a t Z. = R^(p)Y,(x,),
(57)
There remains the case in which X leads from one vertex xz to another xt of Go. In this case C(Go) contains in its integrand the function 1IJ(XI,»4)5P»,,(XJ-S;0.
(58)
which is the vacuum expectation value of the operator P&.(xCM*M (59)
X P t f . M . t f ' M , •••,ff'(j>„)). (54)
2„ = E f
In the case when the line X leads into the graph Go from the edge of the diagram to the point x3, it is clear that C(V) will be similarly obtained from C(Go) by the replacement
(55)
with R independent of * j . From considerations of relativistic invariance, R must be of the form
according to (43). Now in analogy with (54), C(r) is obtained from C(Go) by replacing (59) by f
( - t 7 * e ) - [ l / » ! ] f dyv
f dyn
X P ( i ( n ) , M * o , ff'OyO. • • •. ff'W). (60) and the vacuum expectation value of this operator will be denoted by 4 J K * J , X 0 - S ' « . ( X J —*0-
(61)
By the methods of Section VI, (61) can be expanded as a series of terms of the same character as (47); this expansion will not be discussed in detail here, but it is easy to see that it leads to an expression of the form (61), with S/(x) a certain universal function of the 4vector x. It will not be possible to reduce (61) to a numerical multiple of (58), as Z. was in the previous case reduced to a multiple of 7„. Instead, there may be expected to be a series expansion of the form
2
Sn.(x) = ( i ? 2 + a 1 ( D 2 - « o : ! ) + a 2 0 - « o 2 ) 2 + ---)S«.(*) + ( & H - 6 , 0 - « o s ) + - - - ) X(y,Z3/dx,]-'<°)tySFy.(x), (62)
p1 = - hW,
where Q 1 is the Dalembertian operator and the o, b are numerical coefficients. In this case C(JT) will be equal to the C'(Ga) obtained from C(G0) by the replacement
St.RiW) +
(P,y,h.RzW),
where p is the square of the invariant length of the 4-vector p. But since the matrix element (53) is a solution of the Dirac equation, (p,yf),.
Y, = ihK
and so (55) reduces to Z. =
R1Y.(x,),
Sr(x,-x,)^S,'(x,-xt).
(63)
328
Lorentz
and Poincare
Invariance
Applying the same methods to a graph G with a self-energy part connected to its surroundings by two photon lines, the sum C{V) will be obtained as a single contribution C'{Go) from the reduced graph Go, C'(Go) being formed from C(Go) by the replacement Dr(x,-xi)-*D/(.x,-xi).
(64)
The function Dp' is defined by the condition that Jftci,„ZV (*,-*«)
(65)
is the vacuum expectation value of the operator
£ (-i/W/nqj XP(A,(x3),
dyi--f dy„
A,(x<), H'(yO, • • •, H'(y„)),
(66)
and may be expanded in a series £ / ( * ) = C K 3 + c l n 2 + c 2 0 ) 2 + • • -)Dr{x).
(67)
Finally, it is not difficult to see that for graphs G with self-energy parts connected to their surroundings by a single photon line, the sum C{V) will be identically zero, and so such graphs may be omitted from consideration entirely. As a result of the foregoing arguments, the contributions C(G) of graphs with self-energy parts can always be replaced by modified contributions C'(Go) from a reduced graph Go. A given G may be reducible in more than one way to give various Go, but if the process of reduction is repeated a finite number of times a Go will be obtained which is "totally reduced," contains no self-energy part, and is uniquely determined by G. The contribution C'(Go) of a totally reduced graph to the matrix element of (31) is now to be calculated as a sum of integrals of expressions like (47), but with a replacement (56), (57), (63), or (64) made corresponding to every line in Go* This having been done, the matrix element of (31) is correctly calculated by taking into consideration each totally reduced graph once and once only. The elimination of graphs with self-energy parts is a most important simplification of the theory. For according to (22), H' contains the subtracted part Hs, which will give rise to many additional terms in the expansion of (31). But if any such term is taken, say, containing the factor Hs(xi) in the integrand, every graph cor-
responding to that term will contain the point x, joined to the rest of the graph only by two electron lines, and this point by itself constitutes a self-energy part of the graph. Therefore, all terms involving Hs are to be omitted from (31) in the calculation of matrix elements. The intuitive argument for omitting these terms is that they were only introduced in order to cancel out higher order self-energy terms arising from H\ which are also t o be omitted; the analysis of the foregoing paragraphs is a more precise form of this argument. In physical language, the argument can be stated still more simply; since Sm is an unobservable quantity, it cannot appear in the final description of observable phenomena. v m . VACUUM POLARIZATION AND CHARGE RENORMALIZATIO N
The question now arises: What is the physical meaning of the new functions DP' and S/, and of the constant Ri? In general terms, the answer is clear. The physical processes represented by the self-energy parts of graphs have been pushed out of the calculations, but these processes do not consist entirely of unobservable interactions of single particles with their self-fields, and so cannot entirely be written off as "self-energy processes." In addition, these processes include the phenomenon of vacuum polarization, i.e., the modification of the field surrounding a charged particle by the charges which the particle induces in the vacuum. Therefore, the appearance of Dp', SP', and Ri in the calculations may be regarded as an explicit representation of the vacuum polarization phenomena which were implicitly contained in the processes now ignored. In the present theory there are two kinds of vacuum polarization, one induced by the external field and the other by the quantized electron and photon fields themselves; these will be called "external" and "internal," respectively. It is only the internal polarization which is represented yet in explicit fashion by the substitutions (56), (57), (63), (64); the external will be included later. To form a concrete picture of the function Dp', it may be observed that the function Dp{y — z) represents in classical electrodynamics the retarded potential of a point charge a t y acting upon a point charge at z, together with the re-
Chap. 5. The Union of Special Relativity tarded potential of the charge at z acting on the charge at y. Therefore, Dp may be spoken of loosely as "the electromagnetic interaction between two point charges." In this semiclassical picture, DP' is then the electromagnetic interaction between two point charges, including the effects of the charge-distribution which each charge induces in the vacuum. The complete phenomenon of vacuum polarization, as hitherto understood, is included in the above picture of the function Dp'. There is nothing left for Sr' to represent. Thus, one of the important conclusions of the present theoryis that there is a second phenomenon occurring in nature, included in the term vacuum polarization as used in this paper, but additional to vacuum polarization in the usual sense of the word. The nature of the second phenomenon can best be explained by an example. The scattering of one electron by another may be represented as caused by a potential energy (the M0ller interaction) acting between them. If one electron is at y and the other at z, then, as explained above, the effect of vacuum polarization of the usual kind is to replace a factor Dp in this potential energy by Dp'. Now consider an analogous, but unorthodox, representation of the Compton etfect, or the scattering of an electron by a photon. If the electron is at y and the photon at z, the scattering may be again represented by a potential energy, containing now the operator Sp(y — z) as a factor; the potential is an exchange potential, because after the interaction the electron must be considered to be at z and the photon at y, but this does not detract from its usefulness. By analogy with the 4-vector charge-current density _/„ which interacts with the potential Dp, a spinor Comptoneffect density u„ may be denned by the equation «.(.r)-yl„W(7,).^(.v), and an adjoint spinor by
These spinors are not directly observable quantities, but the Compton effect can be adequately described as an exchange potential, of magnitude proportional to Sp(y — z), acting between the Compton-effect density at any point y and the adjoint density at z. The second vacuum polariza-
and Quantum ...
329
tion phenomenon is described by a change in the form of this potential from Sp to Sp'. Therefore, the phenomenon may be pictured in physical terms as the inducing, by a given element of Compton-effect density at a given point, of additional Compton-effect density in the vacuum around it. In both sorts of internal vacuum polarization, the functions Dp and Sp, in addition to being altered in shape, become multiplied by numerical (and actually divergent) factors Ri and Ri; also the matrix elements of (31) become multiplied by numerical factors such as RiRi*. However, it is believed (this has been verified only for secondorder terms) that all n'th-order matrix elements of (31) will involve these factors only in the form of a multiplier
this statement includes the contributions from the higher terms of the series (62) and (67). Here e is defined as the constant occurring in the fundamental interaction (16) by virtue of (37). Now the only possible experimental determination of e is by means of measurements of the effects described by various matrix elements of (31), and so the directly measured quantity is not e but eRiR3y Therefore, in practice the letter e is used to denote this measured quantity, and the multipliers R no longer appear explicitly in the matrix elements of (31); the change in the meaning of the letter e is called "charge renormalization," and is essential if e is to be identified with the observed electronic charge. As a result of the renormalization, the divergent coefficients Ru R,, and R3 in (56), (57), (62), and (67) are to be replaced by unity, and the higher coefficients a, b, and c by expressions involving only the renormalized charge e. The external vacuum polarization induced by the potential A„ is, physically speaking, only a special case of the first sort of internal polarization; it can be treated in a precisely similar manner. Graphs describing external polarization effects are those with an "external polarization part," namely, a part including the point Xa and connected with the rest of the graph by only a single photon line. Such a graph is to be "reduced" by omitting the polarization part entirely and renaming with the label Xa the
330
Lorentz
and Poincare
Invariance
point at the further end of the single photon line. A discussion similar to those of Section VII leads to the conclusion that only reduced graphs need be considered in the calculation of the matrix element of (31), and that the effect of external polarization is explicitly represented if in the contributions from these graphs a replacement
in a sequel to the present paper a similar evaluation of the function Sr; the analysis involved is too complicated to be summarized here. EC. SUMMARY OF RESULTS
In this section the results of the preceding pages will be summarized, so far as they relate Ar'{x)^A,"(x) (68) to the performance of practical calculations. In effect, this summary will consist of a set of rules is made. After a renormalization of the unit of for the application of the Feynman radiation potential, similar to the renormalization of theory to a certain class of problems. charge, the modified potential A,'1 takes the form Suppose an electron to be moving in an exA/{x) = ( l + c O + ^ ( n 2 ) 2 + • • -)A/(x), (69) ternal field with interaction energy given by (33). Then the interaction energy to be used in where the coefficients are the same as in (67). calculating the motion of the electron, including It is necessary, in order to determine the radiative corrections of all orders, is functions Dp', Sr', and A/, to go back to formulas (60) and (66). The determination of the HE(x0) = t {-i/hcy[_\/n\-]J„ vacuum expectation values of the operators (60) n-0 and (66) is a problem of the same kind as the original problem of the calculation of matrix = £ ( - W [ l / » ! ] f
Chap. 5. The Union of Special Relativity and Quantum ... such graphs can exist only for even n. K„ is the sum of a contribution K{G) from each G. Given G, K(G) is obtained from Jn by the following transformations. First, for each photon line joining * and y in G, replace two factors Ajl(x)A,(y) in P„ (regardless of their positions) by \hcSr,Dp'{x-y),
(71)
with DP' given by (67) with i?j = l, the function Dp being defined by (42). Second, for each electron line joining x to y in G, replace two factors 4>a(x)tj/g(y) in P„ (regardless of positions) by hS'm.ix-y)
(72)
with Sp' given by (62) with 2?s = l, the function SP being defined by (44) and (45). Third, replace the remaining two factors P(^y(z),^i(w)) in P„ by \j/^(z)\!/i(w) in this order. Fourth, replace A ,'{xa) by i4/'(jc 0 ) given by A/(x)
= A„'(x) - [ a / 1 5 W ] [ 3 4 » ' M
(73)
or, more generally, by (69). Fifth, multiply the whole by ( — 1)', where I is the number of closed loops in G as defined in Section VII. The above rules enable K, to be written down very rapidly for small values of n. I t should be observed that if Kn is being calculated, and if it is not desired to include effects of higher order than the n'th, then DP, SP', and A/ in (71), (72), and (73) reduce to the simple functions Dr, Sp, and A,. Also, the integrand in J„ is a symmetrical function of xi, • • •, x„; therefore, graphs which differ only by a relabeling of the vertices xu •••, *„ give identical contributions to iC» and need not be considered separately. The extension of these rules to cover the calculation of matrix elements of (70) of a more general character than the one-electron transitions hitherto considered presents no essential difficulty. All that is necessary is to consider graphs with more than two "loose ends," representing processes in which more than one particle is involved. This extension is not treated in the present paper, chiefly because it would lead to unpleasantly cumbersome formulas. X. EXAMPLE—SECOND-ORDER RADIATIVE CORRECTIONS
As an illustration of the rules of procedure of the previous section, these rules will be used for writing down the terms giving second-order
FIG.
331
1.
radiative corrections to the motion of an electron in an external field. Let the energy of the external field be -C1/C]JM(.VO)4/(.VO).
(74)
Then there will be one second-order correction term i/=[a/15xKo ! ]ClA]jV(^o)LJ 3 '4/(-' c ») arising from the substitution (73) in the zeroorder term (74). This is the well-known vacuum polarization or Uehling term. 9 The remaining second-order term arises from the second-order part J°_ of (70). Written in expanded form, Jt is J. = ie> J dxA
dx2P($.(Xt)(yJ.aMx
'£-,(*i)(7».)7<W*i)^»(*i).
i.(x,)(y,).{MxM.(x~)). Next, all admissable graphs with the three vertices Xt>, Xi, Xz are to be drawn. It is easy to see that there are only two such graphs, that G shown in Fig. 1, and the identical graph with X\ and xj interchanged. The full lines are electron lines, the dotted line a photon line. The contribution K{G) is obtained from Ji by substituting according to the rules of Section IX ; in this case Z = 0, and the primes can be omitted from (71), (72), (73) since only second-order terms are required. T h e integrand in K(G) can be reassembled into the form of a matrix product, suppressing the suffixes a, •••,{"• Then, multiplying by a factor 2 to allow for the second graph, the complete second-order correction to (74) arising from Jt becomes L= -t[y/8fa] I dxA
dXiDF(xi-x^A,'{xa)
Xi(x,)y,Sp(xa-xl)y,Sp(xi—Xo)y,\Kxi). • Robert Serber, Phvs. Rev. 48, 49 (1935); E. A. Uehling, Phys. Rev. 48, 55 (1935).
332
Lorentz and Poiocare Xavariance
This is the term which gives rise to the main part of the Lamb-Retherford line shift,10 the anomalous magnetic moment of the electron,11 and the anomalous hyperfine splitting of the ground state of hydrogen.11 The above expression L is formally simpler than the corresponding expression obtained by Schwinger, but the two are easily seen to be equivalent. In particular, the above expression does not lead to any great reduction in the labor involved in a numerical calculation of the Lamb shift. Its advantage lies rather in the ease with which it can be written down. In conclusion, the author would like to express his thanks to the Commonwealth Fund of New York for 'financial support, and to Professors Schwinger and Feynman for the stimulating lectures in which they presented their respective theories. Noks added in proof (To Section II). The argument of Section II is an over-simplification of the method of Toraonaga,1 and is unsound. There is an error in the derivation of (3); derivatives occurring in H(r) give rise to noncommutativity between H(r) and field quantities at r* when r is a point on «r infinitesimally distant from r \ The 19 W. E. Lamb and R. C Retherford, Phys. Rev. 72, 241 (1947).
F. J. Dyson
argument should be amended as follows. # is defined only for flat surfaces J(r)—/, and for such surfaces (3) and (6) are correct. -9 is defined for general surfaces by (12) and (10), and is verified to satisfy (f). For a flat surface, # and * are then shown to be related by (7). Finally, since Hi does not involve the derivatives in Ht the argument leading to (3) can be correctly applied to prove that for general
Chap. 5. The Union of Special Relativity
and Quantum ...
333
T h e R a d i a t i o n T h e o r i e s of T o m o n a g a , S c h w i n g e r a n d F e y n m a n The Physical Review, 7 5 , 1949, pp. 486-502. (Also included in Selected Papers on Quantum Electrodynamics, Julian Schwinger, editor, [New York, Dover, 1958.])
Freeman J. Dyson Institute for Advanced Study, Princeton, New Jersey
Commentary
1
T h e American scientific tradition was strongly empirical. Theory was regarded as a necessary evil, needed for the correct understanding of experiments but not valued for its own sake. Quantum field theory had been invented and elaborated in Europe. It was a sophisticated mathematical construction, motivated more by considerations of mathematical beauty than by success in explaining experiments. The majority of American physicists had not taken the trouble to learn it. They considered it, as Samuel Johnson considered Italian opera, an exotic and irrational entertainment. Thus it happened t h a t I arrived at Cornell from England in 1947 as a student, and found myself the only person in the whole university who knew about quantum field theory. The great Hans Bethe and the brilliant Richard Feynman taught me a tremendous lot about many areas of physics, but when we were dealing with quantum field theory I was the teacher and they were the students. Bethe and Feynman had been doing physics successfully for many years without the help of quantum field theory, and they were not eager to learn it. It was my luck t h a t I arrived with this gift from Europe just at the moment wh :n new precise experiments on the fine details of atomic energy levels required quantum field theory for their correct interpretation. Julian Schwinger had known about quantum field theory long before. But he shared the American view t h a t it was a mathematical extravagance, better avoided unless it should turn out to be essential. In 1948 he understood t h a t it could be useful. He used it for his calculations of the energy level shifts Condensed by the author from Selected Papers of Freeman Dyson with Commentary, published by the American Mathematical Society in 1996 (pages 11-13 in that volume).
334
Lorentz and Poincaxe Invariance
revealed by the experiments of Lamb and Retherford, Foley and Kusch at Columbia. But he used it grudgingly. In his publications he preferred not to speak explicitly about quantum field theory. Instead, he spoke about Green's Functions. At Cornell I was learning Richard Feynman's quite different way of calculating atomic processes. Feynman had never been interested in quantum field theory. He had his own private way of doing calculations. His way was based on things that he called "Propagators", which were probability amplitudes for particles to propagate themselves from one space-time point to another. He calculated the probabilities of physical processes by adding up the propagators. He had rules for calculating the propagators. Each propagator was represented graphically by a collection of diagrams. Each diagram gave a pictorial view of particles moving along straight lines and colliding with one another at points where the straight lines met. When I learned this technique of drawing diagrams and calculating propagators from Feynman, I found it completely baffling, because it always gave the right answers but did not seem to be based on any solid mathematical foundation. Feynman called his way of calculating physical processes "the space-time approach", because his diagrams represented events as occurring at particular places and at particular times. The propagators described sequences of events in space-time. It later turned out that Feynman's propagators were merely another kind of Green's Functions. Feynman had been talking the language of Green's Functions all his life without knowing it. Green's Functions also appeared in the work of Sin-Itiro Tomonaga, who had developed independently a new and elegant version of relativistic quantum field theory. His work was done in the complete isolation of war-time Japan, and was published in Japanese in 1943. The rest of the world became aware of it only in the spring of 1948, when an English translation of it arrived in Princeton, sent by Hideki Yukawa to Robert Oppenheimer. Tomonaga was a physicist in the European tradition, having worked for two years with Heisenberg at Leipzig before the war. For him, in contrast to Schwinger and Feynman, quantum field theory was a familiar and natural language in which to think. After the war, Tomonaga's students in Japan had been applying his ideas to calculate the properties of atoms and electrons with high accuracy, and were reaching the same results as Schwinger and Feynman. When Tomonaga's papers began to arrive in America, I was delighted to see that he was speaking the language of quantum field theory. It did not take me long to put the various ingredients of the pudding together. When the pudding was cooked, all three versions of the new theory of atoms and electrons turned
Chap. 5. The Union of Special Relativity
and Quantum ...
335
out to be different ways of expressing the same basic idea. The basic idea was to calculate Green's Functions for all atomic processes that could be directly observed. Green's Functions appeared as the essential link between the methods of Schwinger and Feynman, and Tomonaga's relativistic quantum field theory provided the firm mathematical foundation for all three versions of quantum electrodynamics. The unification of quantum electrodynamics was done, with the active encouragement of Bethe, Schwinger and Feynman, during the summer of 1948. A detailed analysis of the structure led to two simple conclusions. The results of systematic perturbation theory were identical with the results obtained from Feynman graphs using Feynman's rules of calculation. And the procedures of mass and charge renormalization got rid of the infinities in the theory and made all observable quantities finite. The effect of my paper was to make quantum electrodynamics into a convenient tool for practical calculations. By a systematic use of perturbation theory one could calculate physical processes to any desired accuracy. The same basic ideas and techniques were later incorporated into the unified theory of electromagnetic and weak interactions by Weinberg and Salam, and into the quantum chromodynamic theory of strong interactions which became the "Standard Model" of modern particle physics.
This page is intentionally left blank
Chapter 6
The Lorentz and Poincare Groups and Their Implications 6
6
E . Wigner (1939), Van der Waerden (1932), V. Bargmann (1947, 1948), S. Watanabe (1951), E. Inonu (1953), Yu. M. Shirokov (1958), M. A. Naimark (1957)
338 Lorentz and PoincarS Invariance
The Quantum Theory of Fields (vol 1) S. Weinberg University of Texas at Austin Symmetries, Quantum Lorentz Transformation and The Poincar6 Algebra
2.2
Symmetries
A symmetry transformation is a change in our point of view that does not change the results of possible experiments. If an observer 0 sees a system in a state represented by a ray M or St\ or M%,.., then an equivalent observer 0' who looks at the same system will observe it in a different state, represented by a ray M or M{ or &2..., respectively, but the two observers must find the same probabilities
p{m -• mn) = p\mf - • a n ) .
(2.2.1)
(This is only a necessary condition for a ray transformation to be a symmetry; further conditions are discussed in the next chapter.) An important theorem proved by Wigner2 in the early 1930s tells us that for any such transformation M -> dt of rays we may define an operator U on Hilbert space, such that if ¥ is in ray 3t then U¥ is in the ray M\ with U either unitary and linear (£/<&, l/¥) = (<&,¥), U(%® + IJ¥) = <* C/O + IJ C/Y
(12.2) (2.2.3)
Chap. 6. The Lorentz and Poincare Groups ...
339
or else antiunitary and antilinear ([/$, £/¥) = (,¥)*, x
U($Q + rj ¥) = CU® + ri'UV .
(2.2.4) (2.2.5)
Wigner's proof omits some steps. A more complete proof is given at the end of this chapter in Appendix A. As already mentioned, the adjoint of a linear operator L is defined by (0,L t , F) = (L
(2.2.6)
This condition cannot be satisfied for an antilinear operator, because in this case the right-hand side of Eq. (2.2.6) would be linear in <&, while the left-hand side is antilinear in
(2.2.7)
With this definition, the conditions for unitarity or antiunitarity both take the form t/ f = U~l .
(2.2.8)
There is always a trivial symmetry transformation ^2 -> ^2, represented by the identity operator U = 1. This operator is, of course, unitary and linear. Continuity then demands that any symmetry (like a rotation or translation or Lorentz transformation) that can be made trivial by a continuous change of some parameters (like angles or distances or velocities) must be represented by a linear unitary operator U rather than one that is antilinear and antiunitary. (Symmetries represented by antiunitary antilinear operators are less prominent in physics; they all involve a reversal in the direction of time's flow. See Section 2.6.) In particular, a symmetry transformation that is infinitesimally close to being trivial can be represented by a linear unitary operator that is infinitesimally close to the identity: U = 1 + iet
(2.2.9)
with e a real infinitesimal. For this to be unitary and linear, t must be Hermitian and linear, so it is a candidate for an observable. Indeed, most (and perhaps all) of the observables of physics, such as angular momentum or momentum, arise in this way from symmetry transformations.
340
Lorentz and Poincare Invariance
The set of symmetry transformations has certain properties that define it as a group. If Ti is a transformation that takes rays 2/tn into $n, and T2 is another transformation that takes <%'n into 0l"n, then the result of performing both transformations is another symmetry transformation, which we write T2T\, that takes ^ „ into 0l"n. Also, a symmetry transformation T which takes rays 0tn into 3tfn has an inverse, written T~l, which takes 3t'„ into 01n, and there is an identity transformation, T = 1, which leaves rays unchanged. The unitary or antiunitary operators U(T) corresponding to these symmetry transformations have properties that mirror this group structure, but with a complication due to the fact that, unlike the symmetry transformations themselves, the operators U(T) act on vectors in the Hilbert space, rather than on rays. If T\ takes Mn into 3Pn, then acting on a vector *F„ in the ray 01n, U(T\) must yield a vector U{Tif¥n in the ray $n, and if T2 takes this ray into 0t"n, then acting on t/(Ti)*Pn it must yield a vector C/(T2)t/(T1)vPn in the ray 3?'„'. But [/(T 2 Ti)T„ is also in this ray, so these vectors can differ only by a phase (f>n(T2, T\) U{T2)U(Ti)Vn = e^T^UiTzTiWn
.
(2.2.10)
Furthermore, with one significant exception, the linearity (or antilinearity) of U(T) tells us that these phases are independent of the state ¥„. Here is the proof. Consider any two different vectors ^A^B, which are not proportional to each other. Then, applying Eq. (2.2.10) to the state VAB = *¥A + T B , we have ei4>AB U{TlT,){mA
+ VB) = U{T2)U{Ti){VA
+ Tfl)
= U{T2)U{TX)VA + t/(T 2 )t/(T 1 )T B = ei(f>A U{T2TiyrA
+ ei
(2.2.11)
Any unitary or antiunitary operator has an inverse (its adjoint) which is also unitary or antiunitary. Multiplying (2.2.11) on the left with U~l{T2Ti), we have then e±t
xj/fl)
=
e±t
e±^8vFs >
(2.2.12)
the upper and lower signs referring to U{T2T{) unitary or antiunitary, respectively. Since *¥A and ¥ 5 are linearly independent, this is only possible if eiAB =
ei4>A
= ei<$>B
_
(2.2.13)
Chap. 6. The Lorentz and Poincare Groups ...
341
So as promised, the phase in Eq. (2.2.10) is independent of the state-vector ¥„, and therefore this can be written as an operator relation U(T2)U(Ti) = ^ ( r 2 ' T l ) L/(T 2 Ti).
(2.2.14)
For
(2.2.15)
with fa(6,9) a function of the 0 s and 6 s. Taking 6a = 0 as the coordinates of the identity, we must have / a (0 ; O)=/ a (O,0) = 0 a .
(2.2.16)
As already mentioned, the transformations of such continuous groups must be represented on the physical Hilbert space by unitary (rather than antiunitary) operators U(T(6)). For a Lie group, these operators can be
342
Lorentz and Poincare Invariance
represented in at least a finite neighborhood of the identity by a power series U(T(9J) = 1 + i6ata + I 9b8ctbc + • • • ,
(2.2.17)
where ta, tbc = tcb, etc. are Hermitian operators independent of the 9s. Suppose that the U(T(9)) form an ordinary (i.e., not projective) representation of this group of transformations, i.e., U(T(9J)U(T(6J) = u(T{f{8,6))).
(2.2.18)
Let us see what this condition looks like when expanded in powers of 9a and 9". According to Eq. (2.2.16), the expansion of f (9,9) to second order must take the form fa{9,9)
= 9a + 9a + fabc9b9c
+ •••
(2.2.19)
with real coefficients fabc. (The presence of any terms of order 92 or 92 would violate Eq. (2.2.16).) Then Eq. (2.2.18) reads 1 + i9ata + i 9°9ctbc + • • •] x [1 + i9ata + \9b9ctbc + • = 1 + i (9a + 9a + fabc8b9c +
+ • • •) ta
\{9b + 9b + • • •) (9C + 9C + • • -)tbc + •••
(2.2.20)
The terms of order 1,9,9,9 2 , and 92 automatically match on both sides of Eq. (2.2.20), but from the 99 terms we obtain a non-trivial condition tbc — —ffc^c — ifabc ta •
(2.2.21)
This shows that if we are given the structure of the group, i.e., the function f(9,9), and hence its quadratic coefficient fabc, we can calculate the second-order terms in U{T(9)) from the generators ta appearing in the first-order terms. However, there is a consistency condition: the operator tbc must be symmetric in b and c (because it is the second derivative of U{T{9)) with respect to 9b and 9C) so Eq. (2.2.21) requires that [tb,tc] = iCabcta,
(2.2.22)
where Cabc are a set of real constants known as structure constants Cabc = -fabc+facb.
(2.2.23)
Such a set of commutation relations is known as a Lie algebra. In Section 2.7 we will prove in effect that the commutation relation (2.2.22) is the single condition needed to ensure that this process can be continued: the complete power series for U(T(9)) may be calculated from an infinite sequence of relations like Eq. (2.2.21), provided we know the first-order terms, the generators ta. This does not necessarily mean that the operators U{T{9)) are uniquely determined for all 9a if we know the ta, but it
Chap. 6. The Lorentz and Poincare Groups ...
343
does mean that the U(T(9)) are uniquely determined in at least a finite neighborhood of the coordinates 6a = 0 of the identity, in such a way that Eq. (2.2.15) is satisfied if 9,9, and f{9,9) are in this neighborhood. The extension to all 9a is discussed in Section 2.7. There is a special case of some importance, that we will encounter again and again. Suppose that the function f{9,9) (perhaps just for some subset of the coordinates 9a) is simply additive fa(9,9) = 9a + 9a. (2.2.24) This is the case for instance for translations in spacetime, or for rotations about any one fixed axis (though not for both together). Then the coefficients fabc in Eq. (2.2.19) vanish, and so do the structure constants (2.2.23). The generators then all commute [tb, tc]=0. (2.2.25) Such a group is called Abelian. In this case, it is easy to calculate U{T(9)) for all 9a. From Eqs. (2.2.18) and (2.2.24), we have for any integer N U(T(9))
U l T l ^ "
Letting N -> oo, and keeping only the first-order term in U(T{9/N)), we have then IN
U[T{9)) = lim 1 + ±Pttt and hence l/(r(0)) = e x p ( M a ) 2.3
(2.2.26)
Quantum Lorentz Transformations
Einstein's principle of relativity states the equivalence of certain 'inertiaF frames of reference. It is distinguished from the Galilean principle of relativity, obeyed by Newtonian mechanics, by the transformation connecting coordinate systems in different inertial frames. If xM are the coordinates in one inertial frame (with x 1 ,* 2 ,* 3 Cartesian space coordinates, and x° = t a time coordinate, the speed of light being set equal to unity) then in any other inertial frame, the coordinates x'M must satisfy n^dx^dx'i1 = rj^dx^dx"
(2.3.1)
3x'M dx'v 5x7 5x^ = *"• "
<13-2>
or equivalently ^
344
Lorentz and Poincare Invariance
Here rj^ is the diagonal matrix, with elements mi = mi = >733 = +1, Too = - 1 (2.3.3) and the summation convention is in force: we sum over any index like \i and v in Eq. (2.3.2), which appears twice in the same term, once upstairs and once downstairs. These transformations have the special property that the speed of light is the same (in our units, equal to unity) in all inertial frames;* a light wave travelling at unit speed satisfies \dx/dt\ = 1, or in other words rjliVdxfldxv = dx2 — dt2 = 0, from which it follows that also rjltvdx,fldx'v = 0, and hence \dx!/dt?\ = 1. Any coordinate transformation xM -> x'M that satisfies Eq. (2.3.2) is linear*0 x'" = A" v x v +a"
(2.3.4)
with a? arbitrary constants, and A^ a constant matrix satisfying the conditions ij^pW
= npa .
(2.3.5)
For some purposes, it is useful to write the Lorentz transformation condition in a different way. The matrix rj^ has an inverse, written n1™, which happens to have the same components: it is diagonal, with f/00 = - 1 , rjn = n22 = rj33 = +1. Multiplying Eq. (2.3.5) with ^ A * , and inserting parentheses judiciously, we have rlllvA»p(A\A\r,<") = A% = n^KA%
.
Multiplying with the inverse of the matrix ri^A^p then gives \\\KxrF
= riVK .
(2.3.6)
These transformations form a group. If we first perform a Lorentz transformation (2.3.4), and then a second Lorentz transformation x//z —> x"*, with x"" = A%x'p + a" = A y A ' v x v + of) + a" then the effect is the same as the Lorentz transformation x^ —• x"^, with x"" = (A%A'v)xv + (A%ap + av).
(2.3.7) A/JPAPV,
(Note that if A^v and A^v both satisfy Eq. (2.3.5), then so does so this is a Lorentz transformation. The bar is used here just to distinguish * There is a larger class of coordinate transformations, known as conformal transformations, for which n^dx^dx" is proportional though generally not equal to n^dx^dx*, and which therefore also leave the speed of light invariant. Conformal invariance in two dimensions has proved enormously important in string theory and statistical mechanics, but the physical relevance of these conformal transformations in four spacetime dimensions is not yet clear.
Chap. 6. The Lorentz and Poincare Groups ...
345
one Lorentz transformation from the other.) The transformations T{A,a) induced on physical states therefore satisfy the composition rule T(A,a)T(A,a)
= T(AA,Aa + a).
(2.3.8)
Taking the determinant of Eq. (2.3.5) gives (DetA)2 = 1 AMV
so form
-1 v
has an inverse, (A )
p
(2.3.9)
which we see from Eq. (2.3.5) takes the
( A - y v = A / = n^npaA\
.
(2.3.10)
The inverse of the transformation T(A, a) is seen from Eq. (2.3.8) to be T(A~ 1 ,-A- 1 a), and, of course, the identity transformation is T(1,0). In accordance with the discussion in the previous section, the transformations T(A,a) induce a unitary linear transformation on vectors in the physical Hilbert space ¥ -» U(A,ayV . The operators U satisfy a composition rule U(A,a)U(A,a) = U(AA,Aa + a).
(2.3.11)
(As already mentioned, to avoid the appearance of a phase factor on the right-hand side of Eq. (2.3.11), it is, in general, necessary to enlarge the Lorentz group. The appropriate enlargement is described in Section 2.7.) The whole group of transformations T(A, a) is properly known as the inhomogeneous Lorentz group, or Poincare group. It has a number of important subgroups. First, those transformations with a? = 0 obviously form a subgroup, with T(A, 0) 7(A, 0) = T(AA, 0),
(2.3.12)
known as the homogeneous Lorentz group. Also, we note from Eq. (2.3.9) that either DetA = +1 or DetA = — 1; those transformations with DetA = +1 obviously form a subgroup of either the homogeneous or the inhomogeneous Lorentz groups. Further, from the 00-components of Eqs. (2.3.5) and (2.3.6), we have (A°0)2 = 1 + A'oA'o = 1 + A 0 iA 0 i.
(2.3.13)
with i summed over the values 1, 2, and 3. We see that either A% > +1 or A°o < — 1. Those transformations with A°o > +1 form a subgroup. Note that if A*\ and AMV are two such As, then (AA)°o = A°oA°o + A 0 !A x 0 + A°2A2o + A03A3o ;. But Eq. (2.3.13) shows that the three-vector (A1o,A2o, A3o) has length 0 A/(A°O) 2 — 1, and similarly the three-vector (A°i,A 2, A°3) has length
346
Lorentz and Poincare Invariance
2 V(A 'o) — 1, so the scalar product of these two three-vectors is bounded
by
l A 0 ! ^ + A°2A2o + A°3A3ol < yJ(A%)2-l\f(A%)2-l,
(2.3.14)
and so (AA)°o > A°oA°o - ^(A°o)2 - l y ^ o ) 2 - 1 > 1 . The subgroup of Lorentz transformations with Det A = +1 and A% > +1 is known as the proper orthochronous Lorentz group. Since it is not possible by a continuous change of parameters to jump from Det A = +1 to Det A = - 1 , or from A°0 > +1 to A°0 < - 1 , any Lorentz transformation that can be obtained from the identity by a continuous change of parameters must have Det A and A°o of the same sign as for the identity, and hence must belong to the proper orthochronous Lorentz group. Any Lorentz transformation is either proper and orthochronous, or may be written as the product of an element of the proper orthochronous Lorentz group with one of the discrete transformations 2P or ST or 5P2T, where 0> is the space inversion, whose non-zero elements are ^°o = 1, 9X\ = ^ 2 2 = ^ 3 3 = - 1 ,
(2.3.15)
and 3~ is the time-reversal matrix, whose non-zero elements are ,r 0 o = - l , ST\ = .T 2 2 = iT33 = l .
(2.3.16)
Thus the study of the whole Lorentz group reduces to the study of its proper orthochronous subgroup, plus space inversion and time-reversal. We will consider space inversion and time-reversal separately in Section 2.6. Until then, we will deal only with the homogeneous or inhomogeneous proper orthochronous Lorentz group.
2.4
The Poincare Algebra
As we saw in Section 2.2, much of the information about any Lie symmetry group is contained in properties of the group elements near the identity. For the inhomogeneous Lorentz group, the identity is the transformation A^v = d^y, a11 — 0, so we want to study those transformations with A*\ = 5", + G>"¥ ,
0" = ^ ,
(2.4.1)
both aA and e* being taken infinitesimal. The Lorentz condition (2.3.5)
Chap. 6. The Lorentz and Poincare Groups ...
347
reads here ripe = V>(<5% + <"%) (<5v
We are here using the convention, to be used throughout this book, that indices may be lowered or raised by contraction with r\^ or rj^ U>op = riiiaOi^p
Keeping only the terms of first order in a> in the Lorentz condition (2.3.5), we see that this condition now reduces to the antisymmetry of ai^ a>nv = -c
=
jp°
t
PPt = PP #
(2.4.4)
Since copa is antisymmetric, we can take its coefficient Jpa to be antisymmetric also jpa
=
_jap
(2 4 5)
As we shall see, Pl,P2, and P 3 are the components of the momentum operators, J2^,Jn, and J 12 are the components of the angular momentum vector, and P° is the energy operator, or Hamiltonian." ' In the absence of superselection rules, the possibility that the proportionality factor may depend on the state on which U( 1,0) acts can be ruled out by the same reasoning that we used in Section 2.2 to rule out the possibility that the phases in projective representations of symmetry groups may depend on the states on which the symmetries act. Where superselection rules apply, it may be necessary to redefine U(l,0) by phase factors that depend on the sector on which it acts. " We will see that this identification of the angular-momentum generators is forced on us by the commutation relations of the J1". On the other hand, the commutation relations do not allow us to distinguish between J"1 and — P*, so the sign for the e^P? term in (2.4.3) is a matter of convention. The consistency of the choice in (2.4.3) with the usual definition of the Hamiltonian P° is shown in Section 3.1.
348
Lorentz and Poincare Invariance
We now examine the Lorentz transformation properties of Jpa and Pp. We consider the product l7(A,a)l/(l + a>,e)tr 1 (A,a), where A^v and ap are here the parameters of a new transformation, unrelated to co and e. According to Eq. (2.3.11), the product [/(A- 1 ,-A- 1 a)f/(A,a) equals 17(1,0), so U(A-\-A~la) is the inverse of [/(A, a). It follows then from (2.3.11) that U(A,a)U(l + cD,e)U-1{A,a) = u(A(l + (o)A-\Ae-A(oA-la)
. (2.4.6)
To first order in co and e, we have then U(A,a) [ \opaJp(! -epPp\
U~l(A,a) =
{(AcoA-%J^ -(Ae - AcoA-la)^P" .
(2.4.7)
Equating coefficients of cop(T and ep on both sides of this equation (and using (2.3.10)), we find [/(A, a)JpaU-\A,a)
= A / A / ( J ' , V - aftPv + d'P^),
p
U(A, a)P U~\A, a) = A / P " .
(2.4.8) (2.4.9)
For homogeneous Lorentz transformations (with a? = 0), these transformation rules simply say that J1" is a tensor and P^ is a vector. For pure translations (with AMV = ^ v ) , they tell us that Pp is translation-invariant, but 3pa is not. In particular, the change of the space-space components of Jpa under a spatial translation is just the usual change of the angular momentum under a change of the origin relative to which the angular momentum is calculated. Next, let's apply rules (2.4.8), (2.4.9) to a transformation that is itself infinitesimal, i.e., A ^ = <5^v +c
(2.4.10) (2.4.11)
Equating coefficients of a>MV and e^ on both sides of these equations, we find the commutation rules i [J"v,Jpa] = nvpJ** - ^pJva - nailJpv + ( i " F , i [ ? " , Jpa] = rf?Pa - rfaPp , [pn , pp] = 0 . This is the Lie algebra of the Poincare group.
(2.4.12) (2.4.13) (2.4.14)
Chap. 6. The Lorentz and Poincare Groups ...
349
In quantum mechanics a special role is played by those operators that are conserved, i.e., that commute with the energy operator H = P°. Inspection of Eqs. (2.4.13) and (2.4.14) shows that these are the momentum three-vector P={P\P
2
,P
3
}
(2.4.15)
and the angular-momentum three-vector J = {j 2 3 ,J 3 1 ,J 1 2 }
(2.4.16)
and, of course, the energy P° itself. The remaining generators form what is called the 'boost' three-vector K = {J 1 0 ; J 2 0 ,J 3 0 } .
(2.4.17)
These are not conserved, which is why we do not use the eigenvalues of K to label physical states. In a three-dimensional notation, the commutation relations (2.4.12), (2.4.13), (2.4.14) may be written [Ji, Jj] = * tijkJk , [Ji, Kj] = i eijkKk , [Ki,Kj] = —ieijkJk •. [Ji,Pj] = ieijkPk , [Ki,Pj] = iH5ij, [Ji,H] = [Pi,H] = [H,H]=0, [Ki,H]=>iPi, i
(2.4.18) (2.4.19) (2.4.20) (2.4.21) (2.4.22) (2.4.23) (2.4.24)
where i,j,k, etc. run over the values 1, 2, and 3, and e,^ is the totally antisymmetric quantity with en3 = +1. The commutation relation (2.4.18) will be recognized as that of the angular-momentum operator. The pure translations T(l,a) form a subgroup of the inhomogeneous Lorentz group with a group multiplication rule given by (2.3.7) as T(l,a)T(l,a) = T(l,a + a).
(2.4.25)
This is additive in the same sense as (2.2.24), so by using (2.4.3) and repeating the same arguments that led to (2.2.26), we find that finite translations are represented on the physical Hilbert space by 1/(1, a) = exp(-iP%).
(2.4.26)
In exactly the same way, we can show that a rotation RQ by an angle |0| around the direction of 6 is represented on the physical Hilbert space by t/(i?0,O) = exp(iJ-0).
(2.4.27)
It is interesting to compare the Poincare algebra with the Lie algebra of the symmetry group of Newtonian mechanics, the Galilean group. We
350
Lorentz and Poincare Invariance
could derive this algebra by starting with the transformation rules of the Galilean group and then following the same procedure that was used here to derive the Poincare algebra. However, since we already have Eqs. (2.4.18)-(2.4.24), it is easier to obtain the Galilean algebra as a low-velocity limit of the Poincare algebra, by what is known as an InonuWigner contraction4,5. For a system of particles of typical mass m and typical velocity v, the momentum and the angular-momentum operators are expected to be of order J ~ 1, P ~ mv. On the other hand, the energy operator is H = M + W with a total mass M and non-mass energy W (kinetic plus potential) of order M ~ m, W ~ mv2. Inspection of Eqs. (2.4.18)-(2.4.24) shows that these commutation relations have a limit for v « 1 of the form [Ji, Jj] = i eijk Jk , [Ji, Kj] = i eijk Kk , [Jt, Pj] = i eijk Pk , [Kit Pj] = i M5tj , [Ji,W] = [Pi,W]=0, [KhW] = iPi, [JhM] = [Pt,M] = [KhM] = [W,M]=0,
[K*, Kj\ = 0 ,
with K of order 1/v. Note that the product of a translation x -* x + a and a 'boost' x —> x + vr should be the transformation x —> x + \t + a, but this is not true for the action of these operators on Hilbert space: exp(iK • v) exp(—iP • a) = exp(iMa • v/2) exp (i(K • v — P • a)) . The appearance of the phase factor expO'Ma • v/2) shows that this is a projective representation, with a superselection rule forbidding the superposition of states of different mass. In this respect, the mathematics of the Poincare group is simpler than that of the Galilean group. However, there is nothing to prevent us from formally enlarging the Galilean group, by adding one more generator to its Lie algebra, which commutes with all the other generators, and whose eigenvalues are the masses of the various states. In this case physical states provide an ordinary rather than a projective representation of the expanded symmetry group. The difference appears to be a mere matter of notation, except that with this reinterpretation of the Galilean group there is no need for a mass superselection rule.
Chap. 6. The Lorentz and Poincare Groups . . .
Lorentz Group in Feynman's World - W i g n e r ' s Little Groups and Their Applications Y. S. Kim1 Department of Physics, University of Maryland, College Park, Maryland 20742, U.S.A. Marilyn E. Noz 2 Department of Radiology, New York University, New York, New York 10016, U.S.A.
Abstract R. P. Feynman was quite fond of inventing new physics. It is shown that some of his physical ideas can be supported by the mathematical instruments available from the Lorentz group. As a consequence, it is possible to construct a Lorentz-covariant picture of the parton model, It is shown first how the Lorentz group can be used for studying the internal space-time symmetries of relativistic particles. These symmetries are dictated by Wigner's little groups, whose transformations leave the energy-momentum four-vector of a given particle invariant. The symmetry of massive particles is like the threedimensional rotation group, while the symmetry of massless particles is locally isomorphic to the two-dimensional Euclidean group. It is noted that the E(2)like symmetry of massless particles can be obtained as an infinite-momentum and/or zero-mass limit of the 0(3)-like little group for massive particles. It is shown that the formalism can be extended to cover relativistic particles with space-time extensions, such as heavy ions and hadrons in the quark model. It is possible to construct representations of the little group using harmonic oscillators, which Feynman et al. used for studying relativistic extended hadrons. This oscillator formalism allows us to show that Feynman's parton model is a Lorentz-boosted quark model. The formalism also allows us to explain in detail Feynman's rest of the universe which is contained in his parton picture.
electronic address: [email protected] Electronic address: [email protected]
351
352 Lorentz and Poincare Invariance
1
Introduction
The role of Lorentz covariance in quantum field theory and thus Feynman diagrams is well known. In this paper, we study how the Lorentz group plays roles in other physical theories initiated by Feynman. Richard Feynman and Eugene Wigner left their own legacies in physics. Their approaches to physics appear to be different. Feynman knew how to observe the real world and wrote down systematically how the physical world behaves, while Wigner was able to develop mathematical tools suitable to physics. The most controversial phenomenological observation Feynman made was his parton model. Among the many contributions Wigner made, his 1939 paper on the inhomogeneous Lorentz group is still controversial in that its physical contents are still being explored. In this report, we propose to combine Feynman's parton picture with Wigner's representation of the Lorentz group. The paper thus consists of two parts. First, we explain in detail Wigner's representation of the Lorentz group for internal space-time symmetries of relativistic particles. We then deal with Feynman's world using coupled harmonic oscillators. It is shown that his parton model can be explained within the framework of Wigner's representation. In order to see which aspect of Wigner's work is relevant to Feynmnan's parton model, let us go back to Einstein. If the momentum of a particle is much smaller than its mass, the energy-momentum relation is E = p2/2m + mc2. If the momentum is much larger than the mass, the relation is E = cp. These two different relations can be combined into one covariant formula E = \/m2 + p2. This aspect of Einstein's E = mc2 is also well known. In addition, particles have internal space-time variables. Massive particles have spins while massless particles have their helicities and gauge degrees of freedom. As a "further content" of Einstein's E = mc2, we shall discuss that the internal spacetime structures of massive and massless particles can be unified into one covariant package, as E = \Jm2 + p2 does for the energy-momentum relation. The mathematical framework of this program was developed by Eugene Wigner in 1939 [1]. He constructed the maximal subgroups of the Lorentz group whose transformations will leave the four-momentum of a given particle invariant. These groups are known as Wigner's little groups. Thus, the transformations of the little groups change the internal space-time variables of the particle, while leaving its four-momentum invariant. The little group is a covariant entity and takes different forms for particles moving with different speeds. In order to achieve the zero-mass and/or infinite-momentum limit of the 0(3)like little group to obtain the £'(2)-like little group, we use the group contraction technique introduced by Inonu and Wigner [2], who obtained the E(2) group by taking a flat-surface approximation of a spherical surface at the north pole. In 1987, Kim and Wigner [3] observed that it is also possible to make a cylindrical approximation of the spherical surface around the equatorial belt. While the correspondence between 0(3) and the 0(3)-like little group is transparent, the i?(2)-like little group contains both the E(2) group and the cylindrical group [4]. We study this aspect in detail in this report. Let us look at the world of R. P. Feynman. When we read his papers, he totally avoids group theoretical languages. However, whenever appropriate, his physical reasoning is consistent with the Lorentz group. Let us look at Feynman diagrams.
Chap. 6. The Lorentz and Poincare Groups ...
353
They are surprisingly consistent with Lorentz covariance of the S-matrix formalism of quantum field theory. This aspect is well known. In this paper, we study Feynman's papers published in 1969 [5] and 1971 [6], and the chapter on density matrix in his 1972 book on statistical mechanics [7]. Feynman appears to be dealing with three different physical problems in these three papers. However, the Lorentz group allows us to combine them into one great piece of work which includes a covariant description of Feynman's parton picture. How do we do this? In their 1971 paper [6], Feynman et al. used harmonic oscillators to work out hadronic mass spectra. Even though they used relativistic oscillators, they did not address the question of whether their formalism constitute a representation of Wigner's little group. On the other hand, Wigner did not use harmonic oscillators too often in his papers, and Feynman did not pay enough attention to Lorentz covariance. Thus, in order to combine Feynman's initiative with Wigner's formalism, we can construct representations of the little group using harmonic oscillators. In so doing, we construct harmonic oscillator wave functions which can be Lorentz-boosted. The question then is whether we can produce new physics by combining them. In this paper, we develop Wigner's mathematical formalism first, and we then use it to interpret Feynman's physics. In Sec. 2, we present a brief history of applications of the little groups to internal space-time symmetries of relativistic particles. It is pointed out in Sec. 3 that the translation-like transformations of the £"(2)-like little group corresponds to gauge transformations. In Sec. 4, we discuss the contraction of the three-dimensional rotation group to the two-dimensional Euclidean group. In Sec. 5, we discuss the little group for a massless particle as the infinite-momentum and/or zero-mass limit of the little group for a massive particle. In Sec. 6, we move into to the world of R. P. Feynman. Feynman was particularly fond of harmonic oscillators in formulating new ideas. It is pointed out that harmonic oscillators embrace many useful properties of the Lorentz group. In Sec. 7, we review the quantum mechanics of coupled harmonic oscillators in which one of them corresponds to the world in which we do physics, and the other is considered as the rest of the universe. In Sec. 8, it is shown that the time-separation variable in a two-body bound state belongs to Feynman's rest of the universe. It is shown also that Feynman's oscillator formalism includes this time-separation variable. We show in Sec. 9 that this 0(1,1) formalism enables to construct a covariant model of relativistic extended particles. As a consequence, we show that the quark and parton model are two different aspects of one covariant object. It is shown also that this parton picture exhibits the decoherence effect. From the historical point of view, we are dealing here with further contents of Einstein's energy-momentum relation. This question is addressed in Sec. 10.
2
Wigner's Little Groups
The Poincare group is the group of inhomogeneous Lorentz transformations, namely Lorentz transformations preceded or followed by space-time translations. In order to study this group, we have to understand first the group of Lorentz transformations,
354
Lorentz and Poincare Invariance
the group of translations, and how these two groups are combined to form the Poincare group. T h e Poincare group is a semi-direct product of the Lorentz and translation groups. T h e two Casimir operators of this group correspond to the (mass) 2 and (spin) 2 of a given particle. Indeed, the particle mass and its spin magnitude are Lorentz-invariant quantities. T h e question then is how to construct the representations of the Lorentz group which are relevant to physics. For this purpose, Wigner in 1939 studied the subgroups of the Lorentz group whose transformations leave the four-momentum of a given free particle invariant [1]. T h e m a x i m a l subgroup of the Lorentz group which leaves the four-momentum invariant is called the little group. Since the little group leaves the four-momentum invariant, it governs the internal space-time symmetries of relativistic particles. Wigner shows in his paper t h a t the internal space-time symmetries of massive and massless particles are dictated by the 0(3)-like and i?(2)-like little groups respectively. T h e 0(3)-like little group is locally isomorphic to the three-dimensional rotation group, which is very familiar to us. For instance, the group SU{2) for the electron spin is an 0(3)-like little group. T h e group E(2) is the Euclidean group in a twodimensional space, consisting of translations and rotations on a flat surface. We are performing these transformations everyday on ourselves when we move from home to school. T h e m a t h e m a t i c s of these Euclidean transformations are also simple. However, the group of these transformations are not well known to us. In Sec. 4, we give a m a t r i x representation of the E(2) group. The group of Lorentz transformations consists of three boosts and three rotations. T h e rotations therefore constitute a subgroup of the Lorentz group. If a massive particle is at rest, its four-momentum is invariant under rotations. T h u s the little group for a massive particle at rest is the three-dimensional rotation group. Then what is affected by the rotation? T h e answer to this question is very simple. T h e particle in general has its spin. T h e spin orientation is going to be affected by the rotation! If the rest-particle is boosted along the z direction, it will pick u p a non-zero m o m e n t u m component. T h e generators of the 0 ( 3 ) group will then be boosted. T h e boost will take the form of conjugation by the boost operator. This boost will not change the Lie algebra of the rotation group, and the boosted little group will still leave the boosted four-momentum invariant. We call this the 0(3)-like little group. We realize t h a t the standard four-vector convention is (t,x,y,z), but it is more convenient to use (x, y, z,t) when we study light-cone coordinate system and group contractions. In this non-standard convention, the four-momentum vector for the particle at rest is ( 0 , 0 , 0 , m ) , and the three-dimensional rotation group leaves this four-momentum invariant. This little group is generated by
Ji
/O 0 0 0 0 i
\o o
0 -i 0 0
0\ 0 0 , 0/
/ 0 0 J2 = -i \ 0
0 * 0\ 0 0 0I 0 0 00 0 0/
i J3 = 0 \0
-i 0 0 0
0 0 0 0
0 0 0 0
(1)
which satisfy the commutation relations: [Ji, Jj] = ICijkJk-
(2)
Chap. 6. The Lorentz and Poincare Groups ...
355
It is not possible to bring a massless particle to its rest frame. In his 1939 paper [1], Wigner observed t h a t the little group for a massless particle moving along the z axis is generated by the rotation generator around the z axis, namely J3 of E q . ( l ) , and two other generators which take the form
JVi
-
'0 0 i i
0 0 0 0
-i 0 0 0
f 0 0 0,
^2
/0 0 0 \0
0 0 i i
0 -i 0 0
0\ i 0 0/
(3)
If we use Ki for the boost generator along the i-th axis, these matrices can be written as Nx = Kl - J2 N2 = K2 + h (4) with A'i
/0 0 0 \i
0 0 0 0
0 0 0 0
i 0 0 0,
Kn =
'0 0 0 .0
0 0 0 i
0 0 0 0
0> i 0 0,
(5)
T h e generators J 3 , N\ and N2 satisfy the following set of c o m m u t a t i o n relations. [NuN2}
= 0,
[J3,N1]=iN2,
[J3,N2] = -iN1.
(6)
In Sec. 4, we discuss the generators of the E(2) group. They are J3 which generates rotations around the z axis, and f\ and P2 which generate translations along the x and y directions respectively. If we replace N\ and N2 by P i and P2, the above set of c o m m u t a t i o n relations becomes the set given for the E(2) group given in Eq.(18). This is the reason why we say the little group for massless particles is i?(2)-like. Very clearly, the matrices Ni and ./V2 generate Lorentz transformations. It is not difficult to associate the rotation generator J3 with the helicity degree of freedom of the massless particle. Then what physical variable is associated with the Ni and W2 generators? Indeed, Wigner was the one who discovered the existence of these generators, but did not give any physical interpretation to these translationlike generators. For this reason, for many years, only those representations with the zero-eigenvalues of the ./V operators were thought to be physically meaningful representations [8]. It was not until 1971 when Janner and Janssen reported t h a t the transformations generated by these operators are gauge transformations [9, 10, 12]. The role of this translation-like transformation has also been studied for spin-1/2 particles, and it was concluded t h a t the polarization of neutrinos is due to gauge invariance [11, 13]. Another i m p o r t a n t development along this line of research is the application of group contractions to the unifications of the two different little groups for massive and massless particles. We always associate the three-dimensional rotation group with a spherical surface. Let us consider a circular area of radius 1 kilometer centered on the north pole of the earth. Since the radius of the earth is more t h a n 6,450 times longer, the circular region appears fiat. Thus, within this region, we use the £'(2) symmetry group for this region. T h e validity of this approximation depends on the ratio of the two radii. In 1953, Inonu and Wigner formulated this problem as the contraction of 0 ( 3 ) to E(2) [2]. How about then the little groups which are isomorphic to 0 ( 3 ) and E(2)7
356 Lorentz and Poincare Invariance It is reasonable to expect that the £'(2)-like little group be obtained as a limiting case for of the 0(3)-like little group for massless particles. In 1981, it was observed by Ferrara and Savoy that this limiting process is the Lorentz boost [14]. In 1983, using the same limiting process as that of Ferrara and Savoy, Han et al showed that transverse rotation generators become the generators of gauge transformations in the limit of infinite momentum and/or zero mass [15]. In 1987, Kim and Wigner showed that the little group for massless particles is the cylindrical group which is isomorphic to the E(2) group [3]. This is illustrated in Fig. 1.
North Pole
R^-oo
Equatorial Belt
E(2)
Isomorphic
R^oo
o
Cylindrical Group
o
E o CO
Identical
Identical R-^oo
Little Group m*0, (3=0
R
=VW
_L_ Little Group m=0, p=0
Figure 1: Contraction of 0(3) to E(2) and to the cylindrical group, and contraction of the 0(3)-like little group to the E(2)-like little group. The correspondence between E(2) and the E(2)-like little group is isomorphic but not identical. The cylindrical group is identical to the E(2)-like little group. The Lorentz boost of the 0(3)-like little group for a massive particle is the same as the contraction of 0(3) to the cylindrical group.
3
Translations and Gauge Transformations
It is possible to get the hint that the N operators generate gauge transformations from Weinberg's 1964 papers [8, 11]. But it was not until 1971 when Janner and Janssen explicitly demonstrated that they generate gauge transformations [9, 10]. In order to fully appreciate their work, let us compute the transformation matrix exp (—i(uN\ + VN2)
(7)
Chap. 6. The Lorentz and Poincare Groups ... 357 generated by N\ and 7V2 • Then the four-by-four matrix takes the form /l 0 u \u
0 -u 1 —v v l-{u2 + v2)/2 0 -(u2 + v2)/2
u \ v (u2 + v2)/2 l + (u2 + v2)/2/
(8)
If we apply this matrix to the four-vector to the four-momentum vector p=(0,0,a;)W)
(9)
of a massless particle, the momentum remains invariant. It therefore satisfies the condition for the little group. If we apply this matrix to the electromagnetic fourpotential A={Al,A2,A3,A0)exp{i{kz-ut)), (10) with A3 — Ao which is the Lorentz condition, the result is a gauge transformation. This is what Janner and Janssen discovered in their 1971 and 1972 papers [9]. Thus the matrices Ni and N2 generate gauge transformations.
4
Contraction of 0 ( 3 ) t o E(2)
In this section, we explain what the £'(2) group is. We then explain how we can obtain this group from the three-dimensional rotation group by making a fiat-surface or cylindrical approximation. This contraction procedure will give a clue to obtaining the £'(2)-like symmetry for massless particles from the 0(3)-like symmetry for massive particles by taking the infinite-momentum limit. The E(2) transformations consist of a rotation and two translations on a fiat plane. Let us start with the rotation matrix applicable to the column vector (x, y, 1):
(
cos 9 sin
- sin 0 0 \ cos0 0 .
0 Let us then consider the translation matrix: /l T{a,b)= 0 \0 If we take the product
0
(11)
1/
0 a\ 1 6). 0 l)
(12)
T{a,b)R{6),
(
cos 9 sinf?
— sin 9 a\ cos 9 b \ .
(13)
0 0 l) This is the Euclidean transformation matrix applicable to the two-dimensional xy plane. The matrices R(9) and T(a, b) represent the rotation and translation subgroups respectively. The above expression is not a direct product because R(9) does not commute with T(a,b). The translations constitute an Abelian invariant subgroup because two different T matrices commute with each other, and because R(9)T(a, b)R~l(9) = T{a',b'). (14)
358 Lorentz and Poincare Invariance The rotation subgroup is not invariant because the conjugation T{a,b)R(9)T-1{a,b)
(15)
does not lead to another rotation. We can write the above transformation matrix in terms of generators. The rotation is generated by / 0 -i 0 J3 = \ i 0 0 (16) \0 0 0 The translations are generated by
Pi =
/0 0 A 0 0 0 , \0 0 0 /
P2 =
(17)
These generators satisfy the commutation relations: [A,P 2 ] = 0,
[J3,Pi] = iP2,
[Js,P2] = -iPi.
(18)
This E(2) group is not only convenient for illustrating the groups containing an Abelian invariant subgroup, but also occupies an important place in constructing representations for the little group for massless particles, since the little group for massless particles is locally isomorphic to the above E(2) group. The contraction of 0(3) to E(2) is well known and is often called the InonuWigner contraction [2]. The question is whether the i?(2)-like little group can be obtained from the 0(3)-like little group. In order to answer this question, let us closely look at the original form of the Inonu-Wigner contraction. We start with the generators of 0(3). The J3 matrix is given in Eq.(l), and
J2 =
/ 0 0 \-»
0 A 0 0 , 0 0/
J3=
/0 [ i \0
-i 0 0
0\ 0 . 0/
(19)
The Euclidean group E{2) is generated by J3, Pi and P2, and their Lie algebra has been discussed in Sec. 1. Let us transpose the Lie algebra of the £'(2) group. Then Pi and P2 become Qi and Q2 respectively, where Qi =
/0 0 0\ 0 0 0 , \i 0 0/
Q2 =
/0 0 0 0 \0 i
0\ 0 . 0/
(20)
Together with J3, these generators satisfy the same set of commutation relations as that for J3, Pi, and P2 given in Eq.(18): [Qi,Q 2 ] = 0,
[J3,Qi] = iQ2,
[J3,Q2] = -iQi-
(21)
These matrices generate transformations of a point on a circular cylinder. Rotations around the cylindrical axis are generated by J3. The matrices Qi and Q2 generate
Chap. 6. The Lorentz and Poincare Groups ...
359
>y
Figure 2: North-pole and Equatorial-belt approximations. The north-pole approximation leads to the contraction of 0(3) to E(2). The equatorial-belt approximation leads corresponds to the contraction the cylindrical group. translations along the direction of z axis. The group generated by these three matrices is called the cylindrical group [3, 4]. We can achieve the contractions to the Euclidean and cylindrical groups by taking the large-radius limits of Pi = and
K
]^B-lJ2B,
l Qi = - RBJ2B-\
where B{R)
P2 =
H
Q2 = 1 0 0 = ( 0 1 0 0 0 R,
-^B-lJxB,
(22)
^BJ1B-\
(23)
(24)
The vector spaces to which the above generators are applicable are (x,y,z/R) and (x,y,Rz) for the Euclidean and cylindrical groups respectively. They can be regarded as the north-pole and equatorial-belt approximations of the spherical surface respectively [3]. Fig. 2 illustrates how the Euclidean and cylindrical contractions are made.
360 Lorentz and Poincare Invariance
5
Contraction of 0(3)-like Little Group to E(2)like Little Group
Since Pi(P2) commutes with Q2{Qi), we can consider the following combination of generators. Fi = Pi + Qlt F2 = P2 + Q2. (25) Then these operators also satisfy the commutation relations:
[*i,*2] = o,
[J3)FI:
[J3,F2] =
IFn
-iFl.
(26)
However, we cannot make this addition using the three-by-three matrices for Pi and Qi to construct three-by-three matrices for i*\ and F2, because the vector spaces are different for the Pi and Q,- representations. We can accommodate this difference by creating two different z coordinates, one with a contracted z and the other with an expanded z, namely (x,y, Rz, zjR). Then the generators become
Pi
and Qi
0 f 0 0 0 0 0 0,
/ 0 0 0 0' 0 0 0 i Pi = 0 0 0 0
/0 0 0 0\ 0 0 0 0 i 0 0 0
Q2
'0 0 0 0^ 0 0 0 0 0 i 0 0 .0 0 0 0,
(28)
F,=
'0 0 0 0' 0 0 0 i 0 i 0 0 ,0 0 0 0,
(29)
'0 0 0 ,0
0 0 0 0
(27)
\ o o o o,
Vo o o o/ Then Fi and F2 will take the form /0 0 F, = i
0 0 i\ 0 0 0 0 0 0
Vo o o o/
The rotation generator J3 takes the form of Eq.(l). These four-by-four matrices satisfy the E(2)-like commutation relations of Eq.(26). Now the B matrix of Eq.(24), can be expanded to
B{R)
/l 0 0 0 1 0 0 0 R
0 \ 0 0 I/RJ
Vo 0 0
(30)
This matrix includes both the contraction and expansion in the light-cone coordinate system, as illustrated in Fig. 3. If we make a similarity transformation on the above form using the matrix /l 0 0 0 1 0
0 0
\
0 0 1/V2 - 1 / V 2 Vo 0 1/V5 1/V2 J
(31)
Chap. 6. The Lorentz and Poincare Groups ...
361
2(( 2 -z 2 )
Area A
Figure 3: Light-cone coordinates. When the system is Lorentz-boosted, one of the axes expands while the other becomes contracted. Both the expansion and contraction are needed for the contraction of the 0(3)-like little group to £(2)-like little group. which performs a 45-degree rotation of the third and fourth coordinates, then this matrix becomes 0 0 1 0 0 0 0 1 (32) 0 0 cosh rj sinh rj 0 0 sinh77 cosh 77 n with R = e . This form is the Lorentz boost matrix along the z direction. If we start with the set of expanded rotation generators J 3 of Eq.(l), and perform the same operation as the original Inonu-Wigner contraction given in Eq.(22), the result is x (33) N2 = -l-B - JiB, Nx = -^B^JiB, where N\ and JV2 are given in Eq.(3). The generators N\ and N2 are the contracted J2 and J1 respectively in the infinite-momentum and/or zero-mass limit. It was noted in Sec. 3 that N\ and N2 generate gauge transformations on massless particles. Thus the contraction of the transverse rotations leads to gauge transformations. We have seen in this section that Wigner's 0(3)-like little group can be contracted into the E(2)-\ike little group for massless particles. Here, we worked out
362 Lorentz and Poincare Jnvariance explicitly for the spin-1 case, but this mechanism should be applicable to all other spins. Of particular interest is spin-1/2 particles. This has been studied by Han, Kim and Son [11]. They noted that there are also gauge transformations for spin1/2 particles, and the polarization of neutrinos is a consequence of gauge invariance. It has also been shown that the gauge dependence of spin-1 particles can be traced to the gauge variable of the spin-1/2 particle [16]. It would be very interesting to see how the present formalism is applicable to higher-spin particles. Another case of interest is the space-time symmetry of relativistic extended particles. In 1973 [17], the present authors constructed a ground-state harmonic oscillator wave function which can be Lorentz-boosted. It was later found that this oscillator formalism can be extended to represent the 0(3)-like little group [18, 19]. This oscillator formalism has a stormy history because it ultimately plays a pivotal role in combining quantum mechanics and special relativity [20, 21]. With these wave functions, we propose to solve the following problem in highenergy physics. The quark model works well when hadrons are at rest or move slowly. However, when they move with speed close to that of light, they appear as a collection of an infinite-number of partons [5]. The question then is whether the parton model is a Lorentz-boosted quark model. This question has been addressed before [22, 23], but it can generate more interesting problems [24]. The present situation is presented in the Table 1. Table 1: Massive and massless particles in one package. Wigner's little group unifies the internal space-time symmetries for massive and massless particles. It is a great challenge for us to find another unification: the unification of the quark and parton pictures in high-energy physics.
Massive, Slow
EnergyMomentum
Internal Space-time Symmetry
Relativistic Extended Particles
COVARIANCE
Massless, Fast
Einstein's E
=p2/2m
s3 Si, £2
Quark Model
E=\p2 + m2]1/2
E =p
S3
Wigner's Little Group
Gauge Trans.
One Covariant Theory
Parton Model
We are now ready to consider the third row of Table 1. In the this table, we would like to say that the quark model and the parton model are two different manifestation of one covariant entity. In order to appreciate fully this covariant
Chap. 6. The Lorentz and Poincare Groups ...
363
aspect, let us examine Feynman's style of doing physics.
6
Feynman's World
Feynman was quite fond of using harmonic oscillators to probe new territories of physics. In this section, we examine which oscillator formalism is most suitable to interpret some of Feynman's papers during the period 1969 - 1972. This formalism should accommodate special relativity and quantum mechanics of extended objects. Let us start with a simple physical system. Two coupled harmonic oscillators play many important roles in physics. In group theory, it generates symmetry group as rich as 0 ( 3 , 3) [25]. It has many interesting subgroups useful in all branches of physics. The group 0(3,1) is of course essential for studying covariance in special relativity. It is applicable to three space-like variables and one time-like variable. In the harmonic oscillator regime, those three space-like coordinates are separable. Thus, it is possible to separate longitudinal and transverse coordinates. If we leave out the transverse coordinates which do not participate in Lorentz boosts, the only relevant variables are longitudinal and time-like variables. The symmetry group for this case is 0(1,1) easily derivable from the Hamiltonian for the two-oscillator system. It is widely known that this simple mathematical device is the basic language for two-photon coherent states known as squeezed states of light [26, 27]. However, this 0(1,1) device plays a much more powerful role in physics. According to Feynman, the adventure of our science of physics is a perpetual attempt to recognize that the different aspects of nature are really different aspects of the same thing [28]. Feynman wrote many papers on different subjects of physics, but they are coming from one paper according to him. We are not able to combine all of his papers, but we can consider three of his papers published during the period 1969-72. In this paper, we would like to consider Feynman's 1969 report on partons [5], the 1971 paper he published with his students on the quark model based on harmonic oscillators [6], and the chapter on density matrix in his 1972 book on statistical mechanics [7]. In these three different papers, Feynman deals with three distinct aspects of nature. We shall see whether Feynman was saying the same thing in these papers. For this purpose, we shall use the 0(1,1) symmetry derivable directly from the Hamiltonian for two coupled oscillators [29]. The standard procedure for this two-oscillator system is to separate the Hamiltonian using normal coordinates. The transformation to the normal coordinate system becomes very simple if the two oscillators are identical. We shall use this simple mathematics to find a common ground for the above-mentioned articles written by Feynman. First, let us look at Feynman's book on statistical mechanics [7]. He makes the following statement about the density matrix. When we solve a quantum-mechanical problem, what we really do is divide the universe into two parts - the system in which we are interested and the rest of the universe. We then usually act as if the system in which we are interested comprised the entire universe. To motivate the use of density matrices, let us see what happens when we include the part of the universe outside the system. In order to see clearly what Feynman had in mind, we use the above-mentioned couples oscillators. One of the oscillators is the world in which we are interested
364 Lorentz and Poincare Invariance and the other oscillator as the rest of the universe. There will be no effects on the first oscillator if the system is decoupled. Once coupled, we need a normal coordinate system in order separate the Hamiltonian. Then it is straightforward to write down the wave function of the system. Then the mathematics of this oscillator system is directly applicable to Lorentz-boosted harmonic oscillator wave functions, where one variable is the longitudinal coordinate and the other is the time variable. The system is uncoupled if the oscillator wave function is at rest, but the coupling becomes stronger as the oscillator is boosted to a high-speed Lorentz frame [19]. We shall then note that for two-body system, such as the hydrogen atom, there is a time-separation variable which is to be linearly mixed with the longitudinal space-separation variable. This space-separation variable is known as the Bohr radius, but we never talk about the time-separation variable in the present form of quantum mechanics, because this time-separation variable belongs to Feynman's rest of the universe. If we pretend not to know this time-separation variable, the entropy of the system will increase when the oscillator is boosted to a high-speed system [24]. Does this increase in entropy correspond to decoherence? Not necessarily. However, in 1969, Feynman observed the parton effect in which a rapidly moving hadron appears as a collection of incoherent partons [5]. This is the decoherence mechanism of current interest.
7
Two Coupled Oscillators
Two coupled harmonic oscillators serve many different purposes in physics. It is well known that this oscillator problem can be formulated into a problem of a quadratic equation in two variables. To make a long story short, let us consider a system of two identical oscillators coupled together by a spring. The Hamiltonian is H
=^{Pi+P2}
+ \ iK (xl+xl)+2CXlxa} .
(34)
We are now ready to decouple this Hamiltonian by making the coordinate rotation: 2/1 = -y= (xi ~x2),
J/2 =
-T=
(*i + x2) •
(35)
In terms of this new set of variables, the Hamiltonian can be written as H
=^{PI+PI)
+ Y
Wyf + e-2"y22},
(36)
with exp fa) = y/(K + C)/(K-C).
(37)
Thus T) measures the strength of the coupling. If y\ and y2 are measured in units of {mK)llA, the ground-state wave function of this oscillator system is lM*i.*2)=(^)
e x p j - i ^ +e - ^ j .
(38)
The wave function is separable in the y\ and y2 variables. However, for the variables x\ and x2, the story is quite different.
Chap. 6. The Lorentz and Poincare Groups ... 365 The key question is how quantum mechanical calculations in the world of the observed variable are affected when we average over the other variable. The x2 space in this case corresponds to Feynman's rest of the universe, if we only consider quantum mechanics in the X\ space. As was discussed in the literature for several different purposes [27, 19], the wave function of Eq.(38) can be expanded as
i>n{xi,xi)=
Qgh
X/(tanh2J
<
M !Cl ) ( M a:2 )-
( 39 )
k
This expansion corresponds to the two-photon coherent states in Yuen's paper [26], and the wave function of Eq.(38) is a squeezed wave function [27]. The question then is what lessons we can learn from the situation in which we average over the x2 variable. In order to study this problem, we use the density matrix. From this wave function, we can construct the pure-state density matrix p(xi,x2;x'1,x'2)
- rl>r,(x1,x2)il>n(x'1,x'2),
(40)
If we are not able to make observations on the x2, we should take the trace of the p matrix with respect to the x2 variable. Then the resulting density matrix is p(x,x')=
iptl(x,x2){^v{x',x2)}*
dx2.
(41)
We have simplicity replaced xi and x[ by x and x' respectively. If we perform the integral of Eq.(41), the result is
^•^=(s^) a i;Hi) a %*(*)«(«'),
(42)
which leads to Tr(p) = 1. It is also straightforward to compute the integral for Tr(p2). The calculation leads to
The sum of this series is (1/cosh 77), which is smaller than one if the parameter 77 does not vanish. This is of course due to the fact that we are averaging over the x2 variable which we do not measure. The standard way to measure this ignorance is to calculate the entropy defined as S=-Tr(p\n(p)), (44) where S is measured in units of Boltzmann's constant. If we use the density matrix given in Eq.(42), the entropy becomes 5 = 2 {cosh 2 ( | ) In (cosh | ) - sinh 2 ( | ) In (sinh | ) } .
(45)
This expression can be translated into a more familiar form if we use the notation tanh| = e x p ( - ^ ) ,
(46)
366 Lorentz and Poincare Invaxiance where u is given in Eq.(37) [30]. It is known in the literature that this rise in entropy and temperature causes the Wigner function to spread wide in phase space causing an increase of uncertainty [29]. Certainly, we cannot reach a classical limit by increasing the uncertainty. On the other hand, we are accustomed to think this entropy increase has something to do with decoherence, and we are also accustomed to think the lack of coherence has something to do with a classical limit. Are they compatible? We thus need a new vision in order to define precisely the word "decoherence."
8
Time-separation Variable in Feynman's Rest of the Universe
Quantum field theory has been quite successful in terms of perturbation techniques in quantum electrodynamics. However, this formalism is basically based on the S matrix for scattering problems and useful only for physically processes where a set of free particles becomes another set of free particles after interaction. Quantum field theory does not address the question of localized probability distributions and their covariance under Lorentz transformations. The Schrodinger quantum mechanics of the hydrogen atom deals with localized probability distribution. Indeed, the localization condition leads to the discrete energy spectrum. Here, the uncertainty relation is stated in terms of the spatial separation between the proton and the electron. If we believe in Lorentz covariance, there must also be a time separation between the two constituent particles. Before 1964 [31], the hydrogen atom was used for illustrating bound states. These days, we use hadrons which are bound states of quarks. Let us use the simplest hadron consisting of two quarks bound together with an attractive force, and consider their space-time positions xa and Xb, and use the variables X = (xa + xb)/2,
x={xa-
xb)/2V2.
(47)
The four-vector X specifies where the hadron is located in space and time, while the variable x measures the space-time separation between the quarks. According to Einstein, this space-time separation contains a time-like component which actively participates as can be seen from ( z'\ _ /cosh/? \t' J ~ \sinhri
sinhjyA ( z\ coshrjj \t J '
(
.„] '
K
when the hadron is boosted along the z direction. In terms of the light-cone variables defined as [32] u={z + t)/y/2, v = {z-t)/V2. (49) The boost transformation of Eq.(48) takes the form u' - e^u,
v' = e~nv.
(50)
The u variable becomes expanded while the v variable becomes contracted. Does this time-separation variable exist when the hadron is at rest? Yes, according to Einstein. In the present form of quantum mechanics, we pretend not to
Chap. 6. The Lorentz and Poincare Groups ... ,*
367
t
—Dirac: Uncertainty without Excitations
rK J x
—
z
'
Heisenberg: Uncer with Excitations
Figure 4: Space-time picture of quantum mechanics. There are quantum excitations along the space-like longitudinal direction, but there are no excitations along the time-like direction. The time-energy relation is a c-number uncertainty relation. know anything about this variable. Indeed, this variable belongs to Feynman's rest of the universe. In this report, we shall see the role of this time-separation variable in decoherence mechanism. Also in the present form of quantum mechanics, there is an uncertainty relation between the time and energy variables. However, there are no known time-like excitations. Unlike Heisenberg's uncertainty relation applicable to position and momentum, the time and energy separation variables are c-numbers, and we are not allowed to write down the commutation relation between them. Indeed, the timeenergy uncertainty relation is a c-number uncertainty relation [33], as is illustrated in Fig. 4 How does this space-time asymmetry fit into the world of covariance [17]. This question was studied in depth by the present authors. The answer is that Wigner's 0(3)-like little group is not a Lorentz-invariant symmetry, but is a covariant symmetry [1]. It has been shown that the time-energy uncertainty applicable to the time-separation variable fits perfectly into the 0(3)-like symmetry of massive relativistic particles [19]. The c-number time-energy uncertainty relation allows us to write down a time distribution function without excitations [19]. If we use Gaussian forms for both space and time distributions, we can start with the expression
(3r»»H ( 2 ! + ' 2 ) }
<si>
for the ground-state wave function. What do Feynman et al. say about this oscillator wave function?
368 Lorentz and Poincare Invariance In his classic 1971 paper [6], Feynman et al. start with the following Lorentzinvariant differential equation.
\{xl-•&}*&
= WW-
( 52 )
This partial differential equation has many different solutions depending on the choice of separable variables and boundary conditions. Feynman et al. insist on Lorentz-invariant solutions which are not normalizable. On the other hand, if we insist on normalization, the ground-state wave function takes the form of Eq.(51). It is then possible to construct a representation of the Poincare group from the solutions of the above differential equation [19]. If the system is boosted, the wave function becomes
M*,i)=(iy
2
expj-i(e-2"U2 + e 2 V)}.
(53)
This wave function becomes Eq.(51) if 77 becomes zero. The transition from Eq.(51) to Eq.(53) is a squeeze transformation. The wave function of Eq.(51) is distributed within a circular region in the uv plane, and thus in the zt plane. On the other hand, the wave function of Eq.(53) is distributed in an elliptic region with the light-cone axes as the major and minor axes respectively. If 77 becomes very large, the wave function becomes concentrated along one of the light-cone axes. Indeed, the form given in Eq.(53) is a Lorentz-squeezed wave function. This squeeze mechanism is illustrated in Fig. 5. It is interesting to note that the Lorentz-invariant differential equation of Eq.(52) contains the time-separation variable which belongs to Feynman's rest of the universe. Furthermore, the wave function of Eq.(51) is identical to that of Eq.(38) for the coupled oscillator system, if the variables z and t are replaced x\ and X2 respectively. Thus the entropy increase due to the unobservable X2 variable is applicable to the unobserved time-separation variable t.
9
Feynman's Parton Picture
It is a widely accepted view that hadrons are quantum bound states of quarks having localized probability distribution. As in all bound-state cases, this localization condition is responsible for the existence of discrete mass spectra. The most convincing evidence for this bound-state picture is the hadronic mass spectra which are observed in high-energy laboratories [6, 19]. However, this picture of bound states is applicable only to observers' in the Lorentz frame in which the hadron is at rest. How would the hadrons appear to observers in other Lorentz frames? To answer this question, can we use the picture of Lorentz-squeezed hadrons discussed in Sec. 8. The radius of the proton is 1 0 - 5 of that of the hydrogen atom. Therefore, it is not unnatural to assume that the proton has a point charge in atomic physics. However, while carrying out experiments on electron scattering from proton targets, Hofstadter in 1955 observed that the proton charge is spread out [34]. In this experiment, an electron emits a virtual photon, which then interacts with the proton. If the proton consists of quarks distributed within a finite space-time region,
Chap. 6. The Lorentz and Poincare Groups ...
369
At
(3=0.8
>z (3=0
Figure 5: Effect of the Lorentz boost on the space-time wave function. The circular space-time distribution at the rest frame becomes Lorentz-squeezed to become an elliptic distribution. the virtual photon will interact with quarks which carry fractional charges. The scattering amplitude will depend on the way in which quarks are distributed within the proton. The portion of the scattering amplitude which describes the interaction between the virtual photon and the proton is called the form factor. Although there have been many attempts to explain this phenomenon within the framework of quantum field theory, it is quite natural to expect that the wave function in the quark model will describe the charge distribution. In high-energy experiments, we are dealing with the situation in which the momentum transfer in the scattering process is large. Indeed, the Lorentz-squeezed wave functions lead to the correct behavior of the hadronic form factor for large values of the momentum transfer [35]. Furthermore, in 1969, Feynman observed that a fast-moving hadron can be regarded as a collection of many "partons" whose properties do not appear to be quite different from those of the quarks [5]. For example, the number of quarks inside a static proton is three, while the number of partons in a rapidly moving proton appears to be infinite. The question then is how the proton looking like a bound state of quarks to one observer can appear different to an observer in a different Lorentz frame? Feynman made the following systematic observations. a. The picture is valid only for hadrons moving with velocity close to that of light. b. The interaction time between the quarks becomes dilated, and partons behave as free independent particles.
370 Lorentz and Poincare Invariance c. The momentum distribution of partons becomes widespread as the hadron moves fast. d. The number of partons seems to be infinite or much larger than that of quarks. Because the hadron is believed to be a bound state of two or three quarks, each of the above phenomena appears as a paradox, particularly b) and c) together. In order to resolve this paradox, let us write down the momentum-energy wave function corresponding to Eq.(53). If the quarks have the four-momenta pa and pt, we can construct two independent four-momentum variables [6] P=Pa+Pb,
q = V2{pa-pb),
(54)
where P is the total four-momentum and is thus the hadronic four-momentum. q measures the four-momentum separation between the quarks. Their light-cone variables are qu = (go - qz)/V%, qv = (qo + qz)/V2. (55) The resulting momentum-energy wave function is M9*,9o)=(j^
exp|-l(e-2"g2+e2,g2)|
(5g)
Because we are using here the harmonic oscillator, the mathematical form of the above momentum-energy wave function is identical to that of the space-time wave function. The Lorentz squeeze properties of these wave functions are also the same. This aspect of the squeeze has been exhaustively discussed in the literature [19, 22, 23]. When the hadron is at rest with j] = 0, both wave functions behave like those for the static bound state of quarks. As rj increases, the wave functions become continuously squeezed until they become concentrated along their respective positive light-cone axes. Let us look at the z-axis projection of the space-time wave function. Indeed, the width of the quark distribution increases as the hadronic speed approaches that of the speed of light. The position of each quark appears widespread to the observer in the laboratory frame, and the quarks appear like free particles. The momentum-energy wave function is just like the space-time wave function, as is shown in Fig. 6. The longitudinal momentum distribution becomes wide-spread as the hadronic speed approaches the velocity of light. This is in contradiction with our expectation from nonrelativistic quantum mechanics that the width of the momentum distribution is inversely proportional to that of the position wave function. Our expectation is that if the quarks are free, they must have their sharply defined momenta, not a wide-spread distribution. However, according to our Lorentz-squeezed space-time and momentum-energy wave functions, the space-time width and the momentum-energy width increase in the same direction as the hadron is boosted. This is of course an effect of Lorentz covariance. This indeed is the key to the resolution of the quark-parton paradox [19, 22]. The most puzzling problem in the parton picture is that partons in the hadron appear as incoherent particles, while quarks are coherent when the hadron is at
Chap. 6. The Lorentz and Poincare Groups ...
QUARKS
371
PARTONS
p=o BOOST
.___L SPACE-TIME DEFORMATION
/Weaker spring \ * constant ' Quarks become (almost) free
> o z LU
%
111
5 1-
p=o
\r L VJ ^
MOMENTUM-ENERGY DEFORMATION
/Parton momentum^ i ^ distribution ' | becomes wider I I
Figure 6: Lorentz-squeezed space-time and momentum-energy wave functions. As the hadron's speed approaches that of light, both wave functions become concentrated along their respective positive light-cone axes. These light-cone concentrations lead to Feynman's parton picture.
372
Lorentz and Poincare Invariance
rest. Does this mean t h a t the coherence is destroyed by the Lorentz boost? T h e answer is NO, and here is the resolution to this puzzle. When the hadron is boosted, the hadronic m a t t e r becomes squeezed and becomes concentrated in the elliptic region along the positive light-cone axis, as is illustrated in Figs. 5 and 6. T h e length of the major axis becomes expanded by e*1, and the minor axis is contracted by ev. This m e a n s t h a t the interaction time of the quarks among themselves become dilated. Because the wave function becomes wide-spread, the distance between one end of the harmonic oscillator well and the other end increases. This effect, first noted by Feynman [5], is universally observed in high-energy hadronic experiments. T h e period is oscillation increases like ev. On the other hand, the interaction time with the external signal, since it is moving in the direction opposite to the direction of the hadron, travels along the negative light-cone axis. If the hadron contracts along the negative light-cone axis, the interaction time decreases by e~v. T h e ratio of the interaction t i m e t o t h e oscillator period becomes e"2lJ. T h e energy of each proton coming out of the Fermilab accelerator is 900GeV. This leads the ratio to 1 0 - 6 . This is indeed a small number. T h e external signal is not able to sense the interaction of the quarks among themselves inside the hadron. Indeed, Feynman's parton picture is one concrete physical example where the decoherence effect is observed. As for the entropy, the time-separation variable belongs to the rest of the universe. Because we are not able to observe this variable, the entropy increases as the hadron is boosted to exhibit the p a r t o n effect. T h e decoherence is thus accompanied by an entropy increase. Let us go back to the coupled-oscillator system. T h e light-cone variables in Eq.(53) correspond to the normal coordinates in the coupled-oscillator system given in Eq.(35). According to Feynman's parton picture, the decoherence mechanism is determined by the ratio of widths of the wave function along the two normal coordinates. T h e result is listed in the third row of Table 1.
10
Further Contents of Einstein's Formula for Energy, Mass, and Momentum
In Table 1, we put Wigner's formalism and Feynman's observation into a single package which could be called "Further Contents of Einstein's E = mc2." To physicists, E = mc2 means E = ^/m2+p2. Of course, the mass m has different meanings for these two different formulas, one for the rest mass and the other for moving mass. This distinction is so obvious to physicists t h a t there is a tendency not to mention it in the physics literature. However, the distinction is not so trivial to those who study how special relativity was developed. Indeed, there has been a recent debate on this issue, and the debate is likely to continue. However, the present authors have not done enough research on this issue, but would like to acknowledge a very comprehensive review article by Okun [36], entitled "Concept of Mass" and comments on this article by various authors. It is not clear whether Einstein was concerned with the question of whether the particles are point particles or objects with internal space-time structures, be-
Chap. 6. The Lorentz and Poincare Groups ...
373
cause the internal space-time symmetry was not formulated until 1939 when Wigner published his paper based on the little groups [1]. On the otherhand, Wigner's approach starts from Einstein's energy-momentum relation for free particles. Thus, the energy-momentum relation remains valid for particles with internal structure only if there is a covariant description of internal space-time symmetries. Indeed, this is the main point of the present paper. Table 2: Historical Necessity. Newtonian Mechanics is Galilean-covariant, and Maxwell's theory is Lorentz-covariant. Do they have to be based on two different kind of covariance?
Galilean Covariance
Lorentz Covariance
Newtonian Mechanics
YES
NO
Maxwell Theory
NO
YES
Another historical question is who formulated the mathematics for special relativity. Here, prominent names are Lorentz, Poincare and Minkowski. This is also an interesting and important issue. The present authors have done some research along this line, but not enough to make an impact on the existing literature. The following point is well known, but seldom mentioned. Before Einstein, Newtonian mechanics and Maxwell's equations were based on two different covariance principles, as is summarized in Table 2. Thus the development of special relativity is of historical necessity to those, like Einstein, who believed this world is one covariant world. It is now firmly established that mechanics should also be Lorentz-covariant. Furthermore, it is a well-accepted view that Lorentz covariance should become Galilean covariance for slow particles. This slow-speed limit is not as trivial as taking a numerical limit of speed of particle divided by the speed of light. This limiting process was worked out by Inonu and Wigner in their 1953 paper [2], where they introduced group contractions. The Inonu-Wigner group contraction also allows us to take a large-speed limit, which we used in the present paper. It is interesting to note that both limiting processes can be derived from the InonuWigner contraction.
Acknowledgments The author would like to thank Professor Lev Okun for sending us his list of papers on the concept of mass, and also for helpful comments.
374
Lorentz and Poincare
Invaiiance
References [1] E. P. Wigner, Ann. Math. 40, 149 (1939). [2] E. Inonu and E. P. Wigner, Proc. Natl. Acad. Scie. (U.S.A.) 3 9 , 510 (1953). [3] Y. S. Kim and E. P. Wigner, J. Math. Phys. 2 8 , 1175 (1987). [4] Y. S. Kim and E. P. Wigner, J. Math. Phys. 3 1 , 55 (1990). [5] Ft. P. Feynman, The Behavior of Hadron Collisions at Extreme Energies, in High Energy Collisions, Proceedings of the Third International Conference, Stony Brook, New York, edited by C. N. Yang et al., Pages 237-249 (Gordon and Breach, New York, 1969). [6] R. P. Feynman, M. Kislinger, and F. Ravndal, Phys. Rev. D 3, 2706 (1971). [7] R. P. Feynman, Statistical
Mechanics
(Benjamin/Cummings, Reading, MA,
1972). [8] S. Weinberg, Phys. Rev. 134, B882 (1964); ibid. 135, B1049 (1964). [9] A. Janner and T . Janssen, Physica 5 3 , 1 (1971); ibid. 60, 292 (1972). [10] For later papers on this problem, see J. Kuperzstych, Nuovo Cimento 3 1 B , 1 (1976); D. Han and Y. S. Kim, Am. J. Phys. 4 9 , 348 (1981); J. J. van der Bij, H. van Dam, and Y. J. Ng, Physica 1 1 6 A , 307 (1982). [11] D. Han, Y. S. Kim, and D. Son, Phys. Rev. D 26, 3717 (1982). [12] Y. S. Kim, in Symmetry and Structural Properties of Condensed Matter, Proceedings 4th International School of Theoretical Physics (Zajaczkowo, Poland), edited by T. Lulek, W. Florek, and B. Lulek (World Scientific, 1997). [13] Y. S. Kim, in Quantum Systems: New Trends and Methods, Proceedings of the International Workshop (Minsk, Belarus), edited by Y. S. Kim, L. M. Tomil'chik, I. D. Feranchuk, and A. Z. Gazizov (World Scientific, 1997). [14] S. Ferrara and C. Savoy, in Supergravity 1981, S. Ferrara and J. G. Taylor eds. (Cambridge Univ. Press, Cambridge, 1982), p. 151. See also P. Kwon and M. Villasante, J. Math. Phys. 29, 560 (1988); ibid. 30, 201 (1989). For earlier papers on this subject, see H. Bacry and N. P. Chang, Ann. Phys. 4 7 , 407 (1968); S. P. Misra and J. Maharana, Phys. Rev. D 14, 133 (1976). [15] D. Han, Y. S. Kim, and D. Son, Phys. Lett. B 1 3 1 , 327 (1983). See also D. Han, Y. S. Kim, M. E. Noz, and D. Son, Am. J. Phys. 52, 1037 (1984). [16] D. Han, Y. S. Kim, and D. Son, J. Math. Phys. 2 7 , 2228 (1986). [17] Y. S. Kim and M. E. Noz, Phys. Rev. D 8, 3521 (1973). [18] Y. S. Kim, M. E. Noz and S. H. Oh, J. Math. Phys. 20, 1341 (1979). [19] Y. S. Kim and M. E. Noz, Theory and Applications (Reidel, Dordrecht, 1986).
of the Poincare
Group
Chap. 6. The Lorentz and Poincare Groups ...
375
[20] P. A. M. Dirac, Proc. Roy. Soc. (London) A 1 8 3 , 284 (1945). [21] H. Yukawa, Phys. Rev. 9 1 , 415 (1953). [22] Y. S. Kim and M. E. Noz, Phys. Rev. D 15, 335 (1977). [23] Y. S. Kim, Phys. Rev. Lett. 6 3 , 348 (1989). [24] Y. S. Kim and E. P. Wigner, Phys. Lett. A 147, 343 (1990). [25] D. Han, Y. S. Kim, and M. E. Noz, J. Math. Phys. 36, 3940 (1995). [26] H. P. Yuen, Phys. Rev. A 13, 2226 (1976). [27] Y. S. Kim and M. E. Noz, Phase Space Picture of Quantum Scientific, Singapore, 1991).
Mechanics
(World
[28] R. P. Feynman, http://www.aip.org/history/esva/exhibits/feynman.htm, (the Feynman page of the Emilio Segre Visual Archives of the American Institute of Physics) [29] D. Han, Y. S. Kim, and M. E. Noz, Am. J. Phys. 6 7 , 61 (1999). [30] D. Han, Y. S. Kim, and Marilyn E. Noz, Phys. Lett. A 144, 111 (1989). [31] M. Gell-Mann, Phys. Lett. 13, 598 (1964). [32] P. A. M. Dirac, Rev. Mod. Phys. 2 1 , 392 (1949). [33] P. A. M. Dirac, Proc. Roy. Soc. (London) A 1 1 4 , 234 and 710 (1927). [34] R. Hofstadter and R. W. McAllister, Phys. Rev. 9 8 , 217 (1955). [35] K. Fujimura, T . Kobayashi, and M. Namiki, Prog. Theor. Phys. 4 3 , 73 (1970). [36] L. B. Okun, Physics Today, June 1989, pp. 31 - 36. See also the letters from W. Rindler, M. Vnayck, P. Muragesan, S. Ruschin, C. Sauter, and L. B. Okun, Physics Today, May 1990, pp.13, 115 - 117.
This page is intentionally left blank
Chapter 7
The Isotropy of the Speed of Light c: A Convenient Assumption 7
7
W. F. Edwards (1963), J. A. Winnie (1970), R. Mansouri and R. U. Sexl (1977), Y. Z. Zhang (1995)
Test Theories of Special Relativity Yuan Zhong Zhang1 Received April 5, 1994 We review the Edwards transformation, and investigate the Robertson transformation and the Mansouri-Sexl (MS) transformation. It is shown that the MS transformation is a generalization of the Robertson transformation, just as the Edwards transformation is a generalization of the Lorentz transformation. In other words, the MS transformation differs from the Robertson transformation by a directional parameter q, just as is the case for the Edwards and Lorentz transformations. So the MS transformation predicts the same observable effects as the Robertson transformation, just as the Edwards transformation does with the Lorentz transformation. This is to say that the directional parameter q representing the anisotropy of the one-way speed of light is not observable in any physical experiment. The observable difference between the MS (Robertson) transformation(s) and the Lorentz transformation is caused by the anisotropy of the two-way speed of light. Therefore a physical test of the MS transformation is a test of the two-way speed of light, but not of the one-way speed of light.
1. INTRODUCTION In Einstein's theory of special relativity [1], constancy of the speed of light is the second postulate. With this postulate, a clock located at any position in a inertial frame can be synchronized with a clock at the origin of the frame by means of a light pulse. Since that time, the clock synchronization problem has been discussed by many authors. Robertson [2] proposed a more general transformation. Reichenbach (Ref. 3, p. 142) and 1
Institute of Theoretical Physics, Academia Sinica, P.O. Box 2735, Beijing, P.R. China. E-mail: [email protected]
378
Chap. 7. The Isotropy of the Speed of Light
379
Grunbaum [4] discussed this problem in detail, and pointed out that no observable difference would result if the speed of light really were anisotropic. Ruderfer [5] held that special relativity contains an important assumption which has not and possibly cannot be tested. Edwards [6] and Winnie [7] obtained a generalized Lorentz transformation starting from the constancy of the two-way speed of light. It was concluded that the generalized Lorentz transformation predicts the same observable effects with the standard Lorentz transformation. Later, Mansouri and Sexl [8] proposed another more general transformation. After that time, many papers on this topic have been published [9-17]. However, some ambiguities still exist in comparing the test theory with physical experiments [8,17]. Thus it is necessary to analyze these kinds of test theories in detail. In this paper, we shall first recall the Edwards transformation and its physical meaning, and then investigate the Robertson transformation and the Mansouri-Sexl (MS) transformation. We give the physical meaning of these transformations, and show the connections among the transformations under consideration. We analyze time retardation, the so-called firstorder experiment (the Romer experiment), and the transversal Doppler effect using the MS transformation. These analyses are examples to show that the first-order effect does not appear in any so-called first-order experiments. 2. ONE-WAY SPEED AND TWO-WAY SPEED Consider a Cartesian coordinate frame whose origin is the point O. Let P denote a point with coordinates (x, y, z), and r = (x, y, z) indicate a radial vector. cr and c_ r refer to the one-way speed of light in the direction of v/r and in the opposite direction, respectively. We define the two-way speed of light along the path lop + Ipo as c r = (lop + lpo)/{top + tpo), where lop = Ipo = ?"> top = f/cr is the time lapse between the emission of the light pulse at O and its arrival at P, and tpo — r/c-r is the time interval spent by the pulse from P back to O. So the two-way speed of light can be expressed as ZCrC—r Cr = — C^p
i
•
(1)
C—r
Equation (1) implies that the choices of cr and c_ r are restricted in such a way that the sense of cause is preserved. In other words, a light signal starting at O cannot reach P before it leaves O. Since top and tpo must be positive, so must cv and c_ r be positive. Thus (1) leads to the restriction y
(2)
380 Lorentz and Poincare Invariance
It is convenient to introduce a directional parameter q as follows: Cr
CT
Cr
=
=
C-r
1 ~ Qr
' '
(3a)
1 + qr
Using eqs. (3a) in (2), we get the limit on the directional parameter - 1 < Qr < + 1 .
(4a)
In particular, along the x-, y-, and z-axes, we have Ci
=
C-i
1 -9i '
(36)
=
- 1
l+9i' (46)
x,y,z.
Let us discuss the relation between qr and Oj. Consider the following "loops" of light: '+
=
IOA
+ IAB + IBP + Ipo >
(5)
!•- = hp + IPB + IBA + I AO
where IOA is the distance between O and A, and so on. Coordinates of the points 0,A,B and P are (0,0, 0), (x, 0,0), (x, y,0) and (x, y, z), respectively. Let t+ and £_ denote the time intervals spent by the light pulse traveling along 1+ and 1-, respectively, i.e., t+ = to A +£/!/*+ ^BP + iP0 *
*- = *OP + tpB + tsA + ^AO •
Substituting tOA = x/cx, f^o = x / c - i , tAB = y/c v , ^ = y/c-y, z/c z , fpB = z/c-z, top — r/cr, ipo — r/c-r into (6), we have x
y
z
t+ = — + — + —
r
x
C—T
C—x
y £—y
(6)
tBp =
z
r
C—z
Cy
(7)
Using the definition, (3), we obtain from (7) t+-t-=
2r
<7r Cr
— cos a + t r cos p + — cos 7 Cx
Cu -y
(8)
C
where cos a = x/r, cos /5 = y/r, cos 7 = z/r, and cos2 a + cos 2 (3 + cos2 7 = 1. Assuming £+ = £_, we obtain from (8) 9*
, qy
a
, qz
— cos a + zr- cos p + — cos 7 .
(9)
Chap. 7. The Isotropy of the Speed of Light
381
3. EDWARDS TRANSFORMATION Let us recall a generalized Lorentz transformation as an example of how to compare a test theory of special relativity with physical experiments. Edwards [6] assumed the constancy of two-way speed of light, and modified Einstein's second postulate as: the two-way speed of light in a vacuum as measured in two coordinate systems moving with constant relative velocity is the same constant regardless of any assumptions concerning the one-way speed. The constancy of two-way speed of light implies cr = cr> — c = constant. For simplicity, let q = qx ^ 0, q' = qx> ^ 0, qy = qy, = qz = qz, = 0, so we have from (3) _
Cx
=
c
_
~
j
c
C_x —
c Cx'
_ c
— -,
, i
C—x'
—
_
_
_
Cy — C— y — C-z — C _ z — C,
>
1
_ ,
(10)
Cyi — C _ y '
— C 2 ' — C—z'
— C.
1 — q' 1 + q' From the constancy of the two-way speed, Edwards [6] obtained the following generalized Lorentz transformation: t =
y/[l + (v/c)q'}2 - (^ 2 /c 2 )
i + - ( ? + g') t'
- (1 - q'2) + (q- q') x c X —
y/[l + {v/c)q'Y -
(v2/ci)
(x'-vt1),
y = y
(11)
z = z
where v is the velocity of the inertial frame S(txyz) with respect to S'{t'x'y'z'). In the case q' = 0, the frame S' is a "preferred" reference system to be denoted by Y,(TXYZ). In this case, the Edwards transformation (11) reduces to 2
2
y/l ~ (V /C ) [ X
1
=
y/l
z = Z.
-
(V2/C2)
l+j,|T-
v \X -+q\ — c I c
(X-vT),
(12)
382
Lorentz and Poincare Invariance
How is the Edwards transformation (11) applied to physical experiments? It is noted that the coordinate t and t' are not directly observable because they depend upon the definition of simultaneity (an observable time should be a proper time), and hence all quantities associating with t and t', such as ux — dx/dt and ux> = dx'/dt', are also not directly observable. On the other hand, distant clocks in physical experiments are generally synchronized by means of Einstein simultaneity, i.e. the constancy of the one-way speed of light. Thus in-order to compare the mathematical quantities in the test theory with data given in physical experiments, a relation between a general clock synchronization and Einstein clock synchronization is needed. Let quantities with a subscript "0" correspond to Einstein simultaneity. Consider a light signal traveling from O to P. Let to be the departure time at O, and t or to be the arrival time at P. A general clock synchronization implies t = t0+
— +— +— • CX
Cy
(13)
CZ
On the other hand, Einstein clock synchronization gives
t0=to
+ - + - + --
(14)
c c c The relation between t and to follows from (13) and (14) t = to+x(—--)+y( \CX
) +z( Cj
\Cy
Cj
)•
\CZ
(15a)
Cj
Similarly in frame 5", we have
For Edwards clock synchronization, using (10), (15) leads to t=to-q-,
t' = t'0-q'-. c
(16) c
Using (16) we obtain relations between the velocities (ux = dx/dt, ux> = dx'/dt', • • •) corresponding to Edwards simultaneity and the ones [{UX)Q = dx/dto, (ux>)o = dx'/dt'0, • • •} corresponding to Einstein simultaneity: (^x)o
1 - q(ux)o '
(tti')o
1 -q'(uX')o
'
(17a)
Chap. 7. The Isotropy of the Speed of Light
383
In particular, letting v = ux> and v0 = (ux>)o, eq. (17a) becomes v =
; ° / w , (176) 1 - {vo/c)q' where v and VQ are velocities of the reference systems S(txyz) relative to S'(t'x'y'z'), which are measured by means of Edwards simultaneity and Einstein simultaneity, respectively. Using (16) and (176), the Edwards transformation (11) reduces to the standard form, 1
1 / ft' 0 2 " 0 - KVc ) V " * X
=
I (x' y/l ~ (Vile*)
v
_±
-v0t'0),
y = y', (18) This result shows that the difference between the Edwards transformation and the Lorentz transformation is just their different definitions of simultaneity, and that the Edwards transformation predicts the same observable effects as the Lorentz transformation. In other words, the directional parameters q and q', and hence the one-way speed of light, cannot be tested in any physical experiment. Let us now give an example in illustration of this result. The Doppler effect can be easily obtained (Ref. 18, p.33):
1 + {v/c)q' — (v/c) cos a where v is the frequency of light emitted by source moving at velocity v relative to an observer, v' is the corresponding frequency measured by the observer, a is the angle between the propagating direction of light and the velocity v. It is stressed that one cannot employ a value for the speed of a light source given in any physical experiment in place of v in (19), because v is defined by Edwards simultaneity with a non-zero value of q, and while the value for the speed of a source is generally measured by means of Einstein simultaneity in the physical laboratory. For this reason, we should identify the value as VQ in (176). This implies that (176) should be substituted into (19). In this way we arrive at the standard form
1 — {Vo/c)cosa
384
Lorentz and Poincare Invariance
It is eq. (20) which is to be used for comparing with physical experiments. So we conclude that in comparing a prediction of a test theory with data given in physical experiments, a correct result may be acquired only if the different clock synchronizations are taken into account. 4. ROBERTSON TRANSFORMATION Robertson [2] proposed the following general transformation:
2\-l
=a r 1 ( i - ^ )
(X-VT),
y = a,21Y, z = az1Z,
(21)
where ao,ai,a,2 are arbitrary functions of v2, and the frame Y,{TXYZ) is a preferred reference system where the one-way speed of light in a vacuum is a constant c. For the convenience of the rest of the text, the parameters ao,ai,02 are replaced by new ones as follows:
(
2\ ~1
l-^a)
a2=d~1.
.
(22)
In terms of these new parameters, the Robertson transformation (21) becomes * = 1 - {v2/c2) x = b(X
\T~JX
-vT),
V=(d)Y, z - {d)Z.
(23a)
In order to understand the physical meaning of the parameters, let us rewrite (23a) as C± y/1 - (t, 2 /c 2 ) V
C2
Chap. 7. The Isotropy of the Speed of Light
C± y/l
-
385
(V2/C2)
z = (d)Z,
(236)
where c\\ and ex are denned by cb ( v2\ 1 - ^ )2 , a \^ c /
"
_
cd >/l - (v2/c2) , a
gj. = -
(24)
and it will be shown below that cy and cj_ represent the two-way speed of light in the direction parallel and perpendicular to v, respectively. Now calculate the speed of light in the frame S(txyz). In frame E, the one-way speed of light is isotropic, i.e. c2T2
_ X2
_ y2 _
Z2
=
Q
(25)
Substituting (236) into (25), we have c2. ( r j cos2 a+ ZY cos2 P +-^ C c \c|| J_ x where x / i = c r c o s a , y/t = crcosP solution to (26) for c r is given by cr =
cos2 7 ] - 1 = 0, /
(26)
and z/t — c r c o s 7 are used. The
" _ . Vcy + ( c l - c | ) c o s 2 a
(27a)
In particular, (27a) leads to c_ r = c r , Cx
C—x
cr = C|| ,
• = cr, c r + c_ r
Cy = C_y
=
(276)
C z = C _ 2 = Cj_ .
One can see from (27) that in the Robertson test theory the one-way speed of light in a given direction is equal to the one in its opposite direction, but the two-way speed of light, in general, depends upon v2 and is anisotropic. Therefore the parameters c\\,cj_,d (or ao,ai,a 2 ) can be determined by experiments. For instance, a limit on c\\ and cj_ may come from the performed measurements of two-speed of light, and then together with the time retardation experiments a limit on the parameter d can be obtained.
386
Lorentz and Poincare Invariance
5. THE MANSOURI-SEXL TRANSFORMATION Mansouri and Sexl [8] proposed a more general linear transformation as £ = aT + e-x, y = (d)Y,
x = b(X -vT), z = (d)Z,
(28)
where the frame T,(TXYZ) is a preferred inertial reference system in which the one-way speed of light is isotropic. The frame S(txyz) is moving at velocity v in the positive x direction with respect to E, the parameters a, b, d, and e are functions of v. Let us introduce the new set of parameters H = (lx,Qy, Qz), c\\, c± in place of the old parameters e = (ex, e y , e z ), a, b: v
e-r. = c 6 [ l - ( v 2 / c 2 ) ] Vc +
ez =
c c V l - (u 2 /c 2 )
9z
=
- +Qx
i,
(29)
ex
where the constant c is the speed of light in E, and cy and c± are defined by (24). Putting (29) in (28), the Mansouri-Sexl transformation can be expressed as 1
t = (d)-
2
C± I y/1 -
2
(U /C )
Y
c
r
-
c
+QJ
X_ c
Z Qz —
-Qy
c
x = (d)J-
(v2/c2)
c± y/1 -
c {X - vT),
y = (d)Y, z = (d)Z.
(30)
Next we shall prove that the new parameters (qx,Qy,Qz) a r e J us * the directional parameters defined by (3) and (4). For this reason, we first calculate the speed of light in S. Substituting the Mansouri-Sexl transformation (30) into (25), we get the equation satisfied by the one-way speed of light in frame S: 1
1
52
q_
L .
1
1 'X
cos2 a + 2cr [ ^-qr ) + 1 = 0,
(31a)
Chap. 7. The Isotropy of the Speed of Light
387
where c r = r / i , x/t = c r c o s a , y/t = cvcos/3, z/t = crcosj, cos 2 a + cos2 0 + cos 2 7 = 1, ex and c\\ are given by (24) which we will show to be just the two-way speed of light parallel and perpendicular to v, and qT/cr is defined by — — — cos a + rr~ cosp + zr~ cos 7 . (316) Cr
C||
Cx
Cx
Solutions to (31) for tv and c_ r are given by -1
cr=( W ^ + ( ^ - ^ )
c
°s
2
a
-I
1
)
.
( 32 «)
.
(326)
-1
c-r=( J^-+(^2-^-)cos2a
+|^)
In particular, the one-way speed of light along the i-axis can be found from (32) and (316): d = -—— , 1 - 9i
c_j =
* , 1 +
i = x, y, z,
(32c)
where c^ = cy and Cy = cz = c±. The result shows that 5\\ and ex are just the two-way speed of light along x and y-axis (or z-axis), respectively, and the new parameters (qxiQy^Qz) defined by (29) have the same meaning as the directional parameters given in (3), and hence qT/cr defined by (316) is the same as the one given in (9). Consequently, from (32), the speed of light in the frame S reduces to cT =
, I -qr
^ =
r
c_ r = C I|C 1
c
,
(32d)
l + qr
c
" "
c cos2
\] \ + ( l - f)
.
(32e)
«
It is shown that the two-way speed of light, (32e), is the same as the one, (27), in the Robertson test theory. We see that (30) with the new set of parameters (c"x,C||,q,d) is equivalent to the original MS transformation (28). However, the new parameters have the merit that they have explicit meaning physically, i.e. c± and CJJ are the two-way speed of light and ex ¥" C|| represents the anisotropy of two-way speed of light; a non-zero value of qi implies the anisotropy of one-way speed of light; and d is a trivial common (or a "conformal") factor.
388
Lorentz and Poincare Invariance
Let us now consider a relationship between the MS transformation and the Robertson transformation. The MS transformation differs from the Robertson transformation just by a non-zero value of qi, i.e. by a different simultaneity. Thus the MS transformation (30) would reduce to the Robertson transformation (236) when the MS simultaneity is changed to the Robertson simultaneity. In fact, making an analogue of (15a) we have a relation between the MS simultaneity and the Robertson simultaneity as follows: tMS=tR
+ x(— \cx
-±-) C]\J
+y( - L - J - ) \cy cxJ
+ z
( ! _ J_) \cz cxJ
(33)
where £MS and £R are the time coordinates in (30) and (236), respectively. Substituting (33) into (30) and using (32c), the MS transformation (30) reduces to the Robertson transformation (236). Therefore the relationship among the MS and Robertson transformations is just an analogue of the one among the Edwards and Lorentz transformations. 6. ON COMPARING THEORIES WITH EXPERIMENTS It is shown from the previous sections that the Edwards transformation (12) would reduce to the Lorentz transformation by making the change of simultaneity (15), and that the MS transformation (30) would reduce to the Robertson transformation (236) by making the change of simultaneity (33). This implies that the directional parameter q is just a result of the definition of simultaneity, and hence should not appear in any physical experiments. In this section we shall repeat the procedure of changing the simultaneity, as done in the previous section, for some examples so as to reveal the meaning of results obtained in the previous sections. First, note that the proper length and proper time interval, and hence the two-way speed (e. g. c± or c\\) which is defined as a ratio of the proper length and the proper time interval, are independent of the definition of simultaneity. On the other hand, the coordinate time interval depends on the definition of simultaneity. 6.1. On the reciprocity rule of velocity The Robertson addition law of velocity comes from (236), C|| x
Ux
~ v
~ c 1 - (v/c2)Ux
UYV1-(V2/C*)
_c±
'
v
c
1 - (v/c*)Ux
'
{
}
where ux = dx/dt, Ux = dX/dT • • •, and uz = Uz = 0 is assumed for simplicity. The speed of S(txyx) as measured in T,(TXYZ) may be obtained
Chap. 7. The Isotropy of the Speed of Light
389
by putting ux = uy = 0 in (34), Ux=v,
UY = 0.
(35a)
It is similar to get the speed of £ as measured in S by putting Ux = Uy = 0 in (34) Vh = «x = - — w ,
uy = 0.
(356)
It is shown from (35) that VR ^ -v, i.e. the reciprocity rule of velocity is not valid. This is due to the different definitions of simultaneity in E and S. Thus "velocity" has no absolute meaning, which depends on the simultaneity. The MS addition law of velocity is given from (30), _ c^ Ux
~
"y =
Ux -v
c [l + (v/c)qx]-[{v/c)+qx](Ux/c)-qy(UY/c)y/l-(vy
'
2
cx
tWl-(^/c )
c [l +
(v/c)qx]-[(v/c)+qx](Ux/c)-qy(UY/cWl-(vy)
Equation (36) gives that the speed of S as measured in E is ux = v, uy = 0, and the speed of E as measured in S is V
-IT -
V ^ c l + (v/c)qx
VR
cw\
l-(VK/c\\)qx
where (356) is used. Again VMS ^ —v, i.e. the reciprocity rule of velocity is not valid. The relation between VMS and VR, given by (37) can be derived directly from the following relationship between the MS and Robertson definitions of simultaneity: dtMS = dtR-dx(~-~)
=dtJl-^^-
),
(38a)
where (32c) is used. Equation (38a) is just an analogue of (15a). Thus by definition we have from (38a) dx dx (' qx dx \ , ,N 3 VMS = -77— = 3T- 1 - ^ ^ r - • »6 "CMS «t R V C|| dtRJ Equation (386) is just (37) where VR = dx/dt^,. In short, it is shown from the above that a velocity is defined by a coordinate time interval, and hence related to a definition of simultaneity. Therefore, before making use of the experimental data in a test theory, one must consider whether the simultaneity in the experiments is the same with the one in this test theory.
390
Lorentz and Poincare Invariance
6.2. Time retardation of a moving clock 6.2.1. Let a clock be at rest in the system S, i.e. its spatial coordinates in S do not change. Putting Ax = Ay = Az = 0 in (30) and denoting A T = Ai, ATE = AT, one arrives at Ar=
r
C
C± y/l
~
2
2
(V /C )
+
[H*> -(; *)¥] A X = vATE .
(39a)
(396)
Using (396) in (39a), one gets A T = (d) — y/l - (v2/c2) ATE ,
(40)
where AT is a proper time interval, and A T E is a coordinate time interval corresponding to the Einstein simultaneity. So that q does not appear in (40). 6.2.2. Let a clock be at rest in E. So putting AX = AY = AZ = 0 in (30) and denoting A^MS = At, we obtain
C± y/l
—
(V2/C2)
te-W*-^*, cj. V l -
(41*) (v2/c2)
where AT is a proper time interval, and A£MS is a coordinate time interval corresponding to the MS simultaneity. Now we change the MS simultaneity to the Robertson simultaneity, i.e. using (33) we have AtUs=AlR-Ax(
—)=A£ \C-x
C\\J
R
-Ax^,
(42)
C||
where (32c) is used. Putting (416) in (42) and using (41a), one arrives at AtR = (d)--7=
43
We see that q is eliminated. In fact, (43) can be directly obtained from the first equation of (236) with AX — A y = AZ — 0. The meaning of the above discussions is that (41a) is equivalent to (43) in any given
Chap. 7. The Isotropy of the Speed of Light
391
experiment. In other words, (41a) and (43) predict the same value for a given measurement. For example, a clock C moves from a point A (at the local time i^) to a point B (at the local time i s ) ; assume that the difference ts — tA is, e.g. 5 sec. Both v and ts — tA are measured by the Einstein simultaneity, as it does in the physical experiments. Now the question is in the following: How to compare the MS transformation with this measurement? The answer is obvious: If (43) is used, we can simply identify ts — tA = 5 sec with A£R in (43) because the difference between ex and c is neglected in the experiment. However, if one would like to use (41a), ts — ^A — 5 sec cannot be identified with A£MS in (41a) because of the different simultaneity. One should use (42) with AtR = ts — tA = 5 sec to get a value of A£MS, and then put it in (41a). This predicts obviously the same value for the time retardation with (43). 6.3. The Romer experiment O. Romer [19] determined c from the occultation of the moons of Jupiter. The interval (AT) in which one of the moons of Jupiter enters into the shadow of this planet are constant, as seen from Jupiter. Seen from the earth irregularities appear due to the change in the Earth-Jupiter distance. As this distance increases, light signals would take longer to reach the earth. This permits a determination of the velocity of light. We shall re-analyze the Romer experiment here by making use of the MS transformation. We denote the Jupiter involving its moons by J(T), which emits light signals at a constant time interval, AT = TB ~ TA as timed by J(T), towards the earth where they are received at a time interval, A T = £2 — ti as measured by the earth (E). In principle, this experiment can be shown in Figure 1. In Figure 1, a clock J(T) (Jupiter) being at rest in the system T,(TXYZ) moves from a point A to a point B as seen in system S(txyz). tA is a reading of a clock A being at rest at the point A when the clock J(T) acrosses at A, and while the corresponding reading of J(T) is TA; £B is a reading of a clock B being at rest at the point B when the J(T) arrives at B, and while the corresponding reading of the J(T) is TB- The distance between A and B is denoted by Ax. The first light signal emitted from J(T) at A (at the time tA in S, and the corresponding time in £ is TA), propagates at a velocity c{9) along a path r(9) and reaches the point E (the Earth with a clock E) at the reading t\ of the clock E, as seen in the system S; The second light signal emitted by J(T) at B (at the time ts in S, and the corresponding time in E is TB) propagates at a velocity c(ir-9) along a path r(n-6), and reaches E at the time £2 timed by the clock E, as seen in S. So the difference, £2 — *i, is the time interval of receiving the two signals by E, which is a proper time interval. Let us calculate the difference between A r and AT by use of MS trans-
392
Lorentz and Poincare Invariance
*~ x
F i g u r e 1. A sketch map for a Romer-type experiment.
formation (30). Notations are defined in Fig. 1. For simplicity, assume that Jupiter, J(T), is at rest in the system E(TXYZ), and the earth (E) is at rest in S(txyz). It is shown that both A T = TB — TA. and A T = £2 — *i are the proper time intervals. The time interval in 5" corresponding to AT is At = ts — tA, which is a coordinate time interval. In S, A r = t2 — £1 is given by
t2 tl = {tB tA) +
-
"
^^F)~W)
or
Ar = At +
C(TT
- 6)
(44)
c{6) '
where r = r(n-6) = r(9) is assumed for simplicity. The one-way speed of light c(9) is given by (32a). Using (316) in (32a), we get
c(0) =
f
i + f^-Jr)
cos2
-1
* --cos*
(45)
where q = (g,0.0) is assumed for simplicity. Putting (45) in (44), one arrives at (46) A r = At + (2rcosd)-±- ,
Chap. 7. The Isotropy of the Speed of Light
393
or
1 * 1 ^
= 2 ^ | 9 cos*| = 2 / ^ + e c \
C||
t
\c\\
(47)
J
where t = r / e is the (average) time in which the light signals reach the E (Earth), the definition (29), i.e. (c/c\\)q = — [ec + (v/c\\)], is used, and 9 = 0 is considered. To the first order, i.e. c\\ ~ c = 1 and e = 2av, (47) becomes l A r ~ A * l = 2 ( l + 2a)T;. (48) This is just eq. (2.2) of Mansouri and Sexl's paper [8]. Note that (48) can be obtained from the Edwards transformation (12), because the MS and Edwards transformations are equivalent to the first order. It is emphasized that an incorrect step made by Mansouri and Sexl in their paper was to compare directly (48) with the values given in the Romer experiment, because they did not distinguish the coordinate time interval from the proper time interval. Note that the time interval At = £# — t^ in (48) is a difference between two readings timed by two separate clocks A and B, and hence it is a coordinate time interval depending on the simultaneity in the system S. However, the corresponding time interval AT = Tg — TA. timed by the one moving clock J(T) is a proper time interval. So that AT ^ At because of time retardation effect. Therefore, a correct step should be to compare between the proper time intervals, A r and AT. For this reason, transform At into A T using the Mansouri-Sexl transformation (30). For simplicity, we have already assumed that the clock J(T) is at rest in the system E(TXYZ), i.e. AX = AY = AZ = 0. The relation between At and A T is already given by (41a). Putting (416) in (41a), we have A r
At = ( d ) ^
- A*f.
CJL y/1 - (V 2 /C 2 )
(49)
C||
Substituting (49) into (46) for At, we obtain 1
A r = (d)— = (d)~
h
Cj_ T / 1 -
AT + (2rcos0 -
]2/2
{vl/C2)
AT,
Ax)-^(50)
where the term involving q vanishes, because 2rcos# = Ax (see Fig. 1). We see that the difference between the two proper time intervals, A r and AT, do not depend on q. Thus to the first order, (50) reads A r = AT,
or
A r
~
A T
= 0.
(51)
394
Lorentz and Poincare Invariance
Thus we conclude that there is no the first order effect in the so-called first order experiments, unless an absolute simultaneity could be found. 6.4. The transversal Doppler effect The transversal Doppler effect given by the Lorentz transformation is of the second order in the relative velocity v. Mansouri and Sexl [8] claimed that the MS test theory would predict the first order effect in the transversal Doppler shift. Again they did not distinguish the coordinate time interval from the proper time interval. It is well known that the transversal Doppler effect is just the time dilation effect of a moving clock. Equation (50) shows that the time retardation effect is of the second order, i.e. the directional parameter q would not appear in the transversal Doppler shift. In fact, this result is given in Ref. 18, p.33, i.e. by (19), because the MS and Edwards transformations are equivalent each other to the first order as mentioned above. So that we have from (19) i/ = * / M + - cose)
(52)
to the first order. For the transversal shift, i.e. 9 — n/2, (52) reads v = v'. Therefore we see that there is also no first order effect in the transversal Doppler shift. Finally we want to stress that the analyses in this section are trivial and unnecessary, because we have had the general discussions in the previous sections, which are valid for all physical events involving the so-called first order experiments. 7. DISCUSSION AND CONCLUSIONS The following relations can be shown from comparing the Lorentz transformation, the Edwards transformation (12), the Robertson transformation (23), and the Mansouri-Sexl transformation (30): Lorentz
<— q = 0 <—
Edwards
T
T
C|| = C± = C
C|| =
T Robertson
CJL
= C
T <— q = 0 <—
Mansouri-Sexl
It is shown that advantages of the new set of parameters (qj, c±, q, d) are not only their explicit meaning physically, but also the separation of the parameters presenting the two-way speed of light from the one presenting the one-way speed of light: c± ^ c\\ represents the anisotropy of two-way
Chap. 7. The Isotropy of the Speed of Light
395
speed of light; a non-zero value of ^ implies the anisotropy of one-way speed of light. Furthermore the parameter d becomes a common (a "conformal") factor, and thus is a trivial constant. Therefore c\\ and c± may be determined by measuring the two-way speed of light. However the directional parameter q could not appear in any experiments where simultaneity is defined by a light signal. This implies that when the different clock synchronizations are taken into account, the Edwards transformation (12) is equivalent to the Lorentz transformation, while the MS transformation is equivalent to the Robertson transformation. So that we come to the following conclusions. (i) The Mansouri-Sexl transformation predicts the same observable effects as the Robertson transformation, just as the Edwards transformation does with the Lorentz transformation. (ii) In other words, the directional parameter q cannot be observed in any physical experiment. This is to say that its modulus can be taken as any value in the range (—1,-1-1), or to say that the definition of simultaneity can be chosen arbitrarily. Einstein simultaneity is the simplest one among the theories in which the one-way speed of light is isotropic; while Robertson simultaneity is the simplest one among the theories where the two-way speed of light is anisotropic. This conclusion agrees with [17]. (iii) Therefore, a test of the Mansouri-Sexl transformation is just a test of anisotropy of the two-way speed of light (and a test of the parameter d), but not a test of anisotropy of the one-way speed of light. (The new type measurements reported in [17] are tests involving unidirectional propagation along several baselines together with clock transport connecting the ends of each baseline. These kinds of experiments are related to the problem of slow transport of clocks.) For instance, the constancy of values obtained by measuring the two-way speed of light in physical experiments performed before may yield a limit on the two parameters c\_ and C||; and then the second-order Doppler effects may give a limit on the third parameter d. ACKNOWLEDGMENTS The author would like to thank Profs. James M. Nester ( Department of Physics, National Central University, Chungli, Taiwan) and Wei-Tou Ni (Department of Physics, National Tsing Hua University, Hsinchu, Taiwan) for some helpful discussions, and also wish to thank Professor C. M. Will (Department of Physics, Washington University) for some suggestions on revising this paper.
396
Lorentz and Poincare Invar ance
REFERENCES 1. Einstein, A. (1905). Ann. Phys. (Leipzig) 17, 891. 2. Robertson, H. P. (1949). Rev. Mod. Phys. 21, 378. 3. Reichenbach, H. (1958). The Philosophy of Space and Time (Dover Publications, Inc., New York). 4. Grunbaum, A. (1960). In Philosophy of Science, A. Danto and S. Morgenbesser, eds. (Meridian Books, New York). 5. Ruderfer, M. (1960). Proc. IRE 48, 1661. 6. Edwards, W. F. (1963). Am. ./. Phys. 31, 482. 7. Winnie, J. A.(l!'70).Phil. Sci. 37, 81; 37, 223. 8. Mansouri, R., S< xl, R. U. (1977). Gen. Rel. Grav. 8 „ 497,515,809. 9. Bertotti, B. (1979). Radio Sci. 14, 621. 10. MacArthur, D. W. (1986). Phys. Rev. A 3 3 , 1. 11. Haugan, M. P., Will, C. M. (1987). Physics Today 40, 69. 12. Abolghasem, G. Khajehpour, M. R. H., Mansouri, R. (1988). Phys. Lett. A 1 3 2 , 310. 13. Riis, E., et al. (1988). Phys. Rev. Lett. 60, 81; (1989). ibid. 62, 842. 14. Bay, Z., White, .(. A. (1989). Phys. Rev. Lett. 62„ 841. 15. Gabriel, M. D., Haugan, M. P. (1990). Phys. Rev. D41, 2943. 16. Krisher, T. P., et al. (1990). Phys. Rev. D42, 731. 17. Will, C. M. (1992). Phys. Rev. D45, 403. 18. Zhang, Y. Z. (1979/1994). Experimental Tests of Special Relativity (Science Press, Beijing). 19. R6mer, O. (1676). Mem. Acad. 10, 575; Karlov, L. (1970). Austral. J. Phys. 23, 243.
Chapter 8
Common Relativity and its 4-Dimensional Symmetry 8
J. P. Hsu (1976-1983), T. N. Sherry (1980).
398
Lorentz and Poincare Invariance
Common Time in a Four-Dimensional Symmetry Framework1 J. P. Hsu2 and T. N. Sherry3 Received March 9, 1979 Following the ideas of Poincare, Reichenbach, and Grunbaum concerning the convention of setting up clock systems, we analyze clock systems and light propagation within the framework of four-dimensional symmetry. It is possible to construct a new four-dimensional symmetry framework incorporating common time: observers in different inertial frames of reference use one and the same clock system, which is located in any one of the frames. Consequently, simultaneity has a meaning independent of position and independent of frame of reference. A further consequence is that the two-way speeds of light alone are isotropic in any frame. By the choice of clock system there will be one frame in which the one-way speed of light is isotropic. This frame can be arbitrarily chosen. The difference between one-way speeds and two-way speeds of light signals is considered in detail.
1. INTRODUCTION It is beyond doubt that the special theory of relativity, based on (A) the principle of relativity for physical laws and (B) the principle of universal light speed, is a consistent theory. Today, it is widely believed that the universality of light speed is an inherent property of nature—in fact, that it is a principle that must be operating in nature. The Michelson-Morley experiment(1) provides the support for this belief. Furthermore, it is also believed that the only possible physical time is the relativistic time. It is true that if one accepts the two principles of special relativity (A) and (B) there 1
Work supported by the NRC, NASA, and the U.S. DOE. Space Sciences Laboratory, NASA/Marshall Space Flight Center, Huntsville, Alabama. Present address: Physics Department, Southeastern Massachusetts University, North Dartmouth, Massachusetts. 3 International Center for Theoretical Physics, Miramare, Trieste, Italy.
2
0015-9018/80/020O-0O57S03.0O/0 © 1980 Plenum Publishing Corporation
Chap. 8. Common Relativity ...
399
is nothing new to be gained from the treatment of a four-dimensional symmetry framework. We examine closely the assumptions (A) and (B) which form the basis for special relativity. From the physical point of view it appears that the principle (A) is fundamental and necessary. The principle (B), however, is neither fundamental nor necessary. In fact, our analysis shows that the light principle (B) is, essentially, a convention. In other words, special relativity based upon the principles (A) and (B) is only a particular realization of the four-dimensional symmetry. It is our contention that assumptions which are incompatible with those of special relativity, i.e., with (A) and (B> can be used as the foundations for a new and consistent four-dimensional symmetry framework.4 The assumptions which we use in setting up our alternative four-dimensional framework are: (A) The principle of relativity for physical laws, according to which the laws of physics should take the same form in all inertial reference frames. (B') The existence of a common, or universal (as we have previously12-3' named it), time which is the same for all observers, even those in different reference frames. The role of convention in clock synchronization has been discussed by, among others, Poincare14' Reichenbach,<5) and Grunbaum. (6) It is clear that the analysis of light propagation cannot be separated from the discussion of how to set up clock systems, because, from an experimental point of view, it is time, not speed, which is the primary concept. Evidently, an observer in a reference frame that does not have a clock system cannot measure the velocity of an object, let alone decide whether the speed of light in that frame is the same as or different from the speed of light in another frame. Within the context of the special theory of relativity, if there are a number of different inertial frames moving relative to each other, we must construct a number of grids of clocks and associate one grid of clocks with each frame. The clocks in each frame are synchronized by using light signals, which are assumed to travel with the universal speed. Any observer can read and compare the times of moving clocks and clocks at rest in his frame of reference. We note that in special relativity an observer at rest in a frame 4
We might draw an analogy here to the situation in geometry: If one accepts all of Euclid's axioms, one has a self-consistent geometry. However, Euclid's "parallel axiom" bothered some mathematicians. When it was shown that assumptions incompatible with those of Euclid could be used as the foundation of consistent deductive geometries, it was realized that his parallel hypothesis was not necessary for geometry. The analysis of Euclid's parallel axiom eventually had a profound influence on the evolution of scientific and philosophic thought.
400 Lorentz and Poincare Invariance
must use the grid of clocks associated with that frame to define time and thus to measure speed. Within a new theory based on assumptions (A) and (B') it is clear that the clocks used in special relativity will not be suitable. Within such a different context we consider the following clock system for observers in different frames: We need to construct only one grid of clocks in, say, the frame F. This F frame can be chosen arbitrarily because we do not assume a preferred frame. We synchronize the clocks in F so that the speeds of light signals are constant and isotropic in F. By convention, all observers, whatever their frame of reference, use this grid of clocks to define time, ft is not necessary to set up a grid of clocks in each frame. To be more specific, we assume that there is a clock at each point in the F frame and that there is a an observer at each point in every inertial frame. To record time the observers use the clock nearby so that they do not have to worry about the finite speed of the light signal. In this way, observers in different frames make usejof a common time rather than the relativistic time. Such a common time sounds strange. However, after some reflection, one sees that there is no logical self-contradiction in such a definition. In principle, any observer can read a clock, whatever its state of motion (see Appendix). As we have emphasized, the choice of frame F is arbitrary. If one wishes, the clock system in F can be discarded and another one can be set up in a different frame F'. The clocks can be synchronized so that the speed of light is constant and isotropic in F', and we then require that all observers use the grid of clocks in F'. However, the time of this clock system in F' will be different from the time of the clock system in F. Thus, common time is not absolute in the sense of the Newtonian time.5 As a matter of fact, the common time is the most naive concept of time, which is intuitively clear and is used by people in daily life. Such a simple concept of common time was rejected by physicists after the advent of special relativity because its use in a three-dimensional symmetry framework (with the Galilean transformation) leads to false results. However, if one uses common time in a four-dimensional symmetry framework, a very different picture of the world emerges. The initial suggestion and formulation of such a four-dimensional symmetry framework was given recently by Hsu.(2) It was argued in Ref. 2 that the new symmetry framework is actually consistent with the same experiments which are quoted as evidence for the special theory of relativity. Experiments such as the Michelson-Morley experiment, the KennedyThorndike experiment, and even the time dilation experiment have been 6
Newton believed that there existed only one time, the "absolute, true and mathematical time, (which) of itself, and by its own nature, flows equally without relation to anything external."
Chap. 8. Common Relativity
...
401
examined within the new framework in great detail. Unfortunately, the original paper led to some misunderstandings. (7-8) Certain physical quantities were not properly interpreted, or used. In the present paper we hope to present the formulation of the framework in a clearer fashion. In this paper we place emphasis upon the setting up of clock systems and the operational meaning of common time in the four-dimensional symmetry framework. We demonstrate that, using the common time, a mathematical framework that displays four-dimensional symmetry can be set up. In this framework, spacetime four-vectors xu = (x, y, z, bt)
(1)
are transformed in a manner similar to special relativity. The differences between special relativity and this formalism, which might be termed common relativity, are (a) the transformation properties of time and (b) the definition of and transformation properties of light speed. Special relativity assumes that the coefficient of time is a universal constant. This forces t to transform and leads us to identify the universal constant with the (unique) speed of light. On the other hand, within the framework of common time, the time t does not transform. Invariance under four-dimensional transformations forces the coefficient b of t to become a variable. Furthermore, it precludes the interpretation of this coefficient as the (unique) speed of light. We clarify also the properties of light propagation within this framework. It turns out that one-way and two-way speeds of light have very different properties. For example, the one-way speed of light signals in a fixed direction will be constant in an arbitrary frame, but it will vary with direction; i.e., it is not isotropic. However, two-way speeds of light (their definition will be given in Section 4) are isotropic in any inertial frame, though they have different values in different frames. In setting up the formalism we will begin by choosing an F frame arbitrarily. This is the frame in which the common time is defined. Thus the light speed is uniquely defined in this frame; i.e., the coefficient of t in (1) is constant in the F frame, and its magnitude equals the speed of light in the F frame. We shall then perform the transformations to other frames of reference. These transformations will enable us to discuss the various possible one-way and two-way speeds of light. We will then be able to write down the general transformation between arbitrary frames of reference.
2. THE FOUR-DIMENSIONAL TRANSFORMATION. A SPECIAL CASE In this section we introduce the new transformation laws between inertial frames of reference that occur in our framework. It is first necessary
402
Lorentz and Poincare Invariance
to understand how inertial frames are characterized within our framework. Inertial frames of reference are inherently four dimensional: three of the dimensions we identify as the three space dimensions x, y, and z, and the fourth we call, for the moment, JC°. This fourth quantity (or coordinate) x° has the same length dimension as x, y, and z. In terms of the time variable we have x° = bt
(2)
If the time variable t that we use in Eq. (2) is the common time, then the framework is that of common relativity. In that case the quantity b, which is variable, does not have immediate physical significance. However, as we investigate the theory further, its meaning and properties will become more clear. If, on the other hand, the variable t in (2) is the relativistic time, then the framework is that of special relativity, in which case b is a universal constant, the same in all inertial frames of reference, and is identified with the speed of light. Whichever framework we specify, it is clear that the inertial frame of reference is the same when specified in terms of x, y, z, and x°. It is only when we decompose x° according to Eq. (2) that the differences show. Formally, we can define the infinitesimal interval of the four-dimensional flat space in the most general form ds2 = d{btf - dx2 - dy2 - dz2
(3)
in a frame F. If we require that the infinitesimal interval (3) be invariant under changes of inertial reference frames, i.e., d(btf - dx2 - dy2 - dz2 = d{b't')2 - dx'2 - dy'2 - dz'2
(4)
we are led to the following general four-dimensional transformation between two arbitrary inertial frames F a n d F'\ x' = y(x - pbt), / = y, z = z, b't' = y{bt - px),
y = (1 - jS2)-1/2 = const (5)
In general b' and t' could be functions (of x and /) in the transformation. In the special case when we identify b' and b with the (universal) speed of light c, Eqs. (5) reduce to the Lorentz transformations of special relativity. However, in common relativity the equations have a different structure because of the definition of common time, viz. t' = t
(6)
Chap. 8. Common Relativity
...
403
We notice, further, that the parameter /? is not defined in the transformation (5). To understand the meaning and properties of b, b', and j8 in the common relativity we now make use of the operational definition of common time which we introduced in Section 1. The interpretation of /? in the general case will be given in Section 4, while in Section 3 we shall examine the relationship between the quantities b and b' and the operational definitions of the speeds of light signals. Let us examine the special case when the frame F is the frame in which the clock system is set up to define the common time. Then in F for light emitted from an arbitrary source the speed is constant, isotropic, and uniquely defined. Thus, in the frame F we have b = c — const
(7)
where the symbol c is reserved for speeds of light. Consequently, the parameter j8 in Eq. (5) can be identified with the ratio v/c, where v is the speed of the frame F' as measured in the F frame. Then the new four-dimensional transformation from the F frame (in which the clock system is set up to define the common time) to any other frame F' is given by x'
= y{x _ vt), y' = y,
z' = z, b't = y(ct - vx/c),
y = (1 - ^/c 2 )- 1 / 2 (8)
Here, V is a function (of x/t) in the transformation. Its appearance in the transformation laws is essential if we are to have a four-dimensional, rather than a three-dimensional, transformation incorporating common time. The transformation laws (8), taken by themselves, can easily be misunderstood. It is important to remember that they define a transformation between inertial frames of reference, but where the decomposition (2) with common time is employed. Thus, for example, the coordinate four-vector is (x', y', z', b't) and not (JC', y', z, t): the relationship between (x, y, z, t) and (x', y', z', t) is a three-dimensional transformation, and so cannot give rise to general transformations between four-dimensional inertial frames. Let us examine now some of the properties of the transformations (8): (i) When we consider an event in the F frame with position coordinates x, y, z (x ^ 0) and time coordinate / = 0, the transformation law appears to be singular. In particular, the transformation b't = y(ct — vx/c)
(9)
404 Lorentz and Poincaxe Invariance
needs to be correctly defined for such events. The quantity (or function) b' is defined for such events to be infinite, but with the following property: "b'(t = 0) times 0" = lim (b't) = -fox
(10)
With this definition the transformation law (9) is consistent for such events. In fact, although b' has the dimensions of a speed, this singular definition of b'(x ¥= 0, t = 0) does not affect known physical concepts, for, as we shall see in Section 3, this divergent b' is not related to any physical speed.9 (ii) When we examine an event for which both x = 0 and t = 0, but y and z are arbitrary, in the F frame we see that the corresponding value of b't in the F' frame is, in fact, zero no matter how x and / approach zero. (iii) When we consider two events in the F frame, one at xx at time t±, and the other at x2 at time t2, then the transformation laws for the intervals Ax = xx — x2 and At = tx — t2 are Ax' = y(Ax - v At),
A{b't) = y(cAt - v Ax/c)
(11)
where A(b't) = (b't), -
(b't\
= ( V - V ) ' i + 6 8 '('i - '*) = Ab' tx + b2' At
(12)
(iv) The apparent singularity in the transformations (8) referred to in (i) only occurs at the origin of t. For, if for the two events in (iii) we have h = h = 'i i-e-> At — 0, then Eqs. (11) give us Ax' = y Ax,
A(b't) =* (A b')t = -yv
Axle
(13)
which is not singular. (v) If the two events in (iii) correspond to two events on the world line of a particle moving with constant speed w in the/ 7 frame (through the origin of F), then we have xJh = x2/t2 = w
(14)
and so
Ab' = V - b2 = yP(x2/t2 - xA) = 0 8
We note in passing that the extra constraint b't = 0 for / = 0, which was imposed in Ref. 2, is now seen to be unnecessary.
Chap. 8. Common Relativity
...
405
and so in this case we have Ax' = y(Ax — v At) = y(w — v) At A{b't) = b' At = yc(l - wv/c2) At
(15)
The first Eqs. (15) is quite interesting, as it shows that in the F' frame also, Ax' is linear in At. In other words, the path of the particle is a straight line in the F' frame, even with our unconventional use of common time. Before we progress any further in our understanding of the quantities b and b', it is necessary to understand how the propagation of light signals is treated in our framework. We will use this treatment, which follows in Section 3, to give whatever physical meaning is possible to V.
3. THE PROPAGATION OF LIGHT Clearly, a proper treatment of the propagation of light and an understanding of the speed of light signals play a crucial role in the development of the new symmetry framework. Whereas in the special theory of relativity it is the assumption of a universal light speed that leads us to use a time that transforms in an unusual way, so in this case it is the assumption of a common time that leads us to consider unusual transformations for the speed of light. Let us consider a light signal emitted at time t — 0 at the origin of the special frame F. The propagation of this light signal in F is given by c2t2 - x2 - y2 - z2 = 0
(16)
We assume that all light signals that travel in the same direction in the frame F have the same speed relative to F. With this assumption it is clear that the motion of the source of a light signal has no effect on the speed of the light signal. We find the law of propagation of the above light signal in an arbitrary frame F' by applying the four-dimensional transformation (8) to Eq. (16) (b't)2 - (x'Y - (y') 2 - {z'f = 0
(17)
The form of this equation is the same as that of (16). In this way we see that the principle of relativity is obeyed for light propagation. In theFframe the speed of light is uniquely defined. This does not occur in any other reference frame. In the F' frame the speed of the light signal that propagates according to Eq. (17) is V = | r' \/t = d(b't)/dt
(18)
406
Lorentz and Poincare Invariance
where b't is given by the transformation law (8), for an event (x, y, z, ct) on the world line of the light signal in F. Thus the speed is !(*•,)_,,(*-!*)
(19)
where x is the x component of the speed of the same light signal in F. If the light signal travels along the x axis of F, we have x = ±c and the speed of the light signal in F' is given by y(c =F v) = c*'
(20)
depending on whether the light signal travels in the positive or negative direction along the x axis. We call the speed defined in this way a one-way speed of light. In the general case when the light signal travels in an arbitrary direction, specified by spherical polar angles 6 and
(21)
so that the corresponding one-way speed of this light signal in the F' frame is given by y(c — v sin 0 cos
(22)
The above prescription is crucial to a proper understanding of our work. It can be restated as follows: To find the speed of a light signal in F' we use the fourth transformation from (8) subject to a constraint which corresponds to a description of the physical arrangement. Thus, for example, the constraint x = ±ct corresponds to an event on the world line of the light signal in the F frame, and the imposition of this constraint in Eq. (19) yields a one-way speed of light in F' corresponding to the direction of the light signal in F. From Eq. (22) it is clear that the one-way speeds of light signals in different directions in the F' frame will not be the same. Thus the>one-way speed of light signals in F' is not isotropic. Each frame of reference is characterized by a set of one-way speeds of light corresponding to the different possible directions.
Chap. 8. Common Relativity
...
407
We must also describe how to calculate the one-way speed of a light signal that is emitted from an arbitrary point of the frame F. For this case it is helpful to define
c ' - > ' 0
(23)
which is easy to understand in terms of velocities. In fact, with this definition, (*, y, z, c) forms a four-vector, as can be seen from the transformation law for velocities x' = y(x — v),
y' = y,
z' = z,
c' = y(c — vx/c)
(24)
where x' == dx'jdt, etc., and we note that c' = c'(x). In the general case, the prescription is that if x is related to the x component of the speed of light in the F frame, as in (21), then the corresponding c' is the one-way speed of this light signal measured in F'. The above prescription tells us how to calculate the one-way speed of any light signal in an arbitrary frame of reference. Furthermore, in so doing we have given physical meaning to b'(x/t) for \x/t\ < c, or d{b't)/dt for | x | < c. When this restriction is imposed, x is the x component of the speed of a light signal in some direction. For \ x\ > c there is no such interpretation possible. These latter values of b'{x[t), or d(b't)/dt, enter the theory to define the metric, but we do not require that they represent the speeds of physical objects. There are other speeds of light which can be defined, namely the twoway speeds of light. In fact, as we will show in the following section, values of b'(x/t) or (d/dt)(b't) for | x | or | x/t | < c can also be identified as two-way speeds of light. This interpretation enables us to completely define the fourdimensional transformation between arbitrary frames of reference.
4. THE TRANSFORMATION BETWEEN ARBITRARY FRAMES OF REFERENCE As we noted in the previous section, there is another speed of light that can be defined. We call it a two-way speed of light because it is the average speed of a light signal that travels first in one direction and then in the opposite direction. It is clear from the previous section that such a quantity has a role to play within our framework, since the one-way speeds are not isotropic in an arbitrary frame. To illustrate the concept let us evaluate the average speed of a light signal emitted from the origin of F along the +x axis and reflected back to the origin by a mirror. We depict in Fig. 1 the path of this light signal, which
408 Lorentz and Poincare lnvaiian.ee
yt
f c,t, c,t 2
p%
w
tl = t 2
cL.t, — • —
c+, t 2
t| = t 2 Fig. 1. The paths traveled by light signal (a) in the F and F' frames.
we denote by (a), from the points of view of both F and F'. In F the calculation is trivial and we have (25)
Cav = {
where t1 and t2, the times for the outward and inward trips, respectively, are equal. In F' the calculation is as simple and we find ckv = (cJh + c+'t^Ktj, + t2) = \{cj
+ c+') = yc,
tx
(26)
This answer does not depend on the distance traveled in either direction. Furthermore, as can be seen from Eq. (22), this speed is also isotropic. In the previous section we gave a prescription which could be used to calculate the one-way speed of light in an arbitrary direction in any frame of reference. A similar prescription can be given to evaluate the speed in (26). In the fourth transformation law of (8) b't = y(ct — vx/c) we impose the constraint that at the end the light signal returns to its point of origin in F; i.e., x = 0. With this constraint the rhs is the required two-way speed. In what follows it will be simpler to use this prescription rather than an explicit calculation, although they yield the same answer in each case.
Chap. 8. Common Relativity ...
409
Table I. The Velocities of the Frames F, F' and F" Relative to Each Other"
F
F
F'
F"
0
V
Vl
F'
v' — —y(v)v
0
«i
F"
v"y =
"i
0
-y(yi>i
° E.g., the velocity of F' relative to F is v (along the x axis).
There are a number of different two-way speeds of light which can be defined. Let the frames F and F' be as we have used them earlier. We introduce a third frame F" which moves along the +x axis of F with speed v1. We list in Table I the speeds of these three frames relative to one another. We discuss three types of light signals: (a) a light signal as introduced above, emitted from the origin of the x axis and reflected back to the origin in F; (b) a light signal that returns to its point of origin in F'; (c) a light signal that returns to its point of origin in F". Furthermore, in each of the three frames we can ask what is the average speed of each of the three light signals. In Table II we define the notation which we use for these speeds. For example, cz'(F) is the two-way speed in the F' frame of the light signal that returns to its point of origin in F. Since the speed of F relative to F' is denoted by v', we can also denote this by c2'(V)- In Fig- 2 we have depicted the paths of light signal (b) in both the F and F' frames. Since the speed of light is uniquely defined in the frame F it is clear that c2(0) = c2(v) = cgfo) = c
(27)
The result derived in Eq. (26) gives us c2'(F) = c 2 >') = y(v)c,
4(F) = d{v") = y(Vl)c
Table II. Notation for Two-Way Speeds of Light Signals (a), (b), and (c) in the Frames of Reference F, F', and F" F
F'
F"
(a)
c*(F) [c2(0)]
c,'(F) lc3'(t/)]
c\{F) [cl(v")]
(b)
c,(F') [c2(v)]
c2W) [*,'«>)]
cl(F') [c\(u")\
(c)
c2(F") [*(»,)]
ct'iT)
c\(F") [c;(0)]
fo'(«i')l
(28)
410 Lorentz and Poincare Invariance
yf c,t, — • —
->x
c,t 2
ci,t, — •
«— c+,t2 tl*t2 Fig. 2. The paths traveled by light signal (b) in the F and F' frames.
Of the remaining four types, c2'(F') and c'i{Ff) are calculated in the same way, andc2'(^") andc5(F)are related to one another by the interchange of v and ^ . To evaluate c2'(F), we use (d/dt)(b't) = y(c - vxjc) subject to the constraint that x = 0; i.e., the signal returns to its point of origin in the F frame. This constraint, using Eq. (24), gives us y(c
- v2/c) = c/y
i.e.,
c2'(F') = r2'(0) = c/y(v)
(29)
Similarly, then, we also have c;(F") = c;(0) = cly(i\)
(30)
To evaluate c2'(F") we must first write down the four-dimensional transformation between the frames F and F". The general structure of this transformation is given by Eq. (5), where the unknown parameter is ft say, to distinguish it from the J3 that occurs in (8). By considering the transforma-
Chap. 8. Common Relativity
.
411
tions from F to F' and F to F" of the type (8), we deduce the following form for the four-dimensional transformation from F' to F": X" =
[(i - m
X'
+ 03 - ft) b'tw - ^yi\\
- py*
y' = / , z" = z' b"t = [(1 - m b't + (]8 - ft) x']/(l - ft2)1/^ -
(31) pfi*
where ft = vjc. This set of transformations can be written in the form of Eqs. (5), where the parameter ft is identified as (32)
0 = (ft - j8)/(l - ;8ft)
In fact, the result (32) is a particular case of the velocity addition law within our framework. To see this we examine the velocity transformations (24) x' = y(v)(x — /3c),
c'(x) = y(v)(c — fix)
(33)
We let x be the velocity of an object at rest in the F" frame; i.e., x = vx Then, taking the ratio of the two equations (33), we find c'(x) *=»!
vjc - jS _ ft - j8 = /? 1 Mc l £ft
(34)
Here *'(yi) is t n e velocity of F" measured in F', which we have denoted in Table I as w2', «i' = y(»X»i - o)
(35)
and c'(x) \i=v is given by (33) subject to the constraint; i.e., *'(*)!*-., = c(\ - £ft)/(l -
Pfi*
(36)
To identify this quantity, let us now evaluate the two-way speed, which we have denoted by c"2(F'). We use the fourth equation of (31) subject to the constraint x' = 0. Thus, using also (8), we have cl(F') = ff (b't) x =0
(1 - /S2)1'2 (1 - ft2)1/* dtK c(l - j3ft) (1 - ft2)i/2
}
(37)
412
Lorentz and Poincare Invariance
Interchanging v and vx, we then obtain the two-way speed c2'(F") = c{\ - flSO/O - jS2)^
(38)
But this is just the quantity we have evaluated in (36). Thus we see that we have now a simple physical interpretation for the parameter ft, namely jS =
(39)
In terms of this parameter we can now write the four-dimensional transformation between arbitrary frames F' and F" as X"
= y(X' _ fo't)
y" = / , Z" = Z' b't = y(b't - £c')
(40)
where y = (1 - )3»)V« In discussing the various two-way speeds of light we have given an alternative physical interpretation to the quantity d(b't)/dt for values | x | < c. In fact, this quantity is the two-way speed of light measured in F' of a light signal that returns to its point of origin in a frame moving with a velocity (*, 0, 0) relative to the F frame. On the other hand, if | x | > c, then such an interpretation for d(b't)/dt does not exist, as there is no frame of reference that moves with a velocity greater than c relative to the F frame. It is for this reason that the singularity in (8) referred to in Section 2 does not alter physical quantities. We conclude that c in (33) [or equivalently V in (8)] has physical significance when it has an operational meaning; i.e., when 0 < | * | < c. In this sense, the four-dimensional transformation may be termed a space-light transformation with common time. The four-dimensional transformations form a Lie group, just as do the Lorentz transformations. Thus the inverse transformations to (8) and (33) can be constructed. For example, suppose an object is at rest in the F frame; then its velocity in the F' frame is given by x = —yv,
y = z = 0,
c'(0) = yc
according to the velocity law (24) Then the speed of the F frame relative to the F' frame is V'
=
_„/(! -
y2/ C 2)l/2
Chap. 8. Common Relativity
. ..
413
which is different from v. However, the ratio v'/c'(0) is the same as v/c up to the normal sign v'/c'(G) = -v/c
= -j8
Thus in the inverse to Eqs. (8) we may use —v'/c'(0) in place of v/c.
5. DISCUSSION In this paper we have discussed the formulation of a four-dimensional symmetry framework making use of a common time in all inertial frames of reference. The purpose of the paper was to see whether or not this concept of time could be incorporated into a genuine four-dimensional symmetry framework. We have shown that this can indeed be done. Essentially the common time is defined by a clock system set up in an arbitrarily chosen inertial frame of reference. In this frame the description of physics is as we expect it. However, since the time is common to all inertial frames, certain physical quantities (but not physical laws) appear to be very different when we transform to another frame of reference. In particular, in any other frame of reference the speed of light (with respect to the common time) is not isotropic. Nevertheless, as we are working in a manifestly four-dimensional framework, the principle of relativity is preserved: physical laws take the same form in all reference frames when written in terms of four-vectors and tensors. The coordinate four-vector is (x, y, z, x°) where each coordinate has the same (length) dimension. It is only when the four-vector (x, y, z, bt) is decomposed into a three-vector (x, y, z) and the common time t that apparent differences occur. In fact, the transformation from one inertial frame to another is given in terms of the four-vector (x, y, z, bt). It is important to emphasize that the coefficient of the common time t is not identified a priori with the speed of light. We have specified in Section 3 how one calculates the speed of a light signal in an arbitrary frame of reference. According to this well-defined prescription, we use the fourth of the special transformations (8) subject to the constraint that the coordinate x in the F frame lies on the world line of the light signal, and then we calculate d{b't)/dt. In fact, as we saw in Section 4, prescriptions of this type have wider application and can be used to give, for example, two-way speeds of light. Finally, we have completely parametrized the transformation between two arbitrary frames of reference in terms of quantities defined in those frames of reference alone.
414
Lorentz and Poincare Invariance
Although in this framework the speed of light is not a uniquely defined quantity, nevertheless the maximum speed attainable by a physical object moving in a specific direction is the one-way speed of a light signal in that direction.<8> Thus, the speed of a light signal is a limiting speed in our theory also. It is believed that the special theory of relativity is on very firm ground with regard to experimental support. In the past a number of experiments have been quoted in support of special relativity, including, for example, the Michelson-Morley experiment, the Fizeau experiment, the KennedyThorndike experiment, and the Doppler shift experiment. Each of these experiments has also been shown to be consistent with common relativity. <2'10) For example, within the framework of this paper, the Michelson-Morley experiment yields a null result. The important point to note, as was pointed out in Ref. 2, is that what is actually compared are two-way speeds of light in orthogonal directions. But, as we have seen in Section 4, the two-way speeds of light turn out to be isotropic, even though the one-way speeds are not so. The Fizeau experiment and the Kennedy-Thorndike experiment are treated in a similar vein. In the case of the Doppler shift it actually turns out that for what is experimentally measured the same transformation laws hold as in special relativity, even though the transformations between inertial frames look so different. We include in an appendix a short discussion of this Even the time dilation experiment is consistent with common relativity.(2) With the use of common time the lifetimes of the same unstable particle measured by observers in different frames of reference have the same value, as one expects. However, if an observer compares the lifetime of a particle at rest and the lifetime of a similar particle in motion, the latter will be the larger of the two. This result follows because the decay width of an unstable particle is the same, whatever its rest frame. Thus we also achieve consistency with the time dilation experiment. Furthermore, in our framework, just as in special relativity, the speed of a light signal is independent of the state of motion of the source of that light signal. As explained in Section 3, this property is built into the framework. Thus any of the sets of special relativity that test this property alone may actually be seen to test the same property in common relativity. For example, in a recent experiment the arrival time pf pulses from a binary x-ray source was measured with great accuracy.'10' The experiment shows that if the dependence of the speed of light on the velocity v of the source is given by c + kv, then k must be smaller than 2 x 10~9. The conclusion we draw from these considerations is that the results of the experiments, referred to above, should be seen as evidence for the fourdimensional nature of the symmetry framework rather than just for the specific realization in which the speed of light is universal. The experiments are also
Chap. 8. Common Relativity
...
415
consistent with common relativity. They do not consistitute conclusive evidence to prove the universality of the speed of light. They are consistent with that hypothesis; but, as we have just seen, they are also consistent with the alternative hypothesis of common time. It is an interesting question whether or not the universality of the speed of light is a testable hypothesis. In fact, because the definition of light speed depends on the definition of time, it is not possible to test the hypothesis. The hypothesis is, then, a convention which is convenient for the synchronization of clocks in different frames rather than an inherent property of light. This viewpoint resembles greatly the opinion expressed by Poincare.'31 Suppose the speeds of light signals are measured to be nonuniversal. All this really means is that the clock system does not read relativistic time. If, further, the speeds of light signals agree with the predictions of this paper, then the clock system uses common time. What this paper has shown is that such a common time can be used within the four-dimensional symmetry framework. We may remark, at this stage, that the drawback of the synchronization schemes discussed by Reichenbach and Griinbaum is the absence of a fourdimensional symmetry framework. As a consequence, it does not seem likely that quantum electrodynamics, or other field theories, could be correctly formulated in a simple way with the adoption of such synchronizations and the accompanying transformations derived by Winnie.(11) APPENDIX. PLANE ELECTROMAGNETIC WAVES AND DOPPLER SHIFTS As we have seen in Section 3, the law of propagation of light in common relativity is given by ds2 = c2 dt2 -dx2-
dy2 - dz2 = (c'f dt2 - dx'2 - dy'2 - dz'2 = 0
(Al)
where c' = d(b't)/dt. A plane wave of light is described by exp(ikuxu) = exp[/(k • r — (u)/c)ct)]
(A2)
exp[i(*')» *„'] = exp[/(k' • r' - (u>'/c')b't)] in the frames F and F', respectively. The wave four-vectors ku = (kx, ky, kz, co/c) and ku' = (kx', ky', kz', oi'jc') are related by the four-dimensional transformation
**' = Y(kx - fa/c) Ky
=
Ky ,
KZ
=
KZ
\f\j)
416
Lorentz and Poincare Invariance
The four-dimensional transformation law between arbitrary frames applies to this four-vector also. The plane wave satisfies the equations ("£"
_
ex
^ ^
P^" A '") = - (k2 - - ^ j exp(i*<x) = 0
(A4)
and
= -
[(k')2 - ( ^ - ) 2 ] exp[i/r„'(x')«] = 0
(A4b)
in the F and F' frames, respectively. We note here that _#•
\_JP__
_F
<^_
3r* ~ ~c* Ht* ~ ~Wf ~ ~Wtf which follows directly from the four-dimensional transformations (8). The transformation equations (A3) are those for the Doppler shift. Since these differ from the corresponding transformations of special relativity, one might naively expect a measurable difference to show up. This is not so, however, and this is a demonstration of the fact that both frameworks describe the same phenomena. The quantities in (A3) are not what are tested in the laboratory. The reason is as follows: Suppose the photon is emitted from an atom at rest in the frame F'. The unshifted quantities k' and co'jc' are measured in F'. This cannot be done in the laboratory because the atom (or F' frame) is moving relative to the laboratory, at rest in F, say, with a typical thermal speed j8 ~ 10~6. Instead, we use the unshifted quantities k{) and wjc associated with an atom of the same kind at rest in the laboratory. Thus in the laboratory we compare the shifted (k, u>jc) with the unshifted (k„, OJJC). The unshifted wavelengths A0 and A' are the .same, and hence co'/c' = cvjc
(A5)
using (A4). It follows, then, that 1/A„ = (1/A)y(l - j8),
OJ0 =r ojy(\ - 0)
(A6)
where we have set k„ = kz — 0 for simplicity. These relations are identical to those derived for this experiment within the conventional framework of special relativity.
Chap. 8. Common Relativity
...
417
For the transformations of the "energy-momentum" four-vectors for material particles and photons [related to (A6)] and the transformations of the electromagnetic fields in the present framework, we refer to Ref. 2. From previous discussions, we have seen that the essence of the theory is the symmetry property of a physical law rather than the symmetry property of a physical quantity such as time or the speed of light. In particular, the anisotropy of the speed of light in common relativity does not imply that inertial frames are distinguishable, because this anisotropy is related to the convention of setting up a clock system. We stress that the invariance of physical laws ensures that all inertial frames are experimentally completely indistinguishable.
ACKNOWLEDGMENTS We would like to acknowledge the many helpful discussion we have had with Prof. C. B. Chiu and Dr. P. B. Eby. The research was accomplished while one of us (JPH) held an NRC Senior Resident Research Associateship.
REFERENCES 1. E. U. Condon and H. Odishaw, in Handbook of Physics (McGraw-Hill, 1967), pp. 6-158-6-165. 2. J. P. Hsu, Found. Phys. 6, 317 (1976); 7, 953 (1977); 8, 371 (1978). 3. C. B. Chiu, J. P. Hsu, and T. N. Sherry, CPT preprint 292. 4. H. Poincare, Bull. Sci. Math. 28, 306 (1904); The Foundation of Science, (The Science Press, New York, 1921), pp. 303-313. 1 5. H. Reichenbach, The Philosophy of Space and Time (Dover, New York, 1958). 6. A. Griinbaum, Philosophical Problems of Space and Time (Reidel, Boston, 1973). 7. T. M. Kalotas and A. R. Lee, Found. Phys. 8, 603 (1978). 8. J. P. Hsu and T. N. Sherry, Found. Phys. 8, 609 (1978). 9. J. P. Hsu and J. A. Underwood, Found. Phys. 8, 833 (1978). 10. K. Brecher, Phys. Rev. Lett. 39, 1051 (1977). 11. J. A. Winnie, Phil. Sci. 37, 81, 223 (1970).
418
Lorentz and Poincare Invariance
Questions on Universal Constants and Four-Dimensional Symmetry from a Broad Viewpoint - I. J. P. Hsu (*) Space Sciences Laboratory, NASA/Marsliall
Space Flight Center - Huntsville, Ala. 35812
(ricevuto il 5 Novembre 1982)
Summary. — It is demonstrated that there is a flexibility in clock synchronizations and that the four-dimensional symmetry framework can be viewed broadly. A new viewpoint of the four-dimensional framework is discussed on the basis of a common time for all observers who may be in motion relative to each other. Such a common time can be realized by a special method of clock synchronization and it is not absolute in the Newtonian sense. We suggest that the truly universal constants in physics are J = 0.35-10~37 g cm and e = 1.6-10"20 e.m.u. rather than ft, e (in e.s.u.) and the speed of light because J and e are independent of a special arrangement of the measuring apparatus—such as clock synchronizations. PACS. 03.30. - Special relativity.
1. - Introduction. I n a previous p a p e r (*), which discussed cosmic m a t t e r - a n t i m a t t e r sepa r a t i o n during t h e expansion of t h e Universe, we n o t e d t h a t it is also possible t o construct a four-dimensional space-time framework w i t h a cosmic t i m e
(*) Present address: Physics Department, Southeastern Massachusetts University, North Dartmouth, Mass. 02747. t1) J. P . H s u : Nuovo Cimento B, 61, 249 (1981); see also Pliys. Lett. B, 119, 328 (1982); Lett. Nuovo Cimento, 28, 128 (1980); Phys. Bev. Lett., 42, 934, 1720 (1980). The author should like to thank Prof. S. BEKGIA for correspondence concerning the evolution of the Universe from the viewpoint of a cosmic time.
Chap. 8. Common Relativity . .. 419 defined by the evolution of the Universe as a whole (2). This is interesting because the rate of the expansion of the Universe has a universal meaning to all observers who may be in motion relative to each other. It is well known that the four-dimensional bound-state wave function in the Bethe-Salpeter equation does not have a probability interpretation in the sense of elementary quantum mechanics. (See sect. 5 below.) This difficulty is due to the appearance of «relative time » in bound-state wave functions (3). Both of these problems are related to the relativistic time in the usual four-dimensional framework. Because of these facts, it is both interesting and worthwhile investigating the four-dimensional symmetry without the usual relativistic time for an analysis of physical phenomena. I n other words, we view the four-dimensional symmetry from a new and different viewpoint that there is a common time for all observers—a « common viewpoint». The common time for all observers makes sense if and only if it can be realized by a synchronization of clock systems. I t must be stressed that i) when one discusses synchronizations of clock systems, the four-dimensional symmetry of physical laws must be preserved because there is no preferred inertial frame of reference, and ii) we shall not take over the whole Newtonian concept of absolute time, that is, we do not assert that time is absolute, intrinsic and unique because we do not know how to realize such a time. We take over only part of the Newtonian concept of time which can actually be realized by clock systems. The purposes of the present investigation based on this new viewpoint are to reveal what the truly universal constants are and to demonstrate the flexibility in the concept of time. To accomplish this, we first analyze to what extent the universality of c, h and e can be taken for granted (4). I t is demonstrated that different concepts of time (5) can be introduced in the four-dimensional symmetry framework and, thereby, shed some light on the difficulty in the unification of quantum mechanics and four-dimensional symmetry mentioned previously. Furthermore, the result gives us a broad view of the physical world. To keep physical theory in contact with physical reality, it is desirable to see how a common time for all observers can be physically realized. Suppose (2) The possibility of introducing a cosmic time in cosmology is well known, see, for example, R. ADLEE, M. BAZIN and M. SCHIFFER: Introduction to General Relativity (New York, N. Y., 1965), p. 62 and 339. (3)
A. A. LOGUNOV and A. N. TAVKHELIDZE : Nuovo Cimento, 29, 380 (1963); E. N.
Ann. Phys. (N. ¥.), 78, 176 (1973). To avoid the difficulty of the relativistic time in the bound-state wave function, these authors introduced a single-time wave function to describe bound states of many particles. (4) The fundamental length will not be discussed here because of the lack of a wellestablished theory. For recent discussion of the possible fundamental length in quantum field theories, see J. P. Hsu: Nuovo Cimento A, 55, 145 (1980); 56, 1 (1980); J. P. Hsu and E. MAC- NUOVO Cimento B, 49, 55 (1979), and references therein. (5) J. P. Hsu: Found. Phys., 8, 371 (1978); 6, 317 (1976). FAUSTOV:
420 Lorentz and Poincare Invariance there are two sets of infinitely many clocks, one set on the ground (J^-frame) and one set in a train, placed along a straight line (the x-axis). All clocks are identical. The clocks on the ground are synchronized as usixal by using a light signal (its one-way speed is defined to he the same as the two-way, back-andforth, speed in F). Consequently, the speed of light emitted from any source measured on the ground is isotropic and constant, c = 2.99-10 8 m/s. Now suppose all clocks in the train are co-moving with another frame F'. And they have a constant velocity V= ( 7 , 0 , 0 ) as measured on the ground and their rate of ticking will slow down relative to the clocks on the ground. Any observer can compare the rate and the time of a moving clock and the clock at rest in his frame. Therefore, he can adjust the rate of clocks in the train (F'frame) so that the clocks in F' tick with the same rate as the F clocks. Moreover, the F' observer can adjust his clock in the train F' to read the same time as the nearby clock on the ground. Let us assume that there is a clock at each point in every inertial frame and there is an observer at each point in every frame, so that both sets of clocks have, by the procedure above, the same rate of ticking and read the same time. Note that, to record time for an event, the observers use the clocks nearby the event so that they do not have to worry about the finite speed of light. In this way, observers in different frames have their own clock systems and have a common time. We may remark that actually one need not bother to set up the F' clocks, one simply requires observers in the train {F') to use the clocks on the ground (F) to record time. Why is such a common time not absolute 1 The reason is as follows: There is no preferred frame and, therefore, the choice of synchronizing first the clocks in the 2^-frame is arbitrary. If one wishes, one can choose to synchronize the clocks in another frame F' by using light signals so that the speed of light is constant and isotropic in the .F'-frame. We then require the observers in F to adjust their clocks in such a way that they have the same rate and read the same time as those in the jF'-frame. Therefore, all observers in F and F' have a common time too. However, this common time is different from the common time discussed previously. Thus we have seen that the common time for all observers is not unique and, therefore, not absolute. In sect. 2, we discuss the true universality of basic constants by considering a class of measurement processes, based on velocity = distance/time interval, which always yields some number when used by an observer. I n sect. 3, we formulate the four-dimensional symmetry framework based on the «common viewpoint» {i.e. a common time for all observers), so that the co-ordinate four-vector, the wave four-vector, etc. are defined. We then discuss other related processes of measuring light speed: the speed of light measured by using the relation c = Xv and the measurement of the two-way speed of light. I n sect. 4, invariant « action functions » for physical laws in the new fourdimensional symmetry framework with the common time are established to discuss universal constants. We demonstrate that the truly universal con-
Chap. 8. Common Relativity ...
421
stants are J = 0.35-10- 3 7 gcm
and
e = 1.6-10- i0 (g cm)* (or e.m.u.)
rather than c, Ti and e (in e.s.u.). This can be substantiated based on a purely theoretical ground if one accepts the criterion that a truly universal constant should not depend on a special arrangement of the measuring apparatus. In sect. 5, we point out that physics in this new framework and that in special relativity are equivalent as far as one-particle systems and the ^-matrix in field theories are concerned. The advantages of the new conceptual framework for describing bound-state, many-particle systems and cosmology are briefly discussed.
2. - True universality of basic constants. Let us discuss the true universality of basic constants such as the electromagnetic coupling strength a = 1/137 and the speed of light, etc. True universality of a quantity means that a quantity has the same value when measured in various ways (e.g. measured by observers on the ground and by those in a train) at different places and times. For measurements, standards of « meter », « second » and « gram » must be first given, and the clock system must be set up before one measures velocities, etc. In the inertial frame F (the ground), an atom of some specific element at rest has a definite mass ma and can emit a radiation with specific wave-length Xa • The quantities ma and A„ can be used as standards of « gram » and «meter » in F. I n another frame F' (a train or another star), an atom of the same element at rest (in F') has mass and can emit the same characteristic radiation with a specific wave-length. They can be used as standards of «gram» and «meter » in F'. Similarly atomic clocks may be used to define «second» in, say, the .F-frame. These definitions are quite natural because all inertial frames are assumed to be equivalent (6). Let us consider a class of measurement processes for the speed of light based on the definition v = Ar/Ai. We concentrate on measurements of speeds of the same light signal by observers in different inertial frames within the framework of four-dimensional symmetry. Suppose there is a grid of clocks on the ground (the .F-frame) and these clocks are synchronized as discussed previously. Suppose the train is the J"-frame which moves with a constant velocity V = [V, 0, 0) as measured in F such that x and x' are related by (see sect. 3)
(1)
x'=y{x-M,
y=(l-0')-*,
P=V/e.
(6) For a comprehensive review of the birth of special relativity, see S. Eintein, A Centenary Volume (Cambridge, Mass., 1979), p. 65.
BEEGIA:
in
422 Lorentz and Poincare Invarian.ee Evidently, the speed of a light signal measured in F has the unique value c by the definition of time (or by the set-up of the clock system in F). We now ask how to measure the speed of light in the train (J"-frame) and under what condition the observers in the train will measure the value c for the speed of bhe same light signal. The standards for « meter » and « second » given previously are not sufficient. One must also consider the clock system {i.e. the time) to be used by the F' observers. Because this is quite subtle conceptually, let us consider this in detail and repeat some well-known results for the sake of comparison. According to Einstein's viewpoint, the observers in F' must have their own time t' which differs from t. That is, another grid of clocks must be set up in F' in such a way that the time V of a F' clock is related to the time t read by the P-frame clocks by the relation (2)
f=y{t-pxlc),
if the origins of F and F' coincide at time t = 0. Suppose there is a light signal propagating along the -|- ® axis from the position x = 0 and time t — 0, xjt = e. I t follows from (1) and (2) that the observers in the train will measure (3)
c ' = x'jt' = e
for the speed of the same signal. Evidently, if one assumes the universality of light speed (3) and uses light signals to synchronize the clocks in the train, one gets the relativistic time (2). In any case, in order that the speed of light has the same value c for both F and F' frames, one must either use the relativistic clock systems, which by definition read the time in (2), to measure velocity, or, equivalently, one simply assumes the speed of light to be universal and uses light signals to calibrate the F' clocks. However, the measured speed of light in I" is quite different according to the « common viewpoint» discussed in sect. 1. In this case, the observers in F' have a clock system which reads the same time t as the F clocks. Therefore, the observers in the train will measure c (4)
°'
=
y
(1 - 7VC)*
for the one-way speed of the same light signal, xjt = c. Eesult (4) is not surprising at all because both the observers in the train and on the ground use the common time to measure speed. Eesult (4) can be measured and tested experimentally by using the common clock systems. We stress that result (4) based on common time does not imply that result (3) based on the relativistic time is wrong, and vice versa. What we strive to demonstrate is that one needs a very special arrangement
Chap. 8. Common Relativity ...
423
of the measuring apparatus (i.e. the relativistic clock systems) to make light speed appear to be the same to all observers. This is in sharp contrast with the universality of the electromagnetic coupling strength a = 1/137, which is a pure number and, therefore, independent of the measuring apparatus and standards of units. One may ask: Is there a natural concept of time? Is there a right way to measure the speed of light1? We note that, before bhe twentieth century, people would regard common time as the natural concept of time because it resembles the common-sense time used in daily life and that nonuniversality of light speed in (4) is natural and obvious. Yet nowadays physicists regard the relativistic time as the only possible physical time. Any other concept of time such as the common time is probably regarded as unnatural. True, the common time t'= t in the three-dimensional symmetry framework (with the Galileian transformation) leads to incorrect results, so that it was abandoned. However, the situation is quite different within the four-dimensional symmetry framework, as we shall see in sect. 3. Let us cast away all these prejudices of the age and recognize the flexibility in the concept of time. As long as a concept of time is self-consistent and does not contradict experiments, it is viable and should be allowed. In order to develop physics in the future and to understand Nature more broadly, one should keep one's mind open and should be aware of the flexibility of basic concepts. We stress that the propagation of light with universal speed (3) or with nonuniversal speed (4) is imposed upon our physical theory and it is not imposed upon physical world. I n other words, the difference between result (4) and that in special relativity is completely due to different concepts (or definitions) of time. Therefore, it is not possible to say which way to measure the speed of light is the right way because both are consistent with experiments. This view will be discussed and substantiated below.
3. - Four-dimensional symmetry from the « common viewpoint». Suppose the common-time clock systems in F and F' are set up as discussed previously and the speed of light is uniquely defined in the J'-frame. Since there is a common time for all inertial frames according to the «common viewpoint», the four-dimensional transformation between inertial frames F and F' is (5)
x'=y(x—f}ct),
y'—y,
z'—z,
b' t = y(ct — fix),
where
(6)
P=Vlc,
y=(l-p*)~i,
and F' moves with a constant velocity (V, 0, 0) as measured in the J7-frame.
424 Lorentz and Poincare Invariance The quantity ¥ is defined by transformation (5). It is not a constant and not to be identified with, the speed of light in the F'-frame. The meaning of ¥ will be clarified later by finding out its relation to the speed of light in F'. Prom the « common viewpoint», the four-dimensional co-ordinate vectors associated with F and F' are, by definition, (7)
(x0,x,y,z)
and
{x0, x, y',
z),
respectively, where x0 = ct and x'0 = ¥ t. Any event can be specified by a fourvector (ct, x, y, z) as observed in F. The corresponding four-vector in F' can be calculated from the four-dimensional transformation (5). The infinitesimal interval between two events in the four-dimensional flat space is defined as (8)
ds 2 = dx20 - da;2 - dy2 - dz2 = c2 &f - dr 2 .
If one starts from the definition of four-coordinates for an event in (7) and requires the interval (8) to be invariant under a four-dimensional transformation, (9)
ds 2 = ds' 2 = d^ 2 - dr' 2 = [d(6' *)]2 - dr' 3 ,
one can derive transformation (5). On the other hand, if one starts with (5) which defines ¥, then (9) follows trivially. Mathematically, ds can be used as an evolution variable to define a velocity, e.g. dx^/ds, but it has no direct experimental meaning according to the « common viewpoint». The propagation of a light signal is described as usual by the law (10)
ds = 0 ,
which is invariant under transformation (5). This indicates that ¥ is related to the speed of light c' by the relation (11)
&(¥t)l&t = c'
because ds2 = 0 can be written as (12)
ds 2 = &(¥ t)* - dr' 2 = c'2 dt* - dr' 2 = 0 .
I n this particular case, ds = 0, o' in (11) is a one-way speed of light and relation (11) can be derived from (5). The transformation of velocities, measured by using the common time, is
\ x'=y(x—
$c),
y'=y,
z'=z,
c'=y(c-p£),
(13) [ x'= Ax'jdt,
x = dx/dt,
o'= d(¥ t)/dt,
etc.
Chap. 8. Common Relativity ... 425 Note that c' is a function of x in general. For the special case x = -\- c (or ds = 0), we obtain result (4) which shows nonuniversality of light speed. Nevertheless, the law for the propagation of light has the four-dimensional form and is invariant under transformation (5), as one can see from (10) and (12). Within the present framework with the common time, simultaneity of two distant events has a «universal» meaning, independent of positions. To illustrate this point, let us consider light signals starting out at time t = 0 from some point M on the a;-axis (which coincides with the point M' on the x'axis of I" at t = 0) in two opposite directions. Since the velocity of light is equal to o in both directions in F, the signals xA and ocB will, respectively, reach points A and B, equidistant from M, at the same time T. Now, according to the concept of common time, the same two events {i.e. arrival of the signal at A and B on the a;-axis or A' and B', respectively, on the x'-axis) are also simultaneous to the observers in F', reading the F' clocks nearby A (or A') and B (or B'). Since the point M' is moving with V toward the point B, the distance M'B' is shorter than the distance MA'. The observers in F' measure the speeds of the light signals to be (14) (15)
c,(a
, _ M'A'
[ A)
„
N
c+ V
r
(i - v*ic*)*
M'B'
c—V
±
(X—
for signal a^ , for signal xB,
Y'jl}-)'
where T'= T, xA\T = MA\T=~e and xBIT = MB/T = c. This example shows clearly the difference between the concepts in this framework and those in special relativity. Furthermore, although the picture of the world from the «common viewpoint» is quite different from that of the Einsteinian viewpoint, these two viewpoints can be self-consistent. Now let us consider the second class of measurements of the light speed, i.e. the two-way speed of light (7), denned as the total back-and-forth distance divided by the total time interval. Suppose the light signal starts from r\ in F', travels along the -j- a?-axis (or — a>axis) a distance L' and is reflected back to r 0 . Evidently, one measures (16) in the JF-frame. However, the two-way speed of this signal as measured by observers in F' is (17)
c;.wa7=c(l-7W,
O W. P. EDWARDS: Am. J. Phys., 31, 482 (1963); J. A. WINNIE: PHlos. Sci., 37, 81, 223 (1970); K. MANSOTOI and R. U. SEXL: Gen. Bel. Grav., 8, 497 (1977); J. P. Hsu and T. N. SHERRY: Found. Phys., 10, 57 (1980); ZHANG YUAN-ZHONG:
Experimental Foundations of Special Belativity, Chapt. 1 (Beijing, 1979).
426 Lorentz and Poincare Invariance independent of the directions (i.e. ± #-axis)'. I t can be easily shown that results (17) and (16) hold for a light signal traveling (a distance 77) along the y'-axis: 2L'j2t = (d°- - VH^f/t = c(l - F2/"2)* and 2d/2t = c in F' and F, respectively. Thus the two-way speed is isotropic in both F and F' frames. This explains the null result in the Michelson experiments (carried in F and F') and others. It is important to note from (13) and (17) that o!2_W!iy = c'(x) with x =V. The reason is that this measurement of the two-way speed of light corresponds to two events with d a / = 0 (or dx'/dt = 0). Thus the quantity & in (13) can be either the one-way speed or the two-way speed, dependent on x. It is also possible to measure the speed of light by a third class of measurement process, namely by measuring the wave-length X and the frequency v. These quantities are related by the relation c — Xv, which can be written in the four-dimensional form by using the electromagnetic-wave four-vector hu, = (mjc, k) in F: (18)
(co/c)2 - k- = 0 ,
a) = 2nv,
\k\ = 2njX .
Similarly, in the JF'-frame we have c'= X'v', i.e. K / c ' ) 2 - fc"2 = 0 ,
(19)
where c' is the one-way speed of light in F'. cp = exp [ilcnX"]
and
A plane wave is described by
in F and F', respectively (8). Note that, if one uses the relativistic time t'rei in F', this plane wave in F' is described by q>'nl = exp [i(co'rJc) ct'rtl — ik' • r ' ] , where co^el is measured in terms of t'rel. The frequencies co' and a>r'eI are related by oi'jc'=
Chap. 8. Common Relativity ...
427
measured under different conditions, or different clock systems at different places in space and time, while relativistic invariance refers to the property of a quantity or a law under Lorentz transformation (in which relativistic clock systems must be used). B) Once ether theory and « objective» Lorentz contraction have been discarded, Michelson's experiment can only be interpreted in terms of isotropy of light speed in « moving» frames. I t is trivial to say that this check of isotropy does not involve direct time measurements and, therefore, does not depend on any specific way of synchronizing clocks. I t is perhaps less trivial to show explicitly, by using an alternative synchronization, that only isotropy of the «two-way » speed of light, e.g. (17), is required.
4. - Truly universal constants. We have seen that the speed of light is no longer a universal constant in the present framework. What are the universal constants? This question can only be answered by the laws of physics. Let us consider « action functions » which lead to physical laws. The usual action S = — \mc &s — (eje) An da;" is not good now because it is not invariant under the four-dimensional transformation (5). However, we have a new invariant «action » for a charged particle with rest mass m moving in the electromagnetic four-potential field
S = — mds— e fl^da?" = — [mc(l — v-jc^f -f- ea„«"]d£, (20)
= -\[mc'{l-v'-lc'2)i
S = -\mds'-e\a'iidx'" ds2 = c2 &t- — dr 2 , 2
2
v = |dr/d<|, 2
2
2
+ ea'flv'IJ]dt,
in F, in
F',
v = dx"/dZ, 2
ds' = d ( 6 ' < ) - d r ' = c ' d < - d r ' ,
c' = d.(b't)/At,
v' = \Ar'jAt\,
where m and e = e (in e.s.u.)/c are constants and a^{= A^c) is defined as a four-vector. Since S =jl?At, the Lagrangian <£ takes the form & = - mc'(l - v'2jc'2)* -f ea'-v'-
(21)
ea'0e',
which is an invariant. We have
Q (22)
H'=v
%& dv'
mv'jo' -f- ea' = q' -f- eo' , (1 — v'zjc'*)*
die
se = &
m 2 2 (1 —•u''*l*»\t /c' ) +
ea
'°
o'iq'o + ea'0) ,
428 Lorentz and Poincare Invariance in F'. We see that Q' and Q'0 = H'jc' form a four-vector Q'f = (Q'a, Q') and that (23)
(; - ea'or - « ? ' - ea'f = q* ~ <j* = m* •
I n the .F-frame, we have the four-vector q , corresponding to q : (24)
q" — m(vklc)ftl - v^c^-f,
q° = m/(l - v-jc^ .
The «momentum four-vectors » q^ and q in (22) and (24) should be related to the wave four-vectors k and It for the particles in F and F', respectively:
(25)
\q» = Jk», I [k"=(k0,k),
q'» = JV»,
/i,v =
0,1,2,3,
*'*=(*,,*').
Because q„ is related to the usual momentum p^ in F by the relation q^ = pn/c, J must be a universal constant and its value must be related to hjo determined in special relativity in the .F-frame. We may remark that from the common viewpoint % ( = J times the speed of light) is just a composite constant and is not universal. If one examines the «action function » in quantum electrodynamics, one sees that the only meaningful form is
(26)
sQED =J" [wW
- «*") v - ™w - j j - W " d'x,
where d4a? = d3rda?0, /#» = d^a,— 3 v a„, QvsiJd* and # QED has the same dimension as S and J (i.e. mass times length). Note that, if one does not specify a particular frame, one can write a?" = (x°, r) = (bt, r), where 6 is a variable in general. When one chooses a particular frame F (or F'), one has 9 Xf = (ct, r) (or x'f = [V t, r')) ( ). Thus the present four-dimensional symmetry implies that J and e must be universal constants and are, respectively, equal to h/c and e (in e.s.u.)/c:
(27)
f J = 0.351 772 93-10-" g c m , ( e = 1.6021891-10- 20 (g cm)*
(or e.m.u.).
The electromagnetic coupling strength x = e2/J — 1/137.035 982 remains the same as the usual value e2lhc. (•) The present four-dimensional symmetry framework can be used to formulate quantum electrodynamics and other field theories, e.g., such as those in J. P. Hsu: PJiys. Bev. Lett., 36, 646 (1976); 42, 934 (1979); Phys. Bev. D, 5, 981 (1972); Found. Phys., 8, 371 (1978); 6, 317 (1976).
Chap. 8. Common Relativity ...
429
We suggest that the truly universal constants in physics are J and e rather than Ji, e (in e.s.u.) and the speed of light. The reason is not because the « common viewpoint» is better than or superior to the Einsteinian viewpoint. Bather, the reasons are as follows: A truly universal constant should not depend on a special arrangement of the apparatus used by observers in different frames of reference. So far, all experiments indicate that the basic laws of physics must have four-dimensional symmetry. The four-dimensional symmetry from the Einsteinian viewpoint implies that h, e and the speed of light (and their combinations J= H\c, e = eje, oc = e^jKc, etc.) are universal constants. Yet the four-dimensional symmetry from the « common viewpoint» implies that J and e (and their combinations a = e 2 /J, etc.) are universal constants and that the speed of light is not universal. Despite of their radical differences in the concept of time and in the picture of the world, the constants J and e are universal, no matter whether observers in different frames wse the relativistic clock systems, or use the common-time clock systems to record time and to measure velocities. However, it is not so for the universality of the Planck constant fi, the charge in (e.s.u.) and the speed of light. (Here the charge in e.s.u. is understood as the charge in e.m.u. times the speed of light.) I n the appendix, we show that J and e are universal even if one uses other clock systems to describe Nature within the flat four-dimensional-symmetry framework. Let us briefly consider the role of e0 and /to in electromagnetism within the present four-dimensional framework. The four-dimensional transformation property of a physical quantity can be most easily seen from the « action» function for a physical system under consideration. The Gaussian system is used for the usual action S = - fmeds - ("(e/e^da;* -
(1/16JI) [F^W^X
,
where 2?Vv = d,tAf— dyA„. As far as the system of units is concerned, new « actions » (20) and (26) correspond to the usual action S, so that in our system of units £„ and/^o are dimensionless. (In our system of units, the charge is e rather than e.) Moreover, electric and magnetic fields have the same dimension, and one has a simple covariant relation between the electromagnetic four-potential a^ and the field tensor /„,, (i.e. the electric and the magnetic fields). Otherwise, it is rather inconvenient in writing down covariant laws of quantum electrodynamics (10). The choice of the system of units is merely a matter of con-
(10) The Heavyside and Gaussian systems are more suitable for microscopic problems involving the electrodynamics of individual charged particles. See, for example, J. D. JACKSON: Classical Electrodynamics (New York, N. Y., 1966), appendix p. 611. Tor covariant formulations in the MKS system of units, see W. K. H. PANOFSKY and M. PHILLIPS: Classical Electricity and Magnetism (Eeading, Mass., 1962), p. 437.
430 Lorentz and Poincare Invariance venience. Of course, if one works in a particular frame, say, F (in which the speed of light is isotropic and uniquely denned), one can use any system of units without having this inconvenience. For details, we refer to ref. (10).
5. - Remarks and conclusions. Is there a way to distinguish between the four-dimensional framework with the common time and that with the relativistic time? According to the set-up of the common-time clock systems, the speed of light has, by definition, the unique value c in the .F-frame. The «action functions » (20) and (26) lead to the result that the physical laws in the .F-frame within the present framework are identical with those in the framework of special relativity. Since all inertial frames are equivalent, the laws of physics in another frame (e.g. the .F'-frame) can be obtained from the laws in F by four-dimensional transformations. This is true in both frameworks because a physical law can be expressed in a form which displays four-dimensional symmetry and has the same form in any inertial frame. Therefore, the physics related to the «actions» (20) and (26) in this framework is the same as that in special relativity. I n this sense, no experiment related to the laws, which can be derived from invariant actions (20) and (26), can distinguish these two frameworks and rule out one of them. What we have done in this paper is to propose a Newtonian-type viewpoint to view the four-dimensional physical world and to suggest truly universal constants. This new viewpoint enables us to view the four-dimensional symmetry broadly. I t reveals that there is an intimate relation between the concept of time and the set-up of clock systems and that the concept of time is not unique. I t also reveals that the truly universal basic constants are J and e rather than ft, e and the speed of light. I t is interesting to note that DIEAC has expressed beliefs that, in the physics of the future, fi, <x and the speed of light would not all be fundamental constants and that only two of them are fundamental ( u ). Based on previous discussions, it is likely that J and e are the two fundamental and universal constants in the physics of the future. The presence of the scalar common time in this framework leads to a new invariant quantity 0 = qjc = qjc . This can be derived by using (13) with (z, y, z) = (vlt fla, v3) and the transformations for q and q in (24). We obtain 2i c'
=
2o 1 —ffgi/go= «2 = gr c 1 —folic c '
(") P. A. M. DIEAC: Set. Am., 208, 48 (1963).
Chap. 8. Common Relativity ... 431 where we have used (24). This invariant quantity G may be called «genergy » because G of a particle is equivalent to t h e conserved quantity «energy » q0 in the frame F (in which the speed of light is isotropic and uniquely denned). We stress that G is a function of t h e momentum q because of the relation <Mo— qq = m2 for a particle. Since the ratio ^0/(da;0/ds) is a constant, one does not have an invariant quantity like G in special relativity. Finally, we would like to remark on possible differences between this conceptual framework and the usual framework. We believe that this new viewpoint has some advantages because it sheds light on the difficulty in unifying quantum mechanics and four-dimensional symmetry, mentioned in t h e beginning. To wit, in the many-time formalism of Block and the Bethe-Salpeter equation, t h e wave functions are manifestly four-dimensional and describe the probability amplitude for finding particle 1 at position r x and time tlt particle 2 at position r.2 and time t2 for a two-particle system (12). This does not make sense from the viewpoint of observation, because what appears to our consciousness is really the physical world at a certain time. However, in the Schrodinger equation and the Tamm-Dancoff formalism, the three-dimensional wave functions describe completely (in the sense of quantum mechanics) the physical system at a certain time. This appears to indicate that the probability concept in quantum mechanics and the concept of the relativistic time in the four-dimensional symmetry framework are incompatible in general. On the other hand, within t h e present conceptual framework, the common time is the only time in the four-dimensional wave function 0 describing t h e manyparticle system aad will enable us to interpret 0 as t h e probability amplitude at a certain time. This resembles the Schrodinger equation and the TammDancoff formalism and distinctly differs from that in special relativity. Furthermore, the common time for all observers has advantages in cosmology and in statistical mechanics, because it enables us to describe the evolution of the many-particle systems by one single time (13). These subjects are interesting and important in their own right and shall be discussed in separate papers: *** To a wonderful teacher, Dr. T. Y. WXJ. Part of the work was accomplished while the author held an NEC Senior Eesident Eesearch Associateship at NASA-MSFC.
(12) G-. WENTZEL: Quantum Theory of Fields (New York, N. Y., 1949), p. 138; F . J. D T S O N : Phys.
Bev.,
9 1 , 1543 (1953); KUAN TU-NAN, Z H U HSI-QUEN,
H O TSO-XIU,
QING CHENG-KUI and CHAO W E I - Q I N : Proceedings of the 1980 Guangzhou Conference
on Theoretical Particle Physics, Vol. 2 (Beijing, 1980), p. 1390. (13) See, for example, E . HAKIM: J. Math. Phys. (N. Y.), 8, 1315 (1967); J . P . H s u and T. Y. S H I : Phis. Bev. D, 26, 2745 (1982).
432 Lorentz and Poincare Invariance APPENDIX
True universality of J and e. We shall show that, within the flat four-dimensional framework, the constants J and e are truly universal in the sense that they are universal constants, no matter whether one uses the common-clock system, the relativisticclock system or other clock systems to record time and to describe Nature. In general, the flat four-dimensional framework admits the following fourdimensional transformation between inertial frames F and I": (A.l)
x'— y(x —
fix0),
y'=y,
z'= z ,
x'a = y(x0 — fix),
where x0=ct,
x'0 = b't',
)8=7/c,
y = (1—/?»)-*.
We stress that here x'0 is defined as b't' which differs from that in eqs. (5) and (6) (where x'a = b't). This transformation can be derived from the invariance of the interval ds-= c*dt2 — d r 2 = [&{b't')]2— d r ' 2 . Note that x0 = ct means that the speed of light is uniquely defined in the Fframe by synchronization of clocks in F, as discussed in the introduction. The relation for x'0 = b't' in (A.l) indicates that, in general, one has the freedom of choosing a relation between t' and t (i.e. choosing a particular clock system for the J"-frame). For example, if one chooses V = t, we have a particular four-dimensional symmetry framework with a common time, as discussed previously. If one chooses V ±= y(t— f}%/c), one can show that b'— c and one has the Lorentz transformation. These are two simple cases, which can be realized by using « common time » and «relativistic » clock systems. However, one can also choose a relation (A.2)
t'=fS2)t,
or a more complicated relation (A.3)
V = f2(x, t) ,
where /i and /2 are some functions. This kind of time /' for the .F"-frame can be realized by setting up clocks in F' according to (A.2) or (A.3). Once a specific function is chosen, one canderive results in the same way as discussed in the paper. In this way, physical laws in terms of variables in (A.l), etc. have the four-dimensional symmetry form. This shows the flexibility in the set-ups of clocks in different inertial frames. Examining these cases, one can see that only the choice t'= y(t—^xjc)
Chap. 8. Common Relativity ...
433
leads to universal light speed. All other choices lead to nonuniversal light speed. Therefore, "when one uses other clock systems (corresponding to (A.2) and (A.3)) to describe Nature, the invariant action function for a physical system should not involve light speed explicitly because it is not universal. The « action » functions (20) and (26) with x'0 = V V can be shown to be invariant under the general transformation (A.l). This implies that the constants J and e are truly universal in the sense that their values are independent of the clock systems used in the flat four-dimensional symmetry framework.
434
Lorentz and Poincare Invariance
NEWS AND VIEWS (Extract) (On Common Time and Common Relativity)
Nature Editorial
Those who from time to time send to Nature, as manuscripts intended for publication, proofs that the special theory of relativity must be wrong will in future, at least in the first instance, be referred to the issue of // Nuovo Cimento (74B, 67, 1983) in which Dr. J. P. Hsu from the National Aeronautics and Space Administration's Goddard Space Flight Center 2 works out some of the consequences of rejecting the notion that the velocity of light is the same in all frames of reference moving relative to each other (or, more strictly, in all inertial frames). For although Dr. Hsu's professed objective is to demonstrate that some 'fundamental' constants are more fundamental than others, he embarks on his argument by throwing away the assumption that the velocity of light is constant (and isotropic) for the sake of a system for measuring time which satisfies one of the rudimentary goals of the anti-relativists—doing away with problems of simultaneity. In passing, and inevitably, he shows that the game cannot be worth the candel. Hsu's starting point, explicitly and emphatically not that of an antirelativist, is that it should be philosophically permissible to require a system for the measurement of what he calls "common time" that should be valid in all relatively moving frames of reference. The procedure is straightforward, and follows Einstein's original discussion of the problem. Within one frame of reference, a system of identical clocks ticking at what is supposed to be the same rate can easily be synchronized by means of light signals exchanged between observers attending the various instruments. Similarly, a population of clocks in a relatively moving frame of reference can be synchronized among themselves, and then there is nothing to prevent those in charge of the second system from adjusting the rate at which each in the set of moving clocks ticks so that the two systems always tell the same time. And so on, to other systems moving with different relative velocity. So there is no practical obstacle to the construction of a system of common time in whcih people's commonsense expectations will not be offended. The other consequences, however, will be less palatable.
Chap. 8. Common Relativity ...
435
For the purposes of argument, it is sufficient to consider two linear frames of reference S and S', with the second moving relative to the first with velocity v. In standard special relativity, the transformation from one system to another is easily accomplished by the relations x' = -y(x — vt) and ct' = j(ct — fix), where x,x' and t,t' are respectively the coordinates and times in the two frames, where c is the velocity of light, (3 is the ratio v / c and 7 is the familiar (1 — u 2 / c 2 ) 1 ' 2 . Implicit in these relations is the assumption t h a t the velocity of light is c in both frames of reference. If, however, it is the time t h a t remains unchanged, the standard equation for the transformation of time must be changed into a form t h a t Hsu writes as b't = j(ct — /3x), where b' is neither the velocity of light in the second frame nor even, for t h a t matter, a constant. But it does turn out t h a t the velocity of light in the second frame is given by c' = j(c — v), from which it is in no sense surprising t h a t , on the assumption t h a t the velocity of light is isotropic (the same in both directions) in the first frame, it turns out to be anisotrropic in the second. Logically, this is a serious stumbling block for anti-relativists who may have chosen to follow Hsu thus far, for if it is accepted t h a t either frame can be chosen as a starting point for a system of universally synchronous clocks, the notion t h a t the velocity of light in all other frames should be anisotropic is much more absurd than the usual anti-relativists' complain about the twin "paradox". As it happens, Hsu does show t h a t the anisotropy of the velocity of light in moving frames of reference should not affect measurements of the velocity of light as obtained from a round-trip measurement as in the Michelson-Morley experiment, whose null result is therefore not an absolute bar to a t t e m p t s to construct a system for measurement of time in which the difficulties of simultaneity simply melt away. 3
REFERENCES
[1] Nature Editorial, NATURE
303, 129 (1983).
[2] Goddard Space Flight Center should read Marshall Space Flight Center. [3] The readers, who are interested, can go look up the rest of the editorial in ref. 1.
This page is intentionally left blank
Chapter 9
The Aether and Relativistic Quantum Fields 9
9
R . A. M. Dirac (1951), T. D. Lee (1981), J. D. Bjorken (1988)
November 24, 1951
NATURE
LETTERS TO THE EDITORS The Editors do not hold themselves responsible for opinions expressed by their correspondents. A'o notice is taken of anonymous communications Is t h e r e a n SEther ? I N the last century, t h e idea of a universal and all-pervading aether was popular as a foundation on which to build t h e theory of electromagnetic, phenomena. The situation 'was profoundly influenced in 1905 b y Einstein's discovery of t h e principle of relativity, leading to t h e requirement of a fourdimensional formulation of all natural laws. I t was soon found t h a t t h e existence of a n aether could not be fitted in with relativity, a n d since relativity was well established, t h e aether was abandoned. Physical knowledge has advanced very much since ] 905, notably b y the arrival of q u a n t u m mechanics, and the situation has again changed. If one reexamines the question in t h e light of present-day knowledge, one finds t h a t t h e aether is no longer ruled out b y relativity, and good reasons can now be advanced for postulating a n aether. L e t us consider in its simplest form t h e old argument for showing t h a t t h e existence of a n aether is incompatible with relativity. T a k e a region of spacetime which is a perfect vacuum, t h a t is, there is no m a t t e r in it and also no fields. According t o t h e principle of relativity, this region m u s t be isotropic in t h e Lorentz sense—all directions within t h e lightcone m u s t be equivalent t o one another. According to t h e aether hypothesis, a t each point in t h e region there m u s t be a n sether, moving with some velocity, presumably less t h a n t h e velocity of light. This velocity provides a preferred direction within t h e light-cone in space-time, which direction should show itself u p in suitable experiments. Thus we get a contradiction with t h e relativistic requirement t h a t all directions within t h e light-cone are equivalent. This argument is unassailable from t h e 1905 point of view, b u t a t t h e present t i m e it needs modification, because we have t o apply q u a n t u m mechanics t o t h e aether. The velocity of t h e aether, like other physical variables, is subject to u n c e r t a i n t y relations. F o r a particular physical state t h e velocity of t h e aether a t a certain point of space-time will not usually b e a well-defined quantity, b u t will be distributed over various possible values according t o a probability law obtained b y taking t h e s q u a r e of t h e modulus of a wave function. W e m a y set u p a wave function
438
Chap. 9. The Aether and Relativistic
November 24, 1951
Quantum Fields
NATURE
which makes all values for t h e velocity of t h e tether equally probable. Such a wave function m a y well represent t h e perfect v a c u u m state in-accordance with t h e principle of relativity. One gets a n analogous problem b y considering t h e hydrogen a t o m with neglect of t h e spins of t h e electron a n d proton. F r o m t h e classical picture it would seem t o b e impossible for this a t o m t o be in a s t a t e of spherical symmetry. W e k n o w experimentally t h a t t i e hydrogen a t o m can b e i n a state of spherical symmetry—any spectroscopic iS-state is such a s t a t e —and t h e q u a n t u m theory provides a n explanation b y allowing spherically symmetrical wave functions, each of which makes all directions for t h e line joining electron t o proton equally probable. W e t h u s see t h a t t h e passage from t h e classical theory t o t h e q u a n t u m t h e o r y makes drastic alterations in our ideas of symmetry. A thing which cannot be symmetrical in t h e classical model m a y very well b e symmetrical after quantization. This provides a means of reconciling t h e disturbance of Lorentz symmetry in space-time produced b y the existence of a n aether with t h e principle of relativity. There is one respect in which t h e analogy of t h e hydrogen a t o m is imperfect. A s t a t e of spherical symmetry of t h e hydrogen a t o m is quite a proper state—the wave function representing i t can be normalized. This is n o t so for t h e s t a t e of Lorentz symmetry of t h e aether. L e t us assume t h e four components t^ of t h e velocity of t h o aether a t a n y point of space-time commute with one another. T h e n we can set u p a representation with t h e wave functions involving t h e v's. T h e four v's can be pictured as denning a point on a three-dimensional hyperboloid in a fourdimensional space, with t h e equation : «„' — v{ — « , ! — v,' = 1 » 0 > 0. (1) A wave-function which represents a state for which all aether velocities are equally probable m u s t be independent of t h e u's, so it is a constant over t h e hyperboloid (1). If we form t h o square of t h e modulus of this wave function a n d integrate over t h e threedimensional surface (1) in a Lorentz-invariant manner, which means attaching equal weights to elements of t h e surface which can b e transformed into one another b y a Lorentz transformation, the result will be infinite. Thus this wave function cannot be normalized.
439
440 Lorentz and Poincare Invariance
November 24, 1951
NATURE
The states corresponding to wave functions that can be normalized are the only states that can be attained in practice. A state corresponding to a wave function which cannot be normalized should be looked upon as a theoretical idealization, which can never be actually realized, although one can approach indefinitely close to it. Such idealized states are very useful in quantum theory, and we could not do without them. For example, any state for which there is a particle with a specified momentum is of this kind—the wave function cannot be normalized because from the uncertainty principle the particle would have to be distributed over the whole universe —and such states are needed in collision problems. We can now see that we may very welj. have an sether, subject to quantum mechanics and conforming to relativity, provided we are willing to consider the perfect vacuum as an idealized state, not attainable in practice. From the experimental point of view, there does not seem to be any objection to this. We must make some profound alterations in our theoretical ideas of the vacuum. It is no longer a trivial state, but needs elaborate mathematics for its description. I have recently1 put forward a new theory of electrodynamics in which the potentials Ap are restricted by : •where k is a universal constant. From the continuity of Ac we see that it must always have the same sign and we may take it positive. We can then put
k-^A„ =
ty
(2)
and get v'a satisfying (1). These v'a define a velocity. Its physical significance in the theory is that if there is any electric charge it must flow with this velocity, and in regions where there is no charge it is the velocity with which a small charge would have to flow if it were introduced. We have now the velocity (2) at all points of space-time, playing a fundamental part in electrodynamics. It is natural to regard it as the velocity of some real physical thing. Thus with the new theory of electrodynamics we are rather forced to have an aether. P. A. M. DLRAC
St. John's College, Cambridge. Oct. 9. • Proc. Roil. Soc., [A, 209, 291 (1951)].
Chap. 9. The Aether and Relativistic
Quantum Fields
Particle Physics and Introduction to Field Theory T. D. Lee Columbia University Chapter 16 V A C U U M AS T H E SOURCE OF A S Y M M E T R Y
16.1
What Is Vacuum? In the last century, in order to understand how the electromag-
netic force, and later the electromagnetic wave, could be transmitted in space, the vacuum was viewed as a medium called aether.
In his
note 3075 on experimental research Faraday wrote * For my own part, considering the relation of a vacuum to the magnetic force and the general character of magn e t i c phenomena external to the magnet, I am more i n c l i n e d to the notion that in the transmission of the force there is such an action, external to the magnet, than that the effects are merely attraction and repulsion at a distance. Such an action may be a function of the aether; for i t is not at a l l unlikely that, i f there be an aether, i t should have o t h er uses than simply the conveyance of radiations. However, since at that time the nonrelativistic Newtonian mechanics was the only one available, the vacuum was thought to provide an absolute frame which could be distinguished from other moving frames by measuring the v e l o c i t y of l i g h t .
As is w e l l - k n o w n , this led to the
downfall of aether and the rise of r e l a t i v i t y . We know now that vacuum is Lorentz-invariant, which means that just by our running around and changing the reference system, we * Michael Faraday, Experimental Researches in Electricity (London, R. and J . E. Taylor, 1839-55).
441
442
Lorentz and Poincare Invariance are not going to alter the vacuum.
But Lorentz invariance does not
embody a l l physical characteristics.
We may still ask: What is this
vacuum state? In the modern treatment, we define the vacuum as the lowestenergy state of the system. It has zero 4 - m o m e n t u m .
In most q u a n -
t u m - f i e l d theories, the vacuum is used only to enable us to perform the mathematical construct of a Hilbert space. From the vacuum state we b u i l d the one-particle state, then the t w o - p a r t i c l e state, • • • ; hopefully, the resulting Hilbert space w i l l eventually resemble our universe.
From this approach, different vacuum state means different
Hilbert space, and therefore different universe. From Dirac's hole theory we know that the vacuum, although Lorentz-invariant, is actually quite complicated. In general, we may expect the vacuum to be as complex as any spin-0 f i e l d
at
4-momentum
k
= 0 .
(16.1)
H
Like a spin-0 f i e l d , i t is conceivable that the vacuum state may carry quantum numbers such as isospin In this context we may ask: i c a l medium?
I ,
parity
P,
strangeness S , etc.
Could the vacuum be regarded as a phys-
If under suitable conditions the properties of the v a c u -
um, l i k e those of any medium, can be altered physically, then the answer would be affirmative.
Otherwise i t might degenerate into seman-
tics. The analysis given below w i l l be based primarily on two of the most remarkable phenomena in modem physics: (I) missing symmetry, and
(ii) quark confinement.
The former w i l l be discussed now and the latter in the next chapter.
Chap. 9. The Aether and Relativistic 16.2
Quantum Fields
Missing Symmetry If we add up the symmetry quantum numbers such as I , S , P,
C , ••• , of all matter, we find these numbers to be constantly changing
I S
dt-
/o
P
\< C
(16.2)
CP matter Aesthetically, this may appear disturbing. Why should nature abandon perfect symmetry?
Physically, this also seems mysterious. What hap-
pens to these missing quantum numbers? Where do they g o ?
Can it
be that matter alone does not form a closed system? If we also include the vacuum, then perhaps symmetry may be restored
I S
TT { c )
=
(16 31
° •
-
CP , matter + vacuum As a bookkeeping device, this is clearly possible.
It also forms the
basic idea underlying the important topic of spontaneous symmetry breaking, developed by Y. Nambu and others.* In such a scheme one often assumes that there exists some phenomenological spin-0 f i e l d
= vac
< vac I <> t I vac >
4
i T i
/
0
.
(16.4)
* For a history of this subject, see Y. Nambu, Fields and Quanta 1 , 33 (1970).
443
444
Lorentz and Poincare Invariance Consequently, the observed asymmetry can be attributed entirely to the state vector of our universe, not to the physical law. [ Examples w i l l be given later. ] On the other hand, unless we have other links connecting matter with vacuum, how can we be sure that this idea is right, and not merely a t a u t o l o g y ? A way out of this dilemma is to realize that in (16.1) the r e striction k
= 0 for the vacuum state is only a mathematical i d e a l i -
zation. After a l l , very l i k e l y the universe does have a f i n i t e radius, and k
is therefore never strictly zero. So far as the microscopic sys-
tem of p a r t i c l e physics is concerned, there is l i t t l e difference between k
= 0 and k
nearly 0 ; the latter corresponds to a state that varies
only very slowly over a large space-time extension. This means that i f the idea expressed by (16.4) is correct, then under suitable c o n d i tions, we must be able to produce excitations, or domain structures, in the vacuum. whose size is »
In such an excited state, there exists a volume
Q
the relevant microscopic dimension; inside Q we
have the expectation value <
. The symmetry properties inside Q can then be d i f -
ferent from those outside. 16.3
Vacuum
Excitation
How can we produce such a change in <
<—>
magnetic spin
J = matter source as shown in Fig. 16.1.
<—>
,
magnetic
field
Chap. 9. The Aether and Relativistic
Quantum Fields
<£> = >vac outside
a SPIN
DIRECTION
VACUUM EXCITATION
Fig. 1 6 . 1 . Domain structures in a ferromagnet vs. in the vacuum.
In the case of a very large ferromagnet, because the spins i n teract linearly with the magnetic f i e l d , a domain structure can be created by applying an external magnetic f i e l d over a large volume. Furthermore, after domains are created, we may remove the external f i e l d ; depending on the long-range forces, the surface energy and other factors, such a domain structure may persist even after the e x ternal magnetic f i e l d is removed. Similarly, by applying over a large volume any matter source J which has a linear interaction w i t h <{>(x), we may hope to create * a domain structure in <
=
i (£-) ~ U(t) ,
T. D. Lee and G . C. W i c k , Phys.Rev. D9, 2291 (1974).
(16.5)
445
446
Lorentz and Poincare Invariance where the absolute minimum of U Is at 4 = 4 T
T
and with vac
U(<J> ) T vac
= 0 . Since we are interested in the long-wavelength limit of the f i e l d , the scalar f i e l d
but
zero outside. For a sufficiently large Q , we may neglect the surface energy. The energy of the system becomes [U(t) + J*J Q
.
(16.6)
Its minimum determines the expectation value of 4 inside Q .
The
graphs in Fig. 16.2 illustrate how the new expectation value
4> = <*(x) > can be changed under the influence of J . If the missing symmetry is due to 4
^ 0 , then by changing
4 we may alter the symmetry properties inside Q dynamically.
Of
course, to do a realistic experiment to reclaim our missing symmetry is not easy.
But i t is the prerogative of the theorist to contemplate
such a situation.
16.4
CP Nonconservation and Spontaneous Symmetry Breaking We discuss here one of the simplest examples that illustrates
the phenomenon of spontaneous symmetry breaking. Our purpose is to give a t h e o r y * in which (i) the Lagrangian is invariant under CP and but
(ii) its S - m a t r i x violates
T,
CP and T symmetry.
* T. D. Lee, Physics Reports 9C, No. 2 (1974).
Chap. 9. The Aether and Relativistic
Quantum Fields
(a)
U + Jd>
VAC
(b) U +
Jd>
«£vAC
Fig. 16.2.
Change of
447
448
Lorentz and Poincare Invariance Let us assume the system consists of a s p i n - 2
Dirac f i e l d <j)
and a s p i n - 0 Hermitian f i e l d
M
u where
(16.7)
x
*2(
U($) = \
•
(16.3)
From Hermiticity, the parameters
m , g , p and
It can readily be verified.that
is invariant under T , C and P
where
£
K
must be real.
.
T
= -*(7,-t)
,
(16.9)
c
06.10)
.
p
= -
.
(i6.ii)
The corresponding transformations of ty are given by (10.5), (10.9) and (13.53).
Because
U is a fourth - order polynomial in
theory is renormalizable. The vacuum expectation value of <J> is determined by the m i n imum of
U(<{>) . As shown in Fig. 16.3, we have either
<*> T
=
p > 0
(16.12)
vac
or < <J) > = - ro . In either case, since T d> is of T = - 1 , CP = - 1 vac ' and P = - 1 , a nonzero expectation value of
—- 0
vac
r
must also be one.
P) . The T
= p is a s o l u -
These two solutions
.
transform into each other under T , but by itself neither is i n v a r iant under T . is under
(It is also not invariant under CP and
P , though i t
C and CPT .)
Because of quantum effects, the f i e l d <(> fluctuates around its
Chap. 9. The Aether and Relativistic
Quantum Fields
449
U(aS)
Fig. 16.3. The potential energy density
U(<|>)= -^ *2(
vacuum expectation value. We may choose <
where
= p , and write
p + 5<|>
In terms of 6
p2)
U becomes
*u 2 (S
(16.13)
u.=
the solution, we may perform a unitary transformation under which <> | is unchanged, but ? -
e
~?2"r5a
f
.
(16.14)
Therefore, the quadratic expressions
*t ?4
,,, _ ^t e ^y 5 a = f
?4
e-iir5°
9
=
,t ^4
e-ir5
a
*
y - ( c o s a - iy,- sin a ) <> |
and
-1"
^4^5
- -
iaIte?i>'5,,, = •
^ 5
e- -- '^i l' Y r sc 0a ^* = _ jflj ,.t
7 4 ( s i n a + i y 5 cos a ) f
- 1 Y-aa ?4?5
*5 »
450
Lorentz and Poincare Invariance Hence, by choosing tana
= gp/m.
,
(16.15)
we have •J , t y 4 ("i + i g p y 5 ) *
•ty
-
4
M»
(16.16)
where M =
( m 2 + g 2 p2f
.
(16.17)
By substituting (16.14) into (16.7), we find that the Lagrangian density £
becomes
-i(^«t),-u-*rr4(r|1^;*M)» - g f t y 4 ( s i n a + i y 5 cos a ) 1" 5* . Since the operator <> | y the operator
P= 1 ,
i <J> y . y. »> | is of P = - 1 ,
C = 1 and
(16.18)
T = 1 while
C = 1 and T = - 1 , any
exchange of the S<}) quantum would give an interference term b e tween these two operators that violates T , P and CP ; but the product symmetry CPT remains intact.
Figure 16.4 gives an example of
such an interference term, whose amplitude is A_ = g 2 s i n a
cos a ( k 2 + u 2 ) " 1
,
(16.19)
where k denotes the 4 - momentum transfer.
ig y cos a
g sin a
84>
Fig. 16.4. A T - v i o l a t i n g scattering diagram due to the exchange of the quantum of 6
Chap. 9. The Aether and Relativistic
Quantum Fields
The model discussed in this section illustrates the basic mechanism of a spontaneous T v i o l a t i o n . [ T h e same discussion applies to CP and
P as w e l l . ]
One assumes that the ground state of the system
(called the vacuum) has a nonzero expectation value
< d> > vac
,
where $ is a T = - 1 phenomenological spin-0 f i e l d ; thus, the v a c uum is noninvariant under T , even though the Lagrangian satisfies T invariance. The T invariance of the Lagrangian implies that the vacuum must have a double degeneracy.
It is interesting t o examine the bar-
rier penetration between these two degenerate solutions: < <|» >
=p
and - p of Fig. 16.3. By enclosing the entire system in a f i n i t e v o l ume Q w i t h a periodic boundary condition, we may expand
$
in
terms of the usual Fourier series
4> = q +
T.
— L
(16.2CM
where q is independent of r . Substituting this expression into (16.7), we find
L = /
£ d 3 r = i q 2 Q - U(q) Q + • • •
where • • • depends on <pr~ and the fermion f i e l d "|" . The barrierpenetration amplitude may be crudely estimated by concentrating on the q - d e g r e e of freedom.
We may set <pr~ = 0 and <> | = 0 . The
Hamiltonian becomes then H =
-1$
2 P
+
U(q)Q
(16.21)
where p = Q q . According to the W.K.B. approximation in quantum mechanics, the barrier-penetration amplitude is ~exp{ -Q /
[2U(q)Fdq}
(16.22)
451
452
Lorentz and Poincare Invariance which goes to zero exponentially as the volume of the system a p proaches i n f i n i t y . From (16.17) we see that the fermion mass changes from m to M when the expectation value of
= p.
also depends on < 4 > -
vac
Hence i f we follow the discussion given in Section 16.3, by applying a matter source J over a large volume Q we may alter < <}> > i n side Q , and thereby change the mass of the particle and the symmet r y - v i o l a t i n g amplitude. Remarks.
The above example demonstrates the essence of the spon-
taneous symmetry-breaking mechanism. under a certain group
The Lagrangian is invariant
JL, of symmetry transformations.
But the v a c -
uum state is not, and that gives rise to symmetry-violating phenomena. By applying
4. onto the vacuum state, we must generate other states
degenerate w i t h the vacuum.
In a realistic cosmological model, the
volume Q of the universe may be expected to be f i n i t e .
In general,
there would be nonzero, but very small, barrier-penetration a m p l i tudes between these different " v a c u a " , which can l i f t the degeneracy. W h i l e such effects can be safely neglected at our present.stage of e v olution, they may have been important at a much earlier period when Q was s t i l l of microscopic dimensions. In Chapter 22 we shall discuss further the application of the spontaneous symmetry-breaking mechanism to various continuous symmetry groups.
Chap. 9. The Aether and ReMivmtic Quantum Fields 453
The New Ether J.D. Bjorken FERMI NATIONAL ACCELERATOR LABORATORY
I. Introduction In this talk, I want to address three topics. The first is to review yet again the standard model of elementary particles and forces. This standard model summarizes a fantastic amount of progress made in the last twenty years in our understanding of the basic building blocks of matter and the basic forces in nature. The second goal of this talk is to connect the standard model with the early universe,1 the hfcbig bang." The third goal, to which the title refers, is to highlight the emerging importance as time goes on of the vacuum state. By the vacuum state, the khnew ether," I mean that quantum state of the world, or of some region of the world, which contains absolutely nothing within, where all the matter has been removed, and where there exists no energy or momentum. It is, in a word, nothing at all. But what can one say about "nothing at all"? At this point I'm tempted to try for the next four minutes and twenty-two seconds to do for particle physics what John Cage did for music. But 1 doubt that the chairman will let me do that. Although one probably should not bother talking about nothing at all, nevertheless serious physicists do. The study of the vacuum has become a very sophisticated subject, about which there is a great deal of expertise.2 There are some experts at this meeting. Of course all these experts are theorists. One of them is Gerard t'Hooft, And 1 can say with certainty that Gerard understands nothing at all better than I do. So what is vacuum? Well, experimentalists know what it is; it's what's left in the can after everything is pumped out. But, of course, if one takes a can and pumps everything out, and the temperature
454
Lorentz and Poincare Invariance
of the can is finite, everything isn't pumped out. There are still photons radiated from the walls, hence a gas of photons existing inside the can. In order to get rid of those, one has to go to zero temperature to pump out the photons as well. Now, in thinking about vacuum, that suggests approaching the ideal vacuum from the point of view of the big bang and the early universe. Just take a piece of the universe which at early times has lots of things in it. Then, as the universe expands and cools, fewer and fewer particles per unit volume remain. If we live in an open universe with O < 1, maybe in the distant future we'd get something that is as close to real vacuum as a particle physicist imagines it.3 So in what follows I will try to describe, in the context of big-bang history of the early universe, the raw material of particle physicists and to set the stage for the end of the talk, where we discuss what this vacuum is and why it looks to me more complicated than nineteenth-century ether.
II. Big Bang Overview The earliest epoch of the big bang we will discuss starts a few femtoseconds after the beginning. This is a long time after the start for contemporary particle theorists; this is a very conservative talk. At this time the contents of the universe are believed to be a hot plasma with a temperature of about 10 TeV, somewhat above what can be reached in terms of energy per particle, in the biggest accelerators. As time goes on, the universe expands and cools. As it goes down through lower temperature scales, it passes through all the energy scales of interest to experimental particle physics and nuclear physics. When the universe is 10 milliseconds old, the temperature is down to 10 MeV or so. Beyond that it's on to the first three minutes, and we need not discuss that. 4 The features we will discuss in detail are shown in Table 1. Needless to say, especially from the nanosecond time scale on back, there is a fair amount of conjecture, because we don't have much in the way of experimental facts beyond the 10-100 GeV mass scale, where the highest energy collisions yet studied in detail have just yielded the intermediate bosons of the weak interactions. In those very early times we use some theoretical hubris; namely that the standard model has been working so well that we think it will, in its gross features, extrapolate quite a way up in the energy scale without serious error.
Chap. 9. The Aether and Relativistic
Quantum Fields
455
Table I. Time 14
Temperature
sec
lOTeV
10 - 1 2 sec
1 TeV
10"
Comments Symmetric world
10- 1 0 sec 8
400 GeV
second-order electroweak phase transition
100 GeV
W, Z freezeout
10" sec
10 GeV
10~6 sec
1 GeV
200 MeV ~10" 4 sec 2
~10" sec
100 MeV 10 MeV
top quark freezeout bottoml > freezeout charm J first-order deconfining phase transition baryon asymmetry appears
I I I . Early Times; Highest Energy (10 ~ 1 4 sec, 10 TeV) So let us start at the beginning, when the time is ~10' 4 sec and the temperature is —10 TeV. While this is the most speculative epoch we'll be talking about, it's the one which really epitomizes most cleanly our present view of the fundamental building blocks and forces that characterizes the standard model, a view which exhibits a high degree of symmetry. What are the contents of this big bang plasma? Let us build it up by quickly going up the temperature scale. At three degrees we have just photons. But as one goes up in the energy scale, the photons start colliding with each other, and they can make other particles. First of all come electrons and positrons. Then they all collide and make still others, such as ordinary hadrons. Going up the temperature scale, one makes all the species that exist, more or less in equal proportion. Table 2 shows the "periodic table" of building blocks. All of them are there in the plasma at this temperature of 1-10 TeV. There is the ordinary stuff that we're made of: up and down quarks in the three colors, electrons and their neutrinos v e . In ad-
456
Lorentz and Poincare Invariance
Table II. Periodic table of building blocks Quarks 4th Generation? 3rd Generation 2nd Generation 1st Generation
? ttt ccc uuu
? bbb sss ddd
Leptons ?
?
T
yT
e
ve
Forces Sources Charge "weak isospin" color
Carrier photon
w+, w-, z 8 gluons
Strength 1/137 1/30 1/7
dition there are the two replications of that at higher mass scales; one is the "second family" charm and strange quarks, the muon and its neutrino. The other replication, or "third family", contains top and bottom quarks, the tau lepton and its neutrino. All of these objects are discovered for sure except for the top quark and the tau neutrino. Everybody believes the latter exists, but has never been directly observed. But someday it may be discovered at Fcrmilab. Now all of these quarks, leptons, and their antiparticles are swimming around in this initial 10 TeV plasma as more or less a free gas. But there do exist interactions between them. The principal forces are the "gauge" forces. The electromagnetic force couples to charged sources; the carrier of the force is the photon. The intrinsic strength of the force is measured by the famous number rh which is small. (That number should go to r k or some slightly larger number at this scale, but it's still small) So everything charged, e.g. the top quarks of charge §, bottom quarks of charge - i , the leptons with charge ± 1, will interact via the electromagnetic force. The weak force, mediated by the now famous intermediate bosons W~ and Z°, couple to something called weak isospin. All of the quark and lepton species, at least in part, have weak isospin and couple to the weak force. Its strength is measured by a number which, instead of nbs is about ^ . The third "gauge" interaction is the strong force, which acts among the quarks. The quarks, which are "colored", couple to each other through exchange of gluons with a strength of order \. What makes this picture so satisfying is
Chap. 9. The Aether and Relativistic
Quantum Fields
457
that there is a great similarity in the way the quarks and the leptons in the various generations behave. Furthermore, all these forces have a great similarity as well. At this 10 TeV scale all the forces are inverse square forces, like the electrical force. All the force carriers, the photon, the W~ and the Z°, the gluons, can be considered massless, because the energy scale is much higher than, e.g. the mass scale of the intermediate bosons. All of the force carriers have two polarizations, like photons. The electric vector moves transversely to the direction of motion, either in the horizontal or the vertical. Better, one can circularly polarize the quanta and talk about left and right-handed helicities. All of these forces are based on gauge principles and are described by Maxwell-like equations, although in detail the latter two are more complicated than the electromagnetic. And all the force laws possess lots of symmetry. These underlying symmetries have been vital to the discovery of the force laws and the consistency of the theories that describe them. The building blocks, the quarks and leptons, have helicity properties similar to the gauge particles. The neutrinos come in only one helicity. Their polarization, if you like, rotates to the left, with their spin angular momentum pointing against the direction of motion. The anti-neutrinos have the opposite helicity. Electrons and positrons, as well as the quarks, possess both left and right helicities for each the particle and the anti-particle. But at this high temperature these are independent degrees of freedom without significant communication between them. And only left-helicity particles or righthelicity antiparticles participate in the weak interaction. Thus, in summary, at the 10 TeV scale all the quark and lepton degrees of freedom are present in the big bang plasma. There is a high degree of symmetry between different species of quarks and species of leptons. There are no forces between the building blocks which are "strong". Even the strong force mediated by gluons is characterized by a coupling strength ~ j , only four times as large as that of the "weak" force. The three known forces,—strong, weak and electromagnetic—are at this scale inverse-square forces derived from symmetry principles analogous to electromagnetic gauge invariance. Quarks and leptons can be broken into right and left handed parts; the left-handed parts behave much like neutrinos and participate in the weak interaction. The picture is relatively simple and very symmetrical—and slightly incomplete. We have left some things out. But in this simplest description of the standard model
458
Lorentz and Poincare Invariance
they are usually left out at this stage. When I have to put them back in, I'll return to this temperature scale.
IV. The Electroweak Phase Transition (10~12 sec. 400 GeV) So now we let the clock run, bringing down the temperature of the universe, and watch what happens. Below one TeV we begin to approach the top end of the present and future accelerator ranges of study. At the picosecond time scale, i.e., the several hundred GeV range, there is an occurrence believed to happen by 99 percent of all theorists. This is called the electro-weak phase transition, expected to be of second order. It has an analogy in solid state physics, which is simply the transition from normal metal to superconductor. The solid-state analog of the plasma of quarks and gluons is the gas of electrons in a metal. Below the superconducting transition temperature is formed a kind of Bose condensate of bound electronhole pairs. The analogous electroweak condensate is very important for elementary-particle phenomenology at low energies. Its presence is believed to be responsible for the fact that three of the gauge bosons, the W~, W~, and Z°, acquire a big mass of about 100 GeV. This phenomenon also has its analog in solid state physics in the Meissner effect. Long range electromagnetic fields in a charged superfluid or superconducting medium get expelled. Long range electromagnetic fields cannot exist inside of the condensate. Currents are formed at the boundaries of a superconductor, (something which can't happen very easily in vacuum, which doesn't have a boundary) such that the fields fall off exponentially as one goes into the superfluid medium, just like a Yukawa meson field falls off exponentially away from its source. In the solid state analog one can colloquially say that the photon gets a mass. That's loose talk, but roughly speaking the Compton wavelength of the photon is, in the superconducting analog, the London penetration depth. In the particle physics case, the analogous words are that below this critical temperature of several hundred GeV, some sort of condensate is forming which is analogous to the superfluid. By analogy with the Meissner effect, the W~ and Z obtain a mass. It turns out that the square of the mass is proportional to the weak coupling constant (M) and to the square of the critical temperature. That means the W
Chap. 9. The Aether and Relativistic
Quantum Fields
459
mass comes out to about 80 GeV, eighty times the proton mass. The Z comes out a little heavier. Now the big question is what this condensate is. Are the degrees of freedom identifiable in terms of the quarks and leptons in the periodic table? While many certainly have tried, I think there is a clear answer of " n o . " The condensate is not made of the known degrees of freedom, and we must put in something extra. Now the simplest model which does that, the Lagrangian-Higgs model, also has a solid state analog. It is analogous to an obsolete theory of superconductivity called the Ginsburg-Landau theory. One to one the solid-state equations can be mapped over, with some corrections in honor of special relativity. So if we now add in the minimum addition to the periodic table, we may go back to our original high temperature of 10 TeV and ask what has to be thrown into the plasma to account for this phenomenon. Whatever this turns out to be is dubbed the "Higgs sector." This Higgs sector turns out to be at least two particles along with their anti-particles. One of them is electrically charged and the other is neutral. They possess no helicity or spin and they must have "weak isospin." That is, they couple to the W~ and the Z. Of course, the charged particle couples to the photon. But none of the four have color and thus they do not couple to gluons. So that what is seen at high temperature in this minimum model, the Lagrangian-Higgs model. At a low temperature below the second order phase transition, a condensate is formed from a combination of the two neutral Higgs particles. The other three are subsumed into the W", W~ and the Z°. Those degrees of freedom are necessary to convert the massless W~ and the Z° with their two transverse polarizations that vibrate in the horizontal and the vertical, into a massive particle which can also have its polarization vibrating in the third, longitudinal direction. There is an extra degree of freedom for the massive spin one particle not present for the massless one, provided by these new degrees of freedom. So 33.3% of the W~ and Z° are made from the Higgs sector. Finally there is a dynamical remnant of the condensate, excitations which have an "energy gap" and correspond to another massive spinless particle, which is called the Higgs boson. Now remember that the minimum model of our world, namely the "first generation" up and down quarks, electron and its neutrino, is far from everything. We don't understand why the rest should be there. And, in the same way, there's no reason why these four extra "Lagrangian-Higgs" particles should be the whole story. Real life
460
Lorentz and Poincare Invariance
may well be much more complicated, with a periodic table for the Higgs degrees of freedom as big as what we have for quarks and leptons. It may even be as big as what is known as the Rosenfeld tables. They are a small book, packed with data on all of the particles (hadrons) that one makes out of quarks, antiquarks, and gluons. The study of the Higgs sector may well be as exploratory as the study of the strong force, from Yukawa to QCD.
V. Freeze-outs (10 _11 -10~ 6 sec; 400-1 GeV) As the universe continues to expand and the temperature drops from hundreds of GeV down to, say, 10 GeV, new things happen. The first is what can be called the "freeze-out" of the W's and Z's. As the universe expands and the temperature drops below the W and Z rest energy, they disappear from the plasma. They decay into quarks and leptons. This decay process was present at high temperatures as well, but the W and Z bosons could be replenished by collision processes: they are radiated by quarks and leptons just like photons are radiated by charged particles. At low temperature there is no longer enough collision energy to produce the rest energy of the W- and Z. After the W and Z disappear, quarks and other unstable particles freeze out in a similar way. So far, at these high temperature scales, the rest mass of the quarks and the leptons hasn't entered at all, and in fact has been totally ignored. Where does the rest mass come from? Again it is believed to come from the same mechanism that gives the W~ and Z their mass. In order to have this happen, the quarks and the leptons must be allowed to couple to the Higgs sector. In particular a Yukawa interaction of Higgs meson emission and absorption is postulated such that say, the left-handed top quark can change to a right-handed one, the one with the opposite helicity, plus a Higgs meson, and vice versa. This is similar to the Yukawa force used for the strong interaction. In fact, the charged Higgs mesons may also be exchanged; consequently the top can go to the charged Higgs plus bottom quark. This interaction between the Higgses and the top quarks suffices to create another phenomenon, which has its analogy in modern BCS superconductivity theory, called the energy gap. In the BCS theory, the electron excitations above the Fermi surface have an energy gap. In this case that is interpreted as the top quark getting a mass, namely, the minimum
Chap. 9. The Aether and Relativistic
Quantum Fields
461
energy that the top quark can have is a nonvanishing amount above zero. That is simply its rest energy. And the amount of mass the top quark gets is proportional to the amount of coupling that it has to the Higgs bosons, Alas, the mass is not predicted; one gets out no more than what is put in. Now as the temperature comes down and becomes less than the rest mass of the top quarks, they disappear from the big bang plasma. A loss mechanism is again the weak decay of the top quarks. In addition, they can annihilate with their anti-particles, because they are there in about equal numbers. Again if one wants to get them back into the plasma through inverse processes, it is very hard to do because the rest mass has to be created in a collision. When the temperature is small compared to the rest mass it is very unlikely that there are the particles around to do that. And so the top and antitop drop out of equilibrium ("freeze out") and are lost from the plasma, essentially in the same way the Ws and the Z's were lost when the temperature went well below 100 GeV. That is also true for massive Higgs bosons, which have to be there in the plasma at high temperature. As we go on in time and down in temperature, the rest of the quarks start freezing out by the same mechanism. The bottom quark, with its rest energy of 5 GeV, is next, then the charm quark with its rest energy of 2 GeV, soon followed by the highest mass lepton, the tau lepton, which is the second replication of the electron. It has a mass about the same as the charm quark and will also drop out in the few-GeV temperature regime. Now the time scale is getting almost to a microsecond; the freezouts are the main feature in this epoch following the electroweak phase-transition.
VI. The Confining Phase Transition (10 ~5 sec, 200 MeV) Downward, below one-GeV temperature and going toward 100 MeV, another very interesting thing happens. It is associated with the change in the content of the big-bang plasma from quarks and gluons to the hadrons, protons, pions, that we're made of, and is usually called the de-confining phase transition. But I will call it the "confining" phase transition: on the way down in temperature it's confining, but looking up in temperature from below (as we are usually obliged to do) it's de-confining.
462
Lorentz and Poincare Invaxiance
At temperatures well above about 200 MeV, the important degrees of freedom in the plasma are the quarks and the gluons (and some photons and leptons). At, say, 300 MeV those which haven't frozen out are up, down, and strange quarks, their antiquarks, the gluons, the photons, the muon, the electron, and all the neutrinos, all interacting with each other. Now as the energy scale comes down, the strong force rapidly gets stronger through subtle "vacuum polarization" effects. At about 200 MeV, things come to a head and there is a first order phase transition,5 with lots of latent heat. Below that first-order phase transition there are no more quarks and gluons; the density and temperature is low enough so that identifiable mesons and baryons can most of the time exist (The available volume per hadron is larger than the hadron size). At the same time something called the QCD non-perturbative vacuum, or condensate, forms. This is a relatively wooly concept, not under as good control, in my opinion, as the idea of the electro-weak condensate. What is this new condensate? It is not quite like a superconductor or a superfluid, but there are some similarities in the implications. To appreciate it, one must understand the main feature of this de-confining phase transition. It is the transition between the strong force law at short distances, which is inverse square, (just like all the other ones) and the force law at long distances, which is a constant force, associated with a "string tension." What is supposed to happen is that the color fields surrounding a quark and an anti-quark which arc close together look pretty much like the electromagnetic fields associated with a charged dipole. But if one pulls the quark and antiquark apart by more than 10~ n cm or so, the field lines somehow are supposed to get squashed together into a flux tube. Colloquially, this can be described as pressure from the vacuum squeezing those flux lines in. One may speak of a gluon condensate or "non-perturbative QCD vacuum" on the outside, excluding the colored flux from it in a manner something like the Meissner-effect. In fact there are nice analogies to magnetic monopoles travelling through typetwo superconductors, in which this phenomenon does occur. The field lines of the monopole are crowded together into quantized vortices or flux tubes trailing behind the monopole. But while there are some analogies, the actual mathematical description of this nonperturbative QCD vacuum still remains rather primitive. At the same time and temperature, something else happens; a superfluid condensate of the quark degrees of freedom forms, again in analogy to BCS superconductivity. This is called the chiral phase
Chap. 9. The Aether and Relativistic
Quantum Fields
463
transition. It has an order parameter associated with it, which is, roughly speaking, the effective mass of quarks. And that adds to the richness of the vacuum below the confining phase transition.
VII. Later Times and the Emergence of Baryons As we continue onward on this quick trip through early history, below 100 MeV and down to 10 MeV, we have no quarks and gluons. Strange particles and muons are freezing out as well as the protons, neutrons, and their anti-particles because the temperature is much below their rest mass. Baryons disappear via particle-antiparticle annihilation, but thank goodness they don't all disappear, or we wouldn't be here to talk about it. At temperatures somewhere around 50 MeV or so, the protons and neutrons significantly outnumber the anti-protons and anti-neutrons. The latter very rapidly disappear completely but not the former, because there is a slight asymmetry in the amount of matter relevant to anti-matter in the early universe. If we go back again to very early times, say when the temperature was hundreds of MeV or above, less than one part per billion excess of quarks over anti-quarks was needed to account for the amount of baryonic matter in the universe now. The asymmetry was a very tiny effect in most of the history of the early universe. That tiny effect undoubtedly has a very deep physical origin, and probably originated long before the epoch that we have discussed, which started at the first femtosecond or so after the bang. It may well have the same origin as the observed CP violation in the neutral K meson system, a phenomenon found in 1964 by Fitch, Cronin and their collaborators. 6 The neutral K is a down quark bound to an antistrange quark; it turns out that its decay properties into two pions is slightly different from the decay properties of its antiparticle into two pions. Therefore there is, in the basic laws of the physics that govern these weak decays, a slight asymmetry between properties of matter and anti-matter. The connection between the history of the early universe and this phenomenon goes back to Andrei Sakharov in 1967, in an impressive, visionary, set of ideas. 7 Most theorists now accept this connection as a working hypothesis, and furthermore try to blame both problems on the Higgs sector. (What else is there to blame?) After this freeze-out, when the baryons are left behind and the anti-baryons are gone, we enter the "classical" period from a tern-
464
Lorentz and Poincare Invariance
perature of 10 MeV on downward to the present temperature of 10 ~ 4 electron volts, from the first ten milliseconds to the next 10'° years, when a few inconsequential things happened such as the formation of nuclei, atoms, stars, galaxies, Steven Weinberg, and the written history of all of this. 4
VIII. Comments Let us now review the implications, experimental or otherwise, of this kind of history. First of all, the scenario is probably wrong in detail, especially at the highest energy scale. The "Higgs mechanism" for the origin of the intermediate boson masses argues for new degrees of freedom and new forces which aren't well understood, and which we are probably describing very inaccurately. Beyond what is "needed" there may well be other new degrees of freedom, new forces, and maybe more phase transitions (or maybe fewer—maybe the idea of the electro-weak phase transition is wrong). In terms of laboratory experiments, this argues for particle searches at all mass scales to see whether something has been left out. There is a lot of room for improvement in that area. Secondly and more specifically, what really is the nature of that electroweak phase transition? Is it really at 400 GeV, or at some different temperature? Are its properties what wc think it is? Is there more than one transition? All this bears upon the delineation of the Higgs sector. The mass scale is quite firmly set as no higher than one TeV, and that is very suggestive that one really needs to experimentally explore up to that mass scale very well. This is why particle physicists are in such forceful unanimity about the need for the superconducting super-collider, or SSC. This twenty TeV machine, with a cost of about three billion dollars, addresses in observational terms the physics of the TeV mass-scale, i.e. the massscale of the expected electroweak phase transition. The next question, a little more modest, is the nature of the confining phase transition. 8 Here there are two areas in which much can be and is being done. On the theory side, one needs to calculate better the equation of state of the plasma as well as its transport properties. Experimentally, the relativistic heavy ion collider RHIC, which is proposed to be built some day at Brookhaven and which I hope is built, may provide means of producing quark-gluon plasma
Chap. 9. The Aether and Relativistic
Quantum Fields
465
at temperatures above the confining phase transition, much as it was supposed to be during the first microseconds of the big bang. The origin of the baryon asymmetry is a vital question for us, literally. That invites incisive studies of CP violation anywhere one can find it, not only in the K system, but also possibly in heavyquark (bottom or charm) meson systems. There is a tremendous technological challenge here. There is also very rapid progress, which will probably evolve for the next couple of decades, in being able to experimentally examine particles containing charm and bottom quarks as well as we do now with /Ts.
IX. The New Ether What about the either itself? I think one can't help but be impressed at how complicated the vacuum has become. We see all these phase transitions, with the vacuum ascribed all sorts of nontrivial properties. The particle theorists tend to talk about vacuum in the same language as used by the condensed matter people: condensates, order parameters, etc. It is as if the vacuum really is physical, that there is dynamics associated with it. If so, there is a question that immediately arises, namely whether the vacuum gravitates. If there is dynamics associated with all these phase-transition mechanisms, then there should be energy, at least potential energy, associated with them. If one tries to calculate it, it usually comes out infinite and is subtracted away; it is quite an ambivalent question when approached from the point of view of formal quantum field theory. But were there potential energy associated with those condensation phenomena, it would give an enormous cosmological constant in the equations of general relativity. Experimentally, it is not there to one part in 10l2° of what might naively be expected. It is a nontrivial theoretical challenge to reduce the discrepancy. There are a lot of other complications in the vacuum which I haven't even mentioned. In the old days, when I started out in this game, a popular subfield was something called "axiomatic field theory." Before one had a good handle on how to make theories of strong and weak interactions, one started with principles believed to be absolutely safe, and tried to derive general consequences from them. Some of those principles were the properties of the vacuum. These seemed absolutely trivial and self-evident at the time, namely that the vacuum state is unique, that it is Lorentz invariant, has zero
466
Lorentz and Poincaie
Invaiiance
energy, and has zero momentum. Other than that last one, all those postulates have nowadays been abandoned. The vacuum is not unique. In the theory of the strong force (QCD) there are a countably infinite number of vacuua, which differ only by the number of topological knots in pure gauge potentials that are present in vacuum configurations. A pure gauge potential doesn't contribute any energy to the vacuum. But if the topology of the gauge potential is non-trivial, its vacuum is non-trivially different from ones, for example, that don't have any gauge potential at all. Furthermore there are dynamical couplings between all of these different vacuua, associated with the names "tunnelling", "instanton" and t'Hooft. This dynamics is extraordinarily subtle, and is not at all irrelevant. This vacuum degeneracy can potentially lead to observable CP violations in the strong force. This would be a disaster, because Norman Ramsey and his colleagues have measured the absence of the neutron electric dipole moment to very, very high accuracy. This creates no doubt about the fact that there is no substantial CP violation in the strong force. This situation is patched up in the present theory by adding still more Higgs condensates at extremely high mass scales. This is a serious problem, not fully solved, which also has experimental implications. The most exciting one is the search for "invisible axions," which might be a candidate for the dark matter of the universe. The second of the three postulates, the Lorentz invariance of the vacuum, is abandoned in gauge theories. The description of the vacuum is what are called physical gauges ends up not being Lorentzinvariant. 9 A Lorentz transformation of vacuum is accompanied by a gauge transformation. Of course the physical consequences are Lorentz covariant and gauge-invariant. But the description looks a little clumsy. As we discussed earlier, the third of the three postulates, zero energy of the vacuum, ought to be true in order to avoid an enormous gravitational cosmological constant. But as we also discussed, we don't really know why. So it would seem at the very least that we are being led into a very complicated description of something which ought to be absolutely simple. And it may be that, just as with the old ether, it is our descriptive structure that is wrong. Sooner or later we may get a description that is as efficient and elegant as special relativity, one that leaves the physics consequences more or less alone and gets rid of a lot of excess conceptual baggage. On the other hand, maybe these complications really imply that there are
Chap. 9. The Aether and Relativistic
Quantum Fields
467
elements of physical reality somehow involved in this description of the vacuum, ultimately observable. I wouldn't think the observations are just around the corner. If they were to be accessible, I would question at the same time all the sacred principles of contemporary physics, including gauge-invariance, Lorentz-covariance, and maybe quantum mechanics itself. If such basic principles were in need of modification, I feel that any replacement would for sure have to be not only subtle but very beautiful. References 1. See the talk of George Field in these proceedings. 2. There is even a book: J. Rafelski and B. Muller, "The Structured Vacuum," Deutsch (Thun, Switzerland, 1985). 3. A splendid discussion of what might happen-and whether life could survive-is given by Freeman Dyson, Revs. Mod. Phys. 51, 447(1979). 4. See S. Weinberg, "The First Three Minutes," Basic Books, Inc., (New York, 1977). 5. Probably. The order of the transition is still a matter of theoretical debate. But no matter what, a lot of latent heat is released within a small temperature increment. 6. J. Christenson, J. Cronin, V. Fitch, and R. Turlay, Phys. Rev. Lett. 13, 138 (1964). 7. A. Sakharov, Sov. Phys. JETP Lett. 5, 24 (1967). 8. See the talk of Arthur Kerman, these proceedings. 9. "Unphysical" gauges can be Lorentz covariant, but allow unphysical particles in the spectrum, including particles emitted with negative probabilities. It is not clear that, from a physics point of view, this is to be preferred.
This page is intentionally left blank
Chapter 10
The Logically Simplest Theory of Relativity and its 4-Dimensional Symmetry 10
L. Hsu (1990-1994), J. P. Hsu (1994)
W/UA
V ^ '
s-34
<&
w Can one derive the Lorentz from
precision
Transformation
experiments?
Leonardo Hsu Physics Department Harvard University Cambridge, MA 02138
Investigation of two derivations of the Lorentz transformation from the results of three fundamental experiments reveals that an additional theoretical assumption has been made, making the special relativity transformations the inevitable result.
A more careful derivation shows
that the three fundamental experiments of special relativity are not enough to determine a space-time transformation uniquely.
"This is a nice presentation and idea. Clearly written."
470
Issac F. Silvera
Chap. 10. The Logically Simplest Theory of Relativity ...
All
I. INTRODUCTION Ever since its introduction in 1905 by Einstein's paper, "On the Electrodynamics of Moving Bodies," the theory of special relativity has generated much discussion and has had a profound impact on physics.
By
putting space and time on an equal footing and postulating the equivalency of all inertial frames, the theory changed the whole framework within which physics is done and the way in which physical laws are formulated. As J. D. Jackson points out, relativistic effects play a significant role in physics, "from the lowest energies in atomic systems (where the precision is so high that the tiny relativistic effects must be included) to the highest laboratory energies in the giant particle accelerators (where relativistic effects are gross and must enter even the crudest considerations).' Because so much of the theory is based on two postulates which, though familiar now, seemed to fly in the face of common sense at the time, a fair amount of effort has been devoted to trying to replace the foundational postulates with laboratory observations, i.e., to derive the Lorentz transformations from experimental results such as the MichelsonMorley experiment, measurements of the Doppler shift, and so on.
In this
paper, I will discuss two such attempts, one made by H. P. Robertson 1 in 1949, and a much more recent one made by Dieter Hils and John L. Hall (following a parameterization of the space-time transformation by R. Mansouri and R. Sexl
J
) just this past year.
Both start with a general
linear transformation containing a number of unknown parameters and, using previous experimental results, obtain values for the parameters which uniquely fix the transformation to be the one obtained by Lorentz. However, I believe that in the two derivations, the authors have, perhaps unconsciously, made certain assumptions which inevitably led ihcm to the
472 Lorentz and Poincare Invariance Lorentz transformation.
In this paper, I will discuss what these
assumptions are and then, using a transformation even more general than the ones used by Robertson or Hils and the same experimental data, see to what extent the transformation parameters can really be fixed. The derivations of the Lorentz transformation by Robertson and Hils used the results of only three experiments to completely specify the transformation parameters.
These are the Michelson-Morley, Kennedy-
Thorndike, and Ives-Stilwell experiments.
Before going on, here is a quick
summary of these three fundamental experiments of special relativity and their results.
II. THREE CLASSICAL TESTS OF SPECIAL RELATIVITY The earliest experiment which suggested that the then modern view of physics needed revising was the Michelson-Morley experiment, first performed in 1881 by A. A. Michelson alone, and again-with greater precision in 1887 in a collaboration with Edward Morley . This experiment was originally designed to determine the velocity of the earth with respect to the ether, a hypothetical medium that permeated the universe and was responsible for the transmission of electromagnetic waves.
The setup consisted of a number of full and half-silvered mirrors,
a light source, and an observing telescope.
Light from the source was split
into two equal beams by a half-silvered mirror.
The two beams would
travel mutually perpendicular paths, one aligned parallel with the Earth's motion through the ether, of equal length before being recombined by the half-silvered mirror to form an interference pattern.
The light in the first
arm would travel at the velocity of light plus or minus the speed of the Earth through the ether while the second beam would travel only at the
Chap. 10. The Logically Simplest Theory of Relativity ... speed of light.
473
The beams would thus take different amounts of time to
travel equal path lengths and would be out of phase when recombined to produce a pattern of light and dark bands.
By observing the change in the
interference pattern as the setup was rotated by 90 degrees, one could calculate the velocity of the Earth relative to the ether. As we know, no change was ever observed in the interference pattern.
The motion of the Earth through the ether, if indeed there was an
ether, was undetectable.
Since this experiment has been performed at
many orientations and times of the year, the null result is usually interpreted as meaning:
The time required for light to travel a distance L
and back is independent of its direction. Since Lorentz and Fitzgerald showed that a particular length contraction along the direction of motion could provide the same null result to the Michelson-Morley experiment as an isotropic speed of light, modern versions of the experiment have sought to test the isotropy of an "etalon of length" in space.
One such test was carried out by A. Brillet and
n
J. L. Hall .
A He-Ne laser, a Fabry-Perot cavity and a servo mechanism
forming a feedback loop are mounted on a rotating optical bench.
The
laser beam passes through the cavity with the servo continuously adjusting the frequency of the laser so that its beam satisfies standing wave boundary conditions inside the cavity.
Thus, changes in the length of
the cavity are seen as changes in the frequency of the laser light.
Part of
the beam is diverted from the loop, directed upwards along the table's axis of rotation, and combined with the beam of a nonrotating, highly stable laser.
The combined beam is then fed into a beat detector, the beats being
produced because of the differing frequency of the two beams, and the output of the beat detector is finally sent to a computer which stores the
474 Lorentz and Poincare Invariance information over the course of the table's rotation.
If space is not isotropic
and lengths are contracted along a particular direction, then the frequency of the He-Ne laser will change periodically as the length of the cavity changes in response to its orientation, and the beat frequency will change periodically.
Taking into account many sources of errors, Brillet and Hall
found that one could rule out the possibility of a length contraction to a very high degree of accuracy (AL/L would have to be < 1 x 10
).
The second of the three experiments, chronologically, is the KennedyQ
Thorndike test, the results of which were first published in 1932 . The original purpose was to see whether absolute time could be ruled out experimentally, since the results of all other experiments to date could be explained by means other than modifying the concept of absolute time as found in the Galilean transformation.
The setup for the experiment
resembles that of the Michelson-Morley experiment except that the two arm lengths are different from each other.
Light from a source is split into
two beams by a half silvered mirror, the beams travel in two different directions and are reflected back upon themselves.
The half-silvered
mirror then recombines the two beams and reflects them into a telescope for observation of their interference pattern. If there really was absolute time, it would manifest itself as a change in the interference pattern over time as the velocity of the apparatus through space changed due to the rotational motion of the Earth.
Over a
period of several days' observation however, no change in the interference pattern could be found that might correspond to the changing velocity of the earth.
Kennedy and Thorndike noted that "there is no effect
corresponding to absolute time unless the velocity of the solar system in space is no more than about half that of the earth in its orbit," a possibility
Chap. 10. The Logically Simplest Theory of Relativity ... that has been ruled out by modern astronomical measurements. interpretation of this null result is:
475
The usual
The time for light to travel out to a
point and back in an inertial frame is independent of the velocity of the inertial
frame.
As with the Michelson-Morley experiment, modern KennedyThorndike experiments have utilized lasers to make very precise measurements.
In a very recent experiment by D. Hils and J. L. Hall , a
setup similar to the one described above (different lasers were used and the apparatus could not rotate) for a modern Michelson-Morley experiment was used to try to detect a 24 hour sidereal variation in the beat frequency corresponding to the earth's rotation.
This measurement is
the physical equivalent to that made by the original KT experiment, comparing the transformation of time and length in a moving frame. Again, the looked for variation could not be found with any certainty and so the original null result is reconfirmed, but to a level of accuracy 300 times better than the first attempt. The third and last experiment to be described here is the IvesStilwell experiment , which measured the second-order Doppler shift of a moving light source by comparing the spectra produced by stationary and moving hydrogen canal rays (an old term for positively charged ions produced in a gas by electrical discharge).
The ions were produced in an
arc between some filaments and a grounded aluminum electrode and then accelerated between that and second electrode held at some high voltage. In order to eliminate the effects of the first order shift (which would have masked any second order effects), one must measure the spectrum of the moving particles at right angles to the particles' direction of motion. Since this is extremely difficult to accomplish experimentally, Ives and
476 Lorentz and Poincaie Invariance Stilwell instead took two simultaneous measurements of the rays, one with and one opposite to their direction of motion, resulting in one red and one blue shifted line, where the second order displacement can be calculated by comparing the center of gravity of the two lines with the line produced by a stationary source.
They discovered that the wavelength of the light
emitted from the moving ions was shifted by a factor of (1-v /c )
' ,
where v is the velocity of the emitting ions with respect to the observer. As with the previous two experiments, modern reproductions of the Ives-Stilwell experiment have used lasers and beat frequency detectors. In one version* , the frequency of the beam from a laser was tuned to a particular transition of Neon atoms at rest while another was tuned to the same transition of moving Neon atoms.
The two beams were combined
and the beat frequency of the combined beam was measured for several velocities of the moving Neon atoms to measure the Doppler shift of the frequency of the transition. Now let us see how the results of these experiments are used to determine a unique space-time
transformation.
III. DERIVING THE LORENTZ TRANSFORMATION? The first of the two attempts to find a purely experimental basis for the Lorentz transformation was made by H. P. Robertson
in 1949.
Following is a short summary of his work and my criticism of his derivation. To start, Robertson defines two inertial frames; Z, a "rest" frame with coordinates x , \, r\, and C, (or ^°, i}, Z,2, and ^ 3 ) where T is the time coordinate, 0 1 2 3 and S, a moving frame with coordinates t, x, y, and z (or x , x 1 , x , and x J ) , with t the time coordinate. He also defines the metric
Chap. 10. The Logically Simplest Theory of Relativity ...
i
(0)
/„„d5"dS* 3
dT
477
- J . f d S ^ d l * *dC m )
»
assuming that the speed of light is isotropic in the 2 frame.
The problem is
to find the transformation T which expresses the greek letters in terms of the roman ones.
In the most general case, we have 3
(i)
? = z < *'
M = O, 1, 1 , 3
Using appropriate symmetry arguments, the 16 coefficients can be reduced to four parameters as follows.
Take the components of the
velocity vector of the S frame measured from the 2 frame as v a (a = 1, 2, 3) so the equation of motion for the spatial origin of the S frame is X = *0°t
l* '- o,;t
x' - X* = * 3 = O
for
If time is to work normally on both frames, agO cannot be zero. Now consider a light signal propagated from the origin at t = 0 which is reflected at coordinates p a (a = 1, 2, 3) and returns to the origin at tg. The outgoing signal is on the light cone defined by the equation *cr a - W ( a * * t
+ *>x*)(o.>t
+
< * x
b
)
=
O
(3)
while the reflected light is described by the equation
(4)
tA = -t0
te = t
x/ = o
><; = x a
Assuming that the speed of light is the same in both directions along a line in the S frame, the point (tg/2, p a ) must satisfy both equations.
Setting
478 Lorentz and Poincare Invariance the equations equal to each other, we find that all the cross terms must vanish, or
r„„a0"a:tPfc -
(5)
^ e * =
O
^ ^ ^ ^ ^
Since this must hold for all p a , equations (2) and (5) give (6)
<],. =
a / a / -
l ^ a :
-
a/(a.8-
/ a . V c
1
)
= 0
To simplify the transformation without loss of generality, one can align the axes of both frames such that the relative motion is purely along 1
9
^
the % -axis. Then v = v =0. By choosing the rest of the axes in the two frames to be parallel, requiring azimuthal symmetry about the x-axis, and constraining all motion to be in the x or \ direction, the transformation equations simplify to c
(7) £ = va/i
-i- a , ' *
£ =
a / z
The metric (0) can now be rewritten in terms of t, x , y, and z as d
^[g/'clx1
^ .
l
( d y ^ d ^ ) j
(8)
Of course, as v goes to 0, all g's and a's should become 1. To determine the remaining parameters, Robertson now turns to the three aforementioned experiments.
As stated above, the results of the
Michelson-Morley experiment tell us that the time required for light to travel out to a point and back is independent of the direction.
Using (8),
Robertson finds that the time it takes for light to travel out a distance L in a direction making an angle h with the x-axis and back is
Chap. 10. The Logically Simplest Theory of Relativity ...
(-9-1
( , , l v k o ^ ) M < j » s i n V,)j + ^ . J ( 9 . M c o s ( f c t 7 i ) ) 1 + ^ , W ^
t
479
( ^ f j
For this expression to be independent of h, a necessary condition is
(10)
<3,(v) = ^Jv)
where both g's are positive, since the a's are positive. Next, Robertson uses the Kennedy-Thorndike experiment to further fix the values of the g's.
From (9) and (10), the time it takes for light to
travel a distance L in any direction is (11)
-t =
Lq.M/cqoM
Therefore, the difference in time At for the two light beams to return to the mirror which initially split them is related to the difference in the lengths of the arms of the interferometer AL by the equation: (12)
it
= < 3 » A L /c<3oM
The parameter v in this equation is the velocity of the apparatus, which changes periodically as a result of the earth's orbit about the sun and its rotation about its own axis. in the interference pattern.
A change in At would appear as a shift
Because the pattern did not change over a long
period, Robertson deduces that the ratio gj/gg is independent of v and since g^/gg = 1 when v=0 (both are identically 1), this ratio must hold for all v. To summarize to this point, we have determined that gj(v) = g2( v ) = g3(v).
Setting all three equal to g(v), the transformation T then looks like
(13)
? y\
V ' - vVc"
'- «j(v) y
VI
=
o)U)^
v c
^
480 Lorentz and Poincare Invariance To find definite values for g, Robertson turns to the Ives-Stilwell experiment.
He introduces a third reference frame S', moving with
velocity v' with respect to Z where v' > v, with clocks set such that at t = t' = 0, the origins of S and S' coincide. By setting d^/d-r = v' and dx/dt = u in (13), the velocity u of S' with respect to S is found to be (14)
(j=
( v ' - v) / ( | -
vv'/c
l
)
Suppose that there is a light source at the origin of S' which is co-moving with that frame.
Its equations of motion relative to the S frame are
obtained by setting x', y', and z' to 0
d5)
Vi-vVc* t +
fr(v) -
7 1 - v-vc1
^
vx/c
*
= *(„)•
vt + x
and solving for x and t (16)
t=pt',
X-
upt\
p=
f_L
(l-uVc»)
If the light source sends out a signal at some t' < 0, while it's approaching an observer R located at the origin of S, the signal will be received by R at a time (17)
x
+
c
p
c
i
c yp
Two signals sent by the moving source separated by a time At' would be seen by observer R separated by a time (18)
at
+
=
( i - »/c)
P
At '
For a periodically emitting source, we can think of the wavelength of the radiation emitted as X' = cAt' in the S' frame and
Chap. 10. The Logically Simplest Theory of Relativity ...
(19)
A+ ^
in the S frame.
( l -
u
481
A ) p V
By a similar argument, radiation emitted as the source is
moving away from the observer has its wavelength altered like (20)
\_
=
( 1 + u / c ) p X'
The same radiation emitted from a source stationary to the observer has wavelength X' = X, so the center of gravity of the two shifted lines is displaced from the non-moving source's line by an amount
,21,
.X-AI^O-XMp-OX^^fl^,)-'] In their experiment, Ives and Stilwell determined that the second
order Doppler shift AX/X could be given by the expression u 2 /2c . Since this result holds for motion of the source at an arbitrary angle from the xaxis and there is no correlation between v and the x-component of u, Robertson determines g(v) and g(v') to both be unity (to second order). Thus he • claims to have arrived at the familiar Lorentz transformation and four-dimensional metric, purely on the basis of experimental results. While each step in this derivation seems reasonable, I believe that Robertson has made an unwarranted assumption near the beginning of his calculation which has made the Lorentz transformation the inevitable result.
It comes while he is reducing the number of undetermined
coefficients in the transformation T.
When we consider a light signal being
sent out from the origin and reflected back from a point, Robertson writes: "We agree to set the auxiliary clock situated at x a = p a [in the S frame] in such a way that it records the time tg/2 for the...reflection" (emphasis his). In other words, he has constrained the velocity of light along a line in S to be equal in both directions along that line.
Now, even before this,
482 Lorentz and Poincare Invariance Robertson had assumed that the speed of light is isotropic in his "rest" frame Z.
This is a perfectly reasonable assumption and is useful for
simplifying the discussion, but, in order to truly derive a transformation using only experimental results, one should not put any restrictions on the speed of light in any other frames.
Putting his assumption about the speed
of light in S together with the results of the Michelson-Morley experiment, we soon arrive at the conclusion that the speed of light must be isotropic in all inertial frames.
This constraint results in a great loss of generality and
contradicts an earlier statement made by Robertson, "No assumption is here made concerning the velocity of light...in S."
Later on, we shall see
how the absence of this constraint affects the derivation. The second derivation of the Lorentz transformation is more direct. The parameterization of the general transformation and calculations were first done by Reza Mansouri and Roman Sexl in a 1977 series of papers using the best data available at the time.
In 1990, Dieter Hils and John
Hall used modern laser techniques to greatly improve the resolution of the Kennedy-Thorndike experiment and using Mansouri and Sexl's framework, confirmed their Lorentz derivation to a much higher level of confidence . In this derivation, one first writes down a general linear transformation between two frames, S the preferred frame, and S', the moving
frame: t =
(22)
where
<x[w)T
+ ex
v = x
d
M Y ,
_
Chap. 10. The Logically Simplest Theory of Relativity ...
483
As before, the three spatial axes in both frames have been aligned and the relative motion of S and S' is restricted to the x-direction.
Hils and
Hall set e by the convention for clock synchronization and determine a(v), b(v), and d(v) from experimental data.
Using Einstein's method of clock
synchronization, they fix (24)
e
_
=
v / c
,
The speed of a light signal traveling at an angle a with respect to the x-axis is then <-(©)
= 7 U cos 6)%
+ (cs,o©r
•= J~~(x/t)%
+ ( y / t )
1
(25) -
C [ | * f ^ - £ ^S ) ^ S .n - © +• (fi - o. - I) £
to second order.
]
This result holds quite generally since we have already
fixed c to be azimuthally symmetric around the x-axis. With e determined, a is the only unknown parameter left in the transformation between t and T.
From the results of modern
measurements of the Doppler shift it is determined to be (26)
ot =
± iO- 7
- i
The most accurate Michelson-Morley experiments, showing that the speed of light is independent of V, fixes (27)
1 _ A +• S
=
Q +
5 *
\o^
And in light of this value, the Kennedy-Thorndike experiment performed by Hils and Hall, showing that the two way speed of light does not depend on the velocity of the reference frame, gives us (28)
£ - * - ! =
O i
7 x
\0~s
484 Lorentz and Poincare Invariance Therefore, we can deduce that (29)
^
^ i
+ 7"l0"
r
7><|0"f
S = 0 ±
Putting these figures into (25), we see that c is now a universal constant to an accuracy of 7 x 10
, the biggest uncertainty in any of the three
measurements, and equations (22) now have the form of the Lorentz transformation to second order. However, this derivation has the same problem as Robertson's.
By
using Einstein's method of synchronization to determine the parameter e, Hils and Hall have, just as Robertson did, constrained the speed of light along a line to be the same in both directions along that line.
This is
evident from (25), where sin •& is squared so that c(£) = c(i3 + K). An additional effect of choosing e to be -v/c
that Robertson's
assumption does not have is that in order for this set of transformation equations to have four dimensional symmetry, the parameters a, P, and 8 MUST assume the values -1/2, 1/2, and 0 respectively.
This fact is noted
indirectly whe'n Hils and Hall mention that "the kinematical parameters a(v), b(v), and d(v) might be determined by theory."
Then the remainder
of the derivation is merely a check to see whether or not the results of the three experiments are consistent with the four-dimensional symmetry of physical laws.
Since this framework has been verified again and again by
countless experiments, most notably high energy particle experiments, the verification of those values for the parameters is rather trivial.
Once e has
been set in that manner, the Lorentz transformation is the only possibility with four-dimensional symmetry permitted by (22).
For any other values
of a, (3, and 8, we get only three-dimensional symmetry (Gallilean invariance) at best, which has been conclusively ruled out by experiment.
Chap. 10. The Logically
Simplest
Theory
of Relativity
...
485
In sum, my criticism concerning the previous derivations is that by using the Einstein clock synchronization convention, both have actually assumed that the one-way speed of light is the same in both directions along a line in the moving frame, instead of deducing it from any experimental results.
Of course, for purposes of setting up any kind of
clock system at all, one may assume an isotropic and constant speed of light in a "rest" or "preferred" frame.
However, in order to truly derive a
transformation from experimental results, one should not make the least assumption regarding the speed of light in any other frame.
With this in
mind, let us now see how precisely a space-time transformation can be specified based on the results of the three experiments.
IV. A MORE GENERAL FORM To start with, I choose
(30)
x ' = A(»)
x -
v> =
v
b't'=
D
(v)
F(v)c^
&(-)ct
= 0 ( v ) [ A'(»)x -z.' =
-1- P (v)
x
+•
6'(Oct]
Dlv)z. =
0(O[t"(v)ct
+•
P'(v)x]
as a parameterization of a more general form for a linear transformation between reference frames S (a "rest" frame) and S' (a moving frame) than either of those proposed by Robertson or Hils and Hall.
The coordinates in
S and S' are (ct, x, y, z) and (b't', x', y', z') respectively and v is the velocity of S' with respect to S.
As in the previous derivations, I have aligned the
coordinate axes such that the motion of S' is along the x-axis of the S frame and have introduced azimuthal symmetry about the axis of motion (y and z transform in the same way).
Also, I have assumed that the speed of light
is isotropic in S only, so we can use Einstein's procedure to synchronize the clocks in S.
In the moving frame, the speed of light is not, in general,
486 Lorentz and Poincare Invariance isotropic, so we can only write b't' in the transformation equations, instead of ct'.
Although we don't yet know of a relation between t and t' and so
cannot separate b' from t', I make the further assumption that the time coordinates are linearly related and write (31)
t' = M(v)t
+ N(v)
X
Physically, this relation can be realized for any choice of M and N since the rate of ticking and reading of a clock can be arbitrarily adjusted and all (31) means is that an S' clock located at x shows the time Mt + Nx while an S clock at the same position shows time t.
In using the above equations, I
have defined my metric in S to be
(32)
(ctr
- x* - yx - ^
= cr l
To begin the determination of the seven parameters (A, B, D, E, F, M, and N), we start with the requirement that an object located at the origin of S' and co-moving with that frame must have a velocity v in the rest frame.
(33)
Introducing the condition x'=0 in (30), we get
o =
A'(V)X
+ B'lvM
2L x
=
V=
-J^L
(v)
Next, we use the results of the Michelson-Morley experiment, which tells us that:
For a given v, the time required for light to make a trip out to
a point and back is independent of the direction.
This is equivalent to
saying that the speed of light, averaged over a round trip, is the same in any direction.
Suppose that we are in the rest frame S watching the
(moving) apparatus in S' from the side (see Fig. 1). We will see the light signal traversing the vertical arm of the experiment make a triangular path as shown below.
The velocity of this light beam measured from the
moving frame is (using (30))
Chap. 10. The Logically Simplest Theory of Relativity ...
s ~-^-^
it
l c f = / / J x - ^ ^ i J j ' r -= D V < * - v it'/ UtV M+ Nv
(34)
c
* =c;
487
-
Dt J i - * * fvl + /V v
= ci. /}
"<Jl i t
-
VKXI + - S I I v e r e d
B - ro i - ro r
/ 1 > V !—I Ra. )
where P = v/c.
0<j\~
view
o+
telescope
s i m p l i + i'ed
MM
9x p e rirne
nf
From the same vantage point, we see the horizontal beam
describe a path as shown in (35).
The average speed of this light beam is
(taking the arm lengths to be L' measured from S') d
it
it'I
-*',
3
O
it
1 i-t'l
it , I 1
I
_ ' '
1
A'
M+Nv
Setting the horizontal and vertical averages equal to each other and using (33), we determine that
488 Lorentz and Poincare Invariance
(36)
A'= 9
where y ^ l - p ^ ) "
7
1/9 1
.
&'
=
- Z
3
^
I have taken the positive root of A to make the x
and x' axes parallel rather than antiparallel. We now apply the Kennedy-Thorndike results, first writing the metric given by (32) in terms of primed variables (e>F-EA)"a[(b'i'r(V-6^) -
X ^ ^ - ^ )
+
2b'tV(EB-^p)j
(37)
At this point, Robertson used the metric to find an expression for the time it takes for a light signal to make a trip out to a point and back in the moving frame and made this expression independent of the direction of the trip and v.
Since we do not have expressions for M and N, it is
impossible for us to do the analogous step without creating an expression loaded with more unknown parameters than we will eventually be able to solve for.
To circumvent this problem, we opt for a different, but
equivalent interpretation of the Kennedy-Thorndike
experiment.
The unchanging interference pattern seen in the original experiment and the constant frequency of the cavity laser in modern experiments indicate that the recombining light beams in the interferometer maintain the same absolute and relative phases as the velocity of the apparatus changes.
We might say that this test shows that the final phase of a light
signal making a round trip between two points is independent of the velocity the observer.
Therefore, instead of making time be the round-trip
invariant, I will use a quantity I call the "light path," equal to c't' (b' = c' where the motion of light is concerned).
The light path is the distance
Chap. 10. The Logically Simplest Theory of Relativity ...
489
traveled by a light beam in a time t' and since the wavelength of the light does not depend on the type of clock synchronization we use (relation between t and t'), requiring the round trip light path to be invariant is the same as requiring the final phase of the light signal to be independent of the direction of its trip.
Solving for c't' from the metric (c't' = b't' when dc2
= 0), we obtain
(38)
I'cos
U ' t ' ] (e) =
Q [A'F •-£'&•)
+ J-'i^' 1
A' -
-
S'
~6'F')
1
where (39)
x' = A'cos
,x y * + -L' x = JL sl^
6
So the round trip light path is a I ' (£ V-S'P
(40)
[ c ' i ' ] (0) + [ c ' t ' ] ( a - " " )
=
"7^ A
1
)
7^ D
The Kennedy-Thorndike experiment requires that this expression be independent of velocity.
As we can see from (30), the parameters A', E',
and D must reduce to 1 and B' and F' must become 0 when v=0, so E'.V-
for all v. (42)
8'F' I
£'A'-
B'F'
Using (36), the relation between E' and F' is then r £'
=
I-
/37F'
Finally, we turn to the Ives-Stilwell experiment to try to determine the remaining parameters D, E', F', M and N.
In order to proceed, we must
make one last, small assumption regarding the invariant form of a plane wave.
I postulate that, as in special relativity, the form for a plane wave
which is invariant under (30) is (43)
exp[; (
x
°k
0
-r-k
)]
-
exP[; U * X ' - r'- k'
)]
490 Lorentz and Poincare Invariance Under this very reasonable assumption, the transformation equations of the wave four-vector are K0 (44)
-
D ( HT'ko' +- &' *x')
1<X
= D(A'KX'+
Ky
= T> k y '
= 0 ( e-M<„' - ITA ! < / )
F'Vc,')
= D(rK„'
+
F'K,')
K x = D ^
where Kx (45)
= K0cos6
Ky-
kv' = K; coi e
J K
K 0 s in 9
; = * , ' * • . « e'
Note that for this part, I have rotated the coordinate axes so that the wave has no z dependence to simplify the ensuing math. In the Ives-Stilwell experiment, only the second order Doppler shift was measured.
In order to eliminate the first order shift, the light source
must be viewed perpendicularly to its direction of motion. equivalent to setting •& = 90 degrees.
This is
We first use the k x and k
equations
to solve for kg,
K X = o = M r i
i<„ - D K 0 V I - costs' K y
=
Dk
cos e' - - P ' A
/
o
yi-
p'Vr1
= *.
Using the relation between E' and F' obtained above and the transformation equation for kg, we find
(47,
* . . D(^iii'K..-r«K.-(-fj)
= >i'
Setting the expressions for kg in (46) and (47) equal to each other gives
(48)
D«0'TA-^
"
r
P
T"
F' =
-47
Chap. 10. The Logically Simplest Theory of Relativity ...
491
I have chosen F' to be negative so that t' and t will have the same sign. The only remaining task is to determine D, M and N. Since k = 2iz/l, we can rewrite (47) as (49)
X -
i-A'/D
so to second order accuracy
4_A fiJl
(50)
==
DD ([ ,I ++
11 ^r* )J -~ )'
=~
l u (O-O ~
,J
+ • •i - c"-
A
From the Ives-Stilwell experiment, we know that the second order wavelength shift AA./A. of a moving light source is related to its velocity by v^/2c^.
For our proposed transformation to correctly predict the outcome
of this experiment, we see that we must 'have D = 1.
V. DISCUSSIONS AND CONCLUSIONS Having culled as much information as possible out of the three fundamental experiments, we can now write the transformation
equations
(30) as
(51) b' t '
-
"2 ' = X
with the invariant metric (52)
(ft')
1
-*'
1
-/
1
-*'
1
=
S
^
f c t ^ - X ^ y
1
-
z.1
Notice that at no point did we ever postulate a method of clock synchronization, as Robertson or Hils and Hall did, which would have been equivalent to assuming a certain relation between t and t' (or equivalently, c and b').
As a result, it is impossible, on the basis of these three
experiments alone, to separate the variables b' and t' in the transformation
492 Lorentz and Poincare Invariance or to determine the parameters M and N.
By choosing a method of clock
synchronization equivalent to that of special relativity and thus assuming that the speed of light was the same in both directions along a line joining the two clocks, Robertson and Hils and Hall have actually fixed the speed of light to be a universal constant.
Without this stipulation, we see that an
infinite number of transformations, all consistent with the results of the three experiments, are allowed, corresponding to arbitrary choices of M and N.
Each of these transformations has the Lorentz group properties
(this is evident because v'/c' is independent of M and N) and fourdimensional symmetry, reducing the "tests of the isotropy of light" to mere tests of the four dimensional symmetry of natural laws.
In the future,
consideration of these many possibilities may lead to new ways of looking at physics. It is my belief that no experiment to date contains enough information to exactly specify a transformation.
To do so, one would have
to devise a method to measure the one way velocity of light without introducing a method of clock synchronization, a task whose feasibility is a much debated subject.
In the meantime, more and more accurate versions
of these experiments will challenge the limits of our measurement ability and provide more and more precise verification of the four-dimensional symmetry of our world.
Chap. 10. The Logically Simplest Theory of Relativity ...
493
References
°J. D. Jackson, Phys. Today 40, No. 5, 34 (1987). % . P. Robertson, Rev. Mod. Phys. 2 1 , 378 (1949). 2
D. Hils and J. L. Hall, Phys. Rev. Lett. 64, 1697 (1990).
3
R. M. Mansouri and R. U. Sexl, J. Gen. Rel. Grav. 8, 497 (1977).
4
R. M. Mansouri and R. U. Sexl, J. Gen. Rel. Grav. 8, 515 (1977).
5
R. M. Mansouri and R. U. Sexl, J. Gen. Rel. Grav. 8, 809 (1977).
6
A. A. Michelson and E. W. Morley, Am. J. Sci. 34, 333 (1887).
7
A. Brillet and J. L. Hall, Phys. Rev. Lett. 42, 549 (1979).
8
R. J. Kennedy and E. M. Thorndike, Phys. Rev.
9
H. E. Ives and G. R. Stilwell, J. Opt. Soc. Am. 28, 215 (1938); 31, 369
42, 400 (1932).
(1941). 10
M. Kaivola, O. Poulsen, E. Riis, and S. A. Lee, Phys. Rev. Lett. 54,
255 (1985).
494
Lorentz and Poincare Invariance A p h y s i c a l t h e o r y b a s e d s o l e l y o n t h e first p o s t u l a t e o f r e l a t i v i t y J. P. Hsu Physics Department, University of Massachusetts North Dartmouth, MA 02747, USA
Dartmouth
Leonardo Hsu 1 Physics Department, Harvard University, Cambridge, MA 02138, USA
Received 22 June 1994; revised manuscript received 22 September 1994 Accepted for publication 7 October 1994 Communicated by V. M. Agranovich
Abstract Using the first postulate of relativity only, we develop a general theory, termed taiji relativity, which has four-dimensional symmetry and is consistent with experiments. Within this framework, the speed of light c is no longer a universal constant. Thus, quantum electrodynamics has only two fundamental constants, e and J, which are analogues of e and fi. Some new results are implied.
Chap. 10. The Logically Simplest Theory of Relativity
...
495
The two postulates of special relativity are the basis of a theory that has been tremendously successful in describing a wide range of phenomena. However, using only Einstein's first postulate, the principle of relativity, one can construct a more general theory. Taiji relativity, as we term it (see note added), is physically distinct from and logically simpler than special relativity, as we make only one postulate, and also reproduces all of special relativity's successes. The predictive powers of taiji relativity stem directly from its four-dimensional symmetry. Consider the usual case of two inertial frames, F and F . To simplify the discussion, we define the speed of light to be constant and isotropic in the F frame by synchronizing the clocks in that frame according to the usual method of special relativity2. We make this definition in one frame only and could just as easily have used the F' frame, as all frames are equivalent, so this procedure does not select a preferred frame. As the principle of relativity itself does not specify how the F and F' clocks should be related, the theory does not tell us anything about the speed of light in any other frame [1]. However, taiji relativity can still predict experimental results. Assume F and F' have relative motion along a common x/x' axis. We denote the as-yet-unknown relation between / and /' by t'=A(x,t),
(1)
where A(x, t) is an unspecified function and x is the 1
Present address: Physics Department, University of California at Berkeley, Berkeley, CA 94720, USA. 2
Some readers may cry foul here and claim that this assumption constitutes a second, albeit weak, postulate. However, the existence of this frame is not necessary to this paper. One could imagine a universe of frames F, F, F", etc. from which F suddenly disappeared. No physics would change. If we do not make this assumption here, the ensuing discussion would become very mathematically messy.
496
Lorentz and Poincare Invariance
F coordinate of the F' clock showing time t'. We describe events by the four-vectors (ct,r) = (ct,x,y,z)=x",
/z=0, 1,2,3,
(2)
(b'f, #•') = (Z>7'; x', y', z')=x'",
(3)
Here, &' is an arbitrary function of x and t, and depends on A(x, t). We denote the zeroth component x° (the "lightime") of x" in a general frame by w (=bt) where 6=c only in the frame in which the speed of light is constant. The four-dimensional taiji transformation relating x"andx'"is b't'=y(ct-px), p=V/c,
x' = y(x-fict), 2
y=l/(l-/? )
1/2
y'=y,z' = z,
.
(4)
where Fis the speed of F' as seen from F. It preserves the four-dimensional interval s2=(b't')2-r'2=(ct)2-r2,
r2 = r2
(5)
and the infinitesimal interval ds 2 = [d(b't') ]2-dr'2 = c2 dt2-dr2
& fa \&'\ %< ,
p 'WW
.
(6)
As b' depends on t and x, t' can no longer be isolated in transformation (4). Time by itself can no longer be regarded as a fourth dimension. In taiji relativity, one must use the lightime w. We define c'=d(b't')/df
(7)
in order to write (6) in the form ds2=c'2 dt'2-dr'2=c2
dt2-dr2.
(8)
A physical interpretation of c' can be gleaned from the broad relativistic velocity transformation c'=y(c-pvx)/B, v'y = vy/B, v'=dr'/dt',
v'x =
y(vx-pc)/B,
v'z = v2/B,
(9)
v=dr/dt,
B=(dA/dx)(dx/dt)
+ dA/dt,
(10)
When ds=0, (6) describes the propagation of a light signal: c=dr/dt and c' = dr'/dt', where c' is the speed of light in F . If d s ^ 0, however, then a physical interpretation of c' becomes complicated, as it depends on v. In the case v= (V, 0,0), c' can be interpreted as the average (or two-way) speed of a light signal which travels from r'0 to r',, and then back to r'0. This two-
Chap. 10. The Logically
Simplest
Theory
of Relativity
. . . 497
way speed can be shown to be isotropic in all frames, as it must be in order for the theory to be consistent with the Michelson-Morley experiment 3 . Taiji relativity also leads to a new criterion for inertial frames. Setting v-0 in (9) to find the speed of the F frame as seen from F , we get c'=yc/B=c'(0),
v'x=-ypc/B,
v'y = v'z = 0. (11)
This implies that the speed of the F frame, measured from F', is not only different from the speed of F' measured from F, but also depends on the function A=A{x, t). The ratio v'x/c'(0), however, is constant and is equal to — V/c. Consider also another inertial frame F" moving with a constant velocity (Vu 0, 0) as measured from F and a different velocity (V\, 0, 0) as measured from F'. Here, V\ depends on B. The ratio V\ /c', on the other hand, is constant and has the value (from (9)) VJc'=V\lc,= (Vllc-P)K\-pvjc) .
(12)
We see that V'/c' is the parameter one should use to characterize the relative motion of inertial frames in taiji relativity. Using (12), the broad transformations (4) can also be shown to form the Lorentz group. Let us now look more closely at the behavior of light. Suppose we have a light signal propagating alone the jc-axis, with v= (c, 0, 0), then c' = yc(l-fi)/B
= c'+,
t/x = yc(l-fi)/B,
v'y = 0, v'z = 0,
(13)
where B is defined in (9). As expected by four-dimensional symmetry, we have c'2—v'2 = c2 - v2 = 0. If v=(-c, 0,0), then c' = yc(l+P)/B=c'_.
(14)
If A{x, t) is specified, one creates a completely new 3
When B is constant, for example, the velocity transformation (9) is consistent with the Michelson-Morley experiment because the two-way speed of light is isotropic in F . The time interval At' for a round-trip of light in F is angular independent: A£'/ c'(0')+AL'/c'(e' + n)=At'=AL'/{yc(l-cos81)/B] + AL'/ {yc(\-cos82)/B] = 2yBAL'/c, where cos0, = (cos0' + /2)/ ( l + 0 c o s 0 ' ) , cosfl 2 =[cos(0'+Jt) + /n/[l-t-0cos(0' + Jt)] and B=M+Ndx/dt=M+0cN. Here, we have used A = Mt+Nx and the condition for a round-trip dr' = 0 in F , i.e., P = (V, 0, 0). The same is true when A is not linear in x and 7, but a little messier to work out.
498 Lorentz and Poincare Invariance theory in which the relationship between F and F' clocks is postulated. In special relativity, the second postulate sets A = y(t-fix/c),
(orB=y(l-fivx/c)),
(15)
so that c'+ =c'_ = c. However, since there is no such postulate in taiji relativity to fix A, the theory says nothing definite about c'+ and c'_. We now present some clarifications. The indeterminacy in the way clocks in different frames are related is often misunderstood. One important point is that taiji relativity is not simply special relativity with some change of variable. Although one can obtain Eq. (4) by substituting b't' for ct' in the usual Lorentz transformations, we shall see in the remainder of this paper that this absolutely cannot be done with respect to other taiji relativistic equations and quantities such as the action 5 or the Dirac equation. If taiji relativity were just a change of variables, it would be physically identical to special relativity, based on the same two postulates and, as stated in the first paragraph, it is not. In addition, as we will see shortly, taiji relativity has only two fundamental and universal constants, instead of the usual three (h, c, and e). This indicates that a real change in the physics has occurred, as a simple change of variables could never effect this reduction. Another common question is: do not all experiments show that relativistic time (as given by (15)) is the only correct time? We are not saying that special relativity is wrong. Instead, taiji relativity shows that relativistic time, or any particular time system for that matter, is not a necessary ingredient of a theory for it to correctly reproduce all known experimental results. The four-dimensional symmetry of the physical framework is all that matters. As we have already seen in regards to the Michelson-Morley experiment in footnote 2 and shall see for a few other experiments later in this paper, the function A always disappears when we calculate the quantities that are actually measured in experiments. As we claim that any arbitrary time system is consistent with all known experiments, one might ask how one could actually set up a clock system for any given function A. Any clock has only two adjustable parameters, its reading and rate of ticking. By basing our "clocks" on a computer chip, we could program any clock in F to obtain a time reading from the
Chap. 10. The Logically Simplest Theory of Relativity ...
499
nearest F clock and, based on its F frame position x, compute the time it should display {t'=A(x, t)). The resultant time system, however complicated, must be considered valid and "physical" as long as it produces results consistent with all known experiments. As no known experimentally measurable quantity actually depends on A itself, any time system with fourdimensional symmetry is allowed and there are no bounds on the value or restrictions on the functional form of A. Now we look beyond the simple coordinate transformations of taiji relativity to some of its implications. In the realm of mechanics, more radical changes must be made. Since c is no longer a universal constant, the usual action — Jmc'ds', where ds' 2 = dx'^dx^, is no longer invariant. To preserve its invariance, we redefine the action and the associated Lagrangian for a single particle as f ( d w 2 - d r 2 ) 1 / 2 = \LAI,
Sf=-m
L=-mC(l-v2/C2)l/2,
(17)
C=dw/d/# const, (18)
where m is the rest mass of the particle. Here, SV and L are given in terms of quantities measured in a general frame with coordinate x>1= (w, r). As the dimension of our Lagrangian differs from that of the special relativistic one by a velocity, other quantities in taiji relativity will also have different units from their conventional analogues. The "momentum" of a particle is now ,r/. p=dL/dv=
mv/C {l_v2/c2y/2.
(19)
which has the dimension of mass and the Hamiltonian H is defined by (dL/dv)-v—L ti — — C 2
2 12
= (p + m ) ' ,
~
m (\-v2/C2)l/2 (20)
which also has the dimension of mass and can be interpreted as the "energy" p0 of a free particle. It follows from (19), (20), and H=p0 that pl-p2
= m2.
(21)
Just as in special relativity, p"=(Po, p) is a fourvector.
Invariance We mentioned before that taiji relativity implies the existence of only two fundamental and universal constants in quantum electrodynamics. We now find their values: In quantum mechanics, the wave fourvector k" of a particle is proportional to its four-momentum p", so we can write p'"=J'k,f;
in the F frame ,
p"=Jk",
in the F frame,
(22)
where J and J' are proportionality constants. Based on the four-dimensional taiji transformations of k'"={(o'/c',k'), a)'/c' = y((D/c-pkx), k'x = 7(kx-Po)/c),
(23) k'y = ky,
k'z = kz,
(24)
J
and/?'", we see ihatp"/k''=zp'' /k'>'. Thus, J=J', so J is a universal constant. Comparing (22) in the F frame with the conventional relation pgR =fik", where P$R =cp", w e deduce the value o f / t o be fi/c, or 7= 3.5177293 X10- 3 8 g e m .
(25)
This / is the analog of Planck's constant in the conventional framework and is found in expressions such as/>= -L/V, exp(ipltx"/J), d 3 rd 3 p/(27t/) 3 . For the other universal constant, consider Eq. (21) for a charged particle moving in an electromagnetic field. As usual, we substitute/?"—ea^ forp", where e is some constant, to get {p0-ea0)2-(p-ea)2 2
= m2, 2
2
(p'o-ea'0) -(p'-ea') =zm ,
inF, inF.
(26) (27)
Comparing (26) with the corresponding conventional relation in the F frame, we see that e=e/c (e in esu) and aM~af'(w, r)=A"(ct, r)/c, where A* is the usual vector potential. We substitute aM for A*1 to ensure that the invariant action for the charged particle and electromagnetic field, Sm=Sr-ejaMdx"-\
j7 H „/'"'dVdw,
f/lv=dMa„-di,ali,
(28)
has the same dimensions as the action S( in (17). As can be see from (28), e is a universal constant and has the value (in Heaviside-Lorentz units) e=-1.6021891 Xl0- 2 0 (47t) 1 / 2 (gcm) 1 / 2 .
(29)
Chap. 10. The Logically Simplest Theory of Relativity ...
501
One can verify that the fine structure constant is given by a = e2/4nJ=l/l37, retaining its well-known value. From the viewpoint of quantum mechanics, the new definition of the action has an effect on the interpretation of the energy levels of a hydrogen atom as well. The covariant Dirac Hamiltonian for the hydrogen atom is now H0=—ap—fim-e2/4nr, p=-L/V, and the Dirac equation is iJdi///dw=Hu/,
w=bt.
(30)
Both d/dw and HD transform like the zeroth component of a four-vector. Therefore, the lightime w must be used as the evolution parameter for a physical system in a general frame. Since Hu now has the dimension of mass, the Dirac equation leads to atomic "mass" levels M
"~
m {l+a2/[n-h+(h2-a2)l/2]2}l/2'
a=e2/4nJ,
(31)
where h =j+ %. In taiji relativity, an atomic system is considered to emit or absorb "mass quanta." Physically, we can visualize a transition as occurring when a photon of "moving mass" Jkq (analogous to the h v) is absorbed or emitted by an electron. To see that this interpretation is consistent with what is observed in the lab, consider its effect on the Doppler shift experiments. The moving mass of a photon is related to its frequency by Jk0=Jaj/c, Jk'0=Jco'/c',
inF, inF'.
(32)
To find the relationship between co and co', we set kx=kcosd= (co/c) cos6 in (23) to get a)'/c'=y(co/c) (1-0 cos 6) .
(33)
In the actual Doppler shift experiments, it is the shift of atomic energy (or mass) levels that is being measured, rather than the frequency of light. As a result, we want to find how the moving mass Jco/c of a photon, and not the frequency co, differs between frames. Although the transformations for co' and c' both individually depend on A, that for their ratio does not (Eq. (33)), making it impossiblefor this experiment to tell us anything about what A(x, t) "should'' be. As expected, the taiji relativistic Doppler shift is identical
502
Lorentz and Poincare Invariance
to the conventional Doppler shift if c=c'. Another example of how observable quantities turn out not to depend on the function A can be seen in experiments measuring the dilation of the decay time of particles in flight. A crucial point regarding these experiments is that one really does not measure the decay time dilation, but the dilation of the decay length of the particle. In a general frame with coordinates (w, x, y, z), we can express the S matrix in terms of an interaction Hamiltonian //j by S=l + (-i/J) + (-i/J)2
\Hi(w)dw \ Hx(w) dw \ Hi(w') dw'+....
(34)
The inverse decay length 1 /D for a particle in flight is [2] l/D=r(l^2
= lim fW-*QO J
tf>
+ 3 + ... + N)
d3rNd3pN , (35) (2nJ)3
which is defined as the decay rate with respect to the lightime w. For a given process, such as P-~(Pi)-e~ (Pi) + v*(Pi) + ve (PA), 1 ID is given by l/Z»=r(1^2 + 3 + 4)oc— J [ Pol
4
XS (Pl-p2-p3-p4)
d3P2 dVs d 3 p 4 P02 2
Y \M\ ,
P03
P04
(36)
M=(G/V/2)[vMyk{\-y5)n{Pi)] X[e(p 2 )y A (l-y 5 )i'e(/'4)] •
i
(37)
Everything except l/p 0 i in (36) is invariant in taiji relativity so the decay length is dilated, D{in flight)/ Z)(at rest) =Poi/ w i> m agreement with experiments. This kind of agreement can be obtained for all other known experiments as well. The decay length case highlights an important point. When comparing theory and experiment, it is critically important to be aware of what quantities are actually measured and what effects the assumption of the second postulate may have had on the interpretation of the results. With regard to other laws of physics, consider, for example (28) generalized to the case of a continuous charge distribution in space. The second term in (28) becomes — JalljMdirdw and we regain the invariant Maxwell equations
Chap. 10. The Logically Simplest Theory of Relativity
dJ"P=j'', d^d/dx", f^xx)
... 503
djx'+dj^+dj*"^, xx=(w,r),
= dlla,(xl)-a,aIL(xl).
(38)
Although taiji relativity may seem unnecessarily complex, with its b' instead of a neat universal constant c, it should be remembered that almost all of the formulas simplify tremendously when quantities are paired with c'; for example, the transformation between v'/c' and v/c, as opposed to that between v and v'. Oftentimes in experiments, it is the ratio that is actually being measured, rather than a specific quantity by itself. Also, the complexity of a theory is more accurately judged by the number of postulates one must make, rather than the number of terms in its mathematical formulas. Taiji relativity shows us that the speed of light is not truly universal because it depends on the choice of a particular method of synchronizing clocks in different inertial frames, a choice which needs not even be made. A framework involving four-dimensional symmetry, guaranteeing that the laws of physics take the same form in all frames, is the only requirement for a theory to explain all known experiments. Taiji relativity does have new implications for modern physics and the way we view our universe that differ from those found in special relativity. One new result is the possibility of obtaining a single fourdimensional Liouville equation for many-body problems. This is impossible to do in special relativity due to the complex relationship between t and t'. Since taiji relativity tells us that any relation at all will do, we can set t= t' to simplify the calculational maze of many coupled Liouville equations presented by special relativity 4. Another effect of postulating t=t' is that it is now possible to derive an invariant Planck distribution < n >. Thus, the observed anisotropy of the 3K microwave background radiation cannot be attributed to a Doppler shift due to the Earth's "absolute" motion 4
Common time (t=f) does not define a preferred frame and is not absolute because the frame in which the speed of light is defined to be constant can be arbitrarily chosen. It differs from Newtonian time in that the coordinate transformation is given by (4) rather than the Galilean transformation. Once the additional postulate of common time is made, one has a new theory which may be termed "common relativity". See also Ref. [ 1 ].
504
Lorentz and Poincare Invariance
- * 0 , j///>*>>Kr. L^
J/.'IA,..
kr^dt
through the cosmic radiation 5. The anisotropy must instead come from a small non-uniformity in the distribution of galactic matter, and thus the difference between various time systems made possible by taiji relativity is not just a matter of formalism or of interpretation; it is a real physical difference. The idea of constructing a relativity theory by using only the first postulate of special relativity has been discussed by Ritz, Tolman, Pauli and others 6; but the crucial taiji transformation (4) and physical implications in Eqs. (9)-(38) are new and are our original works 7. This work was supported in part by The Jing Shin Research Fund of the UMassD Foundation.
Note added Our results suggest that any specific system of time such as relativistic time and the universal speed of light are not physical entities inherent in nature but human conventions imposed upon it. However, the four-dimensional symmetry appears to be inherent and truly fundamental in Nature. Thus it seems appropriate to term this theory of relativity (based solely on the first postulate of special relativity) "taiji relativity" because, in ancient Chinese thought, taiji denotes the ultimate principle or the condition as it existed before the creation of the world. 5
Ai
For a proof of this result within the four-dimensional symmetry framework with t'=t, see Ref. [3] (in which the new invariant temperature x is given by kBT/c3, where the usual temperature T may not be invariant). 6 See, for example, Ref. [4]. The authors of Ref. [4] did not get a satisfactory answer because they did not recognize the four-dimensional symmetry based solely on the first postulate of relativity. Reichenbach, Edwards, Winnie, Tyapkin, Mandel'shtam, Grunbaum, Hsu, Sherry, Underwood, Mansouri, Sexl, Logunov, Zhang et al. [S] also discussed physical implications without postulating the constancy of the (one-way) speed of light. (A theory of gravity based on the usual physicalfieldin Minkowski space formulated by Logunov et al. appears to be quite interesting, for it is closer to the framework of taiji relativity than the conventional theory.) 7 For detailed discussions, see Ref. [6].
Chap. 10. The Logically Simplest Theory of Relativity
...
505
References [ 1 ] J.P. Hsu, Phys. Lett. A 97 (1983) 137; Nuovo Cimento B 74 (1983)67; J.P. Hsu and C. Whan, Phys. Rev. A 38 (1988) 2248, Appendix. [ 2 ] J.D. Bjorken and S.D. Drell, Relativistic Quantum mechanics (McGraw-Hill, New York, 1964) pp. 261-268, 285, 286; J.J. Sakurai, Advanced quantum mechanics (AddisonWesley, Reading, MA, 1967) pp. 171, 172, 181-188. [3] J.P. Hsu, Nuovo Cimento B 93 (1986) 178. [4] W. Ritz, Ann. Chim. Phys. 13 (1908) 145; R.C. Tolman, Phys. Rev. 30 (1910) 291; W.Pauli, Theory of relativity (Pergamon, Oxford, 1956) pp. 5-9. [5] H. Reichenbach, The philosophy of space and time (Dover, New York, 1958) p. 127; L.I. Mandel'shtam, Lectures on optics, relativity and quantum mechanics (Moscow, 1972); A.A. Tyapkin, Sov. Phys. USPE 15 (1972) 205; Lett. Nuovo Cimento 7 (1973) 760; J.P. Hsu and T.N. Sherry, Found. Phys. 10 (1980) 57; A.A. Logunov, Yu. M. Loskutov and Yu. V. Chugreev, Theor. Math. Phys. 67 (1986) 425; 69 (1986) 1179; Yuan-zhong Zhang, Experimental foundations of special relativity (Science Press, Beijing, 1979) [in Chinese]. [6] L. Hsu and J.P. Hsu, Experimental tests of a new Lorentz invariant dynamics based solely on the first postulate of relativity, Experimental tests of a new Lorentz invariant quantum electrodynamics based solely on the first postulate of relativity, UMass Dartmouth preprints (1994).
506
Lorentz
and Poincare
Invariance Physics Letters A 217 (1996) 359
Erratum
A physical theory based solely on the first postulate of relativity (Physics Letters A 196 (1994) 1) * Jong-Ping Hsu, Leonardo Hsu On page 3, paragraph 3, line 11: "footnote 2 " should read "footnote 3 " . On page 5, paragraph 4, lines 5-7 after Eq. (38): "between v'/c' and v'/c, as opposed to that between v and v1." should read "between velocity ratios v'/c' and v/c, as opposed to that between velocities v1 and v." We would like to clarify that quantities such as r\ b', v', c' and frequency ID' are, strictly speaking, undefined in taiji relativity as they depend on the unspecified function A( x, t) or its derivative B = d A{ x, i)/dt. Only quantities such as the product b't' = W and the ratios v'/c1 and w'/c' are independent of A(x, t) and, hence, well-defined. It is possible to formulate taiji relativity without the use of the undefined quantities t, V c', etc. because all of the theoretical and experimental results in the theory can be directly obtained on the basis of the four-dimensional symmetry with the transformation w' = -y(w — $x), x' = y(x - $w), y' = y, z' z, where the lightimes w and w' have the dimension of length and play the role of "inherent" evolution variable. Taijj relativity does not postulate the speed of light to be c in the frame F and d in the frame F', because c' is undefined. One may use taiji clocks which read w and W in F and F' respectively based on the invariance phase k0w —ft• r of the atomic radiations, where k0 plays the role of the angular "taiji frequency". This resembles that, in special relativity, the angular frequency
* SSDI of the original article: 0375-9601(94)00848-5.
Part II
Experiments
x
for Lorentz and Poincare Invariance
See Ref. [16] in chapter 11 for more detailed discussions of tests of special relativity
This page is intentionally left blank
Chapter 11 The Fizeau Experiment Historically, the first ether theory describing the propagation of light rays in moving media was developed by A. J. Fresnel beginning in 1818 [1]. He made calculations which were able to explain the phenomena of polarization, diffraction and interference. The interference of light was originally investigated by T. Young in 1801-1804. Young's results supported the wave theory of light developed by C. Huygens in his book "Treatise on Light" (1690). They also suggested the existence of a medium, called aether (or ether) for the vibrations of light waves. By analogy with the velocity of sound in elastic media, Fresnel assumed that the velocity of light in media are proportional to the square root of the density of an elastic ether. He derived the formula for the velocity of light in an elastic medium without dispersion, u = u' + / v ,
(11.1)
/ = l - ^ .
(I")
with
known as Fresnel's drag coefficient, where v is the velocity of the medium with the index of refraction n relative to the ether and u' is the velocity of light in the medium when v = 0. Suppose the second term on the righthand side of Eq. (11.1) is interpreted as the velocity of an ether wind, then the quantity v — / v = v / n 2 is the velocity of the medium relative to the ether wind. Thus we see that the ether is dragged along partially by the moving medium. Physically, Fresnel's derivation of (11.2) was based on the assumption of a partial drag of the light by the medium. In order to test Fresnel's ether theory, H. L. Fizeau [2] first measured the speed of light in running water in 1851 and obtained a result which was in agreement with (11.1) and (11.2). Later, he performed a second experiment in 1895 [3] and obtained the same result. We may remark that Fizeau (and Foucault) also carried out an experiment in 1850 to show that light travels more slowly in water than in air, which gave strong evidence against Newton's particle theory of light.
509
510
Lorentz and Poincare Invariance WATER
F i g . 1 1 . 1 . Schematic diagram of Fizeau's experiment.
A schematic diagram of Fizeau's 'aether-drag' apparatus is shown in Fig. 1.1. A beam of light from a source S is divided by a (semitransparent silvercoated) glass plate P, which is placed at an angle of 45° to the direction of propagation, into a transmitted part 1 and reflected part 2. The transmitted beam 1 reflected by the mirrors M x , M2 and M 3 , traverses a rectangular path PM1M2M3P; again a certain fraction of the beam passes the plate P and enters the telescope T. The reflected beam 2 traverses the same rectangle in the opposite direction. On its return to P it is partially reflected into T where it interferes with beam 1. Between M 3 (or P) and M 2 (or Mi) is inserted a tube filled with water, so that light beams 1 and 2 pass through water on the path PMU as well as on the path M2M3. As indicated in Fig. 1.1, beam 1 traverses the water in a direction opposite to the direction of motion of the water, while beam 2 has the same direction of motion as the water. In Fizeau's experiment, the total length of the path of light was / ~ 1.5 m. The velocity of the water current v was about 7 m/sec. If the source emits monochromatic or nearly monochromatic light, interference fringes can be seen when one looks through the telescope. The observed shift of the central position of interference fringes, when the direction of v was inverted, was equal to 0.46 parts of the distance between two neighboring fringes. A possible dispersion effect is within the experimental error, and therefore, could not be observed. In special relativity, the law of velocity addition for moving media with-
Chap. 11. The Fizeau Experiment
511
out dispersion is given by
1 + (v/c)/n
n
V
nzJ
\v J
To the first-order in v/c, the law of velocity addition (11.3) leads to Fresnel's formulas (11.1) and (11.2). Thus, Fizeau's experiment can be interpreted as a test of Einstein's law of velocity addition involving a moving medium. However, there is another interpretation from a broad viewpoint: In a theory based on broad Lorentz and Poincare invariance, the law of addition for dimensionless-velocity is given by [4]
where we have assumed that the medium with a refractive index n is at rest in the F' frame, so that the dimensionless speed of light in the medium as measured by the F'-observers is f3'L = 1/n, where n > 1. Thus, the result of Fizeau's experiment can be considered as a support of broad Lorentz and Poincare invariance [4]. After 1905, several Fizeau-type experiments were performed. According to the relationship between the direction of the motion of the medium and that of the light beam, these "drag experiments" can be classified as follows: (i) The first type is the longitudinal drag experiment, such as the original Fizeau's experiments [2,3], Michelson-Morley experiment (1886) [5] and Zeeman's experiments [6-9], where the direction of the velocity of the medium is parallel to that of the light beam. (ii) The second type is the transverse drag experiment, for example, Jones' (1971, 1975) experiments [10,11], where the direction of the velocity of the medium is perpendicular to that of the light beam. (iii) The third type includes other experiments [12,13], where the angle between the directions of the velocities of the medium and light beam is Brewster's angle. All these three types of experiments involve media with or without dispersion. The results given by the experiments without dispersion are in agreement with Fresnel's formulas (11.1), while other experiments with dispersion cannot be explicitly explained by Fresnel's ether theory because the index of refraction in his theory is constant. Lorentz's electron theory [14,15] can also give a satisfactory explanation of the drag experiments with dispersion. Furthermore, the relativistic electrodynamics of moving media with dispersion can also give an excellent explanation to all of the drag experiments with dispersion effect (see [16] for details).
512
Lorentz and Poincare Invariance
REFERENCES [I] A.J. Fresnel, Annls Chim. Phys. 9 (1918) 57. [2] H. L. Fizeau, Compt. Rend. 33 (1851) 349. [3] H. L. Fizeau, Ann. Chem. Phys. 57 (1895) 385. [4] Jong-Ping Hsu, Einstein's Relativity and Beyond-New Symmetry Approaches, (2000, World Scientific Publishing Co Pte Ltd, Singapore). See also Chapter 10 in this book. [5] A. A. Michelson and F. W. Merley, Am. J. Sci. 31 (1886) 377. [6] P. Zeeman, Proc. Roy. Acad. Amsterdam, 17 (1914) 445; 18 (1915) 398. [7] P. Zeeman, Proc. Roy. Acad. Amsterdam, 22 (1920) 462. [8] P. Zeeman and Miss A. Snethlage, Proc. Roy. Acad. Amsterdam, 22 (1920) 512. [9] P. Zeeman et al, Proc. Roy. Acad. Amsterdam, 23 (1922) 1402. [10] (a) R. V. Jones, J. Phys. A(GB) 4 (1971) LI. (b) R. V. Jones, Proc. R. Soc. Lond. A328 (1972) 337. [II] R. V. Jones, Proc. R. Soc. Lond. A345 (1975) 351. [12] W. M. Macek et al, J. Appl. Phys. 35 (1964) 2556. [13] H. R. Bilger and A. T. Zavadny, Phys. Rev. A5 (1972) 591. [14] H.A. Lorentz, The theory of electrons, Teubner, Leipzig (1916). [15] L. Rosenfeld, Theory of electrons, North-Holland, Amsterdam (1951). [16] Yuan Zhong Zhang, Special Relativity and its Experimental Foundations, (1998, World Scientific Publishing Co Pte Ltd, Singapore).
Fizeau (1819-1896)
Chapter 12 The Michelson-Morley Experiment The famous and ingenious Michelson experiment in 1881 was stimulated by a suggestion of C. Maxwell in 1879. Maxwell believed in the existence of the aether and was eager to find out the earth's motion through the aether. In his letter to Todd of the U.S. Nautical Almanac Office (Washington), Maxwell suggested a terrestrial experiment on the (2-way) speed of light, since the (first order) effect related to the eclipses of J u p t e r ' s moon was very difficult to observe, as pointed out by Todd. Maxwell said t h a t the effect due t o the earth's motion in terrestrial experiments is of the second order v /c ~ 1 0 - 8 , which would be too small to observed. Fortunately, Maxwell's letter to Todd was also read by Michelson (a young naval instructor.) Michelson was, however, not deterred by the difficulty and, two years later, came up with an ingenious device (the Michelson interferometer) with unprecedented sensitivity to measure this very small second order effect. In general, experiments of testing the constancy of the speed of light can be classified into two types: The first type of experiment compares the speed of light in different directions within the same inertial frame of reference. The second type of experiments compares the speed of light in different inertial frames. The first type of experiment can again be divided into two groups: the "closed path of light" and the "one-way path of light" experiments. The purpose of some of the "closed path of light " experiments was to search for a possible second-order effect of the "ether wind'' (or "ether drift") or for the absolute motion of the earth relative to an assumed absolute frame. The "one-way path of light" experiments observed the transverse Doppler frequency shifts. The Michelson-Morley Experiment is the first one of the closed-pathtype using the Michelson interferometer as sketched in Fig. 12.1. The whole a p p a r a t u s is firmly mounted on a massive base and can be rotated about a vertical axis through any desired angle. Light from the source S is partly reflected and partly transmitted by the mirror P with a semitransparent silver coating on its front face. The transmitted ray 1 is reflected by the mirrors Mi and P, and then enters the telescope T . The reflected ray 2, traveling toward and being reflected from the mirror M2, passes through P, and then in T is brought into interference with the light ray 1. If monochromatic light
513
514
Lorentz and Poincare Invariance
is emitted from the source, interference fringes are observed in T. We now calculate the travel-time difference of the two rays by use of the ether theory. The arrangement is assumed to move with the velocity v in the direction of the path PMi relative to the ether. According the ether theory the velocities of the ray 1 are c — v and c+v in the path PMi and MiP, respectively. Let l\ and l2 represent the lengths of the paths PMi and PM2 respectively. Then we obtain the traveling time of the ray 1 from P to Mi and then back to P,
h
/,
h
2/ic
+ c+v
(12.1)
C 2 — V2 '
For calculating the traveling time of the second ray, we must consider the path actually traversed by it. Let t2/2 represent the traveling time of the ray 2 from P to M2 . Due to the motion of the apparatus through the ether, the actual path is l'2 = y/l' 2 +
(vt2/2)2,
(12.2)
where vt2/2 represents the path length passed through by the ether wind at the time ^2/2. Thus, the traveling time of the ray 2 from P to M2 and then back to P is given by 2l'n
t2 = ^
= tyjll + (vt2/2)2
i.e.
2/ 2 *2
=
2
y/c
(12.3)
— V2
From Eqs. (12.2) and (12.3) we obtain the travel-time difference of the two rays when they enter the telescope T, At = t1-t2
h
/1
= - 2 2 c[l-v /c
71 -
v2/c2
(12.4)
If now the entire apparatus is rotated through 90°, the two light paths / t and l2 interchange their relationships to the direction of motion through the ether. After this rotation, the travel-time difference of the two rays is given by At'
h
c [l-v2/c2
u
y i - v2/c2 '
(12.5)
So a change of the travel-time difference after the rotation with respect to that before the rotation is then given by St = At-
At' =
2 (h + h) 1 - v2/c2
y/\ -
v2/c2
(12.6)
Chap. 12. The Michelson-Morley 1
t
=•
Experiment
515
Mi
h
M
p
¥— s
^
"
''
1
F i g . 1 2 . 1 . Schematic diagram of the Michelson interferometer,
or, for small velocities (v2
St^ih±IAnl. c
(12.7)
c*
This leads to the fringe shift
*-!*-HM?)-
(128)
'
where A is the wavelength of the monochromatic light. If one uses the velocity of the earth in its orbit, v ~ 30km/sec, h = h = 1.2m and A ~ 5.9 X 10 - 7 m, one has A ~ 0.04 fringe. This shows that after the apparatus is rotated through an angle of 90° an expected displacement of the interference fringes should be observable. In 1881 A.A. Michelson [1] first performed this experiment. No significant shift of fringe pattern after rotating the apparatus through 90° was observed. Considering the accuracy of the measurement the null result implies that the velocity of an ether wind should be less than 21.2 km/sec if the ether theory is correct. This null result was completely unexpected and very disapointing to physicists. Lord Rayleigh urged Michelson to carry out a more precise experiment. Later, Michelson and Morley [2] performed a set of new measurements in 1887 with technical improvements. The massive base of the apparatus floated in mercury and could thus, without disturbance, be rotated about a vertical axis through any desired angle. The length of the
516
Lorentz and Poincaxe Invaxiance
light path was lx = l2 — 11 m, the light wavelength was A = 5.9 X 10~ 7 m. If the velocity of the ether wind is regarded as v = 30 km/sec, the earth's orbital velocity, the shift of fringes should then be <5 ~ 0.37 according to Eq. (12.8). After they rotated the apparatus, the largest one of the observed shifts of the fringes was less than 0.01. Half a year later, when the earth's orbital velocity had an opposite direction, they performed again a set of measurements and still did not observed any shift. The experimental accuracy of the null result in this experiment gave an upper limit of 4.7 km/sec for the velocity of the ether wind. In order to explain the absence of any effect due to the motion of the ether in Michelson's experiment, FitzGerald (1889), Lorentz (1892) [3] and Lodge (1893) [4], independently put forward the hypothesis that any rigid body moving through the ether with a velocity v is contracted in its direction of motion by a factor (1 — v2/c2)1!2. (This is called the FitzGeraldLorentz contraction). The length of the path PMi in Michelson's equal-arm interferometer is then not /, but /(l — t^/c 2 ) 1 / 2 , while the length of PM 2 is unchanged, since PM2 at right angle to the direction of motion of the apparatus. Instead of (12.1), one obtains 2/ tx
=
/ 2
2
=
*2'
while £2 is still given by (12.3) with I2 — h = /. In this case the phase difference A vanishes, in agreement with Michelson's null experiment. This contraction hypothesis was regarded as an immediate forerunner of Einstein's theory of relativity. However, it should be emphasized that the concept of FitzGerald and Lorentz contradicts the fundamental principle of relativity because the FitzGerald-Lorentz contraction is absolute, while the Lorentz contraction in Einstein's theory is relative. After Einstein developed his theory of special relativity, many experiments of the Michelson-Morley type with improved experimental conditions were carried out by several physicists. For instance, Kennedy (1926) [5] and Illingwarth (1927) [6] independently performed measurements by means of a Michelson interferometer with non-equal lengths of arms (i.e., l\ ^ I2). Their measurements gave negative results which lead to the upper limits of 5.1 km/sec and 2.3 km/sec on the velocity of the ether wind, respectively. In a repetition of the Michelson-Morley experiment carried out by Joos [7] in 1930, the sensitivity of his arrangement was extended to the point where an ether wind of velocity as small as 1.5 km/sec could have been observed. Again this observation gives a null result that implies an upper limit of 1.5 km/s for the velocity of the ether wind. We may remark that a positive result for the measurement of the MichelsonMorley type was obtained by Miller [8] in 1933, which implied the "ether
Chap. 12. The Michelson-Morley
Experiment
517
wind" velocity to be about 10 km/sec. However, Shankland et al. (1955) [9] .analyzed this experiment carefully, and concluded that the observed fringe shift was produced by statistical fluctuations and temperature variation. Another experiment was performed by Shamir and Fox [10] in 1969, in which the two arms of an interferometer were consisted of a transparent solid and the light source was a He-Ne laser. Their apparatus could observe a fringe shift of 1 0 - 5 . When the apparatus was rotated through 90°, no shift of fringes was detected. All these experiments depend on the interference of light (the Michelson interferometer). Nevertheless, L. Essen [11] performed an experiment of the Michelson-Morley type by using a microwave resonator cavity as the arm of an interferometer in 1955. His null result gave an upper limit of 2.5 km/s for the velocity of the ether wind. In 1964, Jaseja et al. [12] performed measurements by use of two He-Ne masers which was equivalent to a Michelson-Morley experiment with improved precision. Their null result gave an upper limit of 0.95 km/s for the velocity of the ether wind. In the above experiments of the Michelson-Morley-type, the paths of light are all closed [13, 14]. The null results can be easily explained by the isotropy of the two-way speed of light regardless of any assumptions concerning the one-way speed. The constancy of the two-way speed of light could be formulated by the following relation cn =
° (12.9) 1—q•e where cn denotes a non-isotropic one-way speed of light propagating along the direction e, and c is the constant average velocity of light along a closed path (i.e., the two-way speed of light), q = |q| is a constant limited by — 1 < q > + 1 , and e is a unit vector in the direction of light propagation. In order to explain the closed-path experiments, we calculate the time of a light ray traversing a closed path: / dl f dl f q-edl
* = fz = h+thr-
< 12 - 10 >
Using Stokes's theorem, the integral in the second term on the right-hand side can be written as. L-e(fI=lq.(fl=
/ / V x q - ( f s = 0.
(12.11)
Here V X q = 0 since q is a constant vector independent of the space-time coordinates. In general, the result (12.11) holds if q = VA(r,t), where A(r,t) is an arbitrary function. Thus, the time traversing a closed path, equation (12.10), becomes
518 Lorentz and Poincare Invariance
This is nothing but the definition of the average speed of light along the closed path, i.e. there is no effect due to the directionality parameter q in the experiments. Therefore, the Michelson-Morley-type experiments are tests of the isotrophy of the two-way speed of light rather than tests-of the isotrophy of-the one-way speed of light. It can be interpreted as a support for broad Lorentz and Poincare invariance [15].
mtmF&ftxi
Morley
Mkhelson and Einstein
Chap. 12. The Michelson-Morley
Experiment
519
REFERENCES 1] A.A. Michelson, Am. J. Sci. 22 (1881) 120. XI A.A. Michelson, and E.W. Morley, Am. J. Sci. 34n (1887) 333; Phil. Mag. 24 (1887) 449. 3] G.F. FitzGerald, Am. J. Sci., 13 (1889) 349; H.A. Lorentz, Verh. K. Akad. Wet. 1 (1892) 74. 4] O. Lodge, Phil. Trans. R. Soc. A 184 (1893) 727. >] R. J. Kennedy, Proc. Natl. Acad. Sci. 12 (1926) 621. [6] K. K. Illingworth, Phys. Rev. 30 (1927) 692. [7] G. Joos, Ann. Physik. 7 (1930) 385. [8] D. C. Miller, Revs. Mod. Phys. 5 (1933) 203. [9] R. S. Shankland et al, Revs. Mod. Phys. 27 (1955) 167. [10] J. Shamir and R. Fox, Nuovo. Cimento. 62B (1969) 258. [11] L. Essen, Nature 175(1955)793. [12] T. S. Jaseja, A. Javan, J. Murray, and C.H. Townes, et al., Phys. Rev. A 133 (1964) 1221. [13] E. W. Silvertooth, J. Opt. Soc. Amer. 62 (1972) 1330. This and the next reference [29] are closed light-path experiments which differ from those of the Michelson-Morley type. [14] W. S. N. Trimmer, R.F. Baierlein, J.E. Faller, and H.A. Hill Phys. Rev. D8 (1973) 3321. [15] See also J. P. Hsu and Leonardo Hsu, in Chapter 10, footnote 3.
This page is intentionally left blank
Chapter 13 The Wilson-Wilson Experiment Long before 1905, electromagnetic phenomena for moving bodies (especially, magnetic dielectric materials) were difficult problems that had been investigated by many people. In 1831 M. Faraday [1] first studied the socalled "unipolar induction". He found that a stable electric current was produced in a stationary conducting wire when placed near a rotating magnet. Although this discovery had been widely applied to engineering for production of the electric generator (i.e., unipolar machine), many different explanations of the "unipolar induction" were in dispute for a long time [2,3]. The electric and magnetic effects, which are related only to the velocities of moving bodies relative to an observer, appear to have a certain symmetry. The classical electrodynamics with Galilean transformations cannot explain such a symmetry. This symmetry is one of Einstein's motives for developing his theory of special relativity, as stressed in his first paper in 1905. In principle, the problem of electrodynamics of moving bodies was solved in 1905 by Einstein and Poincare. That is, covariant Maxwell's equations, Lorentz force and basic equations of motion for charged particles were obtained and discussed by Poincare and Einstein. (Einstein's resultant equation for charged particles was not invariant (in contrast to Poincare's invariant result) and was corrected by Planck in 1906.) (See Chapter 2). Later, it is Minkowski (1908) [4] who proved that the equations and the boundary conditions for phenomenological electrodynamics of moving media (or bodies) can be derived from Maxwell's equations and the boundary conditions for media at rest, within the framework of special relativity. The results of measurements for electromagnetic induction are only of the first-order in v/c, where v is the velocity of the moving body. Predictions of the 'Maxwell-Minkowski electrodynamics' based on special relativity are in agreement with, say, the Wilson-Wilson experiment[5]. Consequently, these experimental results have been considered as tests and confirmations of special relativity. Some authors claimed that these experiments test and confirm Einstein's simultaneity. However, this is not an unambiguous test [6] because, as stressed in [1] a 'generalized Lorentz transformation' can also lead to the same results. We may remark that electromagnetic experiments, with the exception of Fresnel
521
522 Lorentz and Poincare Invariance
drag experiments, have not been repeated for a long time, because of the difficulty to improve experimental accuracy. In order to test the electrodynamics of moving media, Einstein and Laub (1908) proposed a possible experiment. Suppose one has an insulating slab of material with a dielectric constant e and a permeability \i. When the slab moves through a uniform magnetic field H with a constant velocity v, there is a potential difference between two opposite faces of the slab. (See Fig. 13.1). In 1913, M. Wilson and H.A. Wilson of Rice Institute (Houston, Texas) performed a measurement "on the electric effect of rotating a magnetic insulator in a magnetic field" [5]. A schematic diagram of the WilsonWilson experiment is shown in Fig. 13.1. If one assumes a local equivalence between a rotation and a constant linear motion, than the Wilson-Wilson experiment is equivalent to the experiment discussed by Einstein and Laub. In the Wilson-Wilson experiment, the magnetic field is static, i.e., dH/dt 0. In the laboratory frame, Maxwell's equations lead to V X E = 0. Using Stokes's theorem, one has <j>E-d\ = f / ( V x E ) - d s = 0,
(13.1)
where the closed integration path is AVBA. Since E vanishes outside the condenser, i.e., Eext = 0, (13.1) can be written as
JAVBA
E • dl = I
JAVB
Eext • d\ + J Eint JBA
d\= [ Eint • d\ = 0. JBA
(13.2)
This implies ( E m t ) z = 0 inside the medium. By using the first-order approximation of the constitutive relations [1], D = eE + (eft - 1) - v x H, c
(13.3)
we obtain Dz = (en-l)-vH, (13.4) c where we have used H = (0, H, 0). These results for D are obtained by using the Lorentz transformation of the electromagnetic fields (E, B ; D , H) and the constitutive relations D ' = eE' and B ' = //H' Because Dext — 0, there is an electric field inside the condenser:
iR^k
= Jl-±-\
V
-H.
(13.5)
e \ enJ c This will lead to a potential difference between the two plates of the condenser. This potential difference is proportional to (1 — l/sfi). As shown in Fig. 13.1, the apparatus of the Wilson-Wilson experiment consists of a hollow cylinder made by sealing-wax embedding steel balls,
Chap. 13. The Wilson-Wilson
AA. BB.
Outer coating of cylinder. Inner coating of cylinder.
WW. K.
P. SS.
Driving pulley. Solenoid.
TTTT. H.
C.
Brush on outer coating.
R, R'.
D. QQ.
Brush on inner coating. Conical bearings. Water jacket. Ebonite bushings supporting brush rods.
xxxx. EEEE.
4
M. S. V. YY.
Experiment
Bars on brush rods. Key for earthing outer coating. Metallic screen. Wire leading to electrometer. Resistances of 90 to 10 ohms. Key. Dry cell. Voltmeter. Fibre tube insulating BB from shaft.
H
y
uA
-9-
X
V (E.H)
T^B F i g . 1 3 . 1 . Schematic diagram of the Wilson-Wilson experiment (with the assumption of local Equivalence of rotation and constant linear motion). A large condenser is filled with a medium characterized by (e,ft). The two plates of the condenser are connected to the terminals of a voltmeter. There is a uniform magnetic field H along the y-axis. When the condenser and medium move with a velocity v along the z-axis, there is a potential difference between the two plates of the condenser due to the z-component of the internal electric field, (Dint)z/e.
523
524 Lorentz mid Poiacare Invariance
where dielectric constant and relative permitivity in the cylinder are respectively e = 6.0 and /J = 3.0. The inner and outer metal coatings of the cylinder form a condenser. In the experiment the cylinder condenser rotates on its axis, and the direction of a magnetic field is parallel to the axis. The Wilson-Wilson experiment gave an average value of (1 — 1/e/i) = 0.96, which was consistent with the theoretical value 0.944 based on relativity theory with 4-dimensional symmetry. It can be considered as a support of broad Lorentz and Poincare invariance, provided there is a local equivalence between rotation and constant linear motion. However, Pellegrini and Swift re-analyzed the Wilson-Wilson experiment in 1995. They directly used a rotational coordinate system, without assuming a local equivalence between rotation and constant linear motion. Their conclusion challenged the conventional interpretation that the WilsonWilson experiment supported special relativity. They concluded that the Lorentz transformation cannot be applied to the Wilson-Wilson experiment which involved rotation [6].
Harald Albert Wilson (1874-1964) (Photographed by his daughter Joan Wilson Shened in 1954. Photo courtesy of Prof. S. A. Dodds, Rice University)
Chap. 13. The Wilson- Wilson Experiment
525
REFERENCES [1] M. Faraday, Experimental Researches in Electricity, Vol. I. London, 1855, pp. 225-230. [2] R. Becker, Electromagnetic fields and interactions, Vol. I: Electromagnetic theory and Relativity, Blaisdell, New York, 1964, §§71, 86, 87, and 88. [3] A. Sommerfield, Electrodynamics, Academic Press, New York, 1952. J. Djuric, J. Appl. Phys. 46 (1975) 679. [4] H. Minkowski, Nachr. Ges. Wiss. Gottingen 53 (1908) [5] M. Wilson and H. A. Wilson, Proc. Roy. Soc. (A) 89 (1913) 99. See also, H. A. Wilson, Phil. Trans., 1904, A, vol. 204. [6] For an interesting analysis of the Wilson-Wilson experiment based on a rotating coordinate system, see G. N. Pellegrini and A. R. Swift, Am. J. Phys. 63 (1995) 694. These authors argued that this experiment does not test special relativity since the experiment involved rotation rather than uniform translation. One of their conclusions was that either the experiment was wrong or the theoretical analysis must be modified.
H.A. Wilson
This page is intentionally left blank
Chapter 14 The Kennedy-Thorndike Experiment In 1932, R. J. Kennedy and E. M. Thorndike [1] performed an important variation of the Michelson-Morley experiment. There were two main differences of their experiment from that of Michelson and Morley: (a) The difference in length of the two arms in the Michelson interferometer was made large. (b) The interferometer was held fixed in a single direction in the laboratory, and the interference fringes were observed over a period of months. The experiment performed by Kennedy and Thorndike was very difficult because it required the apparatus to be extremely stable for a long time [2]. They looked for shifts in the interference pattern for many days at different times of the year. But they never detected any seasonal or diurnal shifts of the interference fringes. Therefore, it is as if they had moved the apparatus from one inertial frame to another inertial frame, running the experiment with the apparatus at rest each time, and observed no shift in the interference fringes in each frame. The usual interpretation of this null result is that the time (in seconds) it take's light to make a round trip along the extra length of the longer interferometer arm is the same in all inertial frames. The usual conclusion in terms of the speed of light is that "the observations say that the round-trip speed of light has the same numerical magnitude [i.e., 299,792,458 m/s] in inertial frames of reference having a relative velocity of 60 kilometers/second" [3]. Moreover, this experiment is important because its null result excludes the original hypothesis of the FitzGerald-Lorentz contraction of length. However, Leon Hsu [4] has a concise and interesting re-interpretation of the Kennedy-Thorndike experiment strictly in terms of lengths. Such an interpretation offers a new view of what can be inferred from the result of this experiment: The observed interference pattern in this type of experiment (including the Michelson-Morley experiment) depends on the phase difference of the beams from the two arms of the interferometer when they recombine. This phase difference depends on the difference in the optical path lengths of the two arms. The experimental result that no significant shift was ever observed in the interference fringes implies that the difference in the optical path lengths of the two arms is the same in all inertial
527
528
Lorentz and Poincare Invariance
frames. Here, the arm length is measured in units of the wavelength of the monochromatic light used in the experiment [5]. Based on this experiment, one can conclude that "the extra round-trip distance traveled by a light pulse along the longer interferometer arm must have the same numerical value in all inertial frames when expressed in units of light wavelengths" [4]. This interpretation indicates that the Kennedy-Thorndike experiment can be understood solely in terms of the Principle of Relativity for physical laws. In other words, the experiment is consistent with the broad Lorentz and Poincare invariance or any relativity theory with 4-dimensional symmetry of the Lorentz and Poincare groups [6]. The argument is as follows: The Principle of Relativity implies that a meter stick at rest in a frame F and measured by observers in F has the same length as that at rest in F' and measured by observers in F'. Therefore, the extra round-trip distance that light must travel along the longer interferometer arm and the wavelength of the monochromatic light used in the experiment must both be the same in all inertial frames when measured in terms of meters or any other standard of length. Since the interference pattern generated depends on the number of extra wavelengths which can be fit into the extra round-trip distance of the longer interferometer arm, the pattern must be the same in all inertial frames. Consequently, one must have a null result. We stress that, by interpreting the experiments in terms of lengths alone, we did not need to make any postulates regarding the speed of light. Following this re-interpretation, Leon Hsu points out that "the difference in the implications of the Kennedy and Thorndike experiments for the speed of light as expressed in meter/meter or meter/second reveals a fundamental difference between measuring time in units of meters or seconds. The properties of a clock which measures time in units of meters [7] is based on a physical phenomenonthe propagation of light in a vacuum. The behavior of such clocks in different inertial frames is determined by properties of our universe which cannot be changed by humans. However, the properties of a clock which measures time in units of seconds is based on human convention. The behavior of such clocks in different inertial frames is determined by a postulate regarding the value we should measure for the speed of light (in meter/second)."
Chap. 14. The Kennedy-Thorndike
Experiment
529
REFERENCES [1] R. J. Kennedy and E. M. Thorndike, Phys. Rev. 42 (1932) 400. [2] In 1990, Hils and Hall carried out an improved Kennedy-Thorndike experiment by searching for sidereal variations between the frequency of a laser locked to an I2 reference line and a laser locked to the resonance frequency of a highly stable cavity. No variations were found at the level of 2 x 1 0 - 1 3 . See D. Hils and J. L. Hall, Phys. Rev. Lett. 64 (1990) 1697. [3] E. F. Taylor and J. A. Wheeler, Spacetime Physics, 2nd. ed. (W. H. Freeman, New York, 1992), p. 88. [4] Leon Hsu, A re-interpretation of two basic experiments od special relativity in terms of length, preprint, Univ. of Minnesota, 2001. We would like to thank Leon Hsu for his permission of using materials in his paper. For a more detailed disucssion, we refer to the original paper. [5] One important issue is the use of units. Since the universal constancy of the speed of light is not assumed in Leon Hsu's re-interpretation, the conventional definition of the meter in terms of the speed of light is no longer useful. (After 1983, the meter is, by convention, defined as the distance traveled by light in 1/299,792,458 of a second in the vacuum.) Here, we will use a definition of the meter from the pastthe distance between two marks on a platinum bar stored near Paris. This old definition of length involves less assumptions. Based on such an analysis, one can see that this type of experiment is insufficient to conclude that the roundtrip speed of light (in meter/second) must be a universal constant in all inertial frames. See ref. 4. [6] See J. P. Hsu and Leonardo Hsu, in chapter 10. [7] See Figure 2 on page xxvi for a clock which measures time in units of meters.
This page is intentionally left blank
C h a p t e r 15 The Ives-Stilwell Experiment The Doppler effect can be derived from the transformation of the wave 4-vector (fco,k). In particular, we have
*l = *»^2j£
(1M)
where k'J - k' 2 = 0, k% - k 2 = 0, A;' = 2TT/A' and k = | k |= 2?r/A. In special relativity, one has the relations (3 = v/c, k'0 — u'/c and ko = u/c. However, equation (15.1) holds for all relativity theories with the 4-dimensional symmetry of the Lorentz and Poincare groups. In other words, 4-dimensional symmetry dictates that (15.1) results from the transformaton of the wave 4-vector (&o,k). About 90 years ago, Einstein and Ritz first suggested that an experiment might be carried out to measure the wavelength of light emitted at right angles to the direction of motion by fast moving atoms. Such an experiment has been commonly perceived as performed by observing light emitted transversely to the direction of motion of atoms. However, it would be extremely difficult to make sure that observation was made exactly at right angles to the direction of the rays, and very small deviations from this direction would introduce a first-order effect which would swamp the result being sought. This can be seen clearly by expanding (15.1) as a power series in (3. Omitting terms of order higher than /?2 and using the relations k' = 27r/A' = 2n/X0 and k = 2ir/X, we have A = Ao(l-/?cos0+^32V
(15.2)
A' = A0. The light emitting atoms are assumed to be at rest in the F ' frame, so that the wavelength A' measured by the F'-observers is the same as the wavelength A0 emitted by the same type of atoms at rest in the F frame and measured by the F-observers, i.e., A' = A0. The second and third terms in (15.2) are respectively called the longitudinal and transverse shifts. The angle 0 is
531
532
Lorentz and Poincare Invariance
the direction of propagation of the light relative to the atoms' velocity as measured in the laboratory frame F. At room temperature, hydrogen atoms move with a speed of v ~ 10 6 m/sec or /? ~ 1/300. If 9 = 90.2° in (15.2), the first order shift \0(3cos8 ~ Ao/?7r/900 ~ A o (l/300) 2 is of the same order as the second order shift A0/?2. In 1938, Ives and Stilwell [1] carried an experiment in whcih this difficulty was avoided by measuring light emitted near 0° and 180°, where the longitudinal shift changes slowly with angle. A hydrogen discharge tube was used to produce Hf and H^ ions in the Ives-Stilwell experiment. Ions of H+ were not observed because they were immediately captured by hydrogen molecules to form H^. The H^ and H^ ions were accelerated through a potential of about 104 volts. By neutralization and dissociation, they produced neutral but still excited hydrogen atoms, which emitted hydrogen lines from the Balmer series. The wavelength was measured using a diffrection grating and photographic recording. Ives and Stilwell measured the Doppler shift of light rays emitted near 0° and 180° by hydrogen atoms. A schematic diagram of this experiment is given in Fig. 15.1. For light rays emitted by moving atoms in two opposite directions, Eq. (15.1) leads to A.(l-(t,/c)cosg)
Ar =
A 0 (l + (i;/c)cosfl) , 3 . ,
(15.36)
where Ar is the wavelength of the reflected light wave. The quantities A__| and A_ are defined as
A+
=^
+A
) = 7f^W
A. = l ( A r - A ) = A f / c ^ ° s f .
(15 4)
-
(15.5)
Equation (15.4) shows that the combined relation of A+ and A0 is the same as that of the transverse Doppler effect. It also shows that to second-order, we have 1 v2 A + - A 0 ~ - AA0 ^ , (15.6) 2 V which can be compared to the results of the Ives-Stilwell experiment. Values of the three different wavelengths A0, A and Ar can be measured by experiment, and therefore the values of the left-hand side of equation (15.6) are regarded as the observed values, while the right-hand side can be calculated as theoretical predictions by using either of the following two methods. The
Chap. 15. The Ives-Stilwell Experiment
«rf&&
533
^ &&
\
CANAL RAY
W**0*
F i g . 1 5 . 1 . Schematic diagram of the Ives-Stilwell experiment. The light is emitted by high speed hydrogen atoms, and also by atoms at rest. It is viewed in a direction making a small angle (6° in Otting's work, 7° in other) with the direction of the canal rays. The light is observed directly and by reflection from a mirror on the line of sight.
first is an optical means, the right-hand side of (15.6) is expressed in terms of the observed wavelengths A0, A and Ar: lAt,».,(Ar-A)a 2 V-8Aocos'0-
(
j
The second method is to calculate the right-hand side of (15.6) by use of electrodynamic quantities. Let V denote the voltage between two electrode plates, e be the charge of the hydrogen atom, and M be the mass of the particle observed. The equation of motion for the accelerated particle leads to eV = ^M0v2, and then we have 1
v2
(15.8)
eV
Values for the left-hand side of Eq. (15.6) were measured during the experiment, while the right-hand side can be calculated by (15.7) or (15.9). Measurements performed by Ives and Stilwell (1938) [1] are in agreement with equations (15.6)-(15.9). The original purpose of this experiment was not to test special relativity because these equations could also be obtained by the contraction ether theory. In fact, Ives [2] emphasized that he did not employ the constancy of the speed of light, but used the ether theory for the prediction. However, this experiment can be interpretated as a high
534
Lorentz
and Poincare
Invariance
207055 Volts
13702 Volts
7859 Volts
F i g . 1 5 . 2 . S p e c t r o g r a m s o b t a i n e d for several voltages.
accuracy test of the second-order Doppler shift giving a strong support of broad Lorentz and Poincare invariance. In their experiment, Ives and Stilwell tested the agreement between (15.5) and (15.8). The results are shown in Table 5.1, where v in the quantity u \0v/c Computed" in column 4 was evaluated by Eq. (15.8); the values of A_ in the quantity "A_/cos7°" in column 5 were obtained from Eq. (15.5) by use of the observed values of Ar and A. Table 15.1 shows the agreement between the computed and observed values. Table 15.1 Plate
Voltage
Line
169 160 163 170 165 172 172 177
6788 7780 9187 10574 11566 13560 13560 18350
H3 H2 H2 H2 H3 H2 H3 H2
X0v/c A Computed 10.62 14.04 15.30 16.34 13.88 18.50 15.05 21.55
Mean AA(A) Observed (AA/cos7°) 10.35 14.02 15.40 16.49 14.07 18.67 15.14 21.37
The final results obtained by Ives and Stilwell are shown in Fig. 15.2 and Tab. 15.2. The figures indicate typical spectrograms obtained for several applied voltages. In each the center, undisplaced line is seen, accompanied
Chap. 15. The Ives-Stilwell Experiment
535
at either side by two companions, which, by their separations from the center line by distances in the ratio \ / 2 / \ / 3 , are identified as H2 and H3. In the table, the values in column 4 were calculated from Eq. (15.9); the values in column 5 were obtained from Eq. (15.7); in column 6 the quantity AA is defined by the left-hand side of Eq. (15.6), i.e., AA = A+ — A0. Table 15.2 shows that the experimental values agree with the prediction of (15.1). Several similar measurements were carried out later [3-6]. Table 15.2.
Plate
169 160 163 170 165 172 172 177
Voltage
6788 7780 9187 10574 11566 13560 13560 18350
Line
H3 H2 H2 H2 H3 H2 H3 H2
X0v2/2c2 computed from voltage 0.0116 0.0203 0.0238 0.0275 0.0198 0.0352 0.0233 0.0478
\0v2/2c2 computed from observed A A 0.0109 0.0202 0.0243 0.0280 0.0203 0.0360 0.0237 0.0469
AA(A) Observed 0.011 0.0185 0.0225 0.027 0.0205 0.0345 0.0215 0.047
In 1979, Hasselkamp, Mondry, and Scharmann [7] performed a new experiment which had an important difference from previous experiments: Namely, they measured the second-order Doppler shift by directly observing light emitted perpendicular to the direction of a linearly moving light source, in contrast to all previous experiments. Their result was also consistent with equation (15.1).
536
Lorentz and Poincare Invariance
REFERENCES [1] H. E. Ives and G. R. Stilwell, J. Opt. Soc. Am. 28 (1938) 215. [2] H. E. Ives, J. Opt. Soc. Am. 27 (1937) 389. [3] H. E. Ives and G. R. Stilwell, J. Opt. Soc. Am. 31 (1941) 369. [4] G. Otting, Physik. Z. 40 (1939) 681. [5] W. Kantor, Spectr. Letters 4 (1971) 61. [6] H. I. Mandelberg and L. Witten, J. Opt. Soc. Am. 52 (1962) 529. [7] D. Hasselkamp, E. Mondry, and A. Scharmann, Z. Physik A 289 (1979) 151.
Chapter 16 The Observation of the Muon Lifetime Dilation Let us consider a particle 1 which decays into particles 2 and 3 , 1 - + 2 + 3 , in a general inertial frame F(w,x,y, z). Within special relativity, one has the relation w = ct. The time dilation of special relativity implies t h a t the mean decay lifetime of particle 1 moving with a velocity v\ would increase by a factor of 7 = (1 — t > 2 / c 2 ) - 1 / 2 compared to t h a t of the particle 1 at rest. Experimentally, the quantity which is actually measured in particle-decay in flight is the decay-length rather than time. Thus, these experiments can also be interpreted in terms of the dilation of the mean decay-length. In order to see t h a t a dilation of the decay-length holds in the broad 4-dimensional symmetry framework of Lorentz and Poincare, we note t h a t , in quantum field theory, the decay-length D is defined as the inverse of the decay rate T ( l -> 2 + 3): D = „,. \ . „, oc J - / ^ ^ 6 * ( r ( l - • 2 + 3) poi 01 JJ P02 P02 P03 P03
-
P l
P2
-
P3)
^
I Ms | 2 ,
(16.1)
±t spin
where pi,p2 and p3 are the 4-momentum of particles 1, 2, 3 respectively. The integral is invariant under the Lorentz group, so t h a t the decay-length is proportional to the 'energy' p0i of particle 1 (or 1/(1 — /3 2 ) 1 / 2 ), where Pi = vi/c in special relativity. One can write the dilated decay-length D in terms of the decay-length D0 when the particle's velocity approaches zero: D =
,Do
.
(16.2a)
In the widely used book of particle data, the mean decay-length D is defined to be related to the mean lifetime r by the relation: D0 = CT0,
T=
ill
,
(16.26)
so t h a t both of them transform as the zeroth component of a 4-vector, as required by 4-dimensional symmetry.
537
538
Lorentz and Poincare Invariance
^
-Ml SI
S2
I• v
S3
I
1
S4
1
F i g . 1 6 . 1 . Experimental arrangement for measuring the mean lifetime of particle decay in flight.
To be specific, let us consider a pion n+ which decays into a muon /z + and a muon-neutrino u^, n+ —> /j,+ + v^, as shown in figure 16.1. A narrow beam of pions passes through three detectors, Si,Si and 53. Some of the pions decay between detectors 53 and 54. If the distance L between 53 and 54 is sufficiently large, the probability of a / i + entering the detector 54 is negligible because the /x+s produced in the decay move out in all directions. The coincidence of signals in Si, 52 and 53 gives the total number of pions N0 before decay. The coincidence of signals in Si,S2,S3 and 54 gives the number of pions, N' remaining after traveling a length L. The relationship between N0 and N' is given by N' = N0eWD, D = D0/(l - K / c ) 2 ) 1 / 2 = 7A,,
(16.3) D0 = cr0,
in an inertial laboratory frame. Experimentally, one changes the distance L between two detectors and measures N' and iV0 for each L. Then, one plots a graph of ln(N'/N0) versus L to get a straight line with the slope o f - l/ftZ? = -(l-/3l)1^2//31D0. Thus 2 -1 2 one can test the dilation factor 7 = (1 — / ? ) ' and measure the mean life time T0 = D0/c. Note that the rest lifetime r 0 can also be measured by first stopping the particles in a medium and then measuring their lifetime at rest. The first experiment of this type was carried out by Rossi and Hall in 1941 [1] who measured the lifetime of muons created by cosmic rays. Since that time, many measurements have been performed by using mesons produced in the atmosphere and in accelerators [1-7]. The first observed muons were
Chap. 16. The Observation of the Muon Lifetime ...
539
those produced by cosmic rays colliding with molecules in E a r t h ' s upper atmosphere (~10-20 km). Although muons have a proper lifetime of r 0 = 2.2 X 10~ 6 sec only, a larger number of them are observed at sea level. This shows t h a t the mean decay-length of the muons should be at least tens of kilometers long. If the decay lifetime r of the flying muons were equal to the proper lifetime r 0 , then their mean free path would be only about CT0 = 660 m, even if they moved near the speed of light and only a very small fraction of muons would reach sea level. This shows t h a t most of the muons could not reach the sea level. The experimental result is resolved by the dilation of the decay-length or lifetime: the lifetime of the flying muons increase by a factor of j = (1 — / 3 2 ) - 1 / 2 compared to the proper lifetime, i.e., r = JT0. From the point of view of an observer co-moving with the moving muons, although the lifetime of the muons is r 0 the distance from the sea level contracts by a factor of 7 compared to the original tens of kilometers. The explanation given by the time-dilation effect is thus equivalent to t h a t of the contraction effect. Rossi and Hall (1941) [1] first observed the momentum-dependence of the decay rate of the /i-mesons in cosmic rays, and yielded a result in agreement with a prediction by Eq. (16.2). A series of measurements of the lifetime of muons in flight were carried out in the CERN muon storage ring. Farley et al. [8] reported a value of the muon lifetime in flight for an expansion factor 7 ~ 12, which agreed within 1% with the predicted value obtained by applying the above expansion factor and measured lifetime at rest to equation (16.2b). In 1977, they [9] reported the separate measurements for fi+ and / z - , with a 7 factor of 29.33, which is an order of magnitude more precise and which shows t h a t the predictions of Lorentz and Poincare invariance hold even under centripetal accelerations as large as 10 1 8 g, where g=980cm/sec 2 . The decay lifetimes for kaons in flight have also been measured. For example, a measurement for the lifetime of k°s in flight at several hundred GeV in 1987, with a 7 factor of ~ 700 [10]. The result is still consistent with broad Lorentz and Poincare invariance.
540
Lorentz and Poincare Invariance
REFERENCES >] B. Rossi and D. B. Hall, Phys. Rev. 59 (1941) 223. [2] D. H. Frisch and J. H. Smith, Am. J. Phys. 5 (1963) 342. ;3] D. M. Lederman et al., Phys. Rev. 83 (1951) 685. ;4] R. P. Durbin et al., Phys. Rev. 88 (1952) 179. 5] J. D. Jackson, Classical Electrodynamics ( New York, Wiley, 1962.) ;6] A. J. Greenberg et al., Phys. Rev. Lett. 23 (1969) 1267. [7] D. S. Ayres et al., Phys. Rev. D 3 (1971) 1051. ;8] F. J. M. Farley et al., Nuovo Cimento, 45 (1966) 281; J. Bailey et al, Phys. Lett. B 28 (1968) 287; Nuovo Cim. A9 (1972) 369. [9] J. Bailey et al., Nature 268 (1977) 301. [10] N. Grossman, K. Heller, et al, Phys. Rev. Lett. 59 (1987) 18.
Chapter 17 "Experimental Tests" of the Second Postulate of Special Relativity There are many experiments which purport to test the second postulate of special relativity, i.e., the universal constancy of the (one-way) speed of light. Many of these experiments use moving light sources such as binary star systems, stars, the sun, extragalactic systems, moving media, and g a m m a ray sources. The results of these experiments can be summarized as follows: (A) The one-way speed of light has been proven to be independent of the motion of the light source. However, experiments have not unambiguously established the universal constancy of the one-way speed of light in different inertial frames, as we shall see below [1]. (B) The two-way speed of light has been shown to be isotropic in different inertial frames. In the analysis of these experiments, one can assume a simple relation c' = c±kv,
(17.1)
between the speed of light c and the velocity v of a moving source, where c denotes the speed of a light signal from a stationary source. The quantity c' is the speed of the same light signal as measured by an observer moving with a velocity v relative to the source. The parameter k is then determined by the experiment. The value k — 0 corresponds to Einstein's relativity, while k = 1 corresponds to Ritz's "ballistic" hypothesis of light emission. An experimental null result would give an upper limit on the parameter k. Let us consider evidence from observations of binary systems. The discussions of Comstock (1910) [2] and de Sitter (1913) [3] concerning the orbits of close binary stars are the oldest astronomical evidence for the property t h a t the speed of light is independent of the motion of the light source. Since each of the two stars moves with a high speed in its orbit, if the speed of light did depend on the velocity of its source, then the light from
541
542
Lorentz and Poincare Invariance
-
EARTH
Si
F i g . 1 7 . 1 . Observations of a binary system.
the approaching star would reach the earth in a shorter time than that from the receding member of the doublet. In figure 17.1, we assume that twin stars Si and S2 move in a circular orbit around their common center of gravity. At the time t\ star Si reaches point A, with its velocity pointing toward the earth. If a light ray emitted by the star has a velocity c + v, then an observer on the earth will receive this ray at the time (17.2) t\ = h + c+v where L is the distance between the binary and the earth. Let tAs represent the amount of time it takes Si to travel from point A to point B. A light signal emitted from the star will now have a velocity c - v, and will reache the earth at the time (17.3) tX = t1+T + c—v The amount of time between the arrival of the two light signals at the earth is t*ABl ~ t*2 ~ *1
where T =
C— V
c+v
=
^AB +
2vL c — v2 2
(17.4a)
T
2vL
(17.46)
The difference t*AB1 is a half-period of the orbit of star Si as seen from the earth. We now consider the second star S 2 . Suppose S 2 is at point B at the time t2 with its velocity directed away from the earth. A light signal emitted
Chap. 17. "Experimental Tests" of the Second Postulate ...
543
from S2 at t2 will arrive at the earth at the time t*3 = t2 + — . (17.5) c— v Similarly, S2 is at point A at the time t2 + tAB • Another light signal emitted at such a time will be received by the earth at the time tl = t2 + tAB + —r-. (17.6) c+v Thus, the amount of time between the arrival of the two light signals from S2 on the earth (or the half-period of the stars' orbit) is given by t*AB2 = *4 - *3 = tAB ~ T,
(17-7)
where r is again defined by Eq. (17.4b). These results show that if Ritz's "ballistic" hypothesis for the speed of light is correct, the half-periods of the stars in a binary system will depend on the distance between the star system and the earth. This would introduce a considerable spurious eccentricity into the orbits of the binary systems. W. de Sitter (1913) [3] first considered this problem and pointed out that for many binary systems r has the same order of magnitude with t^B • He came to the following conclusion: If the speed of light were not assumed constant, then for circular orbits of spectroscopic binary stars the time dependence of the Doppler effect would correspond to that of an eccentric orbit. Since the actual orbits have very small eccentricity, this leads one to conclude that the speed of light is, to a large degree, independent of the velocity v of the binary star. From known data on /? Aurigae, he calculated an upper limit for the A;-value of less than 0.002 by use of expression (17.1). Later, Zurhellen's estimate (1914) [4] gave k < 1 0 - 6 . In 1977, Brecher obtained a still smaller upper bound for k, of 1 0 - 9 [5]. Now let us consider laboratory experiments using moving light sources. Alvager, Farley, Kjellman, and Wallin (1964, 1966) [6] measured the speed of 7 rays from the decay of neutral pion, 7r°, with an energy larger than 6 GeV. The arrangement is shown in Fig. 17.2 (a). Neutral pions are produced at CERN Proton Synchrotron (PS) in an internal Be target bombarded by 19.2 GeV/c protons. The 7 rays are observed at an angle of about 6° relative to the circulating protons. In order to achieve high accuracy in the measurement of the 7 ray velocity, the time-of-flight measurement is based on the bunched structure of the circulating proton synchrotron beam. The beam has bunches of a few nanosecond half-width and separation of about 100 nanoseconds. In order to preserve the bunch structure of the beam, the
544
Lorentz and Poincare Invariance
radio frequency voltage is maintained during the target irradiation (which is about 100 microseconds). Due to very short lifetime of the neutral pions, ( ~ 1 0 ~ 1 6 sec), the resulting beam of 7-rays has the same radio frequency structure. The method for the time measurement is shown in Fig. 17.2 (b). As the counter assembly is moved back along the beam line and with the cable length is held constant, the s t a r t pulses will arrive successively later with respect to the stop pulses. When the detector has been moved a distance L « c/f [A —l B in Fig. 17.2 (a)], the s t a r t pulses will again show the same time relative to the stop pulses provided d = c (where d represents the velocity of 7 rays from moving 7r°), and identical distributions will be obtained on the time analyzer in the two positions A and B. A non-zero value of AT = TA — TB in Fig. 17.2 (b) will show t h a t d ^ c (As will be shown below, however, an anisotropy of the one-way speed of light does not necessary imply a non-zero value of A T = 7-4 — rjg.) The corresponding 7-ray flight time between A and B will then be
c' = L (AT + ~)
.
(17.8)
Experimentally, the average value of AT was found to be
AT = 0.000 ± 0.013 nsec.
(17.9)
By use of the experimental values for L = 31.4503 ± 0.0005 m and / = 9.53220 ± 0.00005 MHz, the speed of the 7 rays of energy > 6 GeV is found t o be d = (2.9979 ± 0.0004) x 10 l o cm/sec,
(17.10)
which is in agreement with the accepted value of the speed of light from stationary sources. Is the experimental value d in (17.10) really the one-way speed of light? If so, then this experiment directly contradicts the logical analysis of simultaneity by Reichenbach and other works in Chapters 4 and 8. To clarify the question, we shall prove t h a t this measured speed d in (17.10) is actually the two-way speed of light rather than the one-way speed of light [6], contrary to the author's claim [7]. A schematic diagram of the experimental setup is shown in Fig. 17.3. A detector D is first located at the position A [see Fig. 17.3 (a)] and then moved to position B [see Fig. 17.3 (b)]. A time analyzer T P is assumed to be fixed at the position B [see Fig. 17.3 (a,b)]. For simplicity, we assume t h a t the length of the electric cable between D and T P is equal to the distance L
Chap. 17. "Experimental Tests" of the Second Postulate ...
forget M
mUD
Concrete Lead
(a)
Stop pulses
Start pulses Position A
Stan pulses Position B
Y-ray time-of-lighl
(b)
F i g . 17.2 The Alvager-Farley-Kjell man-Wallin experiment, (a) Arrangement; (b) Measurements of time.
545
546
Lorentz and Poincare Invariance
(a)
DO—
(b)
-HTP
\A
-4>D -EpTP
F i g . 17.3 Schematic diagram for explaining the Alvager-Farley-KjellmanWallin experiment.
between A and B. Suppose the one-way speed of light u in a medium along the direction e is given by 1 - (u/c2)q-e The quantity u is the average speed of light along a closed path (i.e., the two-way speed of light) in a medium and in vacuum, one has u = c2 = and a constant two-way speed of light. The two-way speed of light u and c2 as well as the parameter q are independent of the motion of source. Let tA denote the time at which a 7-ray pulse from the decay of w° mesons is received by D and also the time at which an electrical pulse is produced. The propagation time of the electric pulse from D to TP is given by L L L (17.12) tAB = — = — — q , u u c2 where u is the one-way speed of the electric pulse in the cable, which is given by Eq. (17.11), and u is the two-way speed of light in the cable. The arrival time of this pulse at TP is then (17.13) t*A = tA + tAB = tA + q. * u c2 On the other hand, when the detector D is moved to the position B, the time, at which the 7 ray arrives at D through the distance L in vacuum, is L L tB — ^A + —, = M H C
Ci
(17.14) C2
Chap. 17. "Experimental Tests" of the Second Postulate ... 547
where c' is the one-way speed of 7 ray in vacuum, which is given by Eq. (17.11) with u = c' and u — c 2 . (The 7 ray received by D at B is not the same 7 ray received by D when it was located at A so that the time ts given by Eq. (17.14) differs from an actual time by N/f where N is an integer number and / is the repeat frequency of the 7-ray pulses as seen in Fig. 17.2 (b). This difference, however, does not change the following final result.) The electrical pulse produced by D at the position B at time ts will propagate along the cable of length L and reach the analyzer T P at the time ,, . , 1 - {u/c2)qL l+{u/c2)qL L =t l tB B-\ = 77 H = 77 = B + —, u 2 u 2 u or, by use of Eq. (17.14), t*B=tA + u
q+
c2
— = t*A + —, c2 c2
(17.15)
or t*B-t\ = ±.
(17.16)
C2
Thus, by definition, the velocity of the 7 ray is L lt*
— lt* B A
c2.
(17.17)
Note that here, c 2 is defined in Eq. (17.11), which is the two-way velocity of light in vacuum: c2 = u. Based on this more careful analysis, we conclude that (A) this experiment determines the two-way speed of light rather than the one-way speed of light and (B) the experiment shows the two-way speed of light to be independent of the motion of source. [8,9]
548
Lorentz and Poincare Invariance
REFERENCES [l]Note that experiments have not established unambiguously that the oneway speed of light is the same as the two-way speed of light and that the one-way speed of light is isotropic in different inertial frames. See also references 7 and 8. [2]D.F. Comstock, Phys. Rev. 30 (1910) 267. [3] W. de Sitter, Proc. Acad. Sci. Amst. 15 (1913) 1297. [4]W. Zurhellen, Astr. Nachr 198 (1914) 1. [5] K. Brecher, Phys. Rev. Lett. 39 (1977) 1051. [6] Yuan Zhong Zhang, Special Relativity and its Experimental Foundations, 1998, World Scientific Publishing Co Pte Ltd, Singapore, pp. 168-170. [7] T. Alvager, F. J. M. Farley, J. Kjellman and L. Wallin, Phys. Lett. 12 (1964) 260; Arkiv Fysik, 31, (1966) 145. [8] D. Sadeh, Phys. Rev. Lett. 10 (1963) 271. T. A. Fillippas and J. G. Fox, Phys. Rev. 135(1964)1071. In Sadeh's and Fillippas-Fox's experiments, the light paths (i.e., the 7-ray path and the detection apparatus circuit) formed a closed path. The detection apparatus was calibrated by means of a light source at rest. In other words, the difference between the two arrival times of two 7 rays from a moving source to two separate detectors was measured by comparison with that in the case of a 7-ray source at rest. Thus, this experiment, like other closed-path experiments, is a test of the constancy only of the two-way speed of light but not of the one-way speed of light. [9] A. Alvager, A. Nilsson and J. Kjellman, Nature 197 (1963) 1191; Arkiv Fysik, 26 (1964) 209. The Alvager-Nilsson-Kjellman experiment involved the measurements of the time-of-flight for two 7 rays arriving at the same detector. Thus, these measurements do not depend on any definition of simultaneity. This means that this experiment proved that the one-way speed of light is independent of the motion of light source. However, the experimental result does not imply the isotropy of the oneway speed of light.
Chapter 18 The Mass-Velocity Relation Experiment The mass-velocity relation is one of the fundamental and important predictions of special relativity. It is given by m
y/1
°
nan
— UZ/CZ
where TUQ is the constant (rest) mass of a particle and u is its velocity in an inertial frame. This relation was obtained and discussed by Lorentz before the creation of special relativity in 1905. It appears that this was the first physical "prediction" of relativity theory and was tested by an experiment using /? rays by W. Kaufmann around 1905 [1]. Such an experimental test helped to raise interest in relativity among physicists at that time, even though the results of this early experiment were not in favor of relativity and were controversal [2]. The mass-velocity relation (18.1) is intimately related to the relativistic properties of a free particle whose motion is characterized by the energymomentum 4-vector or the 4-momentum p** = (p°, p),
f2 - p2 = ml within the 4-dimensional symmetry framework of broad Lorentz and Poincare invariance [3]. Thus, the velocity-dependence in (18.1) is essentially a property of the zeroth component of the 4-momentum. In special relativity, one has the universal constant c and (3 = u/c, so that one can define the usual energy, E — p°c2, and momentum, p = pc, r-, E =
-0
^
C
TTIQC2
2
=
/i
m0u
2/2'
2
P = P
2
C =
/i
2/22
2
18
-3
V l - u /c A/1 - u /c In high energy physics and quantum field theory, kinematic relations of the form (18.2) are indispensable. They have been extensively used in the covariant formulation of theories, in all sorts of calculations and experimentally, in designing high energy accelerators and fitting data. The first experimental investigation concerning the dependence of mass on velocity was carried out by Kaufmann (1901) [1]. In determining the mass-to-charge ratio e/m of the (3 rays, he first found that the value of m 549
550
Lorentz and Poincare Invariance
varied with the velocity of the electron, if the charge e is assumed to be independent of the velocity. In order to explain this phenomenon, many expressions for mass-velocity relation were derived from different models in the ether theory [4] and Lorentz's theory of the electron [5]. Kaufmann's early experiments were not conclusive in deciding between the competing theories of Abraham and Lorentz. Later, more measurements were performed, which agreed with the prediction of (18.1). Let us consider a test of (18.1) with high energy synchro-cyclotron. Consider a particle with a rest mass mo and a charge e which moves in a static magnetic field B. In special relativity, the Lorentz force acting on the particle by the magnetic field is F = - u x B = - u ± x B, (18.4) c c where u i is the component of the particle's velocity u in the direction perpendicular to B . This equation shows that the force F is perpendicular to the velocity u so that F • u = 0. The equation of motion for the particle is then du. e m— = F = - u x x B, (18.5) at c where m is the relativistic mass given by Eq. (18.1). If u is perpendicular to B, equation (18.5) can be written as p = mu = -Bp, (18.6) c where p denote the radius of the circle in which the particle moves. Equation (18.6) shows that the orbit of a charged particle moving perpendicularly to the magnetic field B is a circle of radius p. The angular velocity of the particle is u - -. (18.7) P By use of Eq. (18.6), the above equation becomes u> =
W 5. \mj
(18.8)
c
The frequency of a voltage applied to accelerate the particle should be equal to the angular frequency u (synchronization). Equation (18.8) shows that by directly measuring the angular frequency u) and magnetic field B, we can obtain the mass of a charged particle moving in a synchro-cyclotron. The measurements of this kind for 385-MeV protons (corresponding to a velocity of u ~ 0.7c) were performed by Grove and
Chap. 18. The Mass-Velocity Relation Experiment
551
Fox (1953) [6]. From the measured values of a; and B, they calculated the quantity, f • (18-9) (-) = \m/i At the same time, by putting the value of the orbital radius p into Eq. (18.7) and then using (18.1), they calculated the relativistic prediction:
\mj2
m0 V
C
c
m0 V
From their measurements, they obtained (e/m)i - (e/m)2 = -0.0006 ± 0 . 0 0 1 . (e/m)2 This is in agreement with (18.1) to within 0 . 1 % . Relation (18.1) can also be tested by measurements of the flight-time of moving particles. In order to separate the law of electromagnetic motion of charged particles from the mass-velocity law, Bertozzi (1964) [7] performed an experiment in which the speeds of electrons with kinetic energies in the range 0.5-15 MeV were determined by measuring the time required for the electrons to traverse a given distance and the kinetic energy was determined by calorimetry. His result shows t h a t the dependence of the kinetic energies of the electrons on their velocities agrees with the following relation u2
mnc 2
( l
~i = ~
—rr^F
\
•
18 n
-
which is obtained from the kinetic energy expression T = (m — mo)c 2 . If the kinetic energy T is much greater than the proper energy moc 2 , to the lowest order in {TUQC2 /T)2, the above equation reduces to TUnC4
C — U r^i
2 T
2
(18.12)
Brown et al. (1973) [8] compared the velocity of 11-GeV electrons to the velocity of visible light by use of time-of-flight techniques. Their result was given by c— u At , . a , = — = ( - 1 . 3 ± 2.7 x 1 0 " 6 , 18.13 c t where t was the time required for the electron to traverse a given distance, At represented the difference of the times required for electrons and light signals to traverse the same distance. This implies t h a t the velocity of the
552
Lorentz and Poincare Invaxiance
11-GeV electrons was found to be equal to the velocity of visible light to an accuracy of 1 0 - 4 . The prediction given by the relativistic equation (18.12) for the difference should be approximately equal to moc2/aT2 ~ 1 0 - 9 , which agrees with the observed value above. Later, Guiragossian et al. (1975) [9] performed a similar measurement at the Stanford Linear Accelerator Center. In this experiment the relative velocities of 15-GeV 7-rays and electrons with energies ranging from 15 to 20.5 GeV were measured by using a time-offlight technique with 1-psec sensitivity and a flight path of about 1 km. No significant difference between the velocities of light and the velocities of the electrons was observed to within 2 parts in 107. According to the relativistic equation (18.12), the expected difference should be (c — u)/c ~ 5 X 10~ 10 . Thus the observed result agrees with the prediction (18.1) with a 7 factor of roughly 40. In view of equations (18.1)-(18.8), the great success in constructing high energy synchro-cyclotrons has provided a powerful confirmation of equation (18.1). At the end of the 20th century, the highest obtainable momenta for proton beams in the laboratory has reached 106GeV/c. Experiments using such beams have verified relation (18.1) with gamma factors of up to 1,000,000.[10]
Chap. 18. The Mass-Velocity Relation Experiment
553
REFERENCES [1] W. Kaufmann, Nachr. (1901).
Ges.
Wiss.
Gottingen, Math-Nat.
Kl.
143
[2]A. I. Miller, Albert Einstein's Special Theory of Relativity , (AddisonWesley, 1981) pp. 226-235, 334-352. [3]See Jong-Ping Hsu and Leonardo Hsu, Phys. Lett. A 196 (1994) 1. See Chapter 10, second paper, equations (19)-(21). [4]M. Abraham, Ann. d. Phys. 10 (1903) 105. Abraham assumed that an electron is a rigid sphere moving with a velocity u, and then obtained the mass formula, m =
3 mo 4 ^
P = u/c.
[5] H. A. Lorentz, Proc. Acad. Sci. Amst. 6 (1904) 809. [6] D. J. Grove and J. G. Fox, Phys. Rev. 90 (1953) 378. [7] W. Bertozzi, Am. J. Phys. 32 (1964) 551. [8] B. C. Brown et a/., Phys. Rev. Lett. 30 (1973) 763. [9] Z. G. T. Guiragossian, G.B. Rothbart, and M.R. Yearian, Phys. Lett. 34 (1975) 335. [10] Particle Data Group, Review of Particle Physics, Phys. (1996) 193.
Rev.
Rev. D. 54
This page is intentionally left blank
Chapter 19 The Mass—Energy Relation Experiment The equivalence of mass and energy is one of the most important results of the special theory of relativity. This is probably the best known formula in physics. Its validity has been demonstrated by the powerful explosion of the atomic bomb. Within the framework of special relativity, a free particle with the energy E has the inertial mass m. The connection of the magnitudes of E and m is given by Einstein's famous formula E = mc2,
(19.1)
where the mass m is the inertial mass. Experimentally, the changes AE and Am are directly observable, but the absolute magnitudes E and m are not. These changes satisfy a similar relation: AE = Amc2. (19.2) This relation for a physical process can be derived from relativistic mechanics by using (19-1) and the conservation of energy. Thus a test of the relation between mass and energy is also a test of the conservation of energy. Relation (19.2) implies t h a t a certain amount of change (transfer) in energy must be accompanied by a corresponding change (transfer) in mass, and vice versa. This shows the equivalence of mass and energy in special relativity. In general, as long as a theory has the 4-dimensional symmetry of the Lorentz and Poincare groups, one can write down an invariant action [1] for a particle and obtains the 'energy-momentum' 4-vector, p M = (77710,77770/?'), where i = 1,2,3, 7 = (1 — / ? 2 ) - 1 ' 2 and /3 l is a component of a dimensionless velocity vector of a particle. In such a theory, the 'energy' is given by p0 = m = 77770 and, hence, is also consistent with the mass-energy relation experiments which support (19.1). To be a nuclear represent produced
specific, let us consider the relation between mass and energy of reaction within the framework of special relativity. Let A and B two nuclei in initial states of a process, and a and b be two nuclei through the reaction: A + B-^a
+ b.
(19.3)
We shall use below a symbol with a subscript i = A, B, a and b to denote the physical quantity corresponding to the nuclei A,B,a and b. For instance, EA and EB represent the total energies of the nuclei A and B respectively, Ta denotes the kinetic energy of the nucleus a, and so on. 555
556
Lorentz and Poincare Invariance
According to the conversation law of energy, the total energies before and after the reaction should be equal to each other: EA + EB = Ea + Eh.
(19.4)
By use of the definition of energy, E = mc2 = T + m0c2, the above equation gives the law of energy transfer: AT = AE0,
E0 = m0c2,
(19.5a)
where AE0 = E0A + E0B - E0a - Eob,
(19.56)
AT = Ta + Tb-TA-TB.
(19.5c)
This indicates that after the reaction, the increase in the total kinetic energy AT equals to the decrease in the total proper energy AEoSubstituting relation (19.1) of mass and energy into the conservation law (19.4), we have mA + mB — ma + mb. (19.6) This is the law of conservation of relativistic mass. It shows that, in special relativity, the conservation of energy and the conservation of mass are two equivalent laws rather than two independent laws. The relation of mass and velocity shows that the inertia of a missive particle is no longer measured by its proper mass, but should be measured by its total (i.e., relativistic) mass. Let us introduce the concept of the moving mass mj which is the difference between the relativistic mass m and proper (or rest) mass mo'mx = m — mo(19-7) By comparison of this equation with the definition T = (m — mo)c2, one can see that the moving mass mj is related to the kinetic energy T as follows: T = mTc2.
(19.8)
By inserting Eq. (19.7) in Eq. (19.6), we obtain the law of mass transfer: AmT = Am 0 ,
(19.9a)
Amo = -m0A - m0B + m0a + m0b,
(19.96)
where m
Amj = mjA + friTB ~ Ta — fnjb-
(19.9c)
This result indicates that after the reaction, the decrease Amo (i-e., the socalled "mass defect") in the total rest mass is equal to the increase Amj in the total kinetic mass. Due to relation EQ — m0c2 and Eq. (19.8), equations
Chap. 19. The Mass-Energy Relation Experiment
557
(19.9) and (19.5) are not independent. In fact, using E0 = m0c2 and (19.5b), we can write (19.5a) as AT = Am 0 c 2 . (19.10) Similarly, using T = mrc2 and (19.5c), Eq. (19.5a) can be written as c2AmT
= AE0.
E0 = m0c2.
(19.11)
Experimentally, the observed quantities are the rest masses moi and the kinetic energies 7; for i = A, B, a, b. From these observed values one can calculate the changes Amo and AT in the total mass and kinetic energy, and then compare the results with the mass-energy relation (19.10). Thus it is the law of energy transfer, Eq. (19.10), that is used for direct comparison with experimental results. A nucleus consists of protons and neutrons. The rest mass of a nucleus is less than the sum of the rest masses of the involved protons and neutrons. The difference between them is the "mass defect", which is proportional to the binding energy of the nucleus. The mass defect for a nucleus (A) consisting of N nucleons (or Z protons and A — Z neutrons) is given by Am 0 = Zm0(H) + (A - Z)m 0 ( n ) - M0(Z
(19.11)
where m0(.f/) = 1.008142 u (lu=1.6603 X 10 - 2 4 grams) is the rest mass of a hydrogen atom, m 0 ( n ) = 1.008982 u the rest mass of a free neutron, and e rest MQIZ,A) th mass of a nucleus consisting of Z protons and A—Z neutrons. The binding energy of nucleus AEQ is just the energy corresponding to the mass defect Amo: AEQ (MeV) = 931 [l.008142 Z + 1.008982 {A - Z) - Af0(Z)^)] .
(19.12)
The average binding energy per nucleon AEo/A is the required energy to separate a nucleon from the nucleus. The first test of the mass-energy relation was carried out by Cockcroft and Walton (1932) [2] by use of the nuclear reaction: iH 1 + 3Li7 —> 2 a. The observed values by means of a mass-spectrometer are as follows: m 0 ( 3 Li 7 ) = 7.01818 u, modH 1 ) = 7.01818 u,
(19.13)
4
m 0 ( 2 He ) = 7.01818 u. By putting these values into Eq. (19.9b), one arrives at the change in the proper energy, AE0 — c2AmQ — 17.25 MeV, which should be equal to the difference of the kinetic energies of the two a particles and incident proton AT. In the Cockcroft-Walton experiment, the kinetic energy for the incident
558
Lorentz and Poincare Invariance
proton is 0.25 MeV and after reaction the kinetic energy of the a particle is measured to be 8.6 MeV. The 3 Li 7 target is at rest and hence its kinetic energy T ( s L i 7 ) = 0. From these values one obtains the increase in the kinetic energy after the reaction compared to t h a t before the reaction, which should be AT = (2 X 8.6 - 0.25) MeV = 16.95 MeV. Smith (1939) [3] repeated the experiment of Cockcroft and Walton with higher accuracy and measured an energy liberation of A T = 17.28 ± 0.03 MeV which agrees well with the expected value AEQ = 17.25 MeV. Many experiments using various nuclear reactions have been performed. They all verified the mass-energy relation to very high accuracy [4,5,6]. Let us consider the process of e+e~ pair annihilation. Experiments have shown t h a t a slow positron e + will be captured by an electron e~ to form positronium. In the singlet state of positronium, the spin directions of the electron and positron are anti-parallel; while in the triplet state, the spins of the electron and positron are parallel. Positronium in the singlet state will annihilate and produce 2 photons with a lifetime of about 1 0 _ 1 0 s e c . Due to the conservation law of momentum and energy, the energy of each photon should be mac2 — 0.511 MeV. T h a t has been proved by experiments. A triplet positronium, with the lifetime of about 1.5 X 10~ 7 sec, will annihilate into three photons. In this process the spin and angular momentum stay constant. Experiments showed t h a t the directions of motion for three photons lay in the same plane and t h a t the momentum is conserved. An inverse process, in which radiation energy could transfer into proper energy (or rest mass), is as follows: A photon in an electric field of nucleus could transfer into a pair of positron and electron. Experiments show t h a t such a process occur only when the energy (hu) of the photon is larger than 2 m 0 c 2 .
Chap. 19. The Mass-Energy Relation Experiment
559
REFERENCES [1] Jong-Ping Hsu and Leonardo Hsu, Phys. setters A 196 (1994) 1 [see Chapter 10 of this volume] and Leonardo Hsu and Jong-Ping Hsu, Nuovo Cimento B i l l (1996) 1283. [2] J. D. Cockcroft and G. T. S. Walton, Proc. Roy. Soc. A137 (1932) 223. [3] M. M. Smith, Jr. Phys. Rev. 56 (1939) 548. [4] G. Stephenson and C.W. Kilmister, Special Relaticity for (Longmans, Green and Co. 1960).
Phyiscists
[5] M. C. Hudson and W. H. Jocobson, Jr. Phys. Rev. 167 (1968) 1064. [6] W. H. Wapstra et al., Nad. Data. A9 (1971) 364.
This page is intentionally left blank
Chapter 20 The Thomas Precession Experiment Let us consider experimental measurements of the (g — 2) factor of the electron. Suppose an electron moves in a uniform magnetic field B. Let the velocity v of the electron be perpendicular to the magnetic field B . In this case the electron will move along a circular orbit with the angular frequency given by Eq. (18.8), eS eB ,nn „ , ue = = , (20.1a) mc ymoc where (20.16)
s/1 - u 2 /c 2
'
The precession of the electron spin is described by [1] ds
TT =(Os X S,
at where (Qs = T ^ - ( B - - x E) -f Q>T = T ^ - B +G) T (20.2) 2m,QC c 2moc is the precession frequency of the SDin when E = 0. The angular frequency ©y of Thomas precession is given by [2] 1 jvxa « r = ( 1 - - s/l-v*/c 7 = ^ 7 52)) ^~2 .
(20-3)
where a is the acceleration of the electron moving along a circular orbit in the magneticfield,i.e., a = ^ - v x B . -ym0c From (20.3) and (20.4), we have « T = (i - j)^f-Vx(vxB) •yniQC vl
=
(20.4)
_ ^ ^ ^ B _ 7moc
where v is perpendicular to B. Using the result in Eq. (20.2), we obtain the precession frequency of the electron spin in the uniform magnetic filed B, us = (2-^) — + we, V 2 ) m0c 561
(20.6a)
562
Lorentz and Poincare Invariance
or ua=us-ue=(--l)-—.
(20.66)
Here ue is the electronic (orbit) angular frequency defined by Eq. (20.1). According to the assumption by Uhlenbeck and Goudsmit (1925), the relation between the electron moment n and spin angular momentum s is ge = ^ 2m 0 c S ' where g is called g factor (e.g., g = 2 for the electron), —e is the charge of the electron, mo is the rest mass of the electron, and c is the speed of light in vacuum. This means that g = 2 for an electron[3]. However, experiments show that the g factor is slightly different from 2. The difference, the {g — 2) factor, corresponds to the anomalous magnetic moment of an electron. Equation (20.6b) gives the difference between the electronic spin precession frequency u>s and its orbital frequency uc predicted by Einstein's theory of special relativity, which can be directly measured. Therefore, by use of an observed value of the quantity eB/m0c in Eq. (20.6b), one can find out a value of the factor (g — 2)/2. Experimental results for measuring the (g — 2) factor of leptons are shown in Table 20.1. The g — 2 factor can be calculated by use of quantum electrodynamics which is based on Einstein's theory of special relativity. This calculation to the magnitude of order a 3 gives [4]
1
fa\2
a
fa\z
2 07-2) = 1 + — - 0.328479 (^-J + (1.49 ± 0.25) (^-J , where
1 137.03604(11) is the fine structure constant. This theoretical result agrees with experiments to within a very high accuracy. We now consider comparison between Eq. (20.6a) and the g — 2 measurements, which is not relevant to the quantum electrodynamics. Table 10.1 shows a fact that the values of g — 2 are independent of the velocities of the measured particles, which is in agreement with the prediction of Eq. (20.6a). Then this can be regarded as a direct test of Thomas precession. In order to estimate relative accuracy of these measurements, a theory or a model different from Einstein's special relativity is needed. For this purpose, Newman et al. (1978) [5] suggested a model in which the momentum p and energy E of a free electron has a relation E — E(p) and then the rest mass of the electron is given by a =
-L = lim I £ m0
p-+o p dp
(20.7)
Chap. 20. The Thomas Precession Experiment
563
The cyclotron frequency of the electron in a uniform magnetic field B is uc = ^ - , jm0c
(20.8)
?-*-£•
(20-9)
with mo dh, Then the precession frequency of the electron spin, (20.6a), should be changed to geB + (1 - 7)wc = u>c ( -gj - 7 + l j , us = 2moc with
1
T=
v
7r^V?'
IdE
c~-.??
,nn,ns
(2(U0)
and hence the difference between the cyclotron and precession frequencies is UD = Ua-Ue=(l-l)l*L.
V2
(20.11)
7 / m0c
Newman et al. regarded the quantity (1 — -y)u>c as a relativistic effect, i.e., Thomas effect, where 7 has the usual form. On the other hand, the 7 in the expression of u>c comes from dynamics and thus may be different from the usual 7. The result (20.11) from this model gives a relation between the frequency wo a n d the electron velocity. Newman et al. used two experimental results for determining the factor (g/2) - 7/7 on the right-hand side of Eq. (20.11). One is the experiment carried out by Wesley and Rich (1971) [6] (see Tab. 20.1), in which the magnetic field was B = 1.2x 10 3 G, the kinetic energy of electron was 110 keV (the corresponding velocity of electron was v/c = 0.75, i.e., 7 = 1.2), and the difference in frequency U>D was directly measured. By putting these values in Eq. (20.11), they obtained f | - 1 J = 0.00115965770(350).
(20.12)
Van Dyck, Schwinberg, and Dehmelt (1977) [7] performed an experiment (see Tab. 20.1), in which electrons were nonrelativistic (7 — 1 = 10 - 9 ) and the observed difference in frequency was w
'~Uc
= 0.00115965241(20).
(20.13)
By making comparison of Eqs. (20.12) and (20.13), they yielded 1 - 1 = (5.3 ± 3.5) x 1 0 - 9 .
(20.14)
564
Lorentz and Poincare Invariance
This shows that these measurements agree with the prediction (7 = 7) of Einstein's special relativity to within an accuracy of 5 X 1 0 - 9 . As pointed out by Combley et al. (1979) [8], of course, a relative accuracy from this method is very much dependent upon the model adopted for the breakdown of special relativity. Cooper et al. (1979) [9] used the above model and assumed that the quantity 7 / 7 can be expanded in the power series of (7 — 1):
I = l + C 1 ( 7 -1) + ---. 7 Here a value of the efficient C\ was obtained from two experimental results through _ q d ) - q(2) C l _ 7
(1)_7(2)'
where
_l( 2 V
(eBYl
7\
7/
\m0cj
The superscripts (1) and (2) correspond to two experiments with different leptonic velocities. Their results are shown in Table 20.2.
Table 20.1. Experimental measurements for g — 2 of leptons Year 1977 1971 1979 1972 1977
Investigator Van Dyck, Schwinberg and Dehmelt [7] Wesley and Rich [6] Cooper et al. [9] Bailey et al. [10] Bailey et al. [11]
Lepton e~
7 1+10-9
(9 ~ 2)/2 0.00115965241(20)
e~ e~
1.2 2.5 X 104 12 29.2
0.00115965770(350) 0.0011622(200) 0.00116616(31) 0.001165922(9)
H* H*
Table 20.2. Summary of lepton g — 2 relativity tests Method p,~ ,fi+ g factor e~ g factor e~ g factor
References [10] and [11] [6] and [7] [7] and [9]
7 (D
12 1 1
7(2)
29.2 1.2 2.5 x 104
Ci (1.4 ±1.8) x 10~ 8 (-2.6 ±1.8) x 10~8 (-1.0 ±8.0) x 10" 1U
Chap. 20. The Thomas Precession Experiment
565
REFERENCES [1] Yuan Zhong Zhang, Special Relativity and its Experimental Foundations, 1997, World Scientific Publishing Co Pte Ltd, Singapore, p. 269. [2] L. T. Thomas, Phil. Mag. 3 (1927) 1. See also Yuan Zhong Zhang, ref. 1, p. 49. We note that all formulas in classical electrodynamics, including those related to the Thomas precession, can be written within the 4-dimensional symmetry framework of, say, taiji relativity (or common relativity) consistent with experiments and without involving the universal speed of light c. See Jong-Ping Hsu, Einstein's Relativity and Beyond-New Symmetry Approaches, (2000, World Scientific Publishing Co Pte Ltd, Singapore) Chapters 10 and 12. [3] The correct gyromagnetic ration g = 2 was first obtained by taking the non-relativistic limit of the Dirac equation (see Chapter 5). Thus most people considered that g — 2 is a consequency of Dirac's relativistic equation for the electron-a splendid result of the union of Lorentz and Poincare invariance and quantum mechanics. [4] J. Schringer, Phys. Rev. 73 (1948) 416; ibid, 76 (1949) 790; R. Karplus and N. Kroll, Phys. Rev. 77 (1950) 536; C. Sommerfield, Ann. Phys. 5 (USA) (1958) 26; A. Petermann, Nucl. Phys. 5 (1958) 677; M. J. Levine, Phys. Rev. Lett. 26 (1971) 1351. [5] D. Newman, G. W. Ford, A. Rich and E. Sweetman, Phys. Rev. 40 (1978)1355.
Lett.
[6] J. Wesley and A. Rich, Phys. Rev. A4 (1971) 1341. [7] R. Van Dyck, P. Schwinberg, and H. Dehmelt, Phys. Rev. Lett. (1977) 310. [8] F. Combley et al, Phys. Rev. Lett. 42 (1979) 1383. [9] P.S. Cooper et al, Phys. Rev. Lett. 42 (1979) 1386. [10] J. Bailey et al, Nuovo Cimento 9 (1972) 369. [11] J. Bailey et al., Phys. Lett. B68 (1977) 191.
38
This page is intentionally left blank
Appendices
568
Lorentz and Poincare Invariance
Woldemar Voigt (1850-1919)* Von C. Runge On December 13, Woldemar Voigt passed away after a short but severe illness. Almost 70 years old, he was still leading a full and active life, which makes the gap left by his absence even larger. The scientific world has lost a great individual. An even greater loss has been suffered by his university, the University of Goettingen. Perhaps the greatest loss, however, is the one suffered by those who had the chance to get to know Voigt personally. The University of Goettingen must now get along without a great scientist, teacher, and supporter of science in many fields. It was important to Voigt to encourage the younger generations. He was interested in all affairs concerning the university. Everybody could rely on his generosity. He made many projects and professorial appointments possible, sometimes by making personal sacrifices. Voigt is one of several great physicists who were students of Franz Neumann. Voigt studied under Neumann between 1871 and 1874 in Koenigsberg. Born in 1850 in Leipzig, Voigt left school with the 'Maturitaetsexamen'** in 1868. He studied at the University of Leipzig until the beginning of the German-French war. As a twenty-year old student, he became a soldier in a campaign against the French. After the campaign, he continued his studies in Koenigsberg under Richelot and Franz Neumann and obtained his doctorate in 1874. After his return to Leipzig, he obtained a job as a high school teacher at the Nicolaischule, which he had attended when he was a young boy. Following Neumann's advice, Voigt decided to start an academic career. In 1875 he resigned from his teaching job in Leipzig to start writing his "Habilitation." Shortly before giving his inaugural lecture in Leipzig, he was offered an associate professorship in Koenigsberg. Voigt would replace Franz Neumann who, at 77 years of age, wanted to spend more of his time on his own scientific research rather than on lectures or under the direction of the Physics Department of the University. The permanent replacement of Neumann was not easy. The university did not even own a physics laboratory. In 1876, during Neumann's 50-year doctoral jubilee, the Prime Minister of Ostpreussen promised to build a physics laboratory. The laboratory was finished in 1885, later than expected. By this time, Voigt had already left Koenigsberg. In 1883 a chair position in the physics department at Goettingen was offered to him. Voigt
Appendices
569
stayed in Goettingen until he died. A new physics institute was supposed to be built for him and E. Riecke. But as in the case of the physics laboratory at Koenigsberg, the work on the institute was not finished until 1905, again later than expected. In the summer of 1914, Voigt resigned from his position as director of the institute. Nevertheless he continued giving lectures in theoretical physics at the University of Goettingen. To duly honor his scientific activities goes beyond the scope of this obituary. In his work, he studied mainly the field of crystallography, "the great magnificent field," as he said in the preface of his book about crystallography which was published in 1910, "to which I have returned over and over again within the last 36 years." But Voigt's many-faceted studies included other fields of research as well, such as the elasticity and stability of solids, thermodynamics, and magneto- and electro-optics, about which he also wrote a book. In addition to hundreds of articles which were published in the Annalen der Physik, the Goettinger Nachrichten and in the Physikalische Zeitschrift, and in his own Kompendium der Physik and Thermodynamik, one should also mention the many other works of his students which were done at his suggestion. A great number of physicists from Germany and other countries studied under him, both theoreticians and experimentalists. In addition to his scientific capability, Voigt also had a great artistic talent. As a young man, it was hard for him to decide whether to spend his whole life on music or on science. "In music one is either a great musician or a poor one, in physics one can also be a mediocre physicist," he said modestly. Voigt never lost his love for music during his life. He was an expert on the works of Bach and a large number of Bach cantatas were performed under his direction in the church at the University of Goettingen. He also organized great performances at his own house with his musical friends whom he gathered around him. On December 15, in front of his house, where he was often found waving the conductor's baton stood an organ and his bier. From there, pallbearers carried his coffin away to the sounds of a Bach chorale.
"Translated from Physikalische Zeitschrift, (No.4, XXI, pp. 81-82, 1920) by Andreas Ernst, Department of Physics and Astronomy, Univ. Heidelberg. **This term really indicates that he was a master student and would be guaranteed admission to university.
570
Lorentz and Poincare Invaxiance
George Francis FitzGerald (1851-1901) J. P. Hsu
George Fancis FitzGerald was born at Monkstown in Dublin. His father was a Trinity professor and his mother was the sister the distingushed physicists G. J. Stoney, who named the electron. He was tutored at home and entered Trinity College in Dublin at the age of 16. H Later, he married the daughter of a physics professor and spent his whole life at Trinity. [1] FitzGerald was a tutor at Trinity in 1879 and became a professor in 1881. He was the first physicist to suggest a method of producing radio waves by an oscillating electric current, which was verified experimentally later by H. R. Hertz between 1885 and 1889. FitzGerald was well-known among the physics communities of Great Britain and Ireland. [1] He was highly honored for his extensive knowledge of physics, his critical powers and his interesting speculations. He frankly told his friends that he was not in the least sensitive to making mistakes. His habit was to rush out with all sorts of crude notions in hope that they might set others to thinking and lead to some advance. He did precisely this in the his short paper submitted to Science. A crucial issue before special relativity was the motion of the earth through the aether. One should be able to detect the aether wind by measuring the propagation of light in the aether. Maxwell noted that such an effect was extremely small, a second-order effect, in his 1879 letter to D. P. Todd, Director of the Nautical Almanac Office in Washington, D.C. [2] Michelson was doing post-graduate work in Helmholtz's laboratory, on leave from the U.S. Navy. He read Maxwell's letter and was undeterred. He went ahead to invent an ingenious interferometer in order to detect Maxwell's second-order effect. Michelson carried out an experiment with his newly invented interferometer. In order to avoid urban perturbative vibrations, he did the experiment at the astrophysical observatory in Potsdam rather than in Berlin. However, no shift of interference fringes was detected in Michelson's experiment (1881) when the interferometer was rotated by 90 degrees. The same experiment was repeated by Michelson together with Edward W. Morley in 1887 in Cleveland. They built a new interferometer and took great care to minimize perturbative vibrations. As a result, they obtained the same null outcome with a much better accuracy. The null result was a disappointment to Lorentz, Kelvin, Rayleigh, Michelson and others who believed in aether. In the spring of 1889, FitzGerald conceived his idea of contraction of
Appendices
571
length of a moving rod in the aether when he was visiting his friend Oliver Lodge in Liverpool. This ideas was the first positive reaction to MichelsonMorley's null result. As stressed by Weaire, [1] FitzGerald did not simply cook up the hypothesis of length contraction in an arbitrary way: It could well have been inspired by what FitzGerald knew of the work of Oliver Heaviside whose calculations showed that the electric field of a moving charge is compressed by the motion, according to Maxwell's equations. The calculations were published in the December issue of The Electrician in 1888. FitzGerald guessed that, if molecular forces were similar to electric forces, it was reasonable that material bodies will be contracted in the direction of motion relative to the aether. FitzGerald sent a letter to Science, then an obscure American journal. [3] The very short paper by FitzGerald was published two week later. In 1892, Lorentz independently proposed the same idea of length contraction. This was the first spark of theoretical speculation, the FitzGerald-Lorentz contraction, that eventually caused a prairie fire in theoretical physics in the 20th century. [4]
References [1] D. Weaire, Physics World, 5, 31 (1992). [2] A. Pais, Subtle is the Lord...(Oxford Univ. Press, 1982) pp. 112-113. [3] FitzGerald did not submit his short paper to the proceedings of the Royal Dublin Society, apparently because he was in a huff wih that society. See ref. 1. [4] J. P. Hsu, Einstein's Relativity and Beyond-New Symmetry Approaches (World Scientific, Singapore-New jersey, 2000) pp. 22-26.
572
Lorentz and Poincare Invariance
Abbreviated Biographical Sketch of Walter Ritz (1878-1909) Robert S. Fritzius
Walter Ritz was born in Sion in the southern Swiss canton of Valais on February 22, 1878. His father, Raphael Ritz, a native of Valais, was a well-known landscape and interior scenes artist. His mother was the daughter of the engineer Noerdlinger of Tubingen Germany. After an all-to-short but brilliant career in physics Ritz died at age 31 in Gottingen Germany on July 7, 1909. As a specially gifted student, the young Ritz excelled academically at the Lycee communal of Sion. In 1897 he entered the Polytechnic school of Zurich where he began studies in engineering. He soon found that he couldn't live with the approximations and compromises involved with engineering so he switched to more mathematically exacting studies in physics. (Albert Einstein was one of his classmates.) Following a severe bout with what may have been pleuresy he transferred to Gottingen Germany in 1901 to get away from the humid climate of Zurich. There his forming aspirations were strongly influenced by Voigt and Hilbert. Ritz's dissertation (under Voigt) concerned a mathematical expression to predict the frequencies of the lines in atomic spectral series. His classical approach to the phenomenon involved elastic atomic vibrations. (In retrospect it should be noted that this approach could be considered as similar in spirit to our current ideas about the vibrational origin of molecular spectra.) His oral doctoral examination was passed summa cum laude. Ritz's work in spectroscopic theory eventually led to what is still known as the Ritz combination principle. His interpretation of the underlying mechanism was rejected but his 1908 mathematical formulation for the frequencies of the lines in a given spectral series
vm,n = —7
~
(m, n = 1,2,...)
multipled by Planck's constant is the Rutherford-Bohr 1913 quantization rule for quantum mechanics. In the spring of 1903 Ritz visited Leiden to attend a series of lectures by H.A. Lorentz on electrodynamics problems and his new theory of electrons. In June t h a t year he was in Bonn at the Heinrich Kayser institute where he found the hitherto undiscovered m = 4 difuse series line of potassium which was predicted in his dissertation. In November that year Ritz began work on producing infra-red photographic plates at the Ecole Normale Superieure in Paris. In July 1904 his health failed and he returned to Zurich. During the following three years of unsuccessfully trying to regain his health Ritz was outside the scientific centers. It wasn't until 1906 that he began to publish again. This in spite of his poor health.
Appendices
573
In September 1907 he moved to his mother's home town of Tubingen, which was a center for spectral research. In 1908 he relocated to Gottingen where he qualified as a privat dozent (private lecturer) at the University. There he produced his opus magnum Recherches critiques sur I'Electrodynamique Generate. In the First Part of his "Critical Researches" Ritz delineated in depth his version of the shortcomings of the continuum-ether Maxwell-Lorentz electromagnetic theory and urged science to avoid the strange consequences associated with adopting Einstein's special theory of relativity. T h a t theory had been formulated to bring the Maxwell-Lorentz partial differential equations into harmony with our failures to detect our motion through the ether. In the Second Part Ritz outlined a way to take Lorentz's ideas back to the classical electrodynamic theories of Gauss, Weber, Reimann, and Clausius by use of his time assymetric finite-speed elementary interactions. Ritz warned his readers that his preliminary approach was flawed because in order to remain faithful to Lorentz's formulations he had adhered to the superposition theorem which amounts to action without reaction. In essence, Ritz's model, as enunciated in 1908, is not applicable to optics for macroscopic distances in dispersive media such as the Earth's atmosphere. Yet it may be found to be applicable on a microscopic scale (millimeters down to atomic dimensions) and for distances on the order of perhaps one light year in the vacuum of interstellar space. (See Fox's article in the sources below.) In 1908-1909 Ritz and Einstein held a war in Physikalischce Zeichschrift over the proper way to mathematically represent black-body radiation (the radiation problem from the ultraviolet catastrophe) and over the theoretical origin of the second law of thermodynamics. The final paper in that series looks as though it could have been written by the journal editors. Its language judges in favor of Ritz. Six weeks after its publication Ritz was in a higher court. (See Fox.)
Sources used in preparing this limited biography: Ritz, W., Gesammelte Werke - Oeuvres, Published by the Societe Suisse de Physique, Gauthier-Villars, Paris, 1911. Forman, P., Dictionary of Scientific Biography, X I , 475, Charles Scribner's Sons, New York, 1975. Fox, J. G., Am. J. Phys., 3 3 , 1, 1965.
574
Lorentz and Poincare Invariance
Hans Reichenbach (1891-1953): Principal Dates [1] M. Reichenbach
1891 1910-15 1915 1915—17 1917-20 1920—26 1926—33 1933 1933—38 1938—53 1947 1947 1952 1953 j 953
Born on September 26th in Hamburg University studies in Berlin, Munich, Gottingen, Erlangen and the Technische Hochschule at Stuttgart Ph.D., Erlangen German soldier on the Russian front Radio engineer in the laboratory of a radio firm at Berlin Privatdozent, later Associate Professor, Technische Hochschule at Stuttgart Associate Professor of Philosophy of Physics, University of Berlin Dismissal from University of Berlin under Hitler Professor of Philosophy, University of Istanbul Professor of Philosophy, University of California, Los Angeles Visiting Professor at Columbia University, City College of New York, and the New School for Social Research President, American Philosophical Association, Pacific Division Lectures at the Institut Henri Poincare, Paris Invitation to deliver the William James lectures at Harvard University Died on April 9th in Los Angeles
References [1] From the Reidel volume No. I. Hans Reichenbach, Selected Writings, 1909-1953.
Appendices 575 The Most General Linear-Acceleration Transformation of Spacetime Based on Limiting 4-Dimensional Symmetry
Department
Jong-Ping Hsu of Physics, University of Massachusetts North Dartmouth, MA 02747-2300, USA
Dartmouth
The spacetime transformations of non-inertial frames must have 4-dimensional symmetry of the Lorentz and Poincare groups in the limit of zero acceleration. This natural requirement leads to fairly simple general-linear-acceleration transformations of spacetime. The set of general transformations forms a group, which includes the Wu and M0ller groups (for constant linear accelerations) and the Lorentz and Poincare groups as special limiting cases. Both the Planck constant h and the speed of light c are no longer universal constants under such general transformations of spacetime. Physical implications regarding truly universal constants and basic wave equations for both inertial and non-inertial frames are discussed.
1
Introduction
Inertial frames are idealizations or approximations of non-inertial frames with small accelerations. Experiments have established that physical laws in inertial frames display 4-dimensional symmetry of the Lorentz and Poincare groups. However, almost all physical frames of reference in the universe are, strictly speaking, non-inertial because of the long range action of the 'gravitational force.' Thus, basic laws of physics and the universal constants should be understood not only in inertial frames but also in non-inertial frames. l ' 2 In particular, we already know that the speed of light is not a universal constant in non-inertial frames. It is natural and necessary to require that the laws of physics in non-inertial frames must display the 4-dimensional symmetry of the Lorentz and Poincare groups in the limit of zero acceleration. Such a requirement is postulated as the principle of 'limiting 4-dimensional symmetry' 2 in this paper. The limiting 4-dimensional symmetry principle has been used to derive a simple generalization of the Lorentz transformation, i.e., spacetime transformations involving a constant velocity and a constant linear-acceleration. 2 The set of transformations forms the Wu group, which includes the M0ller group 3 and the Lorentz group as limiting cases. 4 Here, the same limiting 4-dimensional symmetry is employed to generalize spacetime transformations for frames with arbitrary accelerations along a straight line. These frames are called general-linear-acceleration (GLA) frames. The "4-dimensional spacetime" (w,x,y,z) of non-inertial frames is a generalization of the Minkowski spacetime for inertial frames. Let us call such a spacetime "general taiji (GT) spacetime." Since the constant c of the speed of light has no operational meaning in GLA frames, we will directly use w (with the dimension of length) instead of t (or ct) for the generalized evolution variable and call it the "GT time" (or simply time) for GLA frames. 5 With the help of such GLA transformations, basic wave equations, truly universal constants and physical properties of spacetime for both inertial and non-inertial frames can be discussed. In addition, the quantization of gauge fields in GT spacetime with non-constant metric tensors could shed light on the quantum theory of gravity. 6
576 2
Lorentz and Poincare Invariance Kinematical Approach to General Spacetime Transformations Based on Limiting 4-Dimensional Symmetry
Let us consider first the implication of limiting 4-dimensonal s y m m e t r y for spacetime transformations between a non-inertial frame Fc(w, x,y, z) (with constant velocity and linear acceleration a0) and an inertial frame Fj(wj,xj,yi,zj). Suppose t h a t the frame Fc has both its velocity and acceleration directed along parallel x and xj axes and t h a t the origins of Fc and Fj coincide at the time w = wj = 0. We have transformations for the constant-linear-acceleration frame Fc(x) and an inertial frame FI(XJ): 2 wi = ~ff3x+
10
^+a0+w0,
Xf = yx-\
7
~+b0+x0,
yi = y+y0,
zi =
z+z0, (1)
bo = <*o7o
,
P = a0w + /?„,
7 =
/
=
V1-/3
<*olo
•
lo
Vl~
PZ
where the velocity /? = P(w) is a linear function time w. T h e transformation (1) with the condition x^ = (w0, x0, y0, z0) = 0 is called the Wu transformation. It reduces to the M0ller transformation when (30 = 0 and z£ = 0, provided a change of t i m e variable (w — (l/a0)tanh(a0w*)) is m a d e . 2 One can verify t h a t , in the limit of zero acceleration, a0 —> 0, the transformation (1) reduces to the 4-dimensional transformations: wj = -f0(w+/30x)+w0,
xi = ~f0(x+/30dw)+x0,
yi - y+y0,
zi = z+z0.
(2)
T h u s , limiting 4-dimensional symmetry of the Lorentz and Poincare invariance is satisfied. T h e differential form of the Wu transformation (1) for constant-linearacceleration is dwi = ~f(Wcdw+pdx),
dxj = j(dx+pWcdw),
dyj = dy,
dzi — dz;
(3)
y2{y;2+a0x).
Wc = T h u s we have dsc = gcilvdxv'dxv gc^
= dwj — dxj — dyj — dzj,
=
(4)
2
{W c{w,x),-1,-1,-1),
where gcllv is the metric tensor for the spacetime of constant-linear-acceleration frames. Now let us consider a generalization of the inhomogeneous Wu transformation (1) to the most general non-inertial frame F{w,x,y,z) moving with an arbitrary velocity f3(w) or arbitrary acceleration a(w) along the x-axis,
/?(«,) = A H + A, 0(0) =/?«,,
«H = ^
=^ .
(*)
a(0) = ao.
T h e last two initial conditions are related to the fact t h a t the origins of F(w, x, y, z) and Fj(wj,xj,yj,zi) coincide at G T time w = wi = 0. T h e velocity /?(«;) is an arbitrary function of G T time w and characterizes the general-linear-acceleration in
Appendices
577
the x-direction of a non-inertial frame F. This function must be given in order to specify the acceleration of a non-inertial frame. For example, if/? = Jeow2/2+a0w+ j30, then the non-inertial frame has a constant jerk Jeo and a variable acceleration dp/dw = Jeow + «(., which is linear in time w. One of the simplest generalizations of the constant-acceleration case (3) is to write the local relation for F(x) and FJ(XJ) in the following form, dwi = -y(Wadw + ,3dx),
dxi = y(dx +
7=
J-—,,
fiWbdw),
dyi = dy,
dzi = dz; (6)
/?2<1,
/? = /?iM + &.
vl - P where the two functions Wa — Wa(w,x) and Wb = Wb(w,x) m a y be different in general, in contrast to the case of constant-linear-acceleration in (1). T h e local form (6) is a minimal departure from the Wu transformations (3) for constant accelerations. T h e limiting 4-dimensional symmetry dictates t h a t the two unknown functions Wa(w,x) and Wb(w,x) must satisfy the following two integrability conditions for the differential equations in (6):
Since 7 and /? are functions of w only, (6) and (7) lead to the results wi — yf3x + I fA(w)dw,
xi = jx + I j/3B(w)dw.
(8)
where we obtain Wa and Wb, Wa(orWb) — -f2ax + A(w)(orB(w)), from (7) and substitute t h e m in (6) to carry out integrations. Limiting 4-dimensional s y m m e t r y and minimal departure from (1) suggest t h a t the two integrals in (8) have the following forms involving an acceleration function a(w):7 / -fA(w)dw
= 7/?
[1pB(w)dw =
\
1
+ a0
(9)
+60.
(10)
Such a generalization for spacetime transformations is essentially an assumption guided by 4-dimensional symmetry of the Lorentz and Poincare groups and by minimal departure from the Wu and M0ller transformations for constant-linearacceleration frames. By differentiations of (9) and (10), we can determine the two functions A(w) and B(w), ...
72 7o
pje a
n
.
.
1Z
72
Je a
lo
P
. .
r
io
da
d2/3
dw
dw1
,
,
where Je{w) is the jerk,' which is the third-order time derivative of coordinates. From equations (8)-(10), we obtain a simple and general spacetime transformations for non-inertial frames with the most general linear-acceleration a(w):
wI = 1p(x+
1
) - -A-
+
u,
(1^
578
Lorentz and Poincare Invariance
xj = 7
x +
y + y0,
VI
1 + x0. a0~f0
ot{w)-il/ zj-z
+ z0,
where the two constants of integration a0 = —(50/{a0-)0) and b0 — —l/(a0^0) are determined by the limiting 4-dimensional s y m m e t r y as a(w) — a0 —» 0. T h e relations in (12) m a y be termed "general taiji (GT) transformations" for a frame F(x) with a general linear acceleration a(w), which is an arbitrary function of w. In the following discussions, we shall set x% — 0. T h e G T transformations for the differentials dx^ and dxf can be obtained from (12): dwj = •y(Wadw + (3dx),
dxi = -y(dx + pW0div),dyj
Wa - 7
2 2
( ax + — To
dzj = dz; Je(w)
Wb = 7 2 [ ax + —
>0,
a!
— dy,
P<*22-,2 1,
10
(13) >0.
T h e invariant interval ds2 in GLA frames can be obtained from (13), ds2 = <7oc> == W~,
g0i=
g^dx^dx",
g10 = U,
(14)
<7ll = 522 = 533 = - 1 ,
2
T2
Je{w Je{w) 1 ax+ —r >o, u = 2 <x l'Z a2J2 ' To Mathematically, the transformation (13) can be obtained by the replacement W2(w,x)
= 7
(dw,dx)
->• (dw*,dx*)
-
{W*dw,dx-Udw),
where W* = j2(ax + l/f02), and then followed by a 4-dimensional rotation on the w-x plane with y and z axes fixed: dwj = f(dw* + (idx*),
dxj — j(dx*
dw* — W*dw,
+
ftdw*),
dyj = dy,
dzj — dz;
dx* = dx — Udw
W h e n the jerk Je{w) = da(w)/dw vanishes, we have a(w) = a0 and one can see t h a t the G T transformation (12) reduces to the Wu transformation (1) for a constantlinear-acceleration frame Fc(x), in which the time axis is everywhere orthogonal to the spatial coordinate curves. T h e contravariant metric tensors for a general non-inertial frame can be obtained from (14): 1
„oo
W
2
2
+ U'
901 = 9W
=
U W + U2' 2
9U
=
-w2 w2 + u2'
,22
33 9 = -1.
(15) Using (14) and (15), one can verify t h a t gaig1p = 6% and all other components vanish. If x^ = 0, the inverse of the G T transformation (12) is found to be
&
•
•
wi + Po/{a0~fo) xi + l / ( a 0 7 0 )
(16)
Appendices
wi +
Xj +
Po 9 '
Ololo
loOio Z =
y = yi,
579
Zj.
If a specific function for P(w) is given and one can solve time w in terms of /?, then the first equation in (16) can be written in the form w = w(P). For example, the simplest generalization of constant-linear acceleration is the case with a constant jerk, Je0 = constant or /? = Jeow2/2 + a0w + /30. When w is positive, we have w = ( l / J e o ) [ - a 0 + sjal
+ 2Jeo(p
- /?«,) ],
where /? is given by (16), so t h a t we have the transformation for time, w — w(wj, 3
(17) xj).
Physical Implications and Discussions
T h e coordinates x^ specified by the metric tensor in (15) for a non-inertial frame may be called "general taiji (GT) spacetime." They are the preferred coordinates for the general taiji transformation with limiting 4-dimensional s y m m e t r y . 8 Other choice of coordinates will not satisfy the limiting 4-dimensional symmetry. T h u s , the present theory of spacetime for general non-inertial frames is not a general covariant theory, in contrast to the general theory of relativity. Furthermore, the Riemann curvature tensor of the G T spacetime vanishes, as implied by the G T transformation (12). Other physical implications are as follows:
(A) Spacetime-Dependent Speed of Light in Non-inertial Frames It is known t h a t the law for the propagation of light is (is = 0 in an inertial frame. Thus, the propagation of light in a non-inertial frame is described by the same invariant law (14) with ds = 0. In order to see the property of light in a GLA frame, let us consider some specific and simple cases. Suppose a light signal moves along the x-axis, i.e., dy = dz = 0, the speed of light Pix is found to be dx 0L* =
f2(ax
+ ~f02) +
dy = dz = 0,
(18)
which is certainly different from the speed of light pL — \ (derived from ds2 = dwj — dx2 = 0) in an inertial frame. If the signal moves in the y-direction, i.e., dx = dz — 0, eq. (14) with ds = 0 leads to the speed of light piy
Ply
dy dw
7 2 I ax +
Je{w) a 2 7o
H
dx = dz = 0.
(19)
(B) T h e Laws of Velocity-Addition in Non-Inertial Frames In general, the law of velocity addition can be obtained from (13), dxi du>i
dx/dw
+ p\¥b
Wa + Pdx/dw '
(20)
580
Lorentz and Poincare Invariance dyi dy/dw ~dw~i ~ -y(Wa + Pdx/dw)'
dzj dz/dw d~w^ ~ -y(Wa + fidx/dw)'
In particular, if /?£ = dxj/dwi — 1, then the velocity-addition law (20) leads to the same spacetime-dependent speed of light (18) in GLA frames. (C) T h e General Taiji G r o u p Let us consider two other GLA frames F' and F", which are respectively characterized by arbitrary velocities f3'(w'), (3"(w"), initial accelerations a'ot a", and initial velocities (3'0, f3'J. T h u s we have the G T transformations among Fj, F, F' and F". T h e G T transformation between F and F', and t h a t between F and F" can be obtained. Using these relations, one can show t h a t the G T transformation between F'(x') and F"(x") has the same form as t h a t for F and F' frames. Other group properties can also be verified. T h u s , the general taiji transformations form a group, which may be called 'general taiji (GT) group.' This G T group involves one arbitrary acceleration function a(w) and two parameters, i.e., the initial acceleration a0 and the initial velocity /?„. T h e G T group includes the Wu and M0ller groups (for constant linear accelerations) and the Lorentz and Poincare groups as special limiting cases. (D) Physical T i m e in Inertial and Non-Inertial Frames T h e usual time t (measured in, say, seconds) and the universal constant c (measured in, say, cm/sec) do not exist within this theory in general. T h e physical time w, called taiji time, has the dimension of length and can be realized physically by 'computerized c l o c k s ' 5 (or 'light clocks' as shown on p. xxvi in this volume). Since the variable speed of light is given by the function (18), it is very complicated to use such a non-constant speed of light to synchronize clocks in a GLA frame F(w, x, y, z). However, it is not necessary in general to use light signals to synchronize clocks. 2 If the computer chips are not affected by acceleration or the effect due to acceleration can be corrected, then one can use a grid o f ' c o m p u t e r clocks' in F to realize the taiji-time w: Namely, suppose a computer clock can accept information concerning its position in the Fj frame, obtain wi from the nearest Fi clock, and then compute and display w using the inverse transformation of (12). In the case of constant jerk, we have the transformation of time w (17), which includes the transformation of time, w = J0(U>I — (3axi) in relativity theory as a special limiting case when Jeo —> 0, and a0 —• 0. In a GLA frame, time w is restricted by P2(w) x > —l/(a(w)-f1), as shown in (12) and (16).
< 1 and space is limited by
(E) Classical Electrodynamics in Inertial and Non-Inertial Frames Let us consider classical electrodynamics in GLA Frames. For a continuous charge distribution in space, the invariant action for the electromagnetic fields and their interaction is assumed to be
" /
a,r + \uvr V=9
d4x.
y/=g =•• V~det a«P = V ^ 2 + U2 = 7 2 (ax + 4 ) > °-
(21)
Appendices
581
where £>„ denotes the partial covariant derivative. T h e most general Maxwell equations for both inertial and non-inertial frames can be derived from Sem. (F) Truly Universal Constants in both Inertial and Non-inertial Frames Since the speed of light c in an accelerated frame F is no longer a universal constant, the invariant action for a charged particle moving in the electromagnetic 4-potential ali(x) in a GLA frame is assumed to be S = J {-mds-ea^dx''),
e = -1.6021891 x 1 0 " ^ V ' i ^ F c m .
(22)
T h e action S is formally the same as t h a t in constant-linear-acceleration frame. 2 Following the same reasoning as t h a t for the frames with constant-linear-accele r a t i o n s 2 , the truly universal and fundamental constants in both inertial and noninertial frames are the q u a n t u m constant, J — 3.5177293 x 10 -38 <7 • cm, and the electric charge in the electromagnetic units given by (25) [or ae = e 2 /(47rJ) = 1/137.036]. It is interesting t h a t these universal constants for non-inertial frames with arbitrary-linear-accelerations turn out to be precisely the same as those in the theory of relativity which is formulated solely on the basis of the first principle of relativity, without making any assumptions concerning the speed of light. 5 (G) Generalized Klein-Gordon and Dirac Equations for Non-inertial Frames T h e most general Klein-Gordon and Dirac equations for b o t h inertial and noninertial frames are [g^{UD^ UVd^
- e a ^ H z V A , - ea„) = m 2 ] <j> = 0,
+ Uj(8^)xl> + I(^/nV=5)xir"V -mxjj = 0,
(23) (24)
where ^T^, T" j =2<jf"/(x). T h e Dirac equation can be derived from a symmetrized Lagrangian. 9 T h e author would like t o thank Dr. L. Hsu for his useful discussions and suggestions. T h e work was supported in part by the Jing Shin Research Fund of the UMassD Foundation and the Potz Science Fund.
582
Lorentz and Poincare
Invaiiance
References 1. For early discussions of acceleration transformations, see A. Einstein, J a h r b . Rad. Elektr. 4, 411 (1907); L. Page, Phys. Rev. 4 9 , 254 (1936); C. M0ller, Danske Vid. Sel. Mat.-Fys. xx, No. 19 (1943); T . Fulton, F . Rohrlich and L. W i t t e n , Nuovo Ciment.XXVI, 652 (1962) and Rev. Mod. Phys. 3 4 , 442 (1962); E. A. Desloge and R. J. Philpott, A m . J. Phys. 5 5 , 252 (1987). 2. For more recent discussions based on limiting 4-dimensional symmetry, see Jong-Ping Hsu and Leonardo Hsu, Nuovo Cimento B, 1 1 2 , 575 (1997) and Chin. J. Phys. 3 5 , 407 (1997). Jong-Ping Hsu, Einstein's Relativity and Beyond - New Symmetry Approaches, (World Scientific, Singapore, 2000), Chapters 21-23. Limiting 4-dimensional symmetry is simply the 4-dimensional symmetry of the Lorentz and Poincare groups applied to acceleration transformations in the limit of zero acceleration. 3. C. M0ller, Danske Vid. Sel. Mat.-Fys. xx, No. 19 (1943); see also The Theory of Relativity, (Oxford university press, 1952), Chapter VII. 4. See ref. 2 and Jong-Ping Hsu and Leonardo Hsu, in JingShin Theoretical Physics Symposium in Honor of Professor Ta-You Wu (Editors J. P. Hsu and L. Hsu, World Scientific, 1998) pp. 393-412. 5. Such a taiji time w appears naturally in the theory of spacetime for inertial frames based solely on the first postulate of relativity, without making any postulate regarding the speed of light. See Jong-Ping Hsu and Leonardo Hsu, Phys. Lett., A 1 9 6 , 1 (1994) and ref. 2. 'Taiji' denotes, in ancient Chinese thought, the ultimate principle or the condition as it existed before the creation of the world. 6. See, for example, A. A. Logunov and M. A. Mestvirishvili, Progr. Theor. Phys. 74, 31 (1985). If Logunov's field-theoretic approach to gravity is viable, then the present G T spacetime could provide a specific m a t h e m a t i c a l framework to further explore its physical implications. 7. T h e generalization with this replacement a0 —t a(w) [for the accelerations in the denominators of (9) and (10)] is crucial. Mathematically, this replacement implies t h a t the two variables w and x in W(w,x) cannot be separated. In contrast, a general transformation for GLA frames was discussed in a previous paper based on the separation of w and x in W(w,x). See J. P. Hsu, in FRONTIERS OF PHYSICS AT THE MILLENNIUM, SYMP (Ed. Y. L. Wu and J. P. Hsu, World Scientific, 2001). T h e generality in this conference paper turned out to be restricted and, hence, not completely satisfactory because the additional assumption of separation of variables, W(w,x) = W\{w)W2{x), prevents the truly general linear accelerations from being full realized. See also T . Kleinschmidt, Master Thesis, UMass D a r t m o u t h (2001) 8. This is analogous to the fact t h a t pseudo-Cartesian coordinates are the preferred coordinates for the Lorentz transformations. 9. Such a Lagrangian involves (iJ/2)[ipT^d^4>(d^ip)T^ip]. V can be related to the constant Dirac m a t r i x fa by T^ = yael'a, where the tetrad, e^a, a = 0 , l , 2 , 3 , is a set of four mutually orthogonal unit vectors. In the field theory with the Poincare or the de Sitter group as the gauge group, the gauge invariant Lagrangian must have the tetrad as a 'scale gauge field,' in addition to the usual gauge fields. See J. P. Hsu, Phys. Lett. 119B, 328 (1982).
Appendices
583
About the author Jong-Ping Hsu Jong-Ping Hsu received his B.S. in physics from the National Taiwan University and his M.S. in physics from the National Tsing Hua University. In 1965, he traveled to the University of Rochester to study particle physics and earned his Ph.D. degree in 1969. He has worked at McGill University, Rutger, the State University of New Jersey, the University of Texas at Austin, Marshall Space Flight Center, NASA, and is currently at the University of Massachusetts Dartmouth. He has also been a visiting scientist at Brown University, MIT, National Taiwan University, Beijing Normal University and Academy of Science, China. Hsu's research has been concentrated in the areas of gauge field theories, the unified electroweak theory, fuzzy quantum mechanics and fields, and broad views of 4-dimensional symmetry. In addition to editing four conference proceedings, he has published one book and over 120 papers and articles. Hsu is currently a chancellor professor of physics and the Director of Jing Shin Research Fund at the University of Massachusetts Dartmouth.
About the Author Yuan-Zhong Zhang Yuan-Zhong Zhang graduated from the University of Science and Technology of China in 1965. He joined the Institute of Physics, Academia Sinica, as an Assistant Researcher in the same year and became an Associate Researcher in 1977. Zhang was an Associate Researcher at the Institute of Theoretical Physics in 1980, then was promoted to Associate Professor in 1986 and Professor in 1992. In addition, Zhang was an Associate Member of the International Center for Theoretical Physics, Italy, during 1988-1993. He has published more than 80 papers and two books. Zhang is currently head of the Gravity Group at the Institute of Theoretical Physics at Academia Sinica. He is also a Member of the Science Committee of the Institute, Vice-chairman of the council of the Chinese Society of Gravitation and Relativistic Astrophysics, and a Co-Editor of the Editorial Board of the Journal of Chinese Physics.
Advanced Series on Theoretical Physical Science •
• •
Volume
LORENTZ AND POINCARE INVARIANCE 100 YEARS OF RELATIVITY by J-P Hsu (University of Massachusetts Dartmouth) and Y-Z Zhang
(Institute of Theoretical Physics)
This collection of papers provides a broad view of the development of Lorentz and Poincare invariance and spacetime symmetry throughout the past 100 years. The issues explored in these papers include: (1) formulations of relativity theories in which the speed of light is not a universal constant but which are consistent with the four-dimensional symmetry of the Lorentz and Poincare groups and with experimental results, (2) analyses and discussions by Reichenbach concerning the concepts of simultaneity and physical time from a philosophical point of view, and (3) results achieved by the union of the relativity and quantum theories, marking the beginnings of quantum electrodynamics and relativistic quantum mechanics. Ten of the fundamental experiments testing special relativity are also discussed, showing that they actually support a four-dimensional spacetime based on broad Lorentz and Poincare invariance which is more general than and includes the special theory of relativity. The generalization of the concepts of simultaneity, physical time and the nature of the speed of light within a four-dimensional spacetime framework leads to the conclusion that the symmetries embodied by the special theory of relativity can be realized using only a single postulate — the principle of relativity for physical laws.
ISBN 981-02-4721-4
www. worldscientific. com 4785 he
9 "789810"247218l