This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
= <j>(C, T). , one can replace this infinite group by a finite group, as is discussed by PITTERI [7]. There are the central inversions, for which the action is ea -> - e a , b 3 = 0, ), one can replace B by a constant multiple of H, the magnetic field intensity, as this is interpreted in much of the literature. TRUESDELL & TOUPIN [13, Ch. F] use a different interpretation of H, which I [12] adopted. Besides 0. Granted this and using the action of SO(3) on ea, we get the induced action on L as L^RL, )= should be invariant under rotations, {ea,9i,B), where 6 denotes absolute temperature. Given Eq. (2.4), it is really better to use one of the form (P = 4>{ta,pi,6). 0 or e, • e2 A e3 < 0. mbaeb, p -> a p + laea, {m, a, 1} G L(E a , P)& $ L(ea,p). orei-e2Ae3 V-cfl=0. 0. , will be very smooth. However, p can easily lose its customary smoothness, at places where bifurcation occurs. Bifurcation theory would suggest that it would acquire some type of algebraic singularity, $ being rather likely to inherit this. As I see it, it is not unlikely that such bifurcation would trigger or be associated with some type of phase transition. Clearly, dependence on other types of internal parameters could, equally well, give rise to such effects. In this light, we should be wary of assuming that , at least.' Isothermal elastic moduli correspond to second derivatives with respect to C. Adiabatic moduli differ from these by terms involving the remaining second derivatives. As such things are judged, both kinds should reduce to continuous functions of 6, along the equilibrium paths considered, as should C. Thus I will always assume that 9, b->c so, at 9, the configuration has the form (3.3). There will also be the branch given by (3.6) as well as the third mentioned, the three meeting at 9, so the initial assumption implies a type of bifurcation. Of course, (3.1) implies the usual differential conditions (3.7) (C,9)-]^d3 O /da = 0. Thus, it makes sense to think of C 4 4 as a function of 9 on this branch, near 9 = 9. If it were positive, except at 9 = 9, a slight perturbation in p2, (RFHEa, 0), (2.8) with fixed values of Ea, the kind of constitutive equation which is associated with thermoelasticity, incorporating some general features suggested by molecular theory. One aim is to try to use such theory to analyze near-transition phenomena, which can involve twinning, in particular. ZANZOTTO [4] has explored the experimental evidence concerning the applicability of the above theory of twinning, writing that " . . . some experimental observations suggest that it is not safe to use thermoelasticity theory to describe the behavior of bodies whose structure is not given by a simple Bravais lattice. On the contrary, it does seem rather safe to use it for the latter. The reasons are rather unclear." Among other things, he discusses clear evidence that (2.6) fails to apply to deformations occurring in mechanical twinning of zirconium, which is a rather simple hexagonal close-packed crystal, but not a Bravais lattice. For it, there is no clear evidence that (2.4) fails. In some more complex crystals, for example calcite, (2.6) seems to apply to twinning deformations, so it seems hard to spot any clear pattern in this. In cases I have checked, transition strains in Bravais lattices also fit (2.6). Without belaboring the matter, ZANZOTTO'S statement seems to me the best available rule of thumb. So, it seems that our theory is pretty good for Bravais lattices. Occasionally it seems to apply to a crystal not of this kind, but one better look hard at the available experimental evidence before gambling much on this. Primarily, we have used ideas of molecular theory to estimate the group G. Conceivably, thermoelasticity theory could apply, when some of these ideas fail but, here, I will ignore this possibility. Workers involved in molecular theories of crystal elasticity have not considered the possibility that (2.6) fails to apply, as far as I know. To a large extent, the classical theory of crystallographic groups aims at describing the symmetry of particular configurations. Such theory, as it applies to Bravais lattices, is briefly summarized by SCHWARZENBERGER [5], who also covers some topological considerations to be discussed later. I won't repeat the references he gives to group-theoretic results mentioned below. 0 (flk-ap,9) = q>(Ak-CApt6)=y(Ak'QTCQAp,9), = f{v,r,k,e) = f(v,r,-k,e),
(56)
In principle, <> / should be considered to be invariant under an infinite discrete group of the kind described by Ericksen[6]. However, as is discussed by Parry[7] and Pitteri[8], we need only require invariance under the point group corresponding to our cubic reference, if we are concerned with deformations meeting restrictions which they describe. It so happens that such restrictions are met by all deformations satisfying our constraints. Thus, as long as we accept the notion that the constraints apply, we need only require that 4>{HTCH, T) = 4>{C, T) = ${f,g, h, T),
(57)
with H belonging to the indicated point group. In particular, this includes orthogonal transformations interchanging pairs of our preferred base vectors, which means that $ should be a symmetric function of/, g and h, this being enough to ensure invariance under the full group. Note that, with (47), <ji is then invariant under (46). In physical terms, it is pretty obvious that twin-related configurations should have the same energy. Physically, this supports the view that $ should remain invariant under the cubic group when the crystal has transformed to configurations of the tetragonal type. Given this symmetry, we can reduce (j> to a function of elementary symmetric functions, a matter discussed carefully by Ball[9]. Bearing in mind the constraint (12), this means that 4> is expressible in the form (j> = $(J, K, T),
(58)
where 6/ = ( / - l ) 2 + ( ^ - l ) 2 + (/ ! -l) 2 = tr(C-l) 2 ,
(59)
2AT = ( / - l ) ( 0 - l ) ( * - l ) = d e t ( C - l ) .
(60)
Mathematically, potentials of essentially the same form, and similar character arise in considerations of isotropic-nematic phase transformations in liquid crystals, so that we will borrow some results occurring in Ericksen's[10] discussion of these. First, the inequality K2 sg P
(61)
always holds. When this reduces to equality, at least two of the quantities/, g and h must be equal. Refining this a bit, we have f = g = h=\oJ
= K = Q,
(62)
characterizing our cubic phases. Configurations of the tetragonal type are covered by
/ = «#* 1 f^g^h
ioK2 = J'>0.
(63)
f=h*gJ These correspond to the nematic phases in liquid crystals, the cubic phases being the analog of the isotropic phases in the latter. Also, the constraint (12), together with the condition that these functions be positive, provides another restriction. By an elementary calculation, fgh = 2K-3J+l>0,
(64)
143 960
J. L. ERICKSEN
one can show that (61) and (64), along with J > 0, cover the limitations on possible values of J and K. In general terms, we then want $ to have an absolute minimum of the kind indicated by (62) when T > Tc, switching to the kind indicated by (63) when T < Tc, it being possible that both retain some status near T = Tc, as at least relative minimizers. Expressing some of these ideas more formally, we at least want that
$(J, K,T)> $(0,0, T),
(65)
T > Tc
and, for some choice of the function / = JQ{T), and for one of the two choices of algebraic signs,
$(J, K,T)> $(J0, ±Jl'2, T) = >/,(/„, T),
T < Tc.
(66)
When they first reported these transformations in an A-15 superconductor, Batterman and Barrett[l 1] opined that they might well be of second order, this being a reasonable opinion, I think. For this, it is important that Jo -> 0 as T-> Tc, and data such as are presented by Keller and Hanak[2] indicate that this might be true. To make a long story short, experimentation still seems to leave doubt as to whether such transformations are of secondorder, or of first-order, with small discontinuities in Jo, etc. masked by experimental errors. For analyzing such small deformations, it seems natural to try to approximate 4> by a polynomial of rather low degree in C— 1, if you like by the first few terms in the Taylor expansion of a smooth function. One of the form
$ = a(T) + b(T)J+c{T)K+d(T)J2
(67)
covers the possible quartics. There should be no danger of confusing the temperaturedependent coefficients with constants similarly labelled earlier. Assume that the temperature-dependent coefficients are smooth and similarly approximated near T — Tc, and you have what is sometimes called mean field theory. As is discussed by Wilson[12], for example, such assumptions go wrong in analyses of critical points in fluids, etc., situations bearing some similarity to the kinds of transformations considered here. Still, it seems to me worthwhile to better understand what kinds of predictions are associated with such a guess, and my own understanding of this leaves much to be desired. Of course, one could try a compromise, using (67), but allowing the coefficients to have mild singularities at
T=TC.
With (61) and (67), we clearly have $>t(J,T) = a + bJ-\c\Pl2 + dJ\
(68)
from which it is clear that if
(69)
to get the minimizers, so we will try assuming this. Were d < 0 for this term in a Taylor expansion, one might still have the minima, but one would need to consider higher order terms in the expansion to sort this out, for a smooth potential. To have even a relative minimum of cubic form {J = K = 0), for T > Tc, we must have b(T)>0,
T>Tc.
(70)
Other extremals can be located, by setting the derivative of \j/ equal to 0, giving b-3\c\JV2/2 + 2dJ = 0,
(71)
yl
a quadratic in J . It will have real roots if 9c 2 > 32bd,
(72)
144 Constitutive theory for some constrained elastic crystals
961
and we want this, at least for T < Tc. For whatever it is worth, the Landau-type argument that the transformation should not be of second-order is as follows. To have bifurcation occur at T = Tc, it is easy to see that one needs b(Tc) = 0, so J = 0 then satisfies (71). At Tc, <j> should still be a minimum, for J = K = 0. An inspection of (67) or (71) makes clear that, for this, it is necessary that c(Tc) = 0. Grant that <j> is thrice differentiable and, by essentially the same analysis, you come to this conclusion. As Landau[4] saw it, it is highly improbable that two functions of one variable should vanish simultaneously. Nowadays, experts in bifurcation theory would, I think, agree that generically, such a transformation is not of second-order. If we argue generically, b should remain positive at and near T = Tc, so that the cubic phase should retain some status, as a relative minimizer, for T < Tc. Then, as J increases, i// must take on a local maximum before it can take on another minimum. Thus the latter must correspond to the larger root of (71), when this is real. At this, we have / = JQ{T), with J]j2 = (3|c| + ^9c2 - 12bd)/$d.
(73)
By elementary calculation, the cubic phase 7 = 0 has the lowest energy when \j/{Ja,T)> iMO, T)oc2<
4bd,
(74)
and we want this for T > Tc. Similarly, the tetragonal phase does when
\l/(0,T)>\l/(J0,T)oc2>
4bd,
(75)
and we want this for T < Tc. Of course, Tc represents the temperature at which the two energies become equal, so that c2 = 4bd,
at
T=TC,
(76)
and generically, this should be an isolated temperature. Assuming this form of $ applies to the A-15 superconductors, we must have JO(TC) very small, which requires that, for T near Tc, |c|/rf«l,
b/d«l,
(77)
so that, by the indicated kind of reasoning, this gives one estimate of what $ might look like, near the transformation, certainly involving a bias in favor of the notion that the transformation is of first-order. One might introduce guesses about the temperature dependence of coefficients near Tc, based on ideas of smoothness. Otherwise, this seems to be the simplest kind of model which accommodates the two phases and twinning. It could do no harm to better understand what all it predicts, and how this compares to the behavior of real crystals, but it is a somewhat naive guess. 6. EQUILIBRIUM EQUATIONS
Here we begin by reverting to index notation. In dealing with constrained elastic materials, we follow the most common practice, which is to use the format suggested by Ericksen and Rivlin[13], to introduce kinds of Lagrange multipliers or forces of constraint. Some possible generalizations are considered by Antman[14], who argues that they might well be of import for some kinds of theories, but not elasticity theory. Here we write f = cj,-n(Cu + C22 + Cii-3) + 2(X1C2i+X2Cii+X}Cl2),
(78)
where n and the A's are arbitrary functions of position, <j> being given by a definite constitutive equation, for example, that represented by (67). Then, ignoring the constraints, treat <j> as the potential for an unconstrained material, using any of the common formulae for calculating stresses, to be used as usual in equations of equilibrium or motion. Without
145 962
J. L. ERICKSEN
really looking at such calculations, we can notice one curiosity. We have four multipliers to play with, only three equations to be satisfied, so that it seems that it should be possible to get any kinematically possible deformation to satisfy the equilibrium equations with zero body force, this still leaving only three equations to determine four unknowns. Before, we found that a naive count of equations and unknowns was misleading, so that we should look more closely at the equations. For example, the Piola-Kirchhoff stress tensor is given by 71 = dftdy'j = T'ky[k,
(79)
where fjk
=
fkj
=
af/dCjt + d$/dCkJ,
(80)
or, in matrix form,
f=b3 IU2
n2 xA,
(8i)
Xi /X3II
with H = 8<j>/dCkk — n
(no sum).
(82)
The equilibrium equations
can, with the help of (13), be reduced to the form
Ti+r}kfjk = o.
(83)
With C of the form (11) we get, after some calculation
(A,hHAih-U*h =*n (^,),2 + (^3),l-(^),2 =2 \,
(hX1X, +
(84)
{hX,\2-(hn),,=^J
where 2®, =[
2$2 = [^-2g(df/8g)h,
(85)
2$ 3 = [(0-2A(fl<£/5/!)],3. With the deformation given, (84) then reduces to three linear equations for the four unknown multipliers, seeming to reinforce our first impression. We do know that we have the generalized plane deformations given by (21). Assuming that the multipliers do not depend on x3, the above system reduces to
{fX,h = {fn + a)A (<M3),, = (
^,2 + A 2 , , = 0
J
(86)
M6 Constitutive theory for some constrained elastic crystals
963
with
2
(87)
Then (86), and (86) 2 can be viewed as integrability conditions for functions £, and rf such that
/ * , = {.„
fn + c = i,2-)
(88)
Eliminating A3 and n gives 9li-f 1.2 = 0 1 9l2-fi.i=g*-fry
(89)
One can solve for the gradient of either function, then cross-differentiate to get an equation for the other. For example, that for £ is [(g/f)i,ih -Kg/niih
= (*-g°/f),2
(90)
a linear hyperbolic equation having as characteristics our old friends, the possible twin planes. Locally, any solution of this generates a solution of the equilibrium equations. That is, one can work back from this to get A3 and n, etc. Similarly, we can satisfy (86)3 by writing A, and i 2 in terms of derivatives of an arbitrary function. Certainly, this serves to confirm the first impression. With the several constraints, we begin to approach the situation encountered in rigid body mechanics. There one might introduce stress as an essentially arbitrary tensor, restricted a bit by the condition that its divergence vanishes, for example. The idea is more cumbersome than useful, so that we use other familiar ideas to formulate and solve physical problems. It does suggest that we might need to change our thinking habits, to make effective use of these theories of highly constrained materials. Inherently, the physical situations envisaged are complex. As should be clear from our consideration of minimizers, the simplest problem, of the equilibrium of an unloaded crystal, is nontrivial, and requires stability analyses. For various static problems, one can use the notion of minimum energy to formulate problems, building stability criteria into the formulation. Certainly, this is a sensible approach, but it is hardly a panacea. I have begun to look at some of the simplest experiments from this point of view, but find it tricky, so that it seems premature to comment on this. Of course, the general format is designed to produce some agreement with a few of the observations, but not enough to firm up specific constitutive equations, or to assess the quality of predictions of such theory. Acknowledgement—This material is based on work supported by the National Science Foundation under Grant No. MEA-8304750.
REFERENCES 1. N. Nakanishi, Lattice softening and the origin of SME. In Shape Memory Effect in Alloys (Edited by J. Perkins). Plenum, New York (1975). 2. K. R. Keller and J. J. Hanak, Phys. Rev. 154, 628 (1967). 3. J. L. Ericksen, Archs Ration. Mech. Analysis 73, 99 (1980). 4. L. D. Landau, On the theory of phase transitions. In Collected Papers of L. D. Landau (Edited by D. TerHaar). Gordon and Breach, and Pergamon, New York (1965). 5. A. E. H. Love, A Treatise on the Mathematical Theory of Elasticity, 4th edn. Cambridge Univ. Press, England (1927). 6. J. L. Ericksen, Adv. Appl. Mech. 17, 189 (1977). 7. G. P. Parry, Math. Proc. Camb. Phil. Soc. 80, 189 (1976). 8. M. Pitteri, J. Elasticity 14, 175 (1984).
^ _ _ _ 964
147 J. L. ERICKSEN
9. J. M. Ball, Differentiability properties of symmetric and isotropic functions, to be published in Duke Math. 10. J. L. Ericksen, A thermodynamic view of order parameters for liquid crystals. In Orienting Polymers, SpringerVerlag Notes in Mathematics (Edited by J. L. Ericksen), Vol. 1063. Springer-Verlag, New York (1984). 11. B. W. Batterman and C. S. Barrett, Phys. Rev. Lett. 13, 290 (1964). 12. K. G. Wilson, Rev. Mod. Phys. 55, 583 (1983). 13. J. L. Ericksen and R. S. Rivlin, J. Rat. Mech. Analysis 3, 281 (1954). 14. S. S. Antman, Re. Accad. Na:. LinceilO, 256 (1982).
148 Material Instabilities in Continuum Mechanics and Related Mathematical Problems © 1988 Oxford University Press
11 SOME CONSTRAINED ELASTIC CRYSTALS J. L. ERICKSEN
1.
INTRODUCTION Often, crystals undergo phase transformations involving
some change of symmetry. When such transformations are of second-order, and often when they are of first-order but weakly so, it is observed that some linear elastic moduli become quite small compared to others, near transition. reasonable to try using simpler, idealized considering
It then seems rather theory, roughly
the larger moduli as infinite or, more properly
regarding the crystals as constrained materials.
Motivated by
observations of cubic-tetragonal transformations in A-15 superconductors, Ericksen
[1] formulated
a thermoelasticity theory
of this kind, which also shows some promise as a theory for somewhat similar transformations occuring in Indium-Thallium systems.
A variety of different constraints would be in reason-
able agreement with the observations.
Ericksen [1] gives one
theoretical argument leading to a unique choice.
Here, we give
a rather different argument leading to the same choice. In the Indium-Thallium systems, but not in the A-15 superconductors, cubic and tetragonal phases have been observed to coexist, by Burkart and Read [2], and they form rather complicated configurations.
Ball and James [3] found a way to deduce
algorithms which enable one to describe coarser features of such
149 120
J.L. ERICKSEN
configurations.
We will present and discuss the results of such
calculations, for the theory of constrained crystals. The reasoning of Ball and James does not suggest why such configurations should be stable enough to be observed.
Burkart and Read [2]
attempt to rationalize this, but a satisfactory theory for this remains to be developed. is not observed
It should explain why such coexistence
in A-15 superconductors.
It would be easy to
explain this, if such transformations are of second-order, as was suggested by the first workers to observe them, Batterman and Barrett [A]. As is discussed by Ericksen [5], reasoning accepted by many physicists leads to the conclusion that they should not be, but there is some room for argument about this. We have not seen clear experimental proof that the implied discontinuities in deformation occur. In any event, such stability questions remain open. 2.
CONSTRAINTS For the aforementioned crystals, we have cubic crystals at
higher temperatures, transforming to crystals of tetragonal form, as the temperature T is lowered through a critical value T'o and, commonly, the tetragonal phase contains twins.
The trans-
formation involves a discontinuity in deformation which is quite small.
In the A-15 superconductors, it might be zero or just
small enough to be obscured by experimental errors. In the IndiumThallium alloys, data presented by Burkart and Read [2] indicate strains of the order 10 large.
, definitely non-zero, but not terribly
Briefly, we seek a theory capable of describing both
phases, twinning in the tetragonal phase, as well as the effects of small loadings and temperature changes.
Roughly, the deforma-
tions of interest should be considered as finite, although they are not very large;
conversion of one twin to another involves
finite rotations, for example. In the cubic phase, the linear elastic strain energy W is of the form
150 121
SOME CONSTRAINED ELASTIC CRYSTALS
2(/= (c u -c 1 2 )(nf 1 + n2 2+ n|3) + [(^11+2C12)/3](eil+e22 + e33)2 + 4
^^(e12
+ £
23 + e 3 l ) .
(2.D
v-3(ell+e22+e33)6^.
<2-2>
where \Q
= e
and e is the infinitesimal strain tensor. The C.. are elastic moduli, labelled as most experimentalists do, except that we have added carets to distinguish these from components of a tensor C to be used later, Ericksen [l], noting observations indicating that, for T near To , ( C 1 1 - C 1 2 ) / ( C U + 2C 1 2 ) < 1 , 1 \
(cll-c12)/chk
< i,
(2.3)
J
thought of the denominators as effectively i n f i n i t e , to get e
ll
+ e
22+e33 = ° »
(2-4)
e
12= e 23= £ 31 = ° .
<2-5)
as a first estimate of likely constraints.
In some way, we need
and
to extrapolate these, to apply to finite measures of strain. From the viewpoint of nonlinear thermoelasticity theory, it is convenient to take as a reference configuration the (unstressed) cubic phase at the transition temperature T^. Refer this to rectangular Cartesian coordinates x=(x^ , x2, x 3 ) , with the orthonormal base vectors e. parallel to the usual cubic lattice vectors, as is presumed in (2.1).
A deformation
maps x to u = u(x) ,
(2.6)
with F =Vu , detF> 0
(2.7)
the usual deformation gradient, and the symmetric tensor C = F1F (2.8) is a commonly used measure of finite deformation. The only
151 122
J . L . ERICKSEN
reasonable e x t r a p o l a t i o n of (2.5) seems to be C12=C23=C31 = 0 .
(2.9)
As will be made clear, if it is not already so, a variety of reasonable extrapolations of (2.4) exist.
Ericksen [l] gives an
argument leading to the choice Cn+C22+C33
= 3,
(2.10)
and, shortly, we will argue this in a different way.
Given the
experimental errors involved in measuring small strains, it is hard to say which extrapolation might best fit the deformations experienced by the real crystals. From what little experience we have, if one picked the constraint to make the mathematical theory simplest, one might well pick (2.10). 3.
JUMP DISCONTINUITIES Consider a material surface, with unit normal If in the
reference configuration, across which the deformation gradient F suffers a finite discontinuity, with U remaining continuous. If the two limiting values are denoted by F and F , the usual kinematical conditions of compatibility read F = F(l + A ® N) , where A is the so-called amplitude vector.
(3.1) From t h i s , we get
C = I?! = ( 1 +N ® A) C(\+A ® N) = C + N <8 M + M ® N ,
(3.2)
where 1M =2CA + A • CAS
(3.3)
We assume that (2.9) applies, so that ? - C = aex ® ex + B e 2 ® e 2 + Y « 3 ® e3 .
(3.4)
For the present, we ignore the remaining, ambiguous constraint. From (3.2), if V is a non-zero vector satisfying V. M = V' N = 0,
(3.5)
252 SOME CONSTRAINED ELASTIC CRYSTALS then
_ (C-C) V = 0 .
123
(3.6)
Using ( 3 . 4 ) , we then get ctKj = &V2 = yV3 = 0 ,
(3.7)
with V^ = V ' e . , the components of V . If a = $ = Y = 0, C = Cy implying the existence of a rotation matrix Ft such that (3.8) F = RF . With (3.1), this gives R = F{\ + A ® N) F"1 = 1 + FA ® F~T N .
(3.9) —T
This implies that any vector perpendicular to F vector of R , with eigenvalue one.
N is an eigen-
Since a nontrivial rotation
has only one axis, we have c t = B = Y = O-»J?=l"*>l = O ,
(3.10)
the trivial case. This result is well-known to those familiar with twinning theory. If two of the three quantities vanish, say a
= g= o* y ,
(3.11)
it is easy to show that (3.2)—(3.4) imply that N , M and / must all be parallel to « 3 , so we can take N = e 3 , A = 6e 3 .
(3.12)
With the determinants of F and F positive, (3.1) implies that det (1 +A ® N) = 1 +A -N > 0 ,
(3.13)
For the particular case at hand, this gives l + 6 > 0 , a mild restriction. If just one of the three quantities vanish, there is no important loss of generality in assuming that Y = 0, a 6 # 0 , V = e 3 •
(3.14)
Then, with (3.5), we can represent the unit vector H in the form
153 124
J-L. ERICKSEN N = cos
(3.15)
Now M, also perpendicular to V, must satisfy M ® f f + W ® W =a « 1 ® e 1 + B e 2 ® e 2 .
(3.16)
An elementary analysis then shows that M has the form , P = cos c() 2j — sin < M 2
M=mP
(3.17)
where m is some non-zero scalar. This gives C 1 X — C J J = a = 2m cos2<|> , C22-C22
= S = - 2 m sin2 * ,
(3.18)
r —r L
33~°33 •
The diagonal matrices C and C must be positive definite, implying that C n + 2mcos2i))> 0 , C 2 2 - 2msin2<() > 0 .
(3.19)
Suppose that we are given (diagonal) C, m and <)) satisfying these conditions.
Then (3.15) determines N, and (3.17) determines P.
From (3.3), we must have A = C"1 (mP-aS) ,
(3.20)
with
2a =A • CA =C~l{mP-aN) • (mP-aS) ,
(3.21)
giving a quadratic equation for a , viz a 2 AT "C"1 S-2 (mP • c"1ff+ 1) a + m 2 P • C~l P = 0 .
(3.22)
It is elementary to verify that our assumptions imply that the two roots are r e a l .
Condition (3.13) implies that
1 +mP • C~* N >aN - C"1ff,
(3.23)
which selects one of the two r o o t s , that given by JV • C~l Na = 1 +mP • C~l N - /I A = (mP • C~l If + 1 ) 2 -m2{C~l
P • P){C~1 N • If) .
(3.24) (3.25)
^54 '25
SOME CONSTRAINED ELASTIC CRYSTALS With (3.20), this determines A. One can take F=/c,
the usual
positive definite square root of the matrix C , use (3.1) to determine F and verify that the corresponding C is diagonal, etc.
So, this summarizes the kind of solutions of (3.1) which
are permitted by the constraint (2.9). 4.
THE REMAINING CONSTRAINT In the A-\5 superconductors and Indium-Thallium alloys,
discontinuities of the kind considered above commonly occur in the tetragonal phase, associated with twinning.
For such twins,
observations indicate that N takes on one of the six directions given by /Iff = e. ±e. , *•
i *s .
(4.1)
3
Clearly, these do not conform to (3.12), but fit (3.15), or equivalents obtained by replacing e^ and « 2 by two other base vectors, or their negatives. Also, one has cos2c(> = sin2cj> = \ .
(4.2)
Briefly, observations indicate that, in the unstressed cubic phase C =el ,
(4.3)
a being a function of temperature, describing thermal expansion, with c(T ) = 1 a consequence of our choice of reference configuration.
In the unstressed tetragonal phase, C can take on any
one of three values, of the form C ( 1 ) = diag (v2 , \i2 , vi2) , C ( 2 ) = diag (p 2 , v 2 ,y 2 ) ,
>
(4.4)
C ( 3 ) = diag (y2 , u 2 , v 2 ) , with y and v ^ y
positive functions of temperature.
these conform to (2.9).
Clearly,
If experimentalists had doubts about
this, they would not be likely to call these tetragonal phases. As is discussed in some detail by Ericksen [l] , the
155 126
J.L. ERICKSEN
possibility of twinning is linked to the assumption that governing constitutive equations are invariant under the cubic point group.
It would seem unreasonable not to require this of the
constraints.
Assuming (2.9) holds, we thus look for a constraint
extrapolating (2.8), of the form /(Cn
,C22 , <733) = 0 ,
(A.5)
with f a smooth symmetric function of its arguments. Our assumption, that the reference configuration is attainable, gives the condition /(I,
1,1) = 0 .
(4.6)
Any such function will satisfy •JTJ—
= -M- = M-
=b
when
C= 1,
(4.7)
for some value of b. For infinitesimal deformations, we can linearize f, writing C. . = 1 + 2e .. , Iris
(no sum)
(4.8)
Iftr
to introduce infinitesimal strains, and use / « 2 6 ( e n + e 2 2 + e33) = 0 .
(4.9)
Assuming only that b=t0, we thus get (2.4) in this approximation, so the linear estimate is of little help, in selecting particular choices of
f.
With f a symmetric function, if / = 0 holds for C = C / 1 N » it will hold for C. . and C. . . Briefly, this is enough to ensure that the standard analysis of twinning in the unstressed tetragonal phase will go through, for any reasonable choice of /.
Naturally, / = 0
should hold for a value of C. > well
approximating observed values, but (4.9) comes close to guaranteeing this, for the crystals considered. Application of small loads to a twinned crystal may or may not cause the discontinuities to move through the material, perhaps to be created or destroyed.
What they seem not to want to
^56 SOME CONSTRAINED ELASTIC CRYSTALS do is rotate through the material.
127
That is, they prefer the
material planes listed in (A.I), although the loadings cause C to change. In the A-15 superconductors, I have not seen evidence of other kinds of jump discontinuities in F. In Indium-Thallium alloys, cubic and tetragonal phases coexist, in a complicated way.
Possibly, some differently oriented discontinuity is in
this picture, but I have not seen clear evidence for it, so will ignore this. So, let us find forms of / which force N to take on just the values given by (4.1).
First, suppose that, for all values
of C of interest 5/
_
061 i
3f o Cj 9
3/
(4
,0)
33
A general integral of these equations is an arbitrary function of
y = c
n
+ c2Z + c 3 3 - 3 ,
(4.H)
so, with ( 4 . 6 ) , we have / = Uy)
.
•(£>) = o .
(4.12)
Since we are only interested in the set where / = 0 with very little loss of generality, take constraint as (2.10).
f = y
we can,
giving the
Bearing in mind (3.4), we see that this
excludes possibilities like (3.11), as would various reasonable alternatives.
The rest are like (3.18).
If C and C satisfy
(2.10), we see from (3.18) that either m = 0 , the trivial case, or (4.2) holds, implying that N is included in the set given by (4.1).
So, this form of / has the desired property. If f is a symmetric function not covered by the above
argument, there will be some value of C such that f{Cu
,C22 , C 3 3 ) = 0
(4.13)
and (4.14) Now fix such a C and consider
157 128
J.L. ERICKSEN f(Cn
with
,C22 , C 3 3 ) = 0 ,
(4.15)
C,, = C,,+ 2m cos2(J) , ^-16)
2
as an equation to be solved for m, in terms of <(> . Since f is a symmetric function, (4.15) will be satisfied at
(m, <|>) = (m,<|>),
where m = C*22 — ^11 ^ ^ » <j>3cos2$ = sin2$ = \ , since this makes CJJ = C 2 2 etc.
(4.17)
Calculating the partial deriva-
tive of f with respect to m at this place, we get
3ffl
3C 2 2
"dC-y J
(4.18)
Thus, by the implicit function theorem, we can solve ( 4 . 1 5 ) — (4.16) for m , locally.
Thus, we can pick <j> near <}) , not satis^
fying (4.2), and find a corresponding m . With the analysis given in Section 3, it is then clear that we can construct jump continuities, with N not included in the set given by (4.1). Essentially, then, (2.10) is the only extrapolation of (2.8) which forces all jump discontinuities to be on the twin planes. As is discussed by Ericksen [ 1 ] , relevant equilibrium equations have a hyperbolic character and, with the constraints selected, and only with these, the characteristic surfaces also become the twin planes.
Clearly, this indicates that these become the pos-
sible bearers of various other kinds of weaker discontinuities. As he indicates, one can construct relatively simple Helmholtz free energy functions which, near T=Ta
, have absolute or rela-
tive minima at the cubic phase, described by C = 1,
(4.19)
the form of (4.3) satisfying (2.10), and at the tetragonal phase, described by (4.4), with
v = V 3 ~ l ^ , 0
(A 20)
'
L58 SOME CONSTRAINED ELASTIC CRYSTALS to satisfy (2.10).
129
At least qualitatively, such forms are capa-
ble of describing the cubic-tetragonal transformation, twinning, effects of small shear stresses, etc. There is a rather interesting interpretation of the constraints, not mentioned by Ericksen [1], In the cubic reference configuration, a diagonal of the unit cell is parallel to the unit vector E = (±el ± e2 ± e3)//3 ,
(A.21)
where the algebraic signs can be chosen arbitrarily.
If the
constraints are satisfied,
\\FE\\2 = E'CE = (Cn+C22
+ CZ3)/3 = 1
(4.22)
so the material is inextensible in all of these directions. Conversely, if it is extensible in all, the constraints (2.9) and (2.10) must hold.
After I mentioned this in a lecture, David
Parker noted that, for the plane deformations considered by Ericksen [1], the material behaves as if it were inextensible in two in-plane directions, making these deformations analogous to the plane deformations encountered in textiles.
In three
dimensions, one has something more like a block of foam rubber, with four appropriately oriented systems of inextensible fibres running through it.
Possibly, this will help some to picture
possible deformations. 5.
PHASE INTERFACES An example of twinning can be obtained as follows. F = V C ( 3 ) = p(«j ® «! +<22 ® el)
with y and V satisfying (4.20).
+ ve
3 ® e3 '
Set ^5-1)
Introduce the orthonortnal
basis
fx = « lf f2 = (02+e3)//2 , f3 = (e2-e3)//2 , /, being one of directions listed in (4.1).
A = af2, N =/ 3 . With a little calculation, one finds that
(5.2)
Set
(5.3)
_^___ 130
159
J.L. ERICKSEN F = F{\ +A ® N)
(5.4)
FTF
= C.z)
(5.5)
a = 6(l-y 2 )/(3-u 2 ) .
(5.6)
satisfies provided Often, one sees sets of these layers separated by parallel planes of discontinuity, with the deformation gradient alternating between the two values.
The thickness of these layers can
vary in a random or regular manner. In Indium-Thallium alloys, observations of coexistence of tetragonal and cubic phases, by Burkart and Read [2], involve such a system of twins in
the tetragonal phase.
Those layers with F
seem to be of about the same thickness, as do those with value F. However, the two thicknesses are not the same, one being about twice the other.
This configuration meets the cubic phase at an
interface which is roughly planar, close enough to a plane that one can estimate its crystallographic orientation.
As is dis-
cussed by Ericksen [1], one cannot fit the above F to a value consistent with another fitting (4.19).
This presumed that the
constraints are satisfied, but this is not vital.
So, indica-
tions are that the interface or transition region is more complicated, probably involving inhomogeneous deformation near the interface.
Burkart and Read [2] give a "schematic representation"
of the interface, perhaps to be regarded as a guess as to how atoms could adjust their positions and avoid great distortions of the latter.
Roughly similar configurations are observed in
various crystals which undergo martensitic transformations, and we have not yet found any theory to describe any of them, in detail. Ball and James [3] noticed that one can! construct minimizing sequences for the total energy which resemble such configurations, involving regular spaced twins, with the thickness of the layers approaching zero. energy minimizers.
Such sequences do not converge to
It is not so clear why configurations resem-
bling some of these should be stable enough to be observed.
|60 SOME CONSTRAINED ELASTIC CRYSTALS
131
However, by analyzing the limit, they derive formulae, which seem to describe coarser features of such interfaces rather well. With our constraints, there is doubt as to whether the kinds of deformations needed in the derivation can be constructed. However, it seems interesting to see what the end result predicts. Involved is a scalar A € ( 0 , l ) , interpretable as the fractional thickness of the layers in which F occurs, F occurring in the proportion
1—A.
Then, in the obvious sense, the aver-
age deformation gradient in the tetragonal phase is (F) = XF + (l-A)F = F {) + XA ® N).
(5.7)
In the cubic phase, (A.19) should hold, implying that the deformation gradient should be R, some rotation.
According to the
Ball-James analysis, we should have (F)
= i?(l + B ® D)
(5.8)
where D is the normal to the nominal plane phase interface, and B is some vector.
If \F) were the actual deformation gradient
in the tetragonal phase, this would, of course, be the usual kinematical conduction of compatibility.
Burkart and Read [2]
deduce a kind of approximation to this, using a notion of small average strain, neglecting some small rotations, etc. Here,F, A and N can be regarded as given, but X, R, B and D are not, so one uses (5.7) and (5.8), i.e. R{\ + B ® D) = F{\ + XA ® N) ,
(5.9)
to try to determine them, it being understood that F , A and If are given by (5.1) and (5.3).
We have found no way
to avoid
some tedious calculations in doing this, and will not record the grim details.
It is helpful to begin by noting that, if E is
a unit vector perpendicular to D and N , (5.9) implies that FE = RE •*• \\FE\\ = 1 .
(5.10)
It is easy to check that, as the notation suggests, E must be
161 132
J.L. ERICKSEN
one of the inextensible directions, given by (4.21), one in the subset perpendicular to II. This gives two possible directions, E = («1+e2+e3)//3 ,
(5.11)
E = («1-e2-83)//3 .
(5.12)
or
One can select either and proceed, with the knowledge that D must be perpendicular to E .
By taking determinants, one gets
1 +B 'D = detF = u2v .
(5.13)
Then, one requires other conditions guaranteeing that R , obtained by solving (5.9), is a rotation, the tedious part. the choice, one gets two solutions.
With (5.12) as
One is of the form:
\ = (3 + N / 4 V 2 - 3 ) / 6 ,
(5.14)
/ 2 u 0 = f1 + ( / 2 + V 4 y z - 3 / 3 ) / / 2 , g =
/2vV-l) [ v f l -( / 2 W4y 2 -3/ 3 )/v/2],
R being calculated from (5.9).
(5.16)
Clearly, these exist only if
Au2 ^ 3 , a condition not guaranteed by (4.20).
(5.17) To get the second solu-
tion, simply replace V 4 y — 3 by — v 4 y 2 — 3 tions.
(5.15)
in these prescrip-
As long as (5.17) holds, the two values of X lie in
(0,1), as they should, and any value in the interval can be obtained, by suitable choice of y. When equality holds in (5.17), the two solutions coalesce, with A = j . value of p, the two values of X sum to one. lar solutions using (5.11).
For a fixed
One gets two simi-
The two values of E are related by
an element of the cubic point group, reversing « 2 and e,. Clearly, this leaves F and A®S
invariant;
one can get this
by transforming the entries in (5.9) by the indicated group element.
Actually, one of the previous solutions can be converted
to the other, by a group element interchanging e2 and e 3 but this is a bit more subtle. Altogether, this gives four solutions,
162 SOME CONSTRAINED ELASTIC CRYSTALS the number we should have, according
133
to the work of Ball and
James [3], although the breakdown of solutions as X ->\ was not something we anticipated from the general considerations. In the observations of Burkart and Read [2], p is close to one.
From Eqn. (5.14), we have lim A = § , y-»i
in good agreement with their observations.
(5.18) Similarly,
lim D = [/•1 + ( f 2 + / 3 ) / / 2 ] / / 2 p-»l = (« 1 +« 2 )/^2 also in good agreement with
(5.19)
their estimate of the crystallo-
graphic orientation of the normal
to the nominal interface.
Interestingly, in this limit, D becomes normal to a twin plane, different from that associated
with the twin planes in
tetragonal phase, a possible bearer of discontinuities.
the
If the
result were precise, one might have such a plane of discontinuity as the real interface, separating the two phases, with some kind of inhomogeneous deformation near the interface.
Conceivably,
re-examination of the derivation of (5.9), with the constrained deformation, would modify the result to be in agreement with this.
In any event, this is what seems to be suggested, by the
"schematic representation" of Burkart and Read [2]. Another possibility might have the twin planes with the normals indicated by (5.19) as surfaces of discontinuity, arranged in stairstep fashion.
In places, the twin planes involved in the tetra-
gonal phase might extend, to provide risers, separating one set of twins from the cubic phase.
In this picture, the real inter-
face becomes crinkled, but need not stray far from the nominal plane.
It is not so obvious that one must have such walls
separating the phases; F could vary smoothly, at least in places, to effect the transition.
Also, some different kind of
crinkling might occur, as suggested below.
With electron
163 134
J.L. ERICKSEN
microscopy, one might decide what kind of picture of the interface fits best, but I have not seen such observations.
With our
highly constrained deformations, it is not very clear whether one can construct deformations fitting any of these pictures. If one can, it is likely that one can find Lagrange multipliers enabling one to satisfy equations of equilibrium in a weak sense, from Ericksen's [1] discussion of such matters.
One also needs
some analysis of their stability, which is likely to be tricky. So, we are far from having satisfactory analysis of such phase mixtures. It seems worthwhile to note that the twin planes in the tetragonal phase intersect those with normal indicated by (5.19) in a line parallel to E, given by (5.12), also the direction of the line of intersection of either set of twin planes with the plane with normal D.
Another set of twin planes containing
such lines exist, those with normal parallel to e , + e 3 • With three such systems of planes, one can form an interesting variety of crinkled planes,triangular cylinders, etc., geometrically. Of course, arranging patterns of discontinuities also involves considerations of compatibility conditions, complicating matters. ACKNOWLEDGEMENT Much of the research covered in Sections 3 — 5 was done while I was visiting Heriot-Watt University, where I greatly appreciated the warm hospitality, lively discussions, and having some time to think.
Especial thanks go to John Ball, for much
help in arranging this.
Also, discussions with Richard James
helped me to make a little progress in understanding the phase mixtures in Indium-Thallium alloys.
164 SOME CONSTRAINED ELASTIC CRYSTALS
135
REFERENCES 1. J.L. Ericksen, "Constitutive theory for some constrained elastic crystals", to appear in Int. J. Solids and Structures. 2. M.W. Burkart and T.A. Read, "Diffusionless phase change in Indium-Thallium systems", Trans. AIME J. Metals 197, (1953), pp.1516-1524. 3. J.M. Ball and R.D. James, "Fine phase mixtures as minimizers of energy':, to appear in Arch. Ratl. Mech. Anal. 4. B.W. Batterman and C.S. Barrett, "Crystal structure of superconducting V 3 Si", Phys. Rev. Lett. 13, (1964), pp.390-392. 5. J.L. Ericksen, "Some phase transitions in crystals", Arch. Ratl. Mech. Anal. 73, (1980), pp.99-124.
_
^
Arch. Rational Mech. Anal. 139 (1997) 181-200. © Springer-Verlag 1997
Equilibrium Theory for X-ray Observations of Crystals J. L. ERICKSEN Communicated by M. GURTIN
1. Introduction In molecular theories of crystal elasticity, one needs some way of relating changes in atomic arrangements to macroscopic deformation. The oldest idea, introduced by CAUCHY, was that the gross motion agrees with atomic motion, at the mass points representing atoms. After appreciating reasons why this is not really sound, physically, BORN1 modified this assumption in what seems to be a very reasonable way. His assumption, called the Born Rule, hereafter abbreviated to B.R., has become the standard assumption in this area. It is widely appreciated that it is not consistent with observations of phenomena involving slip, commonly associated with dislocation motion and what we call plasticity. While linear elasticity theory has been used extensively in studies of isolated dislocations, we all know that it is not a good theory to use for predicting what deformations will occur in a simple tension experiment, for example. Elasticity theory has been used successfully to deal with other kinds of deformations which can be large, associated with phase transitions involving changes of symmetry, including sophisticated descriptions2 of rather complicated arrangements of twins occurring with these. From this, it seems reasonable to include such deformations among those commonly regarded as elastic and, possibly, to use similar analyses for other kinds of twins which are rather different, physically. Those alluded to above are transformation twins, occurring naturally in unstressed samples as the temperature changes, associated with phase transitions. Then there are the growth twins, occurring naturally as a crystal solidifies from a melt or solution; some include twins occurring during annealing. Here again, elasticity theory has been useful in designing mechanical treatments to eliminate some of them. Early work of this kind, done during World War II, relating to Dauphine twins in quartz, 1 2
For a brief and critical presentation of this theory, see STAKGOLD [1]. For a recent exposition of theory of this kind, see BHATTACHARYA et al. [2].
165
166 182
J. L. ERICKSEN
was covered by the WOOSTERS [3] and THOMAS & WOOSTER [4]. So, at least some problems involving growth twins are within the province of elasticity theory. In the third category are the deformation twins, produced by applying loads involving shear, remaining when loads are removed. There is a good and recent survey of work in this area, by CHRISTIAN & MAHAJAN [5]. It is clear from this that workers have been struggling hard to find ideas which can be used to understand the phenomena observed, with limited success, including some use of elasticity theory, among other things. Studies have turned up so many complications that it is hard to be optimistic about finding any theory with a broad range of applicability. What are called mechanical twins can be either transformation or deformation twins. From a different perspective, ZANZOTTO [6] pointed out a basic difficulty which, for me at least, helps clear the air a bit. After looking at many observations, he concluded that B.R. is reliable for what might reasonably be regarded as elastic deformations for two kinds of crystals, in view of the aforementioned experience. One covers the monatomic crystals fitting the description of Bravais lattices. The other consists of the so-called shape-memory alloys. For other kinds of crystals, it sometimes applies, often does not, and there seems to be no obvious pattern in this. At least for failures relating to twinning, he gives an ingenious argument, convincing to me, that when B.R. fails to apply, so does elasticity theory. No viable alternative to B.R. suggests itself. In pondering such difficulties, I got to thinking about what kind of field theory would result, if one used the ideas of molecular theories of elasticity, but did not use B.R. Here, I present some of my thoughts on this, which lead to an equilibrium theory of lattice vector fields, etc., things more related to X-ray observations and to some made using electron microscopes. It is in a form such that, if one trusts B.R. for a limited set of deformations, one can use it to describe these. ZANZOTTO notes that, for some but not all crystals, B.R. applies pretty well to thermal expansion, although it might then fail for twinning. Also, I will set up an analog of the elementary theory of twinning used in elasticity theory, albeit by rather different reasoning.
2. On Molecular Theory Except for some earlier writers, those who work on molecular theories of crystal elasticity aim at calculating strain energy functions, using the energy arguments of thermoelasticity theory to deduce formulae for stress tensors. Unfortunately, this requires use of the B.R., to be avoided here. Various kinds of molecular theory are used, producing different strain energy functions. For continuum theory, I have looked for a format general enough to cover what is common to such theories. Here, I present some of the reasoning I used for this. Since various failings of B.R. are associated with polyatomic crystals, we will consider these. Consider a crystal filling all of space, with atoms idealized as mass points, to be definite. Commonly, it is thought of as undergoing temperature-de-
167 Equilibrium Theory for X-ray Observations of Crystals
183
pendent vibratory motion about positions of equilibrium. With some abuse of language, we interpret the latter positions as the positions of the atoms. The simplest crystal configurations are the Bravais simple lattices. For these, all atoms are identical, and the position XN of the Mh atom can be represented in the form xN = naNea + const.,
iV=I,2,...,
(1)
where ea(a = 1,2,3) are lattice vectors, required to be linearly independent. Also, the naN are integers, with every choice of integers giving a possible atomic position. Here and in the following, the summation convention is used. Actually, these simplest configurations are not of great interest here, since B.R. applies pretty well to them. Other crystal configurations can be described as a finite set of interpenetrating Bravais lattices with the same lattice vectors. Basically, the lattice vectors generate a translation group, describing the periodicity required by the classical definition of a crystal. Number the lattices by Greek indices. If there are M of these, they will have positions given by XNX = n"Nea +pa,
x= 1,...,M,
N = \,2,...,
(2)
where the pa are some vectors. These positions are to be distinct, which means that the differences pa — p^oi + p, cannot have integer components relative to the basis ea. Atoms on different lattices may or may not be identical. The relative translations like pa — p\ are called shifts, following PITTERI [7], who discusses some important features of such multi-lattices. Concerning molecular theories of elasticity, it is generally agreed that these deal only with rather short-range interactions. Since the aim is to get energy functions of the kind used in elasticity, one has to argue that, to a good approximation, they have the usual property of being additive set functions. One does not want to try to keep track of those little vibrations. For this, one either uses "zero-temperature" models, assuming they vanish or, if not, one uses statistical mechanics, typically, which introduces the temperature 6 as a control variable. One should be aware that there is some ambiguity in the multi-lattice description. For example, a body-centered cubic monatomic crystal fits the description of a Bravais lattice. However, it is more often described a 2-lattice, with lattice vectors orthogonal and of equal length, pictured as the edges of a cube, one atom in the second lattice sitting in the center of a cube. This might not be the best way to picture the maximal translation group, but it is good for picturing the point group, for example. Of course, for elasticity theory, one wants to cover some variety of configurations. Generally, if a configuration can be described as a multi-lattice with some value of M, it can also be described as one with a larger value of M, although not with the same choice of lattice vectors. Strictly speaking, lattice vectors should be associated with the maximal translation group for the configuration, which means a minimum value of M, so I am abusing language a bit. So, using individual judgement, one somehow fixes a value of M, for a calculation of a specific strain-energy function. Of course, one must make
^68 184
J. L. ERICKSEN
some assumptions about how the atoms interact with each other, and numerous possibilities for this to occur in the literature. Note also that ea and pa determine the positions of the atoms. It is then pretty clear that for say q>, the Helmholtz free-energy -function per unit mass, one gets some constitutive equation of the form cp = (p(ea,pa,6),
(3)
maybe this only for 9 = 0. With any decent molecular theory, it should be invariant under rotations and translations. For the former, it is described by ea -> Rea,
Pa
-> Rpa, R~l = RT, det R=\.
(4)
If we put this in, consider the infinitesimal approximation R = 1 + Q, QT = -Q, and differentiate (p with respect to Q at Q = 0, then we get a differential identity, asserting that a second-order tensor T is symmetric, where T = ^®ea+^®Pa = TT. (5) dpa dea From general experience with continuum mechanics, this should be associated with a balance of couples, like the condition the Cauchy stress tensor be symmetric. For the translations, we have ea -+ ea, pa -> pa + c,
(6)
where c is an arbitrary constant vector. Similarly, this gives the differential identity (7)
the experience being that this should be associated with a balance of forces. If one is familiar with B.R., one can apply it to elasticity theory to get a formula somewhat similar to (5) for the Cauchy stress, omitting the part associated with pa. If one knows the old molecular theory of Born, one might spot some equilibrium equations which eliminate those terms associated with pa, to be introduced later, I shall sketch how relevant calculations are made for a simple kind of molecular theory, a zero temperature model, with atoms subject to central forces. Essentially, these are old calculations presented by LOVE [8, note B]. Being of an older school, he calculated the Cauchy stress tensor for Bravais simple lattices and 2-lattices directly, then used elasticity theory to find the strain energy, the reverse of what is now the common practice. The generalization to Af-lattices is almost obvious, certainly easily done. I will just rearrange terms to fit the format suggested above, indicating how both are calculated directly from molecular theory of this kind. Here, the atoms are considered to be at rest. In the ath lattice, all atoms are identical, so have the same mass /iy. Pick one atom from each lattice and call this set a pseudo-molecule. It has the mass
169 Equilibrium Theory for X-ray Observations of Crystals u
0 = 5>«-
185
(8)
The central forces are described by a set of potential functions M=flk,
(9)
each depending only on the distance between one pair of atoms, consisting of one in the ath lattice, the other in the /fth. These should approach zero quite rapidly as the distances tend to infinity, so that infinite sums to be encountered converge rather rapidly. Generally, the energy of the infinite crystal is infinite. However, we can get a reasonable estimate of energy per unit mass by giving each atom a fair share of the energy. If x and y denote positions of two atoms, the pair potential for them will have a certain value, to be shared equally by them. For example, if we pick an atom in the first lattice, its share of the total energy is M
£5>«(K'e«+A.-'Pil)/2> n
a—1
(io)
where ^ n denotes the sum of over all integers n", excepting rf = 0 in f\\. It has the analogous meaning in calculations to come. Now add up the contributions of this kind for a pseudo-molecule, and divide by its mass, to get the energy per unit mass M
= ^ ^ / « , ? ( i « % + A - ^ | ) / 2 / J ,
(11)
a special form of (3) for this theory, considered to apply at 6 = 0, where the difference between internal energy and Helmholtz free energy is no longer significant. With the rather explicit formula, one can verify aforementioned invariance properties of cp and some additional ones not yet mentioned, which also hold for other versions of molecular theory. For the positions described by (2), different choices of ea and pa give exactly the same set of positions. This gives transformations of the form ea-^rnbaeb,
pa -> pa + n % ,
(12)
where the w's andra'sare integers such that det \\mba\\ = ± 1 . Not all of these can be viewed as leaving
b
pa ->• - p x .
(13)
With m a = -S a in (12), we can, independently, reverse ea, so the pair could map the domain to itself. However, central inversions sometimes relate dif-
170 186
J. L. ERICKSEN
ferent constitutive equations describing enantiomorphs, so I am inclined to interpret this as indicated by <(>(ea,pa, 6) =
(14)
where q> and q>' might be the same or different functions, then describing enantiomorphs. In the latter case, one could regard the two as branches of a double-valued function which is invariant under the indicated transformation. If atoms in some lattices are identical, q> is a symmetric function of the corresponding vectors pa. I shall continue to include the improper transformations, because they can be involved in growth twins, for example, but do keep in mind the above discussion. To calculate the stress tensor, we use CAUCHY'S interpretation of it. Pick a plane with unit normal v and on it a region R of area S, bounded by a curve c. In (11), for example, the infinite sum is replaced by a finite sum, in practice, which means that interactions between atoms only count when the distance between them is less than a certain value. Roughly, a continuum point is a ball of about this size. The idea is that the region R has typical linear dimensions large compared to this, but is small enough to be regarded as infinitesimal, macroscopically. Ideas of this kind are used in justifying the notion that the Helmholtz free energy is (approximately) an additive set function, for macroscopic regions. The idea is to consider pairs of atoms whose lines of action cross R, to calculate the resultant force exerted by those on the side into which v points, on those on the opposite side. Consider one such pair, say at some positions x in the first lattice, y in the second, y being in the side into which v points, so (y-x)-v>0.
(15)
For some integers n", we have y-x
= naea+p2-p\,
and the force exerted on the one at x by that at y is
or
^+»-»i>i£S-
<16)
Various pairs of this kind have the same relative position vector and lines of action intersecting R, so multiply (16) by this number, «o- To estimate this, LOVE constructs a cylindrical region capped by R, this translated by the relative position vector, bounded on the side by the ruled surface with rulings parallel to this. Then, n0 should be the number of atoms in this region, a large number. Also, it should contain about the same number of atoms in each of the other lattices. The total mass for the region can then be estimated from microscopic and macroscopic reasoning, giving the equation
_i Equilibrium Theory for X-ray Observations of Crystals pS(y-x)
187 07)
• v = non,
where fi is the mass of a pseudo-molecule. For this set of pairs, we thus get an estimate of the resultant force as (16) multiplied by pS(naea+p2-p1)-v/fi.
(18)
This should be summed over almost all the values of n" consistent with (15). There is the point that, for some rather large relative position vectors almost perpendicular to v, their lines of action cross the plane outside R. Ignore this, for the moment. Now consider the analogous calculation for pairs in these two lattices, with that in the first lattice being in the region into which v points. If we look at the result, we see that this is equivalent to doing the first sum, for integers rf satisfying the reverse of the inequality (15). With (17), we can add in directions such that (y — x) • v — 0. So, add these two contributions, divide by S to get the force per unit area, and let R increase, tofillout the whole plane, to catch those directions temporarily ignored. For pairs in the same lattice, one uses a slightly different argument. Consult LOVE [8, note B] if this causes trouble. Adding up all the contributions of this kind gives, as a prescription for the Cauchy stress tensor t,
1
- P 1_ !_• f*A\V™P\)
2n\V «
'
(
}
where
K«p = n"ea +pa- pp.
(20)
With this and (11), it is straightforward to verify that, with T given by (5), t=pT p
= (^-®e"+w®p)-
(21)
I think that this applies quite generally, perhaps with some idea of stress which is not the same as CAUCHY'S, but do not know how to make a convincing argument for this. Among the troublesome cases is an old one, discussed in some detail by POINCARE [9], in connection with elasticity theory, covered more briefly by STAKGOLD [1], Briefly, POINCARE considered what are sometimes called pseudo-potentials, potentials depending in a rather general way on the positions of all the atoms. The negative derivative of this with respect to its position gives the force on an atom, but there is no sensible way of calculating a part of this, as exerted by subsets of other atoms. Obviously, CAUCHY'S idea of stress does not really make sense in this context. There is a counterpart of this in nonlinear field theory. Matter can produce fields, not with parts attributable to a subset, although a field can exert a force on a subset. However, in those molecular theories, with some approximations and use of energy arguments, one gets a stress tensor fitting the usual prescription used in elasticity theory, where CAUCHY'S idea is commonly used. This is not the place for a discussion of such subtleties, although the matter deserves serious thought.
]72
_^____^_^__ 188
J. L. ERICKSEN
We are not quite done, there being a matter of consistency. Certainly, any atoms subject to non-zero forces would not be at rest. For a Bravais simple lattice subject to central forces, it is easy to see that the force acting on any one vanishes but, as is familiar to workers in this area, this is not true for the more complicated configurations, in general. By calculations much like those we have done, one can calculate the resultant force fa on any atom in the ath lattice, and show that it is given by
/.-,£.
<*>
X > = 0,
(23)
From (7), what is automatic is that
01=1
i.e., that the resultant force on a pseudo-molecule vanishes, verifying the notion that (7) represents a balance of forces. However, we should also introduce, as equilibrium equations for/j a , the condition that each atom be subject to no force,
Then, when they are satisfied,
w,=°do t = p~-(g)ea.
(24)
(25)
Of course, t is here independent of position, so V • t = 0. If (24) is nicely soluble for pa, we can eliminate these variables and, with B.R., get the formula for the CAUCHY stress which is generally accepted. Note that, with (12), solutions of (24) for px are never unique. I think it likely that some of the failures of elasticity theory discussed by ZANZOTTO [6] are associated with the possibility that these equations are not so nicely soluble. For a purely mechanical theory like this, it seems to me unlikely, physically, to be able to find external forces to balance fa and not introduce other variables, perhaps electrical polarizations or magnetizations. Also, I am comfortable with assuming that (24) applies to general forms of the function q>, partly because I do not see how else to get molecular theory and elasticity theory to be consistent, when B.R. applies. So, we have scraped together a kind of format, a start toward building a continuum theory of equilibrium, albeit not of a very familiar kind. Clearly, what follows more directly from those molecular theories is not a theory of macroscopic deformation, but a theory of those vectors ea and pa, really a theory of X-ray observations.
3. Constraints and B.R. For continuum theory, consider ea and pa as vector fields, defined over some regions of space, these being smooth except on curves or surfaces or perhaps at isolated points. Singularities there occurring might well represent
^ _ _ _
173 Equilibrium Theory for X-ray Observations of Crystals
189
isolated defects, for example vacancies, dislocations or twins. Even when these fields are smooth, they can involve continuous distributions of defects. Measures of these associated with ea only are discussed in some detail by DAVINI & PARRY [10]. Probably, some failures of B.R. are associated with large numbers of defects. For example, the experiments of BAIZER & SIGVALDSON [11] on thermal expansion of zinc indicate a serious failure of B.R., which they attribute to changes in locations of vacancies, along with changes in dislocations. I do not have a very good feeling for what goes on in the motions associated with forming deformation twins. Logically, there are two possibilities which might occur when twins form without producing other defects. Either the changes in all atomic positions in a twin can be described by some linear transformation of the simple shearing kind or not. In the former case, the experience is that the macroscopic deformation gradient is likely to be such a transformation. In the latter case, workers try to fit this gradient to a lattice consisting of a subset of atoms, which seems to be possible, generally. Then, they try to estimate how the remainder have been shuffled around, to get to their final positions. Roughly, this is what "shuffling" means, in practice. There are inequivalent descriptions, more precise, in the literature. For further discussion of shuffling, including sketches of possibilities, see CHRISTIAN & MAHAJAN [5]. For multi-lattices, with pa determined by equilibrium equations depending on the material, shuffling does occur, in most cases. Here, I limit applicability of theory, by excluding continuous distributions of defects, essentially. From the discussion of DAVINI & PARRY [10], excluding continuous distributions of dislocations gives the constraint curl e" = 0,
(26)
where ea are the reciprocal lattice vectors, the basis dual to ea. At least locally, this is equivalent to having three scalar functions x" such that ea = V Z fl .
(27)
Generally, if c is a Burgers circuit, a closed oriented curve on which ea is smooth, the numbers
b" = I e" dx
(28)
c
are interpretable as certain components of the Burgers vector for the circuit, which should vanish unless the circuit encloses a line defect, usually regarded as a dislocation line, where e" can be expected to have singularities. Assuming ba = 0, for all circuits in a region where ea is smooth, is enough to make (27) apply throughout the region. Obviously, if we are to use the suggestions from molecular theory, we need some prescription for the mass density p. From the microscopic point of view, the mass per unit volume is the mass of a pseudo-molecule divided by the volume of a unit cell, suggesting what I accept, i.e.,
J74 190
J. L. ERICKSEN
p = nl\e\ • e 2 A e 3 | = p\ex • 2 A e 3 | .
(29)
Having numerous point defects, like vacancies or interstitials, could upset this. So, effectively, continuous distributions of point defects are excluded. From the earlier discussion of thermal expansion of zinc, one should be leery of trying to apply the present theory to it. Note that, if we use (12) to transform ea, the transformed vectors again satisfy (26) and give the same value to p. Deliberately, I have postponed describing B.R. To use it, one introduces some possible configuration as a reference and a definite choice of lattice vectors Ea for it. If F denotes the usual deformation gradient, relative to the reference, the assumption is that the vectors given by ea = FEa
(30)
are a possible set of lattice vectors in the deformed configuration. This gives ea=F~TEa.
(31)
If we regard this as a function of spatial coordinates x, we can verify that it satisfies (26). Also, (29) is consistent with the usual Jacobian rule for relating p to the reference mass density, when F is used. Commonly, Ea is taken to be constant. Then, combining (27) and (31) gives an equation which can be integrated easily, to give a
x
= X • Ea + const.
(32)
or, equivalently, X = xaEa + const.
(33)
with X representing positions of material points in the reference configuration. It is not hard to generalize this to cases where Ea is not constant, but (27) applies to it. So, if it seems safe to apply B.R. to a limited set of deformations, to link them to changes in ea and/?,*; one can use these ideas to do so. I have and will continue to assume that B.R. applies to the trivial cases of rotations and translations. With either (32) and (33) or the indicated generalization, X = const, o %a = const., so those crystallographic points are also material points, when B.R. applies. In some twinning situations, it is clear that some crystallographic or irrational planes of interest do not move as material surfaces, so B.R. cannot then apply. Here, in discussing B.R., I do not insist that ea be true lattice vectors, generating the maximal translation group: I am allowing some use of the weakened version labelled W.B.R. by ZANZOTTO [6]. I do agree with him that, for a decent theory, one must fix on some mode of description of configurations, fixing a value of M, for a given crystal. As is clear from the discussion by CHRISTIAN & MAHAJAN [5], failures of B.R. for twinning are commonly linked to shuffling which, in itself, does not produce anything recognizable as a defect. Commonly, in crystallography, workers mention (zi,z 2 ,z 3 ) crystallographic planes, where the zs are some integers, meaning a surface for which the normal has the direction
175 Equilibrium Theory for X-ray Observations of Crystals
191
zaea = V(z a X a ).
(34)
They either state or rely on common usage to identify how the ea are to be selected. Here, such a surface is described by what looks like the equation of a plane, i.e., zaIa = const.
(35)
so, I call these crystallographic planes. Workers also speak of irrational planes, described in the same way with real numbers za, not replaceable by integers. From this, and similar descriptions of crystallographic directions, it is natural to think of the x" as Cartesian coordinates of points in an affine space, which I call crystallographic space. So, instead of connecting positions in space to positions of material points in a reference, we are connecting them to these crystallographic points. Since boundaries of crystals tend to be made up of parts of such planes, this view can be of some help in thinking about possible boundary conditions. Given the various notions about atomic shuffling, it seems to me likely that making sense of macroscopic deformations might well involve averaging over a larger scale than is needed for X-ray observations, making it a bit tricky to combine these in a single theory. Clearly, adding a constant to / changes nothing of physical interest. Essentially, we have interpreted ^" as Cartesian coordinates in an affine space, which fits nicely with this translation group. Also, it has an infinite discrete group acting on it, inferred from (12). Putting these two groups together, we get transformations of the form
Xa^mtXb + C,
(36) a
where the
176 192
J. L. ERICKSEN
4. Equilibrium Equations From the previous discussion, it is pretty clear that we need to select some equilibrium equations for x", which involves some subtleties. For q>, etc., I assume that the types of constitutive equations considered in §2 apply to the fields ea and pa, although these are there assumed to be constants. However, given (27), it is better to replace ea by ea = Vx" in these, which is routine. So, we will have (37)
cp = (p(ea,px,6) = 7p(e",pa,6), with equilibrium equations given in part by
^ = ? = 0' Opa
OP*
(38)
6 being regarded as a constant, really a control parameter. With (38), the CAUCHY stress is given by
t= P
^t®ea
=
(39)
-^ ®lt» = ~9%®e">
from (5) and (21). This should satisfy the usual equilibrium equations V - f + / = 0,
(40)
/ being the body force per unit volume. Of course, in (37) and (39), e = V# a . For fitting the equations to principles of virtual work, etc., it helps to introduce the energy per unit volume a
w = plp = w(Vxa,Pa,0),
(41)
with p given by (29)2. This gives dVxa ® V * a '
~~ W
^ '
comparable to my [13, eq. (4.5)]. There, in eq. (4.3), is a prescription for a configurational stress, the obvious analog being *
= <&•
<43>
for which I use the same name. By a simple calculation, V • t = - V • caVX\
(44)
a
where det || Vx || +0, from the assumption that lattice vectors are linearly independent. Thus, the equilibrium equations have the alternative forms V.t=-f&V-ca
= -ga, f=-9aVxa-
(45)
In the latter form, they are of the usual Euler-Lagrange type, with a source term, better fitting variational formats. Given a conceptually clear statement
177 Equilibrium Theory for X-ray Observations of Crystals
193
of the physical interpretation of ca, it might be feasible to deduce the formula (43) from the molecular theory discussed earlier, or use the formula to find a good interpretation, something I have not yet tried. With (29) and the usual relations for dual bases, we have = p.e2 A e3 = pS/j1 A V* 3 ,
pe\ = /i|det Vf\e\
etc.,
(46)
For one thing, this is related to a measure of point defects discussed by & PARRY [10]. If if is a region, with e" smooth on dR, one has the measures DAVINI
H I e2 Ae 3 dS = f pex-dS, dR
etc.,
(47)
dR
where dS denotes the vector element of area. With (46), it is easy to show that V • (pea) = 0,
(48) a
so these measures vanish if R contains no holes and, in it, e is smooth. Note also that, from (29), (49) «
&
=
"
*
*
•
&
.
-
<
>
•
< « >
In language used by experts in the calculus of variations, p is a null Lagrangian. Let us turn to a principle of virtual work, much like that I [13] used for a generalization of elasticity theory. There, the discussion fails to mention that the domain variation of reference configurations can involve adding or subtracting mass. Here, I exclude all singularities and topological complications, dealing with smooth fields and smooth regions, topologically equivalent to balls. Reasonably, w can be expected to become infinite at values of its arguments for which positions of different atoms coincide. Certainly, loadings cause a body to occupy different positions in space, so we should vary domains. With no clear ideas about deformations, it is not so clear what should be meant by a material body, except that its mass should remain fixed, viz., Jt = / p dv = const.,
(50)
R
As usual, we consider one-parameter families of fields, defined over regions which similarly vary. For the domains, we have invertible mappings of the form x = x(x,t),
x(x,0)=x,
(51)
mapping some domain R(0) to R(t), with e
3x=
def VX ,
~dj(X>0)-
,
,.,,
(52)
Similarly, the functions %" n e e d to be defined on R(t), and varied independently, so we introduce functions
178 194
J. L. ERICKSEN
r(x(x,o),o) = xa(x),
r = r(x,t),
(53)-
with
«y=^M).
(54)
I assume that the fields pjj£,i) are obtained as solutions of (38) and are smooth, dpa being calculated in the same way as d%a. It turns out to be convenient to introduce the combination AXa
= Sxa + VXa • Sx,
the variational equivalent of the material derivative of the analog of a velocity field. First, a calculation gives, with
(55) a X ,
with Sx regarded as
e = / wdv
(56)
R
as energy, Ss = - I V • ca3xa dv + f {ca5Xa + wSx) • dS,
(57)
OR
R
where R = R(0). This is to be equated to virtual work, denoted by W, which can be regarded as a linear functional of dXa and Sx or, equivalently, of AXa and Sx. I use the latter, writing
W = J(f-Sx
+ gaAxa)dv + j ( h S x + kaAxa)do,
(58)
dR
R
where da denotes the scalar element of area and the coefficients of the variations are generalized forces. So we want de=W
when 5Jf = 0.
(59)
As noted earlier, I do not think it reasonable to think that one can produce external forces conjugate to pa, so I do not include a linear functional of them. Actually, one could deduce (38) from (59). It is easiest to deal with the constraint by using a Lagrange multiplier. However let us first look at balance laws obtained by using variations based on Lie groups leaving w invariant, so <5e = 0. There are the translations in physical space for which Ax" = 0,
5x = const.,
(60)
yielding
f f dv+ f hda = 0, R
(61)
dR
which I interpret as a balance of forces. For the rotations, Axfl = 0, Sx = dAx,
(62)
179 Equilibrium Theory for X-ray Observations of Crystals
195
with d any constant vector, we get the balance of moments
(63)
fxAfdv + I xAhda = 0. OR
R
Finally, there are the translations in crystallographic space indicated in (36), for which = Sxa = const.,
Af
dx = 0,
(64)
producing
f gadv+
fkada
= 0,
(65)
OR
R
which I interpret as a balance of configurational forces. We now consider general variations. As interior equations, we then get, by routine calculations V-ca + ga = 0,
/ + ^VZa = 0
in R.
(66)
With (44), these imply that (45) is satisfied, (66) emerging as the equilibrium equation favored for y_a, although (45)] is equivalent. Matching the boundary terms gives the familiar tv = h on dR,
(67)
where v is the outward directed unit normal, and the not so familiar ca • v = ka + Xpea • v o n dR,
(68)
with X as the Lagrange multiplier associated with fixing the mass. Now, at least as long as mass is conserved, we generally think of ~q> and 7p + a as physically equivalent, if a is a constant or, more generally an affine function of 9. However, they give different values to ca, the difference being the divergenceless vectors. a ^ -
= aPea.
(69)
In my estimation, we should regard such different vectors as physically equivalent. So, I suggest that boundary conditions like (68) need only to be satisfied to within this ambiguity, as is implied by (68). One can check that 7p and q> + a give the same Cauchy stress, using (41) and (42). Considering our rather meager experience with formulating boundary conditions, etc., for configurational forces, it is tempting to use the alternative formulation in terms of the stress tensor. However, considerations of configurational forces are being used increasingly, in studies of various kinds of defects, so there is reason to try to better understand them. There are recent studies emphasizing them, by GURTIN [14] and MAUGIN & TRIMARCO [15]. One should not view (51) as describing a one-parameter family of macroscopic deformations, and there is no compelling reason to do so. Even in elasticity theory, we use adjectives like "virtual", to indicate that such
^80 196
J. L. ERICKSEN
mappings might or might not be realizable as deformations, physically. Naively, one might expect that, for macroscopic deformations, the variational equivalent of the equation of continuity should hold, i.e., dp + V • (pSx) = 0,
(70) 3
and it does not for all variations considered above . On the other hand, the mass density associated with macroscopic deformation is likely to be some average of p, if this involves averaging on some larger scale. 6. On Twinning To a large extent, theories of twinning deal with twins present in crystals subject to no loads, in what are, or at least seem to be equilibrium configurations. I shall tackle these in a way which seems to me natural, for the theory here considered, although it is not completely conventional. Such twins could be of the deformation kind, for example, produced by loads involving shear. We are then doing a post-mortem of what occurs, after loads are removed, and the crystal has come to rest. We begin by considering configurations minimizing the energy £ = / pep dv,
(71)
R
subject to the constraint
f
Ji = I p dv = const., J
(72)
R
R being an unspecified region. Here, I use q> rather then 7p, to stay closer to what is conventional. Proceeding naively, we try to find constant minimizers ea and pa. For these, e = J?q>,
(73)
so it is a matter of minimizing q>. Clearly, p is constant, so all that matters about R is its volume. If ea and pa denote such a minimizer, we can apply the transformations leaving cp invariant to generate an infinite set of such minimizers. If ea and pa denotes any of these, there is a choice of symmetry transformations such that ea = Qmbaeb,
pa = Q{pa + naaea + c),
(74)
where Q is some orthogonal transformation, the rest being obtained by combining (4), (6), (12) and (13). For cases where atoms in different lattices are identical, one can generalize this a bit, allowing interchanges of corresponding/^. However, for simplicity, 3
By one way of interpreting this, our variations cover some atomic shuffling as well as gross deformations.
18J. Equilibrium Theory for X-ray Observations of Crystals
197
I ignore this. From the extremal conditions for a minimum, it follows that the stress tensor vanishes and the configurational stress is physically equivalent to zero. It might be possible to use the theory of space groups to say more about minimizers. Not having thought this through carefully, I will not say more. For elementary twinning theory, one considers a plane I call a twin4 plane, across which ea and pa undergo a jump discontinuity to a symmetryrelated configuration, some choice of ea and j?a in (74). One could get trivial examples of this kind by having one configuration throughout, with the different choices of ea and pa possible for one configuration. It is understood that these are to be excluded, that it is not possible to relate the configurations on the two sides by transformations of the form (74), with Q= \. For studies based on elasticity theory, another condition is used, kinematic conditions of compatibility coming from the assumption that the displacement is continuous on the twin plane. When B.R. fails to apply, it is at best doubtful to rely on this. However, there is a different assumption which is equivalent to this when B.R. applies, and is still usable when it does not. It is Assumption 1. For a twin, isolated from other singularities, it is possible to choose lattice vectors so that the Burgers vector vanishes for all Burgers circuits lying in a neighborhood of the plane, including those intersecting the twin plane. The statement is not conventional, so I will expend some ink, to explain how it relates to more conventional ideas used in discussions of twinning. Assuming it, one can pick some point in the neighborhood, assign a value of Xa = Xo there, then use the path-independent integrals implied by it to define Xa at any other point P in the neighborhood, as indicated by p
a
X (P) = xS + Jea-dx.
(75)
Pa
At the twin plane, #fl is then continuous, giving us an analog of the continuous displacement assumed in elasticity theory. This gives us the analogous kinematic condition of compatibility. Denoting by ea and ea the limiting values on the twin plane obtained as one-sided limits, we have ~ea = Vxa = Vxa + qav = ea + qav
(76)
for some scalars q", v being the unit normal to the plane or, equivalently, ea = He",
(77)
where H=l+v®q,
q = qaea.
(78)
Consider the equation of the twin plane in crystallographic space. With x" continuous, the two limits can be described by the same equation, so it makes sense to say that it is a certain crystallographic or irrational plane, not that it 4 In the literature, this and other adjectives are used, e.g., interface, twinning or composition.
182 198
J. L. ERICKSEN
is one thing on one side, another on the other. Certainly, this fits standard procedures in discussions of twinning. Also, for any vector u perpendicular to v, we have u • v = 0
=>
u • ea = u • ea,
(79)
which means that crystallographic directions in the plane, with the same Miller indices on both sides, coincide. An example of a growth twin in alum not satisfying this condition is mentioned by ZANZOTTO [16, note 1], so the assumption does not apply universally. As he mentions, there seems not to be a definition of growth twins which is generally accepted. However, I do not know of any observations of mechanical twins similar to those seen in alum. In using continuum theory, we do lose sight of the detailed arrangements of atoms on both sides, as is the case with X-ray observations. So, there is no way of knowing whether atoms on one side might be translated relative to the other by a fraction of lattice spacing, for example. This is something that might be seen, using electron microscopes, but one then encounters some uncertainty about how well what is observed in the small samples there used relates to what occurs in the bulk. What we have discussed relates to special choices of lattice vectors, but these must be compatible with (74), so e" must be some choice of ea. One implication is that det ea = ± det ea,
(80)
giving detH=
l+q-v
= ±\.
(81)
The relation between some growth twins does involve orientation reversing operations, so cases where the lower sign holds in (81) might be of interest for some of these. I do not think that this possibility is of interest for mechanical twins, and so accept Assumption 2. For mechanical twins, there is a way of choosing e" so that (81) applies, with the upper choice of sign. There is the logical possibility that, with different choices of e", one can get both choices of sign in (81), a possibility I have not explored, but I know that it happens, in special cases. With Assumption 2, we have det H = 1 <=> q • v = 0,
(82)
~ea=H-Tea = {\-q®v)ea.
(83)
That is, the linear transformation applying to lattice vectors has the form of that for a simple shearing deformation. For mechanical twins, it seems to be generally agreed that the relative macroscopic deformation gradient is of the simple shearing kind. To try to determine this from X-ray observations, workers try to find lattice vectors related as in (83) and, it seems, always find some. Most but not all observations indicate that twin planes are crystallographic planes. Then, theoretically, there are infinitely many, if there is one.
_ _ ^ _ Equilibrium Theory for X-ray Observations of Crystals
183 199
Many involve shears too huge to be considered seriously but, typically, one gets some different possibilities which look reasonable, on the face of it. Failures of B.R. occur when none of these agree with observations, as does happen for some crystals. Given a set of lattice vectors satisfying (82) and (83), one can work back, to show that these do conform to Assumptions 1 and 2, providing more evidence that these assumptions apply to mechanical twins. Using the fact that ea must also be one of the ea given by (74) leads to the rather standard twinning equation Qmbaeb = (\-q®v)ea,
(84)
Assumption 2 implying that d e t g d e t ||m*|| = 1.
(85)
The experience is that (84) admits various solutions not matched by observations, suggesting that some other criterion should be introduced, but I do not know what to propose for it. For Bravais simple lattices, there is no loss in generality assuming both determinants are positive, so Q is a rotation. However, one should be aware that the possibility that both are negative has a different effect on (73). Even when it fails to apply, one can use B.R. to borrow twin analysis based on the version of (84) which is used in elasticity theory, giving us the analogous elementary theory of twinning. In a similar way, one can adapt most of the theory of twin microstructures. However, I do not want to expend the ink to explain subtleties related to this. For deformation twins, it is clear from evidence discussed by CHRISTIAN & MAHAJAN [5] that interaction of twins with dislocations and other defects is important, so there is also reason to adapt theories of such defects based on elasticity theory, which seems feasible. Acknowledgement. I thank MARIO PITTERI and GIOVANNI ZANZOTTO for helpful dis-
cussion of various aspects of this theory. References [1] STAKGOLD, I., The Cauchy relations in a molecular theory of elasticity, Quarterly of Applied Mathematies, 8, 169-186 (1950). [2] BHATTACHARYA, K., Firoozye, N. B., JAMES, R. D. & KOHN, R. V., Restrictions
[3] [4] [5] [6]
on microstructure, Proceedings of the Royal Society of Edinburgh, 124A, 843-878 (1994). WOOSTER, W. A. & WOOSTER, N., Control of electrical twinning in quartz, Nature, 159, 40-406. (1946). THOMAS, L. A. & WOOSTER, W. A., Piezocrescense - the growth of Dauphine twinning in quartz under stress, Proceedings of the Royal Society of London, A208, 43-63 (1951). CHRISTIAN, J. W. & MAHAJAN, S., Deformation twinning, Progress in Materials Science, 39, 1-157 (1995). ZANZOTTO, G., On the material symmetry group of elastic crystals and the Born rule, Archive for Rational Mechanics and Analysis, 121, 1-36 (1992).
|84 200
J. L. ERICKSEN
[7] PITTERI, M., On v + 1 lattices, Journal of Elasticity, 15, 3-25 (1985). [8] LOVE, A. E. H., A treatise on the mathematical theory of elasticity, 4th ed., Cambridge University Press, Cambridge, 1927. [9] POINCARE, H., Lefons sur la theorie de I'elasticite, Georges Carre, Paris, 1892. [10] DAVINI, C. & PARRY: A complete list of invariants for defective crystals. Proceedings of the Royal Society of London, A432, 34-365 (1991). [11] BALZER, R. & SIGVALDSON, H., Equilibrium vacancy concentration measurements on Zinc single crystals. Journal of Physics F: Metal Physics, 9, 17-178 (1979). [12] ERICKSEN, J. L., On the symmetry of deformable crystals, Archive for Rational Mechanics and Analysis, 72, 1-13 (1979). [13] ERICKSEN, J. L., Remarks concerning forces on line defects. Zeitschrift fur Angewandte Mathematik und Physik, 46, Special Issue, S247-S271 (1995). [14] GURTIN, M. E., The nature of configurational forces, Archive for Rational Mechanics and Analysis, 131, 67-100 (1995). [15] MAUGIN, G. A. & TRIMARCO, C , The dynamics of configurational forces at phase-transition fronts, Meccanica, 30, 605-619 (1995). [16] ZANZOTTO, G., Twinning in minerals and metals: remarks on the comparison of a thermoelastic theory with some experimental results, Notes I and II, Atti della Accademia Nazionale dei Lincei, 82, 725-741, 743-756 (1988). 5378 Buckskin Bob Road Florence, Oregon 97439 (Accepted May 2, 1996)
185 figAMtfMStl E=3ySS3
vveierstrass Institute tor Applied Analysis and Stochastics
Keport No. 18 2000
Anniversary Volume Krzysztof Wilmanski Contributions to Continuum Theories
A Minimization Problem in the X-ray Theory J. L. ERICKSEN 5378 Buckskin Bob Rd Florence, OR 97439, USA
Summary
and it is somewhat different from the analog used in elasticity theory. Here, I discuss some My purpose is to elaborate some features of ideas that I have found helpful for better una continuum theory of X-ray observations of crys- derstanding this problem. tals, relating to an energy minimization problem. The problem is similar to one commonly used in . thermoelasticity theory to analyze phenomena re- 2 Kinematics Ot 71-latticeS lating to phase transitions and twinning, but there „,. v , .. ,.a ihe X-ray theory is a continuum theory of deare some subtle differences, r i_i i • i tormable n-lattices, that is of n geometrically identical crystal lattices, translated relative to 1 Introduction each other. Atoms in any one of the lattices are required to be identical, but atoms occurThose involved in molecular theories of elas- r j n g m different lattices need not be the same, ticity theory for crystals commonly use the To within a translation, the identical lattices Cauchy-Born rule to convert functions of lat- c a n be described by three linearly independent tice vectors to functions of deformation gradi- vectors e a , the lattice vectors, determining the ents. Zanzotto (1992) pointed out that, for de- reciprocal lattice vectors (dual basis) e a , satisformations associated with twinning, for exam- tying pie, the rule seems to be reliable for Bravais lattices and shape memory alloys, but not for e" • e& = $%, e°
J_86 position vectors of the latter, relative to this origin. Again, there are infinitely many ways of choosing these for a given configuration and, as is discussed by Pitted (1985), for example, these are related by transformations of the form
where m, a and 1 are selected from the matrices occurring in (2.2) and (2.4). Here, (2.7)i describes a lattice group element for the skeletal lattice. It can happen that it -holds but (2.7)2 does not, the n-lattice having less symmetry than the skeletal. Also, there is a caveat. (2-4) g o m e descriptions of n-lattices are nonessenPi -> °qPj + ' i e a i h £%• . . . , ., .. , tial, meaning that the configuration involved ,-, 5 & ' For monatomic crystals, the a matrices are el, ,,, . ,, can also be described as a n-lattice, with ements of the group generated by ' n < n. If not, the description is called essen(a)Matrices obtained by replacing \ tial. For nonessential descriptions, one can't (2.5) rely on (2.7) to correctly describe the symmetry a column in the unit matrix by one with all entries equal to -1. . of the configuration: it can underestimate how (6)Matrices obtained by inter' much symmetry occurs in the configuration. The remaining discussion excludes nonessential changing two columns in the unit matrix. J descriptions. A lattice group element is then described by the following set of integers: This covers picking the origin and other atoms used to determine shifts to be any selection of (m, a, I) e L(e a ,p;). (2-8) atoms, taking one from each lattice and renumbering the lattices so they are, essentially, per- In particular, these satisfy mutations. For polyatomic crystals, only permP mutations permuting identical atoms are al= *> P ~ 1> 2 , 3,4, or 6, 1 ^ ^ lowed. The shifts are restricted by the condim p = 1 =^ otp = 1. J tion that two atoms must not be in the same T /n n\ ii ii .i i i In (2.9)2 the converse implication does not hold, ., , ,, , , , „c place, formalized by c J ' so it can be that a — 1 when p = 2, lor examn =£ / a P anH if 4 zt 4 "1 P le - F° r present purposes, this rather sketchy a p. _ n- J: l e f ^-"l description will suffice. A good general reference for these matters and other aspects of the theory of crystals is the book by Pitteri and for any integers If and If-. In most discussions of crystals, e a and p ; Zanzotto (1999). Results concerning various are regarded as constant vectors but, in the X- l a t t i c e groups are presented by them, Adeleke ray theory, they are vector fields, functions of ( 1 9 9 9 ). Ericksen (1999), Fadda and Zanzotto positions in space. Effectively, this is what is (1999), Parry (1998) and Pitteri and Zanzotto done in other continuum theories. With elas- (1998). Pitteri (1998) and I (1998) presented ticity theory, they are commonly regarded as different characterizations of nonessential defunctions of position in some reference config- 3 scnptions. Constitutive equations uration, which is essentially equivalent. The
concept of reference configurations is not ba- 3 sic to the X-ray theory, although one could introduce analogs of these, when it serves some useful purpose. Symmetry of particular descriptions is best described by the lattice groups L(e a , p;) introduced by Pitteri (1985). These are defined by the possible solutions of the equations Qea = mbaeb <=> Qea = (m^JJe 6 , "] > (27) Q G O(3), QPi = °^iPj + '? e a, J
Constitutive equations
The constitutive equations I (1997) proposed involve an assumption which amounts to exeluding continuous distributions of dislocations: (3.1) there are scalar functions \ a s u c n t n a t e" = Vx"(3-1) This and other entities are considered as functions of position in space, there being no material coordinates in this theory. Hereafter, (3.1)
78
187 is assumed to hold. Also, motivated by the idea phase transition problems, the configurations of that mass is a multiple of that of a unit cell, I interest can be included in a neighborhood of assumed that, for the mass density p, the kind introduced by Pitteri (1985), centered at some configuration. For such a domain, it p = ke e Ae |, (o.i) suffices to consider ? to be invariant under a finl e „ i TT i t S rou Pi the lattice group for the center. For where k is a vpositive constant, ror the Helm. . . , . . present purposes, it is not necessary to go mto holtz free energy per unit mass, the constitutive . , _ these matters in detail, equations are of the form
ip =
(3.3) 4
Energy minimization
These are assumed to be invariant under rota- j ^ l g g 7 | tlonsi
l g g g ) b r i e f l y c o n s i d e r e d t h e p r o blem
of
minimizing the energy
R £ SO(3).
/
[
,
'
E = Jpvdx, n
In part, the equilibrium equations are z!f_ — o dpi
(4.1)
where ft is a fixed region, subject to the con' '' that the value of the mass,
stra 11
(3-5)
When these are satisfied, the other constitutive equations reduce to the form
M=
pdx
(4.2)
n
is fixed. Also, 9 is assumed to be fixed, so I will ignore it. In this, I was interested in configurations involving complicated arrays of twins, among other things, configurations not smooth enough to satisfy the equilibrium equations everywhere. Roughly, the idea is to consider the various material bodies of the same mass that could occupy the region fi, selecting those minimizin E In e thermoelasticity theory, one minimizes t h e anal ° g o f E f o r a fixed material body, instead. My general experience in such matters led me to believe that, for minimizers, the Cauchy stress should reduce to a constant hyV • t = —(V • c a )e°, (3.7) drostatic pressure, which depends on the value of M, for a given material and region Q. Correso, if the body force vanishes, as will be as- s p o n dence with knowledgeable persons has insumed here, dicated that, even for smooth minimizers, this , , is not obvious to some. I doubt that it is possi_ V- t = 0 and V • ca = 0, (3.8) , . . . r <•,,• •,. ble to give a rigorous general proof of this, with giving three independent equations for the realistic assumptions allowing for complicated three unknowns Xa• Concerning material sym- arrangements of twins, phase mixtures and the metry, the basic idea is that (2.2) and (2.4) de- l i k e a n d - certainly, I am not capable of doing so. scribe physically equivalent lattice vectors and W h a t i s feasible is to make the idea plausible, shifts. However, one needs to tailor this to u s i n g r a t h e r n a i v e f o r m a l arguments, Now what one considers to be the domain of V, and > l ( 1 9 9 7 ) s e t UP a Principle of virtual work w h i c h l u s e d t o one needs to exercise judgment in selecting this. > S e t equations satisfied by For example, for some but not all twinning and smooth extremals described previously. In this,
9
„
79
_188 I used domain perturbations, which need to be restricted to be mappings of 0 onto itself. This does affect some conclusions about boundary conditions, but not interior equations. Instead of trying to work with the extremal equations, let us approach the problem in a different way, still quite formal, possibly useful for those interested in devising a more rigorous treatment. First fix n, using your own judgment about the domain of ip. Introduce a subenergy function obtained by permitting p,to take on all values allowed, denned by cr(ea) = inLifi{ea, p{).
is the stress deviator, and p = — — — (£rt)/3
(4-9)
j S ) by a reasonable interpretation, the (possibly negative) thermodynamic pressure, also what is commonly regarded as the mechanical pressure, A formulation quite similar to this for thermoelasticity was introduced by Flory (1961), in a s t u d y of thermodynamics of high polymers, and I (1981) made use of it in another thermodynamic study. N o WWei n t r o d u c e
(4.3)
another subenergy function, an analog of one Chipot and Kinderlehrer Physically, I don't see a good reason to doubt (1998), Kinderlehrer (1988) and I (1981) used, that this can be regarded as a minimum, taken in elasticity and thermoelasticity theories, Fonon for some values of the shifts satisfying (3.5). seca and Parry (1992) using it for a more genIn the molecular theories of elasticity, workers eral theory, solve the analog of (3.5), but one also wants r ( " ) = inf. p -i =[/ cr(e a ). (4.10) solutions fitting a condition like (4.3), to get stable equilibria. With it, we have for any fields G r a n t e d t h a t i l reall y d e P e n d s o n *• 1 interpret e°(x) satisfying the mass constraint, it as what thermodynamicists commonly mean / p ( T ( e o ( x ) ) d x < E, (4.4) b y t h e H e l m h o l t z f r e e e n e r g y P e r "nit mass, ' for the kinds of physical problems being conJ with E evaluated for these fields and any ad- sidered, but they won't relate v to ea. For the missible shift fields: one needs to exercise some fields e " mentioned above, we should then have judgment about minimal smoothness require„.. r , \ , <. , E" = I pr(u)dx. = 1 ments. n I / (4 11) l Now, I digress a bit, to introduce a reJ u~ r(y)d^ < E' < E, ( ' n ' formulation of the constitutive equations which seems worth mentioning. Consider the change w h e r 6 ) ^ i n (4 5 \ of variables described by ) i'=l/p=l/(fce'-e2Ae3). (4.12) f"=ell/(e1-e2Ae3)1/3, a l i \ (4 5) e = (kv)~ / f I lr / 1 2 3M I Again, I don't think it unreasonable to regard ~ 'P ~ I \. \e e '\' J the infimum as a minimum. Then, using (4.5)u being the specific volume, the f° being homo- (4-9), it is fairly easy to show that, for the geneous of degree zero in the eb. Here, I have fields remaining after the selection implied by used (3.4). Then with (4-l°)> the Cauchy stress reduces to an hydrostatic pressure. Taking T(V) as a possible funcV = <£(f\ Pi,", 0), (4.6) t i o n ^ o n e g e t s Ei
_
obtained by substituting (4.5)2 in V, one gets, by a routine calculation, t —t
— ol
_
,
„,
'
(A 7)
a s o n e wou l d expect from this. Assuming this satisfies the equilibrium equation (3.8)i, perhaps in a weak sense, we get
(4.8)
T » = -p = const.
where, when (3.5) holds,
t D = p(V. | £ l / 3 - f ® | ^
,, ^
80
(4.14)
189 The difficulties in giving a rigorous derivation of t = - p i , p = const., for complicated miniraizers are like those discussed by Kinderlehrer (1988) for the analogous situation in elasticity theory, with additional complications associated with the mass constraint. For situations rather similar, mathematically, Chipot and Kinderlehrer (1988) and Fonseca and Parry (1992) showed that the mean stress is an hydrostatic pressure, so this might be provable for the case at hand. At least for less complicated minimizers, v should satisfy (4.14) and the mass constraint, linking the value of the pressure p to that of M. In elasticity theory, for example, workers generally assume that there are homogeneous minimizers and it is easy to see how these conditions work out for these. I believe that a thermodynamicist would be likely to simply assume (4.14), regard both p and M as control variables, and use standard techniques to analyze such problems, allowing the volume to vary. In proposing the problem described above, I was primarily concerned with adapting the theory of microstructures used in elasticity theory and, for this, it seems to me to trickier to deal with controlling p and M, since Q then needs to be variable. However, this deserves more thought: it is desirable to be able to control p.
237-277. Christian, J.W. and Mahajan, S., 1995, Deformation twinning, Progress in Materials Science, 39, 1-157. Ericksen, J.L., 1981, Some simple cases of the Gibbs phenomenon for thermoelastic solids, Journal of Thermal stresses, 4, 13-30. Ericksen, J.L., 1997, Equilibrium theory for X-ray observations, Archive for Rational Mechanics and Analysis, 139, 181-200. Ericksen, J.L., 1998, On nonessential descriptions of crystal multilattices, Journal of Mathematics and Mechanics of Solids, 3, 363-392. Ericksen, J.L., 1999, On groups occuring in the theory of crystals multilattices, to appear in Archive for Rational Mechanics and Analysis, Fadda, G. and Zanzotto, G., 1999, The arithmetic symmetry of monatomic planar S-lattices, to appear in Acta Crystallographica. Flory, P.J., 1961, Thermodynamic relations for high elastic polymers, Transactions of the Faraday Society, 57, 829-838.
Acknowledgment: I thank Marion Ericksen Fonseca, I. and Parry, G., 1992, Equilibrium for her help with the typing. configurations of defective crystals, Archive for Rational Mechanics and Analysis, 120, 245-283. References Kinderlehrer, D., 1988, Remarks about equilibAdeleke, S., 1999, On the classification of rium configurations of crystals, in: Materials monoatomic crystal multilattices, in prepara- Instabilities in Continuum Mechanics, ed. J.M. tlon Ball, Oxford University Press, 217-241. Ball, J.M. and James, R.D., 1992, Pro- P a r r V ) G ; 1 9 9 8 > Low dlmensional latUce posed experimental tests of a theory of fine groups for the continUum mechanics of phase microstructures and the two-well problem, transitions in crystals, Archive for Rational Philosophical Transactions of the Royal Soci- Mechanics and Analysis, 145, 1-22. ety of London, 333A, 389-450. Chipot, M. and Kinderlehrer, D., 1988, Equihbnum configurations of crystals, Archive for Rational Mechanics and Analysis, 103,
Pitteri, M. 1985, On (v + 1)-lattices, Journal Elasticity, 15, 3-25.
of
Pitteri)
81
M.
1998>
Geometry and
symmetry
_190 of multi-lattices, International Plasticity, 14, 139-147.
Journal of
Pitteri, M. and Zanzotto, G, 1998, Beyond space groups: the arithmetic symmetry of deformable multi-lattices, Acta Crystallographica, A54, 339-373. Pitteri, M. and Zanzotto, G, 1999, Continuum models for phase transitions and twinnings crystals, to be published by CRC/Chapman and Hall, London. Zanzotto, G, 1992, On the material symmetry groups of elastic crystals and the Born rule, Archive for Rational Mechanics and Analysis, 121, 1-36.
19J_ £ £ Journal of Elasticity 55: 201-218, 1999. ^ ^
201
© 2000 Kluwer Academic Publishers. Printed in the Netherlands.
Notes on the X-ray Theory J.L. ERICKSEN 5378 Buckskin Bob Rd., Florence, OR 97439, U.S.A. Received 24 September 1999 Abstract. My aim is to better compare thermoelasticity theory with a continuum theory of X-ray observations of crystals, to be described, to make it easier to adapt ideas and techniques from one to the other. To this end, I will present some different ways of formulating the X-ray theory. Also, I will adapt symmetry considerations from the former, partly to analyze problems of a kind not considered in thermoelasticity theory. Key words: constitutive equations for crystals, reference configurations.
1. Introduction Those involved in molecular theories of elasticity theory for crystals commonly use the Cauchy-Born rule, hereafter abbreviated to CBR, to convert functions of lattice vectors to functions of deformation gradients. Zanzotto [1] pointed out that, for deformations associated with twinning, for example, the rule seems to be reliable for Bravais lattices and shape memory alloys. However, for other kinds of crystals, it sometimes applies, but frequently does not and, in the latter cases, elasticity theory also fails to apply, in general. To provide some theory for such cases, I [2] proposed an equilibrium theory dealing only with the X-ray observations of lattice vectors, etc., hereafter called the X-ray theory. Here, my aim is to present some analogs of some useful results in thermoelasticity theory, to help in the development of the X-ray theory, and reformulations of equations emphasizing similarities between the two kinds of theories. Also, I shall discuss use of symmetry assumptions to help analyze some equilibrium equations that have no counterpart in thermoelasticity theory, as well as to analyze problems not considered in such theory. Elsewhere, I [3] outlined an adaptation of the theory of microstructures based on elasticity theory, presented by Ball and James [4], for example. This is useful in analyzing rather complicated configurations, involving many twins. Also, I [5] elaborated features of an energy minimization problem which is important for this, among other things. 2. Kinematics of n -lattices The X-ray theory is a continuum theory of deformable n -lattices, that is, of n geometrically identical crystal lattices, translated relative to each other. Atoms in
192 202
J.L. ERICKSEN
any one of the lattices are required to be identical, but atoms occurring in different lattices need not be the same. To within a translation, the identical lattices can be described by three linearly independent vectors ea, the lattice vectors, determining the reciprocal lattice vectors (dual basis) ta, satisfying ea • eb = Sab,
ea
(2.1)
A specification of these vectors defines what is called a skeletal lattice, for any crystal configuration having the periodicity described by them. For any given lattice, there are infinitely many choices of lattice vectors, related by transformations of the form ea -» mbaeb <* ta -> ( m " 1 ) ^
m = \\mba \\ e GL(3, Z).
(2.2)
In all matrices used here, the lower index labels rows. To describe how the different lattices are translated relative to each other, we use shift vectors, denoted by p,.,
i = l,...,fi-l.
(2.3)
To determine these, pick some atom in one of the lattices as origin, and some atom in each of the other lattices. Then, the shifts are the position vectors of the latter, relative to this origin. Again, there are infinitely many ways of choosing these for a given configuration and, as is discussed by Pitteri [6], for example, these are related by transformations of the form p,-->a/p;+/?efl)
/?eZ.
(2.4)
For monatomic crystals, the a matrices are elements of the group generated by (a) Matrices obtained by replacing a column in the unit matrix by one with all entries equal to —I. (2 5) (b) Matrices obtained by interchanging two columns in the unit matrix. This covers picking the origin and other atoms used to determine shifts to be any selection of atoms, taking one from each lattice and renumbering the lattices so they are, essentially, permutations. For polyatomic crystals, only permutations permuting identical atoms are allowed. The shifts are restricted by the condition that two atoms must not be in the same place, formalized by p,-^/?e a
and,
ifi^j,
p, -p,-#/?•«„,
(2.6)
for any integers If and /?.. In most discussions of crystals, ea and p, are regarded as constant vectors but, in the X-ray theory, they are vector fields, functions of positions in space. Effectively, this is what is done in other continuum theories. With elasticity theory, they are commonly regarded as functions of position in some reference configuration,
193 NOTES ON THE X-RAY THEORY
203
which is essentially equivalent. The concept of reference configurations is not basic to the X-ray theory but, as will be discussed, one can introduce an analog of these, making it easier to see some similarities between the X-ray theory and thermoelasticity theory. Symmetry of particular descriptions is best described by the lattice groups L(e a , p,) introduced by Pitteri [6]. These are determined by the possible solutions of the equations Qea = mbaeb & Qea = ( m " 1 ) ^
Q e O(3),
Qp, = a/p,. +l?ea,
(2.7a) (2.7b)
where m, a and 1 are selected from the matrices occurring in (2.2) and (2.4). Here, (2.7a) describes a lattice group element for the skeletal lattice. It can happen that it holds but (2.7b) does not, a possibility to be discussed later. Also, there is a caveat. Some descriptions of n -lattices are nonessential, meaning that the configuration involved can also be described as a w'-lattice, with n' < n. If not, the description is called essential. For nonessential descriptions, one cannot rely on (2.7) to correctly describe the symmetry of the configuration: it can underestimate how much symmetry occurs in the configuration. I do not wish to deal with complications associated with this so, hereafter, any description considered is to be regarded as essential. A lattice group element is then described by the following set of integers: (m,a,I)eL(efl,p,).
(2.8)
In particular, these satisfy m* = 1, p = 1,2,3,4,6, mp = 1 =>• ap = 1.
(2.9a) (2.9b)
In (2.9b) the converse implication does not hold, and examples with a = 1 for p = 2 will be discussed later. For present purposes, this rather sketchy description will suffice. A good general reference for these matters and other aspects of the theory of crystals is the book by Pitteri and Zanzotto [7]. Results concerning various lattice groups are presented by them, Adeleke [8], Ericksen [3, 5], Fadda and Zanzotto [9], Parry [10] and Pitteri and Zanzotto [11]. Pitteri [12] and I [13] presented different characterizations of nonessential descriptions. 3. Constitutive Equations The constitutive equations I [2] proposed are of the form
t = p(ea®^+p,.®^)=tT, 11
~~~do'
(3.1)
194 204
J.L. ERICKSEN
plus a description of configurational stresses, to be mentioned later. Actually, I used position vectors relative to an arbitrary origin instead of shifts, an inconsequential difference. Here, the interpretations are that
R € SO(3).
(3.2)
T
For example, this yields t = t as an identity. Using the shifts takes care of the invariance under translations. Similarly, the molecular theory and invariance considerations led to equilibrium equations for the shifts, of the form | ^ = 0.
(3.3)
The idea that the mass is a multiple of the mass of a unit cell gave, for the mass density p, Id -e2Ae3|
(3.4)
where k is a positive constant. Also, by excluding continuous distributions of dislocations, I inferred that, at least locally, there are scalar functions xa such that efl = V x f l .
(3.5)
Henceforth, all such fields are assumed to satisfy this. Clearly,
ei • ez A e3 > 0 or ei • e2 A e3 < 0.
(3.6)
Similarly, the domain is not likely to include shifts violating (2.6). Obviously, this excludes replacing SO(3) by 0(3) in (3.2) or using (2.2) with detm = - 1 . It does not exclude combining the latter with improper orthogonal transformations, as will be considered later. I do have reservations about including such transformations in the invariance group for cp, in general, thinking it better to think hard about physical implications of this, in considering special cases, before assuming it. Later, I shall say a bit more about this. Of course, any transformations in the invariance group of
195 NOTES ON THE X-RAY THEORY
205
one of Pitteri's [6] domains, centered at an essential configuration. This picks out a finite subgroup as the group mapping the neighborhood to itself, essentially the elements occurring in the lattice group for the center, combined with (3.2). For some kinds of twinning and phase transition phenomena, elasticity theory of this kind has been quite successful. For other such phenomena, one needs domains to be larger than Pitteri neighborhoods, although it might be judged to be reasonable to use some bounded neighborhoods, in particular cases. Then, some transformations in the infinite group might map only some configurations in the domain back to the domain, and it is not so easy to properly account for this. As far as I know, no one has worked out an example of this kind. 4. Reformulations With (3.5), it is rather awkward to use the formulation (3.1), so it seems better to replace ea by e" in the list of independent variables, as I [2] did, for some analyses. Later, I will discuss a compromise, using ta, the common practice in elasticity theory, to better compare the two kinds of theories. After working with the theory a while, I found it advantageous to make an additional change. Introduce variables p" such that P,- = P>a => P? = P, • e*
(4.1)
and, in place of (3.1), use (p
= y{ea,p>[,9)
(4.2)
as the potential determining the constitutive equations. Making the change of variables, one gets t
= -"e"®|? = - ^ ® e B '
cfl = ^
+p|?p,
de°
(43a)
(4.3b)
dp? (4.3C)
in which it is not assumed that the equilibrium Equations (3.3) are satisfied: they become
•i*i| =a
(4.4,
As before, the symmetry of t indicated in (4.3a) is really an identity, derivable from (3.2). Further, the three vectors ca represent the configurational stress mentioned earlier, being analogous to the configurational stress used in elasticity theory. This
a 206
J.L. ERICKSEN
has its uses, in analyzing singular solutions representing defects, but I will not use it here. Bearing in mind (3.6), one needs to decide which orientation of lattice vectors to employ, so I choose d • e2 A e3 > 0 <s> e1 • e2 A e3 > 0.
(4.5)
Now this and the inner products Gab
def ta _eb
=
Qba
( 4 6 )
determine the ea to within a rotation, and any choice of these uniquely determines ea. Then p, is determined by (4.1), given any possible values of pf. Of course, the Gab must be coefficients of a positive definite quadratic form, to determine admissible lattice vectors. Since pf is an inner product of vectors, it is invariant under SO(3), in fact 0(3). Said differently, if ^ i s invariant under SO(3), as described by (3.2), it can also be considered to be a function of the form
(4.7)
While this should be useful for many calculations, and has not been noted previously, I will not use it here, but will make some use of an equivalent. It is a simple matter to rephrase (4.3) in terms of (p. Of course, this does not cover the additional invariances discussed previously, but I will not pursue this, except to note that (2.4) implies that (p is a multiply periodic function of pf, if its domain is suitably infinite. There is another formulation which seems to me to be worth mentioning. Consider the change of variables described by
fa= ..
1
f
2
3 . 1/3 , 3 1 3
t a = {kvTx'Ha,
v=- =
n f
.
\
2
3 - , (4.8)
(e -e Ae ) / p [k(e • e A e3)] v being the specific volume, the fa being homogeneous of degree zero in the e6. Here, I have used (3.4). Then, with l
(4.9)
obtained by substituting (4.8) in % one gets, by a routine calculation, (4.10)
t = tD-pl, where
tD= ^.f 1/3 _^f)
(4.11)
P—f=~
(4.12)
is the stress deviator, and
dv
3
197 NOTES ON THE X-RAY THEORY
207
is, by a reasonable interpretation, the thermodynamic pressure, also what is commonly regarded as the mechanical pressure. A formulation quite similar to this for thermoelasticity theory was introduced by Flory [14], in a study of thermodynamics of high polymers, and I [5, 15] made use of it and (4.11) in other thermodynamic considerations. Now, there is a way of eliminating the shifts, by introducing a subenergy function a(ea,9) that I [5] used a bit. In different theories, for particular kinds of problems, subenergies of a different kind have been used by Chipot and Kinderlehrer [16], Ericksen [5, 15], Fonseca and Parry [17] and Kinderlerher [18]. Here, the idea is to fix e° and 9, let the pf vary, and take a(efl, 9) = inf ^(e fl , pf, 9),
(4.13)
for whatever is taken to be the domain of q>. I believe that this is at least close to what those interested in molecular theories of elasticity theory hope to be doing, when they solve what is, essentially, an equivalent of (4.4), for measures of shifts in terms of lattice vectors and, sometimes, 9. For this, the next step is to assume CBR, to introduce the deformation gradient. I introduced the X-ray theory to provide some theory for the rather common cases where the CBR fails to apply, so I avoid this step. Obviously, using o(e", 9) as the basic potential leads to simpler theory, more like elasticity or thermoelasticity theory, and I expect that it is likely to be adequate for some problems involving phase transitions and twinning. For analyzing phase transitions which might involve bifurcations, etc. associated with shifts, it would seem unwise to use it. Also, configurations for which the underlying descriptions become nonessential still need special handling and, for this, one must deal with shifts, even if they do not appear explicitly in a. Similarly, shifts are needed to determine the symmetry of configurations of interest and for determining the nature of Pitteri neighborhoods. Also, if one is considering generalizing the theory to deal with non-equilibrium processes, one should give thought to the shifts, in my opinion. So, one needs to use good judgment, in deciding whether to use the simplified theory. 5. Some Implications of Invariance In elasticity theory, it is well known that one can deduce restrictions on the Cauchy stress and other tensors, evaluated at a configuration having some symmetry. Similar results can be obtained for the X-ray theory. For this, we consider a Pitteri neighborhood, centered at the configuration of interest, say e a , pf,9 having some nontrivial lattice group. In terms of the variables now used, this will involve some set of transformations of the form Qefl = {m-x)ablb,
(5.1a)
p?mZ=ajp«+l?,
(5.1b)
198 208
J.L. ERICKSEN
obtained as an equivalent of (2.7), accounting for the different variables used. Rewrite any such transformation as
e« = {m-%l\
pf = ( m - 1 ) ^ + if),
(5.2)
to get transformations mapping the set (e a , pf) to itself. If (ea, pf) is in the neighborhood, its lattice group will be some subgroup of this lattice group and, with some exceptions, it will be a proper subgroup. However, (p will be assumed to be invariant under this lattice group, at least if improper orthogonal transformations are excluded. Applying the same transformations to it, we get
I" = Q T (m-')y, ~p1 = {m-%{aip) + if),
(5.3)
where (ea,pf) and (e a , pf) might or might not be the same. By a routine calculation,
V^C^Mrn-l)>%{^Ple)-
(5-4)
Now, evaluating at the center, and using (4.3) we easily get QTt(e°,pf,(9)Q = t(g fl ,pf,0),
(5.5)
it being unimportant whether (4.4) is satisfied. For any particular crystal class, the restrictions imposed on t by this are the same as those listed in the table presented by Truesdell [19, p. 201] for his tensor C/o for example. Similarly,
(5.6)
identities which can be used to help solve (4.4). Of course, one can also deduce similar relations for the configurational stress, but I won't record these. While it might be useful to similarly treat higher derivatives, for example to derive linearized equations, I will not pursue this. 6. Examples Consider a 2-lattice of the hexagonal kind, described in part by ei=ai,
e2 =
el_i+j_3
a
a(-i + y/Tj) Y~ , 2 __2j_
V3a
e3 = ck, 3_k c
( O )
where a and c are positive constants, the vectors i, j and k being an orthonormal basis, oriented so that i-JAk>0.
(6.2)
199 NOTES ON THE X-RAY THEORY
209
Let R(i/f, v) denote the rotation with axis v, angle f in radians. The point group for this skeletal lattice is generated by the orthogonal transformations -1,
R(7r/3,k),
R(;r,i).
(6.3)
Here, I will not assume that tp is invariant under reflections, in the sense described before, so I discard the first generator. To complete the description, we need to specify some shift p = p°ea,
Pa=p-ea.
(6.4)
Let us pick this so that R(;r/3, k) belongs to the point group for the 2-lattice configuration. A calculation gives R(n/3,k)ea=mbaeb,
(6.5)
with 1 1 0 - 1 0 0 . 0 0 1
m=|m*|=
(6.6)
For a 2-lattice, (5.1b) becomes pbmab=apa+la,
(6.7)
where a = 1 for diatomic crystals, a = ± 1 for monatomic crystals. For a diatomic crystal, any admissible shift gives an essential description. For a monatomic crystal, the description is either not admissible or nonessential, when the p" are all either integers or half integers, and such values are to be excluded. For (6.6), this gives ,,'-/
= ap'+l\
l
p = ap2 + l2, p3 = ap3+P.
(6.8)
Also, adding integers to the pa gives equivalent shifts. To within such transformations, we get a = 1 =>• pl = p2 = 0,
/73 is essentially arbitrary,
(6.9)
meaning that p3 can be any number except an integer for diatomic crystals, halfintegers also being excluded for monatomic crystals. Also, one can reduce it modulo integers, getting equivalent shifts. For monatomic crystals, we can also use l
a = -\^p
,
ll+l2 = —^,
2/2-/1 p =^—, 2 2
, P P =2'
(6 10)
-
200 210
J.L. ERICKSEN
This includes the common hexagonal close-packed configurations, with , 2 Pl = y
, 1 P2=y
, 1 P3 = ^>
1 = (1.1.1)
(6.H)
and configurations described by pl =
r
p2 =
r
p3 = 0
'
1=
d. LO)
( 6 - 12 )
you might like to determine for yourself whether there are other admissible solutions not equivalent to one of these. For any of these possibilities, we can use (5.5) with Q = R(TT/3, k) to determine that, at the configuration considered, the Cauchy stress must be of the form t = Al + Bk
(6.13)
where A and B are some numbers, as is the case for elasticity theory, for a crystal configuration with this symmetry. At the configurations considered, set (6.14)
*a = -^-, the version of (4.4) used here. For the case (6.9), (5.6) gives
(6.15)
with no conditions on
(6.16)
with m=
1 -1 0
0 0 -1 0 . 0 - 1
(6.17)
For a special case of (6.9), this belongs to the lattice group, since p3 = -^R(n,i)p
= p-e3,
1 = (0,0,-1),
(6.18)
and, as usual, this also applies to shifts equivalent to this, if they are in the domain of
(6.19)
201
NOTES ON THE X-RAY THEORY
211
so, in this case, the third equation is also satisfied, as a consequence of symmetry. For monatomic crystals, (6.18) is excluded, giving a non-essential description. However, for admissible values of p3, the indicated rotation transforms p3 to — p3, satisfying (6.7) with a = — 1,1 = 0. Now, there is a judgment by Landau and Lifshitz [20, Section 130] that "Indeed, it is highly improbable that the atoms of a crystal belonging to its Bravais lattice should be distributed more symmetrically than the symmetry of the crystal requires." It is not very difficult to think of ways of interpreting this that are not compatible with experience. So, workers employ some ground rules to avoid such contradictions. I do not know of a place where all of these are clearly described, and I do not understand this well enough to present such a list. However, with the ideas that have been presented, it is fairly easy to explore particular configurations and decide whether they might reasonably be regarded as unlikely. First, let us consider the examples of diatomic crystals considered above. For the case p3 = 1/2, the skeletal lattice and 2-lattice share the same point group, but they do not in the cases where 0 < p3 < 1/2. So, possibly, the latter should be regarded as unlikely. Thus, we should determine whether there is some important difference between the two kinds of cases. In discussions I have seen of such matters, I would guess that workers are considering unstressed crystals or, perhaps, crystals subjected to a fixed hydrostatic pressure, although this is not always made clear. To be on the safe side, I assume they are unstressed. To better assess this, let us try reasoning sometimes used by physicists in such situations, involving unknown functions, here (p. We have three adjustable constants, p3 along with a and c in (6.1). To have such unstressed equilibria, we need to have three functions of these vanish, as indicated by ^ = 0 < » A = B = 0 in (6.13),
O 3 = 0.
(6.20)
In such cases, one idea used is that N equations in N unknowns might well be solvable, and equations implied by symmetry are not included in the count, so we have the case N = 3. It would be considered unlikely that we could satisfy any additional equations not implied by these. Also, it is considered to be fair to assume that various inequalities are satisfied, for example those involved in second derivative tests for tp to have at least a relative minimum, at the extremal considered. From this view, it is then plausible that such equilibria might be observed in some crystals, there being no important difference between the two kinds of cases mentioned above. Now, there are various real crystals such that the point group for these is a proper subgroup of that for the skeletal lattice. So, naturally, one does not want to conclude that these are highly unlikely. The kind of reasoning illustrated is one way of distinguishing the likely from the unlikely. Now, let us consider a configuration which would be considered as unlikely. The skeletal lattice is primitive cubic, with lattice vectors described by ei = a\,
e2 = aj,
e3 = ak,
(6.21)
202 212
J.L. ERICKSEN
where the vectors i, j and k are as before. Again, it is a 2-lattice, with a shift of the form p = bk.
(6.22)
It does not really matter whether the crystal is monatomic or diatomic, except for determining what values of b must be excluded. For any choice, some rotations in the cubic point group are not in the 2-lattice point group, for example, R(7r/4, i). By calculations much like those done before, one can verify that two equations are implied by symmetry, $ i =
(6.23)
and that (6.13) also applies to these. Here, we only have the two adjustable constants a and b, but need to satisfy three equations, A = B = 0 in (6.14)
and
<J>3 = 0,
(6.24)
and it is not considered likely that this can be done. Actually, workers tend to hedge a bit in such cases, granting that they might be satisfied, somewhat accidentally, at a particular temperature, but not if this is changed slightly. It is a simple matter to change the skeletal lattice to eliminate the difficulty: make it tetragonal, with |d | = |e21 ^ |e31 to introduce a third adjustable constant, without changing (6.23) or (6.24). Here, I follow the common practice of taking it as obvious that the stated change is what is to be expected, genetically, in free thermal expansion. This part of the X-ray theory has not yet been developed. However, it is much like that for thermoelasticity theory, except that it allows for possible bifurcations involving shifts, which could be of interest in analyses of some phase transformations. For the subject being considered, it is likely to be understood that such phenomena are excluded. In any event, with the modification in lattice vectors, the configuration is no longer unlikely. There is a variation on this problem which seems to me to be interesting. Assume that (6.21) and (6.22) hold, but replace (6.24) by B =
(6.25)
it being likely that these and the inequality A < 0 can be satisfied. The stress then reduces to an hydrostatic pressure, p = —A. If we consider varying 0 a bit, it is likely that this will generate a function p(8). So, by controlling p and 6 to stay on this curve, we could maintain the cubic skeletal lattice for 9 in some interval. In observing critical points in fluids, workers do control p and 6 on strategically selected paths, so there are precedents for doing what is suggested. Probably, one could find various other configurations with this feature. Not having seen a situation of this kind discussed by experts, I am not sure how they would relate this to the statement quoted above. Independent of this, I find it interesting that there are such possibilities. In any event, my purpose is more to present some simple applications
203
NOTES ON THE X-RAY THEORY
213
of symmetry conditions, in analyses of a kind not considered in thermoelasticity theory, not to do a critique of that statement. For the cases described by (6.10), (5.6) yields ®a = 0,
(6.26)
so these equilibrium equations are all satisfied, as a consequence of symmetry. Here, it is easy to verify that any of the possibilities listed in (6.10) admits a lattice group element corresponding to (6.17), although we have not needed to use this. Generally, this depends on the choice of la, as described by (6.7). Generally, two configurations are regarded as having the same symmetry provided that it is possible to choose lattice vectors and shifts so that their lattice groups coincide. For example, replacing the 1/2 in (6.18) by any other half-integer gives a configuration with the same symmetry. Now, any monatomic 2-lattice is centro-symmetric, in the sense that the entries Q = -l,
m=-l,
a = -l,
1=0
(6.27)
are included in elements of the point and lattice groups for such configurations. By combining these as indicated by e
"=Qm'e" Qp = « p = - p ,
(6.28)
we can comply with (4.5). For a theory of such crystals, the assumption that 1p is invariant under reflections can then be interpreted as y(ea,pb,e)=v(ea,-pb,6).
(6.29)
For these special cases, I think that it is rather reasonable to assume this. However, most diatomic 2-lattices are not centro-symmetric, and assuming that (6.29) applies to them comes close to asserting that there is no theoretical distinction between monatomic and diatomic 2-lattices. Certainly, I do not feel comfortable making this assumption without thinking very hard about the physical implications and pondering available experimental information. The assumption could be good for some but not all such crystals. Before, we did use something like (6.29) for diatomic crystals, but only for special configurations, not for all in the domain of fp. Certainly, 1p can satisfy (6.28), for particular values of its arguments, when the function does not enjoy this invariance. These rather simple examples serve to illustrate some uses of ideas of symmetry in connection with assessing Cauchy stresses and those equilibrium equations, in situations not considered in elasticity theory. If you are familiar with applications of other kinds of symmetry considerations to elasticity theory, there is a good chance of adapting them. One should bear in mind that, like elasticity theory, the X-ray theory does not cover various effects encountered in real crystals, for
204 214
J.L. ERICKSEN
example piezoelectric or piezo-magnetic effects, and special kinds of symmetries are associated with these. 7. Reference Configurations In elasticity theory, workers associate a reference configuration with a material body. However, sometimes tacitly, we often use the notion of a reference configuration for an homogeneous material, filling all of space. Subregions of it serve as reference configurations for bodies made of this material. With the X-ray theory, it is not so obvious how to formulate a good analog of the former, in general, because of conceptual difficulties in giving a precise description of what should be meant by a material body. Actually, I will note some difficulties of this kind, encountered in elasticity theory. However, we can do something with the latter. Often, some particular configuration does play an important role in theoretical considerations. For example, when the domain of i^is taken to be a Pitteri neighborhood, its center is such a configuration, and this covers likely kinds of linearized equations. Now, I [2] interpreted the xa associated with (3.5) as a special choice of coordinates of points in what I named crystallographic space, here denoted by C. The idea is that this space should serve for all crystals, not be different for the different kinds. Obviously, adding constants to the xa does not affect Vx a . I interpreted this as inducing the translation group Xa ->• Xa + const,
(7.1)
the motivation for considering C to be an affine space. Similarly, (2.2) induces Xa -
{m-l)abXb,
(7-2)
as an action of GL(3, Z) on C, so one gets the group generated by (7.1) and (7.2) acting on C. I accept the classical view that this group defines the geometry of C. In particular, no second order tensor which could serve as a metric tensor is invariant under this group, so the notions of length and angle are not part of this geometry, although one can introduce invariant volume measures. As a manifold, this space is of a simple kind, an affine space which can be covered by one chart, etc. One could introduce more general kinds of coordinates, for example, those permitted by regarding C as an analytic manifold, and this might be useful for problem solving. However, for present purposes, it is simpler to use the special coordinates. For similar reasons, I shall use rectangular Cartesian coordinates for points in Euclidean 3-space, denoted by E. It is easy to reformulate the equations to use general coordinates. Now consider fields e a (x), x e Q C E. By integrating (3.5), we get functions X°(x)e^cC,
(7.3)
describing a mapping of Q onto Q, assuming constants of integration are somehow fixed: if not we get the equivalence class generated by the translation group. With
205 NOTES ON THE X-RAY THEORY
215
reservations very similar to those associated with deformations in elasticity theory, the mapping is invertible, with a differentiable inverse. Differentiating the inverse and using familiar properties of inverses, we get
«. = £ •
(7.4)
Now, almost all analyses in crystal elasticity theory assume that the relevant material symmetry group is a finite group. As I interpret this, it is tantamount to assuming that the domains of constitutive functions are Pitteri neighborhoods. So, for the X-ray theory, I shall consider such neighborhoods. Consider the center as representing a reference configuration for a material, with constant lattice vectors and shifts. Denote these and related coordinates in E by capital letters. Then, (7.5) E a = V / a = ^ ^ x a - E f l - X = Afl = const dX giving a fixed map of IE onto C, assuming that the Aa are somehow fixed. Inverting this, we get X = (xa-Aa)Ea.
(7.6)
Now, compose the map (7.3) with this, as indicated by X(x) = [xa(x) - Aa]Ea e £2 c E,
(7.7)
mapping Q to another region Q in E. Formally, this looks like the inverse of a deformation but, conceptually, it is not necessarily related to any deformation. As is obvious from (7.4) or (7.6), 3X a = —,
E
(7.8)
a
dx
so, in obvious notation,
ax
ax dxa
dx
dxa
(7.9)
3x
or, equivalently, what looks like CBR, Fxd=!f^ efl=FxEfl, dX
(7.10)
being the analog in the X-ray theory of the deformation gradient. Whenever the mapping inverse to (7.7) coincides with the macroscopic deformation, (7.10) describes the CBR. In the X-ray theory, this implies that any line element gets mapped to one with the same crystallographic indices. Generally, this does not apply to material line elements, when CBR fails to apply. If one believes that, for some crystal,
206 216
J.L. ERICKSEN
the rule applies to some limited set of deformations of the reference, one can use this to correlate some predictions of the X-ray and elasticity theories. For example, if one trusts this for phenomena covered by linear thermoelasticity theory, one can use measurements of moduli to determine corresponding moduli in the X-ray theory. As is noted by Zanzotto [1], the rule does not apply to thermal expansion of zinc, so the assumption is not necessarily safe, although other examples of this kind seem not to be known. Here, there is experimental evidence that (3.4) is violated, probably because of variations in the distribution of vacancies. For this reason, the X-ray theory is here not on firm ground, although one could construct constitutive equations to fit this particular phenomenon to the data. Actually, thermoelasticity theory seems not to suffer from this defect, because the larger scale measurements of deformation include contributions from the vacancies. However, one then cannot use it to calculate changes in lattice vectors. To express
(7.11)
where C X = FTF X
(7.12)
is the analog of the Cauchy-Green tensor commonly used in thermoelasticity theory. Considered as restricted to a Pitteri neighborhood, Tp will be invariant under the lattice group for the center, with elements of the form
QEfl = mbaEb,
P? = (m-l)cd(ai Pf + if),
(7.13)
with the usual understanding about preserving the orientation of lattice vectors. With the understanding that the E a are fixed, we have
f = f{Cx, pi 9) = ir{QTCxQ, (m-'Y^pj
+ if), 6).
(7.14)
Now use (4.4) or (4.13) to eliminate pcr If this goes smoothly, as is tacitly assumed in thermoelasticity theory, one gets a thermodynamic potential which is invariant under the point group for the n -lattice: Obviously, one gets thermoelasticity theory by assuming the CBR, and formally similar equations if it does not. Mathematically, solving (4.4) might not go so smoothly, and this might be relevant to understanding some phase transitions. In thermoelasticity theory, it is commonly assumed that a body is represented by a fixed reference configuration, although experts know of exceptions to this rule. For example, if one considers the creation of an edge dislocation, the effect is to remove a thin slice from the reference configuration, perhaps separating it into two disjoint regions. Or consider the study of incoherent phase transitions by Cermelli and Gurtin [21]. Before the transition occurred, the body occupied some region, in a configuration which could be taken as a reference configuration for the
207 NOTES ON THE X-RAY THEORY
217
body, say Qo. After transition it again occupies some region Q in space. Trace this back to the reference configuration for the material in the manner described above and, for the kinds of situations considered by these authors, one will get some set, generally different from Qo. They realize this, as is clear from their Figure 1, although their rules for determining the set differ from mine. Such phenomena do involve some kinds of failures of the CBR, rather different from those associated with deformation twinning. Generally, for any kind of failure of this rule, I think it prudent to at least consider the possibility that the assumption of a fixed reference configuration for a body might not be good. For deformation twins, when CBR fails to apply, the shear describing deformation differs from that used in the X-ray theory, making it seem likely that this will be associated with a change in the reference configuration. I have analyzed this for type I, type II and compound twins, all of the types for which CBR has been observed to fail. The reference configuration does then change and, given data for a twin from twinning tables, a rather simple calculation determines how it changes. It would take more ink than I wish to expend here to properly explain this and its ramifications, so I will cover it in a separate paper. Suppose that we put one body in two configurations occupying two regions fli and ^2 a n d are able to determine two corresponding regions £2i and Q2 in the reference configuration for a material. The mass of a body should be fixed and the mass density of the reference is constant, so the volume of the latter regions should be the same, even if their shapes differ. The same would apply to cases where one had two different bodies of the same material and the same mass, so this is not enough to distinguish one body from another. Why is it important to make such a distinction? I believe that it is because we have some theory for helping us describe what we can do to a body, to induce it to change its size and shape, in most cases to which thermoelasticity applies. Certainly, it would be very helpful to have some theory of this kind to supplement the X-ray theory, but we do not have it: one needs a more general theory, covering deformation and the crystallography. It would be nice to have realistic examples of twins in some crystal for which the CBR fails to apply and the twins can be included in some Pitteri neighborhood, but I have not yet found one. If a crystal occurs in a configuration of maximal symmetry, twins in it cannot be included in such a neighborhood. This excludes the examples Zanzotto [1] found in hexagonal close-packed crystals. The modes he found in (orthorhombic) a-uranium are not excluded for this reason. I [3, 5] explored two modes for this material. For one, CBR fails, but the twins cannot be included in any Pitteri neighborhood. For the other, they can be so included, but CBR applies. In unpublished work, I examined the remaining modes, including the {021} mode mentioned by Crocker [22], but not by various other writers, including Zanzotto [1]. Perhaps it is not widely accepted. For all of these, CBR fails and none of the twins can be included in one of those neighborhoods. There are various kinds of crystals for which CBR fails to apply to some observed twins, so examples of the kind indicated might well exist.
208 218
J.L. ERICKSEN
Acknowledgment I thank Giovanni Zanzotto for some very helpful suggestions, and Marion Ericksen for her help with the typing. References 1. G. Zanzotto, On the material symmetry groups of elastic crystals and the Born rule. Arch. Rational Meek Anal. 121 (1992) 1-36. 2. J.L. Ericksen, Equilibrium theory for X-ray observations. Arch. Rational Mech. Anal. 139 (1997) 181-200. 3. J.L. Ericksen, On groups occurring in the theory of crystal multi-lattices. To appear in Arch. Rational Mech. Anal. 148 (1999) 145-178. 4. J.M. Ball and R.D. James, Proposed experimental tests of a theory of fine microstructures and the two-well problem. Philos. Trans. Roy. Soc. London Ser. A 333 (1992) 389^50. 5. J.L. Ericksen, A minimization problem in the X-ray theory. To appear in Contributions to Continuum Theories, Weierstrass Institute for Applied Analysis and Stochastics (1999). 6. M. Pitteri, On (i; + l)-lattices. J. Elasticity 15 (1985) 3-25. 7. M. Pitteri and G. Zanzotto, Continuum Models for Phase Transitions and Twinning in Crystals. To be published by CRC/Chapman and Hall, London (1999). 8. S. Adeleke, On the classification of monoatomic crystal multilattices. Submitted to Math. Mech. Solids (1999). 9. G. Fadda and G. Zanzotto, The arithmetic symmetry of monoatomic planar 2-lattices. To appear in Ada Crystallogr. (1999). 10. G. Parry, Low dimensional lattice groups for the continuum mechanics of phase transitions in crystals. Arch. Rational Mech. Anal. 145 (1998) 1-22. 11. M. Pitteri and G. Zanzotto, Beyond space groups: The arithmetic symmetry of deformable multi-lattices. Ada Crystallogr. A 54 (1998) 359-373. 12. M. Pitteri, Geometry and symmetry of multi-lattices. Internat. J. Plasticity 14 (1998) 139-147. 13. J.L. Ericksen, On non-essential descriptions of crystal multi-lattices. J. Math. Mech. Solids 3 (1998) 363-392. 14. P.J. Flory, Thermodynamic relations for high elastic polymers. Trans. Faraday Soc. 57 (1961) 829-838. 15. J.L. Ericksen, Some simple cases of the Gibbs phenomenon for thermoelastic solids. J. Thermal Stresses 4 (\9Z\) 13-30. 16. M. Chipot and D. Kinderlehrer, Equilibrium configurations of crystals. Arch. Rational Mech. Anal. 103 (1988) 237-277. 17. I. Fonseca and G. Parry, Equilibrium configurations of defective crystals. Arch. Rational Mech. Anal. 120 (1992) 245-283. 18. D. Kinderlehrer, Remarks about equilibrium configurations of crystals. In: J.M. Ball (ed.), Material Instabilities in Continuum Mechanics, Oxford Univ. Press (1988) pp. 217-241. 19. C. Truesdell, A First Course in Rational Continuum Mechanics, Vol. I. Academic Press, New York (1997). 20. L.D. Landau and E. Lifshitz, Statistical Physics. Pergamon, London/Paris and Addison-Wesley, Reading, MA (1958). 21. P. Cermelli and M.E. Gurtin, On the kinematics of incoherent phase transitions. Ada Metallurgica Materialia 42 (1994) 3349-3359. 22. A.G. Crocker, The crystallography of deformation twinning in a-uranium. / Nucl. Mater. 16 (1965) 306-326. 23. G. Zanzotto, The Cauchy-Born hypothesis, nonlinear elasticity and mechanical twinning in crystals. Ada Crystallogr. A 52 (1996) 839-849.
209 Arch. Rational Mech. Anal. 164 (2002) 103-131 Digital Object Identifier (DOI) 10.1007/s00205-002-0213-x © Springer-Verlag (2002)
On Pitteri Neighborhoods Centered at Hexagonal Close-Packed Configurations J. L. ERICKSEN Communicated by D. KINDERLEHRER
Abstract I study the structure of Pitteri neighborhoods centered at hexagonal closepacked configurations of monatomic crystals using my previously introduced Xray theory. The theory is generalized to cover some effects of magnetism. I give an alternative formulation of X-ray theory that makes comparison to conventional continuum theories more transparent. I study the influence of lattice symmetry on the constitutive relations of nonlinear thermoelasticity and thermomagnetoelasticity, consistent with the existence of a Pitteri neighborhood.
1. Introduction This is a study of the structure of Pitteri neighborhoods [1] centered at hexagonal close-packed configurations of monatomic crystals. It is based on my [2] X-ray theory, generalized to cover some effects of magnetism and I seek to improve understanding of such constitutive equations. I reorganize this by introducing a linear transformation L which can be interpreted as the usual deformation gradient F whenever the Cauchy-Born rule is accepted, hereafter abbreviated as CBR. This makes it somewhat easier to see the similarities and differences between the X-ray theory and more familiar continuum theories, primarily thermoelasticity theory, and to adapt some results from the latter. For monatomic 2-lattices, a center of this kind is one of maximal symmetry. Given the structure of such neighborhoods it is rather easy to determine the structure of neighborhoods centered at less symmetric configurations contained in the neighborhood. FADDA & ZANZOTTO [3] have described how these different symmetries are related, but I will add some information on how this affects constitutive equations, and on how magnetization can influence crystal symmetry. Certainly, I would like the X-ray theory better if it could deal with deformation in the numerous cases where CBR fails to apply. However, numerous workers
210 104
J. L. ERICKSEN
interested in crystal plasticity, deformation twinning etc. have wrestled hard with this kind of issue, without finding very satisfactory ways of dealing with it, and I have no helpful new ideas about it. 2. X-ray theory A monatomic crystal 2-lattice consists of two identical lattices with three linearly independent lattice vectors ea. One is translated relative to the other, the translation being described by a shift vector p representing the position of an atom in one of the lattices relative to some atom in the other. The possibility that two atoms occupy the same position is excluded, leading to the restriction p^naea,
naeZ.
(2.1)
For any particular configuration, there are infinitely many ways of choosing these vectors. If (e a , p) is one possibility, (e a , p) is another, provided that e a = mbaeb,
m = \\mba\\ e GL(3, Z)
(2.2)
and p = a p + / a e fl ,
a = ±l,
la e Z.
(2.3)
For elements of matrices, the lower index labels rows, unless there is a statement to the contrary. In the X-ray theory, which is an equilibrium theory, these are considered to be vector fields, functions of position in space. Originally, I [2] assumed that, for a given material, there is a constitutive equation for
(2.4)
where 6 denotes absolute temperature. I assumed that there are scalar functions xa such that ea = V X ° ,
(2.5)
e a denoting the reciprocal lattice vectors (dual basis). Physically, this excludes continuous distributions of dislocations. The idea that mass is proportional to the mass of a unit cell is formalized by the assumption that p = m/|ei • e2 A e3| = m\el • e 2 A e 3 |,
(2.6)
with p the mass density, m being a positive constant. Alternative formulations can be more convenient, depending on the kind of problems being considered. Given (2.5), there is some reason to use, as an argument in cp, e a rather than e a . In my [4] analysis of growth twins in quartz, I found this advantageous. For these, deformation is not relevant, except for studying how they are affected by changes of pressure, temperature, etc. after they have grown. Thus, thermoelasticity theory is useless for analyzing the as-grown configurations, for
2U Pitteri Neighborhoods of Hexagonal Close-Packed Configurations
105
example. However, in elementary analyses of deformation or transformation twins, e a and relevant shifts are assumed to be piecewise constant, and this is also the case for phase transitions. For these, it really does not matter whether are used ea or e a , except that one, usually the former, can better fit common usage. I [5,6] have used (2.4) in such analyses, to better match practices, although not very much use of constitutive equations is made in such studies. Had I dealt with more complicated inhomogeneous configurations, I would have replaced ea by e a , in cases where I did not expect CBR to apply, generally speaking. However, I did not do this in comparing [7] the X-ray theory with thermoelasticity theory because, in the latter, spatial descriptions are rarely used. When CBR does apply, it might seem to be advantageous to follow the usual practice in elasticity theory, of using it to introduce material coordinates or coordinate-free equivalents. I felt rather comfortable with the assumption in my [8] study of the a — /3 phase transition in quartz, which uses a 3-lattice model, introducing a reformation of equations for this. However, for reasons to be discussed later, I will use spatial descriptions here. In crystals, it is fairly common for phase transitions to be associated with bifurcations in variables not accounted for by thermoelasticity theory, making this theory inadequate for analyzing them. It is clear that this is the case for the a — f5 transition in quartz. The X-ray theory is a bit more flexible, with its accounting for shifts, so I tried using this. The results seem sensible, but these transitions involve some very complicated phenomena which have long puzzled workers and remain to be understood. There is the idea that values of
= Qea,ap + laea = Qp,
Q e 0(3)},
(2.7)
where m, a and 1 are as described in (2.2) and (2.3). Here, m and / are the matrices with components mba and la, respectively. This is associated with the more familiar point group
P(ea,v) = J Q e 0(3)|Qe fl =m*e fe ,Qp = ap + / a e fl },
(2.8)
where the values of m, a and I are those that occur in (2.7). While space groups and site-symmetry groups are often considered in such discussions, I will not make any
212 106
J. L. ERICKSEN
use of them. There are exceptional cases where lattice groups are not unique. These involve non-essential descriptions, which describe configurations also describable as Bravais lattices (1-lattices). PITTERI [9] and I [10] have discussed these in somewhat different ways. They occur when the numbers p • ea are all integers or half integers, the possibility that they are all integers being excluded by (2.1). We shall not encounter them here. Two configurations are regarded as having the same symmetry if it is possible to choose lattice vectors and shifts so that their lattice groups coincide. For monatomic 2-lattices, FADDA & ZANZOTTO [3,11] have described all of the symmetry types, there being 29, and kinematical possibilities for symmetry breaking. In trying to apply these ideas to constitutive equations, we encounter some subtleties. As defined by (2.7), these lattice groups always contain the element m = -l,
a = -l,
l = 0=>Q = - l ,
(2.9)
which reverses the orientation of lattice vectors. As usual, the domain of
R e SO(3)}
(2.10)
and similarly restrict the point group to P+(ey, p), corresponding to these elements. Note that mabeb = Re a <s> ( m " 1 ) ^ " = Re 6 .
(2.11)
It is easy to check that transforming ea and P by any orthogonal transformation does not change L+(ea, p) but does change P+(ea, p) to a conjugate group, obtained using the orthogonal transformation as a similarity transformation. Roughly, (2.10) cuts in half the neighborhoods considered by Pitteri: his arguments can be revised to allow for this. Also, I want to include some effects of magnetism, by replacing (2.4) by (p =
(2.12)
where m is the usual magnetization, an axial vector, the domain being taken as one of the Pitteri neighborhoods. In describing this, I add a caret to distinguish it from the matrices of integers. To better fit (2.5), I have replaced ea by e a . Of course, the domains of these functions must be interpreted to include the dependence on m, but using the neighborhoods does not limit this. These functions are considered to be invariant under finite rotations,
R e SO(3),
(2.13)
213 Pitteri Neighborhoods of Hexagonal Close-Packed Configurations
107
there being no problem in having these transformations map a neighborhood to itself. For any fixed configuration, including a fixed value of m, we again have the equivalence class of descriptions described by (2.2) and (2.3), since these still represent different ways of describing the same 2-lattices. If the center is described by the values Ea and P of the lattice vectors and shift, this motivates the assumption that q>((m-x)\a,
a p + lata, m, 6) = cp(eb, p, m, 6), (m, a, 1} e L+(E O , P). (2.14)
Recall that this covers the symmetry of all configurations in the neighborhood, since their lattice groups are subgroups of that for the center, by the properties of the neighborhood. For magnetism, another kind of material symmetry is often assumed, which I label as TR, TR :
(2.15)
the idea being that this transformation represents a time reversal, although a different transformation of in under time reversals also is frequently mentioned in the literature. For a long time, workers assumed that invariance under such transformations always applies to equilibrium theories. Then, workers began to find crystals exhibiting the phenomenon of piezomagnetism, which is excluded by the assumption. So, while the assumption seems to apply to many materials, it fails to apply to all. I will consider both possibilities. There is one possibility for considering invariance and reflections, by combining Q = —1 and m = —1 to get the orientation-preserving transformation Qmbaeb = ea o- Q l ^ " 1 ) * ^ = eb.
(2.16)
Applying Q = — 1 to p reverses it, but m is not changed, since it is an axial vector. By interpreting it this way, we get the assumption
(2.17)
While this seems to be like (2.15), there are differences. Often, in = 0 and small perturbations of this need to be included in the domain, but (2.1) forbids having p = 0. Of course, if we accept (2.17), we must require that, whenever p is in the domain, so is —p. If we accept (2.16)), why not also combine Q = — 1 and a — — 1 to get transformations not changing any of the arguments? As a personal choice, I have decided not to use (2.17) here. I leave it to interested workers to check its implications for themselves. I do not wish to get into a lengthy discussion of the theory of magnetism, which is a complicated subject, since I shall not make much use of it here. I [12] have given my own views on the theories of magnetic and dielectric effects. These are somewhat unconventional, but are in line with some general ideas presented by TRUESDELL & TOUPIN [13, Ch. F]. For general references on electromagnetism, I prefer this reference and, especially, the work by WHITTAKER [14]. Particularly for ferromagnets, workers consider exchange forces to be important, covering these by allowing 0 to also depend on Vm, adding a function quadratic in these variables.
214 108
J. L. ERICKSEN
I [12] have discussed this in a general way but, for simplicity, will ignore it here. For ferromagnets, the coverage by BROWN [15,16] is good. I retain one set of equilibrium equations that I [2] introduced, dw
ap=0'
(Z18)
the only difference being that these now involve m. For configurations with some nontrivial symmetry, these are partially satisfied as a consequence of symmetry, and we will cover related theory for an equivalent of this. Similar theory will be presented for 3
(2.19)
215 Pitteri Neighborhoods of Hexagonal Close-Packed Configurations
109
with tM denoting a part depending on the material, tp a field contribution, which does not. For the former, I adapt an analog of what I [12] suggested for thermoelasticity theory: tM = -pea
30
30
(2.20) de" dp which inherits transformation properties from invariances of 0. For example, it transforms as a second-order tensor under SO(3). For our considerations of constitutive equations, the prescriptions of tp, magnetic body forces and body couples are not relevant, so you could use my [12] suggestions or more conventional prescriptions. Here, I will not make use of this. Merely, I am indicating that a prescription is available for this and, if 0 is known, there is a constitutive equation for it. Similarly there are prescriptions for configurational stresses and entropy, not used here. Later, I will say something about implications of symmetry on stress, when effects of magnetism are ignored. ®^+PTT®P,
3. Reformulation Here, we reformulate the equations to make invariance considerations better resemble those commonly encountered in other kinds of constitutive theory. Usually, these employ point groups. First, whether or not CBR applies, we can introduce a reference configuration for a material, as an homogeneous configuration filling all of space, equipped with a set of lattice vectors Ea and a shift P and, for a Pitteri neighborhood, the center is the obvious choice. Reasonably, we should assign some particular value(s) of m to the center but, for the present, I ignore this. When CBR applies, we can associate subregions with particular material bodies. CBR asserts that, given a deformation of a crystal with deformation gradient F the vector fields ea = FEfl 4» efl = F - r E a
(3.1)
are a possible choice of lattice vectors in the deformed crystal. For spatial descriptions, it is simpler to use (3.1)2, which gives F" 1 = VX = E a ® e a ,
(3.2)
with X representing the position vector in the reference configuration. When CBR does not apply, we can replace F by a linear transformation L satisfying the analog of (3.1), so efl = L E a <»efl = L " r E a ,
(3.3)
and it follows from (2.5) that, at least locally, there are functions Y(x) such that L" 1 = VY = Ea
(3.4)
You can consider its range as a subregion of the reference configuration for the material. So, when you trust CBR, just interpret L as F. It is not hard to modify this
216 110
J. L. ERICKSEN
to allow for the possibility that CBR applies only to subsets of configurations in the neighborhood, not necessarily including the center. If the configuration arises from deformation of a solid body, the difference between this mapping and that describing deformation seems likely to be describable as what is called a neutral deformation by DAVINI [17] and FONSECA & PARRY [18], judging from their ideas and my [6] study of deformation twins. However, we lack reliable theory for predicting these. Of course, a crystal configuration need not arise from deformation of another configuration. For example, it might be produced by solidification from a melt. If we introduce a transformation in L+(Ea, P) we have mbjtb = Lm^Efo = LE a <& ( m " 1 ) ^ = L T r E a ,
(3.5)
where L = LR,
ReP+(Ea,P),
(3.6)
this being how a lattice group gets replaced by a point group in elasticity theory, for example. Of course, the lattice group also acts on shifts. For this, let n = L 1 p - P < ^ p = L(P + nr).
(3.7)
This will serve as a measure of the shift. To get a transformation rule for it under the lattice group, consider applying a transformation in L + (E a , P). For P, we have Rr(aP + /QEa)=P,
(3.8)
L" 1 - + L " 1 = R r L - 1 , p ^ p = a p + / a e fl .
(3.9)
and we should have
By a routine calculation, this gives the transform as a = L " 1 p - P = a!R r ^,
(3.10)
which is simpler than the rule for p. However, it does differ from the usual rule for transforming vectors under the point group when a = — 1. Now, invariance of
ReSO(3),
(3.11)
and it then follows that the symmetric tensor D= L
1
L
r
(3.12)
transforms to itself, and it determines L to within transformations in SO(3). When L = F, this is C" 1 , where C is the Cauchy-Green tensor commonly used in nonlinear elasticity theory. Similarly, it transforms to itself under SO(3). Regarding P as given and with L determined as indicated by D, (3.7) determines p in terms
217 Pitted Neighborhoods of Hexagonal Close-Packed Configurations
111
of n to within the same transformations in SO(3). A similar way of representing m is needed, and there are various possibilities for this. In [12], I mentioned two, assuming L = F. One that I estimated as fitting a practice of some workers dealing with ferromagnets gives, when det L > 0, as I am assuming, M = L-!m/detL-'.
(3.13)
This and the other transform to themselves under SO(3) and, under the lattice group for the center, transform according to M -* R r M ,
R e P + (E O P).
(3.14)
Also, all have the property that, if one allows improper unimodular transformations, which I will not, M, like m, transforms as an axial vector. For present purposes, we can interpret M as either of these choices, or others conforming to (3.14). Expressing q> as a function of these variables, we get functions of the form
(3.15)
where I follow the practice in elasticity of ignoring the dependence on the fixed vectors E a and P. Hereafter, I simplify notation by using (p in place of (p. Of course, these functions are automatically invariant under SO(3), the invariance under L+(Ea, P) translating to
=
(3.16)
Obviously, (2.17) translates to TR :
(3.17)
Now, every configuration in the neighborhood has a lattice group L+(ea, p), which can be trivial, that is a subgroup of L+(Ea, P) and, for given ea and p, this determines a point group P+(ea, p) which is conjugate to the corresponding subgroup of P+(Ea, P). This restricts L as indicated by R'L = L R ^ . L ~ 1 R ' r = R r L - 1 ,
R' e P+(ea, p),
ReP+(Ea,P), (3.18)
as is familiar to workers in elasticity when L — F, and is easily proved. We can use SO(3) to adjust ea and p to get P+(ea, P) to be a subgroup of P+(Ea, P), instead of just being conjugate to one. Introduce the polar decomposition of L~' in the form L~'=WR,
ReSO(3),
W = W r > 0,
(3.19)
the latter inequality meaning that W is positive definite, with D = W2.
(3.20)
R r D R = D.
(3.21)
From (3.12) and (3.18),
218 112
J. L. ERICKSEN
Then, (3.20) determines W uniquely for given D implying that R r W R = W => R r W = W R r .
(3.22)
Rearranging (3.18) gives RrW = WRr,
R = RR'Rr,
(3.23)
changing the point group of interest to a conjugate group and rotating ea and p to new values ea and p. Comparing (3.22) and (3.23), we get R = R,
(3.24)
so the new group is a subgroup, as indicated by P+(e0,p)cP+(Ea,P).
(3.25)
By an elementary calculation, we then get aRTji
=JC,RG
P+(ea,p),
(3.26)
a being the value in L+(Ea, P) corresponding to this value of R e P+(Ea, P). However, M does not always get transformed to itself because, from (2.14), the lattice group L + (E a , P) has a trivial action on m. When the operation indicated in (2.14) is included in L+(ea, p) we can replace it by the corresponding point group operation, then use invariance under SO(3) to conclude that
(3.27)
Since this works differently for different arguments, it does not seem sensible to try to treat it as something to be covered by the invariance group for q> or
(3.28)
Of course, we can use these results to get invariance conditions on derivatives of tp. For example, with A^ ^ 3D
(3.29)
differentiating (3.28) gives A ( R r D R , a R T 7 r , R r M , 0 ) = R r A(D,7r,M,0)R, R e L + ( E f l , P ) (3.30) and,ifR € P+(ea, P)>wecanuse(3.21)and(3.26)toreducethefirsttwoarguments on the left to the values on the right. If also R r M = M or, when TR applies,
219 Pitted Neighborhoods of Hexagonal Close-Packed Configurations
113
R r M = ±M, then all arguments can be regarded as the same, so the value of A is mapped to itself. Then, (3.27) is, effectively, a consequence of invariance under L+(Ea, P). Similar remarks apply to the derivatives O = d
(3.31)
and higher-order derivatives. Later, I will make some use of such symmetry arguments. It is a routine matter to use (2.20) to get prescriptions for tjyi and similar implications of symmetry for it but, generally, the latter do not apply to t/r. 4. Hexagonal dose-packed neighborhoods We now consider a neighborhood centered at some particular hexagonal closepacked configuration. Described in a conventional way, the lattice and reciprocal lattice vectors are of the form Ei=aoi,
E2=aoUi+^jY
/
\
(4 1}
p i — ' / ; _i_ i i 1
E
-^y+Ti*)'
E 3 = cok,
p-2
E =
2
•
7^oh
i?3 _
E
l
-
t
~^k'
where a0 and c0 are particular positive numbers, i, j and k being some orthonormal basis, usually considered to be right-handed. The shift is P = f Ei + ±E2 + i E 3 .
(4.2)
The corresponding groups L+(Ea, P) and P+(Ea, P) which are of order twelve, are described in the Section 7. There is then the question of what to take as a reference value or possibly a set of reference values for M. When TR applies, the two values ±Mo should be allowed for some choice of the vector Mo, for example. What seems to be a common practice suggests assuming that, at some reference temperature, A(l, 0, M o , 0O) = * ( 1 , 0 . M o , 0O) = 0,
(4.3)
M(l, 0, M o , 0O) = 0,
(4.4)
and, often,
where A and
220 114
J. L. ERICKSEN
pick some non-zero vector Mo, it seems reasonable to think that this might well break the hexagonal symmetry. Usually, workers interested in ferromagnets seem to ignore the opinion of Landau and Lifshitz. At least, this is how I interpret the common acceptance of the idea that a ferromagnetic crystal can be cubic when m has some particular crystallographic direction since, for example, a vector invariant under one of the cubic groups up to a reversal can only be the null vector. To put the question another way: How likely is it that (4.3) can be satisfied? I have not found a discussion of this in the literature, but will do some analyses relating to it. As will be exemplified, it is not reasonable to think that the symmetry of configurations, for given lattice vectors and shift, is completely independent of the value of m. To deal with this, I will pick out a special set of values of M for each configuration: those fitting NOTR: R r M = M V R e F + ( e a , p ) ,
(4.5)
TR: R r M = ± M VR e P+(ta, p),
(4.6)
and
NO TR and TR meaning that TR does not or does apply, respectively. Then, from the discussion of (3.29), we have R r AR = A VR e p+(ea, p),
(4.7)
where the arguments in both values of A coincide. From the information concerning symmetric second-order tensors having this property, given by TRUESDELL [20, p. 201], for example, we infer that, for any configuration having the lattice group of the center, it is of the form A = p\ + yk
(4.8)
for some scalars /} and y. Similarly, we can read off forms of A for the other symmetry types. Here, (4.8) still applies if we consider values of a and c differing from ao and CQ. Similarly, we get, for all transformations in L+(ea, p), aRTjt = n,
aRT® = <J>.
(4.9)
In particular, this gives L+(e a ,p) = L + ( E a , P ) = > j r =
(4.10)
So, (4.3) reduces to the two equations /? = y — 0, which can be regarded as equations for determining the two unknowns a and c. By reasoning commonly used by physicists, it is not unlikely that n equations in «o unknowns will have a solution, here interpreted as ao and co- So, here, it is not unlikely that we can have this kind of center, satisfying (4.3). By a similar argument, (4.6), here implying that M is parallel to ±k, provides three equations in three unknowns when TR applies, and workers interested in ferromagnets commonly assume that it does. Generally, (4.3) gives nine equations and, at most we have five unknowns, a, c and three components of M. At least superficially, this suggests that the number of
22Ji Pitted Neighborhoods of Hexagonal Close-Packed Configurations
115
independent equations might well exceed the number of unknowns. By common consent, it is then unlikely that they can be satisfied. We will explore questions of this kind in more detail. Reasoning of this kind also suggests that some values of M are likely to break symmetry of configurations having some nontrivial symmetry. I remind the reader that, usually, the symmetry of a crystal is judged by examining specimens not subject to shear stresses, although magnetic effects can complicate considerations of this kind. Of course, a very slight change in symmetry could be masked by the inevitable experimental errors, but might be revealed in more accurate observations done in the future.
5. Structure of the neighborhoods Here, I will list information about configurations contained in neighborhoods of the hexagonal kind, for the simplest case, Mo = 0 for the special configurations such that L-1=W.
(5.1)
Actually, the analyses will uncover some possibilities with Mo 5^ 0. As we have seen, the point groups for these are subgroups of P+(Ea, P), not just conjugate to these. Also, other possible configurations can be obtained by applying rotations to these. Each configuration generates a set of variants, obtained by taking the orbit of it under L+(Ea, P) • From elementary group theory, the number of these is obtained by dividing twelve, the order of L+(Ea, P), by the order of the lattice group of the configuration. I will put the variants under one heading, describing the (conjugate) lattice groups for each. As should be clear from the previous discussion, what symmetry a configuration has depends on the value of M. I will list special values of this which are clearly compatible with the symmetry considered when TR does not apply, denoted by NO TR, and when it does, denoted by TR, the analogs of (4.5) and (4.6). Values of W, A, JT,
222 116
J. L. ERICKSEN
suitably defined fixed sets are then convex subsets, as can be seen from the following arguments. For r e [0, 1], consider D(r) =
TD!
+ (1 - r)D 2 ,
(5.2)
where Dj and D2 are symmetric, positive definite, close to 1, and the corresponding point groups both include a rotation R. Then, it is easy to verify that D(r) is also symmetric, positive definite, close to 1 and transformed to itself by R. Similarly, if n\ and ni are small and such that, aRT7iK
=nK,
A: = 1,2,
(5.3)
then jr(r) = rjri + ( l - r ) j r 2
(5.4)
also has these properties. With some such restrictions, this comes close to establishing the convexity. It might seem that this could be generalized to give convex global fixed sets. The difficulty with this is that a line joining two configurations might contain a configuration excluded by (2.1) or a configuration with a non-essential description, which needs careful consideration at least. However, we cannot have a sequence of such configurations approaching a center of the kind we are considering. Within the limits indicated, such local theory of fixed sets should be much like that I [21] presented for Bravais lattices. To firm up this line of thought, it seems important to understand how the dependence on M influences symmetry of configurations. Here, I shall not pursue this matter of convexity, but will present some thoughts on the influence of M. We begin with the configurations of greatest symmetry, those in the fixed set containing the center. For the headings, the names and type numbers listed are those used by FADDA & ZANZOTTO [3,11]. Results listed below are obtained using (3.22), (4.5)-(4.7) and (4.9). Because I use (4.5) and (4.6), which is not a common practice, I regard the descriptions as tentative. Later, I will modify some slightly. At the end of each description, I will present characterizations of ea and p which apply whether or not L is restricted to satisfy (5.1), which should make it easier to picture the configurations. 5.7. Hexagonal close-packed configurations, type 27 To avoid excessive use of letters, I will employ some of the same letters in describing different symmetry types. First consider the configurations of maximal symmetry, lying in the fixed set determined by the center. Routine calculations give the following possible forms; W = cl + Jk
jr = * = O ,
A = £ l + yk
(5.5)
along with N O T R : M = 0 = > M = 0,
(5.6)
223
Pitteri Neighborhoods of Hexagonal Close-Packed Configurations
117
TR : M = ±mk =$• M = ±/xk.
(5.7)
Here, ea and p have the forms (4.1) and (4.2) respectively, with «o and CQ replaced by different numbers. Also, by hindsight, we could have taken Mo to be of the form (5.7) when TR applies. Later we will find a similar generalization is possible when it does not. For e a , the Gramian matrix is of the form Gn -\Gn 0
G = ||e f l -e 6 ||=
-\Gn Gu 0
0 (I 0 G33I
(5.8)
and p is of the form (4.2). Results of this kind are obtained by analyzing (2.10) for the values of m occurring in the governing lattice group. By similar analyses, values of p are obtained using (2.10). 5.2. Hexagonal configurations, type 25 For these, the lattice groups are of order six, so there are two variants, but they share the same lattice group, with elements described in Section 7 as L 2 ,L 4 ,L6,Lg,Lio andLi 2 .
(5.9)
Thus the variants are included in the same fixed set. Using (5.9), we get W = c l + dk ® k,
jz=sk,
<J>=crk,
A = j81 + yk
(5.10)
For both variants, the restrictions on M and M are the same as in the previous case. Take one variant and make the substitutions s ->• —s,
a - > —a
(5.11)
to get the other. Of course, the type-27 configurations are included in the same fixed set, occurring for s = a = 0. As is mentioned by FADDA & ZANZOTTO [3] and
is easily verified, the lattice vectors for these two types are of the same form, the reason for calling these hexagonal, but the shifts are a little different. Obviously, the type-27 configurations separate the configurations with s > 0 from those with s < 0 so, in this sense, the variants form two disjoint sets. For efl, (5.8) again applies and p = | e i + i e 2 + pe 3 .
(5.12)
When letters like this p are included in such expressions, it means that the numbers are not required to take on a particular value although, generally, the various parameters involved are not all independent.
224 118
J. L. ERICKSEN
5.5. Base-centered orthorhombic configurations, type 14 The order of the lattice groups for these is four, giving three variants. For each, the point group contains three 180° rotations, with perpendicular axes, which are different for different variants, but Rj£ is included in all. For this one, a = — 1 and, for the other two, a — 1. We use an orthonormal basis with directions fi, f2 and k, f i being the axis of the rotation associated with a = 1. Involved are the three lattice groups with elements L3,L8,Ln,Li2,
(5.13)
L 3 ,L7,Lio,Li2
(5.14)
L 3 ,L 6 ,L9,L 1 2 .
(5.15)
and
Calculations give W = cfi
n = sii,
A = 0fi
$ = aix, (5.16)
For magnetization, NOTR: M = M = 0
(5.17)
TR: M = ±mifi,±m 2 f 2 or ±/n 3 k,
(5.18)
with M = ±intu
±Ai2f2 or ± fi3k.
(5.19)
Obviously, the variants reside in different fixed sets, since their lattice groups differ. For individual configurations, crystallographers have conventional ways of describing them, which are used by FADDA & ZANZOTTO [3,11]. Bear in mind that, in collecting configurations into one neighborhood, it can be necessary to use unconventional descriptions and, for this reason, there are some differences between their descriptions and mine. The characterizations of ea and p are different for the different lattice groups. I list them in the same order I listed the lattice groups. For the Gramians, calculations give Gn G12 0 G12 Gn 0 , 0 0 G 33
(5.20)
225 Pitteri Neighborhoods of Hexagonal Close-Packed Configurations
Gn -\Gn 0
119
-\G12 Gn 0
0 0 , G 33
(5.21)
Gn -\Gu -5G11 G 22 0 0
0 0 . G33
(5.22)
and
Here, only (5.20) fits the conventional description of lattice vectors used by FADDA & ZANZOTTO [3], for example. It is true that the others describe configurations with the same symmetry, but they occur in different parts of the neighborhood and, for continuum theory, it is important to account for this. Also, p = />ei + ( l - / ? ) e 2 + i e 3 ,
(5.23)
p =
(5.24)
p = re, + ( 2 r - l ) e 2 + | e 3 .
(5.25)
and
As is obvious from (5.20)-(5.22), the values of lattice vectors in (5.23)-(5.25) should not be regarded as the same, but it is cumbersome to use different symbols for the three sets. It is important to account for all three, since they occupy different positions in the neighborhood. 5.4. Base-centered monoclinic configurations, type 7 The monoclinic configurations have lattice groups of order two, giving six variants. These involve 180° rotations and, from the Appendix, seven of these are included in P+(Ea, P). If we take the orbit under L+(Ea, P) of a configuration associated with one of these, we get lattice groups related by similarity transformations and, using (7.1), it is easy to show that these must share the same value of a. Looking at the three with a = 1, we find that they do occur on the same orbit, giving the lattice groups with elements LK,L12,K
= 7, 9, 11.
(5.26)
From (5.14)—(5.16), each of these is included in one of the lattice groups for type 14. From the work of FADDA & ZANZOTTO [3], this is enough to identify them as being of type 7. To get the six variants, it is rather obvious that these must split into three pairs, with the two in each pair having the same lattice group. What happens is that the symmetry of each type-14 variant gets broken in the same way, to double the number of variants. Pick a value of K for L,K and the type-14 lattice group containing it. Then the axis of the rotation corresponding to L,K is one of the three vectors fi used in describing type 14, and good bases are fi, f2 and k. To determine how the two type-7 variants with this lattice group are related, we pick
226 120
J. L. ERICKSEN
some configuration with this symmetry and determine its orbit under the lattice group for the type 14 considered. By doing the relevant calculations, we get W = cfi
it = sfu
A = ;Sfi
$ = /xfi, (5.27)
along with NO TR: M = mfi,
M = /xfi
(5.28)
M = ±/nfi.
(5.29)
M = ± ( / x i f 2 + /x2k)-
(5.30)
and either TR: M = ±mf], or T R : M = ±(mif 2 + m 2 k),
The two variants mentioned above are obtained using the substitutions e -> —e, S ->• —5, m ->• —m, yn ->• — fi, when (5.28) or (5.29) applies (5.31) replacing the last two by either mi -> - m i ,
Ml - • -Ml
(5.32)
m2 -> ~m2,
M2 -> ~M2
(5.33)
G n - 2 ^ 2 2 G13 -2G22 G 2 2 - 2 G i 3 , G13 -2G13 G33
(5.34)
G n - 3 G 1 1 G13 -3G11 G 2 2 -2G13 G13 - 2 G i 3 G33
(5.35)
G n G12 G13 Gn G n G13 . G13 G13 G33
(5.36)
or its equivalent when (5.30) applies. Here, we get
and
For the shifts, calculations give p = pei + \pt2 + ^e3,
(5.37)
p = qei + (2q - l)e2 + ie 3
(5.38)
p = r e i + ( l - r ) e 2 + |e 3 .
(5.39)
and
227 Pitted Neighborhoods of Hexagonal Close-Packed Configurations
121
5.5. Base-centered monoclinic configurations, type 6 The theory of these is somewhat similar to that for type 7. These involve three 180° rotations, now with axes perpendicular to k, making 60° or 120° angles with each other, depending on which angle is used, all with a = — 1. Again, they involve three lattice groups, now with elements L,,.,Li2,*: = 6, 8, 10,
(5.40)
but they are all contained in one lattice group, that given for type 25 by (5.9). Again, from the work of Fadda and Zanzotto, this implies that they are of type 6. For any of the lattice groups given by (5.40), pick an orthonormal basis gi, g2 and k such that gi is the axis of the rotation in the corresponding point group, orienting these so that R^/3 takes one value of gi and g2 to another. Then, we find that W = cgi ® gi + dg2 ® g2 + e(g2 <8> k + k
A = /Sg! O gi + yg2
(5.41)
with N O T R : M = mgi,
M = figi
(5.42)
M = ±fJ,gl
(5.43)
M = ±(/zig 2 + /i 2 k).
(5.44)
and either T R : M = ±mgi, or TR: M = ±(m 1 g 2 + m 2 k),
Taking the orbit of these under the type-25 lattice group just gives the other two values of gi and g2, giving three variants. To get the rest, you can transform these using L3, for example. This induces the substitutions e —y —e,
S2 —> —S2,
8 —>• —8,
02 —>• — 0 2 ,
m —*• —m,
fi —> —fi
(5.45)
when (5.41) or (5.42) applies, or replaces the last by the pair m i —• — m i ,
/HI —> —/J,I or m.2 —• -ffi2,
H2 - > —M2
(5.46)
when (5.43) applies. Generally, this produces six different configurations from one, and it is easy to see what happens when they are not all different. Here,
Gn -3G11 0
-i-Gn 0 G22 G 23 , G23 G33
(5.47)
228 122
J. L. ERICKSEN
Gn G\2 G\3 Gn G n -G13 G13 —G13 G33
(5.48)
and G11 -\G12 G13
— 5G22 G13 Gn 0 . 0 G33
(5.49)
With this, p = pe{ + (2p - l)e 2 + qe3,
(5.50)
p = rei + (l - r ) e 2 + s e 3
(5.51)
p = rei + ife 2 + «e3.
(5.52)
and
5.6. Primitive monoclinic configurations, type 2 These are the monoclinic configurations which FADDA & ZANZOTTO [3] describe as all having the same lattice group, a subgroup of the type 14, the elements being L 3 ,Li2,
(5.53)
the former having a = — 1. Note that this is included in all the groups (5.13)-(5.15). This involves a 180° rotation with k as axis, which is not included in the point groups of the monoclinics previously considered. As might be obvious from this, you can get these by taking configurations with any one of the three and breaking the symmetry, to double the number of variants. To do so, use the orthonormal basis fi, f2 and k used for the type 14 and, for the type 2, you get W = cfi
* = o\i\ + 02(2,
A = pf 1 ® f1 + yh ® h + Hf\ ®fi+f2®
fi) + ek
(5.54)
along with N 0 T R : M = mk,
M = /xk,
(5.55)
and either TR:M=:±mk,
M = ±fik,
(5.56)
229 Pitted Neighborhoods of Hexagonal Close-Packed Configurations
123
or TR: M = ±(/nifi +m2h),
M = ±(/xifi +/x 2 f 2 )-
(5.57)
Taking the orbit of one of these under the corresponding type-14 lattice group generates two variants by the substitutions e -> —e, S2 -» s
2
, 02 -> -o"2, 5 -> — S
(5.58)
and m i —> — m i , IJL\ —> —/ii o r m2 —• — m 2 , /J-2 —*• —1^2
(5-59)
depending on which of (5.55)-(5.57) applies. So, this describes the six variants, and it is again easy to see what happens if they are not all distinct. In this case, we get Gn Gn 0 G12G22 0 0 0 G 33
(5.60)
p = pd + qe2 + ^e3.
(5.61)
and
5.7. Triclinic configurations, type 1 These have the trivial lattice group, consisting of the identity, so symmetry imposes no restrictions on the quantities of interest. Of course, it is possible to get twelve variants from one, and calculate the substitutions relating quantities of interest for them, but I do not think it worthwhile to list the details. Several comments are in order. For one thing, in all cases, n and <J> have the same general form, because symmetry conditions restricting them are the same. Setting <J> = 0 then gives the same number of equations as the number of parameters in n. This is also the case for W and A as well as M and M. There is then a reasonable possibility of having, for some materials, a configuration satisfying $ = A= M= 0
(5.62)
for some materials, at some temperature, for any of the symmetry types considered. These are the kinds of configurations that workers seem to like as reference configurations, and they are of some importance in dealing with various kinds of phase transitions. It will become clear that, in some cases, this can be attained this with some but not all more general forms of M. The difficulty lies in stating an alternative to (4.5) which produces such generalizations. So, I could have saved a little ink by not listing all, but thought it better to record the complete list. Also, there is some overlap with the presentation by FADDA & ZANZOTTO [3]. The main difference is that they do not discuss the effects on any constitutive equations or how introducing another variable, here M, affects the symmetry. For one thing, I am trying to cover the opinion of LANDAU & LIFSHITZ [19, Sect. 130] mentioned
230 124
J. L. ERICKSEN
in Section 4, so I am exploring ideas about this. What is not so obvious is what should be meant by "the symmetry of the crystal" when M ^ 0. As another point, it is not necessary that the center be an equilibrium configuration for the material and, for twinning studies, it could be useful to use one which is not. I [5] noted that the common {13 0} twins in a-uranium could be described using a type-27 neighborhood, but no known phase of this material has this symmetry. However, in this case, this is of limited value, since other known twins in this material cannot fit into this neighborhood. Further, I think that, potentially, it is useful to understand the structure of neighborhoods with centers of high symmetry because smaller parts are also neighborhoods with less symmetric centers, so we get information about possible structures of these and what can limit their size. Finally, suppose that we were interested in a second-order or weak first-order type-25-<->-type-27 phase transition, for example. From information now at hand, we see that the important difference is having 5 ^ 0 in (5.10)2. A physicist would, I think infer that s is a good choice of order parameter, and use Landau theory to analyze it, at least as afirststep. Theory of this general kind is discussed at length by TOLEDANO & DMITRIEV [22]. It involves using polynomial constitutive equations of relatively low degree and of different degrees in different arguments. 6. Quadratic constitutive equations Here, we consider constitutive equations obtained by expanding
M 0 = 0,
JT = O,
6 = e0,
cpo = O.
(6.1)
To do so, replace D by e = (1 - D)/2
(6.2)
If CBR applies and only infinitesimal deformations are allowed, you can replace this by the strain tensor commonly used in linear theories. I will split the energy into parts indicated by V =
+(pp +
(6.3)
Here, <pr includes what is commonly used in thermoelasticity theory plus some magnetic terms, those which would be used for a linear theory of paramagnetic phases. For these and particularly for ferromagnets, theory more nonlinear is used, with deformations still considered to be infinitesimal. Conventional theory of this kind and other kinds of inforamtion on ferromagnetic materials are discussed by CARR [23]. Remaining magnetic terms are covered by
(6.4)
Also, n = 0 =>
(6.5)
231_ Pitteri Neighborhoods of Hexagonal Close-Packed Configurations
125
Primarily because of the unusual transformation laws for n, I did not find the form of q> completely worked out in the literature, so did this myself. In doing such calculations, it helps to note that L+(Ea, P) can be generated by two elements, for example Li and Le- Of course, a function is invariant under the group provided it is invariant under a set of generators. Finally,
+ A 2 (£ 33 ) 2 + A 3 det ^eap| + A 4 £« 3 £ a3 +
A5saas33
+ (A6saa + A7£33)(0 - 9o) + A8(0 - eof + A9MaMa + Ai O (M 3 ) 2 , (6.6) (6.7)
+ Ai 3 (7T 3 ) 2 + Al 4 [2£i2^1 + (fill - £ll)jT2],
(6.8)
and (PL = Ai5Saa + Ai 6 £ 33 + An(d-
0O),
(6.9)
where the A's are material constants. Here, with the numbering indicated by writing the position vector as x = x\i + *2J + x 3 k, Greek indices take on values 1 and 2, eap being the alternating tensor. Of course, since e = eT, it makes no difference whether you take the first or the second index to label rows in \\sap\\. Properly interpreted, A is a symmetric tensor, so you must bear this in mind in calculating it, as is done in elasticity theory. This expansion could be used for any configuration in the neighborhood which is in the fixed set of the center and for which Mo = 0. For any such, (5.5) holds. If TR does not apply, it is not likely that An will vanish. If not, a simple calculation gives NO TR: (5.6) =^ M\ = M2 = 0.
(6.10)
This term can produce piezomagnetic effects, when other moduli satisfy suitable stability inequalities. I have not made a search to determine whether there are known real cystals of the monatomic hexagonal close-packed kind that are piezomagnetic. If M does not satisfy these conditions, it should cause a type-27 configuration to shift to one of lower symmetry, and you can find candidates by examining the descriptions of symmetry types presented earlier. The theory of piezomagnetism is rather similar to that of piezoelectricity, which is discussed in detail by CADY [24].
232 126
J. L. ERICKSEN
I do not know of a similar exposition of piezomagnetism. To satisfy (4.3) when e = 0, we also need Ai4 = A i 5 = 0 .
(6.11)
When TR applies, An = 0 and, according to this approximate theory, we can have (5.5) for any value of M. However, if higher order terms are included, this is not the case. For example, a possible cubic term is a constant times M eM,
(6.12)
and for the contribution to A from this to be consistent with (5.5)4, (5.7)i must hold. One could quibble about the possibility that other nonlinear terms might cancel out the contribution from this for particular values of M. However, to me, it makes plausible the idea that, genetically, we will not get (5.5) unless (5.7)i is satisfied. From here on, I will ignore dependence on 9. When TR does not apply, (6.10) does not require that Mj, = 0, suggesting that it might be possible to relax (5.6)i in this way, without vitiating (5.5). Actually, this is the case. To see this, first fix e at a value consistent with (5.5)i, take M = (0,0, M3), n = (0.0, ^3) and consider the fact that
(6.13)
Now, for the same values of e and M, take n = (it\, JT2,0) and consider the invariance of
(6.14)
Evaluate this at n = 0 and you easily get <J>a(e,0, (0, 0, M 3 )) = 0 = ^ <J> = 0 .
(6.15)
It is then a matter of showing that A has the right form. Represent components of e by three groups of terms. One consists of terms invariant under P+(Ea, P), which are e n + e 2 2 and e 33 .
(6.16)
The others can be pictured as two vectors in the basal plane, vanishing when £ is consistent with (5.5)i. u=(ei3,£23),
v = (2ei2,eii-£22),
(6.17)
By a routine calculation, we can verify that, under the lattice group element L2, these do in fact transform as vectors. Under Li, v transforms like n, which gives the term multiplied by A14 in (6.8). Of course, we need to verify that this is also invariant under L,(,, which is easy. Fix the arguments in (6.16), take n = 0, M = (0, 0, Af3),
233 Pitted Neighborhoods of Hexagonal Close-Packed Configurations
127
v = 0, and use the fact that (j> is invariant under L2. By an argument much like that used to get (6.15), you find that, when u = 0 also holds,
JL = JL=Q-
(6 18)
-
Treating v as u was treated, we get, for the same arguments, the equivalent of
*L = Oi i ^
=
if_.
(6.i9)
3ei2 3en 3e22 For M copy the argument used to get (6.15) and you find that (5.5) still holds when (5.7) is replaced by NO TR: M = mk =>• M = /ik.
(6.20)
For the symmetry type 25, we can use essentially the same arguments to get (5.10) still holds when (5.6) is replaced by NO TR: M = mk => M = fxk. (6.21) Again, for this, M should be of the form (5.7) when TR applies. I think it sensible to allow these modifications to the symmetry types. For type 14 with NO TR, it suffices to examine the variant described by (5.13). We can use (6.7) to infer that, for A to be of the form (5.16)4, we need Mi = M2 = 0.
(6.22)
Then, verify that [2ei27r2 — (en — £22^1 ]Af3
(6.23)
is invariant under L+(Ea, P) and you can use this to infer that, for <J> to be of the form (5.16)3, M3 = 0 => M = 0,
(6.24)
so (5.17) should hold. For this, we should find a reason to reject the possibility that en = £22. which can be done. If TR applies, we can use (6.12) to verify that (5.18) should hold. Thus, for this symmetry type, we get just what is covered by (5.17)—(5.19). For types 7, 6 and 2, we can use the same terms in a similar way to argue that the previously stated restrictions on M should be respected. So, it is only for the most symmetric types 27 and 25 that we have reasons to relax the restrictions previously imposed. While the theory of crystal symmetry has other uses, it does play an important role in the selection of invariance groups for various kinds of constitutive equations. Even for the quadratic approximation considered here, it is important whether TR applies, this being what distinguishes materials having a paramagnetic phase from those having a piezomagnetic phase. This suggests that it would be useful to refine the symmetry classification provided by lattice groups, to include TR or NO TR distinctions. Intuitively, configurations fitting (5.21) and (5.22) could be viewed
234 128
J. L. ERICKSEN
as having different symmetries, and it seems clear that linear theories obtained by expanding about these take somewhat different forms. I have no strong prejudices about how we should deal with such possibilities, but think that this deserves some serious thought. Now, I will restrict my attention to theory ignoring magnetic effects. Ignore the statements about requirements on M in the descriptions of symmetry types and you get the descriptions of symmetry types for the simpler theory. For the Cauchy stress tensor t, you can use (2.20) as one way of representing it. Make the change of variables indicated in (3.15) with M deleted and you get the equivalent as t = tr =2pL"rAL-1
(6.25)
For any of the symmetry types, evaluating this for L of the form (5.1) gives t of the same general form as A. For other values of L, transform this by the rotation occurring in the polar decomposition (3.19), which does not alter A. When magnetization is accounted for, calculation of self fields gives a contribution to the stress depending on the size and shape of the body considered, this being one of the things that makes it hard to make comparable statements about the form of this tensor.
7. Appendix Herewith is a list of the lattice group elements for L + ( E a , P) for the center described by (4.1) and (4.2), and the group multiplication table for it. First, the group multiplication operation is {m, a, 1} • {m, a, 1} = {mm, aa, al + lm}.
(7.1)
This is inferred from the composition e a -* ea = mbaeb -* e a = mbazb = (mm)*efe and the analog for the shift. The elements are
I
I 10 - 1 1 0 ,-1,111 1 1||
,
(7.3)
- 1 - 1 0 ,l,||10 0|||,
(7.4)
0 01
f 0 Rfcr/3=L2 = j
(7.2)
1 0
I o oi
)
]
f -1 0 0 ) R * L 3 = { 0 - 1 0 , - 1 , ||0 0 1|| [,
[ 0 01
J
(7.5)
235 Pitted Neighborhoods of Hexagonal Close-Packed Configurations
f -1-10 ] 1 0 0 ,1,11-1-1 0||[, l4/3:L4= ( 0 0 1 ]
f o-io
I
I 001
J
(7.6)
R ^ / 3 : L 5 = I 1 1 0 ,-1,1110 1||},
R^:L6={
[ 1 0 -1-1
R^
+ 2 J
0
1 l
11° ° ~
,l,||00-l||},
(7.9)
,-1,111 1 0|| ' ,
(7.10)
I
f °l °
'i+^i 2i
R~y :U=
\
100
[-100 R^:L9= •
1 1 0 0 0-1
I , 1, H - 1 0 - 1 | | 1 , j
_i i + V3i [ - 1 - 1 0 R^2 + 2 J : L 1 0 = { 0 1 0
( 0 0 -1
^i+'i R*
2 +2J
(7.8)
]
II1 l ° :L7= • 0-1
(7.7)
1 , - 1 , ||10 0|[ } ,
I 0 0 -1
^i+'l 2
0 0
129
(7.11)
| , - 1 , ||0 0 0 | | [ ,
I
[ 0 - 1 0 1 : L n = ] - 1 0 0 ,1,11-1-1 - 1 | | [ , [ 0 0 -1 )
1: L n = {1, 1, 0}
(group identity).
(7.12)
(7.13)
(7.14)
Here, the rotation listed as the first item is the corresponding point-group element. Lattice groups do determine such point groups to within similarity transformations obtained using orthogonal transformations. They also determine space groups, but different and inequivalent lattice groups can correspond to one space group, the lattice groups distinguishing differences in symmetry not recognized by space groups or site-symmetry groups. Essentially, subgroups determine how the neighborhood gets decomposed into subsets, where configurations in one subset have the same symmetry, ignoring some of the subtleties discussed earlier, associated with including magnetization.
236 130
J. L. ERICKSEN
The group multiplication table is
L2 L3 L4 L5 1 L7 Lg L9 L10 Ln L6 L3 L4 L5 1 Li Lg L9 Lio Ln L6 L7 L4 L5 1 Li L2 L9 Lio L11 L6 L7 Lg L5 1 Li L2 L3 Lio Ln L6 L7 Lg L9 1 Li L2 L3 L4 Ln Lg L7 Lg L9 Lio Ln Lio L9 Lg L7 1 L5 L4 L3 L2 Li L6 Ln Lio L9 Lg Li 1 L5 L4 L3 L2 L7 L6 Ln Lio L9 L2 Li 1 L5 L4 L3 Lg L7 L6 Ln Lio L3 L2 Li 1 L5 L4 L9 Lg L7 Lg Ln L4 L3 L2 Li 1 L5 Lio L9 Lg L7 Lg L5 L4 L3 L2 Li 1
(V.15)
presented as a matrix, where the element in the ; t h row and 7 th column is the product L, • Ly, omitting the obvious products involving the identity element. The same table applies to the point group, and it is easier to use this to calculate the entries. As is clear from (7.1) and (7.14), the value of m for the inverse of an element with this value is m " 1 and they share the same value of a. For any particular subgroup, it is routine to characterize the possible values of ea, ea, p, and R associated with this symmetry. For each such element, take the m from the list, using Rea — mbatb to determine restrictions on the lengths of lattice vectors, etc. Readers might find helpful the discussion by FADDA & ZANZOTTO [3,11] for picturing how the configurations fit together.
References 1. M. PITTERI: On (y + 1) lattices. Journal of Elasticity 15, 3-25 (1985). 2. J. L. ERICKSEN: Equilibrium theory for X-ray observations of crystals. Arch. Rational Mech. Anal. 139, 181-200 (1997). 3. G. FADDA & G. ZANZOTTO: Symmetry breaking in monoatomic 2-lattices. International Journal of Nonlinear Mechanics 36, 527-547 (2001). 4. J. L. ERICKSEN: On the theory of growth twins in quartz. To appear in Mathematics and Mechanics of Solids. 5. J. L. ERICKSEN: Twinning analyses in the X-ray theory. International Journal of Solids and Structures 38, 967-995 (2001). 6. J. L. ERICKSEN: On correlating two theories of twinning. Arch. Rational Mech. Anal. 153, 261-289 (2000). 7. J. L. ERICKSEN: Notes on the X-ray theory. Journal of Elasticity 55, 201-218 (1999). 8. J. L. ERICKSEN: On the theory of the a —fiphase transition in quartz. Journal ofElasticity 63, 61-86 (2001). 9. M. PITTERI: Geometry and symmetry of multi-lattices. International Journal of Plasticity 14, 139-147 (1998). 10. J. L. ERICKSEN: On nonessential descriptions of crystal multilattices. Mathematics and Mechanics of Solids 4, 363-392 (1998). 11. G. FADDA & G. ZANZOTTO: On the arithmetic classification of crystal structures. To appear in Ada Crystallographica A. 12. J. L. ERICKSEN: Electromagnetic effects in thermoelastic materials. Mathematics and Mechanics of Solids 7, 165-189 (2002).
237 Pitted Neighborhoods of Hexagonal Close-Packed Configurations
131
13. C. TRUESDELL & R. A. TOUPIN: The Classical Field Theories. Handbuch der Physik (ed. S. Fliigge) III/I, 226-793 (1960). 14. E. T. WHITTAKER: A History of the theories of Aether and Electricity (2 volumes), Thomas Nelson and Sons, London, 1953. 15. W. F. BROWN, JR.: Magnetoelastic interactions. In SpringerTracts in Natural Philosophy (ed. C. TRUESDELL), vol. 9, Springer- Verlag, New York, 1966. 16. W. F. BROWN, JR.: Micromagnetics. R. E. Krieger Publishing Co., Huntington, 1978. 17. C. DAVINI: A proposal for a continuum theory of defective crystals. Arch. Rational Mech. Anal. 96, 295-317 (1986). 18. I. FONSECA&G. PARRY: Equilibrium configurations of defective crystals. Arch. Rational Mech. Anal. 120, 245-283 (1992). 19. L. D. LANDAU & E. M. LIFSHITZ: Statistical Physics. Pergamon Press, Oxford, 1959. 20. C. TRUESDELL: A First Course in Rational Continuum Mechanics, Vol. 1. Academic Press, New York 1977. 21. J. L. ERICKSEN: On the symmetry of deformable crystals. Arch. Rational Mech. Anal. 72, 1-13 (1979). 22. P. TOLEDANO & V. DMITRIEV: Reconstructive Phase Transitions in Crystals and Quasicrystals. World Scientific, Singapore 1996. 23. W. F. CARR, JR.: Secondary effects in ferromagnetism. Handbuch der Physik (ed. S. FLUGGE & H. P. J. WIJN), XVIII/2, 1966, pp. 274-340.
24. W. G. CADY: Piezoelectricity. McGraw-Hill, London 1946. 5378 Buckskin Bob Rd. Florence OR 97439, USA (Accepted November 1, 2001) Published online September 4, 2002 - © Springer-Verlag (2002)
239
3. Defects The first paper in this series, "Volterra dislocations in nonlinearly elastic bodies," deals with the definition and some elementary features of a generalization to finite deformations of Volterra dislocations. The second, "Twinning of crystals," is an exposition of twinning theory, as I then understood it. One thing discussed in "Thermoelastic considerations for continuously dislocated crystals" is what could be regarded as an antecedent of the X-ray theory or one interpretation of theory sometimes used by physicists. I found fault with it and discussed reasons for believing that Gibbs would have disliked it. Also, I discussed thermoelasticity theory, including the difficulty of defining entropy when changes in distributions of dislocations occur, and some thoughts about calculating forces on dislocations. Further, I discuss defining thermodynamic potentials when continuous distributions occur, but are not really changed by deformation. In different ways the subsequent two papers, "Some surface defects in unstressed thermoelastic solids" and "Stable equilibrium configurations of elastic crystals", cover elementary theory of twinning based on nonlinear elasticity theory and illustrative examples. I then present in "On nonlinear elasticity theory for crystal defects" my thoughts on using nonlinear elasticity theory to treat dislocations in crystals. One thing discussed here is the possibility of having dislocations not of the Volterra or Somigliana kinds. Another is the desirability of better combining twinning theory with dislocation theory. Also, some common misconceptions are mentioned. One of the theories mentioned in the title of "On correlating two theories of twinning" is the X-ray theory that does not explicitly involve deformation. Workers concerned with deformation twins often encounter cases to which the Cauchy-Born hypothesis does not apply. Then, it is common to assume that it applies to lattice vectors for some nonessential description, then to use conventional twinning equations based on ideas about deformation, giving the second theory to which the title refers. For the X- ray theory, one can introduce a linear transformation taking lattice vectors for one of the twins to the other, using an essential description. This would be a possible value of the relative deformation gradient if the Cauchy-Born hypothesis held. What is done is to characterize the relation between this and the deformation gradient used in the other theory in terms of certain lattice invariant shears. "Twinning analyses in the X-ray theory" is an exposition of twinning theory, treating known examples of twins in a-uranium, for most of which the Cauchy-Born hypothesis does not apply. In the paper "Twinning theory for some Pitteri neighborhoods," for what I call a Pitteri neighborhood centered at a configuration of the monatomic hexagonal close-packed type, this being a configuration of maximal symmetry, all generically possible twins are
240
characterized, using the X-ray theory. This idea can be used to treat twins in crystals of lesser symmetry, as long as their configurations lie in the neighborhood. Twins occurring in hexagonal close-packed crystals cannot be included in any Pitteri neighborhood. Such neighborhoods are also called weak-transition neighborhoods or Ericksen-Pitteri neighborhoods. "On the X-ray theory of twinning" covers the basic X-ray theory of twinning, with new results on the solubility of twinning equations and some properties of solutions, when they exist. As defined by some experts, rotation twins are twins such that applying a two-, three-, four- or six-fold rotation to one of the twins gives the orientation of the other. Further, the rotation axis must either lie in the twinning plane or be perpendicular to it. I discuss these issues in the article "On the theory of rotation twins in crystal multilattices;" and for the X-ray theory, all such solutions are characterized. As an example, a twinned configuration observed in staurolite is analyzed. A study of all of the wellestablished growth twins in quartz is given in "On the theory of growth twins in quartz." Most are successfully analyzed by use of the X-ray theory of twinning, the exceptions being the Zinnwald and Zwichau twins. These are unusual in that crystallographically inequivalent planes occur on the two sides of the discontinuity plane, making them more like grain boundaries. While different definitions of cyclic twins occur in the literature on mineralogy, a reasonable interpretation is that they are growth twins involving multiple twinning, according to the same twin law, with composition planes repeated at a constant angular interval. This is similar to the way oranges are divided into sections, although configurations involving sets of parallel planes are also included by some. Accepting idealized descriptions of these in the literature, in the forthcoming paper "On the theory of cyclic growth twins," I use the X-ray theory to analyze some different kinds occurring in rutile, aragonite, marcasite and quartz. One conclusion is that, excepting cases where the interfaces are parallel planes, most if not all such twins are subject to residual stresses, but would not be if lattice parameters had values different from those observed. With the assumption that they were unstressed when created, this gives some information about environments occurring when they were formed. Typically, these are found in the field, the environment in which they grew being unknown. One unusual solution of the X-ray twinning solutions described in the forthcoming paper "Unusual solutions of twinning equations in the X-ray theory," involves a 180° rotation as the isometry, with a variety of possible zigzag interfaces. One could use these to approximate cylindrical surfaces with rulings parallel to the axis of rotation. Also, it is well-known that type I and type II twins satisfy twinning equations that can be satisfied by any set of lattice vectors. For the X-ray theory, all such solutions are characterized in this paper, there being four kinds. Extended summaries of these two papers are provided at the end of this chapter.
241 Analele stiintifice ale Universitatii ,,A1. I. Cuza" din Ia?i Tomul XXIII, S. I a, f. 2, 1977
VOLTERRA DISLOCATIONS IN NONLINEARLY ELASTIC BODIES BY J. L. ERICKSEN
1. Introduction. In his impressive memoir on dislocations, V o 1t e r r a [1] presented concepts and analyses which he considered appropriate for isotropic, linearly elastic materials. Later, S o m i g l i a n a [2,3] considered more general types of dislocations. Our purpose is to give a definition and local characterization of Volterra dislocations for nonlinearly elastic materials. Unfortunately, approximations accepted in linear theory tend to obscure the ideas involved, so the generalization could, perhaps, be made in different ways, Very roughly, our basic notion is that Volterra dislocations are those which leave a smooth body smooth. It turns out that this includes dislocations of a more general nature than those considered by Volterra, although less general than considered by Somigliana. Ideas much like those used here are rather commonly employed by those experts who deal with imperfections in various materials. They also introduce other restrictions based on ideas of microstructure, which are outside the realm of elasticity theory. Here, I use only ideas which are naturally associated with elasticity theory itself. Thus, considerations of microstructure can well single out subsets of our Volterra dislocations. 2. Preliminaries. It is perhaps simplest to think of beginning with one homogeneous parent body P, in a homogeneous reference configuration, referred to rectangular Cartesian material coordinates X". In various ways, we can cut out and glue together parts, to form other bodies containing dislocations. Normally, smooth deformations of P are described by invertible mappings of the form x< = e* (Z*) or x = x (X), (2.1) with deformation gradients F given by (2.2) Ft = a*t, where x refers to the rectangular Cartesian spatial coordinates. In places, it will be easier to use the inverses X = X(x), (2.3)
.
(
Z=F'\
242
424
J- L- ERICKSEN
2
For P, we will require a constitutive equation for the (symmetric) Cauchy stress tensor t, of the form (2.4)
t =
t_(G).
We require that ~t be analytic for G e= A, where A is some domain: We are only interested in realizable deformations, which should satisfy (2.5) det G > 0, as well as being in the domain A. For the most part, we could get by with much weaker continuity assumptions. We do employ a standard existence theorem which presumes analyticity ; perhaps this is only a fault of the proof. For constitutive equations which we have seen used in practice, the assumption is satisfied, for some choice of A. Whatever material symmetry P may have is described by unimodular matrices H which, under matrix multiplication, form a group Q, (2.6)
det H = 1, H e .0.
The constitutive function f must then satisfy (2.7)
t(HG)
-
t(G),
V//
e 0,, G e A.
We have some need to refer to the equations of equilibrium, with zero body force, viz (2.8) V . t = 0, or, in component form (2.9) *,»., = 0. In practice, t is more often considered as a function of F but, with (2.5) it is a simple matter to convert such to the form (2.4). This summarizes as much of the elementary nonlinear elasticity theory as we require. Let us take a local view of neighborhoods of two different material surfaces S+ and S' in P, which are to be moved and glued together to form a single material surface 5. In a natural way, this sets up a mapping 5 + « 5 ' ; two particles correspond if they will meet on S. Describing the correspondence by letting corresponding particles have the same surface a coordinates u , which will also serve for S, we write parametric descriptions for S+ and S~, respectively, as f Z=Z+(tt«), (2J0)
U
= -¥-<*•)•
Of course, necessary cutting of P is envisaged, to leave S+ and S~ as parts + of the boundaries of two domains D and D', respectively. In principle,
243 3
VOLTERRA DISLOCATIONS IN NONLINEARLY ELASTIC BODIES
425
D+ and D~ could form a single domain but, for a local analysis, we presume them disjoint, D+ n D- = 0.
(2.11)
Generally, we are interested in deformations such that (2.1) maps D+ and D~ smoothly onto domains d+ and d~, with d+ and d~ having only boundary points in common, and with S included in this common set. The bodies could touch where they are not glued, so other points could be in this intersection. Of course, d+ U d~ is to be the domain, Z>+ u D~ the range of the inverse map (2.3)! . Now consider one fixed mapping of the type indicated, letting bars denote quantities associated with it, then G is to be smooth in d+ and d~, approaching smooth limits G+ and G~ as we approach S <= d+ n d~ from the interior of d+ and d~, respectively. With the mapping described in the form (2.1), we have implied that S is given parametrically by (2.12)
x = x (ua) = ~x [X+ (ua)] 1 = x[X-(u')],
with the obvious analog for the inverted map. With (2.1), (2.12) and the chain rule, we have (2.13) or, equivalently, (2.i4)
[ v,a = (G-*)+X\ ~ Z
1
=(G-')-^r-
X%=G+~m^, ~_ : ~_
Since this mapping is fixed, the * can serve as material coordinates ; we can use the glued-up body as a "reference configuration. Subsequent deformations can be described as before, or by mappings of the form
(2.15)
I
X = X (x),
:
Let G be defined, using (2.15), as
(2.16)
so, by the chain rule
i"
X = X (x).
244 426
J- L. ERICKSEN
(2.17)
4
G=~GG. A
We can then calculate t, a new constitutive equation for t, by the rule (2.18)
t(G,~x)=~t(GG). A
A
.*.
As a function of G, t will have a domain A simply related to A, as indicated by (2.17). As a function of », its domain will be d+ U d~. Suppose, for the moment, that G (x) is continuous on S. Then, approaching S from the two sides gives the limits
I
(2.19)
t* (6, x) = t(G+G),
: G: : : i : l'( ^) = t(G-GJ,
so, in general, the new constitutive equation is not continuous. If it is not, A
one can hardly expect (2.18) to deliver solutions with G continuous on S. There is some possibility of defining weak solutions of (2.8), supplementing it with physically appropriate jump conditions, but this is not our aim. Rather, we wish to single out the special cases where, for all G e A, G is A
A
A
"
*
*
.
"
"
such that t is a continuous function of G e A. A fixed deformation x = = x (X), accomplishing this locally, wilf be called a Volterra dislocation solution. Linear theory presumes all deformations so slight that t can be taken as the same function as t. Clearly, this obscures the issue. It is then more in spirit than in detail that our Volterra dislocations are like what Volterra called ,,distorsions". Volterra's work on isotropic materials has of course been adapted to crystals, etc. Taking into account how this has been done, our deformation seems apt, and Somigliana's name might well A
be attatched to those which produce a discontinuous t. 3. Local Conditions. From (2.5), G has positive determinant, so there is a unique matrix K, with det. K > 0, defined on S, such that G=KG+. (3.1) It is then easily seen that the requirement that t be continuous translates into the requirement that (3.2)
~t{KG) =~t(G), VG s A.
From (2.7), this is certainly satisfied if, at each point of S, (3.3) Some writers take ^
K e £. to be the maximal group such that (2.7) holds,
245
5
VOLTERBA DISLOCATIONS IN NONLINEARLY ELASTIC BODIES
427
in which case (3.3) must hold. Others would, we think, concede that (3.3) includes the more likely cases. Thus, depending on one's attitude toward ^ , (3.3) is either necessary, or close to it, so we presume it applies. It will be shown that (3.1) and (3.3) do characterize the Volterra dislocations, locally. Some consequences follow easily. From (2.14) and (3.1) (3.4)
X;a = KX%,
and K should be a smooth function of ua, If Q, is a discrete group, as is the case for crystals, K must be constant, and (3.5)
K = const. => X- = K X+ + const.,
describes the possible relations between the surfaces S~ and S+. If, as is the case for isotropic or transversely isotropic materials, ^ admits a continuous subgroup, then K might vary smoothly with position. From the obvious integrability conditions for (3.4), the choices of S+ and K (ua) are limited by (3-6)
fif^^IJ.
For an isotropic material, 0 is the group of rotations. With K a rotation, (3.4) implies that the surface metric tensors g*p of S1* satisfy (3-7)
<£„ = X7a . X;, = X,t . X% = gt, .
Thus, whatever is S+, S~ must be obtainable from it by a mapping which preserves all distances, and common text books on differential geommetry deal with the theory of surfaces so related. For example, if S+ is plane, S~ must be a developable surface. Suppose, conversely, that we are given any pair of surfaces S+ and S~, related so that (3.7) holds. Introduce their unit normals N*, oriented in accordance with the usual convention that (3.8)
X% A X% . N± > 0.
It then follows easily that there exists a unique rotation K (u* ) such that (39)
rx,-. = * * • . ,
1 N~ = KN+.
so (3.4) holds. If, say, S+ is part of a plane, S~ of a right circular cylinder, then K is not constant. If we labor over this, it is because, in the linear theory of Volterra dislocations, Volterra and others have argued that the infinitesimal equivalent of K must be constant. For reasons mentioned earlier, the assumptions made are not precisely equivalent, and we here see that our assumptions permit more general types of dislocations ; Volterra's ,,distorsions" are in the subset indicated by (3.5), the still similar subset which can be treated with linear theory.
246 428
J- L. ERICKSEN
6
For crystals, most writers have assumed that ^ is one of the crystallographic groups, which are subgroups of the orthogonal group. E r i c k s e n [4] points out that, while this is sound for the theory of infinitesimal deformations, it is less apt for finite deformations. For the latter, Q should also include elements representing certain types of shearing deformations. The larger group is still discrete, so (3.5) applies. The bulk of the literature on dislocations in crystals deals only with the subcase where K = 1. A sketch of a deformation applying to a case where K j= 1, is given in T o u p i n ' s discussion of a ..dislocated crystal with 180° Iwist", sometimes called a Moebius crystal. Here, K # 1, but it is still an element of a crystallographic group. Cases where K represents one of the aforementioned shears seem not to have been studied. In discussing screw dislocations, W e s o l o w s k i and S e e g e r [5] do suggest that some such study might be feasible, if we correctly interpret their closing remarks. 4. Symmetry. Of course, the glued-up body inherits some material symmetry from the parent P and, for a Volterra dislocation, there is no disruption of the apparent symmetry on the dislocation surface S. From (2.7) and (8.2), t(G,x) = ~t(HGG) (4.1)
^tJGHG) A
A
A
= HHG,cc), where (4.2)
H^G'H'G. A
A
Thus, as H ranges over 0., H ranges over a conjugate group ^ ; the usual change in apparent symmetry accompanying a change of reference configuration. If we fix II = const., and approach S from the two sides, we get the limits H+ = (G-1)+H~G+
(4.8)
H~ = (G-1)HG= (G 1 )* K1 H ZG+,
where we have used (3.1). Thus, in general, H+ =£ H. However, from (3.3) (4.4)
K-'HK
^
#,
so, H+ and H~ range over the same set of matrices, or
247
7
VOLTERRA DISLOCATIONS IN NONLINEARLY ELASTIC BODIES
429
^+ ^ ^-.
(4.5)
In this sense also, then, the dislocated body is smooth. 5. Sufficiency. One might like to have the fixed deformation describe an equilibrium configuration, corresponding to a solution of (2.8). One might think that this would give rise to addititonal local restrictions on Volterra dislocations, or that we have overlooked some which apply more generally. A construction indicates that we have in hand all relevant local conditions. It is here where we need the aforementioned analyticity assumptions on t. etc. Suppose we are given any particular solution of (2.8), with deformation (5.1)
X = X (x)
analytic and invertible in some domain d. For a local analysis, we can replace d by any convenient subdomain, say a small ball, to eliminate singular'ties, topological complications, etc. Now pick any analytic surface 2 which slices d in two subdomains d+ and d~. In d+, we take the fixed deformation to be the restriction of (5.1) to d+, so that (5.2)
G = G in d+,
etc., and we take S = 2), to be given parametrically by,
(5.3)
1=t
(W),
a
analytic in u . The range of this restriction of (5.1) will give us D+, and S+ will be the image of S. By the chain rule, its parametric description, X+ (ua), will satisfy (5.4)
*+ = G+ xa
in accord with (2.13). The problem is then to construct a deformation in d~, or a suitable part near S, which will give us a Volterra dislocation solution in the corresponding part of d. We now require some analytic surface S~, related to S+ in a manner compatible with (3.3) and (3.4) ; any such choice will do. This involves some choice of K (ua), so we assume it known. Thus^ we can use (3.1) to calculate Gp on S~and it will be analytic. With (3.2), G will be in the domain A where t is analytic if G~ is ; we have (5.5)
G- =KG+ in S, K s Q.
Referring to the fixed deformation in d~, we know that its map, X = X (%), must take on the known values X = X~ on S, and that its
248 430
J- L. ERICKSEN
8
first derivatives on S must be given by (5.5). Further, in d~, (2.8) should be satisfied, with t given by the analytic constitutive equation t (G). This is a standard Cauchy problem for these equations. If S is not a characteristic surface, the Cauchy-Kowalewski theorem assures us of a solution in part of d~, near S, and we have our dislocation solution. It may happen that the image D~ of d~ does not satisfy (2.11), but one can arrange this by replacing X by X -f const... This construction makes quite clear in what sense (3.1) and (3.3) characterize the Volterra dislocations, locally. Some would go so far as to say that t should be so limited that the equilibrium equations never admit real characteristics, eliminating the exceptional cases. We think that this goes to far. However, dislocations wherein S is characteristic might well exhibit some unusual pathology, so it might be wise to put them in a separate category. 6. Remarks. For constrained materials, it is easy to adapt the earlier analyses but, in general, it is at least tricky to adapt the analysis of § 5. For nonlinearly elastic materials, the simplest example of a Volterra dislocation solution is for a material subject to the constraint of incompressibility, that discussed by T r u e s d e l l and N o l l [6., pg. 195]. As they note, such solutions provide examples of bodies which are materially uniform. Our analyses suggest new some methods for constructing these. Superficially, it would appear that, by going to the limit of infinitesimal deformations, we would get Volterra dislocations more general than those considered by Volterra. To settle this, one needs to look very carefully at what is required for the linearization to make sense. We have only taken a quick look at this question. Indications are that Volterra probably did include all of those which qualify as infinitesimal. Acknowledgement •. This work was supported by a grant from the U.S. National Science Foundation. REFERENCES 1. V o l t e r r a V. —• Sur V iquilibre des corps elastiques, Ann. Ec. Norm, 24, 401—517 (1907). 2. S o m i g l i a n a C. — Sulla teoria delle distorsioni elastiche, Nota I, Rend. Accad. Lincei 23, 463—472 (1914). 3. S o m i g l i a n a C. —• Sulla teoria delle distorsioni elastiche, Nota I I , Rend. Acad. Lincei 24, 655—666 (1915). 4. E r i c k s e n J. L. — Special topics in elastostatics, to appear in ..Advances in Applied Mechanics", (ed. C—S. Yih), Academic Press, New York, 1977. 5. W e s o l o w s k i Z. and S e e g e r A. — On the screw dislocation in finite elasticity theory, in ,,Mechanics of Generalized Continua", (ed. E. Kroner), pp. 295—297, SpringerVerlag, New York, 1968. 6. T r u e s d e l l C. and N o l l W., — The non-linear field theories of mechanics, in Handbuch der Physik, (ed. S. Fliigge), pp. 1—591, Springer-Verlag, Berlin-Heidelberg-New York, 1965. Received 21.V. 1977
The Johns Hopkins Baltimore, Maryland,
University U.S.A.
249 Reprinted from
Metastability and Incompletely Posed Problems Edited by S. Antman, J.L. Ericksen, D. Kinderlehrer, and I. Miiller © 1987 Springer-Verlag New York, Inc. TWINNING OF CRYSTALS ( I )
J.L. Ericksen Departments of Aerospace Engineering and Mechanics and School of Mathematics University of Minnesota Abstract:
1.
Twinning is a kind of defect, commonly observed in crystalline solids. We explore some implications of t h i s , with respect to the theory of crystal elasticity.
Introduction When the crystallographer uses mathematics, it is more likely to he
elementary algebra, geometry or group theory than sophisticated analysis. However, some of the things he or the metallurgist observes and describes can be viewed as realizations of solutions of partial differential equations of a rather unusual kind, involving deep analytical difficulties. Twinning and like phenomena are included among the defects commonly seen in crystalline solids. Nonlinear elasticity theory is being used, with some success, to analyze such phenomena.
In a very formal sense, the general equations are old and well-
known, Euler-Lagrange equations of a somewhat special kind, there being a great number of studies of special cases. Yet the simplest considerations of these phenomena have made clear that we need to revise our thinking about the subject. My purpose is to elaborate these vague remarks.
2. Elastic Crystals Here, we ponder what might be considered to be the simplest possible kind of problem in elasticity theory. Consider a homogeneous elastic body, with a shape as simple as you like, and seek to determine how it can be in stable equilibrium, when it is subject to no forces or kinematic constraints. Recall that such theories involve a fixed configuration of a body as a reference, all others being related to this by deformations (mappings), usually
250 78 fairly smooth and one-to-one.
If
F denotes the gradient of the map from the
reference to another configuration, we generally have det
F > 0,
(2.1)
C = C = F F is positive-definite
(2.2)
implying that
If
W denotes the strain energy per unit reference volume, we have a constitu-
tive equation of the form W = W(C) = W(F)
That
W depends on F, in the combination
(2.3)
.
C, reflects the fact that we don't
expect the energy to change, if we merely subject the material to a static translation and rotation, motions which don't affect understood that
C. Tacitly, it is
W will also depend on temperature, commonly thought of as a
control parameter.
I will keep the discussion fairly informal. Consider W to
be as smooth as you wish, defined on the entire domain indicated by (2.2). Generally, we know that that function
W depends on the material and the choice
of reference configuration. By conventional reasoning, in the absence of any constraints on the deformation, and with no forces applied, the stable equilibria should be minimizers of the total energy E = fB W , B denoting the region in figuration.
(2.4)
E_ occupied by the body in the reference con-
As the usual formal conditions for extremals, we have the
Euler-Lagrange equations and natural boundary conditions A
V . $ =0 where
A
, f
N=0 ,
(2.5)
N is normal to 3B. For other static problems, experience indicates
that we are likely to be dealing with (2.5), with different boundary or other
251 79 side conditions, or related minimization problems. Actually, it is easier to note that
E will be minimized if W
common practice is to look for a symmetric positive definite tensor
is, so the C^
such
that W(C) > WfCj). Given such a constant
(2.6)
C., we can find a constant
F = F^
satisfying
FJF1 = C x , det Fx > 0,
(2.7)
and integrate this gradient to get a mapping giving a nvinimizer. Riven
Cj, the
mapping is determined to within a translation and rotation, so we have a nice smooth minimizer, a trival solution of (2.15). assumption that
W
One might question the tacit
attains its minimum but, really, I don't find fault with the
reasoning, as far as it goes. Rather, it is the possibility that
W might have
more than one such minimizer, and questions related to this, which we will consider. So far, we have said nothing about material symmetry, which certainly has some relevance to the crystals, in particular. General theory for this introduces an invariance condition of the type W(H T CH) = W(C) , H the consensus of opinion being that det H = ±1. Also, since
H = -1
e
G ,
(2.8)
G should contain only
satisfies (2.R) for any
H such that W, there is no real
loss of generality in assuming that det H = 1. Then, if C.
(2.9)
is a minimizer, so is H C.H, for any
H e G.
In alnost all
expositions of elasticity theory, it is assumed, often tacitly, that, for solids, H T C,H = C 1
V
H e G
(2.10)
252 80 and, clearly, this must hold, if C^ Automatically, this forces
is to he the only minimizer of
W.
G to be a compact group, conjugate to some subgroup
of the orthogonal group. Then, the minimizer becomes the obvious choice for the reference, and using it gives
C^ = 1, (2.9) and (2.10) then implying that
G
is a subgroup of the orthogonal group. All this is compatible with much successful experience with linear elasticity theory.
Essentially, Noll fll relied
on such experience, in defining solids by the condition that the maximal leaving
W
G
invariant is such a group. Certainly, I accepted the basic line of
though for a time and, no doubt, some still do. possible that
W
If one accepts it, it remains
has other minimizers, not so related by symmetry.
As a physi-
cist might argue this, this could well occur at a particular temperature.
Change
the temperature a little and, because there is no reason for the energies to stay equal, it is almost certain that they won't, so one will take over as the minimizer. by symmetry.
So, for most values of the temperature, all minimizers are related The other configuration might continue to have some status as a
relative minimizer of W, perhaps be stable enough to be observed.
One might
even see both, coexisting in the same body as a metastable configuration.
Such
things do happen frequently. Austenite and martensite are names long used to describe more and less symmetric phases occurring together in common steels, with the martensite twinned, both phases containing other defects. One needs a microscope to discern this structure.
In some other alloys, the metallurgist
sees simpler morphologies, cleaner arrangements of austenitic and twinned martensitic phases. So, very commonly in crystalline solids, one sees configurations which must be somewhat stable, since we see them persist for long periods of time, but can hardly be more than metastable. Usually, they are not so smooth. Clearly, it is not so easy to give a precise definition of relative minima of the energy which covers these.
Some thoughts concerning such
questions can be found in the work of James [2,3]. Some of these ideas about elasticity theory need to be revised to accomodate phenomena like twinning.
To abstract some important features of such phenomena,
we need to have at least two symmetry-related minimizers, so we must give up (2.10),
253 81 and replace it by the statement that there is at least one
H e G such that
C 2 = H ^ H * Cj . It is then elementary that, if F. corresponding to some rotation
C.
and
and
(2.11)
F. are any two values of F
C-, as indicated by (2.1) and (2.2), there must be
R, R'1 = R T , det.R = 1 ,
(2.12)
F 2 = RFjH .
(2.13)
such that
Actually, we want more, that two such configurations be able to coexist in the same body, coherently. F
taking the value
That is, we want there to be two adjacent regions, with
F.
continuous deformation.
in one, F ?
in the other, with
F the gradient of a
If you like, when those interested in crystals speak of
twins, they mean such Siamese twins. At the surface where the parts join, the assumption of continuity implies the well-known conditions of compatibility,
where
F 2 = (1 + a ® n ) F 1 ,
(2.14)
n is the unit normal to the interface and
a is the so-called amplitude
vector, a measure of the size of the jump in
F. Eliminating
F~
between
(2.13) and (2.14), we get RFjH = (1 + a © n ) F j , With
F.
and
F~
constants, a and
(2.15)
n must be constant, so the interface is a
plane, or part thereof, and the interfaces seen in many crystals do conform pretty well to this. When these seem to he distorted a bit, I would interpret this as evidence of residual stresses, other configurations which might well be interpretable as extremals of
E, stable enough to be ohserved.
Now, let us try to adhere to convention, hy selecting the first con-
254 82 figuration as a reference, taking F1 = 1 => RH = 1 + a © n .
(2.16)
Then, the conventional choice of G for a crystal would take this to be the crystallographic point group associated with this configuration, which makes H orthogonal. Elementary analysis then shows that (2.16) can hold only if R = HT , a = 0 ,
(2.17)
and this won't do. So, either we give up the attempt to use elasticity theory, or we revise some of our ideas concerning G. What has happened, as I see it, is that the indicated conventional estimate of G is faulty; it is good enough for classical linear theory, but such theory can't cope with the phenomena being considered.
3. Molecular Considerations To clear the air a bit, we first consider a molecular picture of the simplest kinds of crystal configurations, identical atoms arranged on what is sometimes called a simple or a Bravais lattice, filling all of space. Take three constant linearly independent vectors
a|.( K = 1,2,3).
Put identical atoms
at all of the points with position vectors of the form 3 K J n a, , K K=l i/
where the n
(3.1)
are integers, positive or negative. In common jargon, the a^
are lattice vectors for the configuration, characterizing the periodic structure demanded by crystallographer's definition of crystals. As a rather simple special case, suppose that sv • a. = 0 , K * L,
(3.2)
and l a ^ * ia 2 ! .
(3.3)
Consider the possibility that such a configuration corresponds to our first
255 83 minimizer, for some choice of the function reference.
If you like, use the rectangular Cartesian coordinates, with axes
parallel to the a.,, to represent tion with
W, and that we take it as the
a.
F etc. as matrices. Obviously, a 180° rota-
as axis takes the infinite set of physically indistinguishable
points to itself. Orthogonal transformations of this kind form the point group associated with the configuration, the conventional estimate of
G.
Of course,
this reflects the notion that, had someone so rearranged the identical atoms, unbeknownst to us, we could design no experiment to detect this. We now construct a different kind of rearrangement which, by the same kind of reasoning, should belong to
G.
Consider a simple shearing deformation, a homogeneous deformation with gradient F = S = 1 + a"®"h"
, "a • In = 0 ,
(3.4)
wherein a" = a(aj + a 2 )
(3.5)
TT = a 3 /\ (ax + a 2 )/ia 3 A (^ + a2)» ,
(3.6)
o being a scalar. Let aK = F aK
(3.7)
be the lattice vectors describing positions of atoms in the new configuration. By sketching a picture of this, you can easily see that if you choose correctly, the vectors
a!, will also be orthogonal, with
ifljH = ia2« , JlLi = la,! , i"a"3i = la^i , so use this value of
a
(3.R)
a. Further, the sketch then indicates that if you sub-
ject this deformed configuration to a 18f)° rotation of the form Rx = -1 + 2rT(a'n
, Rj = 1 ,
(3.9)
256 84 you can make the set of atomic positions come back to what it was originally. Individual atoms do not return to their starting positions, but we have accepted the view that this doesn't matter. So we should have that H = RjS e G ,
(3.in)
H2 = 1 .
(3.11)
and it is easy to verify that
Next consider (2.16), which now reads RH = RR,S = RR.(1 + a" ©17) 1
l
(3.12)
= 1 + a® n . Clearly, we can satisfying this by taking
R = RJ = R 1 , ' a = a , ' n = n ,
(3.13)
and use this to construct an example of a pair of coexisting configurations of the kind discussed earlier.
It is an example of twinning, as the
crystallographer uses the word. When he uses it, he has in mind some such array of atoms, fitting his concept of a crystal, as a periodic structure. Some reinterpretation must be made, to use any continuum theory to analyze such things. To relate macroscopic motion to microscopic motion, one must introduce some hypothesis. Our (3.7) states the most common hypothesis, relating lattice vectors to the macroscopic deformation gradient, what is commonly called the Born or Cauchy-Born hypothesis. It seems so natural that it is easy to overlook the fact that it is an assumption, but it has its limits, failing to apply to motions encountered in some phase transitions, for example. For a given configuration, infinitely many sets of lattice vectors can serve as lattice vectors. Resulting ambiguities in the Cauchy-Born hypothesis are discussed in detail by Ericksen [4].
They do cause real difficulties for crystallographers and metallurgists
seeking to relate observations of lattice vectors to deformation. Variants of the hypothesis are used in a rather informal way, to handle some particular
257 85 kinds of exceptions, but it is hard to find one umbrella to cover all. My experience is that elasticity theory does poorly in describing what occurs, in such exceptional cases. Having found reason to think that the classical estimates of G are in need of revision, we can try to make a different kind of estimate, by appealing to molecular theories of elasticity.
For the simplest, most classical molecular
theories, it is rather easy to do this, as is discussed by Ericksen [5]. This involves picking a reference configuration, with reference lattice vectors A.., describing the periodicity of the crystal, considered as filling all of space. As is well known from crystallography, and mentioned above, there are infinitely many ways of choosing these. To be more specific, another set of vectors J^ is eligible if and only it is related to the possible set
A K by relations of
the form 3
i
I,K = 1 mb A K
L=1
(3.14)
K L
where m = nrryl
(3.15)
is a matrix of integers, with
det m = ±1 . Clearly, these form a group mation
G'. Given
m, we can define a linear transfor-
H(m) by \
and such
(3.16)
= H(m)AK = mj^ A L ,
H are the elements of a conjugate group
(3.17) G. The aforementioned mole-
cular theory leads to the conclusion that G =G , i.e., that this is the group leaving
W invariant.
(3.1R) Molecular theory has its
difficulties, in describing real crystals, but I believe that this prediction is
258 86 quite good, as long as elasticity theory can reasonably be applied.
Some kinds
of twins involve what the experts call shuffling, rearrangements of atoms in a unit cell, not involving macroscopic deformation.
Elasticity theory is too
coarse to deal properly with such phenomena, for example.
Pitteri [6,71 discusses
more general ideas of kinematics and symmetry which apply to such crystals. James' [8] study of quartz serves to illustrate these ideas. Use (3.18) and you automatically include the element indicated by (3.10), for the example considered.
In translating ideas used by crystallographers and
metallurgists into the language of elasticity theory, one encounters various other kinds of elements, enough so that I feel quite comfortable with this choice. With this view, the invariance of the energy is essentially the same for all crysals, although minimizing configurations can display different symmetries. In brief, nonlinear elasticity theories involving strain energies invariant under
G, as just estimated, seem to be capable of describing some of the tran-
sition phenomena observed in crystals. As is discussed by Parry [9] and Pitteri [10], select subgroups can suffice, if one is only concerned with suitably small deformations of a particular reference. Make a more conventional choice of
G, and, from kinematic considerations, you can make an estimate of the
limits of validity of the assumption.
From this perspective, the ideas of
invariance used in linear theory are seen as sound enough, but one can go wrong, in extrapolating them to nonlinear theory. Here, I have sketched some of the reasoning which has led us to take seriously a general kind of mathematical theory, deserving deeper study. As mentioned earlier, we know that it can't cope with all of the phenomena observed in crystals, but it covers some which are well outside the range of linear theory. Strategies for constructing strain energies invariant under
G are
discussed by Parry [9], Pitteri [10] and Fonseca [11], the latter borrowing more ideas from molecular theory. 4. Unloaded Bodies. It is well known and easy to show that the group
G1
is infinite,
259 87 discrete and not compact.
Since
G
is conjugate to G 1 , it shares these pro-
perties. One consequence is that no positive-definite symmetric tensor is invariant under
G.
Said differently, if W
is invariant under
G, it is
impossible to satisfy (2.10). Given the nature of this group, we can't expect
W to grow large as F
gets large, in general, although it might happen for particular sequences. Roughly, there is some limit to the size of shear stresses which such a body can tolerate. Said differently, global existence theory for traction boundary value problems, problems of the Neumann type, is not likely to be had.
Studies of local
existence could help us better understand what happens, as we approach data for which existence fails but, for this discussion, we ignore the more complex problems hinted at here. Assuming of choosing
W has a minimizer
H to satisfy (2.11).
C., there are infinitely many ways
Of particular interest is the subset for which
(2.15) or the equivalent (2.16) can be satisfied, enabling configurations to coexist in unloaded bodies. Included among the possiblities for satisfying (2.16) are the so-called lattice invariant shears, discussed in some detail by Ericksen [4], cases of the type R = 1, H = l + a © n , a - n G
= 0,
admitting an infinite number of elements of this kind.
(4.1) From this, it follows
that the Euler-Lagrange operators indicated by (2.5) cannot always satisfy the conditions of strong ellipticity, as noted in Ericksen [5]. Thus, nature provides us with realizations of solutions of equations of this kind. That the equations degenerate on a rather complicated set can be seen as follows. Pick any case where (2.16) is satisfied and set
4>(x) = W(l + x a © n ) . From our assumptions, $ has minima at
x = n and
(4.2) x = 1, so
• '(0) =•'(!) = (1 , 4,"(0) > 0, f(l)
> 0 .
(4.3)
260 88 The strong ellipticity condition implies that •"(x) > 0 . Clearly, (4.3) can't hold, if (4.4) holds for
(4.4) x e [0,1].
Physically, we expect
the inequalities in (4.3) to be strict, at least in general, at the minimizers, x = 0 and some
x = 1. Then the inequality in (4.4) must in fact be reversed for
x e (0,1), so it doesn't help to just allow equality in (4.4); the
Legendre-Hadamard inequality also fails. By similarly using the example involving (3.10), with lattice vectors nearly equal in length, one can get such degeneracies to occur at deformations which are very small, albeit finite. The observations indicate that, depending on the temperature and type of crystal, such deformations can be very small or large. These and other quirks of W, induced by invariance,are discussed in some detail by Fonseca [11]. The more common kinds of twinning are covered by solutions of (2.15) such that R = H 2 = 1, including as a special case the example discussed in
(4.5) §3. As is discussed in
detail by Pitteri [12], H 2 = 1 * R 2 = 1. He gives an example to show that R 2 = l ^ H 2 = 1, but notes that, if R 2 = 1, then either
H 2 = 1 or
unbounded, as n + «. Earlier, Gurtin [13] showed that, if H R
2
= 1, then
2
H = 1. Given
possible values of can calculate width, with
F.
R, a and and
F.
and
Hn
is
is orthogonal and
H satisfying (4.5), one solves (2.15) for
n, details being discussed by these writers. So, one
F_, arrange any number of parallel strips of arbitrary
F alternating between one value and the other, minimizers with any
number of planes of discontinuity. One sees a great variety of such arrays in real crystals.
In specimens which might seem otherwise identical, the number and width
of these layers can be quite different, so such nonuniqueness of minimizers is not unrealistic. If Fj
and
H be given, not restricted by (4.5), the analysis of
Ericksen [14] shows that either (2.15) has no solutions for two, depending on whether
F.
and
R etc_. or it has
H satisfy a certain equation. Cases where
261
89 H satisfies (4.5) are exceptional, in that there is no restriction on F j , except what is implied by (2.11).
This might be viewed as a mathematical expla-
nation of why such twins are so commonly observed in crystals. model is simplistic.
Actually, the
Physically, one expects there to be some surface energy
associated with the discontinuity, making the total energy higher than i t would be i f the discontinuity were absent.
Commonly, a crystallographer or metallurgist
w i l l acknowledge t h i s , with a comment to the effect that such surface energies tend to be unusually small.
It does reinforce the notion that, very often, we
observe solids in configurations which are only metastable, and there is a paucity of good theory for dealing with such things. of
N
G, H = 1, where N = 2,3,4 or 6.
tions on F,
If
H belongs to a f i n i t e subgroup
Then, when N>2, some rather mild restric-
are required, for (2.15) or (2.16) to be soluble for R etc.
conclusions come from analysis of Ericksen [14].
Configurations with
These
N>2
occur in nature but do seem to be less common. Some workers regard some of these as twins, others don't.
It isn't hard to construct examples of possibilities not
of this form and not included in (4.1), but we don't yet have a good way of characterizing all of the possibilities. With the numerous possibilities for satisfying (2.15), i t might be expected that different planes of discontinuity could intersect, for example to give regions f i t t i n g together like the sections of an orange, and such things are observed. James [2,15,16] has developed algorithms useful for analyzing situations of this general kind, not restricted to cases where the deformations involved are symmetry-related, including analyses of some particular cases encountered in practice.
Ericksen [17] presents some fairly general schemes for constructing
symmetry-related configurations of this kind.
It will take more effort to
characterize all minimizers, but i t seems hopeful that this can be done.
It
seems rather evident that i t doesn't much matter what we assume for the size and shape of the body, since we satisfy (2.5) by having
JJJ=O as a consequence of minimizing
W.
(4.6)
For a s t a r t , we can think in terms of what
262 90 might occur in a crystal filling all of space, taking restrictions to find some possibilities for a crystal of finite size.
In doing this, one might overlook
possibilities which would be acceptable for bodies with special shapes or topologies. On occasion, one does see configurations which agree well with our picture of minimizers.
Parallel twin planes pretty well cover the complications seen in
common observations of high temperature superconductors, for example. Generally, the discontinuities don't move freely about, although one might get them to shift, by applying and removing forces, if the force is large enough. In this sense, a particular minimizer seems to be stable with respect to small enough disturbances.
If it were otherwise, linear elasticity theory might not
do as well it does, in describing small motions. Clearly, one should bear this in mind, in attempting to analyze changes of minimizers.
Pego [181 has developed
dynamical stability theory, using a one-dimensional viscoelastic model, predicting that analogous discontinuous static solutions can be stable, when the disturbances are small enough, an interesting beginning step toward constructing relevant dynamical stability theory. The metallurgist might take one of our minimizers and treat it to produce a more complicated configuration, something he often does to make a specimen harder, for example.
Practically, such specimens can be quite stable; we don't
expect our chisels to lose their hardness if we treat them well, although we can make them lose their temper, by overheating them while grinding them, for example.
Simple possibilities, of some interest, involve cases where
has another relative minimum, of different symmetry.
W
also
In various crystals, one
has more symmetric and less symmetric phases of this kind, often called austenite and martensite, respectively and, commonly, the martensite contains twins. One might see some such combination occurring in the same specimen.
Loading
and unloading can change the relative proportions. In the so-called memory alloys, this pretty well describes the configurations commonly observed.
Even
during the loading, the parts are much like this, not far from homogeneous.
If,
when unloaded, the parts seem to be homogeneous, with planar interfaces, it is
263 91 at least tempting to think that (4.6) applies. James [19] considers some simpler problems of this kind, involving loading, more detailed analysis of what occurs in particular crystals. As he mentions, he assumes invariance only under a finite subgroup of our group. As is discussed by Fonseca [111, there are theoretical indications that configurations supporting shear stresses cannot be very stable, according to the type of theory being discussed here. As James mentions, some possible arrangements of discontinuity planes mitigate against the applicability of (4.6), so it might be necessary to then consider more complicated solutions of (2.5).
In any event, it seems reasonable to try to develop fairly detailed theory
of the parts, if the number is not large, to improve understanding of their interactions. Specimens of interest can contain a great number of parts, so one would also like a coarser and simpler theory for these, perhaps a statistical theory somewhat like that considered by Muller and Wilmanski [20]. As is covered in the exposition of Nishiyama [21, Ch. 6 ] , metallurgists and crystallographers have long used a kind of geometric or kinematic theory to analyze configurations which seem to be piecewise homogeneous. This involves jump conditions like (2.14) and ideas of crystal symmetry.
Now and then, strain
energy might be mentioned, in a casual way. However, one doesn't see any formula for it, and it is not involved in such analyses. The ideas used there fit quite well with the ideas of elasticity theory, including our estimate of the invariance of W. As I interpret, there is a tacit assumption that (4.fi) applies. That is, the manner in which rotations are treated then makes sense and, in general, it would not otherwise. On occasion, they do have difficulty in rationalizing an observation, as is discussed by Nishiyama. to
It would be nice
know whether some more complicated solutions of (2.5) would better fit such
observations, but this question remains to be settled.
It would seem that one
would need to have more specific information about the functions W which might apply, to do much with such problems and, currently, we know very little about this. In the various steels and other alloys, one encounters a great variety of morphologies, as can be seen by glancing through the pictures presented by
264 92 Nishiyama [21], for example. Various other kinds of defects occur and are of concern, for example point or line defects. There is a bulky literature involving use of linear elasticity theory, attempts to understand parts of what is seen. There is no good way to summarize such work in a few words. Of course, it lends support to the notion that complicated singular solutions of (2.5) can be used to model at least some of what is seen. Also, it becomes evident that our "simplest problem" can be exceedingly complicated. The practical metallurgist has learned to control coarser features of such things, to relate these in a rather empirical way to properties which are desirable for particular practical uses of his metals. This does suggest that some kind of control theory is to be had.
It does seem plausible enough that a simplistic form could
be developed, and it should be instructive to do this. The equations used by Pego [18] seem to be unusually tractable, and one could pose non-trivial control problems for them.
Ignore all but our minimizers, and seek a strategy to induce a
given one to shift to another such target, compatible with the notion that it is the same lump of matter. Certainly, we will not get the same set of minimizers, if we impose some kinematical constraints, and we might tailor these to fit the initial and target configurations. It seems clear enough that we can use the idea to at least partially control the outcome and that we would do best by controlling the displacement of every point on the boundary, perhaps with some judicious choice of control path.
It is a bit risky to guess about the nature of hard
theorems which need to be proved, to explore such lines of thought, so it doesn't seem worthwhile to speculate more. Acknowledgement: This material is based on work supported by the National Science Foundation under Grant No. MEA-830475fl. References 1. Noll, W., A mathematical theory of the mechanical behavior of continuous media, Arch. Rat'l. Mech. Anal. 2, 197-226 (1958). 2. James, R.D., Finite deformation by mechanical twinning, Arch. Rat'l. Mech. Anal. 77, 143-176 (1981).
265 93 3. James, R.D., A relation between the jump in temperature across a propagating phase boundary and the stability of solid phases, J . Elasticity 13, 357-378 (1983). 4.
Ericksen, J.L., The Cauchy and Born hypotheses for crystals, in Phase Transformation and Material Instabilities in Solids (ed. M.E. Gurtin), Academic Press, New York (1984).
5.
Ericksen, J.L., Special topics in elastostatics, Adv. Appl. Mech. (ed. C.-S. Yih) 17, 189-244 (1977).
6.
P i t t e r i . , On v+1 lattices,
7.
P i t t e r i , M., On crystallographic space groups and generalized lattice groups, pending publication.
8.
James, R.D., The stability and metastability of quartz, in these proceedings.
9.
Parry, G.P., On the elasticity of monatomic crystals, Math. Proc. Camb. Phil. Soc, 80, 189-211 (1976).
J . Elasticity 15, 3-25 (1985).
10.
P i t t e r i , M., Reconciliation of local and global symmetries of crystals, J . Elasticity 14, 175-190 (1984).
11.
Fonseca, I.M.Q.C, Variational methods for elastic crystals, Ph.D. Thesis, Univ. Minnesota (1985), (to appear in Arch. Rat. Mech. Anal.)
12.
P i t t e r i , M. On type II twins, to appear in Int. J. Plasticity.
13.
Gurtin, M.E, Two-phase deformations of elastic solids, Arch. R a t ' l . Mech. Anal. 84, 1-29 (1983).
14.
Ericksen, J.L., Some surface defects in unstressed thermoelastic solids, Arch. R a t ' l . Mech. Anal. 88, 337-345 (1985).
15. James, R.D., Displacive phase transitions in solids, to appear J . of Mech. and Physics of Solids. 16.
James, R.D., Stress-free joints and polycrystals, Arch. R a t ' l . Mech. Anal. 86, 13-37 (1984).
17.
Ericksen, J.L., Stable equilibrium configurations of elastic crystals, Arch. Rat'l. Mech. Anal. 94, 1-14 (1986).
18.
Pego, R.L., Phase transitions: stability and admissibility in one dimensional nonlinear viscoelasticity, pending publication, c.f. also, Dynamical problems in continuum physics, I.M.A. Volumes, (ed. J. 8ona, J.L. Ericksen, C. Dafermos, and D. Kinderlehrer) Springer (1986).
19.
James, R.D., The arrangement of coherent phases in a loaded body, in Phase Transformations and Material Instabilities in Solids (ed., M.E. Gurtin), Academic Press, New York (1984).
20.
Nishiyama, Z., Martensitic Transformations, Academic Press, New York - San Francisco - London (1978).
266 Reprinted from
The Mechanics of Dislocations Proceedings of an International Symposium Houghton, MI; 28-31 August 1983
Thermoelastic Considerations for Continuously Dislocated Crystals J. L. Ericksen
Department of Aerospace Engineering and Mechanics and School of Mathematics University of Minnesota Minneapolis, MN 55455, USA
IN DEALING WITH DISLOCATIONS, pictured as isolated singularities, we often employ elasticity theory. In terms of molecular theory, we often consider it to involve averages over regions with diameters large compared to spacing between atoms. The notion of continuous distributions of dislocations suggests s t i l l larger regions, containing many such defects. It seems to me instructive to ponder wh'at difficulties we might encounter, if we try to apply something like thermoelasticity theory in either case. For crystal configurations simple enough to treat as Bravais l a t t i c e s , with occasional defects, there is a rather natural way to formulate such theory, to make clearer some of the subtleties. My purpose is to elaborate
then, using Cauchy's idea of s t r e s s , we can calculate i t , given a pair potential which decays rapidly enough with distance. As is discussed by, say. Love [4, Note B], one adds up the forces associated with pairs of atoms having lines of action crossing any given plane, Apart from a trivial translation, the l a t t i c e vectors determine the positions of all the atoms, so it is rather clear that the Cauchy stress o must reduce to a function of these, The theory at hand gives formulae of this kind, viz, 1J
By way of motivation, l e t us reconsider ideas involved in the simplest and oldest molecular theories of e l a s t i c i t y , such as were introduced by Cauchy [ 1 , 2, 3 ] . Cauchy i n t r o duced an approximation, replacing sums by int e g r a l s , leading to some confusion as to j u s t what is predicted. In t h i s r e s p e c t , the exposition by Love [ 1 , Note A] is c l e a r e r . Bear in mind t h a t Cauchy had his concept of s t r e s s , but the concept of strain energy came l a t e r . So did the notion that, in a solid which we perceive to be at rest, atoms are not s t i l l , but undergo the thermal motions. So we picture atoms at r e s t . Suppose they are subject to central forces. Then, and perhaps only then, the notion that one atom exerts a certain force on another has meaning. Consider the (identical) atoms to be located on a Bravais l a t t i c e , with (constant) l a t t i c e vectors aK (K-1, 2, 3), reciprocal l a t t i c e vectors aK, satisfying the relations L
K L'
i K K j =
i j '
m 3tf> la x a • a I 3a' 2 3 I K I '
j ^
• -m|a' x a2 • a ' | -3A- aK , J 3a
MOLECULAR CONSIDERATIONS
K
=
(2)
^ a oertaln m l s t h e m a s s o f a r l atom< function of the aK or a", which depends on the By a simple c a l c u l a o n o l o e of p a i r potential. t l o n > o n e c a n estimate t h a t p, the macroscopic m a s s p e r u n i t volume, is given by
where
p
m
" j | a » * a* '
a
| , 2 s| f - m|a x a • a | . '\
(3)
To get to elasticity theory, one introduces another assumption, usually put in early enough to obscure (2). Pick some configuration, with l a t t i c e vectors A^, as a reference, and consider a macroscopic deformation taking a point in the reference at X to x = FX + constant,
(3)
F being any matrix with positive determinant, Cauchy's idea was that if X be taken as the reference position of an atom, (3) will give i t s position in the deformed configuration. This amounts to assuming that, in the deformed crystal, a possible set of l a t t i c e vectors is
. .
Copyright 1985 American Society of Metals 95
267 given by
a
K " FAK •
a typical length. Such theory seems to fit pretty well the ideas used in the discussion of Landau and Lifshitz [9]. They would seem to fit better if we added terms quadratic in VaK, but I ' l l ignore this complication. Statistical theory does play a role in their thinking so, really, I am discussing a theory which is not entirely the same. Assuming that no body forces act, what interior equations should be satisfied, for thermodynamic equilibrium? We should deal with a fixed quantity of matter. If we accept (3), which seems rather reasonable, we should impose a constraint of the form
CO
Plugging this into <> ( reduces this to a function of F, giving us what we now consider to be the strain energy per unit mass. Selecting a reference configuration for solids is a rather arbitrary matter. If you read what Gibbs [5] wrote about solids, you will see that he appreoiated this. Here, we will pursue a notion which, at f i r s t , might seem paradoxical. One can give elastic deformation an important place in the theory, without accepting the idea that any particular reference configuration must play a fundamental role. In this respect, I am elaborating ideas discussed by Davini [6], who considers very similar types of theory. By an elementary calculation one can show that (2) and (3) can be reduced to equations commonly accepted in nonlinear elasticity theory, in dealing with strain energies which need not derive from this or another molecular model. Also, one can calculate the strain energy by adding up pair potentials, in the manner recommended by Poincare [7], and the results are consistent. Actually, (4) makes perfectly good sense as a relation between macroscopic variables, if we think of lattice vectors as averages determined by x-ray observations, which average out the thermal motions. Interpreted this way, it seems to apply well not only to the small deformations which we commonly consider to be elastic, but to rather large deformations encountered in some phase transitions or cases of martensitic twinning where defects such as dislocations can be of l i t t l e import. So the old theory poorly estimates the function
/a 1 x a2 • a3 dv - constant .
To avoid considering loadings in detail, we can assume that the boundary and values of aK on or near i t remain fixed, to get interior equations, and ignore body forces. If these are the only constraints, the usual formal calculations give, as interior equations, 3
= i a
i K'
,„,
where
A l s a Lagrange multiplier. Within this framework, i t is not very obvious how to calcustresses. If we assume we can use (2), we
l a t e g e t
°ij " " p 6 ij •
<8'
p - \m|a' x a2 • a 3 | ,
(9)
where
< > f but, otherwise, i t looks pretty good, as a theory of elasticity or thermoelasticity, if we consider (J> as the Helmholtz free energy. With more modern molecular theories, the habit is to calculate the free energy and assume, as is suggested by thermodynamics, that using this in cs), or the equivalent, will give the Cauchy stress. Although we give i t this name, it is really not so clear that it is the same, conceptually.
so the stress must reduce to an hydrostatic pressure. Landau was interested in solids subject only to such pressures so, perhaps, would not be bothered by this. Some thermodynamicists seem to like the idea that, in thermodynamic equilibrium, the stress should reduce to this form. From these views, there is nothing obviously wrong with our calculation of s t r e s s , although some of us might find the conclusion disturbing. According to the theory of continuous distributions of dislocations, we can make one interpretation of what we have done. By permitting rather arbitrary small and smooth variations of the aK, we are permitting introduction of continuous distributions of dislocations, at least. After writing down his famous criteria for thermodynamic stability Gibbs [5] elaborated what he meant by "possible variations", After reading his words, I find it hard to believe that he would accept the analysis sketched above. In his words, "In order to apply to any system the criteria of equilibrium which have been given, a knowledge is requisite of its passive forces or resistances to change, in so far, at least, as they are capable of preventing change. (Those passive forces which
ONE THERMODYNAMIC VIEW Turning to a different topic, we find Landau [8] discussing thermodynamic potentials for, and certain phase transitions in crystals, without mentioning the macroscopic deformation gradient F, or any reference configuration. At least by example, the preceding discussion suggests how one might do this. For our rather simple crystals, what is suggested is a Helmholtz free energy function of the form 4> = $(a^,T) ,
(6)
(5)
where T is the temperature, and the a^ are vector fields, functions of x, current positions. No harm in requiring them to vary slowly, on a scale employing atomic spacing as 96
268 a
only retard change, like viscosity, need not be considered.)" In elaborating this by mentioning examples, he writes "...that which prevents the changes in solids which imply plasticity, (in other words, changes of the form to which the solid tends to return), when the deformation does not exceed certain limits." As I interpret, in permitting those easy variations in continuous distributions of dislocations, we have ignored the "passive forces" which prevent this, ignoring Gibbs' advice on how to use his criteria. One might interpret his words in some slightly different way, but it seems to me very hard to argue that we have respected his advice, in the analysis sketched. His [5] later analyses of equilibria of solids in contact with fluids does make clear that, in his view, it was possible for a solid to be in stable equilibrium, while supporting shear stresses. Of course, the argument for using the indicated prescription to calculate stresses is rather weak, but I will strengthen i t. I hesitate to claim that Landau, say, disagreed, since I have taken some liberty in interpreting ideas described by Landau and Lifshitz [9]. Simply, this provides some food for thought.
=
Fa|<(x) ,
or, equivalently, _ _ aKU) - (F~')T a*(x) .
(12)
(13)
Before, we could take F - 1, and s t i l l vary aK rather independently, so there is quite a difference. Essentially, we take this as a definition of elastic deformation of a state. It was Gibbs' habit to consider small but finite variations so, if you like, restrict the deformations accordingly. Considering aK(x) as fixed, we have _ <(.(aK,T) =
ANOTHER THERMODYNAMIC VIEW
(15)
Assuming that i t applies, i t is a simple matter to use the chain rule, to verify that (2) agrees with the prescription of the Cauehy s t r e s s commonly accepted in thermoelasticity theory. So, we now have a different reason to think that the prescription is good, These smooth elastic deformations preserve values of integrals along curves, of the form
Let us now revise the aforementioned analyses, paying more attention to Gibbs' advice. Clearly, elasticity theory provides a rather different view of the problem just discussed, and we do use i t to analyze real cryst a l s , containing some imperfections, sometimes using Gibbs1 c r i t e r i a . As I see i t , this works by restricting the variations which are I t seems fair to say that Gibbs considered. was a master at the a r t of judiciously s e l e c t ing r e s t r i c t e d variations, and this is an important factor in getting useful r e s u l t s from any mathematical theory of s t a b i l i t y . Consider putative equilibrium fields a^(x), defined over some region R. They need not be terribly smooth, but I will here gloss difficulties associated with properly accounting for singularities. Consider a smooth one-to-one map _ _ x - x(x) , (10) _ of R onto R, which might be the same or a different region, and l e t F - Vx .
K( x '
/a K • dx .
(16)
From the theory of continuous distributions of dislocations, when the curves close, these are interpretable as Burger's vectors for such c i r cuits, or more properly, components of these, if we use as a basis the l a t t i c e vectors. Variations considered before might fix these for circuits lying in the boundary of a body, but l e t them vary in a rather arbitrary way, in the interior. Perhaps, this makes clearer the remark t h a t , before, we were permitting changes in continuous distributions of dislocations. Similar remarks apply to surface integrals of the form
(11)
/aK x aL • ds ,
We would like to consider this as a smooth elastic deformation, one which does not really change the nature of defects in the crystal. What seems to be a reasonable way to mathematize this is to adapt Cauchy's rule, equation CO. That is, for R~, possible lattice vectors aK(x) are assumed to be related to those in R bv
(17)
where ds is the usual vector element of area, and to volume integrals of the form ; | a , x &2
' As
. a,| d y '
=
A
%
I
.
I., d y _ ( l 8 ) (18) I
is discussed in some detail by Davini [6], restricting one's attention to variations which leave all these integrals unchanged is almost
97
^ _ _ _ _ equivalent to assuming that (13) applies. As I see i t , this adds to our understanding of what we are doing, in assuming (13), freezing these measures of defects. Also, it is convenient to note that, at least roughly, (18) gives the number of unit cells in the volume, | |
, x a2
269
regard as a process which is inherently i r r e versible. Commonly, we use for this the Helmholtz free energy obtained from elasticity theory. In doing so, we seem to be dealing with changes of entropy, in dealing with changes which cannot be approximated by reversible processes. From the viewpoint of classical thermodynamics, this is a dubious business, at best. However, with dislocations viewed as isolated singularities, we can, I think, argue our way out of this difficulty, fairly well ' consider perfect crystals, F l r s t t w 0 identical, except that one is larger than the other. By adding material to the smaller, we We a l l could make l t l d e n t i c a l t 0 t h e larger. agree, I think, that the two entropies are defined and equal. Similarly, processes which c a n b e r e g a r d e ( j as accomplished by interchangidentical atoms must be ing p o s l t l o n s of regarded as leaving the entropy fixed. Is it Gibbs accepted this that he n o t beoause regarded his calculations of entropies for as giving rise to a paradox? ldeal g a 3 m l x t u r e s S O j l f w e c a n v l e w a p r o c e s s a s accomplished by subtractions, or rearrangements suon addltlons_ to define o f m a t e r l a l i w e o a n u 3 e Such i d e a s entropy changes, even when the process seems t 0 b e inherently irreversible, Consider two dislocations in an infinite crystal, with the usual view that these correspond to isolated singularities. We wish relative to tQ consider tnem a s moving a p a r t , translating one relative to the t n e materlal> take n o t n e r by n l a t t l c e spaoings. S o, "planes" of atoms, and insert them in the region between the two dislocations, in such a singularities, and way a s t o produoe n0 oUler in separation. One tQ give t h e d e s i r e d i n c r e a s e of the slab or o a n u s e e l a s t l c de formations original material to get the f i t . Similarly, remove n planes on the other side of one dislocation so that there is no net gain or loss of It seems pretty clear that this, coma t oms. M n e d wltn e l a s t l c de formation, can produce an effect physically equivalent to moving the dislocations in the desired manner, and that one more complio a n apply s l m l l a r ideas t 0 some So, it makes sense to cated a r r a ngements. thermodynamic potentials thin|< t h a t cnanges in o a n b e d e f i n e d f o r situations of this kind, outside some l m i e n e i g n b o r n o o d o f t n e singularit l l n a w a y w h l c n l s o o n s l s t e n t with thermopractice. With the singularities themdynamlc it seems to be more a matter of indiselves,
i a, | ,
being the volume of a unit c e l l . Again roughly, this presumes that we can add up the volumes of the l i t t l e unit c e l l s , to get the total volume. The variations previously considered did not preserve these integrals locally, but did for the entire body. We will consider this a bit more, l a t e r . As long as we deal with elastic deformations of a fixed ground state, there is no difficulty with defining the "stress-energy momenturn tensor", as Eshelby liked to call i t . However, to get i t s divergence to vanish, one needs to have 4, not depend explicitly on x. It is rather clear (11) that this will be true if the aK are constant, generally not otherwise. Physically, defects will make a crystal appear to be somewhat inhomogeneous. In this respect, use of such tensors to analyze the moving of dislocations through the material seems to involve some approximations. I have not thought carefully about the ramifications, so I won't claim that this is a matter of great import. In one respect, we have pulled a fast one, (114) rather suggesting that the Helmholtz free energy is well-defined, for sets of states which need not be elastically equivalent. By the usual thinking in thermoelastioity theory, entropy is determined only to within an additive constant, * to within an affine function of temperature so, really, we determine an infinite set of potentials, for each set of fields a* which are elastically equivalent. Given two fields aK which are not elastically equivalent, we might argue that there is no obvious way to define the differences in entropy, so we are stuck with dealing with infinitely many valued potentials, making it at least hard to know how to apply Gibbs criteria. An obvious alternative is to consider $ as the function obtained by considering a* to be constants, states which are elastically equivalent via homogeneous deformations. I don't see that classical thermodynamics forbids us from doing this, and it gives us what is needed to do the calculations needed, to employ Gibbs' criteria. in the warnings about plasticity, he seemed to presume that one might find some such way to do the calculations. Doing them is not the same as trusting them, his advice being aimed more at the l a t t e r question.
vi(jual
judgment>
t 0
declde
now t 0
h a n d l e
tnem-
_ In t h e p ro C e ss envisaged, we add material, increasing the number of unit cells here, decreasing it elsewhere, keeping the grand total fixed. From
One polnt
seems
wortn
notlng
this view, an overall constraint like (6) s t i l l seems reasonable, but such volume integrals over sub-volumes might well change. distributions when we d e a l wlth oontlmious disappearance of those Qf a l s l o o a t l o n S i t n e singularities makes it easier for us to over-
SHIFTING DISLOCATIONS Calculations of forces on defects consider these as moving through the material, if only infinitesimally, something that we tend to 98
270 look these d i f f i c u l t i e s in defining potentials. Conceptually, i t is s t i l l not so clear that one can deal successfully with changes of entropy for the inherently irreversible processes. For sake of argument, I ' l l grant that we somehow make sense of a constitutive equation like (5), although this is perhaps too simple to be taken very seriously. Davini [6] considers somewhat more complicated constitutive equations, avoiding, or at least postponing consideration of entropy, which might well be a better course. Certainly, we can move defects about in space, by elastic deformations. I f we view them as isolated singularities, we might think of moving them through the material as a twostep process. First keep the singularities fixed in space, adding or subtracting such material as might be necessary, to make the desired shift feasible. Then allow elastic deformations of the modified configurations, using ideas like those sketched in Section 3 to analyze the l a t t e r . For the f i r s t step, we allow reciprocal lattice vector fields to change from aK(x) to a K (x), in such a way as to maintain the nature of the singularities, and introduce no others. For dislocations, we should keep fixed the Burgers vector for c i r cuits not intersecting the singularities, or _ / (aK - aK) • dx - 0 . (19)
given by (20), with whatever constraints seem to you reasonable. Assuming these include (6), we then get procedures intermediate between those hinted at in Sections 3 and 4. If I interpret correctly, Gibbs warned us to be wary of t h i s , as a good criterion for stability, Since we s t i l l f i x the essential character of dislocations, albeit in a different sense, one is getting close to s p l i t t i n g hairs, in deciding just where Gibbs would l i k e a line to be drawn, I don't find i t so easy to describe precisely what we do, in calculating forces on dislocations, to f i t i t into this context. I f I correctly estimate, i t is more a matter of making some rather specific choices of aK(x) and a K (x), consistent with (20), at least. Then we apply ideas more like those described in Section 4, to adjust these, e l a s t i c a l l y . I f we view this as a very restricted set of variations, I don't find i t very easy to say what concept gives them this favored status, We might avoid this, by arguing that, having the stress tensor, we can calculate resultants, so we should use this idea to calculate forces on dislocations. I f not, just what are we to mean, by the concept of stress, in such situations? As we a l l know, practice is in contradiction with the notion and, at least in part, this is related to Eshelby's worries about the different kinds of forces. Conceptua l l y , the idea that we can speak of force exerted by one defect on another is a bit wrong, and hard to make sense of, for nonlinear theory. A better view is that defects i n f l u ence and are influenced by f i e l d s , here envisaged as the elastic f i e l d s .
for a l l such circuits. These ideas lead natura l l y to the requirement that there be rather smooth, single valued scalars I(JK(X) such that _ aK - aK = v>K • (20) with the usual ideas, one would have curl aK = 0, away from s i n g u l a r i t i e s , b u t p o t e n t i a l s f o r a K would be m u l t i v a l u e d . A c t u a l l y , (20) looks very much like the gauge transformations associated with dislocations in gauge theory and, perhaps, we should take this as a clue as to how we might better link gauge theory to dislocation theory. We might define topological deformations of a crystal configuration as the set of mappings obtained by combining the elast i c deformations with those described by (20). Certainly, I am not an expert in the topological theory of defects but, as I interpret notions of experts, this deformation seems to f i t their prejudices, at least reasonably w e l l . As discussed earlier, one might reasonably impose (6) as a constraint on these variations, for the entire body, but not for sub-regions. Concerning the surface integrals (17), my own thoughts are not so clear, so i t seems premature to make a specific recommendation. Analyses given by Davini [6] might be h e l p f u l , in assessing this. Of course, one can apply these kinematical considerations to continuous distributions of dislocations, as well as the discrete varieties. One might consider minimizing appropriate thermodynamic potentials with respect to the combination of elastic deformations and those
SUMMARY I t is at least interesting to consider the practice of using variations including elastic deformations and those permitted by (20), conditioned a bit by constraints which might seem reasonable, such as (6), in minimizing energies, This is what we seem to be nibbling at, for example in calculating forces in dislocations, Thermodynamic considerations such as mentioned above seem not to indicate any f a t a l flaws in the idea, at least for the isolated singularities. I t boils down to a matter of interpretation as to just what we should make of Gibbs1 words. I didn't quote a l l that he said, but, certainly, there is room for individual judgments about this and, of course, he was not i n fallible. The analysis of Section 3 does, I think, indicate what you can get i f you think he was t o t a l l y wrong. In Section 4, we try to lean in the opposite direction, to set another limit. Then, we t r y to fish out an estimate of what the consensus of opinion might be, as to where a line should be drawn, as well as to try to find the lines to old and newer lines of thought which seem to me promising. If I emphasize a particular kind of theory, i t is because I have found this useful for analyzing things of more interest to me personally, like
99
271_ the twinning of crystals, and some of the phase transitions, things which seem to be well beyond the scope of linear theory. For such things, nonlinear thermoelastlcity theory is essentially equivalent, and the considerations of dislocations could be rephrased, in this other language, when one is dealing with these, as isolated singularities. Physically, it seems easier and clearer to put in view the lattice vectors, so I have done this. I don't find it so easy to shift gears, to deal with continuous distributions of dislocations, but some of the ideas are useful, as I have tried to indicate. Also, I have tried to record some of the problems which bother me, in dealing with them. ACKNOWLEDGMENT This material is based on work supported by the National Science Foundation under Grant No. MEA-7911255. REFERENCES 1. A. L. Cauchy, Ex. de. Math. 3_, 227-287 (1828) 2. A. L. Cauchy, Ex. de. Math. 3, 253-277 (1828) 3. A. L. Cauchy, Ex. de. Math. H_, 312-369 (1829). 1. A. E. H Love, "A Treatise on the Mathematical Theory of Elasticity", 1th Ed., Cambridge University Press, Cambridge, (1927). 5. J. W. Gibbs, Trans. Conn. Acad. Ill, 108-248, (1876) and 313-521, (1878). 6. C. Davini, Istituto di Meccanica Teorica ed Applicata (Udine) report IMTA/011 (1983). H. Poincare, "Lecons sur la Theorie 7. l'Elasticite", G. Carre, Paris, (1892).
de
8. L. D. Landau, "On the theory of phase transitions", in Collected Papers of L. D. Landau, (ed. D. Ter Haar), Gordon and Breach and Pergamon Press, New York-London-Paris, (1965). 9. L. D. Landau and E. M. Lifshitz, "Statistical Physics (translated by E. Pereils and R. F. Pereils)", Pergamon Press, London-Paris and Addison-Wessley Publishing Co., Inc., Reading, (1958).
100
272
Offprint from "Archive for Rational Mechanics and Analysis", Volume 88, Number 4, 1985, pp. 337-345 © Springer-Verlag 1985 Printed in Germany
Some Surface Defects in Unstressed Thermoelastic Solids J. L. ERICKSEN
Dedicated to Walter Noll, on the Occasion of His Sixtieth Birthday I. Introduction
In the literature on crystals, "twinning" is a word used to describe a variety of phenomena involving different but symmetry-related configurations which coexist in crystals, meeting to form surfaces of discontinuity. As is discussed by PITTERI [1], there have been various attempts to formulate a more precise general definition of the word, as it applies to crystals. His discussion makes clear that some types of twinning are outside the scope of thermoelasticity theory. His definition excludes some phenomena which some experts on crystals call twins, like the "rotational twins" described by BARRETT & MASSALSKI [2, p. 406], things which seem to me more reminiscent of the multiple births we commonly describe by other words, like triplets or sextuplets. Whatever one calls them, they are of physical interest, as are other somewhat similar phenomena. My purpose is to present elements of thermoelasticity theory for things of this general kind.
II. Thermoelastic Bodies
To abstract features which seem significant, we consider a homogeneous thermoelastic body, referred to a homogeneous reference configuration. For present purposes, it is characterized by one smooth constitutive equation, of the form 4> =
(2.1)
with 4> identified as the Helmholtz free energy per unit mass, 0 as absolute temperature and (2.2) C = FrF=Cr. Here, F is the usual deformation gradient, with detF>0,
(2.3)
273 338
J. L. ERICKSEN
implying that C is positive definite. To discuss symmetry-related configurations we want there to be some non-trivial material symmetry, so we require that 4>(HrCH, 6) = HP, 0),
He G,
(2.4)
G being some non-trivial subgroup of the unimodular group. For crystals, molecular theory suggests for G a group which can be represented by the unimodular matrices of integers, as is discussed by ERICKSEN [3], and the experience is that this is an apt choice for analyses of twinning, etc. The H then representing G need not be these matrices of integers, but are similar to them. Select subgroups can suffice for particular problems. For general discussion, we need not fix any particular choice of G. For any particular choice of 0 = 0 1; C = CL i s a natural state provided the inequality
MC, 0J ^ HCu 0i)
(2-5)
holds in the strict sense, at least when C — C± is small enough. In more physical terms, these are stable or metastable unstressed equilibrium configurations, equilibrium of the simplest kind. With (2.4), for any H 6 G, HTCXH is also a natural state. Many studies of thermoelasticity theory presume that HTC^H = d
V He G.
(2.6)
The group suggested above for crystals leaves invariant no symmetric, positive tensor so, with this choice of G, (2.6) cannot be satisfied. Even if we ignore this, it is necessary to deny (2.5), to accommodate common examples of twinning. So we want <£ to be such that, for some H £ G, and some natural state C,, C2 = HTC^H =# Cu
det H = 1,
(2.7)
giving us at least two different, symmetry-related natural states. Alternatively, if Ft and F2 are any deformation gradients corresponding to d and C 2 , as indicated by (2.2) and (2.3), there exists a rotation R, R-1 = RT,
det R = 1,
(2.8)
such that F2 = RFiH.
(2.9)
One more condition is inherent in notions of phenomena like twinning, that it should be possible for such natural states to coexist, coherently, in the same body. That is, it should be feasible to regard F± and F2 as values of F in adjacent regions, with F the gradient of a continuous deformation. This introduces the requirement that the classical kinematic conditions of compatibility be satisfied where the regions meet, viz. F2 = (l+a®n)F1.
(2.10)
Here, n is the unit normal to the surface of discontinuity, in the current configuration, and a is the so-called amplitude vector. In this most elementary formulation, Fu F2, R etc. are constants; we are interested in the possibilities for patching together homogeneous deformations, with occasional planes of discontinuity,
274
Defects in Thermoelastic Solids
339
etc. What we desire, I think, is a neat way of classifying and picturing all of these, for crystals at least. Certainly, there is interest in like phenomena encountered in stressed crystals, and other types of complications, but I will ignore such ramifications. As it applies to thermoelasticity theory, PITTERI'S [1] definition of twinning requires that R2 = H2 = l, (2.11)! (2.9) then implying that F, = RF2H, (2.11)2 reflecting the idea that Fl and F2 are related like twins. Actually, one of the conditions (2.11) implies the other, so one can assume less. As he indicates, the theory of such twins in unstressed crystals is rather well understood. Similar remarks apply to the lattice-invariant shears, as defined by ERICKSEN [4], cases where R=l,
7/4=1.
(2.13)
Other possibilities include at least some of the "rotational twins" described by & MASSALSKI [2, p. 406]. Their description suggests that, for these, we want, for some integer N, HN=l (2.14) BARRETT
and perhaps more; their statements about possible amplitudes, etc. seem not to follow from this alone. Later, we will discuss implications of (2.14) in detail. Since I am hardly an expert crystallographer, I could be overlooking other possibilities known to such experts. In any event, it seems sensible to try to understand the full set of mathematical possibilities. III. A Solution In the setting indicated, it is rather natural to think of selecting Fx and H, then trying to calculate values of other quantities involved, when they exist. From (2.9) and (2.10), one can eliminate F2, to obtain RFlH={\+a®n)Fl,
(3.1)
wherein the rotation 7? as well as the vectors a and n are regarded as unknowns. We can lump together the known quantities, writing 7=7^7/ ' F r \
(3.2)
7? = (1 + a ® n) J.
(3.3)
det / = det 7/- 1 = 1.
(3.4)
replacing (3.1) by From (2.7)2 and (3.2),
Whatever other properties J might have depends on the choice of Fx and H. For crystals, 77, hence / is similar to a matrix of integers, so tr J is an integer, for example and, later, we make a little use of this. With (3.4), we do have, by taking determinants of both sides of (3.3), 1 = d e t ( l +a ®n),
(3.5)
275 340
J. L. ERJCKSEN
implying that (1 + a <8> n)-1 = 1 — a ® «,
(3.6)
and, bearing in mind that we already assumed n to be a unit vector, a • n = 0,
« • n = 1,
(3.7)
With (3.3), we can eliminate the rotation R, using i?- 1 = ./-i(i - a
+/J
® c),
or, equivalently, = (1 - a ® «) (1 - « ® a).
(3.8)
From its definition and (3.4), we have that K=KT>0,
tetK=l.
(3.9)
Again, any other properties it might have depend on special choices of Ft and H. Equations of the type (3.8) arise in almost all discussions of simple shearing deformations in nonlinear elasticity theory, from which it is perhaps clear that (3.8) cannot hold for all K satisfying (3.9). To analyze this, let e by a unit vector perpendicular to a and n, e • a = e • n = 0,
e • e = 1.
(3.10)
Then (3.8) implies that Ke = e
(3.11)
d e t ( « : - l ) = 0.
(3.12)
so it is clearly necessary that So, we must pick Ft and H to satisfy this condition, or the equivalent. If we do, we can calculate possible vectors a and n, satisfying (3.8), use (3.3) to calculate R, in brief to solve the basic problem posed. One way to do the calculation is to introduce the spectral representation of K. Granted that (3.12) and (3.9) hold, it will be of the form K=e®e + Xf®f+X~1g<8g,
(3.13)
with e,fand g orthonormal vectors, and A > 0. If X = 1, then K = 1, and it then follows from (3.1) that a = 0, a trivial case of no interest. Otherwise, it is clear that the e occurring in (3.11) must be that given in (3.13), so we have not seriously abused notation. Also, from (3.7) and (3.4), we infer that Kn = n-a,
(3.14)
fl = ( l - A ) n
(3.15)
n- AT» = « • « = ! .
(3.16)
or and, using (3.7),
276 Defects in Thermoelastic Solids
341
n = (cos v ) / + (sin y>) g,
(3.17)
So, write and use (3.16) to determine y>. This gives A cos2 f + A"-1 sin2 y) = 1 or cos v = ± 1 / ^ + 1,
sin f = ± ]/kl{X+ 1),
(3.18)
where any choice of the signs can be used. For each choice, (3.15) determines a corresponding value of a. By simple calculation one can verify that the values calculated satisfy (3.8). Nominally, this gives four solutions, but only two are essentially different. That is, we obviously get one solution from another by replacing a by —a and n by —n, and this accounts for two of the four. This analysis seems to be new, though special cases have been treated by various writers. One can deduce various conditions which are equivalent to (3.11), but I won't belabor this. From this view, the only difficulty lies in deciding under what physical circumstances it is likely that (3.12) will be satisfied, in better understanding physical implications of the analysis. Clearly, it only takes an infinitesimal shift in K to violate (3.12). Normally, we think of Ft as varying smoothly with 8, over some temperature interval, the phenomenon of thermal expansion. Superficially, this makes it seem likely that (3.12) will fail to hold over an interval of temperatures. That there are likely exceptions is known from the theory of twinning. Briefly suppose that H2 = 1. Then from (3.2), it is also true that J2 = 1. With (3.4) and a bit of matrix theory, one can then show that / must be of the form so
J=-l
+ 2c®d,
c-d=\,
K = ( - 1 + 2c <8> d) ( - 1 + Id <8> c).
(3.19) (3.20)
Clearly, (3.11) then holds, with e perpendicular to c and d; it matters little what is Flt as long as H does not map Cx to itself. Somewhat similar reasoning applies to the lattice invariant shears, typified by (2.13). Cases to be discussed do restrict Cu but only a little more, so the superficial impression mentioned above is at least not entirely correct. A rather curious result is implicit in analyses given by ERICKSEN [5], although one has to examine the special cases to check it. After I noticed it, and mentioned it to RICHARD JAMES, he found and showed me a more direct proof. Suppose that we have a solution for any case where H is a rotation. Then there exist 180° rotations Rx and R2, viz. R\ = R\=\, (3.21) such that RF^H^RyFyRz. (3.22) This comes close to saying that, with H orthogonal, all solutions are of the twinning type, basically, although R2 might not belong to G. In various cases encountered in practice, one can choose a reference configuration so H is orthogonal, although
277 342
J. L. ERICKSEN
it might not be a choice which intuition suggests is most natural. In this respect, the lattice invariant shears are exceptional, involving H not similar to any rotation. So, (3.22) gives a characterization of a subset of the possibilities, and we might hope for some characterization of other subsets, covering the lot. IV. Cases with HN = 1
If, as is the case for crystals, G admits non-trivial finite subgroups, we have the possibility of using H in one of these. There will then be a smallest integer such that HN=\. (4.1) As indicated before, the case N = 2, associated with twins, as defined by PITTERI [1], is pretty well understood, so we are primarily interested in cases with N > 2. Let me review some classical arguments, for the reader who might be rusty. First, H leaves invariant at least one symmetric positive definite tensor. One way of determining it is to take any tensor S such that S=ST>0,
(4.2)
and calculate the average 1 N <S}=— 2 (HM)TSHM.
(4.3)
It is then easy to verify that <S> = (S}T > 0,
HT(S> H=(S}.
(4.4)
Also, H must be similar to some rotation, H = L-^RL,
Rr1 = RT,
det R = 1.
(4.5)
(LHL~l) = 1,
(4.6)
One way of determining L is to solve LTL = <S> for it. Then, (4.4) gives HTLTLH = LTL - • (L-THTLT)
with this and (2.7)2, (4.5) is immediate. For crystals, the possibilities are N =1,2,3,4,6.
(4.7)
As mentioned before, H should now also be similar to a matrix of integers, so tr H = tr R = 1 + 2 cos % = integer,
(4.8)
X being the angle of rotation. Determine the angles possible, and you get (4.7). The result, and arguments used to get it are familiar to crystallographers, since they are used in deducing the crystallographic point groups. Further, according to the theory of material symmetry, one can subject H to a similarity transformation, by making a change of reference configuration,
278
Defects in Thermoelastic Solids
343
as is discussed by NOLL [6]. So, there is no real loss of generality in choosing a reference such that H=R, (4.9) I'll assume thus. For crystals, this implies that H then belongs to one of the crystallographic point groups, rather familiar things. However, it will generally not be the point group commonly associated with the natural state considered. With (4.9), we can employ (3.22) in the form RFtR = RtF^
(4.10)
with Rt and R2 satisfying (3.21). So, we first consider RlF1R2 = (l + a ®n)Flt
(4.11)
a standard twinning problem. As is known, and clear from the previous discussion, it will have two solutions. Since R2 is a 180° rotation, we can represent it in the form R2 = -l + 2h®h, h-h = l, (4.12) h being the axis. Rotating an unstressed body leaves it unstressed. We can introduce a normalizing condition based on this, and do this by taking F^Ff^-fCt,
(4.13)
the usual positive definite symmetric square root. Then, the calculations give, for one solution,
n = FTxhl\\Frlh\l
a = 2(« - || F^h || F,h), Ri = - 1 +2n ®n,
(4.14)
clearly consistent with (3.21). Similarly, the second solution is a — txFih, n = 2^\F^h-FM\Fxhf), Rl = - 1 + 2a ®a/||a|| 2 ,
(4.15)
the scalar <% being fixed by the condition that n • n = 1, which gives «2 = 4 ( | | F f 1 / ! | | 2 - | | i y * | r 2 ) .
(4.16)
Of course, it is here presumed that R2 does not leave invariant C 1; Uadfla + d .
(4.17)
With (4.12) and (4.13), this is equivalent to i ^ F ^ + Fi,
(4.18)
and it guarantees that the two solutions are well defined, with a #= 0. This gives the solutions explicitly, for N = 2; take R2 = R.
279 344
J. L. ERICKSEN
Otherwise, for (4.10) to hold, it is necessary that RTClR = R2ClR2,
(4.19)
RTC1R = C1,
(4.20)
or where
R = RR2,
R = RR2,
(4.21)
and clearly we don't want to have R = 1. Mathematically, it is not so hard to satisfy these conditions. For example, any rotation can be written as the product of two 180° rotations, so take R to be what you will; we will it to be a rotation, with angle 2n/N. So represent it, taking one rotation as R2, the other as R; these need not belong to G. Then pick Ct to have the axis of R as an eigenvector, and satisfy (4.17). Then (4.20) will hold. Some other forms of R can be accommodated, by letting Cx have two equal eigenvalues; if all three coincide, (4.17) can't hold, and we must respect it. Physically, we would like C t to be a natural state, for some realistic form of the constitutive function, which should be invariant under a group containing R. Possibly, some limitation might be deduced from this, but I am not sure of this. If we satisfy (4.19), we will have some values for R2 and Fu information needed to use (4.14) or (4.15), either giving us values for Ru etc. Also, with (4.13), (4.19) is equivalent to RTF1R = R2FlR2,
(4.22)
from which it follows that (4.10) holds, with R = RtR2RT.
(4.23)
To recapitulate briefly, we then have, the quantities determined as indicated,
RFlR =
RlR2RTFlR
= R1R2FiR2 = R.F.R, = (1 + a
(4.24)
Rather obviously, the analysis applies to any case where H is similar to a rotation, not just to cases where H belongs to a finite group. For the crystals, it is known that H is similar to a rotation if and only if it belongs to a finite group and, at least so far, most of the interest in such phenomena has centered around the crystals. Clearly, the set of possibilities here considered has a rather clear group-theoretic status. With the apparatus given, it is easy enough to construct illustrative examples. I have found some fitting the description of rotational twins given N BARRETT & MASSALSKI [2, p. 406], for example, with the property that R = 1. However, I have not found time for a deeper study of the set.
280 Defects in Thermoelastic Solids
345
There remain other logical possibilities, cases where H is not similar to a rotation and R =)= 1. One can concoct examples, by composing twinning solutions with suitably related lattice invariant shears, and one might replace the twinning solutions by the other types considered above. I haven't noticed examples of an essentially different kind, and observations known to me suggest nothing else. Lacking good theorems or more illuminating examples, I can say little more. While the basic formulation is quite elementary, the questions it raises are not so easy to settle. Acknowledgement. This material is based on work supported by the National Science Foundation under Grant No. MEA-8304750.
References 1. PITTERI, M., On the kinematics of mechanical twinning in crystals, Arch. Rational Mech. Anal. 88, 25-58 (1985). 2. BARRETT, C. & MASSALSKI, T. B., Structure of Metals, 3 rd ed. McGraw Hill, Inc., 1966. 3. ERICKSEN, J. L., "Special Topics in Nonlinear Elastostatics", in Advances in Applied Mechanics (ed. C.-S. YIH), vol. 17, New York: Academic Press, 1977. 4. ERICKSEN, J. L., The Cauchy and Born hypotheses for crystals, MRC Technical Summary Report 41=2591, University of Wisconsin, 1983. 5. ERICKSEN, J. L., Stress-free joints, / . Elasticity 13, 3-15 (1983). 6. NOLL, W., A mathematical theory of the mechanical behavior of continuous media, Arch. Rational Mech. Anal. 2, 197-226 (1958). Department of Aerospace Engineering and Mechanics and School of Mathematics University of Minnesota Minneapolis (Received August 7, 1984)
2sn Offprint from "Archive for Rational Mechanics and Analysis", Volume 94, Number 1, 1986, pp. 1-14 © Springer-Verlag 1986 Printed in Germany
Stable Equilibrium Configurations of Elastic Crystals J. L. ERICKSEN For J. B. Serrin on his 60th birthday 1. Introduction
Nonlinear thermoelasticity theory for crystals is generating some knotty questions in the calculus of variations and in the theory of related partial differential equations. This stems from invariance assumptions which are suggested by molecular theory, and seem to be needed to analyze some commonly observed phenomena, such as twinning. What might be regarded as the simplest problem is to characterize the most stable configurations of unloaded crystals. Even this is difficult, because these are not all unique, but form infinite sets. A number of these do seem to match configurations observed in real crystals. After elaborating this a bit, I will present methods for constructing solutions which are special, but resemble configurations which are observed. 2. Crystal Elasticity Kinematical descriptions of crystal configurations and their symmetries are discussed in detail by PITTERI [1, 2]. Some of the simpler kinds are adequately described by a set of three linearly independent vectors ea (a = 1,2, 3), called lattice vectors. Then, two such sets ea and e~a describe the same configuration provided
K = «fab,
(2.1)
b
where the m a are integers such that detin = ± l ,
I» = ||I«S||.
(2.2)
Under the obvious rule for composing these linear transformations, they form a group G, mdG, (2.3) which is infinite, discrete and not compact. Essentially, molecular theories of elasticity are schemes for calculating <j>, the Helmholtz free energy per unit mass,
282 2
J. L. ERICKSEN
as a function of lattice vectors and temperature T,
(2.4)
4> = kea,T).
In some situations, for example for the quartz problems discussed by JAMES [3], one finds that additional variables are needed, for an adequate description. Other difficulties are encountered in dealing with long range interactions. However, (2.4) seems to apply to a variety of crystals. When it does, one should get a single value of <£ for a single configuration, so
m£G.
(2.5)
The notion that rotating a configuration should not alter this energy gives rise to the additional invariance that, for any rotation R, R-1 = RT, kRea,T)
det R = 1, = 4>{ea,T).
(2.6) (2.7)
To get to elasticity or thermoelasticity theory, the usual practice is to introduce the Born hypothesis, which is discussed in detail by ERICKSEN [4]. Briefly, one picks some reference configuration, including a definite choice of (constant) reference lattice vectors Ea. Then, the assumption is that a macroscopic deformation x = %(X), with gradient F = Vx,
det F > 0,
(2.8)
takes Ea to ea, a possible set of lattice vectors in the deformed configuration, given by the formula ea = FEa. (2.9) Thus kea,T)=kFEa,T) = i(F,T), (2.10) the Ea being regarded as fixed. To describe the invariance of <£, we define, for each m 6 G, a linear transformation H(m) by the rule H{m)Ea = mbaEb,
(2.11)
det 77 = ± 1 ,
(2.12)
such H satisfying and generating a group G conjugate to G. Then, by easy calculation, 4>(RFH, T) =
HeG.
(2.13)
Commonly, such invariances are obscured by approximations made in calculating thermodynamic potentials from molecular theory. However, examples of molecular theories consistent with the above description can be found, for example in the works of EFTIS, MACDONALD & ARKILIC [5] and ERICKSEN [6]. The indicated invariance causes ^ to have some unusual properties. Since G and G are not compact, one can pick any Fin the domain of <£, and find a sequence Hn f_ G with FHn unbounded as n -> oo. Thus, the assumption that <#> -> oo
283 Equilibrium of Elastic Crystals
3
when F grows large is untenable. Similarly, the invariance prevents
where Q is a positive constant, the mass density, and Q is the region occupied by a body in the reference configuration. Make Q as simple and smooth as you like, assume <> / very smooth, etc. to simplify matters as much as is possible. Here, there are no constraints on deformation, so we can begin by trying to minimize the integrand. This might well be possible. Suppose that for some constant F = Fo, with det Fo > 0, we have 4(F,T)^j>(F0,T),
(2.15)
for all F in the domain of <£. This gives us, as a very simple minimizer, a homogeneous configuration with constant lattice vectors. Of course, with (2.13), we have an infinite collection of similar minimizers, with deformation gradients of the form RF0H.
(2.16)
Were G a continuous group, we might find a smooth inhomogeneous deformation, with F taking values in this set, but one can show that here, this is impossible. One could have additional minimizers not so related by invariance and, physically, this can happen at isolated temperatures, but we ignore this. A complication is suggested by observations of twinning. Physically, it seems quite reasonable to regard twinned configurations as minimizers. Generalizing this idea, we should allow configurations for which F is piecewise continuous, as long as the displacement remains continuous. If R^FoHi and R2F0H2 are two values in neighboring regions, meeting at a smooth surface with normal N in the reference configurations, the usual kinematical conditions of compatibility require that there exist a vector A satisfying R2F0H2 = R^gH^l +A®N).
(2.17)
One can normalize this by requiring that N be a unit vector, but I find it more convenient to use different normalizations in different cases. I will just assume that A ® N = 0 only if ,4 = 0. With Fo, the i?'s and i / ' s constant, it is clear that A and N can be taken as constant so, where smooth, such surfaces are planes. As is discussed by JAMES [8, 9], one can deduce additional restrictions, if more
284 4
J. L. ERICKSEN
than two such regions should meet. Briefly, there are many possibility for so tacking together these homogeneous deformations and, to some extent, they depend on the function
H = H^Hi
A = HXA,
N=HyTN,
(2.18)
(2.17) reduces to much the same thing, viz. RF0H = F 0 (l + A ® N).
(2.19)
I will assume that H belongs to the subgroup G+ of G defined by det H = 1
«=> det m = 1,
(2.20)
there being no real loss of generality in doing this. Then, by taking determinants of both sides of (2.19), we find that det (1 + A ® N) = 1
o
A-N=0
(2.21)
=>
a • n = 0,
(2.22)
By setting a = F0A,
n = FQTN
we can rewrite (2.19) as RF0H=(l+a®n)F0.
(2.23)
One can think of picking values of Fo and H, and solving the latter for R, a and n, if possible. As is established by ERICKSEN [10], it is impossible unless d e t ( / J r - l) = 0,
(2.24)
J^FoH-'Fo1.
(2.25)
where If (2.24) holds, then either JJT=
1
=»
a=0
(2.26)
or there are two non-trivial solutions. * Given (R, a, n) for one, one can get corresponding values (R, a, n) for the other by using the following prescriptions:
2a=-\\nfa + n,
(2.27)
(|ja 2 ||||« 2 ||+4)» = 4a + 2||«|| 2 n.
(2.28)
* For this purpose, we count solutions obtained by multiplying a by a scalar, and dividing n by it, as the same.
285 Equilibrium of Elastic Crystals
5
Then, set (2.29)
R = RR, where R = (l+a
®n)(l
- a ®n).
(2.30)
It is straightforward to verify that R is a rotation, and that the prescriptions deliver the second solution. This does give one characterization of solutions, and a way of constructing numerous examples, but we would like some better way of picturing the possibilities. At least for intuitive purposes and for making contact with works on crystallography, it can be useful to revert to the description in terms of lattice vectors, eliminating the rather arbitrary reference configuration. Set ea = F0Ea
(2.31)
and use (2.11) and (2.23) to get the equivalent = RFombaEb = Rmbaeb
RF0HEa
= Sea,
(2.32)
where 5=l+a®«.
(2.33)
That is, we can get lattice vectors on the other side of the plane by applying this simple shear to ea, or we can get them by applying R and m as indicated. Some special cases are now well understood. Cases summarized in (2.26) translate to the condition that Rmbaeb = ea. (2.34) This means that R is an element of the crystallographic point group for these lattice vectors. In the terminology used by ERICKSEN [11], m belongs to the corresponding lattice group. From the theory of these groups, we know that they are finite, conjugate and different for different choices of ea. Also, we must have RN = mN=\,
N = 1,2, 3, 4 or 6.
(2.35)
This covers the trivial solutions, involving no shear. There is another infinite set of solutions for which R = 1, so that Sea = mbeb.
(2.36)
There are the so-called lattice invariant shears, discussed in detail by ERICKSEN [4], things considered by metallurgists seeking to determine what deformations have taken place in certain phase transformations. For these, the x-ray crystallographer sees the lattice vectors as continuous, but one can need to consider them to analyze shapes of crystallites. Here, such m's do not depend on the choice of ea. The more common forms of twinning involve choices of m such that m2 = 1. Then, as is discussed in detail by PITTERI [12], (2.31) can be solved for any choice of ea and solutions have the property that m2 = 1
=>
R2 = 1.
(2.37)
286 6
J. L. ERICKSEN
Solutions can be of the form (2.34) or be non-trivial, depending on the choice of m and ea. PITTERI also includes an interesting example showing that R2 = 1
=|»
m2 = 1,
(2.38)
pointing out that one will either have m2 = 1 or mN unbounded as N-^oo, if R2 = 1. Further, PITTERI has shown me a proof that, given any m£G, with det m = 1, we can find choices of ea such that (2.13) can be satisfied or, equivalently, such that (2.24) holds, for the corresponding arguments. A characterization of solutions possible with H similar to a rotation is given by ERICKSEN [10], who notes that this is equivalent to the assumption that HN = 1 <=> mN=\,
N= 1,2, 3,4 or 6.
(2.39)
The case JV = 1 is trivial and N = 2, covered by (2.36) is well understood, so cases with N > 2 are of more interest. At first, I thought that this might be a way of characterizing certain kinds of observed patterns, the "rotation twins" which are described by BARRETT & MASSALSKI [13, p. 406]. It is not, but there is some connection. Later, I will elaborate this. In the words of BARRETT & MASSALSKI, "Crystals are rotation twins if a two-, three-, four-, or six-fold rotation of one crystal about a twinning axis produces the orientation of the other. The rotation axis lies either in the twinning plane or normal to it and is not a symmetry element of the lattice of the individual crystals."
3. Examples Here, I will present examples with m* = 1, most of which do not fit the aforementioned description of rotation twins. Pick any set of (linearly) independent) lattice vectors ea such that lkill = 11*211 = 1,
e3-(ei+e2)
= 0,
(3.1)
and let e" denote the corresponding reciprocal lattice vectors (the dual basis). Let e denote the unit vector given by
= (*!+ ea)/lki + ea\\ = (d + e2)/^2(l+e1-e2).
(3.2)
One 180° rotation is defined by i?1 = - l + 2 e ® e .
(3.3)
By an elementary analysis, one can verify that, with (3.1) we can also write e = ie* + e2)/\\e1 + e21| = (e1 + e2) \\et+ e21|/2,
(3.4)
thus, (3.3) is equivalent to
R, = - 1 +(et + e2) ®(el + e2) = - l + (ei + e2)®(el
+ e2).
(3.5)
287 Equilibrium of Elastic Crystals
7
This means that /?i belongs to the point group for such ea. Another 180° rotation is given by R2 = - 1 +2e2 ®e2.
(3.6)
Composing these, we obtain the rotation R = R2RX
(3.7)
with axis parallel to the vector (
(3.9)
and, by using the definition of R, we find that cos0 = e 1 - e 2 .
(3.10)
We can choose et and e2, consistent with (3.1), to get any desired angle, excepting cos 0 = ± 1, which would contradict the requirement that the ea be linearly independent. Now, define m£G as follows: m\ea = —e2, (3.11)
m\ea = eu m%ea = e3, it being straightforward to show that w4=l.
(3.12)
By elementary calculations, we then find that Rm\ea
=
e2e2,
Rmlea = e2, Rma3ea = e3 -
(3.13)
2 e , • e3e2.
We then have Rmabea = (\+a®n)eb,
(3.14)
with a = 2e2, n = e2 — e 2 . To have « =)= 0, we need e2 4= e2.
(3.15)
288 8
J. L. ERICKSEN
which will hold unless et • e2 = e2 • e3 = e t • e3 = 0, that is, unless the vectors are all orthogonal. If they are all orthogonal, (3.13) reduces to the form (2.34), R then being in the associated point group, the solutions becoming trivial. For one thing, the examples show that the analog of (2.37) does not hold, i.e. m*=l
(3.16)
=\> R*=l.
From (3.10), we can have i? 4 = 1, a 90° rotation, by assuming that e1 • e2 = 0 4= ^i ' e3- Then n is orthogonal to ey and e2, hence parallel to e 3 which, from (3.8), is the axis of rotation. This particular case conforms to the description of rotation twins given by BARRETT & MASSALSKI [13, p. 406]. Another special case satisfies their description in part. By choosing ex and e2 so that 2et • e2 = — 1, we see from (3.10) that R will satisfy R3 = 1. Their description also dictates that the axis of rotation, here e3, should be either parallel or perpendicular to n. Checking this, one finds that it is not parallel, and is perpendicular only if et • 3 = e2 • e3 = 0. Of course, one can pick ea not satisfying these conditions. A somewhat similar example is given by PITTERI [12], with R2 — 1, with the axis of R neither parallel nor perpendicular to n. As far as I know, nothing excludes the possibility that any lattice vectors might minimize some potential function <£, and occur in nature.
4. Some Constructions Here, we deduce some fairly general procedures for constructing examples of solutions fitting the aforementioned description of rotation twins. First, consider solutions of (2.32) with the special property that Ra = adF0,
i? 4= 1,
(4.1)
so that R has a unique axis parallel to a, hence perpendicular to n. Then, (2.32) is equivalent to M = f mbaeb ® ea =
RTS
= RT+a®n.
(4.2) r
Further, since n is perpendicular to a, the axis of 7?, there is a vector b such that w = CR - 1) & 4= 0,
b-a = 0.
(4.3)
Let S = 1 + a ® b, and we find that
_
(4.4)
_
SRTS~1 = RT + a® Rb — a ®b = RT' + a ®n = M,
(4.5)
^___
289 Equilibrium of Elastic Crystals
9
so M is similar to RT. This implies that MN = RN=\,
N=2,
3,4 or 6.
(4.6)
Briefly, taking traces gives tr M = m% = integer = t r ^ r = 1 +2cos0,
(4.7)
where 6 is the angle of rotation. Insert the possible integers, bearing in mind that M is similar to RT, and you get (4.6), which is well known to crystallographers. It can be helpful to note that any rotation R is similar to its transpose: it is easy to verify that RT = RRRT, (4.8) where R is a 180° rotation, with axis perpendicular to that of R. Thus, M is also similar to R. With (4.5) and (4.2), we have Let
Mea = mbaeb = S^S^e,,. ea = S~lea,
ea = Sea,
(4.9) (4.10)
and we have, as the equivalent of (4.9), Rmb~eb = ~ea.
(4.11)
Referring to (2.34), this means that m is in the lattice group for the lattice vectors ea, R being the corresponding element of the associated point group, also similar to m. To construct examples, we can pick any m £ G such that mN =\,
N=2,3,4
or 6.
(4.12)
As is discussed by ERICKSEN [11], this guarantees that there will exist vectors e0 and a rotation R satisfying (4.11), with RN = 1. It is in fact fairly easy to determine them, given m. Then pick a =j= 0 parallel to the axis of R, Z> =#= 0 any vector perpendicular to it. With S then defined be (4.4), we can use (4.10) to determine ea, satisfying (4.9). Use (4.3) to define n which, by the construction, will be perpendicular to a. Routine calculations then show that we have a solution of (2.32), satisfying (4.1). With solutions of this kind, we can construct configurations consisting of symmetry-related wedges, fit together like the sections of an orange, special cases of the topological possibilities discussed by JAMES [8, 9]. Later, I will discuss this a bit more. Much the same kind of analysis applies to solutions of (2.32) such that Rn = n,
i?=f=l,
(4.13)
which also fit the description of rotation twins. Again define M as above. Instead of (4.2), we use M~T = RT{\ -n
®a) = RT-n®a.
(4.14)
290 10
J. L. ERICKSEN
Here, we set a = (1 - R) c,
c • n = 0,
(4.15)
and similarly conclude that with or, equivalently,
M-T=SRTS~1,
(4.16)
5=l+n®c,
(4.17)
M = S~TRTST.
(4.18)
ea = STea,
(4.19)
Rm% = \ .
(4.20)
Then, setting we obtain, as the analog of (4.11), In a similar way, this can be read backwards to construct examples, and ST is also interpretable as a simple shear. These solutions cannot be used to generate the "orange section" configurations referred to above, but do fit the description of rotation twins. Bearing in mind (4.8), we note that, for all of these examples, M is similar to R. Suppose, conversely, that M is similar to R and M= RT{\ +a
(4.21)
trM= trRT=tvR,
(4.22)
RTa-n = Q.
(4.23)
In particular, we must have implying that T
Also M~ must be similar to R, and M~T = RT(l - n ®a),
(4.24)
RTn-a = Ra-n = 0.
(4.25)
similarly implying that Of course, we must also have a • n = 0. By elementary reasoning, left to the reader, one finds that the three equations can hold only if at least one of the following four conditions holds: Ra = ±a,
Rn = ±n.
(4.26)
If Ra= —a, R is a 180° rotation, so R2 = 1. With M similar, we also have M2 = 1. These are then standard twinning solutions, referred to in (2.37). Similar remarks apply to the possibility that Rn = — n. For all such twinning solutions, it is known that either Ra = a and Rn = ~n or Rn = n and Ra = —a. Thus, when M is similar to R, (4.26) reduces to the statement that either Ra = a or Rn — n, the possibilities analyzed above.
291. Equilibrium of Elastic Crystals
11
Thus, we have a rather nice characterization of a subset of the rotation twins, at least. There might or might not be other solutions fitting their description. 1 have not made a careful study of this question, but do not known of any examples. 5. Composition Suppose that we have two solutions of (2.32) for the same choice of ea, say Sea = (1 + a
(5.1)
Sea = (1 + a ® n) ea = Rmbeb.
(5.2)
and
One way of composing these is obtained by noting that Rmb(Seb) = =
RSRT(Rmbeb) RSRT(Sea)
= RR~mbmcbec = RR^ibamcb(RTS{m~l)ac
ed).
(5.3)
In particular, we then have S(Sea) = Rmba(Seb).
(5.4)
with S = RSRT = 1 + (Ra) ® (Rn), R = RRRT,
(5.5)
m = rrifhnr1 £ G. So, (5.1) links ea by an admissible discontinuity to lattice vectors Sea. With (5.4) we link the latter by a similar discontinuity to lattice vectors S(Sea). One can similarly compose to link the latter to a fourth set, etc. This is a first step toward building configurations involving several regions tacked together. It needs to be supplemented by considerations such as are discussed by JAMES [8, 9], relevant when the various planes of discontinuity intersect. In particular, nothing prevents us from taking the two solutions indicated by (5.1) and (5.2) to be the same, iterating this as many times as we desire. This gives a sequence of the form Sea = Rmbeb RSRT(Sea) = Rmb(Seb) R2S(RT)2
(RSRTS) ea = Rmb(RSRTS)
ea,
292 12
J. L. ERICKSEN
the N t h step giving RNS(RT)N e»-1 = Rmba$~l
= RN+i(mN+ifaeri,
(5.6)
with
eZ-1='UXIiS(R7Yiea.
(5.7)
Here, (5.6)2 follows immediately from (5.3)3 for N — 1 and is easily established by induction for N > I. Sometimes, after a finite number of steps, this iteration will return the lattice vectors to their original values, the condition for this being that, for some N= F < o o , R\Mv)baeb
(5.8)
= ea.
One can continue the iteration, reproducing the collection e^" 1 in a periodic fashion, so I call these periodic solutions. Recalling (2.33), we see that (5.8) implies that mv is in the lattice group for ea, with Rv in the corresponding point group. These groups are finite and conjugate, so we have Rv similar to mv similar to My,
(5.9)
with M defined as in (4.2). That these groups are finite also implies that there are smallest integers P and Q such that Rp=\,
Q
m
= MQ=\.
(5.10)
Also, it implies that there exists a rotation R such that M is similar to R
=»
RQ = 1.
(5.11)
First, suppose that P = Q, in which case we can take R = R. giving us the solutions discussed in § 4. For example, we might pick one with Ra = a, P = Q = 3. The iteration generates the three directions, n, Rn and R2n, making 120° angles with each other. Clearly, we can construct three half planes whose normals have these directions, their edges meeting on a line with direction a. In the wedges thus formed, we can assign lattice vectors ea, Rmbaeb and R2{m2)ba eb, and have all the desired jump conditions satisfied on the planes of discontinuity. Clearly, (5.8) holds, with M = 3. Such triplets are called by some "trills". Clearly, one can construct similar solutions with four or six wedges, resembling the structure seen in oranges. If one uses the solutions with Rn — n, one gets parallel planes, so one can't form such wedges, but one can put together any number of strips, with the lattice vectors changing in a periodic fashion, as one traverses them. The width of the strips can be assigned arbitrarily. Such patterns of plane defects often occur in parts of real crystals. Frequently, these are the standard twins, described by (2.37) but, clearly, they need not be. From casual observation, one might be misled into thinking that they are, when they are not.
293 Equilibrium of Elastic Crystals
13
The aforementioned description of rotation twins does serve to warn us against making this mistake, to look closely at the crystal structure. They are the remaining possibilities, with P =j= Q. In § 3, we encountered examples with P = 3, Q = 4, enough to indicate that we are not discussing the null set. From (5.8)—(5.11) we must have tri? = 1 + 2cos(2jr/i>), tr M = tr R = 1 + 2 cos (2n/Q) = integer.
(5.12)
cos (2Vn/P) = cos (2Vn/Q).
(5.13)
and If P = 1, the only solutions of (5.1) are the lattice invariant shears noted in (2.36), and none of these satisfy (5.10). Similarly, Q = 1 yields no nontrivial possibilities. As is discussed by PITTERI [12], possibilities consistent with (5.10) are such that P = 2 => Q = 2 and vice versa. The remaining possible values of P and Q are 3, 4 and 6. From (5.13), we must have V{\IP - 1/0 = integer,
(5.14)
so it is straightforward to find the smallest possible values of V. With (2.18), we easily find that (P, 0 = (3, 4) or (4, 3)
=>
F=12,
(P, Q) = (3, 6) or (6, 3)
=»
V=
{P, 0 = (4, 6) or (6, 4)
=>
F=12.
6,
(5.15)
In all these cases, (5.8) is satisfied in the trivial way, by having Ry = mv = 1. Since all solutions with mQ = 1 are fairly well characterized by ERICKSEN [10], one might better characterize the (proper) subset satisfying (5.10), with P 4= Q, and determine the kinds of patterns which they can generate. Here, I'll not pursue this. These rather naive and elementary analyses give us snapshots of what some of our minimizers can look like, a means of constructing certain kinds of examples, but not enough to give us a good picture of the whole set. To some degree, metallurgy is the art of controlling these and more complex morphologies, albeit imperfectly. Clearly, we have a long way to go, to develop mathematical theory to apply to such problems. Acknowledgment. This material is based on work supported by the National Science Foundation under Grant No. MEA-8304750. Also, I thank Dr. MARIO PITTERI for helpful comments. References 1. 2.
M., On v + 1 lattices, / . Elasticity 15, 3-25 (1985). M., On crystallographic space groups and generalized lattice groups, pending publication. PITTERI, PITTERI,
294 14
J. L. ERICKSEN
3. JAMES, R. D., The stability and metastability of quartz, to appear in Proc. Workshop on Metastability and Partial Differential Equations, Minneapolis, 1985, to be published by Springer-Verlag. 4. ERICKSEN, J. L., "The Cauchy and Born Hypotheses for Crystals", in Phase Transformations and Material Instabilities in Solids (ed. M. E. GURTIN), New York: Academic Press, 1984. 5. EFTIS, J., MACDONALD, D. E. & ARKILIC, G. M., Theoretical calculations on the pressure variation of second-order elastic coefficients for alkali metals, Mater. Sci. Eng. 7, 141-150 (1971). 6. ERICKSEN, J. L., "Special Topics in Nonlinear Elastostatics", in Advances in Applied Mechanics (ed. C.-S.YIH) vol. 17, New York: Academic Press, 1977. 7. FONSECA, I. M. Q. C , Variational Methods for Elastic Crystals, Ph.D. Thesis, Univ. Minnesota, 1985. 8. JAMES, R. D., Mechanics of coherent phase transformations in solids, MRL Report, Brown University, Division of Engineering, October, 1982. 9. JAMES, R. D., Stress-free joints and polycrystals, Arch. Rational Mech. Anal. 86, 13-37 (1984). 10. ERICKSEN, J. L., Some surface defects in unstressed thermoelastic solids, Arch. Rational Mech. Anal. 88, 337-345 (1985). 11. ERICKSEN, J. L., On the symmetry of deformable crystals, Arch. Rational Mech. Anal. 72, 1-13 (1979). 12. PITTERI, M., On type II twins, to appear in Int. J. Plasticity. 13. BARRETT, C , & MASSALSKI, T. B., Structure of Metals, 3rd ed. McGraw Hill, Inc. 1966. Department of Aerospace Engineering and Mechanics and School of Mathematics University of Minnesota Minneapolis (Received August 19, 1985)
295 /^Si§N ( m P I / Pergamon
International Journal of Plasticity, Vol. 14, Nos. 1-3, pp. 9-24, 1998 © 1 9 9 8 E l s e v i e r Science Ltd
P I I : S0749-6419(97)00037-5
ON NONLINEAR ELASTICITY THEORY FOR CRYSTAL DEFECTS J. L. Ericksen*^ 5378 Bucksin Bob Road, Florence, OR 97439, U.S.A. {Received in final revised form 8 June 1997) Abstract—In recent years, modifications of nonlinear elasticity theory for crystals have been used successfully to analyze phenomena associated with twinning and phase transitions involving changes in crystal symmetry. Here, I attempt to assess how adapting the ideas to dislocation theory could bring in some mechanisms not present in linear dislocation theory, and why this might help us better understand plasticity phenomena. Conversely, twinning theory could benefit from adapting ideas used in dislocation theory. © 1998 Elsevier Science Ltd. All rights reserved Key words: A. twinning, dislocations. I. INTRODUCTION
For some time, I have been interested in nonlinear theories of defects in materials, particularly in twinning phenomena in crystals and in defects observed in liquid crystals. It is gratifying to see the dramatic improvement in these which has occurred in recent years, although I don't claim much credit for this.1 There are many unsolved problems associated with these. For some related to twinning, I believe that it will be useful to adapt ideas used in dislocation theory, for crystals subject to non-zero surface tractions, particularly. As to dislocation theory, I have been motivated to think about concepts related to this more by questions arising in discussions and correspondence with experts in this area than by personal interests. However, I have long believed that linear dislocation theory is inadequate to treat some kinds of phenomena which are observed in experiments related to plasticity. More recently, I concluded that some common ideas about basic concepts needed to be revised. This matters little for calculations made using linear theory. However, I don't believe that this is also true for nonlinear theory. From my experience with versions of nonlinear crystal elasticity theory now being used to analyze twinning and phase transitions, I see reasons to believe that nonlinear dislocation theory based on it would differ from linear dislocation theory, qualitatively, by introducing the following kinds of possibilities not contained in linear theory. (1) Prediction of non-zero volume changes associated with dislocations. (2) Having dislocation solutions not of the Volterra or Somigliana kinds. *I dedicate this paper to the memory of my dear friend, James F. Bell. tCorresponding author. Fax: (541) 997 6399. *Work of this kind is discussed by Ball and James (1992), Bhattacharya and Kohn (1996) and Ericksen (1996) for crystals and Virga (1994), for liquid crystals, for example.
9
296 10
J. L. Ericksen
(3) Occurrence of slow waves which might even come to rest. (4) Providing an alternative in elasticity theory to separate theories of Peierls-Nabarro forces. (5) Extending the theory to cover twinning. Theories of Peierls-Nabarro forces do patch some nonlinear theory to linear elasticity theory, using the idea that the periodic structure of crystals implies a periodic variation of forces. I see indications that all five of these items are likely to be of some importance in attaining a better understanding of some plasticity phenomena. My purpose is to elaborate this. In plasticity, Bell was primarily interested in what occurs at large deformations, in metals of the common f.c.c. kind. Well, they are not so large for well-annealed polycrystals crystals or single crystals, particularly when the latter are not of very high purity. If they are of high enough purity, one can get those stage I and II regimes where, in a uniaxial-stress experiment, plasticity occurs with a linear stress-strain curve. Bell was more interested in the behavior in stage III, where the stress-strain curve changes to parabolic form. In the regimes of most interest to him, applying loads causes a decrease in volume, which pretty well disappears when the loads are removed, in the kinds of experiments he did. At first, I was sceptical about out this, there being loose theoretical reasons to believe that the volume should sometimes increase, sometimes decrease, depending on the kinds of loads which are involved. So, I asked him various questions about the experiments, and did my own analysis of some of the data obtained in his laboratory for polycrystals and some obtained by Hsu (1969) for single crystals. This convinced me that Bell was right. After that, he produced much more evidence of this, both for polycrystals and single crystals, emphasizing metals of the face-centered kind.This is covered in various publications by Bell, e.g. (1973, Section 435; 1996a). It is already clear that, unlike linear theory, nonlinear theory can predict volume changes associated with dislocations. There is an old analysis by Toupin and Rivlin (1960), which produced a formula for the changes in volume occurring in an unloaded sample which has been subjected to cold working. As far as I know, Bell did not do experiments of this kind. The formula was obtained using slightly nonlinear theory, second order elasticity, with some basic assumptions which might reasonably be applied to theories which are more nonlinear. Wright (1982) collected available data on this, finding that the formula fit these surprisingly well. For Cu which had been subjected to larger loads, he noted some discrepancy, which is not very surprising, in view of the second-order approximation. Theoretically, the volume could increase or decrease, depending on material properties, but the observations found by Wright all give an increase. As is mentioned by him, this could cover some things other than dislocations, but he also mentions various things related to dislocations which might be thought to be important, but are not accounted for. It is also true that, theoretically, the volume change would occur if only dislocations were involved. Wright mentions that the calculations also should apply to twins. Since he published this, the developments in elasticity theory related to twinning theory make clear that second-order theory is inadequate to describe twins, so it is not reasonable to try to apply the Toupin-Rivlin formula to these, although the basic ideas about averaging, etc. appear to be reasonable for twins. Workers have considered crystals containing very large numbers of twins, even infinitely many, to model twin microstructures observed in shape-memory alloys, associated with Martensitic
297 Nonlinear defect theory
11
transformations. These have been quite successful in describing configurations observed in unloaded samples. At least for the configurations which have been analyzed, there is no change in volume, theoretically, and I don't know of any observations contradicting this. As to Bell's volume changes, the Toupin-Rivlin calculations say nothing about them, since they don't allow for non-zero surface tractions. It is only reasonable to expect from this that nonlinear theory will predict some kind of volume changes, likely to depend on which nonlinear theory is used, particularly if some involve different mechanisms than others. Here, (2) and (4) could be of some relevance. However, those volume changes seem to me to be intimately related to the constraint Bell discovered, trll = trV = 3, where U and V are the symmetric tensors occurring in the polar decomposition of the deformation gradient, as a replacement for the common assumption of incompressibility. Again, I have checked this, agreeing that it applies to a wide variety of the kinds of loading programs he was concerned with, in (f.c.c.) polycrystals and single crystals. However, in presenting his last thoughts on the latter, Bell (19966) opined that a second constraint might also apply, but time ran out before he could check this out. My guess is that he would have first tried setting a linear combination of the invariants in his eqn (14) equal to zero, determining whether this gives a relation consistent with measurements. I suggest that workers interested in crystals plasticity explore this; my intuition agrees with his about this. Generally, when a theory of constrained materials applies to some material, it is because some moduli are very small compared to others. In (3), those slow waves suggest small tangent moduli, occurring as a nonlinear effect. So, my guess is that some such softening causes those odd volume changes, possibly relating these to (3). For a different kind of situation, involving softening of shear moduli at zero stress near phase transitions, one might think that this would be associated with the constraint of incompressibility, as is suggested by linear theory. However, nonlinear analyses by Ericksen (1986, 1988) gave constraints not compatible with this, one of which is much like that found by Bell. In numerous conversations we had, Bell mentioned his belief that it is important to understand why it is that macroscopic deformations which one would expect to be homogeneous are not, according to macroscopic measurements but, on a large enough scale, the differences seem to cancel out, making it possible to get reproducible experimental gross measurements. Of course, dislocation theory deals with deformation on a much finer scale, so the less coarse deformation hinted at above should still be regarded as a rather gross average. For this, I think that (2), (3) and (4) might all have some relevance. Concerning (5), plasticity effects related to deformation twinning are now receiving considerable attention, as is clear from the exposition of Christian and Mahajan (1995). One can do a bit with linear theory, by using different constitutive equations for the twins. As far as I know, the first useful results of this kind were those reported by the Woosters (1946) and Thomas and Wooster (1951), on detwinning of Quartz. However, nonlinear theory is needed and used for many twinning problems. I think that (2) is likely to be relevant to some interactions between twins and dislocations, among other things. Some theoretical background is needed for any discussion of the matters mentioned above, so I turn to this. II. ELASTICITY EQUATIONS
For analyses in elasticity theory, it is almost automatic that workers use material coordinates as independent variables, this being a more convenient way of handling most
298 12
J. L. Ericksen
problems. For some analyses relating to relating to defects, it really is better to use spatial co-ordinates instead, as Toupin and Rivlin (1960) did, after noting some reasons for this. For nonlinear theory, one can organize the two kinds of formulations so that the two kinds of equations look quite similar, as Ericksen (1995) did, in presenting some first thoughts on concepts relating to defects, in the context of nonlinear elasticity theory. In this section, I'll summarize forms of basic equations, as they apply to smooth solutions. Some are conventional, derivations for the others being given in the paper just cited. Elasticity theory involves comparing two kinds of configurations. There are the various configurations which a body occupies in different regions in space, here considered as equilibria, which might be attained by a body under different kinds of loadings. Things described here, like the regions or Cauchy stress tensors are of course described in terms of spatial positions. I'll use Cartesian tensor notations, Latin indices for things associated with these positions, x, denoting the spatial co-ordinates. Then there are the material coordinates Xa, Greek indices being used for things associated with these. This is my way of emphasizing which things are associated with which regions. I'll interpret material coordinates in two slightly different ways. One involves a reference configuration for an homogeneous material, considered as filling all of space, Xa labelling positions here. The other is a reference configuration for a body, thought of as some particular subregion of the former. For us, the reference is to be interpreted as a perfect crystal, unstressed, a reasonably stable configuration of this kind, with reference mass density p 0 , some positive constant. There is the view that all real crystals contain some defects, so that such configurations are not really attained. It is not really necessary to include the reference among the aforementioned spatial configurations, but it does complicate matters if you don't accept the common compromises about this, settling for relatively pure crystals which have been well annealed, etc. to approximate the ideal. I accept such compromises so, for the reference, x = X. When we do what is conventional, use Xa as the independent variables, we think of mapping a reference configuration of a body to spatial configurations, described by functions of the form Xi = x,iXa),
(1)
commonly regarded as smooth and smoothly invertible. If spatial co-ordinates are used as independent variables, we use instead the inverse functions, of the form Xa = Xa(Xi).
(2)
Generally, the material description is more convenient, because the region in the reference configuration associated with a body is known and fixed. However, one can be interested in designing a body to have a certain shape, when it is subject to certain loads, clamping conditions, etc. Then the situation is reversed, so it is better to use the spatial description. Occasionally, problems of this kind have come up in my conversations with engineers. I happen to remember the first time this occurred, a long time ago, in a conversation with Ray Mindlin, concerning the design of better piston rings. For somewhat similar reasons, the spatial description can be better for some analyses of defects, as I will mention later. For the material description, we introduce the strain energy per unit reference volume W, as a function of the deformation gradient JC,>. The requirement that it not be changed
299 Nonlinear defect theory
13
by superposing rigid body motions on any configuration reduces it to a function of the usual Cauchy-Green tensor Cafi = XiaXi^ = Cfia,
(3)
W = W(Xi,a) = W(Cap).
(4)
so, we have
Essentially, this is an idea used by Cauchy in his development of linear elasticity from macroscopic considerations; he also treated molecular theory. As is customary, I assume that W = 0 in the reference configuration. The Piola-Kirchhoff tensor, measuring force applied to the actual configuration per unit reference area, is given by dW Tai=^~ ,
(5)
axia and the conventional equilibrium equations are Tai,a + pcfi = 0, (6) where ft is the body force per unit mass. Then there is that (referential) configurational stress tensor Pap = W8ap - TaiXi,p,
(7)
generally not symmetric. The first index goes with the reference vector element of area, the second with the force. It is straightforward to show that, when eqn (6) is satisfied, this satisfies Pafl.a + POgfl = 0,
gp = -fiXij .
(8)
Conversely, if eqn (8) is satisfied, so is eqn (6), granted the usual condition related to invertibility, i.e. det || JC,-,,, || > 0 .
(9)
So eqn (8) can also be used as the equations of equilibrium. If you superpose rigid rotations, the Pap and gp don't change, but Tat and fj do, reflecting the fact that rotating the actual configuration rotates the forces acting on it. As is traditional, Tai and ft are regarded as mechanical forces and they are conjugate to x,. The configurational forces are instead conjugate to Xa, which means that they do work in quite different ways. For almost all, if not all analyses of defects, body forces are assumed to vanish, so T and P both have zero divergence. Eshelby and many other writers then introduce the displacement u = x - X and add T to P to get a slightly different version, Pap = WSap - TayuYip.
(10)
300 14
J. L. Ericksen
As a young student, I was taught not to add tensors of different kinds, as is done here, and it does have some unpleasant consequences for nonlinear theory. For one thing, rotating the body amounts to applying a rotation to x but not X, so u does not rotate as a vector would. While Pap is not affected by such rotations, Pap is. Also satisfying div P = 0 does not always imply that div T = 0. For such reasons, those interested in nonlinear theory tend to use P rather than P, as is the case the recent discussions of configurational forces by Gurtin (1995) as well as Maugin and Trimarco (1995). Both cover ideas about configurational forces not discussed here, and omit some covered here. Let a denote the stress tensor used in linear theory. Calculating the linear approximation to these tensors, one gets T^a,
PQi-o, P^O.
(11)
Here, I don't use indices, since a approximates different kinds of tensors. What is done in linear theory, following Eshelby,* is to use the linear approximation for T, giving the usual linear equilibrium equations as an approximation of eqn (6). Then use eqn (10), with T replaced by a, W being taken as the quadratic form used in linear theory. So this becomes a quadratic function of displacement gradients, familiar to those who know a bit about defect theory. This satisfies eqn (8), with the approximation g ^ - / . For this, what is changed by using P is indicated in the linear contribution indicated in eqn (11), which does of course have a vanishing divergence, when body forces vanish. For singular solutions, with P or P used to calculate the configurational force exerted on defects, it could make a difference which one uses, depending on what side conditions are imposed on a, near singularities. Eshelby believed very strongly that forces acting on defects are configurational or, as he often put it, not real. By this reasoning, a force calculated using a would be real, so this should give zero, as a contribution to the force (or moment of force) acting on a defect. I think it accurate to say that this has become the conventional practice, for calculating forces on defects. So, for linear theory, we have the quadratic P«0 = WSaH - OayUy.p,
(12)
to be used for calculating force on defects. Granted this, it doesn't really matter whether one uses P or P, for such linear theory. Conceptually, there is no analog of the balance of moments of mechanical forces, for the configurational forces, in general. Actually, for isotropic materials, P turns out to be symmetric for nonlinear but not linear theory, giving the obvious analog for nonlinear theory. Conceptually, crystals are not isotropic, so this quirk of isotropic materials is not relevant. One thing about the linear theory is worth noting. If one did a systematic approximation of P or P, retaining all terms quadratic in displacement gradients, one would get what is described above, plus additional terms. This might better describe P, but it won't have a vanishing divergence, in general. So, at least tacitly, the judgment is that it is more important to have the vanishing divergence. Toupin and Rivlin (1960) do use systematic approximations, but make no use of any ideas about configurational forces. Effectively, they make a different judgment. Incidentally, there is a routine which leads to eqn (8), rather automatically. From the old work of Noether (1918), we know that, if a Lagrangian is invariant under an n-parameter *See e.g. Eshelby (1951, 1956). There is some nonlinear theory in the latter.
301^ Nonlinear defect theory
15
continuous group, there are n vector-like quantities such that their divergence vanishes whenever the Euler-Lagrange equations are satisfied, and there is a routine for calculating these. Here, eqn (6) gives the Euler-Lagrange equations, w h e n / = 0. For homogeneous materials, the Lagrangian W is invariant under the 3-parameter group of translations of Xa. Turn the crank and you get eqn (7) as a formula for the corresponding divergenceless quantities. Use the invariance under rotations and you get what amounts to the usual balance of moments, for nonlinear theory. For the spatial version, it is pretty much a matter of using the same kinds of arguments, interchanging the roles played by xt and xa. So we use eqn (2) in place of eqn (1), replace W by w, the strain energy per unit present volume, to be regarded as a function of Xaj. What does not get interchanged is the invariance under finite rotations, which means that w reduces to a function of Cap, given by eqn (3) or, better, its inverse Dap, given by Dap = Xa,iXp,i,
(13)
so we have w = w(XaJ) = w(Dafi).
(14)
Here, in calculating forces, we use the present elements of area. If dSj denotes the vector element of area of this kind, dSa the correspondent in the reference configuration, the usual kinematical relation is dS, =JXaJdSa,
j = det || xLa || .
(15)
So we introduce spatial measures of configurational and mechanical stresses as indicated by PtadSi = PpadSp,
tjidSj = TaidSa ,
(16)
tji being the familiar Cauchy stress tensor. The prescriptions for these work out to be (17) and tji = wSjt - pjaXaJ
(18)
If you put in the function w for w, you can check that this gives /,-,- as a symmetric tensor, as it should. The new equilibrium equations are tju + pfi = 0,
(19)
P«.i + Pga = 0,
(20)
and where p is the present mass density. As before, eqn (19) implies eqn (20) and vice versa. It might seem curious, but this gives a formula for pia which looks more like eqn (5), for Tia,
302 16
J. L. Ericksen
while eqn (18) looks more like eqn (7). In this sense, the roles played by these stresses get reversed. So, if we use the same kind of reasoning we did before, for linear theory, we should linearize the configurational stress and use the argument used before for P to get an analogous quadratic formula for t. Suppose that I didn't think much about what I was doing, and confused the material co-ordinates Xa for spatial co-ordinates and vice versa. I would then introduce as the displacement u' — X — x, really — u. Suppose that I am involved in a calculation based on linear theory, so I use the linear equations, and do not worry about deducing them from nonlinear theory. However, one can do the derivations. With the understandings, the linear elastic strains are taken to be
4 = ^ « * + «*,/).
(21)
and plugging this into the linear constitutive equations gives the corresponding linear estimate of stress a'. Working out the linear approximations gives pQ*c/,
t^ -a'.
(22)
Comparing this to eqn (11), I get trapped into thinking that p is the Piola-Kirchhoff stress, / the configurational stress. Actually, since u' — —u, I should use a = -a' to compare the two formulae. However, I carry on, to calculate forces on dislocations. Will my blunder cause me to get wrong answers? Try this for yourself. Take any such calculation you find in the literature. Most likely, the author(s) won't tell you which co-ordinates they are using, so assume they are spatial, and interpret what they call displacement as u', probably not their intent. My experience is that you will get the same result, formally. So simply reinterpret the conventional (10), and you get the analogous tji = wSji - cr'jku'kj = wSjj - OjkUkj ,
(23)
really the same equation, differently interpreted. This is a bit strange, since it is not symmetric, as the Cauchy stress should be. As before, eqn (23) does not include all the terms which you would get from a systematic expansion such as Toupin and Rivlin (1960) used, but its divergence will vanish. As far as I know, I am the only one who has realized that there are these two interpretations of linear theory which are conceptually very different. Had Eshelby known this, it would be like him to acknowledge it and give some argument for preferring one to the other. Later, I'll discuss this a bit more. III. KINEMATICS OF DISLOCATIONS
Eshelby contributed much to working out procedures for setting up and solving defect problems using linear theory, this being treated in some detail by Eshelby (1951), for example. Concerning dislocations, he wrote there that "The fundamental property is LijAxj = bi
*His equation number.
(22)*
3O3
Nonlinear defect theory
17
for any closed circuit embracing the dislocation line." While he doesn't use the words here, there is no doubt that the circuits are what are commonly called Burgers circuits, fl being the Burgers vector. For these, the atomistic view is that you look at the atoms in a crystal containing defects, constructing Burgers circuits which are closed curves, consisting of straight lines connecting atoms, avoiding the defects. In the continuum view, we no longer see the atoms. For the analog, we look at some configuration actually taken on and, in it, construct a closed oriented curve, not passing through any singularities in the corresponding elastic solution. For this to make sense, we need to use the spatial representation of such a curve, say Xi
— Xi(r), 0 < T < TO,
x,(0) = x,(r 0 ).
(24)
The next step is to construct the image of this curve in a perfect crystal. I won't belabor how this is done in the atomistic scene, but in the continuum analog, we plug eqn (24) into eqn (2), to get the corresponding curve in the reference configuration. This may or may not be a closed curve, depending on what dislocation lines are embraced by the Burgers circuit. The Burgers vector for the circuit is TO
To
Ba = Xa(r0) - XM = J dXa = | Xa,fra, 0
(25)
0
the latter integral being taken on the curve given by eqn (24). If you like, you can replace X by my vl in the last integral. Try to interchange the role played by the perfect and defective crystal here, to put the closed curves in the perfect crystal, and it doesn't make sense. So, for the aforementioned quotation to make sense, we should interpret Eshelby's co-ordinates as spatial co-ordinates, his u as what I call u'. Often, he didn't say whether his co-ordinates should be interpreted as material or spatial, in his writings but, from various clues, I don't doubt that he was interpreting them as material, generally. From this, I assume Proposition 1: Burgers circuits and dislocation lines should be described, using spatial coordinates. The Burgers vector is a vector in the reference configuration. Certainly, Eshelby knew about those atomic pictures associated with Burgers vectors and circuits. It used to be common for those lecturing on dislocations to explain this and I'm sure he heard at least as many of these as did I. What is strange is that he overlooked the logical implications for elasticity theory, summarized in my Proposition 1. So have his followers, as far as I know. I have used familiar ideas wrongly, without thinking much about it, kicking myself when the light dawned. Apparently, he did something like this. Conceptually, it does have important implications. One that would have shocked him is that a force capable of doing work in changing the dislocation line must conjugate to x and the P he recommended is not. He was knowledgeable about such matters and was generally careful about them. Mechanical forces could so do work, and commonly do. The latter cannot do work in changing the Burgers vector, but configurational forces might. I'll come back to such questions later. There is the problem of deciding what kinds of singularities should be considered to represent defects, dislocations in particular. For the latter, the usual picture of an isolated
304 18
J. L. Ericksen
defect involves a surface often thought of as a slip plane, with a jump in displacement occurring there, a more violent singularity occurring on all or part of the boundary of this surface, what is interpreted as the dislocation line. Actually, a slip plane is more likely to be a curved surface, but I accept this common abuse of language, Assuming no other singularities occur, the Burgers vector should have the same, non-zero value for all similarly oriented circuits enclosing this line, and vanish for all circuits not enclosing it. With these conditions, it is feasible to allow or exclude jump discontinuities in displacement gradients. From Proposition 1, we should use spatial co-ordinates as independent variables. Kinematically, one can deduce conditions on jumps, using the conditions on Burgers vectors, where the discontinuity surface is smooth. Label with a plus sign limits of functions taken from one side, a minus sign those taken from the other, let v, denote the unit normal, outward directed relative to the region labelled +, and let [/] =/+ - / -
(26)
denote the jump in the function occurring in the square bracket. Then, one gets [Xa] = const.
(27)
and that, for finite jumps in derivatives, there are functions Aa such that [Xa.l\ = Kvi.
(28)
Briefly,with the zero Burgers vector condition, you get some integrals independent of path, which can be used to define a function Ya near the surface which is continuous across the surface and would be Xa, except that it doesn't match the non-zero Burgers vector condition. Apply the customary conditions of compatibility to it and you get eqn (28). Here, I touch upon reasoning which needs to be used to construct the region associated with the body in the reference configurations, to take care of little chunks of matter which might have been taken from one place in a body and put in another. It won't be the same body you had before those defects occurred, a remark that also applies to mine. So, for a body containing many defects, it is not easy to keep book on all that is needed to construct the corresponding reference region. For various problems, it is not really necessary to do this "damage assessment". Linear theory almost always uses the Volterra model of dislocations, which means taking Aa = 0. Eshelby gave some thought to approximating these by Somigliana dislocations, which also have Aa = 0. One can show that, with the usual assumption that the strain energy function is strictly positive, Aa must vanish. Briefly, discontinuities like these with Aa ^ 0 are associated with stress waves, which must move with calculable, non-zero speeds. So, having Aa ^ 0 would be a genuine nonlinear phenomenon. This is important for twinning, commonly considered to fit eqn (28) and a special case of eqn (27), [Xa] = 0, or the equivalent of such conditions. Concerning the dislocations, some things have entered my mind. One is that, with the Volterra model, one is really dealing with multi-valued functions, which means that the discontinuity surface can be chosen rather arbitrarily; one can put the cut where you like, within reason, which seems to me a bit unphysical, given that slip planes really are particular crystallographic planes. Concerning this, Nabarro (1987, p. 21) writes that "In the
305 Nonlinear defect theory
19
theory of dislocations in an elastic continuum there is no such special plane, and it is in fact an essential part of this theory that the state of the body depends only on the line and its strength b, and not on the cut bounded by this line which was made in order to introduce the dislocation." My view is that nonlinear theory should be capable of distinguishing those slip planes, perhaps by allowing jumps with Aa ^ 0 near the line. From experience with twinning theory, we have some understanding of what properties constitutive equations should have, to make it possible to have such jumps. One needs a band of deformations too unstable to be observed, separating more stable regimes. In the latter, near the unstable regime, some acoustic speed gets small, vanishing as one enters the unstable region or, in static terms, some normally positive tangent modulus goes to zero, becoming negative in the unstable regime. One can state the conditions more precisely, but I don't want to get into technical details. Also, with such dislocations, it could then be easier to understand conversions of dislocations to twins, which are considered in deformation twinning. Referring to the introduction, this bears on (2) and (3). For plasticity, there are indications of some such softening and of slow waves in the observations of what Bell (1973, Section 4.31) called the Savart-Masson effect, along with the McReynold slow waves. This is better evidence, but still weak. For the theory considered, we are considering deformation on a finer scale, somewhere between atomic spacing and the distance between dislocations. The measurements are on a much larger scale. So, it is a bit tricky to interpret what the measurements imply about events on the finer scale. Also, most of the observations mentioned above are on polycrystals, not single crystals. However, as is discussed in some detail by Bell (1981), there is evidence that quite similar phenomena do occur in Stage III deformation of single crystals, including X-ray observations, made using high-speed scanning measurements, taken during loading. For the latter, one is getting into regimes where the crystal is obviously damaged by large numbers of defects, making X-ray pictures fuzzy. Perhaps an expert could decide whether the phenomenon at hand is visible in the haze, but I can't. Theoretically, the phenomenon is one which might be produced by loading and either remain or disappear when loads are removed; I am pretty sure that one could construct theories of either kind. So, we have a genuinely nonlinear effect which might or might not be helpful in understanding plasticity. I'd like to see workers find other possibilities and improve our theoretical understanding of them. IV. FORCES
With the misinterpretation pointed out in the last section, one should look carefully at interpretations of many calculations. It is a tricky matter to correctly interpret calculations of forces acting when a dislocation moves through a material. For one thing, this changes the slip plane, adding to or subtracting from it. This involves changing the Burgers vector for some circuits from zero to a non-zero value or vice versa. As Ericksen (1995) notes, configurational forces can do work in this process. For another, for edge or mixed dislocations, one needs to bring in or remove vacancies or atoms, so one should consider forces between these and the dislocation, or maybe the slip plane. I have not thought through for myself how best to view these. Then, there is the point that changes in the slip plane arefirmlylinked to changes in the line, so changes in position of the latter are not really independent of those changes in Burgers vectors. In addition, there is the atomistic view that Burgers vectors should be integral linear combinations of (reference) lattice vectors, which means that they change only if the line shifts by an integral number
306 20
J. L. Ericksen
of lattice spacings, roughly speaking. This is important in the considerations of PeierlsNabarro forces, in particular. For linear theory, the Peach-Koehler forces are among those considered, and it is hard to know how to define these, for nonlinear theory. However, I'll attempt to interpret the symbolism. The idea is to try to estimate how surface tractions influence motion of dislocations. In Eshelby's (1956, Section 9) treatment of this, he considers dislocation loops, along with a smooth linear elastic solution with this stress tensor denoted by aT, describing the loading. He considers an infinitesimal variation of a small part of the dislocation line, adding an infinitesimal area to the slip plane. If dS represents the vector element of area for this, his calculation of the change in energy 8E gives for it SE = ±bi<7ydSj,
(29)
with b\ the Burgers vector. One can fix the sign by adopting conventions for orienting the vectors, as he does. To interpret this, consider the slip plane to be located in the spatial configuration, so dS is the spatial version and, from Proposition 1, the Burgers vector is in the reference configuration. The interpretation of crT fitting these is the linear approximation to the spatial configurational stress. So, I would rewrite this as SE = ±BapiadSi,
(30)
and interpret this as the work done in changing the Burgers vector from zero to Ba in this process. Eshelby doesn't, skipping to the next step, using dS = lsAi;,
(31)
where 1 is the length of the line segment varied, s is its direction, and £ is the infinitesimal displacement vector, indicating how it is displaced. With this, one can put eqn (30) in the form 8E=lF-l-,
(32)
as a justification for regarding F as a force per unit length acting on the dislocation; work this out and you get the Peach-Koehler formula, as I interpret it. Effectively, this converts a configurational force to one which is conjugate to x. With the singularity at the dislocation, one should not use the total stress in this formula and, with nonlinear theory, one can't simply add smooth solutions to singular solutions, so this seems to make some sense for linear theory only. In writing about his memories of Eshelby, Nabarro (1985) writes that "Eshelby maintained this distinction* rigorously. When he calculated the force between parallel disclinations in a nematic liquid crystal and found that 'the supposedly configurational force in a nematic is in fact a real force exerted on the core of the dislocation by the surrounding medium', he was very disturbed, and he circulated the draft of his paper* to many colleagues before publishing it." As was acknowledged by Eshelby (1980), I was one of the *He refers to the distinction between mechanical and configurational forces. My reference to Eshelby (1980).
f
307 Nonlinear defect theory
21
many, so I know something about his worries. Certainly, he felt very strongly that forces on defects should be configurational, although I have never understood his reasons for this. So, he started believing that the liquid crystal workers had made some error, which he would find and correct. However, after a careful examination, he concluded that it was he who was wrong and, being an honest and scholarly person, he made this public. I doubt that it even occurred to him that he might be somewhat wrong about this, for solids. At the time, I had not thought much about these calculations. However, questions relating to this came up in later correspondence and discussions with others, inducing me to collect my thoughts, which are presented by Ericksen (1995). In taking a hard look at the calculations, I found the conceptual error described in (Section III). I then argued that, in particular, the force on a dislocation line should sometimes be regarded as mechanical, to be calculated using the Cauchy stress tensor. Involved here are forces not calculated using elasticity, but from the microscopic view of atoms, perturbed a bit from the periodic arrangement, what are commonly called Peierls or Peierls-Nabarro forces. Actually, a kind of continuum theory is often used for these, motivated by the atomic picture. Remember the elementary arguments used to make plausible the notion that crystals must have a finite shear strength. One version I remember being exposed to in more than one lecture on dislocation theory, long ago, considers a one-dimensional shear stressstrain curve associated with a picture of atoms arranged periodically. Pick the crystallographic direction of shear in a way making it easy to see what should happen and it's intuitively obvious that this should be periodic curve. So, start from a place where the shear stress vanishes, and the slope of the curve is positive. For a small positive shear stress, associated with small shear strains, one gets a point closer to a maximum on the curve, thought of as a limit of metastability, it being conceded that one should get pretty close to this, physically, in a perfect crystal. Note that this implies that the tangent modulus vanishes at the limit and should at least get small for deformations viewed as attainable, at least in the perfect crystal. With the Peierls-Nabarro forces, the aim is to do more realistic calculations using this idea, perhaps with atomistic theory. Of course, one does not use linearized theory, for this part of the calculation. The question is whether the dislocation line will move through the material, when forces are exerted on it by other defects, walls, etc. The idea is to assume that it does not, so one then uses elasticity theory to calculate forces acting on it and determine whether these can be balanced by the Peierls-Nabarro force. As was already mentioned in (Section III), the force acting on the dislocation line should be conjugate to x. To me, having the dislocation line there is not really different from having some foreign object embedded in the material, when it is considered not to move through the material. Given this, it seems to me reasonable to use the Cauchy stress tensor to calculate that force. Also, we should match this force to the Peierls-Nabarro force. Conceptually, I have no idea how to calculate such a force, in atomistic theory, if it is not to be interpreted as mechanical. There is another point in that, with the forces regarded as mechanical, one should also balance couples. I do share the common view that the equilibrium theory is not what should be used, to analyze what happens when a dislocation moves through the material. Physically, I don't see a good way to fit the Peach-Koehler forces into such calculations since they involve moving the dislocation through the material. So, as an amateur in this business, I'll leave this to experts. In using nonlinear elasticity theory to do analyses of problems relating to twinning and phase transitions, workers have found it important to use a theory of material symmetry
308 22
J. L. Ericksen
which is quite different from that used for linear theory and, by most workers not concerned with these phenomena, for nonlinear theory as well. Essentially, it better accounts for the periodic nature of crystals, building in those periodic shear stress-strain curves as a contribution to the stress, for example. So, from this view, the Peierls-Nabarro forces can be cancelled out by requiring resultant stresses to vanish, as was assumed by Toupin and Rivlin (1960). Workers have found some tricks for simplifying theory in some cases, compromising with the periodicity, but saving the part of this which is important for the phenomena considered. One is not really interested in all those maxima and minima in that shear-stress strain curve, but it is good to see at least one. So, there is some experience in using continuum theory analogous to that involved in the consideration of PeierlsNabarro forces. From this view, some of the latter kinds of atomistic calculations could be viewed as substitutes for experiments we don't know how to execute, providing some information about constitutive equations. So, this elaborates item (4) in the introduction. My experience is that research moves ahead faster when workers with different kinds of expertise get involved, and there are real experts in continuum theory of this kind. Admittedly, this is still quite speculative,and it will take hard research to better understand what such theory can do, concerning the dislocations. My reinterpretation of linear theory and thoughts about different kinds of forces both suggest one side condition associated with configurational stress, which I'll state in words as Proposition 2: Resultant configurational forces exerted on a dislocation line vanish. Interpret this in the same way as workers have done with the analogous statement for mechanical forces, as the vanishing of the integral over tubular surfaces surrounding the line, of configurational surface tractions. For linear theory, it is clear that this does not force Peach-Koehler forces to vanish, for example. Concerning those surfaces of discontinuity discussed in (Section III), extremal conditions in the calculus of variations suggest, for one thing \Pi]vi = 0,
(33)
in the notation used before. This is satisfied automatically for Volterra dislocations and for the kinds of twins most commonly analyzed, those occurring in unstressed bodies. With my proposal for linear theory, this becomes the commonly used jump condition for the stress tensor. In brief, the assumption is not obviously wrong, as far as I can see, but neither has it been put to a very good test. Actually, in considering the less than perfect mobility of transformation twins subjected to loading, I found a reason to reject eqn (33) for these,. However, I don't see any similar reason to reject it for dislocations. Toupin and Rivlin (1960) did not use it, but did use the usual condition on the Cauchy stress tensor associated with jump discontinuities, [tji]vj = 0,
(34)
in my notation. Reasonably, this applies to likely generalizations of the Volterra and Somigliana dislocations. Combining this with eqns (17), (27) and (28), one gets [w] = AapiaVj,
(35)
309 Nonlinear defect theory
23
where pia can be evaluated on either side. Again, this is satisfied in a trivial way by Volterra dislocations, etc. So, eqns (33) and (34) are consistent with common practices, for whatever that is worth. Given that eqn (33) is unfamiliar, it is natural to view it with more suspicion, although eqn (34) might also fail to apply because surface energies are important. For edge dislocations, one might introduce a surface energy, in describing why atoms again draw closer when a half plane is removed, but is hard to see why this should have any effect after the gap closes. What is harder is to decide what to say about singularities associated with dislocation lines. Fixing the Burgers vector for circuits embracing the dislocation line obviously does force Xaj to behave rather badly, near the dislocation line. The considerations of periodic shear stress-strain curves, etc. suggest that the stresses, etc. don't really behave as badly as is depicted by linear elastic solutions. Tame these a little, and one could make sense of resultant forces on surfaces intersected by dislocation lines, which would be helpful. It would take more taming to get energy integrals to converge, but this could happen, I think, with reasonable nonlinear constitutive equations. Given all the doubts about such matters, I think it premature to try to propose definite assumptions for such singularities. However, I will endorse one used by Eshelby (1951, Section 7), viz. Proposition 3: An elastic body reacts to applied forces in the same way whether it is selfstressed or not.
and suggest that you examine what he wrote concerning this. Of course, one needs to use some judgment, in interpreting statements of this kind. For linear theory, it excludes the possibility of using concentrated force solutions to equilibrate unbalanced surface tractions in representing defects, for example. With the suggested side conditions on configurational forces, we have almost enough information to do a copy of the Toupin-Rivlin procedures for averaging the Cauchy stress, which should yield additional information. One also needs some kind of boundary conditions on configurational surface tractions. I think that these might somehow be involved in helping or resisting motion of defects from the boundary to the interior or vice versa, but I haven't digested literature relevant to this. Thus, I won't try to suggest anything definite for these. So, I have discussed some of my reasons for thinking that those nonlinear mechanisms listed in the introductions are built into versions of nonlinear crystal elasticity theory now being used, and are likely to be of some importance for understanding some plasticity phenomena. The next step should be to formulate and tackle some more definite problems of this kind. My thoughts on this are not yet clear enough to present, so I will not expend more ink. REFERENCES Ball, J. M. and James, R. D. (1992) Proposed experimental tests of fine microstructures and the two-well problem. Phil. Trans. R. Soc. London A338, 389. Bell, J. F. (1973) The experimental foundations of solid mechanics. In Handbuch der Physik, vol. VIa/1, ed. C. Truesdell, pp. 1-813. Springer-Verlag, Berlin-Heidelburg-New York. Bell, J. F. (1981) A physical basis for continuum theories of finite strain plasticity, Part II. Arch. Rational Mech. Anal. 75, 103. Bell, J. F. (1996a) The decrease in volume during finite plastic strain. Meccanica, (in press). Bell, J. F. (19966) The kinematics of large plastic strain in cubic single crystals: a new look in the laboratory at G. I. Taylor's analysis of finite shear on face diagonals. In Contemporary Research in the Mechanics and Mathematics of Materials, eds R. C. Batra and M. F. Beatty, pp. 11-39. CIMNE, Barcelona.
310 24
J. L. Ericksen
Bhattacharya, K. and Kohn, R. V. (1996) Symmetry, texture and the recoverable strain of shape-memory polycrystals. Ada Mater. 44, 529- 542. Christian, J. W. and Mahajan, S. (1995) Deformation twinning. Progress in Mat. Sci. 39, 1. Ericksen, J. L. (1986) Constitutive theory for some constrained elastic crystals. Int. J. Solids Structures 22, 951. Ericksen, J. L. (1988) Some constrained elastic crystals. In Material Instabilities in Continuum Mechanics and Related Mathematical Problems, ed J. M. Ball, pp. 119-136. Clarendon Press, Oxford. Ericksen, J. L. (1995) Remarks Concerning Forces on Line Defects. Zamp 46(special issue), S247. Ericksen, J. L. (1996) Thermal expansion involving phase transitions in certain thermoelastic crystals. Meccanica, (in press). Eshelby, J. D. (1951) The force on an elastic singularity. Phil. Trans. R. Soc. London A244, 87. Eshelby, J. D. (1956) The continuum theory of lattice defects. Solid State Physics 3, 79. Eshelby, J. D. (1980) The force on a disclination in a liquid crystal. Phil. Mag. A42, 359. Gurtin, M. E. (1995) The nature of configurational forces. Arch. Rational Mech. Anal. 131, 67. Hsu, N. N.-H. (1969) Experimental studies of latent work hardening of aluminum single crystals. Ph.D. dissertation, The Johns Hopkins University, Baltimore. Maugin, G. A. and Trimarco, C. (1995) The dynamics of configurational forces at phase-transition fronts. Meccanica 30, 439. Nabarro, F. R. N. (1985) Material forces and configurational forces in interaction of elastic singularities. In Proc. Int. Symp. on Mechanics of Dislocations, 1983, eds E.C. Aifantis and J. P. Hirth, pp. 1-3. Michigan Technical University, American Society of Metals, Metals Park. Nabarro, F. R. N. (1987) Theory of crystal dislocations. Dover Publications Inc., New York. Noether, E. (1918) Invariante Variationsprobleme. Nachr. Akad. Wiss. Goettingen, Math.-Phys Kl 2, 235. Thomas, L. A. and Wooster, W. A. (1951) Piezocrescence-the growth of Dauphine Twinning in quartz under stress. Proc. R. Soc. London A208, 43. Toupin, R. A. and Rivilin, R. S. (1960) Dimensional changes in crystals caused by dislocations. J. Math. Phys. 1, 8. Virga, E. G. (1994) Variational theories for liquid crystals. Chapman and Hall, London-Glasgow-Weinbein-New York-Tokyo-Melbourne-Madras. Wooster, W. A. and Wooster, N. (1946) Control of electrical twinning in quartz. Nature 159, 405. Wright, T. W. (1982) Stored energy and plastic volume change. Mechanics of Materials 1, 185.
3U Arch. Rational Mech. Anal. 153 (2000) 261-289 Digital Object Identifier (DOI) 10.1007/s002050000093 © Springer-Verlag (2000)
On Correlating Two Theories of Twinning J. L. ERICKSEN
Communicated by D. KINDERLEHRER
1. Introduction After ZANZOTTO [17] made it clear that elasticity theory is inadequate to analyze various twins observed in crystal multilattices, I began to think about the possibility of constructing a different kind of theory, enabling some analysis of such twins and some phase transitions where a similar difficulty occurs. Here, we will concentrate on a theory for twins of this kind. Typically, those working on deformation twins use two kinds of observations. First, there are the X-ray observations, used to determine how the atoms are arranged in the two configurations forming a twin. Second, there are the observations of macroscopic deformation, commonly done using optical methods, determining the regions occupied by the untwinned and twinned crystal, usually when it bears no loads. In trying to understand deformation twins, workers use various ideas and assumptions. However, if you look at the review by CHRISTIAN & MAHAJAN [4], for example, you will see that constitutive equations are rarely mentioned or used. Frankly, I am unable to find constitutive theory to deal with the observations of deformation. However, it occurred to me that it should be feasible to construct a partial theory, covering only the X-ray observations, which is rather different from theories commonly used to interpret such measurements. Here, I will try tofitthis together with some special assumptions used for the deformations. Long ago, when I was a student, I was taught that X-ray observations are of the microscopic kind, determining how atoms are arranged in a crystal. However, if you consider widths of typical X-ray beams, it is clear that one is viewing a very large number of atoms, certainly enough to justify regarding related measurements as being of a macroscopic nature. With such techniques, one can determine quantities describing crystal structures, lattice vectors and the like. Given these, one can construct a model of a perfect crystal, describing how the atoms are arranged in it.
312 262
J. L. ERICKSEN
However, real crystals commonly contain large numbers of dislocations and other defects that cannot be detected in X-ray observations. Thus, the picture of the perfect crystal has some status in relation to the macroscopic measurements considered, but one needs to bear in mind that it ignores those common defects. For this reason, the lattice vectors considered in the X-ray theory are not necessarily the same as might be observed in devices providing higher resolution. Given such thoughts, it seemed to me reasonable to construct a continuum theory of X-ray observations, as a macroscopic theory, dealing with lattice vector and shift fields. As a first step, I [6] constructed a general framework of this kind, patterned after molecular theories of elasticity. I call this the X-ray theory. To get to elasticity theory, workers use the Cauchy-Born rule, hereafter abbreviated as CBR, to relate changes in lattice vectors to macroscopic deformation. ZANZOTTO [17] established that, for the cases of interest here, this fails to apply, so I avoid use of this hypothesis. In theories of continuum mechanics, it is a common practice to introduce reference configurations. Originally, I did not consider it important to do so for the X-ray theory. Later, I [8] did a comparative study of the X-ray and thermoelasticity theories and, for this purpose, did introduce reference configurations. However, for the kinds of phenomena considered here, I indicated that I believe that it is not very useful to do so. Later, I will explain why results obtained here reinforce this view. In 1997,1 [6] introduced an unconventional assumption concerning twins, used here and to be introduced later. In 1999, I [9] developed what is, nominally, the elementary theory of twinning a little further. Actually, this subject is not without its subtleties. Here, in Section 3,1 will explain some matters barely mentioned there, and present some new results. There is some overlap with discussions presented there, with some changes in how results are presented. Section 4 is essentially expository, a presentation of my interpretation of various ideas used in considering deformation. I have tried to make this paper self-contained, while avoiding excessive repetition. 2. Preliminaries For this study, we do not need much of the general apparatus of the X-ray theory. Primarily, we will be concerned with lattice vector and reciprocal lattice vector fields, denoted by ea and ea respectively. These satisfy the usual relations for dual bases, ea-eb
= Sab,
ea ® e a = e a
(2.1)
For elementary twinning theory, these are piecewise constant fields. In the background are constitutive equations for
(2.2)
where 8 denotes absolute temperature and p,,
i = \...n-\
(2.3)
313 Two Theories of Twinning
263
is a set of shift vectors. Elsewhere, I [8] described alternative formulations which offer some advantages. Briefly, a n-lattice consists of n geometrically identical lattices, translated relative to each other, efl generating the translation group describing the periodicity which is inherent in crystals. Pick an atom from each of the lattices, and choose one of these as a base point. Then, the shifts are position vectors of these atoms, relative to the base point. Briefly and roughly, elementary twinning theory deals with certain pairs of minimizers of
ReS0(3)
(2.4)
Also, it should be invariant under transformations of the form efl -> mbaeb,
m = \\mba\\ e GL(3,1)
n d e t m = 1,
(2.5)
or a suitable subset of these, depending on the kinds of twins considered. Actually, some workers mention also using improper orthogonal transformations. There is a mathematically acceptable way of doing this, by combining a central inversion, effectively R — —1, with m = —1, leading to the assumption (p(ea,Pi,e) = §(ea,-pi,e).
(2.6)
One can alter this in a physically equivalent manner, using ideas of equivalent shifts, which might be important when one considers q> to be restricted to one of PITTERI'S [13] neighborhoods. From the physical point of view, I do not feel comfortable making this assumption, and have not found an example of a deformation twin for which it is unavoidable. This brief sketch of the X-ray theory is, I think, sufficient to make the following discussions comprehensible.
3. X-ray theory of twins Among experts on twinning, there is some disagreement as to what should be the general definition of a twin. For the X-ray theory, I accept the common view that a twin involves two different configurations of a crystal, meeting at a plane, called the twin plane or composition plane, such that the positions of atoms in the two can be related by an isometry. This plane can be observed and characterized crystallographically, using X-rays. Lattice vectors and shifts are not affected by translations but, with rare exceptions, both undergo jump discontinuities at the plane. At least when the descriptions are essential, this leads to the conclusion that, if ea (or ea) and p, describe one of the configurations, then Qe a (or Qe a ) and Qp, are possible values of these vectors for the other, where Q is the orthogonal transformation involved in such an isometry: there can be more than one, and will be in cases to be studied later, when additional assumptions will be introduced. The twinning theory considered here is elementary, ignoring complications sometimes
314 264
J. L. ERICKSEN
considered by workers. These include consideration of surface energies, the possibility that the discontinuity plane is a thin layer with a microstructure, or that it might be composed of thin strips of differently oriented planes. Generally, such details are too microscopic to be observable using X-rays. One expert informs me that, sometimes, in his X-ray observations, he observes changes in lattice vectors near a twin plane. It is reasonable to suppose that any of the items mentioned could produce this. I [6] introduced another assumption. Briefly, this said that it is possible to choose lattice vectors on the two sides such that the Burgers vectors for circuits in the neighborhood of the composition plane are all null vectors. If ea and ea are so related, this led to the jump condition e a = (l-n
(3.1)
where n is the unit normal to the composition plane and a is some vector. This is not a conventional assumption in twinning theory, but will play an important role in our considerations. Here, we will consider only deformation twins in unstressed crystals and, for these, one can add the conditions that these vectors have the same orientation and that the volumes of unit cells coincide, implying that a n — 0. Here, I assume that a ^ 0. The case a = 0 is of some interest for Dauphine twins in quartz, which are associated with deformation, the only example of this kind known to me. Then, (3.1) is equivalent to ea=Sefl,
S = l + a
a • n = 0.
(3.2)
Here, ea must be in the equivalence class of lattice vectors generated by Qe a , producing the twinning equation ea = Sea = Qmbaeb,
m = \\mbj e GL(3, Z).
(3.3)
Now, S looks like the deformation gradient for a simple shearing deformation, like that associated with forming a deformation twin, call it S. However a priori, S has nothing to do with deformation. For example, (3.3) applies to some growth twins which have no deformation involved in their creation. However, we will see that there is a relation between S and S for deformation twins. Sometimes, a relation of the form (3.3) is satisfied by S, and we then say that CBR (Cauchy-Born Rule) is satisfied. In this context, (3.3) is familiar to experts. However, for various observed twins, S does not satisfy an equation of this form. By and large, we will be concerned with how S and S are related, in the latter cases. Note that (3.3) implies that detQdetm=l
(3.4)
and Q = ±R, where R is a rotation. Thus, in solving (3.3), there is no loss of generality in taking Q = R, as I will do. However, it can happen that, when shifts are taken into account, the relevant isometries require use of only one of the two choices of Q. There is the customary classification of twins, into type I, type II, compound and any not included in these. In the latter category, I know of only one example,
315 Two Theories of Twinning
265
of a twin in LaNbC>4, and the analysis of this by JIAN & JAMES [11] makes clear that CBR applies to it. This makes it uninteresting for us, so we will consider only the first three types. For type I twins, one is dealing with the 180° rotation Rj satisfying Rj2 = 1,
Rin = n,
(3.5)
or its negative. As is well known, for (3.3) to hold, it is necessary that n be a rational direction, meaning that there are relatively prime integers ka such that direction of n = kaea =f Ki.
(3.6)
Here, Kj is one of four directions listed in twinning tables and it is commonly determined from X-ray observations. For these twins, the other three are ignored in the X-ray theory, since they are not determined by such observations. Also, for these, the m = mi in (3.3) is of the form (mi)ba = -8ba+lbka,
laka=2,
(3.7)
where the la are, of course, integers. Then, (3.3) is equivalent to S = l + a ® K i = R i ( - l + ?7(g)K1),
(3.8)
where r/=laea^t)-Kl
a = a/|K,|,
=2,
K, = |Ki|n.
(3.9)
Solving for a gives .
(Ri - 1 ) 7
*=—-^
=
ri+
»-KiKi
~ ^r
ri+
2Ki
= - wtf'
(3J0)
and one obviously has a = |K,|a,
n = K,/|Ki|.
(3.11)
Here, (3.10) is a useful way of describing solutions, slightly different from that used by PITTERI [14]. For type II twins, a 180° rotation Ru is involved, satisfying Rna = a,
Rn = l.
(3.12)
For (3.3) to hold, it is then necessary that a be rational, meaning that direction of a = maea d=
Vl,
(3.13)
the ma being relatively prime integers. For these twins, r) ], another direction listed in twinning tables, can be obtained from X-ray observations. So can Kj, the direction of n, but for the time being, I will ignore this. Here, m = mn is of the form (mii)* = - 5 * + mbna,
mana = 2.
(3.14)
316 266
J. L. ERICKSEN
An anolog of (3.8) is S = l + i J i ® K = R i I ( - l + J j 1
jll-K
= 2,
(3.15)
with a = arix,
K = na,
a = |a|/|i/il = |K|,
K = naea.
(3.16)
Solving for K gives
K
K =
(l-Rn)K
~2—
= K
29l
(3 17)
~I^
-
with n = X/|X|.
8=1X1*!,
(3.18)
When theory does not require a direction to be rational, the custom is to call it irrational. Examples are the direction of a for type I twins and of n for type II twins. Here again, (3.17) is a way of describing solutions which I find useful, slightly different from that used by PITTERI [15], for example. A twin is compound if it is both of type I and type II, with S being the same for the solutions (3.10) and (3.17). Using (3.3), (3.7) and (3.14), we then get Ri(mi)baeb
= Ru(mn)baeb.
(3.19)
Since Rj and Rn are 180° rotations with perpendicular axes, composing them in either order gives a 180° rotation with axis perpendicular to both, as indicated by R u = RiRn = R n Ri,
a n = a u = n u = 0,
R^ = 1.
(3.20)
Then (3.19) yields R u e« = {ma)ba eb,
m u e GL(3, Z),
(3.21)
where m u = mimii = mnmi.
(3.22)
The interpretation of (3.21) is that R u is included in the point group for the skeletal lattice. This does not imply that either R u or —Ru is the point group for a multilattice, although it is true for a Bravais lattice (1-lattice). However, point groups for the latter are well-known, as are popular choices of lattice vectors for these, so (3.21) can be useful for verifying that a twin is compound. For example, CAHN [3] made use of an equivalent of this in characterizing the {130} twins in a-uranium, as I interpret his remarks. These are "There is a plane of symmetry normal to (130), namely (001). From this, we can deduce that i\ j must coincide with the intersection of (130) and (001), that is the direction [310]. If this were not so, (001) would no longer be a symmetry plane in the twinned parts of a grain, and this would violate the fundamental requirement that the crystal be unchanged by twinning."
317 Two Theories of Twinning
267
Equating values of S gives a ® K i = >?, ®K=>-a = /3j/i.
K = /?Ki
(3.23)
for some scalar /3, or S = l + /3i»i
(3.24)
where fi is, as yet, undetermined. Now, a simple calculation based on (3.22) gives lana = 0
<£> ij • K = 0,
(3.25)
to be satisfied, along with (3.7)2 and (3.14)2, by these integers. Also, by routine calculations, using (3.10), (3.17), (3.23) and (3.25), one finds that the following equations must be satisfied J/-U = K I - U = J/-U = K - U = O.
(3.26)
So, (j/ 1( K]) and (?/, K) are two sets of orthogonal directions lying in the plane with normal u, with r\ j • K = 2 and t] • K2 = 2. From this geometry, it is obvious that |jhl!K| = |*||Ki|,
rnri--T|L^-
|Kii|K|
\n\\\n\
(3.27)
When all of these conditions are satisfied, one can use (3.10), (3.17) and (3.23) to show that ji is given by i?i-J/
Ki-K
' - - 7 ^ " i*T
(3 28)
Again, this is a useful but unconventional way of describing solutions of the twinning equations. Commonly, an estimate of Ki is obtained from X-ray observations, subject to the inevitable experimental errors. Also, for type II and compound twins, one can get such estimates of J/J, interpreted as the axes of Rn- Occasionally, workers use X-ray observations to get estimates of r\ or K, but this involves some guesswork, so I will ignore this. For example, CAHN [3] did get an estimate of K for a twin in a-uranium by recognizing that it was conjugate to another twin he observed, K being its value of Ki. Often, when a twin is observed, its conjugate is not, so this is an unusual situation. Generally, this means that the related value should be regarded as subject only to the conditions required by the solvability of the twinning equations, in the X-ray theory. Next, we explore non-uniqueness of solutions of the twinning equations, beginning with those of type I. Of course, an estimate of Ki can be obtained from X-ray observations, and workers agree with the conclusion of the X-ray theory, that it is rational, so workersfitthe data to get the integers ka, referred to in (3.6). However, the integers I" could be any satisfying (3.7)2, giving an equivalence class of solutions. Given one set IQ , one can get others by adding integers la as indicated by Za=/g+/a,
laka=0,
(3.29)
318 268
J. L. ERICKSEN
or ri = laea = jf0 + i/,
i/ = Iaea =>• i/ • Ki = 0.
(3.30)
This gives a 2-parameter family of solutions, parameterized by two arbitrary integers. From (3.8), starting with the shear So = l + a o ® K i = R I ( - l + 90®K1),
(3.31)
we get any of the other shears described by S = l + a ® K i = Ri(-1 + Cifo + rj) ® Kx = R ( - l + i/0
(3.32)
where S = l-rj®Ku
i/-Ki=0
(3.33)
is a shear with the property that Sea = (mi)*ei,
iiii = \\S> - lbka\\ e GL(3, Z).
(3.34)
This is then a lattice invariant shear, mapping ea to another set of lattice vectors for the same configuration, without requiring a change in the shifts. Such shears are not detectable in X-ray observations so, physically, this explains the non-uniqueness of solutions. For type II twins, as described above, we can apply the same reasoning to the integers na in (3.14). With analogous notation, we get S = S0S = SS 0 .
(3.35)
Using the same symbols for these shears should not cause confusion. Here, we use na=noa=na,
K = naea = Ko + K,
K = na,
i/, • K = 0.
(3.36)
With this, (3.15) and (3.17). we have S o = l + »h®Ko,
S = 1 + ?1®K,
Ko = K o - - ^ 2
(3 37)
-
with Se a = (mn)baeb,
m n = \\Sba + mbha\\,
maha = 0,
(3.38)
again describing a 2-parameter family of lattice invariant shears. Finally, for the compound twins, J/J and Ki are regarded as known,fixingthe integers ma and ka, which must satisfy i/! -Ki =maka = 0 .
(3.39)
319 Two Theories of Twinning
269
From (3.7)2. (3.14)2, and (3.25), the integers la and na must satisfy lak" = 2,
mana = 2,
fna = 0 <#• 77 • Ki = 2,
^ • K = 2.
r? • K = 0 (3.40)
and be compatible with (3.26), where u, the direction of a A n is determined by J/J and Ki. Writing la = 1% + la as before, we get j j . K i = i r u = 0,
rj = Iaea,
(3.41)
implying that»/ has the same direction as i i / j . Since the components m° of ^ are relatively prime integers, this gives la=lZ-nma ^Tj^vo-ntii
(3.42)
for some integer n. Treating na similarly, then using (3.40), we get na = nOa + nka •& K = Ko + «Kj.
(3.43)
Then, using (3.28), we get 0 = A) + n,
(3.44)
a 1-parameter family of shears, parametrized by n. Here, we have (3.35), with the lattice invariant shears now given by S = 1 + «?!
(3.45)
satisfying Se a = (mc)baee,
mc = \\Sba + nmbka\\.
(3.46)
It remains for us to deal with the fact that our solutions for type II twins which are not compound do not account for the fact that estimates of Ki are commonly obtained from X-ray observations. Often, specifying J/J and Ki implies loss of existence of solutions. Of course, one would like to have the vector K in (3.17) take on the specified direction Ki. However, Ki is subject to some experimental error, so it would suffice to attain a value which agrees with the estimate, within the estimated error limits, and this is feasible. To analyze this, choose whatever value of K you like, denote it by Ko, and introduce the corresponding value of K t
K0 =
(I-RII)KQ
2Vl
=Ko-]^j2.
(3-47>
from (3.17), with ijj considered as given. Introduce a rational estimate of Ki, denoted by Ki, matching the data within the margin of error, requiring that ?i • Ki = 0,
(3.48)
in accordance with common practice for such estimates. From (3.15)3 and (3.17), *i • Ko = 0.
(3.49)
320 270
J. L. ERICKSEN
From (3.36), the available values of K are given by K = K 0 + K,
9l
. K = 0,
(3.50)
with Ko and K rational. Let v = vaea
(3.51)
be a non-zero vector such that v •J/, = v • Ki = 0 => v • Ko = v • Ko>
(3.52)
from (3.17). For any choice of K, we can use (3.17) and (3.50) to calculate a corresponding value of K satisfying 1)^ • K = 0. Ideally, we would like this to have the direction Ki, which is to say be perpendicular to v. This would be possible, if we could find a value of K such that v • (Ko + K) = 0 & v • (Ko + K) = 0,
(3.53)
from (3.52). Now, J/J = maea, where them13 are relatively prime integers, implying that one can find m e GL(3, Z), with these numbers as the first row. Using this to transform lattice vectors, and we get a choice of ea such that *i=ei.
(3.54)
Since Ki and K are rational vectors and perpendicular to r}l, we then have Ki = pe2 + qe\
K = re2+se3,
(3.55)
where p and q as well as r and s are relatively prime integers. Also, Ko is perpendicular to rjl, so Ko =ice2+Xe3,
(3.56)
where these components are real numbers, not likely to be integers. We are free to choose r and s. Referring to (3.51), we can take v2 = q,
v3 = -p,
(3.57)
to satisfy (3.52)2, and calculate u1, using v • r]l = v • ei = ( V d + q e2 - p e3) • ei = 0 .
(3.58)
In (3.55), p and q cannot both vanish, so I will assume q / 0. This can be attained by renumbering ei and e2, if necessary. Working out the equation we would like to satisfy, (3.53), we get £ = 0,
E = - - ' ^ - , q I +s
(3.59)
321 Two Theories of Twinning
271
to determine r and s. Solve it when possible, but this is more likely to be impossible. However, given the experimental errors, it seems reasonable to accept approximations such that, for some number /x covering the estimated margin of error, \E\ < (i.
(3.60)
For r = np, s = nq, where n is an integer, (3.60) is satisfied when \n\ is sufficiently large =£• K = ri¥L\.
(3.61)
It is easy to see that
Of course, how large n must be depends on the choice of the starting value of Ko, so one could choose this in such a way as to satisfy the condition for all positive n, for example. With the values of K thus determined, one gets an infinite number of theoretical estimates of Ki, of the form Ki = w ( K o + « K i - p J 2 ) = i;(Ko + «Ki),
(3-62)
where v is a number one can chose and, in this form, the result does not depend on the choice of lattice vectors,. Generally, Ki is irrational. In such cases, a common practice is to choose a particular set of lattice vectors and normalize by using ratios making one component equal to unity. Clearly, one can get different values of Kj by using slightly different values of the rational estimate Ki, for example. Obviously, the X-ray theory does not pick out a particular value as being exact and I am not sure what should be meant, physically, by an exact value. The analysis here is new. Previously, I [9] deduced a result of this kind for the particular case of the "{172}" twins in a-uranium, using rather similar ideas. Here, the quotation marks indicate that this is a rational approximation to Ki, which is considered to be irrational. From calculations I have done and comments made by others, I think this estimate fits the experimental data, within the likely errors. At the time, I did not see how to treat the general case.
4. Deformation theory of twins Somewhat different theory is used for twins produced by deformation, for which X-ray observations are supplemented by measurements of deformation, commonly obtained using optical observations. Since the theories are different, I think it important to understand how their predictions are related, and whether these can be in conflict. First, some identities are used, easily derived from equations presented by KELLY & GROVES [12, Section 10.2]. Given the orthogonal vectors a and n de-
322 272
J. L. ERICKSEN
scribing a shear S, we introduce four vectors J/J , J/2, K i and K2, described by The vector »/!
has the direction of a
'2 Ki
2 n
~ n
a
(4.1)
|a| 2 From these, it follows that the four vectors all lie in the plane of a and n, rj1 • v = 1/2 • v = Ki • v = K2 • v = 0,
v = a A n,
(4.2)
and J / ! - K i = J J 2 - K 2 = 0,
J/1-K2>0,
J/2-KI>0.
(4.3)
Actually, workers sometimes take )/[ and a to be anti-parallel, etc., but, usually, the directions picked conform to (4.3)2 and (4.3)3. Typical examples occur in the cases treated in Section 7, where these directions are parallel for some realistic values of lattice parameters, anti-parallel for others. Bear in mind that reversing a and n does not change S. Also, with Ri and Rn interpreted as before, one gets a generalization of (3.8), by using routine calculations to verify that
>/ 2 -Ki
(4.4)
satisfies detHi = l,
H 2 = l,
S = RIHI
(4.5)
These equations play an important role in ZANZOTTO'S [17] study of failures of CBR and related failures of elasticity theory. Similarly, (3.15) gets generalized by verifying that
t)i K 2
(4.6)
satisfies detHn = 1,
Hj2! = 1,
S = RiiHu.
(4.7)
Also, using (4.3), it is easy to show that HiH n = HnHi.
(4.8)
In the applications, r)\ and J/2 a r e commonly considered as linear combinations of some lattice vectors e a , whereas Ki and K2 are referred to the reciprocal lattice vectors ea. We have seen examples in the previous Section, with the possibilities rf2 = m for type I or compound twins =£> J/2 • Kj — 2,
(4.9)
323 Two Theories of Twinning
273
and K2 = K for type II or compound twins => r]x • K2 = 2,
(4.10)
with >/i and Ki as described there. Nothing said so far relates S to macroscopic deformation. If S is considered to represent macroscopic shear deformation, J/J represents the direction of a, now interpreted as the direction of shear, whereas before it represented the direction of the axis of Rn. Conceptually, these directions are different. If you look at twining tables, you can find descriptions of rjl but, for type II or compound twins, the measurements are sometimes made of one of these directions, sometimes of the other, and you must go back to the original sources to find out which it was. Suffice it to say that I made a concerted, but unsuccessful effort to find out if there are any observations of cases where these two directions are different. So, I believe that, empirically, they always coincide. Later, I will say a bit more about this. In such tables, data for i/2 and K2 are almost always obtained from observations of deformation. Generally, these differ from the rather ambiguous values of 17 and K used in the X-ray theory. So, to avoid confusion, I used different notation. Commonly, experimental estimates of Ki listed in tables are obtained from X-ray observations. It is easy to check that t\i is the direction in the plane of a and n, not parallel to a, such that i/2 and S?/2 have the same length. This direction is included in a plane with normal also in the plane of a and n, with direction K 2 . As is perhaps obvious, suitable pairs of the four vectors determine S, for example r\x and K2 or J/2 and Ki. Of course, these are combinations which cannot be determined using only X-ray observations. Clearly, one could put data from the tables into the identities (4.5) and (4.7) to relate these to a and n, etc., but I do not think that this is really useful. Neither do I regard this as a theory of deformation twins, since it only involves identities. So, we need to introduce some assumption(s). I will now described my interpretation of one used in practice. Consider an unstressed crystal twin configuration, using an essential description. There are then two logical possibilities, both encountered in practice. (a) The macroscopic deformation gradient S satisfies an equation of the form (3.3). We then say that CBR applies. (4.11) (b) The macroscopic deformation gradient S does not satisfy such an equation. We then say that CBR fails to apply. (4.12) Of course, these statements apply only to the very special deformations commonly considered in discussions of twinning. Now, if ea is a set of lattice vectors for an essential description, what are called sub-lattice vectors ea are choices of lattice vectors for a non-essential description of the same configuration. I [7] discussed relations between these two kinds of descriptions in some detail, including ways of determining whether a description is non-essential. Briefly, ea and a a are related by a transformation of the form ea = M%eb,
Mba e Z,
|detM| > 1.
(4.13)
324 274
J. L. ERICKSEN
Also, by transforming ea and ea independently, using transformations in GL(3, Z), one can get pairs such that ea = maea
(no sum),
(4.14)
where the ma are positive integers such that m \ is a divisor of mi and mi is a divisor of 7« 3. Later, I will calculate examples of the latter. Now I state my interpretation of an assumption commonly made by those concerned with deformation twins. Hypothesis: when (4.12) holds there are sublattice vectors ea and ta such that la = Sea = Qmbah, M e GL(3, Z). (4.15) Here, as before, Q is an orthogonal transformation involved in a isometry, relating the two configurations involved in the twin considered. Clearly, one relies on Xray observations to determine the possible isometries. Probably, the hypothesis can always be satisfied, with the experimental errors involved in measuring lattice vectors, by approximating real numbers by rationals. In practice, workers tend to dislike such assumptions if they require using ratios of large numbers but, as far as I can tell, it is a matter of individual judgment as to what is too large. With (4.13), one can work out the equivalent for the essential description, getting Sea=Qrbeb,
(4.16)
r = M" 1 mM
(4.17)
where
is a unimodular matrix of rational numbers, having the special properties implied by (4.17). When (4.12) applies, Sea is not a possible set of lattice vectors on the side to which S applies. For type I twins, the idea is as before: an isometry involving ±Ri should relate the two configurations involved in a twin. Using (4.4) and (4.5), we must have S
= Rl (-1 + tOl®*)
=Rir
^ a e«.
(4.l8)
Using the fact that the rba are rational, it is not hard to show that J/2 and Ki must be rational, so we can take r,2 = Paea,
K,=kaea^r,2-Kx=paka,
(4.19)
where the pa as well as ka are relatively prime integers. Now we set n = -%-
=> i • Ki = 2,
(4.20)
and, by writing S = l + a
(4.21)
325 Two Theories of Twinning
275
we can copy (3.10) to get (4.22) Here the direction of a, hence i/i, as well as K2 need not be rational, the custom being to call them irrational. As noted above, J/2 and Ki must be rational. This is the way you will find type I twins which are not compound described in twinning tables and, if they are so described, it is safe to assume that workers have judged then to be type I twins which are not compound. The X-ray theory gives different values of J;2 in the situations considered, but does not disagree as to which of the four directions are irrational. One difference between this and the X-ray theory is that, here, experimental estimates of the integers pa as well as ka are available, enough to determine S, the former coming from observations of deformation. The analysis of type II twins is similar, replacing (4.18) by
S = Rn ( - 1 + 2Hl^ll)
=
RIir^fe Q e«.
(4.23)
This requires J/J and K 2 to be rational, say i/i = maaa,
K 2 = qaea => rll • K 2 = m % ,
(4.24)
where these components are relatively prime integers. Here, i/2 and Ki need not be rational so, by the same custom, they are called irrational. Again, if the four vectors are so reported in tables, it is safe to assume that workers have judged them to be twins of this kind. Here, by setting K=
2K-> | _ ^ ) ? 1 . K = 2,
(4.25)
and by writing S = I + 1/1 ®K,
(4-26)
we can copy (3.17), to get x (l-Rn)K - 2,! K=—1_=K-^]I.
(4.27)
Again, there is the difference that the integers in (4.24) are both available as experimental estimates. Of course, K should have the direction of Ki, which can be estimated from X-ray observations, but we do not have the flexibility used in the X-ray theory, to try to enforce this, except for that involved in replacing matrices of integers by r. Frankly, I am not sure what workers would do, if the value of Ki obtained from (4.27) were in serious disagreement with the value obtained from X-ray observations. Neither do I know of an example where this has proved to be the case. So, the two kinds of theories treat this issue in different ways, but I see no clear evidence of any real conflict between them.
326 276
J. L. ERICKSEN
For the compound twins, the deformation theory obviously requires »/;, i)2, Ki and K2 all to be rational directions and, again, this is how they are reported in the tables, etc. Again, by writing S = l + ^JJi®Ki,
(4.28)
we can copy (3.28) to get
rjrtl
KKi
(4.29)
where »/ is given by (4.20), K by (4.25). Here, I have presented solutions of the twinning equations in unconventional ways, to make it easier to relate the shear deformations to the shears associated with the X-ray theory. Now, a word of warning about twinning tables. Often, the indices listed refer to lattice vectors associated with non-essential descriptions. Experienced workers know this and ways to compensate for it, for example by allowing half-integers in (3.3) for centered 1 -lattices described as 2-lattices, when CBR applies. However my experience is that there are various disadvantages to the practice. For example, in my study of{ 130} twins in a-uranium [9], it was only after replacing the commonly used non-essential description by an essential one that I realized that these could be included in a Pitteri neighborhood. When this can be done, it simplifies analysis of patterns of twins. Before, I mentioned the two conceptually different interpretations of r]x, in discussing type II and compound twins. I do not know of a compelling reason to believe that the two directions always coincide. So, I note some easy implications of their being different, for type II twins. Suppose that we do have that isometry, involying Rj/, with J/J as the direction of its axis, and that a direction of shear deformation is rjl ^ i/j. Let Si denote the shear deformation and suppose that Hypothesis (4.15) applies so that, from (4.23), there is some r such that Si = l + ai
ai • n = 0.
(4.30)
Now set S 2 =RiiSf 1 Rii = l + a 2 ® n .
(4.31)
A routine calculation then gives S2ea = R i i O - 1 ) ^ ,
a 2 = R n ai =>• a2 • n = 0,
(4.32)
using the fact that n is perpendicular to i/j. With the directions J/J of ai and fji different, a 2 7^ ai. One possibility is a2 = - a i = ^ a 2 - ) h = 0 a n d S 2 = S^ 1 ,
(4.33)
the directions of ai and a2 then being essentially the same. Otherwise a 2 = R n ai ^ ± a i ,
(4.34)
327 Two Theories of Twinning
277
and the directions of ai and a2 are definitely different. As I interpret (4.32), it implies that S2 is also a possible value of the shear deformation. It is easy to verify that, when (4.34) applies S 3 = SiS 2 = S2Si
(4.35)
is also a simple shear, with i?j as the shear direction, so i}1 =TJ\ could be attained in this way. There is a theoretical example involving (4.34). PITTERI [15] presented it, and it involves a 180° rotation about a rational direction, so if j is rational, with m of infinite order. However, his a is neither parallel nor antiparallel to r\ j , so one can use (4.32) to get another value, with a different direction. Later, ZANZOTTO [16] showed that, by superposing a lattice invariant shear, one could convert his solution to one with m 2 = 1, and this changes a so it becomes parallel to t\ \. So, as he notes, this is essentially a type II twin, in disguise. Of course, this does not preclude the possibility that a deformation might have a at the value given by Pitteri, or the indicated alternative. Techniques developed by ADELEKE [1] make it more routine to analyze such ambiguities in descriptions of twins which conform to (3.3), as the indicated example does, the case at hand being covered by his subcase 3.3.4.3. This indicates that such examples are, in a similar way, type II twins in disguise. This does not cover the cases where the r in (4.30) fails to reduce to a matrix of integers. I do not fully understand the theoretical implicaions of this non-uniqueness of j / b so do not think it worthwhile to say more about this hypothetical situation. As to (4.33), I have found no evidence that any expert has considered such a possibility. I have not made a serious effort myself to construct an example which is physically plausible, so I leave it at that.
5. Compound twins To relate the shears involved in the two kinds of theories, I think it best to begin with the compound twins, for which the results are easier to interpret. Here, I assume that the vectors J/J and Ki are the same in the two theories and, of course, rational, since both theories require this. For the X-ray theory, choose any of the possible solutions of the twinning equations or, equivalently, some particular vector tf, denoted as before by rj0, described by (3.8) and (3.9). This gives a shear So for this theory,
S0 = l + A)!h®Ki-
^0 = - y : 4 i ,
(5.1)
from (3.28). For the deformation theory, we get from (4.20) and (4.29) the analog (5.2)
wherein rj and »j2 a r e rational. Consider the vector W = 2)?2-^2-KIJ?O
(5.3)
328 278
J. L. ERICKSEN
which, from (3.26) and (4.2), lies in the plane of ql and Ki. Also, from (3.9) and (4.19), it has integer components in the basis e a and is perpendicular to Ki, so it has the direction of ±ril. Since the components of r]l are relatively prime integers, there is then some integer n such that w = ni?!.
(5.4)
0 = 00 + — ^ . j / 2 -Ki
(5.5)
S = S0S = SS 0 ,
(5.6)
With this, (3.28) and (5.2), we get
With this, we have
where S is the shear given by S = l+^i.
1/2 Kl
(5.7)
If T)2 • Ki is a divisor of n, this is a lattice invariant shear and, by replacing ?/0 by another possible value of i\ we could get S = So, implying that CBR applies. For other cases, we need to introduce the relevant sublattices to interpret S. This involves using (4.13) and, as we noted before, it is always possible to transform the lattice vectors so that (4.14) holds. To accomplish this, again use the fact that r\ j is rational, to get lattice vectors e a such that 9i = e i .
(5.8)
This still leaves us free to transform e2 and e3, leaving ei fixed. Since Ki is rational and perpendicular to r}x, one can construct an acceptable transformation of this kind, to attain Ki = e 3 .
(5.9)
a — r)2 • Ki if this is an odd number Ki ... . u = 1/2 • — if it is an even number
(5.10)
Now, having done this, set
and take as sublattice vectors ei=ei,
e2 = e 2 ,
e3 = ae 3 .
(5.11)
e^e1,
e2 = e 2 ,
e 3 = e3/a.
(5.12)
Note that
Also i,2 = laea,
/3 = i j 2 . K i .
(5.13)
329 Two Theories of Twinning
279
Then,
S = R1f-1 + ^ 4 f ! ) = R 1 ( - 1 + 2 e3^e3 + 2 ( / l e ' + / 2 J ) ^ 3 ) V */2-Ki / V / ^ 2a(lle! + /2e2)
r)2 • K]
i/2-Ki
/ (514)
/
A routine calculation then gives Se a = Rim£efo,
(5.15)
with 0 - 1 0 0 - 1 0 , Z1 Z2 1
m =
(5.16)
where ^ ^ ^ x f ^ K ^ s o d d , = l
a
(5.17)
if it is even.
So, this verifies that (5.11) does give a possible sublattice. Also S = 1 + — n -— ei (E) e 3 = 1 + 7 / 2 1
a
" ei ® e 3 92-Ki if )/2 • K4 is odd, = l + «ei(g)e 3 = 1 + («/2)§i ® e3 if J?2 • Ki is even.
p 18)
'
However, in the latter case, it follows from (5.3) and (5.4) that n is even, so n/2 is an integer. This establishes that S is a lattice invariant shear for the sublattice.
(5.19)
Now, the shear associated with the X-ray theory puts the lattice vectors and shifts at values they should have, to account for isometries relating the two configurations, but this theory does not really fix the region in which the atoms reside. By going to a sublattice, we are effectively ignoring what happens to a large fraction of the atoms. The usual view is that they must somehow get shuffled around, to be compatible with the isometries dictated by X-ray observations. Restating (5.19), we have Sea = mhaeh,
m 6 GL(3, Z),
(5.20)
or S(m-1)bJb=&a,
(5.21)
330 280
J. L. ERICKSEN
where it is a simple matter to use (5.7) to calculate m. When we use (5.21), the ea get restored to their original values, which is consistent with the Hypothesis: The shear denoted above by S results in no change in the crystallography: the values of ea and p,- occurring after it is applied can (5.22) be taken to be the same as those used before it was applied. So, this is my way of explaining how the issues can be resolved. Of course, S does affect the macroscopic deformation. In fact this is a quantitative measure, useful for this purpose, describing a rather complicated process, involving atomic shuffling. Certainly, the changes on positions of the atom are too complex to be described by a linear transformation. While I have not seen it explained in this way before, I believe that it is consistent with what is done by experts, in practice. How they deal with shufflings is covered in some detail in the review of work on deformation twinning by CHRISTIAN & MAHAJAN [4]. One way of proceeding is to first consider how the atoms would be arranged, if the linear transformation S were applied to the positions of all atoms on the sheared side. Then, one finds some relatively simple way of changing some of the positions, to get them to agree with the positions dictated by the relevant isometries. Suppose that we start with an essential description of a n-lattice, and introduce a non-essential description of this as arc'-lattice,n' > n, as we have done. From the general theory of non-essential descriptions, which I [7] have discussed in some detail, n is a divisor of n', and the effect is to replace each of the original lattices by n'/n sublattices. As above, consider a plane with normal e 3 (= Ki) which contains an atom in one of the original lattices. By translating this with integral combinations of ei and e2, we get a net in this plane. The non-essential description also covers all atoms so, by similar reasoning, the same net occurs in one of the sublattices. In the simplest view, the shear deformation S then describes how the atoms in any sublattice move, relative to an atom in this sublattice, and those in a plane with this normal behave as rigid units. This is simplistic, because one cannot know whether identical atoms might have exchanged positions, producing an indistinguishable outcome. However, I see no danger in assuming the simplest picture. Similarly, because of the requirement that S produce an arrangement of atoms isometric to the original, one can assume that all the planes move in the same direction, by different amounts. To see this, consider a 2-lattice, with r]\ • K) = 4, so 63 = 2e 3 . The describes the 2-lattice as a 4-lattice. If p is the original shift vector, the three shifts for the 4-lattice description can be taken as e 3 /2 = e 3 ,
p,
p + e 3 /2.
(5.23)
Now transform these, using R]. The requirements for isometries imply that Rie3 must be a lattice vector for the sheared crystal and (Rie3) • e 3 = e3 • e 3 = 1. From this, one can infer that, for the two sublattices representing the original lattice containing the origin, the distance between those planes stays fixed, whether they are in the same or different sublattices. Also, the changes in shifts associated with isometries are consistent with the planes all being translated in the direction r\ l = ei. In a similar way, one can deal with the other two shifts. So, I picture the collection
331 Two Theories of Twinning
281
of planes as like a deck of cards restricted to slide in one direction. Of course, the amounts they can translate are not arbitrary, because the atoms must come to positions respecting isometries involved. Later, I will go through the relevant calculations for a twinning mode observed in some hexagonal close-packed crystals, all of which are pure metals. As should soon become clear, the same argument can be used for the type I twin, up to a point. For these, K] is again rational, producing the common nets. There is the difference in that, for these, rjl can be irrational. With type II twins, we generally do not have these nets, since Ki is generally irrational. So, the reasoning for these needs to be changed, a matter taken up in the next Section.
6. Other types of twins First, consider the type I twins that are not compound. Here, I assume that the two theories employ the same Kj, agreeing that it is rational, this being the only one of the four directions determined by X-ray observations. For the deformation theory, 1/2 i s rational, with J/J and K2 considered to be irrational. In addition, I assume that the hypothesis (5.22) also applies to shears labeled S. Again, choose any (rational) value ?/0 of r\ for the X-ray theory, giving the shear S 0 = l + a ® K i = R i ( - l + if2<8>Ki),
)j0 • Kj = 2,
(6.1)
and the analog for the deformation theory
S = l +a®K1=R1(-l+^^y
(6.2)
Here again, the vector W = 2»/2-J;2-K1J;0
(6.3)
has integer components relative to the basis ea and is perpendicular to Ki. However, i/o need not lie in the plane of J/2 and Ki. So, take a vector w parallel to w with components that are relatively prime integers. We then have 2ri2 = jjj-K^o-nw,
(6.4)
for some integer n. Putting this into (6.2), and doing routine calculations, one gets S = S0S = SS 0 ,
(6.5)
where
S =l+
»/2
=4,
K
l
(6.6)
much like (5.7) except that w is no longer parallel to J/J, in general, being some rational vector perpendicular to Kj which depends on the choice of j / 0 . Again, if
332 282
J. L. ERICKSEN
r/2 • K[ is a divisor of n, we can use another choice of J/0 to get S = So, so CBR then applies. In the other cases, we can again choose the lattice vectors ea so that Ki = e 3
(6.7)
and, for any such choice, use (5.11) and (5.12) to introduce sublattice vectors ea. This gives (6.8)
Sea=m%,
where m is the unimodular matrix of integers described by (5.16). Again, S is a lattice invariance shear for the sublattice. If you look in detail at the vector nw in (6.6), you see that, usually, the directions of shear for So and S are different, no matter what choice you make for J/0, the direction for S being a combination of these. Basically, this is what makes these twins different from the compound twins. For given S, there are infinitely many choices of So and S, owing to the fact that lattice-invariant shears for the essential description can be included in either. However you do this, S describes a part of S that cannot be detected in the usual X-ray observations. Again, there are the planes, having Ki as normal, in which atoms belong both to a lattice and a sublattice. They do move as rigid units, again like a deck of cards. There are other problems that could be considered, for example what are the choice(s) of i\ give So best approximating S, but I will not pursue them. For type II twins which are not compound, a rather similar analysis is feasible. I assume that the two theories use the same value of r)l, which is rational. They agree that Ki is, by the usual standards, irrational. For the X-ray theory, we had an infinite number of solutions of the twinning equation, even if we matched data on Ki, within experimental error. So, we can choose any value Ko of K. It will be rational and have integer components relative to the basis e a , which are relatively prime. Now, by arguments very similar to those used above, we get 2K2 - J/i • K 2 K 0 = nv/,
(6.9)
where w is a vector with relatively prime integer components relative to the basis ea, satisfying (6.10)
W-IJ1=O.
Then S = Rn(-l+?^)=Rn(-l V Vi • K2 / V
+ ,1®Ko
= SoS = SS o ,
+
=!!®* • K2) V i / (6.11)
where So is the shear corresponding to Ko, according to the X-ray theory, and S = I + =!!®* Vi K 2
(6.12)
333 Two Theories of Twinning
283
Now, writing the shears in the forms (3.15) and (4.26), we have S = SSQ l = 1 + rll ® (K - Ko),
(6.13)
K-Ko = -^—.
(6.14)
so
Generally, K and Ko will not be rational directions, but the direction w of this difference is. Also, if r/ j • K2 is a divisor of n, there is a value of K such that S = So, so CBR applies. To find a sublattice for the other cases, one can introduce lattice vectors such that *i=ei
(6.15)
K2 = J/i •K 2 e 1 +Jte 3 ,
(6.16)
and
these components being relatively prime integers. Then set ei=ei,
£2 = e 2 ,
e3 = ae 3 ,
(6.17)
where a = ij] • K 2 if J/I • K 2 is odd, = t)\ • K 2 /2 if 1/1 • K 2 is even.
(6.18)
A calculation then gives SSfl = Rn#ȣefc,
(6.19)
1 0 0 -1 b 0
(6.20)
where m=
0 0 -1
with b = 2k if VJ K 2 isodd, = k if)/! K2 is even.
(6.21)
Again, one can interpret S as a lattice invariant shear, for the sublattice. Note that, since r\ • Ko = 2, K0 = 2e1 +Pe2 + qe3,
(6.22)
334 284
J. L. ERICKSEN
for some integers p and q, which we can choose. Using (6.9). (6.16) and (6.22) to calculate w, we get n-w = 2ke3 - nix • K2(p e2 + q e3) = 2ake3 - i^ • K2(p e2+q e 3 ).
(6.23)
Using (6.18), ny//r) {• K2 is a vector with integer components relative to the basis e a . This is enough to demonstrate that S, given by (6.12), represents a lattice invariant shear for the sublattice. Note that the part involving p and q represents a lattice invariant shear for the original lattice, an effect of using different values of Ko. So, using (6.12), (6.18) and (6.21), we find that the important part gives planes with normal e 3 , a shear described by S = l + 6ei
(6.24)
the choice for p = q = 0. Here again, these are planes in which the atoms belong both to a lattice and a sublattice, those having e3 as normal. However, this is different from the direction of Ki associated with S. Thus, one expects such planes not to behave as rigid units, but for atoms in such a plane to be translated by different amounts, in the direction y\ j . Granted that Ki is generally irrational, the composition plane contains only a finite number of rows of atoms, at best. From this, it seems likely that, microscopically, this "plane" has a somewhat complicated structure. One idea, discussed by CHRISTIAN & MAHAJAN [4], is that it consists of small parts of rational planes, joined together, to give a microstructure toofineto be observable, using X-rays. I do not think that it is fruitful to try to discuss such speculations here, partly because the macroscopic theories considered might well be too crude to deal with such questions. From the above discussion, we do have a particular set of rational planes, playing a rather important role, for whatever that is worth. For the type I and type II twins, one can again apply the reasoning following (5.21) to argue that these versions of S do not really change the lattice vectors or shifts either. In comparing the X-ray theory with thermoelasticity theory, I [8] introduced an analog of the reference configurations used in the latter theory, for the X-ray theory. The shears denoted by So do not cause this to change, implying that deformations do not change it, when CBR applies. However, using information given there, it is easy to see that, when it fails to apply, the shears denoted by S do cause it to change, making it less useful to introduce reference configurations. I believe that this is related to the fact, reasoned by ZANZOTTO [17], that elasticity theory generally fails to apply, when this is the case for CBR. The X-ray theory represents my attempt to express in mathematical terms various ideas about such observations that workers seem to use when thinking about twinning and phase transitions in crystals, among other things. This includes a theory of twinning, which I believe to be reliable, summarized in Section 3. However, it says nothing about other kinds of observations used in experiments on deformation twins. For these, workers use additional ideas and assumptions. In Section 4,1 discussed my interpretation of some of these. I believe that it is important to determine whether the combination is internally consistent, in describing deformation twins, in order to provide another check on the X-ray theory and to illustrate what the X-ray theory can and cannot do in analyzing such twins. I think that the general
335 Two Theories of Twinning
285
arguments demonstrate that the combination of assumptions lead to a consistent description of deformation twins. My arrangement of ideas is unconventional, but I believe that it is at least reasonably consistent with lines of thought used by others, in dealing with these matters. In the previous discussions, I glossed an ambiguity. Given sublattice vectors ea satisfying (4.13) and (4.15), one can always find larger sublattices satisfying these conditions; multiply these lattice vectors by the same integer, for example. Those I have used are, in a certain sense, optimal. To explain this, I first note that, in practice, we really start with (4.16) and (4.17), with some description of r. Check back through the calculations done and you find that, in all cases of interest, r is of the form rb = -8b + 2rasb/c,
c = rasa,
(6.25)
where ra as well as sb are relatively prime integers. Since the m e GL(3, Z) in (4.17) is similar to r, it must be of the form mba = -Sb + rasb,
rasa = 2.
(6.26)
for some integers ra and sb. By writing (4.17) in the form Mr = mM
(6.27)
it is easy to check that this is the equivalent to the equations Mbrb = Xra
(6.28)
and Mabsb = 2Xsa/c,
(6.29)
where X is a scalar. Now, by shifting a factor of 2 from ra to sa, if necessary, we can assume that the ra are relatively prime. Then, since the left sides of (6.28) are integers, X e Z.
(6.30)
All of these statements are invariant under transforming ta and e a , independently, using transformations in GL (3, Z). Using these, we can arrange that ra = ra = 8l =>• Ml = XS3a
and s 3 = 2,
(6.31)
implying that X is a divisor of det M.
(6.32)
Also, in (6.29), the entries on the left are integers and the s" are relatively prime, whence it follows that 2X/c e Z.
(6.33)
336 286
J. L. ERICKSEN
There are then two possibilities, c is odd =>• c is a divisor of A., hence of det M
(6.34)
c is even =£> c/2 is a divisor of k, hence of det M.
(6.35)
or
With this, it is easy to check that The sublattices we have used give the smallest possible values of |detM|, (6.36) or, if you prefer, the unit cells of minimum volume. I note that this statement is invariant under the transformations of lattice vectors used in deducing it. Of course, one can get other possibilities of this kind, by transforming the lattice vectors we used, but most such choices will not satisfy (4.14). Another point is that configurations related by the deformations denoted by S are, essentially, examples of the neutrally related states introduced by DAVINI [5] in the theory of continuous distributions of defects and discussed more recently by FONSECA & PARRY [10], for example: these writers do not mention the shifts needed to describe multilattices, something that should be considered. In this context, the relation of S to atomic shuffling adds a bit, physically, to the perspective of neutrally related states. 7. An example Here, we will do relevant calculations for a kind of compound twin to which CBR fails to apply. This mode, commonly denoted by {1012}, is observed in all pure metal crystals of the hexagonal close-packed kind. For these monatomic 2-crystals, take
ei = a\,
e2 = - (-i + V3jj ,
e3 = ck,
e1 = i ( i + - L j Y
e2 = ^ j ,
e3 = ik,
a\ with the shift
V3 /
\/3a
(7.1)
c
2 1 1 , p = - d + -62 + - e 3 ,
(7.2)
Here, a and c are positive constants, i, j and k being orthonormal vectors such that i-JAk>0. From the twinning table presented by ample, one gets
BARRETT
91 = - 2 e i - e 2 + e 3 , 92 = 2 e i + e 2 + e3,
(7.3) & MASSALSKI [2, p. 415], for ex-
Ki = e ' + 2 e 3 , K2 = - e 1 + 2 e 3 ,
*• " '
337 Two Theories of Twinning
287
from which Vl
• K 2 = v2 • Ki = 4.
(7.5)
For the X-ray theory, a simple calculation gives j / 1 - e 2 = K i - e 2 = 0,
(7.6)
so, in (3.26), we can take u = e 2 =>• K • e 2 = t) • e2 = 0.
(7.7)
With this, (3.9)3, (3.15)3 and (3.25)2, we get K = me 1 + 2(1 + m)e 3 ,
i? = (« 1 + l ) ( 2 e i + e 2 ) - « i e 3 ,
(7.8)
where n\ is an arbitrary integer. Then, using (3.28), (7.1) and (7.4), we get the set of shears described by (7 9)
S(ni) = l + fm + l - - ^ — j ) " ® ^ -
-
the possible choices of the S o referred to before. With (4.29), (4.20) or (4.25), (7.1) and (7.4), we can calculate the shear deformation, which gives S = 1+(I-^)^K,
(7.10)
Then, for any choice of n\, we can calculate the corresponding S(« 1 ) = 1 - ^ « 1 + ^ » ; 1 ® K 1 .
(7.11)
Minimizing this shear, to make S(«i) as close as possible to S, obviously gives the two possibilities ni=0
or m = - l .
(7.12)
To compare the sizes of shear associated with S and these choices of S, we can use the ratio ^
R=±
V2
=T^A
3a2+c2)
•
(7.13)
\a>
For realistic values of c/a, R is quite large. For example, using values of c/a listed by BARRETT & MASSALSKI [2, p. 415], I get,
Magnesium, c/a = 1.623, |fl| = 15.39,
for
Zirconium, c/a = 1.592, \R\ = 11.87, (7.14)
338 288
J. L. ERICKSEN
the latter value being more typical for these twins in metals. Certainly, the magnitudes of shears associated with S and our choices of So are quite different for the twins considered. So, CBR fails miserably in these cases. I have not done such calculations for other kinds of twins, so I do not know how representative these values are, generally. To determine a sublattice, we transform lattice vectors in the following way ei = - 2 e i - e 2 + e 3 , e2 = - e 2 , e 3 = ei.
(7.15)
This gives i»i=ei,
Ki=e3,
(7.16)
agreeing with (5.8) and (5.9). Then, as indicated by (5.10), (5.11) and (7.5), sublattice vectors can be taken as ei=ei,
e2 = e 2 ,
e 3 = 2e 3 ,
(7.17)
describing the 2-lattices as 4-lattices. The three shifts for the latter can be read off from (5.23) and (7.2). Perhaps this is enough to illustrate that making such calculations is now a matter of routine. I notice that for Zirconium, ZANZOTTO [17] lists three other twinning modes. For one, CBR applies. For the other two, it does not, since JJ2 • Ki has the values 6 and 17. In turn, this means that the sublattices for these are quite different from each other and from those given above. Similar situations are observed in other crystals. I am very sympathetic with the efforts of those working on deformation twins, trying to find some rationale to explain such matters, but have no helpful ideas concerning this. Acknowledgements. I thank MARION ERICKSEN for her help with typing chores.
References 1. ADELEKE, S. (1999), On matrix equations of twinning in crystals. To appear in Mathematics and Mechanics of Solids. 2. BARRETT, C. S. & MASSALSKI, T. B. (1966), Structure of Metals, 3 r d ed. McGraw-Hill, New York. 3. CAHN, R.W, (1953), Plastic deformation of alpha-uranium; twinning and slip, Acta Metallurgica 1, 49-67. 4. CHRISTIAN, J. W. & MAHAJAN, S. (1995), Deformation twinning, Progess in Materials Science 39, 1-157. 5. DAVINI, C. (1986), A proposal for a continuum theory of defective crystals. Archive for Rational Mechanics and Analysis 96, 295-317. 6. ERICKSEN, J. L. (1997), Equilibrium theory for X-ray observations, Archive for Rational Mechanics and Anaysis 139, 181-200. 7. ERICKSEN, J. L. (1998), On non-essential descriptions of crystal multi-lattices, Journal of Mathematics and Mechanics of Solids 3, 363-392.
339 Two Theories of Twinning
289
8. ERICKSEN, J. L. (1999), Notes on the X-ray theory. Journal of Elasticity, 55, 201-218. 9. ERICKSEN, J. L. (1999), Twinning analyses in the X-ray theory. To appear in International Journal of Solids and Structures. 10. FONSECA, I. & PARRY, G. (1992), Equilibrium configurations of defective crystals, Archive for Rational Mechanics and Analysis 120, 245-283. 11. JIAN, L. & JAMES, R. D. (1997), Prediction of microstructure in monoclinic LaNbO4 by minimizaton, Acta Materialia 45, 4271^281. 12. KELLY, A. & GROVES, G.W. (1970), Crystallography and crystal defects, AddisonWesley, Reading. 13. PITTERI, M. (1985), On (v + l)-lattices, Journal of Elasticity 15, 3-25. 14. PITTERI, M. (1985), On the kinematics of mechanical twinning in crystals, Archive for Rational Mechanics and Analysis 88, 25-57. 15. PITTERI, M. (1986), On type-2 twins in crystals, International Journal of Plasticity 2, 99-106 16. ZANZOTTO, G. (1988), Twinning in minerals and metals, with some experimental results, Nota II. Mechanical twinning and growth twinning, Atti Accademia Nazionale dei Lincei, Classe fisiche, matematiche e naturali 82, 743-756 17. ZANZOTTO, G. (1992), On the material group of elastic crystals and the Born rule, Archive for Rational Mechanics and Analysis 121, 1-36. 5378 Buckskin Bob Rd. Florence Oregon (Accepted January 15, 2000) Published online July 12, 2000 - © Springer-Verlag (2000)
340
WB) ^^ PERGAMON
International Journal of Solids and Structures 38 (2001) 967-995
SOLIDS and STRUCTURES = = ^ = =
www.elsevier.com/locate/ijsolstr
Twinning analyses in the X-ray theory J.L. Ericksen * 5378 Buckskin Bob Road, Florence, OR 97439, USA Received 20 April 1999
Abstract Commonly, expositions of twinning theory combine at least two different kinds of observations: measurements of macroscopic deformation and those made using X-rays. I believe that there is some merit in considering the latter separately because, often, the two kinds do not mesh very well. Here, my aim is to elaborate this and to improve the twinning theory based on a theory of X-ray observations to be described. © 2001 Elsevier Science Ltd. All rights reserved. Keywords: Continuum theory of crystal defects; Twinning theory; Crystal multilattices
1. Introduction
The research of Zanzotto (1992) made clear that, for many crystals, the commonly used Cauchy-Born rule for relating changes in lattice vectors to macroscopic deformation is not consistent with observations of the changes associated with twinning and phase transitions. Briefly, this is the assumption that the macroscopic deformation gradient, applied as a linear transformation to lattice vectors before deformation, gives a possible set of lattice vectors in the deformed crystal. By arguments that I find convincing, he concludes that, when it is not, elasticity theory is inadequate to deal with these phenomena. He does note that the assumption does seem to be reliable for some kinds of crystals, the Bravais lattices and shapememory alloys, for which elasticity theory has been used successfully to describe such phenomena. For other kinds of crystals, the assumption sometimes applies, but often fails, and there seems to be no clear pattern in this. When it fails, we have no reliable theory for relating deformation to the lattice vectors and shifts describing the crystal structures, commonly observed using X-ray methods. I thought it desirable to construct some type of theory to describe at least some such observations. To this end, I proposed (Ericksen, 1997) a continuun theory of crystal multilattices ' dealing only with the X-ray observations, hereafter called X-ray theory. Certainly, it is important to deal with deformations, but I do not know how to do so. Experts do find this very difficult. In my presentation of this theory, I included some brief comments about related twinning theory, introducing an idea not used before in the literature, as far as
"Fax: +1-541-997-6399. 1 The book by Pitted and Zanzotto (2000) is a very good reference for relevant information on these and the Bravais lattices. 0020-7683/01/$ - see front matter © 2001 Elsevier Science Ltd. All rights reserved. PII: S0020-7683(00)00069-X
341_ 968
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
I know. Here, my purpose is to discuss this kind of theory in some detail, including an illustrative example, an analysis of one of the five twinning modes observed in orthorhombic cx-uranium, which involves some subtleties. Also, a partial analysis of another twin in this material is presented to illustrate other points. Generally, my aim is to link better such calculations to the theory of constitutive equations. Twinning analyses in the X-ray theory are somewhat different from what you are likely to find in expositions of twinning theory, so I will elaborate this. Another of my aims is to adapt to the X-ray theory, the kinds of analyses, based on elasticity theory, that have been used to describe some microstructures involving twinning. Conceptually, these have relied on notions of reference configurations and deformations, concepts which are not involved in the X-ray theory. Also, to relate calculations to X-ray observations, the Cauchy-Born rule is used, and the X-ray theory can be used for cases where it fails or is not relevant, as is the case for some studies of growth twins. Mathematically, such use of elasticity theory involves considering minimizing sequences for energy functionals that do not converge to minimizers, but have useful limits which can be described, using the theory of the probability distributions known as the Young measures. The literature on applications of this to elasticity theory has grown rather large. For readers not familiar with it, I suggest starting with the paper by Ball and James (1992), which covers the basic mathematical theory, some applications and a number of relevant references. Actually, it might be useful to reconsider how the mathematical theory applies to functions defined on bounded domains, to provide a better basis for most calculations of this kind. Some newer ideas and references are covered by Bhattacharya et al. (1994). I will introduce an alternative to the formulation used for elasticity theory which generalizes more easily to the X-ray theory, then briefly indicate how to do the generalization. For this part to be comprehensible, one needs to have some understanding of the basic mathematical theory, and I will not rehash this. Particularly, in Section 6,1 have made a serious effort to understand and explain common practices that I have found confusing and seem likely to be confusing to other theorists. In this, I could not avoid making some guesses, so bear this in mind, in assessing my opinions about this.
2. X-ray theory Here, I will give a brief summary of my X-ray theory (Ericksen, 1997), restricting the discussion of this to monatomic crystals for simplicity. The configurations are described as n lattices, where n is any positive integer: constitutive equations treat n as fixed, the configurations as variable. An n lattice consists of n identical lattices, translated in different ways relative to each other. The usual idea is that a lattice describes a set of "positions" of identical atoms, representable in the form: rfea+ const.,
if e Z,
(2.1)
where the (three) vectors ea and lattice vectors are a set of linearly independent vectors. The quotation marks indicate that, physically, "position" really means a point describing some averaged location of an atom. For various purposes, it is also important to introduce the reciprocal lattice vectors (dual basis) ea, satisfying e"®e(, = e a ®e a = l,
e" • eb = b\.
(2.2)
For an n lattice, one also needs to describe how the different lattices are translated relative to some point in space. With the usual ideas of invariance under translations associated with Galilean invariance, there are some advantages in taking the point to be some position in one of the lattices, using what Pitteri (1985a) has called shifts, a set of vectors denoted by p., / = l , . . . , v = n - 1.
(2.3)
342 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
969
For any configuration, there are infinitely many ways of choosing these vectors. Commonly, estimates of these are obtained using X-rays, probed over the width of an X-ray beam, wide enough to include a very large number of atoms. For this reason, observations made using electron microscopes reveal more about atomic arrangements near defects which might even be invisible in X-ray observations, for example. This motivated me to consider a continuum theory, treating the indicated vectors as vector fields, functions of position in space. This is what I call the X-ray theory, which is an equilibrium theory. My interest is in providing some theory for certain phenomena outside the range of validity of elasticity theory. Included in this are the discontinuities associated with twinning in some crystals, such as are documented by Zanzotto (1992), changes associated with phase transitions in some crystals, and growth twins. For such phenomena, I believe that it is reasonable to accept some assumptions I made in most cases. There are exceptions for some growth twins, noted in Section 6. I excluded continuous distributions of dislocations, leading to the conclusion that one has an analog of the inverse of the deformation gradient in elasticity theory. That is, there are scalar functions x" such that e" = VXa-
(2.4)
This leaves open the possibility of analyzing isolated dislocations, similar to the way this is done in elasticity theory. Generally, twinning involves finite jumps in ea, e" and p, across some surface. After pondering observations and common practices, I did and still do consider it to be reasonable to assume that it is possible to choose lattice vectors on the two sides, so that the Burger's vector vanishes for all Burger's circuits intersecting the discontinuity surface, not enclosing other defects, most likely to be dislocation lines. This implies that x" can be taken to be continuous across the discontinuity surface. This is an analog of the usual assumption of continuity of the displacement in elasticity theory for twins. Associated with this is the usual kinematic condition of compatibility, which can be put in the form e° = (1 - n ® a)e",
(2.5)
where n is the unit normal to the discontinuity surface and a is an amplitude vector. Also, e" and e" here represent limiting values from the two sides. This will play an important role in later discussions. For twins in unstressed crystals, it is the common understanding that the volume of a unit cell is the same on both sides. Also, for most if not all mechanical twins and some growth twins, the experience is consistent with the assumption that e" and e" can be selected so as to have the same orientation. With both of these assumptions, an = 0,
(2.6)
and Eq. (2.5) is equivalent to ea = (1 + a ® n)eo.
(2.7)
Another possibility will be described in Section 6. I expect that Eq. (2.5) also applies to twins in samples bearing small loads, which are often associated with metastable equilibria. Then, Eq. (2.6) might well fail. No doubt, Eq. (2.7) looks more familiar to those with some experience in the analysis of mechanical twins. There, the indicated linear transformations are often associated with the macroscopic deformation gradient, a simple shear, and this is, effectively, an application of the Cauchy-Born rule. Similarly, there are conditions on the shifts associated with twins. I will not discuss these in general, but will treat a special case later, in an example, discuss it a bit more, in Section 6. Typically, mechanical twins do involve a simple shearing deformation like this, in unloaded specimens, with a deformation gradient of the form. F = l + b®n,
(2.8)
where b is perpendicular to n. To compare this with Eq. (2.7), take as a reference configuration with the lattice vectors eo. Commonly, X-ray observations provide infinitely many possibilities for satisfying
343 970
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
Eq. (2.7), for a given twin and choice of ea. When the Cauchy-Bom rule does apply, one of these satisfies a = b and, usually, in Bravais lattices, it is the one with the smallest value of |a|. So, this is a guess often made by workers, to get a value of F from X-ray observations, although they use other ideas for multilattices. When the Cauchy-Born rule fails, none of the possibilities indicated by Eq. (2.7) agrees with Eq. (2.8). Obviously, one really needs both X-ray observations and measurements of macroscopic deformation to check this. Later, we will consider an illustrative example. To get something like the Cauchy-Born rule, workers often use what Zanzotto (1992) calls sublattices. Briefly, this involves using larger lattice vectors, effectively ignoring some sets of atoms in the real lattices. Empirically, it seems that such sublattices exist for all mechanical twins, when the Cauchy-Born rule fails to apply, although I do not know of a good physical reason for this. In Section 6, I will mention variations on twinning equations used by workers to allow for this. Obviously, it is desirable to test the aforementioned assumptions by analyzing observed twins for which there is some reason to suspect that they might not apply. It seemed to me possible that the examples to be discussed might be of this kind. We will see how analyses of these work out. It is an old difficulty in twinning theory that, often, theoretically possible twins are not observed, and I have no new ideas for trying to remedy this. It is tricky, since we cannot know what might be observed in the future. One does need to be able to account for mass densities, and I made a rather obvious assumption (Ericksen, 1997) about this, to be mentioned later. Also, I introduced a constitutive function for cp, the Helmholtz free energy per unit mass, of the form (2.9)
(2.10)
Obviously, it is a matter of making a change of variables to get one from the other, and I will not belabor converting the analyses I gave to put them in terms of (p. For the analysis of twinning, in particular, one needs to make some assumptions about the invariance group for q> or cp and, unfortunately, this is a complicated business, only partly because one is dealing with n lattices, where n can be any positive integer. It is a bit easier to discuss this for
R 6 SO(3),
(2.11)
implying that these transformations must map the domain of
(2-12)
So, for example, any lattice described by ea can also be described by —e0, but the two possibilities cannot both be in the domain of cp. With growth twins, there is real possibility of having enantiomorphic configuarations on the two sides. There is then the possibility that two different "mirror image" constitutive equations are appropriate. This does not necessarily require that their lattice vectors be oppositely oriented, but it allows for this possibility, in constitutive theory. I interpret the different constitutive equations to mean that it is impossible to remove such twins by applying loads. This fits the experience with Brazil twins in quartz, for example. Similarly, one needs to exclude the values of p,, which let two different atoms occupy the same position. This puts some conditions on the topology of the domain that are not encountered in the
344 971
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
theory of Bravais lattices (1-lattices). I will ignore these, as they are not needed for matters to be discussed here. Another complication not encountered in Bravais lattices is one that I glossed (Ericksen, 1997). The domain is likely to include nonessential descriptions in the terminology of Pitteri (1998) and Pitteri and Zanzotto (2000, Chapter 4). Briefly, these are the values of ea and p, for an n lattice describing a configuration which can also be described as an n' lattice, with n' < n. He gave one characterization of these and, more recently, I gave (Ericksen, 1997) a different one. In some cases of interest, but not all, one can pick domains excluding these. Otherwise, it is still not clear how best to deal with those occurring in the domains of functions, so, I will not face up to this difficulty. As candidates for transformations to be included in the invariance group for a function
m = \\mba\\ e G,
(2.13)
the group of unimodular matrices of integers, often denoted by GL(3,Z) or a similar notation. Of course, Eq. (2.12) excludes the possibilities with det m = - 1 although, as I noted (Ericksen, 1997), one could combine these with improper orthogonal transformations. In these and other matrices, my convention is that the lower index always labels rows. To define shifts, we picked, an atom in one of the lattices an an origin, and numbered the lattices in some manner. Picking another such atom and renumbering gives transformations of the form p,. -+ p, = a>Pj + nahea,
rfi e Z,
(2.14)
where the matrices a = ||a{|| can be interpreted as forming an offbeat representation of the permutation group on n objects. Pitteri (1985a) described a set of generators which, with my convention, consists of
{
(a) matrices obtained by replacing column in the unit matrix by entries all equal to — 1 and (b) matrices obtained by interchanging two columns in the unit matrix, (2.15)
for essential descriptions. He uses a different convention, making his matrices transposes of mine. As I see it, these transformations also apply to those nonessential descriptions, but one should also explore additional implications of their being describable as «' lattices. As I discussed (Ericksen, 1999) in some detail, it suffices to use a proper subset of these, for monoatomic crystals, except for some smaller values of n. For polyatomic crystals, additional restrictions are imposed on the a's. Now, for Bravais lattices, it has been very useful to consider as a domain one of the neighborhoods of Pitteri (1984, 1985a) because, effectively, these pick out finite subgroups of the infinite discrete groups, which map the neighborhood onto itself, in a nice way, more traditional invariance groups. It is possible for these neighborhoods to be unbounded with respect to applying uniform dilatations to lattice vectors, but most analyses done do not use this flexibility. Unfortunately, there are cases where the configurations of interest cannot be in the same neighborhood, for example, the very common twins in cubic and hexagonal crystals, although it is physically reasonable to take the domains of interest to be bounded for these. So, there is room for some new ideas, to deal with such exceptions, preferably without requiring the use of infinite discrete groups. Pitteri (1985a) generalized the basic theory of neighborhoods from his earlier version (Pitteri, 1984) for Bravais lattices, but only when they are centered at essential descriptions. Then, the relevant discrete group reduces to a finite group, the lattice group of the center. The lattice group for a nonessential description, described in the same way, is never unique. Elements of lattice groups are sets of integers, of the form (m,a,q),
q=||rf||eZ,
(2.16)
where the matrices m and a are of the form described above, this set being associated with the equations
345 972
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
jQea=mbaeb,
Q e O(3),
,, _>
One uses these, restricted to be consistent with Eq. (2.12), evaluated at the lattice vectors and shifts describing the configuration chosen as center. It turned out that crystallographers had not worked on these groups, hampering development of this theory. However, progress is being made in understanding them. For example, Pitteri and Zanzotto (1998) constructed a very nice example showing that these can be used to distinguish subtle differences in symmetry that are not by the commonly used space groups and sitesymmetry groups. Generally, they can distinguish differences in symmetry missed by the latter groups. Also, Pitteri and Zanzotto (2000, Chapter 4) present calculations of these groups for some special cases. I characterized (Ericksen, 1999) the groups for two lattices and three lattices with lattice vectors of most of the lower symmetry triclinic and monoclinic types. Adeleke (1999) characterized these for general n lattices with body-centered orthorhombic lattice vectors. The latter two papers explain how the as are a representation of the permutation group on n objects. Also, Parry (1998) treated some low-dimensional lattice groups. I find that it eases some such analyses, if one replaces Eq. (2.17) by the equivalent
rf*2 = « # + «?,
(2-18)
where Pi=P?*a=>rf = Pr*f-
(2-19)
This also better fits formulation (2.10), with p, replaced by p\. As yet, we know very little about what kinds of phase-transitions, twinning, etc. can be analyzed, using these neighborhoods, for multilattices. Also, there is need to understand how to use neighborhoods centered at nonessential descriptions, in a similar way. Generally, it is not hard to calculate the lattice group for a particular configuration, given an essential description, as I will do for an example discussed later. I think it obvious that the X-ray theory is still in a very rudimentary state and it is a complicated theory. However, tackling special problems with newer theories usually results in gains in our understanding, and I think it is now feasible to do some of this. 3. An example In attempting to analyze twin patterns which are or might be observed in a particular material, a good first step is to analyze a single twin that has been observed in it, with constant lattice vectors and shifts on each side. One can find lists of observed twins and other general information on them in standard references such as Barrett and Massalski (1966), Hall (1954), Kelly and Groves (1951), Klassen-Nekliudova (1964) and Reed-Hill et al. (1964). Also, there is a fairly recent review of work on deformation twinning, containing much information and numerous references, by Christian and Mahajan (1995). Bear in mind that observations of new kinds of twins occur from time to time, so views of the subject and tables can and do change, as a result of this. Briefly, deformation twins are produced by suitably loading a crystal, then removing the loads. Transformation twins do not involve loading, occurring naturally in unstressed crystals as a result of various phase transitions involving a change of symmetry. Intuitively, producing deformation twins treats crystals rather roughly, more so than the treatment producing transformation twins, for example. From this, one might guess that other kinds of defects are more likely to accompany deformation twins. However, twinning theory not accounting for this has been quite successful. Some information listed in such tables is not really relevant to the X-ray theory, referring to observations of macroscopic deformation. For an example, I will pick a material for which Zanzotto (1992) concluded that elasticity theory is
346
_ _ ^ 973
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
inadequate to describe all observed twins. Conceivably, this might suggest that my assumptions do not all apply. The materials he implicates include some hexagonal close-packed crystals. The lattice group for these is maximal, which implies that it is impossible for the two configurations involved in a twin to be in the same Pitteri neighborhood. In this respect, the lower symmetry (orthorhombic) oc-uranium he analyzes seems to be more promising, these twins being produced by stress, induced mechanically or thermally. This material is also interesting, because a variety of rather unusual twins are observed in it. It is described as a (monatomic) four lattice by Barrett and Massalski (1966, p. 170), a description used by all workers, as far as I know. The lattice vectors are chosen to be of the form re1=ai,
\e'=i/a,
e2 = b],
e2=\/b,
e3 = ck,
,
e3 = k/c,
,
[iA)
where the vectors i, j and k are orthonormal, and a, b, c are constants, with 0 < a < b < c.
(3.2)
Later, I will say m o r e a b o u t relevant values of these and the shifts. L o o k at the table given b y Barrett and Massalski (1966, p . 415), for example a n d y o u get a rather standard w a y of listing twinning information. Some tables include additional entries, use slightly different notations, etc. F o r ot-uranium, they list five twinning modes. The first, which is also the m o d e most frequently observed, is described as (Twinning } plane, K\ [ {130}
Twinning direction, y\x (3l0)
Second undisorted plane, K2 {110}
Direction r\2 (110)
Shear (3.3) 0.229
The labels KuVi> e t c - a r e standard. I will follow the common practice of using the Kt entry to label twinning modes, so these are the {130} twins, or it is the {130} mode. The numbers involved are crystallographic indices of the directions mentioned; one can use the numbers as they stand, or use crystallographically equivalent sets of directions. In some tables occurring in the literature, using such numbers as they stand can lead to error. At least check that they imply that Kt and »/, as well as K2 and r\2 are orthogonal, as they must be. Later, I will mention cases in point. The description of twinning elements by Barret and Massalski (1966, p. 411) should make fairly clear the interpretation of these entries, at least for mechanical twins, but I will raise a question about the interpretation of r\x. Here, only the first and, sometimes, the second entries are relevant to the X-ray theory, the others referring to observations of macroscopic deformation. Actually, workers exercise ingenuity, intrying to get best estimates of all entries. I do not think it is worth getting into a lengthy discussion of this. So, allow for the fact that my remarks about such matters are somewhat simplistic. With the obvious difference in how the planes are transformed some workers prefer to use different names for the planes, calling the latter composition planes. To illustrate some ideas in a simpler way I will make a "lucky guess" that 9L\X\X (a is parallel to i\t), the vector with t]t as components relative to ea. Workers familiar with twinning analyses might spot that this is not just a guess. Later, I will explain this. A more thoughtful treatment is described in Section 6. Also, I will gloss some subtleties related to twinning theory, discussed in Section 6, that are not important here. In terms of the vectors a and n, the first two entries give nye1 + 3e2,
a||3e, - e2,
(3.4)
or a pair of directions crystallographically equivalent to these. Ignoring the remaining entries, we have S/ e f = 1 + a ® n = 1 + a(3e, - e2)
(3.5)
where a is some scalar. Here, the e0 are lattice vectors on one side of the twin. Subscript on S indicates that I interpret this as a shear to be determined by the X-ray theory and measurements only, although I will
347 974
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
compare it with the value of the shear deformation. Here, Sx represents a one parameter family of values, parameterized by a. To qualify as a twin in an unstressed material, it is generally agreed that some orthogonal transformation applied to the configuration on one side gives the configuration on the other. For most observed mechanical twins, this is a 180° rotation with n as axis, associated what are called type I or rotation twins. 2 For the occasional exception, it is most often a 180° rotation with a as axis, associated with what are called type II or reflection twins. For a long time, experts were not convinced that other types of mechanical twins exist in nature. However, recent observations, to be discussed later, seem to establish that there are. Simply, I do not know how well these are accepted by experts. For a-uranium, the types of observed twins are mentioned by Christian and Mahajan (1995, Section 2.8). The twin at hand is compound, meaning that it can be analyzed as either type I or type II. It will be analyzed as type I, using /R = -l+2n®n, \ n = (e1 + 3e 2 )/|e 1 +3e 2 |.
.
.
(Xb>
Now, from this, the set eo = Re,,,
p, = Rp,
(3.7)
is a possible set of lattice vectors and shifts on the other side. From Eq. (2.7), the vectors indicated there by ea are also a possible set of lattice vectors on the same side. Using Eq. (2.13) to put this together, we get the twinning equation eo = S*ea = [1 + a(3e, - e2)
(3.8)
with R given by Eq. (3.6). This is to be solved for possible values of the scalar a and m e G. This will look familiar to those dealing with twins in Bravais lattices and shape memory alloys, for which the CauchyBorn rule applies, according to Zanzotto (1992). Then, a possible value of S* is taken to be the macroscopic deformation gradient F, as described earlier. However, Eq. (3.8) is based on a different concept and, generally, I believe that it applies when the Cauchy-Born rule fails. To compare with the macroscopic shear deformation, it is convenient to proceed as follows: For any value of a, we can determine a matrix n(a) such that Eq. (3.8) holds, with m replaced by p. This is just a description of (mixed) components of RSx in the lattice vector basis. A calculation gives ji(a)=
1 - 3/i n 0 6 - 9 / * 3/<-l 0 , 0 0 - 1
(3.9)
where H = 6a2/{9a2 + b2) + oi.
(3.10)
It is easy to check that det|i=l,
H2 = l.
(3.11)
Clearly, we will have |i 6 G if and only if ji is an integer, call it m. So, Eq. (3.8) will be satisfied, for any integer m, if we take 2 One should be wary of the fact, pointed out by Zanzotto (1988), that inequivalent definitions of the types occur in the literature, and it is sometimes hard to know which a writer has in mind, if you do not know some common practices, explained in Section 6.1 use a slight generalization of the one Zanzotta selects, covering multilattices.
348 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995 2
2
x = m-6a /{9a
2
+ b ).
975
(3.12)
This type of indeterminacy is familiar to those who do twinning analyses, using X-ray data, to estimate F, assuming that the Cauchy-Born rule might apply and, soon, I will explain this. One then cannot really determine the macroscopic deformation gradient uniquely from such X-ray observations. To try to do so, I will use the guess commonly used for Bravais lattices, minimizing the shear magnitude sx, which is given by ^(m) = |a||3e1-e2||eI+e2| = \m{9a2 + b2) - 6a2\/ab
(3.13)
by a routine calculation. Here, it might be reasonable to consider the two smallest, as they give shears in opposite directions, they are sx{0)=6a/b
(a<0),
sx(l) = b/a + 3a/b (a > 0).
(3.14)
(3.15)
Now, from the description of rj2 in Eq. (3.3), one can calculate a similar formula for .$ the magnitude of the macroscopic shear deformation. This gives a (known) formula s=\b/a-3a/b\=sx(l/2).
(3.16)
This does not agree with Eq. (3.3) for any integer m For our ot-uranium, data cited by Barrett and Massalski (1966, p. 170) give b/a = 2.056,
(a > 0),
(3.17)
yielding the value s = 0.299 noted in Eq. (3.3). For this ratio, Eq. (3.14) gives 2.92, much too large and in the wrong direction. With Eq. (3.15), we get the right direction, but with the value 3.515 much too large. Certainly, one cannot attribute these large discrepancies to experimental errors. What we are seeing seems to be clear evidence of a failure of the Cauchy-Born rule, but we shall see that it is not. This analysis applies to any n lattice, at least when the lattice vectors are associated with an essential description. From the four-lattice description of a-uranium presented by Barrett and Massalski (1966, p. 170), I read off shifts, obtained by using as origin the point they label as Oyl/4, y = 0.015 ± 0.005, getting
f Pl = (e, + e 2 )/2,
\ p2 = -2ye 2 + e 3 /2, [ p 3 = p 2 + (e,+e 2 )/2.
(3.18)
This is essentially the same as an example I presented (Ericksen, 1998, Eq. (82)), of a nonessential fourlattice description. It can also be described as a two-lattice, with lattice vectors describing a base-centered orthorhombic lattice. The routines presented there give these as [( e i +e 2 )/2,e 2 ,e 3 ],
(3.19)
and one shift, which can be taken as p2 = -2ye2 + e 3 /2.
(3.20)
However, analyses to follow work out more neatly if one uses the equivalent set f e, = (e, - e 2 )/2, ^ e2 = (e,+e 2 )/2,
(3.21)
a commonly used description of a base-centered orthorhombic lattice, along with an equivalent shift, given by
349 976
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
P = P2 + e2 = 2ye, + (1 - 2y)e2 + e 3 /2.
(3.22)
Certainly, experts know that the configuration can be described as a two-lattice, as is clear from the discussion of Christian and Mahajan (1995, Section 2.8), for example, and they compensate for this by allowing half integers in Eq. (3.8). As is noted by Pitteri and Zanzotto (2000, Chapter 4), point and space groups calculated using nonessential descriptions are sometimes only proper subgroups of those obtained using essential descriptions. Here, the two descriptions give the same point groups and space groups. This means that, for most and perhaps all conventional analyses, it really does not matter which description one uses. However, for an analysis to follow, the difference is important. Consider Eq. (3.16), describing the macroscopic deformation. Changing variables to replace the old lattice vectors by the new, we get, with the shear S described by Eq. (3.16), Se, = - R e , - Re2, Se2 = Re2,
(3.23)
!
S63 = -Rg 3 , and the coefficients on -1 -1 in, = 0 1 0 0
the right now form a unimodular matrix of integers, given by 0 0 - 1
(3.24)
with detmi = l,
m2 = l.
(3.25)
The effect is to allow half integers as well as integers in Eq. (3.13) and minimizing sx in this larger set gives S, which agrees with measurements of the deformation gradient. With the nonessential description, we thus get the poor estimate of minimum shear, and an apparent failure of the Cauchy-Born rule, using the same X-ray observations. With the essential description, the Cauchy-Born rule and minimum shear hypothesis do apply. As was mentioned earlier, workers avoid this trap, by allowing half integers. As a matter of taste, I do not like this dodge, but it works, for this calculation. For the linear transformation S, it does not matter what basis we use to describe it. So, the Cauchy — Born rule does apply, with the new choice of lattice vectors and shifts given by Eqs. (3.21) and (3.22), providing an essential description.
(3.26)
A calculation gives
je'+3e 2 = 2e 2 -e\
,
\ 3e, - e2 = 2(2e, + e2)
"
(
n
and S = 1 + 2(b2 - 3a2)(2e, + e2)
(3.28)
With the new choice of lattice vectors, the table corresponding to Eq. (3.3) becomes f Twinning Twinning Second Undistorted plane,K2 < plane,AT, direction, ?y, { {120} (210) {100}
Direction r\2 (010)
Shear (3.29) 0.299
As in the previous calculation, the X-ray theory involves an infinite number of shears, now including the one describing the macroscopic shear. As is familiar to experts, this occurs because of the existence of lattice
350 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
911
invariant shears SL, meaning shears that map one set of lattice vectors to another one for the same lattice. For this reason, they are invisible, in X-ray observations. Here, the important ones are of the form Si = l + m ( 3 e , - e 2 ) ® ( e 1 + 3 e 2 ) ,
(3.30)
for the old set of lattice vectors, where m is any integer. For the new set, a look at Eq. (3.27) gives S^. = l + m ' ( 2 e 1 + e 2 ) ® ( 2 e 2 - e 1 ) ,
m ' = 2m,
(3.31)
where m! is any integer, justifying the rather common practice of using integers and half integers in Eq. (3.13), as workers do and as I did in Eq. (3.16). For a satisfactory analysis of these twins, <j> should be invariant under finite rotations and the subgroup of the lattice group for e0 and p which is consistent with Eq. (2.12), in particular. The complete lattice group is of order eight, involving all of the orthogonal transformations in the point group for the lattice vectors indicated in Eq. (3.21). Elements corresponding to a central inversion and 180° rotations with axes e2 and e3 are, respectively,
r -i o
{ 0 - 1 0
[
0
0 -1
r -o - l
{
{
-1
0
f -1 < 0
{
0
o
0
^
,-1,(0,0,0) } ,
o
1
0 ,1, (-1,-1,-1) },
0 -1 0 -1
(3.32a)
J
J
0 ) 0 ,-1,(0,0,1) y
0 1
J
(3.32b)
(3.32c)
where, interpreted as in Eq. (2.16), these serve as generators of the complete lattice group for this configuration, and the first can be deleted in considering the invariance group for q>. To properly cover the twins considered, we also need to have q> invariant under the mi given by Eq. (3.24), hence under the group generated by it and the two ms in Eqs. (3.32b) and (3.32c). For these twins to be included in a Pitteri neighborhood, it is necessary that this be a finite group. This can be true only if the three ms are included in the lattice group of some Bravais lattice. If so, the lattice vectors for the latter, denoted by ca, are possible candidates for the center of a Pitteri neighborhood. So, one looks at the three equations of the form R c o = mbacb,
(3.33)
for the three indicated ms, the Rs being some unknown rotations, the co also being unknowns. As all satisfy det m = 1, m2 = 1, these are all 180° rotations, if Eq. (3.33) can be satisfied. Analyzing this, one finds that the ca must satisfy the following conditions:
{
ci • c3 = c2 • c3 = 0, |c,| = M ,
(3.34)
2c,-c 2 = - | c 2 | 2 , identifying these as commonly used lattice vectors for hexagonal lattices. These should be selected to have the same orientation as the lattice vectors ia. It is easy to determine the three axes of the aforementioned 180° rotations, which are included in the point group for the hexagonal lattice. With the Rs thus defined, one also needs to define a shift p, for the hexagonal configuration, transforming so as to match the second and third entries in Eq. (3.32), and to be such as to preserve the invariance associated with Eq. (3.24). Also, it is preferable that the description be essential, as we know so
35]_ 978
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
little about neighborhoods centered at nonessential descriptions. However, for this special situation, one could deal with this, if it were necessary, but it is not. Represent the shift by P=^
(3.35)
The requirement that this fit the lattice group elements (3.32b) and (3.32c) yields Pi+Pi = h ft = 1/2-
(3.36)
The condition to be avoided, that the description be nonessential, is that the shift is equivalent to one with components all equal to zero or a half. To include a lattice group element involving mi, we need to have { -pl+?
= ±p>+n2,
(3.37)
\-f = ±f + n\
where the if are integers and the same choice of sign must be used in the three entries. Here, I used Eq. (2.18). For the upper sign, one gets only nonessential descriptions, so we try the lower. The conditions do not determine the p" uniquely, leaving room for using some equivalent shifts. Whichever one uses, one gets an hexagonal close-packed configuration. I prefer a standard choice, p = c 1 /3 + 2c 2 /3 + c 3 /2.
(3.38)
This defines the corresponding lattice group element as f -1 -1 0 { 0 1 0
[
0
0 -1
| ,-1,(0,1,0)1,
J
(3.39)
the complete lattice group for the center being that for an hexagonal close-packed configuration, as described here. This makes it seem likely that the a-uranium configurations considered can be included in such a neighborhood, and I do believe this. I will not try to prove it, but will give a plausibility argument. For example, experts know that the configuration can be viewed as a distorted hexagonal one. Frank (1953), an ingenious person, got results similar to mine, by noticing a similarity between a-uranium and zinc. I got it by a rather routine calculation and will point out implications of this concerning constitutive theory, assuming my belief is correct. I note that, for this analysis, it is at best awkward to use the four-lattice description. So, this is one of various examples illustrating why it is better to use essential descriptions, unless there is a very good reason not to do so. Let us try to construct a path joining the a-uranium configuration first considered to an hexagonal closepacked configuration, with
10,1 = 1021 = 16,1 = 1621.
(3.40)
On this, we wish to have the lattice groups of all configurations on the path be exactly that of the a-uranium configuration, except for the center, which will have a larger lattice group, of course. Denote by fa and q the lattice vectors and shift for any configuration on the path. We require these to satisfy |f,| = |f2| = |6i|,f,-f 3 =f2-f3 = 0 , 1 H = Ml + {l-X)f2 + f3/2, J
.
{
. '
where A is a parameter. Also, think of the ia continuous functions X, so, we have a path beginning at A = 2y = 0.03 and ending at I = 1/3. At the beginning, using the data on a-uranium, one can calculate the angle determined by ei and e2, which is 128°, roughly. So, the angle determined by fi and f2 needs to decrease from this value to 120° at the other end, and consider it to decrease monotonically, for simplicity. A problem could arise if this path contained a nonessential description. Now, Eq. (3.41) describes one at
352 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
979
X = 1/2, for example, but there are none in the ^.-interval of interest. Also, Eq. (3.14) is enough to guarantee that the lattice groups on the path have that at the beginning as a subgroup, at least. A problem could arise if the path passed through a configuration with a larger lattice group, but a calculation indicates that this does not happen. From this, I conclude that such a path is a connected set fitting the description of an hexagonal close-packed neighborhood. It is not really necessary to assume that Eq. (3.40) holds, but it makes the reasoning easier. To proceed, I assume that this inclusion applies. It should be noted that the paths and center need not be equilibrium configurations, because this requires that dcp/dp = 0 which might or might not be satisfied, except at the cx-uranium configuration. Here, I interpret observations as implying that the latter is an equilibrium configuration. In molecular theories of elasticity, workers follow Born (1923), using molecular models to determine
(3.42)
used in Eq. (3.8), and p ^ R ( - p + e2).
(3.43)
Note that the procedure picks out one of the infinitely many equivalent shifts for the twin. In the jargon used by workers by workers in this area, we have predicted a definite kind of shuffling. Workers use various kinds of reasoning to estimate shuffling in various kinds of twins, but may seem not to relate this to any definite theory of constitutive equations, as far as I can tell. As we use only two of the three variants, it might be possible to use the third to describe more complicated patterns of twins coexisting in one crystal, but I will not pursue this. While this provides some basis for analyzing patterns of modes of the {13 0} kind, it seems most unlikely, on the face of it, that it will be possible to include the other observed twinning modes in this neighborhood. Later, I will do a partial analysis of one such mode, eliminating any doubt about this, finding that it alone cannot be contained in any such neighborhood. Certainly, it will be very difficult to construct a theory to deal with all of the modes. However, we have good clues for constructing a theory to explore the effect of small loads on {130} twins, for example. I presented (Ericksen, 1997) general equilibrium equations etc. needed for this. I do not mean to imply that it would be an easy matter to find an appropriate constitutive function
353 980
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
twinning tables, rjl can always be regarded as representing the axis of rotation for type II and compound twins. Here, this makes a||if]. After reading Section 6, you should understand my reasons for saying this. For the two prescriptions to agree, one should always have b||a, for such twins, and I know of no way to justify assuming this, theoretically. I do not know of any observations of such mechanical twins indicating failure of these directions to be parallel, so the assumption that they are seem to have some status, empirically. Theoretically, I do not like using the same name for things that are different, conceptually. I note that, by itself, the usual description really implies that one cannot determine r\x, using only X-ray observations, as there is no reliable way of relating the latter to deformation. However, workers sometimes do, by accepting the other description, for this, as is clear from the work of Cahn (1953), for example. If you look at various twinning tables, you will see three kinds of entries. For some twins, including those at hand, you will find entries presented as precise integers. This might make you wonder, since they represent experimental data, which are always subject to some error. For other entries, you will some comment to the effect that ertain entries are irrational. Then, the table might or might not list an approximation to the entries, using some set of integers. Certainly, workers know about approximating irrational numbers by rationals, and that, given an experimental estimate of some number associated with theory, one cannot determine whether the exact value is irrational or rational. To make sense of this, one needs to have some understanding of how workers make such decisions. In Section 6, I will explain this, as I understand it. Do not think that this is a typical example of twinning analyses for multilattices. These twins are unusual in more than one way. As the basic atomic arrangement is not that of a Bravais lattice, it is at least somewhat exceptional to have the Cauchy-Born rule apply, as was mentioned earlier. Various type I and type II twins are not compound and this can complicate the analysis of them, as will be illustrated in Section 6. Even for Bravais lattices, there are many twins that cannot be included in any Pitteri neighborhood, as was mentioned earlier. With all of these nice features, we got a good correlation with the macroscopic deformation, using an hypothesis which is ad hoc, but is quite successful, for Bravais lattices. For these twins, elasticity theory might be adequate, but there are the other modes in this material, to which this theory does not apply, according to Zanzotto (1992). Workers trying to deal with deformation in other crystals are generally interested in trying to understand how all atoms move in this deformation, and we produced a particular estimate of shuffling which workers might accept, as a reasonable guess about this for these twins. Generally, such workers deal with twins that do not work out as happily, trying various hypotheses, with limited success, to correlate X-ray observations with measurements of deformation. I will not deal with these issues. For those interested in them, I suggest reading the discussion by Christian and Mahajan (1995) of them, which includes relevant references. In Section 6, I will analyze another mode in auranium, to illustrate some of the kinds of complications that can occur, in dealing with X-ray observations of twins.
4. Microstructures: elasticity theory For elasticity theory, the idea that has been used is to fix some reference configuration for a crystalline body, and to consider minimizing sequences for the strain energy, or Helmholtz free energy functional, with sequences involving twinning deformations. Commonly, the constitutive equation is considered to be restricted to one of the Pitteri (1984) neighborhoods, 3 making this invariant under a finite material symmetry group GM, essentially some point group, and the body is considered to be unloaded. As was mentioned earlier, there are practical difficulties involved in treating twins that cannot be included in such a neigh3 For a slightly more general result and simpler treatment of such neighborhoods for Bravais lattices, cf. Ball and James (1992). Largely, thermoelasticity theory treats multilattices as Bravais lattices.
354 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
981
borhood, although one could do so, in principle. Essentially no progress has been made, in analyzing microstructures in such cases. Also, the energy density is assumed to have a set of minima generated by one or two, the orbit of these generated by SO(3) and GM. We discussed an analog of this, in our example. Most of the interest has been in Martensitic transformations, where fine-scale twinning microstructures occur naturally when Austenite transforms to Martensite, in crystals bearing no loads. However, the method can be used for twins not of such transformation types, at least when they fit into some neighborhood. In such endeavors, one is trying to relate theory to observations of twin microstructures. Experimentally, one then has information concerning the region actually occupied by the twinned specimen, in the Martensitic phase, the crystallographic orientation and arrangements of the twins, and, often, some information about the parent untwinned Austenitic specimen, commonly taken as a reference configuration. Following the custom in elasticity theory, workers have used material coordinates as independent variables, so, the twin planes, etc. are described as certain directions in the reference configuration. To compare with observations, one then needs to map these to the actual configuration, to correct the values of angles between differently oriented twin planes, for example. For this, my impression is that what is really used is the identification of relevant crystallographic directions in the reference and deformed configurations provided by the Cauchy-Born rule. Certainly, this hypothesis is used in an important way, to relate Xray observations of the crystallographic orientations of twin planes etc. to the calculations, and to select GM- So, one must do something different when the rule fails to apply, one of my reasons for proposing the X-ray theory. Commonly considered minimizing sequences involve increasing numbers of such planes with the distance between parallel planes approaching zero in the limit, the deformation gradient F undergoing finite jumps across these. Rather obviously, such values of F will not converge pointwise to a value of F, the basic reason why a minimizer is not obtained in the limit. However, this is one kind of limit which can be described, using the theory of Young measures, permitting one to do some useful calculations. With this theory, one can also describe sequences not only involving twins, such as the Austenite-Martensite interfaces studied by James and Kinderlehrer (1989), for example. Briefly, this describes the kind of theory I would like to adapt to the X-ray theory. Roughly my idea is to interchange the roles played by the spatial coordinates and material coordinates. We are interested in comparing calculations with observations of a specimen with microstructure, occupying some region Q in space. Instead of fixing a reference configuration, fix Q. Again roughly, the idea is to consider the various material bodies that might be able to be in the observed configuration. So, instead of the usual deformation, we consider the inverse, maps of Q to other domains, of the form x = x(y).
(4.1)
Here, x and y denote the material and spatial coordinates, or coordinate-free equivalents, respectively. From the practice used in the previous view, I will borrow the assumption that these functions are included in FF1'00: it seems to me that arguments favoring this fit equally well with either of the two procedures. To use the spatial coordinates as independent variables, write the relevant energy in the form E= [ pwdv, JQ
(4.2)
putting the constitutive equation for w in the form w = w(F-'),
(4.3)
where F is again the usual deformation gradient: include a dependence on temperature, if you like. Clearly, one can take a constitutive equation for the energy per unit reference volume and transform it to get w, or vice versa. I see no real difficulty in defining Pitter's neighborhoods to fit either formulation, for example. Roughly, the idea is to consider minimizing E, with the constraint that the total mass M be fixed. Here,
355 982
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
f pdv, (4.4) Ja p being the mass density. In the customary approach, one takes care of this in a trivial way, by fixing the reference mass density p 0 , and the material region. With the new procedure, we use M=
p = podetF-\
(4.5) 1
with p0 a fixed constant. For the spaces and sequences of interest, det F~', along with F" and adjF~' are weakly continuous, making p weakly continuous, in particular. Briefly, this means that, in the limit, one can use the Young measures to calculate the mass of subregions of Q. Or, one can do this, using the weak limits. There are possible reasons to prefer one to the other, too technical to discuss here. One does need to bear in mind that sequences considered should respect the condition that M is fixed. In typical calculations involving only twins, one considers the twinning equation, which can be put in the form F2 = R F , H = ( l + b ® n ) F , ,
R e SO(3),
H e GM,
b • n = 0,
(4.6)
or the equivalent FJ 1 = H " 1 F - I R T = F 7 1 ( 1 - b ® n ) , 1
1
b n = 0,
(4.7)
1
where'F," and F j are values of F" for some pair of minimizers of w, n is the unit normal to a twin plane and b is the amplitude vector referred to in Ref. (2.8). Of course, using such minimizers picks out a particular value of M. With the usual understandings about material symmetry and the assumption that the Cauchy-Born rule applies, this is compatible with the twinning equation used in Eq. (3.8). This gives the same value of p for the two deformations, so it is easy to deal with Eq. (4.4), in such cases, and not hard for sequences involving a mix of Austenite and Martensite, for example. Essentially, Eq. (4.6) is the kinematic condition of compatibility, enabling one to construct piecewise homogeneous maps of the form (4.1), with x continuous, F" 1 undergoing a finite jump from F~' to F^' across planes with normal n, fitting Wu°°. Perhaps this is enough to indicate how one can redo calculations in the literature, using this procedure. I will not argue that, for elasticity theory, this procedure is better than the usual one. It does have one little advantage, in avoiding transforming descriptions obtained in the reference configuration. My reason for considering it is pragmatic, to be able to adapt such techniques to the X-ray theory, and only the latter is suitable for this. 5. Microstructures: X-ray theory
Recall Eq. (2.4) and the fact that, by specializing the choice of lattice vectors a bit, one can arrange that Z° is continuous across twin planes. For example, referring to the lattice vectors in Eq. (3.8), we could use as e~" and e° those values involved in our example. Mathematically, e" = Vx°, then, has essentially the same properties as the F~' considered before, and there is an analogous twinning equation for these. The other vectors are shifts p,-, / = 1 . . . « — 1, for an « lattice. As was the case in our example, these also can suffer finite discontinuities across twin planes, which can be analyzed, using Eq. (3.7). If you like, you can adjust this, using Eq. (2.14) on either side, as we did in our example, there to better fit the descriptions to a neighborhood. So, in place of Eq. (4.1), we have an energy density function, the form described in Eq. (2.10) being more appropriate than is that described in Eq. (2.9). As an analog of Eq. (4.5), I proposed (Ericksen, 1997) using p = *|det||V Z a |||,
(5.1)
where A: is a positive constant, again with essentially the same features as Eq. (4.5). This can exclude some variable continuous distributions of point defects, for example vacancies, a remark that also applies to the way mass is accounted for in elasticity theory.
356 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
983
For x", the obvious analog of what is done in elasticity theory is to consider it to be in the function space W''°°. For pf, the finite jumps in it are akin to those occurring in Vf, fitting the function function space L°°. For this combination, the general theory of Young measures, etc. is available. In principle, one can then proceed as before, considering minimizing sequences for E=
Ip{j>&v,
(5.2)
[ pdv
(5.3)
JQ
with M=
JQ
held fixed, as before. In much of this kind of work in elasticity theory, workers do not really use the function w, merely the assumption that it has certain minimizers. Using Pitteri's neighborhoods makes it rather easy to generate an orbit of minimizers that are also in the neighborhood. For the cases involving twins that cannot be included in one such neighborhood, such as one to be encountered in an example, one could use observations to estimate some minimizers, take their orbits under SO(3), and possibly use other general ideas of invariance to enlarge the list. In Section 6, I will suggest a possible strategy for this, which is specula ive and only loosely defined, involving two modes in a-uranium. At least tacitly, one then assumes that such minimizers are in the domain of
6. General theory of twinning
Here, my aim is to elaborate the more elementary parts of the general theory of twinning, according to the X-ray theory. This is designed to be compatible with twinning equations used by workers analyzing type I and type II twins, with information obtained from X-ray observations, as well as some permitting analyses of more general kinds of twins. As the atoms are arranged in different ways in the two regions separated by a twin plane, the twin plane direction K\ can be determined, with the inevitable experimental error, using Xray methods, it being relatively easy to do so. At least in principle, one can also determine how the atoms are arranged in both regions although, in practice, this can be difficult. We have seen some reasons why it is
357 984
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
better to use essential descriptions, and there are others. However, workers often use nonessential descriptions, so, one might well need to learn how to recognize these and do a translation of the description much like that done for the example considered earlier. The discussion to follow assumes that essential descriptions are used. It is a very common understanding that, for such a pair of arrangements to be called a twin, in an unloaded crystal, one requirement is that the two arrangements can be related by some orthogonal transformation Q, not necessarily unique. That is, if ea or e" and p, describe the configuration on one side, then, ea = Qe0, e" = Qe° and p, = Qp,
(6.1 a, b, c)
represent possible values of these vectors on the other side. In some writings, it is not entirely clear that the authors intend to require an equivalent of Eq. (6.1a), but I am fairly sure that they do. An expert on X-ray observations informs me that some slight departures from Eq. (6.1a,b,c) are tolerated, in practice, but I will ignore this. Obviously, that Eq. (6.1a,b,c) holds can be shown to be possible or not, if one knows how the atoms are arranged. Of course, one can use Eqs. (2.13) and (2.14) to get other descriptions, as we did in the example. If Q belongs to the point group for the n lattice, so Eq. (2.17) is satisfied for some values of the integers invo.ved, the two configurations are the same, so this possibility is excluded. However, for n lattices, this doe> not always exclude the possibility that Q is in the point group for the lattice vectors only. Another assumption is tacit in twinning tables. That is, they list just one entry for K\, for example, although, since two different values of e" occur, this direction might have different indices on the two sides. It is easy to see that, if they match for one choice of these two sets of vectors, one can make them different, by introducing an equivalent set on one side, leaving that on the other as is. So, this presumes special choices of these, the obvious possibility being to use pairs related as in Eq. (6.1a,b,c). Conditions obtaining from this are discussed by Zanzotto (1988, Note 2). I will satisfy the condition in a different way, which is less restrictive in this respect, but is more restrictive in others. For purposes of discussion, I'll take the conditions described as minimal requirements for a surface discontinuity to be called a twin. However, at the end, I will explain why this is not completely consistent with practice. As might be expected from this, there are, in the literature, different definitions of twins and I have not tried to locate all of these. Those that I have inspected impose Eq. (6.1a,b,c) and some other conditions, depending on the definition. Some and perhaps all exclude some observed configurations called twins. To do common kinds of twinning analyses, one needs more than the minimal requirements noted above. Roughly, one aim is to assume as little as possible, consistent with this desideratum. Another is to sort out information which is relevant to the X-ray theory. Here, I will present my view of two classes which seem to me to be interesting, from this perspective, for what are considered to be unstressed crystals. For obvious reasons, these emphasize statements that can be verified by X-ray observations alone, although these are not the only kinds of observations of interest. One class deals with a subset of twins I will call generalized type I twins, defined as follows:
{
These satisfy Eq. (6.1), as interpreted above, K[ is rational, There is a choice of reciprocal lattice vectors ea equivalent to ea such that
(6.2a-c)
e = (1 - n ® a ) e a , e" = m«e* = m°bQeb,
m e G,
where n is the unit normal to the twin plane and a is some vector. Here, Eq. (6.2a) just repeats Eq. (2.5), discussed earlier. The second equation merely mathematizes the assumption of equivalence. As is pointed out by Zanzotto (1988, Note II), it is always possible to pick lattice vectors so that two are parallel to the plane, which can help simplify analyses. It follows from these assumptions that
358 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
det ||1 — a® n|| = 1 - a - n = ± 1 ,
985
(6.3)
so, with the upper sign a•n= 0
(6.4)
and with the lower a-n = 2.
(6.5)
I note that Eq. (6.4) includes the possibility that a = 0 which is worth bearing in mind, for growth twins, in particular Dauphine twins in quartz. For these, the simplified model used by James (1987) seems to do quite well. Unlike the Brazil twins, mentioned in Section 2, these can be removed by mechanical treatments, as was discovered during World War II and later reported by Thomas and Wooster (1951). For this reason, it does seem sensible to use the same constitutive equation for both configurations. I interpret assumption (b) as n = Ki/|K,|,
K,=^e<\
(6.6)
where ka are relatively prime integers. Then, using Eq. (6.2a), (6.4)-(6.6), we have kae" = K, - a K,n = K, - a nK, = ±K,,
(6.7)
which is commonly interpreted as acceptable matching of the two values of A^. This does not assume that e" and e" are related by an isometry, as was discussed above, although it achieves the matching of indices. For this, it is not necessary that the ka be integers, and this will be important when we consider the second class. However, a useful result follows when they are. Suppose that Eq. (6.5) applies. With the ka being relatively prime integers, there are integers I" such that (6.8)
kal"=l, by elementary number theory. Then, m = \\6"b - 2kbl"\\ G G,
with
m2 = I,
detm = - l ,
(6.9)
so we can introduce equivalent reciprocal lattice vectors given by i a = m£-4e = m°(l-ii
a n = 2.
(6.10)
By a routine calculation, this gives l a = (l - n ® a ) e ' 1 ,
(6.11)
where a = a - 2 | K 1 | / n e a ^ n - a = 0.
(6.12)
Similarly, if Eq. (6.4) holds, we can transform it in a similar way to have Eq. (6.5) apply, so, the two versions are equivalent, in this sense, when Ki is rational. This has its limits. There are type II twins which are in this class, the compound twins. For these, the directions of a or a can be obtained from X-ray observations, as was mentioned in Section 3, implying that a and a are not always physically equivalent. However, one can still use the idea to transform twinning equations using Eq. (6.4) to equivalents using Eq. (6.5), or vice versa, which might be helpful, for theoretical studies. It is a common notion that compound twins can be analyzed as type I and as type II. As was mentioned in footnote 1, this is subject to interpretation, as different definitions of the types are in the literature. Properly interpreted, this is true, as far as the lattice vectors are concerned, as is discussed by Zanzotto (1988). Ponder what this presumes about shifts, and you should see that this might not always be true for n lattices. However, my experience is that, in practice, when K\ and r\x are reported as rational, it does mean that they can be analyzed either way and,
359 986
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
soon, I will explain what I see as the reasons for this. Those accustomed to use the mathematical definitions of rational and irrational numbers should be aware of the fact that this is not exactly what workers in this area mean by the words. To them, it seems to be like picking a number at random from some interval, when you do not know in advance that it must be rational. Since the rationals are only a countable subset of the real numbers, the number picked will be irrational, almost certainly. In the remainder of this paper, I will consider only cases for which Eq. (6.4) holds. Then, Eq. (6.2a) is equivalent to eo = ( l + a ® n ) e o ,
(6.13)
used earlier and, with Eq. (6.1b), we get what looks like the more standard twinning equation ea = mbaQeb = (1 -fa® n)e0,
m = m '.
(6.14)
However, when the Cauchy-Born rule fails, this equation can still be used, but it cannot be satisfied for a = b, the vector indicated in Eq. (2.8). As will be described later, various workers then use a variation on Eq. (6.14), with a replaced by b From this view, my procedure is not conventional. Some workers interested in mechanical twins use arguments about shearing deformations to motivate using Eq. (6.14). To me, this muddies the water. If the reasoning were sound, Eq. (6.14) should be satisfied when we take as 1 + a
R = - l + 2n®n.
(6.15) 4
While analysis of Eq. (6.14) for these is very familiar to those who have done twinning calculations ,1 will belabor it, to make some points. First, note that, if it is satisfied with (R,m) it is also with (-R, - m ) although, in general, only one of these will be consistent with Eq. (6.1c). However, for (monatomic) one and two lattices, both hold when one does, and this includes our example. Taking the former, consider the equivalent Hea = mbaeb, H
def
= R(l + a ® n ) ,
(6.16)
and verify that J H 2 = 1, T
d e t H = 1 =^m2 = 1,
detm=l,
\H n = n=>m% = ka.
,,
ir.
[bAI)
It seems to be consistent with experience that type I twins always have K\ rational, as is required for Eq. (6.16) to have any solutions. However, with the inevitable experimental errors, measurements cannot really 4
See Zanzotto (1988, Note 1) and Pitteri (1985b), for example.
360 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
987
confirm or contradict this. Thus, necessarily, workers rely on some theory to decide this, using equations more or less like Eq. (6.16). The conditions on m require that, for some set of integers I" satisfying Eq. (6.9), mha = -d"a + 2lbka,
(6.18)
one point being that X-ray observations of type I twins that are not compound can yield values of ka, but not of /" or of the direction of the vector a. So, here, we can use Eq. (6.4) and regard it as physically equivalent to Eq. (6.5). Also, if Eq. (6.9) is satisfied by /", it is also satisfied by f = f + ^,
r X = 0,
(6.19)
where the r" must be integers, of course. For any choice of these integers, there is a vector a such that Eq. (6.14) is satisfied, given by a = 2(n - | K i | / " e a ) = ^ a - n = 0.
(6.20)
From Eq. (6.20), it is easy to see that replacing I" by /" amounts to adding in a lattice invariant shear, not detectable in X-ray observations. So, understandably, theory gives us this ambiguous estimate of a. Here, ^i is the only twinning element used, although one does need more information to determine that a twin is of type I. Excepting compound twins, I do not see how one could determine other elements from X-ray observations alone, without adding some hypotheses, for twins in this class: sometimes, workers do add such hypotheses, to estimate other elements. Now, how is >/, determined? According to most expositions, it describes the direction of the vector b in Eq. (2.8). I have come to believe that this is not the only interpretation used, in practice. Try the following experiment: Select any twinning table, and ignore all entries for which »/, is described as rational. I will deal with these, later. This will leave you with a much shorter list: if there are none left, try another table. Now, select one such mode. It is rather likely that this will give no quantitative information for r\x. To understand this, it might help to consider a case history. Cahn (1953) did pioneering work in determining twinning elements for the modes he observed in a-uranium, and he used some ingenuity in doing so. For example, the data in Eq. (3.3) are his, but we are ignoring these. However, his observations of {112} twins are relevant. He concluded that these are type I twins, with r\x irrational, but was unable to get quantitative estimates of it. One might think it a simple matter to measure the direction of b but, in practice, it can be very difficult and, for him, it was not feasible to get good data of this kind. To infer that r\x is irrational, he used an indirect argument. He also observed "{172}" twins, to be analyzed later and concluded that these two modes are conjugate, using theory relating to this to draw this conclusion. This gives enough information to enable you to make a theoretical estimate of r\x, if you want to. Thus, these experiments really gave no information about r\x, and this seems not to be an extremely unusual difficulty. Often, if one mode is observed, its conjugate is not, as seems to be the case for the {130} twins, for example. Thus, one cannot always use the reasoning based on this, which helped Cahn. So, an answer to the question posed is that r\x is not always determined and, if it is, it might be by a theoretical estimate, or by some experiment. The set of integers r° can always be represented parametrically, with two arbitrary integers as parameters. In our example, we saw only one. A calculation shows that, had this not been compound, or had we not noticed this, the shear in Eq. (3.31) should have been replaced by 1 + [m'(2e! + e2) + re3] ® (2e2 - e1),
(6.21)
where r is an arbitrary integer. A calculation shows that r = 0 for the minimum shear. So, in this case, using this would cancel the error of omission, since it is compound. In other cases involving type I twins, it certainly is better practice to determine whether the twin is compound. For compound twins, the relevant f in Eq. (6.19) reduce to a one parameter family, as in our example. For the X-ray theory, I believe that Eq. (6.16) is reliable, for locating the possible energy wells associated with the twins considered. However, only a small number of these are likely to be relevant, physically.
m 988
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
When it is feasible to include some in a Pitteri neighborhood, it seems to be a good rule to use these. However, there are many cases for which this is impossible. So, we need, but do not yet have a good alternative, for such cases. Certainly, this is a road block that needs to be surmounted, to develop a good theory of twin patterns and related microstructures. Later, I will mention a few thoughts about this. From Eq. (6.20) follows another point, that, for type I twins that are not compound, Eq. (6.2a) is redundant, and observations agree with the view that all of these are generalized type I twins. That is, given a rational K\ and any set of lattice vectors, Eq. (6.16) can always be satisfied. For compound twins, when Ki and r]] are determined by X-ray observations, one should confirm that these are compatible with Eq. (6.2a). If not, I would conclude that, most likely, there is either some fault in the experiments or in the interpretation of them, but I am biased. Shortly, I will indicate how rjl has been determined using X-ray observations, for some twins. Most twins are either of type I or can be described as being both of type I and of type II, so most observed twins are in this class. Now, I turn to the second class, consisting of what I will call generalized type II twins. In particular, this covers the less common observations of type II twins with K\ considered to be irrational. By private communication, Richard James informs me that these are in fact rather common in copper based shapememory alloys. As was mentioned before, the Cauchy-Born rule seems to apply to all shape-memory alloys. Examples of such twins in other kinds of crystals seem to be rare. To define the class, simply replace statement (b) in Eq. (6.2a-c) by (b)' K] is irrational.
(6.22)
This allows for the possibility of observing examples not of type II and, later, I will mention a recently discovered example. One does run into rather different problems in analyzing the generalized type II twins, so there is some reason to put them in a separate category. Before, I mentioned the interpretation of rj] commonly found in expositions. However, in practice, I find that another one is also used, which is not obviously equivalent. For the following discussion, I will take at face value the following statement by a well-known experimentalist, Cahn (1953), as describing how r\\ can be determined, for type II twins: "For a twin of the second kind, the orientations of parent and twin are related by a rotation of 180° about r\x as axis." Certainly, he did use this, in estimating r\y in cases where he could not get good measurements of deformation. So, what is to be determined experimentally is this axis, something that can be determined using X-ray observations, at least in principle. After pondering this and other bits of evidence, I concluded that the best way to mathematize this is as follows: Proceed by satisfying Eq. (6.14) with Q = ±R,
R = - l + 2v®v,
v = a/|a|,
(6.23)
this being what I take as a definition of type II twins, for the X-ray theory. Except for allowing Q = - R , it is just my interpretation of the quotation above. In this, I also include compound twins. As suggested by the quotation, I will consider >/, as describing the direction of a. For analyzing the twinning equation, it is enough to consider the upper sign, as before. I still use Eq. (6.4), which seems to fit the examples observed. This differs from the previous case in that v and n can be obtained from X-ray observations, with some experimental errors. Here, it is known 5 that for Eq. (6.14) to be soluble, it is necessary that v = Hi/kl,
5
i\i=t°ea,
See Zanzotto (1988), for example.
(6.24)
362 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
989
where the f are relatively prime integers. That is, r\x must be rational. With this or some similar guide, workers will pick the numbers to fit the data to within the errors in these. It is also known that the equation does not require K\ to be rational. The practice is to call it irrational unless there is some good reason to believe that it is rational. The likely reason is that it can be described equally well as a type I twin. Then, the equations require that both K, and JJ, be rational. I believe that this is the real reason why these elements are always reported this way, for what are called compound twins. In twinning tables, I have not yet found an example of a type I twin which is not compound, for which »/, is reported as rational. So, my impression is that ril is judged to be irrational for any type I twin that is not compound. This makes it a pretty safe bet that, if these entries are reported as rational, it implies that the twins can be analyzed as type I or as type II. As I said, I am interpreting r\x as representing the axis of rotation. As was noted before, from the X-ray observations alone, there is no reliable theory for determining the vector b in Eq. (2.8). In particular, Eq. (6.14) is not always satisfied with a = b. So, if we insist that r\l represents the direction of b, how do workers conclude that r\x is rational, in these cases? Certainly, workers accept this conclusion and, later, I will explain how they use a variation on the twinning equation mentioned earlier, to obtain such conclusions. For these twins, one could assume that b||a weaker than a = b which seems to apply to twins observed in txuranium, at least, avoiding an inconsistency with the idea that >/, represents the shear direction. I just do not know whether twins are observed, which violate this assumption. I think that failure of this assumption is a theoretical possibility, and that it might avoid confusion, if we were to use different notations for the two interpretations. Of course, this assumption is not really relevant to the X-ray theory, but one interpretation of r\x is. Until someone proves me wrong, I will assume that the values reported in tables are consistent with my definition. I think it clear that X-ray observations should be used to determine the direction of this axis and, if only such observations are available, this is how r\x would be determined, in practice. So, it seems to me to be a good definition for the X-ray theory, at least. One could make the condition that b||a part of the definition of type II twins. However, experimentalists can only check this approximately, with a margin of error which is not always so small. Also, this does not make sense for growth twins, or for twins for which it is not known how they formed. So, I do not like this idea. From Eq. (6.4), there is some scalar a, such that a = at),.
(6.25)
Then, one question is whether one can always choose a so that Eq. (6.14) is satisfied, with Q = R, given by Eq. (6.23) and a by Eq. (6.25). This reduces to the question of whether one can always find relatively prime integers sa satisfying saf = 1
(6.26)
such that an + 2 t i , / k I2 =
2s
= f 2s°e"
(6.27)
is satisfied for some value of a. It is not hard to show that one can pick lattice vectors, integers f and a unit vector n±n, f° r which this is impossible. So, one could argue that, hypothetically, there are type II twins for which Eq. (6.14) is not satisfied. Of course, with my definition of type II, this is impossible. I doubt that workers using another would accept the possibility that one will encounter a realization of this, in nature. This does make it desirable to explore a realistic example. For a-uranium, there are the type II "{1 72}" twins, with r\x = (312), the quotation marks indicating that this is a rational approximation to K\, considered to be irrational. These data were produced by Cahn (1953), so r\x represents the axis of rotation. The entries for these two elements agree with Cahn's and listings in tables presented by Hall (1954) and KlassenNekliudova (1964), for example, but the table given by Barrett and Massalski (1966) gives Kt = "{172}" and the same r\x. This is not consistent with the fact that these directions are orthogonal. The table
363 990
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
presented by Christian and Mahajan (1995) also does this and puts the entries for r}y and r\2 m the wrong places. Now, it is easy to show that, for this mode, the Cauchy-Born rule fails to apply. Essentially, the basic idea is what I used to get (3.16). For this, one uses twinning elements describing shear. In Zanzotto (1992), a list of linear transformations for a-uranium, it is H2 that is associated with this mode. He uses the usual four-lattice description. Apply this to these lattice vectors and express the result as a linear combination of the lattice vectors. Do it again, using the essential description. For the Cauchy-Born rule to apply, at least the latter coefficients should be integers, forming an element of G. In both cases, one finds that they are rational numbers, but not integers. Actually, various workers consider the twinning equation to be generalized, allowing such rational numbers as well as integers, this being the variation on the twinning equation mentioned earlier. Here, the equation is phrased as one involving the shear deformation. This does not change conclusions about which twinning elements are rational. Essentially, this is a way of describing the observations supporting the view, mentioned in Section 2, that one can always find a sublattice to which the Cauchy-Born rule applies. If you perform the above calculations, you should be able to find the sublattices for the two descriptions, and determine whether these are the same or different. Here, what I am doing is unconventional, assuming that Eqs. (6.14) and (6.23) apply, with an unconventional interpretation of these, despite the failure of the Cauchy-Born rule. I do believe that this is sound. Of course, I concede that it is possible that someone could find clear evidence that my belief is wrong. Those interested in constitutive theory need to be aware of this, and try to take it into account. When I considered this, I concluded that a sensible form of constitutive equations for the energy density is what I used for the X-ray theory, so, I am interested in learning what can be done with it. My view is that, if identical atoms somehow exchange positions, this does not affect the energy. Of course, using the X-ray theory does not preclude the common practice of introducing other hypotheses to relate the shear deformation to lattice vectors and shifts. Simply, I do not see any good way of relating these to a sound theory of constitutive equations. This is an important and challenging open problem, in need of a good solution. I will start by taking the approximation as exact. The indices refer to the four-lattice description described in Section 3. Converting them to the essential description used there, I get K, = 4 6 1 - 3 e 2 + 2e3,
(6.28a)
t)! = e , + 2 e 2 + e 3 ,
(6.28b)
s = sae",
(6.28c)
with the integers sa satisfying saf = t], -s = j , +2s2+s}
= 1.
(6.29)
Then, Eq. (6.27) gives three equations for a: one can use the information on the lattice vectors given in Section 3 to calculate the necessary entries. By routine calculation, I get, as conditions for these to admit a solution for a, 2(5, + s2) - s3 = *def = (12a2 - 4c2)/z,
(6.30)
•s, - 2s2 = yda = (3a2 - b2 - 8c2)/z,
(6.31)
and where z = 9a2 + b2 + 4c2.
(6.32)
364 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
991
It is easy to pick numbers a,b and c such that x is not very close to an integer, for example, so these equations are violated. However, using data presented by Barrett and Massalski (1966, p. 170), I calculate that, approximately, * =-0.001,
y= -1.001,
(6.33)
quite close to x = 0, y = —1. So, interpret this as an error in what we took for K\, but take the values of so suggested by this, which are s1=4«-l,
«2 = l - 3 n ,
s3 = 2n,
(6.34)
where n is any integer, and f can be read off from Eq. (6.28b). Again, this would amount to adding lattice invariant shears of a particular kind, invisible to X-rays, if the starting estimate of K\ had satisfied the twinning equation. For any choice of n, one can solve the twinning equation for n and a, thereby getting an infinite number of possibilities for both. It is not hard to check that all these directions are very close to that given by Eq. (6.28a), when n is very large, and that a is very large when n is. Define KO by an = Kae" =^ K\ + 2K2 + K3 = 0.
(6.35)
As an example of numbers obtained for n small, I calculate that „ = 0 => K3/K{ = 0.5005,
(6.36)
also quite close to the starting value of 1/2. From this, it is pretty clear that, while different values of n will give different values of n, they are not very different, although the values of a can differ considerably. Of course, values of these depend on values of a, b and c, which are subject to some experimental error, changes of temperature, etc. There is no obvious reason why the relevant combinations of these should be rational numbers, so K\ is regarded as irrational. From this exercise, it does seem pretty clear, on the face of it, that workers have done calculations somewhat similar to mine, to draw their conclusions about K[ being irrational. Given these calculations, I seriously doubt that experimentalists can get X-ray data sufficiently accurate to pick out a particular value of n. However, I would be happy to be proven wrong about this. Even so, I would like to see some theoretical reason for picking one. Earlier, I noted that, for this mode, in particular, the Cauchy-Born rule fails to apply, so we cannot expect to get a reliable determination of n, by using shear data. Another theoretical possibility is to try to determine it so that the twin is contained in some Pitted neighborhood. This involves satisfying three equations like Eq. (3.33), the difference being that the m! used there is replaced by m2 = \\-dba + 2satb\\=>ml = l,
detm 2 = l.
(6.37)
However, it is not hard to show that there are no lattice vectors satisfying these conditions for any permissible values of the sa, so these twins cannot be included in any Pitted neighborhood.
(6.38)
At least for the present, I do not see another possibility that is easy to assess. However, there is another speculative line of thought that seems to me to be promising enough to mention, although it would take some hard work to firm it up and assess it. Now, I return to the speculation. First, Cahn (1953), in his work on determining twinning elements for auranium, presented a number of photographs of patterns of twins he observed, including some involving the modes considered in our examples. Mostly, these involve differently oriented twins. For example, he found that "{172}" twins can intersect each other, also that they can intersect {1 30} twins. So, these are among the patterns that the X-ray theory should treat. I have not tried to collect other information of this kind that is available, but this is enough to supply some motivation for the discussion.
365 992
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
Now, in common practices, I observe an intuitive prejudice concerning analyses of twins. Roughly, it is a common notion that interactions involving different energy wells will involve only wells which are very close to each other. In one way or another, various workers use this idea, rather successfully, on the whole. In our first example, we used two versions, one being the minimum shear assumption. The other is more of a topological nature. With twins that can be included in a neighborhood, it has become routine to use this to locate neighboring wells, this being the second version used in the first example. Other variations are used, in dealing with deformation. Mathematically, it is at best unclear that such neighborhoods always include the wells closest to one, and it would not surprise me if someone produced a counterexample, with a reasonable interpretation of "closest". Pragmatically, it has been a successful selection criterion, for cases to which it applies. For the {130} twins, we used this rule to get the three variants discussed before and experience suggests that the corresponding wells are likely to suffice, to describe their role in patterns of twins. Clearly, this idea is of no help, in deciding which wells to use, to try to analyze patterns involving the "{172}" twins, in particular, and we know that these can also involve {130} twins. Roughly, what the indicated thoughts suggest to me is selecting the "{1 72}" wells closest to those of the {1 30} variants, and rejecting the rest. As a general strategy, I like this way of getting rid of most or all of the ambiguities associated with that arbitrary integer, etc. The difficulty is that it is not obvious exactly how best to accomplish this, or to forecast how well it might do, in delivering wells needed for satisfactory analyses. Let us think a bit more about this, in general terms. For one variant, we did determine that infinite set of solutions of the twinning equations. Using Eqs. (2.14) and (6.1a,b,c), one can calculate all possible shifts, to complete these solutions. We should also determine the corresponding sets for the other two variants. This is routine, a matter of applying transformations to the first set. If we like, we can take orbits of these under SO(3). At present, I have not firmly decided whether to do so, but I lean toward it. Either way, one should take each {130} variant, look at the entire list of "{172}" possibilities, and pick out the one(s) closest to it. Here, we have another uncertainty, "closest" being subject to interpretation. There are various possible norms that could be used for this. For reasons not yet very clear to me, I think that some are better than others, so I am not ready to make a definite proposal for this. I do not mean to exclude the possibility that some more topological interpretation might be best, but I have no concrete suggestions for this. Obviously, one needs to make definite decisions about these uncertainties, and determine the results. I believe that it should be possible to do so, but the best way to confirm this is to produce the results. So, my proposal is rather vague and speculative. Granted a successful outcome, one should take the orbit of the selected descriptions under SO(3). Suppose that we have done all this. Then, we have a collection of energy wells, to be used in trying to analyze patterns involving one or both of these modes. There is no guarantee that the procedure will deliver the wells needed to properly analyze the observed patterns, so this needs to be explored. For this reason also, my proposal is speculative. The common practice of using only the wells included in a Pitteri neighborhood is subject to the same reservation but, in practice, it has worked well, giving us some reason to be optimistic about this. For whatever it is worth, Frank (1953) used a somewhat similar idea, to compare twins in ot-uranium with those in zinc. If the procedure works well for these modes, one could try including other observed modes, in a similar way. So, with some luck, this could become a useful partial theory of twins in a-uranium. If I had an idea which I thought would be more likely to be successful than this, I would have discussed it instead. For this, it seems necessary to assume that q> is invariant under an infinite group, unless someone finds a clever trick to evade this. However, for typical analyses, it is not necessary to specify this function, although one is likely to need it in the future. As I have not found a good way of selecting that arbitrary integer etc., this is, for me, a problem not yet solved. For this reason, I have referred to my analysis of these twins as being only a partial one. Whether or not the suggested procedure is successful, I have barely scratched the surface in constructing a useful theory of twinning in this material. Here, my aim is more to illustrate the issues arising in realistic
366 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
993
cases, and to describe the tools we have for dealing with X-ray observations of twins. The two quite different examples seem to me to be good for this. I think it worth work out additional realistic examples, to better illustrate all of the kinds of difficulties that do arise, and need to be dealt with, perhaps by creating other kinds of tools. For twins with Kt irrational, it is not immediately obvious how this should be defined. As best as I can estimate the consensus of opinion about this, it is to use ratios of components of n. So, for our example, we might use K, ^ M Z / J C K J / K , } .
(6.39)
A different kind of example of a generalized type II twin occurs in orthoniobate. These are transformation twins, associated with a second-order phase transition. I have not studied the literature on these, except that I did look carefully at the analysis of observed microstructures by Jian and James (1997), which agrees well with the experimental data. This alone is rather good evidence that these are neither of type I nor of type II, with K\ and r\x both being irrational. This is the first such twin to be observed, as far as I know. This is one of those nice cases where the Cauchy-Born rule applies and the twins can be included in a neighborhood. The former is expected, this being a shape-memory material. The latter is rather obvious from the fact that these twins are associated with a second-order phase transition. Then, it is not really necessary to use the X-ray theory, thermoelasticity theory being adequate for analyzing these. This is what Jian and James (1997) do. Of course, this theory ignores the shifts. I do not really doubt that they are arranged properly and, otherwise, these twins do qualify as generalized type II twins that are not of type II. I have not tried analyzing these, using the X-ray theory, but would expect to encounter ambiguities similar to those encountered in the last example, associated with lattice invariant shears. It might be worthwhile to do so, to understand how the nice features in this case enable one to eliminate these. Or, perhaps, they are not completely eliminated, but leave us with inconsequential ambiguities. There is the common idea that, if K\ and r\y are rational, they can be treated as either type I or type II. As I explained before, this is, essentially, a tautology, if you recognize how it is decided that they are rational. This seems to cover the observations of generalized type I twins, but I have not studied the mathematical possibility of including others, for which K\ should be judged to be rational, by similar reasoning. I should say that, in practice, some workers say that calling Ki irrational really means that one needs rather large integers ka to match the measurements, within experimental error. Apart from the fact that this seems to require some subjective judgment, and makes the theoretical distinction fuzzy, I do not really object to this. Perhaps, some of them will not like my explanation of this. I believe that the last a-uranium example is fairly typical, illustrating the kinds of issues which will arise, when one tries to analyze type II twins with K\ irrational. Another such twin of this kind, also observed in ot-uranium, involves an additional ambiguity. With r\x = (512), it is either the "{1 97}" or the "{1 76}" mode. Some writers use one, some the other, and a few persons mention both. Of course, rational approximations are not unique, but these seem to me not to be very close to each other. Some workers mention that the rational approximation for the "{172}" mode is unusually good. I have not tried analyzing these, so am not sure what problems are created by the added ambiguity. Generally, some twins of this kind might be simpler than others, because they are contained in a Pitteri neighborhood and/or conform to the Cauchy-Born rule, for example. Finally, it is fair to ask whether there are things called twins which are not in either of my classes. I know of no observations of mechanical twins of this kind, but there are some such examples of growth twins, as is rather clear from the discussion by Zanzotto (1988). For example, he mentions cases of growth twins in some materials, for example alum, which do not even meet what I took as minimal requirements: they involve inequivalent crystallographic planes on the two sides of the twin plane. Thus, they cannot be listed in the usual way, in twinning tables. Frankly, I do not see how to phrase a general definition of a twin which includes such cases and also excludes the grain boundaries in polycrystals, for example, and rather different
367 994
J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
kinds of theory are used for the latter. So, I would not call these twins, but I seem to be alone in this. I do not really object to having a third category of twins, allowing for any not in either of my classes. It is only that less theory is available for these. It would be nice for theorists if workers would agree on a general definition of twins, but I am not very optimistic about this. This covers my thoughts on what might be viewed as elementary twinning analyses, according to the Xray theory. Hopefully, this is enough to make clear how well these fit common practices, and to give some idea of what the X-ray theory can and cannot do. For nonspecialists, I have tried to point out practices which seem to me to be somewhat confusing.
Acknowledgements
I thank Richard James, Mario Pitteri and Giovanni Zanzotto for some very helpful suggestions, and Marion Ericksen, for her help in getting the typing done. References Adeleke, S.A., 1999. On the classification of monoatomic crystal multilattices. submitted for publication. Ball, J.M., James, R.D., 1992. Proposed experimental tests of a theory of fine microstructures and the two-well problem. Philosophical Transactions of the Royal Society of London 333A, 389-450. Barrett, C.S., Massalski, T.B., 1966. Structure of Metals. Third edn., McGraw-Hill, New York. Bhattacharya, K., Firoozye, N.B., James, R.D., Kohn, R.V., 1994. Restrictions on microstructures. Proceedings of the Royal Society of Edinburgh 124A, 843-878. Born, M., 1923. Atomtheorie des festen Zustandes. Second edn., Teubner, B.G., Leipzig. Cahn, R.W., 1953. Plastic deformation of alpha-uranium; twinning and slip. Acta Metallurgica 1, 49-67. Christian, J.W., Mahajan, S., 1995. Deformation twinning. Progress in Materials Science 39, 1-157. Ericksen, J.L., 1997. Equilibrium theory for X-ray observations. Archive for Rational Mechanics and Analysis 139, 181-200. Ericksen, J.L., 1998. On non-essential descriptions of crystal multi-lattices. Journal of Mathematics and Mechanics of Solids 3, 363392. Ericksen, J.L., 1999. On groups occurring in the theory of crystal multi-lattices. Archive for Rational Mechanics and Analysis 148, 145-178. Frank, F.C., 1953. A note on twinning in alpha-uranium. Acta Metallurgica 1, 71-74. Hall, E.O., 1954. Twinning and diffusionless transformations in metals. Butterworths, London. James, R.D., 1987. The stability and metastability of quartz. In: Antman, S., Ericksen, J.L., Kinderlehrer, D., Miiller, I. (Eds.), Metastability and Incompletely Posed Problems. Springer, New York, pp. 147-175. James, R.D., Kinderlehrer, D., 1989. Theory of diffusionless phase transformations. In: Rascle, M., Serre, D., Slemrod, M. (Eds.), Lecture Notes in Physics, vol. 344. pp. 51-84. Jian, L., James, R.D., 1997. Prediction of microstructure in monoclinic LaNbO4 by energy minimization. Acta Materialia 45, 4271^1281. Kelly, A., Groves, G.W., 1970. Crystallography and crystal defects. Addison-Wesley, Reading, MA. Klassen-Nekliudova, M.V., 1964. Mechanical twinning of crystals. Consultants Bureau, New York. Parry, G.P., 1998. Low-dimensional lattice groups for the continuum mechanics of phase transitions in crystals. Archive for Rational Mechanics and Analysis 145, 1-22. Pitteri, M., 1984. Reconciliation of local and global symmetries of crystals. Journal of Elasticity 14, 175-190. Pitteri, M., 1985a. On (v+l)-lattices. Journal of Elasticity 15, 3-25. Pitteri, M., 1985b. On the kinematics of mechanical twinning in crystals. Archive for Rational Mechanics and Analysis 88, 25-57. Pitteri, M , 1998. Geometry and symmetry of multi-lattices. International Journal of Plasticity 14, 139-157. Pitteri, M., Zanzotto, G., 1998. Beyond space groups: the arithmetic symmetry of deformable multi-lattices. Acta Crystallographica A54, 359-373. Pitteri, M., Zanzotto, G., 2000. Continuum models for phase transitions and twinning in crystals. CRC/Chapman and Hall, London, submitted for publication. Reed-Hill, R.E., Rogers, H.C., Hirth, J.P., 1964. Deformation Twinning. Gordon and Breach, New York. Stakgold, I., 1950. The Cauchy relations in a molecular theory of elasticity. Quarterly Journal of Applied Mathematics 8, 169-186.
368 J.L. Ericksen I International Journal of Solids and Structures 38 (2001) 967-995
995
Thomas, L.A., Wooster, W.A., 1951. Piezocrescence-the growth of Dauphine twinning in quartz under stress. Proceedings of the Royal Society of London A 208, 43-62. Zanzotto, G., 1988. Twinning in minerals and metals: remarks on the comparison of a thermoelastic theory with some available experimental results: Notes I and II. Atti Accademia Nazionale dei Lincei, Rend. Fis., 82, 725-742 and 743-756. Zanzotto, G., 1992. On the material symmetry group of elastic crystals and the Born rule. Archive for Rational Mechanics and Analysis 121, 1-36.
369 Continuum Mech. Thermodyn. (2002) 14: 249-262 Digital Object Identifier (DOI) 10.1007/s001610200092
© Springer-Verlag (2002)
Original Article Twinning theory for some Pitteri neighborhoods^ J. L. Ericksen 5378 Buckskin Bob Road, Florence, OR 97439, USA Received December 3, 2001 / Published online May 21, 2002 - © Springer-Verlag 2002 Communicated by Epifanio Virga, Pavia
Analyses of twins in crystals can require use of more than one set of constitutive equations and/or use of a set that is invariant under an infinite crystallographic group. Here, we will explore the kinds of twins that can be treated using one set of constitutive equations, which are invariant under a finite crystallographic group, along with being invariant under the continuous group of rotations. The finite group considered here is the lattice group for monatomic hexagonal closepacked crystals, the domains of constitutive equations being Pitteri neighborhoods centered at such configurations.
1 Introduction The common practice in treating material symmetry for theories of crystals assumes that the associated invariance group for a constitutive equation is some finite group. Essentially, this presumes that its domain is restricted to some choice of a Pitteri [ 1985a] neighborhood, depending on the material. Such a neighborhood is centered at a particular configuration and has the property that the lattice group of any configuration in it is a subgroup of that of the center. If the domains are considered to be suitably infinite, the invariance groups are infinite, and theory of this kind is much more complicated. For twinning analyses, there are advantages to having the invariance groups be relatively large, since this makes the collection of possible twinning solutions a relatively large set. If one is interested in cases where it is not necessary to deal with infinite groups and one prefers not to, the best one can do is to use a Pitteri neighborhood centered at a configuration of maximal symmetry. For Bravais lattices, Pitteri and Zanzotto [2002] treat theory of this kind for cubic and hexagonal neighborhoods. As will become clear, this excludes some kinds of twins that can occur in n-lattices, for n > 1. For monatomic 2-lattices, one such possibility for the center is an hexagonal close-packed configuration. The structure of such neighborhoods is rather well understood from the work of Fadda and Zanzotto [2001a, 2001b] and Ericksen [2002a], and similar results are not yet available for other possibilities. So, I will build on this, treating elementary twinning analyses for such neighborhoods. While it might seem curious, what cannot be treated with such theory are twins in hexagonal close-packed crystals, because the lattice groups for the pair involved in such a twin generate an infinite group. This is a common difficulty in twinning theory for configurations of maximal symmetry, as well as for various kinds of deformation twins occurring in configurations of lesser symmetry. Other kinds of phenomena can also exclude use of such theory, as is exemplified in Ericksen's [2001a] study of growth twins in quartz. Another point is that one can use theory to be developed for crystals such that the center is not observed to be taken on as an equilibrium configuration. For example, in analyzing twins in (orthorhombic) a-uranium, Ericksen [2001b] found that the common { 1 3 0} twins could be included in a neighborhood of the kind considered, although no known phase of this material has this symmetry. This opens the door for developing theory for studying effects of loadings on these twins, for example. * This paper is dedicated to the memory ofFrank Leslie, a scholar and a gentleman.
370 250
J.L. Ericksen
2 Background Here, I will deal only with the most elementary kinds of twinning analyses, primarily kinematical studies. While this makes little use of constitutive equations, I think it important to understand how these fit into the picture. To be considered are monatomic 2-lattices, consisting of two identical interpenetrating lattices, described by lattice vectors e o , with the reciprocal lattice vectors (dual basis) e a and a shift vector p. Pick as a base point some atom in one of the lattices. Then, p is the position vector relative to it of some atom in the other lattice. The descriptions considered are essential, meaning that the configurations cannot also be described as Bravais lattices. For a given 2-lattice filling all of space, there are infinitely many ways of selecting these vectors. If (e o , p) is one set, ea, p is another provided ea = mhaeb&ea
= {m-1)aheb,m=
||m£|| € GL(3,Z)
(2.1)
and p=ap
+ laea,a
= ±l,la
eZ.
(2.2)
In matrices, my convention is that the lower index labels rows, except that I omit the lower index (1) in
1 = II*1 P 1%
Any particular set of these vectors determines some finite groups, those relevant here being a lattice group i ( e o , P ) = {m, al\mbaeb = Qe o , a p + laea = Qp, Q e O(3)},
(2.3)
a skeletal lattice group L(ea) = {m|m^ei, = Qe a , Q 6 O(3)} ,
(2.4)
P(e a , p) = all Q occurring in (2.3)
(2.5)
P(e o ) = all Q occurring in (2.4).
(2.6)
a point group
and a skeletal point group
We shall see an example of a case for which the order of L(ea) and P(ea) is larger than that of the other two groups, which is not unusual for n-lattices with n > 1. The values of m, a and 1 mentioned in describing the lattice groups are, of course, a subset of those denned in (2.1) and (2.2). For Bravais lattices, the lattice and point groups are just the skeletal groups. Early work on lattice groups for multi-lattices was done by Pitteri [ 1985a], spawning a number of papers involving these. Twinning involves jump discontinuities in some or all of these vectors across some surface, and many discussions of this associate twins with deformation. This is not relevant to the formation of growth twins, occurring naturally as a crystal is grown. Also, in numerous cases of interest, producing them does involve deformation, but we lack reliable ways of relating this to the changes in crystallography. Ericksen [1997] thought it feasible to develop theory to correctly describe the crystallography and, to this end, proposed what is called the X-ray theory. One assumption, excluding continuous distributions of dislocations, gives the existence of functions x" s u c n that ea = VXa.
(2.7)
There are various equivalent formulations of the constitutive equations. Here, I won't make explicit use of these, but they are used implicitly, and can be taken to be of the form V(ea,pb,0), t = -pea ® - ^ = -p-^~ dip
® ea = tT,
(2.8)
m Twinning theory for some Pitteri neighborhoods
251
where (p is the Helmholtz free energy per unit mass, t the Cauchy stress tensor, if the mass density, rj the entropy per unit mass, 9 the absolute temperature and pa are the components of p indicated by p = paea<^p° = p e a .
(2.9)
The requirement that the descriptions be essential is that these numbers cannot all be reduced to 0 or 1/2 by transformations of the form (2.2). The symmetry of is implied by the assumption that ip is invariant under finite rotations, a b h V(Qe ,p ,e)=
(2.10)
With 8 considered as a control parameter, the equilibrium equations consist of the usual equations for t along with (2.11) Alternative formulations are discussed by Ericksen [1999, 2002a, 2001c]. To get a twinning equation, Ericksen [ 1997] introduced a new idea, that it is possible to choose lattice vectors on the two sides such that the Burger's vector vanishes, for all Burger's circuits in the neighborhood of the discontinuity surface. For such a pair, this led to the jump condition e" = ( l - n ® a ) e a .
(2.12)
where e a and e a are the limiting values on the two sides of the discontinuity surface, n is its unit normal and a is some vector. With this, it becomes reasonable to consider growth twins as well as twins more associated with deformation. Also, (2.12) is interpretable as the kinematical condition of compatibility associated with having the xa be continuous on the surface. It is easy to introduce deformation into the theory by using the traditional Cauchy- Born rule, when one trusts it. For almost all elementary twinning analyses, workers have in mind unstressed crystals or those subjected to hydrostatic pressures, this being associated with the common assumption that the two configurations are related by some isometry. Also, it is typical to consider piecewise constant configurations, as I shall do here. The isometries are introduced through the equations e a = (1 - n 0 a)e a = Q ^ " 1 ) ^ 6 & §Q = (1 - n ® a ) - ' r e a = Qmbaeb, Q e O(3)
(2.13)
and p = Q ( a p + /°e a ).
(2.14)
With various well-established kinds of twins, although not all, (2.13) does seem to apply, judging from Ericksen's [2000, 2001a, 2001b] efforts to check this out. So, consider (2.13) and (2.14) as selecting a special class of twins, for which one can do some analyses. From (2.13), it follows that either d e £ ( l - n ® a ) = 1 =*• a • n = 0, (1 - n ® a ) ~ T = 1 + a ® n
(2.15)
det(l -n(g>a) = - l = > a - n = 2, ( 1 - n ® a ) ~ T = 1 - a
(2.16)
or
I call those described by (2.15) S-twins, those by (2.16) O-twins, indicating that the two sets of lattice vectors have the same or opposite orientations, respectively. For mechanical twins, the experience is that a simple shearing deformation gradient of the form 1 + a ® n takes one configuration to the other. Sometimes, (2.13) applies with this value of a. For this reason, (2.13) was studied before I proposed the X-ray theory, for example by Pitteri [1985b, 1986]. However, in numerous deformation twins, (2.13) does not hold for this value of a.This means that the traditional Cauchy-Born rule for relating deformation to changes in lattice vectors fails to apply and, then, we do not have a reliable alternative. With my interpretation of (2.12), it is not necessary that a be associated with deformation and, in most twins I have studied, including such cases and growth twins, one can find values of a such that (2.13) is satisfied. It does fail to apply when the two sides of the discontinuity surface are crystallographically inequivalent, as is the case for some things called twins, and the failures I have
372 252
J.L. Ericksen
encountered are of this kind. It should be mentioned that workers have been unable to agree on a general definition of a twin. Here, the focus is on solutions of (2.13) and (2.14), so these are the twins of interest. Obviously, any of the transformations (2.1) and (2.2) is determined by giving the set of integers indicated by L = {m,a,l}.
(2.17)
If one first applies one labeled L then applies L to the transformed vectors, the result is equivalent to applying the transformation i»hd={naa,aa,al + lm.}
(2.18)
to the original set of vectors. So, we can take this as describing the group multiplication for this infinite group of transformations. The most commonly observed mechanical twins employ in (2.13) and (2.14) transformations such that L 2 = L « L = 1 = {1,1,0}, w i t h a n = 0.
(2.19)
m 2 = l , a l + lm = 0.
(2.20)
m£L(ea).
(2.21)
From (2.18), this implies that
For the present, I assume that
Then, the work of Pitteri [ 1985b, 1986] shows that, with RJ denoting the rotation with axis v and angle 6, either Q = ±R",
(2.22)
Q = ±R£
(2.23)
or
Of course, (2.19) implies that detQ = detm = ± 1 . For the X-ray theory, I take (2.19) and (2.22) as the definition of a type I twin, (2.19) and (2.23) as the definition of a type II twin, it being also assumed that (2.13) and (2.14) hold. I have examined information on numerous twins,findingthat what are classified as twins of these types in the literature do seem to be consistent with these definitions. For S-twin solutions of (2.13) with a ^ O , Ericksen [1986] showed that if Q = R, with either Ra = a or Rn = n, then the corresponding m is similar to R. For 180° rotations of this kind, this gives m 2 = 1. A result obtained by Pitteri [1998] then gives the stronger result that (2.19) holds, assuming as we do that the descriptions are essential. Frequently, in later analyses, solutions for type I and type II twins occur. These involve m's with detm = 1, (2.20) then implying that they are of the form m
13
where ua and v
a = ~Sa + "a" 13 , UaVa = 2,
(2.24)
can be taken as integers. Let \i = uaea,v = vaec,.
(2.25)
Then, for the same m, one has two S-twin solutions of (2.13), described by typel:nigia = u ® (2-^2 - v j
(2.26)
/ v \ typell:nigia= fu-2—j2 j i8v.
(2.27)
and
Of course, the values of shifts must satisfy (2.14). For type I twins, the direction of n is rational, meaning that it can be described by integer components relative to the basis ea. This implies that the discontinuity surface is
373 Twinning theory for some Pitteri neighborhoods
253
parallel to some crystallographic plane. For type II twins, the direction of a is rational in a different sense, being describable by integer components relative to the basis e a implying that it is parallel to some row of atoms. Lattice invariant shears are S-twins satisfying (2.13) and (2.14) with Q = 1. There are trivial solutions of (2.13) and (2.14) with L e L(ea,p). Compositions of these two kinds are called fake twins: for these no discontinuities evident in typical X-ray observations occur. For most twins, the orientation of n must be related in a definite way to the crystallography. There are twins, called penetration twins with a discontinuity surface not of this kind. Usually, these are growth or transformation twins. From reports of observations I have seen, the surface can be of rather random shape, or of zigzag shape, then being made up of different kinds of crystallographic planes. Well-known twins of these kinds include the Brazil twins, favoring the zigzag shapes, and Dauphine twins, preferring random shapes, these occurring in quartz. Tentatively, Ericksen [2002b] proposed that at least some penetration twins are represented by some solutions of the twinning equations which are not of the trivial kind, but are such that, in (2.13), m e L(ea).
(2.28)
It then follows that there are two kinds of possibilities for this, penetration S-twins, with a = 0
(2.29)
for which (2.13) involves no explicit dependence on n, and what I call exceptional O-twins, with a = 2n.
(2.30)
For given m and e a , one can assign n and adjust Q to satisfy (2.13) in the latter case, so this equation does not really restrict values of n. Ericksen [2001a] verifies that (2.29) does apply to Brazil and Dauphine twins in quartz. There is need to check how well (2.28) applies to other things called penetration twins, something I have not yet done. Also, his analysis of Ch. Friedel twins in quartz describes these as exceptional O-twins, and workers do not view them as being penetration twins. For Bravais lattices, there are only trivial possibilities satisfying (2.28), so one does not have non-trivial twins of these kinds. For the time being, we exclude these two kinds and fake twins. Then, for given L, (2.13) may or may not have any solutions, depending on the values of e a . Ericksen [1985, 2002b] gave a criterion for existence and Adeleke [2000] characterized all solutions for S-twins. Later, I shall describe and use the criterion. Also, Ericksen [2002b] showed that, when there are solutions, there are four, two representing S- twins, called conjugate S-twins, two representing O-twins, called conjugate O-twins. Further, he showed that the latter pair can be obtained by applying a simple transformation to the former, making it feasible to use Adeleke's results to analyze all O-twins. In particular, type I twins have as conjugates those of type II. For these, (2.13) can be satisfied for any admissible values of e a . From results of Pitteri and Zanzotto [ 1998], it follows that the only S-twin solutions of (2.13) with this property are these two types and lattice invariant shears. Most observed mechanical twins are either of type I or are compound twins, which means that they can be analyzed as type I and as type II twins. As I interpret this, we should satisfy two sets of twinning equations. For the type I, I will use the notation above, so Q is given by (2.22), for example. For the Type II, I use the notation e a = (1 - n
(2.31)
For compound twins, Ericksen [2002b] showed that to within lattice invariant shears, a = a
(2.32)
R* An and/or - R* An e P(eai p).
(2.33)
and that
Further, if (2.33) applies to a type II twin, it is compound. For a type I twin, one can use certain lattice invariant shears to change the direction of a leaving that of n and the condition m 2 = 1 fixed, and one cannot expect (2.33) to hold for all such directions. However, if (2.33) does hold for one of these values, the twin is compound and (2.32) will hold for this value. For the special cases to be considered here, it is automatic that a = a. Theory to be used here excludes considering improper orthogonal transformations, but we will encounter examples associated with R* An .To treat twins using one set of constitutive equations with its domain restricted to a Pitteri
374 254
J.L. Encksen
neighborhood, one should restrict the choice of L and Q to leave these equations invariant. As I have interpreted this, it excludes O-twins and restricts the possible values of L to a finite set, the lattice group of the center, restricted to exclude transformations reversing the orientation of e a so (2.34)
detq = detia=l.
So, we will explore what twins can be treated in this way, using hexagonal close-packed neighborhoods. Henceforth, I replace Q by R as a reminder that all orthogonal transformations considered are rotations. For each symmetry type included in one of these, I will list the lattice group elements of the possible variants, and present a characterization of values of e a and p that are consistent with this symmetry, borrowing this information from Ericksen [2002a]. Then, I will describe the twins that can be treated with this theory. Ericksen [2002a] considers more general theory, also covering magnetic effects, which I ignore, for simplicity. Also, he covers restrictions implied by symmetry on t and
(2.35)
then take the orbit of each of these and the original under SO (3). Each o f these orbits is a variant. From elementary group theory, it follows that the number N of different variants is given by N = N'/N", where N' is the order of L ( E a , P ) , N" the order of L ( e a , p ) . It is easy to see that Land L give the same variant whenever L = LL',L'eL(ea,p)
(2.36)
or, in the language of group theory, whenever L and L belong to the same right coset. In the following, the names and numbers of symmetry types are those used by Fadda and Zanzotto [2001a, 2001b]. 3 Hexagonal close-packed configurations (type 27) A conventional description of these configurations is given by v / j3 \ , e = ck, e! = ai,e 2 = a ( - -1i + — 3
V
e1 - - (i + — i\ v3 / a \
l
)
e 2 - -^—i V3a
e 3 - -k c
(3.1)
(3 21
and 2 1 1 P = jjei + - e 2 + 2 e 3,
(3.3)
where a and c are positive constants, with i, j and k representing some orthonormal basis. We have in mind using theory based on restricting the domain of
375 Twinning theory for some Pitteri neighborhoods
255
of the orthonormal basis and values of a and c differing from those at the center, restricted by the size of the neighborhood. However, as was mentioned earlier, we cannot deal with twins in these because they involve pairs of configurations which cannot be included in any Pitteri neighborhood. As is noted by Ericksen [2002a], the equilibrium equation (2.11) is automatically satisfied as a consequence of symmetry for these configurations. Symmetry considerations also imply that equating the Cauchy stress tensor t to zero or to a given hydrostatic pressure gives two equations for determining values of a and c.
4 Hexagonal configurations (type 25) Here, and in discussions of other configurations to follow, I make free use of results of Fadda and Zanzotto [2001a, 2001b] and Ericksen [2002a] concerning descriptions of the configurations. For the type 25, there is just one lattice group, of order six, with elements (4.1)
L2,Li,L6,LSnLw,Lx2,
so, here, we have 12/6=2 variants, with the same lattice group, one that is not changed by transformations generating variants. I will describe lattice vectors of the various configurations by the Gramian matrices G=||ea-e6||
(4.2)
the shifts as linear combinations of lattice vectors. For the type 25, the lattice vectors are of the same general form as those for type 27, giving Gii G=
-\Gu
0
- i d ,
Gn
0
0
0
G 33
.
(4.3)
This means that every value of m obtained from an element of L+(Ea, P ) is in £+(e o ). This implies that the lattice vectors of a pair of variants differ at most by a rotation and that only penetration twins are possible. The possible shifts are of the form 2 1 /I \ P = o e i + - e 2 + x + P e3.
(4.4)
For any configuration, one gets the value of p for the other variant by taking any element of ^ + ( E a , P) not listed in (4.1), using this to transform the original value, which induces the transformation P -> ~P,
(4.5)
with $ = 0 reducing to one equation to determine this parameter. It is easy to see that any two elements of L+(Ea, P) not listed in (4.1) are related by some transformation of the form (2.36) so, for twinning analyses, it is only necessary to consider one element. Taking one of the simplest, L 3 , one satisfies the twinning equation (2.13) using ea = ea = Rmbaeb.
(4.6)
where -1 m=
00
0 - 1 0 , 0
(4.7)
0 1
R being the corresponding point group element, a 180° rotation with e 3 as axis. Start with some p of the form (4.4), transform it with this value of R and the lattice group element L3 and you get just what is indicated by (4.5). The better known penetration twins include the Brazil and Dauphine twins in quartz and, mathematically, these are more like the latter. For example, unlike the Brazil twins, but like the Dauphine twins, the two configurations
376 256
J.L. Ericksen
are not enantiomorphs. I do not know of an observed twin of this kind. General experience suggests that they might occur naturally when a type 25 is grown, or occur as transformation twins, when a type 27 —>• type 25 phase transition occurs. Experience with Dauphine twins suggests that it might also be possible to produce them by applying rather concentrated loads. For these and other twins occurring in quartz, Ericksen [2001a] used the twinning equations to analyze various configurations involving more than one twin, and one could similarly treat possibilities for twins of this kind crossing each other, for example. 5 Base-centered orthorhombic configurations (type 14) For configurations of this kind, the order of the lattice group is four, giving three variants. They have different lattice groups, with elements L3,1*8, L a , L12,
(5.1)
L3,L7,Lio,Li2
(5.2)
Li3,L,6,L9,Li2.
(5.3)
and
ExceptforthegroupidentityLi2,theseallsatisfyL by
2
= 1 , L / 1. The Gramians and shifts are given, respectively,
Gn G\i 0 Gi2Gn 0 . 0 0 G33
(5.4)
p = p e i +(1 - p ) e 2 + - e 3 ,
(5.5)
G=
G=
G11
—2^22
0
-§G22
G22
0
0
0
G33
,
P = ?ei + V - e 2 + -1e 3 ,
(5.6)
(5.7)
and G11
G=
-|Gn 0
—5G11 0
G22 0
0 , G33
(5.8)
p = rei + ( 2 r - l ) e 2 + i e 3 .
(5.9)
For variants, the parameters occurring in these shifts are related, the relations being described by Ericksen [2002a], which I won't repeat. The same comment applies to other shifts to be considered. From information given in the Appendix, it is easy to see that each right coset contains two elements satisfying L 2 = 1, so one can use either of these to construct solutions for type I and type II twins: generally, using different elements in the same right coset describe the same twins, in a different way. The twins obtained by this procedure are all compound. For an indication of why this is, consider those obtained by starting with a configuration in the first variant, to be adjoined by one in the third. The latter can be obtained from the former by solving the twinning equations, using L7 or L10 = L7L3, which are in the same right coset, L3 being in all three lattice groups. Solving the twinning equations with L7, using (2.26) and (2.27), onefindsthat the conjugate twins satisfy typel: a||e 2> n||e 1 , typell: a||2ei+e 2 ||n = e 1 - 2 e 2 .
(5.10) (5.11)
I won't bother to record what one gets for |a|. These four directions are all perpendicular to e 3 and, from (10.5), R®3 is the point group element corresponding to L3, which is in the first lattice group, in particular.
377 Twinning theory for some Pitteri neighborhoods
257
Refer to the discussion of (2.33), and you infer that these are compound twins. Or, you can solve the twinning equations using Lio which gives the same four values of a and n, but interchanges the designations type I and type II. By this kind of argument, one can cross-check the assertion that, here, we can take a = a in (2.32). It is easy to see that any type 14 configuration can be related as a type I or type II twin to configurations in the other two variants. For exceptional values of ea, one can get penetration twins. This is the case if they are such that (5.4) degenerates because 2G\i = —Gn, but the shifts given by (5.5) are inequivalent to those given by (3.3), for example. 6 Base-centered monoclinics (type 7) Here, the lattice groups are of order two, giving six variants. They are arranged in three pairs, with the two in a pair sharing the same lattice group. The three lattice groups have the elements L K , L 1 2 ) * : = 7,9,11 =>\?K\.
(6.1)
For these, the forms of G and p are, respectively, G=
-\G22 Gi3
G22
-2G 1 3
—2G13
G33
.
P = Pet + | e 2 + ^ 3 , Gn G=
—2Gu
-±GU
G22
G13
—2G13
(6.2)
(6.3) G13
-2Gi3 ,
(6.4)
G33
p =
(6.5)
and Gn G12 G13 G\i Gn G13 . G13 G13 G33
(6.6)
p = rej + (1 - r)e 2 + - e 3 .
(6.7)
G =
Each of the elements L 7 , Lg, Ln is in one of the lattice groups (5.1)—(5.3), and, excluding L12, the remaining elements in one of these are in the same right coset. For example, L7 is in (5.2), as are L3 and Lio = L3L7. So, from the one variant with L7 in its lattice group, we can get another, using either L3 or Lio, both of which satisfy L2 = 1. This gives the variant with the same lattice group. We can solve the twinning equations using L3, say, getting type I and type II twins involving variants of this kind. Calculating the values of a and n and using (6.2), one finds that they are all perpendicular to 2ei + e 2 and, by (10.9), the point group element corresponding to L 7 is R^ e i + ° 2 . It then follows from the considerations like those used for type 14 configurations that these are compound twins. The other possibility is for the twins to have different lattice groups. Here again, these are type I and type II twins. For example, start with a configuration having the first lattice group, involving L 7 . This time, pick L satisfying L 2 = 1, but not one included in (5.2). I pick L9, which produces a variant with the lattice group containing Ln. From (2.26) and (2.27), onefindsthat using L9 gives ( e2 \ type I: n
(6.8)
type II: n ® a = U2 - 2 - ^ - ± | ^ _ j ® ( ei + 2e2).
(6.9)
2
and
378 258
J.L. Ericksen
Of course, e a should conform to (6.2). Excluding cases for which G takes on values for more symmetric configurations, there is just one point group element to consider in (2.33), R 2 e i + e 2 and, generally, its axis is not perpendicular to the two directions given by (6.8) or (6.9). Essentially, this is the reason why these twins are not compound. Using (2.14), one can show that, for such twins, the parameters in (6.3) and (6.7) are related by r = l - | .
(6.10)
Again, any type 7 configuration can be related to some in any other variant as a type I or a type II twin. 7 Base-centered monoclinic configurations (type 6) As in the previous case, one has six variants, arranged in pairs which share the same lattice group, the three lattice groups being L K ,L 1 2 ,.fi: = 6 , 8 , 1 0 = > L ^ = 1 .
(7.1)
All are contained in the type 25 lattice group given by (4.1). Forms of G and p for the three are, respectively, Gil
~ 2 Gn
^
-\G1X
G22
G23
G23
G33
•
(7.2)
p = p e 1 + ( 2 p - l ) e 2 + ge3,
(7.3)
G =
0
G =
Gn
G12
G12
Gn -G13 ,
G13
— G13
Gi3 (7.4)
G33
p = rei + (1 - r)e 2 + se 3 ,
(7.5)
and Gn G=
-\Gii
— 2 G 22 G13 G22
0
.
G13 0 G33 t p = tei + - e 2 + u e 3 .
(7.6) (7.7)
One possibility is to have twins sharing the same lattice group. For these one can get type I and type II twins by picking L satisfying L 2 = 1, in the normalizer of the lattice group considered. For the first variant, L 3 and L g qualify, and they lie in the same right coset, so one can use either. A calculation like those used before shows that these are compound twins. Again, it is easy to show that, for exceptional values of e o , one can get penetration twins. If one transforms a configuration with one of these lattice groups by an element which is not in the normalizer, but is in (4.1), satisfying L 2 = 1, one gets a variant with a lattice group not containing either. For example, the variant associated with L 6 , transformed by L 8 , has L l o in its lattice group. Obviously, one can use this to get type I and type II twins. Generally, these are not compound twins. Here again, any type 6 configuration can be related to some configurations in each other variant by type I and by type II twins. 8 Primitive monoclinic configurations (type 2) For these, there are six variants, all sharing the same lattice group, with elements L 3 ,L 1 2 ,
(8.1)
a subgroup of all of the type 14 groups listed in (5.1)—(5.3). These arise from breaking the type 14 symmetry, to double the number of variants. Pick one of the type 14 groups and either of the other two L's lying in it, excluding
379 Twinning theory for some Pitted neighborhoods
259
L12. These are in the same right coset, so one then gets type I and type II twins. By arguments like those used before, one can show that these are compound twins. In this way, one links a configuration with three included in different variants. In this case, some variants cannot be linked by type I and type II twins, those which are obtained by using two of the four elements Li, L2, L 4 and L 5 , lying in different right cosets: L t or L 4 or L 5 . For any of these, m is of the form ab0 cdO ,ad-bc 00 1
m=
= l.
(8.2)
.
(8.3)
p = pei + qe2 + - e 3 ,
(8.4)
For all variants, G and p are of the form G =
Gn G12 0 G 12 G 22 0 0 0 G 33
with different values of the components for different variants. Now, with Q restricted to be a rotation R, (2.13) is equivalent to R = (1 - n ® a ) H - 1 = R
r
= (1 + a ® n ) H T ,
(8.5)
where H=(m-1)£ei>®eo.
(8.6)
From Ericksen's [1985, 2002b] analysis of this, (2.13) can be satisfied provided K = H1 H has 1 as an eigenvalue.
(8.7)
Ericksen [2002b] notes that it is obviousfromhis equation (4.19) that, if (8.7) is satisfied by some values of m and e a , it is also satisfied by m " 1 and the same values of e a . This has some relevance, since L J 1 = L i , ! , ^ 1 = L 2 . There are some exceptional values of e o for which K = 1, giving rise to penetration twins. Otherwise, a and n are perpendicular to the eigenvector corresponding to this eigenvalue. Besides satisfying this condition, n must be a unit vector satisfying n K n = |Hn| 2 = 1.
(8.8)
There are two solutions of this, with n and —n regarded as the same solution. With n so determined, a is given by a=(l-K)n,
(8.9)
and one can then use (8.5) to calculate R. Now, for m of the form (8.2) and consistent with (8.3), it is easy to show that H e 3 = e 3 and H T e 3 = e 3 =>• K e 3 = e 3 and R e 3 = e 3 .
(8.10)
So, for example, L^ and L 2 produce twins involving the variants they produce: one can use these to get corresponding shifts. So, one configuration can be related to some configuration in any other variant by some kind of twin, not always of type I or type II. Here, I won't calculate the solutions in detail. For those interested in doing so, one could use the procedure described above, but you might well find it simpler to use the results of Adeleke [2000], in his category IV.2. While they are not common, there are observed twins that are not of type I, type II or penetration twins, although I know of none exemplifying the theoretical possibilities at hand. For example, the Ch. Friedel twins in a-quartz involve as an isometry a 90° rotation, but these are O-twins, according to Ericksen's [2001a] analysis of them. Also, while a-quartz has hexagonal symmetry, a 2-lattice model of it is not satisfactory. S-twins somewhat like those encountered here were found rather recently in LaNbO4. Jian and James [1997] successfully analyze them using a tetragonal neighborhood. Our neighborhoods contain no tetragonal configurations, which are needed to describe a related phase transition. There is also the obvious matter that these are not monatomic 2-lattices. Occasionally, new kinds of twins are found, so good examples might be found in the future.
380 260
J.L. Ericksen
9 Triclinic configurations (type 1)
For these, the lattice group has the identity as its only element, so there are twelve variants. Then, no two elements of L + ( E a , P ) are included in the same right coset. This group contains seven elements satisfying L 2 = 1 , L ^ L12, so one can use these to relate any type 1 configuration to seven others by type I and type II twins. As is rather obvious from the lack of symmetry, these are generally not compound twins. Clearly, some variants cannot be related in this way. Generally, these cannot be related by more general kinds of twins, because (8.7) is not satisfied. I have not really explored other twins that are possible for exceptional configurations, although it is easy to construct examples. With this, we have drawn some conclusions about all of the types of configurations that are included in hexagonal close-packed neighborhoods. It does seem plausible that twins that are possible for all configurations with the same symmetry type are more likely to be observed than those that are not. However, better mathematical theory for explaining why many theoretically possible twins have not been observed would be welcome.
10 Appendix
Herewith is a list of the orientation preserving lattice group elements L + ( E a , P ) for the center described by (3.1)—(3.3), and the group multiplication table for it. As is implied by (2.18), the group multiplication operation is
{m, a,I} • {m, a,l} = { m m , a a , a l + Im},
(10.1)
this being inferred from the composition
ea -» e o = mbaeb -> e a = mbaei, = ( m m ) ' e t
(10.2)
and the analog for the shift. The elements are
f
1 10
)
R* / 3 :Li = { - 1 0 0 , - l , | | l 1 1||}:R? /S , &L/S
••!*={
I 0 01
J
{ 0 01 r -l oo
J
( 0 1 0 ) - 1 - 1 0 , 1 , | | 1 0 0|| \ : R £
R*:L3={
{
(10.3) (10.4)
)
0 - 1 0 ,-l,||0 0 1||V:R?, 0 0 1 J
(10.5)
381_ Twinning theory for some Pitteri neighborhoods
261
f -1"10 Rt/3U={ I Rw3:Ls = < I ( R ^ . L e ^ R^
l
i + i J
- 1 - 1 0
I :L
7
=!
,
0 - 1 0
I oo - l
={
9
= ^
2J
: L10 = {
[
0
J
(10.9)
)
(10.10)
J
,l,||-10
) - 1 | | } :R:i+2e2,
1 0
0 0 -1
(10.11)
J , - l , | | 0 0 0||V : R ? ,
0 0 -1
1 : L12 = { 1 , 1 , 0 } : 1,
(10-7) (10.8)
, l , | | 0 0 - 1|| I : R*»i+->,
0 0-1
^i+lj ( 0 - 1 0 R , ^"'+5J : Ln = { - 1 0 0
{
:R°>,
(10.6)
J
, - l , | | l 1 0|| V :R^ 1 + e 2 ,
( - 1 0 0 1 1 0
I R^'4"
10 0
{ 00-1
R i : L
,-l,||l 0 0|
0 0 -1
f 010
RT+^}-U
1
1 0 0 ,1,11-1 - 1 0 | | } : R £ / 3 , 0 0 1 J 0-10 ] i i ° --U1 ° i|lrRW3' 0 0 1 J 1 0 0 ]
(10.12)
J ,1,||-1
-1
) -l||V:R?-ei,
J
(group identity).
(10.13) (10.14)
Here, the rotation listed as the first item is the corresponding point group element for the configurations described by (3.1)—(3,3). The last entries are rotations, to be interpreted as follows. Consider any configuration which has this particular element in its lattice group. Then, the indicated rotation is the corresponding point group element. It is easy to verify that, when (3.1)—(3.3) apply, this rotation coincides with that listed on the left. Lattice groups do determine such point groups to within similarity transformations obtained using orthogonal transformations. They also determine space groups, but different and inequivalent lattice groups can correspond to one space group, the lattice groups distinguishing differences in symmetry not recognized by space groups or site-symmetry groups. Essentially, subgroups determine how the neighborhood gets decomposed into subsets, where configurations in one subset have the same symmetry, ignoring some of the subtleties mentioned earlier, associated with including magnetization. The group multiplication table is L2 L3 L4 L5 L12 Ln L6 L7 Lg L9 L10
L3 L4 L5 L12 L7 L/8 Lg L10 L i i L6 L4 L5 L12 Li Lg Lg L10 L n Le L7 L5 L12 Li L2 Lg L10 L n L6 L7 Lg L12 Li L2 L3 L10 L11 L6 L7 Lg Lg Li L2 L3 L4 L n lie L7 Ls Lg L10 L10 Lg Ls L7 L12 L5 L4 L3 L2 Li L n L10 Lg Ls Li L12 L5 L4 L3 L2 hg L n L10 Lg L2 Li L i 2 L5 L4 L3 L7 Lg L n L10 L3 L2 Li L i 2 L 5 L 4 Ls L7 L6 L n L4 L3 L2 Li L12 L5 Lg Lg L7 Lie L5 L4 L3 L2 Li L12
,
(10.15)
presented as a matrix, where the element in the ith row and j t h column is the product Li • L,-, omitting the obvious products involving the identity element. The same table applies to the point group, and it is easier to use this to calculate the entries. As is clear from (10.1) and (10.14), the value of for the inverse of an element with this value is m ^ 1 and they share the same value of a. For any particular subgroup, it is routine to characterize
382 262
J.L. Ericksen
the possible values of e o , e",p; and R associated with this symmetry. For each such element, take the from the list, using Re o = m^e;, to determine restrictions on the lengths of lattice vectors, etc. Readers might find helpful the discussion by Fadda and Zanzotto [2001a, 2001b] for picturing how the configurations fit together. References Adeleke S (2000) On matrix equations of twinning in crystals. Math. Mech. Solids 5, 395^15. Ericksen JL (1985) Some surface defects in unstressed thermoelastic solids. Arch. Rational Mech. Anal. 88, 337-345. Ericksen JL (1986) Stable equilibrium configurations of elastic crystals. Arch. Rational Mech. Anal. 94, 1-14. Ericksen JL (1997) Equilibrium theory for X-ray observations of crystals. Arch. Rational Mech. Anal. 139, 181-200. Ericksen JL (1999) Notes on the X-ray theory. J. Elasticity, 55, 201-218. Ericksen JL (2000) On correlating two theories of twinning. Arch. Rational Mech. Anal. 151, 261-289. Ericksen JL (2001a) On the theory of growth twins in quartz. Math. Mech. Solids 6, 359-386. Ericksen JL (2001b) Twinning analyses in the X-ray theory, hit. J. Solids and Structures 38, 967-995. Ericksen JL (2001c) On the phase transition in quartz. J. Elasticity 63, 61-86. Ericksen J L (2002a) On Pitted neighborhoods centered at hexagonal close-packed configurations. To appear in Arch. Rational Mech. Anal. Ericksen J L (2002b) On the X-ray theory of twinning. To appear in Math. Mech. Solids Fadda G, Zanzotto G (2001a) Symmetry breaking in monoatomic 2-lattices. Int. J. Nonlinear Mechanics 36, 527-547. Fadda G, Zanzotto G (2001b) On the arithmetic classification of crystal structures. Acta Cryst. A 57, 492-506. Jian L, James RD (1997) Prediction of microstructure in monoclinic LaNbO4 by energy minimization. Acta Mater. 45, 4271^(281. Pitteri M (1985a) On (y + 1) lattices. J. Elasticity 55 201-218. Pitteri M (1985b) On the kinematics of mechanical twinning in crystals. Arch. Rational Mech. Anal. 88, 25-58. Pitteri M (1986) On type-2 twins in crystals. Int. J. Plasticity 2, 99-106. Pitteri M (1998) Geometry and symmetry of multi-lattices. Int. J. Plasticity 14, 13-57. Pitteri M, Zanzotto G (1998) Beyond space groups: the arithmetic symmetry of deformable multilattices. Acta Cryst. A54, 359-373. Pitteri M, Zanzotto G (2002) Symmetry-breaking and transformation twinning. Pending publication.
383
On The X-ray Theory of Twinning
J.
L. ERICKSEN
5378 Buckskin Bob Road, Florence, OR 97439, USA (Received 22 December 2001; accepted 7 January 2002)
Dedicated to Millard E Beatty Abstract: Twinning equations associated with the X-ray theory are conceptually different from others in the literature, in that they are not linked to deformation. Despite this, they have been applied to deformation twins, as well as to growth twins, with some success. In part, this is an exposition of such theory, but it also contains new results. Key Whirls: twinning theory, continuum theory of crystals
1. INTRODUCTION Workers interested in crystals have long used the idea that, at fixed temperature, changes in the energy of a crystal are determined by changes in its crystallography, using this in molecular theories of elasticity and studies of phase transitions involving changes of symmetry, among other things. For the transitions, workers often use what is essentially macroscopic theory, assuming that the relevant thermodynamic potential is a function of variables describing crystal structure, perhaps with other kinds of variables considered to be important. Typically, workers set up and analyze relatively simple energy minimization problems, with simplifying assumptions appropriate for a narrow range of problems. My experience is that studies of the kinds presented by Toledano and Dmitriev [1] are fairly typical. When crystal structure and deformation are both considered and when workers make clear what they assume about relating these, the common practice is to use the Cauchy-Born rule, hereafter referred to as CBR. Zanzotto [2] pointed out that this seems to be reliable for deformations encountered in phase transitions and twinning in Bravais lattices and shape-memory alloys. However, it often fails to apply to such phenomena in crystal multi-lattices. Those concerned with deformation twinning are painfully aware of this, as is clear from the review by Christian and Mahajan [3], but these workers have had only limited success in finding reliable substitutes for CBR. After pondering these facts, I [4] decided to formulate a theory to try to correctly describe the crystallography, leaving open the possibility of using CBR for limited changes of configurations, when one trusts it. I call this the X-ray theory, since it deals with observations commonly made using X-rays. In this respect, it is somewhat unconventional. Also, instead of the traditional point or space groups, it uses as invariance groups the lattice groups associated with Pitteri's [5, 6] neighborhoods or the infinite groups he discusses. The merits
Mathematics and Mechanics of Solids 7: 331-352, 2002 © 2002 Sage Publications
DOI: 10.1177/108128028475
384 332
J.L.ERICKSEN
of doing this are made clear by Pitteri [7] and Pitteri and Zanzotto [8]. Since I proposed this theory, I have been testing it on various kinds of phenomena, including twinning analyses, with some success. Formally, the theory of twinning presented by Pitteri and Zanzotto [8] is quite similar to that to be discussed, and I recommend that readers also study it. Conceptually, there is an important difference. In their twinning equations, an important ingredient is the kinematic condition of compatibility, associated with requiring the displacement to be continuous when the deformation gradient undergoes jump discontinuities across some surface. In my theory of twinning, I use instead an assumption not related to deformation, to be described later. This makes it feasible to analyze growth twins, those occurring naturally as a crystal is grown, for which consideration of deformation is not relevant to their formation. These involve an interesting variety of twins which are rather different from those produced mechanically. I [9] used this to analyze most of the well-established growth twins in quartz, with good results, there being two kinds to which my twinning equations obviously cannot be applied. Commonly, deformation twins are generated by applying, then removing, shear stresses. Here, my equations often give results which are different from those based on the other view. However, I [10] explained how they complement instead of contradicting each other.
2. BACKGROUND Here, I will deal only with the most elementary kinds of twinning analyses, primarily kinematical studies. This does not really make use of constitutive equations, although it is consistent with those of the X-ray theory. However, it is important to understand how some groups fit into the picture. To be considered are ^-lattices, consisting of n identical interpenetrating lattices, described by lattice vectors ea, with the reciprocal lattice vectors (dual basis) e", and n — 1 shift vectors p , . Different lattices can be populated by the same or different kinds of atoms. Pick as a base point some atom in one of the lattices. Then, the p, are the position vectors relative to it of some atom in each of the other lattices. For a given ^-lattice filling all of space, there are infinitely many ways of selecting these vectors. If (ea, p,) is one set, (ea, p) is another provided ea =mbaeb^ea
= {m-1)"be\
m = \\mab\\ <E GL(3,Z)
(1)
and
p,=^p
+ /;eJfeZ.
(2)
For matrices, my convention is that the lower index labels rows. A description of this kind is called non-essential if the configuration can also be described as a ^-lattice with a smaller value of n, essential otherwise. The former are discussed in some detail by Pitteri [7] and Ericksen [11]. Here, I consider only essential descriptions. The matrices denoted by a describe the effect of interchanging identical atoms, as is discussed in some detail by Pitteri [6] and Ericksen [12]. Here, we will not use the specific forms of these. Together, (1) and (2) describe infinite groups to be used.
385 ON THE X-RAY THEORY OF TWINNING 333 Any particular set of these vectors determines some finite groups, those relevant here being a lattice group I(e f l ) P l -) = {m,a,l\mbaeb
= Qea,ajiPj
+ l°ea = Q p , , Q e 0(3)} ,
(3)
a skeletal lattice group I(ea) = { m | m ^ = Qefl!Qe0(3)},
(4)
P (e a , p,-) = all Q occurring in (3)
(5)
P(ea) = all Q occurring in (4),
(6)
a point group
and a skeletal point group
often called the holohedral point group or holohedry. The order of L(ea) and P (e a ) can be and often is larger than that of the other two groups. The values of m, a and 1 mentioned in describing the lattice groups are, of course, a subset of those referred to in (1) and (2). Twinning involves jump discontinuities in these vectors across some surfaces, and many discussions of these associate them with deformation. As terminology used by experts more broadly interested in twinning is described by one of these, Cahn [13, p. 388], The interface or composition surface between the individuals constituting a twin is something quite distinct from the twin plane, where there is one. The latter is a plane of structural symmetry between the components, while the interface may or may not be plane, and may or may not be parallel to the twin plane. When the interface is parallel to the twin plane, the edifice is termed a contact twin....When the interface is crystallographically irregular, so that the components appear as if they had grown simultaneously but independently of each other, the edifice is called a penetration twin.
My impression is that, when workers deal with something other than contact twins, they are generally dealing with observations of growth twins, not some theory. Later, I will say a little more about this. To understand what these words mean to experts, it seems necessary to study how they interpret them in specific cases, and to assess how much consistency there is in the usage. Certainly, not all experts interpret twin planes in the same way. For example, according to Hurlbut and Klein [14, p. 98], "the operations that may relate a crystal to its twinned counterpart are: (1) reflection by a mirror plane {twinplane)...'" but, according to Hirth and Lothe [15, p. 812], "...Ki is called the composition plane or twin plane." Commonly, Ki is discussed only for twins associated with deformation, being the invariant plane of a simple shearing deformation and, often, the isometry relating the twins is not equivalent to a reflection. For a type II deformation twin, for example, this is not always what I would interpret as a mirror plane and neither of these interpretations seems to be exactly what Cahn had in mind. Otherwise, interpretation seems to be a bit complicated.
386 334
J. L. ERICKSEN
For example, in describing Japan twins in quartz, Dana and Dana [16, p. 91] describe four kinds, saying that all are contact twins, that their type I does involve a twin plane, but that the others only involve "pseudo twin" planes. Their type I fits my description of a type I twin, to be discussed later, as a special case. The other types (II, III and IV) involve combining these with Brazil and/or Dauphine twins. This implies joining enantiomorphic configurations and/or configurations with different polarities of the a-axes. Apparently, this is why they are not regarded as true twin planes by these writers. Because I am not sure what the words mean, I will not try to describe good examples of cases where a twin plane and interface both exist, but are different. An additional complication is that some things Dana and Dana call Japan twins have an interface which is not a plane, being of zigzag form; I would find another name for these, or explain why not. Simply, I [9] treated these four kinds of twins and more complicated arrays as different solutions of my twinning equations. These are growth twins, so other workers have not associated twinning equations with them. Frankly, I do not see why it is important to introduce a notion of a twin plane, although I do not object to others doing so. While workers seem unable to agree upon a general definition of a twin, at least most things called twins have associated with them an interface, considered as a surface of discontinuity. Often, but not always, it is a plane or at least appears to be one from macroscopic observations. Occasionally, workers consider replacing the surface by a thin transition layer, as is illustrated in a sketch presented by Christian and Mahajan [3, Figure 8(b)], for example. For the X-ray theory, one assumption I made excludes continuous distributions of dislocations, which gives the existence of functions/ 0 such that ea=VXfl,
(7)
at least locally. Essentially, this is what makes it feasible to use CBR1 for subsets of configurations. To get a twinning equation, I introduced a different idea, that it is possible to choose lattice vectors on the two sides such that Burger's vector vanishes, for all Burger's circuits in the neighborhood of the discontinuity surface. For such a pair, this led to the jump condition ea = ( l - n < g > a ) e a ,
(8)
where e" and ea are the limiting values on the two sides of the discontinuity surface, n is its unit normal and a is some vector. This is the assumption mentioned earlier, as being used instead of the kinematic compatibility condition associated with deformation, permitting applications to growth twins. Here, (8) is interpretable as the kinematical condition of compatibility associated with having the functions x" in (7) be continuous on the surface but, in principle, this says nothing about deformation. For almost all elementary twinning analyses, workers have in mind unstressed crystals or those subjected to hydrostatic pressures, this being associated with the idea that the two configurations should be related by some isometry. Also, it is typical to consider piecewise constant configurations, as I shall do here. My way of introducing isometries is through the equations ?
= ( l - n ® a ) e s = Q (w" 1 )^ e* & ea = (1 - n
(9)
387 ON THE X-RAY THEORY OF TWINNING 335
and Pi=Q{ajiPj+l?ea).
(10)
Obviously, the substitutions m —> —m, Q —> — Q take one solution of (9) to another but (10) is not always compatible with both when it is for one. To save ink, I will consider only one of the two possibilities in various cases. Intuitively, these equations should imply that values of relevant thermodynamic potentials can be taken to have the same value on the two sides. However, as is clear from my [9] study of growth twins, there are cases where it is necessary to consider the two functions as symmetry related, but different. With various observed kinds of twins, (9) and (10) do seem to apply, judging from my [9, 10, 17] efforts to check this out. It does fail to apply to some things called twins, which involve discontinuity surfaces which are crystallographically inequivalent on the two sides, and most of the failures I have encountered are of this kind. In quartz, the Zinnwald and Zwichau twins are of this kind, for example. For some materials, CBR fails to apply to free thermal expansion, for example, and this can invalidate the prescription I [4] proposed for the mass density, in particular. It should be mentioned that workers seem unable to agree upon a general definition of a twin, and my assumption leading to (8) has not been considered by others. So, if you like, consider (9) and (10) as defining a special class of twins, for which one can do some analyses. From (9), it follows that either det (1 - n
(11)
or det (1 - n ® a) = - 1 => a • n = 2, (1 - n ® a ) ~ r = 1 - a
(12)
The designations S and O indicate that the two sets of lattice vectors have the same or opposite orientations, respectively. For deformation twins, the experience is that a simple shearing deformation gradient of the form 1 + b ® n with b • n = 0 takes one configuration to the other. Sometimes, (9) applies with a = b , (11) then applying. For this reason, (9) was studied before I proposed the X-ray theory, for example by Pitteri [18, 19]. However, in numerous deformation twins, (9) does not hold for a possible value of a = b . Often, workers then relax (9) by assuming that it applies to some non-essential description, allowing this to be different for different twins in the same material. I [10] have discussed how this relates to my version of twinning theory. The difficulty is in trying to predict which non-essential description(s) apply to any particular case. This means that the traditional CBR for relating deformation to changes in lattice vectors fails to apply, and we do not have a reliable alternative. This is one thing that motivated me to propose the X-ray theory, to provide some theory for the kinds of things commonly observed using X-rays. Obviously, the transformations (1) and (2) are determined by giving the set of integers described by
388 336
J. L. ERICKSEN
L = {m,a,l}.
(13)
If one first applies one labeled Li, then applies L 2 to the transformed vectors, the result is equivalent to applying the transformation L 2 « L ! = { m 2 m i , a 2 a i , a 2 l i + l 2 mi}
(14)
to the original set of vectors. So, we can take this as describing group multiplication for this infinite group of transformations. As is explained by Pitteri [7] as well as Pitteri and Zanzotto [8], when L is an element of some lattice group L € l ( e , l P l - ) = » L " = 1 , » = 1,2,3,4, or6,
(15)
this applying only to essential descriptions. For non-essential descriptions, they note that these numbers are replaced by multiples of them. As particular solutions of the twinning equations, one has those characterized by a • n = 0, a / 0, (9) and (10) with Q = 1,
(16)
called lattice invariant shears 2 . For these, one possibility for (10) is p, = p,-, and m is of the form mab =dab+mbna,
mana = 0,
(17)
where ma and na can be taken as integers. With typical X-ray observations, such shears cannot be detected. For some, but not many, «-lattices, one can get another solution by the substitution mentioned after (10). In some commonly encountered cases, one can use them to change values of a and m in the twinning equations, to get physically equivalent solutions, as far as such observations are concerned, and the equivalent of this is often done in trying to relate shear deformation to X-ray observations. Similarly, one has solutions of the form a - n = 2, (9) and (10) with Q = 1,
(18)
with m of the form mab=8ab-kbl\
*,/" = 2 =» m 2 = 1,
(19)
where ka and I" can be taken as integers. Here again, one could take p, = p, and, occasionally, get another solution by the above substitution. In particular cases, this can also be used to make useful adjustments to the values of a and m. Then there are what are regarded as trivial solutions, { m , a , l } G l ( e o , p , ) and Q <E P(ea,p,), giving
(20)
389 ON THE X-RAY THEORY OF TWINNING
eb = efl, Q r (aJlPj + l?ea) = p,,
QT (m-1^
337
(21)
Combining such a solution with a lattice invariant shear gives a fake twin: no discontinuity would be visible in X-ray observations. These have been used to generate from one solution equivalent solutions with different values of {m, a , 1} and Q. Later, I will give a brief discussion of associating constitutive equations with these and other twins.
3. COMMONLY OBSERVED TWINS As is familiar to at least some of those knowledgeable about twinning, most commonly observed mechanical twins are S-twins which employ in (9) and (10) transformations such that L 2 = L * L = 1 = { 1 , 1 , 0 } , w i t h a - n = 0.
(22)
From (14), this implies that m
2
= l , a2 = 1, a l + l m = 0.
(23)
For the present, I assume that m<£L{ea).
(24)
Here, (22)4 requires that, for (9) to be satisfied, det Q = det m = ± 1 and, for simplicity, I will only consider the possibility d e t Q = d e t m = l,
(25)
except where there is a note to the contrary. Then, the work of Pitteri [19] shows that, with R^ denoting the rotation with axis v and angle y/, either Q - R",
(26)
Q = R£,
(27)
or
for solutions of (9). For the X-ray theory, I take (22) and (26) as the definition of a type I twin, (22) and (27) as the definition of a type II twin, it being also assumed that (9) and (10) hold. I am inclined to accept relaxing this, to allow det Q = det m = — 1 as well as (25), if examples of this kind turn up. I have examined information on numerous twins, finding that what are classified as twins of these types in the literature do seem to be consistent with these definitions. My usage is unusual, in applying these names to some growth twins as well as mechanical twins. There is a partial converse, which follows from results obtained by Pitteri
390 338
J. L. ERICKSEN
[18] and Ericksen [20], for essential descriptions: if (9) is satisfied with Q of the form (26) or (27), then (23)x must hold for S-twins. Solutions of (9) for these two types are presented by Pitteri [18, 19]. I will record them in a slightly different way. First, (23) and (25) imply that m is of the form mab = -dah+ubv\
uava=2,
(28)
where ua and va can be taken as integers. Also, it is easy to see that (9) is equivalent to l - n < g > a = QH, H = (w" 1 )" e* ®e fl ,
(29)
H = - l + u
(30)
with (28) giving
For type I twins, solving (29) gives n ® a = u® [2-^2 - v ) . u
v ll
/
(31)
Here, for example, n = ± u / | u | and it is immaterial which of these two is used. Commonly, is said to be rational or to have a rational direction if there is a vector with this direction having integer components relative to the basis e°. Here, this is the case, since the ua are integers, so for type I twins, nis rational.
(32)
This means that the discontinuity surface is parallel to some crystallographic planes. While my theory of twinning is not conventional, this statement seems to be accepted by all workers involved in twinning problems, and it can be deduced from twinning equations in use that are not equivalent to mine. Here, for example, there are lattice invariant shears of the form S = l + c
(33)
in (28) gives a different value of H also satisfying (30)5 and, using (31), it is easy to see that the effect is to superpose a lattice invariant shear. In terms of what is commonly observed using X-rays, this gives a physically equivalent solution. Similarly, for type II twins, one gets n
(
V
u
2v\ j ®vv
l l/
(34)
So, for the same m , we get these two solutions. Many theoretically possible twins have not been observed. In particular, type I twins are much more common than type II twins. If
m ON THE X-RAY THEORY OF TWINNING 339
there is some vector with the direction of a, having integer components relative to the basis ea, a is said to be rational or to have a rational direction. Here, with the v" being integers, this is the case, so for type I twins, ads rational.
(35)
This means that a is parallel to some rows of atoms. So the words "is rational" have different meanings in (32) and (35). For deformation twins, with a interpreted as an amplitude of a shear deformation, workers accept this. Although in my interpretation a is not necessarily related to any deformation, there are cases where such an amplitude satisfies my equations. Given that irrationals can be approximated within specified bounds by rationals and that observations always are subject to some errors, it is not possible to determine experimentally whether such directions are rational. The practice is to assume they are irrational unless some theory requires them to be rational. One can get other solutions from one, with the same direction of a, using lattice invariant shears obtained by the substitution ua -> ua +ya, yava =0.
(36)
This does change the direction of n. For type II twins, one can get an estimate of the direction of a and n from X-ray observations, with the inevitable experimental errors. Thus, the different solutions obtained are not really physically equivalent, even if one excludes other kinds of observations. However, sometimes one can use this to get values of these vectors better matching observations. I [10] discussed this in more detail. For deformation twins, the evidence seems to indicate this is also the direction of the amplitude of the shear deformation and one uses other methods to determine its magnitude, commonly optical methods. I [10] did a little analysis, looking at the theoretical possibility that the two directions are different. I did make a serious effort to locate observations of a difference in these directions, but found none. Similarly, for various kinds of twins, one also uses X-ray observations to get an estimate of the direction of n, relative to the crystal structure. Note that for a given H fitting (28) and (29), one can get these solutions for any admissible values of ea and p , . Assuming that a ^ 0 , a • n = 0, Pitteri and Zanzotto [8] showed that, if m is such that (9) is can be satisfied for all ea, then either it describes a lattice invariant shear or m 2 = 1. It is easy to show that this remains true if one allows a = 0. This suggests a study which should be helpful in understanding less common twins. Suppose that wefixsome skeletal lattice group U and consider the set SL' = {ea\L(ea)=L'}
(37)
and seek all values of {m| (9) can be satisfied Vea e SL'} .
(38)
I [9] did use this idea to analyze Friedel twins in quartz, for which Q is a 90° rotation. My view is that, with occasional exceptions, changing pressure and/or temperature will generally change values of ea (and p,-) without changing the lattice group, so one is more likely to observe twins that can be formed for any such values.
392 340
J.L. ERICKSEN
Of course, one should ensure that the shifts satisfy (10). For the general theory, one gets no equations to be satisfied by a and 1, so there is nothing else to be said about this, except that one can use (10) to calculate possible values of p, in terms of given values of p,. Later, I will mention theoretical reasons making it desirable to restrict possible choices of L to a finite set, in some situations. Then, the values of a and 1 are linked to choices of m, as a set belonging to some lattice group element. Most observed mechanical twins are either of type I or they are compound twins, which means that they can be analyzed as type I and as type II twins. As I interpret this, we should satisfy two sets of twinning equations. For the type I, I will use the notation above, so Q is given by (26), for example. For the type II, I use ea = (1 - n ® a) ea = Qmabeb, p , = Q (aJ'tp, + /, o e fl ) , Q = R*.
(39)
To workers with some experience in doing twinning calculations, it might be surprising that I do not take a = a. However, in the X-ray theory, it is not unusual for (9) to be satisfied for infinitely many values of a, with equivalent values of e" and shifts, and a is not necessarily related to deformation, so it seems to me better to avoid this. However, (e fl , p,) and (e a , p,) should describe the same configuration, hence be related by some transformations fitting (1) and (2), say ea = mbeb, p, = a\Pj + 7?e a .
(40)
I will not try to explore in detail all the implications of these equations, but it is easy to obtain some interesting results using (40) with the twinning equations. For one thing, this gives eb = Qmabeb = Q (mm" 1 ); e»,
(m-%
(41)
which can be rearranged to give QQe = (mm-1 m)abeb ^QQeP(ea),
(42)
and it is easy to verify that, with a • n = 0 implied by (22)2,
Q del QQ = QQ = ± R f n ,
(43)
if we allow not only (25), but also d e t Q = d e t m = —1. Of course, we always have Q G -P(ea) ^ - Q e P(ea). It is also true that the combination of m s in (42) is an element of L(ea), which makes it conjugate to ± R ^ A n . Among other things, this gives m 2 = 1, m = mm" 1 !!!.
(44)
So, in the X-ray theory, this is a necessary condition for a twin to be compound. By doing a similar analysis of shifts, one can verify the stronger condition that
R f n E P(ea,Pi)
and/or - R f n e P { e a , P i ) .
(45)
393 ON THE X-RAY THEORY OF TWINNING 341 When an isometry of this kind applies, one can use it to transform a type II solution to a type I, for example. A proof of this is sketched in the Appendix for the first possibility mentioned in (45): the other possibility can be treated in essentially the same way. So, this is also a sufficient condition for a type II twin to be compound. For a type I twin, one can use lattice invariant shears to change the direction of a, leaving that of n fixed, and it is unreasonable to think that (45) holds for all such directions. However, if it holds for one, one can use it to construct a type II solution and, for this choice of a, a = a. So, it is compound. I will not give a proof of this, which is very much like that for the type II case. From another view, we have ea = (1 - n ® a) ea = (nT 1 )" e* = (1 - n
(46)
(1 - n ® a)" 1 (1 - n
(47)
or
with S = l + (a-a)®n
(48)
then being a lattice invariant shear. Adjust this in the way discussed earlier for the type I solution and you get another new result, that, to within lattice invariant shears, a — a. I believe that these easily proven results are new. Next consider cases of type II deformation twins for which the CBR does not apply, with some shear deformation gradient SD = 1 + b ® n, b • n = 0. Accept that it will apply to lattice vectors associated with some non-essential description and that b is rational, assumptions that are widely accepted. Let Sx denote any solution obtained using the X-ray theory, with amplitude a parallel to b. Then, I [10] showed that there is a shear S such that SD = Sx S = SS X ,
(49)
S being interpretable as a lattice-invariant shear for the non-essential description, but not for the essential. There are similar results for type I and compound twins. Generally, the three shears involve three different values of n, which seems curious, although it is easy to understand, mathematically. However, one can get solutions in the X-ray theory for which the difference in values of directions of n for SD and S^ is as small as you like. Referring to the description of terminology quoted in Section 2, one interpretation might be that these twins are not exactly contact twins. Reasonably, one could say that the interface is or is well approximated by a plane associated with S^. Planes associated with any of the X-ray solutions fit some descriptions of twin planes, although it is unclear why one should favor one such plane instead of another. I have not found a good discussion ofjust how experimentalists determine twin planes and plane interfaces, when they exist but are not parallel. For crystals to which CBR applies, it can be reasonable to use the thermoelastic theory of twinning. For this, as it applies to type II twins, Pitteri and Zanzotto [21] note that, if the strain gets perturbed, for example by a change of temperature, and the crystal stays stress free, then the interface should rotate through the material. I also find this curious.
394 342 J. L. ERICKSEN 4. OTHER TWINS While they are relatively rare, other kinds of twins have been observed. First, let us consider that (24) and, possibly, (22) are violated. Then, (9) and (10) are to be satisfied with m*et = Q e f l , Q e O ( 3 ) ,
(50)
this implying that, for (9) to be satisfied, l-n®a=QQeO(3).
(51)
Then, it is easy to show that a n = 0^>a = 0 ^ Q
= Qr.
(52)
In studying the applicability of my equations to growth twins in quartz, I [9] encountered two types fitting this description, commonly called penetration twins and, tentatively, suggested using these equations as a definition of penetration twins. It needs to be explored how well this fits other things called penetration twins. As was mentioned in the quotation from Cahn [13] in section 2, the basic property is that the discontinuity surfaces do not have normals with particular crystallographic directions. With (52), it is obvious that one gets no restrictions on n. For quartz, the examples are the well-known Brazil and Dauphine twins, and my equations apply to various configurations involving combinations of these, sometimes including other kinds. In particular, these are the kinds of twins mentioned earlier as accompanying some Japan twins. They involve jumps in shifts. Observations indicate that discontinuity surfaces associated with Dauphine twins generally have rather random shapes. The two configurations generally occupy nearly equal volumes. Those associated with Brazil twins tend to be of zigzag form, involving planes having various crystallographic directions, when they occur in the interior and, generally, one occupies a much larger part of the volume: I do not know of a good theoretical reason for these differences. They can also occur, with one configuration as a small growth on the surface of a crystal in the twin configuration. Actually, there are differences in symmetry between the Brazil and Dauphine twins, which I used to explain the observations that Brazil twins cannot be removed by mechanical treatments although this is possible for Dauphine twins. The other logical possibility is what I will call exceptional O-twins, a n = 2 = > a = 2 n ^ QQ = - R " .
(53)
If one has values of variables satisfying (53), one canfixall but n and Q let these vary, to keep (53) satisfied. In this sense, n is not required to have a particular crystallographic direction. In this respect, these solutions are somewhat like those for penetration twins. However, (53) applies to Friedel twins in quartz and workers do not regard them as penetration twins. That is, according to my [9] analysis of these, they provide an example of twins of this kind, with Q a 90° rotation about an axis perpendicular to n.
395 ON THE X-RAY THEORY OF TWINNING
343
Rotation twins are also observed. As described by Barrett and Massalski [22, p.406], Crystals are rotation twins if a two-, three-, four- or sixfold rotation of one crystal about a twinning axis produces the orientation of the other. The rotation axis lies either in the twinning plane or normal to it and is not a symmetry element of the individual crystals.
For S-twins, I [20, 23] presented solutions of (9) of this kind, but there is reason to also consider O-twins, one example of the latter being provided by the aforementioned Friedel twins. Previously, Zanzotto [24, 25] had done interesting studies of the effects of pressure and temperature on rare quartz crosses involving these, my studies adding a little to his theory. He found that, in general, changes in pressure and/ or temperature induce shear stresses in such configurations unless their values lie on certain curves. He reasoned that those found in good condition should have grown on or quite near such a curve, passing through atmospheric pressure at room temperature. I [9] noted that, for similar reasons, his curve also applies to the Zinnwald and Zwichau twins in quartz, although these are not observed to form crosses. As was mentioned earlier, my twinning equations do not apply to these two kinds of twins, but I copied what Zanzotto did for the Friedel twins, considering the effect on them of changing pressure and/or temperature, using CBR, which does seem to apply to these phenomena in quartz. Crosses are much more common in staurolite, for example, but relevant data is lacking for them. As was mentioned earlier, the Friedel twins are observed to involve a 90° rotation with axis in the (planar) interface. My analysis of the twinning equations and those by Zanzotto do use this and mine requires that they be O-twins, a • n = 2. Thus, my [20, 23] solutions for S-twins do not apply to these twins. I do not have enough experience with growth twins to judge whether this is unusual. It did lead to the prediction that, by mechanical means, it is not possible to create or destroy such twins, or to make them move through the material. For these twins, I know of no observations relevant to this. While my theory describes the Brazil twins as S-twins, they do involve joining enantiomorphic configurations, this being associated with a difference in shifts. For the Friedel twins, Zanzotto and I used quite different reasoning, but we came to agreeable conclusions concerning finding two values, both satisfying m 2 = 1, d e t m = — 1. Both are used to construct solutions for the crosses. One could adjust these in various ways, by adding in Brazil or Dauphine twins, but observations concerning occurrence of these is lacking. As should be clear by now, quartz provides examples of an interesting variety of twins. I know of only two examples of deformation twins that are not of type I, type II or compound. One that was discovered recently, in LaNbO 4 , has been successfully analyzed by Jian and James [26], using thermoelasticity theory. Here, CBR is used successfully. The interface seems to be planar. The isometry is a rotation with axis in this plane, an angle fairly close to but different from 90°, so these are not type I, type II or rotation twins. The other example involves Dauphine twins in quartz. Generally, these are not regarded as deformation twins, because they commonly get created as growth or transformation twins. However, they can be created by applying rather concentrated forces. I do not know of another kind of penetration twin that can be produced mechanically. Now consider solving (9) for given values of m and e". Rewrite (9), as was done in (29), the only difference being that (28) is not required to be satisfied. Rearrange this to get Q = (1 - n
r
= Q to eliminate Q. This gives
1
(54)
396 344
J. L. ERICKSEN (l-n®a)~rHr = (l-n®a)H-1
(55)
(l-a®n)(l-n»a)=K = HrH
(56)
or the equivalent
as an equation for determining a and n. From the way it is defined, K is symmetric, positive definite and det K = 1. Obviously, the matrix on the left has an eigenvalue equal to 1, with eigenvector perpendicular to a and n. Thus,fora solution to exist, K must have an eigenvalue equal to 1.
(57)
When (57) holds, we have K = A f 1 ® f 1 + -f2(g>f2 + f3
(58)
where the ia are orthonormal eigenvectors. One possibility is X = 1 => K = 1,
(59)
giving either the trivial possibilities indicated in (20) or the twins covered by (52) or (53). Another possibility is 1 ^ 1 , a • n = 0.
(60)
One can then solve (56), with |n| = 1, getting a = n - w, w = /Uifi + -r«2f2 = K n , na = n • fa,
(61)
n • f3 = 0.
(62)
A
and
nl = y^j, nl = -~j^n-Kn=l,
This gives four solutions, with the ± signs available in taking square roots in (62). However, if (a, n) is one of these solutions, (—a, — n) is another, and this describes the same twin, so one really gets two solutions. In more conventional twinning theory, such pairs are called conjugate or reciprocal (S-) twins, and I use the former name for them, as described in the X-ray theory. From section 3, it is clear that the conjugate of a type I twin is a type II twin. This analysis is essentially what I [27] used for deformation twins. What is new is the analogous result for the other case, X / 1, a - n = 2.
(63)
a = n + w,
(64)
Here, one gets
397 ON THE X-RAY THEORY OF TWINNING
345
with w again given by (61)2, n by (62). So, this gives two more solutions, which I call conjugate O-twins, and to get the four, we only need (57) to be satisfied, excepting the case (59). With any of these solutions, use (54) to calculate Q, and you have solved the equivalent of (9). Actually, one gets eight solutions, by including multiplying m and Q by —1. Of course, one should keep in mind that (10) also needs to be satisfied. This analysis is a reasonable way of introducing these sets of solutions, and it gives one tool for attacking (38). However, it has its faults, in that X-ray observations commonly give estimates of n and atomic arrangements on the two sides, from which Q is estimated. Usually, one gets m by trying to satisfy twinning equations. There is another way to view these results. Suppose that we have a solution of (9)2 and we apply the orthogonal transformation Q = 1 — 2n <£> n to both sides. By a simple calculation, this gives another solution, corresponding to the substitutions a ^ a = 2n - a, Q ^ Q Q ,
(65)
with a • n = 2 if a • n = 0, a • n = 0 if a • n = 2. So, this transformation simply relates S-twin and O-twin solutions of (9), and it is a simple matter to adjust shifts to conform to (10). A better approach has been explored by Adeleke [28], for the important case where a • n = 0. He has characterized all solutions of equations equivalent to (9), for any given Q and n as well as for given m. He assumes that det Q = det m = 1 but, as is noted above, it is easy to get the solutions for the other possibilities, det Q = det m = — 1 and a • n = 2. In various cases, if one of these is consistent with (10), not all of the others are. Relevant information on Q and n is available in the literature for numerous observed twins. In particular, rotation S-twins with Q 2 / 1 are all included in his subcase 3.3.2.1 or case 3.3.3, with Q 2 = 1 in his subcase 3.3.4.3. There is another consideration which might be of some interest. Taking into account the fact that det K = 1, a necessary and sufficient condition that K have 1 as an eigenvalue is that trK = trK~1.
(66)
Let Gab = e a • sb, Gab = ea • eb,
(67)
these being components of inverse matrices. For a given symmetry of the skeletal lattice, specifications of possible ea will give them depending on some lattice parameters, and (66) takes the form /(m,G)=/(m-1)G)
(68)
where f(m,G)
= GacGbdmabmcd.
(69)
Note that (68) implies that, if (57) is satisfied by some values of m and ea, it is also satisfied by m " 1 and the same values of e a . Also, for a given symmetry type, (38) requires
398 346
J. L. ERICKSEN
(66) to be satisfied for all possible values of the lattice parameters for a given symmetry type, a restriction on m. Of course, lattice invariant shears along with values of m associated with types I and II twins are among the solutions for any symmetry type, so (66) always has infinitely many solutions. Clearly, this complicates the problem of trying to classify solutions. Another special case seems worth noting. Consider the possibility of satisfying (9) for S-twins with m having 1 as an isolated eigenvalue and det m = 1.
(70)
Judging from various kinds of twins that I have explored, it seems quite common to have m" = 1 , n = 2,3,4 or 6, det m = 1, implying (70). For suitably weak phase transitions in 1-lattices, such transformation twins are discussed in detail by Pitteri and Zanzotto [29]. They do use CBR, which seems to be reliable for twinning in 1-lattices. As is well-known, such are similar to rotation matrices, the solutions being covered by Adeleke [28, cases IV 2 and VI]. Simply, I add observations which are not so obvious from his descriptions. Also, it follows from Adeleke's work that values of m associated with rotation S- twins have 1 as an eigenvalue, but might or might not satisfy m" = 1 for any finite n. With (70), there are numbers ka and I" such that mbakb = k a , m a b l b = la.
(71)
Set k = kaea,\
= laea.
(72)
For S-twins, (9) should be satisfied with det Q = 1 and the assumption that 1 is an isolated eigenvalue excludes Q = 1, so we have the usual axis of rotation satisfying Qe = e,e e = l.
(73)
Multiplying both sides of the first version in (9) by ka and summing on a, we get (Q - 1) k = - a • kn.
(74)
Similarly using la with the second gives (Q - 1) 1 = n • la.
(75)
From these two equations, it follows that a • k n • e = a • e n • 1 = 0.
(76)
Then, by elementary analyses, exploring the possibilities for satisfying (76), one finds that at least one of the following equations must hold: e||k=»a-k = a-e = 0
(77)
399 ON THE X-RAY THEORY OF TWINNING 347
or e | | l = ^ n - l = n - e = 0.
(78)
Obviously, for both to hold, it is necessary that k 111. An example of this kind that is not a rotation twin, with m 4 = 1, does occur in the analysis by Jian and James [26] of that unusual twin in LaNbO4. Of course, (77) or (78) are only necessary conditions for (9) to be satisfied when (70) holds, and one needs to dig through the results of Adeleke [28] to construct such solutions. When k 111, it is easy to see that, with K given by (56)2, K k = k,
(79)
so (57) is satisfied, giving us solutions of the kind described above. Of course, having does restrict values of lattice vectors. As an example, I shall note the other calculations for one situation, Adeleke's subcase 3.3.2.1, which fits (78). This involves rotations with axis perpendicular to n that do not take a plane with this normal to itself, a rotation to be prescribed to fit this description. While this need not be a rotation twin, one can prescribe it to get rotation twins with R" = 1 , n = 3,4 or 6. Of course, a 180° rotation is excluded, since it would take the plane to itself. Let p , q, r be an orthonormal basis with n = p , R r - r, R = R ; ,
(80)
for some angle yj. Also, m can be prescribed, subject to some conditions. First, rn must have 1 as an eigenvalue, so that, in particular, (71)2 applies. Also, it is necessary that there exist numbers ya such that /",y" and nfbyh are linearly independent.
(81)
Any such set can be used. There are then solutions with e" obtained by solving the equations ea • q =y\
ea • ( R r q ) = maby\ e" • r = I",
(82)
from which I get e" = (cscy/mabyb -cotvya)p+yaq
+ lar.
(83)
For a, Adeleke gets a = mabeb • n R e a - n.
(84)
It does not seem easy to see what point groups could be attained for such values of e", an important step in assessing prospects for satisfying (38). By perusing Adeleke's work, one could get the other solutions for rotation twins.
400 348
J. L. ERICKSEN
There is a large literature dealing with observations of twins. For those unfamiliar with this, I recommend starting with the short notes by Zanzotto [30].
5. CONSTITUTIVE EQUATIONS Usually, constitutive equations for crystals are assumed to be invariant under some finite crystallographic group. Nonlinear thermoelastic theory of this kind has been used successfully by various workers to analyze phenomena associated with martensitic phase transitions in shape-memory alloys, treating them as Bravais lattices (1-lattices). These involve twins, often very finely spaced, forming microstructures. They have been studied using the ideas concerning Young measures discussed by Ball and James [31], for example. Essentially, replacing the infinite group described by (1) and (2) by a finite subgroup involves restricting the domains of constitutive equations to Pitteri's [5, 6] neighborhoods. Any of these is centered at a particular configuration equipped with a set of lattice vectors E a and shifts P , and has the property that the lattice group of any configuration in the neighborhood is a subgroup of L(Ea, P , ) . This suggests using this as an invariance group for constitutive functions restricted to such a neighborhood, except for one thing. I [9] have argued that it is not reasonable to assume that the domains include lattice vectors of opposite orientation, so I use instead Z,+ (E fl ,P,-) = { L e J L ( E a , P , ) | d e t m = l } and Q e S O ( 3 ) .
(85)
I [4] proposed constitutive equations for
(86)
where the arguments are considered as fields, functions of position in space. Elsewhere, I [32, 33, 34] have discussed equivalent formulations which are convenient for some kinds of studies. Assuming that (86) is invariant under (85) is not conventional, since workers have become accustomed to use point or space groups as the basic descriptors of crystal symmetry. I [4] gave prescriptions for the Cauchy stress tensor, configurational stress entropy per unit mass, and mass density which I have not seen elsewhere, except for the latter, but they are not used here. Of course, the former should satisfy the usual equilibrium equations, interpretable as three equations for determining the three functions^". As equations for determining the shifts, I borrowed what is used in classical molecular theories of elasticity,
^
dp;
= 0.
(87)
401 ON THE X-RAY THEORY OF TWINNING 349
In the previous discussions in this paper, it was tacitly assumed that these are satisfied by one of the twins. In most if not all cases of interest, one can use symmetry arguments to show that it must also be satisfied in the other twin. Another important invariance is 0>(Qea,Qp,,0)=p(ea,Pl-,0), QeSO(3).
(88)
Now, if we are to associate a single constitutive equation of this kind to twinning phenomena, it should accommodate some of the elementary kinds of solutions considered earlier. For this it is necessary that the lattice groups for the twins be included in some lattice group, so there is a possible center. For solving (9), the values of {m, a , 1} that can be used are just those for some element of L + ( e a , p , ) . For observed twins, this is impossible for some, possible for others. For example, I [17] found that a very commonly observed twin in (orthorhombic) a -uranium could be included in a neighborhood centered at an hexagonal close-packed configuration, although no known phase of this material has this symmetry. So one could use such theory to study effects of stress on these twins, for example. However, I also found that other deformation twins observed in this material cannot be included in any Pitteri neighborhood. My limited experience is that this is fairly common for deformation twins. This is also the case for twins in crystals of maximal symmetry, for example the common monatomic hexagonal close-packed and body-centered or facecentered cubic crystals. This does not exclude the possibility of using for these constitutive equations that are invariant under infinite groups. However, not all twins can be treated in this way. Exceptions include growth twins involving joining enantiomorphic configurations, the Brazil twins in quartz, for example. I [35] have discussed twins that can be analyzed using a constitutive equation restricted to a Pitteri neighborhood centered at a monatomic hexagonal close-packed configuration. This study emphasizes twins satisfying (38), although other possibilities are mentioned. APPENDIX Suppose that (39) and (45) are satisfied with R* An <E P(ea, p,). We wish to show that the type II twin is then compound. Since n, a and a A n are orthogonal, we have
R*=R?R» An .
(Al)
We also have the equivalent of (39), ea = (1 + a
(A2)
and, for the shifts, we can use p,=R>=R^Rf"P,,
(A3)
It is easy to verify that the corresponding L satisfies L 2 = 1. From (45), we get R* A n e a - mbaeb, R* A n p, = aJlPj + /«e a ,
(A4)
402 350 J. L. ERICKSEN where, according to Pitteri [7], this lattice group element will satisfy {m, a, I} 2 = 1, or m 2 = 1, a 2 = 1, &1 +lfh = 0.
(A5)
For this, it is important that the descriptions be essential. Thus, (A2) becomes ea = (1 + a
(A6)
which satisfies (9), with R" as the isometry. For this situation, Pitteri's [ 18] or my [20] results give (mm) = 1 =>• m m = m m = m,
(A7)
say. Using (Al) and (A4), we can rearrange (A3) to get P,=R?(a'p/+/,aea),
(A8)
which is consistent with (10). However, generally, {m, a, 1} ^ 1, so this does not quite qualify as a type I twin solution. This can be fixed, by using the admissible change of shifts p, =aJtPj +l,aea,
(A9)
with
~lf =lb<-
(A10)
A routine calculation then shows that P,=R"Po
(All)
and, by using ea and p,, we satisfy the requirements for a type I twin, so the type II twin is compound. By very similar arguments, one can get the analogous result for type I twins and the alternative, -R* A n G P(ea, p,-) in (45). Acknowledgment. I thank Giovanni Zanzottofor helpful comments andfor supplying useful references.
NOTES 1. For any reference configuration with lattice vectors E a , CBR asserts that, if F is the deformation gradient, then e a = F E a ( o e° = F ~ T E a ) is a possible set of lattice vectors (reciprocal lattice vectors) in the deformed configuration. 2. Commonly, the shear is regarded as the equivalent operation on e a , which is 1 + a Cg> n.
403 ON THE X-RAY THEORY OF TWINNING
351
REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
Toledano, E and Dmitriev, V: Reconstructive Phase Transitions in Crystals and Quasicrystals, World Scientific Publishing Co., London, 1966. Zanzotto, G.: On the material symmetry group of elastic crystals and the Born rule. Archive for Rational Mechanics and Analysis, 121, 1-36(1992). Christian, J. W. and Mahajan, S.: Deformation twinning. Progress in Materials Science, 39, 1-157 (1995). Ericksen, J. L.: Equilibrium theory for X-ray observations. Archive for Rational Mechanics and Analysis, 139, 181-200(1997). Pitteri, M.: Reconciliation of local and global symmetries of crystals. Journal of Elasticity, 14, 175-190 (1984). Pitteri, M.: On (v + 1) lattices. Journal of Elasticity, 55, 201-218 (1985). Pitteri, M.: Geometry and symmetry of multilattices. International Journal of Plasticity, 14, 139-157 (1998). Pitteri, M. and Zanzotto, G.: Beyond space groups: the arithmetic symmetry of deformable multilattices. Ada Crystallographica, A54, 359-373 (1998). Ericksen, J. L.: On the theory of growth twins in quartz, Mathematics and Mechanics of Solids, 6, 359-386 (2001). Ericksen, J. L.: On correlating two theories of twinning. Archivefor Rational Mechanics and Analysis, 153, 261-289 (2000). Ericksen, J. L.: On non-essential descriptions of crystal multi-lattices. Mathematics and Mechanics of Solids, 4, 363-392 (1998). Ericksen, J. L.: On groups occurring in the theory of crystal multi-lattices. Archive for Rational Mechanics and Analysis, 148, 145-178 (1999). Cahn, R. W: Twinned crystals. Advances in Physics, 3, 363-345 (1954). Hurlbut,C. S. Jr. and Klein, C: Manual of Mineralogy (after James D. Dana), 19th ed., John Wiley and Sons, New York, 1977. Hirth, J. E and Lothe, J.: Theory of Dislocations, 2nd ed., John Wiley, New York 1982. Dana, J. D. and Dana, E. S.: The System of Mineralogy, 7th ed., \61. 3 (rewritten and enlarged by C. Frondell), John Wiley, New York 1962. Ericksen, J. L.: Twinning analyses in the X-ray theory. International Journal of Solids and Structures, 38, 967-995 (2001).
[ 18] Pitteri, M.: On the kinematics of mechanical twinning in crystals. Archive for Rational Mechanics and Analysis, 88, 25-58 (1985). [19] Pitteri, M.: On type-2 twins in crystals. International Journal of Plasticity, 2, 99-106 (1986). [20] Ericksen, J. L.: Stable equilibrium configurations of elastic crystals. Archive for Rational Mechanics and Analysis, 94, 1-14(1986). [21] Pitteri, M. and Zanzotto, G.: Transformation twinning and Mallard's law. In Contemporary Research in the Mechanics and Mathematics of Materials (ed. R. C. Batra and M. F. Beatty), pp. 298-309, International Centre for Numerical Methods in Engineering, Barcelona, 1996. [22] Barrett, C. S. and Massalski, T.B.: The Structure of Metals, 3rd ed., McGraw- Hill, New York, 1966. [23] Ericksen, J. L.: Some surface defects in unstressed thermoelastic solids. Archivefor Rational Mechanics and Analysis, 88, 337-345 (1985). [24] Zanzotto, G.: Geobarothermometric properties of growth twins and mathematical analyses of quartz data for a broad range of temperatures and pressures. Physics and Chemistry of Minerals, 16, 783-789 (1989). [25] Zanzotto, G.: Thermoelastic stability of multiple growth twins in quartz and general barothermometric implications. Journal of Elasticity, 23, 253-287 (1990). [26] Jian, L. and James, R. D.: Prediction of microstructure in monoclinic LaNbO4 by minimization. Ada Materiala, 45, 4271^281 (1997). [27] Ericksen, J. L.: Some surface defects in unstressed thermoelastic solids. Archive for Rational Mechanics and Analysis, 88, 337-345 (1985). [28] Adeleke, S.: On matrix equations of twinning in crystals. Mathematics and Mechanics ofSolids, 5, 395-414 (2000). [29] Pitteri, M. and Zanzotto, G.: Symmetry breaking and transformation twinning. Pending publication. [30] Zanzotto, G.: Twinning in minerals and metals: remarks on the comparison of a thermoelastic theory with some experimental results, Nota I and Nota II. Atti della Accademia Nazionale dei Lincei, 82, 725-741 and 743-756 (1988). [31] Ball, J. and James, R. D.: Proposed experimental tests of a theory of fine microstructures and the two-well problem. Philosophical Transactions of the Royal Society of London, A333, 389^450 (1992).
404 352
J. L. ERICKSEN
[32] Ericksen, J. L.: Notes on the X-ray theory. Journal ofElasticity ,55,201-218(1999). [33] Ericksen, J. L.: On the a - 0 phase transition in quartz. Journal ofElasticity, 63, 61-86 (2001). [34] Ericksen, J. L.: On Pitted neighborhoods centered at hexagonal close-packed configurations. To appear in Archive for Rational Mechanics and Analysis. [35] Ericksen, J. L.: Twinning theory for some Pitted neighborhoods. Submitted to Continuum Mechanics and Thermodynamics.
405 J ^ Tr^
Journal of Elasticity 70: 267-283, 2003. © 2003 Kluwer Academic Publishers. Printed in the Netherlands.
267
On the Theory of Rotation Twins in Crystal Multilattices J.L. ERICKSEN 5378 Buckskin Bob Rd., Florence, OR 97439, U. S. A. Received 1 May 2002; in revised form 1 October 2002 Abstract. Rotation twins form a subset of twins in crystals which at least closely resemble many of the twins that are observed. My purpose is to characterize all solutions of this kind for twinning equations in the X-ray theory of crystals. An analysis of a common kind of growth twins in staurolite is presented. Mathematics Subject Classifications (2000): 74E15, 82D25. Key words: twinning theory, continuum theory of crystals.
Dedicated to the memory of Clifford Truesdell
1. Introduction The definition of rotation twins to be studied here is that given by Barrett and Massalski [1, p. 406], "Crystals are rotation twins if a two-, three-, four- or sixfold rotation about a twinning axis produces the orientation of the other. The rotation axis lies either in the twinning plane or normal to it and is not a symmetry element of the individual crystals." Of course, use of the adjective "rotation" is reasonably interpreted as implying that not all configurations called twins are rotation twins. Except for the two-fold possibility, which applies to almost all mechanical twins, all examples known to me occur in growth twins. Here, my purpose is to describe all solutions of the twinning equations in my [2] X-ray theory for such twins. I [3, 4] have described how these equations are a bit different from some others used in studies of mechanical twins, which cannot reasonably be applied to growth twins. Also, I [5] have verified that my equations describe several different kinds of growth twins that are well-established in quartz, excepting those for which the two sides represent crystallographically inequivalent surfaces. It should be noted that workers have been unable to agree on a general definition of twins. In the words of one expert, Cahn [6, Section 1.1], one of the best
406 268
J.L. ERICKSEN
known proposals, attributed to Friedel, describes a "true twin" as involving a pair of configurations such that "... the two crystals can be brought into one congruent configuration by reflection in a lattice plane of low indices, or by a rotation through 60°, 90°, 120° or 180° about a lattice row of low indices." The decision as to how low the indices must be seems to be left to the individual, but practice favors using single digits. Generally, Cahn seems to take this view fairly seriously, although he describes what he considers to be some rare exceptions. Other workers call other kinds of exceptions twins. Structural considerations motivated Hartman [7] to exclude existence of three-, four- and six-fold rotations in his definition of twins. Given that all measurements are subject to some error, there is no way to confirm experimentally that the rotation involved in one of these examples is exactly 90°, for example. Particularly in minerals, various kinds of complicated intergrowths occur, and most workers are not happy to call all of these twins. I note that Barrett and Massalski do not mention that the rotation axes should be restricted to the kind of crystallographic directions mentioned by Cahn, and I will not assume this. However, according to my theory, it turns out that, in most cases, these axes are parallel to some (parallel) rows of atoms or to normals of some (parallel) crystallographic planes, and I will note kinds of cases where one of these conditions must hold. I won't expend more ink in trying to describe all of the different views on what should be meant by a twin. For present purposes, I will regard an intergrowth as a twin if it is described by some nontrivial solution of my twinning equations. 2. Background A crystal n-lattice, pictured as filling all of space, consists of n geometrically identical lattices translated relative to each other, with lattice vectors ea and thenduals e a , the reciprocal lattice vectors. Physically, the atoms in any one of the lattices are identical, but atoms in different lattices can be the same or different. To describe the relative translations occurring when n > 1, we use shift vectors p,, i = \,... ,n — 1. Pick one atomic position in each of the lattices and take one as a base point. Then the shifts are position vectors of the others relative to the base point. For a given configuration, there are infinitely many ways of choosing the vectors, (e a , p,-) and (ea, p,) being two possibilities if the first is and the second satisfies ea=mbaeb
&
ea = ( m " 1 ) ^
(2.1)
where m = \\mba || is a unimodular matrix of integers, and p,=a/py+/,flea,
/?€Z,
(2.2)
the matrices a = ||a/|| being discussed in some detail by Pitteri [8] and Ericksen [9]. Briefly, they describe interchanging identical atoms. Here, the detailed descriptions of them are not important. In dealing with matrices, my convention
407 ON THE THEORY OF ROTATION TWINS
269
is that the lower index labels rows. The transformations (2.1) and (2.2) form an infinite discrete group, with the group product indicated by {m, a, 1} • {m, a, 1} ={mm, a a , al + Im}.
(2.3)
For each configuration of an n-lattice with n > 1, we have four finite groups that are relevant here, the lattice group L(e a , p,) = {m, «, 1 | mbeb = Qe o , « / p ; + Ifea = Q P i , Q e O(3)}, (2.4) the skeletal lattice group L(efl) = {m | mbtb = Qea, Q € O(3)},
(2.5)
the point group = {Q I Qe a = mbatb, Qp, = ajpj + l?ea, {m, a, 1} e L(ea, p,)},
(2.6)
and the skeletal point group
P(ea) = {Q I Qeo = mfa, m e L(ea)}.
(2.7)
Often, the latter is called the holohedral point group or holohedry. For a Bravais lattice (1-lattice), the lattice and point groups are just the skeletal groups. While workers seem unable to agree on a general definition of twins, they are generally considered to involve jump discontinuities in lattice vectors and/or shifts although, frequently, the latter are not considered explicitly. Normally, at least tacitly, these are considered as they occur in crystals at constant temperature that are unstressed, or in which the stress is a constant hydrostatic pressure, with e a , ea and p; piecewise constant. So, assume this and the fact that the aforementioned Barrett-Massalski description of rotation twins presumes this. In this context, the twinning equations I [2] proposed are of the form ea = ( l - n ® a ) e a = Q ( m - 1 ) y
& ea = (1 - n
and p,- = Q(or/p;- + IX),
Q € O(3).
(2.9)
Here, (e a , p,) and (e a , p,) are values of these vectors on the two sides of the discontinuity surface, n being its unit normal, m, a and 1 being some choice of the matrices referred to in (2.1) and (2.2), a being some vector not having a definite physical interpretation. One finds it by solving (2.8), when possible, which depends on the nature of data available for the other variables. Here, the description of rotation twins gives some information about Q, that it is a rotation with axis parallel or perpendicular to n, angles of rotation being those commonly encountered in various
408 270
J.L. ERICKSEN
studies of crystals. There are solutions of (2.8) with Q a rotation with axis making a different angle with n. Our task is to characterize all choices of the other variables in (2.8) and (2.9) for the indicated possibilities of Q and n. In this setting, (2.9) is almost trivial. For given p,, we could satisfy it by taking p, = Qp,, for example. Of course, in trying to match a solution to some observations, one should match the observed shifts, when data on these is available. In addition, ea and p, should satisfy some equilibrium equations, which I [2] have described, but are not used explicitly here. Whether they can be satisfied by reasonably stable configurations for any one of the kinds of configurations discussed depends on the nature of the particular constitutive equations considered. It follows from (2.8) that det(l-n
(2.10)
which distinguishes two kinds of possibilities. The upper sign gives S-twins with
a-n = 0
=>
(l-n
a-n = 2
=>•
(1 — n ® a)~ T = 1 - a
(2.11)
and the lower gives O-twins with
(2.12)
Here, the S and O refer to the fact that ea and ea or, equivalently, ea and ea have the same or opposite orientations, respectively. As will become clear, rotation twins of both kinds or at least configurations closely resembling them are encountered, in practice. For some deformation twins, S-twin solutions of (2.8) are used, with a interpreted as the amplitude of a simple shearing deformation. The X-ray theory does not require this interpretation. Note that, if (2.8) is satisfied for some values of Q and m, it is also satisfied if we simply replace these by —Q and —m, leaving the remaining arguments unchanged. However, it can be that (2.9) is satisfied by one of these and not the other, for equivalent sets of shifts. In considering rotation twins, I interpret the description as implying that we should assume that Q = R e SO(3),
(2.13)
from which it follows that detm=l
for S-twins,
detm = — 1 for O-twins.
(2.14)
Obviously, it is easy to remove the restriction (2.13). Henceforth, I write R in place of Q whenever these are considered as rotations and denote by R^ the rotation through angle V with axis in the direction of v. There are solutions of (2.8) and (2.9) sometimes called fake twins, because the discontinuities involved are not visible in typical X-ray observations. One kind consists of the lattice invariant shears: S-twins with Q = 1.
(2.15)
409 ON THE THEORY OF ROTATION TWINS
271
Sometimes, workers compose some of these with certain other solutions to adjust values of a or n. The other kind involves solutions with (2.16)
{m,a,l}eL(ea,pi),
perhaps combined with a suitable lattice invariant shear. It is easy to show that, for S-twins of this kind, a = 0 and Q T is the corresponding element of P(ea, p,). For O-twins of this kind, a = 2n and -Q T R° is the corresponding element of ^( e a>P;)- One can also compose these with another solution to get alternative descriptions of the same twin, with different values of Q, m, etc. For n-lattices with n > 1, having suitable symmetry, one can have somewhat similar but nontrivial solutions not of this kind, for example, S-twins with m e L(e 0 ),
m£L(ea,p()
=>• a = 0,
(2.17)
n then being arbitrary. This applies to at least some twins called penetration twins. I call them penetration twins, not implying that all things called this fit (2.17). I am still looking for other possibilities for describing these. According to my [5] analyses of Brazil and Dauphine twins in a-quartz, they are of this kind. Observations of Dauphine twins indicate that the discontinuity surfaces are rather random surfaces. The isometry can be taken as a certain 180° rotation. Theoretically, one could pick planes such that thesefitthe description of rotation twins and workers have done this in considering occurrence of these in some Japan twins in quartz, for example. The surfaces associated with Brazil twins are different, usually of zigzag form, involving pieces of various kinds of crystallographic planes. I know of no theoretical reason for this difference. The isometry involved can be taken as a central inversion. There is also the analog of (2.17) O-twins with m e L{ta), l-n
m g L(eo, p,) =>• a = 2n =>• (2.18)
One could also call these penetration twins, but I am not yet sure how consistent this is with practice, although we will see an example that is. So, for the present, I call them exceptional O-twins. According to my [5] analysis of Friedel twins in a-quartz, (2.18) applies to them and theyfitthe description of rotation twins. The isometry can be taken as a 90° rotation, with axis perpendicular to n. Theoretically, the interfaces are orthogonal planes, and I do not know of any observations that are inconsistent with this. Those who have a little familiarity with twinning studies have encountered examples of the rotation twins of the two-fold kind, said to be of types I and n. Analyses of these are presented by Pitteri [10, 11]. These are S-twins, so a • n = 0 and, for both, m 2 = 1, det m = 1. This implies that m is of the form m
a = ~% + "avfi,
uava = 2,
(2.19)
410 272
J.L. ERICKSEN
where ua and va can be taken as integers. For both, the isometries are 180° rotations. Solving (2.8) gives type I:
Q = R°,
type II:
Q = R*,
n
(2.20)
and n
(2.21)
where u = uaea,
v = vaea.
(2.22)
As is known from studies of deformation twins, it follows from (2.20) that, for type I twins, n is rational, meaning that there is a vector in this direction with integer components relative to the basis efl. Such directions are normal to crystallographic planes. For type II twins, it is also known that (2.22) implies that a is rational in a different sense, here meaning that there is a vector in this direction with integer components relative to the basis ea. Rows of atoms have such directions. Of course, these are also the directions of the axes of rotation. Most mechanical twins are of type I. Like the lattice invariant shears, these solutions are available for any set of lattice vectors. Pitteri and Zanzotto [12] show that these are the only S-twins with this property. As is discussed in slightly different ways by Adeleke [13] and Ericksen [14], one subset of rotation twins has been characterized completely, rotation S-twins with
Rn = n,
a ^ 0.
(2.23)
From the analyses of these, it follows that n, the axis of rotation, must be parallel to a normal of some crystallographic planes and m must be similar to Q. Adeleke [13] describes all S-twin solutions of (2.8) with a ^ 0, making our task relatively easy. In one respect, possibilities with a = 0 are somewhat like those mentioned above for Dauphine twins, since a value of Q included in any point group satisfies QN = 1 for N = 1, 2, 3, 4 or 6. Thus, take Q = R ^ 1 to be a rotation of this kind and you can pick n to be parallel to its axis, so you then have a description of all rotation S-twins such that Rn = n. For any set of lattice vectors, R e P{ta) implies that the axis of R is normal to some crystallographic planes. If you are not familiar with this, it is easily seen by taking the inner product of the equation in (2.7) with the axis, so this property extends to these penetration twins, assuming we are not dealing with the trivial fake twins. Thus, we can consider these possibilities to be characterized. 3. S-twins with RN = 1, AT = 3,4 or 6 Here, N is considered to be the smallest integer such that RN = 1. The analogous assumption is made elsewhere. From the above remarks, the only S-twins that
4_U ON THE THEORY OF ROTATION TWINS
273
need to be considered are those for which the axis of R is normal to n. The cases indicated in the heading all have the property that the rotations do not take a plane with normal n to itself. Such cases are special cases of the solutions covered by Adeleke [13], in his subcase 3.3.2.1, so it is only necessary to specialize his results. Let i, j , k be an orthonormal basis with n = i,
Rk = k,
R = R*,
(3.1)
for any of the angles i/r fitting the powers listed in the heading. There are some restrictions on m, namely (a)
m must have 1 as an eigenvalue
=>•
mabxb—xa,
(3.2)
where the x" are not all zero, and (b)
for some real numbers y", xa, ya and mabyh are linearly independent. (3.3)
In the following, one can use any such values of m and ya. Translating Adeleke's notation to mine, one gets that e" is obtained by solving ea-k = xa,
e a - j = yfl,
ea • (RTj) = mabyb,
(3.4)
from which k = xaea.
(3.5)
If the eigenspace corresponding to 1 is one-dimensional, the xa are, to within a scalar factor, integers, so k is parallel to some rows of atoms. Obviously, this is the case if 1 is an isolated eigenvalue. The other logical possibility is that the eigenvalues are 1,1, 1. Then, one can use the Jordan Canonical Theorem on matrices to show that, if m ^ 1 and if m does not correspond to a lattice invariant shear, trivial possibilities, this eigenspace is one-dimensional. Adeleke [13] uses this theorem repeatedly. Using the results above, I calculate that efl = (esc \lrmabyb - cot irya)\ + yaj + xak.
(3.6)
I won't belabor the elementary calculations giving ea. Then, Adeleke's calculations give a = mabeb • iRea - i.
(3.7)
This generalizes solutions I [14] obtained for the special case Ra = a. Among the solutions considered here, these and only these have m similar to R. Thus, R^ = 1 =$>• m^ = 1 just for this special case. By using (3.5) to calculate R^.(m~1)^efc and rearranging terms, I get an alternative to (3.7) as
a - {csc V K + im~%]yb ~ 2cot V / K .
(3.8)
412 274
J.L. ERICKSEN
Except for differences of notation, most of these calculations are included as special cases of those given by Adeleke [13], but (3.8) is not. 4. S-twins with R2 = 1 There are obvious possibilities of this kind with the axis of R perpendicular to n, the type II twins described by (2.21), but these do not quite include all such solutions, which are covered by Adeleke's [13] subcase 3.3.4.3. For all, it is necessary that m have 1, — 1, — 1 as its eigenvalues.
(4.1)
If m 2 = 1, one gets type II twins. If not, proceed as follows. Of course, by hypothesis, one will have, for some orthonormal basis i, j , k, Ri = i,
Rj = - j ,
Rk = - k ,
n = k.
(4.2)
Introducing a pair of eigenvectors of m, we have mabxb=xa,
mabyb = -ya.
(4.3)
One can take ea as any linearly independent vectors such that ea-i
= xa,
ea-i = y a .
(4.4)
So, ea = xai + yai + zflk,
(4.5)
where, except for the requirement of linear independence, the numbers za are arbitrary. Adeleke's description of a is a = mbea neb - n = mbzaeb - k.
(4.6)
The values of m involved here have the property that m"2 = 1,
m = mm",
(4.7)
where m' is a value of m corresponding to a lattice invariant shear. One way to see this is to use Adeleke's observation that, by a similarity transformation using matrices of integers with determinant one, m can be reduced to the form* m=
1 p q
0 -1 r
0 0 -1
,
(4.8)
where the entries are, of course, integers. Said differently, given an m with the properties described above, one can find lattice vectors relative to which it reduces * He and I use different conventions, making his matrices transposes of mine.
413 ON THE THEORY OF ROTATION TWINS
275
to the form (4.8). If r = 0, m 2 = 1, so assume that r ^ 0. It is then easy to verify that (4.7) is satisfied by m'=
1 0 0 0 1 0 rp —r I
,
m"=
1 p ^
0 -1 0
0 I! of, —1 ||
(4.9)
that m' does represent a lattice invariant shear and that m"2 = 1. Pitted [11] first produced an example of an S-twin solution of (2.8) with R2 = 1, m 2 ^ 1. For this example, Zanzotto [15] showed that it is really a type II twin, in disguise. Here, we can come to the same conclusion, by similar reasoning. To do so, note that we have solutions of (2.8), which I put in a form more like that used by Zanzotto, Rmbaeb = (1 + a ® n)e a = e a .
(4.10)
With (4.7), this is equivalent to R(m")fle» = (1 + a ® n ^ m ' " 1 ) * ^ = la = (m'" 1 )*^,
(4.11)
ea and ta being equally acceptable choices of lattice vectors on the second side. Assuming the lattice vectors used correspond to (4.8), we use (4.5) to calculate that e 3 = z 3 k = z 3n
(4.12)
and, with this that (m/-1)*efe = (l + b ® n ) e a ,
b n = 0,
(4.13)
where b = rz3(ez - pei).
(4.14)
Then, (4.11) becomes R(m")*eft = (l + c®n)e fc = § a ,
c = a + b.
(4.15)
This is a solution of (2.8) with m 2 = 1, detm = 1, m = m", axis of R perpendicular to n, implying that it is a type II twin, so Rc = c.
(4.16)
The isometry is then the same as it is for the type II twin. In the example presented by Pitteri [11], Zanzotto [15] reasoned from this that the apparently different solution is physically equivalent, and I agree. Thus, all of the cases considered in this section are really type II twins. Actually, what appeared to be exceptions are compound twins, meaning that they can also be described as type I twins.
414 276
J.L. ERICKSEN
5. O-twins with Rn = n Having disposed of the S-twins, we now consider the type of O-twins described in the heading. So, we are to satisfy R^m-^e* = (l-n®a)ea,
a n = 2,
(5.1)
for any of the angles V associated with rotation twins. Operate on both sides with —R™, replace m by in = —m and note that - R " ( l - n
(5.2)
where a = 2n - a
=^
a • n = 0.
(5.3)
Also,
R"R°=RW
(5-4)
Thus, (5.1) gets transformed to the S-twin solution R|.(m-1)Jeb = (l-n®a)e a ,
^ = ty + n,m = -m
(5.5)
and, obviously, we can transform (5.5) in a similar way to get (5.1). There are exceptional cases. For one, yjr=n
=•
^ = 2jt
=*
R | = l,
(5.6)
(5.5) then describing a lattice invariant shear. One is then combining the trivial —R° = 1 — 2n (g> n with a lattice invariant shear, to get a solution which is rather trivial, but perhaps not useless. Another exceptional case occurs if in e L(ea)
&
me L(ea),
(5.7)
(5.5) then describing a penetration twin or a fake twin. Either way, a = 0. Then, R!L € P(ea), so the axis n is normal to some crystallographic planes. For the remaining solutions, if \js is one of the angles associated with rotation twins, xfr + n is another, so one can take any of the S-twin solutions discussed in Section 3 and transform it to get an O-twin solution. From the remarks after (2.23), it follows that, for the twins considered in this section, the axis of rotation is parallel to the normal of some crystallographic planes. 6. O-twins with axis of R perpendicular to n For these, we can use the fact that any rotation can be written as a product of two 180° rotations with axes perpendicular to that of the rotation and one of these axes
415 ON THE THEORY OF ROTATION TWINS
277
can be chosen at will. If u is a unit vector such that u • n = 0, there is then a vector v such that R^ = R£R;,
v • u = 0,
2(v • n) 2 = 1 + cos x/r.
M = 1,
(6.1)
Given an O-twin solution of the form R ^ ( i « - 1 ) y = ( l - n ® a ) « s B = g8,
a n = 2,
(6.2)
we can transform it as we did (5.1) to get the S-twin solution R ^ m - 1 ) ^ = ( l - n ® a ) e a = ea,
ea = - R ° e a ,
m =-m,
(6.3)
with a again given by (5.3). One exceptional case occurs when (6.3) describes a penetration twin, so that m € L(e a )
•&
m e L(e a ),
5 = 0.
(6.4)
It then follows that m is similar to R^, so (6.5) m 2 = l =» m 2 = l. In these cases, it is not necessary that u be parallel to a row of atoms or to a normal of a crystallographic plane. My [5] analysis of Friedel twins in quartz, which is consistent with Zanzotto's [16, 17], shows that they provide an example fitting (6.4) and (6.5), with \jr = jr/2. Another exceptional case occurs when
v n = 0 => R;=Rr n -
(6-6)
Then, (6.3) describes a rotation S-twin of the kind discussed in Section 4, essentially a type II twin. Using (2.21), one can transform it to get solutions of (5.1) rather explicitly. In this case, u need not be parallel to a normal of crystallographic planes or to rows of atoms. This covers the nontrivial possibilities with in 2 = 1, so I now assume that m2 ^ 1. If v is parallel to n, (6.1) gives R" = 1, which is of no interest. So, we can assume that v is neither parallel nor perpendicular to n. Then, RJ does not map a plane with normal n to itself, and we do have R > = -u,
u • n = 0.
(6.7)
Such solutions are characterized by Adeleke [13], in his subcase 3.3.2.2. Restrictions on m not already mentioned are that m must have —1 as an eigenvalue =>• m£kh = —kb and
m*4 = —la, (6.8)
for some numbers ka and Za, not all in either set being zero. Also, there must be numbers ya such that k", y", and mabyb are linearly independent.
(6.9)
416 J.L. ERICKSEN
278
Of course, m can be replaced by m" 1 in (6.8). If I calculate correctly, such ya always exist for any particular m of the kind considered, but the linear independence fails for some values of these. For calculations to follow, one can use any values of m and ya that are consistent with (6.8) and (6.9). Take u as a unit vector and use the orthonormal basis u, n, w = u A n. Then, for some angle
v = cos
(6.11)
Adeleke's prescription for ea gives them as solutions of e ° - u = Jk°,
e°-w = / ,
ea-(R»=mfcV,
(6.12)
which gives ea = kau + (esc 2
(6.13)
For a, his prescription gives a = mfeb • nR^e a - n.
(6.14)
Of course, one must transform this to get the corresponding solutions of (6.1), using a = a + 2n. From (6.12)i, it follows that the axis of RJ, is given by u = kaea. Now, starting with the combination Rl(m~l)leb,
(6.15) use (6.7) and (6.10) to get
R^n = cos2
(6.16)
R^w = sin 2
(6.17)
to evaluate this. One finds that it can be put in the form (1 — n
(6.19)
Using (6.13), one gets 1 = kalau + sec (playa (— sin
(6.20)
where (6.10) is used. With the possibility v • n = 0 excluded, a and v cannot be parallel, so these determine the direction of a plane with normal 1. When the
417 ON THE THEORY OF ROTATION TWINS
279
eigenspace of m corresponding to —1 is one-dimensional, ka and la are proportional to integers, implying that u is parallel to some rows of atoms and 1 is normal to some crystallographic planes. Either —1 is an isolated eigenvalue, in which case the eigenspace is one-dimensional, or the eigenvalues are 1, —1, —1. In the latter case, one can reduce m to the form (4.8). It is then easy to show that if m 2 ^ 1, the eigenspace corresponding to —1 is one-dimensional, and we have already covered possibilities with m 2 = 1. As is clear from (6.18), there are exceptional cases when m 2 = 1. One possibility is that one has type I or type II twins, so sin 2
=»
in e L(ea),
R; € P(ea).
(6.21)
With this, we have all solutions of the twinning equations describing rotation twins. 7. An example In part, this is an attempt to warn those unfamiliar with research on minerals of some of the pitfalls. Experimentally, it is not always easy to distinguish between various twin laws which seem to be quite different. In the words of some experts, Donnay et al. [18], "Whenever the crystal lattice or one of its multiple lattices possesses pseudo-symmetry, the crystal may twin (twinning by pseudo-merohedry or by reticular pseudo-merohedry). If the pseudo-symmetry is pronounced and sufficiently high, several twin laws may lead to nearly identical orientations of the twinned individual. The resulting twins have been called "neighboring twins" (macles voisines, Friedel, 1926). Because the relative orientation of one of the twinned individuals with respect to the other is known to morphologists only to within the limits of error of optical goniometry, the identification of neighboring twins may be a difficult problem, as is well illustrated by cryolite, staurolite, harmotome, morvenite, etc." These workers developed a rather sensitive technique for distinguishing between such possibilities. As an example, they considered four likely possibilities for describing one kind of twin in staurolite. These involve four rotations with different axes. Two have 120° angles, the angles for the other two being 90° and 180°. According to their observations, that with the 180° angle is the best fit, among the four. I have looked at several references on staurolite and plan to look at more, since I have found them rather confusing and incomplete, partly because of my ignorance and meager experience with minerals. In one of the more recent references, the book by Klein and Hurlbut [19, pp. 104, 105 and 438, 439], various information is presented, including the chemical composition, from which it is clear that this is a complicated multilattice, and that the composition is somewhat different in different specimens. This is also the case in various other minerals. It is described as monoclinic with a 0 angle of 90°, being pseudo-orthorhombic. As I interpret this and other writings, the skeletal lattice is orthorhombic, but the configuration of
418 280
J.L. ERICKSEN
shifts reduces the symmetry to monoclinic. The space group is described as C2/m. No details concerning shifts are given. Although writers tend not to say so, the general practice seems to be to use as lattice vectors a mutually orthogonal set, with magnitudes given by these writers as a = |ei| =7.83 A,
b = |e 2 | = 16.62 A,
c = |e 3 | = 5.65 A.
(7.1)
Other estimates I have seen of these numbers are not very different. As they describe them, two common kinds occur, both being called penetration twins, and both being involved in crosses commonly found in this material, "(1) with twin plane {0 3 1} in which the individuals cross at nearly 90° (Figure 13.24b); (7.2) (2) with twin plane {2 3 1} in which they cross at nearly 60° (Figure 13.24c)." I will only present an analysis of the first kind. Also, I note that Donnay et al. [18] do not consider the twin plane {0 3 1} as one of the more likely possibilities. Here, {a be} represents the direction ae1 + be2+ce3 or a crystallographic equivalent. For the particular direction noted, replace curly brackets by parentheses. Concerning the first kind, in a paper written almost two decades earlier, Hartman [7] complains that too many workers ignore experimental work by Friedel [20], then quite old, which he accepts. This is also endorsed by Cahn [6], a major expert. Briefly, Friedel found that the isometry is better described by K)i(7-3) From such remarks, I became doubtful that the interface is exactly a {0 3 1} plane. Similarly, Klein and Hurlbut [19] ignore the work mentioned above, by Donnay et al. For the second kind, they found that this isometry is better described as a 180° rotation about [3 1 3], the direction 3ei + e2 + 3e3. One could replace this by a crystallographic equivalent, the set of these being described by (313). Various writers just give one direction, assuming you know that the equivalents can be used and, here, I follow suit. I decided to try analyzing the first kind as a rotation twin, using (7.3) This axis is perpendicular to the normal to the aforementioned (0 3 1) plane or an equivalent. I assume that the interface is a plane with normal perpendicular to the axis, but not necessarily this one. A priori, it might be an S-twin or an O-twin, these being growth twins. I tried both possibilities, concluding that only the latter is appropriate. I shall sketch my analysis of the latter, using the analysis in Section 6. We are given that u = — =ae\ ir = ~. (7.4) a 2 With the lattice vectors orthogonal, e2 and e3 must be in the plane determined by n and w, where the vectors are defined as in Section 6, giving ce3 = — s i n / n + cos/w, (7.5) be2 = c o s / n + sin^w,
419 ON THE THEORY OF ROTATION TWINS
281
where / is some angle, unknown since no particular value of n is assumed. Of course, these values of ea are to be consistent with (6.13), and I reject solutions possible only for isolated values of b/c. This gives equations which can be solved by elementary methods, to get two solutions. For m, one gets the values im=
1 0 0 1
0 I! 0 |,
m2=
0 0 -1 I
1 0 0 -1
0
0
0 0
1
eL(ea).
(7.6)
For the first, one gets (n
=> »i = 2 n i
(7.7)
=»
(7.8)
and, for the second (n ® a) 2 = (be2 - ce3) ® (j
- * j
a2 = 2n 2 .
The two directions of n are orthogonal, these solutions being rather similar to the solutions Zanzotto [16,17] and I [5] got by different reasoning for the Friedel twins in quartz. The latter form 90° crosses, and one can use much the same reasoning to construct solutions for such crosses in staurolite. For (7.7), say, the direction of n is given by b , , - e 2 + e3, c
b - = 2.94 c
(7.9)
for the approximate values of b and c given in (7.1). In the usual jargon, this is an irrational direction. In dealing with these, it is a common practice to give a rational direction, using fairly small integers and, here, a likely choice is a (03 1) plane, referred to above as a twin plane. While mi and m 2 are not in L(ea, p,) m3 = m 1 m 2 =
1 0 0 I! 0 - 1 0 eL(ea,p,)
0
0
-1 I
=>•
mi = m 3 m 2 ,
(7.10)
from the claim that these are monoclinic crystals. Here, my interpretation of the literature is that the 180° rotation included in the point group is R*1. So, the two solutions are symmetry related. As far as I can tell, this seems to be a satisfactory description of these twins. What is the basis for calling these penetration twins, particularly by those like Klein and Hurlbut who associate a definite twin plane with them? According to Cahn [6, p. 388], the interfaces of penetration twins are crystallographically irregular, and interfaces are not always parallel to twin planes, when these exist. I have not yet found reports of observations of details of these interfaces, so I might be missing something. From looking at sketches of the 90° crosses and eyeing
420 282
J.L. ERICKSEN
one specimen, the interfaces at least resemble perpendicular planes. By looking at sketches of the "nearly 60°" crosses, the reader will be as able as I am to judge whether the interfaces are planes and if these are likely to be rotation twins. Other kinds of intergrowths more or less like these, not always found in crosses, are among the things called penetration twins. Sketches of these staurolite crosses and other twinned configurations in this and other materials are presented by various writers, for example, Klein and Hurlbut [19, pp. 103-106] and Dana [21, pp. 181— 194]. With (7.6), for the same values of mi and m2, one can get solutions for what I call penetration S-twins without restrictions on n, or for what I call exceptional O-twins with different values of n, but these involve different isometries. There is a little mystery which I seem to have resolved. For the second kind of twin described in (7.2), Cahn [6, p. 373] says nothing about the twin plane mentioned there, but does mention a twin law "... with a three fold twin axis [101],...". I expected to find this among the four likely prospects considered by Donnay et al., but did not. At first, I thought that this could be due to some misprint. Then, I found that Hartman [7, p. 234] noted that different workers use different values of c, one based on "... the X-ray unit cell...", the other on "... the morphological description...", the latter being twice the former. Assuming Cahn but not the others used the latter, this corresponds to the [102] possibility considered by Donnay et al. As was mentioned earlier, this is not the one they consider the most likely, which is a two fold rotation with twin axis [3 13]. However, this seems to give a likely explanation of the indicated difference. Of the two, Cahn's paper was published a bit earlier so, at the time, he might not have learned of the results of the other writers. I will not recommend a solution for these twins without demonstrating that it is adequate to describe the "nearly 60°" crosses, something I have not yet explored. I do think it desirable to try to analyze more of the many growth twins, to better understand how well my twinning equations apply to them, and to determine how useful they might be in helping to resolve ambiguities mentioned at the beginning of this Section. References 1. C.S. Barrett and T.B. Massalski, The Structure of Metals, 3rd edn. McGraw-Hill, New York (1966). J.L. Ericksen, Equilibrium theory for X-ray observations. Arch. Rational Mech. Anal. 139 (1997) 181-200. 3. J.L. Ericksen, On correlating two theories of twinning. Arch. Rational Mech. Anal. 151 (2000) 261-289. 4. J.L. Ericksen, Twinning analyses in the X-ray theory. Internat. J. Solids Structures 38 (2001) 967-995. 5. J.L. Ericksen, On the theory of growth twins in quartz. Math. Mech. Solids 6 (2001) 359-386. 6. R.W. Cahn, Twinned crystals. Adv. in Phys. Quart. Suppl. Phil. Mag. 3 (1954) 363-^45. 7. P. Hartman, On the morphology of growth twins. Zeits. Krist. 107 (1956) 225-237. 8. M. Pitted, On (v + 1) lattices. J. Elasticity 15 (1985) 3-25. 2.
421 ON THE THEORY OF ROTATION TWINS
283
9. J.L. Ericksen, On groups occurring in the theory of crystal multi-lattices. Arch. Rational Mech. Anal. 148(1999)145-178. 10. M. Pitteri, On the kinematics of mechanical twinning in crystals. Arch. Rational Mech. Anal. 88 (1985) 25-58. 11. M. Pitteri, On type-2 twins in crystals. Internal J. Plasticity 2 (1986) 99-106. 12. M. Pitteri and G. Zanzotto, Beyond space groups: The arithmetic symmetry of deformable multilattices. Acta Cryst. A 54 (1998) 359-373. 13. S. Adeleke, On matrix equations of twinning in crystals. Math. Mech. Solids 5 (2000) 395^15. 14. J.L. Ericksen, Some surface defects in unstressed thermoelastic solids. Arch. Rational Mech. Anal. 88(1985)337-345. 15. G. Zanzotto, Twinning in minerals and metals: remarks on the comparison of a thermoelastic theory with some experimental results. Mechanical twinning and growth twinning, Nota n . Atti Accad. Naz. Lincei 82 (1988) 725-741, 743-756. 16. G. Zanzotto, Geobarothermometric properties of growth twins and mathematical analyses of quartz for a broad range of temperatures and pressures. Phys. Chem. Minerals 16 (1989) 783789. 17. G. Zanzotto, Thermoelastic stability of multiple growth twins in quartz and general barothermometric implications. J. Elasticity 23 (1990) 253-287. 18. G. Donnay, J.D.H. Donnay and V.J. Hurst, Precession goniometry to identify neighboring twins. Acta Cryst. 8 (1955) 507-509. 19. C. Klein and C.S. Hurlbut, Jr., Manual of Mineralogy (after James D. Dana), 21st edn, revised. Wiley, New York (1999). 20. G. Friedel, Sur les macles de la staurotide. Bull. Soc. Franc. Mineral. 45 (1922) 8-15. 21. B.W. Dana, A Textbook on Mineralogy with an Extended Treatise on Crystallography (revised and enlarged by W.E. Ford). Wiley, New York (1932).
422
On the Theory of Growth Twins in Quartz
J. L. ERICKSEN
5378 Buckskin Bob Road, Florence, OR 97439, USA (Received 3 April 2000; Final version 14 April 2000)
Abstract: Being generally interested in twinning phenomena in crystals, the author became curious to know what can be done with available continuum theory, to better analyze growth twins in minerals. The most common mineral is quartz, a very useful material, and a variety of different kinds of growth twins occur in quartz crystals. So, he picked this material for his first attempt to analyze growth twins. The author is encouraged to find that, even at the elementary level, this does add something to our understanding of them. His aim is to elaborate this, using thermoelasticity theory and the X-ray theory described below.
Key Words: twinning, crystal mechanics, phase transitions, properties of quartz
1. INTRODUCTION For deformation twins in many crystals, nonlinear thermoelasticity theory fails to apply, as was made clear by Zanzotto [1]. This is associated with failure of the commonly used hypothesis relating changes in lattice vectors to macroscopic deformation, the Cauchy-Born rule, hereafter abbreviated as CBR. Similarly, CBR fails to apply to deformations associated with some phase transitions. Also, there are difficulties in using such theory to analyze growth twins, twins occurring when a crystal is grown, the concept of deformation being irrelevant to how they are related, when they are formed. Once they are formed, there are some possibilities for using thermoelasticity theory to analyze how they respond to changes in temperature and stress, as Zanzotto [2,3] did, in an interesting study of likely environments for growth of rare quartz crosses or, as Thomas and Wooster [4] did, in finding a theoretical basis for good mechanical treatments to remove Dauphine twins from quartz crystals, for example. These impede their piezoelectric response. As is discussed by Dana and Dana [5, pp. 75-98], a variety of kinds of growth twins have been reported to be observed in quartz, several being well confirmed. Also, rather complicated configurations involving the same or different kinds occur. They and other workers mention that essentially all quartz crystals contain some twins. I [6] proposed a theory of X-ray observations of crystals, hereafter called the X-ray theory, to help deal with cases in which CBR fails to apply or is irrelevant, as is the case for growth twins. It is constructed so that for subsets of changes of configurations, one can use CBR to introduce deformation when one wishes to do so. For various reasons, I think it worthwhile to determine what the X-ray theory can do, to describe some of the observations of quartz.
Mathematics and Mechanics of Solids, 6: 359-386,2001 ©2001 Sage Publications
423 360
J. L. ERICKSEN
I [7,8] have developed some of the more elementary theory of deformation twins, and I think it reasonable to develop similar theory for growth twins. As noted above, quartz provides a variety of examples for this, some being rather unusual. Thus, it provides a good place to explore, to determine how well the X-ray theory performs, in analyzing growth twins. Also, the a — /? phase transition is interesting, as an example of a transition which cannot be satisfactorily analyzed using thermoelasticity theory, although CBR seems to apply to the deformations involved in it, as is discussed by James [9]. I note that James [10] did an interesting study of some such phenomena in quartz, using theory which is similar to the X-ray theory, but less complete. I'll include some very brief comments about the transition, but won't attempt to analyze it: it has some curious features which have puzzled workers, including me. Primarily, my concern here is with elementary analyses of the well-established kinds of growth twins in quartz. I will use both the X-ray and the thermoelasticity theories, showing how they complement each other. The most common twins are those called Brazil, Dauphine, and Japan twins and combinations of these. For these, the X-ray theory is used. For the rarer Friedel twins, I consider both kinds of theory, which provides more understanding than one can get from using just one of these. For the Zinnwald and Zwichau twins, which are also rather rare, I use only thermoelasticity theory. These have different properties, which make the X-ray theory less useful for them. As far as I know, these are the only kinds of twins that are recognized as well-established by experts, although various others have been reported as having been observed, some only once.
2. X-RAY THEORY The X-ray theory involves two kinds of vector fields, functions of position in space. For one, there are the lattice vectors e a and their duals, the reciprocal lattice vectors e" satisfying ea-eb=Sb,
ea ® e a = e"
(1)
I excluded continuous distributions of dislocations by assuming that, at least locally, there are scalar functions/ 0 such that efl=Vja.
(2)
This makes the theory compatible with CBR for subsets of changes of configurations, leaving it to individual judgment to decide when to trust it. Briefly, it asserts that if one picks a reference configuration with lattice vectors E a and subjects it to a deformation with deformation gradient F , the vectors F E a are a possible set of lattice vectors in the deformed crystal. For the mass density p, I assumed that mass is proportional to that of a unit cell, formalized as P = -x r=m|e1-e2Ae3|) |ei-e2Ae3| ' ''
(3)
where m is a positive constant. This is also compatible with CBR. It fails to apply to thermal expansion of zinc, according to the experiments of Balzer and Sigvaldson [11], who attribute
424 ON THE THEORY OF GROWTH TWINS IN QUARTZ 361 this to changes in the distribution of vacancies and dislocations, defects which are invisible in X-ray observations. This relates to the fact that X-ray observations are of a macroscopic nature, involving averaging over the thickness of a beam. For thermal expansion of quartz, Jay [12] found that CBR does apply, implying that (3) should hold, but he mentions evidence that it fails for bismuth. For any particular configuration, there are infinitely many possible choices of lattice vectors, related by transformations of the form ea -» mbaeb «• e" -> (m" 1 )^ e£
m = \\mba \\ e GL (3, Z ) .
(4)
In all matrices used here, the lower index on entries labels rows. The second kind of vectors consist of the shifts, denoted by p,,,=l...«-l
(5)
for an n-lattice. Such a lattice consists of n geometrically identical interpenetrating lattices, with the same lattice vectors. To get a set of shifts, pick one atom from each lattice and select one as a base point. Then the shifts are the position vectors relative to the base point of the other atoms selected. Obviously, there are also infinitely many ways of choosing these, and they are related by transformations of the form P/->a^py+/?efl,
/feZ.
(6)
We will be concerned only with the theory of monatomic 3-lattices, and for these, the possible choices of a are
_
1 0
°i
-
o
a
=
0 - 1 j _j
5
i '
a 2
_ - ! ! -
> «6 =
-
1
0 1 j Q
0'
a3
_
~
-1 °
-
1
1'
a 4
_ ! - !
•
~
0
-
1 '
(7)
As was mentioned earlier, the lower index on elements labels rows. Some workers, for example Pitteri and Zanzotto [13, Chap. 4], use a different convention, making their matrices transposes of mine. For a lattice, the point group P(ea) and lattice group L(ea) are finite groups, defined by P(ea) = {Q£O(3)\Qea=mbaeb,
m€GL(3,Z)}
(8)
and L(e) = {meGL(3,Z)|m*e6 = Qe,
QeO(3)}.
(9)
425 362
J. L. ERICKSEN
Any n-lattice, equipped with a choice of ea, defines a skeletal lattice, an 1-lattice with these lattice vectors, so these groups are defined for w-lattices. In addition, there is the point group defined by / > ( e a , p / ) = { Q e O ( 3 ) | Q e a = mbaeb, Qp, =aJ,pj• + I?ea,
m e GL (3, Z)} , (10)
where, for monatomic 3-lattices, the possible choices of a and 1 are as described above. This can be of lower order than P(ea), since the conditions on shifts can exclude some elements of the latter, and we will encounter examples of this kind. Lattice group elements for a lattice are of the form {m)a,I}Gl(eo>p/),
(11)
consisting of matrices of the kind indicated that are compatible with (10). Concerning constitutive equations, there are various equivalent formulations of those that I proposed [6], and I now prefer one I presented later [14]. For this,
(12)
where 6 denotes the absolute temperature and
tf=p,ea^p,=tfea.
(13)
As to what invariances ip should have, this depends on what is considered to be its domain, and there are different reasonable possibilities, depending on the problems one wishes to consider. As a general restriction, I use that described as follows: Throughout the domain of
<
0 and e 1 • e 2 A e 3 > 0 0 ande 1 • e 2 A e 3 < 0. (14)
Here, I use the former alternative, except when there is a statement to the contrary. This restriction is based on the idea that
R e SO(3).
(15)
We do have the equivalence classes of lattice vectors and shifts discussed earlier, and their equivalents for the variables are now used. Physically, the value of
426 ON THE THEORY OF GROWTH TWINS IN QUARTZ 363 the configuration, not on how this is represented. So, whenever transformations of the kind indicated by (4) and (6) map an argument in the domain to others in the domain,
(16)
n=
d
~le
J
While configurational stresses can be useful for analyzing some kinds of singularities, they are not for those considered here. From (16), it is clearly important for
(18)
Also, I [14] derived identities satisfied by t and 5> evaluated at values of e" and/?? with a nontrivial lattice group. These are R r t R = t,
ReP'(efl)p,)
(19)
{m, a, 1} e L' (efl, p , ) ,
(20)
and the equivalent of m $ = $a,
where the primes denote these groups, restricted to be consistent with (14). For the cases to be considered, this imposes no restrictions. These do require that
_ ^ _ ^ _ 364
427
J.L.ERICKSEN
groups indicated here. This will be true if its domain is a Pitted [15] neighborhood centered at the configuration considered, in particular. However, to cover various twinning analyses, it can be necessary for the domain to be larger and, in some cases encountered, it is necessary to consider more than one constitutive equation.
3. THE/3-PHASE The behavior of some twins makes it useful to consider two phases of quartz. At room temperature, unstressed quartz is found in the a-phase, transforming to the more symmetric /?-phase at a temperature of about 574 °C, in crystals bearing no loads. In practice, specimens subjected to atmospheric pressure are commonly regarded as unstressed. Since the latter phase is a bit simpler, we will consider it first. For unstressed configurations in both phases, for describing the lattice and reciprocal lattice vectors, I use the rather common choice,
ei
=
ai,
a \
e2 = a I - - i + — j I ,
V3 /
e 3 = ck
\/3a
c
(21)
where (i, j , k) is an orthonormal basis, oriented so that ijAk>0,
(22)
a and c being positive functions of temperature. Often, k is referred to as the c-axis. In either phase, configurations can be of one of two kinds, commonly described as rightand left-handed. My view is that, in a theory of the solid state, these should be considered as different, enantiomorphic materials. I will use a simplified theory which seems to perform well for studies of the kind done here and which is like that used by some workers. This describes these materials as monatomic 3-lattices. Essentially, this substitutes an atom for a combination of the silicon and oxygen atoms occurring in the real crystal and ignores the fact that these are ionic crystals. The /?-phase is not piezoelectric, but the a-phase is. In the /?-phase, the two shifts for the right-handed variety can be taken as Pi = Pi" e a = ^ei + 3 e 3,
P2 = P*" e* = if 2 + 363,
(23)
and, for the left-handed, Pf = ^ a e a = - p f ,
i = l,2.
(24)
Of course, for either material,
428 ON THE THEORY OF GROWTH TWINS IN QUARTZ 365
6) and
0) .
(25)
Here,
DR = DL .
,
(26)
Since these are effectively different materials, it is not really important that the domains coincide. For example, an alternative is to take DL
TDR
=
^ yL
=
T(f)R
^
^
where T denotes the transformation described by Tea = -ea,
TPi = - p , O Tfi = pi.
(28)
For the twins to be considered, (26) better fits the analysis, although we shall see that something rather similar to (27) is more appropriate for Friedel twins. I see no harm in making different choices for different problems, as long as it is made clear what is being used. Actually, different workers have made different choices, sometimes without being clear what they were using, causing some confusion. While it might seem curious, the two functions
= (mi1)abeb,
Ri-R^/3,
(29)
where mi =
1 1 0 - 1 0 0 0 0 1
,
1
m^ =
0 - 1 0 1 1 0 0 0 1
,
m? = 1
(30)
and R 2 e a = (m2)* eb & R 2 e" = (/n2)° e*. with
R / ^ ,
(31)
429 366
J. L. ERICKSEN
1 0 0 m2 = m^1 = - 1 - 1 0 0 0 - 1
,
(32)
Working out the corresponding generators for the lattice groups, one gets and { m 2 , a 3 , I 2 }
{m^aa.Ii}
(33)
for the right-handed and, for the left-handed, and { m 2 ) a 3 ) - I 2 } ,
{1111,02,-1!}
(34)
where a 2 and a3 are described in (7) and
j 1
~
1
°
1
0 0 1
T '
1
-
2
~
0
° ° -
1
™ 0 •
K
>
I will use some but not all elements of the lattice group for these configurations. However, in the last section, I will describe all of the elements in detail. Now, let
% = ^~a-
(36)
With (20), using the generators described above, one finds that * = 0.
(37)
So, this subset of equilibrium equations is satisfied by symmetry. Note that this applies to any values of a and c describing configurations in the domain of
(38)
which characterizes the stresses possible in these configurations. One can get thisfromthe table presented by Truesdell [16, p. 201] or by a simple calculation based on (19). With the shift components fixed by (23) and (24), A and B reduce to functions of a, c, and 0. Setting t = 0 then gives two equations in two unknowns, A(a,c,0)=B(a,c,O)=O,
(39)
for determining a and c as functions of 6, at least for values of this likely to be relevant to the occurrence of the /?-phase. Usually, crystals found in the field are predominantlyright-or left-handed, containing small amounts of the enantiomorph, the two possibilities occurring with about the same frequency. The smaller contributors can occur as growths on the surface of the larger or be included in the interior. Dana and Dana [5, Fig. 61] present a sketch of
430 ON THE THEORY OF GROWTH TWINS IN QUARTZ
367
typical shapes of the latter, rather simple regions with piecewise planar boundaries "such as {1010} , {1010} , {0111} , {0001} and {1121}." For hexagonal crystals, it is customary to use the four index notation: interpret {p, q, -p—q, r} as the directionpe1 +qe2+re3, or one crystallographically equivalent to it, considered as a normal to a plane. I refer the reader to their discussion for additional details on these twins. Using (26), it is easy to show that these solutions are symmetry related in an obvious way for the left- and right-handed varieties. The coexisting enantiomorphs form the so-called Brazil twins, the first kind of growth twins to be considered. My view that the two varieties should be regarded as different solid materials implies that mass won't be transported across such phase boundaries, even when a — fi phase transitions occur, which is consistent with experience. This does not preclude the possibility that conversions might occur if the crystal passes through a fluid phase, but the X-ray theory does not deal with fluid phases. Concerning terminology used by experts, Cahn [17, p. 388] writes The interface or composition surface between the individuals constituting a twin is something quite distinct from the twin plane, where there is one. The latter is a plane of structural symmetry between the components, while the interface may or may not be plane, and may or may not be parallel to the twin plane. When the interface is parallel to the twin plane, the edifice is termed a contact twin When the interface is crystallographically irregular, so that the components appear as if they had grown simultaneously but independently of each other, the edifice is called a penetration twin. I [6] proposed that, for most twins, including growth twins, it should be possible to choose reciprocal lattice vectors e" and e" on the two sides of the discontinuity surface such that for all Burgers circuits in the neighborhood of the surface, the Burgers vectors vanish. This led to the jump condition ea = ( l - n < g i a ) e a ,
(40)
where n is the unit normal to the surface and a is some vector, defined on the surface. Another common idea is that for twins in unstressed crystals, the two configurations should be related by some isometry. This leads to the twinning equation ea = ( l - n < g > a ) e ° =Qmabeb,
QeO(3),
raeGI(3,Z).
(41)
For Brazil twins, the obvious isometry has Q = — 1, and we can satisfy this with Q = -l,
m = - l = ^ e a = e " =^ a = 0.
(42)
Actually, this does not require the crystal to be unstressed, holding for any e° described by (21). So, the stresses can be any of the form (38). Also, the isometry implies more, that Q = - 1 => efl -> - e a and p, -> - p , . As before, combine this with m = — 1 and we get what was used in (26),
(43)
43j_ 368
J. L. ERICKSEN
e" = mab Qe" = e a and p, = - p , =• ft = -pi,
(44)
agreeing with (41) and justifying calling these twins, involving enantiomorphs. Here, at least, the suggested twinning equation applies. There are configurations called twins, for which crystallographically inequivalent planes meet at the interface. As I [6] noted, mentioning an example observed in alum, one should not expect (41) to apply to these. In quartz, the wellestablished Zinnwald twins are of this kind, as is clear from the description of these given by Friedel [18], for example. Later, we will do some analysis of these and the somewhat similar Zwichau twins, using thermoelasticity theory. For various deformation twins which have been unloaded, workers use the twinning equation ea =Qmbaeb
= (l + a®n)ea,
QeO(3),
m£GI(3,Z),
a • n = 0,
(45)
or a variation on this with rather similar features. Almost always, a ^ 0 and solutions then restrict n to certain crystallographic directions. Here, there is no such restriction. This is associated with penetration twins, mentioned in the aforementioned explanation by Cahn, observed in various growth twins, involving nonplanar discontinuity surfaces, often of a somewhat random kind. Rather obviously, no particular crystallographic plane is suggested by the analysis, nothing that would serve as a twin plane, a typical situation for what are called penetration twins. As was mentioned earlier, those observed in Brazil twins do have relatively simple geometric shapes. Perhaps because the X-ray theory does not deal with growth processes or interfacial energies, it provides no reason for this. Brazil twins are sometimes called "optical twins," since they can be seen using polarized light, this being associated with optical activity. Here at least, what makes it possible to have a = 0 is that Q = - 1 e P(ea) but g P(ea, p,-). I note that, with (26), one can use the same analysis without assuming that the crystals are unstressed, allowing for the stresses represented by (38), but I won't belabor this. Also, with some wariness, James [10] uses CBR to introduce macroscopic deformation into his equations. As I [7] explained, it is easy to do this for the X-ray theory for particular changes of configurations, whenever you trust it. Like James, I think it wise to be wary of assuming that it applies, without looking hard at the evidence, but workers often do so. To use thermoelasticity theory, it seems necessary to do so, and later, I will do this. We have not yet had a reason to deal with deformation, but later, we will.
4. THEa-PHASE The a-phase is piezoelectric, but, as was mentioned before, we do not account for this, following the lead of Thomas and Wooster [4] and James [10], among others. For the aphase, the lattice vectors are also of the form (21), but for the point and lattice groups, there is some loss of symmetry, the generators for the former now being R-2TT/3 an( ^ ^ 4 - The order of either of these groups is 6, half that for the /?-phase. This is associated with a change in the shifts to the form
432 ON THE THEORY OF GROWTH TWINS IN QUARTZ
1 2 Pi = 2 e i + - e 3 + A (ei + 2e 2 ),
1 1 p 2 = - e 2 + - e 3 + A (2e1 + e 2 ) ,
369
(46)
for the right-handed variety, and the negatives of these for the left-handed, where A is a function of 0. Referring to (22), some writers prefer to refer the left-handed varieties to lefthanded bases. Obviously, one can describe the reversal of p, by replacing e a by - e f l , leaving the components unchanged and simplifying the listing of these a bit, and there is the same advantage for listing piezoelectric moduli. For the problems to be considered later, I find this disadvantageous. Cady [19, Chap. XVI] discusses how an orthonormal basis is commonly determined, physically, when piezoelectric effects are also considered. For the configurations described above, piezoelectric effects should not occur, according to the linear theory used by Cady for both phases. As we shall see, (46) is really double-valued, in the sense that (17) gives a pair of symmetry-related solutions. Here,
K3ea^(m3)baeb^R3ea
= {m^)abeb,
R3 ^ R^ /3 = R?,
(47)
with m3 =
0 1 0 - 1 - 1 0 0 0 1
m^1 =
,
- 1 - 1 0 1 0 0 0 0 1
=*• mij = 1.
(48)
Of course, R 2 = RJ,. again transforms ea and e" as described by (31) and (32). Calculating the corresponding generators of the lattice group for the right-handed kind, one gets {m2,a2,I2}
and
{m3,a5,I3},
(49)
where the first generator is again described by (32) and (33), a 5 is given in (7), and
13
= I -1 I I I
<50>
replace I, by —I, to get the analogs for the left-handed variety. Of course, one can use (20) to get the identities satisfied by <&, which imply that it reduces to the form *=
$ -2$ 0
-2$ $ 0
,
(51)
where $ ^ 0 in general: equating it to zero gives an equation for determining A. Since A = 0 is a solution, the a — /? transition involves nonuniqueness of solutions of this equation, with an exchange of stability. With thermoelasticity theory, there is no way to account for this, and one should. With the new point group, it still follows that t is of the form (38), so one gets (39) as equations for determining a and c. For (38) to apply, it is not necessary that $ = 0 be satisfied, but bear in mind that A and B now also depend on A. Using the invariance of cp under SO(3), one can show that they are even functions of A. This is related to the fact that
433 370
J.L.ERICKSEN
such stresses are ineffective in removing Dauphine twins, according to the theory of Thomas and Wooster [4]. As was mentioned before, the Brazil twins continue to exist in the cc-phase; one can again use (26) and (46) for the new shifts. However, with X / 0, we also get the possibility of Dauphine twins. These are sometimes called "electrical twins" since they affect electrical responses. They can occur as growth or transformation twins, occurring naturally when an unstressed crystal in the /?-phase is cooled enough to transform to the a-phase. They can also be generated by applying rather concentrated loads. They disappear in the /?-phase. To understand these, one can look at how the previous generator R i = R£ , 3 acts on the new shifts. Denote by pf the values of p, when X > 0, p,~ when X < 0. Then, at the same value of | X |, a calculation gives p, = R l P + = (o a ) { p / + {h)a ea,
(52)
where a 2 is given in (7) and I x is given by (35), (52) implying that the shifts p, are equivalent to pf. For the Dauphine twins, one can take e" to have the same value on both sides, with shifts Pi" on one side and pj~ on the other. For the isometry, one can use that suggested by (52), or combine this with an element in the point group for the a-phase. Using R ^ R ^ r / 3 = R-* gives the simpler p,-=R4P, + = p - + (/ 4 );e a ,
R ^ R j ,
(53)
with T
- 1 0
0
(54)
The corresponding m is - 1 0 0 m4 = m j 1 = 0 - 1 0 . 0 0 1
(55)
This is the choice of Thomas and Wooster [4] as well as James [10], for example. Taking the isometry suggested by the latter, we can start with ea, given by (21) on the side where p, = pf and, on the other side, use the equivalent of R 4 e a , ea=Ri(mi)"beb
= e a,
(56)
and take as shifts p ~ , consistent with this isometry. Also, we have e" = (1 - n
(57)
as was the case for Brazil twins. Again, this suggests the possibility of penetration twins, and they are. Here again, we have the kind of situation mentioned before, Q G P(ea) but ^ P(ea, pi). With these precedents, I introduce a tentative generalization, namely,
434 ON THE THEORY OF GROWTH TWINS IN QUARTZ
371
Proposition 1. Penetration twins in unstressed crystals are consistent with (40) in the special cases described by a m
=
0andQea=m*ei)
Q e O(3) i
G GZ(3,Z)=^Qe/>(ea).
P(ea,Pi), (58)
Workers seem unable to agree on a general definition of twins, so the best one can do is to determine how well what are called penetration twins fit this description, and I have only confirmed this for two examples. As is discussed by Dana and Dana [5, p. 79], the Dauphine twins are transparent to the eye but can be revealed by etching a plane intersecting the twins: sometimes one can see evidence of their existence on the surface of a crystal. Also, the volumes occupied by the twins are roughly equal, in typical crystals. It is found that, commonly, the interfaces have quite irregular shapes, these twins being in these respects rather different from Brazil twins. Also, unlike Brazil twins, Dauphine twins disappear when the crystal transforms to the /?-phase. Furthermore, as was first reported by Wooster and Wooster [20], they can be removed by applying suitable loads to the phase, whereas the Brazil twins cannot. Of course, there is the theoretical difference that, for Dauphine twins, the isometries relating the twins involve proper orthogonal transformations and, for Brazil twins, it is easy to see that the analogous isometries must involve the improper kind. As was the case for Brazil twins, the analysis of Dauphine twins does not require the specimens to be unstressed.
5. THE COMBINED LAW Some twins occurring in quartz are described by Dana and Dana [5, p. 90] as governed by The combined law (Syn: Leydolt twinning; Leibisch twinning; compound twinning; Dauphine-Brazil twinning). The twinned parts are related geometrically as by a combination of rotation of 180° around the c-axis and by reflection over {1120}, or simply as a reflection over {0001}.... Twins on the Combined law have been produced artificially by introducing Dauphine twins, e.g. by cooling through the 573° inversion point, into quartz already twinned on the Brazil law. The combined law results where the secondary Dauphine twinning cuts across the initial Brazil twin. When I first read this, I found it a bit confusing, so let us consider such an intersection. At least locally, the two interfaces divide a neighborhood into four sectors. As suggested by our earlier studies, we can use the same values of e" and ea throughout. On one side of the Brazil twin, we will have two right-handed configurations Ri and R2, associated with a Dauphine twin, separated by some surface. From our study of these twins, the shifts can be taken as R1:p+,(k>0),
R2:pr,(*<0),
(59)
435 372
J. L. ERICKSEN
with these described by (47). Similarly, on the other side, we have the analogous left-handed configurations Z,x and L2, and from our study of Brazil twins, the shifts can be taken as Li : -/>+,
L2 : -p~.
(60)
It is easy to check that these satisfy all relevant twinning equations. However, something seems to be amiss, since the "neighbors" /?i andZ,i {R2 andL 2 ) are not related by "a reflection over {0001}." However, with Q representing this reflection, a simple calculation gives Qea=m*ei,
m=
1 0 0 0 1 0 0 0 - 1
,
(61)
indicating that the Qe a could be used as lattice vectors in any of the sectors. Also, Qp+ = - p f + ei>
QPJ
= - p j + e2,
(62)
the shifts on the right being equivalent to — pj~ and —pj, respectively. Thus, Q transforms Ri to L2 and, similarly, it transforms R2 to L\. So, in this way, Q does relate the configurations on opposite sides of the Brazil twin. One can do a similar calculation for cases where the Brazil twin stays on one side of the twin, but they are in contact on some surface. Dana and Dana mention that such configurations are sometimes observed.
6. JAPAN TWINS The name "Japan twins" is used for a variety of rather different growth twins, all having the appearance of differently oriented crystals stuck to each other, forming a structure with a Vshaped appearance. Observations of these are commonly made when they are in the a-phase, and I won't consider the yS-phase. For a brief description, I'll quote part of the discussion of Dana and Dana [5, p. 91], referring readers to this for more details. In their words, Four related types of twins are grouped under this name. All are contact twins of two individuals, with the c-axes inclined at 84°33'. A pair of prism faces in the two individuals are coplanar in the twin. The composition plane in all instances is (1122). In one of the four twin types this is a twin plane, while in the others it is a pseudotwin plane in that it describes only the angular relation between the axial systems of the twinned parts. The further description of these twins rests on the hand, which may be the same or different in the twinned individuals, and on the relation between the polarities of the axes in the twinned axial systems (fig. 5). Alternate descriptions can be based on a combination of hand and the identity of the particular faces of the form {1122} that are parallel to the idealized composition plane; or on a combination of hand and the identity of the unit rhombohedron faces that separately terminate the coplanar faces of the twinned individuals. The four types of Japan twin further can be considered as comprising a basic operation, that of a 180° rotation around the normal to (1122), which appears either alone
436 ON THE THEORY OF GROWTH TWINS IN QUARTZ 373 (Type I), or in combination with the Dauphine Law twin operation (Type II), the Brazil Law twin operation (type III), or the Combined Law twin operation (Type IV). For the simplest possibility, one has two individuals of the same hand and polarity, with a plane interface. The normal to this has the direction K x =kaea = e 1 + e 2 + 2e 3 ,
(63)
and, at least for unstressed crystals, we should try to satisfy (42) with Q = R*Cl^RB,
(64)
which should also transform the shifts on one side to a possible set on the other. This fits the description of type I twins that I [8] use for the X-ray theory, and, from results given there, one can read off infinitely many solutions such that e a and e" have the same orientation (a • n = 0). These are given by a=|Ki|a,
2K a =—^-77,
•q = laea,
I*M|
K n= —lKi|
(65)
and mba = -6ba+lakb,
(66)
where the /" are any integers satisfying kala = I1 + I2 + 2/ 3 = 2 => m 2 = 1 and r/ • K j = 2.
(67)
Any two of these are related by a lattice invariant shear, not detectable by X-ray observations, so they could be regarded as physically equivalent. However, making a particular choice does make some analyses easier. One consideration is that
R6ea
0 - 1 0 = (me)* eb, m 6 = nig-1 = - 1 0 0 0 0 - 1
, R 6 "= R ^ - 6 2 G P (ea).
(68)
With the choice 77 = e i + e 2 ,
(69)
this and K i are both perpendicular to the axis of rotation of R6 so R677 = -»7, Then, using (65), we get
ReKx^-Ki.
(70)
437 374
J.L.ERICKSEN
R6a = - a ,
Ren = - n =J> R « ( n 8 a ) = (n
(71)
and, from (66), (41) is satisfied with m =
0 1 0 1 0 0 2 2 - 1
=> m 2 = 1,
m m 6 = m 6 m.
(72)
For one thing, this gives R 6 e" = (1 - n ® a) R 6 e a = (1 - n ® a) (m6)ab eb = (m6)ab eb =• R 6 € P ( e a ) . (73) Also, a simple calculation gives R 6 R^ = R : i + e ' G P ( e a ) P / ) .
(74)
Thus, since R£ produces Dauphine twins, so does R 6 . Explicitly, a calculation gives R6P,+ = («6)1p7 + (/ 6 );e a ,
(75)
where a 6 is given by (7) and
16
= 1 -°i "o1 "o1 I •
<™
The shifts on the right are equivalent to pj~, so I use the latter. With these results, one can analyze combinations of the basic Japan type I twins and Dauphine twins, including the possibilities of the interfaces crossing one or more times, coinciding partially or wholly, or being tangent along some curve. Combine a type I Japan twin with a Dauphine twin and you get a theoretically possible type II Japan twin. From other comments made by these authors, I get the impression that there is likely to be more of a transition zone, involving a number of differently oriented Dauphine twins, and I am then not sure what workers interpret as the interface. The rules are simple: when crossing the Japan plane, rotate values of e" and p,using R 5 , implying that
Pi-^p1=mlpb.
(77)
And, when crossing a Dauphine interface, reverse the sign of X, leaving the value of e" unchanged. As before, the shape of the latter interface is essentially arbitrary. Try a few examples, and you can verify that it is easy to calculate values of these vectors in the subregions formed by the interfaces. Since Brazil twins involve Q = — 1 and m = —1, which commute with all elements of 0(3) and GL(3, Z), respectively, it is even easier to get the analogous results for these. When generalizing the previous combinations to include them, the rule is that, when crossing such
438 ON THE THEORY OF GROWTH TWINS IN QUARTZ
375
an interface, multiply the shifts or values of/?f by —1 and leave the value of e" unchanged. Combine a type I Japan twin with a Brazil twin with the same interface and you get a theoretically possible type III, although, as before, these seem to involve what I interpreted before as transition zones. Similarly, by using both Brazil and Dauphine twins, one can get a similar description of type IV Japan twins. If my interpretations are essentially correct, this covers the most commonly observed twins, in a general way, and the twinning equations do apply to these. The analyses above apply to quartz, which need not be unstressed, but (38) should be specialized to allow only hydrostatic pressures {B — 0), to exclude unbalanced forces.
7. CH.FRIEDEL'S TWIN LAW We now begin to consider the rarer kinds of twins. First, it is easy to verify that e 1 - e2 = ^ ( e i - e 2 ) ,
(78)
e^e2,
(79)
that the three vectors e3ande1-e2
are mutually orthogonal, and that, in the order listed, they form a right-handed triad. For the Friedel law, I will start with only one piece of information, mentioned in various places in the literature, as is discussed by Zanzotto [2, 3]: for the isometry involved, one can take R 7 ^R^=R^ r 2 .
(80)
I assume that this does relate possible choices of shifts on the two sides, although specific information about such details is not mentioned in descriptions I have seen. Of course, these differ from the twins considered previously, by having the basic isometry be a 90° rotation, and we shall see that there is another difference. I note that R27 = R 6 ,
(81)
where Re satisfies (68), and I will use this later. I formulate the problem as that of finding all solutions of the twinning equation
ea = J% fl = R7m^e*,
^ = 1 - n ® a,
(82)
with the usual assumption that the e" are as described in (21), subject to the following: Assumption 1. A solution is relevant only if it exists for an open set of values of a and c, with fixed values of n ® a and m.
439 376
J. L. ERICKSEN
My view is that a and c are likely to be a little different for different samples in the same environment, assumption 1 allowing for this and slight variations in the environment, when different crystals are grown. There are two logical possibilities, both to be considered: e" and ea oppositely oriented => a • n = 2, det m = — 1
(83)
e" and e" with same orientation => a • n = 0, det m = 1.
(84)
and
To explore this, I will introduce vectors e a and their duals e", which do not depend on a or c, by
ei
=
1 . - e i = l, a
1 1. , VS. e2 = - e 2 = - - i + — j , a 2 2
el
=
ae1 = i + —-j, v3
e 2 = ae2 = - = j , v3
1 e 3 = — e 3 = k, ar e 3 = are 3 = k,
(85)
where (86)
In terms of these, R 7 = - (e 1 - e 2 ) ® (ej - e 2 ) + e 3 ® (ex + e 2 ) - - (e 1 + e 2 ) ® e 3 )
(87)
which is, of course, independent of a and c. Then, express n ® a in the form n
(88)
where the components are assumed to be independent of a and c. It is not necessary that the two vectors on the right be the same as their corespondents on the left, and, later, I will use certain factors of proportionality. Also, we must have NaA" =n-a.
= 2
ifdetm = - l ,
(89)
if det m = 1.
(90)
or NaA" = n • a = 0 Rearrange the twinning equation to the form
440 ON THE THEORY OF GROWTH TWINS IN QUARTZ 377
R£ £ e1 = mle1 + m\e2 + -wje 3 R£ £ e2 = m^e1 + m2e2 + i m 2 e 3
'•
<91)
R^ X)e3 = r(mfe1 + m^e2) + mfe3 y Requiring this to hold for an interval of values of r, we get m\ = m2z = m\ = m\ = 0,
(92)
and the condition that m be unimodular gives ml = ±1,
m\m\ - m\m\ = ±1.
(93)
With this information, it is routine to solve (91), and one gets two possibilities. For one, m = m- = (m-)"1=
1 0 0 0 1 0 , 0 0 - 1
(94)
| N " | 2 = 2,
(95)
and n(8)a = N " ( 8 ) N ~ , with N -=k-i(i+v
/
3j)=ce 3 -|(e 1 + e2) = ie 3 -^(e 1 +e 2 ).
(96)
For the other,
m
= m+ = (m+)~1=
0 - 1 0 - 1 0 0 , 0 0 1
(97)
n®a = N+®N+,
| N + | 2 = 2,
(98)
and
with
N+ = k + i ( i + y3j)= C e 3 + ^(e 1 +e 2 ) = ^e 3 + ^(e 1 +e 2 ) =•
N + • 1ST = 0.
(99)
441_ 378
J. L. ERICKSEN
For both, det m = —1, so the lattice vectors on opposite sides are of opposite orientation, implying that, unlike those associated with Dauphine twins, these interfaces cannot move through the material. As far as I know, this prediction is new. In this respect, they are like Brazil twins. With either of these solutions, one can construct a solution describing the rare quartz crosses analyzed by Zanzotto [2, 3]. Picture looking along the direction of ei - e 2 , with this vector pointing toward you, seeing the two orthogonal planes N+ • x = 0
and
N ~ • x = 0.
(100)
These divide the region of interest into four, which, for the first solution, will be denoted by Rt,i = 1 , . . . 4, numbered as follows: Ri is a region bounded on the right by N + • x = 0 and on the left by N ~ • x = 0. Then, number the rest consecutively as you follow a counterclockwise path encircling the line of intersection. In the four regions, assign values e?.s of e" as follows: in Ru take ea{1)
=
e", given by (21),
inR2, take efo = J^~ efo,
(101)
£ ~ = 1 - N~ ® N" = -R™ , +
= 1 - N+ ® N+ = - R ^ + ,
(102)
ini?3, takeefo
-
J2+eh>
ini? 4 , takee? 4)
=
^~e?3).
(104)
' E + R ' = E"' R'E" R ' = E + -
<105>
^
(103)
It is easy to verify the relations
R
A calculation made using these shows that the relevant twinning equation is satisfied across each of the four interfaces and that
e"/+D = R 7 {m-)"b 4 ) = R'7 ([»«"]')] e ^ ,
(no sum),
i = 1... 3.
(106)
442 ON THE THEORY OF GROWTH TWINS IN QUARTZ
379
With a suitable choice of region, this generates a solution for a cross, for any values of a and c. However, one needs to bear in mind (38), remembering that it applies to the cc-phase. To avoid unbalanced forces, one needs to set B — 0, which does restrict the choices of a and c, but this still lets r vary with pressure and temperature. If it did not, the basic assumption 1 would not make much sense. Now, there is a simple relation between this set and the analog generated by the second solution. A calculation gives m + = m~m6 = m6m",
(107)
where m 6 is the matrix described in (68), satisfying (68)i, bearing in mind (81). Using this, we could eliminate m 6 in the second solution by adding two to the powers of R 7 in the set of solutions above. The effect is to rearrange the regions as indicated by Ri^R3,
^2-^4,
R3->Ri,
^4->^2,-
(108)
Alternatively, one could use R 2 7 (m 6 )^e A = e a
(109)
to reduce powers of R 7 in the corresponding set of solutions. But this is just the transformation taking one configuration to its Dauphine twin. Working out the details, one finds that of the two sets of solutions, one can be interpreted as replacing a configuration by the Dauphine twin of it. This does not imply that either involves Dauphine twins by itself, merely that the parameter X involved in the description of shifts can be either positive or negative, and one can so relate the solutions. As before, one can incorporate Brazil or Dauphine twins in either, by adding these, with the appropriate interfaces. For the latter, one needs to be thoughtful in dealing with the rotations. From descriptions I have seen, it is not clear to me which of such possibilities really occur. For rarer twins, as these are, experimental details of this kind are apt to be missing. Workers do not always take the steps necessary to determine whether Brazil or Dauphine twins are involved in an important way. From these calculations, I infer that it is theoretically possible for Friedel twins, as well as the crosses, to be grown with various values of a and c, depending on the pressure and temperature prevailing when they are grown. Physically, this is quite different from taking a configuration having particular values of these parameters when grown, then subjecting the sample to different pressures and/or temperatures, the kind of situation analyzed by Zanzotto [2,3], using thermoelasticity theory. The X-ray theory does not deal with deformation, at least not without adding assumptions, although it is consistent with CBR, commonly assumed in thermoelasticity theory. Now, I'll sketch my view of what is involved when a cross grown when a = ao,c = c 0 , r = r0 is subjected to a change of pressure and/or temperature. We begin with those two orthogonal planes associated with the interfaces and have reasoned that these must continue to be material planes. Consider what the material in one of the subregions, say R l 5 would do under such changes, if it were not constrained by its neighbors. For this, I think it is safe to rely on thermoelasticity theory, implying that the change would produce a deformation with gradient of the form
443 380 J.L.ERICKSEN
F = —l+(—-
— ^k®k,
(110)
perhaps combined with a rotation, which is unimportant for this consideration. With this, one can determine how this changes the planes with initial normals N + and N ~ . One finds that if r = r0, they remain orthogonal, but if r changes to a different value, they don't. If R\ were by itself or in contact with only R2 and/or R4, the outer boundaries would be free to rotate to accommodate this, but in the cross, the four regions can't do so. Thus, one expects that changing r will produce some more complicated distribution of stresses, likely to damage the cross, unless r — r0 is quite small. Commonly, they are observed, presumably in good condition, with the interfaces orthogonal, at room temperature and atmospheric pressure. Reasonably, one can use this to get a value of r0 which is about 1.1. Essentially, I am describing how Zanzotto [2, 3] analyzed the situation, except that he proceeded to use the best available information to determine the pressure-temperature curve along which r = 1.1, describing conditions likely for the growth of the specimens found. Generally, this line of reasoning seems to me to be very reasonable, but one part of the reasoning he used now seems to be shaky. From the literature, he found that those planes are observed to make 45° angles with the c-axis, but infinitely many planes do this. He made a definite choice, based on the idea that "up to crystallographic equivalence, $ 0 is the only rational plane forming an angle of TT / 4 with the basal (E 1 ; E 2 ) plane." Long after he wrote this, I [7] explained my interpretation of how experts decide whether directions are rational or irrational and, as I now interpret this one, it would be classified as irrational. Also, according to the theory being used here, it does not matter whether it is rational or irrational. However, for whatever reason, the directions he found are consistent with those I predicted. With the X-ray theory, we have a better theoretical basis for the choice. So, by using both kinds of theories, we have improved our grasp of the situation. Now, I have emphasized that the domain of
(Ill)
By making the change of variables induced by mab = — 8 ab, e°->-ea,
p, - > P i = > / ? - > - / £ ,
0^0,
(112)
we get a transformed function
(113)
and we need to use both functions to properly analyze Friedel twins. To get to thermoelasticity theory, we will eliminate p" using the equilibrium equations for these, or by taking the infimum of
444 ON THE THEORY OF GROWTH TWINS IN QUARTZ
ea = F - r E " «• - e 8 = F " r ( - E 8 ) ,
381
(114)
where F is the usual deformation gradient. Use this to reduce
8. ZINNWALD AND ZWICHAU TWIN LAWS As we mentioned earlier, the twinning equation does not apply to the Zinnwald twins, because they involve joining crystallographically inequivalent planes, and the Zwichau twins involve a similar difficulty. As a personal matter, I prefer to classify such twins as special kinds of grain boundaries, but this is not the custom. What one can do is use thermoelasticity theory to draw interesting conclusions about effects of changes of temperature and pressure on these, after they have grown. First, consider a perfect crystal. With the thermoelastic ideas used before, a sphere in the crystal when grown should deform to an ellipsoid of revolution about the c-axis, remaining a sphere only if r stays at its original value. This will intersect the various planes, giving ellipses which are congruent for planes with normals making the same angle with the c-axis, and not for other planes. If one were to cut crystals on two planes and neatly glue these together, the configuration could tolerate changes in pressure and/or temperature without harm, provided these ellipses coincide. Otherwise, undesirable stresses will develop when r changes, somewhat like those associated with the crosses. If they are found in good condition at room temperature and atmospheric pressure, as seems to be the case for the twins being considered, one can again use Zanzotto's pressure-temperature curve to estimate conditions likely for their growth. First, the description of Zwichau twins by Dana and Dana [5, p. 97] reads, in part, "The twinned individuals have two coplanar prism faces with the c-axes inclined at 42° 17'." The rest of the description is not needed here. For any prism face, the normal is perpendicular to the c-axis, so these faces do have congruent ellipses. However, with the angle indicated, these will not coincide, so Zanzotto's pressure-temperature curve applies to them. For the Zinnwald twins, a rhombohedral face meets a prism face, faces making different angles with the c-axis. So, that curve also applies to them. Probably, this is one reason why these twins are relatively rare. We have now said something about all of the well-established twins in quartz that I have found in the literature, using theory to draw some conclusions about them and doing some testing of the theories. As is clear from the numerous sketches presented by Dana [21, pp. 180-197], a great variety of twinned configurations are found in minerals, providing many of opportunities for better analyzing them, using the ideas tried out here. Also, I find it interesting that there are noticeable differences between all of the kinds of twins considered here.
445 382 J. L. ERICKSEN Also, there are growths resembling twins that are not regarded as true twins, discussed briefly by Dana [21, p. 86]. As I understand it, angles, and so on, are not quite what they should be, based on the usual measurements of lattice parameters at room temperature and atmospheric pressure. As a speculation, it seems possible that some or all of these grew in different environments, somewhere off an analog of the aforementioned curve, and, if they had then been examined there, at least some of them would pass as twins. For most minerals, relevant information on effects of pressure and temperature is not available to check such guesses. Since quartz is so common and useful, better information of this kind is available for it. Essentially, this is what led Zanzotto [2, 3] to study, the quartz crosses, although crosses are much more common in staurolite, in particular.
9. THE/3-LATTICE GROUP After pondering various writings on the a — /? transition, I formed the opinion that a rather good theory of this could be obtained by using the three-lattice model, restricted to a Pitteri neighborhood, centered at a configuration associated with that describing an unstressed righthanded configuration of the y#-phase. Only minor changes are needed to adapt such analyses to left-handed configurations. There is interest in the effect of various kinds of stress on the transition, which will produce changes in the symmetry of configurations, and such theory can accommodate these, at least if the stresses are small enough. To provide some relevant background for this, I will include more information on the ^-lattice group than was needed for our twinning analyses. First, the group multiplication operation is {m, a, 1} • {m, a , 1} = {mm, a a , a l + I m } .
(115)
This is inferred from the composition e a - > e a = m*e d -> e a = mbaeb = (mm)ba eb
(116)
and the analog for shifts. Next is a description of all lattice group elements associated with any configuration described by (21) and (23).
R
V3
: L
^|
R /3:L2 =
"
f
l
l
°
-11
1 0 1 1
"I ° J . -1 0 ' 0 0 1 ) '
{ V ~o \ 'Hi - i ||' || - i o S||}«
<117>
<118>
446 ON THE THEORY OF GROWTH TWINS IN QUARTZ 383
1 1
^ ^ { l ? ) i|'l° 'INU - oil} R-t/3
U=\
0 0 , : | UI , ° - 1 J 1 ,
1
I
1 (119)
0
0
1
i
u
i
j
(120)
< / 3 : L 5 = j | l Y o | , | ° ; J | , | ° J J||}, ,,21, B; : L
R^i+ij
6
= | -1 -1 ^
. T _ /
R i:L =
J
\
n
> _! ! x
-1
2
>
0
_!
j , (122)
0
0 0 -l
\
I o "o - l ' ° " ' ° ° ° I
^ « {inil'ii°»iNi^"i}' m = u - f "i1 ! o , -j j , • » JJ ) , [
0 0 -l
J
(123)
*,-»»*:L,.-{Hi;1 ]2 _Sj.|j ziy-o 1 ; -O'IJ.OM) R,-^:L 1 1 ={|^1 "o1 _SJ,|; j y ^ -» -1||},«127) 1 : Li 2 = {1,1,0},
(group identity).
(128)
Here, the rotation listed as the first item is that associated with the description in (21) and (23), introduced to make it easier to correlate some elements with those encountered earlier. Lattice groups do determine such point groups to within similarity transformations obtained using orthogonal transformations. They also determine space groups, but different and inequivalent lattice groups can correspond to one space group, the lattice groups distinguishing differences in symmetry not recognized by space groups. In a Pitteri neighborhood, the lattice group
447 384
J.L. ERICKSEN
of any configuration is some subgroup of that at the center and, essentially, the subgroups determine how the neighborhood gets decomposed into subsets, where configurations in one subset have the same symmetry. In thinking about this, I found it useful to have the group multiplication table, which is L12
L13
L3 L4 L4 L5 L5 1 1 Li Ln L10 L6 L11 L7 Lg Lg L7 Lg Lg J-110 Lg
L4
L5
1
L7
Lg
Lg
L10
L11
L/6
L5 1 Li L2 Lg L10 Ln Lg L7 Lg
1 Li L2 L3 Lg Lg L10 Ln L6 L7
Li L2 L3 L4 L7 Lg Lg L10
Lg Lg L10 Ln 1 Li L2 L3 L4 L5
Lg L10 Ln L6 L5 1 Lj L2 L3 L4
L10 Ln L6 L7 L4 L5 1 Li L2 L3
L11 Lg L7 Lg L3 L4 L5 1 Lll L2
L6 L7 Lg Lg L2 L3 L4 L5 1 Li
L7 Lg Lg L10 Li , L2 L3 L4 L5 1
LJI
L6
(129)
presented as a matrix, where the element in the /th row andy'th column is the product L, • Ly, omitting the obvious products involving the identity element. The same table applies to the point group, and it is easier to use this to calculate the entries. For any particular subgroup, it is routine to characterize the possible values of ea, e", p , , and R associated with this symmetry. For each such element, take the m from the list, using R e a — mbaeb to determine restrictions on the lengths of lattice vectors, and so on. For pi, one could use the analog implied by (10), but I prefer to use
pm = ap + I, p^llrfU,
(130)
which is easily shown to be equivalent. Take the relevant values of m , a , and I from the list, then find all p satisfying (130). Do this for all elements in the subgroup to determine all of the restrictions. One can then use (19) to get the identities satisfied by t (20) to get those satisfied by 3? One can do a bit more with general arguments, but one cannot analyze possible solutions of $ = 0 without introducing more specific assumptions about constitutive equations, for example. As a simple example, consider the subgroup generated by L 6 . From the multiplication table, Lg = 1, so this is a subgroup of order two. From (122), R e i = ei,
R e 2 = - e i - e2,
Re3 = - e 3 ,
(131)
and, by elementary calculations, this is equivalent to ex • ( e 1 + 2 e 2 ) = e i - 6 3 = 62-63 = 0,
R = R;».
Then, using (122) and (130), one calculates that the shift components are given by
(132)
448 ON THE THEORY OF GROWTH TWINS IN QUARTZ
P
p 2p-l r p
-
2q q
'
385
^ 133 >
where there are no restrictions beyond the requirements that this give an admissible pair of shifts and that the configuration be included in the neighborhood, complications that I will gloss here. From (19), one gets ei is an eigenvector of t,
(134)
while (20) reduces <& to the form #=
a C
-2a o
T
0
.
(135)
Formally, setting a = C — 0 gives three equations for determining p, q, and r. Of course, one needs to keep in mind that, for some values of lattice parameters, the lattice group will be larger. The lattice groups for the a- and ^-configurations both contain this group, for example. These elementary considerations barely scratch the surface in understanding the a — P transition, but they add a little to results that have been used in this area. Here, I won't pursue this, although I think it would be useful to do so. Acknowledgments. I thank Richard James and Giovanni Zanzottofor helpful advice and for helping me to locate useful references.
REFERENCES [I] [2] [3] [4] [5] [6] [7] [8] [9] [10] [II] [12]
Zanzotto, G.: On the material symmetry group of elastic crystals and the Born rule. Archive for Rational Mechanics and Analysis, 121, 1-36(1992). Zanzotto, G.: Geobarothermometric properties of growth twins and mathematical analysis of quartz data for a broad range of temperatures and pressures. Physics and Chemistry of Minerals, 16, 783-789 (1989). Zanzotto, G.: Thermoelastic stability of multiple growth twins in quartz and general barothermometric implications. Journal of Elasticity, 23, 253-287 (1990). Thomas, L. A. and Wooster, W A.: Piezocrescence—The growth of Dauphine twinning in quartz under stress. Proceedings of the Royal Society of London, A208,43-62 (1951). Dana, J. D. and Dana, E. S.: The System of Mineralogy, 7th ed., Vol. 3 (rewritten and enlarged by C. Frondell), John Wiley, New York, 1962. Ericksen, J. L.: Equilibrium theory for X-ray observations of crystals. Archive for Rational Mechanics and Analysis, 139,181-200(1997). Ericksen, J. L.: Twinning analyses in the X-ray theory. International Journal of Solids and Structures, 38, 967-995 (2001). Ericksen, J. L.: On correlating two theories of twinning. Archive for Rational Mechanics and Analysis, forthcoming. James, R. D.: Displacive phase transformations in solids. Journal of Mechanics and Physics of Solids, 34, 359-394 (1986). James, R. D.: The stability and metastability of quartz, in Metastability and Incompletely Posed Problems, ed. S. Antman, J. L. Ericksen, D. Kinderlehrer, & I. Miiller, IMA Volumes in Mathematics and its Applications, 3, 147-175 (1987). Balzer, R. and Sigvaldson, H.: Equilibrium vacancy concentrations measurements on zinc single crystals. Journal of Physics F: Metal Physics, 9, 171-178 (1979). Jay, A. H.: The thermal expansion of quartz. Proceedings of the Royal Society of London, A142, 237-247 (1933).
449 386
J. L. ERICKSEN
[13] Pitteri, M. andZanzotto, G.: Continuum Models for Phase Transitions and Twinning in Crystals, CRC/Chapman and Hall, London, 2000. [14] Ericksen, J. L.: Notes on the X-ray theory. Journal of Elasticity, 55, 201-218 (1999). [ 15] Pitteri, M.: On lattices. Journal of Elasticity, 15, 3-25 (1985). [16] Truesdell, C : A First Course in Rational Continuum Mechanics, Academic Press, New York, 1977. [17] Cahn, R. W: Twinned crystals. Advances in Physics, 3, 363-345 (1954). [ 18] Friedel, G.: Sur les macles du quartz. Bulletin Societe Francois Mineralogie, 46, 79-95 (1923). [19] Cady, W G.: Piezoelectricity, Dover, New "fork, 1946. [20] Wooster, W A. and Wooster, N.: Control of electrical twinning in quartz. Nature, 157,405-406 (1946). [21] Dana, E. S.: A Textbook in Mineralogy, 4th ed. (revised and enlarged by W E. Ford), John Wiley, New York, 1932.
450
On the Theory of Cyclic Growth Twins J. L. Ericksen In this title paper, to appear soon in Mathematics and Mechanics of Solids, I analyze examples of cyclic twins that occur as growth twins in various minerals. There is some disagreement among workers about the precise definition of these, but most have a structure resembling that of an orange. That is, the sample is divided into differently oriented crystals, meeting at discontinuity planes, with a common angle between neighboring planes. The number n of different orientations can take on various values. For examples treated, n = 3 for twins observed in rutile and aragonite. Actually, these involve 6 sections, with the discontinuity planes forming 60° angles, but just 3 different orientations occur. Quartz gives an example with n = 4, marcasite with n = 5, while another twin in rutile has n = 8. Probably, these were unstressed when grown, but changes in pressure and temperature cause examples found in the field to be subject to internal stresses. Mineralogists commonly give idealized descriptions of crystal structures they find, and I expect that this is the case for cyclic twins. To analyze these, I use twinning equations from the X-ray theory, for a neighboring pair of twins, assuming they are unstressed. These are e a = (1 - n
451
Unusual Solutions of Twinning Equations in the X-ray Theory J. L. Ericksen The title paper, to appear subsequently in Mathematics and Mechanics of Solids, deals with the twinning equations described in the paper "On the theory of cyclic growth twins." The solutions covered are such that they hold for any set of values of ea that are linearly independent. There are exactly four kinds of these. For all, Q can be taken as a 180° rotation. Two are well-known, being called type I and type II twins. For both mab = - 5 £ + gbha,
gaha = 2, with g = g a e a , h
=haea,
ea being the lattice vectors, the basis dual to e a . For type I, Qn — n, n ® a = g®
—2 ~ h >
Vlsl
J
requiring that n be normal to a crystallographic plane. These are by far the most commonly observed types of twins. For type II, Qa = a, n ® a = I g——^ I ®h,
V lhl /
requiring that a be parallel to a row of atoms. For these, e a and ea have the same orientation, which means that they can describe either growth or mechanical twins. For the other two, they have opposite orientations, restricting their use to growth twins. As far as I know, no one else has found these. For both, ml = -8$ + gbha,
gaha=0,
with g and h defined as above. For one type, /2g \ Q n = n, n <8> a = g® — ^ - h ,
Vlsl
/
requiring that n be normal to a crystallographic plane. For the other, / 2h \ Qa = a, n ® a = I —-= + g ®h, h
Vl l
/
452 requiring that a be parallel to a row of atoms. Then, I noticed that one can get solutions with zigzag surfaces of discontinuity by using two solutions of the latter kind, with the same value of h, the two values g and —g. Divide space into parallel strips with normal g, spacing these any way you like. Using pieces of the two discontinuity planes alternately, form a continuous surface, with these intersecting on the aforementioned planes. On one side of this, use the value of e a shared by the two solutions. On the other side, alternate the two values of e" in the parallel strips used, fitting these to the corresponding discontinuity plane. It turns out that these two configurations are related by a lattice invariant shear mapping the parallel strips onto themselves. This means that they describe the same configuration, so one can ignore the discontinuities relating them. Some observed growth twins called penetration twins do have zigzag interfaces, but I have not explored the possibility of these solutions fitting any of these.
453
4. Phase Transitions Experimentally, it can be at least difficult to determine whether a transformation is of second-order or is of first-order, but very weak. There is a Landau theory used by physicists that can be used to exclude as very unlikely some kinds of second-order transitions. This theory does assume that constitutive equations are quite smooth, although physicists have reasons to believe that they exhibit fairly mild singularities. Interest in a transition that was thought to be of second-order but violated the exclusion rules led me, in the paper "Some phase transitions in crystals," to explore kinds of singularities that can negate those rules. Also, I had been concerned about reconciling the conventional assumption that constitutive equations should be invariant under only a finite group with my idea that it should be an infinite group. So I presented a conjecture about this, later proved by Pitteri. Basically, the result is that one can restrict the domain of a constitutive equation to a neighborhood of a given configuration, called the center, such that the neighborhood is mapped to itself only by a finite subgroup, the lattice group for the center. Finding limits on such neighborhoods is a purely kinematical problem. The article "Continuous martensitic transitions in thermoelastic solids" is a presentation of theory of phase transitions based on thermoelasticity theory. At the time, expositions of thermoelasticity theory or elasticity theory had not covered this topic. Some but not all phase transitions are such that the configurations involved can be included in a Pitteri neighborhood and these are the transitions that I call weak. Often, such transitions are analyzed by physicists using Landau theory. The structure in "Weak martensitic transformations in Bravais lattices" uses generic arguments on polynomial approximations to thermodynamic potentials; and here I discuss features of weak-transitions for Bravais lattices, for which the Cauchy-Born hypothesis seems to be reliable. Phase transitions in crystals that can be treated as Bravais lattices, based on thermoelasticity theory, are studied in "Bifurcation and martensitic transformations in Bravais lattices." Theory of this kind seems to be reliable for shape-memory alloys although, strictly speaking, these are not crystals. This paper presents a fairly general discussion of possible bifurcations occurring when pressure and temperature are used as control parameters. There is some emphasis on theory relating to having some moduli become relatively small near transition, often referred to as moduli becoming soft. Examples involving cubic and tetragonal configurations are presented. The article "Local bifurcation theory for thermoelastic Bravais lattices" deals with the kind of theory used in the previous paper, except that only temperature is considered to be a control parameter. In a generic sense, this characterizes the kinematically possible bifurcations associated with the different point groups. There is an error in this, corrected in a later paper "Thermal expansion involving phase transitions in certain thermoelastic crystals."
454 As is well known, in 1850 Bravais reasoned that there are 14 kinds of Bravais lattices, using the idea that two configurations are of the same type if they can be joined by certain kinds of paths in the space of lattice vectors. Without realizing it, I presented in "On the symmetry of deformable crystals," (see Chapter 1) an example showing that two kinds of configurations that he concluded were of different types could in fact be joined by such a path. Pitteri and Zanzotto noticed this and proved that, with Bravais' assumptions, one gets not 14 but 11 types: one can use the theory of lattice groups to get the 14. In "On the possibility of having different Bravais lattices connected thermodynamically," I use generic reasoning to conclude that, by varying pressure and temperature, it is thermodynamically possible to get the two kinds of configurations occurring in my example to be related by a second-order phase transition. The a-p phase transition in quartz is accompanied by numerous complications that have long puzzled workers. Although the Cauchy-Born hypothesis seems to apply, thermoelasticity theory is inadequate to treat this transition even if one ignores many complications, it being essential to account for the fact that it is a multilattice. Wanting a realistic example of this kind, I used a 3-lattice model used earlier by James to explore in "On the theory of the a-(5 phase transition in quartz", what would be predicted by rather simplistic analyses. This is a transition that needs much more work, if only to translate ideas used by physicists into forms more comprehensible to workers in mechanics and mathematics.
455 a
Some Phase Transitions in Crystals J. L. ERICKSEN Communicated by D. D. JOSEPH
1. Introduction In 1937, L A N D A U * [1] proposed a theory of second-order phase transitions in crystals. According to it, some types of changes in crystal symmetry should not be observed. Different physicists seem to have different opinions about the soundness of this theory. Some of the types excluded seem to be observed. It is not impossible for an experimentist to mistake a weak first-order transition for one of second order. If data indicates it to be of second order, one can always argue that still better data might indicate that it is not. Thus, some exercise of judgment is involved, in deciding whether the conflict exists. Let me describe one case. From observations of a cubic-tetragonal transition in V3St, BATTERMAN & BARRETT [2] concluded, reasonably I think, that this is a secondorder transition. ANDERSON & BLOUNT [3] pointed out that, if so, there is a conflict with the Landau theory. The general tenor of their paper indicates that they do not think it impossible that the theory is wrong, for they give opinions as to why it might be, as well as reasons for thinking that this and some other transitions are of second order. More recently, we find MICHELSON [4] writing that "... Landau's approach is invalid in the intermediate vicinity of phase transition points and leads to incorrect results regarding the critical behavior (critical exponents). However, as regards the determination of the allowed symmetry changes and types of ordering, this approach is believed to yield correct results." By implication, BATTERMAN & BARRETT were wrong, although MICHELSON does not mention such cases. Or, perhaps, the implication is that ANDERSON & BLOUNT incorrectly interpreted LANDAU'S theory. If the latter authors interpret incorrectly, so do I. Other examples of confusing statements can be found. There is a fair-sized literature on the subject and, along the way, I will mention some of the information which seems relevant. I have been led to rethink the whole matter, and find myself in disagreement with LANDAU, on some of the more basic theoretical issues involved. From the viewpoint of modern continuum mechanics, this involves some fundamental issues, concerning which there are differences of opinion. Experts here have given little thought * The reference is to his collected works. As is noted there, the original work was published as two papers, each of two parts, one in Russian, the other in German. Archive for Rational Mechanics and Analysis, Volume 73, © by Springer-Verlag 1980
456 100
j . L. ERICKSEN
to one which plays a central role in the Landau theory, concerning the improbability of certain kinds of predictions being observable. I find fault with this, but don't reject the whole idea. I explore an alternative, in an informal way, but still do not see how best to reformulate this, in a rigorous way. Insofar as some cubic-tetragonal and other transitions are concerned, I come to conclusions which disagree with LANDAU'S, although my reasoning seems not to eliminate all exclusion rules. Presently, I do not think it feasible to give a definitive treatment, but it is a good problem which needs to be solved. It is time that we learn how to ascertain what, say, thermoelasticity theory really predicts about those second-order transitions. Whatever we might think about the relevant experimentation, we need to set the theory straight. 2. Background
Physically, we are concerned with the equilibrium of crystals, in an environment where temperature 9 is controlled. For simplicity, I will consider them to be subject to no forces, 9 being the only control variable, following ANDERSON & BLOUNT [3]. It is relatively easy to modify the analyses to accomodate pressure variations, as LANDAU did. I have in mind situations where varying the pressure a bit does not eliminate or change to first-order the transitions considered, although the transition temperature etc. might shift. According to WEGER, SILBERNAGEL & GREINER [15], varying pressure up to 408 atmospheres has a negligible effect on the transition temperature of V3S(, but uniaxial stress has a more significant influence. To see the full picture, we should consider the effects of various kinds of loadings, but I will not do this. The experimentist must adopt some operational definition of a second-order phase transition, and it is pertinent to have some understanding of what it is. More accurately, I give my interpretation of what is typical. I don't think that I could phrase a precise definition which would make everyone completely happy. In principle, that is dictated by the theory which is applicable, but the experimentist can't be sure what it is. Commonly lattice vectors are observed, by X-ray methods. I simplify a bit, for clever methods can reveal finer structure. Lattice vectors for one configuration are not unique, and I don't want to dwell on the issue of which selection we are to make. Suffice it to say that we should be able to find a set which varies in quite a smooth fashion with 9, remaining continuous at the transition temperature. Later, I will say a bit more about this. Their derivatives with respect to 6 might exhibit finite or infinite discontinuities at transitions. Measurements of the macroscopic deformation gradient and linear elastic moduli should indicate that these also remain continuous, but derivatives might exhibit similar discontinuities. Further, the transition should not involve a latent heat; the entropy should also be a continuous function of 9. The transition should be reversible; if we cycle 9 through its transition value, we should come back to the original configuration, etc. If any of these conditions fail, we would not consider that we have a second-order transition. Experimentists are likely to look at other things, like electrical resistivity. Clearly, a small discontinuity in elastic moduli might be lost in experimental error, and a different measurement could tip us off that this has been missed. If, say, some
457
Phase Transitions in Crystals
101
elastic modulus is judged to jump discontinuously, we would consider that the transition is of the first order. This need not be reversible, in the sense indicated above. Here irreversibility is likely to be interpreted as meaning that, on at least one side of the transition, one is getting into a state which is only metastable, as occurs in the supercooling of fluids. The operational definition involves a prejudice that thermoelasticity theory has some relevance, at least. By my interpretation, this is very close to what LANDAU used and, for purposes of comparison, I will use it, as it is likely to be interpreted by physicists. Actually, such theory deals with macroscopic deformations, not lattice vectors, and we need to account for the latter. Temporarily, let us gloss over this. Then the above description makes fairly good sense, if we can reasonably assume the crystals to be in homogeneous configurations. In the cubic-tetragonal transition of V3Sh the cubic phase seems to qualify. In the tetragonal phase, BATTERMAN & BARRETT [2] observed a banded structure, suggesting that the "... bands in the topograph are from twin-related lamellae." Each band might reasonably be considered to be homogeneous. I think it reasonable to consider a band by itself, so will, like LANDAU, presume that we deal with homogeneous configurations. It does indicate that one should be cautious of accepting at face value experimental values of elastic moduli, etc. in the tetragonal phase. Perhaps more important is the warning that an adequate theory should be capable of coping with such twinning phenomena. LANDAU seems not to have appreciated this. Thermoelastic theory of twinning is in an underdeveloped state, but one needs the right view of material symmetry to treat it. Let me describe one view which seems rather reasonable. In the temperature range where we have cubic phase assume, as usual, that constitutive equations are invariant under the indicated point group, as we would normally do, if we did not know of the existence of the tetragonal phase. Similarly, in the temperature range where the tetragonal phase occurs, ignore the cubic phase, treating it as you would any other tetragonal crystal. From traditional views of symmetry of crystals, this seems like the natural thing to do. Otherwise, it seems that we would have to know whether a crystal could change symmetry, before we could analyze its linear elastic behavior. Thus the work of G U R T I N & WILLIAMS [6] is based on this idea, and it was LANDAU'S view. My view is that this is not quite right, although it is not totally wrong. To get a sensible theory of twinning, one needs a different view, and this may, but need not be associated with our second-order phase transitions. I don't want to get into the theory of twinning, but will try to make these points a bit clearer. As we mentioned, there is need for some auxiliary hypotheses concerning lattice vectors. The usual practice is to borrow that used in molecular theories of crystal elasticity. I assume this to be a part of LANDAU'S theory, although he, like various other physicists, leaves this unsaid. The assumption originated in the oldest and simplest theory, that of CAUCHY [7] . Other assumptions which he made have been generalized and modified, but this one remains standard. CAUCHY considered all atoms to be identical, subject to central forces of short range. Allowed configurations are simple lattices, which excludes such things as hexagonal close-packed crystals. That is, for some choice of constant and linearly independent lattice vectors ea (a = 1,2, 3), the set of position vectors of
458 102
J.L. ERICKSEN
atoms, relative to one, must be, precisely, the set given by (2.1)
v(na) = n"ea,
where n" stands for any selection of positive and negative integers, zero being included as an integer. Of course, the crystal is considered to fill all of space. One such configuration, equipped with a definite set of lattice vectors, Ea, is taken as a reference. Then, a homogeneous deformation gradient, F, is associated with a change of lattice vectors to ea, given by (2-2)
ea = FEa.
As a matter of definition, any crystal must have a periodic structure, described by a set of lattice vectors. The single atom in LANDAU'S model can be replaced by a set or, equivalently, we can think of the crystal as composed of several monatomic simple lattices, with the same lattice vectors. The more classical theories of CAUCHY and BORN are summarized rather neatly and briefly by STAKGOLD [8]. LANDAU thought more in terms of a statistical description, but made no use of any description of interactions, etc. Atomistic ideas might underlie our notions of these kinematical ideas of lattice vectors, crystal symmetry, etc. Otherwise, his exclusion rules are based on thermodynamics, of a classical kind, not molecular theory. ANDERSON & BLOUNT [3] opine that thermoelasticity theory is inadequate to describe cubic-tetragonal transitions in particular, if they are of second order. This is indicated by remarks made like "The Batterman-Barrett transitions, then, if they are indeed second-order, must have some additional, as yet unknown, order parameters." Frankly, I am not sure what we are to mean by a second-order transition, without reference to some rather specific type of theory, but I will let this pass. Neither is it clear to me how the addition of more variables would change the conclusion concerning the possibility of the transition. It does seem worthwhile to review an argument which has been used by some to dispose of some of the things which might be called internal variables. As an illustrative example, I will use one which has been discussed a bit in the literature. Some monatomic lattices which are not simple, as well as some diatomic lattices, can be thought of as composed of two simple lattices. One is translated slightly, relative to the other. A vector p, describing this translation, will serve as our internal variable. The atomic positions are described by relative position vectors of the form (2.3)
v(na) = naea,
w{na) = v(na) + p.
Molecular theory such as is described by STAKGOLD [8] does not presume an a priori relation like (2.2) to relate p and F, but leads to a relation which depends on the nature of interactions. Clearly, ea and p determine all atomic positions, hence all allowed configurations, so it is not unreasonable to expect that >, the Helmholtz free energy per unit mass, will be given by a constitutive equation of the form (2.4)
<> / = (KeB,p,0),
459 Phase Transitions in Crystals
103
and this is consistent with such molecular theory. To eliminate p, we fix ea and 9, and determine p=p(ea,9) by the condition that it minimize
4{ea,p,m{ea,P,0).
Insofar as molecular theory is concerned, this is a hard calculation, so workers tend to deal only with infinitesimal changes in configurations, etc., but this is the idea. This, together with (2.2), gives the isothermal strain energy function (2.6)
$(F,6) =
Ea being considered as fixed. Clearly, much the same argument can be used to eliminate various kinds of internal variables, including some which we might not know of. For the problems at hand, we will wind up minimizing the potential anyway, so why not think of doing it in this way, reducing it to a thermoelastic problem. The argument is, I think, not so bad, but it has its limits. If, in (2.5), we permit relative minima, then we might easily fail to get $ as a single valued function of its arguments. Thus, it seems, chances are rather good that we are excluding some metastable configurations. If this is the kind of thing that is involved, then I would side with ANDERSON & BLOUNT, up to a point. To proceed, we then need some more or less definite theory, explicitly involving those mysterious variables to which they refer. Another possibility seems to me at least as likely, and, with some effort, we can accommodate it, within the framework of thermoelasticity theory. I will again describe it in terms of our example. Suppose that, in the part of its domain which is involved,
460 104
J. L. ERICKSEN
bifurcation. We are then lost if we don't know what kinds of internal parameters are involved. I think it sensible to go ahead, as I suggest, making some allowance for the possibility of such ties. I do conclude that good theory should allow for the possibility that potentials of interest might not be so smooth, a n d I will assume no more than is necessary about this. Neither do I like using one kind of reasoning for analytic or C™ functions, a n d a very different one for those which are much less smooth. It is not easy to patch u p L A N D A U ' S reasoning to accommodate less smooth functions. I interpret this to mean that we need to find some different approach, and I will take a stab at this. There is another subtlety which deserves more attention than I can give it here. Suppose that all atoms are alike a n d that, in (2.3), we have, say, p — eJ2. It is then easily seen that the configuration can also be described as a simple lattice with different lattice vectors, namely (el/2,e2,e3). A n infinitesimal shift in atomic positions could invalidate the latter description, by making p4=e,/2. By one way of looking at it, the slight shift has doubled the size of a lattice vector, but we don't really expect the volume of the crystal to double; it need not change at all. If we view the simple lattice as a degenerate case of a diatomic lattice, the lattice vectors will only shift slightly. Pursuing this idea to more complex multi-lattices, we see that many lattice vectors can be associated with one configuration of a crystal, a n d we cannot expect (2.2) to apply, irrespective of what choices we make for the different configurations. T o some degree, (2.2) serves as a guideline for selecting lattice vectors. In the transitions considered by L A N D A U , he considered that expected atomic positions change in a continuous manner, giving one possible basis for a definition of second-order transitions. Intuitively, it is then plausible that the macroscopic deformation gradient should also vary continuously. I accept as plausible that one can select lattice vectors also varying continuously, which conform to (2.2). Some such assumption seems necessary, to correlate microscopic and macroscopic views of the problem. There is a third possibility. Namely, p might not exist for all ea, giving rise to holes in the kinematically possible domain where $ is not defined. It might be feasible to do some analysis of such cases, but I will not attempt this. If one looks carefully at various molecular theories of elasticity, accepting conditions of the kind indicated by (2.5) to eliminate p or its analogs, one finds that there is agreement concerning certain properties of our thermoelastic potential. I will denote this by $, whether or not it is associated with our example of diatomic crystals. First, the requirement of Galilean invariance is built in, and leads to the conclusion that $ is expressible as a function of the scalar products of lattice vectors,
(2.7)
<£ = $(C,0),
where the matrix (2-8)
C^\\ea-eb\\
is symmetric and positive definite. With (2.2), C can be regarded as the matrix of components of the Cauchy-Green tensor commonly used in nonlinear elasticity theory, if we use the reference reciprocal lattice vectors as a basis. In this sense,
461^ Phase Transitions in Crystals
105
(2.7) is a standard description of this thermoelastic constitutive equation. Secondly, one finds that
$ ( M C M r , 0 ) = $(C,6),
VMeG.
Here, G is an infinite, discrete group, represented by all matrices of the form (2.10)
M=\\m%
det. M = ± l ,
with the mba being any set of positive or negative integers which satisfy the condition on the determinant. It is this, whether we are dealing with a cubic crystal, a tetragonal crystal, or whatever. This is very different from what is commonly used, which is, effectively, one of the finite subgroups of G. For local considerations, such as are of interest here, the differences are not so great, but subtle. For the old theory of CAUCHY, deduction of (2.9) is discussed by ERICKSEN [11]. The theories covered by STAKGOLD [8] can be analyzed in a similar way. At bottom, it rests on premises which are generally accepted in molecular theory, such as the periodicity which defines crystals, and the interchangeability of identical atoms. I have my reservations about molecular theory, but think that this prediction is reliable. In part, my prejudice is based on considerations of the twinning phenomenon. For a sensible analysis of second-order phase transitions, it is almost necessary that
462
_ _ _ ^ _ _ 106
J.L. ERICKSEN
Somewhat tentatively, I side with those physicists and others, who do not think that this is the right course. If we do otherwise, we should have some good physical reason for doing so, and we might inquire as to what it is. One line of thought has occurred to me. To some degree, it is suggested by the imperfection analysis used for structures, such as are treated by THOMPSON & H U N T [14], but there are differences. F o r a particular material, as we interpret the term in practice, we will not know the potential function precisely. As I see it, it is true, but not of prime importance that experimental errors or inadequacies in mechanistic theory prevent this. If nothing else mattered, we might increase accuracy be focussing on those places where small differences have important consequences. Still, if the tiniest differences are of great import, experiments would not likely to be reproducible. Here, I accept the notion that a material should be described not by one constitutive equation, but a set. N o two crystals are exactly alike, if only because real crystals contain some accidental imperfections. By using different methods of crystal growth, we can dramatize the differences, as in the transforming and non-transforming crystals used by T E S T A R D I & B A T E M A N [15]. We might not like to regard these as the same material but neither do we wish to go to the opposite extreme, considering each sample to be a different material. Experimentists would not regard the materials as the same, if their elastic moduli, etc., were very different. If we grant this, it is reasonable to consider that, for one material, functions in the set are such that the set will contain a variety of functions. Probably, functions close to one should deliver essentially the same predictions. Without some condition like this, it is unlikely that experimentists, using slightly different samples, could agree on how a material behaves. The idea of closeness should imply that not only values of the functions, but their first and second derivatives are close and, perhaps, something else. My intent is to try out this rather vague idea. Physically, it is not entirely clear that every small shift in
463 Phase Transitions in Crystals
107
change in symmetry, as is indicated by examples given by ERICKSEN [16]. To avoid lengthy discussion of this, I will avoid discussion of argumentative cases. Again, I find myself in some disagreement with LANDAU, and others, as to what we should mean by this. 3. Equilibrium Configurations For studying stability of equilibrium of these crystals, I don't find fault with the traditional criterion used by LANDAU, SO I follow suit. That is, at any particular value of 9, C = C is a (stable or metastable) equilibrium configuration provided that (3.1)
&(C,6)^
at least for C near C. Of course, this only makes sense when the arguments all lie in the domain of <£, and C could be a boundary point. I will follow LANDAU, regarding C as always being in the interior. Some experts in continuum mechanics would not like this, if C refers to the value occurring at one of our phase transitions, for the following reason. As we approach the transition, some acoustic wave speed will approach zero. R. A. TOUPIN has shown me a proof, worked out in collaboration with H. THOMAS, that this always happens, at a second-order phase transition involving a change of symmetry. My ideas of a change of symmetry are slightly different, but I concur with the result. Experimentally, unusual types of damping are observed and, as one gets quite close to transition, it becomes impossible to transmit a measurable signal, at least in crystals such as are discussed by KELLER & HANAK [17], for example. That this speed really goes to zero is then inferred by extrapolating data, so one might quibble about this. This might affect the applicability of the theory, but not what theory predicts. Theoretically, the transition point is likely to have neighboring states where some such speeds would become imaginary. Some would like to exclude these, on the grounds that such configurations are too unstable to be observable. ERICKSEN [11, Section IVF] notes that, if the domain of
464 108
J.L. ERICKSEN
about this whole business, although I do find the work of D A F E R M O S [18] enlightening. It suggests to me that it is safer not to excise these subdomains. Thus, this is the way I will gamble. Clearly, there is need for additional serious studies of these matters. Of course, there is the implication that measurements of elastic moduli by wave propagation measurements might be misinterpreted. For such reasons, I prefer to try to straighten out the theory, without relying very heavily on empirical information. If C is an equilibrium state, and (2.9) applies, so is MCMT, for any MeG, an infinite group. It would seem that three difficulties might arise. First, infinite sets can easily have limit points. Conceivably, then, this invariance might, by itself, prevent an equilibrium state from being isolated. Second, it might seem impossible, or at least very difficult to account for all of the consequences of this invariance. Third, it is not completely clear how we reconcile this with traditional theories of material symmetry, which have been used quite successfully in various linear theories. PARRY [19] has proposed a method for constructing and characterizing properly invariant functions which is somewhat intuitive but, I think, sound. In pondering it, it occurred to me that one theorem would clear up various such questions, and PlTTERl [20] has found a proof. Given any C, symmetric and positive definite, we can define a subgroup of G, £C(C\ by the relation (3.2)
<e{C) = {MeG:MCMT
= C}.
As is discussed by ERICKSEN [16], this is always a finite group, conjugate to one of the crystallographic point groups. What I refer to is then given by a proposition of a purely kinematical nature, viz. Proposition. Given C, there exists a neighborhood of it, N(C), such that, for any MeG, either (a)
Me^(C)
or (b)
M:N(C)->N,
NnN(C) = 0.
Then, if C is an equilibrium configuration, there need not be any other in N(C), so it can be isolated. If we are interested only in a neighborhood of C, the restriction of
465 Phase Transitions in Crystals
(3.3)
1 0 0 C=a 0 1 0 , 0 0 1
109
a>0,
then, from the theory of crystallographic groups, or by elementary calculations, we find that S£(C) consists of the matrices ± 1 0 0 0 ±1 0 , 0 0 + 1
± 1 0 0 0 0 ±1
0 ±1 , 0
(3.4) 0 ±1 0 0 0 ±1 0 ±1 0 ±1 0 0 , 0+1 0 , 0 0 ±1 , 0 0 + 1 + 1 0 0 + 1 0 0 where the algebraic signs can be assigned independently. In, say, a cubictetragonal transition, C would be of this form at the transition temperature, so this group is certainly relevant there, as LANDAU would assume. If the configuration were of the tetragonal form indicated by (3.5)
b C'= 0 0
0 0 b 0 , 0c
it is easy to see that, whenfo=t=c,if(C') is a proper subgroup of Z£{C). This group will map C to itself, so N(C') does not contain C. On the other hand, every neighborhood of C contains some such C', so at least some will be in N(C). For a cubic-tetragonal transition of this kind, C-+C at the transition, so, locally, we are concerned with this neighborhood. Now
(3.6)
c C"= 0 0
0 0 b 0 , 0 b
a third possibility being given by the other obvious permutation. Now JP(C') is conjugate to £f(C"), but a different group. The neighborhood N(C') cannot contain these other two tetragonal configurations etc. Intuitively, when the crystal is in the cubic phase, it does not know one lattice vector from another, so it does not know which should begin to assume the different length. Thus, part of the crystal might shift to (3.5), another to (3.6), assuming that it can accept the discontinuity where the parts meet. In a nutshell, this is the twinning phenomenon, for such tetragonal configurations, and we need this cubic invariance, in the tetragonal phase, to describe it. Such considerations have convinced me that the invariance assumption given by (2.9) is the right one. In other transitions, the cubic group would be replaced by »Sf (C), where C is the
466 110
J. L. ERICKSEN
value occurring at the transition temperature. In the language used by ERICKSEN [16], one might take off into a fixed set corresponding to any subgroup of this group, kinematically. As he indicates, JS?(C) is not really a point group, although it is conjugate to one, and this difference is a bit subtle. Otherwise, this notion of branching off to subgroups accords with LANDAU'S ideas about this, except that he would not assume that
8(p - — = 0, 8
and (3-8)
d2
Cab
-^—EabEde^O,
for all symmetric E, the derivatives being evaluated at the equilibrium state. If (3.8) is strict, these conditions imply (3.1). Also, then, the implicit function theorem applies. Locally, we can solve (3.7) to get C = C(9), uniquely, as a differentiable function and, by continuity, (3.8) will remain strict, for 9 in some interval. Further, »Sf (C(0)) will be independent of 9. Briefly, fix any value of 0 = 90 in the interval, and consider if (C(90)). Since
467 Phase Transitions in Crystals
111
Conditions necessary for this yield values of E for which the equality must hold in (3.8). I will sketch the kind of reasoning used by LANDAU, adapting it to this slightly different view of symmetry. At the transition temperature, 9 = 9, we will have the group =Sf(C), evaluated at the transition configuration. Now <£ is invariant under this, which implies that the second derivatives here are mapped by it to themselves, as components of a fourth-order tensor. This is the idea used in linear theory, to get restrictions on linear elastic moduli. This means that these satisfy a certain set of linear equations. These are to be regarded as identities, not equations. Similar remarks apply to the third derivatives, fourth derivatives, etc., when these exist. If we consider a path coming into to this, in one of the possible fixed sets, similar remarks apply, but it will be a different set of identities, in general. On such a path, the determinant of the Hessian of
(3.9)
£e
d2
^SFir£«»=ft
Here and in the following, quantities involved are evaluated at the putative transition. As mentioned above, we can calculate E which must, necessarily, lie in V. If there were others, the second derivatives would have to satisfy some additional equations, which is improbable. By using this line of thought, one can pin down the structure of V, for any particular kind of transition. Later, we will do this, in special cases. Here, one uses the fact that the group if (C) maps the second derivatives to themselves, so it must map V onto V, C being considered to be fixed at the transition value. If we assume, as LANDAU did, that <£ is at least three times differentiable, we will have, for EeV and sufficiently small, (3.10)
3\[
d3
d3
V£eJ/
468 112
J. L. ERICKSEN
and it was a condition equivalent to this that LANDAU used. As mentioned above, the third derivatives satisfy some identities, which we are free to use. It can happen that these reduce the condition to an identity. If so, he would allow that this type of transition might well be observable, and he would go on to look at higher derivatives, to get conditions necessary for this. If not, we have a type of transition which, in his judgment, should not be observed. The approach makes some kind of sense, if we can assume that the potentials are analytic. Then, locally, each function is determined by its coefficients in a power series expansion, and we can inquire into possible relations between the latter. The identities simply account for the invariance of the potentials. What are we saying about the functions, when we deny those relations between coefficients? One view seems to me to make sense. If some particular equation relating coefficients were satisfied, we could change the coefficients very slightly, and it would no longer be satisfied. Thus we are really requiring insensitivity to small perturbations, within the set of analytic, invariant functions. If we were concerned with only one function, the coefficients would satisfy some equations. Take any two, which depend on temperature, and eliminate 6, to get an equation satisfied by the pair. Surely, we do not wish to assert that such equations are obtainable only very rarely. Also, it seems contrived to assert that some particular equations won't occur by this process. A small perturbation means only a slight change in derivatives, of all orders, if I correctly infer what is meant from calculations used by physicists. If not, what are we to mean? If my view is correct, then something must give, once we admit the possibility that potentials have singularities. We can't force, say, the third or fourth derivatives to be near, if they don't exist. Neither is it clear what we are to mean by analogous coefficients. It might be argued that potentials can be approximated well enough by analytic or C " functions. This is true, with some ideas of what it means for functions to be close. To decide whether it is true, we need to have some idea of what notion of closeness is appropriate. If we have this, then the need for such approximation might disappear. It is pretty clear, from remarks quoted previously, that some physicists believe the singularities to be real. If they are correct, something must be wrong with the approximation. Later consideration of critical exponents provides some hint as to what it might be. Such considerations have convinced me that we must find some way to select the right topologies, or give up the whole line of thought. This is, I think, the most serious flaw in the Landau theory, and I find it difficult to set it right. 4. Special Cases To get some feeling for how we should use ideas mentioned earlier, it seems to me better to deal with some rather specific special cases. Always, I will consider that we have a cubic equilibrium path for 8>9, with the inequality (3.8) strict except at 8, where we have a critical point*; the inequality fails to be strict. We are only concerned with a neighborhood of the critical point. Any potential * Mathematicians use this phrase to describe any points where (3.7) holds. I follow older practice, used in referring to critical points in fluids, etc.
469 Phase Transitions in Crystals
113
considered is to be of class C<2>, at least, and invariant under the cubic group (3.4). A priori, no such function is considered to be improbable. However, if a prediction obtains from one, we either get essentially the same prediction for all others which are close to it, or we don't. If we don't, we consider the prediction not to be in accord with possible observations of a material. In more picturesque terms, we have picked a sample which is bad, insofar as the effect at hand is concerned. It might behave more normally, with respect to another effect. I have observed that experimentists do, on occasion, ignore data obtained from an exceptional sample. Here, as explained before, a material is not to be confused with a single sample, particularly when the latter is of exceptional nature. We are willing to adjust the idea of closeness a bit, to accomodate some reasonable demands, there being no other way to decide what is appropriate. Past considerations indicate that it should imply that values of
4d2$ = C11(E2ll+E222+E233) + 2C12(EllE22 2
+ 4C^(E 2+E
2
23
+
E22E33+E33Ell)
2
+ E 31).
Here, C n , Cl2 and C 4 4 are moduli, in notation similar to that used by experimentists. I have added carets, to avoid possible confusion with components of C. Also, the usual moduli are these, multiplied by a2. To compare this with, say, LOVE'S [22, §110] description, identify d2i> with his 2W, ExJa with 2exx, El2/a with exy, etc. The stability inequality (3.8) is equivalent to (4.2)
C^-C.^0,
Clx+2C12^0,
C44^0,
one or more of the equalities holding at a critical point. As LANDAU would have it, only one should fail to be strict, giving three possible types of critical points. To branch off to a tetragonal configuration of the type (3.5), an elementary analysis indicates that d2 0 must vanish for some value of E such that £ u =E22+E33, E12=E23=E3l = 0, which requires that (4.3)
C U = C 12 .
If the remaining two inequalities are strict, one calculates that the linear space V defined by (3.9) is two-dimensional, consisting of all E such that (4.4)
Eri+E22
+ E33 = 0,
£12=£23=£31=0.
Now d3
470 114
J.L. ERICKSEN
terpretations of LANDAU, this type of transition should not be observed. For this, it does not matter whether we use his ideas of invariance, or mine. Of course, I must think this through independently, since I don't accept his criterion. I will not consider all of the eight possibilities represented in (4.2), but only those such that Cn+2C12>0.
(4.5)
Physically, cubic equilibrium configurations are certainly observable in some materials, so we would be in trouble if our theory were to deny this. Reasonably, we can use this, to check out ideas of closeness, etc. Some of the conditions for equilibrium at a cubic configuration follow automatically from the invariance of
C
l2 = C23 = C31=Q=>-fl(^
CL
12
= J ^ = Jfr- = °UL
-23
V
^i\
For this, it is not necessary that diagonal components of C be equal, so this also applies to our tetragonal configurations, for example. For the cubic configurations we have, in addition to this (4-7)
Cll
d
so the condition that d$ = 0 reduces to one equation. Let
(48) (4.9)
£- 3 » da
4
BCl{
^= 9(C11+2C12).
Here, I use the fact that d2
a>a{0),
— (a,0)
a
(4.10)
A small perturbation in
£n Phase Transitions in Crystals
115
sufficiently insensitive to perturbations. Also, a slight perturbation should leave (4.5) satisfied. If all the inequalities in (4.2) are similarly strict, they should remain strict. Here, we see the need for keeping first and second derivatives close together and, possibly, we want the differences to be uniformly small, in some neighborhood; I am just not sure what are the minimal hypotheses needed, to prove such results. In any event, the statement that we have a cubic equilibrium point where d2
(4.1!)
C 11+ 2e il >
C44>0, C , , - C 1 2 { ^ £ £ *
or
(4.12)
C11 + 2C12>0, C u -C 1 2 >0,
C 44 {>° £ ; £ > £
In either case, the cubic configuration becomes unstable for 9<9, so we don't expect to observe it. I will focus on (4.11). From (4.3), this is the one which is relevant for cubic-tetragonal transitions, so we will be able to compare our conclusions with LANDAU'S. Also data such as are sketched by REHWALD [23] are compatible with (4.11), not with the possibility that C U + 2 C 1 2 = C U — C 1 2 = 0, which we did not cover. For such reasons, this special case seems to be of some physical interest. Although I will emphasize (4.11), there is an interesting point connected with another possibility, a second-order transition leaving the crystal in cubic form. Using (4.9), and the implicit function theorem, we see that the cubic branch would exhibit np unusual discontinuity, etc. unless Cu + 2 C 1 2 = 0 at the critical
472 116
J. L. ERICKSEN
point. The implicit function theorem does then not guarantee that the cubic branch can be extended to 8<6. However, to have the transition, it must, and C n + 2 C 1 2 must be non-negative, near 9 = 9. Naively, it seems that, if we are to be consistent in our reasoning, we must judge that a material will not so behave, and this is in agreement with what obtains from my interpretation of the Landau criterion. This gives an exclusion rule which is consistent with experience, as far as I know. Actually, we might hedge a little. Some breakdown in smoothness of the cubic branch is to be expected at 9 = 9, so the analogy is not very precise. I am not sure what we should make of this. To proceed with the local analysis of the critical points described by (4.11), it is convenient to eliminate some variables, in a traditional way. I find PoiNCARE [24] doing this, in 1885, for example. This leads, in a natural way, to what physicists call "order parameters", after LANDAU. With the invariance described by (3.4), we can use (4.6), and similar considerations of second derivatives, to infer that (4.13)
= aC2l2 + pC223 + yC23l+R,
with R=o(C22 + C223 + C23l). At the critical point, « = )5 = y = C 4 4 /2>0, so, by continuity, these coefficients are positive nearby. Locally, we will then have, as a strict inequality, (4.14)
$(C,0)^(C 11 ,C 22 ,C33,O,O,O,0).
For considerations of equilibria with Cl2 = C23 = C3l = 0, near the critical point, it then suffices to consider only comparison states of diagonal form. To simplify notation, we set (4.15)
'
|x,=C11>0, x2 = C22>0, x3 = C 3 3 > 0 , = il/(xi,9), \<j)(xl,x2,x3,0,0,0,0)
(3.4) implying that \\i is a symmetric function of the xt. In dealing with such functions, we* commonly consider them to be expressed as functions of three elementary symmetric functions, commonly taken as the set (4.16)
I = xl+x2+x3,
II = xlx2+x2x3+x3xl,
III=xlx2x3.
I find it better to use a different set, viz. (4.17)
71=/, 3/2=-^1y2-y2^3-y3y1,
2l3=yxy2y3,
where (4.18)
y; = x;-//3,
iyt i= 1
= 0.
* An example is provided by the theory of isotropic elastic materials, with the strain energy considered to be a symmetric function of principal stretches. Here, most workers use the representation, although one can find exceptions.
473
Phase Transitions in Crystals
117
Either set is expressible in terms of the other, by polynomial relations. We can write (4.19)
iP(xi,9) =
x(IlJ2J3,0%
where % is a continuous function. It is not so easy to see what other properties it inherits, from i/' being of class C<2>, etc., at places where two or three of the xi become equal, i.e. at our cubic and tetragonal configurations. If one is too hasty to make the switch, one can miss some important points. The yt are roots of the cubic equation (4.20)
yf~3I2yi-2I3=0.
Since these must be real, the discriminant of the cubic must be non-negative, which gives the condition (4.21)
I32^I23.
The equality is taken on if and only if at least two of the y{ are equal. If three are equal, it follows that all vanish, so I2 = / 3 =0. Thus, we have the characterization (4.22)
[tetragonal configuration <=> l\ = l\ > 0, [cubic configuration o/2=/3=0.
Cleaily, some of these need to be in the domains of IJJ and %, or we cannot have the transitions, and we hardly expect the domains to include only these. Physically, \\i and x a r e defined only for real configurations, so we see that the domain of x is, inherently, not an open set. Its partial derivatives are then not well-defined, at configurations covered by (4.22). The inequality (4.21) will play an important role, in some of our analyses. Because of the problems indicated, I will use x only rarely, and then with some care. We can safely eliminate one more variable, by using the change of variables indicated by zl = xl+x2 + x3 = I, 6z2=xl+x2-2x3=yl+y2-2y3=-3y3,
(4.23)
2 T / / 3 Z , = X ] — x2 = v, — v,, 2
V
x 1 = z 1 / 3 + z 2 + ]/3z 3 , x2=Zl/3 + z2--/3z3,
to get (4.24)
A calculation gives
(425) (
}
{'z = * l + * i , j/ 3 =(3z 2 -^)z 2 .
474 118
J.L. ERICKSEN
Also, one can check that the equality in (4.21) holds when (4.26)
(yl-y2)(y2-y3)(y3-yl)=o<^z3(z23-3z22)=o,
giving us the new descriptions of cubic and tetragonal configurations. We have, in particular, (4.27)
cubic configuration o z2 = z3 = 0.
These variables are examples of order parameters; they vanish in the more symmetric configuration, but not in the others. If we are interested just in the cubic-tetragonal case, the two order parameters do not vary independently. Here, practice would dictate that we use one, z2 being a likely choice. Now, the idea is to solve d\ji/dzl = 0, near the critical point, to eliminate zx. Of course, it will vanish at the critical point and we calculate that, there, (4.28)
12^=C 1 1 + 2C 1 2 >0. ozl
Thus, we can use the implicit function theorem, as indicated by (4.29)
^
= Q^z,=zx(z2,z3,e),
where zx is locally unique, and of class C(1). If we set z 2 = z 3 = 0 , this gives the cubic branch, extended to 6<6, discussed earlier, as well as the stable part for 9 > 9. Schematically, we have (4.30)
cubic b r a n c h o z x = z ^ 0 , 0 , 6 ) ,
z2=z3=0,
and we would be on it, or one of the three tetragonal branches, when z2 and z 3 satisfy (4.26). Also, with (4.28), we will have, locally, (4.31)
d f lA(zi,0)^1A(z2,z3,0) =i=
^zx(z2,z3,
8),z2,z3,0].
Thus, for considerations of equilibrium, we can use $ in place of \j/ or tfi. The local uniqueness of z\ and invariance of i/' imply that z,, \ji and $ are all invariant under a set of transformations which are equivalent to the permutations of xh viz. (4.32) (4.33)
z 2 ^ z 2>
z3^-z3>
2z2->j/3z3-z2,
2z3^z3+/3z2,
and (4.34)
2z2->-i/3z3-z2,
2z3->z3-]/3z2.
Since zl might only be of class C(1>, it might appear that \f/ is no smoother, but it is of class C(2). To see this, one expands \ft to second order, noting that (4.29) holds at relevant points. Consideration of this and the differential of z1 estab-
a
Phase Transitions in Crystals
119
lishes the assertion. In place of \jj, we can use the analog of (4.19),
(4.35)
tf=z(/2,/3)0),
and this involves essentially the same difficulties as before. The reduction to ip relies on inequalities, etc., not likely to be affected by slight perturbations in the potential. In the remainder of this section, we fix 9 = 5, working on the critical isotherm, so I will not explicitly refer to this argument. From (4.30), we will have (4.36)
# = 0,
at z 2 = z 3 = 0 .
The condition that we have a critical point translates to (4.37)
d2ij)=O,
at z2 = z 3 =0.
Following LANDAU, in spirit, the primary question is whether the condition (4.38)
tf(z2)z3)^(0,0)
can be satisfied by a material, for z2 and z3 small. From (4.25), we can use ]/T^ as a norm, (4.32) and (4.37) implying that, for 72->0, (4.39)
#(z 2 ,z 3 )-#(0,0) = o(J2).
I take it as now being plausible that there are materials for which the reduction to \j/ is feasible, (4.39) applies, etc., so it is only necessary to decide whether (4.38) can hold and display suitable insensitivity to small perturbations. It might help to consider a special case, to get some feeling for what is involved. Suppose we start with a function of the form (4.40)
4>=aP2,
with p> 1, to comply with (4.39), and a>0, so (4.38) holds, in the strict sense. We can perturb it in various ways, without violating (4.38). Clearly, we can change a and p a little, or introduce perturbations so that (4.40) is replaced by (4.41)
$=aIp2 + o{P2).
Using (4.21), we can easily tack on terms of the same order, which might be negative. For example, we could replace (4.40) by (4.42)
with b>0, b/a
476 120
J.L. ERICKSEN
Recalling the statement, in the Introduction, criticizing L A N D A U for getting the wrong values for critical exponents, most physicists would be upset, if I claimed that it was nonsensical to speak of critical exponents, for a material. My view is that we should not rule that they are meaningless for all materials. What then are we talking about, when we are dealing with a critical isotherm? Here, my interpretation is that we assume that A \jj can be written in the form
(4.43)
A
where \j/2 is small relative to \jil, near the order in (4.41), Aip denoting the difference critical point. If we take as a guide the SENGERS [25], we are to assume* that indicated by (4.44)
l A 1 (^z 1 ,l
r
critical point, like the term of small in <£ from the value taken on at the survey by SENGERS, HOCKEN, and i/^ is a homogeneous function, as
z 2 ) = l s .A 1 ( Zl ,z 2 )
for X>0, and some choice of the real numbers q, r and s. Considering the invariance involved, I think it just as reasonable to make the more special assumption described by (4.45)
q = r = l.
For this, (4.40) and (4.41) serve as examples, with s = 2p. If it makes sense to think that s has a rather definite value for some material, it makes no sense to think of those terms of lower order as being small perturbations, as I see it. Thus I would not allow them. Then, it is clear that, by my standards, (4.38) can hold, for particular materials. Clearly, it is rather important for (4.38) to be strict, if ip2 is to be small compared to ij/l. Insofar as our considerations cover the factors involved, it seems, contrary to what LANDAU said, that our cubic-tetragonal transitions might well be observable, but I will probe this a bit more. Mainly, the difference seems to result from differences of opinion concerning what function spaces and topologies are appropriate. I have only roughed out the structure which seems to me to be appropriate. It seems not to fit one of the standard mathematical patterns, and I am not sure how best to reduce it to a precise form. It might be wise to explore less local problems in a similar way, looking for other clues. There are the firstorder transitions, for example. 5. Bifurcation I will now forget LANDAU, and turn my attention to' POINCARE. TO have a transition, we need not only that (4.38) hold, but that we have a suitable stable branch, for 9<9. According to the Principle of Exchange of Stabilities, as * Something of this kind might be deducible, from the ideas of bifurcation of internal parameters discussed in §2, but I have not explored this much. It would not shock me to find a material for which the assumption is bad. Merely, I accept the idea that the assumption makes sense, for some materials.
477 Phase Transitions in Crystals
121
conceived by POINCARE [24], we should have two things. One is the existence of a smooth branch which changes from stable to unstable, as our parameter 0 goes through its critical value 9; our cubic branch qualifies. Second, on it, the determinant of the Hessian of <2> should undergo a change of sign. A calculation indicates that it does not. This happens because C\ t — C 2 2 is a double root, as is clear from the fact that the linear space V, given by (4.4), is two-dimensional. Consideration of examples led PoiNCARE to believe that, in cases where the determinant does not change sign, it is unlikely that one will have the kind of branching which we need. Experts in bifurcation theory tend to shy away from such two-dimensional cases, and to assume too much smoothness, for our purposes. It is easily seen that the V corresponding to (4.12) is three-dimensional, so it is offbeat also. My own amateur attempts to cope with this indicate that, despite what POINCARE thought, it is very likely that we will have the exchange of stability which we need. Consider the continuous function %{12,13, 9), referred to in (4.35), near the critical point. From (4.22), we have, at the critical point, (5.1)
/2 = /3=0,
6 = 6.
Fix 7 2 > 0 and 6<9. Then %. reduces to function of 7 3 which, from (4.21), varies on the closed interval (5.2)
-IW3^I\.
Then, x will take on its minimum on this interval. This might occur at one of the end points. They are similar, so suppose it occurs at that given by (5.3)
h =
-I\.
If this applies for all I2 >0, 6 — 9>0, at least when these are small, we will have, near the critical point, (5-4)
%(I2,I3,0)^X(I2,
-li8)d=X(l2,0).
From (4.22), we are then dealing with tetragonal configurations. For any fixed value of J 2 , there are three of these, related by a renumbering of lattice vectors. We can restrict our attention to that of the form (3.5). From (4.15), (4.23) and (4.25), it is described by
(5.5)
z 2 =]/77>0,
z 3 =0.
Now consider i// on the cubic path z 2 = z 3 = 0, where 6<S. There d# = 0, and an elementary calculation shows that d2\j/ has C, l — C 1 2 as a factor. From (4.11), it is negative, so d2\J/<0. Thus \j/ has a local maximum, at such cubic points. Said differently, there is a positive function f(9) such that, for 6 < 9, (5.6)
O 2 (0)=>#(r 2 ,z 3 ,0)<#(O ) O,0);
we again use the fact that j/T^ serves as a norm. For any given value of I2, we pick z2 and z 3 to give this value, and satisfy (5.3). This follows from the fact that
478 122
J. L. ERICKSEN
(4.20) has real roots, when I2 and 7 3 are given, satisfying (5.3). Thus (5.6) implies that, for 9<9, (5.7)
O<J 2 (0)=>z(/ 2 ,0)< 2(0,0).
Now suppose that (4.38) holds, in a strict enough sense so that (5.8)
I2>O^x(I2,9)>x(0j).
What we have done is to guarantee the hypotheses of a bifurcation theorem given by ERICKSEN [26, Th. 2]. The conclusion is as follows. Pick any e>0, and consider the set S defined by (5.9)
I22 + u2<s2,
u = 9-9>0,
12^0.
Then, for u sufficiently small, there exist functions 5(9) > 0 and g{6) such that (5.10)
|(g(0)'U)e5'
That is, x takes on at least a local minimum, at I2 =g(9), which is as close as we like to the critical point, and in the proper range of temperature. We would like to have a bit more; z2 and z3 should be differentiable functions of 9, for 9<9, to have what would be regarded as a normal transition. PoiNCARE was willing to gloss over technical difficulties of this kind, in formulating his Principle of Exchange of Stabilities. By the same standards, then, the cubic-tetragonal transition is quite a likely one, although inconsistent with his Principle. It is not hard to construct potentials which exhibit such a transition, giving smooth branches. It is a little more difficult to analyze cases where the first minimum occurs in the interior of the interval given by (5.2). Then, (5.3) is replaced by an unknown function, giving I3 in terms of I2 and 9, and there are worries about its smoothness. Otherwise, one can analyze this in the same way, with much the same conclusions. This would give transitions from cubic configurations to those of the form (5.11)
a 0 0 C= 0 b 0 , 0 0c
with a, b and c all different, in general. We have not checked all points. For example, we checked that a material could be in cubic equilibrium configurations, but not that it would be in those of tetragonal form, etc. I would not mind using this to help firm up ideas of closeness, etc., but don't see that it helps much, or that it leads us to contradictions. I have noted places where opinions might differ, where there are gaps in theory, etc. so any conclusions drawn are speculative. As I now see it, there is no obvious reason why either type of transition should not be observed. Rather similar assessment of probabilities described by (4.12) indicates that, similarly, the analog of (4.38) can hold. Here,
479 Phase Transitions in Crystals
123
the relevant determinant does change sign on the cubic branch so, according to PoiNCARE's Principle, we should have the necessary branching. Cubic-triclinic transitions would fit here, for example. PoiNCARE's Principle is based on theorems, but there is not a theorem which really proves that we have this bifurcation, for the case at hand. JOHN BALL has shown me that one can deduce a weak result, similar to those mentioned above, so I consider such transitions to be possible. As I interpret him, LANDAU would exclude all of these transitions. Naturally, others must decide which, if either of us, is correct. If nothing else, I think that I have demonstrated that quite subtle differences in reasoning can lead to very different conclusions, concerning exclusion rules. Admittedly, the precision of my reasoning leaves something to be desired. As to LANDAU, the reader can judge for himself. In discussing real crystals, I have overemphasized some very special types, because it was these that first led me to reconsider the whole theory. My reasoning does not rely on a judgment as to whether these particular transitions really are, or are not, of second order. My conclusion is that, a priori, they might be. Whether they really are must then be decided by looking carefully at the observations, and exercising expert judgments. I don't claim to be such an expert, and this is no place for amateurs. Acknowledgment: This work was supported by National Science Foundation Grant ENG 76-14765 A02. References 1. LANDAU,L. D., "On the theory of phase transitions", in Collected Papers of L. D. Landau (ed. D.TERHAAR), New York-London-Paris, Gordon and Breach and Pergamon Press 1965. 2. BATTERMAN, B. W., & BARRETT, C. S., Crystal structure of superconducting V3St, Phys. Rev. Lett. 13, 390-392 (1964). 3. ANDERSON, P. W., & BLOUNT, E. I., Symmetry considerations on martensitic transformations: "ferroelectric" metals?, Phys. Rev. Lett. 14, 217-219 (1965). 4. MICHELSON, A., "Weak Lifschitz condition" and the allowed types of ordering in second-order phase transitions, Phys. Rev. B18, 459-464 (1978). 5. WEGER, M., SILBERNAGEL, B. G., & GREINER, E. S., Effect of stress on the superconducting transition temperature of V3Si, Phys. Rev. Lett. 13, 521-523 (1964). 6. GURTIN, M. E., & WILLIAMS, W. O., Phases of elastic materials, ZAMP 18, 132-135 (1967). 7. CAUCHY, A.-L., Sur les equations differentielles d'equilibre ou de mouvement pour un systeme de points materiels sollicites par des forces d'attraction ou de repulsion mutuelle. Ex. de Math. 4, (1829) ^(Euvres, 9(2), 162-173 (1891). 8. STAKGOLD, I., The Cauchy relations in a molecular theory of elasticity, Quart. Appl. Math. 8, 169-186 (1949). 9. ERICKSEN, J. L., Nonlinear elasticity of diatomic crystals, Int. J. Solids Structures 6, 951-957 (1970). 10. PARRY, G. P., On diatomic crystals, Int. J. Solids Structures 14, 283-287 (1978). 11. ERICKSEN, J.L., "Special Topics in Nonlinear Elastostatics", in Advances in Applied Mechanics (ed. C.-S. YIH), vol. 17, New York, Academic Press, 1977. 12. THOMPSON, J.M.T., "Catastrophe Theory and its Role in Applied Mechanics", in
480 124
J. L. ERICKSEN
Theoretical and Applied Mechanics (ed. W. T. KOITER), Amsterdam-New YorkOxford, North Holland Publishing Co. 1976. 13. GIBBS,J.W., On the equilibrium of heterogeneous substances, Trans. Conn. Acad. 3, 108-248 (1875-6) and 343-524 (1877-8) = Scientific Papers, London, Longmans & Green, 1906. 14. THOMPSON, J.M.T., & H U N T , G . W . , "A General Theory of Elastic Stability", London-New York-Sidney-Toronto, John Wiley & Sons, Ltd. 1973. 15. TESTARDI, L. E., & BATEMAN,T. B., Lattice instability of high-transition-temperature superconductors. II. Single-crystal V3Si results, Phys. Rev. 154, 402-410 (1967). 16. ERICKSEN, J. L., On the symmetry of deformable crystals, Arch. Rational Mech. Anal. 72, 1-13 (1979). 17. KELLER, K. R., & HANAK,J.J., Ultrasonic measurements in Single-Crystal Nb3Sn, Phys. Rev. 154, 628-632 (1967). 18. DAFERMOS,C. M., The mixed initial-boundary value problem for the equations of nonlinear one-dimensional viscoelasticity, J. Differential Equat. 6, 71-86 (1969). 19. PARRY, G. P., On the elasticity of monatomic crystals, Math. Proc. Camb. Phil. Soc. 30, 189-211 (1976). 20. PITTERI,M., Reconciliation of local and global symmetries of crystals, pending publication. 21. ERICKSEN, J. L., On the symmetry and stability of thermoelastic solids, J. Appl. Mech. 45, 740-744 (1978). 22. LOVE, A. E. H., "A Treatise on the Mathematical Theory of Elasticity", 4lh ed. Cambridge, Cambridge University Press 1927. 23. REHWALD,W., Lattice softening and stiffening of single crystal niobium stannide at low temperatures. Phys. Lett. 27A, 287-288 (1968). 24. POINCARE,H., Sur l'equilibre d'une masse fluide animee d'un mouvement de rotation, Acta Math. 7, 259-380 (1885). 25. SENGERS, A. L., HOCKEN.R., & SENGERS, J. V., Critical-point universality and fluids,
Physics Today 30, 42-51 (1977). 26. ERICKSEN, J.L., Variations on a bifurication theorem by Poincare, to appear in Meccanica. Department of Mechanics The Johns Hopkins University Baltimore, Maryland (Received October 26, 1979)
481 CONTINUOUS MARTENSITIC TRANSITIONS IN THERMOELASTIC SOLIDS J. L. Ericksen Department of Mechanics Johns Hopkins University Baltimore, MD 21218
This paper illustrates how thermoelasticity theory can be used to analyze a rather simple type of phase transition involving a shear critical point.
INTRODUCTION The purpose of this paper is to provide illustrative examples of one of the mathematically possible kinds of phase transitions that can be analyzed by using thermoelasticity theory. In terms of the theory of phase transitions sketched by Ericksen [1J, this is one of those that involve shear critical points. Since it is a continuous transition of the martensitic kind, this leads us into a topic that has also been neglected in thermoelasticity theory, namely, some of the elementary theory of twinning. According to Landau's [2] theory of continuous phase transitions in crystals, the type considered might be observable. Ericksen [3] discusses another type which Landau would exclude, but which it seems can be observed. Some researchers, such as Mnyuck [4], lean toward the view that continuous transitions never occur in nature, although a number do come very close to being continuous. This may or may not be important, since bifurcation theorists tend to treat such situations by perturbing a potential which gives continuous transitions. Absolutely perfect real crystals are hard to come by, and this could be relevant. It is not impossible to treat discontinuous martensitic transitions, but it is trickier, so it is better to treat the simpler case first. We develop a small part of the elementary theory of twinning, applicable to our continuous or comparable discontinous martensitic transitions.
BASIC FORMULATION Physically, we are concerned with a homogeneous thermoelastic body, subjected to a controlled pressure p, in a heat bath with temperature 6, which is also controlled. For the present, we are concerned with cases where the body is in stable equilibrium, in homoThis material is based on work supported by the National Science Foundation under Grant No. CME 79 11112. Journal of Thermal Stresses, 4:107-119, 1981 Copyright © 1981 by Hemisphere Publishing Corporation 0149-5739/81/020107-13$2.75 107
482 108 J.L.ERICKSEN
geneous configurations. Let X denote rectangular cartesian material coordinates, in some homogeneous reference configuration, x1 being the spatial coordinates. The deformation gradient F, given by the usual formula
K
0)
" ax"
is then constant for configurations of interest. The body is then described by a Helmholtz free energy per unit mass of the form (?) = ^ ( C ^ ,
6)
(2)
where
°KL = FK F i
W
The relevant thermodynamic potential is the Gibbs free-energy density, given by
5 = 4> + pu
(4)
which, for p and 0 fixed, should be a minimum for stable equilibrium. Here, v, the specific volume, is given by
U = /det. | IC^I |/p Q
(5)
where p 0 is the mass density in the reference configuration. Following Ericksen [1], we decompose C into a volume factor and a measure of shear deformation y, as indicated by C
KL= ( P 0 U ) 2 / \ l /
det
-^KlJI
= 1
(6)
writing <(> =
4»(YKL,
U,
6)
(7)
We then define a subpotential \p, by minimizing 0, for fixed values of v and 6. Nominally, this gives JKL(V> 0). possibly multivalued, such that ^(Yjg/Ufe)
def :> * ! Y K L ( v > f e ) , u f e ] = i M u f 9 )
(8)
i// being identified with the Helmholtz free-energy density commonly used by thermodynamicists. We are concerned with cases where the minimization problem indicated by (8) involves a bifurcation, at some values of v, 8, at which 7KL remains continuous.
483 CONTINUOUS MARTENSITIC TRANSITIONS IN THERMOELASTIC SOLIDS 109
We are also concerned with materials for which <j> displays a simple kind of invariance -invariance under a 180° rotation about an axis, which we take to be the X3 axis-so *(Y11'
Y
12'
Y
Y
22'
33'
Y
Y
13'
U
23'
'9)
(9)
= * ( Y U , Y 1 2 , Y 2 2 , Y 3 3 , "Y 1 3 , -Y 2 3 , u,0)
We also assume that, for the local analyses of interest, it can be considered to have no other invariances. Looked at more globally, invariance groups for crystals generally contain discrete shears, so they are part of a larger invariance group; but for local analyses the assumptions just made can apply, and we will not get into a long discussion of this. Mathematically, the assumptions could apply to materials that are not single crystals. Some kinds of additional invariance can invalidate assumptions to be made. We might employ the format presented by Ericksen [1] to take care of the constraint indicated in (6), but it is simpler here to eliminate 7 3 3 by solving the second part of (6) for it, in terms of the remaining components, and to relabel the arguments as follows: (u-jy U 2 , u 3 )
= (Y-Qf Y 1 2 /
Y22) (10)
(v
l'
V
2
} =
(Y
13'
Y
23
)
Then we have 0 expressible as a (different) function of these, and it is easily verified that
*(V V
°'
6) =
* (u i'~ V a' U '
6)
00
where the Greek and Latin indices take on the obvious values; we will apply the summation convention to these. In terms of these variables, a shear critical point occurs at values of the arguments where, first, (12)
W~ = °
(13)
a
and, second, the inequality
^
2
= Idu^au. AT-
du du
2
2
2
-1 3- + !-A*73u.<*/ du.1 dv a + |~4r-civ 3u 9ufi adv og >= 0 (14)
holds, but not in the strict sense. Here, (j> is assumed to be relatively smooth. As is discussed by Tisza [7], it is neither wise nor necessary to assume too much differentiability with
484 110 J.L. ERICKSEN
respect to 0, and the same applies for v. Here, we will ignore such subtleties and assume that <> / is very smooth as our primary purpose is to illustrate the kinds of reasoning commonly used by those interested in phase transitions, and where such reasoning leads, within the context of a moderately well understood continuum theory. The next step of giving a careful critique and of refining analyses, minimizing smoothness assumptions, etc. will not be attempted here.
STABILITY CONSIDERATIONS To obtain the kind of transition we are after, we need 0 to have some special properties and, a priori, it might not be too obvious what they are. Landau [2] introduced lines of reasoning that, to put it roughly, select the most likely kinds of potentials. From (10), it follows that
" • - • f j — <>
\
<'5)
a
so naive considerations would suggest that the minimizers involved in (8) are of this kind. Suppose this is the case for v and 6 in some domain. If so, we would get the corresponding Uj by the more arduous task of solving (11) to determine possible extremals. Then, more likely, we would use (13) as a first check on the possibility that the solutions are at least local minimizers; if the inequality is strict, this much is guaranteed. Texts on linear elasticity or thermoelasticity theory always assume strict inequalities, so, in some sense, this is the normal situation. Furthermore, we can then employ the implicit function theorem to conclude that this u; is locally unique, and that it depends smoothly on v and 6. To check this, we would take the differential of (11), holding va = 0, viz.
2
2
3d)y , 9 < t j ^ ~ — du. = - ~ \ 3u.3u. 3u.3u du J-| l j l
2
3d) ,« -• \-,r- d0 3u.30 l
. ,.
v(16)
'
The implicit function theorem then holds if .2 •§-2*—du. = 0 - > ~ d u . 3 u . 3 u . - j - )
= 0
(17)
and we easily see that the strict version of (13) implies this. It might happen that such a configuration is only metastable, giving the possibility of a discontinuous transition to some other state, but we ignore this. What we are interested in is the possibility of a continuous transition, letting va become nonzero. For this, we first consider the problem of partial minimization indicated by <}> ( u . , v , u , 9) I
cj> ( u . , v , U, 0)
(18)
485 CONTINUOUS MARTENSITIC TRANSITIONS IN THERMOELASTIC SOLIDS 111
at least near va = 0. We will then be solving (11), employing the analogous considerations of second derivatives, starting from va = 0. If the corresponding second differential is strictly positive, the implicit function theorem again applies and we get a smooth solution, which is unique, at least locally. Using the fact that it is unique, we then have u
= u , . ( v a , u , 9) = U j . ( - v a , u , 9 )
±
(19)
We will assume this and, again, that no problems arise because these might be only relative minima. Consideration of the possibility that this second differential is not strictly positive, using the Landau theory, indicates that it is improbable that it will give a continuous transition. We will not go through this argument. Now, we have reduced the problem to a consideration of the smooth subpotential
$ ( v , u, 9) ^ i (-*
<|>(u,, v , u, 9) = $ ( - v , u, 9) X
LX
CX
(20)
and the next step is to minimize it, for fixed values of v and 9. Thus, we wish to solve
8$
(21)
a
starting from va = 0; somewhere, we want to get branching to va ¥= 0, and so we require the implicit function theorem to fail, viz.
HI«&ll-°
(22)
This gives one equation, involving v and 6. Potentials which do this might satisfy the condition at an isolated point in the v6 plane, in some region, or along a curve. As Landau would see it, the first or last case could be acceptable; for simplicity, I will assume the last, and so we have some curve, represented by f (U, 9) = 0
(23)
on which (21) holds. With respect to the remaining possibility, this might be a fluid phase, and we would guess that Landau would not accept much else. Suppose that the minimizers are va = 0 on one side, say v
= v
= 0 for f < 0
(24)
^
(0, u ,
(25)
with 9v
a
3
9)dva dvg > 0 in f < 0
486 112 J.L. ERICKSEN the inequality being strict when f < 0. On f = 0, it is not strict, and it will vanish if and only if dvQ, satisfies
; A J _ (0, U, 9) dv a = 0 a B
(26)
Either the corresponding matrix is of rank one, or it vanishes. The latter would correspond to having three functions in the two variables v and Q vanish, and Landau would reject this as highly improbable, so it is of rank one. Generally, the idea is that m < n equations in unknowns might well have solutions but it is unlikely that they will if m > n. Some would replace "unlikely" by a stronger term such as "highly improbable." Some, but not all, experts in stability theory also like this notion. This should be taken with a grain of salt, but it is not a bad idea to use as a tool in devising a classification of potentials into different types. Also, with a potential not precisely known, which seemed to be borderline, one might well side with Landau in placing a bet on its status. Fix some point P on f = 0, and there pick nonparallel vectors a a and b a such that, at P, ^
% -
(27)
»
Using these as a basis, we can write V
a
= ka
a
+ lb
(28)
a
Near P, this change of variables gives $(k, l ,
u,
9) =
$ (kaa + l b , u,
9) (29)
= $(-k, - 1 , v,
6)
At P, we have, evaluating derivatives at k = 1 = 0 ,
i l = I2 312
3^|_ b
3va3vg
b
a 6
>0
(30)
Thus, as before, we can solve 30/91 = 0 for 1, giving
1 = T(k, u, 6) = - ¥ ( - k , u, 0)
(31>
and obtain another smooth subpotential,
$(k, u, 6) ^ £ $(1, k, u, 6) = $(-k, v, 9)
(32)
487 CONTINUOUS MARTENSITIC TRANSITIONS IN THERMOELASTIC SOLIDS 113
Let primes denote partial derivatives with respect to k. Then, at P, for k = 0, we have $•=$•"=$••••
= 0
(33)
so, there, $ ( k ) = $(0)
+
1 $(iv)k4
(34)
For P to be physically attainable, with k = 0, we need 0(0) to be a minimum there. This will fail if 4>(™> < 0. By the now familiar counting of equations, 0~'lv' = 0 will, most likely, hold only at isolated points on f = 0. Thus we can pick P so that 0 ^ =£ 0 and restrict our attention to potentials such that $
> 0
(35)
to get the necessary minimum, or at least a local minimum. By continuity, this will hold near P, for k small. Similar considerations indicate that, most likely, 0" will change sign as we pass through f = 0; so, again locally, $"
> 0 for f > 0
(36)
tj)1 • < 0 f o r f < 0
This implies that the branch k = 0 becomes unstable for f > 0. The fact that we have a strict local minimum at P essentially guarantees that we will have some stable branches for f > 0 , according to a bifurcation theorem given by Ericksen [5]. To estimate their behaviour near P, we introduce the approximation
$(k) = $(0) + | * " k 2 + i
<J) (iv) k 4 ,
f >0
(37)
Setting 0' = 0 gives, besides the possibility k = 0, k 2 = - 6 <j/'/
(38)
From (35) and (36), this gives real values of k for f > 0, values approaching zero at P, and values becoming imaginary for f < 0. In the jargon of the bifurcation theorists, we have a typical "pitchfork" bifurcation, two stable branches being given by k sf ± V_6<,. V *
(39)
related by a symmetry relation for the material. In the jargon of other workers, for f = 0 we have the austenitic phase; the twin phase for f > 0 is called martensitic. To get the Helmholtz free energy \jj(v, 6) commonly used by thermodynamists, indicated by (8),
488 114 J. L. ERICKSEN we evaluate the function 4> at these minima. This gives
\\> = $ ( 0 , u , 0) f o r f < 0 i> = $ (0, u , 6) - |
(4>' • (0, v>, 9)) 2/<}> ( 1 V ) (0, u , 9) f o r f > 0 (40)
At P, this gives a potential that, together with its first derivatives, is continuous, second derivatives having finite discontinuities. To complete the problem, it is necessary to minimize the potential £ = iMu, 6) + p i '
(41)
for fixed values of p and 6. Assuming, as Landau would, that no new difficulties show up here, this should give U = U ( p , 9)
(42)
as a continuous function, with jump discontinuties in derivatives on f = 0. In the classification described by Mayer and Streeter [6] this is a classical second-order phase transition. Incidentally, it nicely illustrates how a puzzle noted by them can be resolved. In (40), the function indicated for f > 0 is also defined for f < 0, and it has there a lower value than the function \p. If the extrapolation were meaningful, it would seem that we should not observe our austenitic phase. Of course, it is clear here that the extrapolation is meaningless, giving an imaginary value of k. Also, k can be regarded as a quasi-thermodynamic variable, in the sense that Tisza [7] uses this term, so we have a good illustration of a role that such a variable can play. Using a different model, suggested by molecular theories of crystal elasticity, Parry [8] gives an interesting study, illustrating these and other points.
TWINNED CONFIGURATIONS In matrix form, the symmetry assumption described by (8) can be written as 4>(PYPT,
U, 9) = cJ>(Y,U,9)
(43)
where
1-1 p = p"1 = pT =
0
0 .;
0
-1
0
0
0
1
(44)
489 CONTINUOUS MARTENSITIC TRANSITIONS IN THERMOELASTIC SOLIDS 115
It was argued before that configurations assumed in austenite are such that y = y0, where P YQ P = T Q
(45)
Similarly, in martensite, we have two possible values, 7i and y2, related by
P
Y-L P
(46)
= Y2 + Yx
and they share the same specific volume. There is then the possibility that, in going from austenite to martensite, part of a sample would undergo the shear yx, another part undergoing y2, giving us a twinned sample. We then have a configuration which is only piecewise homogeneous. However, if the values of 7i,7 2 , etc. are such that they would minimize the Gibbs free energy for a homogeneous sample, then apparently the twinned sample would be equally stable. There then arises the question of whether any conditions should be satisfied at the surface of discontinuity where the twins contact each other, the twin boundary. One might worry about the possibility of an interfacial energy, but we will not. On the basis of some assumptions, we can deduce some conditions. It seems reasonable to assume that changes in p and 8 moving us into the twinned martensitic phase produce a displacement that is continuous at the twin boundary: the calculation is easily modified to allow for some slip. For simplicity, we will assume that the discontinuity is a plane. Actually, one can prove that a smooth surface is planar. To avoid confusion, the reference configuration has been chosen in the austensite phase, with v, ylt and y2 being continuous foundations of p and 9 in the martensite. These quantities determine the deformation gradients F1 and F 2 only within a rigid rotation. By elementary calculations, we then infer that the two should be related as indicated by F 2 = QI^P
Q " 1 = QT
det 0 = 1
(47)
where Q describes an undetermined rotation. Trivial integration gives the two deformations as x
= F
X + a,
x9 = F X + a-
(48)
where aj and a2 are arbitrary constants, the inverse deformations being given by X
= F?1 (x-aj
Xo = F ~ 1 ( x - a , )
(49)
For any fixed p and 6, the twin boundary plane will be given by an equation of the form n*x + a = 0, n*n = 1
(50)
where n is its unit normal. The assumed continuity condition is that we should have
490 116 J. L.ERICKSEN Xj = x 2 , for all x satisfying (49). For this, it is necessary that there exist a vector a such that
F" 3 ^ F ^ ( l + a ® n)
(51)
The remaining condition is an equation that can be regarded as determining q2 in terms of other constants, From (47), the determinants of F 2 and Fj have the same value, so we must have det.
(1 + a ( x ) n ) = l-*~a»n = 0
(52)
and this implies that
(1 + a £> n)~
(53)
= 1 - a ©n
Then, taking inverses and using (47), (51) becomes F 2 = (1-a ® n) F1 = QF.jP
(54)
Now, let T = F-JFF"1
(55)
from which it follows from (44) that T = T"1
(56)
With this, we obtain, from (54) and (56), T = Q T ( l - a ® n ) = (1 + a
(57)
From (44), we can write P in the form P = - 1 + 2 E g) E
E = (0, 0, 1)
(58)
So, inserting this in (55), we get T = -1 + C Q d
(59)
where c = P . E , d = (F ~ 1 ) T E 1~ 1 ~ c d = E*E = 1
(60)
491 CONTINUOUS MARTENSITIC TRANSITIONS IN THERMOELASTIC SOLIDS 117
Introduce the axis of rotation, e, viz. Qe = e , e « e = 1
(6i)
Operating on e with the second part of (57) gives n«e
(Qa + a ) = 0
(62)
and operating on e with the transposed equation gives a«e
(Qn + n) = 0
(63)
If a # 0 and e is not perpendicular to both a and n, it then follows that Q represents a 180° rotation, representable as Q = -l + 2 e © e
= QT
(64)
the second part of (57) then reducing to
- (Qa) ® n = a ® (Qn)
(65)
Also, using (57) and (64), we have
T + l = 2 e ( g ) e - (Qa) £ ) n = 2 c ® d
(66)
From (65) and (66), an elementary analysis shows that there are two possibilities. Bear in mind that, in (61), e is not distinguished from —e, and n is similarly ambiguous. To within such ambiguities, we have i)
n = e = d//d«d
a = 2 (vS^d c - e)
J_
(67)
ii)a//a*a = e = c//c*c /a*a n = 2 (e - /cc
d)
The possibility that a = 0 gives Fj = F 2 , which we will not have in the martensite. By an elementary exercise, based on the second part of (52) and on (57), e cannot be perpendicular to both a and n, so (67) must apply. Thus given Fj, we can use (60) to get c and d, (67) to get a and n, and (54) to get F 2 . It is then a simple matter to fit together the constants a j , a 2 , etc. to get a continuous displacement, or to allow for some slip. Similar
492 118 J. L. ERICKSEN conditions are encountered in various studies of twinning, but often they are discussed in such a way as to leave the impression that one must consider atomic arrangements, etc., to get them. In the twinning analysis, we have not used the fact that the phase transition is continuous, and the analysis applies equally well to cases where it is not. Clearly, a different, more global stability analysis is needed for the latter, but it would take too long to elaborate on this here. Of more physical interest are analogous transitions associated with loadings not of the hydrostatic form. These are encountered in various newer alloys that can undergo rather large, reversible deformations. Through judicious selection of a stability criterion, some of these can also be reduced to considerations of homogeneous or piecewise homogeneous configurations. The problems are then nontrivial but much like those that have become rather routine — for example, in studies of the stability of structures. Thus, it seems both timely and feasible to get such theory in better shape, and to give serious thought to what can be done with inhomogeneous deformations. We have not belabored kinematical interpretations of (67), etc., but one thing must be noted. Whichever we use, we will have, in obvious notation C
2
= F
2F2
=
toF!P)T
Q F X P = PCjP
(68)
so the two give the same value of C 2 , as should be the case if they are to give the same value to the thermodynamic potential. If we label the two as F 2 and F'2, we must then have F ' = Q F 2 = Q'TE^ = QQTF 1
(69)
where Q represents some rigid rotation, Q' and Q being the 180° rotations already discussed. Thus QQ = Q 1
Q = Q'Q
(70)
If we shift to one of the twins as a reference configuration, so that, say, Fj = 1, then F_ = 1 + a@n
(71)
represents a simple shearing deformation, indicating the relation between the twins. Ufaem, T becomes the basic operator, describing the invariance of
REFERENCES 1. J. L. Ericksen, Some Simpler Cases of the Gibbs Problem for Thermoelastic Solids, /. Therm. Stresses, vol. 4, pp. 13-30, 1981.
493 CONTINUOUS MARTENSITIC TRANSITIONS IN THERMOELASTIC SOLIDS 119 2. L. D. Landau, On the Theory of Phase Transitions, in D. Ter Haar (ed.), Collected Papers of L. D. Landau, Gordon and Breach, and Pergamon Press, New York-London-Paris, 1965. 3. J. L. Ericksen, Some Phase Transitions in Crystals, Arch. Ration. Mech. Analy., vol. 73,pp. 99-124, 1980. 4. Yu. V. Mnyuck, Molecular Mechanism of Polymorphic Transitions,Mol. Cryst. Liq. Cryst., vol. 52, pp. 163-200, 1979. 5. J. L. Ericksen, Variations on a Bifurcation Theorem by Poincare1, Meccanica, vol. 13, pp. 3-5, 1979. 6. J. E. Mayer and S. F. Streeter, Phase Transitions, J. Chem. Phys., vol. 7, pp. 1019-1025, 1939. 7. L. Tisza, On the General Theory of Phase Transitions, in R. Smoluchowski, J. E. Mayer, and W. A. Weyl (eds), Phase Transformations in Solids, Wiley, New York, and Chapman & Hale, London, 1951. 8. G. Parry, On Phase Transitions Involving Internal Strain, to appear in Int. J. Solids Struct. 9. C. S. Barrett and T. B. Massalski, Structures of Metals, 3d ed., McGraw Hill, New York, 1966. Received October 20, 1980 Request reprints from J. L. Ericksen.
494 Offprint from "Archive for Rational Mechanics and Analysis", Volume 107, Number 1, 1989, pp. 23-36 © Springer-Verlag 1989 Printed in Germany
Weak Martensitic Transformations in Bravais Lattices J. L. ERICKSEN Dedicated to B. D. Coleman, on the Occasion of His Sixtieth Birthday 1. Introduction
Nonlinear thermoelasticity theory is being used, with some success, to analyze phenomena associated with phase transitions in some crystals, involving a change in crystal symmetry, what are often called Martensitic transformations. Roughly, these are the crystals which are, or at least behave as if they were Bravais lattices. For such lattices, molecular theories of thermoelasticity imply that the Helmholtz free energy should be invariant under an infinite discrete group. However, workers often use constitutive equations which are invariant only under a finite subgroup, to analyze behavior near transitions, Physicists are likely to use polynomials of as low degree as is feasible, what is sometimes called "Landau Theory", to treat second-order or "weak" first-order transitions. I don't know how to give a precise meaning to the notion of a weak transition. By one rather pragmatic interpretation a transition is weak if behavior near the transition of interest can be analyzed, satisfactorily, with a free energy function which is invariant only under some finite subgroup. Using this idea, one can deduce some properties which transitions must have, to be considered weak. My purpose is to elaborate this, to try to get some better understanding of what limits the ranges of applicability of what are, really, two versions of thermoelasticity theory. Briefly, if an unloaded crystal is subjected to different temperatures, it is likely to experience phase transitions at isolated values of the temperature. Commonly, these are of first-order, involving sudden changes in deformation, the "transition strains", which can be large or small. Twins are likely to appear or disappear if some change in symmetry occurs, among other things. Application of loads is likely to shift the transition temperature and produce other unusual effects even when the loads are quite small. The idea is to use one constitutive equation to cover such deformations and a temperature interval. Physically, it is reasonable enough to think of its domain as bounded, but it does need to include the phases of interest. Roughly, I will argue that, in assuming that a finite invariance group suffices, we are, effectively, restricting the constitutive equation to a rather special kind of domain. To be weak, a transition must connect two states in one such domain and, as we shall see, it is fairly easy to deduce some necessary conditions for this.
495 24
J. L. ERICKSEN
2. Elastic crystals First, we review some ideas about nonlinear thermoelasticity theory which are implied by various versions of molecular theory, for Bravais lattices. Think of what is, intuitively, a homogeneous configuration of a crystal filling all of space. According to the classical definition of a crystal, it must admit a translation group generated by three linearly independent lattice vectors ea (a = 1, 2, 3). That is, any vector with integer components relative to this basis translates any point to a physically indistinguishable point. The simplest kinds of configurations are the Bravais lattices. These consist of identical atoms, positioned so that all of their positions are given by applying such a translation group to one position. Molecular theories of elasticity for these tend to be simpler and involve fewer hypotheses than do those used for more complex crystals. Roughly, they deal with relatively short-range interactions. Now, in crystallography, it has long been appreciated that different lattice vectors can generate the same lattice: ea and ea do provided ea = mbaeb,
(2.1)
m = \\mba\\
(2.2)
detwi = ± l .
(2.3)
where is any matrix of integers with
We employ the usual summation convention. Such matrices form a representation of the infinite group GL(3, Z), the general linear transformations on the integers, in three-dimensions, a group I denote by G. Now, for the Bravais lattices, prescribing a set of lattice vectors fixes the atomic positions to within an unimportant translation of the crystal. From this, it is rather clear that molecular theories of thermoelasticity will imply a constitutive equation for the Helmholtz free energy density of the form. V = V(?a, 6) = y(mbaea, 6),
mtG,
(2.4)
G being the infinite group mentioned in the introduction. Another rather obvious implication is that rigidly rotating the crystal won't change 95, or V(ea, 0) = ?(Re a , 6),
R G SO(3),
(2.5)
50(3) being the group of rotations. Tacitly assumed here is that only the atomic positions matter. Paramagneticferromagnetic transitions do occur in Bravais lattices and, to treat them, one needs to introduce other "internal variables". With more complex crystals, there are more possibilities for encountering other kinds of variables which can play an important role in transitions. JAMES [1] and RIVLIN [2] discuss cases of this kind, associated with certain transformations in quartz. In molecular theory, one strategy is to try to eliminate such variables by minimizing
496 Martensitic Transformations in Bravais Lattices
25
y{ea, 6) satisfying (2.4) so, to this extent, the crystal behaves as if it were a Bravais lattice. This does exclude some kinds of metastable configurations which can be important, physically. Or, if one allows them,
(2.6)
are a possible set of lattice vectors in the deformed configuration. Then, we can introduce a group G, conjugate to G, consisting of the linear transformations H given by HeG
<=$ HEa = nfbEb,
m£G.
(2.7)
Then, we have, from above
497 26
J. L. ERICKSEN
Associated with any set of lattice vectors ea is a finite subgroup L(ea) of G which I like to call the lattice group, denned by L(ea) = {meG\Q
= mabea®e0£
0(3)}.
(2.9)
Here 0(3) denotes the three-dimensional orthogonal group, e" the reciprocal lattice vectors, the dual basis satisfying e a • e b = b%,
e a
(2.10)
Then, for example, the expression for Q in (2.9) is equivalent to the formula Qea = mbaeb.
(2.11)
As is clear from (2.1), there are infinitely many choices of lattice vectors for a given configuration. Applying (2.1) changes the lattice group, by a similarity transformation involving some m£G. We account for this by considering all of these to be in an equivalence class, as is briefly indicated by L(mbaeb) ^L(ea),
meG.
(2.12)
We accept the classical view that two Bravais lattices have the same symmetry provided these equivalence classes coincide or, equivalently, if it is possible to choose lattice vectors so that their lattice groups coincide. SCHWARZENBERGER [5] calls the set of lattices having the same symmetry a Bravais lattice. For us, this might be confusing, since the different ones can be related by non-trivial deformations. Instead, I will say that they have the same symmetry type. Associated with L(ea) is a conjugate group P(ea) consisting of the orthogonal transformations indicated by (2.9), the point group. Applying a rotation to ea changes the point group to a conjugate group of 0(3), by the obvious similarity transformation. Since this is commonly regarded as a trivial operation, we regard these as being in an equivalence class indicated by P(Re_)__P(O>
ReSOQ).
(2.13)
meG,
(2.14)
Also, it is easy to verify that P(mbaeb) = P(ea),
i.e. the point group is independent of the choice of lattice vectors. Following [5], we say that two Bravais lattices belong to the same crystal system provided that their point groups are in the same equivalence class. It is known that there are 7 crystal systems. The corresponding point groups are described in Table 1 in the Appendix. The reader might be more familiar with the number 32, which is obtained by permitting proper subgroups of the 7. These can be realized in some crystal structures, but not Bravais lattices. Elasticians might be familiar with the number 11. The number 32 effectively reduces to 11 because elastic strain energies are automatically invariant under central inversions. Clearly, 4 of these can only be realized in structures more complex than Bravais lattices. From remarks made earlier one should be cautious, in using elasticity theory for them. SCHWARZENBERGER
498 Martensitic Transformations in Bravais Lattices
27
As is familiar to group theorists, and is discussed in some detail by ERICKSEN [6], any finite subgroup of G is a lattice group for some choices of lattice vectors and, independent of the choice, it determines a point group, to within the equivalence indicated by (2.13). Said differently, configurations having the same symmetry type are in the same crystal system. However, crystals in the same system can have different symmetry types, it being known that there are 14 symmetry types. The breakdown is described in Table 2, in the Appendix. Thus, the distinction between lattice and point groups is of some import. Elasticians are in the habit of glossing it and, sometimes, this is justifiable. One of our aims is to elaborate this. It simplifies matters a bit to replace the groups by subgroups, discarding the elements with negative determinants, which involves no real loss in generality. We denote by G+ the subgroup of G with det m = 1, and similarly define L+(ea) and P+(ea). Also, it follows from (2.6) that
(2.15)
or, with a slight abuse of notation, (2.16)
cp =
(2.17)
m£G,
the same for all Bravais lattices. Configurations of particular interest are the natural states, minimizers of
One of these will generate the infinite set mT Dm, for m G G+. As is perhaps obvious, and easy to verify, these all have the same symmetry type or, if you like, they are symmetry-related minimizers. Reasonably, q> can have other minimizers, with a different symmetry type, at isolated values of 0, transition temperatures associated with first-order Martensitic transformations. When we say that a crystal has a certain symmetry, we generally mean that this is the symmetry of a natural state, as I interpret it. Clearly, the assumption that (2.6) holds gives where
ea • eb = FEa • FEb = Ea • CEb
(2.19)
C = FTF
(2.20)
is the Cauchy-Green tensor commonly used in nonlinear elasticity theory. With (2.8) and (2.9), we then have
0),
He G+.
(2.21)
What is more conventional is to use (2.21) with Ea taken as lattice vectors for a natural state at some value of 0, and G+ replaced by the finite group P+(Ea).
499 28
J. L. ERICKSEN
When we allow the possibility of Martensitic transformations, more than one point group can fit this description. If one exercises some good judgement in making the choice, theory of this kind can produce good predictions for some kinds of transitions, those we loosely defined as weak. This is well illustrated by the work of BALL & JAMES [7], for example.
3. Weak transitions Rather tacit in the previous discussion is the assumption that the constitutive equation is defined for all kinematically possible configurations, or at least some related by very large deformations. Physically, the theory cannot be expected to apply to such a wide range, so there is some motivation for replacing the previous function by a restriction of it to some smaller domain. This will affect the invariance, since the invariance group must map the domain to itself. What is most likely is that the restriction will be invariant under some subgroup of G+ For our purposes, it would be nice to have it be a finite subgroup. There is a natural way of doing this. It stems from the following fact, discussed briefly by SCHWARZENBERGER [5], in more detail by PITTERI [8]. Bear in mind that, given any symmetric, positive definite matrix D, one can find vectors ea satisfying (2.15) and they are determined to within an orthogonal transformation. Thus, D determines a lattice group as indicated by L+(D) = {m£G+\mTDm
(3.1)
= D}, T
as follows from (2.10). Then, the fact is that, given D = D > 0, there is a neighborhood of it, N(D), such that, for m € G+, =» mTDm(t
N(D)
if
m^L+0)
(3.2)
D C N(D) =P mTDm C N(D)
if
meL+(D).
(3.3)
DCN(D) and
Clearly, if q> is invariant under G+, its restriction to N(D) will be invariant under the finite group L+{D) and not under any larger subgroup of G+. From (3.1)(3.3), it follows immediately that D C N(D) => L+(D) C L+0).
(3.4)
As I see it, assuming that
500
Martensitic Transformations in Bravais Lattices
29
One point is that, within N(D), we can set up a one-to-one correspondence between lattice and point groups, so there is no important distinction between them. To see this, take as a reference lattice vectors Ea such that (3.5)
\\Ea-Eb\\ = D.
For any other D C N(D), we can, by normalizing the choice of an orthogonal transformation, assume that corresponding lattice vectors ea satisfy a special version of the Cauchy-Born rule, ea=UEa,
U=UT>0.
(3.6)
Then, ReP+(ea)^
Re a = ™*e4,
meL+(D).
(3.7)
From (3.4), it then follows from (3.4) that, for the same m, there is a rotation R such that m e L+(D) =» REa = m"aEb. (3.8) Putting these together, we have or
Re a = RUEa = mbaeb = U(mbaEb) = UREa (3.9)
RU = UR.
By an elementary exercise in linear algebra, (3.9) is satisfied if and only if R = R^UR
= RU
+
and
R£P+{ea).
(3.10)
+
Clearly, this makes P (ea) a subgroup of P (Ea), and attaches a unique rotation to each m£L+(Ea). So, to answer a question raised earlier, it is in this sense justifiable to gloss the difference between point and lattice groups, within one of our neighborhoods. It is worth noting that, in deducing (3.10), we only used the inclusion of lattice groups indicated in (3.4), a condition which is only necessary that D be contained in N(D). Also, the conclusion does not depend much on f>. Read differently, it says that, if ea is given and Ea is any other set of similarly oriented lattice vectors with L+{ea) C L+(Ea), then, to within a rotation, Ea = U~l ea, where U"1 commutes with all R G P+(ea). To (3.10), there is a kind of converse, easily proved by similar arguments. Suppose (3.6) and (3.8) hold. Then RU=UR
^ Re P+(ea)
and m£L+(ea).
(3.11)
If one considers ea as a natural state, (3.10) describes the restrictions on U which are often considered to apply to stretches produced by thermal expansion, or application of an hydrostatic pressure. Essentially, this is based on assumptions of local uniqueness and smooth dependence of U on 6 or pressure, likely to be good except at phase transitions. The possible forms of U for the various point groups are neatly deduced by COLEMAN & NOLL [9], for example. The shorter list for Bravais lattices is in Table 3, in the Appendix.
501 30
J. L. ERICKSEN
In, say, the* BCC-BCT transitions in Indium-Thallium alloys discussed by & JAMES [7], one does not have a crystal, in the strict sense. As with other alloys, a small amount of the alloying element is more randomly spaced. However, the bulk of the atoms do fit a Bravais lattice. The observed transition strains do fit (3.10) with ea corresponding to BCT natural states, Ea as a BCC natural state. So, the BCT lattice group is contained in the BCC lattice group. The latter is maximal, not contained in any other lattice group, so, from (3.4), we are pretty well forced into identifying D with the cubic phase, as is implied above. It is then a problem in kinematics to determine whether the indicated BCT phases lie in such a N(D). Later, we will discuss some difficulties which could arise here. If the transition strain is small enough, it will be in N(D) and, in the case at hand, it is quite small. I won't pursue this question. However, PITTERI'S [8] analysis of (3.1) etc. is rather constructive, involving estimates which might by useful in settling some questions of this kind. Assuming the answer is affirmative in the case at hand, one has a fairly good justification for assuming that
UR = RTUR,
R£P+(Ea),
(3.12)
U^U
producing lattice vectors (3.13)
(eR)a=UREa,
configurations giving the same value to cp, if only it is invariant under P+(Ea). In dealing with Martensitic transformations, workers are likely to identify Ea with Austenite, a higher temperature natural state, the {eR)a with lower-temperature, symmetry-related Martensitic natural states. For analysis of twins, likely to form when the Austenite is cooled enough to transform to Martensite, it is important to have such symmetry-related natural states. Suppose P+(ea) is a group of order N, with elements Ri(i — 1 , . . . , N). Then, from (3.10), all these should leave U invariant RfURt=
U
( 3 - 14 )
(no sum on 0-
Thus, one will get the same UR for all rotations of the form RtR,
i = 1 , . . . , JV,
(3.15) +
+
that is, for all rotations in the same right coset of P (ea) relative to P (Ea). Also, using (3.11), one can show that, for two rotations in P+(Ea) to give the same value to UR, they must lie in the same right coset. The number n of symmetry related * We use the abbreviations of symmetry types given in Table 2 of the Appendix.
502
Martensitic Transformations in Bravais Lattices
31
phases is then also the number of right cosets and from the theory of finite groups this is n = N/N, N = order P+(Ea). (3.16) If L+(Ea) is, in turn, a subgroup of a larger lattice group then, in some cases, another choice of Ea could give a larger value of n, a fact which might be helpful in attempting to analyze complex arrangements of twins, etc. Of course, we can get infinitely many, with
(3.17)
Commonly, this is used to describe twinned Martensite, considered as a piecewise homogeneous natural Martensite state, arising from homogeneous Austenite, by a continuous displacement. Different values of Rn will give different solutions provided they are in different cosets or, equivalently, if their product is not an element of the smaller point group P+(ea). My experience is that it is easier to use the last criterion. With Table 1, it isn't very hard to sort this out, for any particular choice of group and subgroup. Clearly, we can apply similar reasoning to get the possible forms of deformations taking any lattice vector in N(D) to any other. Adding in arbitrary rotations, we will have, say e a) =
RWumEa,
ef = RPU™Ea,
(3.18)
with Um and t/ (2) restricted as in (3.10). Then, we will have eM)
with
a
_ rcr- (1) — a >
F = R™ U<-2\Um)-1
RWT.
(3.19)
Possibly, such relations would be useful, when possible choices of D exist, but are not intuitively obvious choices. With these observations, we have found some guidelines for deciding whether a phase transition can reasonably be considered to be weak, covered by a finite group. It does emphasize the need to better understand the structure of neighborhoods N(D). ** As is noted by ZANZOTTO [4], two different definitions of these occur in the literature and are alleged to be equivalent, but are not mathematically equivalent. I disregard that involving conditions that certain directions be rational.
503 32
J. L. ERICKSEN
4. Examples First, we consider an example of a Martensitic transformation which is certainly not weak. It is of the BCC-FCC type, which occurs in Iron, for example. These have the same point group, so one might be tempted to try to use a constitution equation invariant under it. I won't belabor the difficulties encountered if one tries this. Although they are in the same crystal system, they have different symmetry types, so there is no way of choosing lattice vectors so the two lattice groups coincide. These two groups must then generate a larger group. However, there is no larger lattice group, so the larger group must be infinite. It might be a proper subgroup of G+, but it won't simplify matters much to use it instead of G+ as the invariance group. FONSECA [10] discusses possible ways of constructing constitutive functions invariant under G+, but I don't know of an example of a constitutive equation appropriate for a transition of this kind. Further, there are ramifications concerning bifurcation theory. Were equilibrium paths of the BCC and FCC phases to intersect, a continuity argument indicates that, at the intersection, the lattice group would contain both lattice groups, which is impossible. I A rather similar situation occurs with transitions of the TRIG-// type. A TRIG point group can be a proper subgroup of the H point group, and the latter is not a proper subgroup of any larger group. Thus, if any point group is appropriate, it must be the H group. From Tables 1 and 2 in the Appendix, allowing the possibility that the two orthonormal bases might differ, we see that at least the two vectors labelled as k must be parallel. Then, from (3.10) and Table 22, it follows that the linear transformation relating them must be of the form £ / = a l +pk ® k.
(4.1) nl3
From Table 1, the larger point group contains the rotation R (k). It commutes with U so, from (3.11) it should also be in the smaller point group. However it is not. So the TRIG lattice group is not contained in the H lattice group. Again, because the latter is maximal, the two lattice groups generate an infinite group, so one might as well use G+ as the invariance group. Also, by an argument like that used before, equilibrium paths of these kinds cannot intersect. Another observation is that the deformation associated with a transition of this kind will not be of the form (4.1). Briefly, as above, one can use (4.1) and (3.4) to get elements which should be in the TRIG-point group, but aren't. Now consider configurations with lattice vectors of the form «! = ai,
e2 = aj,
e3 = y (i + j) + bk
(4.2)
with (i,j, k) an orthonormal basis, a and b positive scalars. For most values of a and b, these are BCT configurations, and linear transformations of the form (4.1) map one into any other. It isn't hard to show that (3.10) is then satisfied, with P+(ea) interpreted as this tetragonal point group. So, it might seem that one might use this as the basic invariance group, for transitions leaving the symmetry unchanged. A quirk arises because some cubic configurations are included. We have BCC when
b = a/2,
FCC when b = a/]fl.
(4.3)
504 Martensitic Transformations in Bravais Lattices
33
This divides the BCT configurations into 3 disjoint sets, according as b > a/2, a/2 < b < a/]/2 or b > a/|/2, what SCHWARZENBERGER [5] calls the short, medium and long body-centered tretagonals. It follows from (3.4) that, if D is tetragonal, N(D) cannot contain cubic configurations and, from above, there is no D, with N(D) containing both BCC and FCC configurations. This doesn't prove that it is impossible to treat as weak such things as short-long BCT, BCC-long BCT or FCC-short BCT transitions but, with the obvious complications, it is not so clear that it would be advantageous to do so. Actually, FONSECA [10] has done some analysis of cubic-tetragonal transitions, with 99 invariant under G+, so it is not hopeless to use this approach. With, say, a short-medium BCT transition, the two configurations can be very close to each other, which might have tempted us to take one as D and use the tetragonal group. Now, it is not so sure that this can be done and even if it can, it might be better to use as D a nearby cubic configuration, and the associated group. Here, we have discussed classification of the BCT symmetry type into 3 subtypes. SCHWARZENBERGER [5] gives a more satisfactory and comprehensive analysis, using topological ideas to classify all symmetry types into subtypes. Clearly, this is of some help in understanding the structure of N(D), although it is not a panacea. It is perhaps worth noting that (4.1) can obviously relate the BCC and FCC configurations given by (4.3). It is then what is called "Bain strain" in the metallurgical literature, this being in agreement with the observed deformation associated with such transitions in at least some crystals. One point seems worth mentioning. For SCHWARZENBERGER and other mathematical crystallographers, one set of lattice vectors is as good as any other. Objectively, the X-ray crystallographer is in this position. HARTSHORNE & STUART [11, p. 1] note that different workers have, in fact, opted for distinguishably different choices. For us, this is not quite true, since the choice is linked to macroscopic deformation. For us, all similarity oriented sets of lattice vectors are topologically equivalent, in the sense of being related by linear transformations with positive determinant, and with this is associated, in a natural way, a set of neighborhoods, including those we have called N(D). SCHWARZENBERGER has a somewhat different view of matters topological, shared by other mathematical crystallographers. From the viewpoint of the crystallographer, the usual picture of a facecentered cubic is nice, making clear what is the point group, and you can figure out where all the atoms are. There is the difficulty that it suggests, wrongly, that the edges of the cube are lattice-vectors. There is a similar situation with the bodycentered cubic. Try to apply the Cauchy-Born hypothesis to such "lattice vectors", taking the Bain strain as correct, and you will see a pitfall. Enough said. From one point of view, we have compared two kinds of thermoelasticity theory. One, more conventional and thereby more familiar, assumes
505 34
J. L. ERICKSEN
surveys by KINDERLEHRER [13, 14], for example. We should, I think learn to live with the idea that a constitutive equation in one theory is to be regarded as a restriction of one in the other. With the rather simple reasoning used, I have done mo more than indicate that this has some rather definite implications. Appendix Here, for easy reference, I include some information on point groups, etc., for Bravais lattices. If v is a unit vector, 6 an angle expressed in radians, RH(v) denotes the rotation with v as axis, 6 as an angle. Also, i,j and k represent vectors forming an orthonormal basis, vn denoting the four unit vectors given by
In Table 1, "point groups" means the groups denoted by P+ earlier, P being the point group generated by P+ and the central inversion Q = — 1. The groups are numbered according to their order. Table 1. Point groups for Bravais lattices Number
Elements
1 2
1 1, Rn(i)
3
1, R"(i), R"Q),
4
,,* m *'(^) R27ll\k),
5
Order 1 2 tf"(/c)
4 6
R47tl\k)
1, lP(i), *»(/), *»(*), R* (*-^l\
8
i?T/2(Ar), R**l2(k) 6
1, R"(i), R"(j)> Rn(k),
Rl7lt\k), 7
12
R4"l3(k), R5n>\k)
1, R^ii), R^HJ), Rn'\k\
R"(i), R"U), RK(k), R3"l2(i), R?n'2{j), R3nl2(.k) R2nl\vn), R**'\vn)
24
506 Martensitic Transformations in Bravais Lattices
35
Different workers do attach different names and symbols to these. With the same numbering and names which are, I think, somewhat familiar to elasticians, we have Table 2. Systems and symmetry types Number
System
Symmetry types
1
Triclinic
Triclinic (TRI)
2
. Monochnic
3
, , ,. Orthorhombic
4
Trigonal
5 6
Tetragonal Hexagonal
7
Cubic
Simple Monoclinic (SM) Centered Monoclinic (CM) f Simple Orthorhombic (SO) Centered Orthorhombic (CO) j va&yixBllsnA Orthorhombic (BCO) I Face-centered Orthorhombic (FCO) Trigonal (TRIG) Simple Tetragonal (ST) Body-centered Tetragonal (BCT) Hexagonal (H) f Simple cubic (SC) | Body-centered cubic (BCC) [ Face-centered cubic (FCC)
where letters in parentheses denote abbreviations used here. Lattice groups for the different symmetry types depend on the choice of lattice vectors, so it is impossible to list all of them. One can use some rule for picking particular sets of lattice vectors, as SCHWARZENBERGER [5] does, use the point group P+ which fits and calculate m 6 L+, using m? = e"-Re fc ,
ReP+.
The group generated by L+ and m£ = — bab is L. Again, with the same numbering and /, j , k as in Table 1, we have x and /? being any scalars such that x > 0 , x + /? > 0. Table 3. Forms of U satisfying (3.10) Number
Restriction on U
1 2 3 4 5 6 7
None U has i as eigenvector U has i, j , k as eigenvectors U = <x\ + $k
507 36
J. L. ERICKSEN
Acknowledgement. I thank MARJORIE SENECHAL for bringing to my attention the paper by SCHWARZENBERGER, helping me to get a somewhat better understanding of those neighborhoods. This research was partially supported by AFOSR/XOP and NSF, under NSF grant DMS-8842048. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
JAMES, R. D., The stability and metastability of quartz, in Metaslability and Incompletely Posed Problems, IMA Vol. 3 (ed. S. ANTMAN, J. L. ERICKSEN, D. KINDERLEHRER & I. MULLER), Springer-Verlag, 1987, 147-176. RIVLIN, R. S., Some thoughts on material stability, Proc. IUTAM Symp. Finite Elasticity (ed. D. E. CARLSON & R. T. SHIELD), Martinus Nijhoff, 1982, 105-122. ERICKSEN, J. L., Multi-valued strain energy functions for crystals, Int. J. Solids Structures 18, 913-916, 1982. ZANZOTTO, G., Twinning in minerals and metals: remarks on the comparison of a thermoelastic theory with some available experimental results, to appear in Rend. delVaccad. Lincei. SCHWARZENBERGER, R. L. E., Classification of crystal lattices, Proc. Camb. Phil. Soc. 72, 325-349, 1972. ERICKSEN, J. L., On the symmetry of deformable crystals, Arch. Rational Mech. Anal. 72, 1-13, 1979. BALL, J. M., & JAMES, R. D., Fine phase mixtures as minimizers of energy, Arch. Rational Mech. Anal. 100, 13-52, 1987. PITTERI, M., Reconciliation of local and global symmetries of crystals, /. Elasticity 14, 175-190, 1984. COLEMAN, B. D^, & NOLL, W., Material symmetry and thermostatic inequalities in finite elastic deformations, Arch. Rational Mech. Anal. 15, 87-111, 1964. FONSECA, I., Variational methods for elastic crystals, Ph. D. Thesis, University of Minnesota, 1985. HARTSHORNE, N. H., & STUART, A., Practical Optical Crystallography, American Elsevier Publishing Co., 1964. TRUESDELL, C , A First Course in Rational Continuum Mechanics, Vol. I, Academic Press, 1977. KINDERLEHRER, D., Remarks about equilibrium configurations of crystals, in Material Instabilities in Continuum Mechanics (ed. J. M. BALL), Oxford University Press, 1988. KINDERLEHRER, D., Phase transitions in crystals: the analysis of microstructure, to appear in Proc. Int. Colloquium in Free Boundary Problems: Theory and Applications.
Department of Aerospace Engineering & Mechanics, and School of Mathematics University of Minnesota, Minneapolis (Received September 9, 1988)
508 Journal of Elasticity 28: 55-78, 1992. © 1992 Kluwer Academic Publishers. Printed in the Netherlands.
55
Bifurcation and Martensitic transformations in Bravais lattices* J.L. ERICKSEN Department of Aerospace Engineering and Mechanics and School of Mathematics, University of Minnesota, Minneapolis, MN 55455, USA Received 23 August 1989
Abstract. Behavior of crystalline solids exhibiting shape memory effects seems to be associated with special kinds of Martensitic transformations, near which some linear elastic moduli are small compared to others. This work explores the possibility of interpreting this in terms of special kinds of bifurcations.
1. Introduction
It is rather common for crystals to undergo Martensitic transformations, phase transitions involving some change of symmetry. Near these, alloys exhibiting shape memory effects behave quite differently than do various steels, for example. Roughly, they do seem to behave much more elastically. Currently, nonlinear thermoelasticity theory is being used with some success to analyze near-transition phenomena in such solids, despite the fact that very little is known about the nature of relevant constitutive equations. Some, for example Nakanishi [1], have noted that, in these more elastic materials, some moduli are quite small compared to others, and opined that this is of some import. This and some other kinds of observations suggested to me that, theoretically, their behavior might be associated with rather special kinds of sub-critical bifurcations, involving both stable and unstable branches. Also that, indirectly, the observations do supply some information about the unobservable unstable branches. Getting this straight could help us find better constitutive equations, as well as design some interesting experiments. Briefly, my purpose is to elaborate this. While the analyses have some relevance for second-order phase transitions, the interest is more in first-order transitions, wherein the deformation changes in a discontinuous manner as the temperature 6 passes through a critical value in an unloaded crystal, for example.
* This work was supported by AFOSR/XOP and NSF, under NSF grant DMS-8842048.
509 56
J.L. Ericksen
2. Basic equations From a viewpoint which is somewhat close to molecular theory, we deal with crystals whose thermodynamic state is determined by the absolute temperature 9 and a set of lattice vector fields ea (a = 1, 2, 3), three linearly independent vector fields. That is, >, the Helmholtz free energy per unit mass is expressible as a function of these variables. To make the connection with thermoelasticity theory, we introduce one configuration, with constant lattice vectors Ea, as a reference, and employ the Cauchy-Born hypothesis. Briefly, this is the assumption that, if F is the macroscopic deformation gradient relative to this reference, then the vectors ea given by ea=FEa
(2.1)
are a possible set of lattice vectors in the deformed configuration. The assumptions best fit the simplest kinds of crystals, the Bravais lattices. These are most easily described by considering the lattice vectors as constant, with the body filling all of space. It consists of identical atoms which, like mass points, can be considered to have no structure, arranged so that their relative position vectors consist of all vectors with integer components relative to some set of lattice vectors. For more complex crystals, one might or might not need more variables to describe their states. Even if not, it is important to us that (2.1) apply to deformations encountered in phase transitions, as well as the twinning phenomena, etc., which tend to be associated with this. Here, my experience agrees with what Zanzotto [2] concluded from his appraisal of observations concerning mechanical twinning etc. That is, it seems fairly safe to assume that (2.1) applies to crystals fitting the description of Bravais lattices. For other kinds of crystals, it seems to apply to some, not to others and, in this, there seems to be no clear pattern. Strictly speaking, alloys are not crystals, in that small amounts of alloying elements tend to be arranged somewhat randomly, instead of periodically. Here, I'll gloss over this difference. While I won't use any particular molecular theory, I will accept some general implications of molecular theories of Bravais lattices. This includes (2.1), which has been the standard assumption for molecular theories of elasticity. There is the usual condition of objectivity, here reasonably interpreted to include invariance with respect to reflections, leading to the conclusion that 4> is expressible as a function of 9 and the scalar products described by the matrix D = \ea-eb\\,
(2.2)
510 Martensitic transformations
57
i.e. <j> = ${D, 6).
(2.3)
With (2.1), we have (2.4)
ea-eb=Ea-CEb, where C = FTF
(2.5)
is the usual Cauchy-Green tensor. Thus, with the Ea considered as fixed, we have ${D, 0) = ${C, 6),
(2.6)
a constitutive equation of the kind commonly considered in thermoelasticity theory. Now, for any given configuration of a Bravais lattice, the lattice vectors can be chosen in infinitely many ways, ea and ea do so provided ea=mbaeb,
(2.7)
whenever m = \\mba || e G = GL(Z, 3),
(2.8)
that is, for m any matrix of integers with det/« = + l .
(2.9)
Physically, it seems clear enough that $ should depend only on the configuration, not on particular choices of lattice vectors used to describe them. This requires the invariance $(mTDm,G) = $(D,e),
msG.
(2.10)
With this $ acquires a conjugate invariance group G(Ea), consisting of the linear transformation H satisfying G(Ea) = {H\HEa=mbaEb,meG}.
(2.11)
5U_ 58
J.L. Ericksen
That is, ${HTCH, 6) = <MC, 0),
HeG(Ea).
(2.12)
For any fixed choice of lattice vectors ea, we have the lattice group, L(ea) = {meG\Q=
mbaeb®eae
0(3)}
(2.13)
wherein O(3) is the three-dimensional orthogonal group and e" denotes the reciprocal lattice vectors, the basis dual to ea. Equivalently, such Q satisfy Qea=mbaeb,
(2.14)
and these represent a group P{ea) conjugate to L{ea), the corresponding point group. As is well-known, these are finite groups. It is easy to verify that applying any fixed orthogonal transformation to ea does not change L(ea). Since any symmetric positive definite matrix determines lattice vectors to within orthogonal transformations, we will sometimes abuse notation a bit, writing L(D) instead of L(ea). Applying to ea a transformation of the type (2.7) does change L(ea), to a conjugate group, but does not change P(ea). Applying an orthogonal transformation does change P{ea), by the obvious similarity transformation. Considering changes of symmetry does involve making some rather subtle distinctions. According to the classical theory of symmetry for Bravais lattices, we first divide the crystals into classes. Two crystal configurations belong to the same class provided that their point groups can be made to coincide, by applying some orthogonal transformation to lattice vectors describing one. It is known that seven classes are realized in Bravais lattices.* These classes are then subdivided into fourteen symmetry types, using the lattice groups: two Bravais lattices have the same symmetry type provided one can choose lattice vectors so that their lattice groups coincide. It does follow that those with the same symmetry type belong to the same class. For example, the cubic class contains three symmetry types, simple cubics, bodycentered cubics and face-centered cubics. It is certainly consistent with classical views of symmetry to view a phase transition involving two of these as involving a change of symmetry despite lack of any change in the point group, and this is my view. The classical scheme leaves rather open the choice of lattice vectors. Applying a given deformation to a given set of lattice vectors will, according * Pertinent discussion and references to such matters are given by Schwarzenberger [3].
512 Martensitic transformations
59
to (2.1), pick out just one of the infinitely many sets of lattice vectors describing the deformed configuration. From this view, the possibility arises that the lattice might change, although the two configurations still have the same symmetry type. From the classical view, there is then no change of symmetry, although such changes are of some physical interest. This is typical of what occurs in, say, mechanical twinning. A common practice is not to refer to such configurations as being of different symmetry, but to call them twins or symmetry-related configurations. This is also my view, so "change of symmetry" will mean "change of symmetry type". Here, we will be concerned with only the simplest kinds of equilibria occurring in crystals subject to an hydrostatic pressure p, in a heat bath at temperature 6, when these and the crystal configurations are homogeneous. Then, p and 6 are considered as control parameters, and our concern is with the changes in equilibrium configurations which occur when these are varied. For analyzing these, we use the thermodynamic potential y=M
(2.15)
where M is the (fixed) mass of the crystal, V the volume it has, in whatever configuration is to be considered. As a practical matter, very little is known about what forms such functions should take so, really, we are fishing for clues concerning this. If we introduce some reference configuration with (fixed) reference volume VR, we will have, with the usual assumption that detF>0,
(2.16)
V = (detF)VR=(detCy/2VR,
(2.17)
with 9=9{C,p, 6) = M${C,p, 9) + p(det CY'2VR.
(2.18)
Like $, it is invariant under the group G(Ea) referred to in (2.11). There is the alternative, of using the description in terms of lattice vectors. Here, one can argue this directly, by assuming that the crystal contains a fixed number n of unit cells, with m the mass of the unit cells. Then M = mn,
V = n\ex-e2 A e3| = n(det D) 1/2
(2.19)
and % =§(D,p,6) =n{m$+p(detDy2},
(2.20)
513 60
J.L. Ericksen
this potential, like 0, being invariant under the group G. Since we should also have (2.21)
VK=n\ErE2AE3\,
it is easy to show that the two expressions for ^ are equivalent, assuming that (2.1) applies. With either description, equilibria are, by the usual thermodynamic definition, extremals of <8, for example, the values of D satisfying d^/dD = 0,
(2.22)
for any fixed values of p and 9. For them to be stable means that they are absolute minimizers, relative minimizers being metastable. Other equilibria are unstable. We will be interested in all of the three kinds. Also of interest are the second derivatives, schematically indicated by (2 23)
-
W'V-JDaDFrom the invariance of #, it follows that, for m e G, aabcd{m8emhfDgh,p,
9) = aghkm{Def,p,
9)mlmhbmkcm^
(2.24)
and, from (2.13), we have, trivially, aim TDm, p, 9) = a(D, p, 9), m e L(D).
(2.25)
Bear in mind the remarks about notation, after (2.13). From this way of looking at it, restrictions on elastic moduli implied by symmetry could be obtained by picking an equilibrium configuration, taking the restriction to its lattice group of (2.24). With (2.25), this gives a set of linear equations to be satisfied by values of a, at this configuration. The common practice goes more as follows. Consider a particular equilibrium configuration, for some particular values of p =p0 and 9 = 90, and use it as a reference configuration. Then, the fourth order tensor B=
K
*15^(1'A>'0o)
(2.26)
is what is commonly considered to be the tensor describing elastic moduli. By the logically equivalent reasoning, it should be mapped to itself, by all
514 Martensitic transformations
61
transformations in the point group for the ground state. Working this out gives the special relations for different crystal classes which are in common usage. In this form, they are the same for different symmetry types which are included in the same class. Partly because of the nature of experiments, it is convenient to think of p and 9 as varying along particular paths, say p=p(z),
0=9(x),
T6[T,,T2]
(2.27)
with p{x) and 9(x) at least continuous functions of x. With the invariance, one equilibrium configuration generates infinitely many but, since the group is discrete, equilibria are likely to be isolated from each other, for fixed values of p and 6. What seems physically reasonable is the usual picture of continuous equilibrium paths, with one occasionally intersecting others, and it is such possibilities that we will explore. For this purpose, we will consider two kinds of equilibrium paths. One kind is defined by the conditions that (i) D = D(x) is a solution of (2.22), depending continuously on z, for x e[T,,T 2 ]. (ii) L(D(x)) is independent of z, for x e(xu z2). Paths of this kind will be referred to as tolerable paths. By continuity, it follows that, for them iPW]cL[%)],
i=l,2,
(2.28)
wherein the inclusion is to be interpreted in the group theoretic sense, the group on the left being a subgroup, perhaps proper, of those on the right. One could state this in a different way, by introducing assumptions of smoothness and the assumption that, for z e (T,, T 2 ) ker a(t) = {A = A T \ *{x)A = 0}
(2.29)
is the null set, and using the implicit function theorem. Then, (ii) follows from the local uniqueness implied by this. Here, a(r) is defined in the obvious way, by composition, i.e. a(T) = a(Z)(t),^(T),0(T)).
(2.30)
Note that our assumptions do permit a(t) to have a non-trivial kernel at either end point, x = T,, or z = z2. For the other kind of path, we consider the subset which passes the second derivative test for stability. More precisely, a tolera-
515 62
J.L. Ericksen
ble path will be called nice provided that B = BT^0^-B-[k(T)B]>0,
Te(T,,T 2 ).
(2.31)
Here and elsewhere, the dot means the usual inner product for symmetric matrices* A • B = tr AB.
(2.32)
by introducing a reference configuration, one can, of course, describe the paths as mappings of the x -interval to the space of Cauchy-Green tensors, replacing D{z) by C(T), etc. In conditions involving a(i), one then gets the obvious replacement of this by second derivatives of # with respect to C, implied by the chain rule. Granted the usual assumption (2.16), along with (2.1), deformations map lattice vectors to sets with the same orientation. Also, invariance conditions like (2.10) are unaffected by replacing m by —m. From this it follows that there is no real loss in generality in replacing the various groups by the subgroups with positive determinant. We will do this, denoting such groups with a plus sign, e.g. G+=^{meG
|detw = l}.
(2.33)
The elements P+(Ea) then consist of rotations, denoted by R, with such additional markings as might be necessary to distinguish one from another. At least tacitly, we have assumed that # is at least twice differentiate with respect to C. Later, we will also insist that these derivatives be also continuous functions of C and T, at least near the equilibrium paths of interest. Physicists rather expect that thermodynamic potentials will exhibit some rather mild singularities at second-order phase transitions, providing some reason not to assume more smoothness than is needed for the analysis.
3. Some useful relations Any configuration can be described by some set of lattice vectors, which can be taken as the Ea for a reference configuration, at given values of p and 6. * For some purposes, it might be preferable to use a more fundamental interpretation. Clearly k(z)B describes a linear functional on symmetric tensors, being then an element of the dual space. So, in (2.32), one can think of A and B as elements of the two dual spaces, using the similar definition of the dot which goes with this.
516 Martensitic transformations
63
With this are associated the (conjugate) groups L+{Ea) and P+{Ea). Relative to this reference, the fourth order tensor B defined by (2.26) is invariant under P + {Ea). With it, we can define W, a quadratic function of symmetric tensors, 2W(A)=A-BA, A=Ar
(3.1)
which is commonly considered as the strain energy function for linear elasticity theory, at least when the ground state is a stable equilibrium configuration. With P + (Ea), we can decompose the space of symmetric tensors into two orthogonal spaces, v(Ea) = {A=AT\RTAR=A,VRe
P + (Ea)}
(3.2)
and D X ( £ J = {B = BT | B • A = 0, VA e v(Ea)}.
(3.3)
Let the elements of P + {Ea) be denoted by +
Rl,...,RneP
(3.4)
(Ea).
Then, if E is any symmetric tensor, we define its average by
< £ > = - t RTmERm. H
(3.5)
m=]
Then, it is easy to verify that <£>
(3.6)
E v(Ea)
and Eev±(Ea)=><Ey = 0.
(3.7)
Also, E can always be written as E = E,+E2,
E.eviEJ,
E2 = oHEa).
(3.8)
517 64
J.L. Ericksen
Putting this in (3.1), we get, using the invariance of B under
P+{Ea),
W(RTmERm) = W(E) =
W{Ex+RTmE2Rm)
= W(E{) + {RTmE2RJ
• EEX + W{RTmE2Rm)
= W{E{) + W{E2) + {RTmE2Rm) • BE,. By summing this over m and using (3.7), we then get W{E) = W{E, + E2) = W{E,) + W{E2).
(3.9)
In terms more physical, the "interaction energy" between infinitesimal strains lying in the two subspaces vanishes. It is natural to think that this result should be known, but I have not yet found a reference. With this, it is useful to introduce the two restrictions W{(E) = restriction of W{E) to v(Ea),
(3.10)
and W2{E) = restriction of W(E) to v^EJ.
(3.11)
In these terms, for equilibria occurring on a nice branch, we have W(E) >0<^Wl(E)^0
and
W2{E) >0,
(3.12)
where, on the domains of these functions, each of the inequalities holds in the strict sense. Concerning another matter, suppose that we have two sets of lattice vectors with the same orientation, such that the lattice group of one is a subgroup of that of the other. Taking, say, that with the larger group as reference, we have L + {ea)^L
+
{Ea),
(3.13)
with ea denoting the other set of lattice vectors. Also the two sets determine a value Fo of F, satisfying (2.1) and (2.16), i.e. ea=F0Ea.
(3.14)
518 Martensitic transformations
65
We have the polar decomposition F0 = R0U0,
(3.15)
U0=Ul>Q,
with Ro a rotation, and can then define lattice vectors e° by (3.16)
e°a = U0Ea. Since they differ from ea by a rotation, we will have L+(e°a) = L+(ea).
(3.17)
With this normalization of rotations, we have P + (e°a)
(3.18)
and, with C0 = FZF0=U20,
(3.19)
we also have that ReP+(Ea)
and
RTC0R = C0oR e P+(e°),
(3.20)
according to results obtained by Ericksen [4]. Also noted there is a kind of converse to (3.20). That is, if ea and Ea are two sets of lattice vectors such that ea = UEa,
U=Ur>0,
P+{ea)czP+{Ea),
(3.21)
then RTUR = U, VR eP+(ea)
=> L+{ea) ^L+(Ea).
(3.22)
In particular, ea and Ea could be taken as two configurations on the same tolerable path, implying that their lattice groups coincide. In the sense indicated, their point groups coincide, and (3.20) gives restrictions on the relative deformation, implied by symmetry. Commonly, for example in the work of Coleman and Noll [5], (3.20) is used to estimate the restrictions on deformations produced by changes of pressure and/or temperature in crystals, with various symmetries. Specialization of such results to Bravais lattices is given by Ericksen [4], who also discusses examples where (3.21) holds, but (3.22), does not. That is, tacit assumptions concerning relations between lattice groups underlie such results.
519 66
J.L. Ericksen
4. Bifurcations with no change of symmetry Suppose that, for some control path parametrized by x, we have two tolerable branches D(x) and D(x), defined for T < x2, at least near x2, and meeting at T2, so that (4.1)
D{x2) = D(x2) = D2, say. We first consider the rather special case where
(4.2)
L(D{x)) = L(D(x)) = L(D2),
one of the ways to have a bifurcation involving no change of symmetry. Picking as a reference lattice vectors Ea compatible with D2, we will have two Cauchy-Green tensors describing the two paths, C(x) and C(x), with (4.3)
C(T 2 ) = C(T 2 ) = 1.
From (3.20), they must satisfy RTC(x)R = C(x), RTC(x)R = C(x)
VReP+(Ea)
(4.4)
and, for p and 6 on the control path, the equilibrium equation 8§/dC = 0.
(4.5)
Of course, we wish to exclude the possibility that the two coincide: we assume that* T
^
T2
=> C(x) * C(x),
(4.6)
so the tensors given by £(T) = | C ( T ) - C ( T ) | - ' [ C ( T ) - C ( T ) ]
are symmetric tensors contained in the set of symmetric tensors with norm equal to one, a compact set. Thus, the set has at least one limit point E, with \E\ = \.
(4.7)
* One could weaken this by assuming that there is a sequence of values of x approaching T 2 , for which the two differ.
520 67
Martensitic transformations
Here, the norm, induced by the inner product (2.32) is invariant under the orthogonal group, in particular under P + (Ea). From this and (4.4) it follows that (4.8)
E(T) e r>(Ea), the space denned by (3.2) and, by continuity, we will also have
(4.9)
E e v(Ea). Now, let B be any fixed symmetric tensor and consider f{o, ?)=B-^[aC{x)
+ (1 -
<7)C(T),/>(T),
0«],
(4.10)
for a e [0, 1]. Here, it is important that the indicated convex combination of C and C be in the domain of $. It is symmetric and positive definite, and, from (4.3), for such a, lim CTC(T) + ( 1 -ff)C(r) = 1.
(4.11)
We need to have it in this domain only for T near T2, SO the assumption is really not restrictive. Now, from (4.5) and (4.10), /(0,T)=/(1,T)=0,
so by the mean value theorem, 3(7*6(0, 1)3 % (a*, T ) = 0 , it being understood that a* depends on get 0
=
B
'^ ?
(4.12) T.
Dividing this by
[ff * C W + ( l ~ a*)C
|C(T)
—
C(T)|,
we
(4.13)
Now let T ->T2> taking a subsequence if necessary to get E(x) to converge to E. Bearing in mind (2.26), one gets, in the limit B • BE = 0.
521 68
J.L. Ericksen
Given the arbitrariness of B, we must then have BE = 0 for some E =£ 0 in v(Ea),
(4.14)
so kerB is certainly not the null set. From (3.1), we then have W(E)=0.
(4.15)
For cases of interest to us, at least one of the paths is nice, (2.31) being satisfied on it. It then follows, by continuity, that W(A)>0,
(4.16)
although this cannot hold in the strict sense. The mathematical possibilities are numerous. It is my intuitive judgment that those most likely to be of physical interest are those of Type T: kerB is a one dimensional space,
(4.17)
which does force it to be contained in o(Ea). One could cover other possibilities by defining Type 2 as those for which this linear space has dimension 2, etc. However, I'll not try to analyze these other possibilities. Here, the overbars are added to indicate that these definitions are only to be accepted temporarily. Later, we will introduce a different definition which I consider to be better, which reduces to (4.17) for the particular cases considered here. Given (4.16) and (3.12), it is clear enough that, for the Type 1 possibilities, we have WX(E) $s 0 is not strict, but W2 > 0 is.
(4.18)
We consider two examples which are possible for Bravais lattices. The first is of the cubic-cubic kind. The governing point group P+(Ea) is of order 24, with a set of generators which can be taken as Rn(i), R«(j), R2"i3[3-l'2(i +j + k)).
(4.19)
Here and in the following, Re(v) denotes the rotation with angle 6, expressed in radians, (v) denoting a unit vector which is the axis of rotation. The vectors i,j, k will denote an orthonormal basis, suitably related to directions intrinsic to crystals. For this case Ae(v)(Ea)*>Aocl,
(4.20)
522 Martensitic transformations
69
so we must haver ker B = (»)(£„). In terms of the classical labeling of elastic moduli used by Love [5] and many experimentalists, the linear elastic strain energy function relative to the above basis is of the form 2W(A)=Cn(A2u+A222
+ A233) + 2Cl2(AuA22
+ A22A33 + A33An)
+ 4C44(A22 + A223 + A23l),
(4.21)
where the C's are moduli, not to be confused with components of the tensor C. Of course, this is deduced using the equivalent of equations implied by (2.24) and (2.25). By easy calculation, the equation to be satisfied at bifurcation is C,,+2C 1 2 = 0,
(4.22)
this combination being proportional to the bulk modulus. From (4.20), vL(Ea) consists of all traceless tensors. Calculating W for these, using tr A = 0 to eliminate A33, we get W2{A) = ( C u - C]2)(A2U + A222 + AUA22)
+ 2C44(A2U + A\3 + A22),
(4.23)
which should be strictly positive, i.e. CM-C12>0,
C44>0,
(4.24)
wherein we could replace — C12 by C,,/2. On a nice cubic branch, the bulk modulus should be positive, C n + 2C12 > 0. By the rather heuristic kind of reasoning which is often used in discussions of such matters, it might well vanish on some curve in the p-6 plane, marking a limit of metasability for a cubic phase. Then, in a bifurcation joining a nice branch to another, it is more likely that one will cross this curve, instead of becoming tangent and veering back. This makes the other branch unstable, with the bulk modulus becoming negative on it. Also, if the moduli satisfy some other equation, this is likely to occur only at isolated values of (p, 6). Then, the idea is that an arbitrary control path is more likely to intersect a particular curve than to pass through a particular point. Similar reasoning applies to the other kinds of bifurcations here considered. As another example, we consider a bifurcation of the tetragonal kind. Here, the governing point group is of order eight, with Rn(i), Rnl2(k)
(4.25)
523 70
J.L. Ericksen
as a possible set of generators. Then, o(Ea) is two-dimensional, of the form Diag(a, a, b) e v(Ea),
(4.26)
with a and b arbitrary. Also, ^(EJ
consists of matrices of the form
Aevx(Ea),
\d
-c
\e
f
(4.27)
o|
where the entries are again arbitrary. The linear elastic strain energy now takes the form Cn(A2n+A222)+2CuAnA22
2W =
+ 2Cn(An+A22)A,3
+ C,3A23
+ 4C44(A23 + A223)+4C66A2n.
(4.28)
Evaluating this on v(Ea), we get 2WX {A) = 2(C,, + C 12 )a 2 + 4C]3ab + C33b2.
(4.29)
From above, this should be non-negative, and vanish for a one-dimensional subspace, giving the conditions C u + C 1 2 >0,
C 3 3 >0,
2C?3 = ( C u + C12)C33,
(4.30)
the latter replacing the usual inequality 2C?3<(C n + C12)C33
(4.31)
which should apply on nice paths. By evaluating W on vx(Ea), we get W2(A) = ( C n - C12)c2 + 2C44(e2 + / 2 ) + 2C66d\
(4.32)
which should be strictly positive, i.e. C,i>C,2,
C44>0,
C 6 6 >0.
(4.33)
It is easy to interpret (4.30) and (4.33) in simple physical terms. For example, it follows that, for a uniaxial stress proportional to f®/ (or j
524
Martensitic transformations
71
the corresponding Young's modulus vanishes. I won't pursue the analogous calculations for the remaining possibilities associated with (4.1). However, for future reference, it is worth noting that (4.17) can be interpreted in a different way, as the condition that kerB is an irreducible linear vector space, contained in v(Ea). From the nature of v(Ea), irreducible subspaces of it must be one-dimensional.* Also, any irreducible vector space must be contained in either v(Ea) or in v±(Ea). In discussing bifurcations involving no change of symmetry, we should also consider the possibility that L(D(x)) =£ L(D(x)), with these groups being conjugate, related by a similarity transformation involving an element of G+. Or, that L{D(x)) = L(D(x)) # L{D2). However, analysis of these possibilities is so much like that for bifurcations involving a change of symmetry that I will leave exploration of these to the reader.
5. Bifurcations involving a change of symmetry As before, we consider two tolerable paths D{x) and D(x), meeting at x = x2, with D(x2) = D(x2) = D2,
(5.1)
but now, for i < T 2 , we must have, in particular, (5.2)
L(D(x)) * L(D(x)), and, by continuity, we must have L(D(x)) ^ L(D2),
L{D(x))^L(D2).
(5.3)
Clearly, at least one of the two, say L(D(x)) must be a proper subgroup of L{D2). Again, we introduce, as a reference, lattice vectors Ea fitting D2. For D(x), we normalize rigid rotations as was discussed in §3, to get lattice vectors ea satisfying relations of the form ea = UEa,
U=UT>0.
(5.4)
* Briefly, a linear vector space which is invariant under a group is irreducible provided no subspace of lower dimension is also invariant under the group.
_
525 72
J.L. Ericksen
Then, from (3.19), P+(ea)Cp+(Ea),
(5.5)
here as a proper subgroup. With this deformation is associated the CauchyGreen tensor C(r) = U\
(5.6)
C(T 2 ) = 1.
Then, it follows from (3.20) that there must be at least one rotation R such that C(T) = R T C { T ) R
* C{%),
T < T 2 ,
R e P
+
(EJ.
(5.7)
Since R is in the invariance group for , C(x) also describes an equilibrium path, which is tolerable or nice, according as C(T) is tolerable or nice. It is not hard to show that the two are symmetry related paths, so I'll leave it to the reader to prove this. Now, as before, we can introduce £(T) = |C(T) - C ( T ) | - ' [ C ( T ) - C(T)],
(5.8)
and, by essentially the same arguments as before, conclude that this set has at least one limit "point" E, satisfying B£ = 0.
(5.9)
It is only that E(T) here has rather different properties. For one thing, it follows from (3.2) and (5.7) that A • C(T) = A • C(r)
VA e v(Ea),
(5.10)
whence follows that EevHEa),
(5.11)
in contrast to (4.9). Here, with B invariant under P + {Ea), ker B must contain, with E, all of the tensors obtained by transforming E by transformations in P + (Ea), and the linear vector space generated by such tensors, that is to say an irreducible subspace of o-L(£'a) and, generally, such spaces have a dimension greater than one. From group theoretic considerations such as are discussed by Ericksen [4], the number N of symmetry related branches which
526 73
Martensitic transformations can be generated in this way is given by N = order of P + (£J/order of P + (ea).
(5.12)
It is possible to show that the irreducible spaces generated by these coincide. So far, we have merely used the fact that L{D{x)) is a proper subgroup of L(D2), so the analysis also applies to kinds of bifurcations involving no change of symmetry which were mentioned, but not analyzed, in §4. In this context, Z)(T) could be one of these symmetry-related branches. Now, from the results obtained, it follows that kerB must contain some tensor such that tr A = detA = 0 .
(5.13)
In part, this is obvious from (5.11), since all tensors in o^CE,,) have zero trace, as was mentioned before. If kerBnv ± (E a ) is one-dimensional, then, by the definition of v±(Ea) there is some rotation R e P+(Ea) such that AekerBnv-L(Ea)^RTARocA,
RTAR=£A
(5.14)
from which it follows that RTAR = -A =>tr,4 = d e t ^ = 0 .
(5.15)
If ker B n o 1 ^ ) is at least two-dimensional, pick two linearly independent tensors Ax and A2 in it, which are then traceless. If neither has zero determinant, determine a scalar k as a real root of the cubic equation det(/4, + 14 2 ) = 0 , then A = Ax + XA2 satisfies (5.13). Now, for A satisfying (5.13), we infer from the spectral representation A=n{ex®e{-e2®e2),
(5.16)
where e, and e2 are orthonormal, that A =a®b
+ b®a,
a-b=0,
(5.17)
with 2a = /x(ex + e2),
b=ex-e2,
(5.18)
527 74
J.L. Ericksen
for example. Of course, this gives B(a®6 + 6 ® a ) = 0 .
(5.19)
Essentially this result is due to H. Thomas and R.A. Toupin [6], obtained in studying second-order phase transitions involving a change of symmetry, by a slightly different derivation. For one thing, it means that, at bifurcation, the linear elastic differential operator fails to be strongly elliptic. In terms more physical, the stress corresponding to some infinitesimal simple shearing deformation vanishes. Or, if you prefer a dynamical interpretation, some (isothermal) shear wave speed vanishes. As can be seen from the examples considered in §4, those bifurcations can occur without loss of strong ellipticity. Now, if the bifurcation does involve a change of symmetry, there should be, in addition to these symmetry-related branches, a branch corresponding to a different symmetry type, which could be taken as D{x). One possibility is that L{D{x)) = L(D2), in which case no further branches need occur. One can use (3.20) to get restrictions on the deformations which can relate D{x) to D2 or, for a particular value of r, D{x) to any of the symmetry related branches, but I don't see a way of extracting more information concerning ker B. The other possibility is that L{D(x)) is a proper subgroup of L(D2)- Then, in a similar way, one can generate a second set of symmetry related branches, giving contributions to ker B which also lie in \>L(Ea). Otherwise, I see nothing worth noting about this. Clearly, with the different possible choices of groups, etc., the mathematical possibilities are numerous. Here, heuristic reasoning suggests to me that possibilities most likely to be of physical interest will, for one thing, involve at least one nice branch so, as before, we will have W(A)^0.
(5.20)
Further, that they can be classified as Type 1, if we modify the definition of this, as suggested by remarks made earlier, to replace (4.16) by Type 1: kerB is an irreducible subspace.
(5.21)
Since it must here contain elements in v±(Ea), ker B will be contained in o i (£' a ) 1 if it is of this type. One could similarly define Type 2 to include the cases where ker B is expressible as the direct sum of two irreducible subspaces, etc. Such cases might be of some interest, but I will exclude them. Then, (5.20)
a Martensitic transformations
75
should here reduce to W^A) ^ 0 , strictly,
W2(A) > 0, not strictly,
(5.22)
with W2 vanishing just on an irreducible subspace. We consider just one example, with D{x) a tetragonal branch, D(x) being cubic. Since the cubic lattice (and point) groups are maximal, we must have (5.23)
L(D(t)) = L(D2).
That is, D2 must also be cubic. The orders of the cubic and tetragonal point groups are 24 and 8, respectively, by (5.11), we should have 24/8 = 3 symmetry related tetragonal branches. In the orthonormal basis used in (4.18) to describe the cubic point group, the three corresponding Cauchy-Green tensors for the tetragonal branches are of the form C,(T) = dia(a, a, b),
C 2 (T) = diag(a, b, a),
C 3 (T) = dia(6, a, a),
(5.24)
where a and b are different functions of T, with a(T2)=Z>(t2) = l.
(5.25)
For the cubic branch, the Cauchy-Green tensor is of the form C = diag(c, c, c),
(5.26)
with C(T) satisfying C(r 2 ) = l.
(5.27)
The main tool used in making these deductions is (3.20), combined with determination of how the cubic point group admits tetragonal subgroups. These are the only branches which need be involved, so we will assume this, for simplicity. Given (5.24), it is easy to apply the general analysis to show that ker B must contain the linear vector space indicated by diag(fc, l,-k-l)e
ker B,
(5.28)
with k and / arbitrary. One can verify that this is invariant under the cubic point group and irreducible, so ker B should contain no other elements. The appropriate form of W{A) is given by (4.21). By straightforward calculation,
529 76
J.L. Ericksen
we then find the conditions on moduli to be C,, + 2 C 1 2 > 0 ,
C U = C 12 ,
C44>0.
(5.29)
For this, it does not really matter whether it is the cubic, or the tetragonal branches, or all of these, which are nice. Actually, such a bifurcation could involve some additional branches and still be of Type 1, as long as the occurrence of these is consistent with (5.29), it being possible to add in 6 symmetry-related orthotropic branches, for example. Of course, I have merely deduced necessary conditions for the bifurcations, not explored the possibility of constructing thermodynamic potentials capable of describing them. Neither have I seriously explored related rigorous forms of general bifurcation theory, so I am not sure what this might predict about the likelihood of various possibilities.
6. An example Here, we will consider an example which is hypothetical, but at least worth considering, as a possible theoretical interpretation of behavior observed in some shape memory alloys. It involves a subcritical bifurcation of a special kind, associated with a first-order phase transition, of a cubic-tetragonal kind. The proposed scene is as follows. For simplicity, we consider the case where /> is fixed, 9 being the control parameter. At higher values of 9, we have a stable cubic path. As 9 is lowered, this changes from stable to metastable at some temperature 9C, remaining nice until 9 reaches the value 9X <9C. The value 9X marks the limit of metastability for this branch. There, it meets unstable tetragonal branches, defined for 0,^0^02.
(6.1)
From the previous analysis, there should be three tetragonal branches. At 0 =92, each of the three undergoes a bifurcation with no change of symmetry, giving three symmetry-related tetragonal branches, defined for 9 ^92, which are nice for 9 < 62, 9 = 92 marking their limit of metastability. At 9 = 9C, they change from metastable to stable, being stable for lower values of 9. The phase transition are considered to occur, by deformations suddenly occurring, at some particular values of 9, from the cubic to one or more of the tetragonal paths, or vice-versa. Physically, first-order transitions tend to be somewhat hysteretic. Thus, in cooling the cubic, transition is likely to occur for some 9 less than 9C, but above 0,. Experience with such transitions suggests that, when the transition
530 Martensitic transformations
77
occurs, the tetragonal phase is likely to contain twins and other complex microstructures, such as are analyzed by Ball and James [7]. For simplicity, they idealize to exclude hysteresis, which is observed. I note that their picture of three energy wells of tetragonal kind fits our picture of the three paths. Now on heating a tetragonal form, which might contain twins etc., transition to the cubic form is likely to occur at some temperature above 6C, but below 92, granted that there is some hysteresis. For such reasons, precise measurement of 6U6C, and 82 is not likely to be feasible. The unstable paths, including the endpoints, are to be considered as equilibria which are not observable. However, various implications of this picture have some meaning in terms of possible observations. For one thing, the connections imply that each of the three stable parts of the tetragonal paths has a lattice group which is a subgroup of that pertaining to the cubic path. If we take any of the observable cubic configurations as a reference, it then follows from (3.20) that, anywhere along three tetragonal branches, the three Cauchy-Green tensors must be of the form (5.24). So this is something which can be tested, experimentally, on the parts which are stable enough to be observed. Secondly, as we cool the cubic we should have the elastic moduli satisfy (5.29) at 6 = 6U theoretically. We don't really expect to observe this, but might well see evidence suggesting it, the ratios (C I I -C I 2 )/(C I ,+2C 1 2 ),(C i r -C 1 2 )/C 4 4 .
(6.2)
becoming unusually small. As is discussed by Nakanishi [1], softening of this general kind does seem to be a common feature in shape-memory alloys and (6.2) seems to apply to such transitions, when they are of the cubic-tetragonal kind. At least, this is the case for the Indium-Thallium alloys analyzed by Ball and James [7]. Also, the observed transition strains do fit the description given above. From the theoretical picture, another kind of softening should also occur, associated with the bifurcation at 6 = 02. There, the moduli should behave as indicated by (4.30) and (4.36). Nearly, hopefully in tetragonal phases stable enough to be observed, we could observe softening of the kind indicated, ratios like [(C n + C12)C33-2C?3]/Ci3,
(6.3)
becoming unusually small. Probably because such tetragonal phases tend to be twinned, etc., data concerning their elastic moduli is scarce. One can find a measurement of one modulus at one temperature and composition, another at different values of these quantities, etc. Suffice it to say that I would want better data to make a good assessment of the quality of this prediction, but I think that it is likely to be good. The analysis given in §6 does provide some
531 78
J.L. Ericksen
basis for designing experiments to test this. So, if nothing else, trying to fit observations to a bifurcation pattern has produced a question to be answered by experimentalists, one which I think is important. To make possible some numerical calculations, Richard James and I constructed a relatively simple constitutive equation capable of describing near-transition behavior for a transition of the cubic-tetragonal kind, including the twinning etc. Data on Indium-Thallium was used for order of magnitude estimates of adjustable constants. It is hard to know whether this model is even qualitatively correct but, for whatever it is worth, it does predict a bifurcation pattern of the kind described. Conceivably, a different bifurcation pattern could produce indistinguishable predictions concerning observations. I have tried to follow the advice of William of Ockham, picking the simplest pattern which fits the known facts. Another simple pattern would replace the three unstable branches by an unstable cubic branch which meets three nice tetragonal branches at a higher temperature. This would give the same kinds of transitions strains and the mesh with the analysis of Ball and James [7] is equally good. However, the implications concerning softening of moduli are entirely different. For example, it is clear from (4.22) and (4.24) that the first ratio in (6.2) should now become very large instead of very small. This is enough to reject this possibility for the Indium-Thallium alloys mentioned above, for example. By examining the nice tetragonal branches, one can see that they also soften in a very different way. The example represents just one of various changes of symmetry observed to occur and be associated with shape memory effects and softening of moduli which seems to go along with this. Here, I won't try to discuss any of their possibilities, but I have tried to present the tools which seem to me useful for such analyses.
References 1. N. Nakanishi, Lattice softening and the origin of SME. In: J. Perkins (ed.), Shape Memory Effects in Alloys, pp. 147-176. New York-London: Plenum (1975). 2. G. Zanzotto, Twinning in minerals and metals: remarks on the comparison of a thermoelastic theory with some available experimental results, notes I and II, Atti Accad. Lincei Rend. Fis. 82 (1988) 723-741 and 743-756. 3. R.L.E. Schwarzenberger, Classification of crystal lattices. Proc. Camb. Phil. Soc. 72 (1972) 325-349. 4. J.L. Ericksen, Weak Martensitic transformations in Bravais lattice. Arch. Rat. Mech. Anal. 107 (1989) 23-36. 5. A.E.H. Love, A Treatise on the Mathematical Theory of Elasticity, 4th ed. Cambridge: Cambridge University (1927). 6. Private communication from R.A. Toupin. 7. J.M. Ball and R.D. James, Fine phase mixtures as minimizers of energy. Arch. Rat. Mech. Anal. 100 (1987) 13-52.
532 The IMA Volumes in Mathematics and Its Applications. Volume 54: Microstructure and Phase Transition © 1993 Springer-Verlag
LOCAL BIFURCATION THEORY FOR THERMOELASTIC BRAVAIS LATTICES J.L. ERICKSEN* Abstract. Some weak first order phase transitions encountered in shape-memory alloys seem to occur near bifurcations which are not actually observed. Being curious about the nature of these in particular, I began to think about the nature of all the bifurcation patterns predicted by the theory of thermoelastic Bravais lattices. Presented here are local characterizations of the most likely patterns near a bifurcation.
1. Introduction. As is made evident by some of the papers collected in reference [1], for example, nonlinear thermoelasticity theory describing Bravais lattices is proving to be useful for analyzing near-transition behavior of alloys exhibiting shape-memory effects. These involve Martensitic transformations, commonly of first order. From observations such as are discussed in various places in reference [2], it seems typical of these transformations that, as one cools an unloaded crystal which is in the Austenitic phase at higher temperatures, some linear elastic modulus becomes unusually small, seeming to extrapolate to zero at a temperature slightly below that at which the crystal transforms to the Martensitic phase. While there is little experimental evidence to support it, there is some opinion that a similar effect is likely to occur as the Martensitic phase is warmed, to get near the transformation temperature. Theoretical experience suggests that the vanishing of such moduli will be associated with some kind of bifurcation, probably involving equilibrium branches too unstable to be observable. This led me to be curious as to what might be deduced from theory about the natures of such bifurcation patterns. Briefly and roughly, as I interpret it, the general problem is to get a qualitative description of the bifurcation patterns most likely to occur in thermoelastic Bravais lattices. There is a global problem, involving one or more connected sets of equilibrium branches, with any number of bifurcations. For the'most part, I'll be concerned only with the nature of bifurcation patterns near one bifurcation, the local problem. A few remarks about a more global problem are included in Section 10. On the face of it, this seems to involve a. large number of possibilities. Physically, equilibria can have any of the symmetries associated with the seven point groups which can be realized in Bravais lattices, which are described in the Appendix, and branches with different symmetries can meet at bifurcations. With each point group is associated some number of elastic moduli, giving different possible choices of that which vanishes at bifurcation. However, we will see that there are ways to reduce these to just six different mathematical problems which I found tractable, although I am not expert in bifurcation theory, although I have not yet found references to *Department. of Aerospace Engineering and Mechanics and School of Mathematics, University of Minnesota, Minneapolis, MN 55455. Member of the Research Group on Transitions and Defects in Ordered Materials.
533 58 all. However, I found it hard to make some points without giving some indication of how the analysis can be done, so I have done this. Mathematically, there are additional possibilities but, by an old criterion which is to be discussed and used, they are unlikely to be encountered. My purpose is to elaborate these remarks. To keep this paper from being unreasonably long, I will omit some of the details in various analyses, but indicate the kind of reasoning which is used in getting the results. There are some different aspects to this study. One is the reduction of the numerous possibilities to the aforementioned six problems. One needs to understand this in some detail, to know how to relate any particular physical possibility to one of the six mathematical problems, then, one needs some analysis of each of the six and, as noted above, I'll sketch mine. Finally, I'll discuss a particular case which seems to represent a rather common occurrence, experimentally. Some of the elementary bifurcation theory for Bravais lattice is discussed by Ericksen [3], but the present study goes far beyond this, to give a fairly complete local theory of the generic kind. 2. Basic formulation. Here, the absolute temperature is considered to be uniform and to be a control variable, and we will consider only homogeneous configurations of our Bravais lattices. In our analyses, some homogeneous configuration will be taken as a reference so, for the configurations considered, the right Cauchy— Green tensor C will be independent of position. According to thermoelasticity theory, the Helmholtz free energy density <j> is a function of the form
(21)
kc,6).
While there are physical reasons to be interested in cases where
P={Q6P 0 |Q r CQ = C}
it being understood that, with Co — 1, any possible Q is an orthogonal transformation, one of those described in the Appendix.
534 59 For us, equilibria will mean unstressed configurations, values of C and 0 satisfying
(2-3)
J±=0,
If (Co, #o) i s one, one can try to use the implicit function theorem to infer the existence of smooth equilibrium paths C = C(0) through it, on which (2.3) is satisfied. What is important for this are some properties of the fourth order tensor
(2.4)
s(CoA) =
a§b
(Co 0o)
'
or, what is equivalent, a, the tensor of linear elastic moduli. Since properties of the latter are more likely to be familiar to the reader, I'll use it. To get a from 35, one changes the reference configuration to make Co =1, then takes (2.5)
a = 43(1, So)-
In a way, this is a. bit awkward since, if one is evaluating a on a smooth branch, C = C(#), one needs to continually change the reference configuration to make C(0) = 1, but this is what fits rather common practice. As defined, Q can be interpreted in the usual way, as a symmetric linear transformation on the space of symmetric second-order tensors. This involves considering the latter as equipped with the inner product. (2.6)
A»B = trAB,
with A and B representing any pair of symmetric second order tensors. In this setting, a then has real eigenspaces with the usual orthogonality properties. Now, from the view of the implicit function theorem, the trivial case occurs when (2.7)
kera = 0 ,
in which case one has the local existence and uniqueness of the aforementioned smooth branch. On such a branch, as long as (2.7) holds, it follows that the point group is the same as that at the (fixed) reference, Po, also that the values of C(8) are restricted by the obvious interpretation of (2.2): one can read off the possible forms from the description by Truesdell [5, Ch. IX], for example. Also, a, evaluated at any point on such a branch, will be invariant under Po. So, these are the normal branches, commonly considered in the linear thermoelasticity theory, for example. In fact, such theories also presume that the eigenvalues of a are positive. Our interest is more in limit points of such branches, at which (2.7) fails to hold. By continuity, they will satisfy (2.3), and the point group at the limit point will either be or include as a subgroup that, on any of the normal branches having it as a limit point. Said differently, these are the places where bifurcations can occur.
535 60 Now, the intent is not to try to find all of the mathematical possibilities, but to look for the most likely. To assess this, I'll use an old criterion, still commonly used by physicist, which I trust. One could use instead more modern versions of generic bifurcation theory, but this involves making some sticky decisions about what is the appropriate topology, etc., like those discussed by Man [6] in his discussion of Gibbs' phase rule. In any event, what I'll do is to consider things of interest, like a, evaluated on a normal branch, as function of 8. As a consequence of invariance, they will satisfy some equations, to be viewed as identities. Otherwise, the view is that one equation might well be satisfied at a particular value of 9, but it is unlikely that two or more independent equations will there be satisfied. There is a physical idea which supplies some motivation for accepting such ideas, which is also covered in the interesting discussion by Man [6]. Briefly, it involves the idea that a material is described not by one constitutive equation, but a collection of slightly different equations, an idea which appeals to me. In particular, with a value of a being invariant under one of the seven point groups, its eigenspaces will also have this property and, in some cases, this require that some of the eigenvalues be equal. By considering spectral representations, one can show that, when the number of distinct eigenvalues is as large as it can be, the eigenspaces are irreducible. That is, they contain no proper subspaces which are invariant under the applicable point group. It might well happen that two which could be different do coincide at a particular value of 8, on a branch, but this would count as an equation. Now, for a to have a non-trivial kernel, one of its eigenvalues must vanish. By our criterion, it is then unlikely that two eigenvalues will be equal unless, as a consequence of invariance, the two must be equal. This provides us with one important conclusion, viz.. (2.8) Most likely, when k e r a ^ 0, the eigenspaces of a, in particular ker a, are irreducible . In Section 10,1 illustrate a subtlety associated with this. Briefly, it is not considered unlikely that the point group at a limit point of some branch might be larger than that on the branch, and, when this happens, it might require some coalescence of eigenvalues or other conditions not then to be considered as equations. Later, we will make use of some other consequences of our criterion, but this one will be used repeatedly. So, for any of the seven point groups, an eigenvalue associated with any of the irreducible subspaces might vanish, and we need to consider all such possibilities. One could envisage the possibility that the list might be enlarged, by some consideration of the nature of brandies for which the point considered is a limit point but, as we shall see, this is a problem which takes care of itself. 3. General scheme for reducing problems. Essentially, we follow common practice in bifurcation theory, with some accounting for the quirks induced by invariance. The reference configuration is considered to be fixed, at a place where ker a ^ 0, at some particular temperature do. So, by assumption (3.1)
kera/0
536 61 and, is suggested by (2.8), we also assume that the eigenspaces of a are irreducible. Then, kera is an irreducible space of some dimension TV. Later, it will become clear that JV < 3. As a basis for symmetric second-order tensors, we introduce an orthonormal basis of eigenvectors of a, denoted by E; (i = 1 , . . . , 6), Ei.E^iyft, i = l
(3.2)
6) ,
with E i , . . . , E / v spanning kera. In particular, values of C are represented by components z, as indicated by 6
(3.3)
C = 1 + 5>,E,, x=l
with Xj = 0 at the putative bifurcation point. By the implicit function theorem, we can solve (2.3) locally to get relations of the form (3.4)
*j=fj
(xu..-,xN,6),j=N
+
l,...,6
these functions being unique and smooth: they also satisfy the conditions (3.5)
/ ; (O,...,O,0 O ) = O.
Substituting these into the function >, we get the reduced potential, represented by a smooth function of the form (3.6)
*(Xl,...,xNl0).
Then, by standard reasoning, the remaining equilibrium equations reduce to
<3-7>
£H
and (3.8)
dfb
fl2*
— - = — — = 0 at OXi
UXiOXj
Xi
= 0, 0 = 9o ,
where the indices take on the obvious JV values. As I interpret the jargon used by physicists, Xj . . . ,x/y are "order parameters." From remarks made earlier, <j> can be considered to be invariant under the point group Pg associated with the reference configuration and, from this and the aforementioned uniqueness of solutions, $ inherits some invariance, which needs to be determined. The relevant point group involves some orthogonal transformations Q, noted in the Appendix, transforming C to (3-9)
Q 7 'CQ = l + ^ ] . - C l ( Q ^ E l Q ) ,
537 62 and with E, as a basis, we can determine numbers A; such that 6
(3.10)
Q T E,Q = ^ A | E , . i=i
Of course, this induces on X{ the transformation to the components 5f; of Q CQ given by
(3.11)
xi = J2\ixj,
•,i = l , . . . , 6 .
Now, since the inner product (2.6) is invariant under all orthogonal transformations, the Q T E i Q also form an orthonormal basis, whence follows that (3.12)
A=||Ai||,
t,j = l , . . . , 6
is an orthogonal matrix, of a rather special kind. That is, since the eigenspaces of a are invariant under PQ,A must map any such eigenspace onto itself. Thus, the obvious restriction of this linear transformation to an eigenspace is described by a matrix which is also orthogonal. In particular, this is true of the submatrix (3.13)
A = ||AJ||,
i,j= l,...,N.
describing the transformation of order parameters, so it satisfies (3.14)
A"1 = AT .
Also, it is easy to show that the matrices thus obtained from a group under matrix multiplication, which I'll call ^H(-Po) the reduction of Pa, (3.15)
A g «(?„) ,
Bear in mind that it depends on which eigenspace is considered to be ker Q, as well as Pa. To simplify notation a bit, we now write (3.16)
y = (*!,...,**)•
After a bit of calculation, using the uniqueness of the solutions (3.4), one finds that 9t(Po) is the invariance group for the reduced potential, so (3-17)
*(Ay,0) = * ( y , f l ) ,
A e
at least for y near the origin. Here, I have glossed one point, in that (3.17) makes no sense unless the domain of $ is invariant under SH(Po). One can take care of this by considering the domain to be a suitably chosen neighborhood.
538 63 Another item of interest is the point group associated with yo, a particular value of y near the origin. To determine this, one calculates the subgroup SH(P0,y0) leaving yo invariant: (3.18)
*(Po,yo) = {A e tt(Po) I Ay0 = y 0 } .
With (3.4), yo determines a point x0 in the six-dimensional space (xi,... ,XQ), and A is a submatrix of at least one 6x6 matrix A(A). From the uniqueness of solutions of (3.4), it then follows that (3.19)
A(A)x o =x o .
In turn, A(A) is associated with at least one Q € Po, by (3.11); call it Q(A). From (3.3), x 0 determines some value Co of C and, with (3.19), we have (3.20)
Q(A)rC0Q(A) = Co .
The linear transformations Q(A) thus obtained then form some subgroup of Po which, as is indicated in (2.2), is the point group associated with this particular configuration. From such theory it follows that the order of this group can be calculated by dividing the order of PQ by the number of distinct points in the orbit of Co under Po or, what is the same, the number of distinct points in the orbit of yo under 9l(Po). From the Appendix, it is clear that the order uniquely determines the point group. Frequently, I will use this idea to identify point groups. Concerning the function (f>, we have assumed that it is smooth, has some invariance locally, and admits some equilibria compatible with bifurcation. What is left rather tacit above is the notion that, at 8 = 8a, y = 0 represents a limit point of one or more normal branches: as it will turn out, it isn't important to make this more explicit. Otherwise, one can show that, for our local analyses, «& need not satisfy any conditions beyond those mentioned explicitly above. Of course, any additional assumptions made about <j> could lead to additional restriction on , not accounted for here. To proceed with analysis of possible bifurcations, one needs to know the possible values of N and the groups IR(Po). Different physical problems can and do lead to the same N and £H(Po) although, in some cases, one needs to make some judicious choices of the eigenvectors, to make such identifications. So, this is the next step to be taken, to construct guide maps to link each of the physical possibilities to one of the mathematical bifurcation problems. 4. Guide maps. Essentially, this section is a tabulation of important information for each of the seven point groups described in the Appendix. It includes characterizations of the possible orthogonal irreducible subspaces which are, for our purposes, the possible eigenspaces. Also, where needed, I'll describe my special choice of the eigenvectors Ej,. .., E/v, one which can be used to get the particular representations of IH(Po) noted below.
_
_
—
^ 64
I'll not discuss derivations in detail, but some general comments might be helpful. One can start with the second-order tensors which are left invariant under the group Po which are described by Truesdell [5, Chapter IX], for example. These form a linear space of some dimension. Associated with this are one-dimensional irreducible spaces, equal in number to the aforementioned dimension. If the number exceeds one, the particular form of such eigenvectors can be expected to be different for different materials, etc.. for given point group. This space will always include the identity, so all tensors in the orthogonal complement will be traceless. In this complement, there can be some one-dimensional irreducible spaces. If so, they will consist of tensors transforming to themselves by some elements of the group, to their negatives by other, making it easy to locate these. If leer a is of the first kind, one has (4.1)
N = 1,
where S0(l) is the proper orthogonal group in one dimension, the trivial group consisting only of the identity (4.2)
S0(l) = 1 .
If ker a is of the second kind, (4-3)
N = l,
where 0(1), the one-dimensional orthogonal group, also includes reflections 0(1) = 1 ,
(4-4)
yi
-» -yi
.
Having determined all of these, and the space which they span, one can look in the remaining orthogonal complement for two-dimensional irreducible subspaces. It turns out the possibility that one of these is ker a leads to cases of the following kind (4.5)
N =2,
m(P0) = G(iP) ,
^ = -K/2, TT/3 or 2vr/3 .
where G(ip) is the group generated by the two matrices (4-6)
/-I
0\
/ cosV>
sinV>\
V0
1/
\ - s i n y cosipj
Having determined the two-dimensional possibilities, one can again take the orthogonal complement, and look for the three-dimensional possibilities. It turns out that there is just one of these, associated with cubic symmetry, and none of higher dimension. For it, (4.7)
540 65 where G(3) is a group of order 24, generated by two improper orthogonal transformations Qi and Q2 which, in notation used in the Appendix, can be taken as (4.8)
Qi = -R(TT/2, e t )
and
Q2 = -R(TT/2, e 2 ) .
With the point groups numbered as in the Appendix, I now tabulate important details. Tensors are described in terms of components, relative to the bases there used. No. 1 The irreducible subspaces are all one dimensional, and consist of tensors left invariant under the group, orthogonal to each other, so (4.1) applies to any of them. No. 2 Cases to which (4.1) applies involve four orthogonal one-dimensional spanning the space of tensors of the form a
b 0
b
c 0
0
0 d
spaces
Here and below a, b etc are arbitrary numbers. The remaining irreducible spaces are also one-dimensional, consisting of two orthogonal spaces, spanning the space indicated by 0 0a &•• •
0
0
a
b 0
these being possibilities to which (4.3) applies.
No. 3 Cases to which (4.1) applies involve three one-dimensional spaces, vrith span of the form a 0 0 0
b 0
0
0c
The remaining irreducible spaces are also one-dimensional, those of the form 0
a 0
a
0 0
0
0 0
0 ,
0a
0 0 0 a
0 0
541_ 66 and 0 0 0 0
0a
0 a 0 with (4.3) applying to these. No. 4 Cases to which (4.1) applies involve two one-dimensional spaces, with span of the form a
0 0
0
a
0
.
0 0 6 The remaining irreducible spaces consist of two two-dimensional spaces. One is of the form a b 0 b
-a
0
0
0
0
.
If in it one takes as a basis 0 y/2 E! =
1 0
1 0
1 0 0 0
,
yfc E 2 =
0
0 - 1 0
0 0
0
0
0
(4.5) applies with ip = 27r/3. The other is of the form
0
0a
0 0 b a
b
0
and, with the basis 0 v/2E!=
0
1
0 0 0
1 0
(4.5) applies, also with V> = 2ir/Z.
0 , N / 5 E
0
2
=
0
0
0 0 1
0
1 0
542
67 No. 5 Cases to which (4.1) applies involve iwo one-dimensional spaces, with span of the form a 0 0 0
a
0
0
0 6
There are also two to which (4.3) applies, one of the form 0
a
0
a
0 0
0
0 0
the other of the form a
0
0
0
-a.
0
0
0
0
Finally, there is one with dimension two, of the form
With the
0
0a
0
0 6
a
6 0
basis
\/5 E i =
0
0
1
0
0
0
1 0
,
V*2 E 2 =
0
0
0
0
0
0
1
0
1 0
(4.5) applies, with if} = 7r/2. No.
6
Cases to which (4-1) applies involve two one-dim.ensional spaces with span of the form a 0 0 0
a
0
0
0 b
The other irreducible spaces are two-dimensional. One is of the form a b Q b
-a
0
0
0
0
543
68 with the basis 0 V5 Ei =
1 0
1 0
1 0
0
0
0
0
,
E2 =
0
0 - 1 0 0
0
,
0
(4.5) applies, with ip = 2?r/3. The other is of the form 0
0a
0 0 6 b 0
a wtth the basis
V ^ E
1 =
0
0 1
0
0
1 0
0 0 0
0
, v 5 E i =
0
0
0
1
0 1 0
(4.5) applies, this time with %j> — w/3. No. 7 Here, there is just one space to which (4-1) applies, of the form a
0 0
a
a 0
0
0a
There is one of the two-dimensional kind, of the form a
0 0
0
b 0
0
0c
;
a+t+c=0.
with 1 0 /
v 2E1=
0
10
0 - 1 0
0 0 (4.5) applies, with i/> = 2TT/3.
,
V2E2=
0
0 0
Finally, there is a three-dimensional space of the form d a b a 0 c 6
c 0
0 1
0
0 - 2
544 69 With the basis 0 \/2E
1 =
1 0
0
1 0 0 0
0
, V ^ E
2
=
0
0
0
0 0 1 0
1 0
and
\/2 E 3 =
0
0
1
0
0
0
1
0
0
(4.7) applies. So, this describes all of the likely choices of ker a, and links each one to one of six groups fH(Po)- The next step is to deal with the bifurcation theory for each of the six. 5. Cases with N = 1. When N = 1, we are concerned with the function <&(y, 0) discussed in Section 3, y here being a scalar variable. As indicated in (3.8), we are given that it is s smooth function satisfying
(5.1)
-^ (0A) = ^
(0A) = o ,
with (5.1)2 counting as one equation. Suppose first that (4.1) applies, so that $ is not subject to any invariance conditions. Then, what is likely is that
(5.2)
Bo(OM*°>
for the vanishing of this would give a second equation. So we have here another use of our criterion. We can then use the implicit function theorem to solve the equilibrium equation
f=0,
(5.3) locally, for 6, to get (5.4)
0 = 00 + f{y),
/(0) = 0 ,
/'(0) = 0 ,
the latter condition being obtained by calculating /'(0), using (5.1)2. By a similar calculation, one finds that, most likely, (5.5)
/"(0) + 0 ,
545 70 so, near y = 0, / is either positive or negative, its graph resembling a parabola. If, say, / < 0, one gets two equilibrium branches for 8 < 80, coming together at 8 = 80, to form one smooth curve in the y — 8 plane. Since ?l(Po) is here trivial, the orbit of a point on one of these branches consists of just one point. From the discussion in Section 3, this means that the point group there is the same as it is at the reference y = 0, 6 = 6a and, from the information in Section 4, this can be any of the seven. So, this gives a bifurcation involving no change of symmetry. Also, what is likely is that d2$f/dy2 will be positive on one of these branches, negative on the other. When (4.3) applies, $f(y, 0) is restricted to be an even function of y. What is then likely is the typical pitchfork bifurcation often encountered in the literature on bifurcation theory. Briefly, here y = 0 =* — = 0 ,
(5.6)
giving one equilibrium branch, passing smoothly through 0 = 60. Since y = 0 is a fixed point for this group, the point group on it is the same as that at the reference and, from the discussion in Section 4, this could be No. 2, 3, or 5. Besides the branch y = 0, what is likely are the branches defined either for 8 > 8Q or for 8 < 8Q, describable by an equation of the form (5-7)
V2=W),
/(fla) = 0 ,
/'(«o)/0,
where /(#) is a smooth function. If, say, /'(#o) > 0 and 8 > 80, one will have an equilibrium point on one of these branches, with an orbit consisting of two points. The order of the point group at one of these will then be half of what it is at y — 0. According to the Landau [7] exclusion rules, these are the bifurcations associated with likely second-order phase transitions for our Bravais lattices. Here, we do not preclude the possibility that all of these branches represent unstable equilibria, and they could not describe such transitions when they are. One can deduce something about the differences in sign of d2&/dy2 on the different branches, but I won't elaborate this, which will be familiar to anyone who knows a bit about such bifurcations. Essentially, these are textbook examples, encountered in various kinds of applications. I have elaborated them a bit, partly to illustrate how our criterion for being likely is used, to get more definite predictions. Also, we will see that some of the more complicated possibilities can be reduced to one of these cases. Further, if one looks at the analysis sketched, one sees that the smoothness assumptions on <j> are much stronger than is here necessary; to reduce smoothness assumptions to a minimum, one is likely to need to find alternatives to these and other arguments to be used. Another point seems worth mentioning. In the case just discussed, a symmetry argument gave us the existence of one smooth branch (y = 0) passing through the bifurcation point. As it does, what is likely is that, on it, d2&/dy2 will change sign; if it didn't, the partial derivative of this with respect to 8 would vanish at bifurcation, giving a second equation. By an old result, due to Poincare[8,§2],
546 71 this implies the existence of equilibria with y ^ 0, in the neighborhood of the bifurcation point, and he discussed a generalization of this result to n-dimensional problems. This, combined with formal analysis, can be used to make rigorous arguments leading to (5.7), with fairly mild continuity assumptions. Very similar ideas underlie bifurcation analysis of cases yet to be considered. 6. The case N = 2, t/> = TT/2. When N = 2, and &{yi,V2,0) is invariant under any of the groups G(ij>) described in (4.5) and (4.6), one has fix,
(6.1)
g&
with y = (yi, 1/2)- These groups have in common the property that the only vector left invariant by the group is the null vector. From (6.1), we then infer that y = 0 => ^
(6.2)
= 0,
so there is at least this equilibrium branch, passing smoothly through the bifurcation point. When rp = TT/2, it is easy to verify that ( 6 -3)
*(yi,
2/2) = * ( - y i , V2) = *(yi,
-y2>
= *(y2, yi) ,
from which we have, in particular, (6-4)
yi
= 0 =». ~ = 0 , dyi
and y2 = 0 => | - = 0 ,
(6.5)
and (6.5) can be obtained by transforming (6.4) by an element of (6.4), and yi = 0, the equilibrium equations reduce to (6-6)
_
G(TT/2).
With
(0,^,0) = o,
$ then being an even function of y2. Essentially, this is the second case discussed for N = 1. Thus we get the analog of (5.7), branches described by (6-7)
y,=0,
yl = f(9),
f($o) = O,
Transforming this gives the solution of (6.5) (6.8)
V, = 0,
y \ = f{6)-
f(00)?0.
547 72
Clearly, where / is positive, the orbit of one equilibrium point like (0, y/f(0)), consists of four distinct points. From Section 4, the point group at y = 0, 6 — 90 can only be No. 4, configurations of the tetragonal kind, so (6.2) is a branch with this symmetry. From the Appendix, the order of the point group is eight. That on the symmetry-related branches described by (6.7) and (6.8) is then 8/4 = 2, making this the point group No. 2, the monoclinic configurations. So, we have at least these branches. To explore the possibility of additional branching, we use (6.3), along with a little exercise in invariance theory, to infer that (6.9)
& = F(I,J,6)
where F is a smooth function of the variables
(6.10)
/ = | y | 2 = v? + » 2 2 ,
J= y\vl-
While our smoothness assumptions make it relevant, it is worth noting that, were 3> only a few times differentiable, F would be a bit less smooth, at some places, as follows from the results of Ball [9]. For any equilibrium points satisfying the condition
(6.H)
|S=4yiV2(S/Nyl)/°'
the equilibrium equations become
(6.12) Suppose that there were a sequence of such equilibrium points, having y = 0, 6 = 00 as a limit point. Then, by continuity, we would have
(6-13)
| J ( O , O , 0 O ) = ! J ( O , O A ) = O,
that is, at y = 0, 8 = 60. To see whether this is likely, we can use the first few terms in an expansion of $ about y = 0, viz. (6.14)
* = a + l3I + 1J +
6I2+o(\y\4),
where a, ft etc. are functions of 8. From our assumptions one equation must be satisfied, viz. (6-15)
p(00) = 0 ,
so it is unlikely that 7(^0) = 0, i.e. that (6.13) is satisfied. So, we expect not to have equilibria of this kind. From (6.11), and our coverage of cases where 3/13/2 = 0, this leave the possibility of equilibria such that y\ = y\ ^ 0 and, by similar arguments, one can show that these are unlikely to occur.
548 73
7. The case N — 2, t/1 = 2TT/3. With &(yi,yi,9) invariant under the group G(2n/3) described by (4.5) and (4.6), we already have from (6.2), the branch (7-1)
y = 0,
and the easy observation that (7.2)
= 0 => | 5 - = 0 . oyi This suggests looking at the possibility of satisfying (7.3)
Vl
fl-(0,yj,ff)
= 0, Cl/2
with j/2 ^ 0. While this might seem to be the same as one of the two cases for which N = 1, it is not. For one thing, one can differentiate (6.1) with respect to 6 to get, in particular, (7.4)
a consequence of symmetry. This makes it clear that this is unlike the first case discussed in Section 5. Also, it is easy to check that 4>(0,t/2,#) is here not an even function of ;/2> making it unlike the second case. By using our now-familiar criterion, it is easy to check that, most likely,
(7.5)
^ |
(0,0, W O .
So, on the branch (7.1), 924?/9iy| will change sign, as 0 passes through the value SoBy applying the result of Poincare, mentioned at the end of Section 5, to (0,1/2> ^)> one infers the existence of solutions of (7.3) with t/2 7^ 0, in the neighborhood of y2 = 0, 6 = #o- I won't expend the work required to describe my elementary analysis of these, but it produces one branch described by (7.6)
y,=0,
ito = / ( 0 ) ,
/(ffo) = O,
f'(eo)^d.
Transforming this by elements of the group gives two more, of the form (7.7)
2Vl = ±v/3 /(*) , 2y2 = -f(0) ,
or three symmetry-related branches in all. To explore whether occurrence of additional branches is likely, we note that the invariance of * under G(2w/3) implies that it is an even function of J/J , with y2 fixed. Thus, we can regard it as smooth function of y\ and 3/2 o r i equivalently, of the variables (7.8)
X = y\ - 3yl , Y = y2 ,
549 74 so
(7.9)
*(yi,V2,0) = F(X,Y,6),
a function which is still subject to some invariance requirements. What we have above are equilibria on the lines I/J = 0, and j/j = ±\/3 j/2, so let us look for equilibria occurring elsewhere with, say, 8 > 80. If f(8) is then positive, we look at the domain described by (7.10)
!/2<0,
V3y2
If we have any equilibrium point not on one of the three lines, some point on its orbit will be in this domain. In the X - Y plane, this maps to (7.11)
~3Y2<X<0.
Y<0,
Here, we are interested in possible solutions of (7.12)
and know that they are satisfied at (0, —/(0)/2). Suppose that, at 8 = 8lt they are also satisfied at (X\,Yi) =f (0,—f(8i)/2) and consider the function (7.13)
g(r) = ~
[rX, , rYl + (r - l)/(tf,)/2,«i] •
It vanishes at T = 0 and r = 1 so, for some T £ (0,l),<7'(T) = 0. This gives us a point at which
(7 i4)
wF>x^i£§Y^+w^=0-
-
In a similar way, by using dF/dY, we can get a point, generally different, at which
(7-is)
_g_x1
+
g ( y 1 + M)/2) = o.
Here, we can replace (Xi,Yi + F{8\)/2) by the unit vector (7.16)
(Xx/r, ( y i + M ) / 2 ) / r ) , r = \(XUYX + M ) / 2 ) | .
Now, suppose that there were a sequence of such equilibria, having as a limit point (0,0) as 8 —t #o- This will give a sequence of points on the unit circle which must have some limit point. Also, the points where (7.14) and (7.15) hold will tend to (0,0) as 8 —* 8o. Putting this together, one gets the condition (7.17)
d2F d^F
( d2F \ 2
550 75 Now, with a little exercise in invariant theory, one gets the approximation * = a + /?|y|2 + 7V2 (3y? - y\) + % | 4 + O (|y| 4 ) or (7.18)
F(X,Y,0) <* a + 0{X + 4Y2) + yY(3X - 4Y2) + 8(X + 4Y2)2 ,
where a,/?, etc. are functions of 8. From our assumptions, /?(#o) must vanish, one equation to be satisfied. For (7.17) to hold would require that 7(#o) also vanish, so this is unlikely. With slight changes in the argument, one can cover the possibility that f(6) < 0 for 8 > 80, and cover equilibria which might occur for 8 < 80 • In brief, what is likely is that we will have the branches described by (7.1), (7.6) and (7.7), and no others. Physically, these could be the trigonal-monoclinic, hexagonalorthorhombic or cubic-tetragonal kind. The later possibility is discussed in more detail in Section 10. 8. The case N = 2, ip = TT/3. With #(yi,t/ 2 ,0) invariant under G(TT/3), we again have the trivial branch (8.1)
y = 0,
for a start. Notice that this group includes rotations with angle 7r, and you see that •& satisfies
(8.2)
*(»i,ita,fl) = *(-yi,th,») = *(vi,-yj,0) ,
as well as some additional conditions. One oddity is that one has to go to approximations of rather high order, to get polynomials which are invariant under this group, and not the full orthogonal group; my calculations give
(8.3) # = a + 0\y\2 + T |y| 4 + % | 6 + e(y? - yl) [4(y? - y22) - 3|y|4] + o(|y| 6 ) , where a, /?, etc. are functions of 8. Still, $(0, yi,8) is a rather general even function of 2/2 and, as was the case for G(ir/2), we get branches of the typical pitchfork kind
(8.4)
y , = 0 , i£ = / ( * ) ,
/(0o) = O,
f(0o)*O.
Transforming this by the elements of the group has the effect of rotating this by angles, mr/3 , n = 1,2,3,4,5. Since (8.3) describes two equilibria when / > 0, the orbit consists of twelve distinct points. Checking the information you have been given, in Section 4 and the Appendix, you find that, so far, what we have is a hexagonal-triclinic bifurcation. By essentially the same reasoning, one can get a similar but different array of branches, in the orbit of one of the form (8.5)
!fa = 0,
y2=g(8),
g(eo) = 0,
g'(8o)^0-
551 76 a fact which seems to me rather curious. Here, it is convenient to regard $ as a somewhat special function of the variables A' = | y | 2 ,
(8.6)
L =y * - y l = > \ L \ < K ,
K > 0 .
and 0,
(8.7)
* = F(K,L,0),
(8.2) serving to justify this. In terms of these, some of the equilibria described above lie on the lines in the K — L plane given by (8.8)
L = ±K .
If there are any other kinds, they will be in the sector (8.9)
K >0,
\L\
.
Suppose that there is one there, at (KQ,LO), at 9 = 0\. By a routine calculation to determine others on the same orbit, one gets them at (8.10)
(Ko, -Lo/2 ± Mo) , 4M2 = 3 (A'o2 - L\) .
Now, when the three points indicated are distinct, repeated use of the mean value theorem yields the condition
—
(*,,£„,) = 0,
(A'i, Li) being some point in the sector. So, if there is a sequence of these approaching (0,0) as 0i —> 0O, we must have (8-12)
—
(O,O,0O) = O.
With (8.3), this gives e(60) = 0 and, by now familiar reasoning, this is unlikely. Within the sector, Mo ^ 0, so, for two of the points to coincide, we must have (8.13)
L0 =
-L0/2±Mo,
in which case there are two distinct points. By a routine calculation, these correspond to points on the lines y\ = ±\/3 y2 ,
j/2 = ± v ^ yi
lines obtained by taking orbits of the lines J/J = 0 and j/2 = 0. So, there are such equilibria, but they have been accounted for. Here, particularly, I find the analysis unpleasant, in that they presume much differentiability of <j>. Perhaps some nicer treatment is in the literature, but I haven't found one.
552 77
9. The case N = 3. Here, we are dealing with functions «l»(yi, V2,ys, #), which are invariant under the large group G(3) with generators described in (4.8). Again, the only vector left invariant by this group is the null vector, so we have the branch implied by (9-1)
y=0=*- —
=0,
passing smoothly through y = 0, 0 = 90, configurations of the cubic kind. The invariance implies that, when any one of the coordinates vanishes, «& reduces to an even function of the other two, whence follows that y2 = y3 = 0 => — = — = 0, etc.
(9.2)
Ct/2
Ct/3
with *&(yi, 0,0,6) an even function of y\. After checking the invariance requirements on it, one finds that these don't exclude the occurrence of the typical pitchfork bifurcations, so we will have branches described by (9.3)
y2
=
y3
= 0,
y? = / ( * ) ,
/(flo) = O,
f'(9a)^0.
The orbit of this gives two more "copies," obtained by replacing y\ by y2 and by j/3. So, where f(6) > 0, the orbit of one such equilibrium point contains six distinct points. The order of the cubic group is 24, so the order of the point on any of these branches is 24/6=4, identifying this as the orthorhombic group. Here, at least when # is a polynomial, it is known that it can be shown to be representable in the form (9.4)
# = QFj + K * 2 ,
K = yiy2y3
,
with $ i and «f>2 symmetric functions of y\,y\ and j/J, expressible as functions of
i = y2i+vl+vl ,' (9-5)
J = v} + vl+vl, and
A'2 = (yit/2y3)2 •
Or, more simply, we have (9.6)
9 = F{I,J,K,0)
where F is a polynomial in / , /, K*. In fact, one can adapt the analysis by Ball [9] to justify our assuming that (9.6) holds, with F a smooth function. At points such that
(9 7)
-
S S > = 8 { y l -vl) {yl -yl) {yl -yl) * °
*This can be inferred from results of Smith and Rivlin [10], for example.
553 78 the equilibrium equations are
^-^_H-n
(98)
and it is easy enough to show that it is unlikely that these hold at y = 0, 6 = 60, so, by reasoning now familiar, we can dismiss such equilibria. From (9.7), this leaves unsettled the possibilities like (9.9)
V2 = ±3/3 + 0 •
From above, it is easy to see that the restriction of 3> to the set (9.10)
j/2 = 2/3 ,
say, is an even function of this variable, so we can regard it as a function of its square, so (9.11)
$(yuy2,y2,9) = G(XtZ,0)
with X = „, ,
(9.12)
Z = yl .
From, say, (9.5) and (9.6), it is clear that (9.13)
$ = a + HI + 7 ff + SI2 + EJ + o(|y|") ,
where a/3, etc.are functions of 0, (9.14)
/ W =0
being the one equation we allow. From this we have the approximation (9.15)
G = a + P(X2 + 2Z) + -yXK + 8(X2 + 2Z)2 + e{X4 + 2Z2) .
Also, for equilibria with Z > 0, conditions of equilibria imply that
(9.16) With (9.15), it is easy to verify that these are satisfied at X = 0, Z = 0, 6 = doAlso, here,
(9.17)
d2 G dX2
d2G dXdZ
d2G dXdZ
d]G dZ2
det
= -7 2 (0 O ) ,
554 79 which is not likely to vanish. So, the implicit function theorem applies, giving us the existence and uniqueness of solutions of (9.16), locally. In first approximation, we get the solution as
(9.18)
x<*-2p%w-et)
i 70M, )
ZSf'(« 0 )f(«-fl 0 ) 2 /
}
2 7(0O ) ,J
which is consistent with having Z > 0, as it should be. Really, there are three equilibrium equations to be satisfied, in general. However, it is easy to check that
(9 19)
-
Vl=y
^Wr=W/'
one can use (9.6) to confirm this, for example, as well as checking that it is sufficient to satisfy (9.16). Now (9.18) suggests that, for these solutions, Z = X2, and it isn't hard to verify this, using the uniqueness of solutions. So, this gives us equilibria of the form (9.20)
vi = / ( « ) ,
V2 = y3
y22 =
f2{6),
(9.18) giving a first approximation of f(8). Of course, it is unlikely that f)'{9o) = 0. Naturally, G(3) generates an orbit of "copies" of (9.20), which covers the possibilities of having any two of the three quantities y\-,y\,y\ be equal and nonzero, and we have also covered the cases where some or all vanish. Counting up the number of distinct points on an orbit of this kind, I find four. So the order of the point group for these is 24/4=6, the trigonal type. So, we have quite a variety of branches, in this case. I do not know of observations of a softening modulus which would suggest a bifurcation of this kind. However, there aren't very many observations of this general kind, and there are hardly any for Martensitic phases, which here could be either of the orthorhombic or the trigonal kind, for suitable forms of the function <j>. 10. Example. For shape-memory alloys, one of the more commonly observed possibilities has as Austenite a crystal of cubic form. Since this group is maximal, any limit point is also cubic. Upon cooling it, the observations indicate softening described by (10.1)
Cn-C,2-H)
in notation commonly used by experimentalists and by Love [11], for example. Here the C's are not to be confused with components of the tensor C. Referring to the comments about the cubic (No. 7) in Section 4, the eigenvalue which seems to vanish is that associated with the two-dimensional eigenspace, giving us the case N = 2, 9\(P0) = <7(2TT/3) considered in Section 7: the observations give that the other two eigenvalues remain positive. From the discussion in Section 7, the cubic
555
80 branch passes smoothly through 0 = 0o, with the indicated eigenvalue becoming negative, for 0 < 0O, or, said differently, (10.2)
Cn - C12 < 0 for 0 < 0O •
Also, we have the three symmetry-related branches noted in (7.6) and (7.7), involving the tetragonal symmetry. One of these will match the description in the Appendix, with the same basis e\, e2, e3 use for the tetragonal and the cubic configurations. From Section 4, a generally has five distinct values for tetragonal crystals, but at the bifurcation point one has the cubic symmetry so, as a consequence of invariance, some of the five must become equal to match the three for the cubic. This provides an example of an exceptional case, not regarded as unlikely, in which normally distinct eigenvalues can become equal and also vanish. Looking at what is involved, we find that, for the tetragonal, the eigenvalues associated with the two-dimensional eigenspace and the one-dimensional space of the form 0 a 0 (10.3)
o 0 0 0 0 0
become equal to a positive eigenvalue for the cubic, so both of these are positive, locally. Of the two eigenvalues with span of the form a 0 0 (10.4)
0 a 0 0 0 b
one approaches that associated with the one-dimensional eigenspace for the cubic and is therefore positive. The other merges with that for which the eigenspace is of the form a (10.5)
0
0
0 -a 0 0
0
0
these vanishing at the bifurcation point. With some additional analysis, which I won't elaborate, one can show that, nearby, one of the two is positive, one negative, with both of these changing sign as 9 passes through 90, indicating that these branches are unstable, having one negative eigenvalue, associated with a onedimensional eigenspace. Actually, that associated with (10.5) is positive for 0 > 0Q. There are simple possibilities for such branches regaining stability. For example, for 0 > 0o, the negative eigenvalue is one of those with the span (10.3), and it could become zero at some such temperature. From Section 4, this would give a bifurcation for which JV = 1, the first kind discussed in Section 5. That is, one would have a tetragonal branch with positive eigenvalues, for temperatures below
556 81 the temperature at which this bifurcation occurs. Exchange of stability between this and the cubic would then produce a rather typical subcritical bifurcation, a cubic-tetragonal phase transition of first order, and transitions of this general kind occur in shape-memory alloys. If this picture applies, one has a prediction that a certain eigenvalue associated with the tetragonal should become relatively small, near the indicated bifurcation. It is at least within the realm of possibility that experiments could confirm or deny this: I have not seen such data yet, but expect that experimentalists will produce some eventually. It isn't, hard to see that mare complicated combinations of bifurcations can be associated with cubic-tetragonal transitions and predict softening of the same tetragonal modulus. Simply, it is not very clear what can be done, experimentally, to discern between such possibilities, but this is a question which seems to deserve thoughtful consideration. With the information provided, an interested reader can similarly treat the possibility that the negative eigenvalue which again vanishes is that associated with (10.5), which gives a theoretically possible picture of a cubic-orthorhombic transition of first order, associated with softening of the same cubic, modulus. Again, there are observations of cubic-orthorhombic transitions associated with this kind of softening in shape-memory alloys, but there is a quirk. What the suggested analysis gives could fit some kinds of cubic-orthorhombic transitions, but not one of the body-centered-cubic-face-centered-orthorhombic type, for example; this can be inferred by calculating the form of C associated with the orthotropic phase. On the other hand, transitions of this kind are observed. If nothing else, this indicates that it can be feasible to prove wrong a simple guess about a global bifurcation pattern, with the rather limited experimental information now available. More accurately, I am inclined to believe that the general theory does apply to such transitions, it being the guess about the global bifurcation pattern which is at fault. After seeing what is wrong, one can make better educated guesses about the global pattern, finding some which could fit such transitions. One might find reason to reject some such, given experimental information concerning what is the soft modulus in the orthorhombic phase. This discussion serves to illustrate some of the kinds of things that can be inferred about unstable branches, using kinds of experimental results which could now be obtained, by knowledgeable experimentalists. I see this as a small piece of a larger problem. Among workers interested in phase transitions, it is a rather common practice to make some guess abouit itihe form of the relevant thermodynamic potential, involving some adjustable parameters which can be determined, using data, which either has been or obviously can be obtained. Roughly, the general problem is to find the qualities characterizing the more likely guesses, and I expect some of these to be of a qualitative nature. For the more mathematical studies, what is often important is not the specific form of our function tf>, say, but of its qualitative features. Looking back at our analyses, we have used some polynomial approximations of fairly high degree, and such polynomials represent popular guesses as to the form of <j>. From the work of Smith and Rivlin [10], one can determine those which are invariant under any one of the
557
82
point groups. My experience is that yon will have trouble finding relevant data to determine all of the adjustable constants involved. Even if you find enough data to determine all, you are likely to have a complicated function, making it hard to see just what general properties it has, that might enable you to apply some mathematical theorem. Thinking about this persuaded me that it was worthwhile to develop the kinds of pictures which I have presented here. Those who wonder why I have not considered crystals of more than general kinds might consult the paper by Zanzotto [12]; there are some good physical reasons for this. Appendix The (three-dimensional) crystallographic groups, being subgroups of the orthogonal group 0(3), have elements Q £ 0(3), representable in the form
Q = ±R, where R = R(V',e)eS0(3) , is a rotation with angle ip, e being a unit vector, representing the axis of rotation. The point groups realizable in Bravais lattices all contain the central inversion Q = —1, so can be generated by this and some group consisting of rotations. What is referred to as the point group Po in this paper is the latter group. Strictly speaking, this is not correct, but I don't think that this will cause confusion. Involved in the description of each is an orthonormal basis e, (i = 1,2,3), vectors often referred to as crystallographic axes. Where you see the designation e = p, this means to use for e all of the four vectors (ea ± e2 ± e 3 )/\/3 . Listed according to their order, the indicated groups are
Number
Elements
Order
1
1
1
2
l,R(ir,e 3 )
2
3
1, ROr.eO.ROr.e^.R^.es)
4
4
R(2ir/3,e3),R(4TT/3,e3)
6
558 83 Number 5
Elements 1, R(7r,e1),R(7r,e2),R(7r,e3),
R f*'21^1) 6
Order
M*fa e,),R(3jr/2,e,)
l,R(7r,ei),R(7r,e2),R(7r,e3), R^2l±fjl^
, R ^ ^ | ± - ) ,R(7r/3,e3),
R(21r/3,e3),R(47r/3,e3),R(57r/3,e3) 7
8
12
1, R(7r/2, ei ),R(7r/2,e 3 ),R(7r/2,e 3 )
R(ir, e!), R(ir, e2), R(TT, e3), R(3TT/2, ei), R(3TT/2, e 2 ), R(TT/2, e 3 ), R(2TT/3, p), R(4TT/3, p),
24
Crystals having symmetries described by one of these forms one of the crystal systems or classes, as they are sometimes called. Different workers use different names for these, those used here being as follows: 1.
Triclinic
2.
Monoclinic
3.
Orthorhombic
4.
Trigonal
5.
tetragonal
6.
Hexagonal
7.
Cubic
One can further refine these into different symmetry types. As is discussed by Ericksen [4], this is of some important for understanding what kinds of bifurcations are possible, but here, it is not necessary to make any explicit use of these refinements. One could specify the symmetry type at bifurcation, then infer something about the symmetry types on branches, for any of the cases here considered. By not doing so, I have left the theory somewhat incomplete. For those familiar with such matters, I don't think that it would be hard tofillin details of this kind.
559 84
Acknowledgement. This work was supported by AFOSR/XOP and NSF, under grant NSF/DMS 8718881. REFERENCES [1] Material Instabilities in Continuum Mechanics (ed. J.M. Ball), Oxford University Press (1988). [2] Shape Memory Effects in Alloys ed. J. Perkins), Plenum Press, New York-London (1975). [3] ERICKSEN, J.L., Bifurcation and Martensitic transformations in Bravais lattices, to appear in J. Elasticity. [4] ERICKSEN, J.L., Weak Martensitic transformations in Bravais lattices. Arch. Ration. Mech. Anal. 107 (1989), pp. 23-36. [5] TRUESDELL, C , First Course in Rational Mechanics, vol. 1, Academic Press, New York-San Francisco-London (1977). [6] MAN, C.-S., Material stability, the Gibbs conjecture and the first phase rule for substances, Arch. Ration. Mech. Anal. 91 (1985), pp. 1-53. [7]
LANDAU, L.D., On the theory of phase transitions, in Collected Papers ofL.D. Landau (ed. D. TerHaar) Gordon and Breach, and Pergamon Press, New York London-Paris (1965).
[8]
POINCARE, H., Sur I'Equilibre d' une mass fluide animee d' une mouvement de rotation, Acta Math. 7 (1885), pp. 259-380.
[9]
BALL, J.M., Differentiability properties of symmetric and isotropic functions, Duke Math. J. 51 (1984), pp. 699-728.
SMITH, G.F. AND RIVLIN, R.S., The strain-energy /unction for anisotropic elastic materials, Trans. Am. Math. Soc. 88 (1958), pp. 175-193. [11] LOVE, A.E.H., A Treatise on the Mathematical Theory of Elasticity, 4th ed., Cambridge University Press (1927).
[10]
[12]
ZANZOTTO, G., Twinning in minerals and metals: remarks on the comparison of a thermoelasticjty theory with some available experimental results, Notes I and II, Atti Ace. Lincei Rend. Fis. 82 (1988), pp. 723-741 and 743-746.
560 Meccanica 31: 473^88, 1996. © 1996 Kluwer Academic Publishers. Printed in the Netherlands.
Thermal Expansion Involving Phase Transitions in Certain Thermoelastic Crystals J.L. ERICKSEN
5378 Buckskin Bob Road; Florence, Oregon 97439, USA (Received: 21 March 1995) Abstract. This corrects an error and extends some analyses given in my paper [ 1 ] on branchings possible in Bravais lattices, involving deformations associated with thermal expansion. Sommario. Questo lavoro corregge un errore ed estende alcune considerazioni contenute in un mio lavoro [1] sulle ramificazioni possibili nei reticoli di Bravais, collegate alle deformazioni associate a dilatazioni termiche. Key words: Bravais lattices, Phase transitions, Thermomechanics of continua.
1. Introduction Phase transitions, often involving a change of symmetry, are common in crystals. The experience is that nonlinear thermoelasticity theory is useful for analyzing some of these. Believing that such theory is underdeveloped, I have been attempting to improve it, primarily by trying to improve local theory covering kinds of branching likely to occur in crystals. In [1], I made an error in branching analyses for Bravais lattices of the trigonal kind. My main purpose is to correct and extend these analyses. These are local analyses, in the sense these words are commonly used by mathematicians. For example, in studying first order phase transitions, we pick a neighborhood which includes one of the phases, but generally not all of those occurring. As I interpret physicists, they sometimes use a different idea of local theory, in analyzing what are sometimes called weak first-order phase transitions. I will elaborate this, later. I believe that it is feasible to make significant improvements in theory of this kind, which is useful for describing shape-memory alloys. Strictly speaking, these are not crystals, the periodicity being disrupted by alloying atoms. However, thermoelasticity theory designed for Bravais lattices has proved to be quite successful for at least some of these. It will become clear that the notions of weak transitions definitely do not apply to some kinds of physically interesting transitions. The task of developing sound theory for these involves overcoming some difficulties which seem formidable to me, so I am less optimistic about making significant improvements in theory for these, even when indications are that they should fall within the province of thermoelasticity theory. For true crystals which can be reasonably described as Bravais lattices, transitions observed tend to be of the latter kind. What should be done is to allow varying two control variables, pressure and temperature. This introduces some complications which I prefer not to deal with here. What I will do is fix the pressure to be zero, using temperature as a control variable. Mathematically, essentially the same analysis can be used to treat cases where the pressure is fixed at any non-zero value; it is only a matter of adding to the potential the potential associated with pressure loadings.
561 474
J.L. Ericksen
Dealing with varying pressures at fixed temperature can be done in a similar way. However, I'll point out why dealing with variations of both is a bit tricky. 2. Preliminaries We will use nonlinear thermoelasticity theory to study the equilibrium behavior of crystals subject to zero forces, in contact with an ideal heat bath at a constant value do of the absolute temperature, 9Q being our control variable. In an elementary book on thermodynamics [2, Ch. 1] which happened to be at hand, I find that the thermodynamic potential appropriate for this is the ballistic free energy: its extremals are then to be regarded as equilibria, stable when they minimize this potential. What is commonly used in work on phase transitions, including what I did in fl], is to use the Helmholtz free energy at the temperature 9Q. The two are not exactly equivalent, so it is not likely that predictions made will be independent of the choice. So, let's take a look at this. Let e, j3 and rj denote, respectively, internal energy, ballistic free energy and entropy, all per unit mass. Then, by definition (1)
P = e- 9ovIf ip denotes the Helmholtz free energy, per unit mass, at some temperature 6,
(2)
By one way of formulating thermoelasticity theory for crystals, viewed as homogenous materials, we have a constitutive equation for e, which can be taken to be the same for all material points, by a suitable choice of reference configuration. Once the latter has been fixed, it will be given by a function of the form (3)
e = e(C,V),
which I assume to be smooth. Here, C denotes the right Cauchy-Green tensor, the usual symmetric positive definite second order tensor. The function is restricted by invariance requirements associated with crystal symmetries, to be discussed later. With this formulation, the temperature of the body is given by 9 = di/dT].
(4)
In using the ballistic free energy, one allows configurations for which 6 ^ 6Q. When one uses the Helmholtz free energy, one sets 9 = 9Q. What we should do is to integrate the chosen thermodynamic potential over a given body and use the techniques of the calculus of variations, etc., to study possible equilibria. Instead, I follow the common practice of considering only values of C and t] which are independent of position, looking at extremals of (J or
= 0,
9 = 0O.
(5)
Here and in the following, I exclude theories of constrained materials. Then there is the second derivative test, briefly indicated by the condition that d2f3 = d2e SJ 0
(6)
562 Thermal Expansion Involving Phase Transitions in Certain Thermoelastic Crystals
475
for all dC and d^. This formulation would not be used for the other potential, since ip is then regarded as a function of C and 6, not 77. A formulation of the latter kind is equivalent, if (4) is uniquely invertible, to give T/ as a smooth function of C and 6. One wants 6 to be an increasing function of 77, with C fixed, to be able to satisfy conditions of stability, such as are implicit in (6). At least most workers seem happy to assume this. With this formulation, we have
(7)
with equilibrium equations 6 = 0o-
dip/dC = 0,
(8)
If we take the thermodynamic potential to be
(9)
in notation which should be self explanatory. Essentially, (9) is the quadratic form used to calculate the isothermal strain energy used in linear theory. If you use /?, hence (6), you get a condition only slightly different, which is equivalent to d*/3 = dC • N dC + s(d6>)2 > 0,
(10)
where (11)
s = dr)/d6,
is, like H, evaluated at the equilibrium configuration, 9s being the specific heat at constant deformation. Basically, we accepted the assumption that s > 0, in arguing that the two formulations of thermoelasticity theory are equivalent, and I won't consider what might be predicted if s vanishes or becomes negative. Generally, when workers speak of measurements of specific heat, etc., they refer to the specific heat at zero stress or possibly at some constant pressure. For the former, calculate 0 dr]/d9 for C a function of 6 which satisfies (8). I leave it to the interested reader to determine how this behaves at branchings to be considered. For linear thermoelasticity theory, it is important that the inequality (10) be strict, but, to a worker with some experience in linear elasticity theory and heat transfer, these are obvious restrictions to impose on the linear constitutive equations. From now on, we will assume that the heat bath and body temperatures are the same. To set the scene, suppose that we have dif>/dC = 0
at C = C # , 6 = 0 # ,
(12)
where the symbol * denotes some particular values of the variables. Also, we are given that this configuration represents a crystal configuration which is one of the representatives in a Bravais lattice, with trigonal symmetry. We would like to know if this is likely to be interpretable as a limit point of other equilibrium points occurring at slightly different temperatures and, if so, what can be said about these being arranged on smooth branches, what symmetries these might have etc. About the constitutive function
563 476
J.L. Ericksen
for example, (9) might or might not be satisfied. Expecting that different functions will predict different kinds of behavior, we will try to cover all of the likely possibilities. First, it is convenient to use this configuration as a reference, so we have C # = 1.
(13)
Second, we need to have in mind general aspects of nonlinear thermoelasticity theory, as it applies to crystals, bearing in mind that the ultimate goal is to describe phase transitions with can be of first-order and involve changes of symmetry. For one thing, we need to have some way of relating values of C to lattice vectors, to keep track of the symmetries. Those interested in molecular theories of elasticity use for this the Cauchy-Born hypothesis. One introduces some set of lattice vectors Ak(k = 1,2,3) for whatever one takes to be the reference. Then, if this is changed by a macroscopic deformation with gradient F, the vectors ak = FAk
(14)
are, by assumption, a possible set of lattice vectors for the new configuration. With the usual definition, C — FTF, C determines ak only to within an orthogonal transformation. Zanzotto [3] has looked hard at how well this agrees with observations, for deformations which can reasonably be considered to be elastic. He concludes that it is not reliable in general, but it does seem to be trustworthy for Bravais lattices and at least some shape-memory alloys. My experience is that the theory of Bravais lattices is rather successful for the latter. For more complicated kinds of crystals, (14) does not always fail to apply, but there seems to be no obvious pattern in this. In fact, Zanzotto's work leads to the conclusion that, for some crystals, elasticity theory is incapable of describing observations of mechanical twinning, for example. It remains to be explored whether some of the theories of continua with microstructure, such as have interested Gianfranco Capriz, can do better, but this seems worth exploring. However, this has induced me to focus on developing theory for the Bravais lattices, at present. For Bravais lattices, with atoms pictured as structureless mass points, giving a possible set of lattice vectors determines the positions of these, to within an unimportant translation. Thus, it is reasonable to think that, at bottom, tp is determined by these and 8. Also, for any two sets of lattice vectors which determine the same configuration, (p should have the same value. The linear transformations relating these represent the infinite discrete group GL (3, Z), that is ak ->• mpkap,
(15)
where the m's are integers, with det.|m^| — ± 1 , giving us m G GL(3, Z). For any particular choice of ak, some finite subgroup, called the lattice group, is singled out by Lattice group ak = {me GL(3, Z) | mpkap = Qak,
QTQ = 1}.
(16)
The corresponding Q's then form a group conjugate to this, a finite subgroup of the orthogonal group, which is the point group. For any choice of a*, linear transformations h satisfying mpkav = hak,
(17)
564 Thermal Expansion Involving Phase Transitions in Certain Thermoelastic Crystals
All
then represent a group G{ak) conjugate to GL(3, Z). Putting this all together, we infer that
H 6 G(Ak).
(18)
Included in this group are any orthogonal transformations in the point group for Ak, which then form a finite subgroup of G (A k ). In the sense described, the invariance group is really the same for all Bravais lattices. Linear theory is concerned with what happens near a minimizer of
565 478
J.L. Ericksen
Even if one accepts this plan, there is a question of how best to use it. Typically, one will have scant experimental data concerning the function
(19)
All trigonal crystals belong to the same Bravais lattice; we can arrange that the lattice group for any two is the same, by making a suitable choice of possible lattice vectors. We wish to
566 Thermal Expansion Involving Phase Transitions in Certain Thermoelastic Crystals
479
study the situation indicated by (12) and (13), when this reference configuration is trigonal. A possible set of lattice vectors for this satisfies A2 = RAU
A3 = RA2,
(20)
where A\ must be picked so that this gives three linearly independent vectors, for one thing. One also needs to be careful that this does not describe a configuration of greater symmetry. For example, if you pick A\ so that these three are orthogonal, these lattice vectors described a simple cubic configuration. It is known that the only possible configurations of greater symmetry are cubic. If you study this a bit, you will see that you can also get bodycentered and face-centered configurations, for different special choices of A\. In addition to (19), the point group contains three 180° rotations, described by ( R1(AUA2,A3) = (-A3,-A2,-Ai), | R2(AUA2,A3) = {-Ai,-A3,-A2),
[ R3{AUA2,A3) =
(21)
(-A2,-Au-A3),
automatically satisfied by lattice vectors satisfying (20). Axes of these are perpendicular to one of the lattice vectors and to the axis of R in (19). This group of order 6 has two kinds of subgroups which can be point groups for Bravais lattices of lower symmetry. One is described by Group consisting of identity, describing triclinic crystals.
(22)
Transforming such a configuration by the group gives an orbit of six symmetry-related triclinic configurations. Another kind of subgroup is of order two, employing one of the 180° rotations in (21), say Group with elements 1 and R\, describing monoclinic crystals.
(23)
The orbit of such a configuration gives three such configurations. There are two Bravais lattices of the monoclinic kind, the simple monoclinics and the base-centered kind. For the lattice group of one to be included in the trigonal, it must be as described by one of the equations (21), with Ak replaced by a possible choice of lattice vectors for it. Without belaboring the matter, this makes them base-centered. There is another subgroup, that given by (19), but no Bravais lattice has this as a point group. So, in terms of those Pitteri neighborhoods, a trigonal neighborhood contains the trigonal, triclinic and base-centered monoclinic kinds of configurations. There we have a list of the changes of symmetries which might be treatable as weak, using a trigonal neighborhood. Now, one can say something about values of C associated with the different kinds of symmetries occurring in a trigonal neighborhood. For example, the trigonals all have the same lattice group, as well as the same point group, so one can use known results1 to infer that, we will have, for the R noted in (19) f Trigonal <& C = p i + qe <8> e, \Re = e, 1
R* = l,
e.e = 1,
(24)
Forms of C invariant under the various point groups are listed in the text by Truesdell [5, Ch. VI], for example.
567 480
J.L. Ericksen
where p and q are scalars, with p — 1 and q not too large: there are values of these describing cubic configurations, which cannot be included in a trigonal neighborhood. Similarly, for any of the three 180° rotations noted in (21), we have, for example. J Monoclinic «• C has e as an eigenvector, | R \ e = e, R \ — 1,
e e = l ,
(25)
at least for C close to 1. All other values of C describe triclinic configurations, again for C close to 1. Now, we use the implicit function theorem to determine what we can about equilibria near C = 1,9 — 0 # , generally neighborhoods much smaller than those used for the theory of weak transitions. Letting # denote evaluation of quantities at C = 1, q = q#, what is important for this is kerH* = {D = DT\X*D = 0}.
(26)
If this gives D = 0 as the only possibility, the simplest form of the implicit function theorem applies; locally, there exists a unique, smooth solution of the equilibrium equations, with C = 1 at 9 — 8*. This is the least interesting possibility for us. However, at a first order phase transition, what is usual is that measured elastic moduli will imply that the stability inequality holds, which puts us in this situation. Thus, although we don't see it, this branch will continue on and, by continuity, (9) will continue to be satisfied for some interval of temperatures. In our trigonal neighborhood, if this starts as a trigonal branch, it follows from the local uniqueness that it will stay trigonal so, on it, C will be of the form (24). If it were a monoclinic branch, the invariance would give a trio of branches, this possibility being excluded by the uniqueness of solutions. Similarly, triclinic branches are excluded. To analyze other possibilities, we need to detemine what are the likely nontrivial possibilities for kerH#, which we will do in the next Section. In schematic form, the basic idea is to write C = 1 + D + E,
(27)
where D € ker N# and E is in the orthogonal complement k± = {E = ET | DE
= 0,
DekerK*.
(28)
Then, locally, the implicit function theorem tells us that we can solve equilibrium equations for E = E{D, 9),
E{0,9*) = 0,
(29)
E being unique and smooth. Now, H, representing the second derivatives of
(30)
568 Thermal Expansion Involving Phase Transitions in Certain Thermoelastic Crystals
481
for E in terms of D and 9, to get values of (31)
C=\ + D + E(D, 6). If we use (31), with D replaced by a transform, we would get a value, say C = 1 + RTDR + E(RTDR, 9). On the other hand, it follows from invariance that equation (30) is also satisfied for RTCR = 1 + RTDR + RTE(D, 6)R. This would violate the uniqueness of solutions unless RTE(D, 9)R = E(RTDR, 0).
(32)
Obviously, one can replace R by any other rotation in the trigonal group in this, to conclude that the function E is invariant under the group. The next step is introduce the reduced potential
+ E(D, 9), 9).
With (29) we know what invariance it inherits from
(33)
for any rotation R in the trigonal group, and that it is a smooth function. Also, from the way it was constructed, we have d<pr = d V
= 0,
for all dD, dO = 0 at D = 0,9 = 9*.
(34)
Also, the remainder of the equilibrium equations are equivalent to satisfying dD-d<pr/dD = 0VdD£ kerH#.
(35)
Now, we are interested in ker H#, and know that K# is a fourth order tensor which is invariant under the trigonal group, As is clear from (9), we have U • N*V = V • X*U
for all symmetric tensors U and V. So it has real eigenvalues and eigenspaces, with the usual property that eigenspaces corresponding to different eigenvalues are orthogonal. Also, each eigenspace is mapped to itself by the group. For an eigenspace to be one-dimensional, it must then have elements which are all proportional to one, which is either mapped to itself or to its negative by the group. There are various ways of picking two orthogonal tensors of the form (24), to get two onedimensional eigenspaces. So we should look for two of this kind, and will find them. Then, the other eigenspaces will be orthogonal to these. It is not hard to dispose of the possibility of the tensors that might be transformed to their negatives. This leaves us eigenspaces which are at least two-dimensional, and there are various ways of constructing orthogonal sets using
569 482
J.L. Ericksen
these. Setting any of the eigenvalues equal to zero should then give us the possible kernels. Of the six eigenvalues, we should have four of them distinct. So, we must somehow cover the possibility that any one of the four vanishes. So, this gives a tentative description of what we should expect to find, in looking at N# in more detail. 3. Kernels and Reduced Potentials If p denotes the reference mass density and e the infinitesimal strain, the linear elastic (isothermal) strain energy is W = 2pe • H # e.
(36)
The form of this function for the trigonal crystals is well known, and involves 6 independent elastic moduli. Experimentalists generally use the labelling of moduli used by Love [6, Ch. VI] among others, who does discuss a way of deriving the forms appropriate for crystals of the various symmetries. If you use this you need to be aware that his shear strains are twice what are used in most modern work, as you can see from his Section 3. Various other writers follow this practice. Let me describe how I prefer to view this situation. Introduce an orthonormal basis, with the third base vector taken as the axis of the 120° rotation R . Then, invariant tensors will have the form p
0
0
0 0
p 0
0 . q
(37)
In the orthogonal complement of these, one possible invariant set has the form
U=
0 0 a
0 b
0a b . 0
(38)
If you explore how it transforms under the subgroup (19), you see that it transforms like a vector / , of the form f = y/2(a,b,0),
/ - e = 0.
(39)
The invariant set orthogonal to (37) and (38) is of the form c V = d
d -c
0 0 .
0
0
0
(40)
The non-zero entries in it change when V is transformed by an element of the group. A calculation that shows that g = V2(d,c,0),
g-e = 0
(41)
transforms as a vector for the subgroup, although not more generally. Although U and V are orthogonal, / and g are not, which merely means that we are using a different inner product,
570
Thermal Expansion Involving Phase Transitions in Certain Thermoelastic Crystals
483
when we use the vector representation. Any eigenspace orthogonal to tensors of the form (37) will be a subset of U + V and, in looking for these, we will want to use the norm and constraint indicated by l = \U + V\2 = \U\2 + \V\2 = \f\2 + \9\2.
(42)
Given any infinitesimal strain tensor e, we can decompose it into the parts indicated, which gives p = (en + £ 2 2 )/2, q = £33,
< a = en,
b = e 23 ,
(43)
c = (eu - £ 2 2 ) A . d = enHaving these facts in mind, it is easy to construct a formula for W which is only invariant under (19), W = ki(eU + £ 22 ) 2 + 2&2(en + £22)£33 + &3£33 +fc 4 |/| 2 + 2k5f • g + k6\g\2 + 2fc7e • / A g,
(44)
the k's representing a possible choice of the seven material constants, the correct number for this symmetry. We have not imposed the constraint (42). Of course, we want W to be invariant under the 180° rotations described in (21) and, for this, a routine calculation gives the requirement
h = 0.
(45)
For W of this form it is easy to show that, usually, one gets two one-dimensional eigenspaces of the form (24), with eigenvalues which are those of the quadratic given by the first three terms in W. The possibility that one of these vanishes gives the equation (46)
kxk3 = kl,
the one equation we are allowed at this temperature. So, in particular, we reject as unlikely the possibility that these two eigenvalues both vanish. The kernel would then consist of tensors of this form, with the ratio of the entries fixed at a material constant, tensors of the form
D=
z
0
0
0
z
0
0
0
rz
,
r = const,
(47)
z being arbitrary except that our interest is in what happens when it is near zero. Of course, (44) could be translated into an equation relating the moduli used by experimentists, but I won't pursue this. If these eigenvalues are non-zero, the possible kernels lie in the orthogonal complement, which means analyzing the remaining three terms in (44). So, for this part of
571_ 484
J.L. Ericksen
W, one needs to find the eigenvalues and eigenvectors, which is not hard. For the eigenvalues, one gets the quadratic A2 - (k4 + k6)X + k4k6 -fcf= 0,
(48)
and, in terms of / and g, the eigenspaces are described by (A - k4)f = k5g.
(49)
A little exercise shows that if Ai and A2 are the roots of (48) we have (A! - k4) _ k5
k5
_
(A 2 -fc 4 )
~
'
(
}
say. So, for one eigenspace, we have / = tg,
(51)
and for the other / = -g/t.
(52)
For one of these to be ker H#, our one allotted equation is k4k6 = A;2,
(53)
and we then use our criterion to reject the possibility that t = 0 or t = 00. As was anticipated at the end of Section 2, we here get our two-dimensional eigenspaces; we can assign the twodimensional vector g (or / ) arbitrarily in (51) or (52). When the kernel is two-dimensional, we can set Ai = 0 in (50), and use (38)-(41) to represent (49) in terms of second order tensors, parametrized in terms of x and y, where y
x
tx
D(x,y) = x
-y
ty ,
tx
ty
0
(54)
t being the material constant noted above. This disagrees with what is stated in [1], which is erroneous. Now, we have only partly specified the orthonormal base vectors, ej, e2, e^ used for second order tensors. The trigonal group can be generated by the 120° rotation R and the 180° rotation R\, and we now select e\ to be the axis of the latter, so R\e\ — ei,
Re-i — e^.
(55)
Transforming D by R gives what we used before, implicitly, R(x, y) = [{-x + V3y)/2, -{y/lx + y)/2]
(56)
and, by a simple calculation, Ri(x,y) = (-x,y),
(57)
572 Thermal Expansion Involving Phase Transitions in Certain Thermoelastic Crystals
485
which then generates the action of the trigonal group on (x, y). So, there are just two possibilities to consider for ker H#, either the one-dimensional or the two-dimensional possibilities noted above. For either, we decompose the tensors into parts in kerH # and its orthogonal complement, solving the equilibrium equations for the latter, then get the potential as a function of the form
(58)
not subject to any invariance requirements. In the two-dimensional case, we get ip = (pr{x,y,9),
(59)
to be invariant under the group generated by (56) and (57). Branching Analyses Consider the easiest case first, indicated by (58). What we are given is, from (34), (60)
and we are interested in solutions of (61)
for z and 6 — 6* small. Here again excluding unlikely possibilities, we have
dzd0*0>>
(62)
so we can use the function theorem to solve for 6, as a smooth function of z, which gives, for some non-zero constant v, 6 = 6* + vz2 + o{z2).
(63)
So, locally, one has either 6 ^ 0# or 6 ^ <9#, depending on the sign of v, which is not likely to vanish. Rather obviously, this corresponds to having two equilibrium branches for, say 6 > 6#, meeting at 6*, where dz/dd becomes infinite. With z invariant under the trigonal group, both branches are trigonal. Also, a calculation shows that d2(pr/dz2 has opposite signs on the two branches. So, as far as <pr is concerned, one passes a second derivative test for stability and the other fails. One could restabilize the latter by having another transition of this kind at a different temperature, to get the second derivative test to hold again. This pictures what could be a weak trigonal-trigonal phase transition as a typical subcritical bifurcation. Of course, ipr could pass a second derivative test for stability and
5 486
J1
J.L. Ericksen
sketching this, you can easily see how this should change with 6, to conform to the picture described above. You will also see another possible picture, with an exchange of stability involving two branches which are not connected by any other equilibrium branch. It is of some importance to understand the relative merits of such alternatives, but I won't pursue this. Turning to the other possibility, indicated by (59), we have, from (34). V<pr = V V w = 0
at x - y = 0,9 = 9#,
(64)
which I will analyze in a different way than I did in [1]. First, it follows immediately from (57) that x - 0 =» d<pr/dx - 0.
(65)
Expanding <pr in a Taylor's series gives
(66)
where I have accounted for the invariance and discarded an additive function of 9 only. Here h\, hi and hj, are functions of 9. It is not necessary to include the quartic term, but workers are apt to use a quartic, as a model for more specific calculations. From (64) we have hl(9#) = 0.
(67)
Using our criterion to reject unlikely possibilities, we have, in particular
M0#)#O,
h\{e*)?o,
(68)
so h\ will change sign as 9 passes through 0*. From (66), it is clear that x = y = 0 => V<pr = 0.
(69)
Obviously, the orbit of this under the invariance group gives only this configuration, which means that it is a trigonal branch. As far as <pr is concerned, it passes a second derivative test for stability where hi > 0, and fails it where hi < 0. Using the cubic approximation in (66), one can make a first estimate of other branches passing through the origin, which gives three possibilities, fz-0,
y^-2h,/3h2,
(70)
\x = ±y/3y,y^hi/3h2,
the latter being transforms of the first by the invariance group. That there are three means that these are monoclinic configurations. We need to show that there really are such equilibria and, for this, we need only consider the first. From (65), one equilibrium equation is satisfied for x = 0. For the other, set
(71)
y = Z{e-&). With a suitable bound on £, y is then of order \9 — 9#\. Using (65) we then have f d<pr/dy = h2(9*)(9 \
K
= 2h\(0*)/h2(0*).
9*)2[K£,
+ 3£2] + o(9 ~ 9*)2, (72)
574
_ _ _ _ _ _ _ ^ _ Thermal Expansion Involving Phase Transitions in Certain Thermoelastic Crystals
487
Obviously, one can pick an interval of £ values, not including £ = 0, so that the quantity in brackets is positive at one endpoint, negative at the other. Then, for 6 - 9* small enough, dtpr/dy will have opposite signs at these endpoints, so it vanishes somewhere in the interval. Clearly, this agrees with the estimate of y in (70), to first order in 9 - 0*. On this branch, a first order estimate gives
( d2tpr/dx2 S 2hx - 6h2y = 6hu | d2<pr/dy2 S 2/i, + 6h2y * -2hu 1 d2ipr/dxdy 3 0,
(73)
indicating that <pr does not have a relative minimum here, so this is an unstable branch, as are the other two copies indicated in (70). Incidentally, one could push through direct calculations to find that values of C occurring on these branches are consistent with (25), if one did not already know this. One could arrange that, on the side where hi > 0, one has a metastable trigonal branch, branching occurring at the limit of metastability. There, two equal eigenvalues of N vanish, the rest remaining positive. As one moves off onto a monoclinic branch, symmetry no longer requires those two eigenvalues to coincide. If both were positive, tpr would here have a relative minimum, contradicting (73). So, at least one of these is negative, and both might be. To decide this, I did somewhat lengthy calculations that I won't present, finding that, on these branches, near 9 = 9#, det. K = - 1 2 h \ P + o{9 - 0*)2,
(74)
where P > 0 is the product of the non-vanishing eigenvalues at 9 — 9#, which are by assumption positive. So the eigenvalues in question are of opposite sign. On the side where h\ > 0, one could restabilize the monoclinic branch by having another transition to reverse the sign of d2tpT/dy2, a monoclinic—monoclinic transition. This gives a subcritical bifurcation much like that considered before, mathematically, except that it involves the trio of branches. So, this seems to be the simplest pattern for a trigonal-monoclinic transformation, one which can be explicitly analyzed, using the quartic (66). Another possibility is to follow this transition with one more transition, of the monoclinictriclinic kind, essentially letting x = 0 split into two branches, related by the rotation R\. From [1], this gives a typical pitchfork bifurcation. Make the latter two transitions close enough to each other, and the metastable monoclinic phase won't have a chance to become stable. I believe that this is the simplest pattern to describe a trigonal-triclinic transition, and know that you will find the quartic inadequate to describe it. One can describe this with a sextic. Often, when one has a number of symmetry-related phases, they exist together in a sample to form microstructures which can be complex. Instead of pursuing this, I refer the reader to Bhattacharya et al. [7], for a recent summary of what is known about theory of this kind. Finally, I note that there is much confusion in the literature about bases for classifying Bravais lattices into 14 types, as is cleared up by Pitteri and Zanzotto [8]. References 1. Ericksen, J. L., 'Local bifurcation theory for thermoelastic Bravais lattices', In: D. Kinderlehrer, R. James, M. Luskin and J. L. Ericksen (Eds.), IMA Volumes in Mathematics and its Applications vol. 54, Springer-Verlag, New York, 1993, pp. 57-84.
575
488
J.L. Ericksen
2. Ericksen, J. L., Introduction to the Thermodynamics of Solids, Chapman and Hal), London, 1991. 3. Zanzotto, G., 'On the material symmetry groups of elastic crystals and the Born rule', Arch. Rational Mech. Anal, 121, (1992) 1-36. 4. Pitteri, M., 'Reconciliation of local and global symmetries of crystals', J. Elasticity, 14, (1984) 175-190. 5. Truesdell, C, A First Course in Rational Continuum Mechanics, vol. 1, Academic Press, New York, 1977. 6. Love, A.E.H., A Treatise on the Mathematical Theory of Elasticity, 4th edition, Cambridge University Press, Cambridge, 1927. 7. Bhattacharya, K., Firoozye, N.B., James, R.D. and Kohn, R.V., 'Restrictions on microstructure', Proc. Roy. Soc. Edinburgh, 124A, (1994) 843-878. 8. Pitteri, M. and Zanzotto, G., 'On the definition and classification of Bravais lattices' (to appear).
576
On the Possibility of Having Different Bravais Lattices Connected Thermodynamically
J. L. ERICKSEN
5378 Buckskin Bob Rd, Florence OR 97439-8320 (Received April 28, 1995; Final version August 11, 1995)
Abstract: It is generally accepted that the simplest kinds of crystals exhibit 14 kinds of symmetry, the Bravais lattices described in texts treating crystals. Briefly, Bravais's idea was to regard two configurations as having the same symmetry provided they can be connected by a special kind of path, in the space of lattice vectors. However, with his assumptions, he overlooked the possibility that some of the 14 can be joined by such paths, which reduces the 14 to 11. To be explored is the question of whether paths found to connect the different Bravais lattices are thermodynamically possible, when pressure and temperature are suitably controlled.
I. INTRODUCTION What we will be concerned with are the simplest kinds of crystals, pictured here as a collection of identical mass points, filling all of space. To within an unimportant translation of the whole set, a possible configuration is completely described by giving three linearly independent vectors ak, the lattice vectors. Put one point at the origin; then position vectors of the rest are the vectors nkak,
(1)
k
where the n are integers. Any choice of these locates one of the points. Two sets of lattice vectors ak and a*k describe the same collection of points, provided that there is some matrix m = \mpk\ of integers, with det m = ± 1 , such that <=<<*,.
(2)
Such matrices form a representation of an infinite discrete group, the general linear group on the integers, which I'll call G. Associated with any choice of ak is its lattice group L (ak), given by L(ak) = {me G\m{av = Qak, Q~l = QT),
(3)
and its point group P(ak), consisting of all orthogonal Q occurring here. Every lattice group contains m = - 1 , so every point group contains the central inversion Q = — 1. For our purposes, it is convenient to exclude these, so we replace L(ak) by what is left, that is, L+(ak) = {me L(ak) | det m = 1}, +
(4)
with P (ak) denoting the corresponding part of P (ak), a set of rotations. As is well known, these lattice and point groups are finite groups. Rotating such a configuration amounts to rotating its lattice vectors, which doesn't change the associated L + , but does change P+, by Mathematics and Mechanics of Solids 1: 5-24, 1996 © 1996 Sage Publications, Inc.
577 6
J. L. ERICKSEN
a similarity transformation. Replacing ak by a£, as in (2), does change L + , by a similarity transformation, using the m in (2), but leaves P+ unaltered. With a slight abuse of language, L+ and P+ will be called the lattice and point groups, respectively. These are some elementary ideas about crystallography to be used here. There is the notion that configurations need not be related by an orthogonal transformation to have the same symmetry. For example, what commonly occurs in thermal expansion of one of our crystals? Reasonably, we can picture this as described by a set of lattice vectors depending continuously on temperature, so these generally get shorter or longer, if we exclude phase transitions, which might occur at isolated values of temperature. We intuitively accept the idea that there is no change in symmetry. Similar remarks apply to what occurs if we consider what happens to crystals subject to varying hydrostatic pressures. If, for some crystal, there is such a "normal" path connecting two configurations, we should regard them as having the same symmetry. By considering all configurations, we see that this will lead to some variety of maximal connected sets. Essentially, each of these can be viewed as having a different kind of symmetry, unless they can be related by one of the trivial operations mentioned in the first paragraph. A line of thought like this was used by Bravais [1] to infer that there are 14 types of symmetry. Pick up your favorite reference on crystals and, most likely, it will have some discussion of this matter, along with some description of these types. The result is well accepted and useful. I note that a Bravais lattice now means the collection of all lattice vectors with the same symmetry, according to this classification. So these are the 14 Bravais lattices, as the words are now understood. However, there are subtleties involved in trying to understand the reasoning leading to the result. One could consider continuous paths in the space of lattice vectors, on which either the point or lattice group remains fixed. It is easy to show that fixing the point group fixes the lattice group. Also, fixing the lattice group fixes the point group, to within the trivial operations mentioned above, so either assumption fixes both, essentially. Although this seems to be the most obvious way to begin, it was not the way chosen by Bravais. Roughly, he allowed paths on which the symmetry increases, excluding those on which it decreases. At the ends, the symmetry should be the same. There are at least two reasonable ways of interpreting these words, to describe particular kinds of paths, which I will call Bravais connections, namely, Bravais P-connections are continuous paths on which the point group is the same at the endpoints, to within the aforementioned orthogonal transformations. At interior points, it (5) can be larger, but not smaller. or Bravais L-connections are continuous paths on which the lattice group is the same at the endpoints. At interior points, it can be larger, but not smaller. Note that in (5), "the same" is to be interpreted literally. As discussed in some detail by Pitteri and Zanzotto [2], the two types are understood to be equivalent in much of the literature, but they discovered that this is not true. As they interpret the words written by Bravais, he had in mind (5), and I agree. For one thing, they realized that I [3] had, unwittingly, found a Bravais P-connection joining lattice vectors in two different Bravais lattices. So, logically, he should have put them in the same Bravais lattice. They then did more analyses, to prove that, using the indicated interpretation of his assumptions, there are exactly 11! Clearly, this is something we need to better understand. As to my example, I was trying to illustrate differences between lattice and point groups.
578 THERMODYNAMICALLY CONNECTED BRAVAIS LATTICES
7
Similar in spirit to Bravais's theory is the topological study by Schwarzenberger [4], who distinguishes more than 14 types of symmetry. The difference can be understood in terms of admissible paths. For Schwarzenberger, admissible paths are of the more obvious kind. That is, the point group must be constant along the path, to within the trivial ambiguity associated with orthogonal transformations. My example has it larger at just one point, which does disqualify it from being one of Schwarzenberger's paths. There is another approach. I don't remember where or when I first learned about it, and don't know its origins. It seems to be regarded as equivalent to Bravais's [1], by at least some workers. I accepted this, without exploring it carefully. As a matter of taste, I have preferred it because it seems simpler, conceptually. Simply, regard two configurations as having the same symmetry provided that, for some choice of their lattice vectors, their lattice groups are the same. This involves no consideration of paths, obviously. In [3], I introduced a notion of fixed sets, as a way of helping picture how the Bravais lattices have connected parts, a view which is a bit different from that of Bravais or Schwarzenberger [4], permitting one to look at various kinds of paths, including both of the latter. Briefly, two configurations in the same fixed set can be joined by a Bravais L-connection lying in this set, and two configurations will be in the same fixed set if their lattice groups coincide. Pitteri and Zanzotto [2] have found group-theoretic results proving that this approach does give exactly 14 different types of symmetry. I expect that many workers will continue to regard this as the right answer. So, had Bravais employed the paths described by (5), and used correct reasoning, he would have gotten the right answer. However, it seems clear from the discussion of Pitteri and Zanzotto that he did not use such paths and they found other faults in his reasoning. I do note that, if the aim is to get the usual 14, assumptions about paths are redundant. My view is not that the paths mentioned are uninteresting, but that it is better to study them and others within the framework of thermodynamics. In this, kinematical studies like those of Schwarzenberger's are likely to be useful, as are other kinds of kinematical results. Here, my main purpose is to explore the question of whether a Bravais P-connection of the kind occurring in my example and two similar ones found by Pitteri and Zanzotto [2] are thermodynamically possible equilibria, in a crystal, when it is in a heat bath at a controlled temperature 8, and subjected to a static pressure p, also a control variable. More accurately, I'll use the equations of nonlinear thermoelasticity theory, along with rather conventional thermodynamic reasoning. I won't do much to assess the merits of different ideas of connectivity, but suggest that you think about this as you see what kinds of problems we encounter.
II. THERMOELASTICITY THEORY I want to use thermoelasticity theory to analyze equilibria for Bravais lattices in situations where, for one crystal, equilibria can be in different Bravais lattices, depending on the environment. Roughly, thermoelasticity theory allows lattice vectors to deform differently at different material points, so deformed crystals do not fit classical definitions of crystals, strictly speaking. I presume that the reader is familiar with the implied abuse of language. As will become clear, this is not really involved in analysis to be done. Theory of the kind indicated is discussed in some detail in [5], for example, so I will be brief. One assumption is that
579 8
J. L. ERICKSEN
is, by assumption, invariant under the infinite discrete group G mentioned in the introduction. Thus it is the same for all the simple kinds of crystals considered. If this seems to contradict everything you have read, bear with me. It works out to be not as contradictory as it might seem. Commonly used invariance arguments imply that (p is expressible in the form (6)
ip = cp{ak-ap,0).
To relate this to macroscopic deformation, we use the Cauchy-Born hypothesis. Pick some reference configuration, with constant lattice vectors1 Ak. Then, if F is the macroscopic deformation gradient, the assumption is that the vectors (7)
ak = FAk
are a possible set of lattice vectors in the deformed configuration. For our purposes, deformations encountered can be small enough to let us use a much simpler kind of theory, using the restriction of
Q e P+(Ak),
(8)
where C = FTF. Because central inversions map C to itself, one could use P(Ak) in place of P+(Ak), but this makes some accounting less neat. From [6] or [7], all configurations in this neighborhood have lattice groups that are subgroups of L+(Ak), and one can use (7) to relate these to values of F. As is rather customary, I assume values of F satisfy det F > 0. Also, because C determines F to within a rotation, which does not affect lattice groups, one can equally well relate lattice groups to values of C. Also, as noted in [3], a lattice group determines a point group, to within the ambiguity associated with rotations so, with this ambiguity, values of C determine point groups. Here, I will assume that
f pydV, JB
(9)
where Y = H> + (P/P) det F, 1. Footnotes are found in the Notes section before the References.
(10)
580 THERMODYNAMICALLY CONNECTED BRAVAIS LATTICES
9
p is the (constant) reference mass density, B is our choice of material region, some fairly simple shape, maybe a rectangular parallelepiped, and we can use (detF) 2 = detC,
(11)
to get y as a function of C, p, and 6. As is rather obvious, y and q> enjoy the same invariance. Later, we will see what is commonly regarded as the Gibbs function. For reasons of tractability, I will assess stability using derivative tests, which won't distinguish between absolute and relative minima. Probably, it would shed new light to study this, not assuming cp is restricted to a Pitteri neighborhood, or to use arguments not so local, at least. What I ask is, roughly, are there reasonable choices of 0, such that its minimizers behave in a certain way? With the local theory, we do arrive at some rather definite kinds of equilibrium patterns, which seem to me to be physically reasonable.
III. MATERIALS As I am doing here, we theoreticians somehow describe phenomena to be analyzed. Then we are inclined to talk of a material as something described by one constitutive equation, in our case a particular choice of the function (p, once we have decided on what to use for a reference configuration. As you probably know, this is a bit unrealistic. If a careful experimentist does his or her best to get 100 identical samples and do identical experiments on them, there are likely to be differences in the measurements too large to be attributable to experimental error; one might behave very differently from the rest. I have encountered enough examples of this to induce me to believe it. It would be better to think of a material as described by a set of slightly different constitutive equations. What I am after are predictions applying to almost all samples of some material, roughly speaking. There is a discussion of such matters by Man [10]. What I will do is to use some rather naive tests to check that predictions to be obtained are fairly robust. Physically, it is not enough to keep values of cp almost the same. One can argue that this is not even necessary, because adding an affine function of 9 is of no importance, physically. Commonly measured things, like stresses, elastic moduli, and specific heats at constant pressure should be nearly the same and these are, essentially, first and second derivatives. Certainly, one wants more, that is, the different equations should deliver almost the same predictions for situations to which the theory is likely to apply, physically. By common consent, such perturbations should not destroy invariance. I do not think it is possible to make these ideas into rigorous mathematical hypotheses at present.
IV. KINEMATICAL CONSIDERATIONS Consider lattice vectors depending continuously on a parameter x. For r < 0, we have ak — a^(x), all with the same lattice group L~. For r > 0, we have ak = a£(x), all with the same lattice group L + ^ L". At x = 0 these have the common limit
= 0,
for p^k,
and
| a - | ^ \a~\ ? \a~\ ± \a~\,
(12)
581 10
J. L. ERICKSEN
lattice vectors associated with the simple orthorhombic Bravais lattice. The point group then has elements of the form 1, R,, R2, /?3, where Rk is a 180° rotation with axis parallel to a%. The corresponding lattice group elements are diagonal matrices described by the identity and2 1 0 0 0 , fl, = m , = 0 - 1 0 0 - 1
(13)
R2 = m2=
- 1 0 0 0 1 0 , 0 0 - 1
(14)
/?3 = m 3 =
-1 0 0 0 - 1 0 , 0 0 1
(15)
For r > 0, the lattice vectors are of the base-centered orthorhombic kind, with |fl+| = \a+\ ± K l ,
a+-at = a+-a+ = 0,
a+-a2+^0.
(16)
Again, the point group consists of 1 and three 180° rotations R£, with perpendicular axes, which are, respectively, parallel to aj1" + a2, a* — a2, and a3+. The corresponding lattice group consists of the identity and the three symmetric matrices indicated by «J=m;=
0 0 1 1 0 0 , 0 0 - 1
(17)
0 - 1 0 R*2^m*2 = - 1 0 0 , 0 0 - 1
(18)
/?;=m;=m3.
(19)
Looking at (12) and (16), we see that it is possible for these to meet at r = 0 , for a® of the form a ° . a ° = 0,
for p^k,
and
|a?| = \a°\ ^ \a°\,
(20)
lattice vectors of the simple tetragonal kind. The tetragonal has lattice and point groups of order 8, the latter being described in the appendix. Above, we have 6 elements. What remains is a 90° rotation, say
0 R±=m±=
-I 0
1 0 0 0 , 0 1
(21)
and the 270° rotation R]_. Several comments are in order. First, as was mentioned in the introduction, Pitteri and Zanzotto [2] noticed that this puts the two kinds of orthorhombics in the same Bravais lattice, according to Bravais's criterion, although there are numerous comments to the contrary in the literature. Second, there is no way to choose lattice vectors so that these two kinds of lattice vectors have the same lattice group. If you wish to prove this for yourself, it is not hard. What
582 THERMODYNAMICALLY CONNECTED BRAVAIS LATTICES
11
one needs to show is that there is no m e G such that the matrices mmkm~x are, in some order, m\, m*2, and m*v Third, it does not seem obvious that our paths are thermodynamically possible; the Pitteri and Zanzotto work raises questions of this kind, for other pairs of Bravais lattices. There are some similarities and some differences in the different cases. It would be interesting to find some kind of crystal that is observed to follow one of those Bravais Pconnections, joining configurations in different Bravais lattices. I do see a reason to believe that it would be hard to demonstrate this conclusively. One needs certain kinds of second-order phase transitions. With this, there is the usual difficulty that, experimentally, it is very difficult to distinguish between such a phase transition and a first-order transition involving very small jumps. Finally, we do need to pick a reference configuration. Our Pitteri neighborhood must contain the three kinds of configurations mentioned above. This will be true if we take it to be one of the simple tetragonal kind; this is what I will try. One could try a simple cubic neighborhood, an alternative I have not explored. I do not want to begin by assuming those thermodynamic paths to be possible; if they were, I would take Ak = a°k. So, let us postpone making the selection more definite. The next order of business is to understand the structure of Pitteri neighborhoods of this kind. First, the divisors of 8, the order of the tetragonal lattice group and point group, described in the appendix, are 1, 2, 4, and 8. From elementary group theory, the order of any subgroup must be one of these numbers. Recall the property that any configuration in the neighborhood has a lattice group that is one of these subgroups. For Ak, we have assumed that Ap-Ak
= 0,
forp^k,
and
|A,| = \A2\ =£ |A 3 |,
(22)
and we have a listing of lattice groups plus information about point groups listed above. Each m e L+(Ak) is associated with exactly one rotation R e P+(Ak) by what we have abbreviated as R = m, as in Endnote 2. Consider any value of F, subject to the condition that it keeps us in the neighborhood. From (7), this gives lattice vectors ak having some lattice group. For any m € L+(ak), there is some rotation R such that Rak = RFAk = m{ap = FmpkAp.
(23)
However, it is also true that m e L+(Ak), so there is some R' € P+(Ak) such that m{Av = R'Ak,
(24)
R'TCR' = C.
(25)
from which RF — FR', or
By reading this backward, you get the converse result. Now, if you take any value of C and transform it in the manner indicated by the left side of (25) for every R e P+(Ak), you get a certain number of distinct values, which is called an orbit. The values of such R leaving C invariant form a subgroup, linked uniquely to a subgroup of L+(Ak), which is the lattice group for the lattice vectors determined by C, using (7). Divide 8, the order of the tetragonal group, by the order of the subgroup, and you get the number of values of C in the corresponding orbit. These facts are familiar to experts, and I will use them. Two other items need to be kept in mind. First, to consider the restriction of q> to be invariant under the tetragonal group, it is necessary that its domain be invariant: if C is in the domain, so must be its orbit. This is not a strong restriction, but, as is usually the case with neighborhoods, that can be picked in various ways, some choices won't satisfy this restriction. Second, not all subgroups of a lattice group are lattice groups. For example, the m x given by
583 12 J. L. ERICKSEN (21) generates a subgroup of order 4. However, if you look at the conditions lattice vectors must satisfy to have this in their lattice group, you will find that they must be of the simple tetragonal or simple cubic form, so this is just a proper subgroup. Hence, one can examine all subgroups of L+(Ak), pick out the possible subgroups, relate these to subgroups of P+(Ak), and determine3 the values of C left invariant under each to get a picture of how the possible symmetries occur in the neighborhood. Below are the details. These are given in terms of components of C, using the orthonormal basis ek = Ak/\Ak\.
(26)
Values of C must lie in the neighborhood and, of course, C must be positive definite. It would be nice to characterize the boundary of the largest possible such neighborhood and to make a study of how
0 0 , 0
(27)
with orbits consisting of just one value ofC. Next largest are those with lattice groups of order 4, orbits containing two different values. Simple orthorhombic configurations with lattice group elements 1, mh m2, m3, an orbit being described by C-l=
a 0 0
0 p 0
0 0 , y
p 0 0a 0 0
0 0. y
(28)
Here and in the following, the entries are to be restricted, so that the listed entries for an orbit are all different. However, using the same letter for different orbits does not imply that such values must be the same. Base-centered orthorhombic configurations with lattice group elements 1, m*, m\, m*v with orbit of the form a C - 1 = ±y 0
±y a 0
0 0 . p
(29)
Configurations of lesser symmetry are Simple monoclinic configurations with lattice groups elements 1 and either mx or m2, these giving the same orbit, with four different values ofC, indicated by a 0 0 P ±y C - 1= 0 0 ±y 8
,
p 0 0a ±y 0
±y 0. 8
(30)
584 THERMODYNAMICALLY CONNECTED BRAVAIS LATTICES
13
Simple monoclinic configurations, with lattice group elements 1, w3. Orbits contain four different values ofC, indicated by a ±0 0 ±j8 y 0 , 0 0 S
C - 1 =
)/ i|3 0 ±P a 0 . 0 0 S
(31)
Centered monoclinic configurations with lattice group elements 1 and either m* or m*2, these giving the same orbit, with the four values ofC given by a C-l=
0 ±y
f} ±y a —y
-y
a ,
S
-p ±y
—fl a y
±y y . S
(32)
Triclinic configurations, with the trivial lattice group consisting of the identity. This covers values of C not fitting any of the above descriptions. The orbit consists of eight values ofC. (33) I leave it to the reader to determine how these eight are related. Note that for y -»• 0 in (32), we get deformations of the kind described by (29). Also, we do if we let y —>• a in (31). This means that we can connect these two kinds of monoclinics by a Bravais P-connection passing through base-centered orthorhombics. So, as was first noticed by Pitteri and Zanzotto [2], Bravais should have put these in the same Bravais lattice. Obviously, the two kinds of orthorhombics can be connected by such a path, passing through the reference configuration, as followed from my example [3]. Also, for later analyses, it is important to note that the simple monoclinics described by (31) have as limits both kinds of orthorhombics, so the latter two can be connected by continuous paths through the former. Bravais would not allow such paths. This is not a bad place for the reader to pause, to form a mental picture of how these parts fill up the neighborhood, to think about those questions of connectivity mentioned in the introduction. A description of this kind, for trigonal neighborhoods is given in [5]. Later, I will summarize such information for some other kinds of neighborhoods. Rather obviously, with ip invariant under the tetragonal group, if one value of C satisfies the equilibrium equations, so will any other values on the same orbit, and there will be some, if the configuration is not tetragonal. Also, they will be equally stable or unstable. This opens the possibility of equilibria for which C takes on more than one such value in one body, which can involve rather complicated microstructures, such as are discussed by Bhattacharya, Firoozye, James, and Kohn [9]. Those familiar with such theory might have spotted that we have orbits with properties needed for twinning to be possible in the monoclinic and orthorhombic configurations, which will be encountered in Section VI, for example.
V. REDUCED POTENTIALS Here, we use theory of a still more local nature, the implicit function theorem, to deduce something about those equilibria we hope to find. It will help us to get some idea of what properties function
585 14
J. L. ERICKSEN
Suppose that we have one solution of the equilibrium equations in our tetragonal neighborhood, say dy/dC = 0 at C = Co,
p = p 0,
0 = 00.
(34)
Suppose also that this is a tetragonal configuration, so that Co is of the form found in (27), for some choice of the numbers a and /S. To use the implicit function theorem, we introduce the fourth order tensor, in notation that should be clear, « = d2y/dCdC.
(35)
Let Ko denote the value it has, at the arguments indicated in (34). What is relevant is its kernel, ker Ko, the linear space of symmetric second order tensors satisfying K0£> = 0<3>£> ekerK 0 .
(36)
If D = 0 is the only possibility, the simplest form of the implicit function theorem applies; namely, for p near p0 and 9 near do, there is a unique value of C satisfying (34), depending smoothly on p and 9, with C = Co at p = p0, 9 = 90. To get those orthorhombic phases into the picture, we need ker Ko to be nontrivial, in such a way that it favors branches of those kinds. To try to better understand this, let us assume that ker Ko is nontrivial in some way, not yet specified, and take this as our reference, so C0 = l.
(37)
eo = E- K o £,
(38)
The quadratic form based on Ko is
where E is any symmetric second-order tensor, and we use the inner product A • B = tr A B for these. Because we are evaluating Ko at a tetragonal configuration, s0 is invariant under this group, being in this respect like the linear elastic strain energy function for a simple tetragonal crystal. So, it is a function of the same general form, something available in some books on linear elasticity theory, for example Love [12, ch. VI]. Look this up, rearrange terms to express this as a sum of squares, and you get
£o = *i(En - E22f + k2E\2 + h(E2n + £ 2 \) + k4(Eu + £ 22 - 2££ 3 3) 2 + ks(kEu
+ kE21 + E 33 ) 2 ,
(39)
where the fe's are material constants4, and components of E are relative to the orthonormal basis used in Section IV. The fourth order tensor Ko has the symmetry required to have the usual orthogonality of eigenspaces. To within unimportant numerical factors, its eigenvalues are the numbers k, (i = 1 to 5). Having a nontrivial kernel amounts to having one or more of these vanish. So, we need to decide which one(s) should vanish. Let us look at the eigenspaces corresponding to each of these. A calculation gives x forifci, 0 0 0 for k2, y 0
0 -x 0
0 0 , 0
.y 0 0 0 , 0 0
(40)
(41)
586 THERMODYNAMICALLY CONNECTED BRAVAIS LATTICES
for£ 3 ,
0
0
0
0
x2 ,
xx
x2
0
*3
0
0 0
x, 0
for&4,
kx4 0 0
for/fc5,
15
JC,
(42)
0
0 , -2£x 3 0 0 kx4 0 , 0 x4
(43)
(44)
where the xs and y are arbitrary variables. From one view, this can be used as a change of variables, replacing components of C — 1 by xs and y. That is, we can decompose C — 1 into parts lying in these orthogonal spaces. This gives the invertible transformation, designed so the left sides vanish when C = 1, 2x = C\\ — C 2 2
y = cn X\ = C\-\
(45)
(1 + 2k2)x, = ( C , + C 22 )/2 - £C33 + ik - 1 (1 + 2A:2)x4 = k(Cu + C22) + C33 - 2k - 1 Now, to get the simple orthorhombics, we want to get equilibria of the form of (28) instead of (27), the only difference being that Cu = Cn in one case, not the other. So, we would like to make it easy to get x ^ 0, which means letting k\ = 0. Similarly, to get the base-centered kind, we want y ^ 0, so k2 = 0. I know that it will make physicists nervous to assume that two different eigenvalues vanish so, for now, let us assume only what seems safe, *3*4*5 # 0,
(46)
* , = * 2 = 0?
(47)
regarding the guess as just that, that is,
So, using (45), we express y as a function of the new variables. According to the implicit function theorem, we can solve part of the equilibrium equations, that is, dy/dXi=0,
1 = 1,2,3,4,
(48)
to get Xi=Xi(x,y,p,6),
xi(0,0,p0,00) = 0,
(49)
these functions being unique and smooth, locally. Actually, what I would really like is to have this be a minimization, that is, to have Y(xt,x, y, p, 9) > y(Xi,x, y, p, d),
(50)
at least near the given equilibrium at (p0, 00). The obvious second derivative test is satisfied if kj, > 0,
k4 > 0,
k5 > 0.
(51)
587 16
J. L. ERICKSEN
Bearing in mind that there are limits on the domain of
(52)
As is customary in bifurcation theory, we have eliminated most of the variables in the problem, keeping only those that determine the nature of branching, what physicists sometimes call "order parameters." Now, the remaining equilibrium equations reduce to dyr/dx = 3yr/dy = 0.
(53)
As is discussed in some detail in [5], the reduced potential inherits invariance from y, here being invariant under the tetragonal group, as this acts on (x,y). By straightforward calculations, this gives the conditions yr(x, y, p, 9) = Yri-x, y, p, 9) = yr(x, -y, p, 9).
(54)
From this, it is obvious that x = y = 0 =>• dyr/dx =>• dyr/dy = 0,
(55)
giving us at least these tetragonal equilibria: they must be, because there is obviously only one configuration of this kind in an orbit, and because the functions jc,- are unique.
VI. BRANCHING ANALYSIS Here, we need to get into analysis involving more nonlinearity, to explore branching. We are interested in what is likely to happen near the origin, for p near p0, 9 near 90. Using a Taylor expansion up to quartic terms in x and y using (54), gives Yr = go + g\X2 + g2y2 + g?,x4 + 2g4x2y2 + g5y\
(56)
where the gs are smooth functions of p and 9. We could similarly approximate them, but this only makes equations longer. I will assume that the quartic part is positive in the strict sense, subject to the inequalities £3>0,
g5>0,
g4>-(gig5y/2-
(57)
Intuitively, I want a rather high energy barrier around the origin, to favor having rather stable equilibria occur in the vicinity. With the quartic approximation, we can see the lower slopes, but not the top. The questionable (47) translates to gi(Po,9o) = g2(po,9o) = O'?
(58)
Now, as was discussed in Section III, I want predictions that are not very sensitive to small changes in the basic constitutive equations. The idea is that the functions gt are slightly different for the different samples of a material, and we are looking for predictions that are essentially the same for all or at least almost all samples. One common way of dealing with this amounts to rejecting the possibility that more than two equations in two unknowns are satisfied. I will use less formal reasoning leading to the same conclusions. Consider the possibility that gi = 0 on some curve in the p — 9 plane. If gt had the same algebraic sign on both sides, then a slight perturbation in this function would either produce no such curve, or give two curves,
588 THERMODYNAMICALLY CONNECTED BRAVAIS LATTICES
17
with some rare exceptions. So, we reject this possibility and, for similar reasons, cases where gi = 0 at an isolated point. On the other hand, if g\ has opposite signs on the two sides, a small perturbation will give gi = 0 on a slightly different curve, again changing sign as we cross the curve, a tolerable situation. In a similar way, g2 might well vanish on some other curve, and it is then not unlikely that the two curves will intersect. For (58) to hold, we want the two curves to have a point in common. If they were there just tangent, a little perturbation would make them separate or cross, which will not do. On the other hand, having two curves cross is a tolerable situation, granted that the point of intersection will be a little different for different samples. I also tolerate inequalities like (57), because I do not want those energy barriers to be very different. Mathematically, this is different from what was mentioned in Section III, amounting to the requirement that perturbations do not much change fourth derivatives. I could accept the possibility that those two curves cross at more than one point, but will not consider it here. What seems robust is having those two curves cross, dividing the p-9 plane into four parts, at least locally. To describe them, we have the four regions. I: II:
points where g\ > 0, g2 > 0, points where gx > 0, g2 < 0.
III: IV:
points where g{ < 0, g2 < 0, points where gi < 0, g2 > 0.
(59)
These will share common boundary points, parts of the curves g\ =0 and g2 = 0, and the point common to all, described by gi(Po,Oo) = g2(po,9Q) = O,
(60)
the curves and this point being a little different for different samples. We have thus removed the question mark in (58). Next, treating the quartic approximation as exact, we can get explicit solutions of the equilibrium equations. Of course, these are only first approximations, for x, y, p — p0 and 9 — 90 near zero. There are the tetragonal equilibria noted in (55), defined in all four regions, which pass a second derivative test for stability in region I, only. Note that in I, including boundary points on g\ = 0 and g2 = 0, these actually minimize the quartic.There are pairs of equilibria of the simple orthorhombic kind, given by x2 = -gi/2g3,
y= 0
(61)
in regions III and IV. For a second derivative test for these, we calculate that those derivatives, evaluated using (61), are 32Yr/dx2 = 2
d yr/dxdy 2
2
d yr/dy
-48i,
= 0, = 2(g2 -
(62) glg4/gi),
so the second derivative test is satisfied where 8\ < 0,
82 > 8ig4/gi-
(63)
There are also pairs of equilibria of the base-centered kind, given by x = 0,
y2 = -g2/2g5
(64)
589 18
J. L. ERICKSEN
in regions II and III. For these, similar calculations give that the second derivative test for stability is satisfied provided that g2 < 0,
gj > g2g4/gs-
(65)
Finally, there can be quadruples of equilibria given by X2 = (gzg* ~ glgs)/g > 0, 1 y2 = (g\g4 - gigi)lg
>o,}
(66)
where (67)
g = 2{gig5-gl)Conditions for these to pass a second derivative test for stability are reducible to gx2y2 > 0.
(68)
Using (45) and (66), you can verify that these are simple monoclinic configurations of the kind described by (31). Physically, I want there to be at least one stable equilibrium configuration possible for any choice of (p, 6) near (p0, <90), and I have listed all to be found, using the quartic. We can do this, either including or excluding those monoclinic configurations. Let us first consider excluding them, by making them unstable, taking gA > 0,
(69)
gl>gig5,
which is consistent with (57). Then, in I, the tetragonal is the only possibility. In II and part of III, the simple orthorhombics at least pass a second derivative test, reaching the limit of metastability in III, where gigi = gigt,
g\ < 0,
g2 < 0.
(70)
Similarly, in IV and part of III, there are the base-centered orthorhombics, these reaching a limit of metastability in III, where g\g5=glg4,
gl<0,
g2<0-
(71)
With (69), the regions where these configurations satisfy a second derivative test for stability overlap. One is more stable than the other, except where they give the same value to yr, which works out to be where g\gs = glg3,
(72)
occurring on some curve in III. Nominally, a first-order phase transition should occur, on a control path, as this crosses the curve. However, it is common for first-order phase transitions to be hysteretic, meaning that the transitions occur after this point has been reached, going either way. With the first-order phase transition relating them, chances are good that they would sometimes coexist in a specimen, with interfaces that might well be associated with interesting microstructures, which helps keep the phases relaxed. This situation is somewhat unusual, in having twinning possible in two phases related by a first-order transition. Suffice it to say that, if a real version of our hypothetical crystal were found, it would be an interesting one to observe and study, as would other cases to be treated.
590 THERMODYNAMICALLY CONNECTED BRAVAIS LATTICES
19
From (61), in region IV, x —>• 0 as gj -> 0, so this phase merges to the tetragonal phase in region I, continuously. On the orthorhombic side, at equilibrium, Yr = go - g2x/48i>
(73)
a function of p and 0, interpretable as the Gibbs function used in elementary treatments of thermodynamics, for example that of Pippard [13, Chs. 8 and 9]. On the tetragonal side, Yr = go-
(74)
The two values of yr and —dyr/d0, the entropy per unit mass, match on gi = 0, which means that there is no latent heat. The two values of d2yr/d92 approach different limits, which means that there is a jump in the specific heat at constant pressure. This confirms that this is a second-order phase transition, by customary criteria. Similarly, one has a second-order phase transition on g2 = 0, relating the tetragonal to the other kind of orthorhombic. If one takes a control path from some point in region II to one in region IV, passing through I, the deformation thus varies continuously on it, this being a Bravais P-connection. One could, in principle, reduce the part in I to the single point (p0, 00). One should be able to come close, at least, by using some ingenuity in controlling pressure and temperature; it is not so different from trying to hit a critical point in a fluid. For our two kinds of second-order phase transitions, the symmetry change involves two lattice groups such that the ratio of their orders is 2 or 1/2, depending on which ratio you use. So there is a doubling (halving) of symmetry. To physicists familiar with phase transitions, there is some agreement that this is one of the few kinds of second-order transitions that are likely to be observed. So, the ponderings have led us to a fairly definite, robust kind of pattern of equilibria, which seems to me to be thermodynamically possible and to include a path like that mentioned in the introduction. There is the other possibility, excluded by (69). A similar analysis of it shows a rather similar pattern, except that the curve bearing the first-order transition is replaced by a wedge-like region occupied by simple monoclinic configurations, all transitions being of second-order. My view is that this pattern is also thermodynamically possible. Physical experience suggests that thermodynamic potentials are likely to exhibit rather mild algebraic singularities at second-order transitions. That is, critical exponents predicted using smooth functions can differ from experimental values. There is some possibility of taking care of this by minimizing smoothness assumptions on the temperature dependence of
59J_ 20
J. L. ERICKSEN
C gives a configuration with lattice group properly contained, so does XC, for any positive number A. Section IV gives some ways of picturing the behavior of some neighborhoods as subsets of ours. You can pick out various neighborhoods of lesser symmetry, and get some idea of what happens to them, near the tetragonal reference.
VII. A SIMILAR CASE Pitteri and Zanzotto [2] constructed two other rather similar kinds of paths. Our analyses apply to one of these, with a slight adaptation. Again, this involves two different orthorhombic Bravais lattices, connected through a tetragonal. This time, the orthorhombics are of the facecentered and body-centered kinds and the tetragonal is of the centered type. The lattice groups are quite different from those encountered before, but they are similarly related. Now, we use a Pitteri neighborhood of the centered tetragonal kind, with reference lattice vectors Ak given by Ax =a(e2-e1)
-be?,,
A2 = a(e2 - ex) + be?,,
A3 = a{ex + e2) - be?,, (75)
where a and b •£ a are positive numbers, and ek denotes the orthonormal basis referred to in the appendix. I will not record the lattice group elements, which are easy to calculate. Relative to this basis, the elements of P+(Ak) are described in the same way as before. For one thing, this means that the quadratic "energy" (39) also applies here. The same subgroups are point groups, so any one commutes with the same values of C as before. So, for the two values of C given by (28), say, it is immediate that these describe some kind of orthorhombic configurations. To find out what kind, one needs to apply such deformations to the lattice vectors (75) and use the Cauchy-Born hypothesis to see what is the nature of the corresponding lattice vectors. Doing this, one finds that one set satisfies l«il = l«2l = l«3|,
(al-a-i)-(a2+arS) = (ai-a-i)-(al-a1)
= 0,
(76)
along with other relations deducible from these. Such lattice vectors belong to the bodycentered orthorhombic Bravais lattice. If this is not obvious to you, try picturing a unit cell. By proceeding in this way, one can make a table of name changes, to adapt the analyses and conclusions given in Section VI to this situation. This works out to be simple tetragonal —>• centered tetragonal, simple orthorhombic —> body-centered orthorhombic, base-centered orthorhombic —> face-centered orthorhombic,
(77)
simple monoclinic ->• centered monoclinic. With this and a bit more, one gets a description of a centered tetragonal neighborhood. Briefly, values of C in (30) and (31), which there described simple monoclinics, now describe centered monoclinics. In (32), what were centered monoclinics are again centered monoclinics and the triclinics are again these.
592 THERMODYNAMICALLY CONNECTED BRAVAIS LATTICES
21
VIII. THE THIRD CASE The other kind of path constructed by Pitteri and Zanzotto [2] connects the two kinds of monoclinics by passing through a base-centered orthorhombic configuration. Clearly, one cannot use the same analysis for this. However, I'll sketch a very similar way of dealing with it, using our simple tetragonal neighborhood, except that we now assume nothing about equilibria associated with the reference configuration. First, one of the types of base-centered orthorhombics given by (29) has the form a C - \ = y 0
y a 0
0 0 , p
(78)
and with the notation used in the appendix, the corresponding point group is 1,
R[n,(ei±e2)/V2],
R[n,e,].
(79)
From (31), one type of simple monoclinic configuration has the form a' C - 1 = y' 0
y' 0 8' 0 , 0 £'
(80)
which reduces to the form (78), when 8' = a'. We are selecting monoclinics in a Pitteri neighborhood of this orthorhombic kind, at least when a' — a and so on are small enough. Under (79), the orbit of (80) is this and the matrix obtained by interchanging a' and <5'. Similarly, we select from (32) two centered monoclinics, one of the form a* y* 8* C - 1 = y* a* 8* , 8* 8* p*
(81)
the other being this, with 8* replaced by —8*. Clearly, these reduce to the form of (80) when 8* = 0, so they can belong to the neighborhood. Also included are some triclinics and another pair of centered monoclinics; in (81), replace C^ by its negative to get one, then reverse the sign of C i3 and C23 to get the other. Change the reference configuration to be the given equilibrium configuration and you have a description of a Pitteri neighborhood of the basecentered orthorhombic kind, in what might be regarded as a standard form. Next, assume that, for some values of p and 9, equilibrium occurs for some choice of the numbers in (78). Make the change of variables indicated by 2x = C n — C22, 2y — C13 + Cn, LX l = C 17 — C 2 3 ,
x2 = Cn -au 2x3 = Cn + Cn - a2,
(82)
X4 = C33 — a-i,
choosing the constants ak so that the x's and y vanish at the given equilibrium configuration. If C is transformed by any of the transformations in (79), each of the new variables is either
593 22
J. L. ERICKSEN
invariant or gets replaced by its negative. As before, make assumptions to justify using the implicit function theorem to solve dy/dXi = 0
(83)
for Xi in terms of the remaining variables. This leads to a reduced potential of the form Y = Yr(x,y,p,9),
(84)
and you can verify that, as before, it is an even function of x and of y. So, it is simply a matter of reinterpreting the previous analyses. This can be done with a chart like (77). Starting with the first case analyzed, one gets simple tetragonal —> body-centered orthorhombic, simple orthorhombic -> simple monoclinic, base-centered orthorhombic —> centered monoclinic,
(85)
simple monoclinic —>• triclinic. Note that the property that two phases are related by a doubling of symmetry is preserved by this change of names, which is important to have second-order transitions translate to those of second-order. One thing different occurs here. In using the tetragonal neighborhood, we will get an orbit of two equilibrium patterns. Of course, if we go back to the original idea that the potential is invariant under that infinite group G, we should get infinitely many. To some degree, this just reflects the fact that there are infinitely many ways of choosing lattice vectors. However, the theory of microstructures uses the fact that different symmetry-related configurations can occur in one body. It would be good to have some way of predicting such numbers, or inferring from observations what the number tells us about relevant constitutive equations. I do not think that my ideas about this are better than others. Thus my conclusion is that those three kinds of paths are thermodynamically possible equilibrium paths. So, Bravais's ideas allow for some possible kinds of second-order transitions and exclude others. Our results contain examples of both kinds of situations. I interpret this as meaning that his ideas have their faults. Trying to base fundamental ideas of symmetry on prejudices about equilibria seems to me tricky. Should we allow twins, for example? Intuitively, I regard these as being in the same phase, although they are not smoothly connected, directly. At least commonly, they can be connected by a Bravais P -connection. Certainly, our thermodynamic analyses rely heavily on ideas about those 14 Bravais lattices, and other notions about crystal symmetry, but they are not purely kinematical studies. We have noted and used some facts about how configurations with different symmetries are connected, which is a matter of kinematics. To me, this underscores the importance of understanding the structure of those Pitteri neighborhoods, as best we can. I do not really object to the idea that this should be regarded as part of the theory of crystallography. In fact, I would welcome help from expert cry stallographers in improving such theory, if only to point out relevant contributions that have been missed by this outsider.
594 THERMODYNAMICALLY CONNECTED BRAVAIS LATTICES
23
IX. NEIGHBORHOODS For those Pitteri neighborhoods, it is not easy to get a good grasp on techniques for making them maximal. Correspondence with Zanzotto indicates that he has found ideas that seem promising for this, but I think it would be premature for me to say more. It does seem to me feasible and useful to work out a good description of how they are arranged, locally. I tried to do this in Section IV for neighborhoods of the simple tetragonal kind, but am open to suggestions for better ways of doing this. There is the point that, for one Bravais lattice, all neighborhoods are arranged in the same way, locally. That is, the associated lattice groups are conjugate to each other, as are the subgroups, which are lattice groups. It is rather obvious that a neighborhood contains configurations for which every such subgroup is a lattice group, because it only takes an infinitesimal perturbation to convert a configuration with some symmetry to one of given lesser but possible symmetry. Also, the kinds of similarity transformations associated with choosing different lattice vectors neatly map the subregions of lesser as well as greater symmetry and, as noted in the introduction, these leave the point groups invariant. These thoughts led me to focus on the values of C and how they relate to point groups as you have seen. I'll try to summarize what I know about these local scenes, assuming that reference configurations are picked to fit the name. First, a triclinic neighborhood contains only triclinics. For either kind of monoclinics, a neighborhood contains triclinics and the same kind of monoclinics. For the latter, the point group is generated by a 180° rotation. Values of C corresponding to the monoclinics are those for which an eigenvector is parallel to the axis of this rotation. Information given in this paper covers the two kinds of tetragonals and four kinds of orthorhombics. What I know about trigonals is covered in [5]. I have not looked hard at the hexagonals or cubics, so prefer not to comment on these, except to note that Pitteri and Zanzotto [2] include some worthwhile comments about the latter. One way or another, it should be feasible to fill in any gaps and find some fairly neat ways of displaying such results. Theory of the kind used here seems to apply to what can reasonably be regarded as elastic deformations of equilibria in monatomic crystals, when these conform to the description of Bravais lattices. This excludes hexagonal close-packed configurations, for one thing, which have a reputation for doing things that are not well described by available theory. However, such theory has also been successful in describing various phenomena observed in shapememory alloys. In those monatomic crystals, equilibria tend to be of the face-centered cubic or body-centered kinds. In the latter, equilibria with various kinds of symmetries are observed, and it is this that makes the theory of phase transitions and associated microstructures more interesting for these. This would seem to be a more likely place to look for materials exhibiting one of our patterns, if such materials exist.
X. APPENDIX Let R[
R[n/2,e3],
R[n,e3], R[jt, (e, + e 2 )/V2],
R[3n/2, e3],
R[n, e,],
R[n, (e, - e2)/j2].
R[n,e2],
(86) (
595 24
J. L. ERICKSEN
Centered tetragonals can be, and sometimes are, called body-centered or face-centered tetragonals. Experts know that these are equivalent ways of describing the same configurations, although this is certainly not true for the cubics.
NOTES 1. It is not really necessary that they be constant, but this simplifies analyses and is convenient for our purposes. 2. Here, the symbol = is shorthand for R\a^ = (mi)%a~, etc. 3. Forms of C invariant under all of the point groups are presented by Truesdell [11, Ch. IV], for example. 4. Here, I admit that I do what I said we should not do in Section III, treating a material as described by one constitutive equation. Mea culpa.
REFERENCES [1] Bravais, A: M6moire sur les systemes formes par des points distribues regulierment sur un plan ou dans l'espace. /. Ecole Polytechnique, 19, 1-128 (1850). [2] Pitteri, M. and Zanzotto, G.: On the definition and classification of Bravais lattices, Manuscript submitted for publication. [3] Ericksen, J. L.: On the symmetry of deformable crystals. Arch. Rat. Mech. Analysis, 72, 1-13 (1979). [4] Schwarzenberger, R.L.E.: Classification of crystal lattices. Proc. Camb. Phil. Soc, 72, 325-349 (1972). [5] Ericksen, J. L.: Thermal expansion involving phase transitions in certain thermoelastic crystals. Meccanica, (in press). [6] Pitteri, M.: Reconciliation of local and global symmetries of crystals. J. Elasticity, 14, 175-190 (1984). [7] Ball, J. M. and James, R. D.: Proposed experimental tests of a theory offinemicrostructure and the two well problem. Phil. Trans. R. Soc. Lond., A338, 389^50 (1992). [8] Gibbs, J. W.: On the equilibrium of heterogeneous substances. Trans. Conn. Acad., Ill, 108-248 (1875-1876) and 343-524 (1877-1878). [9] Bhattacharya, K., Firoozye, N. B., James, R. D., and Kohn, R. V: Restrictions on microstructure. Proc. Roy. Soc. Edinburgh, 124A, 843-878(1994). [10] Man, C.-S.: Material stability, the Gibbs conjecture and the first phase rule for substances. Arch. Rat. Mech. Analysis, 91, 1-53 (1985). [11] Truesdell, C : A First Course in Rational Continuum Mechanics, vol. 1, Academic, New York, 1977. [12] Love, A.E.H.: A Treatise on the Mathematical Theory of Elasticity, 4th ed., Cambridge University Press, Cambridge, UK, 1927. [13] Pippard, A. B.: The Elements of Classical Thermodynamics, Cambridge University Press, Cambridge, UK, 1957.
596 ±JL Journal of Elasticity 63: 61-86,2001. ™™ © 2001 Kluwer Academic Publishers. Printed in the Netherlands.
61
On the Theory of the a-(5 Phase Transition in Quartz J.L. ERICKSEN 5378 Buckskin Bob Rd., Florence, OR 97439, U.S.A. Received 22 January 2001; in revised form 14 June 2001 Abstract. The a-fi phase transition in quartz has features which workers have long found puzzling. My purpose is to explore using a theory for this that is rather different conceptually from those used by other workers. It does suggest different kinds of experiments, likely to shed light on aspects of the transition, and a possible modification of theory which might be helpful in understanding some subtleties. Mathematics Subject Classification (2000): 74-XX. Key words: phase transitions, thermomechanical theory of quartz.
1. Introduction As our most common mineral, which is also a very useful material, quartz has been studied extensively, but its strange behavior still makes it difficult to understand. At room temperature and atmospheric pressure, it is found in the so-called a-phase. If the temperature is raised to about 574°C, it transforms to the /3-phase, which has greater crystal symmetry, and this phase transition has unusual features which have long puzzled workers. Among other things, there have been differences of opinion as to whether it is a second-order or a weak first-order transition, involving small jump discontinuities or a combination of the two. Complicating this is the fact that, often, the fi —>• a transition produces cracks, damaging the specimen. Those favoring the idea that it is of second-order generally consider it to be a A-transition, as is suggested by some kinds of observations. I find a simple reason to reject this idea that seems not to have been noticed and is essentially independent of experimental measurements. I [1] proposed a continuum theory of X-ray observations, originally to deal with twinning and transition phenomena in crystals, to which the Cauchy-Born rule, hereafter abbreviated as CBR, fails to apply. After looking at some of the evidence on this transition and, particularly, comments by James [2, 3], I believe that CBR applies to the phenomena considered here, but there is another complication, associated with the fact that these are not Bravais lattices, but multi-lattices. This makes it important to use the X-ray theory, or something much like it, as a basis for analyzing the transition. The trouble is that the transition of interest involves
597 62
J.L. ERICKSEN
bifurcations associated with microstructures not describable by thermoelasticity theory. James [2, 3] suggests that a 3-lattice model seems promising for analyzing the transition. I [4] used this to analyze the well-established growth twins in quartz and, for this, it gives satisfactory results. Here, my aim is to begin to explore what it can predict, as a basis for analyzing the transition, including the effects of pressure on the transition. For this, I use a simplistic version of the theory, but I believe that it does rather well, in describing at least some relevant observations. I will begin by using a formulation of the theory not involving CBR, partly to illustrate what can be done if it does not apply. Later, I will also introduce and use theory allowing for use of CBR. Forms of constitutive equations for this situation are certainly not well established. I will emphasize using conditions based on the invariance of constitutive equations, which do not depend on any particular form of such equations. In my opinion, too little has been done with such methods in this area, although they do have their limits. 2. The X-ray Theory The X-ray theory involves two kinds of vector fields, functions of position in space. For one, there are the lattice vectors ea and their duals, the reciprocal lattice vectors e a , satisfying e
Ka
• pb — ^ C
— Oa,
p
Ka
a
/
\& C
-
p fl ffl P
— C
1^ t f l
-
i
— A.
c\\
\1J
I [1] excluded continuous distributions of dislocations by assuming that, at least locally, there are scalar functions x" s u c h that ea = Vxa.
(2)
This makes the theory compatible with CBR for subsets of changes of configurations, leaving it to individual judgment to decide when to trust it. Briefly, it asserts that, if one picks a reference configuration with lattice vectors Ea ,and subject it to a deformation with deformation gradient F, the vectors FE a are a possible set of lattice vectors in the deformed crystal. For the mass density p, I assumed that mass is proportional to that of a unit cell, formalized as (3) =me-e2Ae3, ' |e1-e2Ae3| where m is a positive constant. This is also compatible with CBR. It fails to apply to thermal expansion of zinc, according to the experiments of Balzer and Sigvaldson [5], who attribute this to changes in the distribution of vacancies and dislocations, defects which are invisible in X-ray observations. This relates to the fact that X-ray observations are of a macroscopic nature. For thermal expansion of quartz, Jay [6] found that CBR does apply, implying that (3) should hold for this, among other things, but he mentions evidence that it fails for bismuth. p=
598 THE a-P PHASE TRANSITION IN QUARTZ
63
For any particular configuration, there are infinitely many possible choices of lattice vectors, related by transformations of the form ea -+ mbaeb <S> ea -> (m" 1 )^ 6 ,
m = \\mba\\ e GL(3, Z).
(4)
In all matrices used here, the lower index on entries labels rows. The second kind of vectors consists of the shifts, denoted by p,-,
» = l...n-l
(5)
for a n-lattice. Such a lattice consists of n geometrically identical interpenetrating lattices, with the same lattice vectors. To get a set of shifts, pick one atom from each lattice and select one as a base point. Then the shifts are the position vectors relative to the base point of the other atoms selected. Obviously, there are also infinitely many ways of choosing these, and they are related by transformations of the form Z?eZ.
Pi-Kx{pj+lfea,
(6)
We will be concerned only with the theory of monatomic 3-lattices and, for these, the possible choices of a are a i
=
«4=
1 0 0 1 ' 1-1 0
_j
a2=
•
«s=
- 1 1 -1 0 ' 0 - 1 !
_!
"3= ,
«6=
- 1 0 -1 1 ' 0 1 j
0
(7)
-
As was mentioned earlier, the lower index on elements labels rows. Some workers, for example, Pitteri and Zanzotto [7, Chapter 4], use a different convention, making their matrices transposes of mine. For a 1-lattice, the point group P(ea) and lattice group L(ea) are finite groups, denned by ^(e a ) = { Q e 0(3) | Qefl = mbatb, m e GL(3, Z)}
(8)
L(e0) = {m e (GL(3, Z) | mfa = Qea, Q e 0(3)}.
(9)
and
Any ^-lattice, equipped with a choice of ea, defines a skeletal lattice, a 1-lattice with these lattice vectors, so these groups are defined for ^-lattices. In addition, there is the point group denned by P(ea, p/) = {Q e 0(3) | Qea = mbaeb, Q P / = a/p ; - + /?efl, m£GL(3,Z)j,
(10)
where, for monatomic 3-lattices, the possible choices of a and 1 are as described above. This can be of lower order than P(ea), since the conditions on shifts can
_____ 64
599 J.L. ERICKSEN
exclude some elements of the latter, and this is the case for the configurations to be considered. Lattice group elements for a n -lattice are of the form (11)
{m,a,l}eL(ea,Pi),
consisting of matrices of the kind indicated that are compatible with (10). Concerning constitutive equations, there are various equivalent formulations of those that I [1] proposed and, for most purposes, I now prefer one I [8] presented later. For this, cp, the Helmholtz free energy per unit mass, is given by a constitutive equation of the form
(12)
where 6 denotes the absolute temperature and pf =Pi.ea^Pi
(13)
= p?ea.
As to what invariances
(14)
Here, I use the former alternative. This restriction is based on the idea that
R e 50(3).
(15)
We do have the equivalence classes of lattice vectors and shifts discussed earlier, and their equivalents for the variables now used. Physically, the value of (p should depend on the configuration, not on how this is represented. Thus, whenever transformations of the kind indicated by (4) and (6) map an argument in the domain to others in the domain, (p should have the same value for these. So, if the domain is suitably infinite, mapped to itself by all these transformations that are consistent with (14), we get these as an infinite invariance group, for example. It is my belief that one needs to have some understanding of what is involved in problems of interest, before one can make an intelligent choice of the domain. Thus, I will go through some exercises, assuming the domain is large enough to contain what emerges as desirable, then pick a type of domain. What will be done relies heavily on this theory of material symmetry. Various writers, including James [3], have considered minimizing energy functions or functionals based on energy functions like those in (12), but the X-ray
600 THE a-P PHASE TRANSITION IN QUARTZ
65
theory goes beyond this, in that there are constitutive equations for the Cauchy stress tensor t, configurational stresses ca and entropy per unit mass rj, of the form dw dw t = - p e f l
c
+
p
(16)
- = ^ "4 "
dw ~~d~9' While configurational stresses can be useful for analyzing some kinds of singularities, I won't need them here. From (16), it is clearly important for w to be at least differentiate on its domain. The equilibrium equations consist of the usual equations for t, along with 11 =
•"-IT0-
The symmetry of t indicated in (16)i is an identity implied by (15). I [1] noted another identity which, when the body force vanishes, implies that, for sufficiently smooth solutions, V - t = 0 <*• V - c a = 0.
(18)
For present purposes, it is important to keep in mind that, for (16) to apply, it is not necessary that (17) be satisfied. Also, I [8] derived identities satisfied by t and O, evaluated at values of ea and pf with a nontrivial lattice group. These are R T tR = t,
R e P(ea, p,) n SOQ)
(19)
and the equivalent of m $ = 4>a,
{m, a, 1} e L(e a , p,).
(20)
These do require that
u = I = [m(e1Ae2.e3)]-1, p
L
v
/J
fa =
,
f
(e1 A e 2 -e 3 ) 1 / 3
(21)
to get
(22)
601 66
JL. ERICKSEN
It then follows that, with def
1
/"T3\ 23
,
p = ~3trt' one has
( )
(24)
dv
So, this gives p an interpretation as a thermodynamic pressure, also fitting the common idea of a mechanical pressure. Also, t = - p i + tD,
(25)
where tD, the stress deviator, is given by
(26)
3. Configurations In considering the a-fi phase transition, workers often use theoretical models which do not account for the fact that the a-phase is piezoelectric. Actually, for the configurations to be introduced, piezoelectric effects should not occur, according to classical linear theory. I follow this practice, adopting the 3-lattice model used by some workers, for example, James [3], although I use the X-ray theory to add details to this. To get a simpler theory, I will consider only a restricted set of homogeneous configurations, including those likely to occur in both phases near transition, in a right-handed quartz crystal subject to modest changes in temperature and hydrostatic pressure p. An obvious way of generalizing the theory is to include a dependence of cp on electric polarization to cover piezoelectric effects but, for a first look, I won't allow for this. For ea and ea , I assume they are of the form
ei=ai, 1/
a\
e2 =a[--i+ 1
— j I,
\
V3 /
9
V3a
e3 = ck, 1
y
'
c
where i, j and k form a right-handed orthonormal basis, a and c being positive numbers taking values in an open set. For the two shifts, I use 1 2 Pi = - e 1 + - e 3 + Mei+2e 2 ),
1 1 p 2 = ^2 + ^3 + M2e, + e2), (28)
where A. varies with p and 9. Here, X = 0 in the /5-phase but not in the a-phase, and this produces a difference in crystal symmetry in the two phases. This describes
602 THE a-P PHASE TRANSITION IN QUARTZ
67
right-handed quartz. With minor changes, the same analysis applies to left-handed quartz: just replace p, by —p,. In [4], I noted that, for the configurations considered, (19) implies that the possible stresses are of the form t = A\ + Bk 0 k,
(29)
and (20) yields k = 0 =>•
(30)
and k ^ 0 =» $ ? = $ 1 = _2d>{ = -2
<&] = <$>{ = 0
(31)
Setting
(32,
»=±=^ .
p 2m Then, using the observation concerning the effect of reversing the sign of k, (21)2 and (28), one finds that
6
l = *%- ~ <
<34)
so, with (31), <& = 0 & % = 0. GFA
(35)
603 68
J.L. ERICKSEN
Similarly, df
df
to = 0 # f = 0,
P = -j~,
df
^ = ~-L-
(36)
dr 3v 30 This describes the simplified constitutive theory to be used in analyzing the a-fi transition. If it helps, one can use CBR. Pick a particular configuration, with a = a0, c = Co, A. = A.o at 9 = 0Q. Then, CBR can be interpreted as assuming that the deformation gradient F = — ( i ® i + j ® j ) + — k(8>k a0 c0
(37)
is associated with changing these parameters. In the earlier analyses, I won't use this explicitly but, as will be noted, it is used implicitly. One could superpose a rigid rotation, but this serves no useful purpose here. 4. Thermodynamic Theory We will consider two control parameters, the pressure p applied to the body and 6, the temperature of a heat bath serving as the thermal environment for the body. Of course, one wants to consider a particular crystal and, in the X-ray theory, it is difficult to describe exactly what is meant by this, what distinguishes two crystals with the same mass. By introducing CBR, one can interpret this in the way it is usually done in thermoelasticity theory, for example. So, this is where it is used, implicitly. The relevant thermodynamic potential is commonly taken as (38)
j pcpdx + pV, Jn
where £2 is some region which can be occupied by the crystal, V being its volume. With the common practice in this area, of considering only homogeneous configurations, this reduces to M
(39)
where M = p V, the total mass of the crystal, is afixedconstant. Relevant equations describing extremals involved in minimizing this are 9/
df df ,.m n — = — =0. (40) F dd dv 8X dr My prejudice is that it is also safe to eliminate r, by solving (40)3 for it, using this to reduce / to a function of the form r] =
,
3/
p=
,
/ = f(v, I, 6) = f(v, -X, 6).
(41)
604 THE ct-p PHASE TRANSITION IN QUARTZ
69
I remind you that, from (36), we then have tD = 0. The first three equations in (40) then apply, with / replaced by / . Deliberately, I am belaboring elementary matters, because I think some have gone astray by ignoring one, to be considered next. Readers not familiar with A-transitions can find a discussion of them by Pippard [12, Chapter 9], for example. 5. Could It Be a A-Transition? Some workers, for example Hosieni et al. [13], are of the opinion that the a-fl transition is a particular kind of second-order phase transition, a A.-transition. As part of the description of these, the bulk modulus in one phase vanishes at transition, where X = 0. This requires that, for this phase,
a2/
a3/
— - = ^ - = 0 at transition, (42) du3 dv2 where, as is commonly done by physicists in considering second-order transitions, I use the third derivative test for a minimum. Also, to get the observed change of crystal symmetry, we should have 32/ —r- = 0 at transition, (43) aXl the corresponding third derivative test and (40)3 being satisfied at X = 0 as a consequence of symmetry. Now, with A. so fixed, there are only two arguments, v and 0, which should satisfy the three equations (42) and (43). By the reasoning commonly used by physicists in such situations, it is highly unlikely that all three can be satisfied, so I draw the obvious conclusion. It is highly unlikely that the a-fi transition is a X-transition.
(44)
Even if we ignored the dependence of / on X and related matters of symmetry, as is rather commonly done, we would still have two equations. Then, conventional reasoning would lead to the conclusion that it is unlikely that the transition would occur at more than isolated points in the p-6 plane, which is not consistent with experience. I do find it puzzling that various workers seem to have overlooked these elementary tests for a minimum or given a good reason for ignoring them, but I do not see another way to interpret writings about this. From the nature of the argument, there is some reason to think that some other transitions called A.-transitions might be misinterpreted. At least in part, some workers have liked the ^-transitions because of two features of these: (a) The feature that, in either phase, Cp, the specific heat at constant pressure, approaches infinity at transition
(45)
605 70
J.L. ERICKSEN
and (b) The feature that, in one phase, the bulk modulus is zero at transition.
(46)
Experimentally, Cp does increase very rapidly near transition, although experimental evidence does not contradict the view that it is bounded. Similarly, the bulk modulus in the a-phase becomes quite small, but might or might not attain the indicated limit. So, a good theory should somehow be consistent with these facts, and it is not trivial to construct sound theory of this kind. I note that other workers, for example Dolino [14], have opined that this is a fairly weak first-order transition, involving jump discontinuities. From the references concerning this matter cited by Carpenter et al. [15], it seems unclear that there is a significant majority of workers favoring either view. 6. As a First-order Transition Of course, the aim is to better understand minimizers of the potential / indicated in (41), for relevant values of p and 6: at zero pressure the transition occurs at about 574°C, and we want the theory to apply to some nonzero values of pressure. The real difficulty in situations like this is that one has very little relevant information about the constitutive equations. So, the practice involves relying on intuitive prejudices about them, bolstered a bit by experience with somewhat similar situations. I have and will continue to try to make clear what my prejudices are, so workers can assess these for themselves. After more explorations of this kind, involving shear stresses, one should be in a better position to estimate the form of constitutive equations. The next step is get rid of u by a similar minimization. The relevant extremal equation is 3/ P= ~ , dv
(47)
where p is considered as a control parameter. Solve this for v and use the Legendre transform of / in the usual way, to get the Gibbs function (48)
8(P,^,O), satisfying ri =
-%
ad
v =
ir> dp
g(p,k,e) = g(P,-x,e).
(49)
This might seem a bit riskier, in that one might encounter a place where (42) holds, producing a singular inverse. But let us make the assumption and see where it leads: experienced workers will know that this is not much of a risk.
606 THE a-p PHASE TRANSITION IN QUARTZ
71
Finally, we need to minimize g as a function of A., and it is here that I use Landau theory, with X as an order parameter, for the restricted set of configurations considered. As is common, this describes the weak first-order transition as a subcritical bifurcation. The theoretical picture is as follows. For p = 0, the /?-phase (X = 0) is stable at higher temperatures, changing from stable to metastable near the transition temperature. At a lower temperature, it reaches the limit of metastability for this phase, where it meets an unstable branch which moves off to higher temperatures, X becoming nonzero. This temperature, denoted by 90, is an important parameter in the theory. Then, at some temperature, it turns around as a smooth curve, moving toward lower temperatures. This marks the limit of metastability of the a-phase. At some lower temperature, this branch changes from metastable to stable. This is formalized in the following way. Assume that Ag = g(p, X, 6) - g(p, 0, 9) = \hX2 + \kxXA + \k2X6, 2
4
6
(50)
where k\ and k% are constants, h being an affine function of the form h = hp + kA(6-e0),
(51)
where kj, and kA are constants. In itself, this is somewhat like a version of the theory discussed by Dolino [14], for example, except that I allow for the pressure dependence, and do not include a dependence on VA, while he omits the terms multiplied by k\ and k2.1 note that, in more recent work, described by Carpenter et al. [15], for example, what is used is more like (50) and (51), omitting the pressure dependence, but adding terms depending on strain and including terms of higher degree in the order parameter. The physical interpretation of the order parameter is somewhat different. In a microscopic description, workers associate the other order parameter with a "tilting angle of the SiC>4 tetrahedra", as Dolino describes it. This is described in slightly different words by Carpenter et al. I use more the idea that quartz is SiO2, with atoms in this combination lumped together as a unit. In quartz, these "atoms" do not form a Bravais lattice, but fit a 3-lattice description like that used here, at least when the crystal is well into the a- or /3-phase. Physically, I have no doubt that the observed change in shifts will be associated with some change in those tetrahedra. Actually, for the methods of estimating values of the constants in this model discussed by Carpenter et al., the precise interpretation of the order parameter does not matter. They only discuss this for p — 0, so give no estimate of k^. From their discussion, it is clearly difficult to get precise estimates of parameters in such models. My suggestion to interpret the order parameter as the X associated with shifts gives a way of measuring this, more direct than methods that have been used. I do find it curious that the possibility of making such measurements is not even mentioned by these writers. Also, I know of no measurements of X near transition. There is the theoretical question of whether an instability in shifts triggers the change in the tetrahedra, or vice versa. If there is a clear way of deciding this, I do not know of it.
607 72
J.L. ERICKSEN
For the signs of the constants, I assume that ki < 0,
k2 > 0,
k3 < 0,
k4 > 0.
(52)
The choice of signs of k\ and k2 is standard in Landau theory, the others being based on more empirical considerations, as will become clear. It is then a matter of minimizing Ag, allowing for relative minima. The extremal equation gives ^ and at
= 0
atA. = O,
(53)
ok
k2 =
-*! - Jk\ - 4k2h
k2 =
1
-kx + Jk\ - 4k2h 1
(54)
2k2, 2k2 when these give real solutions. For the fi- branch (A = 0), the limit of metastability occurs when
(55) and (54)! then also gives k = 0, this being the aforementioned unstable branch. The two values of k2 given by (54) coincide when
2
*=3r°^ ~£
>0
ld
" i£=°-
<56)
The potentials of the a- and /3-phases become equal when, in the a-phase,
Ag = 0^h
= ^ ,
A2 = - § i ,
I6k2
Ak2
(57)
marking the place where the a-phase changes from stable to metastable. So, if there is no hysteresis, this is where the phase transition occurs. However, some hysteresis is observed, as is common for first-order transitions. Now consider the entropy difference dAe 1 Arj - n{p, X, 0) - r,(p, 0,0) = -—f = --k4k2 at)
Z
< 0.
(58)
Evaluating this for the a-phase, using (54)2, we get kA[kx - Jk2 - Ak2h] A, = ".
(59)
Obviously, this approaches a finite limit at the limit of metastability described by (56). However, for the corresponding difference in Cp , we get ACp dAr, —^= -T7= 6 W
k2
/ "^ °° 2^k2 - 4k2h
as h
k\ -+ 77-' 4k2
(6°)
608 THE a-0 PHASE TRANSITION IN QUARTZ
73
the same limit of metastability. On the other hand, on the /J-branch, Cp does not necessarily have such a singularity. The analogous difference in specific volume is given by
A u - ^ - V = ti[ - t|+ ^ r ^* ] <0, 4&2
dp 2 on the a-branch, with
k2 dAv/dp =
(61,
3
-* - o o
2jkf-4k2h
k2 as h -> — . 4k2
(62)
Various ways of rearranging these equations are noted by Carpenter et al. [15]. It is observed that the volume does decrease in the /3 -> a transition, dictating the sign of &3, for example. Of course, (62) implies that the bulk modulus for the aphase approaches zero in this limit. Thus, if the a —>•fitransition occurs near this limit, the specific heat should grow rapidly, becoming quite large, but not become infinite, and the bulk modulus should get small, but stay nonzero. Bearing in mind (45) and (46), this transition then resembles a A.-transition in these respects. Also, we here get a description of the observed change in crystal symmetry. In the /3 —>• a transition, there is some latitude in selecting the function g(p, 0, 0), to describe the rapid change observed in the fi -phase near transition, but the theory provides no reason for this, and at least some experts dislike this. Certainly, I would prefer a theory for which this emerges in a more natural way. It is rather easy to get an estimate of £3, the constant not estimated by Carpenter et al. [15]. The evidence is that, at p = 0, the transitions take place within a few degrees of 9\ = 574°C. From the theory, applying a pressure should change 6\ to 0p , where, hp + k4(ep - do) = £4(6*1 - 0O)
(63)
9p = 9 i - ^ .
(64)
or
At least up to pressures of about 5 kb, this is in good agreement with the experiments of Coe and Paterson [16], for — = -25.8°C/kb, (65) k4 a sizable effect. As they note, other workers have obtained numbers close to this one. Interested readers might try doing a more detailed analysis of this, using the rather specific form of constitutive equation suggested by Carpenter et al. [15], to help fit these data to others.
609 74
J.L. ERICKSEN
It is well known that, often, the /3 ->• a transition produces numerous cracks in a crystal, but the a ->• /S transition does not. This suggests that rather large tensile stresses arise in the former. James [3] suggests that the /J-phase enters metastable configurations, building an excess energy, until a transition wave sweeps through the specimen, involving tensile stresses, effecting the transition. There is another possibility, that the a-phase starts to nucleate, but is unable to take over the whole crystal, and some evidence seems to favor this. For one thing, Coe and Paterson [16] mention evidence that"... the detailed behavior of the specific heat... in the region of the transition appears to be sensitive to the state of subdivision of the sample...", suggesting the occurrence of configurations that are not homogeneous, involving something like residual stresses. From their experiments, nonhydrostatic stresses also have a sizable effect in shifting transition temperatures, depending on orientation. From this, one might expect that the transition temperatures would depend on the size and shape of specimens, and Dolino [14] mentions that this is observed. As is noted by James [2], it is not possible for the phases to coexist and be unstressed, with the rather common assumption that the transition is displacive. So, if the two phases do coexist, this should involve what are, essentially, residual stresses, which could be large enough locally to produce some cracking. Our assumption of homogeneous configurations cannot be trusted to assess this. The a —> f!> transition is rather different because, typically, the volume of the crystals is split nearly equally between Dauphine twins in the a-phase. If one thinks of their interfaces as transition layers, within which k — 0 at some places, as Walker [17] does, it would seem that nucleation of the /3-phase might well begin at such locations, when conditions might not favor this for a crystal not containing such twins. Again, this is likely to introduce complicated stresses, until the /2-phase is able to take over the whole specimen. This is likely to shift the transition temperature, interpreted as that at which remnants of the a-phase disappear. Various writers mention the existence of a nearby "incommensurate phase", said to be stable in a temperature interval of less than 2° above the transition point, passed through in the a <-> fi transition. Experimental evidence favoring this view is discussed by Dolino [14] and Carpenter et al. [15], for example. From Dolino's Figure 2, a good part of the apparent rapid change in Cp is then pictured as occurring in this third phase, for example. There is still some increase in what is considered to be the /J-phase, but this is attributed to "effects of fluctuations". I doubt that the "incommensurate phase" is just another way of naming what I described. Obviously, it is not easy to do realistic analyses of such phenomena, but it is hard for me to see how one can attain a good understanding of the transitions without doing so, and there is a lack of analyses helping us to evaluate speculations such as are mentioned above. While it is not likely to be feasible to describe such effects in detail, one might be able to do analyses bearing on the plausibility of such speculations. One difficulty is that it is unclear what form(s) of constitutive equations apply. Coe and Paterson do propose equations to describe the relatively simple stresses occurring in their experiments, but these seem not to correctly describe some of the observed
610 THE a-P PHASE TRANSITION IN QUARTZ
75
orientation effects. As a first step, one could try to find equations which do a better job with this. As far as I know, no one has made a serious effort to do so, although the experiments were reported in 1969. Of course, one can at least partially relieve residual stresses by reducing specimens to powder, as is commonly done in powder diffraction studies, for example. This might lead to better estimates of basic material properties, but there is the difficulty of using this to better analyze the behavior of the more sizable crystals, which seems to have received little or no attention. Concerning values of constants occurring in (50), (51), different estimates give rather different values. For example, Carpenter et al. [15] mention estimates of 0\ — $o ranging from about 3° to \y(Z, with 6\ interpreted as the temperature at which the two energies become equal, when p — 0, their best guess being that it is about 7°C. Theoretically, there is no good way of determining where in the metastable regimes the transitions will actually occur. My suggestion to identify the order parameter with the X associated with shifts provides a way of measuring it directly, and I think that it would be interesting to see how well this would mesh with other methods of estimating material constants. As is discussed by Ackermann and Sorrell [18], there are X-ray observations of the atomic arrangements near transitions, but I am not comfortable with my understanding of their implications, so prefer not to say more. There is another matter. As interpreted here, X is a measure of the difference in values of shift vectors relative to the /J-phase. Generally, having nonzero stresses will change the symmetry of configurations, affecting the form of shift vectors, producing some different measures of the analogous differences. From this view, it seems unsound to regard the order parameter as a scalar, when more general kinds of deformations occur, as Carpenter et al. [15] do, for example. In an example to be presented, two parameters occur in a natural way, and it is easy to construct theoretical examples involving more. Of course, this presumes that my interpretation of the order parameter(s) is good. Suppose that the ft -> a transition at zero stress occurs at or a little below the temperature at which the two energies are equal. Then, in the a —>• fi transition, one expects that nucleation of the fS -phase will not occur until the temperature is increased at least enough to make these energies equal. However, according to Dolino [14], it occurs a fraction of a degree below this value. One could argue that this might be due to occurrence of stresses shifting this temperature, for example. However, at least some experts believe that something is missing in this model, and I believe that this might well be the case. This is associated with the interest in that "incommensurate phase". The observations of Van Tendeloo et al. [19], using electron microscopy, showed a microstructure which they thought to consist of finely spaced Dauphine twins. Various workers, including me, have doubts about whether what is observed in such thin specimens is representative of what occurs in bulkier samples. I do find it hard to understand how this alone would promote the observed cracking. Possibly, the "incommensurate phase" represents some approximation which becomes better in some limit as the volume of samples tends to zero.
6U 76
J.L. ERICKSEN
Later, I will mention a possibility for introducing a third phase in a rather natural way which seems quite different from other ideas that have been tried. However, I do not yet understand what virtues and faults this might have. Perhaps, this is enough to give the reader some understanding of why workers find this transition difficult to analyze. 7. Alternative Formulation Although thermoelasticity theory is not adequate to describe features of the a-fi phase transition, CBR seems to apply to various deformations likely to be involved in analyses of near transition phenomena. Also, it seems reasonable to consider the domain of cp to be a Pitteri neighborhood, centered at an unstressed ^-configuration at a temperature of about 574°C, which I assume to be right-handed in this discussion. It seems desirable to reformulate the basic equations to resemble those commonly used in thermoelasticity theory, differing from the latter by accounting for the shifts. For this, it is clumsy to use the ea as arguments in
ei • e2 A e3 = FE] • FE 2 A FE 3 = (detF)Ei • E 2 A E 3 > 0,
(66) ^
}
where I have used invariance under SO(3) to replace ea by ea • eb and C = FTF (67) is the usual Cauchy-Green tensor. Also, for the Pitteri neighborhood,
P* -* P«mba = a{Pb + lb,
(68)
with ReP(Efl,P,),
|m,a,l}6L(E a ,P i ),
P, = P?Ea,
(69)
giving the transformations for other configurations as ea - • ta = RTmbaeb,
p« - • p? = ( m - ^ ^ a / p j + if),
(70)
describing the transformations leaving q> invariant. Important features of a Pitteri neighborhood are that these transformations map it to itself and that the lattice
612 THE a-P PHASE TRANSITION IN QUARTZ
77
group for any configuration in the neighborhood is a subgroup of that for the center. It simplifies considerations to replace pf by the difference
'
< = P?-P?,
(71)
which causes no problems, since the transformations (70) map P" to itself. Also, these quantities have the simpler transformation nf -
(72)
iif = {m-%oi\n),
making it unnecessary to include 1 in the accounting. Similarly, we replace E a • CE^ by a measure of strain, Eab = Efl • eEb,
e = - (C - 1),
(73)
transforming as Kb -* Kb = Kdmcamdb.
(74)
Obviously, these new variables vanish at the center, so are good for the commonly used Taylor expansions, used to get more definite forms of constitutive equations. The transformation (70) induces the transformations F -> F = R T FR,
s -* s = R T eR,
(75)
Partially, this replaces m by R, making it desirable to do something similar for the shifts. Changes in these are not governed by any kinematic rule, being determined by (17) or the equivalent dip
sir0-
<76)
To eliminate m, we introduce the two vectors nt = ntEa = F
l
Pi
nt = m • Ea,
- Ph
the transformation induced on these being Tt,•-> nt =aJiRTnj.
(77) (78)
The usual practice in elasticity theory is to leave unmentioned the dependence on the essentially fixed reference lattice vectors. In this spirit, the indicated change of variables gives functions of the form (79)
failing to apply special markings to this function should not cause confusion. However, if we take into account the dependence, we have
^ dnt
=
^
^_
dnt
dn"
= 0
* i£=0. djti
(80)
613 78
J.L. ERICKSEN
From (3) and (66), P
M
=
where
=-g°-,
ei • e2 Ae3
Po =
(81)
detF
*"
(82)
is the (constant) reference mass density. Also, a calculation gives formulae familiar from thermoelasticity theory,
, = ,F£*-.
„ = -£.
(83)
However, the energy function is set up to use material coordinates as independent variables, and equilibrium equations for t use instead spatial coordinates. So, while workers often use this combination, it is rather clumsy. A neater theory is obtained by using the Piola-Kirchhoff stress tensor T = (detF)tF- T = p
0
^ = PoF^-
or
=» TF T = FT T .
(84)
ds)
Of course, cp should be invariant under the lattice group for the center, as this acts on the variables now used. This is a rather large group, of order 12, which I [4] have described in detail, including the group multiplication table. However, it will be invariant under this group if it is invariant under a set of generators. The lattice group elements for one choice are
^:
l
i
o i o o
-1 -1 0 , _\ ° l l | o o - i
_° ° o 1 u
f
°
°
-
l
(85)
and
*»•••
-1 £ j ' | : I o 1 - 1 o o !|| •
^
where the symbols on the left describe corresponding point group elements, a 180° rotation with axis i and a 60° rotation with axis k. this leads to the invariances described by
forR = R'M,
(87)
and
forR = R^/3.
(88)
Workers have characterized various kinds of invariant polynomials, etc. relevant to theories of crystals. However, these functions are complicated by the fact that the
614 THE a-P PHASE TRANSITION IN QUARTZ
79
transformation of the vectors is unusual, involving those a's. Further, polynomial functions need to be of rather high degree, sextic in n\ and 7r2 from our previous study of special cases. Thus, it would be very tedious to use the kinds of brute force methods which are commonly used for linear theories to determine forms of such invariant polynomials. Not having a clever idea for dealing with this, I leave it at this, for the present. Of course, one can use general symmetry arguments to draw some conclusions about configurations having as a lattice group some nontrivial subgroup of that at the center, without specifying any particular form of the function (p. For such a configuration and an element in its lattice group, we will have ea = Rrmbaeb = F E a = RTmba¥Eb = R T FRE a ,
(89)
where I have used CBR and (75). With the latter, the point group element for the configuration coincides with that for the center, instead of belonging to some conjugate group. Along with this, we have RTFR = F,
RTeR = e,
(90)
with R the corresponding point group element for the center. With the variables now used, calculations give RTTR = T <£> RTRT = T <S> RTTRT = T T
(91)
RV=aV,
(92)
and
v ^ ^ dTTi
as equivalents of (19) and (20). Also, although I did not mention it earlier, I [8] have noted that the corresponding pf satisfy
pf = {rn-%{a\p) + Z?) => < = {m-%nb},
(93)
which translates to R7Ti=aJ7tj.
(94)
So, one gets these various identities for each element in the subgroup. The neighborhood will contain configurations with a lattice group which is any subgroup of that for the center, so one can use these results to help picture the different symmetries occurring in the neighborhood. Also, there is the possibility of using them to check whether a particular function q> describes couplings between lattice vectors and shifts in a generic way, for example. Of course, it says nothing about the (triclinic) configurations with a trivial lattice group, which form a very sizable subset of the configurations.
615 80
J.L. ERICKSEN
8. A Third Phase? Here, we first explore implications of symmetry for configurations with a lattice group of order three, generated by
Rk • f °1 \ 0
l
I "o o i '
° -1
O i l }
1
l
~ ' - ° ° r
(95)
which is a subgroup of the lattice groups for configurations of the unstressed aphase type, as well as configurations having the same symmetry as the center. By a routine calculation, there is a two parameter family of deformation gradients satisfying (90) i. With a 0 and c 0 denoting the values of a and c for the center, they can be put in the form F = — (i<8>i+j<8>j) + — k ® k =• c0 a0 C= (— } ( i ® i + j ® j ) + ( — ) k ® k , \flo/
(96)
\co/
implying that the lattice vectors ea = FEfl are of the form (27). Similarly, (70) must hold with p? = pf for the above generator. Solving this for pf, one gets a two parameter family which can be represented in the form
I All =
1 -+X 2
+H
2 -
2k
2k
!
? ,
- + k - ix
-
(97)
from which ,,
,,
II71' I ! "
A + /x
2k
2k
0
(98)
k - n0
and 7T! = (A. + /x)Ei + 2AE2,
TT2 = 2XE! + (A. - /z)E 2 ,
(99)
reducing to the form for configurations of the a-phase kind when /x = 0. Again the Cauchy stress tensor t is of the form (29), and so also is the Piola-Kirchhoff tensor T. Then, analyzing (92), one gets the restrictions on v1 as |va ] = |v 2 |,
v 1 - k = v 2 - k = 0,
2V 1 -v 2 = - | v 2 | 2 .
(100)
This gives a two parameter family of solutions, representable in the form v1 = RL/31"2 = «(cos fE2 - sin tfr(Ei + E 2 )), v2 = a ( c o s ^ E ! + s i n y E 2 ) , a and ty being the parameters.
(Ul)
616 THE a-P PHASE TRANSITION IN QUARTZ
81
Now, for this collection of configurations, q> reduces to a function of the parameters indicated by (102)
and we know that cp is invariant under the lattice group for the center. Consider the element associated with R^, given by (85), which also is included in the lattice groups for configurations of the a- and /3-phase kinds. Applying this to (96), we get 4 F R ^ = F =» R > * e * = KmbaFEh
= i^FR^E* = F E . = ea,
(103)
implying that a and c are unchanged. Transforming 7r, by the rule (104)
Hi = Kajnj, we get ni = (X - /A)EI + 2AE2,
TT2 = 2AE, + (A. + /x)E2,
(105)
the effect being to reverse the sign of /x. Thus, f(a, c, X, /x, 6) = f(a, c, X, -/x, 9).
(106)
Similarly using the element associated with R^, for which a = 1, we also get f(a, c, A, ii, 6) = f(a, c, -X, /x, 6).
(107)
Briefly, applying the lattice group for the center to a configuration in the neighborhood gives Af "copies" of the configuration, called variants, with lattice groups which are conjugate to that of the configuration. If the order of the lattice group for the configuration is M, N — 12/M, which must be an integer, from elementary group theory. For the case at hand, M = 3 =$ N = 4, and the two independent sign reversals describe all of these. In the special case /x = 0, we have configurations of the a-phase kind, when 1 / 0 , giving two variants. It is this that makes it possible to include Dauphine twins in the neighborhood. Also, for either variant, one can consider a neighborhood of a particular configuration of this kind, and it will contain two sets of variants of configurations with /x / 0, related as indicated by (106). Now, with these symmetries, we have 0 = ^-{a, c, X, 0, 0) = v 1 • Ei - v 2 • E2\^=0.
(108)
By a simple calculation made using (101), we get H = 0^> sinV = 0, with a = 0 being the equation for determining X.
(109)
617 82
J.L. ERICKSEN
If df/dpi = 0 for some values of \± ^ 0, it follows using (106) that the twinning equation can be satisfied with the same X, values of \i related by a reversal of sign for these twins, with the related isometry involving R^. From my [4] study of the twinning equations, it is easy to see that these are penetration twins, as are the Brazil and Dauphine twins commonly observed in quartz. One could do a bit more, in analyzing possible coexistence of different kinds of twins without saying more about the function / . However, I believe that this pretty well covers what can be inferred from general symmetry arguments, for this kind of example. For the usual view of Dauphine twins, \JL = 0 and values of X are related by a reversal of sign, these disappearing in the fi -phase. I [4] presented analyses of these and other kinds of well established twins observed in quartz, including various interactions involving different kinds, using the X-ray theory. Observations support the assumption that /i = 0, in temperature ranges where these are commonly observed, and no other possibilities satisfy these conditions. Given all the talk about the possible third phase involved in the a-/3 phase transition, I think that it is worth exploring the possibility that it might involve configurations with ii ^ 0. This involves only a rather small difference in symmetry from the configurations occurring with /x = 0. For all, the skeletal lattice group is the same, the difference in symmetry being associated with the difference in shifts. So, the various observations of a and c could apply to any of these, for example. Also, one can connect pairs of such configurations by continuous paths not including configurations of a different kind. It does seem to me reasonable to think that one will need an additional order parameter to deal with an additional phase and, here, one arises in a rather natural way. It seems worth noting that, concerning the tetrahedron mentioned earlier, Ackermann and Sorrell [18] write that "... the measured lattice parameters of high quartz* do not allow a regular tetrahedron with the single adjustable parameter." Of course, one could do a rather similar analysis for any of the other various subgroups of the lattice group for the center, but this is the one I like best for a study of the kind indicated. For a relatively simple theory, consider cp to be restricted to the configurations encountered here. Of course, what is suggested is to replace the sextic in (50) by a sextic in X and li, to be consistent with (106) and (107), and ignore the possibility of coexistent phases. Relevant sextics are of the form Ag(k,n,p,0)
= a]X2 + a2X4 + a3X6 + blix2 + b2fi4 + b3ix6 +8X2IJL2 + sX4/x2 + (pX2fi4,
(110)
where I have used a different labeling of some coefficients to make it easier to relate this to a discussion by Toledano and Dmitriev [20, pp. 106-109], who treat the special case £ = 0 = 0 (my X is their r], my \x their £). In accord with common interpretations of Landau theory, including theirs, a\ and a2 are different affine * This is another name for the /3-phase.
618 THE a-p PHASE TRANSITION IN QUARTZ
83
functions of p and 0, the remaining coefficients being constants. Also, as is rather commonly done, I assume that the highest degree part is positive, A. and/or fi ^ 0 => a3l6 + eA. V
+ 0* V
+ b3fx6 > 0,
(111)
for which it is obviously necessary that (112)
a3>0,b3>0.
It is not hard to determine the requirements on e and <j>. First, it is clear that (112),
e^O
and
0 ^ 0 =>(111),
(113)
so we assume that s < 0 and/or $ < 0. Set X
= ^E'
^ = 7176'
e = <%h\'\
* = a] /3 * 3 2/ V,
(114)
reducing (111) to a 6 + £CTV + i / r a V + t 6 > 0.
(115)
The possibilities thatCT= 0 or r = 0 are covered by (112), so assume that ox ^ 0. Then, (115) is equivalent to f(x) = 1 + §x + V*2 + * 3 > 0
for;c>0.
(116)
Ignoring the restriction on x, this is a cubic which has at least one negative real root and if it has three such, £ > 0 and y\r > 0, by elementary considerations, possibilities we have excluded. The product of the roots is —1, so the remaining roots are either complex or real and positive. In the latter case, (116) will be violated, so they must be complex. Using the discriminant of the cubic, one gets the condition for this as 4(§ 3 + f3) - § V 2 + 18£iA + 27 > 0.
(117)
For /x = 0,1 expect that (110) will agree with (50)-(52), except that the interpretation of #0 will be slightly different. There are four possible kinds of phases. Following Toledano and Dmitriev [20, equation (1.186)], I interpret their labeling of these as the Roman numerals identified by I. II. III. IV.
k = n = 0, 06-phase), X^0, /x = 0, (a-phase), A. = 0, n^0, A/x^0.
(118) U ;
619 84
J.L. ERICKSEN
For the a-phase, it is well known that the point group can be generated by (95) and R^. That for phase III can be generated by (95) and R i . For the latter, the aforementioned penetration twins can be described using the same isometry that applies to the Dauphine twins in the a-phase, R^. So these phases are quite similar, enough so that one could be confused for the other. Suppose that one takes unstressed quartz in the /J-phase and cools it. Then, according to Dolino [14], what first occurs is a second-order phase transition at some temperature 92, quite close to the nominal fi —> a transition temperature, but a little higher. What I consider to be a likely prospect for a third phase occurs here, described by bi(p, 6) = 0
atp = 0,
0 = 62,
I ^ III
(119)
and, for it to fit the description, one wants bi(0, 9) > 0(< 0)
for 9 > 92(9 < 92)
(120)
and b2 > 0.
(121)
With the assumption that b\ is affine, this gives bx =k5(9-
92) + k6p, k5 > 0,
(122)
where k5 and &6 are constants. With this, one has enough information on signs of coefficients to do a qualitative phase diagram comparable to those presented by Toledano and Dmitriev [20, Figure 43]. Nominally, this corresponds to their case (c), in the special case where s = (p = 0. Having these coefficients nonzero should have the greatest effect on phase IV. I do have reservations about their presentation. In the legend, it is asserted that their A is positive but from their equation (1.184), it must be negative. Also, if I correctly interpret their phase labels, II and III should be interchanged in their sketch of this case. Further, one needs to bear in mind that, for p = 0, a\ and b\ will be restricted to a line of positive slope and how this cuts through the phases depends on which line it is, among other things. Physically, (p should be bounded below. If one were to consider (51) as applying to arbitrary positive values of p, with k3 < 0 it would not be. Realistically, (51) and (122) will apply only for a limited range of pressures and temperatures so, effectively, a 1 and b\ are bounded above and below. Judging from the sketch presented by Toledano and Dmitriev [20, Figure 43(c)], this could put the region occupied by phase IV outside the range of validity of the theory, in particular. I believe that the first-order transition II -o- III should be included, along with the aforementioned second-order transition, when p = 0. For phases II and III to coexist, unstressed, puts a restriction on the transition strain. I do not think it likely that this will be satisfied, so coexistence is likely to involve residual stresses somewhat like those
620 THE a-p PHASE TRANSITION IN QUARTZ
85
discussed previously. I have not yet done a careful analysis of this situation, using the more complete theory, or looked hard at possible ways of getting more quantitative information concerning the adjustable constants, using available data. I have thought about the possibility that there is experimental evidence in conflict with this theoretical picture, but have not found any. Neither have I found clear evidence that the picture applies. Certainly, I am an amateur in this area, so could easily miss seeing something apparent to some expert. So, this should be viewed as a speculative proposal, which I find interesting. I have tried to give a fair picture of what the 3-lattice model can do for these transition problems, when used in an elementary way. Certainly, quartz is not a monatomic 3-lattice, and this model cannot describe or make use of ideas or information about relations between silicon and oxygen atoms. Experts do focus on the latter, to the extent of picturing quartz as SiO4, pretty well ignoring the fact that the silicon atoms do not form a 1-lattice, for example. So, we have two rather different pictures of what is really SiC>2. One difference is that the 3-lattice model favors the view that what triggers the transformation is a bifurcation involving the shifts accounted for, whereas the SiO4 relates it to behavior of those tetrahedra. Evidence I have seen does not convince me which of these views is more correct. As to that incommensurate phase, one could apply the theory discussed by Dolina [14] to introduce this into the 3-lattice model. In a formal way, I understand this but, given the complications discussed earlier, I do not find it entirely clear whether we are dealing with something properly regarded as a third phase, or complex metastable configurations, or a combination of both. In dealing with complicated phenomena, I think it sensible to try to look at them from different perspectives, so I have presented one. Acknowledgment I thank Giovanni Zanzotto for helpful criticisms of earlier drafts of this paper. References 1. 2. 3.
4. 5. 6. 7.
J.L. Ericksen, Equilibrium theory for X-ray observations of crystals. Arch. Rational Mech. Anal. 139(1997)181-200. R.D. James, Displacive phase transformations in solids. J. Mech. Phys. Solids 34 (1986) 359394. R.D. James, The stability and metastability of quartz. In: S. Antman, J.L. Ericksen, D. Kinderlehrer and I. Mtiller (eds), Metastability and Incompletely Posed Problems, IMA Volumes in Mathematics and its Applications 3. Springer, New York (1987) pp. 147-175. J.L. Ericksen, On the theory of growth twins in quartz, to appear in Math. Mech. Solids. R. Balzer and H. Sigvaldson, Equilibrium vacancy concentrations measurements on zinc single crystals. J. Phys. F: Metal Physics 9 (1979) 171-178. A.H. Jay, The thermal expansion of quartz. Proc. Roy. Soc. London A 142 (1933) 237-247. M. Pitteri and G. Zanzotto, Continuum Models for Phase Transitions and Twinning in Crystals. CRC/Chapman and Hall, London (2000).
621_ 86
J.L. ERICKSEN
8. .J.L. Ericksen, Notes on the X-ray theory. J. Elasticity 55 (1999) 201-218. 9. M. Pitteri, On (v + 1) lattices. J. Elasticity 15 (1985) 3-25. 10. J.L. Ericksen, A minimization problem in the X-ray theory. In: Contributions to Continuum Theories, Anniversary Volume for Krzystof Wilmanski, (collected by B. Albers). Weierstrass Institut fur Angewandte Analysis und Stochastic, Report No. 18 (2000). 11. L.A. Thomas and W.A. Wooster, Piezocrescence - the growth of Dauphine twins in quartz under stress. Proc. Roy. Soc. London A 208 (1951) 43-62. 12. A.B. Pippard, The Elements of Classical Thermodynamics. Cambridge Univ. Press, Cambridge (1957). 13. K.R. Hosieni, R.A. Howald and M.W. Scanlon, Thermodynamics of the lambda transition and the equation of state of quartz. Amer. Mineral. 70 (1985) 782-793. 14. G. Dolino, The incommensurate phase of quartz. In: R. Bline and A.P. Levanyuk (eds), Incommensurate Phases in Dielectrics 2. Elsevier Scientific, Amsterdam (1986) pp. 207-230. 15. M.A. Carpenter, E.K.H. Salje, A. Graeme-Barber, B. Wrack, M.T. Dove and K.S. Knight, Calibration of excess thermodynamic properties and elastic constant variations associated with the a