This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
1, the supersonic points in the hodograph plane no longer map into the plane £ = fj in the complex domain. The map function / will transform the subsonic flow region into part of an ellipse with £ = fj that is bounded by the sonic locus M 2 = 1. On the rest of this ellipse q and 0 become complex, and the real supersonic flow region corresponds to a surface in the complex domain C 2 crossing the plane £ = fj at the sonic line. In this case our boundary value problem becomes more complicated. To solve for the stream function ip in transonic flow, we prescribe an artificial boundary condition on the portion of the ellipse beyond the sonic locus M2 = 1. Along the arc of the ellipse corresponding to real subsonic flow we still impose the condition ip = 0, but on the rest of the boundary we set Re >(£, £) + C I m »/>(£, 0 = 0, cos d>
(6.17)
where the constant C is determined empirically to improve resolution. It has been shown that problems of this kind are well posed, and computations confirm that the system of linear equations obtained by imposing boundary conditions on a linear combination of special solutions at equally spaced points on the circle \(\ = 8 is very well conditioned. As for the map function / , we
C O M P L E X ANALYSIS O F T R A N S O N I C F L O W
113
F i g u r e 1 Mach lines in the shockless flow around a symmetric airfoil designed by the method of complex characteristics. Robust performance of the new F L O W code has made it possible to obtain an exceptionally large supersonic zone over a relatively thick profile.
CHEN &GARABEDIAN
114
need to modify the boundary condition we formulated in the subsonic case to insure that it works for mixed type. On the subsonic arc of the ellipse we put
y/l-M2
RAf\ = J
dq - log
e-ci
(6.18)
i-c2
as before, but elsewhere we assign the values
log
Re[f]
e-cj
t
C2
(6.19) where once more K\ and Ki are empirical constants, and q = q(Re[ip]). The method of complex characteristics that we have described here has been given an effective new implementation in the FLOW code. Fig. 1 shows a symmetric, 10% thick shockless airfoil designed by this code that has exceptionally large supersonic zones displayed by plotting the Mach lines.
6.5
Structure of t h e algorithm
The FLOW code has been developed from earlier work [3,4] to provide a robust implementation of the method of complex characteristics for calculation of shockless airfoils. It reads in the prescribed speed distribution q = q(s) using the arc length s along the profile as a parameter. Since the input data can be dense at some points and sparse at others, a spline routine is employed to obtain a rescaled, more evenly distributed set of values. The speed is computed iteratively as a function q = q(
((n+1 - (n)(rjn+l
~ Vn)
(6.20) of the kind employed for hyperbolic systems solves the differential equations along the characteristics, where (mVn are mesh points on paths in the complex plane. From the complete set of special solutions a discrete system of linear
COMPLEX ANALYSIS OF TRANSONIC FLOW
Figure 2 Plot of the ellipse where the solution of the flow equations is calculated in the complex characteristic E-plane for the run of the FLOW code that provides the symmetric shockless airfoil shown in Fig. 1. Subsonic paths of integration are indicated by solid lines inside this ellipse. The dotted lines represent a pair of supersonic paths, and the hatchmarks on the ellipse itself are the interpolation points for the boundary value problem. The images of the stagnation point and the trailing edge are indicated by crosses at the extreme left and the extreme right, and asterisks specify the location of the dipole
equations is obtained for the coefficients of the stream function. The real parts of the solutions a.re used in boundary conditions over the arcs of the ellipse where the flow is subsonic, but complex values are required on the remaining arcs. The system of linear equations which define the coefficients of the solutions is well conditioned because we use Chebyshev polynomials in the consruction. For subsonic flow our approach is not unlike more conventional procedures for solving a linear boundary value problem by superposition of special functions. In the transonic case, however, the method has a novel feature because the boundary conditions are imposed along arcs of the ellipse that correspond to nontrivial curves in the complex domain C2. For any sensible choice of the prescribed speed or pressure distribution, the computation we have described results in a shocltless profile when the free stream Mach number is not too large. It is usually superfluous to iterate the map function several times to fit the prescribed speed distribution better, since
CHEN & GARABEDIAN
116
perfect agreement along the supersonic arcs of the profile cannot be expected anyway. However, it becomes necessary to introduce a filter to get rid of erroneous points associated with inaccurate values of the complex square root A = V l — M2. The most difficult step in the algorithm is the determination of a correct branch. Numerical experiments have shown that a valid global branch is located in the half-plane Re[A] < —Im[A]. Constructing paths of integration for the method of complex characteristics is so important that we devote the next section to that issue. It will be shown there how to avoid crossing the sonic locus M2 = 1 where our equations become singular. To determine coordinates of the profile in the supersonic region, we lay down a set of supersonic paths of integration and calculate the solution along real characteristics. These characteristics are plotted in the real zone of supersonic flow in the transonic case. We integrate along them to determine points on the profile by looking for changes in the sign of the stream function tp.
6.6
P a t h s of integration
Let us discuss the paths of integration needed to calculate a cascade of compressor blades. The paths for a single airfoil follow the same principle as for a cascade with only minor variations. Singularities representing the source and the sink for a cascade are located at points £ a and £j in the characteristic £-plane. In the limiting case of a single airfoil the distance between £„ and £l, vanishes and the source and the sink coincide to form a dipole. Paths are drawn from £ a and £j to an initial point £c from which further paths are traced to solve characteristic initial value problems for the flow equation. The preliminary paths from £tt and £j to £c are required to handle the characteristic initial value problem for the Riemann function, which is the coefficient of the singular terms defining the source and the sink. In the subsonic case the paths have symmetry over the complex domain with respect to the real diagonal £ = fj, which just means that the paths in the 77-plane axe reflected images of the paths in the £-plane. Hence we need only compute finite differences on a triangle of mesh points, since values over a larger square mesh can be reflected across the diagonal. This is not possible for the transonic paths because they must circumvent the sonic locus M2 = 1 in the complex domain. Consequently the transonic paths in the £plane and the 77-plane are not reflected images of each other, but terminate instead in arcs on the ellipse traversed in opposite directions (cf. Fig. 2). The calculation is stopped when the two paths meet each other at points on a segment perpendicular to the diagonal. These are points on the ellipse that are used later in the solution of the boundary value problem for the stream function. The value of the stream function is no longer real on arcs of the
COMPLEX ANALYSIS OF TRANSONIC FLOW
117
Figure 3 Schematic diagrams of (a) the supersonic paths in the characteristic a-plane and (b) corresponding characteristics in the real hodograph plane. The dashed line is the sonic locus where, to avoid M 2 = 1, the paths must not cross each other.
ellipse beyond the sonic locus. It is remarkable that these points lead anyway to a well posed boundary value problem in the complex domain producing useful shockfree transonic flows. After the boundary value problem has been solved the real supersonic solution must be found separately. The real characteristics in the supersonic zone cannot be determined by crossing the sonic line directly because the canonical equations fail when M 2 = 1. To avoid that we choose polygonal supersonic paths that consist of a union of horizontal and vertical line segments in the characteristic a-plane, where the sonic locus lies along the imaginary axis (cf. Fig. 3). Each polygon twists around to avoid points where M 2 = 1 and then terminates in an arc of the sonic locus. Since the sonic locus is only a two-dimensional surface, while the calculation is being done in the four-dimensional complex domain C 2 , it is always possible to move the
CHEN
118
kGARABEDIAN
paths of integration so that they circumvent the sonic locus. The schematic diagram in Fig. 3a shows how two paths in the a-plane and the /3-plane are constructed to accomplish this. The thick line indicates the path in the aplane, while the thin line indicates the path in the /5-plane and the dashed line represents the sonic locus. Fig. 3b shows the corresponding characteristics in the real hodograph plane. Pairs of points on the sonic locus coming from the two supersonic paths correspond to real points of supersonic flow in the hodograph plane. When these points move along the sonic locus, the corresponding points in the characteristic a-plane move up or down on the imaginary axis. Thus the imaginary part of a behaves like a real characteristic coordinate. Success depends on selecting the early segments of the paths so as to end up on the right branch of the solution in the complex domain. Both subsonic and supersonic paths of integration are plotted in the characteristic £-plane and displayed by the code as shown in Fig. 2 for an isolated airfoil. Hatchmarks on the ellipse specify mesh points where the boundary conditions are imposed for interpolation. The polygonal curves shown inside the ellipse are the paths of integration used to solve characteristic initial value problems by the method of complex characteristics. They join the dipole at £ a = £j to an initial point £c from which further segments proceed to a number of arcs around the ellipse. Supersonic paths terminate along the sonic locus, which is a curve cutting a sliver out of the edge of the ellipse. The images of the leading and trailing edges of the airfoil are indicated by large crosses at the left and at right in the figure.
6.7
Fourier analysis of the coordinates
For transonic flow one disadvantage of the method of complex characteristics is that it introduces singularities at the sonic locus M2 = 1. The transonic and supersonic paths are constructed to avoid these points, which are reflections of one another across the corresponding analytic curve M2 = 1 in the diagonal plane £ = fj. If a sonic point occurs on the mesh at a point where the solution must be computed, trouble is encountered in calculating reliable coordinates of the profile. We have introduced a special filter that eliminates erroneous data of this kind in the x, y coordinates by exploiting complex analyticity of the solution at the boundary. First let us rescale the arc length 5 around the profile so that it behaves like an angle in the ring 1/5 < | (\ < 5, for x and y are regular there. We choose this angle-like variable ui so that s = s(w) is defined by the integral
a = Ki
(1+e-
UJ)2{L> + efdw + K2,
(6.21)
COMPLEX ANALYSIS OF TRANSONIC FLOW
119
Figure 4 Plot of a shockless airfoil designed on a. crude mesh before the coordinates of the profile were filtered. The effect of erroneous points is clearly visible.
where e is an input parameter less than 0.2 and K\ and Ki are constants adjusted to place w between 1 and — 1. The formula has been arranged to make Fourier series for x and y converge rapidly. Recalling that 6 is the angle of the flow vector, we have dx .dy ds — +i-r = —cos9 dur
du>
du)
. ds . , ds ,.„. + i—sm8 = — exp(i0). dui
dui
(6.22)
Because the velocity components u and v are computed reliably by the FLOW code, we are able to obtain a Fourier series for the function dx
. dy
— + i—
v~V
i
i ■
\
= 2_^(an + ibn) exp [mnuj)
COMPLEX ANALYSIS OF TRANSONIC FLOW
121
Figure 5 Vertically rescaled coordinates of the airfoil shown in Fig. 4. The small circles indicate the unfiltered coordinates, and the solid line represents the final, more accurate profile. The dots below are the filtered coordinates without vertical rescaling, and that shows the true shape of the shockless airfoil.
introduced enables us to obtain from relatively primitive runs of the FLOW code better coordinates for a shockless airfoil whose design properties are confirmed by a BGKM analysis. The FLOW code has produced the shockless airfoil shown in Fig. 4 whose coordinates calculated before the filtering process are seen to be quite unsatisfactory. The upper surface of the output is not smooth near the back of the supersonic zone. This is because a mesh point at the end of one of the supersonic paths falls on the sonic locus and becomes erroneous. Fig. 5 shows a vertically stretched plot of the airfoil in which we not only see the bad point, but also points further back that oscillate unevenly. The small circles in the enlarged picture represent data output before filtering, but points of the solid curve and points in the picture underneath represent more
122
CHEN & GARABEDIAN
Figure 6 Pressure distribution on the shockless airfoil of Figs. 4 and 5 calculated by the BGKM code. The hodograph method and the Euler computations agree well and serve to establish that the filtered solution is truly shockless.
accurate coordinates calculated by the filtering process. Once the filter has been included, we obtain a smooth design that has been checked by running the BGKM code. The BGKM result appears in Fig. 6, where one can see that the airfoil is shockless and that the pressure distributions calculated by the hodograph method and the Euler equations agree. McGrattan [8] has found a symmetric shockless airfoil that produces nonunique solutions of the Euler equations, but it was quite thin. We have obtained another symmetric shockless airfoil with a larger thickness-to-chord ratio of 10% which was designed by the new and more robust FLOW code exploiting the method of complex characteristics. For a symmetric airfoil at zero angle of attack we know there is a solution with zero lift. Because the
COMPLEX ANALYSIS OF TRANSONIC FLOW
123
BGKM code yields a solution with nonzero lift, too, that has to be a second solution. By symmetry the reflected flow with negative lift is yet a third solution. The original shockless airfoil and its hodograph have been displayed earlier in Figs. 1 and 2. The range of free stream Mach numbers for which solutions can be found with nontrivial lift at zero angle of attack is very narrow. In practical applications one might become concerned about the nonunique solutions because an airfoil having such a design could behave unpredictably. However, these solutions occur for such a limited range of shapes, Mach numbers and angles of attack that no adverse physical consequences have been observed. Supercritical wing sections now operate successfully on many commercial aircraft. We hope that the improved FLOW code based on the method of complex characteristics will become part of this generally accepted technology. We are grateful to Frances Bauer, Margaret Bledsoe and Mark McConnell for their contributions. This work has been supported by the National Sciences Foundation under Grant DMS-9420499.
REFERENCES 1.
F. Bauer, P. Garabedian and D. Korn, Supercritical Spring-Verlag, New York, 1972.
2. F. Bauer, P. Garabedian and D. Korn, Supercritical Spring-Verlag, New York, 1975. 3. F . Bauer, P. Garabedian and D. Korn, Supercritical Spring-Verlag, New York, 1977.
Wing
Sections,
Wing Sections Wing Sections
II, III,
4. M. Bledsoe and P. Garabedian, The method of complex characteristics for transonic airfoil design, with an application to compressors, Advances in Computational Transonics 4, ed. by W. G. Habashi, Pineridge Press, Swansea, 1985, pp. 111-129. 5. P. Garabedian, Partial Differential Equations, Chelsea, New York, 1986. 6. P. Garabedian and G. McFadden, Design of supercritical swept wings, AIAA Journal 30, 1982, pp. 289-291. 7. A. Jameson, W. Schmidt and E. Turkel, Numerical solution of the Euler equations by the finite volume method using Runge-Kutta time-stepping schemes, AIAA Paper 81-1259, June 1981. 8. K. McGrattan, Comparison of transonic flow models, AIAA Journal 30, 1992, pp. 2340-2342.
124
CHEN & GARABEDIAN
9.
E. Murman and J. Cole, Calculation of plane steady transonic flows, AIAA Journal 9, 1971, pp. 114-121.
10. E. Swenson, Geometry of the complex characteristics in transonic flow, Comm. Pure Appl. Math. 21, 1968, pp. 175-185.
7
Transonic Small Transverse Perturbation Equation and its Computation Shijun Luo1, Huili Shen2 & Ping Liu3
7.1
Introduction
The small perturbation approximation has been widely accepted in the subsonic and supersonic aerodynamics to compute complex configurations for its superiority in computational efficiency. In transonic aerodynamics, the small perturbation approximation is also feasible. The classical transonic small perturbation (TSP) equation was formulated by von Karman[6]. Murman and Krupp[9] made a modification to improve the accuracy of the critical pressure expression. Another modification which allows large longitudinal perturbation velocity component has been used in [7] for computing wing-body combinations. It eliminates the singularity of the TSP equation at the stagnation point. This paper consists of three parts. In the first, general properties of the transonic small transverse perturbation (TSTP) equation are described. In the second, the Murman-Cole mixed difference scheme[8] is used to compute the transonic flows around inlets. Multiple solutions of transonic flow around airfoil were found by Steinhoff and Jameson[12] for the full potential (FP) 1
Department of Aircraft Engineering, Northwestern Polytechnical University, Xi'an, China.
2
Department of Aeroengine Engineering, Northwestern Polytechnical University, Xi'an, China.
3
Engineering Development Center, Chengdu Aircraft Industrial Corp., Chengdu, China. Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez ©1998 World Scientific
LUO, SHEN, & LIU
126
equation and by Chen[2] for the TSP equation. In the third part, the multiple solutions of the T S T P equation for transonic flow over airfoil and wing are investigated.
7.2
Transonic Small Transverse Perturbation Equation
By the assumption of small transverse perturbation, the full potential equation reduces to {l-M2)
+ 4>yy + 4>zz = Q
(7.1)
where 1 - M2 =
2 1 _ M2 - fa + DM2 **- - ^M (^)2 °° l 7 + J " ' : 2 f<~>
(7.2)
x, y and z are the Cartesian coordinates with x parallel to the longitudinal axis of the body or the chord of the wing and y parallel to the wing span. Mx and is the perturbation velocity potential. Subscripts x, y, z signify the partial derivatives with respect to x,y,z respectively. 7 is the ratio of specific heats. M has the meaning of local Mach number. The pressure coefficient for the T S T P flow is Cp
~ iMl
As M = 1, from Eq.(7.2)
^ 9oo
+
+
A 2<&
=
i-*C
(74) [ '
(7 + l)Afi
Substituting into Eq.(7.3), the exact critical pressure coefficient is obtained
Under the assumption of small transverse perturbation and weak shock wave, the Rankine-Hugoniot shock relation becomes
l-M^-(7 + l ) M i 4 ± ^ - ^ M j ( 4 ± ^
l-(7-l)^^-Y^(^) 2
+(v[ - v2)
3
_
2
^^
+ (u>i - w'2)2 = 0
(7.6)
SMALL TRANSVERSE PERTURBATION EQUATION
127
Figure 1 Shock Polar at Mx = 1.2 where (u'^v^wj and (u'2,v'2,w'2) are the perturbation velocity components on the upstream and downstream sides of the shock wave. In fact, Eq.(7.6) is also t h e difference equation of the T S T P differential equation (7.1). For a shock wave in the free stream, u[ = v[ = w[ = 0, Eq.(7.6) reduces to
2 [l-Ml-{ l)Ml^-^M2(^-) \ui 1+ L zqoo 2 \2q / J 00
+[i- ( ^- 1)M -i- 7 T iM -(£) 2 ]^ 2+ ^= 0
^
This is the T S T P shock polar at M„. The corresponding shock polars at Mx for the T S P flow and the F P flow are respectively [1 - Mi - (7 + 1 ) ^ 4 , ^ ] "22 + v? + w'22 = 0
(7.8)
and = 0(7.9) Eq.(7.9) is the exact shock polar at A f „ . Eqs.(7.7)-(7.9) are plotted in Fig. 1 for Moo = 1.2, where a* is the critical speed of sound.
7.3
C o m p u t a t i o n of A x i s y m m e t r i c Inlet
The external and internal flow around inlet is computed by t h e T S T P and F P equations. Only axisymmetric flow is considered.
LUO, SHEN, & LIU
128 7.3.1
Governing Equations and B o u n d a r y Conditions
In cylindrical coordinates x, r and 6, the T S T P and F P equations for axisymmetric flow are 2 /
(a 2 — u2)4>xx + (a 2 — fj,v2)4>rr — 2fiuv<jjxr -\
=0
(7-10)
where a is the local speed of sound,
a2 = al + ^l{ql-u2-^)
(7.11)
aoo is the speed of sound in the free stream, u and v are the velocity components in the x and r directions respectively, /J, = 0 and 1 give the T S T P and F P equations respectively. The same boundary conditions are used for the T S T P and F P equations. The boundary condition on the external and internal surfaces of the cowl and the surface of the center body, r = f(x) is exact.
<£r(*, /(*)) = [«« + *«(*, /(*))]/' (*)
(7.12)
If the cowl lip is blunt, the above boundary condition is replaced by 4x = -9oo
( 7 -13)
at the blunt lip point. The governing equation on the inlet axis, r = 0, outside of the center body is replaced by (a 2 - u2)cj>xx + 2a24>rT = 0
(7.14)
(j>r = 0
(7.15)
and there
The average velocity at the inlet exit is applied at the far downstream end of a cylindrical extension of the inlet as the downstream internal boundary condition. The free stream condition, <j> = 0 is applied at the outer boundary of the external flow, which is taken far enough from the inlet entrance. In this paper, for M^ < 1, the upstream and downstream boundaries are 5D (D = inlet entrance diameter) from the inlet entrance and the lateral boundary is at r = 25Z); for M^ > 1, the distance for the lateral and downstream boundaries extends t o 500D, which was found necessary for numerical stability, and the upstream boundary remains 5D from the inlet entrance.
SMALL TRANSVERSE PERTURBATION EQUATION
Figure 2
7.3.2
129
Generation of Non-Uniform Meshes around Blunt Lip
Difference Equations
The Murman-Cole non-conservative mixed difference scheme[8] is used for both T S T P and F P equations. The T S T P equation is first solved and its solution is used as the start field in solving the F P equation. By this technique, the non-rotated mixed difference scheme is stable for the F P equation. To facilitate the calculation, the grids around the blunt lip of the cowl are generated as follows. For a given sequence of Ax, a series of grid lines perpendicular to the x-axis are drawn. The corresponding sequence of Ar is determined by the intersection of the grid line and the meridian of the cowl (Fig. 2). The generated grid is non-uniform. In the neighborhood of the cowl lip, the grid stretching should not be excessive, say 1/2 < Axi-1J/AxlJ < 2 where i and j are the grid point numbers in the x and r directions respectively. If the grid stretching is excessive, the relaxation computation may not converge. The grid is fine in the vicinity of the cowl lip and coarse near the outer boundary. 7.3.3
Relaxation Method
The successive line over relaxation (SLOR) method is used to solve the difference equations. The undisturbed flow, <j> = = is s agoo dtart tield do fhe relaxation computation for the external field, but not good for the internal field. For the internal field, the one dimensional flow solution is used as the start field. For the pitot type inlet with a convergent diffuser, it is found that the only r-direction relaxation may not be stable and the x and r alternating direction relaxation stabilizes the computation and accelerates the convergence. The r-direction relaxation propagates the disturbance of the lateral surface of the inlet rapidly into the flow field, whereas the x-direction relaxation propagates the inlet exit disturbance rapidly in the x-direction.
LUO, SHEN, & LIU
130
For the inlet with a center body, the diffuser has convergent and divergent parts. Some of the x-direction relaxation lines through the exit plane are cut by the lateral surfaces of the body and thus the rc-direction relaxation is not effective. Instead, a zonal multiple relaxation technique is used, in which more r-direction sweeps are carried out inside the diffuser than outside and still more r-direction relaxation sweeps in the divergent part of the diffuser than in the convergent part. In order to ensure the convergence of relaxation, the under relaxation is used in the early period of relaxation. The convergence criterion for cf> is that the maximum relative increment of 4> in the consecutive iterations for all grid points is not greater than 1 0 - 3 . A 1% maximum relative difference in the mass flow rate among all cross sections of the inlet is also required for the convergence. The multigrid technique is used to accelerate the convergence and assess the accuracy of the numerical solutions. 7.3.4
Numerical Examples
The pitot type inlets, IE and IC[10] are computed. Fig. 3 shows the flow around the blunt lip of the cowl of IE at Moo = 0.7197, the exit Mach number, Me = Moo and the flow rate coefficient, CA = 1. The F P solution has an isentropic supersonic pocket followed by a larger supersonic pocket ended with a shock wave. The T S T P equation predicts the downstream supersonic pocket well, but misses the upstream supersonic pocket. Fig. 4 shows the bow shock wave and the sonic line of the inlet IE at Moo = 1.14, Me = 0.7197 and CA = 0.91. In comparison with the F P solution, the T S T P bow shock wave and sonic line are good near the axis of symmetry. Fig. 5 shows the distribution of the pressure coefficient on the external and internal surfaces of the cowl of the inlet IC at Mx = 1.14, Me = 0.8215 and CA = 0.98. The T S T P solution agrees with the F P solution except near the blunt lip on the external surface. The comparisons with the available experimental data[10] and Euler solution[ll] are shown in Figs. 4 and 5. For the above computations, the two grids, 41 x 41 and 70 x 80 are used. It is found that the coarse grid gives sufficient accuracy. The inlet with a center body, 60-40(13] at M^ = 1.27, Me = 0.5413 and CA = 0.655 is computed. Fig. 6 gives the pressure distribution on the front portion of the center body and the external surface of the cowl, where poo is the pressure in the free stream. The T S T P solution agrees with the F P solution and the experimental data except near the lip of the cowl. The potential solutions on the nose of the center body agree well with the conical flow theory [3]. The two grids used are 42 x 51 and 80 x 100 and the coarse grid yields sufficient accuracy. The iteration numbers for convergence on the coarse and fine grids are 300 and 400 respectively. The corresponding computer times on the Siemens 7760 which works at a rate of 106 floating point operations
SMALL TRANSVERSE PERTURBATION EQUATION
Figure 3
Flow around Lip of Inlet IE, M^ = Me = 0.7197
Figure 4
Flow around Inlet IE, Moo = 1-14, Me = 0.7197
131
LUO, SHEN, & LIU
132
Figure 5
Pressure Distribution on Cowl of Inlet IC, Moo = 1.14, M, = 0.8215
per second, are 500 and 1000 seconds respectively.
7.4
M u l t i p l e S o l u t i o n s for A i r f o i l a n d W i n g
Multiple solutions of the T S T P equation are investigated for transonic flow over airfoil and wing by varying the start field of the relaxation computation, the relaxation method, the relaxation parameters and the difference scheme. 7.4.1
B o u n d a r y Conditions
The Cartesian coordinates x, y and z are used. The governing equation is Eq.(7.1). Taking the coordinate plane xy close to the wing surfaces, the boundary conditions for the flow around a thin wing at a small angle of attack, a are as follows. On the upper surface of the wing, z = F(x,y) 4>z(x, y, 0+) = [.joo + (f>x(x, y, 0+)]Fx(x, y) - qxa
(7.16)
A similar condition holds on the lower surface. At the blunt leading edge, the exact boundary condition is used.
SMALL TRANSVERSE PERTURBATION EQUATION
Figure 6
133
Pressure Distribution on Cowl and Center Body of Inlet 60-40, M^ = 1.27, M e =0.5413
The K u t t a conditions on the wing wake are
=
(7.17)
and <j>(x, y, 0 + ) - 4>(x, y, 0 - ) = <j>(xt,y,0+) - <j>(xt, y, 0-
(7.18)
where xt is the x coordinate of the trailing edge of the wing. On the far field boundaries,
4>* = o. 7.4.2
Relaxation Methods
The difference equations are formed with the Murman-Cole non-conservative scheme[8]. For the flow over airfoil, the difference equations are solved by two relaxation methods. They are the SLOR method and the implicit approximate-factorization method AF-2[1]. For the flow over wing, the plane AF-2 relaxation method is used. The relaxation plane is perpendicular to the wing span and in the plane the AF-2 method is used. In the SLOR method, the artificial time-damping term[5] is introduced and thus the over relaxation can be used for both subsonic and supersonic points to accelerate the convergence. In this investigation, the convergence criterion is that the maximum relative increment of <j> in the consecutive iterations for all grid points is not greater than 1 0 - 5 .
LUO,SHEN, & LIU
Figure 7 Multiple Solutions for Airfoil NACA 0012,
7.4.3
M, = 0.85, a = 0
Airfoil NACA 0012
For the flow over the airfoil NACA 0012 at M, = 0.85 and a = 0, three different numerical solutions, A,B and C are obtained. The distributions of the pressure coefficient, C, over the airfoil surface differ mainly in the shock wave location and strength as shown in Fig. 7 where c is the chord of the airfoil. These multiple solutions are obtained by different start fields of the relaxation computation with fixed relaxation parameters as given in Table 1. The computational grid is 45 x 39. In Table 1, the subscripts A and B of the start fields denote the weak and strong shock wave solutions respectively and for AF-2, the relaxation parameter, Q = 0.8, the lowest and highest values of the acceleration convergence parameter, UL = 0.5 and UH = 65, and the number in the circular o-sequence, K = 5. The above multiple solutions can also be obtained by different relaxation parameters of the AF-2 method with the same start field as given in Table 2. Here the start field is C$ = 0 and for AF-2, UL = 0.5, UH = 65 and h' = 5, and fl varies.
SMALL TRANSVERSE PERTURBATION EQUATION Table 1
135
Multiple Solutions by Different Start Field
Start Field ^ 0
Relaxation Method AF-2 or SLOR
Solution A
>|M 00 =.9, Q =O,A
AF-2
B
0|M ro = .875,g=O,g
AF-2 or SLOR
C
Table 2
Multiple Solutions by Different Relaxation Parameters
Relaxation Parameter, Q, .795, .8, .805 .6, .7, .75, .79, .81, .825, .85
Solution A B
Numerical experiments show that the above multiple solutions can also be obtained by varying other parameters of the AF-2 method. Extensive numerical experiments, including the use of the refined grids, 89 x 39 and 111 x 39 have confirmed the persistence of these solution. The present investigation finds that for the airfoil NACA 0012 at a = 0, the multiple solutions appear in the region, 0.825 < Mx < 0.900. The multiple wave drag coefficients, CDw0 are listed in Table 3. It is seen that the differences in CDWO are significant. The Murman-Cole conservative difference scheme also yields multiple solutions for the T S T P equation.
Table 3
Multiple Wave Drag Coefficients at Zero Lift ^foo 0.840 0.850 0.875 0.900
CDw0 by T S T P 0.0274, 0.0292, 0.0320 0.0325, 0.0362, 0.0407 0.0480, 0.0540 0.0673, 0.0690, 0.0710
LUO, SHEN,& LIU
Figure 8 Multiple Solutions by Engquist-Osher Scheme, NACA 0012, M m = 0.85, cu = 0
7.4.4
Engquist-Osher Scheme
In contrast to the Murman-Cole difference scheme, the Engquist-Osher difference scheme[4] is entropy satisfying, besides it is conservative. It would be interesting to examine whether the Engquist-Osher scheme yields multiple solutions. In this investigation, the TSTP equation is replaced by the TSP equation for which the Engquist-Osher scheme is established. For the airfoil NACA 0012 at M, = 0.85 and CY = 0, the Engquist-Osher scheme does yieId multiple solutions as shown in Table 4 by varying the start field of the relaxation computation. One solution is symmetric and the other is unsymrnetric. Here the grid is 45 x 39, and the relaxation method is AF2 with 52 = 0 . 8 , = ~ ~0 . 5 , = ~ ~50 and K = 5. The corresponding pressure distributions are given in Fig. 8. They differ mainly in the shock wave location and strength.
SMALL TRANSVERSE PERTURBATION EQUATION Table 4
Multiple Solutions by Engquist-Osher Scheme
Start Field 0 = 0
Table 5
Solution Symmetric Solution Unsymmetric Solution
Multiple Solutions for Wing
Start Field 4> = 0 4> = 0|Moo=.87,a=o 4> = ,|Moo=.85,a=20,/i
7.4.5
137
Solution Symmetric Solution A Symmetric Solution B Unsymmetric Solution
Three-Dimensional Wing
To investigate the existence of multiple solutions of the T S T P equation for transonic flow over three-dimensional wing, an untwisted rectangular wing of aspect ratio, A R = 4 and airfoil NACA 0012 at a = 0 is considered. The Murman-Cole nonconservative scheme and the plane AF-2 relaxation method are employed. The numerical experiments show that multiple solutions do exist in the region 0.84 < M^ < 0.90. As an example, Table 5 and Fig. 9 give the results for the rectangular wing at Moo = 0.85 and a = 0. In Fig. 9, SP is the semi-span of the wing. There are four different solutions: two symmetric solutions, A and B and two unsymmetric solutions. The two unsymmetric solution are mirror image of each other. It is found that the upper and lower surface pressure coefficients of the unsymmetric solution coincide with those of the two symmetric solutions and the multiple solutions differ mainly in the shock wave location and strength. The grid used is 45 x 39 x 16. For AF-2, ft = 0.8, ah = 0.5, aH = 65 and K = 5.
7.5
Conclusions
The transonic small transverse perturbation equation is applicable to the flow in which the longitudinal perturbation velocity component is not small. It
138
Figure 9
LUO, SHEN, & LIU
Multiple Solutions for Untwisted Rectangular Wing, Aspect Ratio 4, Airfoil NACA 0012, M^ = 0.85, a = 0
yields the exact critical pressure coefficient and the approximate shock wave relation. In the successive line over relaxation computation of the internal and external flow around inlets, the techniques of longitudinal and lateral alternating direction relaxation and zonal multiple relaxation are helpful to stabilize the calculation and accelerate the convergence. Numerical experiences for non-uniform grid show that in the region of large gradients, a grid stretching ratio over 2 may induce oscillations. Multiple solutions of the discrete small transverse perturbation potential equation exist not only for steady two-dimensional flow over airfoil but also for steady three-dimensional flow over wing. Extensive numerical experiments, including different start fields of relaxation, different relaxation methods and parameters, and grid refinement, have confirmed the persistence of these solutions. The multiple solutions differ mainly in the shock wave location and strength The three difference schemes the Murman-Cole conservative and nonconservative schemes and the Engquist-Osher scheme, all yield multiple solutions. Hence it appears likely that the multiple solutions are intrinsic to the differential equation of the potential flow and are not due to the numerical method.
SMALL TRANSVERSE PERTURBATION EQUATION
139
REFERENCES 1. Ballhaus, W.F., Jameson, A. and Albert, J., Implicit ApproximateFactorization Schemes for Steady Transonic Flow Problems, AIAA Journal 16, June 1978, pp. 573-579. 2. Chen, T.M., Start Field Problem in Finite Difference Computation of TwoDimensional Steady Transonic Flow, Ada Aerodynamica Sinica 1, No. 3, 1982, pp. 67-73. 3. Dailey, C.L. and Wood, F.C., Computation Curves for Compressible Fluid Problems, John Wiley & Sons, New York, 1949. 4. Engquist, B. and Osher, S., Stable and Entropy Satisfying Approximations for Transonic Flow Calculations, Mathematical Computation 34, 1980, pp. 45-75. 5. Jameson, A., Iterative Solution of Transonic Flow over Airfoils and Wings, Including Flow at Mach 1, Communications on Pure and Applied Mathematics 27, 1974, pp. 283-309. 6. von Karman, Th., Similarity Law of Transonic Flow, Journal of Mathematical Physics 26, 1947, pp. 182-190. 7. Luo, S.J., Zheng, Y.W., Qian, H. and Wang, D.Q., Finite Difference Computation for Transonic Steady Potential Flows, Computer Methods in Applied Mechanics and Engineering 21, July 1981, pp. 129-138. 8. Murman, E.M. and Cole, J.D., Calculation of Plane Steady Transonic Flows, AIAA Journal 9, Jan. 1971, pp. 114-121. 9. Murman, E.M. and Krupp, J.A., Solution of Transonic Potential Equation Using a Mixed Finite Difference System, Proc. Second Int'l. Conf. on Num. Meth. in Fluid Dynamics, Lecture Notes in Physics 8, 1971. 10. Olstad, W.B., Transonic Wind-Tunnel Investigation of the Effects of Lip Bluntness and Shape on the Drag and Pressure Recovery of a Normal-Shock Nose Inlet in a Body of Revolution, N AC A RM L56C28, 1956. 11. Rizzi, A.W. and Schmidt, W., Study of Pitot-Type Supersonic Inlet-Flow Field Using the Finite-Volume Approach, AIAA Paper 78-1115, 1978. 12. Steinhoff, J. and Jameson, A., Multiple Solutions of the Transonic Potential Flow Equation, AIAA Journal 20, Nov. 1982, pp. 1521-1533. 13. WooUett, R.R., Meleason, E.T. and Choby, D.A., Transonic Off-Design Drag and Performance of Three Mixed-Compression Axisymmetric Inlets, NASA TM X-3215, 1975.
8 Excitation of Absolutely Unstable Disturbances in Boundary-Layer Flows Oleg S. Ryzhov1 & E. D. Terent'ev2
Abstract A review is provided to show inconsistency between current instability concepts and experimental findings on the onset of transition in the boundary layer on a swept wing. An extended version of the triple-deck theory proves to be an appropriate means to settle the matter on the assumption that the Reynolds number takes on sufficiently large values. Upstream advancing wave packets lead to absolute instability and earlier breakdown of any three-dimensional boundary layer with crossflow. Theoretical arguments put computed results on a firm footing.
8.1
Introduction
The basic properties of transition to turbulence in boundary layers of different kinds may be summarized as follows, if we confine ourselves to wind-tunnel tests conducted under carefully controlled conditions with fairly weak artificial disturbances. The linear stage of the TS wave amplification in the twodimensional boundary layer on a flat plate is mild and extends over a few hundred wavelength downstream of a periodically vibrating ribbon. The 1
Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, New York 12180-3590. 2 Computing Center, Russian Academy of Sciences, 40 Vavilov Street, 117333 Moscow, Russian Federation. Frontiers of Computational Fluid Dynamics 1998 Editors: David A. Caughey & Mohamed M. Hafez ©1998 World Scientific
142
RYZHOV & TERENT'EV
TS stage shortens to several tenths wavelength with an external source operating in the pulse mode. In general, introducing artificial disturbances into the Blasius flow dramatically changes the location of transition which happens suddenly within a very short distance. At the onset of transition strong nonlinear effects come into play almost simultaneously with threedimensionality evolving in the velocity field. The process as a whole is driven by convectively unstable disturbances sweeping downstream of a region where they were excited. To the contrary, transition in the three-dimensional boundary layer on a swept flat plate (with a displacement body on top) is dominated by strong nonlinear interactions nearly from a chordwise position where disturbances are grown to experimentally detectable size. They can originate, depending on the test environment, in the form of stationary streamwise vortices and coexisting travelling waves. The former of these modes derives from crossflow inherent only in the three-dimensional boundary layer. The latter mode is believed to be governed by a mechanism of the TS type typical of the two-dimensional boundary layer. In addition, the flow over a swept wing is susceptible to leading-edge instability and contamination as well as centrifugal instability which are beyond the scope of the present discussion. Most of experimentalists agree that the measured frequencies and phase velocities of travelling waves and wavelengths of stationary crossflow vortices are fairly well predicted by hydrodynamic stability theory. However, they cannot reach a good consensus of opinion on the group velocities and amplitude amplification rates (see for example Bippes, Mtiller & Wagner [1]; Kachanov [3]; Reed, Saric & Arnal [9]). Analogous open questions feature the work on transition in the closely related boundary layer on a rotating disk (Wilkinson & Malik [13]; Corke & Knasiak [2]). However, there is a point common to most experimental data available. Whatever the perturbing source is used to generate pulsations, the critical Reynolds number consistently shows only a small scatter around an average value of 513 at the onset of transition. In other words, the final state of flow is found to be insensitive to the exact form of initial natural or artificial disturbances. This observation led Lingwood [5] to argue that transition to turbulence occurs due to absolute, rather than convective, instability intrinsic to the rotating-disk boundary layer. In the subsequent paper Lingwood [6] substantiated her theoretical findings by direct measurements. An analysis set forth below is aimed at providing a rigorous proof in favour of feasibility for absolutely unstable oscillations to emerge in any three-dimensional boundary layer with crossflow, provided that the Reynolds number is large enough. This assumption makes it possible to take advantage of an asymptotic approach in the framework of the extended tripledeck scheme. The computation evidences the excess-pressure distributions downstream as well as upstream of a perturbing source in the form of highly
EXCITATION OF ABSOLUTELY UNSTABLE DISTURBANCES
143
modulated signals. Crossflow vortices are shown to give rise to an additional eigenmode bringing new physics in the overall process of transition. Wave packets capable of moving against the oncoming stream are at the heart of absolute instability triggered by a new eigenmode.
8.2
E x t e n d e d triple-deck model
The base motion is supposed to be a general steady three-dimensional boundary layer with the Mach number M^ < 1 at the upper reaches. The triple-deck serves to describe lower-branch instabilities in the form of selfexcited TS waves and crossflow vortices whose coupling results in a new eigenmode of upstream advancing disturbances. We chose a special system of units adopted in this asymptotic approach to scale and normalize both the independent variables and desired functions (Stewartson [12]; Messiter [7]). Let i be the time and x, y, z designate Cartesian coordinates where the x-axis is aligned with the direction of the local external stream, y stands for the normal-to-wall distance, z points to the local spanwise direction. The corresponding velocities are denoted by u, v, w; the excess pressure is symbolized as p. In the thin viscous near-wall sublayer the initial NavierStokes equations reduce to a simpler set of Prandtl equations
du dt dw dt
du dx dw dx
du dy dw dy
du dx du dz
dv ay
dw dz
dw dz d2u dy2
dp dx dp dz
d2w dy1
Since the normal pressure gradient dp/dy = 0 we have p = p(t, x, z). The interaction law relating the excess pressure to the instantaneous displacement thickness — A(t, x, z) is presented in the form
i r * r
?M°l
-d( _ £*
(8.2)
in order to keep the Cauchy problem well posed in a linear approximation (Ryzhov & Terent'ev 1997). A small parameter e in (8.2) is proportional to .R - 1 / 8 , where R signifies the reference Reynolds number, and the spanwise momentum thickness roo
£>(«) = / Jo
My2)U2z0(y2)dy2
RYZHOV & TERENT'EV
144
expressed through the density distribution Ro and the crossflow velocity Uzo in most of the initial boundary layer. The triple-deck scheme as a whole remains intact within the framework of the extended approach (Ryzhov & Terent'ev [10]). Accordingly, at the upper edge of the near-wall sublayer we have « — Txy —► rxA,
w — rzy —► TZA
as y —► oo.
(8.3)
Here TX and TZ denote the normalized skin friction components, hence A vibrating ribbon is as a rule used in wind-tunnel tests for introducing artificial disturbances into a boundary layer (Saric & Yeates [11]; Radeztsky, Reibert, Saric & Takagi [8]). To provide a pertinent mathematical model for these experimental set-ups we start from homogeneous initial data u = w =p = A = Q at t = 0
(8.4)
and specify the shape of a time-dependent obstacle by means of f Jsin(tJoi)/(i)cos(moz), y = Vw = {
{
<>0 (8.5)
o, t < 0
with a function / being effectively non-zero within a finite interval of the streamwise coordinate. The boundary conditions at the moving surface become u = w = 0,
v = -g-
at y = yw
(8.6)
where yw is defined in (8.5). The receptivity problem (8.4), (8.5) and (8.6) allows us to trace the birth and development of various types of disturbances excited in a three-dimensional boundary layer by switching on and subsequent monochromatic vibrations of a ribbon stretching in the locally spanwise direction of crossflow. However, we analyze below wave systems radiated during the initial pulsed motion of the ribbon.
8.3
Linear analysis
With S —► 0 we arrive at a formulation typical of the receptivity process where the amplitude of a perturbing agency is assumed to be small. Accordingly, we put (u - rxy,v,w-
rzy,p,A)
= SRe [(Txuc,vc,Tzwc,pc,Ac)eimoz]
(8.7)
and simplify the Prandtl equations (8.1) as well as the boundary conditions (8.3), (8.6) and initial data (8.4). The desired complex functions
EXCITATION OF ABSOLUTELY UNSTABLE DISTURBANCES
145
uc, vc, wc, pc, Ac are transformed into the Laplace integral in t and the Fourier integral in x by means of [fie(w, k, y),vc(u>, k, y), WC(UJ, k, j/),p c (w, k),Ac(u, /'CO
CO
dx /
/
k)] =
e-< w t + i * I )(u c (i,x,j/),D c (i,a ; ,y),w c (t,x,t/),p c (i,x),A c (i,x)]df. (8.8)
Substitutaion of (8.8) into the system of linerarized Prandtl equations results Substitution of (8.8) into the system of linearized Prandtl equations results in a set of homogeneous ordinary differential equations. To further simplify the analysis let us define a reduced wavenumber K = krx + mQTz, a new independent variable Y = 0, + i1^K1liy,9, = i'2l3wK-2l3 and a function F = kTxuc + moTtwc (Ryzhov & Terent'ev [10]). It satisfies the Airy equation d3F „dF , s dYs--YdY=° <8"9> for the first derivative dF/dY. The boundary conditions F = - t f - J ^ / ,
f £ j = iV*K-W
(fc« +
m2)
pc
at F = 0
(8.10)
for (8.9) are derivable from the linearized version of (8.5), (8.6) and an inhomogeneous second-order equation which determines F and is omitted for brevity. Here f(k) is a Fourier transform of the vibrator shape f{x). The limiting condition F^KAC asF^-oo (8.11) comes from (8.3). The interaction law (8.2) turns to a simple relation
[(l-M^)P+mg]1/2
+ imlAc
(8.12) 1 8
to express pc in terms of A depending on Reynolds number through § ~ R A solution to (8.9) subject to (8.10), (8.11), (8.12) reads
j P
+
^ / » ( f c » + m g ) f c » [ ( l - M i ) f c ' + m g ] - 1 / a + g m § fY K*l*Hp) * ( n ) - Q { f c , m o ; J I / o o , e ; r „ r . ) JQ
A i m d Y {
'
/ .
\ f
with a consequence that the complex pressure pc as defined in (8.7) is given by the following inverse Laplace-Fourier transform Pc = ^
f
dke^Kk)
{k2 [(1 - Ml)
*" + ml] " 1 / 2 + eml)
x
146
RYZHOV & TERENT'EV i-e+ico
x
(8.13)
Je-ioo Here Ai(Y) designates the Airy function, $(ft) and /(ft) are expressed through its first derivative and improper integral
*(fi) = ^jP-m)}-1, and the quantity Q(k,mo;M00,i;Tx,Tz) Q = {^(k2
+ ml)K-"3
W) = J~MY)dY
(8.14)
stands for
[k2 [(1 - Ml)
k2 + ml] " 1 / 2 + emg}
(8.15)
Insofar as a continuous part is missing from the w-spectrum we may expand the inverse Laplace transform in (8.13) into a series in residues of the integrand at its poles. If we confine ourselves to time-periodic oscillations pco and rapidly growing unstable disturbances pc\ the series for the inverse Laplace transform reduces to three terms only leading to an explicit representation pc = pco -\-pc\ where Pco -- - £ / ^ dke^f(k) l
Pel
{k2 [(1 - Ml)
$(-Oo) *(-n0)-Q(ifc,mo)
iwnt
k2 + ml] ~1/2 + em2} > figo) *(no)-Q(*,»»o)
=-^/_°°0O^e^/(fc){A;2[(l-Mi)^+mg]-1/2 [
^
g
^
O
]
f
°r
Suffidently
^
+
£mg} x *•
(8-17)
Herefto = i1/3u0K-2/3, =ui(k,m0;Moo,i;rx,Tz) = i2l3K2lzQ,x{k,m0-} Ul(k) Moo,e;Tx,Tz) and fti = Q,i{k,m0\Moo,e\Tx,Tz) is the first root of the disper sion relation §(tt) = Q(k,m0;Moo,e;Tx,Tz). (8.18) The four parameters Moo, £; TX,TZ are omitted for brevity when indicating the arguments of Q in (8.16,17).
8.4
C o m p u t e d results
Let us put MQO = 0.2 to deal with a typical low Mach-number regime and choose S = 0.01, rx = 2 v 2 / 3 , r 2 = 1/3 as characteristic values of a general three-dimensional boundary layer. The ribbon in (8.5) is fixed by u>0 = 3 and / = 7r _ 1 / 2 exp(-a: 2 ) with a consequence that / = exp(-/fc 2 /4). We
EXCITATION OF ABSOLUTELY UNSTABLE DISTURBANCES
147
Figure 1 Ribbon-induced pressure distributions against the streamwise coordinate in the downstream sweeping wave packet for mo = 4.
concentrate below on exponentially enhancing disturbances excited during the initial pulse-mode switching-on the external source and leave aside, for this reason, time-periodic oscillations ensuing from (8.16). In order to isolate waxe packets from the influence of monochromatic motion of the ribbon, they are shown at an instant t = 10 which is not excessively large. Figure 1 demonstrates the real and imaginary parts of the complex pressure distribution against the streamwise coordinate for a value mo = 4 of the spanwise wavenumber. A highly modulated wave packet is found to sweep downstream at the group velocity Vg*= 4.2, the swing of oscillations in two central cycles located at a distance x = 40 t 44 turns out to be 0.5. lo4 times as large as the amplitude of the ribbon. Disturbance amplification of this type is at the heart of convective instability giving rise to conventional paths to transition in two-dimensional shear flows. However, the rapid growth of wave packets makes them dissimilar to much slower developing periodic wavetrains as a rule exploited in wind-tunnel tests. Contrary to popular belief leaning upon a long-standing experience from observations on two-dimensional boundary layers, the computation reveals a wave packet of completely different kind which is advancing in front of the ribbon against the oncoming stream. The group velocity V; can be roughly estimated to fall within a range -0.22+ -0.28. The pressure variations in the modulated signal are presented in Fig. 2. The pulsation swing in central cycles ranges up to 0.5 in both the real and imaginary parts of the complex pressure. In comparison with the oscillation size in the downstream sweeping wave packet (Fig. 1) this magnitude is almost lo4 times less. Therefore an enlarged scale is required to make the upstream moving signal fully recognizable. Despite its much smaller magnitude the wave packet propagating against
RYZHOV & TERENT'EV
148
Figure 2 Ribbon-induces pressure distributions against the streamwise coordinate in the upstream advancing wave packet for mo = 4.
the oncoming stream can play an extremely important part in determining a location for the initial laminar flow to break down. In line with this conjecture, nonlinear interactions in the three-dimensional boundary layer on a swept wing are found in wind-tunnel tests to become dominant nearly from a chordwise position where the disturbances grew to experimentally detectable size (Bippes et al. [I];Radeztsky et al. [S]; Reed et al. [9]). Figure 3 portrays the oscillation pattern for a larger value mo = 9 of the spanwise wavenumber. Here the disturbance appears in the form of a single wave packet which moves as a whole downstream at the same group velocity V* = 4.2 as in the previous case with mo = 4. However long-wavelength cycles at the tail-end of the highly modulated signal penetrate the region far upstream of the ribbon. Both the upstream advancing wave paclcet in Fig. 2 and long-wavelength cycles under discussion in Fig. 3 exponentially grow in time in the course of their development triggering absolute instability in the three-dimensional boundary layer of an arbitrary type and bringing new physics in the breakdown process. A decisive role in provolring this previously unknown path to transition is clearly attributabIe to crossflow, an integral part of the viscous fluid motion over a swept wing.
8.5
Theoretical arguments
The results computed above may be considered as additional data to experimental findings set forth in Introduction. A question arises in this regard of whether it is possible to put the concept of absolute instability on a firm footing using the same asymptotic scheme.
LL
EXCITATION OF ABSOLUTELY UNSTABLE DISTURBANCES
Z
I
Figure 3 Global ribbon-induced pressure distributions against the streamwise coordinate for a large value mo = 9.
To settle the matter let us analyse the properties of the dispersion relation (8.18) with a function @ ( a ) and the right-hand side Q(k, mo;M,, E"; T,, T,) defined in (8.14) and (8.15), respectively. We confine ourselves to the first root Ql = Rl (k, mo;M,, E"; T,, 7;) giving rise to unstable disturbances. The behaviour of the first dispersion curve in the complex frequency plane associated with this root is displayed in Fig. 4 for mo = 4. As distinct from the thoroughly studied case of two-dimensional oscillations with mo = 0, the curve falls apart into separate branches located in the upper and lower halfplaws. In turn, each of the branches consists of two lobes. Both left lobes are asymptotically close at infinity to the imaginary axis, an asymptote of both from this axis. Note right lobes passes at a distance of ( 4 / 2 ) ~ , 3 ' ~ ( 1M;)'/~ that a thin closed loop elongated in the direction of the real axis forms on the lower branch to smoothly connect its right and left lobes. The global and local maxima of Re(wl) with a local minimum in between are the salient feature of the right lobe of the lower branch. A much smaller positive mound of Re(w1) derives from the tip of the thin loop far away from the imaginary axis, this mound belongs to the left lobe. On theoretical grounds, the excitation of the wave paclcets illustrated in Figs. 1and 2 directly relates to the maxima attained by the real part of the complex frequency (see for instance Landau & Lifshitz [4]).We emphasize also that dIm(wl)/dk < 0 on the right lobes whereas the same derivative dIm(wl)/dk > 0 on the left lobes of both branches making up'the first dispension curve. The global maximum of Re(wl) on the right lobe of the lower branch is at the heart of the wave-packet generation downstream of the ribbon. A contribution from the neighbouring local maximum proves to be insufficient to induce a separate clear cut subpacket (it becomes discernible when using a
EXCITATION OF ABSOLUTELY UNSTABLE DISTURBANCES
151
that leads to the excitation of weak wave packets capable of advancing against the oncoming stream. The group velocity Vd* = -dlm(ui)/dk theoretically predicted for the wave packet in Fig. 2 is about -2.5; as we have seen, the computation give a broader estimate centered around the same value. Further still, the theoretical reference length of oscillation cycles within the wave packet also does not appreciably deviate from the computed results. When the spanwise wavenumber rises to a value mo = 9 the behaviour of the first dispersion curve in the complex frequency plane undergoes a significant change illustrated in Fig. 5. The global and local maxima of Re(u>i) on the right lobe of the lower branch merge together to form a single maximum. The thin closed loop connecting the right lobe with the left one disappears from the shape of the lower branch. This modification in the run of the dispersion curve provides an explanation for the indivisible disturbance pattern in Fig. 3. Owing to the fact that the only maximum of Re(uj\) occurs on the right lobe and there is no positive mound of Re(u>i) on the left lobe, the wave packet in Fig. 3 moves as a whole downstream at the group velocity V* = —dlm(u>i)/dk which closely agrees with the corresponding computed value of 4.2. A more than thousand-fold decline in the size of the disturbance comes about in response to an increase in the spanwise wavenumber from mo = 4 to mo = 9. This dramatic fall in the wave-packet amplitude is attributable to the fact that the magnitude of the global maximum of Re(u>i) on the right lobe of the first dispersion curve fixed by mo is almost twice as large as the magnitude of the single maximum featuring the first dispersion curve with mo = 9. The theoretical reference length monotonically builds up from the front part of the wave system towards its tail. The prediction correlates well with the computation in Fig. 3, where the oscillation cycles bringing up the tail of the modulated signal are seen to penetrate the region upstream of the ribbon. They are driven by a contribution from monotonic variations of iZe(a>i) along the left lobes of both the lower and the upper branches of the first dispersion curve. Thus, an alternative mechanism controls the development of pulsations over and ahead of an external source.
8.6
Conclusion
The type of disturbances emitted by the ribbon upstream strongly depends on the spanwise wavenumber defining a specific viscous crossflow eigenmode. With moderate values (mo = 4), they emerge in the form of a separate wave packet advancing against the on-coming stream. Large values (mo = 9) give rise to oscillation cycles at the tail-end of an indivisible signal propagating as a whole downstream. In both cases the upstream disturbances grow exponentially in time. Their amplification is responsible for absolute instability of a three-dimensional boundary layer that brings new physics in
152
RYZHOV & TERENT'EV
Figure 5 The first dispersion curve in the complex frequency plane for a large value mo = 9. The thin, closed loop disappears from the lower branch.
the process of transition to turbulence.
Acknowledgements It is a great honour for both authors to present this worlc as a tolcen of their high esteem of Prof. Earl1 Murman's scientific activities. The authors would like to express also their gratitude to Prof. Julian D. Cole for many discussions and helpful comments. Effort sponsored by the Air force Office of Scientific Research, Air Force Materials Command, USAF, under grant number F4962097-1-0141. The U.S. Government is authorized to reproduce and distribute reprints for governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or irnplied, of the Air Force of Scientific
EXCITATION OF ABSOLUTELY UNSTABLE DISTURBANCES
153
Research of the U.S. government.
REFERENCES 1. Bippes, H., Mtiller, B. & Wagner, M., Measurements and stability calculations of the disturbance growth in an unstable three-dimensional boundary layer, Phys. Fluids A, Fluid Dyn. 3, 1991, pp. 2371-2377. 2. Corke, T. C , Knasiak, K. F., Cross-flow instability with periodic distributed roughness. Proc. IUTAM Syrup, on Nonlinear Instability and Transition in Three-Dimensional Boundary Layers (ed. P. W. Duck & P. Hall), Kluwer Academic, 1996, pp. 267-282. 3. Kachanov, Yu. S., Experimental studies of three-dimensional instability of boundary layers, AIAA Paper 96-1978, 1996. 4. Landau, L. D. & Lifshitz, E. M., Fluid Mechanics, Pergamon, 1959. 5. Lingwood, R. J., Absolute instability of the boundary layer on a rotating disk, J. Fluid Mech. 299, 1995, pp. 17-33. 6. Lingwood, R. J., An experimental study of absolute instability of the rotating-disk boundary layer, J. Fluid Mech. 314, 1996, pp. 373-405. 7. Messiter, A. F., Boundary-layer flow near the trailing edge of a flat plate, SIAM J. Appl. Math. 18, 1970, pp. 241-257. 8. Radeztsky, R. H., Jr., Reibert, M. S., Saric, W. C. & Takagi, S., Effects of micron-sized roughness on transition in swept-wing flows, AIAA Paper 93-0076, 1993. 9. Reed, H. L., Saric, W. C. & Arnal, D., Linear stability theory applied to boundary layers, Ann. Rev. Fluid Mech. 28, 1996, pp. 389-428. 10. Ryzhov, O. S. & Terent'ev, E. D., A composite asymptotic model for the wave motion in a steady three-dimensional subsonic boundary layer, J. Fluid Mech. 337, 1997, pp. 103-128. 11. Saric, W. C. & Yeates, L. G., Experiments on the stability of crossflow vortices in swept-wing flow, AIAA Paper 85-0493, 1985. 12. Stewartson, K., On the flow near the trailing edge of a flat plate. II, Mathematika 16, 1969, pp. 106-121. 13. Wilkinson, S. P. & Malik, M. R., Stability experiments in the flow over a rotating disk, .4L4.4 J. 23, 1985, 588-595.
9 On Adjoint Equations for Error Analysis and Optimal Grid Adaptation in CFD Michael B. Giles
9.1
1
Introduction
One challenge facing CFD is to be able to give tight error bounds so that an engineer knows the accuracy of the computed results. Leaving aside the difficult issue of assessing the magnitude of modelling errors due to turbulence and transition prediction, this requires an accurate estimate of the errors due to the discretisation of the system of p.d.e.'s. With such knowledge, one can then hope to develop a rigorous approach to optimal grid adaptation, to produce the most accurate solution for a given computational cost, or to minimise the computational cost in achieving a given level of accuracy. At present, there is still a considerable gap between mathematical theory and engineering practice. When using smooth structured grids for smooth flow fields with no singularities, the order of accuracy can be deduced from an analysis of the truncation error. The absolute magnitude of the error for a particular grid size can be estimated from past experience with grid refinement studies on test problems. However, when one starts using grid redistribution (moving grid points) to improve the resolution of flow features, the grid is no longer smooth, and if one uses local grid refinement (adding additional grid points) the grid becomes unstructured, at least from the point of view of theoretical analysis if not from the programming perspective. 1
Rolls-Royce Reader in CFD, Oxford University Computing Laboratory, Oxford, U.K. email: [email protected] Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez ©1998 World Scientific
156
GILES
For unstructured grids the current practice in grid refinement remains the use of heuristic methods. Some of these are based on well-founded physical reasoning, that one needs to have good resolution of features such as shocks, boundary layers, wakes and free shear layers. However, it is entirely possible that too much computational effort is put into the resolution of these features at the expense of insufficient resolution of other parts of the flow field such as the smooth but rapid expansion over the leading edge of an airfoil. Other methods are based on the idea of reducing the magnitude of the truncation error, but this takes no account of the magnitude of the solution error caused by that truncation error. Until recently, the rigorous mathematical approach to error analysis for unstructured grids has involved the use of the Aubin-Nitsche technique to derive error bounds for finite element approximations of model problems such as the convection-diffusion equation [1]. However, when applied to approximations of hyperbolic p.d.e.'s this usually results in error bounds using negative Sobolev norms (e.g. [2]) which have little engineering significance. Therefore, this has generally been of little help to engineers although it has led to practical grid refinement indicators [3, 4]. Very recently, however, a promising new approach to error analysis and optimal grid refinement has been introduced by Becker & Rannacher, Still & Giles, and Paraschivoiu, Patera and Peraire. This starts from the observation that in many cases the key quantities of engineering interest are functionals of the solution, such as the lift and drag on an airfoil. Therefore, the most relevant measure of the solution error is the absolute error in these derived quantities. This leads to a mathematical analysis involving the adjoint p.d.e. with inhomogeneous terms and boundary conditions appropriate to the particular functional. The resulting adjoint solution defines the relationship between the error in the functional and the finite element residual error, which is the extent to which the finite element solution is not the solution of the original analytic problem. Thus, an estimate of the adjoint solution together with the local finite element residual error can be used to define an optimal grid refinement strategy to obtain the most accurate prediction of lift and drag for a given computational cost. This line of research is still in its infancy. Drawing on theory developed for elliptic p.d.e.'s in structural analysis [5, 6, 7], Becker and Rannacher developed a posteriori error estimates for the incompressible Navier-Stokes equations [8]. In addition to similar a posteriori error estimates, Siili, Giles et al have also developed a priori error estimates for the incompressible Navier-Stokes equations, proving an interesting superconvergence property [9]. Paraschivoiu, Patera and Peraire have also developed a slightly different analysis employing adjoint solutions to obtain upper and lower bounds for functionals for elliptic p.d.e.'s [10] with the aim of proceeding to the Navier-Stokes equations. The aim of this paper is to explain this approach to error analysis and show
ADJOINT EQUATIONS FOR ERROR ANALYSIS
157
how it can be used for optimal grid adaptation. The first section outlines an a priori error analysis for finite volume discretisations of the Euler equations on both structured and unstructured grids. The error estimates can be used either to improve the computed value for the functional, or as the basis for grid adaptation through redistribution or refinement. In addition, it is shown that for unstructured grids the use of a conservative discretisation ensures that the order of accuracy of the functional is one greater than the order of the truncation error of the finite volume discretisation. The second section presents the theory for a finite element discretisation of a simple elliptic model problem, discussing the superconvergence property arising from the a priori error analysis, and the use of the a posteriori error analysis for grid adaptation. This provides an introduction to the literature on finite element error analysis; references are provided for the extension of the analysis to the convection/diffusion and incompressible Navier-Stokes equations.
9.2
Finite v o l u m e analysis
The analysis begins with the discrete equations arising from a finite volume approximation of the original fluid dynamic equations, Rh(Uh)
= 0.
Here Uh is the discrete flow solution and the equations come directly from a flux balance and are not normalised by the area or volume of the computational cells. The solution error eh is defined by Uh = U + eh, where U is the analytic flow solution. Linearising the discrete equations gives
where Rh(U) is the vector of truncation errors obtained by substituting the analytic solution into the discrete operator. This equation describes the relationship between the truncation error, which is relatively easy to estimate, and the solution error which is the quantity of greater interest. If I(U) is the scalar functional of interest (e.g. lift or drag) based on the analytic solution, then the error in the corresponding discrete approximation, Ih(Uh), can be broken into two components,
Ih(Uh)-I(U)
= (Ih(Uh)-Ih(U))
+ (Ih(U) - I(U)) .
GILES
158
The second term is the truncation error in approximating the operator I. The first term is due to the error in the discrete solution Uh and can be approximated as follows, dlh dUh dlh dUh =
fdRh \dUh,
-i
Rh{U)
VTRh{U),
where the vector V is the solution of the adjoint flow equations, T
(dRky
(dih\
\dUh)
+
\dUh)
= 0.
Thus the adjoint flow solution relates the errors in quantities such as lift and drag, to the underlying truncation errors in the evaluation of finite volume cell residuals. The role of the adjoint flow solution in optimal design is now well established [11, 12, 13, 14, 15]. The fact that the same adjoint solution plays a critical role in error analysis should not be surprising. In design one is concerned with the perturbation of a functional due to changes in the geometry; in error analysis one is concerned with the perturbation of the functional due to the truncation errors. 9.2.1
Structured grids
For structured grids in which one wishes to improve the accuracy through grid redistribution, moving the grid points to better resolve regions of large flow variation, this analysis can be used to define an optimal adaptation strategy. Consider the discretisation of the Euler equations in 2D or 3D on a smooth structured grid. Taylor series expansions can be used to analyse the truncation error of the discretisation. Bearing in mind that the residuals are not normalised by the area or volume of the computational cells the truncation error is of order hp+d where h is the length of the cell (assumed to have a fixed aspect ratio to simplify the analysis), p is the order of accuracy and d is the dimension of the problem. After computing the adjoint solution, and using the flow solution to estimate the higher order derivatives in the truncation error, the overall error in the functional can be expressed as a sum of the following form
£
h*+dTj.
Here the index j denotes the individual locations of the flow variables (usually at nodes or cell centres) and hj is the length of the associated cell. Tj is a scalar
ADJOINT EQUATIONS FOR ERROR ANALYSIS
159
which involves the product of the adjoint solution and higher order derivatives of the flow solution coming from the truncation error estimation. A numerical estimate of the value of these higher order derivatives can be obtained from a computed flow solution by using local least-squares approximation by a high order polynomial. An adjoint flow calculation provides the adjoint solution and hence one can obtain an accurate estimate for the quantity T, at each node. The resulting error estimate for the functional can be used in two ways. The first is to use it as a correction to the computed value of the functional, providing a more accurate value at the cost of an adjoint flow computation. If one is interested in more than one functional then each would require an adjoint flow computation, but this might still be computationally less expensive than improving the accuracy through using a finer grid for the original flow calculation. The second option is to use the error estimate for optimal grid adaptation through redistribution, moving grid points to better resolve flow features. The objective would be to minimise
Taking hj to be the area {d = 2) or volume (d = 3) of the cell, this summation can be approximated by the integral
Jh"\T\dV, where \T\ is a continuous approximation to \Tj\. Similarly, the total number of cells is given approximately by
f h~d dV. Minimising the first integral while keeping the latter fixed, using a Lagrange multiplier, leads to the requirement that hp+d \T\ should be uniform. In practice, one would usually be concerned with more than one functional, such as both lift and drag. To handle this, the adaptation criterion could be amended so that the strategy is to ensure that hP+d
Y^\Tim)\ m
is approximately uniform, where T ( m ) are the corresponding error components for the functionals of concern. In principle, the construction of each T ( m ) requires the solution of an adjoint equation. However, in practice it may
GILES
160
be that these adjoint solutions can be approximated sufficiently well for adaptation purposes by simple analytic functions, based on a detailed understanding of their origin and qualitative nature [16]. This strategy may not seem very different from current adaptation practices which aim to make the truncation error uniform across the grid. The crucial difference however is the inclusion of the adjoint solution which reflects the fact that not all truncation errors are equal in their effect on the quantities of engineering interest such as lift and drag. A good example of this is truncation errors in the wake behind an airfoil. It is not uncommon for the wake to be poorly resolved a chord or more downstream of an airfoil, due to grid generation difficulties in anticipating the trajectory of the wake. However, although the resulting truncation errors may be relatively large, the adjoint flow solution for lift and drag functionals is relatively small, reflecting the fact that these errors do not significantly affect the flow near the airfoil. Thus, adaptation procedures based solely on truncation error estimates may overresolve the wake region, while those including the influence of the adjoint solution will correctly play greater emphasis on decreasing the errors in those cells close to the airfoil which have the greatest influence on the lift and drag. 9.2.2
U n s t r u c t u r e d grids
For unstructured grids, some further analysis is required to obtain good error estimates. Consider a typical discretisation of the 2D or 3D Euler equations using node-based variables, an edge-based data structure, a standard finite volume discretisation of the nonlinear flux terms (which can also be interpreted as a Galerkin finite element discretisation) plus the addition of characteristic smoothing. Assuming all of the cells are of bounded aspect ratio, the truncation error at an interior node j can be expressed in the following way as a sum over the set of edges Ej coming out of node j .
k€E}
Here hk is the length of the edge and there is an implicit assumption that all of the cells are of bounded aspect ratio so that the face associated with the edge has length/area 0 ( / i t _ 1 ) . Sjk consists of derivatives of the flow solution; this result comes from the standard integration error from a linear representation of the flux on the face. Note that since the area/volume of the cell is 0(hj) where hj is the representative length scale for the cell, the local truncation error normalised by the cell area is 0(h). This first order accuracy for the truncation error has led some to believe that the solution accuracy may also be first order. Indeed,
ADJOINT EQUATIONS FOR ERROR ANALYSIS
161
if one uses the error representation obtained above,
VTR{u) = YJV^Rj 3
then since the total number of nodes is 0(h~d) where h is the average value for hj over the whole grid, it appears to follow that the error is 0(h). However, numerical evidence suggests that such methods are second order accurate [17]. If one considers a union of neighbouring cells covering a region whose area/volume is 0 ( 1 ) , then summing over these cells, the truncation errors for interior fluxes cancel due to conservation. The boundary has 0(h~ + 1 ) faces, each with a truncation error which is 0(hd+1), so the overall truncation error for this aggregation of cells is 0(h2). Giles attempted to refine this argument using a Fourier decomposition of the truncation error and the resulting solution error, but the analysis was not rigorous [18]. To recover the second order accuracy in the error analysis using the adjoint approach requires a simple rearrangement of the error summation. It depends crucially on conservation, so that for an edge k connecting nodes i and j , the truncation error associated with the flux along the edge is equal and opposite for the two cells, i.e. Sik = —SjkTherefore, the error summation over all cells can be rearranged into a summation over all edges E, and all boundary nodes B, of the form
Here Sk is the truncation error for the flux along edge k and AVJt is the difference in the adjoint solutions at the nodes joined by the edge. Assuming that the analytic adjoint solution is differentiable, AVJt should be 0(hk), and with the total number of edges being 0(h~d) this means that the first sum is 0(h2). The number of boundary nodes is 0(h~d+1) and so the second sum is also 0(h2). The conclusion is that the overall error in integral quantities, such as lift and drag, is second order even though the local truncation error is first order. Focusing on the error contribution due to the edges, this is now of the form
k€E
where Tk is the product of the adjoint solution gradient along the edge and higher order derivatives of the flow solution coming from the Taylor series expansions used to evaluate the flux truncation error.
GILES
162
As with the structured grid error analysis, this error estimate can be used to improve the computed value of the functional. Alternatively, it can be used for grid adaptation through the addition of extra grid nodes, thereby reducing the cell sizes hk- The greatest reduction in the error is achieved by introducing additional nodes into the region in which the average magnitude of /i fc +2 |Tjt| is greatest. With repeated refinement, this quantity should eventually become approximately uniform over the grid. The refinement can be continued until the error estimate is smaller than a user-defined tolerance, thereby achieving the goal of minimising the computational cost for a given level of accuracy.
9.3
Finite element analysis
To simplify the details of the analysis we restrict consideration to the 2D or 3D Poisson equation, - V \ = /, on a domain ft which is a unit square or cube, depending on the dimension, and subject to Dirichlet conditions u = g on the boundary dQ,. 9.3.1
N o t a t i o n and definitions
We define two inner products, one on the domain, (u,v) = /
Jn
uvdV,
and the other on the boundary, (f,i>)an — / Jan
uvdA.
It is also convenient to define the following bilinear functional a(u,v) = (Vu, Vt>). The Z/2 norm, H1 semi-norm and H1 Sobolev norm are given by ll"llL - ( " ' " ) '
Mffi =a(u,u),
\\ufHi =(u,u)
+
a(u,u).
Semi-norms and Sobolev norms of higher degree are defined similarly. 9.3.2
Standard f.e. analysis
The standard error analysis for this problem is well-established; see the textbook of Strang & Fix for full details [1].
ADJOINT EQUATIONS FOR ERROR ANALYSIS
163
The function space ff 1 (f2) consists of those functions u for which ||«||#i < oo. The subspace HQ(Q) contains those functions which, in addition, are zero on the boundary dfl, while the subspace -ffj(ft) has those functions satisfying the Dirichlet b.c. u = g. The weak solution of the problem is given by the function u € HUO.) such that a(u, w) = (/, w), Vui£ Hi (ft). Now let Sh be a finite element subspace of if 1 (ft) consisting of continuous functions which are linear on each triangle (or tetrahedron) of a triangulation Th of the unit square (or cube), with h being the maximum diameter of any individual cell. The subspace SQ consists of those functions which are zero on the boundary. In addition we will assume that the boundary data g is piecewise linear so that there also exists a subspace Si1 with functions satisfying the Dirichlet boundary conditions. The solution of the finite element problem is the function uh S Si1 such that a(uh, wh) = (f,wh), \/wheS£. For any wh € S# C H&(Sl), a(u,wh)
(f,wh),
=
and hence we obtain the orthogonality property a{u-uh,wh)
\/wh€S^.
= 0,
Now, there exist positive constants C\, C2 independent of h such that a(w,w)
>
CIHIHI,
a(v,w)
<
C2H|H1|M|Hl
VtoeiT^n), Vv,weHl(Cl).
Hence, using the orthogonality property, ||w-wA||^,
< = <
C~la{u-uh,u-uh) Ci1a{u-uh,u-wh), C^C2\\u-uh\\H1\\u-wh\\H1,
for any wh € 5*. Thus,
At this point wh is chosen to be an interpolant of u. Standard results concerning the accuracy of interpolation can then be used to deduce that ||it — if || H 1 < C± C2C3 h \u\Hi,
GILES
164
where C3 is another constant independent of h. This proves first order accuracy in the H1 norm. To prove second order accuracy in the L2 norm requires the Aubin-Nitsche technique of using an adjoint problem. The function v £ HQ (ft) is defined by the weak problem a{v,w) = {u-uh,w),
Vw £ # £ ( « ) ,
and by standard elliptic regularity results there exists a fourth constant C4 such that |t>|H2
to be an interpolant of v,
h~uh\\2L2
< c2 <
,h\
9.3.3
Vvh £ Sh,
\\u-uh\\H1\\v-vh\\H1 h2
C^C\C\ 2
\u\H2\v\H> 2
<
Cr C*CjCt
h
<
C^ClCld
h2
\u\m\\u~uh\\h2 \u\H2.
P r i m a l / d u a l formulations for linear functional
Suppose now that we are interested in obtaining the value for the following functional which is a combination of inner products over both the domain and the boundary, I = (u,d) + ( - T j - , e ) a n . For an arbitrary function w £ if*(fi), integration by parts yields (f,w)
= a{u,w) + (-~
,c)8n,
where u is still the weak solution of the original (primal) problem. Hence, the problem of determining the value of the linear functional can be expressed in weak form as: Analytic primal V: given functions d,e,f,g, find I(d,e,f,g) and u £ H](U) such that / + a{u, w)-{f,w)-(u,d)
= 0,
Vu; £ Hi (Q).
The weak formulation of the corresponding dual problem is:
ADJOINT EQUATIONS FOR ERROR ANALYSIS
165
A n a l y t i c dual V: given functions d, e,f,g, find I(d,e,f,g) and v € H\ (ft) such that 7 + a{w,v) - (/,«) - (w,d) = 0, \/w € ff*(ft). The equivalence of the two problems, the fact that they yield the same linear functional I(d,e, /,), follows immediately from considering w = v in the primal problem, and w = u in the dual. The finite element approximations to both the primal and dual problems is obtained by replacing 77* (ft) and 7f*(ft) by 5efc and S*. D i s c r e t e primal Vh: given functions d,e,f,g, find Ih(d,e,f,g) and u'1 € Sj such that 7fc + a ( u \ w f t ) - {f,wh)
- ( u \ d ) = 0,
D i s c r e t e dual Vh: given functions d,e,f,g, such that
Vwh e
Si
find Ih(d,e,f,g)
7 h + a(u; fe ,z; ft ) - (/,v' 1 ) - (w ft ,d) = 0,
and u'1 g S*
Wwh € 5ffft.
The equivalence of the linear functionals obtained from these two finite element problems again follows immediately from considering wh = vh in the primal problem, and wh = uh in the dual. 9.3.4
Functional error representation
Let u,v and uh be the solutions of the problems V,V and Vh, respectively, and let wh be an arbitrary function in S^. Then, from the definitions of V, V and Vh, it follows that a{u-uh,v-wh)
=
a(u,v) - a(uh,v)
=
-(/-(/,«)
-a(u,wh)
-(u,g))
+ (7 - ( / , « " ) -(u,g)) =
+
a(uh,wh) -(«A,3))
+ (/-(/,«) -
h
{I -{f,wh)-{u\g))
I-Ih.
Similarly, if u, v and vh are the solutions of V, V and Vh, respectively, and u ' e S j then a(u-w)A,v-v/!) = 7 - 7 f c . Thus we have two different representations of the error 7 — 7 in the finite element approximation of the linear functional. 9.3.5
A priori error analysis
The a priori error analysis starts from 7-7fe =
a{u-uh,v-vh).
GILES
166
Using the error bounds from the standard error analysis given before, the error in the functional when using a finite element space of piecewise linear functions is bounded by |,_/H|
<
\\u-u»\\H1\\v-vh\\H1
<
Cr2C22Cih2\u\H2\v\H*.
If one uses a finite element space with polynomials of higher degree such that s >l \u\H'+i, \u — u Hi < C3h for some integer s and constant C 3 , then this a priori error estimate becomes \I-IH\
This shows that the order of accuracy of the finite element approximation of the functional is twice as good as the accuracy of the solution uh in the H1 norm. This superconvergence property is due to the fact that the leading order terms in the solution error, u—uh, are orthogonal to the smooth functions d, e in the evaluation of the linear functional. 9.3.6
A posteriori error analysis
The a posteriori error analysis starts from I -Ih
=
a(u-uh,v-wh),
where wh will be an interpolation of the dual solution v. Splitting the integral into a sum of integrals over each individual triangle, or tetrahedron, and then integrating by parts yields I - Ih =
] T TK, K€Th
where
TK=
I (f + V2uh)(v-wh)dV JK
in which
^~
+ ]- f I I JdK L
dn
(v-wh)dA,
is the jump in normal derivative across internal faces, and is
defined to be zero on faces forming part of the boundary dfl. When using piecewise linear finite elements, V 2 w h is zero within each cell, and
4 ^ - is easily evaluated. The interpolation error v—wh can be estimated
from the second derivatives of v; these in turn can be estimated using second
ADJOINT EQUATIONS FOR ERROR ANALYSIS
167
differences of the dual finite element solution vh. In this way, the magnitude of the cell error TK can be accurately estimated. For optimal grid adaptation, the strategy would be to refine cells for which \TKI is large until the bound on the error is acceptably small. In the process, \TK\ would become relatively uniform across the whole grid. 9.3.7
Extensions
The analysis presented here is for a simple p.d.e. (the Poisson equation) on a simple domain (which can be triangulated exactly) and with simple boundary conditions and linear functional (corresponding to functions in the finite element space). The analysis can be extended to much harder problems. Becker and Rannacher derived the a posteriori error analysis for the incompressible Navier-Stokes equations [8], and Siili, Giles et al developed both a priori and a posteriori analyses for the incompressible Navier-Stokes equations [9]. These analyses assume simple domain boundaries and boundary data. A forthcoming paper by Siili and Giles will show, for the convection/diffusion equation, that smooth curved boundaries and smooth boundary data can be treated in a way which does not destroy the superconvergence property for functionals; the key is an appropriate projection of the boundary geometry and data onto the finite element space.
9.4
Some concluding remarks
This paper has outlined the way in which the solution of an appropriate dual problem can be used to estimate the error in approximating a nonlinear functional in CFD computations. The error estimates can be used either to obtain better approximations to the functional itself, or to drive grid adaptation with the aim of achieving the most accurate answer possible for a given level of computational effort. The finite volume analysis shows that on unstructured grids discrete conservation is crucial in gaining one order of accuracy relative to the order of the local truncation error. However, the analysis outlined makes the assumption that the gradient of the dual solution is bounded. This may not be true for the Euler equations along the stagnation streamline [16], and so additional analysis may be required. The a priori finite element error analysis reveals an interesting superconvergence property, showing that the order of accuracy of the approximate linear functional is twice that of the solution itself. The lack of a similar result for the finite volume analysis may indicate a significant advantage for finite element methods, but the advantage only appears when
168
GILES
u s i n g m e t h o d s w h i c h h a v e b e t t e r t h a n second o r d e r a c c u r a c y .
Acknowledgments I a m very grateful t o m y colleague E n d r e Siili for his c o m m e n t s o n t h i s p a p e r as well as o u r m a n y discussions r e g a r d i n g e r r o r a n a l y s i s . T h i s r e s e a r c h was s u p p o r t e d b y t h e U K E n g i n e e r i n g a n d P h y s i c a l Science R e s e a r c h Council under research grant G R / K 9 1 1 4 9 .
REFERENCES 1. G. Strang and G. Fix. An Analysis of the Finite Element Method. Prentice-Hall, 1973. 2. E. Siili and P. Houston. Finite element methods for hyperbolic problems: a posteriori error analysis and adaptivity. State of the Art in Numerical Analysis. I. Duff and A. Watson, editors. Oxford University Press, 1997. 3. T. Sonar and E. Siili. A dual graph norm refinement indicator for the compressible Euler equations. To appear in Numerische Mathematik. 4. E. Siili. A posteriori error analysis and global error control for adaptive finite element approximations of hyperbolic problems. Proceedings of t h e 16th Biennial Conference in Numerical Analysis. D. Griffiths, editor. Pitman, 1996. 5. I. Babuska and W . C . Rheinboldt. Error estimates for adaptive finite element computations. SIAM J. Numer. Anal., 15(4):736-754, 1978. 6. I. Babuska and A. Miller. The post-processing approach in the finite element method - P a r t 1: calculation of displacements, stresses and other higher derivatives of the displacements. Intern. J. Numer. Methods Engrg., 20:1085-1109, 1984. 7. R. Verfiirth. A posteriori error estimation and adaptive mesh-refinement techniques. J. Comput. Appl. Math., 50:67-83, 1994. 8. R. Becker and R. Rannacher. Weighted a posteriori error control in finite element methods. Technical report, Universitat Heidelberg, 1994. Preprint No. 96-1. 9. M. B. Giles, M. G. Larson, J. M. Levenstam, and E. Siili. Adaptive error control for finite element approximations of the lift and drag in viscous flow. Technical Report NA97/06, Oxford University Computing Laboratory, 1997. 10. J. Peraire, M. Paraschivoiu and A. Patera. A posteriori finite element bounds for linear-functional o u t p u t s of elliptic partial differential equations. Sypmosium on Advances in Computational Mechanics, submitted to Comp. Meth. Appl. Engrg., 1996. 11. A. Jameson. Aerodynamic design via control theory. J. Sci. Comput, 3:233-260, 1988. 12. A. Jameson, N.A. Pierce, and L. Martinelli. Optimum aerodynamic design using the Navier-Stokes equations. AIAA Paper 97-0101, 1997. 13. O. Baysal and M.E. Eleshaky. Aerodynamic sensitivity analysis methods for the compressible Euler equations. J. Fluids. Engrg., 113:681-688, 1991. 14. J. Elliot and J. Peraire. Practical 3D aerodynamic design and optimization using unstructured grids. AIAA Paper 96-4122-CP, 1996. Proceedings of 6th A I A A / N A S A / I S S M O Symposium on Multidisciplinary Analysis and Optimization.
A D J O I N T E Q U A T I O N S F O R E R R O R ANALYSIS
169
15. W . K . Anderson and V. Venkatakrishnan. Aerodynamic design optimization on unstructured grids with a continuous adjoint formulation. AIAA Paper 97-0643, 1997. 16. M.B. Giles and N.A. Pierce. Adjoint equations in C F D : duality, boundary conditions and solution behaviour. AIAA Paper 97-1850, 1997. 17. D.R. Lindquist and M.B. Giles. A comparison of numerical schemes on triangular and quadrilateral meshes. In D.L. Dwoyer, M.Y. Hussaini, and R.G. Voigt, editors, Proceedings of the 11th International Conference on Numerical Methods in Fluid Dynamics, volume 323 of Lecture Notes in Physics, pages 369-373, Berlin, 1989. Springer-Verlag. 18. M.B. Giles. Accuracy of node based solutions on irregular meshes. In D.L. Dwoyer, M.Y. Hussaini, and R.G. Voigt, editors, Proceedings of the 11th International Conference on Numerical Methods in Fluid Dynamics, volume 323 of Lecture Notes in Physics, pages 273-277. Springer-Verlag, 1989. Berlin.
10 Added Dissipation in Flow Computations Robert W. MacCormack 1
10.1
Introduction
Efficient implicit numerical methods for solving the equations of compressible viscous flow have been developed and used widely during the past quarter century. These methods first approximate the governing partial differential flow equations with a matrix equation and then replace, by approximate factorization, the original matrix with a sequence of efficiently invertible matrix factors. In addition, these methods often add artificial dissipation to control numerical instability. The difference between the original equations to be solved and those actually solved numerically represents computation error. This error is introduced through 1. discretization of derivatives by finite difference quotients, 2. approximate factorization of the original matrix equation, and 3. the addition of artificial dissipation. These methods typically operate at time steps far smaller than are required by time accuracy considerations and often require thousands of time steps for convergence to even engineering accuracy. Errors introduced by approximate factorization or decomposition are principally responsible for slow convergence. Relatively small time steps are required to contain these errors introduced during the initial transients of the flow and then many 1
Department of Aeronautics and Astronautics, Stanford University, Stanford, CA 943054035 Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez ©1998 World Scientific
172
MACCORMACK
time steps must be taken to rid the solution of the garbage introduced. Discretization error and artificial dissipation are principally responsible for degrading the accuracy of the numerical solution. This author used to distinguish artificial dissipation introduced as additional terms to the difference equations from genuine dissipation introduced by choices made, for example upwinding, in the discretization of terms actually appearing in the original set of differential equations. The former was considered to be of a lower form because human intervention could add it at will or dial up its strength until subjectively pleasing results were obtained. Computational fluid dynamics is still very much an art than a science. On the other hand, this author would blend dissipative first order accurate upwind differencing with higher order derivative approximations in regions containing steep gradients, for example at shock waves, to control numerical oscillation. However, today this author considers even this form of genuine dissipation a cancer on the body of the numerical solution. Yet, when it is all said and done, numerical dissipation is the fundamental requirement for numerical stability and obtaining solutions. It has to be present. This paper presents a rational way to formulate and implement it. The strategy for obtaining fast and accurate solutions to the equations governing compressible inviscid or viscous flow is to first strip away, as much as possible, sources of numerical error and then reintroduce only that which is necessary to control instability without destroying the physics of the governing equations. The strategy is given as follows. 1. discretize the derivatives by finite difference approximations of high order accuracy, 2. minimize the error of approximate factorization, and 3. add frame independent artificial viscous stress and dissipation of minimal strength. In the calculations to be described, third order accurate upwind biased difference quotients were used to approximate all inviscid spatial derivatives and central second order accurate approximations were used for the viscous terms. The numerical procedure for minimizing approximate factorization error is described by MacCormack [6]. It consists of two parts: an approximate factorization, originally used by Bardina and Lombard [1], with decomposition error much smaller than that used in the Briley and McDonald [4] or Beam and Warming [2] methods, and an additional procedure [7,8] to eliminate the approximate factorization error which is introduced. The procedure is both fast and accurate because of the minimization of numerical error. Items 1 and 2 above will not be discussed here further. The next section begins the discussion of item 3, required for numerical stability, by examining the NavierStokes stress tensor in cartesian and cylindrical coordinates.
MACCORMACK
174
On comparing the two descriptions above we can find similarities and explain differences. For example, consider the two additional terms on the left hand side of the cylindrical equations, —pqg2/r and +pqrqe/r. The first is the centrifugal force term always adding momentum in the radial direction if the fluid has angular momentum. The second represents the "ice skater" term where angular momentum is increased if radial fluid motion is toward the axis and reduced otherwise. The physics of the flow requires these two terms to balance the system and they naturally arise in transforming the cartesian form of the momentum equations into the cylindrical system. The viscous terms given on the right hand side of the equations also show additional terms in the cylindrical system and are required to balance the flow physics. The equations of fluid dynamics are often solved numerically in generalized coordinates, some not even orthogonal, and care is taken to transform the cartesian equations, either directly or through finite volume formulations, into the computational coordinate system so that the frame independent flow physics is preserved. However, when it comes time to add dissipation, either directly or through the choice of a lower order accurate difference approximation, little thought is given to the preservation of physical balance. The desired effect of artificial viscosity is to damp numerical oscillation, which often leads to instability. However, an unanticipated side effect is the destruction of the physical balance of the overall system. It would be similar to adding a significant new term to the governing cylindrical equations that directly changes radial momentum and causes an unwanted significant "ice skater-like" reaction in angular momentum. It is hopeless to add artificial viscosity in multidimensional curvilinear coordinate systems and simultaneously preserve physical balance - with one exception. The exception is to use the already frame independent Navier-Stokes viscous stress tensor itself as a vehicle to add numerical dissipation. Similar to the concept of an eddy viscosity used to include the effects of turbulent mixing in the Reynolds Averaged Navier-Stokes equations, the addition of numerical viscosity by augmenting the natural viscosity can be used in a frame independent manner to control numerical instability. M *~ ^physical + ^numerical
10.3
(10-1)
T h e Choice of A d d e d Numerical Viscosity
Assuming the numerical procedure has been stripped as bare as is reasonably possible of sources of numerical error and that the procedure requires added dissipation to control stability, we will illustrate the choice of numerical viscosity for two flow situations, at shock waves and subsonic convection. Numerical results will be given for each case.
ADDED DISSIPATION IN FLOW COMPUTATIONS 10.3.1
175
Shock Waves
Shock waves are probably the most prominent feature of compressible flow and they have received the most attention in computational fluid dynamics. Yet, they continue to be a major source of difficulty in multidimensional flow. Poor representations of them will occur if they are strong, above a normal Mach number of two, and are allowed to cut indiscriminately across a computational grid. Disturbing irregularities are often found near the axis of symmetry for blunt body calculations [7]. Mesh adaptation is not sufficient in itself to represent shock waves accurately. Numerical dissipation is required. The following choice for added numerical viscosity was used to obtain the results for Mach 5 flow past a sphere presented later. First, the locations of all shock points in each computational coordinate direction are found by determining if the normal velocity component changes from supersonic to subsonic flow in the flow direction with an accompanying increase in pressure. Consider a £, n and ( computational coordinate system with points 1 and 2 adjacent to each other in coordinate direction £ and £2 > £1 ■ A shock point is present if ci,
<j„2 < c2
and
p2 > pi
or -qnl
< ci,
-Q„ 2 > c 2
and
pi > p2
where qn is the velocity normal to surface £ = constant, located midway between points £1 and £21 c is the sound speed and p is the pressure. If either of these conditions is met ^numerical i
= 1/2 • min { A % Ahv,Ah^}i ■ p;c; (10.2) where Ah(, Ahv and Ah^ represent the grid spacing distances in the £, n and £ directions and p is the density. At both points i = 1 and 2 the viscosity is augmented as shown above in Eq.(lO.l). Similarly, the other coordinate directions, n and (, are checked for the presense of shock points. The above procedure augments the natural viscosity along a narrow band, two points wide, aligned with the shock wave. This in turn can influence the calculation along a band four points wide. However, supersonic points adjacent to points 1 or 2 should be shielded from being affected by numerical viscosity. They don't need it and the upstream propagation of information within a supersonic flow through numerical viscosity is not physical. 10.3.2
Subsonic Convection
Numerical instabilities often begin with oscillations in the convection velocity within subsonic regions where they can be reinforced through self stimulation.
MACCORMACK
176
The following choice for added numerical viscosity was used to obtain the results for Mach 0.2 flow past an ellipse presented later. Again, consider two points 1 and 2 adjacent to each other and check for subsonic flow. If k m | < ci, and \qn2\ < c2 then fJ-numericaii =
0.05 • min {Ahf, A / j „ Ah(}
■ {pi k m I + /9ak» 2 l) ■
kn3l + 3(kml+k„2|) + k„4|+e where points 3 and 4 are adjacent to 1 and 2 and are ordered as £3 < £1 < £2 < £4 and e ~ 1 0 - 9 is added to prevent division by zero in motionless regions. The quotient factor is bounded by one and is proportional to
8qn '
^ '
d?
and the above value of numerical viscosity is added, as in Eq.(lO.l), at pointsl and 2.
10.4
Computational Results
The method for solving the equations of compressible viscous flow described in [6,7,8], with added numerical viscosity as just described, was applied to solve two benchmark calculations with known numerical difficulties. The first, Mach 5 viscous laminar flow past a sphere was suggested by Blottner [3], and the second, inviscid Mach 0.2 flow past an ellipse with 6 to 1 axis ratio at 5 degrees angle of attack, was suggested by Pulliam [9]. In the first problem only the added viscosity presented for shock waves, Eq.(10.2), was used. The natural viscous terms were sufficient to maintain stability in the remainder of the flow. For the second problem, containing only subsonic flow, only the artificial convection viscosity, Eq.(10.3), was used. 10.4.1
Mach 5 Viscous Flow P a s t a Sphere
The Mach 5 laminar flow past a sphere benchmark problem was originally chosen by Blottner in 1990 to study numerical difficulties encountered near the intersection of the axis of flow symmetry with the bow shock wave. This region is very sensitive because of the coordinate singularity caused as r —> 0 in terms containing factors 1/r. See, for example, the Navier-Stokes momentum equations given earlier in cylindrical coordinates. Many papers have observed
ADDED DISSIPATION IN FLOW COMPUTATIONS
177
numerical difficulties in this region. Blottner did an excellent job of discussing the nature of this numerical problem and presented excellent computational results using a thin layer Navier-Stokes code. The bow shock wave was used as an outer boundary of the flow. In the present paper, a full Navier-Stokes calculation is presented and the shock wave was captured as an internal feature of the flow. The flow was solved on three different grids, a course mesh consisting of 22x26 points, a medium mesh of 42x50 points, and a fine mesh of 82x98 points, each spanning the same flow volume. For this flow the computational difficulty increases as points are placed closer to the axis of symmetry. Fig. 1 shows the medium and coarse meshes. An initial bow shock wave location was chosen to be aligned with a grid line of the mesh. The Rankine-Hugoniot shock-jump relations were used to initialize the flow behind the shock wave with the normal-to-the-body-surface velocity component reducing to zero as the body surface was approached. The surface temperature was held at 98.89 degrees Kelvin. Sutherland's formula was used to determine viscosity. The Reynolds number, based on sphere diameter, was 1.89xl0 6 The calculation was started with a CFL number, the ratio of the time step size actually used to the maximum allowed by an explicit method, of about 6xl0 3 The CFL number was then increased to 2.5xl0 7 during the first 52 time steps and thereafter held fixed. This medium mesh case was run for 256 time steps. The mesh was also dynamically kept aligned with the bow shock wave and refined in the neighborhood of the shock as described in [7]. Pressure and Mach contours for the medium mesh are shown in Fig. 2 and surface pressure, skin friction and heat transfer results for all three meshes are shown in Fig. 3. Note that, as expected, skin friction increases linearly with distance from the axis of symmetry, surface pressure and heat transfer have bell shaped curves with their maxima at the axis, and that the three solutions converge. The stagnation pressure ratios, pstag/Poo, on the three meshes, from coarse to fine, are 32.5800, 32.6228 and 32.6667, which compare well to the Richardson extrapolated value of 32.6558 given by Blottner. The corresponding heat transfer values, expressed in k W / m 2 , 103.920, 104.628 and 105.613 also compare favorably with the value 107.360 given by Blottner. Fig. 4 shows the residual versus time step. During the first 64 steps the mesh tracked the shock by relaxing the position of a chosen grid line toward the Mach 1 contour [7]. During this time the average residual reduction rate was approximately 0.86, a value larger than the rate of 0.80 [8] on a fixed grid for this same problem. From step 64 to 96 the grid was refined in the vicinity of the shock as well as tracked and thereafter the grid only tracked the shock. These results were obtained on a 32 bit per word workstation and residual reductions are limited to about five orders of magnitude
MACCORMACK
178 10.4.2
M a c h 0.2 Inviscid Flow P a s t an Ellipse
In 1990 T.H. Pulliam presented the outstanding result that Euler flow codes were incapable of calculating low Mach number flow past the simple ellipse. "The basic result obtained here is a lifting solution for any combination of grid and/or angle of attack which is nonsymmetric." He then challenged the computational fluid dynamics community to "explain this unusual behavior" and he defined a benchmark flow problem, an ellipse with axis ratio 6:1 at 5 degrees angle of attack in Mach 0.2 inviscid flow, which exhibited "disturbing" results. The flow should produce zero lift and drag but all known numerical Euler solutions written in terms of variables p, u, v, and energy reproduced neither. Pulliam's results converged to a significant lift coefficient of 1.545. This problem has for almost seven years represented the numerical equivalent to d'Alembert's paradox. Highly accurate solutions for this problem have been obtained by Hafez and Brucker [5] for the steady incompressible and compressible form of the Euler equations. Also, Winterstein and Hafez [10] present excellent solutions for this problem with the Euler equations solved on triangular meshes. In the latter study, they fixed the rear stagnation point by varying its location until reaching the condition of zero lift. In the present study, the unsteady Euler equations were solved without any conditions on the location of the stagnation points. There is much discussion concerning multiple solutions for this inviscid flow problem, which has no sharp trailing edge and no specification of a Kutta condition. However, if the flow is initialized with no vorticity, and none is generated during the course of the calculation, the symmetric non-lifting solution should occur. Fig. 5 shows a portion of a 130x66 point mesh for this benchmark problem. The outer boundaries of the stretched mesh were approximately 5000 major axis radii away form the body. The natural viscosity was set to zero and slip boundary conditions were implemented at the body. The far field boundaries were held to constant Mach 0.2 flow. The calculation was started impulsively by suddenly placing the body in a uniform Mach 0.2 flow. The initial CFL number was 3.0xl0 3 , was increased to 2.5xl0 9 after 80 steps and thereafter was held fixed. The calculation was run to 2000 time steps to test convergence, but the solution was fairly steady much earlier. Figs. 6 and 7 show fairly symmetric patterns for pressure contours and streamlines. If significant lift were produced the rear stagnation point would be expected to move from the top of the trailing edge around toward the lower surface, as shown in [9]. Fig. 8 shows the coefficient of pressure along the surface of the ellipse. Again, it appears fairly, though not perfectly, symmetric, as required for small lift and drag. Fig. 9 shows lift and drag versus time step. The final values are 4 . 0 0 x l 0 - 3 and 1.66xl0~ 3 , respectively. The lift fell to 7 . 4 0 x l 0 - 2 after 100 time steps. The residual versus time step is shown in
ADDED DISSIPATION IN FLOW COMPUTATIONS
179
Fig. 10. A reasonably accurate Euler solution has been obtained for "Pulliam's paradox" problem. This time again, as with d'Alembert's paradox, the answer appears to be viscosity, in this case too much or frame dependent numerical viscosity, which can destroy the sensitive physical balance of the Euler equations.
10.5
Conclusion
An approach to control numerical instability, similar in concept to that of an eddy viscosity used to include the effects of turbulent mixing in the Reynolds Averaged Navier-Stokes equations, is presented to add numerical viscosity directly to the flow equations by augmenting the natural viscosity. The NavierStokes stress tensor is itself frame independent and hence this approach preserves frame independence, believed to be important for a physical balance of the governing equations when solved numerically in arbitrary curvilinear coordinate systems. This approach has been applied, within a flow solver of minimized factorization error and numerical dissipation, to two difficult benchmark flow problems. One proposed by Blottner as a test for Navier-Stokes solvers and the other presented as a challenge by Pulliam for Euler solvers. The computed results were in good agreement with accepted computational and theoretical results.
REFERENCES 1. Bardina, J. and C.K. Lombard, "Three Dimensional Hypersonic Flow Simulations with the CSCM Implicit Upwind Navier-Stokes Method," AIAA Paper No. 87-1114, 1987. 2. Beam, R. and R.F. Warming, "An Implicit Factored Scheme for the Compressible Navier-Stokes Equations," AIAA Journal, Vol. 16, 1978, pp. 293-402. 3. Blottner, F.G., "Accurate Navier-Stokes Results for the Hypersonic Flow over a Spherical Nosetip," Journal of Spacecraft, Vol.27, No.2, 1990, pp. 113-122. 4. Briley, W.R. and H. McDonald, "Solution of the Multidimensional Compressible Navier-Stokes Equations by a Generalized Implicit Method," Journal of Computational Physics, Vol. 24, 1977, pp. 372-397. 5. Hafez, M. and D. Brucker, "The Effect of Artificial Vorticity on the Discrete Solution of Euler Equations," AIAA Paper No. 91-1558-CP, 1991.
180
MACCORMACK
6. MacCormack, R.W., "Efficient Matrix Decomposition for Implicit Algorithms," presented at The 15th International Conference on Numerical Methods in Fluid Dynamics, Monterey, CA June 24-28, 1996, proceedings to be published in Lecture Notes in Phys., Spr.-Verlag 7. MacCormack, R.W., "Considerations for Fast Navier-Stokes Solvers," presented at the Symposium Advances of Flow Simulation Techniques, Davis, CA, May 2-4, 1997, proceedings to be published. 8. MacCormack, R.W., "A New Implicit Algorithm for Fluid Flow," Paper No. 97-2100, 1997.
AIAA
9. Pulliam, T.H., "Computational Challenge: Euler solution for Ellipses,'' AIAA Journal, Vol.28, No.10, 1990, pp. 1703-1704. 10. Winterstein, R. and M. Hafez, "Euler Solutions for Blunt Bodies Using Triangular Meshes: Artificial Viscosity Forms and Numerical Boundary Conditions," AIAA Paper No. 9S-3SSS-CP, 1993.
ADDED DISSIPATION IN FLOW COMPUTATIONS
Figure 1. Medium and coarse meshes about a sphere.
Figure 2. Pressure and Mach contours for supersonic flow past a sphere.
181
MACCORMACK
0
25
50
Figure 3. Surface pressure, skin Mction and heat transfer.
1;!
lo-' :;;
lva
:.>
.
.
.
.
:;:;::
:if:y-~,
!\;I:;
.
.
;:;;: :;.;:;; :\;:;;;:::
.
. . . ; ; I ; ; :;;:;;;;;;;;;;;;:;;;:;;<::
;:
. . . . . . . . . . . . . . . . . . . . . . ............... ........... ........................................... .. ............. .. .. .. .. . . . . .. .. .. ........... .. .. . . .. .. .. .. .......................... *. ..................................... ........................... - .......... .............................. - ...........
0
50
i . . :
100
i
;
150
200
2 50
TIME STEP Figure 4. Residual versus time step.
ADDED DISSIPATION IN FLOW COMPUTATIONS
Figure 5. Mesh about an ellipse.
Figure 6. Pressure contours for subsonic flow about an ellipse.
183
MACCORMACK
Figure 7. Streamlines for subsonic flow about an ellipse.
1.0

.
.........................................................................
.. . . . . . , . . . . . . . . , . . . . . . . . ' . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . . . . . . . . .
. .. .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . .... . ... . . ... . . ... . . ... . . ... . .... . .... . .... . .. .. . .. . . ... . . ... . ... . ... . .... . .... . .. . . .. . . .. . . ... . . . . . . . . . . . . . . . . . . . . . . -1.0 - . :. . . . . .. . .. . . .. . .. . . .. . .. . . .. . .. . . .. . .. . . .. . .. . . .. . .. . . .. . ... . . ... . .... . .. . ... . .. . . . . . . . . . . ............,............................................................ . . . . . . . . . . . . . . . . . 0.00
0.25
Figure 8. Coefficient of pressure versus x/c.
0.50
0.75
VC
1.OO
.. .
.
11 A Four-Operators Conservative Scheme for the Euler Equations Jean-Jacques Chattot1 11.1 Introduction The pioneering work of Murman and Cole on the solution by relaxation of the transonic small disturbance equation (TSD) [1], completed by the sequel paper of Murman [2], was a breakthrough in many ways, and foremost in computing efficiency. But it may not have been said enough that the method had two other features that are highly desirable and not commonly found in schemes for the Euler equations: there is no tuning parameter to adjust, and the method is remarkably robust. In order to prepare the ground for the Euler equations, the link between the mixedscheme for the 1 -D TSD potential equation and a corresponding scheme for Burgers (inviscid) equation is now established. The governing equation for the small disturbance potential for flows near Mach one can be written as: dx\ 21 dx J
=0
+ boundary conditions. The perturbation potential (p is defined at the nodes and two switches are defined to characterize the flow regime at each point: „ _
d dx
=0
department of Mechanical and Aeronautical Engineering, University of California Davis, Davis, California 95616. Frontiers of Computational Fluid Dynamics - 1998. Editors: David A. Caughey & Mohamed M. Hafez. ©1998 World Scientific
CHATTOT
188
+ initial and boundary conditions. If the following discrete relation is introduced: _q>i-
a)
9
9»w
1
b)
^
1
-9-
■
©
1
9M
^
-©■
#
The four operators mixed scheme of Murman [2] has an exact equivalent for Burgers equation. However a slightly different scheme will be introduced, with a different sonic point operator, that eliminates expansion shocks.
11.2 A mixed-scheme for Burgers equation Consider the following mixed scheme: i) u i > 0 u i > 0 1+ -
i—
2 ..1+1
..nj.
..«
At
,.»
si.
Ax ii . 1
Ax
i—
ii) u i > 0
i+-
2
2
u 11 < 0 2
.1
s2,
,.ns2
.
,." \2 ,t " , - - " " , («r + ir/2-( H T)V2 1 (Mf)z/2-(«r_ 1 )' /2_ 0 .1+1
..n
Af
,.«N2,
Ac
Ac
Ax
Af<M
I -U
i— 2 III) U
l < 0 i— 2
M i > 0 i+— 2 M,
2Ac A*
l + £(«? +1 -«?_!)
=0
,Af<°°
J
('+2
CONSERVATIVE FOUR-OPERATORS SCHEME iv) u
x
i— 2
<0
189
. 1 <0
u
14—
2
, ,B+1
~"T | (Kf+i)2/2-(Mf)2/2
At
Ac
0 ,Ar<
Ac
-«. 1 !+is not a conservative 2
It can be noted that the sonic point, scheme iii), discretization, hence it is not conservative in the transient and it can be shown that the conservation error is proportional to Ac However, it is conservative at steady state. It is the price paid for the elimination of expansion shocks. This scheme is locally implicit and, as such, has no time step requirement, as it can be written:
u?+l-ur
n+\ ui+\ ~ - + U;
u
i-\
=0
At 2Ac The implicit character is achieved through the coefficient, not the space derivative. This brings unconditional stability for the sonic point. Some comments need to be made regarding stability. Since the governing equation is nonlinear, the study of stability has been carried out after lineariation, using Von Neumann method. The linearization for the supersonic point operator i) and subsonic point operator iv) is straightforward, with characteristic speeds u , and u , taken as i+-
i—
2
2
constant. At the sonic point, the implicitness is recognized and the space derivative 2 Ax""' is considered merely as a positive constant. For the shock point, scheme ii), the linearization involves two constant wave speeds with opposite signs, as:
{^)2l2-{uUfl2 Ax
„n ui+\ I i+Ax 2
u?-ul_x
■+u 2
Ax
This analysis provides the above stability conditions, which have been found reliable in numerical tests [3]. It can be shown that this scheme is second order accurate at steady-state for Burgers equation with a source term modeling a quasi-onedimensional flow in a slender nozzle [4].
11.3 1-D Euler Equations The one-dimensional Euler equations can be written as: dw | df(w) •0 dt dx where: 'pu fp \ w ■■
pu , /(w) = pu2 + p puH
E=(7-l)p
2
H
IP (y-l)p
2
CHATTOT
190
Let A = -=— be the jacobian matrix and a2 = — the square of the speed of sound. dw p The characteristic matrix C associated with the system of the one-dimensional Euler equations has three distinct eigenvalues: |C| = |A-A/| = 0 X(j) = {u -a;u;u +a}, j = 1,2,3 The system is totally hyperbolic. The left eigenvectors are obtained from:
l^A = fi¥J\j = 1,2,3 The compatibility relations are:
rf»: ,»{*:+£) = „, ;_-U,3 These represent interior derivatives along the characteristic lines defined by: f
*L)
=XU),j
=
1,2,3
The three compatibility relations are equivalent to the original system. They form the basis of the numerical scheme described below.
11.4 A characteristic box scheme Consider the two boxes [i — l,i], [»',/ +1] surrounding node i. The unknowns w" are located at the nodes. In each box, the discrete jacobian matrix, eigenvalues and eigenvectors are evaluated, using Roe averages [4]. Assuming that the flow is from left toright(«" > 0), the following four cases occur:
0 u" i - a" , > 0 u" i - a" , > 0 "2
' 2
i+2
i+2
1—
2 ^
+ 1
A/ /(3)
H)
M"
11 - a " 1 > 0
1 —
2
u" 1 l+— 2
a n
.i+— 1 <0 2
- ^ , ^ - ^ l Ax
=0
CONSERVATIVE FOUR-OPERATORS SCHEME
r
\
/ 0
w, n+I -w;in
0
191
en _ en | Ji+l Ji
At
en , Ji
Ax
en Ji-l
+
Ax
At
I—
= 0
Ax
2
0
z'.(3)1
)
V
I —
.
in) un i - a" ! < 0 i— 2
B+1
0
w" ! - a" j > 0
i— 2
■
0
-^ Af
I'H—
2
„
2_
,
in— 2
A W."w w -..." ^-i ' 2At Ax
;(2)
i+±
i-±
2
2
«t+l-y»?
^f?-f!Li
At
Ax
= 0
p\
iv) «" i - a " ! <0 un i - a " ! <0 I
!
2
2
2 0 0
('+-
J+-
2
2
0
+1
*? -»>? i&i-rr Af
/(2)
Ar
Ax
Ax
= 0
Z(3),
Note that, for the characteristic Aw which changes sign, the scheme is analogous to the above mixed scheme, in particular as concerns the shock and sonic point
CHATTOT
192
operators, hence, this scheme is conservative, computes the correct shock speed, and prevents expansion shocks by forcing a smooth transition at sonic condition. The time derivatives are obtained from the solution of a 3x3 system whose matrix is made of the left eigenvectors. At supersonic points, the matrix of left eigenvectors can be replaced by the unit matrix.
11.5 Use of discrete eingenvalues and eigenvectors The above scheme can be written in an equivalent, but slightly different form, for the compatibility relations CRm and CR°\ by making use of the property of the Roe averages, as: jU) fi -fi-l iS Ax
=
w
ft) ft) r
i—
2
i
i—' '
~wi-l Ax
2
For the sonic point operator, the eigenvalues and left eigenvectors are evaluated at the nodal state w". This insures that the scheme is not in conservation form. At the shock point, there are four characteristics flowing to point ;', with two compatibility relations CRW. The situation is identical to that of Burgers model and is treated identically, by adding the space derivatives from the two surrounding boxes, thus producing a non-consistent but conservative scheme. However, a choice for the eigenvector, denoted l,m in the above formula, needs to be made. It is proposed to carry the analogy with Burgers equation one step further, so that the Euler equations can be linearized in a similar manner as was done for the simpler model, with two wave speeds as fundamental feature. The shock point equation can be rewritten as: ,.n+l W; At
...n W:
n - + A(l)
wi+l-wt t+— Ax .. 1 A„ 2
/,(1)
,„n
i-l (1> .__}_
Ax 0
2
Ax "2 '"I One can choose the vector I such that the last two terms in the above equation cancel out. The search for such a vector is made easier by looking for the value 6 of the linear combination: i+2
i+«+2 (i)
Ax
j«=ez
efi\u
i-i"V). w '" w '-'f(i-e)/",.(A
J - A W / ) . ^ ^ W,i
=_ 0
,-,-Ax ,__ , + ,-+_ Ax i+With this choice, one can rewrite the four operators scheme for compatibility relation CRm where the switches have been omitted for compactness, in an updating form, as:
CONSERVATIVE FOUR-OPERATORS SCHEME
0 /(1>,.wf+1=/« w,f-^Ma) jp.wTM®
Af
w ? _ £ L A a > (w»
Ax ,+-L 2 Af 15 2Ax 1+ ^(A(D Ax ,+i
4 «
m)/P. W f +1 =Zp.
.V) / « wn+\ = fX\.
193
wf-i) _ w » ) _ £ Af LAa)
(wf
_wf
}
Ax j - i 2 ■W:
A0)}
,-_.L
f-^Vv^-wf)
Vf,
Ax !+For compatibility relations C7? and CR0) the scheme is identical to case/). This is similar to the mixed scheme for Burgers equation, hence the stability conditions are identical. Because this is a system, the left eigenvectors play the role of projection operator for the unknown vector w. The scheme is L„-stable and has been found in numerical experiments to be very robust. There is no free parameter to adjust. At steady-state, the scheme is second order accurate. I+— 2
(2)
11.6 Some results The shock tube test case of Sod is simulated on a mesh with 1,001 points, at CFL condition number unity, for 500 steps. The results are presented in Fig. 1 and are in good agreement with the exact solution. The choked flow in a converging-diverging nozzle of equation g(x) = 1 + 4(x - 0.5)2 with an exit pressure pexil = yp", yields a compression shock in the diverging part. The steady-state solution is compared with Roe scheme. It can be seen in Fig. 2, that Roe scheme, without the artificial viscosity at the sonic point, does not give a smooth transition. The last example corresponds to the physical problem of the start-up of a supersonic wind tunnel. The high pressure reservoir is connected to the atmospheric exhaust by a double throated duct. The forward converging-diverging nozzle has a unit throat area and reaches a maximum area at the test section A,K, =1.25 .The converging-diverging diffuser nozzle has a variable throat which depends on a parameter a. As a decreases, the second throat increases. The equations of the geometry are given by:
CHATTOT
194 g(x) = -(2x-1)4 4
-hlx-1)2 +2 4
,0<x<0.5
, !; u)-a^(2*-l) 4 -|(2*-l) 2 } + |
,0.5<x
,0
In order for the tests section to be shock-free, the first shock must be "swallowed" by the second throat. This requires the second throat to satisfy the inequality [6]: T*^
A*
\.test\
A, Here, this corresponds to —^ > 1.1169
A To simulate the start-up, the initial condition corresponds to air at rest, with the left boundary point (i = 1) representing the reservoir condition, the rest of the duct being at atmospheric pressure. In dimensionless form: ui=0 l
P\
y+1
M,=0
2y Pi=—-pexit, y+1
. „ . i = 2,...,ix
Pi = Pexit
P\
7 pexil =0.9. The mesh has 101 points. The code is run with A, I A{ =1.117 for 30,000 iterations with a CFL number unity. The wind tunnel geometry is shown in Fig. 3. The evolution of pressure with time is plotted in Fig. 4 every 1,000 iterations. It can be seen that the first shock takes a long time to be "swallowed" by the second throat. The shock speed reaches a minimum just slightly above zero at the test section. Once in the converging part of the diffuser nozzle, the first shock moves quickly to the right of the second throat and coalesces with the second shock to form a single stronger shock as the flow reaches steady-state.
11.7 Conclusion It is only fair for the author to acknowledge the legacy of E. M. Murman to the field of computational fluid dynamics and to his personal cursus. The author started his professional career in 1971 at ONERA, France, and was in charge of developing a transonic code based on the new and original contribution of Murman. If anything, this paper attempted to show that, with some further analysis, an existing Euler scheme could be cast into a simpler and purer form, and attain a higher level of beauty: the beauty of the original.
CONSERVATIVE FOUR-OPERATORS SCHEME
195
REFERENCES 1. 2. 3. 4. 5. 6.
Murman, E.M. & Cole, J.D., Calculation of Plane Steady Transonic Flows, AIAA Journal, Vol. 9, No 1, 1971. Murman, E.M., Analysis of Embedded Shocks Waves Calculated by Relaxation Methods, AIAA Journal, Vol. 12, No 5, 1974, pp. 626-633. Chattot, J.J., Box Schemes for First Order Partial Differential Equations, Advances in Computational Fluids Mechanics, Gordon Breach Publishers, 1995, pp. 307- 331. Chattot, J.J. & Malet, S., A "Box-Scheme" for the Euler Equations, Lecture Notes in Mathematics, 1270, 1987, pp. 82-102. Roe, P.L., Approximate Riemann Solvers, Parameter Vectors and Difference Schemes, Journal of Computational Physics, Vol. 43, No 2, 1981, pp. 357372. Zucrow, M.J. & Hoffman, J.D., Gas Dynamics, Vol. 1, John Wiley & Sons, 1976.
196
CHATTOT
Figure 1. Shock tube-Sod test case. Velocity, density, pressure and specific energy distributions
Figure 2. Choked flow in a converging-diverging nozzle. Comparison of the pressure distributions using Roe scheme and the box scheme.
FOUR-OPERATORS CONSERVATIVE SCHEME
Figure 3. Supersonic wind tunnel area distribution.
Figure 4. Superssonic and start-up Evolution of pressure with iterations.
197
12 Autoblocking for Wings with Split and Hinged Flaps D. Scott Eberhardt 1 & Pratomo Wibowo 1
Abstract A technique for generating structured meshes for wings with finite-span flaps is presented. The method consists of four steps. They are: selecting fronts, determining the degree-of-guidedness, determining front-advancement priority, and then advancing the front. A hyperbolic predictor and elliptic corrector are used to advance the fronts. The final block grid is both CO and C l continuous. Fronts are automatically created where needed and merged where not needed. Front connectivity defines a parameter called degree-ofguidedness, or D. 0 . G., which determines the topology of the grid that will be generated. Individual fronts are advanced in an order specified by their surface topology. The method is used to generate structured meshes around wings with partial span split and hinged flaps. The examples shown are model flaps on a North American P-51 Mustang wing. The mesh generation steps are shown as well as the final mesh. The meshes generated are for illustrative purposes, and are not intended to be used in a flow solver.
1
University of Washington, Box 352400, Seattle, Washington 98195-2400 Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez ©1998 World Scientific
EBERHARDT & WIBOWO
200
12.1
Introduction
Grid generation is an essential and critical task for successful application of CFD in engineering process. Grid generation has been, and still is, a labor-intensive and expertise-requiring task, especially for complicated configurations. Configurations for which grid generation requires an enormous amount of human time and labor can easily be found: automobiles, underwater vehicles, and airplanes. One example of further complication is that modern flying vehicles use high-lift devices, such as slats and multiple flaps, during take-off and landing. Flow fields around such complicated configurations are generally decomposed into several relatively simple sub-domains. Once the sub-domains are identified, it is relatively simple and straightforward to fill each of them with structured meshes. A sub-domain may either share a common interface with immediately adjacent sub-domains or be overlaid with another. The first strategy of using sub-domains with common interfaces is the subject of this research, and is called a multi-block grid approach. Decomposition of physical flow fields into blocks, so-called block generation, is the most challenging task in the process of multi-block grid generation. Development of an automatic block generation algorithm is one of several highly-pursued topics in the field of grid generation ([1] - [11]). Many existing block generation schemes generate block boundaries in their topological form. In other words, once the user is given those block boundaries, he/she should distribute grid points along those block boundaries and generate internal grid points for each block to obtain the final multi-block grid. One cannot ignore the push towards unstructured meshes, which solve many of the grid generation challenges. The authors acknowledge the advantages of unstructured grids, but have been stubborn in our reluctance to leave the familiarity of structured meshes. The method outlined in this paper generates blocks as part of the mesh generation process. The surface mesh is used as input and is separated into "fronts." A description of how fronts are generated is found in [12] and [13]. The paper introduces a concept called degree-of-guidedness, which will be detailed in this paper. Although this concept is used in [12] and [13], it is not clearly detailed. Advancement priority is discussed, and then the predictorcorrector algorithm used is briefly presented. Finally, some results generated about the North American P-51 Mustang are presented. Note that the approach presented here is part of a step-by-step process to generate meshes about realistic, multi-element wings. The simplest flap systems are attacked first in order to resolve technical problems one step at a time. At this point, the scheme is successful generating meshes about split flaps and hinged flaps of infinitesimal thickness.
AUTOBLOCKING FOR WINGS
Figure 1
12.2
201
How fronts are generated to maintain simple edge connectivity.
Method
There are four important elements used in the grid generation process. They consist of selecting fronts, determining the degree-of-guidedness (D. 0 . G.) value, determining the front advancement priority, and then advancing the front. A front is a surface of the mesh that is to be advanced outward, similar to a hyperbolic mesh generation scheme. For a wing with flaps, fronts typically are assigned to the upper and lower surfaces of the wing, the upper and lower surfaces of the flap, wing tips and surfaces in the wing/flap gap, should one exist. An important detail in front advancement is that all fronts must connect at corners. This means that fronts, or blocks, may have to be split into multiple fronts to satisfy this condition, as shown in Fig. 1. This process of generating new fronts is automatic, requiring no user interaction. A more complete discussion of this can be found in [12]. 12.2.1
D e g r e e of G u i d e d n e s s
Fronts can either be be advanced using "free advancement" or "guided advancement." The choice of method is dependent on the D. 0 . G. value. Free advancement implies that a surface is advanced outward without regard to neighboring fronts (except for insuring that neighboring fronts take compatible advancement steps). A simple example of free advancement is the wing surface of an un-flapped, infinite-span wing. Guided advancement means that the advancement of one front is explicitly dependent on the other front. An example of this is an interior, right-angle corner as illustrated in Fig. 2. A standard hyperbolic mesh generator would attempt to make both surfaces the same grid line, which is un-natural.
EBERHARDT & WIBOWO
202
Figure 2
Interior corner and "natural" grid.
A major innovation of [13] was developing the concept of degree-ofguidedness. In this work, we have refined the concept and filled in a few missing holes. Here, we will attempt to describe what a D. O. G. value is. Each front surface ahs four edges that are adjacent to neighbors (or a boundary). Consider, for example, a simple wing. The upper and lower surface fronts meet at the leading edge and the trailing edge. The leading edge is round, so the two fronts meet with zero angle. The two fronts can grow using free advancement, but must remain connected. The trailing edge meets with a very large angle (close to 180 degrees). These fronts should be advanced independently (options will be discussed below). The current scheme uses a default D. O. G. value depending on how many surface edges meet. Table 1 shows how D. 0 . G. values, when they are implemented, and which figure illustrates the front advancement method chosen. Free advancement D. 0 . G. values of -1 and -2 need a little more explanation. For a D. O. G. value o f - 1 , if one front is advanced several steps
D . O. G.
Default Angles
Advancement type
Figure
0 -1 -2 -3 1 2
- 4 5 ° to 45° 45° to 135° 135° to 180° no neighbor -135° to - 4 5 ° - 1 8 0 ° to - 1 3 5 °
Free advancement Free advancement Free advancement Free advancement Guided advancement Guided advancement
3 4 5 6 7 8
Table 1
Illustration of D. O. G. values
AUTOBLOCKING FOR WINGS
203
F i g u r e 3 Degree of Guidedness = 0: (a) Initial grid blocks with angle between two fronts of 20 degrees; (b) Result after three grid advancements; (c) Final result of the grid advancement.
F i g u r e 4 Degree of Guidedness = - 1 : (a) Initial grid blocks with angle between two fronts of 120 degrees; (b) Advancement of t h e grid for t h e first front; (c) Final result of the grid advancement.
its side will create an extension of the second front, as shown in Fig. 4b. When this front is advanced, the "corner" is filled in. A D. O. G. value of-2 is what one might expect on the trailing edge of a wing. The two blocks grow independently, as shown in Fig. 5b. A new front is then created by the sides of the new blocks which is then advanced as shown in Fig. 5c. This D. 0 . G. is used to handle wake regions behind the wing. A D. O. G. value of 2 is unique in that a singular line results. This D. O. G. value was introduced to fill the region between a split-flap and the wing. Because the two fronts essentially collapse in a small angle the most natural grid is a grid radiating from the singularity. A user can override the default values. For example, in Fig. 9 the user may want the grid to wrap around the sharp corner. In Fig. 9b, the default D. O. G. value of-1 is shown. The user can specify a D. 0 . G. value of zero to obtain
204
EBERHARDT & WIBOWO
(a) Fig. 5. Degree of Guidedness = - 2: (a) Initial grid blocks with angle between two fronts of 150 degrees, (b) Each grid front advances independently, (c) Final result of the grid advancement
(a) (b) Fig. 6. Degree of Guidedness = • 3: (a) Initial grid front without any grid blocks connected to its sides, (b) Result of the grid after five advancements.
(a)
(b)
(c)
Fig. 7. Degree of Guidedness = 1: (a) Initial grid blocks with angle between two fronts of -80 degrees, (b) Grid after two advancements, (c) Final result of the grid advancement
AUTOBLOCKING FOR WINGS
205
(a) (b) (c) Figure 8 Degree of Guidedness = 2: (a) Initial grid blocks with angle between two fronts of 150 degrees; (b) Grid after three advancements; (c) Final result of grid advancement.
(b)
(c)
Figure 9 Imposing the Degree of Guidedness value other than the default value: (a) Initial grid with angle between two fronts of 120 degrees; (b) Result of the grid advancement using the default value (DOR = -1); (c) Result of the grid advancement using the imposed value.
the grid shown in Fig. 9c. 12.2.2
A d v a n c e m e n t Priority
If fronts are advanced in an arbitrary order it is possible that problems can arise. During this research it was determined that front advancement can be prioritized, and this can be done using default priorities. A cavity, for example, should be filled before other fronts begin advancing. Therefore, as a rule, concave surfaces advance first. This would include fronts with D. 0 . G. values of 2 and those with four sides with D. 0 . G. values of 1. Guided fronts are highest priority following concave surfaces described above. The priority for advancement goes to the front with the greatest number of guided fronts. Finally, once guided front advancement is completed free advancement can begin.
EBERHARDT & WIBOWO
206
Fronts can be advanced one step or multiple steps. If one is advancing fronts on a simple wing, it might make sense to advance the upper surface one step and then the lower surface, instead of advancing the upper surface to the outer boundary and then advancing the lower surface. In a cavity, however, it makes sense to fill in the entire cavity before advancing surrounding fronts. In some cases, it is obvious what the choice is, and the scheme will choose. But, sometimes this step is not clear, and the user must intervene. Such cases can occur at wing tips. It is sometimes best to advance the wing a few steps before allowing the wing tip front to advance. This is an area that still requires work to automate. 12.2.3
Front A d v a n c e m e n t
The front advancement algorithm uses a combination of the Modified Advancing Front Method as a Predictor with an elliptic scheme as a corrector (MAP scheme). Kim introduced the MAP scheme for 2-D and used it for multi-block grid generation around 2-D configurations [12]. The MAP scheme advances a collection of surface grid blocks by one cell height at a time. As the name implies, the MAP scheme is a predictor-corrector approach. A front is advanced by the predictor and smoothed by the corrector. Each front goes through this predictor-corrector step until the flow field of interest is filled with multi-block grids. Predictor step: This advances a front using the Modified Advancing Front Method. The Advancing Front Method (AFM) was originally developed for unstructured, triangular meshes ([14] - [16]). In the present research, however, this method has been modified for hexahedral mesh generation. The Modified AFM not only enables simultaneous generation of a collection of hexahedral cells, but also adjusts the distance and direction of advancement of each grid point according to the surrounding situation. It first interpolates the distance and direction of advancement for each internal point of a front from those values along the boundary of the front. Then, all the points on the front are advanced, resulting in a new front, and the same number of hexahedral cells as that of quadrilateral cells along the old front are obtained. In providing a surface grid as initial fronts, {i,j) indexing of each surface block should be ordered in a consistent manner so that its outward direction can be readily identified; e.g., using the right-hand rule to have the third fc-axis as the outward direction (e; x ij = ejt). The new front, however, usually carries the non-smoothness, if any, of the old front, and magnifies it, making further advancement impossible or meaningless, especially for concave regions. This situation is avoided by adopting a corrector step. Corrector step: In the corrector step, Laplace's equations are used as the elliptic equations by following Cordova's approach [17]. To apply the elliptic corrector, an image of the old front with respect to the new front is first
AUTOBLOCKING FOR WINGS
207
introduced. Using these 3 fronts, the 3-D Laplace equation is solved along the new front with the other two as fixed boundary conditions. As a solution to the elliptic equations, a smoother distribution of grid points for the new front is obtained.
12.3
R e s u l t s for F l a p p e d W i n g s
The test problem selected is a little unusual from the standpoint of typical CFD activity. It reflects the authors' interest in classic and historical airplanes as well as support for a student project to design and build a home-built airplane. The wing is that from a North American P-51 Mustang. Although the wing is the original geometry, the flaps are not. The motivation was to design a new wing that gives a more docile behavior for the recreational pilot, while retaining the looks of a P-51 Mustang. The design was supposed to seat two and be 75% scale to the original. 12.3.1
Split Flap
The first example is a split flap configuration shown in Fig. 10a. In Fig. 10b the wing section is cut out on the right side to show the location of the split flap. The flap has two fronts, a top and a bottom, although it is infinitesimally thin. The lower wing surface front and the upper flap front connect on an edge that has a D. O. G. value of 2. Therefore, the gap between the wing and the flap will be filled in first, as shown in Fig. 10c. The sides of the new filled in gap-blocks form new fronts. The region between the two flaps will be advanced next, as shown in Fig. lOd. Next, fronts will advance to the wing-tip, as shown in Fig. lOe, which creates a smooth grid from wing tip to wing tip. The next step is to advance the wing's upper and lower surfaces, which now include the lower surface of the tip-to-tip front created just before, as shown in Fig. lOf. Note that these fronts are advanced to the outer boundary before the tips and wake region are advanced. This was a result of user input. The tip region is generated next, Fig. lOg, and finally the wake for the final mesh shown in Fig. lOh. Although the mesh is too coarse for practical use, it illustrates the main features of the grid advancement. It also shows that the final mesh is both CO and C l continuous. 12.3.2
H i n g e d Flaps
The next step in creating meshes on realistic wings is to introduce hinged flaps. At this time, the hinged flap is assumed to be of infinitesimal thickness, like the split flap. The surface grid is shown in Fig. 11a. The automatic generator
208
Figure 10
EBERHARDT & WIBOWO
North American P-51 Mustang wing with split flap showing stages of grid generation.
AUTOBLOCKING FOR WINGS
Figure 11
209
P-51 wing with hinged, infinitesimally thin, hinged flap.
chooses to fill in the wing void first, as shown in Fig. l i b , because it has the front with the most guided edges. Then, the problem is reduced to the split flap case of the previous section.
12.4
Conclusions
Many technical issues exist for generating structured meshes around multielement wings. An attempt has been made to solve some of these technical issues by starting with the simple wing/flap systems; the split and the hinged flap. A method is introduced that uses advancing fronts as dictated by the D. O. G. value. Next, the priority of front advancement is given. Finally, a brief description of the MAP scheme was presented. The generation of a mesh about a P51 wing with split and hinged flaps was shown. Although the meshes are not intended to be used in a flow solver, they show the general process. The next step in producing grids for more realistic wing/flap configurations is to introduce hinged flaps of finite thickness. This work is almost completed at this time. After that step, slotted flaps will be introduced. The technique suggested for handling slotted flaps is to use the ideal of "bridging" as described in [12] and [13], which would then reduce the problem to that of the previous case.
REFERENCES [1.] Schonfeld, T. and Weinerfelt, P., The Automatic Generation of Quadrilateral Multi-Block Grids by the Advancing Front Technique, P r o c . 3rd Int'l. Conf. o n N u m e r i c a l Grid Generation in C F D , Barcelona, June 3-7, 1991. [2.] Stewart, M. E. M., Domain-Decomposition Algorithm Applied to Multielement Airfoll Grids, A I A A Journal, Vol. 30, No. 6, P P 1457-1461, June,
210
EBERHARDT & WIBOWO
1992. [3.] Andrews, A. E., Know ledge-Based Flow Field Zoning, N u m e r i c a l Grid G e n e r a t i o n in C o m p u t a t i o n a l Fluid D y n a m i c s '88, Pineridge Press, pp. 13-22, 1988. [4.] Weatherill, N. P. and Shaw, J. A., Component adaptive grid generation for aircraft configuration,, A G A R D A G - 3 0 9 , pp 29-39, 1988. [5.] Shaw, J. A., Georgala, J. M., and Weatherill, N. P., The Construction of Component-Adaptive Grids for Aerodynamcc Geometrees, N u m e r i c a l Grid Generation in C o m p u t a t i o n a l Fluid D y n a m i c s '88, Pineridge Press, pp. 383-394, 1988. [6.] Georgala, J. M. and Shaw, J. A., A Discussion on Issues Relating to Multibiock Grid Generation, A G A R D C P - 4 6 4 , 1990. [7.] Allwright, S. E., Techniques in Multibiock Domain Deeomposition and Surface Grid Generation, Numerical Grid G e n e r a t i o n in C o m p u t a t i o n a l Fluid D y n a m i c s '88, Pineridge Press, pp. 559-568, 1988. [8.] Allwright, S. E., Multibiock Topology Specification and Grid Generation for Complete Aircraft Configuration,, A G A R D C P - 4 6 4 , 1990. [9.] Dannenhoffer, J. F. Ill, A Block-Structuring Technique for General Geometrees, A I A A Paper 91-0145, Reno, January 1991. [10.] Dannenhoffer, J. F. Ill, A New Method for Creating Grid Abstractions for Complex Configuraiion,, A I A A Paper 93-0428, Reno, January 1993. [11.] Noble, S. S. and Cordova, J. Q., Blocking Algorithms for Structured Mesh Generation, A I A A Paper 92-0659, 1992. [12.] Kim, B. and Eberhardt, S. Automatic Multi-Block Grid Generation for High-Lift Configuration Wings, N A S A C P - 3 2 9 1 , NASA Workshop on Surface Modeling, Grid Generation, and Related Issues in CFD Solutions, Cleveland, Ohio, May 1995. [13.] Kim, B., Automatic Multi-Block Grid Generation about Complex Geometries, P h . D . Dissertation, University of Washington, 1994. [14.] Lo, S. H., A New Mesh Generation Scheme for Arbitrary Planar Domains, Int'l. J. N u m . M e t h . Engr., Vol. 21, pp. 1403-1426, 1985. [15.] Peraire, J. et al., Adaptive Remeshing for Compressible Flow Computations, J. C o m p . P h y s . , Vol. 72, pp. 449-466, 1987. [16.] Lohner, R., and Parikh, P., Generation of Three-Dimnnsional Unssructured Grids by the Advancing Front Method. A I A A P a p e r 8 8 - 0 5 1 5 , Reno, January 1988. [17.] Cordova, J. Q., Advances in Hyperbolic Grid Generation, 4th Int'l. S y m p . on C o m p . Fluid D y n a m i c s , Davis, California, September 1991.
13 Local Preconditioning: Manipulating Mother Nature to Fool Father Time David L. Darmofal1 & Bram van Leer2
13.1
Introduction
A common strategy for solving steady equations is to march the associated unsteady equations in time until the solution converges to a stationary result. In the case of the compressible Euler equations, this approach has the important advantage of transforming the problem from a mixed hyperbolicelliptic problem in the steady state to a strictly hyperbolic problem in the transient stages. Unfortunately, this approach also introduces any stiffness due to the unsteady equations into the convergence process for the steady equations. For example, in a nearly incompressible flow, the speed of sound is significantly faster than the local flow speed. As a result, acoustic disturbances propagate much faster than convective disturbances (such as entropy and vorticity) which only travel with the local velocity. For explicit time-marching codes, this stiffness from disparate propagation speeds can significantly slow convergence to a steady state: the fast modes set the maximum allowable timestep while the slow modes set the number of iterations needed for a disturbance to convect out of the computational domain. In order to accelerate convergence for time-marching methods, one may
1 2
Aerospace Engineering, Texas A & M University, College Station, TX 77843-3141
Aerospace Engineering, University of Michigan, Ann Arbor, MI 48109-2140 Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez ©1998 World Scientific
DARMOFAL & VAN LEER
212
introduce a preconditioner into the unsteady equations, du -^+P(u)r(u)
= 0,
where u represents the state vector, r is the spatial residual which the calculation is attempting to drive to zero, and P is the preconditioner. In this review, we will concentrate on local preconditioners for the twodimensional Euler equations. Local preconditioners are evaluated using purely local information from the cell or node where the spatial residual will be preconditioned. While the preconditioner modifies the unsteady equations, the steady solution of the analytic equations is unmodified assuming P is a positive definite matrix. As we will discuss, discrete implementation of a preconditioner often does require a subtle modification to the spatial residual; fortunately, this modification usually has the benefit of increasing the accuracy of the discretization. In this paper, we will detail the process of designing a local preconditioner for the 2-D Euler equations. While the examples will be specific to these equations, the basic design considerations and analysis techniques are applicable to many other systems of equations. Then, we will provide examples of what has been achieved with local preconditioning in practice. Finally, we will indicate remaining areas that still need development if local preconditioning is to achieve its full potential.
13.2
D e s i g n considerations
The first attempt at local preconditioning can probably be associated with Chorin's[6] method of artificial compressibility for incompressible flows. Although Chorin did not view his method as local preconditioning, his approach adds a time derivative of pressure to the incompressible continuity equation which is equivalent to local preconditioning. Since this initial effort at modifying unsteady flow equations to calculate steady flows, the design of local preconditioners has been continually extended and refined. Presently, the design of a local preconditioner is a balance of widely different, sometimes competing criteria. In the following, we describe the various design considerations that one faces when deriving a local preconditioner for the 2-D Euler equations. The analysis of local preconditioners for the 2-D Euler equations is considerably simplified by utilizing the symmetrizing variables[35, 1] which are defined as,
dw
=
— , dq, dr, dS ,
.Pc
J
LOCAL PRECONDITIONING
213
where dq and dr are the change in the streamwise and normal components of velocity, dp is the change in the static pressure, dS = dp-c2dp is proportional to the change in entropy, p is the density, and c is the speed of sound. Using this basis, the preconditioned Euler equations may be written as, dw
dw
_ _ dw
(13.1)
where M 1 0 0
1 M 0 0
0 0 M 0
0 0 0 M
B
0 0 0 0 1 0 0 0
0 0 0
1 0 0 0 0
with the Mach number, M = q/c, the streamwise and normal coordinates, (^,77), and q is the local speed. 13.2.1
Essential requirements
An essential requirement on a local preconditioner is that it should not reverse the propagation direction of any waves. A preconditioner that violates this requirement would result in the incorrect application of boundary conditions since incoming and outgoing waves would reverse roles. This criterion can be met by forcing P to be positive definite (although not necessarily symmetric). In this case, the preconditioned residual is a positively weighted sum of contributions, which does not reverse the time development. Although positive definiteness is an essential criterion, it does not place severe limitations on the preconditioner. The typical approach for assuring a positive definite preconditioner is to check it after satisfying other desirable constraints. When preconditioning the Euler equations, the result should also be a wellposed, hyperbolic set of equations. A necessary condition for well-posedness is that the energy of the system remains bounded as time advances. A sufficient but not necessary condition to guarantee bounded energy is to force the preconditioned system to be symmetrizable. The preconditioned system is symmetrizable if a symmetric, positive definite matrix, Q, exists such that,
and QPA and QPB are symmetric. If a Q exists which satisfies these requirements, then the system is easily shown to be stable in the norm, w Qw, ignoring the influence of boundary conditions. Similar to positive definiteness for P, symmetrizability is usually verified at the end of the preconditioner design process. A number of important hyperbolic systems of conservation
DARMOFAL & VAN LEER
214
laws, including the Euler equations, are syrnmetrizable. Godunov[ll] showed that this implies the existence of an entropy function satisfying an additional conservation law. The preconditioned Euler equations, however, have lost the conservation form, so Godunov's result is not applicable to these. Finally, the preconditioner should also satisfy certsn smoothness conditions throughout all flow regimes, especially at the break point of the steady Euler equations, the sonic point (M = 1). In the development of preconditioners, the subsonic and supersonic regimes have to be treated separately because of the qualitative change in the propagation direction of the acoustic waves. A mismatch in the preconditioner branches across M = 1 could result in poor sonic point capturing abilities and stalled convergence.
13.2.2
Equalization of long-wave t i m e scales
As discussed in the Introduction, the Euler equations have an inherent stiffness at low Mach numbers resulting from the disparate propagation speeds of the convective and acoustic modes. Similar arguments may be made for compressible flow near sonic conditions. Local preconditioning can be used to equalize these propagation speeds or time scales. F'urthermore, for viscous flows, dissipation time scales now enter and could also be accounted for in preconditioning. Note, the analysis and design process to equalize time scales is performed at the continuous, partial differential equation level. Since the discrete numerical scheme should correctly approximate long-wave behavior, we expect the long-wave properties of the preconditioner to be reproduced in the numerical solutions. However. short waves or high frequencies are generally not correctly represented by the discretization. Thus, the equalization of wavespeeds concerns long-wave time scales only. A first step in analyzing the preconditioned wave properties is to calculate the propagation speeds, A, for plane wave solutions of the form w(x, y,t) = w(xcosq!~ ysin4 - A t ) , where 4 is the angle of the wave propagation direction relative to the streamwise direction. Substitution of plane waves into Equation 13.1results in an eigenvalue problem for A,
+
where r is the corresponding right eigenvector. Without preconditioning, the 2-DEuler equations have eigenvalues
While preconditioning design can proceed directly from the eigenvalues, X i , a. caveat must be given with this approach. Specifically, a typical measure of stiffness from wave speeds would be to use the ratio of maximum to minimum
LOCAL PRECONDITIONING
215
magnitudes, i.e., max; max0 |A;(>)| minimin^ \\i(<j>)\ ' However, as is clear from the 2-D Euler equations, this approach is fruitless because A2,3 = 0 whenever <j> = 90° indicating infinite stiffness regardless of the local Mach number. Similar troubles exist when considering the preconditioned system. To remedy this trouble, Van Leer et al[40] use the envelope of plane waves passing through one point at some instant; this in fact is the wavefront emitted by a point disturbance. The point-disturbance envelope is given by, 9$ 9v
cos
X(
\'(4>)
where {g^,g^) is the location of the envelope. In fact, this approach is equivalent to using the group velocity instead of the phase velocity (i.e. A) where (<7£,g,,) is the group velocity vector for a plane wave of direction
_ max, max^ |)| ~ 'mimmin^ \gi{4>)\ '
Without preconditioning, the 2-D Euler equations have a wave-propagation stiffness given by, ng = (M + l ) / m i n ( M , \M - 1|). A plot of Kg is shown in Fig. 2. As expected, Kg indicates stiffness problems are present for M -¥ 0 and M -¥ 1. While not the first preconditioner for compressible flows, the Van Leer-LeeRoe preconditioner[40] is the first valid for all Mach numbers. Furthermore,
DARMOFAL & VAN LEER
(a) M = 0.1
(c) M = 0.9
(b) M = 0.5
(d) M = 1.3
Figure 1 Wavefronts for 2-D Euler. Wavefronts have been scaled by largest
wavespeed.
for the class of symmetric preconditioners, P = PT,Van Leer-Lee-Roe can be proven to produce the optimal wavefronts[l7] with the smallest stiffness, K ~ for all Mach numbers. Specifically, this preconditioner is given by,
d
m
where p = and 7 = min(P, P/M). For subsonic flows, the eigenvalues produced by this preconditioner are
x2,3
= - M 2 c0s24 = qcos4
for subsonic PVl,.
,
LOCAL PRECONDITIONING
217
Figure 2 Wavefront stiffness, KS, versus Mach number for Euler (solid line), optimal preconditioner (dashed line), block-Jacobi preconditioner (dash-dotted line), and Choi-Merkle/Weiss-Smith preconditioner (dotted line). Note: Van Leer-Lee-Roe preconditioner is an optimal preconditioner.
For supersonic flows, the eigenvalues are Ai 4 J '*
— c (\/M2 — 1 cos 6 I sin d>) \ r D i ) tor supersonic >V;r.
A2,3
=
qcos
J
The resulting wavefronts for shown in Fig. 3 and the stiffness in Fig. 2. From these plots, it is evident that the low-Mach-number stiffness has been completely removed. In subsonic flow, the acoustic wavefront is now an ellipse centered at the origin. The minor axis of the ellipse is in the streamwise direction and has a semi-width of q\/l - M2. Thus, as sonic conditions are reached, this ellipse collapses and the infinite stiffness still exists, although it is significantly reduced from the unpreconditioned case. In supersonic flow, the acoustic disturbances collapse on to two points that convect in the direction of the Mach angles relative to the flow, at a rate equal to the flow speed, q, meaning perfect preconditioning. Analytically, we find that Kg = l / \ / I - M2 for M < 1 and Kg = 1 for M > 1. Clearly, a significant improvement in time-scale stiffness has been made by local preconditioning. Turkel[36] was among the first researchers to generalize the method of Chorin. In his analysis, he proposed preconditioners utilizing the primitive variables, p, u, v, plus either entropy, 5 , or density, p. In the symmetrizing
DARMOFAL & VAN LEER
(a) M = 0.1
(c) M = 0.9
(b) M = 0.5
(d) M = 1.3
Figure 3 Wavefronts for optimal preconditioning, P,t (includes P,l,). Wavefronts have been scaled by largest wavespeed.
variables, the general form of the Turkel preconditioner is,
where at and pt are free parameters. Turkel's preconditioner is most widely used to remove stiffness at low Mach numbers[38]. In the limit as M + 0 , if at + 1 and pt + M , then ng -+ 1. The Turkel preconditioner can also be extended to generate the optimal wavefronts over all Mach numbers similar to the Van Leer-Lee-Roe preconditioner. For subsonic flow, the optimal wavefronts are achieved with pt = M and at = 1+p: [17].Unfortunately, when used in conjunction with an upwind or matrix dissipation scheme, the optimal form of Pt results in discrete eigenvalues which f d into the unstable right half
LOCAL PRECONDITIONING
219
plane for higher subsonic Mach numbers[7, 23]. Thus, this specific form of the preconditioner cannot be used in practice for transonic flows. To avoid this problem, Turkel et al[38] define at and fSt to return to the unpreconditioned case above a cut-off Mach number. Although the wavefronts produced by PvW are known to be optimal for symmetric preconditioners, no proof exists that the optimally extends to general non-symmetric preconditioners. For supersonic flow, although other choices may exist, since Kg = 1, these wavefronts may justly be called optimal. In subsonic flow, we are left with the possibility for improvement. However, all evidence points to these wavefronts as being optimal in the subsonic flow also. For now, we will label a preconditioner as optimal when it produces the same wavefronts as Pvlr. Since Pvlr and Pt can both be made optimal, optimal preconditioning is obviously non-unique. Another illustration of preconditioner non-uniqueness is that given a preconditioner P for the symmetrizing variables, the same eigenvalues are found when preconditioning by its transpose, PT. Thus, given an optimal preconditioner, its transpose is also optimal. Recently, we have been able to make significant progress on deriving the general form of all optimal preconditioners. Specifically, two families of optimal preconditioners exist. One family is most closely related to the Van Leer-Lee-Roe preconditioner. The inverse of this optimal family is,
A MA - 4-\/l - M2C2 G
M(C + G)
0
0
C -
0 0
0
1
where A, C, and G may be chosen to satisfy other design constraints. Obviously, the transpose of this preconditioner is also optimal. We note that the symmetric form of this preconditioner is given by C = G = 0 and A = (P + 1 ) / M 2 and results in the Van Leer-Lee-Roe preconditioner. An interesting point is that the optimal Turkel matrix is not a member of this family and, thus, must be a member of the other family of optimal preconditioners. Unfortunately, a clean algebraic description is not presently available for this second family of optimal matrices. Other preconditioners that have been developed include the D.Choi-Merkle inviscid p r e c o n d i t i o n e d ] , the Y.Choi-Merkle viscous preconditioner[5], and the Weiss-Smith p r e c o n d i t i o n e d ] . These preconditioners have been developed only for low Mach number flows. Transforming the preconditioners
DARMOFAL & VAN LEER
220 to the symmetrizing variables gives e 0 fcm/ws ~ 0 _ 8cmi(e - l )
0 0 1 0 0 0 0
-5cmve ' 0 1 0 ' 1
where e is a user-defined parameter. For the D.Choi-Merkle inviscid preconditioner, 5cmi = 1 and 5cmv = 0. For the Y.Choi-Merkle viscous preconditioner, 5cmi = 0 and 5cmv = 1. And, for the Weiss-Smith preconditioner, Scmi = Scmv = 0. In fact, the Weiss-Smith preconditioner is a member of the Turkel family of preconditioners (specifically, at = 0 and f}t = v /£). For subsonic inviscid flows, the recommended choice is e = M2. The eigenvalues for these three preconditioners are identical and given by AM
=
i « ( l + e) cos 4> T ^ i g 2 ( 1 _ e ) 2 c o s 2 0 + e c 2
A2,3
=
qcosrf)
|
^
rsubgo.c c
^
^
J
Analysis of the wavefronts shows that at low Mach numbers with e = M2, K9 -> (VE + l)/(y/E - 1). Thus, while the Choi-Merkle/Weiss-Smith preconditioner does not achieve the optimal conditioning at low Mach numbers, it does eliminate the infinite stiffness of the Euler equations at low speeds. At higher Mach numbers, the preconditioner reverts back to the eigenvalues of the unpreconditioned Euler equations, although, above Mach numbers of approximately 0.4, these preconditioners have worse wavespeed conditioning than the unpreconditioned Euler equations. Note, the ChoiMerkle inviscid and Weiss-Smith preconditioners become the identity matrix as M —>• 1 but the ChoiiMerkle viscous preconditioner does nott For rupersonii iiows we ctssum.6 that e = 1 thus the eigenvalues return to the eigenvalues of the Euler equations without preconditioning. These conclusions are evident in the wavefronts and the values of « shown in Figs. 4 and 2. A more classical suggestion for local preconditioning is the use of the Jacobi block[26, 27, 2, 3, 29] induced by upwind or matrix-dissipation discretizations. If an upwind scheme is used to discretize the spatial operator, the flux across a cell face can be written as, /=2
{fright + futi)
~ \ \Ac\ (Uright ~ U,«/,) ,
(13.3)
where Ac is the face value of the Jacobian matrix for the conserved fluxes, and l i c I has the same eigenvectors as Ac, but its eigenvalues are the absolute values of the eigenvalues of Ac. The block-Jacobi preconditioner is the block found on the diagonal in the coefficient matrix of the linearized spatial
I,OCAL PRECONDITIONING
221
(a) &I = 0.1
(c) M = 0.9
(b) M =0.6
(d) M = 3..3
Figure 4 Wavefronts for Choi-Merkle and Weiss-Smith preconditioning, PCml,,,, with E -. min(M" 11). Wavefronits have been scaled by largest wavespeed.
operator. Wr rectangular cells, this precondifioner is given by
where o is a factor that may be used to scale the size of the spatial footprint depending on the particular discretization. The timestep, At,,,,, is inclucled for dirnensiond consistency but, upon implementation, cancels with the At,,, used in the timestep. The main advantage of the blockJacobi preconditioner is its excellent clustering of the discrete eigenvalues (see Section 13.2.3) ,mc1 its robustness (see Section 13.2.4). Unforkunately, the block-Jacobi preconditioner does little to improve the long-wave propagation speeds. The wavefronts for Pj are shown in Fig. 5. At low Mach numbers, the entropy mode ha.? been relocated to (1,0), however, the vorticity mode still sits near the origin, and, as M -+ 0, the stiffness is still infinite. At
DARMOFAL & VAN LEER
(a) M = 0.1
(c) M = 0.9 I
I OJ OJ
................. +. ............ ....I.................
...... ........ ..................
>.
........... ................................... .................. ..................(. ................i..................4 .................. o* ..................+ .................. ji........... 01 ..................6 .................. 1............
..................
a
................
01
..".
...."'
....
>
..................
-O*
-2
.
................ :............
...................+.................. j.. ................+.................. -0.6
:
*..................2 ............
...................j..................i............
a .................. )...................................... >
a .................. 4
0
4 ..................
41 9)
;
i.
OJ
0
01
(b) M = 0.5
..................*........................................................ -I
4 8
0
-
01
(d) M = 1.3
Figure 5 Wavefronts for block-Jacobi preconditioning, Pj. Wavefronts have been scaled by largest wavespeed.
higher Mach numbers, the block-Jacobi system improves on the stiffness by increasing the speed of the upstream-moying acoustic waves. Although not ~IB evident at low Mach numbers, the wavefronts for the vorticity mode are now clearly seen to be non-physical, taking on a triangular shape. In contrast, the optimal wavefronts in Fig. 3 appear quite plausible - a direct consequence of the continuum-based design. Regardless, while optimal preconditioners improve the wave propagation over that of the Euler equations, they are not a substantial improvement over block-Jacobi at higher Mach numbers. 13.2.3 Discrete eigenvalue clustering
While long-wave properties of the preconditioned equations may be analyzed from the continuous partial differential equations, short waves require analysis of the discrete equations because these small wavelengths are generally not
LOCAL PRECONDITIONING
223
well represented by numerical schemes. The common approach for analyzing the discrete properties of a scheme is via the Fourier footprint of the discrete spatial operator which is the locus of its eigenvalues in the complex plane. Specifically, after linearization about a constant state and substitution of a discrete Fourier mode of the form, Uj^ = ue'^Sl+k8"\ the spectral amplification matrix of the numerical approximation may be found for any discrete wavenumber combination, (8X, 6y). The discrete wavenumbers have a range from — n to n. The use of the Fourier footprint to analyze the smoothing abilities of preconditioned Euler and Navier-Stokes equations can be found in several other references[42, 19, 2, 9, 20, 3, 29]. One goal of convergence acceleration is to quickly damp as many eigenmodes of the Fourier footprint as is numerically possible. For example, an effective multigrid algorithm requires effective smoothing of all modes that do not exist on subsequently coarser grids. For full-coarsening multigrid, this requires effective smoothing of high-low, low-high, and high-high frequency combinations where a high-frequency wavenumber ranges from 7r/2 < \6\ < n. For semi-coarsening, only the high-high frequency components need effective smoothing. As we will show, local preconditioning can often cluster the Fourier footprint, allowing significantly enhanced damping. In particular, we concentrate on the high-high frequency locus and show that without appropriate preconditioning, many eigenmodes will be poorly damped. When a continuum preconditioner such as Van Leer-Lee-Roe, Turkel, or Choi-Merkle/Weiss-Smith is applied to an upwind scheme, the dissipation must be modified such that / = g (fright
+ fie ft) ~ -M'1
P'1
| P (A COS <j> + B sin 4>) | M {Urigkt ~ Uleff) , (13.4)
where M is the transformation matrix from the conserved state vector to the symmetrizing variables, and
^M^P-1
heft) (\PA
\coscj>\+
Fij(8iii^|)M(urijAt-«ie/«).(13.6)
The high-high frequency footprint of the upwind discretized Euler equations at low Mach numbers is very poorly clustered. Figure 6 shows the high-high frequency content of the Fourier footprints for M = 0.1 with a cell-aspect ratio, JR = Ax/Ay = 1, and the flow aligned with the grid (> = 0°).
DARMOFAL & VAN LEER
224
As an illustration, the amplification-factor contours are also overlaid for an optimal two-stage multistage damping scheme as derived by Lynn[18]. Without preconditioning, a group of high-frequency Fourier modes is seen to cluster near the origin where the amplification factor must approach one. This clustering is a direct consequence of the disparate propagation speeds of the Euler equations at low Mach numbers. By comparison, the Van Leer-LeeRoe preconditioner has eigenvalues that are well removed from the origin. While the clustering has improved, better clustering can be achieved with only a small change in the original preconditioner. We note that two distinct clusters of eigenvalues are observed with the original Pvw. In fact, these families of eigenvalues are due to the presence of acoustic and convective waves. Lynn[18] has shown that by scaling back the propagation speeds of the acoustic waves, the Fourier footprint may be perfectly clustered for the highest frequency waves, 6X = 0y = ±7r. This scaling, developed by Lee[15, 14], can be accomplished by redefining r as r = fi/(JRq + 0), where _ T,i=iW^k + vAyk\ Y,t=i \vAxk - uAyk\ Physically, Alq can be interpreted as the cell-aspect ratio defined in the flow direction, i.e. the ratio of the streamwise to normal cell lengths. Using this definition, high-high frequency modes for all Mach and cell-aspect-ratio combinations are clustered at the same location in the complex plane. For semi-coarsening multigrid, this improved clustering is essential since cellaspect ratios can be extremely large. The improved clustering can be verified in Fig. 6. The high-high frequency locations for the same conditions but with a cell-aspect ratio, JR = 100 are shown in Fig. 7. At this high aspect ratio, the improvement over both the unpreconditioned Euler and the original Van Leer preconditioner is drastic and justifies the r modification. Allmaras[2] has performed extensive analysis of the block-Jacobi precondi tioner and found it has excellent high-high frequency clustering over all ranges of cell-aspect ratio, Mach number, and flow angle. The block-Jacobi footprints have been included in Figs. 6 and 7. An important difference to note is that the block-Jacobi preconditioner has a family of modes which cluster about the real axis. This effect is also related to the large difference in propaga tion speeds at low Mach numbers, which the block-Jacobi preconditioner does not alleviate. In contrast, the Van Leer-Lee-Roe preconditioner spreads these modes away from the real axis such that they become better clustered with the other error modes. M
13.2.4
Healthy eigenvector structure
A very important aspect of preconditioner design is maintaining a healthy eigenvector structure. The first study of the importance of eigenvectors on
X,OCAL, PItECONDITIONTNG
(c) Van Leer with original T
(a) No preconditioning
f b) block-Jacobi
= 1, b, = 0. Contours of Figure 6 High-high Fourier footprint for M = 0.1, amplifrcstion Factor between 0 and 1 with increments of 0.1 are overlaid.
preconclitioner design was performed by Darmofal and SchmicL[7]. In this work, they show how the eigenvector orthogondity directly relates to the possibility of transient amplificatio11 for 'small disturbances even in systems where dl eigenvdues inclicate stability. Specifically, consicler the evolution af the following linear problem, du
-- -t iLu = 0, clt where L is an N x N matrix. In preconditioning, this woulcl represent the Fourier transformed, precondit;ionecI spatial operator. The solution of the above eq~tstioncan be written compactly using the matrix exponential as, u(t) = exp (-itL) uo ,
DARMOFAL & VAN LEER
(a)
No preconditioning
(c) Van Leer with original T
(b) block-J-bi
(d) Van Leer with modiied 7
Figure 7 High-high Fourier footprint for M = 0.1, AR = 100, 4 = 0'. Contours of amplification factor between 0 and 1 with increments of 0.1 are overlaid.
where uo is the initial condition. The maximum energy amplification G(t) at a given time can be expressed as the norm of the matrix exponential,
where R is the matrix of right eigenvectors of L and Cl is the diagonal matrix of eigenvalues. The amplification matrix can be bounded for all times by,
where K(R), the condition number of the eigenvector matrix, is defined as K(R) = IIRIIIIR'lII, and wmax is the largest positive imaginary part of all eigenvalues. For hyperbolic systems such as the Euler equations, the latter is usual zero since the eigenvalues are strictly red. Thus, the
LOCAL PRECONDITIONING
227
bound on the largest amplification is given by K(R). For an orthogonal eigenvector structure, K(R) = 1, and no energy growth is possible. Greater departure from orthogonality results in larger K(R) and larger possible transient growth. Without preconditioning, the symmetrizing variables are an orthogonal system since the matrices A and B are symmetric. However, after preconditioning, PA and PB are no longer guaranteed to be symmetric; thus, in general, preconditioning introduces the possibility of transient growth where previously none existed. Unfortunately, Darmofal and Schmid found that continuum-based preconditioners such as Turkel, Van Leer-Lee-Roe, and Choi-Merkle are all subject to severe transient growth as M —y 0. In fact, they found that the non-dimensionalized energy growth rate at t = 0 + is proportional to 1/M for these preconditioners as M —► 0. Previously, many researchers had discovered the extreme lack of robustness at stagnation points; however, the exact cause of this instability had not been known. For example, Turkel[38] employs a cut-off in his (3t definition similar to /32 = max (M2,rjM2eA
where MTef
is
some reference of freestream Mach number, and rj is a user-defined parameter typically between 1 and 3. This effectively avoids j3t going to zero which was shown by Darmofal and Schmid to limit the non-orthogonality. However, this fix is problem dependent and, furthermore, complicates the analysis of preconditioners by introducing a non-local parameter. Another avenue to improve the eigenvector structure of the preconditioned equations is to enforce some measure of orthogonality from the beginning of the design. The strictest requirement would be to enforce orthogonality under all conditions. This is equivalent to requiring PA and PB to be symmetric. The preconditioner that results from requiring complete orthogonality is too limited and cannot significantly alter the long-wave time scales. A weaker restriction is to require orthogonality only in the streamwise direction. Lee and Van Leer[16, 14] have pursued this approach and found an optimal subsonic preconditioner with orthogonal eigenvectors in the streamwise direction. Specifically, this preconditioner has the form
-Pt>!96 =
f
-f / 0 0
-f
|+i
0 0 where
o 0
0 / 3 0 0
0 1
2M2-/? * ~ 1 + M2 ' The preconditioner has been coined Pvi9e, i.e. Van Leer 96, as it was developed by Van Leer during the summer of 1996. This preconditioner smoothly connects with the Van Leer-Lee-Roe preconditioner as M —y 1, thus, the
DARMOFAL & VAN LEER
228
supersonic Van Leer preconditioner can be used for M > 1. Note, P„(g6 is a member of the Popt family of preconditioners; furthermore, P„06 is the only positive definite member of Popt which is orthogonal in the streamwise direction. 13.2.5
Accuracy preservation
As M —>■ 0, a well-known problem with many compressible finite volume techniques is an extreme loss of accuracy in addition to slow convergence[43]. The problem can be traced to a poorly balanced dissipation matrix at low Mach numbers[8]. In this analysis, the symmetrized Euler equations are first non-dimensionalized by a reference density and flow speed. Gustafsson and Stoor[12] have shown that using this non-dimensionalization, - | = O(M), as the Mach number decreases. Thus, the entire symmetrizing state vector behaves as w = [0(M), 0 ( 1 ) , 0 ( 1 ) , 0 ( 1 ) ] T . Similarly, the A and B matrices are found to be of the following order,
A=
0(1) 0(1/M) 0 0
0(1/M) 0(1) 0 0
0 0 0(1) 0
0 0 0 0(1)
, B =
0 0 0(1/M) 0
0 0 0 0
0(1/M) 0 0 0
0 0 0 0
Combining these results it is easy to show that
For a first-order upwind or matrix dissipation scheme, similar analysis shows that
^lAI-^ + A^I —
[0(1), 0 ( 1 / M ) , 0 ( 1 / M ) , 0 ( 1 ) ] '
Thus, except in the entropy equation, the Euler and dissipative terms are mismatched as M —>• 0. When a local preconditioner is applied, a balance can be achieved between the preconditioned flux and dissipation terms. Fiterman et al[8] have shown that a proper match occurs if
P
_1
1
1
,P- |PA|,P- |P.B|
0(1/M2) 0(1/M) 0(1/M) 0(1/M)
0(1/M) 0(1) 0(1) 0(1)
0(1/M) 0(1) 0(1) 0(1)
0(1/M) 0(1) 0(1) 0(1)
A quick check shows that preconditioners designed for low-speed convergence acceleration generally possess this accuracy property. Specifically, P„j r , P«,
LOCAL PRECONDITIONING
229
Pcm, and Pt,;96 all have the proper limits as M —>■ 0 while block-Jacobi preconditioning does not (block-Jacobi preconditioning does not even require the dissipation be modified). Finally, we also note that Reed[30, 31] has performed a more restrictive (but also more straightforward) truncationerror analysis and shown that the truncation error of the Choi-Merkle preconditioned equations is a factor of M 2 lower than the truncation error without preconditioning for some terms. Regardless, the general result remains the same — local preconditioning with the associated modified dissipation can cure low-Mach-number accuracy problems of common computational methods. 13.2.6
Separability
Preconditioning is also useful when attempting to separate equations into independent blocks that would remain hyperbolic or become elliptic if the time derivatives were to be dropped. For example, given the steady Euler equations, , dw du> these equations may be transformed into independent hyperbolic and/or elliptic parts. For supersonic flow, the steady Euler equations can be separated into a set of four convection (i.e. hyperbolic) equations, 0 0 0 0
0 p 0 0
0 0
0" 0 1 0 0 1.
'
fidp+ pudv j3 dp — pu dv
v
s
±1 *l
\
y
' 1 0 0 0
0 -1 0 0
0 0 0 0
0" 0 0 0
' (3 dp + pu dv (3 dp — pu dv
±i M
0,
s
where 5 and H are the entropy and enthalpy. The first two equations are the convection of acoustic disturbances along Mach lines while the last two equations are the convection of enthalpy and entropy along streamlines. In subsonic flow, the convection of enthalpy and entropy remains; however, the acoustic subsystem is now elliptic and equivalent to the Cauchy-Riemann equations, 1 0 0 0
0 -1 0 0
0 0 1 0
0 " 0 0 1
( -P-dp \
d di
I
pu
*
dv
V
H S
+
7
0
-1
-1
0 0 0
0 0
0
0 0 0 0
0 0 0 0
( ^dp pu
d_ dr]
\
dv H S
\
r
= 0. /
The advantage of this splitting is that different discretization methods can now by employed on the elliptic and hyperbolic parts. This separation is not possible for the unsteady Euler equations. However, the introduction of a local preconditioner offers the freedom to separate the parts of hyperbolic and elliptic origin while keeping the time derivatives and allowing time-marching algorithms to be utilized.
DARMOFAL & VAN LEER
230
Roe [33] has developed a general strategy for performing this separation of elliptic and hyperbolic parts. For the 2-D Euler equations, the general form of the preconditioner allowing this separation is, 1 1 M
sep
a3p 0
l M
CL2P
0 -a4M 0
1 0 0
0 0 0 1
where the a*'s are free parameters. Combining this result with the family of optimal preconditioned, Popt, we can find the family of preconditioners which permit elliptic/hyperbolic separation with optimal wavefronts, l+a.i9 2
J_
1 M
, V1-0 2 -; 1
M
0
M
1
0
0
M
0
a,
0
0
0
0
1
sep/opt
+
where a* is a free parameter. In particular, we note that the Van LeerLee-Roe matrix is a member of the Psep/opt family (with a* = 1//?). The hyperbolic/elliptic splitting based on the Van Leer-Lee-Roe preconditioner has been used to formulate genuinely multidimensional upwind schemes for steady flows [25, 24, 28]. These schemes offer improved accuracy over standard grid-aligned upwind methods; however, they also appear to be susceptible to transient amplification of disturbances.
13.3
Current status
In the previous section, we focused on the design of a local preconditioner, highlighting the different design criteria from which one might choose. In this section, we will concentrate on the current status of local preconditioning — specifically, what has local preconditioning achieved in practice? 13.3.1
Stiffness removal for long waves
In principle, the removal of disparate time scales for long waves should result in accelerated convergence, as all errors in the solution can now propagate at the same rate without slowly-propagating error modes staying behind. This removal of stiffness was first demonstrated for low-Mach-number flows. Merkle et al[4, 5] showed significant convergence acceleration for a variety of lowspeed internal flows using an Alternating Direction Implicit algorithm with
LOCAL PRECONDITIONING
231
central differencing. The first example of the benefit of local preconditioning across all Mach numbers was for the Van Leer-Lee-Roe preconditioner[40]. In this work, explicit, upwind solutions over a NACA 0012 airfoil were obtained, showing that in all cases, the application of P„; r results in a significant speed up compared to the use of the unpreconditioned Euler scheme. Subsequently, many author investigators have found similar results[15, 10, 9, 30, 14, 31]. In this section, we will demonstrate the convergence-acceleration effect for a single-grid calculation of the flow over a bump at a M^ = 0.1, 0.3, 0.5, and 0.8. We will only test two preconditioners, Pvir and Pj, and compare the results to unpreconditioned Euler calculations. The basic flow solver consists of a high-resolution upwind scheme[39] with Roe's[32] approximate Riemann solver. The integration in time is performed with a 2-stage optimally damping scheme from Lynn[18]. As shown in Fig. 8, convergence rates for the Moo =0.1 and 0.8 cases are lower than for the two moderate Mach numbers for the unpreconditioned Euler calculations. Using block-Jacobi preconditioning, all but the MQO = 0.1 case converge very similarly, dropping approximately six orders of magnitude in about 1000 cycles (each cycle contains 4 multi stage integrations so this is actually 4000 multistage iterations). In contrast, however, the Van Leer-Lee-Roe preconditioner performs almost independently of Mach number with six orders of magnitude drop in approximately 1000 cycles for all cases. Compared to the unpreconditioned Euler results, the Van Leer-Lee-Roe preconditioner is approximately 1.5 to 3.0 times faster (in terms of cycles to converge six orders). These results agree well with the wavefront analysis discussed in Section 13.2.2 which showed that block-Jacobi preconditioning does not aid long wave propagation at low Mach numbers. Thus, a properly-designed local preconditioner can successfully remove long wave time scales and, in practice, this effect accelerates convergence for singlegrid calculations. 13.3.2
I m p r o v e d multigrid convergence
In addition to accelerating single grid methods, local preconditioning can have a favorable effect on multigrid methods. In fact, local preconditioning has a double benefit when used in multigrid methods: long waves are accelerated, thus improving the fine grid solver, and discrete eigenvalues of the spatial operator are clustered, thus improving the smoothing of high frequency error modes. Many authors have shown the benefits which local preconditioning has on multigrid performance[34, 19, 20, 18, 29, 38, 21, 13, 23, 22]. An important contribution in this area is the work of Tai[34] in 1-D and Lynn[18] in 2-D who showed that the combination of local preconditioning and multigrid can accelerate convergence better than either technique applied independently. To demonstrate this result, we have applied full and semicoarsening multigrid[26, 27] methods to the previous bump flow test cases.
DARMOFAL & VAN LEER
-1 O
~
W
i
4
O
O
z
P
m
O
m
(a) No preconditioning
#h
(b) block-Jacobi (c) Van Leer-LeRoe Finre 8 Mach number independence for convergence of unconfined bump flow with a single grid of 64 x 32 ccells. Solid: M, = 0.1, Dash: M, = 0.3, Dash-dot: MEQ= 0.5, Circles: M, = 0.8. N o h each cycle contains 4 twustage inteerations. Convergence histories are shown in Fig. 9 and 10 for full-coarsening md semicoarsening, respectively. Comparing the unpreconditioned and Van Leer-LeeRoe preconditioned results, we see that the preconditioned results are at least five times faster than the unprewnditioned results, Recalling that without multigrid, the observed speed-ups were between 1.5 and 3.0, we conclude that the combination of multigrid and local preconditioning enhances the performance of the two acceleration techniques. An interesting observation is that the block-Jacobi preconditioner is significantly slower than the Van her-Lee-Roe preconditioner for the M, = 0.1 case. This difference can be attributed to the improved long wave propagation of the Van Leer-Lee-Roe preconditioner. Figure 11 are contour plots of the error in the solution. The error is calculated by first solving the equations to machine precision. Then, the simulations are re-run and
I
#
LOCAL PRECONDITIONING 0,
m
wm
J
loo
IW
(a) No preconditioning
m
w-
IW
60
wm
lm
(b) (c) Van Leer-Lee-Roe . . block-Jacobi , . Figure 9 Mach number independence for convergence of unconfined bump flow with full coarsening muitigrid. Fine grid 64 x 32 cells; coarse grid 8 x 4 cells. Solid: Mw = 0.1, Dash: Mw = 0.3, Dash-dot: Mw = 0.5, Circles: M , = 0.8. the error is the magnitude of the difference between the current solution and the final solution. Initially, the error is distributed over the bump. After the f i s t three cycles, the errors for block-Jacobi and Van Leer-LeeRoe are similar with both exhibiting grid-aligned errors along the solid wall on and downstream of the bump. In the following cycles, the Van LeerLee-Roe preconditioner propagates these grid-aligned error modes out of the computational domain while the block-Jacobi preconditioner does not. Thus, the block-Jacobi preconditioner must rely on smoothing to remove these results; however, since these modes are grid-aligned, they cannot be smoothed and convergknce suffers.
J
160
& VAN LEER
-,!
-~----d--.----.------
IW
J
$50
M~OI
(b) bfock-Jacobi
..,L
I -
100
J
160
cvaat
(c) V?I.IILeer-Lee-Roe
Figure 10 Mach 11u1~1ber independence for converg.cnc:e of rilconfinetl bump f ow with semi coarsening mnltigrid. Fine grid 64 x 32 cells; coarse grid 8 x 4 cells. Solid: M, = 0.1, Dash: = 0.3, Dash-dot: M- = 0.5, Cjircles: M, = 0.8.
13.3.3
Low Mach number accuracy
As discussed in Section 13.2.6, local preconditioning can d s o iniprove the accuracy of low Mach number flow calculatio~lsas a result of the tllodified dissipation matrices. This effect was first observed ~ill~lzerically 1.q' Van Leer et al[40], and, sulseque~ltly,l>y 1-na.ziy otller investigators [37, 38, 311. A typical example of the improved ticcuracy i1.t low Mach numbers is the A/r, 0.01 flow a.rour~da. NACA 0012 airfoil at l.25' a.ngle of att.a.ck: as shown in Pig. 12. In these plots, we compare unpreconc:litioned zmtl TiVcissSnlitli preconditios~edEuler simulztztions to incompressible pa.rxef methods. As czul be clemly seen from both tlle Cr co~ltoursand sl~rf:i(:e plots, 61.1(5 unpreconclitiolied results suffer from significa~lt;inaccuracies. EIo\vc!vc!r, the grecouclitioned results have ciean Cr contours a.nd the surface pressures ~natclz
-
LOCAL PRECONDITIONING
235
(a) Initial condition. 16 error contours from 0 to 0.015.
(b) 3 cycles. 16 error contours from 0 to 0.002.
(c) 6 cycles. 16 error contours from 0 to 0.002.
(d) 9 cycles. 16 error contours from 0 to 0.002.
(e) 12 cycles. 16 error contours from 0 to 0.002. Figure 11
Error contours for full coarsening multigrid convergence of M„ = 0.1 bump flow with block-Jacobi and Van Leer preconditioners.
236
DARMOFAL & VAN LEER
(a) No preconditioning Cp contours
(c) Preconditioned Euler Cp contours
(b) Surface Cp (d) Surface Cp Figure 12 Comparison of low Mach number accuracy for unpreconditioned Euler and Weiss-Smith preconditioned solutions and comparison with panel solution. NACA 0012, M = 0.01, a = 1.25 degrees. 31 Cp contours are plotted from -0.7 to 0.7.
well with panel computations.
13.4
Future developments
While local preconditioning holds significant promise, the only preconditioning approach that appears robust for a wide-range of problems is block-Jacobi. However, the block-Jacobi approach does not provide effective convergence acceleration or good accuracy at low Mach numbers. Thus, a major stumbling block which must be resolved is the lack of robustness at stagnation points for preconditioners with good low Mach number performance. While the work
LOCAL PRECONDITIONING
237
of Darmofal & Schmid[7] and Van Leer et al[41] has resulted in a better understanding of the causes of the robustness problem, adequate solutions are still lacking. While we have concentrated on inviscid flows, the need for convergence acceleration is most strongly felt in high-Reynolds-number, viscous-flow calculations. A significant amount of work has already been accomplished in extending and applying local preconditioning to the discretized Navier-Stokes equations[5, 10, 2, 30, 18, 44, 29, 13, 14, 21]. Further developments are also needed in extending the results to other systems of equations such as those of ideal magnetohydrodynamics.
Acknowledgements The work described above has been achieved only with the collaboration of many people. Specifically, the authors would like to acknowledge the contributions of Dohyung Lee, John Lynn, Barrett McCann, and Phil Roe. The first author would like to acknowledge the support of the NSF through NSF C A R E E R Award (ACS-9702435) and The Boeing Company.
REFERENCES 1. S. Abarbanel and D. Gottlieb. Optimal time splitting for two and three dimensional Navier-Stokes equations with mixed derivatives. Journal of Computational Physics, 41:1-33, 1981. (Also, ICASE Report No. 80-6, 1980). 2. S.R. Allmaras. Analysis of a local matrix preconditioner for the 2-D Navier-Stokes equations. AIAA Paper 93-3330, 1993. 3. S.R. Allmaras. Analysis of semi-implicit preconditioners for multigrid solution of the 2-d Navier-Stokes equations. AIAA Paper 95-1651, 1995. 4. D. Choi and C.L. Merkle. Application of time-iterative schemes to incompressible flow. AIAA Journal, 23(10):1518-1524, 1985. 5. Y.H. Choi and C.L. Merkle. The application of preconditioning in viscous flows. Journal of Computational Physics, 105:203-223, 1993. 6. A. J. Chorin. A numerical method for solving incompressible viscous flow problems. Journal of Computational Physics, 2:12-26, 1967. 7. D.L. Darmofal and P.J. Schmid. The importance of eigenvectors for local preconditioners of the Euler equations. Journal of Computational Physics, 127:346-362, 1996. 8. A. Fiterman, E. Turkel, and V.N. Vatsa. Pressure updating methods for the steady-state fluid equations. AIAA Paper 95-1652, 1995. 9. A.C. Godfrey. Steps toward a robust preconditioning. AIAA Paper 94-0520, 1995. 10. A.C. Godfrey, R.W. Walters, and B. van Leer. Preconditioning for the NavierStokes equations with finite-rate chemistry. AIAA Paper 93-0535, 1993. 11. S.K. Godunov. An interesting class of quasilinear systems. Dokl. Akad. Nauk SSSR, 139:521-523, 1961. 12. B. Gustafsson and H. Stoor. Navier-Stokes equations for almost incompressible
238
DARMOFAL & VAN LEER
flow. SIAM Journal of Numerical Analysis, 28(6):1523-1547, 1991. 13. D. Jespersen, T. Pulliam, and P. Buning. Recent enhancements to OVERFLOW. AIAA Paper 97-0644, 1997. 14. D. Lee. Local preconditioning of the Euler and Navier-Stokes equations. P h D thesis, University of Michigan, 1996. 15. D. Lee and B. van Leer. Progress in local preconditioning of the Euler and Navier-Stokes equations. AIAA Paper 93-3328, 1993. 16. D. Lee, B. van Leer, and J. Lynn. A local Navier-Stokes preconditioner for all Mach and cell Reynolds numbers. AIAA Paper 97-2024, 1997. 17. W . T . Lee. Local preconditioning of the Euler equations. P h D thesis, University of Michigan, 1991. 18. J. Lynn. Multigrid solution of the Euler equations with local preconditioning. P h D thesis, University of Michigan, 1995. 19. J . F . Lynn and B. van Leer. Multi-stage schemes for the Euler and Navier-Stokes equations with optimal smoothing. AIAA Paper 93-3355, 1993. 20. J . F . Lynn and B. van Leer. A semi-coarsening multigrid solver for the Euler and Navier-Stokes equations with local preconditioning. AIAA Paper 95-1667, 1995. 21. D. Mavriplis. Multigrid strategies for viscous flow solvers on anisotropic unstructured meshes. AIAA Paper 97-1952, 1997. 22. B. McCann. Evaluation of local preconditioners for multigrid solutions of the compressible Euler equations. Master's thesis, Texas A&M University, 1996. 23. B. McCann and D.L Darmofal. Evaluation of local preconditioners for multigrid solutions of the compressible Euler equations. AIAA Paper 97-2028, 1997. 24. L.M. Mesaros. Multi-dimensional Fluctuation Splitting Schemes for the Euler Equations on Unstructured Grids. P h D thesis, University of Michigan, 1995. 25. L.M. Mesaros and P.L. Roe. Multi-dimensional fluctuation splitting schemes based on decomposition methods. AIAA Paper 95-1699, 1995. 26. W.A. Mulder. A new approach to convection problems. Journal of Computational Physics, 83:303-323, 1989. 27. W.A. Mulder. A high resolution Euler solver based on multigrid, semicoarsening, and defect correction. Journal of Computational Physics, 100:91-104, 1992. 28. H. Paillere, H. Deconinck, and P.L. Roe. Conservative upwind residualdistribution schemes based on the steady characteristics of the Euler equations. AIAA Paper 95-1700, 1995. 29. N.A. Pierce and M.B. Giles. Preconditioning compressible flow calculations for stretched grids. AIAA Paper 96-0889, 1996. 30. C.L. Reed. Low speed preconditioning applied to the compressible Navier-Stokes equations. P h D thesis, University of Texas at Arlington, 1995. 31. C.L. Reed and D.A. Anderson. Application of low speed preconditioning to the compressible Navier-Stokes equations. AIAA Paper 97-0873, 1997. 32. P.L. Roe. Approximate Riemann solvers, parametric vectors, and difference schemes. Journal of Computational Physics, 43:357-372, 1981. 33. P.L. Roe. Compounded of many simples: Reflections on the role of model problems in C F D . In V. Venkatakrishnan, M.D. Salas, and S. Chakravarthy, editors, Barriers and Challenges in Computational Fluid Dynamics, pages 2 4 1 258. Kluwer Academic Publishers, 1998. 34. C. H. Tai. Acceleration Techniques for Explicit Codes. P h D thesis, University of Michigan, 1990. 35. E. Turkel. Symmetrization of fluid dynamic matrices with application. Mathematics of Computation, 27:729-736, 1973.
LOCAL P R E C O N D I T I O N I N G
239
36. E. Turkel. Preconditioned methods for solving the incompressible and low speed compressible equations. Journal of Computational Physics, 72:277-298, 1987. 37. E. Turkel, A. Fiterman, and B. van Leer. Preconditioning and the limit to the incompressible flow equations for finite difference schemes. In M. Hafez and D.A. Caughey, editors, Computing the Future: Advances and Prospects for Computational Aerodynamics, pages 215-234. John Wiley and Sons, 1994. 38. E. Turkel, V.N. Vatsa, and R. Radespiel. Preconditioning methods for low-speed flows. AIAA Paper 96-2460, 1996. 39. B. van Leer. Upwind-difference methods for aerodynamic problems governed by the Euler equations. Lectures in Applied Mathematics, 22, 1985. 40. B . van Leer, W . T . Lee, and P.L. Roe. Characteristic time-stepping or local preconditioning of the Euler equations. AIAA Paper 91-1552, 1991. 41. B. van Leer, L. Mesaros, C.H. Tai, and E. Turkel. Local preconditioning in a stagnation point. AIAA Paper 95-1654, 1995. 42. B. van Leer, C.H. Tai, and K.G. Powell. Design of optimally smoothing multi stage schemes for the Euler equations. AIAA Paper 89-1933, 1989. 43. G. Volpe. On the use and accuracy of compressible flow codes at low mach numbers. AIAA Paper 91-1662, 1991. 44. J.M. Weiss and W.A. Smith. Preconditioning applied to variable and constant density flows. AIAA Journal, 33(ll):2050-2057, 1995.
14 Relaxation Revisited—A Fresh Look at Multigrid for Steady Flows Thomas W. Roberts 1 , David Sidilkover2 & R. C. Swanson 1
14.1
Introduction
The year 1971 saw the publication of one of the landmark papers in computational aerodynamics, that of Murman and Cole [9]. As with many seminal works, its significance lies not so much in the specific problem that it addressed—small disturbance, plane transonic flow—but in the identification of a general approach to the solution of a technically important and theoretically difficult problem. The key features of Murman and Cole's work were the use of type-dependent differencing to correctly account for the proper domain of dependence of a mixed elliptic/hyperbolic equation, and the introduction of line relaxation to solve the steady flow equation. All subsequent work in transonic potential flows was based on these concepts. Jameson [6] extended Murman and Cole's ideas to the full potential equation with two important contributions. First, he introduced the rotated difference stencil, which generalized the Murman and Cole type-dependent difference operator to general coordinates. Second, he used the interpretation, introduced by Garabedian, of relaxation as an iteration in artificial time to 1
Research Scientist, Aerodynamic and Acoustic Methods Branch, NASA Langley Research Center, Hampton, VA 23681-0001
2
Senior Staff Scientist, ICASE, Hampton, VA 23681-0001 Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez
©1998 World Scientific
242
ROBERTS, SIDILKOVER & SWANSON
construct stable relaxation schemes, generalizing the original line relaxation method of Reference [9]. The decade of the 1970s saw an explosion of activity in the solution of transonic potential flows, which has been summarized in the review article of Caughey [4]. At about the time of Caughey's survey, the main thrust of research in computational aerodynamics was moving away from the full potential equation and towards the solution of the steady Euler equations. By analogy with relaxation methods for potential equation, solution methods for the Euler equations can be thought of as iterations in pseudo-time. Unlike methods for the potential equation, by far the most common approach has been to solve for steady flows as the long-time asymptotic solution of the unsteady equations, rather than directly solve the steady equations. This is much easier conceptually, as the unsteady Euler equations are a hyperbolic system, for which a considerable body of theory exists. Abandoning time accuracy allows considerable flexibility in the construction of the iterative scheme. To accelerate the convergence to the steady state, various types of preconditioning are used. In addition, starting with the unsteady equations leads to a straightforward extension of the iterative methods to the Reynoldsaveraged Navier-Stokes equations, and considerable progress in this area has been made in recent years. Nevertheless, there is still a great need to improve the convergence rates of existing methods. Both the line-relaxation methods for the potential equation and the time-iterative methods for the Euler and Navier-Stokes equations suffer from slow asymptotic convergence to the steady state. Interpreting the iteration as relaxation leads one naturally to consider convergence acceleration methods that have been successfully applied to classical relaxation schemes. The foremost among such methods is the multigrid algorithm. The theory of multigrid is highly developed for elliptic equations, for which 0(n) convergence rates are attainable, where n is the number of unknowns in the system. In other words, the work required to obtain a solution to the system of equations is proportional to the number of unknowns. Classical relaxation schemes for elliptic equations are extremely efficient at eliminating the shortwavelength components of the error, while the coarse grids in the multigrid process are efficient at removing the long-wavelength errors. Application of multigrid acceleration to the Euler or Reynolds-averaged Navier-Stokes equations leads one to consider temporal integration methods which also provide good damping of the short-wavelength error components. There are two main classes of multigrid methods based on the unsteady equations. One class of methods uses upwind-differencing and implicit time integration as the smoother [1,8, 17]. An alternative approach is one originally proposed by Jameson [7]. A finite-volume spatial discretization with explicit artificial viscosity is combined with a Runge-Kutta time integration as a smoother. This approach has been successfully extended to the Reynolds-
FRESH LOOK AT MULTIGRID
243
averaged Navier-Stokes equations [16]. Unfortunately, these approaches have resulted in poor multigrid efficiency. When applied to high Reynolds number flows over complex geometries, convergence rates are often worse than 0.99 per multigrid cycle. Recently, significant improvements have been demonstrated by Pierce, et al. [10]. However, when one considers that for the Poisson equation on smooth domains convergence rates of nearly 0.1 per cycle are attainable in practice, it is clear that there is tremendous room for improvement of existing flow solvers. In the remainder of this paper, a multigrid algorithm for the Euler equations which yields convergence rates comparable to those of the Poisson equation is presented. This algorithm abandons the time-marching approach to the steady state, but relies on relaxation of the steady equations. In Section 14.2, the general principles underlying the algorithm are outlined. The mathematical formulation of the current approach is given in Section 14.3, while the solution procedure is described in Section 14.4. Results for incompressible, inviscid flow in two-dimensional channels and around airfoils are shown in Section 14.5. A brief discussion of the extension of the current method to the compressible flow equations is presented in Section 14.6, where a connection with potential flow solvers is also shown. A summary is found in Section 14.7.
14.2
An Approach to Multigrid
According to Brandt [2], one of the major obstacles to achieving ideal multigrid performance for advection dominated flows is that the coarse grid provides only a fraction of the needed correction for smooth error components. This particular obstacle can be removed by designing a solver that effectively distinguishes between the elliptic, parabolic, and hyperbolic (advection) factors of the system and treats each one appropriately. The efficiency of such a solver will be limited by the efficiency of the solvers for each of the factors of the system. For instance, advection can be treated by space marching, while elliptic factors can be treated by multigrid. In this example, all components of the error associated with the advection terms are eliminated in one sweep, and the convergence rate is limited by the speed of the elliptic solver. Brandt presents an approach called "distributive relaxation" by which one can construct smoothers that effectively distinguish between the different factors of the operator. Using this approach, Brandt and Yavneh have demonstrated textbook multigrid convergence rates for the incompressible Navier-Stokes equations [3]. Their results are for a simple geometry and a Cartesian grid, using a staggered-grid discretization of the equations. In a closely related approach, Ta'asan [15] presents a fast multigrid solver for the compressible Euler equations. This method is based on a set of "canonical variables" which express the steady Euler equations in terms of
244
ROBERTS, SIDILKOVER & SWANSON
an elliptic and a hyperbolic partition [14]. Ta'asan uses this partition to guide the discretization of the equations. A staggered grid is used. with different variables residing at cell, vertex, and edge centers. In Reference [15]it is shown that ideal multigrid efficiency can be achieved for the compressible Euler equations for two-dimensional subsonic flow using body-fitted grids. One possible limitation of the use of canonical variables is that the partition of the inviscid equations is not directly applicable to the viscous equations. Recently the authors [ll]have presented an alternative scheme to Brandt's distributive relaxation and to Tarasan's canonical variable decomposition. This scheme does not require staggered grids, but uses conventional vertex-based finite-volume or finite-digerence discretizations of the primitive variables. This simplifies the restriction and prolongation operations, because the same operator can be used for all variables. A projection operator is applied to the system of equations, resulting in a Poisson equation for the pressure. By applying the projection operator to the discrete equations rather than to the differential equations, the proper boundary condition on the pressure is satisfied directly. The Poisson equation for the pressure may be treated by Gauss-Seidel relaxation, while the advection terms of the momentum equation are treated by space-marching. Because the elliptic and advection parts of the system are decoupled, ideal multigrid efficiency can be achieved. Compared to distributive relaxation and the canonical variables approaches, this method is extremely simple.
14.3
Mathematical ]Formulation
The incompressible Euler equations in primitive variables are
where u and v are the components of the velocity in the s and y directions, respectively, and p is the pressure. The density is taken to be one. The advection operator is defined by
where a,, ay are the partial differentiation operators. The Euler equations may be written as
FRESH LOOK AT MULTIGRID
245
Introducing the adjoint to Q, defined by Q*(f) = -dx(uf)-dy(vf),
(14.3)
a projection operator P is defined: /
/ 0
P =
0 /
0 \ 0
(14.4)
\ dt 8y Q* J Applying the projection operator to the Euler equations yields
Lq = P L q =
/ Q 0 dx\ 0 Q 3 „
/ u \ u P+J 2
V 0 0 A) \
/
\ (dXV)(dyU)
0 0 ~ (dXU)(dyV)
\
J
(14.5) where A is the Laplacian. The matrix operator on the right-hand side consists of the principal part of L (i.e., the highest-order terms of the operator), and the remaining terms are are the subprincipal terms. These terms arise because the coefficients u and v in the operators Q and Q* are not constant. It is important to note that the subprincipal terms can be ignored for the purpose of constructing a relaxation scheme. The system of equations 14.5 is a higher-order system than the original Euler equations 14.2. The continuity equation, which is a first-order partial differential equation, has been replaced by a second-order differential equation for the pressure. One might expect that Eq. 14.5 would require a boundary condition on the pressure in addition to the physical boundary condition of flow tangency at the wall, which is required by Eq. 14.2. However, at the boundary of the domain, the third equation of 14.5 takes the form (—udyv + vdyU + dxp)hx
+ (udxv — vdxu + dvp)hy
= 0,
(14.6)
where the components of the unit normal at the wall. This is simply the equation for the momentum normal to the wall. Because the pressure equation at the wall takes the form of Eq. 14.6, which in this case may be thought of as a compatibility condition of the governing equations, no auxiliary boundary condition on the pressure is needed. The operator on the left-hand side of Eq. 14.5 is upper triangular. Because the pressure satisfies a Poisson equation a conventional relaxation method, such as Gauss-Seidel, can be used to solve it. Upwind differencing of the advection operator in the momentum equations and a downstream ordering of the grid vertices allows marching of the momentum equations. A collective Gauss-Seidel approach is used here, where the vertices are ordered in the flow direction. This is described more fully in the next section.
246
ROBERTS, SIDILKOVER & SWANSON 2
Figure 1 Traingular cell fo an unstructured grid
14.4
Solution P r o c e d u r e
The first step in approximating L is to discretize the Euler equations 14.2. Unlike the methods of References [3,15], which use staggered grids, the current approach is vertex-based, where all the unknowns are stored at the vertices of the grid. Discretizations for quadrilateral structured grids and triangular unstructured grids have been coded. A great deal of flexibility in the form of the discrete approximation to the momentum equations is possible with the current method. By way of illustration, consider a cell-vertex discretization on an unstructured triangular grid. A typical grid cell 0, is shown in Fig. 1. Cell-averaged gradients of the unknowns are found by using a trapezoidal rule integration around the boundary of ft. For example, the discrete approximation to the gradient of u on fl is id^u+jdyU
= ^
(w0(yi - 2 / 2 ) + "1(2/2 -yo)+u2(yo
-J/i))
~2AE (u°(Xl ~ xz) + u i ( x 2 - so) + u2(x0 - Xi)) , (14.7) where An is the area of the triangle. The superscript h is used to denote the difference approximation to the corresponding differential operator. Gradients of v and p are obtained likewise. These gradients are used to approximate Eq. 14.2 on the triangle. An upwind approximation to Q at the vertices of the grid is obtained by distributing the cell-averaged momentum equation residuals to the vertices of each triangle appropriately. The current scheme is not tied to any particular form of the upwind discretization. One choice, which was used to obtain the unstructured grid results presented in Section 14.5, is the advection scheme of Giles, et al. [5]. In this scheme, the residuals are
FRESH LOOK AT MULTIGRID
247
distributed to the vertices of Q. using the weights Wi
1-
Arii
i = 0,l,2,
vjl
where ln is the length of the projection of Q, onto the crossflow direction, and A n ; is the component of the length in the crossflow direction of the edge opposite the i-th vertex. An alternative upwind discretization currently being developed by the authors is based on the multidimensional upwind formulation of Sidilkover [12]. Once the cell-averaged residuals of the continuity and momentum equations have been computed, the projection operator P is applied to these discrete equations to obtain the residual for the pressure Poisson equation of Eq. 14.5. Letting RPi be the pressure equation residual at vertex i, the application of P can be written in integral form, R
Pi=
L triangles
I / (\Qhu + d^\-u\(d^u + d^j\) dy \g^.
- (
Qhv + <9>
(a*u + a*t>)|)dx|
(i4.8)
where A; is the area of the control volume centered on the i-th vertex and the superscript h is used to denote the discrete approximations to the corresponding differential operators on the triangle as before. The summation is over all the triangles adjoining the i-th. vertex. The dashed lines in Fig. 1 are the segments of the boundaries of Ao, A\ and A2 that lie in cell Q,. The boxed terms in 14.8 are the cell-averaged residuals of the x and y momentum equations and the continuity equation. The contributions of the cell-averaged residuals on Q, to Rp are found by evaluating Eq. 14.8 over the appropriate segment of the boundary of A{ lying in ft, taking the boxed terms in Eq. 14.8 to be constant over the cell. Applying the projection operator P at the discrete level in this way, rather than starting with the differential equations 14.5 and discretizing them, has two important advantages. First, the discrete approximation of Eq. 14.8 at boundary vertices reduces to Eq. 14.6, automatically providing the correct boundary condition for the pressure. Second, if the momentum and continuity equations are discretized on the triangles in conservation form, it is possible to obtain a fully conservative scheme. This is particularly important for compressible flows with shocks. The multigrid algorithm uses a sequence of grids GK, GK-\ , ■ ■ ■, Go, where GK is the finest grid and Go the coarsest. Call the discrete approximation to the operator L on the fc-th grid L&, and let q*, be the solution on
248
ROBERTS, SIDILKOVER & SWANSON
that grid. This system has the form Ljtqjt = f*, where the entries of LA are 3 x 3 block matrices which operate on the unknowns («, v,p)T at each grid vertex. A general iteration scheme is constructed by writing the operator LA as lik = Mfc—Nfc, where the splitting is chosen such that M A is easily inverted. Lexicographic Gauss-Seidel is obtained by taking M A to be the block lowertriangular matrix resulting from ignoring the terms above the diagonal blocks of lik ■ A further simplification is obtained if the diagonal blocks of M A contain only those entries corresponding to the principal part of the operator. Because the operator in Eq. 14.5 is upper triangular the diagonal blocks of M A will then be 3 x 3 upper triangular matrices. Letting q£ be the ra-th iterate of the solution on the fc-th grid, the relaxation iteration is
MAq£ + 1 =f* + NAq;j. The operator LA is nonlinear, so M A and N A are functions of qjj and q £ + 1 . Letting £q£ = qjj + 1 — <$■> the iteration may be rewritten as Mjfcfctf = fA - Uqt
(14.9)
Because M A is block lower-triangular, <5q£ is found by forward substitution. At each vertex, a 3 X 3 upper triangular matrix must be inverted. If the discrete approximation to the advection operator Q is fully-upwind and the grid points are ordered in the flow direction, then the 3 x 3 blocks of N A will have zeroes in the first two rows. In this case, lexicographic GaussSeidel relaxation is equivalent to space-marching of the advection terms. The advected error is effectively eliminated in one relaxation sweep and the convergence rate of the system becomes that of the Poisson equation for the pressure. It is possible to get ideal multigrid convergence rates because each component of the error is treated appropriately. A straightforward Full Approximation Scheme (FAS) multigrid iteration is applied to the system of equations. Let L A - I be the coarse grid operator, I^—1 be the fine-to-coarse grid restriction operator, and / £ _ 1 be the coarse-to-fine grid prolongation operator. If qA is the current solution on grid fc, the residual on this grid is TA = fA — LACJA- This leads to the coarse-grid equation LA-iqA-i = ffc-i = l£_xrk
+ L A - I (lk-Ak)
■
(14.10)
After solving the coarse-grid equation for q t - i , the fine-grid solution is corrected by q r w <- qA + I ^ (q*-i - ti-Ak) ■ (14.11) Equation 14.10 is solved by applying the same relaxation procedure that is used to solve the fine-grid equation. Multigrid is applied recursively to the coarse-grid equation. On the coarsest grid, many relaxation sweeps are performed to insure that the equation is solved completely. A conventional Vcycle or W-cycle is used.
FRESH LOOK AT MULTIGRID
249
Figure 2 Channel geometry.
14.5
Results
Both unstructured grid and structured grid flow solvers based on the theory in Sections 14.3 and 14.4 have been written. These codes are described in Reference [11], where extensive solutions are presented. Results illustrating the efficiency of the scheme are presented here. Solutions for incompressible, inviscid flow in a channel have been obtained with both solvers. The channel geometry and boundary conditions are shown in Fig. 2. The shape of the lower wall between 0 < x < 1 is y(x) = r sin 2 -KX. For the computations shown here, the thickness ratio r is 0.05. The flow angle and total pressure are specified at the inlet and the pressure is specified at the outlet. The flow tangency condition u-h = 0 is enforced at the upper and lower walls of the channel. Solutions were obtained on quasi-uniform quadrilateral grids. A simple shearing transformation was used in the center part of the channel to obtain boundary conforming grids. For the unstructured grid solver, the grids were triangulated by dividing each quadrilateral cell along a diagonal. A series of nested coarse grids was obtained by coarsening the fine grids by a factor of two in each coordinate direction. In all cases shown below, the coarsest grid was 7 x 3 vertices. Lexicographic Gauss-Seidel relaxation was used, with the grid vertices ordered from the lower-left to the upper-right of the channel. This resulted in downstream relaxation of the momentum equations. A V(2,1) multigrid cycle was used; that is, two relaxation sweeps were performed on each grid before restricting to the coarse grid, and one relaxation sweep was performed after the coarse-grid correction was added to the fine-grid solution. The computed pressure on a grid of 97 x 33 vertices is shown in Fig. 3 for the unstructured grid flow solver and in Fig. 4 for the structured grid solver. Comparisons of convergence rates for different grid densities are shown in Figs. 5 and 6 for the unstructured and structured grid flow solvers, respectively. The L\ norm of the pressure equation residual is shown; the
ROBERTS, SIDILKOVER & SWANSON
250
Figure 3
Figure 4
Pressure, contour increment Ap = 0.01, for an unstructured grid of 97 X 33 vertices.
Pressure, contour increment Ap = 0.01, for a structured grid of 97 x 33 vertices.
momentum equation residuals show the same behavior. The finest grid used for each flow solver contained 385 x 129 vertices, with a total of 7 grid levels. The convergence rate of the unstructured grid solver on the finest grid is approximately 0.190 residual reduction per multigrid cycle. The structured grid results are slightly better at 0.167 per cycle. These rates are comparable to the ideal rate of 0.125 per cycle for the Poisson equation. The better performance of the structured grid solver is most likely because of better restriction and prolongation operators; the unstructured flow solver performs bilinear interpolation using only the locations of a fine-grid vertex and the three vertices of the coarse-grid cell containing that vertex. W h a t is most important is that the figures show nearly ideal multigrid convergence rates, independent of the grid spacing. This shows that convergence is achieved in order n operations. For complex geometries it may not be practical to generate a series of nested unstructured grids, and the performance of the multigrid solver may
FRESH LOOK AT MULTIGRID
figure 5 Comparison of convergence rates on unstructured grids.
Figure 6
Comparison of convergence rates on structured grids.
251
ROBERTS, SIDILKOVER & SWANSON
252
Figure 7
Grid generated by perturbing the vertices of the 49 X 17 grid.
Figure 8
Pressure, contour increment Ap = 0.01, randomly perturbed unstructured grid of 97 X 33 vertices.
be expected to deteriorate. To show the robustness of the current method, the triangular grid solver was run for a series of non-nested coarse grids. These were generated by randomly perturbing the locations of the vertices on each of the nested grids independently. The perturbed 49 x 17 grid is shown in Fig. 7. The computed pressure on a perturbed 97 x 33 fine grid with 5 grid levels is shown in Fig. 8 and the convergence rate is shown in Fig. 9. The pressure contours are very smooth, showing no sign of the lack of grid smoothness. The asymptotic convergence rate has deteriorated to a still-respectable 0.24 per cycle. Solutions for nonlifting flow over a symmetric Karman-Trefftz airfoil have been obtained with the structured grid solver. A fine O-grid of 385 X 193 vertices was generated from a conformal mapping, and the coarse grids are nested by recursively eliminating every other vertex in each coordinate direction. The grid spacing was chosen to obtain unit aspect ratio grid cells. The outer boundary is approximately 13 chord lengths from the airfoil. Farfield boundary conditions are given by the analytic solution. At inflow points
FRESH LOOK AT MULTIGRID
Figure 9
253
Convergence rate, randomly perturbed unstructured grid of 97 x 33 vertices, 5 grid levels, V(2, 1) cycle.
along the outer boundary the total pressure and flow inclination angle are specified. For outflow points the pressure is specified. On the airfoil surface the tangency condition is enforced. To obtain ideal multigrid convergence rates, it is necessary to sort the vertices in a downstream order so that the advection terms in the momentum equations are marched. This is easily done here by relaxing along the radial grid lines from the outer boundary to the airfoil surface over the forward half of the domain, and from the airfoil surface to the outer boundary over the latter half of the domain. For each case run, the coarsest grid consisted of 13 x 7 vertices. Comparisons between computed and analytic surface pressure coefficients for nonlifting flow around the Karman-Trefftz airfoil are shown in Fig. 10. A W(2,1) multigrid cycle was used for these computations. The computed solution agrees very well with the analytic solution, except for the recompression at the trailing edge. Note that there is no clustering of the grid in this region, which exacerbates the problem. A comparison of the convergence rates of the pressure equation residual for three grid densities is shown in Fig. 11. A slight deterioration of the convergence rate with increasing grid refinement is observed: on the 385 x 193 grid, the rate is 0.153 per cycle. Nevertheless, as with the channel flow results, the convergence rates are very nearly grid independent, and are very close to the ideal rate of 0.125 per cycle.
ROBERTS, SIDILKOVER & SWANSON
254
Figure 10 Surface pressure coeffieient, nonflifting Karman-Trefftz airfoil 193 x 97 grid.
Figure 11
Comparison of convergence rates for nonlifting Karman-Trefftz airfoil.
FRESH LOOK AT MULTIGRID
255
A summary of the convergence rate on the finest grids is presented in Table 1. Two sets of results are shown: the convergence rate per multigrid cycle, and the convergence rate per work unit. For the purposes of the discussion a work unit (WU) is taken to be one Gauss-Seidel relaxation sweep on the finest grid. This is essentially the cost of one residual evaluation on the finest grid. The actual convergence rates are compared to the ideal convergence rates, which are computed as follows. Let y, be the smoothing rate of the relaxation method. For a V(m, n) or W(m, n) cycle, the ideal convergence rate is nm+n. Lexicographic Gauss-Seidel for the Poisson equation has a smoothing rate \i = 0.5. This gives an ideal convergence rate of 0.5 3 = 0.125 for a V(2,1) or a W(2,1) cycle. To compute the convergence rate per work unit we use the following formula. For each cycle, there are a total of m + n fine-grid relaxation sweeps. Examination of Eq. 14.10 shows that the fine-to-coarse grid restriction requires one residual evaluation on the fine grid and an additional residual evaluation on the coarse grid. The coarse grid residual evaluation is 1/4 the cost of a fine grid residual evaluation. Because most of the cost of a relaxation sweep is in the evaluation of the residual, we have that each cycle requires a total of (m + n + 1 + 1/4) work units on the finest grid. The cost of interpolating the residuals and solutions between grid levels is neglected. For a V(m, n)-cycle, we have that _
WU _
«
( m + n +
i
+
!)(i
+
!
+
x
+
...)
= §(m + n+f). Because a W-cyc\e involves two coarse-grid solutions per cycle, we have W-cycle
(m + n + 1 + i ) ( l + I + 1 + • • •)
= 2(m + n + | ) . These numbers yield ideal convergences rates of /i 3(m+n)/(4(m+ n +5/4)) p e r work unit for a F-cycle and Aj('"+n)/(2(m+n+5/4)) p e r w o r k u n i t for a jy-cycle. The V(2< 1) cycle is seen to require 5 2 / 3 WU per cycle. The W(2,1) is 50% more expensive, requiring 8V2 WU per cycle. By way of comparison, one V(2,1) cycle is only slightly more work than a single time step of a 5stage Runge-Kutta scheme on the finest grid. The ideal convergence rates for lexicographic Gauss-Seidel is of 0.693 per WU for a V(2,1) cycle and 0.783 per WU for a W(2,1) cycle. The actual rates shown in Table 1 are seen to be very close to ideal. The convergence rates in Table 1 can be used to estimate the work required to obtain a solution to the level of the discretization error on the fine grid. Let p be the order of approximation of the discrete operator and let hk be the grid spacing parameter on the fc-th grid. An initial guess to the solution
ROBERTS, SIDILKOVER & SWANSON
256
Case
Cycle
channel, 385 x 129, unstructured channel, 385 x 129, structured airfoil, 385 x 193, structured
V(2,l) V(2,l) W(2,l)
Table 1
Convergence Rate per cycle per work unit ideal actual ideal actual 0.125 0.125 0.125
0.190 0.167 0.153
0.693 0.693 0.783
0.746 0.729 0.802
Summary of convergence rates for multigrid solver on finest grids for channel and airfoil flows, with a comparison to the ideal rates.
on the fine grid GK is obtained by interpolating a solution computed on grid GK-I- Assume that the solution on GR-I has been obtained to the level of the discretization error TK-I = 0(hp ) on that grid. The multigrid cycle is used to reduce the error from T „ _ to r „ . Letting fiw be the convergence rate per work unit, the amount of work WK required to get the solution on GK from the initial solution on GK-I is WK
=
logp
-log •K-l
w
P log fi
log w
-K-\
A Full Multigrid (FMG) cycle starts with a solution on the coarsest grid, Go, and recursively generates improved solutions on the finer grids using the strategy above. For the nested grids considered here, the grid spacing parameters are related by hk-i = 2/ijt, and the amount of work on each grid is related by Wk-\ = Wk/4. The discretization is second-order accurate, i.e., p = 2. This gives us the estimate for the total work to obtain a solution accurate to TK to be W,total
L_ l°gM W
fcs(i)(i + i + & + -0 _8 1OR2 3 log M W
(14.12)
Using the values in the last column of Table 1 for nw, we see that channel flow solutions can be obtained to the level of discretization error in approximately 6.4 WU using a FMG cycle. Airfoil solutions can be obtained in about 8.3 WU. These estimates are generally low, and in fact are less than work of a single FMG cycle (7.6 and 11.3 WU for the channel and airfoil cases, respectively). The work computed using Eq. 14.12 also does not account for the introduction of short-wavelength errors in the interpolation of the coarse grid solutions to the fine grids. Nevertheless, Eq. 14.12 is a useful guide to the expected performance of the multigrid scheme.
FRESH LOOK AT MULTIGRID
14.6
257
E x t e n s i o n to Compressible Flow
The scheme presented here has a straightforward extension to the compressible Euler equation. In primitive variables the equations are 0 pQ 0 pdx
( Q 0 Lq = 0
^o
0 0 pQ Pdy
o \ dx 8y
(
u
= 0,
V
(14.13)
\ p )
where p is the density, c is the speed of sound, and s is the entropy. The projection operator P for this system is / 0 0
0 / 0
\o
dx
/ P =
0 0 / dy
o \ 0 0
(14.14)
Q* )
The operators Q and Q* are defined as in Eqs. 14.1 and 14.3. Applying this to Eq. 14.13 and ignoring the subprincipal terms as before yields (Q PLq =
0
0 PQ 0 0
0 0 PQ 0
0 d, dy
\ I s \ u V
2
M
a32
) \P
+ s.p.t.,
(14.15)
)
where M is the Mach number, da is the partial derivative in the streamwise direction, and "s.p.t." are the subprincipal terms. The most significant difference between the compressible and the incompressible equations is that a Prandtl-Glauert-like operator acts on the pressure. Note that this system approaches the system for the incompressible equations in the limit of vanishing Mach number. For subsonic flow the compressible equations can be solved by the same relaxation scheme as the incompressible equations. Unlike time marching methods, the convergence rate will not deteriorate as the Mach number approaches zero. The appearance of the Prandtl-Glauert operator in the pressure equation is significant. In effect, the problem of solving the pressure equation is no different than that of solving the full potential equation. This is a two-edged sword. On the one hand, the difficulties of relaxing the pressure equation in the transonic case are precisely those of relaxing the transonic full potential equation. This is a fundamental difficulty which is faced by any method that works directly on the steady flow equation. On the other hand, one can expect that the wealth of experience in solving potential flows can be directly applied to the current scheme for the compressible Euler equations. The treatment of the advection terms is not essentially different from the incompressible case.
ROBERTS, SIDILKOVER & SWANSON
258
14.7
Conclusions
Murman and Cole introduced type-dependent differencing and relaxation methods into computational aerodynamics; practical and efficient methods for solving nonlinear flow equations were the result. In subsequent years, the emphasis has shifted toward iterative methods based on the unsteady equations. In this paper, it has been shown that great improvements in the efficiency of flow solvers can be achieved by changing the point of view from the unsteady to the steady equations. As Murman and Cole introduced type-dependent differencing, so the current method relies on a discretization which distinguishes between the elliptic and hyperbolic parts of the system. As Murman and Cole used relaxation to solve the steady equations, so the current method applies relaxation with multigrid to the steady equations. This approach yields textbook multigrid efficiency for the steady Euler equations. It is a particularly simple approach; conventional finite-difference or finite-volume discretizations of the governing equations may be used, allowing flexibility in the choice of the underlying numerical method. Unlike time-marching approaches, but like potential flow methods, the convergence rate of the method does not degrade for low-speed flows, and the correct incompressible limit is recovered. Finally, this method can be applied to incompressible, viscous flow following the ideas of Sidilkover and Ascher [13]. There remains a great deal of work to be done before the full Reynoldsaveraged Navier-Stokes equations can be solved as efficiently as the simple problems shown here. The present work only addresses one particular, but nevertheless important, aspect of the problem, namely the appropriate discretization of the governing equations. If textbook multigrid efficiency is to be achieved for compressible, viscous flow, it will likely require an approach along the general outlines presented here.
Acknowledgments The work presented in this paper would not have started without the the advocacy and encouragement of Jerry South. The authors thank him for his interest, enthusiasm, and encouragement during the course of this research.
REFERENCES 1. Anderson, W. K., Thomas, J. L., and Whitfield, D. L., "Three-Dimensional Multigrid Algorithms for the Flux-Split Euler Equations," NASA Technical Paper 2829, 1988.
F R E S H L O O K AT MULTIGRID
259
2. B r a n d t , A., "Multigrid Techniques: 1984 Guide with Applications to Fluid Dynamics," GMD-Studie 85, GMD-FIT, 1985. 3. B r a n d t , A., and Yavneh, I., "Accelerated Multigrid Convergence and HighReynolds Recirculating Flows," SIAM J. Sci. Statist. Comput, vol. 14, no. 3, pp. 607-626, 1993. 4. Caughey, D. A., "The Computation of Transonic Potential Flows," Ann. Rev. Fluid Mech., vol. 14, pp. 261-283, 1982. 5. Giles, M., Anderson, W. K., and Roberts, T. W., "Upwind Control Volumes: A New Upwind Approach," AIAA Paper 90-0104, 1990. 6. Jameson, A., "Iterative Solution of Transonic Flows over Airfoils and Wings, Including Flows at Mach 1", Comm. Pure Appl. Math., vol. 27, pp. 283-309, 1974. 7. Jameson, A., "Solution of the Euler Equations for Two Dimensional Transonic Flow by a Multigrid Method," Appl. Math. Comput, vol. 13, nos. 3 and 4, pp. 3 2 7 355, 1983. 8. Mulder, W., "Multigrid Relaxation for the Euler Equations," J. Comput. Phys., vol. 60, no. 2, pp. 235-252, 1985. 9. Murman, E. M., Cole, J. D., "Calculation of Plane, Steady Transonic Flows," AIAA J., vol. 9, no. 1, pp. 114-121, 1971. 10. Pierce, N. A., Giles, M., Jameson, A., Martinelli, L., "Accelerating Three Dimensional Navier-Stokes Calculations," AIAA Paper 97-1953, 1997. 11. Roberts, T. W., Sidilkover, D., Swanson, R. C , "Textbook Multigrid Efficiency for the Steady Euler Equations," AIAA Paper 97-1949, 1997. 12. Sidilkover, D., "A Genuinely Multidimensional Upwind Scheme and Efficient Multigrid Solver for the Compressible Euler Equations," ICASE Report 94-84, 1994. 13. Sidilkover, D., and Ascher, U. M., "A Multigrid Solver for the Steady State Navier-Stokes Equations using the Pressure-Poisson Formulation," Comp. Appl. Math. vol. 14, no. 1, pp. 21-35, 1995. 14. Ta'asan, S., "Canonical Forms of Multidimensional Steady Inviscid Flow," ICASE Report 93-34, 1993. 15. Ta'asan, S., "Canonical-Variables Multigrid Method for Steady-State Euler Equations," ICASE Report 94-14, 1994. 16. Vatsa, V., Wedan, B. W., "Development of a Multigrid Code for 3-D NavierStokes Equations and its Application to a Grid-Refinement Study," Computers & Fluids, vol. 18, no. 4, pp. 391-403, 1990. 17. Warren, G. P., and Roberts, T. W., "Multigrid Properties of Upwind-Biased D a t a Reconstructions," Sixth Copper Mountain Conference on Multigrid Methods, NASA Conference Publication 3224, Part 2, 1993.
15 Aerospace Engineering Simulations on Parallel Computers K. Morgan, 1 N. P. Weatherill, 1 O. Hassan, 1 P. J. Brookes, 1 M. T. Manzari 1 & R. Said 1
15.1
Introduction
Aerospace companies are employing unstructured tetrahedral mesh based solution techniques for the simulation of steady compressible inviscid flows over complex geometries [4]. Implementations of the approach, on a wide variety of computer platforms, have demonstrated that acceptable accuracy can often be achieved, for this class of problems, by employing meshes consisting of the order of a few million elements [18]. However, the computer memory requirements of the approach are now preventing its extension to simulations which require the use of significantly larger meshes. This is particularly apparent when problems involving turbulent flows, or even transient inviscid flows in the presence of moving boundaries, are considered. The situation is even worse when electromagnetic scattering by aerospace vehicles at realistic wave frequencies is simulated, as the available numerical algorithms will demand the use of meshes consisting of the order of hundreds of millions of elements. With this background, and given the current developments in computer technology, we have directed research into techniques which are based upon the use of parallel processing for the solution of computationally large problems. However, for such an approach to be successful in the current 1
Department of Civil Engineering, University of Wales, Swansea SA2 8PP, UK. Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez ©1998 World Scientific
262
MORGAN, WEATHERILL, HASSAN, BROOKES, MANZARI & SAID
context, it will require not only a parallel implementation of the basic equation solver which is being employed but also a new approach to the pre-processing stages of mesh generation and domain decomposition. In this chapter, we will consider the parallelisation of a basic unstructured mesh solver for aerospace engineering applications and outline the approaches we are following to enable us to address the computer memory demands of the pre-processing stage. The additional difficulty associated with visualising the computed solution is being addressed in related work [6] and is not considered here.
15.2
Solution Algorithm
The solution of systems of partial differential equations, expressed in the form dU dt
&F3__ 8& dxj dxj
is considered, where j takes the values 1,2,3 in the implied summation. The solution is sought in a closed spatial domain, subject to appropriate initial and boundary conditions. With the correct definition of F3, G3 and S, this general form allows for the modelling of inviscid flows [12], turbulent flows with the use of an appropriate turbulence model [8] and electromagnetic scattering [10]. The starting point for t h e development of a solution approach based upon t h e use of unstructured meshes is t h e replacement of the classical statement of the problem by a weak variational formulation. The spatial solution domain is discretised into a general assembly of tetrahedral elements, using a Delaunay procedure with automatic point creation [17]. This basic discretisation procedure is modified when meshes appropriate for turbulent viscous flow [3] or electromagnetic scattering [11] are required. A piecewise linear variation of the approximate solution in space is assumed, and the use of a Galerkin approximate variational formulation then leads to the semidiscrete equation MIJ-^- = MuSj
+ RI
(15.2)
at each interior node / , where the implied summation extends over all nodes J in the mesh. When the mesh is represented in terms of an edge based data structure [13], the components of the right hand side vector R axe evaluated, by summing individual edge contributions, according to R
' = E Ck [(FJi + Fl) - (G»; + Gj.)]
(15.3)
where node I is assumed to be directly connected by edge e to node Ie. In this equation, the weight, CjT , associated with the edge e and the direction Xj,
SIMULATION ON PARALLEL COMPUTERS Table 1 NoP 4 8 16 32 64 128 256
263
Performance statistics (RSB). Steady in viscid flow.
MinEd
MaxEd
MinPo
MaxPo
469 771 230459 114139 55144 26 760 12 826 6189
484465 244 827 125 040 64 926 33 475 16 980 8 762
66 523 33 262 16 631 8 316 4158 2 079 1040
70130 37 357 19 674 10 609 5 930 3 239 1822
Table 2 NoP 4 8 16 32
RT(min) 650 345 159 92 44 27 14
RSU 1 1.88 4.09 7.07 17.77 24.07 46.42
Performance statistics (RSB). Turbulent flow. MinEd 579 277 289 623 144 774 72 311
MaxEd 579 291 289 651 144 827 72417
TNIE 48 069 63 502 98 312 157 324
RT(min) 2 097 1053
548 271
RSU 1 1.92 3.84 7.73
depends upon the local mesh geometry [13]. These weights are precomputed and then stored. Stabilisation and discontinuity capturing is achieved by replacing, on each edge, the actual convective flux function by an appropriate numerical flux function [5]. Finite difference procedures are employed to discretise the time dimension. For present purposes, the solution is advanced by using either Euler or multi stage time stepping. For steady state simulations, the resulting equation system is solved by lumping [20] the mass matrix M in equation (15.2), while explicit iteration is used for truly transient problems [1]. For transient simulations involving moving boundaries, a space/time variational formulation is adopted [9]. 15.2.1
Parallel I m p l e m e n t a t i o n
An efficient parallel implementation of this solution algorithm requires an effective procedure for obtaining an initial decomposition of the mesh, to ensure a balancing of the computational load between processors, and also the optimisation of the necessary inter-processor communication.
264
MORGAN, WEATHERILL, HASSAN, BROOKES, MANZARI & SAID Table 3 Performance statistics (RSB). Electromagnetic scattering.
NoP
MinEd
MaxEd
TNIE
RT(sec)
RSU
4 8 16 32 64 128 256
549429 271 198 132911 64814 32 166 14971 7 142
569 181 294234 148268 76339 38441 20659 10701
47577 79 792 112764 156772 209080 268356 344607
4516 2232 1174 613 342 176 136
1 2 3.8 7.4 13.2 25.7 33.2
15.2.1.I Domain decomposition There are a number of different approaches available for decomposing, in serial, a given unstructured mesh 12). For our initial studies, recursive spectral bisection (RSB) was the selected approach [15],as this generally produces well-balanced subdomains and low communication requirements. These properties should lead to good performance of the parallel equation solver.
15.2.1.2 Data structure The RSB ~roceduresubdivides the global mesh into a number. NoP, of subdomains by colouring the nodes in the mesh. An edge which connects two nodes of the same colour, I, is an interior edge for subdomain I and both nodes are regarded as interior nodes for this subdomain. An edge which connects a node of colour I and a node of colour J. where I < J, will be an interface edge in subdomain I. The node of colour I and the node of colour J will be interior nodes for the subdomains I and J respectively. In this case, the node of colour J is duplicated as an interface node in subdomain I. It should be observed that there is no duplication of edges. Local numbering of vertices, elements, edges and boundary faces is employed within each subdomain. The communication arrays, which are necessary to enable the transfer of information between the subdomains, are evaluated during the domain partitioning stage.
15.2.1.3 Solver parallelisation The parallel implementation of the solution algorithm uses standard PVM or MPI routines for message passing and employs a single program multiple data model. At the start of a time step, the interface nodes obtain contributions from the interface edges. These partially updated interface nodal contributions are then broadcast to the corresponding interior nodes in the neighbouring
SIMULATION ON PARALIJEL COMPUTERS
265
Figure 1 Variatior~of the memory recluirements with the size of the mesh for the mesh generator (solid line), RSB (dashed line) and the equation solver (dotted
line).
s~ibclorr~ains. A loop over the interior eclges is followecl by the receiving of the interface node contributions and the subsequent updating of all interior nodd values. The sending of the updated mlt~esback to the interface nodes completes tt time step of the procedure. The procedure is implemented in such sway that it attempts to allow computation and communication to take place concurrently.
15.2.2
Examples
Thc: pardlel performance of the equation solvers is demonstratecl by considering examples from clifferent application areas. The so1ve1.s have been f ~ ~ lvdicbatecl ly previously {8,10,12],so that vdiclittion is not considered further in this chapter. 15.2.2.1
Steady inviscid flow
The pe~*fosmance characteristics of a pardlel version of the steady inviscid flow solver are demonstrated for a mesh consisting of 1612 174 dements, 266092
266
MORGAN, WEATHERILL, HASSAN, BROOKES, MANZARI & SAID
nodes and 1 912 170 edges. Table 1 illustrates the load balancing capabilities of the RSB method and also contains information on the performance of the resulting implementation on a CRAY T3D. Here MinEd, MaxEd, MinPo, MaxPo denote the minimum number of edges, the maximum number of edges, the minimum number of points, the maximum number of points in any subdomain respectively, RT is the run time required to compute 1000 time steps and RSU is the relative speed-up obtained compared to the time required when using four subdomains. For a mesh of this size, it is observed that a parallel efficiency of 72.5% is achieved using 256 processors. To illustrate the performance which can be obtained when simulating turbulent flow with a k-ui turbulence model [19], a mesh of 1 966 731 elements, 331 685 nodes and 2 317 145 edges is generated. The grid partitioning statistics, together with information on the resulting performance of the parallel flow solver on the CRAY T3D are shown in Table 2. Here TNIE denotes the total number of interface edges. The parallel performance is generally better in this case, due to the increased amount of computation being performed by the flow solver. 15.2.2.2
Electromagnetic
scattering
The performance characteristics of the parallel electromagnetic scattering simulation capability are demonstrated for a mesh consisting of 1897 844 elements, 310813 nodes and 2 242682 edges. The grid partitioning statistics achieved by the use of RSB, together with information on the resulting performance of the parallel solver on the CRAY T3D, are displayed in Table 3. Note that RT now denotes the run time required to compute 500 time steps. The parallel performance is generally not so good in this case, due to the simplified form which is adopted for the numerical flux function [10] and the consequent reduction in the amount of the computation being performed by the solver.
15.3
M e m o r y Requirements
The analyst, who is using the unstructured mesh approach to undertake the simulation of realistic problems in aerospace engineering, quickly encounters the constraints placed by the available computer resources on the size of the mesh which can be employed. On a machine such as the CRAY T3D, the implementation of the equation solver, which has been outlined above, is such that approximately 400000 elements can be accommodated on a single processor. This means that meshes of around 200 million elements can be handled if 512 processors can be simultaneously accessed. However, the mesh generation and the domain decomposition processes require significant
SIMULATION ON PARALLEL COMPUTERS
Figure 2 Variation of the memory requirements with the size of the mesh for domain decomposition by RSB (dashed line) and by RBM (solid line).
amounts of memory and are normally performed in serial, which means that the meshes which can be employed in practice are significantly smaller. This is illustrated in Figure 1, which shows how the memory requirements for mesh generation and domain decomposition vary with the mesh size on a typical serial computer. For interest, though not of direct relevance in this case, the serial memory requirements of the flow solver are also displayed. The significance of this figure is best demonstrated by restricting the discussion to a particular computer platform and here we choose a CRAY YMP/EL, which has around 200 MWords of available memory. From the figure, it is clear that, on this machine, the largest mesh which can be generated and decomposed will consist of around 8 million elements. Hence, for simulations involving larger meshes, alternative domain decomposition and mesh generation strategies must be employed. 15.3.1
Alternative Domain Decomposition Strategies
It is apparent from Figure 1 that the domain decomposition procedure places the most demands on the available memory. An obvious first approach is, therefore, to investigate the memory requirements, and subsequent
268
MORGAN, WEATHERILL, HASSAN, BROOKES, MANZARI & SAID Table 4
Performance statistics (RBM). Electromagnetic scattering.
NoP 4 8 16 32 64 128 256
MinEd
MaxEd
TNIE
RT(sec)
550181 274 369 134 371 64 296 31607 15176 7 220
571164 291416 147 006 74412 37 956 19 607 10154
57 215 92 073 133 994 173 396 224 545 286 736 361971
4610 2 325 1070
578 332 191 145
RSU 1 2 4.3 8 13.9 24.1 31.8
performance, of other domain decomposition strategies. As a representative alternative, the use of a simple direct partitioning algorithm, based upon recursive bandwidth minimisation (RBM) has also been considered [2]. The reduced memory demands of this approach are apparent in Figure 2, which compares the memory requirements of RSB with those of RBM for meshes of different sizes. The figure shows that the RBM approach can be used to decompose meshes consisting of up to 16 million elements on the CRAY Y M P / E L . To illustrate the parallel performance which can be achieved, the mesh employed for the electromagnetic scattering simulation detailed in Table 3 is now decomposed by RBM and the corresponding results obtained are displayed in Table 4. For this problem, the parallel performance produced, following the use of these two domain decomposition methods is seen to be similar. Other methods of domain decomposition have not been investigated, as it is apparent that each will have its own mesh size limit, so that no approach is likely to allow for the direct decomposition of the very large meshes which are of interest here.
15.3.2
Alternative M e s h Generation Strategies
Three approaches to providing a capability for producing larger meshes have been investigated. The first two approaches involve the application of the standard /i-refinement technique to a generated mesh [7]. With h-refinement, a new node is added to each edge in the mesh and the existing tetrahedra are then sub-divided and new elements are formed by appropriate reconnection of nodal points. A disadvantage of this method is that the user forfeits the ability to determine the location of the available nodes on the finest mesh, while an additional complication is that, when boundary edges are considered, the added nodes need to be located on the boundary surface. The third approach is a new method of addressing the problem, in which a parallel approach to mesh generation is adopted.
SIMULATION ON PARALLEL COMPUTERS
269
Figure 3 Scattering of a plane wave of wavelength A by a perfectly conducting aircraft of length 18A—computed contours of the x\ component of the scattered electric field
15.S.Z.I
First
Approach
The first method, which allows simulations to be performed on larger meshes, is to produce the mesh by ^-refinement of a generated mesh and to use RBM as the domain decomposition strategy, then be limited by the memory demands of the RBM procedure. As an illustration of the type of problem which can be reasonably modelled by following this approach, the scattering of an electromagnetic wave by a perfect conductor is considered. The scatterer is taken to be a complete aircraft which is of length 18A, where A is the wavelength of the incident wave. Following the /i-refinement of a directly generated mesh for this configuration, the mesh employed consists of 15182 752 elements, 2 553495 nodes and 17 872347 edges. The solution is output after 36 cycles of the incident wave, with the computation requiring approximately 3 hours on a CRAY T3D using 256 processors. The computed distribution of the contours of the X\ component of the scattered electric field on the aircraft surface is shown in Figure 3. It should be noted that the concerns which are frequently expressed about the quality of meshes which are produced by /i-refinement do not appear to be valid in the present context, as mesh quality enhancement techniques are always employed [10]. This is demonstrated in Figure 4, in which the distributions of the element dihedral angle for a directly generated mesh of 1.8 million elements and an h-refined mesh of 2.1 million elements are seen to be fairly similar.
270
MORGAN, WEATREIZILL, HASSAN, BROOKES,MANZAlEI & SAID
0
20
40
60
80
100
120
140
160
Angle
Figure 4 Comparison of the distribution of the variation of the element dihedral angle on a directly generated atld on an h-refined mesh of approximately equal size.
15.3.2.2
Second Approach
Another approach, which enables simulations to be performed on lasge meshes, is to work initially with a mesh which is of a size which can be generated and decomposed within the limits imposed by the available computational resource. This mesh is decomposed into the same number of regions as there are processors available to perform the equation solution. Then a xnesh of the desired size is produced by 11--refinement of each of the decomposed regions separately. In this waby, no further domain deco~npositionis required and the associated memory constraints are, therefore, removed. Theoretically, subject to the drawbaclts of the h-refinement technique mentioned ea.rlier, this approach could be used to produce nleshes of any desired size. As the final lnesh is never brought together completely, special care is needed in the evaluation of the edge weights CiIcof equation (15.3).Different strategies call be adopted to overcome this proble~nbut, lzere, this is ha.ndled by duplicating the interface edges at subdomain boundaries, associating pa.rtia1weights with each interface edge, accumulatitlg edge contributions from each subdorna.in separately and tlien sumn~ingthe appropriate subdomain contributio~~s. The feasibility of using this approach has been demonstrated by considering the simula.tion of the steady inviscid flow over an aircraft configuration. An initid mesh was subdivided into 32 subdomains using R.SB. A finer mesh for the configuratbl, consisting of 1809 651 elements t~nd327 090 nodes, was
SIMULATION ON PARALLEL COMPUTERS
Figure 5
271
Steady inviscid flow over an aircraft configuration—computed contours of pressure
produced by using a single level of /i-refinement within each subdomain. The flow conditions were defined in terms of a free stream Mach number of 0.85 and an angle of attack of two degrees. The computed distribution of the contours of pressure on the aircraft surface is shown in Figure 5. 15.3.2.3
Third Approach
A general approach, which will allow the simulation of problems on arbitrary large meshes, is one in which a parallel strategy is applied to the mesh generation stage [16]. The strategy which has been adopted employs a single program multiple data model and a manager/worker structure, in which the workers execute the subroutines of the mesh generator. Starting from the discretisation of the domain boundaries, an initial Delaunay discretisation of the domain is produced. Element agglomeration, employing a volume criterion, is then employed to produce a geometrical partitioning of the domain into a set of NoP subdomains. Each subdomain can then be meshed independently using the serial version of the meshing algorithm. The manager distributes the subdomain mesh data to the workers using the PVM or MPI message passing libraries. It has been demonstrated in practice that the
272
MORGAN, WEATHERILL, HASSAN, BROOKES, MANZARI & SAID Table 5
Parallel 3D grid generation with 16 subdomains. Subdomain
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Elements 1243 837 1065 724 1237 778 1225 162 1311018 1098 964 983 377 1180 311 977565 1073 487 921642 1198 717 1304 066 1200 735 612166 488 984
Points 200733 174316 201448 198 944 212815 180 691 163178 192 392 161659 176100 154756 195286 212249 199436 106012 93 081
Faces 34 216 37724 40 230 38 240 40 342 42 812 43 730 39 774 41476 40 280 47 868 39 768 42 522 54266 43 986 66130
use of dynamic load balancing, in which there are more subdomains to be meshed than there are processors available, leads to an efficient computational implementation. It should be noted that a bi-product of the approach is that arbitrary large meshes can be generated on any computer platform, even a platform consisting of just one processor. This is because the initial problem can be reduced to a set of meshing problems, which can each be handled by any given processor. The potential of this approach is illustrated by demonstrating the results produced when discretising a region lying between two hemispheres [14]. The triangulation of the boundary of the region consists of 279 256 triangles and 139 630 points. The Delaunay triangulation of these points produced an initial volume grid which contained 425 092 tetrahedra and this is subdivided into 16 subdomains. Using 8 processors of a parallel computer, each subdomain is discretised using the standard Delaunay method. The final grid which is produced has 171235333 tetrahedra and Table 5 gives information on the size of the subdomain grids. Starting from a finer initial discretisation of the boundary, a second grid, containing 49 168881 tetrahedra has also been produced by the same process. It is apparent from Table 5 that equalisation of the projected computational load will be required before these decompositions can be used in practice. Methods of achieving this are currently under investigation, but it is hoped that the situation will be improved by incorporating better methods for
SIMULATION ON PARALLEL COMPUTERS
0
20
40
80
80
100
120
140
180
Angle
Figure 6 Comparison of the distribution of the variation of the element dihedral angle on a serial generated, a parallel generated and an h-refined mesh of approximately equal size.
constructing the initial decomposition of the domain and by using load balancing techniques to post-process the generated sub-domains. The fact that the mesh is generated in parallel is not expected to have any major detrimental effect on the mesh quality. This is confirmed in Figure 6, which displays a comparison of the distribution of the variation of the element dihedral angle on a serial generated, a parallel generated and an h-refined mesh of approximately equal size.
16.4
Conclusion
Explicit procedures for the solution of equation systems in conservation law form, on unstructured tetrahedral meshes, have been parallelised and applied to the simulation of a number of different problem areas which are of interest in aerospace engineering. Methods which will allow the application of the approach to simulations involving very large meshes have been proposed and implemented and outstanding difficulties have been identified. For certain classes of problems, further developments in solution algorithm technology will be required before the approach can be widely employed in the design environment.
274
MORGAN, WEATHERILL, HASSAN, BROOKES, MANZARI & SAID
Acknowledgements The authors wish to thank the UK Engineering and Physical Sciences Research Council for providing access to the CRAY T3D at the Edinburgh Parallel Computer Centre, under Research Grant GR/K42264, and for supporting the parallelisation activity under Research Grants GR/J12321, GR/J91234 and GR/L18860. The authors also acknowledge the support provided for viscous flow modelling by British Aerospace, Airbus and Military Aircraft, and DERA, Farnborough. R. Said would like to acknowledge the partial support provided by the K. R. S Foundation.
REFERENCES 1. Donea, J., Giuliani, S., Laval, H. & Quartapelle, L., Time-accurate solution of advection-diffusion problems by finite elements, Computer Methods in Applied Mechanics and Engineering 45, 123-146, 1984. 2. Greenhough, C. & Fowler, R. F., Partitioning methods for unstructured finite element meshes, Report RAL-94-092, Rutherford Appleton Labora tory, Didcot, 1994. 3. Hassan, 0 . , Morgan, K., Probert, E. J. & Peraire, J., Unstructured tetrahedral mesh generation for three dimensional viscous flows, International Journal for Numerical Methods in Engineering 39, 549-567, 1996. 4. Hills, D. P., Numerical aerodynamics: past successes and future challenges from an industrial point of view, in J.-A. Desideri et al, editors, Computational Methods in Applied Sciences '96—Invited Lectures and Special Technological Sessions, John Wiley & Sons, Chichester, 166-173, 1996. 5. Hirsch, C , Numerical Computation of Internal Volume 2, Wiley-Interscience, Chichester, 1990.
and External
Flows—
6. Jones, J. W. & Weatherill, N. P., ViPar: Parallel visualisation of large data sets, submitted to International Journal for Numerical Methods in Fluids, 1997. 7. Lohner, R., Morgan, K. & Zienkiewicz, 0 . C , Adaptive grid refinement for the compressible Euler equations, in I. Babuska et al, editors, Accuracy Estimates and Adaptive Refinements in Finite Element Computations, John Wiley, Chichester, 281-297, 1986. 8. Manzari, M. T., Hassan, 0 . , Morgan, K. & Weatherill, N. P., Turbulent flow computations on 3D unstructured grids, Finite Elements in Analysis and Design, 1997 (in press).
SIMULATION ON PARALLEL COMPUTERS
275
9. Morgan, K., Bayne, L. B., Hassan, O., Probert, E. J. & Weatherill, N. P., The simulation of 3D unsteady inviscid compressible flows with moving boundaries, in M - 0 . Bristeau et al, editors, Computational Science for the 21st Century, John Wiley, Chichester, 347-356, 1997. 10. Morgan, K., Brookes, P. J., Hassan, 0 . & Weatherill, N. P., Parallel processing for the simulation of problems involving scattering of electromagnetic waves, Computer Methods in Applied Mechanics and Engineering, 1997 (in press). 11. Morgan, K., Hassan, 0 . & Peraire, J., A time domain unstructured grid approach to the simulation of electromagnetic scattering in piecewise homogeneous media, Computer Methods in Applied Mechanics and Engineering 134, 17-36, 1996. 12. Morgan, K., Peraire, J., Peiro, J. & Hassan, 0 . , The computation of three dimensional flows using unstructured grids, Computer Methods in Applied Mechanics and Engineering 87, 335-352, 1991. 13. Peraire, J., Peiro, J. & Morgan, K., Finite element multigrid solution of Euler flows past installed aero-engines, Computational Mechanics 11, 433451, 1993. 14. Said, R., Weatherill, N. P., Morgan, K. & Verhoeven, N. A., Distributed Delaunay mesh generation for very large meshes, submitted to Computational Mechanics, 1997. 15. Simon, H., Partitioning of unstructured problems for parallel processing, Computational Systems Engineering 2, 135-148, 1991. 16. Verhoeven, N., Weatherill, N. P. & Morgan, K., Dynamic load balancing in a 2D parallel Delaunay mesh generator, in A. Ecer et al, eds., Parallel Computational Fluid Dynamics: Implementations and Results Using Parallel Computers, Elsevier Science, Amsterdam, 641-648, 1996. 17. Weatherill, N. P. & Hassan, O., Efficient 3D Delaunay triangulation with automatic point creation and imposed boundary constraints, International Journal for Numerical Methods in Engineering 37, 2005-2039, 1994. 18. Weatherill, N. P., Hassan, 0 . , Morgan, K. & Marchant, M. J., Large scale computations on unstructured grids, in F. Benkhaldoun and R. Vilsmeier, editors, Proceedings of the Conference on Finite Volumes for Complex Applications, Hermes, Paris, 77-98, 1996. 19. Wilcox, D. C , Turbulence Modelling for CFD, DCW Industries Inc., La Canada, 1993. 20. Zienkiewicz, 0 . C. & Morgan, K., Finite Elements John Wiley, New York, 1983.
and
Approximation,
16 Optimizing CFD Codes and Algorithms for use on Cray Computers Laurence B. Wigton 1
16.1
Introduction
In this paper we will discuss a rather busy year in the life of a Boeing CFD researcher. During this year much was learned about using the new Cray T90 computer and implementing new algorithms and coding techniques to improve the performance of two of our workhorse codes, namely TRANAIR [10, 4] and TLNS3DMB [8, 9]. Extensive tests were performed on the T90 computer including running codes in CAL (Cray Assembler Language) so that we could compare performance characteristics with previous generations of Cray computers. Many of our discoveries and observations concerning T90 hardware and software were passed back to Cray Research/SGI so that they can provide us and the CFD community as a whole even better support in the future. In the last year we also significantly updated the linear algebra routines for TRANAIR design since the "design sensitivity calculations" dominated the cost of a TRANAIR design/optimization run. In particular, we added block GMRES with eigenvector deflation, improved the matrix-vector multiplies using jagged-diagonal formats and speeded up the triangular matrix solves using special CAL code which took full advantage of new hardware features on the T90. 1
Boeing Commercial Airplane Group. Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez
©1998 World Scientific
WIGTON
278
As for TLNS3DMB, we introduced a new algorithm to more quickly compute the distance function needed by the turbulence model. We also created an out-of-core version of TLNS3DMB which we call NSOCR to take advantage of SSD on Cray computers. NSOCR can also run in core just as fast as the original TLNS3DMB. NSOCR and the fast distance calculation routines were given back to NASA so that they too can support us better in the future.
16.2
Cray Computers Including the T90
With all the emphasis being placed on distributed-memory parallel computers it is interesting to note that traditional Cray computers are still the preferred computing platform in many commercial/engineering environments. The reasons for this are that they are reliable, can handle many large jobs running simultaneously, have good software support, and, particularly with the introduction of the T90, are actually good bang-for-the-buck computers. As compared to other computers Crays have fast CPUs, excellent memory bandwidth and large enough memory and SSD so that they can accommodate the problems at hand. Despite all the advances being made with other computing platforms, Crays are still the machines by which all others are judged. 16.2.1
Overview of Traditional Cray C o m p u t e r s Model Clock Speed Pipes/CPU Memspeed
T90 440Mhz 2 34 or 56
C90 240Mhz 2 23
Y/MP 160Mhz 1 17
P i p e s : Each pipe has 3 ports to memory and a complete set of segmented functional units. In vector mode a pipeline is set up. After an initialization cost during each clock period a pipe can do: 2 fetches + 1 add + 1 multiply + 1 store In Cray language the set of operations which are done by a pipe during a clock period is referred to as a "chime". M e m s p e e d : The dominant part of vector initialization cost is "memspeed", the number of clock periods required to fetch the first operand from memory. This is 34 for small T90s and 56 for big T90s (which have more memory board circuitry). Note that memory speeds have not kept up with CPU speeds.
OPTIMIZING CODES AND ALGORITHMS FOR CRAY COMPUTERS 16.2.2
279
SSD
In addition to high speed memory, Cray computers can be supported by SSD, which acts like a very high speed disk (on our T90 the SSD transfer rate is 800Mw/sec vs. 30Mw/sec for striped disk). Over the years SSD has become bigger, faster and cheaper. SSD allows us to solve large problems on the Cray. However taking full advantage of SSD requires an out-of-core code which increases coding complexity. 16.2.3
P r e l i m i n a r y Conjecture o n T 9 0
One might think that the T90 is just like the C90 except that it has a clock which is 1.8 times faster. This is not quite true because, as we have seen, the T90 has a slower vector startup (governed by memspeed). In addition, the T90 has a memory bandwidth limitation, a greater dependence on instruction ordering, and greater sensitivity to so-called CMR instructions. We will discuss each of these points. 16.Z.S.I
Memory
Bandwidth Model numcpus Pipes Memreq banks bankbusy Memavail
T94 4 8 24 64 7 9
C90 16 32 96 1024 6 170
YMP/8 8 8 24 256 5 51
Here "Memreq" is the memory that could conceivably be needed by all the pipes (Memreq = 3 * Pipes), "banks" is the number memory banks. After a particular memory bank is accessed it can not be accessed again until "bankbusy" number of clock periods have passed. For memory intensive operations the number of memory words that could be available during a clock period is: Memavail = banks/bankbusy For the C90 and Y / M P Memavail > Memreq. For the T90 we have Memavail = 9 and Memreq = 24, so for memory intensive operations the T90 could be slowed by a factor of 9/24. Based strictly on CPU performance one might think that the T90 should be about 6 times faster per CPU than a Y/MP. After all, the T90 has a 3 times faster clock and 2 pipes/cpu. In fact, however, memory bandwidth
WIGTON
280
limitations reduce this advantage to: (9/24)6 = 2.25. This is close to the average performance gain observed for production engineering codes when we replaced our Y / M P with a T90. This agreement is fortuitous because average production jobs do not run in peak-memory intensive mode. However even average performance will be impacted by having more memory bank conflicts on the T90 than on a Y/MP. Increased vector start up times and increased sensitivity to instruction ordering and to CMRs also play a role. 16.2.3.2
Instruction
Ordering
When practicing writing CAL Codes for the BLAS routine SAXPY, it was noticed that a simple reversal of CAL instructions had a much greater impact on the T90 than on the C90: Megaflop rate for 2 versions of S A X P Y Machine
SAXPY1 SAXPY2
T94 639 1208
NASA C90 674 697
A note about this was sent to Cray Research in the following e-mail message. Frank Chism: (cc: Ecale, Hilmes, Whitaker, Heroux) It appears that tailgating is not always bad on the T90. I just discovered that a minor change to the "bad" asymc code makes it run twice as fast on the T90. In the "bad" code asymc, I used chains like multiply + add + V2 V3 AO VO AO VI AO ,A0,1
fetch + fetch + store as in: S7*RV0 V2+FV1 A2+A5 ,A0,1 A3+A5 ,A0,1 A3 V3
;SA*X ;SA*X+Y ; F e t c h more X' = l o o k 256 ahead ; F e t c h more Y ' s l o o k 256 ahead ; s t o r e back i n t o Y
If I move the store ahead of the fetches multiply + add + store + fetch + fetch as in:
OPTIMIZING CODES AND ALGORITHMS FOR CRAY COMPUTERS V2 V3 AO ,A0,1 AO VO AO VI
S7*RV0 V2+FV1 A3 V3 A2+A5 ,A0,1 A3+A5 ,A0,1
281
;SA*X ;SA*X+Y ; store 1 ; Fetchmore i ;Fetch X's look 256 ahead ;Fetch Fetch more i Y's look 256 ahead
Then instead of running at 639 megaflops on the T90, we run at 1208 megaflops. This i s almost as good as f90 code. I t appears t h a t a store w i l l s e t you free so t h a t you can s t a r t fetching into r e g i s t e r s which are being used in the chain. On the C90 moving the store i n s t r u c t i o n also helps a l i t t l e (697 megaflops versus 674) but the effect i s not nearly as dramatic as i t i s on the T90. Some hardware guru i s going t o have t o explain t h i s . Larry Wigton 16.2.S.S
Complete Memory Reference
(CMR)
When doing a fetch from memory, the Cray compilers are worried about cases where the number being fetched may have been sent to the memory location by a previously issued store command but might still be in transit on the memory network. Whenever this is a possibility, the compiler generates a Complete Memory Reference (CMR) instruction just to make sure that everything has arrived into memory in accordance with the previously issued store commands. Unfortunately the compilers have a tendency to generate unneeded CMR commands. Spurious CMR commands adversely impact the T90 even more than previous generations of the Cray computer. Consider the following code for SAXPYI: subroutine saxpyi.test(m,a,x,indx,y) r e a l x(m),y(*) i n t e g e r indx(m) CDIR$ IVDEP do 10 i = 1, m k = indx(i) y ( k ) = a * x ( i ) + y(k) 10 c o n t i n u e return end
WIGTON
282
According to Cray documentation (CF90 Commands and Directives Reference Manual SR-3901 1.0, page 37) on the C90 and T90, the directive: CDIR$ IVDEP is equivalent to: CDIR$ IVDEP [SAFEVL=128] and 128 is the largest value which can be assigned to SAFEVL. Accordingly, the cf77 compiler is worried that one group of 128 y's may occupy some of the memory locations used by a previous group of y's. This forces the compiler to issue a CMR instruction after each group is processed by vector instructions. Of course the user really means that SAFEVL=infinity, so no CMR instructions should be needed. One can take the CAL code generated by the cf77 compiler and remove the unneeded CMR instructions by hand. The resulting code is 50% faster on the T90 and about 10%-15% faster on the C90. The CMR problem also existed on early versions of f90, but appears to have been fixed in f90 version 2.0.3 16.2.4
W h y We Still Like the T 9 0
Despite the various apparent weaknesses of the T90, it job faster than the C90. Also, it is much cheaper than Cray is rapidly improving T90 hardware and software. adjusting the compilers as we bring various deficiencies 16.2.5
still runs the average the C90. In addition, In particular, Cray is to their attention.
"Killer Zero"
As demonstrated by the following program, on Cray computers (not just the T90) it is possible to have: 0+ 2= 0 program weird c c Introduce Octal Representation for Killer Zero: c equivalence (ia,a) ia = 0'0411650000000000000000' c b = 2.0
OPTIMIZING CODES AND ALGORITHMS FOR CRAY COMPUTERS
283
c = a + b print *,"a,b,c=
",a,b,c c c Now print everything in Octal c write (*,1) a,b,c 1 format(lx,"a,b)c=",/,(025)) stop end Execution of program "weird" under cf77 yields: triton<35> cf77 weird.f -o weird triton<36> weird a,b,c= 0., 2., 0. a,b,c= 0411650000000000000000 0400024000000000000000 0000000000000000000000 Execution of program "weird" under f90 yields: triton<37> f90 weird.f -o weird triton<38> weird a,b,c= 0.E+0, 2., 0.E+0 a,b,c= 411650000000000000000 400024000000000000000 0 The term "Killer Zero" was coined by Frank Chism at Cray/SGI. Note that the Killer Zero prints as 0 on Cray computers but does not act like 0 in the functional units. We first encountered the Killer Zero in the out-of-core version of TLNS3DMB. The Killer Zero is generated by the Cray memory allocation routine hpalloc. This is not a problem if the user properly initializes memory allocated by hpalloc. Our problem was traced to an error in the original version of TLNS3DMB which was fixed by Veer Vatsa in later versions of the code. The Killer Zero is a special case of an unnormalized number. On Cray computers, if a floating point number is not identically zero, then the first bit in the mantissa must be 1, otherwise the number is unnormalized and will cause problems.
WIGTON
284 16.2.5.1
Tracing Unnormalized
Numbers
Ideally, the hardware should halt execution when a user attempts to use an unnormalized number. Without this hardware change it is very hard for the user to trace down the use of unnormalized numbers (the Cray is much better at handling indefinite numbers). Unfortunately, it would be very expensive for Cray to implement the needed hardware changes. We have asked Cray Research/SGI to implement software improvements: 1. Debug mode in compiler which generates instructions to check for unnormalized numbers. 2. Changes to formatted write statements, print statements and symbolic dumps which will flag unnormalized numbers. 16.2.6
Cray is still King for C o d e D e b u g g i n g
Despite the problem with the "Killer Zero" we still consider Cray to be the leader in debugging codes. Features Crays have which are not commonly available include the ability to preset core to indefinite and to produce a symbolic dump when an error occurs. Also, unlike the Cybers which preceded them, Cray computers halt execution as soon as a floating point exception occurs, which makes it much easier to trace down errors in the code.
16.3
TRANAIR
TRANAIR [10, 4] is Boeing's general geometry full-potential plus boundarylayer code. This code is out-of-core; it runs best on a Cray with SSD. TRANAIR uses Drela's 2D+sweep/taper integral boundary-layer code. The full-potential equations are solved on an adaptively refined octree Cartesian grid. The full-potential and boundary-layer equations are coupled together and solved using an approximate Newton iteration. This is done using GMRES with ILUT preconditioning. TRANAIR design/optimization uses a direct rather than adjoint formulation. The cost is dominated by solving the linear equations with multiple right-hand sides (MRHS) associated with linearized design sensitivities (we compute how the entire flow field changes when each design variable is adjusted). We will now describe improvements which were made to TRANAIR's method for solving MRHS problems. 16.3.1
Block G M R E S
When solving linear equations with MRHS, block GMRES combines all the Krylov spaces associated with the solution of each right-hand side. This
OPTIMIZING CODES AND ALGORITHMS FOR CRAY COMPUTERS
285
automatically increases the dimension of the subspaces over which the best solution for each equation is computed. A good discussion appears in Saad's
book [6]. As compared to standard GMRES, block GMRES requires more storage for the Krylov vectors. Of course more orthogonalization (SDOT and SAXPY) operations are also required. However, block GMRES converges faster, so fewer preconditioned matrix-vector multiplies are required. Also, as discussed later, additional benefits come from processing multiple right hand sides inside inner loops including reduced memory references and SSD I / O . 16.3.2
G M R E S w i t h Deflation
Often small eigenvalues slow GMRES convergence. As GMRES iterations are performed, GMRES with deflation (DGMRES) calculates small eigenvalues and corresponding eigenvectors. These eigenvectors are added to the Krylov space in a bid to speed convergence. 16.3.3
dbgmr
Under contract we had Yousef Saad and his research assistant, Andrew Chapman, produce dbgmr. This is block GMRES with eigenvector deflation For "tough" test problems dbgmr often produced an order of magnitude improvement over standard GMRES. The improvement is not so dramatic for "easy" problems. The ILUT preconditioning for TRANAIR is quite effective so it does tend to make TRANAIR matrices easy to solve. Nevertheless for transonic TRANAIR cases dbgmr reduces the number of iterations required to solve the problems by a factor of 2.5 (block GMRES by itself gives a factor of 1.9 improvement). In addition to introducing dbgmr, we also improved the matrix-vector multiply routines and the triangular matrix solves associated with ILUT preconditioning. 16.3.4
Sparse M a t r i x Vector Multiplies
The old method which was used in TRANAIR to multiply a sparse matrix by a vector is essentially the segmented sum (SEGMV) procedure developed by Blelloch et al. [2]. SEGMV was independently discovered by the TRANAIR team and works well with the compressed row format normally used to store matrices in TRANAIR. While SEGMV is faster than a naive row-by-row implementation of the matrix vector multiply, there are certain problems with it. For one thing, CAL code is needed to capture the full benefits of SEGMV and the TRANAIR team
WIGTON
286
was not using CAL for this purpose. More importantly, SEGMV overwrites the contents of the matrix which makes it ill-suited to multiple right-hand side problems. Accordingly, we decided to use the jagged diagonal (JAD) technique which is also described in Saad's book [6]. JAD requires one to reformat the matrix which is particularly nontrivial for TRANAIR since the matrix is stored out-of-core. However this reformatting cost is easily amortized over the large number of matrix vector multiplies being performed. JAD can be efficiently implemented in FORTRAN, and, as described in the next section, it allowed us to handle MRHS inside inner loops. Overall, JAD proved to be 2.3 times faster than the old TRANAIR code based on an impaired implementation of SEGMV. 16.3.4-1
Multiple Right-Hand Sides Inside Inner Loops
Handling MRHS (multiple right hand sides) inside inner loops is discussed in an e-mail message sent to Yousef Saad: Yousef Saad: Let me make a few more comments about putting multiple right hand sides inside the inner loops on the Cray. We always say that the Cray is a vector machine, but it really is a pipeline machine. It is able to pipeline: 2 1 1 1
fetches from memory addition multiply store to memory
After an initialization cost of starting this pipeline, it is able to perform all these operations in one clock period. Cray refers to all the operations which can be pipelined together as a. chime, and they like to measure the execution cost of executing a. vector loop in terms of chimes. So if we look at the loop:
10
do 10 i=l,n k=ip(i) y(i)=y(i)+a(i)*x(k) continue
In one chime the Cray can fetch ip(i) and a(i). At this point it can not do any multiply or add operations, so this is all that happens during the first chime. During the second chime the Cray will fetch y(i) and x(k) (2 fetches), multiply a(i) by x(k), add to y(i) and store to y(i). During the second chime it actually does 2 fetches a multiply, an add and a store. Bottom line is the single right hand
OPTIMIZING CODES AND ALGORITHMS FOR CRAY COMPUTERS
287
side loop does 2 f l o a t i n g point operations in 2 chimes. A multiple r i g h t hand side loop:
20
do 20 i = l , n k=ip(i) y(i,l)=y(i,l)+a(i)*x(k,l) y(i,2)=y(i,2)+a(i)*x(k,2) y(i,3)=y(i,3)+a(i)*x(k,3) y(i,4)=y(i,4)+a(i)*x(k,4) continue
We do 8 f l o a t i n g point operations in 5 chimes. Thus multiple r i g h t hand side loop with 4 r i g h t hand sides handled simultaneously i s 1.6 times as f a s t as single r i g h t hand side mode. If we take SSD t r a n s f e r of matrix information of "a" and " i p " i n t o account, then we have t o add 2 chimes t o each of these loops (one chime to read a, one chime t o read i p , we w i l l store x and y in c o r e ) . In t h i s case do 10 loop does 2 f l o a t i n g point operations in 4 chimes, while the do 20 loop w i l l do 8 f l o a t i n g point operations in 7 chimes. In t h i s case multiple r i g h t hand side i s more e f f i c i e n t by a factor of ( 8 / 7 ) / ( 2 / 4 ) = 32/14 which i s why I say your idea i s worth a factor of 2. Larry Wigton C o m m e n t : Aside from the reduction in SSD usage, in actual practice processing MRHS inside inner loops leads to a reduction in CPU time by a factor of 1.3 for this case. The simple chime argument is overly optimistic; it would be more accurate (and the computer would be faster) if the Cray had more vector registers on each pipe. The FORTRAN version of JAD is inherently about 1.8 times faster than the FORTRAN implementation of SEGMV. Thus for MRHS the new TRANAIR matrix vector multiply is faster by a factor of 1.3 * 1.8 = 2.3. 16.3-4.2
"PreferVector"
Directive
We would like to code multiple right hand side loops like:
20 30
do 30 i r h s = l . n r h s do 20 i = l , n k = ip(i) y(i,irhs) = y(i,irhs) + a(i) * x(k,irhs) continue continue
WIGTON
288
But this does not work well. According to Cray Research we should be able to code multiple right hand side loops like this: CDIR$ PREFERVECTOR do 20 i = l,n k = ip(i) fac = a(i) do 30 i r h s = l . n r h s y ( i , i r h s ) = y ( i , i r h s ) + fac * x ( k , i r h s ) 30 continue 20 continue But so far this has not worked well either. The best performance has been achieved by unrolling loops by hand as described in the e-mail message. 16.3.5
Triangular M a t r i x Solves
The triangular matrix solves, also known as forward/backward substitutions, were the most expensive part of TRANAIR. Dramatic improvements were made by introducing a column format rather than a row format (on Cray computers a short SAXPYI is faster then a short SDOTI), processing MRHS inside inner loops, removing spurious CMR instructions generated by the FORTRAN compiler and finally by writing a CAL code which took advantage of the new double gather hardware instruction on the T90. On the T90 the triangular matrix solve code is 4.2 times faster than the old code. Interestingly, the triangular matrix solves are 3 times faster on the T90 than on the C90. This represents a significant example of where the T90 is 3 times faster than the C90 even though the clock is only 1.8 times as fast. 1v6.3.6
Profile R e s u l t s for T R A N A I R D e s i g n
Cray profile was run on a T94 for the old and new TRANAIR codes with 61 design variables. The new code solves about 10 right-hand sides at a time.
Triangular Solves Matrix-Vector Multiply SAXPY+SDOT Total
Old Code 8,748,100 2,803,201 187,156 11,738,457
New Code 844,060 481,792 643,813 1,969,655
OPTIMIZING CODES AND ALGORITHMS FOR CRAY COMPUTERS
289
Comments: • Profile sample rate is 512 microseconds. • dbgmr reduces iterations (and thus number of triangular solves and matrixvector multiplies) by a factor of 2.5 • Triangular solves are now 4.2 times faster. • Matrix-vector multiplies are 2.3 times faster. t Bigger Krylov spaces induce more SDOT and SAXPY activity. • Overall improvement is a factor of 6 which is a big win for us.
16.4
TLNS3DMB
TLNS3DMB [8, 9] is a 3D block-structured matched-grid Navier-Stokes code developed at NASA Langley. It employs Jameson-type technology including multigrid, Runge-Kutta time marching, and implicit residual smoothing. As much as possible, TLNS3DMB advances the solution as though the grid is in a single block. All blocks are divided into the same number of multigrid levels. Multigrid is done seamlessly across interface boundaries. Each stage of Runge-Kutta time marching visits every grid point in all the blocks. Only the implicit residual smoothing is done block-by-block. 16.4.1
Out-of-Core T L N S 3 D M B C o d e ( N S O C R )
In order to take full advantage of SSD, an out-of-core version of TLNS3DMB which we call NSOCR was written. Much of the conversion work was accomplished by using automated tools which were written in FORTRAN. However the algorithm for passing ghost-cell information between blocks was changed (by hand) to allow a more efficient out-of-core implementation. Test cases were run to verify that NSOCR produces results which are binary identical with those produced by TLNS3DMB. NSOCR places arrays on a data base. Small arrays are kept in core while large arrays are moved in and out of SSD as needed. The user has control over a parameter called "incore_size" which determines how big an array has to be before it is moved to SSD. By setting incorejsize to a very large number, bigger than any of the array dimensions, then the code operates in a pure in-core mode. In this case, NSOCR is just as fast as TLNS3DMB. By setting incore_size to a smaller value, say 10000, then NSOCR operates in a mostly out-of-core mode. In this case, the time required to access SSD is about 23 percent of the time required by the CPU to do the calculations which is considered to be very acceptable overhead. Since NSOCR operates well in both in-core and out-of-core modes, we plan to make this our only version of TLNS3DMB. NSOCR was given back to NASA.
WIGTON
290
Figure 1 Example of Boxes used in Distance Calculation
16.4.2
D i s t a n c e Calculation
Recently developed turbulence models, such as Baldwin-Barth ([1]) and Spalart-Allmaras ([7]) require the user to compute the distance from each point in the field grid to the configuration under consideration. For calculations involving millions of grid points, naive methods for computing the distance function can easily consume hours of CPU time even on Cray C90 class computers. These distance calculations are so expensive that some code developers have chosen to avoid performing a proper distance calculation thus imperiling the accuracy of their codes. In this section, we wish to discuss an efficient method for computing the distance function. Naive Algorithm: For each of the NF field grid points the naive algorithm simply calculates the distance to each of the NS surface points and selects the minimum of these distances. Cost: NF * NS Faster Algorithm: Construct roughly \/NS boxes each containing roughly y/NS surface points. For each of the field grid points, compute distance to each box. Select closest box and compute distance to each surface point in box. If the minimum distance to points contained in closest box is smaller than distance to remaining boxes we are done. Otherwise examine second closest box etc. Experience has shown that on average one must look at 1.5 boxes. Cost: NF * y/NS Construction of Boxes: Start with a big box containing all the surface points. Divide box in longest direction. Choose dividing plane so that half the surface points lie on each side. Proceed recursively. Stop when box has y/NS or fewer surface points. As final embellishment, once a box is constructed we shrink it down so that it just barely contains all the surface points within it. A 2 dimensional example of boxes is shown in Fig. 1. Of course the concept works just as well in 3D.
OPTIMIZING CODES AND ALGORITHMS FOR CRAY COMPUTERS
291
The fast distance function has been given back to NASA and is now available in TLNS3DMB [8, 9] and CFL3D [5]. The method has produced dramatic time savings. Indeed, one case that required 3 hours on the C90 now takes just 5 minutes. In practice, the method has proven to be faster than others (based on octrees and such) because it fully vectorizes. Of course, it also parallelizes. Initially the parallel code failed under both cf77 and f90. Cray has dropped support for cf77 but they immediately fixed the error in
f90.
Acknowledgements During this rather intense research effort, literally hundreds of e-mail messages were exchanged with my tireless and always ready to help "support group". Professor Yousef Saad and his research assistant, Andrew Chapman, provided key elements needed to upgrade the TRANAIR linear algebra package. My contacts at Cray Research/SGI including Frank Chism, Dave Whitaker, Mike Heroux and David Ecale were extremely responsive in handling my many questions about Cray computers and efficient coding techniques.
REFERENCES 1. Baldwin, B. S., and Barth, T. J., "A One-equation turbulence transport model for high Reynolds number wall-bounded flows," AIAA Paper 91-0610. 2. Blelloch G.E., Heroux M.A., and Zagha M., "Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors", CMU-CS-93-173, Carnegie Mellon University, August 1993. 3. Chapman, A. and Saad, Y., "Deflated and Augmented Krylov Subspace Techniques," Numerical Linear Algebra with Applications, Vol. 4(1), 1997, pp. 43-66. 4. Johnson, F.T. "The TRANAIR rectangular grid approach to solving the nonlinear full-potential equation about complex configurations", Sadhana, Vol 16, Part 2, October 1991, pp. 165-177. 5. Rumsey, C.L., Biedron, R.T., and Thomas, J.L., "CFL3D: Its History and Some Recent Applications", MIS';! TM-112861, May, 1997. 6. Saad, Y., "Iterative Methods for Sparse Linear Systems," PWS Publishing Company, 1996. 7. Spalart, P. R., and Allmaras S. R., "A One-equation turbulence transport model for Aerodynamic Flows," AIAA Paper 92-0439. Also published in La Recherche Aerospatiale, no 1, 1994, pp 5-21. 8. Vatsa, V.N., Sanetrik, M.D., and Parlette, E.B. "Development of a Flexible and Efficient Multigrid-Based Multiblock Flow Solver". AIAA Paper No. 93-0677, Jan. 1993. 9. Vatsa, V.N. and Wedan, B.W. "Development of a Multigrid code for 3-D NavierStokes Equations and its application to a grid-refinement study" Computers and Fluids, vol. 18, 1990, pp. 391-403.
292
WIGTON
10. Young, D.P., Huffman, W.P., Melvin, R.G., Bieterman, M.B., Hilmes, C.L., and Johnson, F.T. "Inexactness and Global Convergence in Design Optimization", AIAA Paper 94-4386.
17 Recent Applications in Aerodynamics with NSMB Structured MultiBlock Solver Carlos Weber 1 , Cyril Gacherieu 1 , Arthur Rizzi 2 , Anders Ytterstrom 2 , Jan Vos3, Nathalie Duquesne 2 & Loic Tourrette 4
17.1
Introduction
Navier-Stokes solvers using structured multi-block grids have become a standard CFD technology for studying practical problems of aerodynamics in the aeronautical field. A very common variant of the procedure is the finite volume approach based on a cell-centered concept. This method has found favor in industry because the multi-block approach facilitates the mesh generation around complex geometries and naturally partitions the problem into smaller units for concurrent processing. The ultimate use of such solvers is for aerodynamic design. But engineers in
1
Centre Europeen de Recherche et de Formation Avancee en Calcul Scientifique, F-31057 Toulouse, France.
2
Department of Aeronautics, KTH Royal Institute of Technology, S-10044 Stockholm, Sweden.
3
Hydraulic Machines & Fluid Mechanics Institute (IMHEF), Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland.
4
Aerospatiale Avions, F-31060 Toulouse, France. Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez
©1998 World Scientific
294
WEBER, GACHERIEU, RIZZI ET AL
industry do not yet routinely simulate three-dimensional transonic turbulent flows because of the high costs of large CPU time and huge memory size along with long turn-around times. Navier-Stokes meshes place a great number of grid points in the viscous layers in order to resolve these regions properly, and this leads to turn-around times much longer than overnight, even with powerful vector computers. The advent of massively parallel computers offered the hope of reducing the elapsed time of such flow simulations, but the great potential of these supercomputers can only be exploited in an efficient way if the software is especially adapted to the parallel architecture. Thus the European Union set up the EUROPORT Project to investigate whether and how this potential could be realized [19]. We participated in this project, and the overall conclusion is that even with present and future high-performance supercomputers, Navier-Stokes simulations of a realistic industrial design case is still highly CPU-time demanding. Explicit schemes simply do not converge rapidly enough for this type of problem, and new methods are needed to accelerate the convergence. Multigrid acceleration does help in some instances, but most of the time the complex geometry with its highly stretched cells in the viscous region together with turbulence-model effects renders this technique ineffective. Thus a more robust algorithm is needed to accelerate convergence. From the perspective of EUROPORT, we now believe that Navier-Stokes simulations of industrial applications will advance only if there is progress in three interrelated areas: 1) more accurate physical models, i.e. turbulence, 2) faster converging and more robust algorithms and 3) better implementation and higher performance on vector/parallel supercomputers. Unfortunately, an advance in one area is usually counter productive in the others. For example a more sophisticated turbulence model adds to, and may alter the character of, the system of equations to be solved, increase the number of computations to be executed and complicate the coding for a given paradigm of parallel processing. Likewise an algorithm with higher accuracy and better convergence properties may be difficult to implement and may have poorer parallel performance. The task thus is to balance all of these factors and to achieve an overall improvement in all three areas. The purpose of this paper is to describe what we are doing in an academicindustrial consortium of partners working to bring these three items under control. The NSMB code has been developed in a joint project between three research establishments (EPFL in Lausanne, KTH in Stockholm and CERFACS in Toulouse) and two industrial partners (Aerospatiale in Toulouse and SAAB in Linkoeping. From the beginning the NSMB code has solved the compressible NavierStokes equations using a finite volume method and central differences for the spatial discretization, together with the explicit multi-stage Runge-Kutta
APPLICATIONS IN AERODYNAMICS
295
time-integration scheme. To accelerate convergence, local time stepping has been used and some implicitness has been introduced through implicit residual smoothing together with a multigrid procedure. The numerical scheme is augmented with artificial dissipation terms in order to prevent odd/even oscillations and to improve the shock resolution. Several turbulence models axe implemented which range from the simple algebraic Baldwin-Lomax model to two-equation models and an algebraic Reynolds stress model. Recently we have implemented the implicit LU-SGS (Lower-Upper Symmetric Gauss-Seidel) scheme for time integration. Jameson and Turkel [12] originally developed this scheme together with the centered space scheme, but we found that convergence degraded when using meshes with high cell aspect ratios. Therefore, a second order Roe's scheme is implemented, ensuring diagonal dominance. Two variants have been implemented, the original scalar version using the spectral radii and a full matrix version where the numerical fluxes are linearized as accurately as possible. Thus the user of the code now can choose from a library of different turbulence models, time integration methods and space discretization schemes. We illustrate the effects that some of the different choices produces on a number of computed examples and compare their computing performance on several vector and parallel machines. The cases include a low Mach number turbulent flow over the A-Airfoil, transonic flow over a delta wing and transonic flow around Aerospatiale's AS28G airplane configuration.
17.2
Physical Model
The Navier-Stokes equations are the model for the fluid dynamics considered here. Turbulent fluctuations in these equations are time averaged, and the system is closed by appropriate models using the Boussinesq approximation. Transition is not modeled. The flow is either considered to be fully turbulent, or the transition location is specified a priori. 17.2.1
G o v e r n i n g Equations
In 3D Cartesian coordinates (x,y,z), the compressible Reynolds-averaged Navier Stokes equations for a perfect gas expressed in conservative form are
ot
ox
oy
oz
where
U=(£Ti)
(17-2)
296
WEBER, GACHERIEU, RIZZI ET AL
T-fK?-??'}
f f H=(Kn -K? )
Q-(QTJ-Q7S\ / S =
0 \ 0 0 0 0
(173)
(17.4)
\Stm/ The term Umf represents the five conservative variables of the mean flow, the inviscid and viscous flux terms of the mean flow are respectively J^ , Q™f, U™f and T™f, Q™f, U™f The components of these vectors are well known and not repeated here. The Reynolds-averaged system is closed using the Boussinesq approxima tion for the turbulent Reynolds stresses and a turbulence model of either just an algebraic equation or one or two partial differential equations. The variables of the model are Utm and the inviscid and viscous flux terms in these models are respectively F&, Gj™, H\% and T\m, £ j m , U\m Lastly the source-term components in the model are Stm. The turbulence-model vectors are specified in the sections below. 17.2.2
Turbulence m o d e l s
Various turbulence models of different complexity are available in NSMB which range from algebraic models to one- and two-equation models. Four different turbulence models have been used here: the algebraic BaldwinLomax [2] and Granville [10] models, the one-equation model by Spalart and Allmaras [18] and the two-equation k-£ model by Chien [6]. The former three models have been implemented in the code by Aerospatiale whereas the latter has been implemented by the Aeronautics Department at KTH. The Baldwin-Lomax model has been implemented in a general way for multi-block computations [7]. 17.2.2.1
Granville
The Granville model [10] is based on the Baldwin-Lomax model and uses the same formulation of the governing Reynolds-Averaged Navier-Stokes equations (17.1). Granville introduced alternative expressions for the Cki and Ccp in the Baldwin-Lomax formulation that includes the Coles Wake factors n in order to represent favorable as well as adverse pressure gradients up to separation.
APPLICATIONS IN AERODYNAMICS
297
Granville assumed an outer similarity law for the turbulent boundary layer Cfc; =
(17 5)
9iT^-
-
A formula for Ccp as a function of Cki was derived by eliminating the Coles Wake factor C
CP = ^FTTT,— .. +, ^C\ 3 ^) 2CH{2 - Qn3C kl t
(17-6)
For the case of equilibrium pressure gradients, the Coles Wake factor is constant in the streamwise direction and can be empirically correlated to the Clauser pressure-gradient parameter. Granville proposed a modified Clauser pressure-gradient parameter p: -'■-nr fit Dx
-
(17.7)
Finally, Granville derived an explicit formula for Cu as a function of the modified Clauser pressure-gradient parameter (3: 2 _ 0.01312_ 3 0.1724 + /?
V
'
The inner eddy viscosity also has to be modified to include pressure gradients. The linear variation (fty) needs to be modified because the total shear stress is no longer constant near the wall. Following the paper of Galbraith et al. [8], the mixing length in the vicinity of the wall becomes: linneT ^ K y{—)^[l
- exp{-y—)}.
Tw
(17.9)
A
Rejecting the use of (it\w\ for the determination of r which induces a nonlinear behavior, Thomas and Hasani [20] proposed an algebraic interpolation formula for r(y) which is reasonably accurate across the entire boundary layer : — ~ 1 + £r? - (3 + 2£>72 + (2 + O ^ 3
(17-10)
Tw
with n = | and f = —^P-. Recently, Granville [10] suggested that the smoothest merge (with the log-law intercept and its slope) was accomplished by the following damping-constant variation: A~ and
^—r (1 + &P+)5
(17.11)
WEBER, GACHERIEU, RIZZI ET AL
298
P+ = ^ puT3 ax with b = 12.6 if p+ > 0 and b = 14.76 if p+ < 0.
17.2.2.2
(17-12)
Spalart- Allmaras
The one-equation model by Spalart and Allmaras [18] uses the same mean-flow equations for mass, momentum and energy as the algebraic models presented above, and one additional transport equation for the turbulent viscosity, in order to get an expression for the eddy viscosity. This gives Utm = v in the turbulence-model state vector (17.2), and the other terms in Eq. (17.1) are
Tg = UP, g\™ = vv, ? c = ™ Ttm
_ v + g gg Qtm _ v + g di>
c «m
g- , Ci2(V£)2
ytm _ v + g gg
f ( \(
^
giving the following transport equation for the turbulent kinematic viscosity, v
j£ ^^^ coni-ecijon
= ctiSv
+^[v-((u
i>)VD)+cb2(Vi>)2'\-cwlful(r)(^\
+ v
production
s
diffusion
s-*™ — -^ dissipation
(17.13) The eddy viscosity is defined as Ht = pvfvi = pvt
(17.14)
To ensure that v equals KyuT in the log layer, the buffer layer and viscous sublayer, the damping function /„i is defined as U
=j
^
(17.15)
as function of the totally local variable A A= -
(17.16)
APPLICATIONS IN AERODYNAMICS
299
Spalart-Allmaras recommended the use of the square of the magnitude of the vorticity \u\ to define the production function S. This term must be modified to maintain its log-layer behavior ( 5 = UT/(Kh)) all the way to the wall S = S 1 / 2 + 7T^2/»2 (An) which is accomplished with the help of the function U
= 1-
I T
^
(17.17) fv2
r
(17.18)
The destruction term should vanish in the outer region of the boundary layer. Spalart-Allmaras proposed the function 1 + r6 , 1 / 6 ^r)=9\-f—^-\
(17.19)
with the argument r
r=
idr§
(17 20)
-
Both r and fw equal 1 in the log layer, and decrease in the outer region. The function g is merely a limiter that prevents large values of /„,, g = r + cw2(r6 - r).
(17.21)
The constants in the Spalart-Allmaras model are cbl = 0.1355, cb2 = 0.622, cw2 = 0.3, cvl = 7.1 2 _ CM , (l + c 62 ) v = T;, CWI = 7 2j -I , cw3 = 2, A = 0.41. it 17.2.2.3
Chienk-e
The starting point for two-equation models is the Boussinesq approximation which can be written :
-puru/1
,du, Buj 2dui , , 2 = M ~ + g ^ ' 3 Q^Sij) ~ ^pkSij
where, according to the Favre averaging, u; is the mass averaged part of the instantaneous velocity and u, is the fluctuating part.
WEBER, GACHERIEU, RIZZI ET AL
300
Two-equation turbulence models use two extra transport equations to close the Favre-averaged Navier Stokes equations (Eq. (17.1)). For the k — e models, the first equation determines the turbulent kinetic energy k(= g-ujwj), and the second determines the dissipation of turbulent kinetic energy e and is used in conjunction with k to define a turbulence length scale. Thus two additional terms are added to the state vector: U
< m
= ( ^ )
(17.22)
The basic difference between this and lower order models is the definition of the turbulent viscosity. Rather than being linked to an algebraic length scale, the turbulent viscosity for a k — e model is determined by the relation :
r
Pk2
with C„ = 0.09. The last term in the Boussinesq approximation |/5fc, corresponds to a turbulent pressure due to the turbulent motion, and is added to the pressure terms in the inviscid fluxes in the momentum and energy equations, yielding p* = p + \~pk- Due to the coupling with the k and e equations, the energy equation is slightly modified and accounts for an additional diffusion term <7fc,- which also appears in the k equation. The last two equations for k and £ written with the previous notations are :
pue } ' "* ~ y pve J '
in
~ \ pwe J
and for the viscous is fluxes
The diffusion terms are written 0>i =
/i +
JH_\ Prk)
dk_ dxi
JH_\ de_ Pre ) dxi where Prk and Pr£ are analogous to turbulent Prandtl-Schmidt numbers. This basic two-equation model is typically a "high Reynolds number " model, which does not account for the interaction between turbulence and
APPLICATIONS IN AERODYNAMICS
301
fluid viscosity and therefore does not apply to regions near walls. To permit the integration of the turbulence equations all the way to the wall, different viscous corrections have been proposed by many researchers. In this study, the lowReynolds number version of Chien [6] has been used. In this model, a transport equation is solved for the isotropic component of the dissipation rate. Since the isotropic component of the dissipation, instead of the dissipation itself, is transported, the source terms of k and e are slightly modified by the introduction of correction terms, and are then written :
Stm=
U ) = Uj4P*-C, 2 /J±-§exp(-0.5y+))
^^
with the production term
For correct treatment in the vicinity of the wall, the following damping functions are used : /„ = l - e x p ( - 0 . 0 1 1 5 y + ) /i = l /2 = l - 0 . 2 2 e x p ^ -
3(;
where the turbulent Reynolds number, Ret = ^—-, and y+ - pu^n dimensionless variables and xn is the distance to the nearest wall. The eddy viscosity is then : fJ-t =
are two
C^fuP^/e
And the typical constants used in the Chien's model are : CV = 0.09, C £ l = 1.35, C £2 = 1.8, Prk = 1.0, Pre = 1.3 The boundary conditions at the wall for k and e are : k = 0, £ = 0
For the calculations performed in this paper, the k and e equations are solved by explicit Runge K u t t a integration.
WEBER, GACHERIEU, RIZZI ET AL
302 17.3
Numerical Model
NSMB uses the cell-centered Finite Volume (FV) approximation as spatial discretization, partly because of the importance with conservation at discontinuities, like shocks. The FV-method has also the possibility to handle stretched meshes around complicated geometries and singular points in the grid. The flux vectors at the cell faces consist of an inviscid part and a viscous part, where the viscous part is calculated using the gradient theorem on a shifted control volume. The user of NSMB can choose either the Jameson central differencing with second and fourth order artificial dissipation or a Roe-based upwind differencing for the space discretization. The usual explicit multi-stage Runge-Kutta time integration has been the standard method for time differencing, coupled with Jameson implicit residual smoothing and multigrid acceleration to the steady-state solution. We do not describe it here. Instead we do describe the new features of code, which include upwind space differencing and implicit time differencing. In this paper the implicit scheme has only been applied to the mean-flow equations, i.e. only algebraic turbulence models have been implemented with it. 17.3.1 17.S.1.1
Space discretization Jameson's central scheme
The second order central scheme is augmented with the artificial dissipation model of Jameson et. al. [13] using a combination of second and fourth order differences. This adaptive dissipation model was proved to reproduce discontinuities without oscillations. The fourth order artificial dissipation terms suppress odd/even oscillations which are not damped by the scheme itself. The numerical flux at the interface between cells i and i + 1 is given by: Fi+1/2
= Tin ( y ' + 1 2 + y ' ) - < W
where Tin is the convective flux and di+1/2 as: di+1/2 = e | + i / 2 ( ^ + i - U.) - e^1/2(Ui+2 2
1S t n e
dissipative flux defined
- 3Ui+1 + 3Ut - Ut^)
(4
(17.26)
(17.27)
The coefficients e^ ' and e ^ are used to locally adapt the dissipative flux and are directionally scaled by the spectral radius r(A) of the Jacobian matrix A: £
!+l/2
=
fc(2)
K^)i+l/2^+l/2
APPLICATIONS IN AERODYNAMICS ejj1/a
=
303
max(0.0.fcWr(A)j+1/a-e<}1/2)
(17.28)
The sensor variable J^+1/2 controls the second order dissipation near shock waves. It is constructed using the absolute value of the normalized second order differences of the pressure: Pi+l ~ 2Pi +Pi-1 & = Pi+i +2pi +pi-i
(17.29)
The sensor is then taken as ui+1/2 17.8.1.2
= max ( ^ , m+1)
(17.30)
Roe's Upwind scheme
This scheme is a second order Total Variation Diminishing (TVD) version of Roe's scheme applying the Monotone Upwind Schemes for Conservation Laws (MUSCL) extrapolation [24]. The numerical flux at the cell side i + 1/2 reads: -F.+ 1/2 =
\
yFin{Ui+1i2)
+Fin{Ui+1i2)
J
-
\
p(tf&i/2.tf£i/ a )| ( ^ 1 / 2 - ^ 1 / 2 )
(17.31)
where U-,,J2 and U^\_1,2 are the to the cell side extrapolated values of the conservative variables. The superscripts L and R refer to the left and right side respectively of the corresponding interface. The matrix A(UL, UR) is the Roe matrix and is based on Roe's approximate Riemann solver. The left and right states at the cell interfaces are defined as [4]: Ut+i/2
=
^,+
/l + $ ~ — A i +
1 / 2
/I + $ ~
+
1- $ ~ — A , . , 1 — <3> ~
\
U,R-1/2 = Cr,-^-J-A i _ 1 / a + - r - A i + V a J Defining AUi+1/2 minmod function:
(17.32)
= f i + i — Ui, the limited slopes are calculated using the
A i + 1 / 2 =minmod(A£/ I - + i/2,wAE/j_ 1 / 2 )
(17.33)
A,_1/2 =minmod(At/i_1/2,o;A[/J+1/2)
(17.34)
where u is a compression parameter in the range given by:
3- $ 1<^<
=• 1_ $
(17.35)
WEBER, GACHERIEU, RIZZI ET AL
304
The accuracy parameter $ in equation Eq. (17.32) defines a family of highaccuracy TVD upwind schemes. Using $ = — 1 results in a second order fully upwind scheme and $ = 1/3 results in a scheme based on a third-order scheme for the scalar convection equation. An alternative to the minmod function, the Van Leer or the Superbee limiter can be used for the fully upwind scheme. In practice, the limiters are apphed to characteristic variables instead of the slopes of the conservative variables. This has the advantage that the propagation characteristics of the flow are better taken into account, and different limiters can be used for different characteristic fields. 17.3.2
Convergence acceleration
17.3.2.1
Multigrid
A powerful method to accelerate the convergence to steady state is the multigrid method. Substantial savings in computing time are often possible by combining the solutions after iterations on a number of coarser grids. This idea was pioneered for compressible flow computations by Ni [17] and Jameson [14]. The multigrid version implemented in NSMB is Brandt's FAS algorithm [11], which can be written in a recursive manner. The multi grid can be combined with the explicit Runge-Kutta scheme or the implicit LU-SGS scheme. For turbulent calculations, the eddy viscosity is only calculated on the fine grid, and its value on the coarser grids is obtained by restriction. 17.3.2.2
D-ADI
The implicit residual smoothing using constant coefficients is very efficient for Euler calculations, but its effectiveness for Navier Stokes simulations is less. For high Reynolds number flows, it is necessary to resolve the thin shear layer near solid boundaries, which requires grids with very large cell aspect ratios. It is known that explicit schemes have difficulties to converge on such grids. To overcome these problems, Caughey [3] developed a diagonal alternating direction implicit algorithm. For a curvilinear coordinate system, the Navier Stokes equations are written as r\
dlu
r\
+
r\
di(jr,n ~ Tv)
+
r\
d~^Gin ~ gv)
+
ac(
(17-36)
Integration over a control volume, and discretization using an implicit scheme yields: {I + y M
A n
~ L") + Sv(B"
~
Mn
)
+ h{Cn
- Nn)]}AUn
= Rn
(17.37)
APPLICATIONS IN AERODYNAMICS
305
where Ajjn
_ jjn+l
_
Vn
and the (explicit) residual is equal to:
JT = -~\h{FL
~ 3\) + *,($?„ - Gnv) + hWn - «?)]
( 17 -38)
An,Bn,Cn (resp. Ln,Mn,Nn) are the inviscid (resp. viscous) Jacobian matrices, and 5^,8^ and S^ are the surface normals. In order for the implicit method to be an effective smoothing algorithm, the artificial dissipation fourth differences are included in the implicit operator, which leads to the inversion of the pentadiagonal systems for each onedimensional factor. To avoid the high cost of solving block pentadiagonal systems, the equations are diagonahzed at each point using a similarity transformation, see Chaussee and Pulliam [5]. This has the effect of decoupling the equations and requiring the solution of five scalar pentadiagonal systems for each factor in threedimensional problems T^[I+^S^]Tf1Tri[I+^5vAr,}T-1T<[I+^A<]T<-1AUn
= Rn (17.39)
where A^, A , , Af are the diagonahzed matrices containing the eigenvectors of the inviscid Jacobian matrices A, B,C, A = TiAiT~\
B = T„AVT-1,
C = TCA(T^
(17.40)
and T(,TV,T( are the matrices containing the associated right eigenvectors. The resulting method has good high wave-number damping, and it is therefore a good algorithm for use in conjunction with the multigrid method. It is also computationally efficient because only scalar systems are involved. The additional computational work required to determine the local similarity transformations and to perform the matrix multiplication of the residual and of the intermediate and final corrections represents a small fraction of the work required to solve the block systems. The above algorithm has been extended to solve the Navier-Stokes equations by adding the spectral radius of the viscous Jacobians to the inviscid implicit factors following Tysinger and Caughey [3]. In the direction £ for example, the A( matrix is replaced by: A( = As + p(Lt)
(17.41)
where the spectral radius of the viscous Jacobian matrix is given by:
^ ) = ^-(,
+
^LiJl!
(17.42)
WEBER, GACHERIEU, RIZZI ET AL
306
17.3.3
Implicit t i m e discretization
17.3.3.1
Scalar LU-SGS
scheme
The following subsection presents the LU-SGS scheme introduced originally by Yoon and Jameson [25]. It has been chosen because of its low numerical effort, the low memory requirements and the reasonable convergence speed. In addition, the scheme proves to be very robust. The scheme is based on a lowerupper factorization and a symmetric Gauss-Seidel relaxation. The governing equations are discretized separately in space and in time. This ensures that the steady state solution will be independent of the time discretization procedure and therefore independent of the time step. Linearizing the residual about the time level n leads to the following equation:
with / being the identity matrix. Equation (17.43) represents a large sparse linear system which has to be solved at each time step. The term dR/dU stands symbolically for the Jacobian matrices resulting from the linearization of the fluxes. A direct method could be applied to solve the linear system which would require the inversion of a large sparse block banded matrix. However, numerical costs and storage requirements are prohibitive using this method. Instead, iterative methods are applied and/or approximations are made to the linear system itself. The starting point is a diagonally dominant form of Eq. (17.43) in order to meet the stability requirements for the relaxation method. Yoon and Jameson use the following form: (A7
+ rA
)
AU +
' l
^-^J)'+i/2
AC/
' + i - ^ (A+rAI)i_1/2
At/,-! =
-R(Un)i
(17.44) with AAU = AF and r& being the spectral radius of the Jacobian matrix A. The LU-SGS method is obtained by factorizing the linear system as follows: {E + D) D-1 (F + D) AU = -Rn
(17.45)
where E contains only the lower triangular part, F the upper triangular part and D the main diagonal of the implicit operator. The linear system is now inverted by a forward and a backward sweep through the mesh on planes with i + j + k = const: (E + D)AU* = -Rn {F + D) AU = D AU*
(17.46)
APPLICATIONS IN AERODYNAMICS
307
By sweeping on the oblique planes, the off-diagonal terms EAU* and FAU respectively become known and are added to the right hand side. As a consequence, only a block diagonal matrix has to be inverted. If the approximation using the spectral radius is introduced, the implicit operator can be even reduced to a scalar diagonal matrix. For viscous calculations the spectral radius of the viscous Jacobian is added to the inviscid spectral radius. 17.8.3.2
Full matrix LU-SGS
scheme
It has been found that the convergence of the scalar LU-SGS scheme rate decreases significantly when simulating viscous flows using grids with a high cell aspect ratio. The reason for this is an overstabilizied system due to the use of the spectral radii. The convergence rate can be improved by using a full matrix LU-SGS scheme which is based on Roe's upwind scheme [23]. Assuming the Roe matrix to be constant, the numerical flux based on the fully upwind scheme can be linearized which leads to the following scheme: V_ I + - a (\A1+1/2\ At' ' 2 \ a (A(U&l/a)
+ |A,_1/2|)
AUi +
- | I , + 1 / 2 | ) AUi+1 - \ a (A(U,L_1/2)
n
= -R(U )i
+ |I,_1/2|)
AU{-i (17.47)
The factor a results from the upwind extrapolation to the interfaces when neglecting the limiters and is set equal to 1.5. In order to enhance convergence and stability of the scheme, the viscous Jacobian matrices are included. 17.3.4
T i m e accurate s c h e m e
Time accurate calculations can be made using the explicit Runge K u t t a scheme, whereby the time step is equal to the minimum time step for all grid cells. For Navier Stokes calculations with highly stretched grids, the maximum allowable time step becomes very small, and for this reason the dual time stepping technique proposed by Jameson [15] was implemented. The idea behind dual time-stepping is to have an outer time-stepping loop for a time accurate time step using a fully upwind scheme, and an inner time stepping loop with a fictitious time step. Local time stepping, multi grid and other convergence acceleration techniques can be used to converge the inner time stepping loop to a "steady-state" in fictitious time. By using fully implicit schemes for the outer time step, large time steps can be made. The problem with the dual time-stepping approach is to find the optimal value of the outer time step, since a too large value of the outer time step requires a large number of inner time steps to converge to a steady state, while with a too small value
308
WEBER, GACHERIEU, RIZZI ET AL
of the outer time step the gain of the dual time stepping approach compared to the Runge Kutta scheme is small.
17.4
P r o g r a m Structure and Multi Block I m p l e m e n t a t i o n
NSMB has been developed on top of a data base system, called MEM-COM [16]. MEM-COM is an object oriented data management system for memory and memory-to-disk data handling. The principal advantage of using a data base system is that for large scale multi block flow simulations (i.e., more than 100 blocks and over 1 Million grid points), access to the independent blocks is extremely fast, and almost independent of the block number. From the user point of view, the data base file appears as a single UNIX file, hence all the information related to the simulation can be easily saved on an archival system without possible loss of information. The MEM-COM library includes a Dynamic Memory Manager (DMM), which offers the possibility to allocate at run time the necessary storage of the arrays in NSMB [22]. When running on distributed memory computers, the DMM allocates the memory on each node of the computer. There is hardly any difference between a single block or multi block calculation for the explicit Runge-Kutta scheme used in NSMB. It is mainly on the level of the incorporation of the boundary conditions that there appears a difference. The LU-SGS implicit method has been implemented using explicit coupling between blocks which means that the implicit time stepping procedure is performed independently block by block. The explicit coupling does not introduce any additional computational costs but the global nature of the implicit scheme is reduced. The current version of NSMB assumes that grid lines are continuous across block interfaces, which means that interface algorithms are not necessary. Fictitious ghost cells on each side of the computational block are used to transfer information from the boundary conditions (including the block connectivity boundary condition) to the algorithm used to update the interior points, see [22] for more details.
17.5
High Performance Computing Strategy
Modern vector and parallel platforms, e.g. NEC SX-4 and Fujitsu VX, IBMSP2, Cray J932 and T3D now offer the user high performance if the code is adapted to the special features of the architecture. We describe the steps we have taken to achieve this with NSMB.
APPLICATIONS IN AERODYNAMICS 17.5.1
309
R I S C / V e c t o r optimization
NSMB was designed for running on vector computers, and the most time consuming routines were written to have vectorizable do-loops over all grid points in a block. This yields an excellent performance of over 1100 MFlops on the NEC SX4, more than 50% of the peak performance of a single SX4 processor. On RISC architectures the initial version of NSMB behaved rather poorly reaching between 10 and 15% of the peak performance. The reasons for this poor performance are memory and cache contention problems, because the data loaded in cache cannot be re-used. When the problem size is reduced (i.e. when smaller blocks are used), the performance increased because all the data could fit into cache. In addition, when using do-loops over all grid points, unnecessary work is carried out in ghost cells. This is not a problem on vector computers where the gain by using vector loops over all grid points largely compensates this overhead, but this is not the case on RISC architectures. Several techniques were studied to increase the performance of the most time consuming routines on RISC architectures [1]. Permuting array indices of some work-arrays yielded good improvement with only small modifications to the code. 17.5.2
Parallel i m p l e m e n t a t i o n
The design of the baseline parallel NSMB version was subject to important constraints with respect to portability, minimization of changes in the code, and ease of future maintenance. Two design choices were made in the development of parallel NSMB, first the domain partitioning is executed before the execution of parallel NSMB, and second the parallel implementation in NSMB is based on a master/slave paradigm. This latter choice permits an easy porting of NSMB to different parallel platforms. 17.5.2.1
Domain Partitioning
Tool MB-Split
One of the most important aspects in domain decomposition, is to distribute the blocks on the parallel machine in order to have a good load balance between the different processors. The complexity of the block-splitting book keeping demands an automatic load balancing and block-splitting tool. Such a domain decomposition tool, MB-Split, [26] was developed at KTH within the Parallel Aero project. MB-Split is a program written in C + + which can read a MEM-COM database with an arbitrary number of blocks, and generates a new MEM-COM database with a new number of blocks such that a balanced calculation is obtained on a parallel computer or on a cluster of computers. MB-Split automatically corrects all boundary conditions on existing blocks, and generates the boundary condition information for the
WEBER, GACHERIEU, RIZZI ET AL
310
newly created blocks. Blocks are split using either the Recursive Edge Bisection or the Greedy Load balancing algorithms. There is also a modified version of the Greedy Load balancing algorithm, called GreedyXtra Load balancing, that takes work done in ghostcells into account. It is also possible to perform a load balance according to real timings for the different blocks. Timings for the different blocks are given as output from NSMB. This latter algorithm does not split any blocks, it only redistributes them for a given number of processors. There are no restrictions on the number of blocks allocated to a node, and it is not necessary that each node has the same number of blocks, which means that it is possible to load balance a mesh with larger number of original blocks than available processors, as well as the opposite. Besides splitting blocks, MB-Split also contains an option to merge blocks back to the original block structure. Often in industrial CFD simulations, coarser grid solutions are used to initialize a fine grid calculation. Most likely, a coarse grid calculation will be made on a different number of processors, thus a different block decomposition than the fine grid calculation. It was decided to use the original block topology for the interpolation of the solution from the coarse to the fine grid. This implies that MB-Split not only splits the grid, it can also split a solution saved in the MEM-COM database. 17.5.2.2
Master-Slave
Implementation
of Parallel
NSMB
The main characteristics of the Master-Slave implementation are: 1. The message passing primitives used in the parallel version of NSMB use the most portable communication primitives such as send, blocking and non-blocking receive. Calls to the message passing libraries are made through a transparent communication layer, IPM[9] a set of M4 macros that generates PVM or PARMACS instructions at compile time. 2. In parallel NSMB, the master performs all accesses to the MEM-COM data base. In the computation phase (i.e. the time stepping loop) there is no communication between the slaves and the host, except for printing the convergence history. 3. The slave nodes are organized in a binary tree structure. The root node has the task to distribute and to control the flow simulation (check convergence, evaluation of the global time step ...). Collecting information during the computation is also done through the tree structure. 4. At the beginning of a Runge-Kutta stage, each processor sends the data for the block connectivity boundary condition to processors holding the neighboring blocks. Blocks on the same processor exchange data without using the network. The calculation of the next Runge-Kutta stage in a block is started as soon as all the data from the neighboring blocks are received.
APPLICATIONS IN AERODYNAMICS
311
The same procedure applies in the LU-SGS scheme. All data exchange is performed before solving the linear systems. 5. Once per x time steps (where typically x = 10), the wall and wake information needed by the turbulence model and calculated on each block are assembled and redistributed to each processor. 6. The Master and Slave executable are generated at compilation using C P P pre-processor directives. The serial version of NSMB is obtained in a similar way, hence there is only one single source code from which three executables can be generated (master, slave and serial).
17.6
C o m p u t e d Results
Three different testcases are computed: an airfoil, called A-airfoil, a 65 deg sweep delta wing, and a full configuration transport aircraft, called AS28G. 17.6.1
A-airfoil
The A-airfoil is a 2D subsonic test case, often used for testing turbulence models. It is well documented and much data from windtunnel experiments is available. The flow is a low Mach number flow with M = 0.15 and a Reynolds number of Re = 2.1 x 10 6 . The angle of attack is a = 7.2°. 17.6.1.1
Baldwin-Lomax
model
In the first computation the LU-SGS scheme with Roe differencing has been used together with the Baldwin-Lomax turbulence model, which is well suited for this type of flow where no or only slight separation occurs. It is also a first testcase to demonstrate the performance of the LU-SGS scheme by comparing the convergence to that of the explicit Runge Kutta scheme. Flow conditions assume a fixed transition point. The mesh used is a one block C-mesh with 256 x 64 = 16384 cells for the finest grid level. Very small cells close to the wall and the high cell aspect ratio lead to a small time step and hence to a slow convergence when using the explicit Runge-Kutta time stepping scheme. A five stage Runge-Kutta scheme has been used with a CFL number of 1.0 to reduce the error norm by three decades. It has then been increased to 2.0 to accelerate convergence. When using the implicit scheme, the CFL number was initially set to 10. and was increased during the time stepping such that 106 was reached after approximately 170 iterations. No stability problems occurred and the algorithm proved to be extremely robust. In Fig. 1, the convergence histories in terms of iterations and CPU time on a Silicon Graphics Power Challenge is shown. The explicit scheme converges extremely slowly and needs
WEBER, GACHERIEU, RIZZI ET AL
312
Figure 1
Convergence histories for the A-Airfoil.
approximately 50000 iterations for a residual reduction of 5 decades. The full matrix (third order upwind) and scalar implicit (central) versions need 550 and 7000 iterations, respectively. The CPU time could be reduced by a factor of 32 and 8, respectively. The scalar implicit version performs well for the first four decades. Then it continues to converge slowly and flattens out. This behaviour is due to the high cell aspect ratio. Using the full matrix version, this convergence degradation does not occur. Velocity profiles in the boundary layer at two locations on the airfoil are shown in Fig. 2. Both the central and the third order upwind scheme agree well with the experimental data at the location x/c = 0.7. The difference in the velocity profiles at x/c = 0.96 is due to the Baldwin-Lomax turbulence model which fails to predict the recirculation in an appropriate way. The pressure coefficient along the chord can be seen in Fig. 3. The third order upwind scheme and the central scheme give practically the same result and are both in good agreement with the measured pressure.
17.6.1.2
k — e model of Chien
The k and e equations together with the equations for the mean flow are integrated using the explicit five stage Runge-Kutta scheme. The mesh used is a one block C-mesh with 256 * 64 = 16384 cells which has been demonstrated to be sufficiently fine to give grid-independent results. The average value of the wall distance y+ is about 0.4. The five stage Runge-Kutta scheme has been used with a CFL equal to 0.5. The explicit scheme converged very slowly, and approximately 70000 iterations were necessary for a residual
APPLICATIONS IN AERODYNAMICS
Figure 2
313
Velocity profiles at two locations on the A-Airfoil.
reduction of 5 decades. Table 1 compares the force coefficients CL and Cp that have been computed with a variety of turbulence models incorporated in NSMB with experimental measurements. The lift coefficient is over predicted by the zero equation models, whereas the Spalart Allmaras model underestimates the experimental value. The drag coefficient is underestimated by the majority of the models whereas the k — e shows an excellent agreement for the drag coefficient as well as for the lift coefficient. Velocity profiles in the boundary layer obtained with the k — e model are plotted for the same locations along the airfoil as for the Baldwin Lomax model described before, see Fig. 2. The comparison between the numerical and experimental results show a good agreement near the wall at the first location where the flow is attached (see Fig. 4). At the second location, the experimental results suggest separated flow, whereas the fc — e as well as the Baldwin Lomax calculations suggest attached flows.
WEBER, GACHERIEU, RIZZI ET AL
314
Figure 3
17.6.2
Pressure coefficient along the chord for the A-Airfoil
D e l t a wing
Although the geometry of a delta wing is simple, the simulation of the vortexdominated flow over the wing at high angle of attack is a challenging problem due to the interaction of multiple vortices, possible interaction of shock waves with the vortices, and most often the flow is turbulent. Here results are presented of the flow over the 65° swept deltawing with a round leading edge. Results are presented for a turbulent calculation with Moo = 0.85, a = 12° and Re/m = 9 x 106 based on the root wing chord. This test case was used for 3 reasons: • to assess the influence of the space discretization scheme on the results
Baldwin-Lomax model Granville model Spalart-Allmaras model Chien k — e model Experimental results (F2 wind-tunnel ONERA) Table 1
cL
1.067 1.046 1.002 1.032 1.033
cD
0.0152 0.0148 0.0143 0.0155 0.0155
Comparison between the force coefficients computed with the different turbulence models and ones measured on the A-airfoil
APPLICATIONS IN AERODYNAMICS
Figure 4
315
Velocity profiles at two locations on the A-airfoil
• to assess the influence of the selected turbulence model on the results • to study different convergence acceleration methods In addition to the turbulent calculations at Re/m = 9 x 10 6 , one calculation was made for a Re/m = 72 x 10 6 , and one laminar calculation was made. For all calculations, a 8 block mesh was used having in total 675'840 grid points. The maximum y+ values near the wall were between 0.2 and 0.5, and there were 80 points located between wall and free stream boundary. The calculations using the LU-SGS implicit scheme and/or the multi grid convergence acceleration were started directly on the finest grid. For all other calculations, a coarse grid having 84480 points was used to initialize the fine grid calculation. 17.6.2.1
Influence of the Space Discretization
Scheme
Calculations were made using the Baldwin Lomax turbulence model, with the following space discretization schemes: It appeared impossible to obtain a converged solution using the central scheme with matrix dissipation, despite numerous tests using different parameters. Figure 5 shows the C p at X/C = 0.4 for these six space discretization schemes. The results using the matrix dissipation are not converged, and concern an intermediate solution. As can be seen in Fig. 5, the space discretization scheme has a large influence on the pressure suction peak. The standard Jameson artificial dissipation implemented in NSMB is anisotropic, and for this reason is less dissipative than the Martinelli artificial dissipation. The third order upwind scheme is the least dissipative, and predicts the smallest width of the vortex. Both calculations using the central scheme + matrix dissipation come close to the result of the third order upwind scheme, but as mentioned, these calculations did not converge. At the windward side of the delta wing, all the computations
316
WEBER, GACHERIEU, RIZZI ET AL
Legend in Fig. 5
Space discretization scheme
cp-ntf-re9e6-std cp-ntf-re9e6-mar cp-ntf-re9e6-mxu
central scheme + standard Jameson dissipation central scheme + Martinelli scaling of the dissipation central scheme + Matrix dissipation based on primitive variables central scheme + Matrix dissipation based on conservative variables upwind scheme third order upwind scheme
cp-ntf-re9e6-mxw cp-ntf-re9e6-upw cp-ntf-re9e6-3upw
Table 2 Summary of the calculations made using different space discretization schemes for the Baldwin Lomax calculations over the 65" deltawing.
full together, except for the non converged calculations using the central scheme + matrix dissipation. Figure 6 shows the calculated Cp at X/C = 0.4 using the Baldwin Lomax model and the one-equation models of Spalart-Allmaras and of EdwardsMcRae, together with a laminar calculation. All calculations were made using the central scheme + Jameson dissipation. As can be seen in this figure, the laminar calculation predicts a secondary separation, which is not present in the turbulent calculations. The width of the vortex is the smallest for the Spalart-Allmaras model. The one equation turbulence models predict a lower pressure suction peak compared to the Baldwin Lomax model. The effect of the Reynolds number is shown in Fig. 7, which shows the calculated results at two Reynolds numbers, together with the laminar calculation at Re/m = 9 x 10 6 . These three calculations were made using the third order upwind scheme. As mentioned before, a secondary separation is visible for the laminar calculation. The calculation at Re/m = 72 x 106 yields the highest suction peak, and also here a small secondary separation is visible. The effect of the Reynolds number is an increase in suction peak, and a small shift of the vortex core towards the symmetry plane. Figure 8 shows the particle traces above the wing for the turbulent calculation at Re/m = 9 x 10 6 , and the vortex is clearly visible. Several calculations were made using the Baldwin Lomax turbulence model and different convergence acceleration methods. The L2-residues as function of the time step are shown in Fig. 9, and the calculations are summarized in Table 3. All calculations were made on the NEC SX4 of the Swiss National Computing Centre in Manno, using local time stepping. As can be seen in
APPLICATIONS IN AERODYNAMICS
Figure 5
317
Cp at X/C = 0.4 using different space discretization schemes
Fig. 9, the calculations using the two upwind schemes combined with the full matrix LU-SGS implicit scheme converged the fastest in terms of number of time steps. For the central scheme, the calculation using both multi grid and the scalar LU-SGS implicit scheme yields the fastest result (again in terms of number of time steps), followed by the calculation using the scalar LU-SGS implicit scheme. Table 3 summarizes the different calculations made, and includes the information on the computational resources needed, and the performance obtained on the NEC SX4. When calculating the cost per time step, it appears that the calculation with the central scheme + Martinelli artificial dissipation using the implicit scalar LU-SGS scheme is the least expensive, while the central scheme + Martinelli and explicit Runge Kutta with Multi grid is the most expensive. Using the cost per time step, the costs to reduce the residues to 1 0 - 2 were determined for the different schemes. This shows that the central scheme + Martinelli artificial dissipation with the implicit scalar LU-SGS scheme requires about 2400 CPUsec, while the central scheme + Martinelli artificial dissipation with the explicit Runge Kutta scheme needs about 5200 CPUsec. The upwind and third order upwind scheme using the implicit matrix LU-SGS scheme require respectively 5100 and 5760 CPU seconds. All other time integration schemes are substantially more expensive. It should be mentioned that the upwind schemes are more expensive than the central scheme with artificial dissipation, which explains partly the differences
WEBER, GACHERIEU, RIZZI ET AL
318
Figure 6 Calculated results 65° deltawing using different turbulence models.
between central and upwind schemes. From Table 3, it can be seen that the calculations with the upwind schemes using the matrix LU-SGS implicit scheme require about 45% more memory than the central scheme using explicit Runge Kutta time stepping, or the central scheme using the scalar LU-SGS scheme. 17.6.3
A S 2 8 G Full Aircraft
The AS28G test case is representative of a full aircraft configuration, which includes the engine/airframe integration where viscous effects axe important, especially on the pylon, where flow separation occurs. The finest grid contains 62 blocks with 3.5 million grid points. Figure 10 shows the surface grid for this configuration. The flow condition presented here represents the cruise condition for which the pylon is optimized: Re/m = 11.16 x 10 6 , a = 2.2° and M = 0.8. 17.6.S.1
Flow analysis and comparison with
experiments
Centered space differencing was used here with coefficients for the second and fourth order artificial dissipation set to 0.8 and 0.015, respectively. T h e artificial dissipation has been damped in the boundary layer by multiplying the wall normal artificial dissipation fluxes with a function of the local Mach
APPLICATIONS IN AERODYNAMICS
Figure 7
319
Calculated results 65° deltawing at two Reynolds numbers.
number which tends to 0 at the wall and to 1 at the edge of the boundary layer. This damping has a large influence on the correct prediction of the boundary layer and of the shock position as shown in [7], [21]. The calculation has been started with a free stream initial solution and a CFL number of 1, when using the scalar LU-SGS scheme. The CFL number has been increased linearly up to 109 after approximately 700 iterations. After 5000 iterations the Li norm of the residual (p) has been reduced by four orders of magnitude using a 109 blocks mesh on the Cray T3D, a 75 blocks mesh on 32 IBM SP2 processors and the original 62 block mesh on the Fujitsu VX. This corresponds to a total elapsed time of approximately 30 hours on the T3D and 55 hours on the SP2. No stability problems were encountered, even when starting with a free stream initial solution and using very small values for the artificial dissipation. Lift and drag coefficient histories are given in Fig. 11 and show converged values after about 4000 iterations. For computations on the Cray T3D using 128 processors the mesh has been split into 276 blocks. The resulting number of blocks is such that at least two blocks are located on each processor. In order to obtain a reasonable loadbalancing, the real CPU timings were measured and the blocks are redistributed using these timings. This procedure requires at least two blocks per processor. However, when using 276 blocks, the standard LU-SGS scheme with explicit coupling between blocks led to divergence. The scheme was then modified such that several sweeps are performed with an update of the block
320
Figure 8
WEBER, GACHERIEU, RIZZI ET AL
Particle traces depict the leading-edge vortex over the 65° delta wing.
connectivity boundary conditions between each sweep. At convergence of the iterative method, the update AU corresponding to a 1 block mesh would be obtained. This method requires more communication, but as several sweeps are performed also the computational work increases so that the ratio of communication time to elapsed time increases only slowly [23]. Performing three sweeps in the manner mentioned above, a converged solution could be obtained after approximately 2000 iterations. The convergence rates of a 276 block and a 109 block calculation using the same algorithm are shown in Fig. 12. For the first three decades residual reduction, the two calculations converge identically. Then the convergence of 276 block calculation seems to be somewhat degraded. However, after 2000 iterations the residual could be reduced by four decades. When looking at the pressure coefficient history, it seems that the error is damped somewhat slower when using 276 blocks. It was not possible to get a converged solution using the explicit RungeKutta scheme, but a Diagonal-ADI scheme has been used on the IBM SP2. The ADI scheme is, however, very sensitive to parameter values in the turbulence models and divergence occurs for CFL numbers larger than 0.1-0.2. Convergence rates of the ADI and the LU-SGS schemes are shown in Fig. 13. Calculations with the ADI scheme were stopped after about 60 000 iterations. The residual had by then barely been reduced two decades, and the lift and drag coefficients, but Cx was still about 20% from the correct value. Computed lift (CL) and drag(Co) coefficients based on wind-axis are given in Table 4 and compared with windtunnel measurements, for different computers and block splitting. All computations were made with the BaldwinLomax turbulence model and the LU-SGS scheme. The coefficient for the second order artificial dissipation was reduced to 0.4
321
APPLICATIONS IN AERODYNAMICS
space discretization
time integration
steps
CPU (sec)
central + standard central + matrix
explicit Runge K u t t a explicit Runge K u t t a explicit Runge K u t t a explicit Runge K u t t a + multi grid implicit scalar LU-SGS + multi grid implicit scalar LU-SGS implicit ma trix LU-SGS implicit ma trix LU-SGS
9000
34420
334
810
6000
30864
339
941
6000
23091
334
810
3000
47203
448
824
2000
25406
476
724
5000
15056
339
714
2000
23970
481
520
2000
22163
481
495
central Martinelli central Martinelli
+ +
central Martinelli
+
central Martinelli upwind
+
third upwind
Table 3
order
memory (Mb)
MFlops
Summary of results using different time stepping methods for the Baldwin Lomax calculations over the 65° deltawing.
on the Fujitsu and to 0.25 on the T3D, compared to 0.8 originally. It was thought that reducing this coefficient would move the shock on top of the wing further forward, and thereby get a better prediction of lift and drag. However, this change had no effect on shock position, nor on CL and Cry- As expected, the splitting of the blocks had only a minor influence on the final result. Figures 16 to 14 plot the pressure coefficient, — Cp, at different positions on the wing. The NSMB results using LU-SGS and Baldwin-Lomax are compared with windtunnel measurements. Starting with station 2 at y/(2b) = rj = 0.29 and ending with station 4 at y/{2b) = rj = 0.57. The half-span is b/2 = 3.704 m. The Cp distribution on the lower side of the wing is very well predicted for all stations, but the shock position is slightly behind the correct position, especially for station 3 (Fig. 15) close to the pylon (at rj = 0.35 ). Closer to the wing-tip, at station 4 (Fig. 14), the shock position is much better predicted. Thus it seems that the pylon further complicates the shock-boundary layer interaction to the extent that a simple algebraic turbulence model looses accuracy and cannot predict the shock position with precision. The Baldwin-
WEBER, GACHERIEU, RIZZI ET AL
322
Figure 9
Convergence history different different time stepping methods, results 65° deltawing using the Baldwin Lomax turbulence model.
Lomax turbulence model is known to have difficulties to calculate shockboundary layer interaction correctly and in such cases usually predicts the shock-position downstream of its correct position, which corresponds to the results presented here. A better prediction of the shock-boundary layer interaction would demand a more advanced turbulence model and hence a larger elapse time due to both more float-operations per timestep and slower convergence. The calculated skin friction lines on the interior side of the pylon are compared with an oil flow picture in Fig. 17. On the upper part of the picture one can see the wing, on the lower right part the engine and in between the pylon. The qualitative flow behaviour has been correctly reproduced by the numerical simulation. 17.6.8.2
HP C performance
Timings for the parallel computations using 32 and 64 processors on the IBM SP2 and 64 processors on the Cray T3D have been compared with timings using a single processor on two vector computers, NEC SX4 and Fujitsu VX. Table 5 shows timings per iteration for the different computers. The Fujitsu
APPLICATIONS IN AERODYNAMICS
Figure 10
323
Surface grid of the AS28G aircraft configuration.
timings are for the original 62 block mesh, while the timings for the SP2 are for a 75 block mesh using 32 processors and a 106 block mesh using 64 processors. The Cray T3D and NEC SX4 timings are for a 109 block mesh. Timings for 32 SP2 processors are for both the ADI and the LU-SGS schemes, while the 64 processor timings are for the ADI scheme only. Computational time without communication are given in the second column. The whole difference between the timings for the timestepping routine and the total elapse time could not all be considered to be communication, but a large part of it. All computations have been made with the Baldwin-Lomax turbulence model and all parallel computations have been made with PVM message passing library. The performance rate for the NEC SX4 calculation is around 500 MFlops, which can be compared with the other timings. The LU-SGS and the ADI schemes takes about the same time per iteration, as shown for the 32 processor SP2 cases in Table 5. The performance rate on the SP2 is very low, which can be explained by the low communication bandwidth for the public domain PVM on the SP2 (less than 10 MB/s). This is more evident for the fine AS28G mesh than for the A-airfoil, due to the larger amount of data to send. The performance rate using NSMB on the Fujitsu, for a case with a favorable vectorlength, can be as high as 900 MFlops and slightly higher for the NEC SX4 and around 40 MFlops for the "thin" SP2 processor. This would demand around 20 to 25 SP2 processors to get the same performance as the Fujitsu and NEC without considering communication, but the problem with load balancing and efficient parallelization makes it difficult
WEBER, GACHERIEU, RIZZI ET AL
324
IBM SP2, 75 blocks Cray T3D, 109 blocks Cray T3D, 109 blocks, Reduced Artdiss2 Cray T3D, 109 blocks, Reduced Artdiss2, 3 sweeps Cray T3D, 276 blocks, Reduced Artdiss2, 3 sweeps Fujitsu VX, 62 blocks NEC SX4, 62 blocks Fujitsu VX, Reduced Artdiss2 Windtunnel Table 4
CL 0.5234 0.5231 0.5300 0.5282 0.5289 0.5247 0.5233 0.5297 0.5145
cD
0.0307 0.0307 0.0305 0.0302 0.0302 0.0306 0.0303 0.0303 0.0249
Lift (CL) and drag (CD) coefficients for the AS28G, using LU-SGS and Baldwin-Lomax in NSMB.
Computer Fujitsu (LU-SGS) NEC SX4 (LU-SGS) 32 P. SP2 (ADI) 32 P. SP2 (LU-SGS) 64 P. SP2 (ADI) 64 P. SP2 (LU-SGS) 64 P. T3D (LU-SGS)
Col
fim^f°n 0 0 44 51 48 63 33
, Total elapse .time 58 40 1100 57 860 42 30.5
Table 5 Timings for the fine AS28G mesh using the ADI and scalar LU-SGS schemes on different computers.
APPLICATIONS IN AERODYNAMICS
Figure 11 Lift (CL) and drag (Co)coefficients a s a function of iteration, for the AS28G mesh, using LU-SGS and Baldwin-Lomax.
to get close to the theoretical performance, which is why vector computers like the Fujitsu and NEC achieves a good price/performance compared to modern parallel wmputers. Timings on the Cray T3D for the LU-SGS scheme performing 3 sweeps with update of the block boundary conditions between each sweep are given in Table 6. On first view the speedup of approximately 2.5 comparing 128 and 64 processors is surprising. This is due to the poor load balance on 64 processors. Because of memory restrictions when using 64 processors, the mesh could not be split in an optimal way. Using 128 processors a converged solution could be obtained after 10.5 hours. This run time can be considered as acceptable for aerodynamic design work where such flow simulations must be completed in overnight runs.
17.7
Summary and conclusions
We have presented our recent work on the three fronts of Navier-Stokes simulations, namely the one-equation Spalart-Allmaras turbulence model, the matrix LU-SGS implicit scheme and their implementation on IBM SP2, Cray T3D, NEC SX4 and Fujitsu VX. We have investigated how these advances in the code have performed in the context of three different test cases. For the testcase of the aircraft configuration we have shown that with the Baldwin-
WEBER, GACHERIEU, RIZZI ET AL
326
0
Figure 12 Comparison of convergence histories using 109 and 276 blocks performing 3 sweeps with update of the block connectivity boundary conditions.
Computer 64 Procs 128 Procs
Coi
%$£mtion
. Total elapse .time
35 26
24.7 10.5
Table 6 Timings for the fine AS28G mesh on the Cray T3D using the LU-SGS scheme with 3 sweeps and update of block connectivity boundary conditions.
Lomax turbulence model a Navier-Stokes simulation of 3.5 M grid points converges in 2000 time steps and can be run overnight on the Cray T3D with 128 processors. The computed results agree with experiments to within about 2% in lift and about 4% in drag. The standard public domain PVM was used as message passing library for the calculations performed on the IBM SP2, which, for 32 processors, gave a communication time that was as large as the computational time because the bandwidth of the public domain PVM on the SP2 is less than 10 MB/s. Calculations with 64 processors on the Cray T3D gave a much lower communication time, due to the faster network, but suffered from a poor load balance, which gave a poor parallel performance, despite the fast network. Further partitioning the domain could be a solution, but this case did not fit on the computer due to the memory limit of the processor. To alleviate this limit, a partitioning for 128 processors was tried, but lead to
APPLICATIONS IN AERODYNAMICS
Figure 15 L2 residual based on the density, for the LU-SGS scheme and the Diagonal-AD1 scheme. Both using Baldwin-Lomax and 32 SP2 procamm.
numerical instability due to the explicit coupling between blocks. Performing three sweeps in the LU-SGS scheme instead of one increased the coupling, proved to be stable and led to convergence in 2000 time steps. In addition, runs were made on the vector machines SX4 and Fujitsu VX. The computational time for the fine, (3.5 Mpoints) mesh, on a single NEC SX4 processor took as long as approximately 50 T3D processors and 45 processors on the SP2. The same figures for the Fujitsu VX were 34 and 32 respectively, with a fairly optimized code for vector processing. This makes the use of a modern vector computers a competitive choice with respect to price/performance. The implicit scheme converges rapidly and is robust, runs have been made with a CFL number of one billion, but at present only algebraic turbulence models have been included in the method.
Acknowledgements The development of the Parallel ~ e b i o nof NSMB was done in the frame of the ESPRIT I11 Project Parallel Aero. The Swedish participation in this project was funded by NUTEK, the Swedish National Board for Industrial and Technical Development. The Swiss participation in this project was funded by the Swiss Ministry of Education and Science (OFES).
WEBER, GACHERXEU, RIZZI ET AL
I;'
. . . . . . . . . . ., . . .. . :
;
0
0.1
0.2
-*.,,., "a6
.
..
..
.
,
..
.. . ... . . . 0.3
0.4
0.5
$Jc
..
.
.. . . .. .
.., . .. .
0.0
0.7
.
. 0.8
0.9
1
Figure 14 Cp distribution on the wing of AS28G, a t station 1, 57% of half-span.
T h e extensive u s e of t h e computers a t t h e Center for Parallel Computers (PDC),KTB, Sweden is hereby gratefully acknowledged. The Computer S u p p o r t G r o u p at CCERFACS is acknowledged for their help o n t h e Cray
T3D. Aerospatide, Toulouse is gratefully ac1cnowledged for providing t h e AS28G m e s h and windtuanel data.
REFERENCES 1. Amestoy P.R. and Dayde M.J., Porting Industrial Codes on High Performance Computers, in: High Performance Computing in Ruid Dynamics, Editor: P. Wcssding, Iiluwer Acadenzic Publisshcr, BRCOPTAC Series, March 1996. 2. Baldwin B.S. and Lomax I-I., Thin Layer Approxitnation and Algebraic Model for Separated Turbulent Plows, AIAA-paper. 78-257, 1978. 3. Caughey D.A., Diagonal Implicit Multigrid Algorithm for the Euler Equations, AXAA Journal, Vol. 26) No. 7, 1988, pp. 841-851. 4. Chakravarthy, S.R.: fligh Resolution Upwind Wrmulations for the Navier-Stolces Equntions, in: VICI LS 1988-05, Con~putationalR u i d U~ynamics,1988. 8. Chaussee D.S. and Pulliam 'I'.I.f., Two-Dimensional Inlet Simulation using a Diagonal Implicit Algorithm, AIAA Joum~ul,Vo1. 19, l%b. 1981, ~ J P 153-159. . 6. Chien I<.-Y., Predictions of Channel and Boundary-layer Flows with a LowReynolds-Numbel* Turbulence Model. AIAA Journal , Vol. 20, Jan 1.982, pp. 33-38. 7. Qacherieu C., Etude cl'un mod8 de turbulence algdbrique 31): Applicatio~~ au cefcul Navier-Stokes de I'dcoulement autour d'une instaflation motrice d'avion de transport, PIID thesis No. 1248) Imtitut National Polylechniy.c~ede Xoulozrsc, 1996.
APPLICATIONS IN AERODYNAMICS -0pn~y~1som.wbn-a41m
Figure
distribution on the wing of ASBG, at station 3,42%
half-span.
8. Galbraith R.A. McD. S.A. Sjolander and M.R. Head, Mixing Length in the Wall Region of Turbulent Flow, Aeronaut. Quart., vol, 27, pt. 2, 1977, pp. 97-110. 9. Giraud L., Noyret P.,Sevault E. and Van Kemenade V., IPM 2.3, User's Guide and Reference Manual, CERFACS, November 1994. 10. Granville P.S., Baldwin Lomax Factors for Turbulent Boundary Layers in Pressure Gradients, AIAA Journal, Vol. 25, 1987, pp. 16241627. 11. Hackbusch, W., Multigrid methods and applications, Springer- Verlag, Berlin, Heidelberg, 1985. 12. Jameson, A. and Turkel, E., Implicit Schemes and LU Decompositions, Math Comp., Vol. 37, No. 156, 1981, pp. 385-397. 13. Jameson, A,, Schmidt, W., Turkel, E.: Numerical Solution of the Euler Equations by Finite Volume Methods using Runge-Kutta Time Stepping Schemes, AIAA Paper 81-1259, July 1981. 14. Jameson, A,, SoIution of the Euler equations for two-dimensional transonic flow by a multigrid method, Appl. Math. Cornput., Vol. 13, 1983, pp. 327-355. 15. Jameson, A,, Time Dependent Calculations Using Multigrid, with Applications to Unsteady Flows Past Airfoils and Wigs, AIAA Paper 91-1596, June 1991. 16. Merazzi S., MEM-COM An Integrated Memory and Data Management System - MEM-COM User Manual Version 6.0, SMR TR-5060, SMR Corporation, P.O. Boa: 4 1, CH-2500 Bienne, March 1991. 17. Ni, R.-H., A multiplegrid scheme for solving the Euler equations, AIAA Journal, Vol. 20, 1982, pp. 1565-1571. 18. Spalart, P.R. and Allmaras, S.R., A One-Equation Turbulence Model for Aerodynamic Flows, AIAA Paper 9.2-0439, Jan. 1992. 19. Stueben,.K., Mierendorff, H., Thole, C.A. and Thomas, 0.: Industrial Parallel Computing with Real Codes, Parallel Computing Journal, to appear, 1996. 20. Thomas L.C. and Hasani S.M.F., Supplementary Boundary-Layer Approximations for 'hrbulent Flow, Journal of Fluid Mechanics, Vol. 111,1989, pp. 420-427. 21. Tourrette L., Assessment of Turbulence models for the Transonic Flow around the DLR-F4 Wing/Body Configuration, AIAA paper 96-2034,1996.
APPLICATIONS IN AERODYNAMICS
Figure 17
Skin friction lines on the interior side of the pylon
331
18 Incompressible Navier-Stokes Computations in Aerospace Applications and Beyond D. Kwak1, C. Kins 2 , J. Dacles-Mariani3, S. Rogersl & S. Yoon 1
Abstract Recent progress in developing computational procedures for viscous incompressible flow is presented. Two different flow solvers are examined, one using an artificial compressibility method and the other using a pressure projection method, followed by applications to problems of engineering interest. Computed examples include pulsitile flow in a constricted channel, wing tip vortex formation and propagation, advanced impeller analysis for liquid rocket engine, and the development of a ventricular assist device. The last example serves as an extension of this computational technology to a non-aerospace application. Numerical issues dealing with these applications are discussed both from the algorithm development and from application point of view.
18.1
Introduction
From an aerodynamicist's point of view, incompressible flow can be considered as a limiting case of compressible flow as the flow speed approaches to a significantly low value compared to the speed of sound. There are a large number of flow problems of practical importance in aerospace and beyond which belong to this category, and such 1 Advanced Computational Methods Branch, NASA Ames Research Center, Moffett Field, CA 94035 2 MCAT, INC., Mountain View, CA ■* University of California, Davis, CA
Frontiers of Computational Fluid Dynamics - 1998. Editors: David A. Caughey & Mohamed M. Hafez.
© 1998 World Scientific
KWAK,KIRIS,DACLES-MARIANI,ROGERS & YOON
334
can be treated as essentially incompressible. The incompressible Navier-Stokes equations, which represent these flows, pose a special problem of satisfying the mass conservation equation because it is not coupled to the momentum equations. Physically, these equations are characterized by the elliptic behavior of the pressure waves, the speed of which is infinite. Various methods have been developed in the past, which can be classified in numerous ways depending on the choice of formulations, variables, or algorithms. Since three-dimensional applications involving complex geometries are of our primary interest, the primitive variable formulation is chosen in the present study. The primitive variables, namely, the pressure and the velocities, can easily be defined in real geometry compared to derived quantities like stream function or vorticity. Therefore, for convenience and flexibility, primitive variable formulations were used for developing incompressible Navier-Stokes codes (INS3D family of codes) at NASA Ames Research Center. In the present paper, some algorithmic features of the two versions of INS3D are discussed, followed by a presentation of computed results. The solution procedures presented here are within a structured-grid framework. General review of our work can be found in Kwak et al. [1,2]. During the past few years, a large number of review articles and books on CFD have included discussions on incompressible flow computations. For a more comprehensive review of computational methods for incompressible flow in general, readers are referred to these materials, i.e., Hirsch [3], Hafez and Oshima [4].
18.2 Solution Methods In this section, two solution methods used in the development of INS3D are reviewed. The governing equations will be given first, followed by a discussion on the two methods. 18.2.1
Formulation
Three-dimensional incompressible flow with constant density is governed by the following Navier-Stokes equations: £
=0
(18.1)
du, du-Uj _ dp d%is dt dxj dx( dxj where t is the time, x( the Cartesian coordinates, M, the corresponding velocity components, p the pressure, and T- the viscous-stress tensor. All the variables have been nondimensionalized by a reference velocity and length scale. In generalized curvilinear coordinates, (f, rj,£), these equations can be written as
INCOMPRESSIBLE NAVIER-STOKES
dt
H>
335 (ei-evi) + s = ~r
_d_(u-i$),
(18.3)
(18.4)
where | , = | , h or z for i=l, 2, or 3
„
1
X$t)xP + »Ut
:-VI,. V|,
u
d]
(18.5)
rfe) r + fe) x » + fe)/ + teL
7 = Jacobian of the transformation u = kinematic viscosity s = source term
The source term s is used to represent centrifugal and Coriolis forces in a steady rotating reference frame. For most flow applications, this term is set to zero. 18.2.2 Method Based on Compressible Flow Algorithm: Artificial Compressibility Method Major advances in CFD have been made in conjunction with compressible flow computations. Therefore, it is of significant interest to be able to use some of these algorithms. To do this, the artificial compressibility method of Chorin [5] can be used. In this formulation, the continuity equation is modified by adding a timederivative of the pressure term, resulting in 1 dp dut :0 (18.6) where /} is an artificial compressibility parameter. Together with the unsteady momentum equations, this forms a hyperbolic-parabolic type of time-dependent system of equations. Thus, implicit schemes developed for compressible flows can be implemented. It is to be noted that t no longer represents a true physical time in this formulation. Physically, this means that waves of finite speed are introduced into the incompressible flow field as a medium to distribute the pressure. For a truly incompressible flow, the wave speed is infinite, whereas the speed of propagation of these pseudo waves depend on the magnitude of the artificial compressibility
KWAK,KIRIS,DACLES-MARIANI,ROGERS & YOON
336
parameter. In a truly incompressible flow, the pressure field is affected instantaneously by a disturbance in the flow, but with artificial compressibility, there is a time lag between the flow disturbance and its effect on the pressure field. Ideally, the value of the artificial compressibility parameter is to be chosen as high as the particular choice of algorithm will allow so that the incompressibility is recovered quickly. This has to be done without lessening the accuracy and the stability property of the numerical method implemented. On the other hand, if the artificial compressibility is chosen such that these waves travel too slowly, then the variation of the pressure field accompanying these waves is very slow. This will interfere with the proper development of the viscous boundary layer. In viscous flows, the behavior of the boundary layer is very sensitive to the streamwise pressure gradient, especially when the boundary layer is separated. If separation is present, a pressure wave traveling with finite speed will cause a change in the local pressure gradient which will affect the location of the flow separation. This change in separated flow will feed back to the pressure field, possibly preventing convergence to a steady state. When the viscous effect is important for the entire flow field as in most internal flow problems, the interaction between the pseudo pressure-waves and the viscous flow field is especially important. Artificial compressibility relaxes the strict requirement of satisfying mass conservation in each step. However, to utilize this convenient feature, it is essential to understand the nature of the artificial compressibility both physically and mathematically. Chang and Kwak [6] reported details of the artificial compressibility, and suggested some guidelines for choosing the artificial compressibility parameter. Various applications which evolved from this concept have been reported for obtaining steady-state solutions (e.g., [7-9]). To obtain time-dependent solutions using this method, an iterative procedure can be applied in each physical time step such that the continuity equation is satisfied (see [10-12]). Further discussions on the artificial compressibility approach can be found in Refs. [13,14]. 18.2.2.1 Steady State Formulation Combining equation (18.6) and the momentum equations gives the following system of equations:
i°=-^-^)+s=-k
(i8 7)
-
where R is the right-hand-side of the momentum equation and can be defined as the residual for steady-state computations, and where ~P~ J
J
E.= i
V-te),
E.=
(18.8)
INCOMPRESSIBLE NAVIER-STOKES
337
An unfactored implicit scheme can be obtained by linearizing the flux vectors about the previous time step. For the Euler implicit case, this can be written as n
/AT
KdDj
J(
This equation is iterated in pseudo time until the solution converges to steady state, at which time the original incompressible Navier-Stokes equations are satisfied. A direct inversion of equation (18.9) would become a Newton iteration for a steady-state solution. In three dimensions, however, direct inversion of a large block banded matrix of the unfactored scheme would be impractical. Factored schemes To overcome the difficulties of an unfactored implicit scheme, indirect methods have been devised by many researchers. The alternating direction implicit scheme (ADI) by Beam and Warming [15] and Briley and McDonald [16] approximates the implicit operator in the unfactored scheme by a product of three one dimensional operators. It is difficult to apply the ADI scheme to equation (18.9) in its full matrix form. Noting that at the steady-state the left-hand side of equation (18.9) approaches zero, a simplified expression for the viscous terms can be used on the left-hand side. To maintain the accuracy of the solution, the entire viscous terms are used on the righthand side. The inversion is reduced to a series of one-dimensional problems. This scheme is unconditionally stable in two dimensions. In three dimensions, with the use of numerical dissipation, this scheme becomes conditionally stable. The ADI scheme introduces a factorization error which reduces the rate of convergence. However, this method introduced perhaps the first implicit scheme to CFD, thus viscous flow computations became more feasible. With the help of reduced cost by the diagonalization of Jacobian [17,18], this scheme has been used in the development of many flow solvers. At NASA Ames, the first incompressible NavierStokes solver in generalized three-dimensional coordinate, INS3D code (Kwak et. al, [19]; Chang and Kwak, [6]), was developed using this approach (for user's guide, see Rogers et al. [20]). The INS3D code was extensively used as a primary tool for an upgraded design of the Space Shuttle main engine (SSME) hot gas manifold (see, [9]). This was one of the early contributions made by CFD to rocket propulsion systems development. Several two-factor schemes have been developed to improve the stability and convergence of the ADI scheme. The LU-SGS scheme, which combines the advantage of LU factorization and symmetric Gauss-Seidel relaxation was implemented to the artificial compressibility formulation by Yoon and Kwak [21]. In this method, the derivatives of the viscous fluxes are approximated using secondorder central differences. The convective flux terms are discretized using central differences, which require numerical dissipation terms for stability. By choosing different numerical dissipation models and Jacobian matrices, a variety of schemes can be constructed.
338
KWAK,KIRIS,DACLES-MARIANI,ROGERS & YOON
Nonfactored schemes To remove factorization error, a line-relaxation implicit scheme similar to that employed by MacCormack [22] can be formed not through factorization of the lefthand-side matrix, but through an iterative solution process. The discrete form of the matrix on the left-hand-side of equation (18.9) is a banded matrix, which is approximately solved using an iterative approach. One of the three computational directions is chosen to be the implicit direction, and the sweeping through the domain proceeds in the other two directions. The algorithm is implemented so that any or all of the three computational directions can be chosen for the sweep direction. The optimum direction and number of sweeps is very much problem dependent. Experience with this algorithm has shown that for most problems it is best to use the wall-normal direction as the implicit direction, and that something on the order of 10 sweeps should be used. This method is used to develop a new artificial compressibility based flow solver, the INS3D-UP code, by Rogers et al. [10]. Since, in the artificial compressibility formulation, the governing equations are changed into a hyperbolic-parabolic type, the upwind differencing schemes developed for compressible flow equations (Roe, [23]) is incorporated into the flow solver. This leads to a more diagonal dominant system. 18.2.2.2 Time-Accurate Formulation Time-dependent calculation of incompressible flow is especially time consuming due to the elliptic nature of the governing equations. Numerically, this means that in each time step, the pressure field has to go through one complete steady-state iteration. In transient flow, the physical time step usually has to be small and the change in the flow field during each time increment may be small. In this situation, the number of iterations in each time step for getting a divergence-free flow field may not be as high as that required for obtaining regular steady-state solutions. However, the timeaccurate computations are generally an order of magnitude more time-consuming than steady-state computations. Rogers et al. [10] developed a time-accurate method using artificial compressibility. In this formulation the time derivatives in the momentum equations are differenced using a second-order, three-point, backward-difference formula 3u"+,-4u"+un-] An+1
m
= r
~
(18 10)
-
where the superscript n denotes the quantities at time t = nAt and r is the right-hand side given in equation (18.3). To solve equation (18.10) for a divergence free velocity at the (n +1) time level, a pseudo-time level is introduced and is denoted by a superscript m. The equations are iteratively solved such that w"+l,"'+1 approaches the new velocity K"+1 as the divergence of U"+Um+* approaches zero. To drive the divergence of this velocity to zero, the following iterative relation is introduced:
INCOMPRESSIBLE NAVIER-STOKES „ n + l,m+l
P
339
_n+l,m
~P
-j8V.fi"
(18.11)
AT
Combining equation (18.11) with the momentum equations, and linearizing the residual term at the m + l pseudo-time level results in the following equation in delta form /„
A \#i+l.m I An+l,m+l
An+l,m\
3D = -&»*■"• -Ia-h5Dn+]m -2D" +0.5D"-') At At \ ' where 7,r is a diagonal matrix given by J_ L5 L5 L5 l„ = diag At' At' At' At
(18.12)
(18.13)
As can be seen, this equation is very similar to the steady-state formulation given by (9). Both systems of equations will require the discretization of the same residual vector R. Even though this formulation has been used successfully, it is desirable to accelerate the flow solver for time-dependent problems. In the next section, a method based on a fractional step approach will be discussed; this has been found to be an efficient method for obtaining time-accurate solutions. 18.2.3 Method Based on Pressure Projection In 1965, Harlow and Welch [24] published the first primitive variable method using a Poisson equation for pressure. In this method, called the marker-and-cell (MAC) method, the pressure is used as a mapping parameter to satisfy the continuity equation. By taking the divergence of the momentum equation, the Poisson equation for pressure is obtained: dx,
dt dx,
(18.14)
where h, = -
dU-U,
dt::
dx,
dx-
The usual computational procedure involves choosing the pressure field at the current time step such that continuity is satisfied at the next time step. The original MAC method is based on a staggered arrangement on a 2-D Cartesian grid. The staggered grid conserves mass, momentum, and kinetic energy in a natural way and avoids odd-even point decoupling of the pressure encountered in a regular grid (see [25]). Even though the original method used an explicit Euler solver, various time
340
KWAK,KIRIS,DACLES-MARIANI,ROGERS & YOON
advancing schemes can be implemented with this formulation. Ever since its introduction, numerous variations of the MAC method have been devised and successful computations have been made. The MAC method can be viewed as a special case of the projection method (i.e. Chorin [26]) or, the fractional step method (see Yanenko [27], Marchuck [28]). In this method the strict requirement of obtaining the correct pressure for a divergence free velocity field in each step may significantly slow down the overall computational efficiency. To satisfy the mass conservation in grid space, the difference form of the second derivative in the Poisson equation has to be constructed consistent with the discretized momentum equation (see Kwak [1]). To solve for a steady-state solution, the correct pressure field is desired only when the solution is converged. In this case, the iteration procedure for the pressure can be simplified such that it requires only a few iteration at each time step. The best known method using this approach is the Semi-Implicit Method for Pressure-Linked Equations (SIMPLE) [29] (see Chen et al. [30] for more recent advancement). The unique feature of this method is the simple way of estimating the velocity and the pressure correction. This feature simplifies the computation but introduces empiricism into the method. Despite its empiricism, the method has been used successfully for many steady-state computations. It is not the intention of the present paper to evaluate this method, and readers interested in this approach are referred to the above cited references. Computational Procedure Here, the time evolution can be approximated by several steps where operator splitting can be accomplished by treating the momentum equations as a combination of convection, pressure, and viscous terms. The common application of this method is done in two steps. The first step is to solve for an auxiliary velocity field using the momentum equation in which the pressure-gradient term can be computed from the pressure in the previous time step. In the second step, the pressure is computed which can map the auxiliary velocity onto a divergence-free velocity field. This procedure is illustrated by the following example in Cartesian coordinates: Step 1: Calculate auxiliary or intermediate velocity, «r, by
^,I(3*;-Sr,)-|:+i_Lv^+,,)
(1 , 15)
where H,■ =—-—u{u, , Re =Reynolds number Sxi Here, the advection terms are advanced by a second-order Adams-Bashforth method. The pressure gradient term is added to the procedure, so as to minimize the pressure correction in each time step.
INCOMPRESSIBLE NAVIER-STOKES
341
Step 2: Solve for the pressure correction. In the second step, the momentum equation can be written as ,
^£A-~T-(r1-r) At
(i8.i6)
2 oxt
where This equation combined with continuity equation results in the following Poisson equation for the pressure correction.
v 2 (0» +1 -^) = A A v
5
(18 . 17)
' At Sx: Once the pressure correction is computed, new pressure and velocities are calculated as follows:
p" + '=p"+(0« + '-0-)--iLv 2 (f ,+, -0")
(18.18)
„" +1 = M "-^-i_(0"'_0»)
(18.19)
Successful methods have been developed using fractional step approach in generalized coordinates (Rosenfeld et al., [31-33], Wessling et al. [34]). One particular aspect of this approach requiring special care is the intermediate boundary conditions. Rosenfeld et al. [31,32] devised a generalized scheme where physical boundary conditions can be used at intermediate steps. As with other pressure based methods, the efficiency of the fractional step method depends on the Poisson solver. A multigrid acceleration, which is physically consistent with the elliptic field, is one possible avenue to enhance the computational efficiency [33]. Rosenfeld et al. [31] defined the dependent variables such that the pressure and the volume fluxes are at the center and on the faces of the primary cells, respectively. This selection is equivalent to afinite-differenceformulation over a staggered grid with the choice of scaled contravariant velocity components as the unknowns. The viscous terms are treated by an approximate factorization. The resulting solver was successfully validated for time-dependent problems which require small physical time steps. To relax the CFL-number restriction in three dimensions and to improve griddependent robustness of the solver, Kiris and Kwak [35] implemented a relaxation scheme where both convective and viscous terms are treated implicitly. Here the first step is to solve the momentum equation for an auxiliary velocity field. Three-point backward differencing formula is used to discretize the momentum equations in time, as shown below:
-U3«,. -4K," + «r') = -$f- + hXii,)
(18.20)
2Atv ' ax: The operator splitting is formulated such that overall temporal accuracy is achieved. The resulting solver was validated under grid conditions with a metric discontinuity. It was found that a large CFL number could be used without causing an instability
342
KWAK,KIRIS,DACLES-MARIANI,ROGERS & YOON
problem. This is a desirable feature for unsteady flow computations when the time step is not restricted by the physics of the problem. A time-accurate version, INS3DFS code, was developed using this approach [35]. 18.2.4 Artificial Compressibility vs. Pressure Projection Methods for Time-Accurate Computations To discuss the characteristics of the two approaches in computing time-accurate flows, pulsatile flow in a constricted channel is chosen. The computations were performed using two versions of INS3D, namely, INS3D-UP and INS3D-FS representing the artificial compressibility and the pressure projection methods, respectively. The geometry is consistent with the experimental setup of Park [36], and is shown in Fig. 1. The height of the constriction is given by a = 0.57. This is the distance from the top wall of the channel to the lowest point in the constriction. The length of the channel upstream of the constriction is given by L„ = 7. The length of constriction isLc = 4.66, and the downstream portion of the channel is given by
The inflow boundary for the experiment was at 100 channel heights upstream of the constriction. However, the computational inflow boundary was placed at seven channel heights upstream of the constriction with a parabolic velocity profile such that the mass flow matches that of the experimental setup. The pulsatile inflow velocity is given by the shape function given in Fig. 2. This problem was previously computed (see Wiltberger et al., [37]), and repeated here for the purpose of comparing the two algorithms. To obtain time-accurate solutions, the INS3D-UP is subiterated in each time step until incompressibility is satisfied, while INS3D-FS is time accurate at each time step. The resulting solutions are comparable as shown in Figs. 3 and 4. In the experiment the location of the center of the vortex along the bottom wall, defined as B-vortex, was measured. The B-vortex grows immediately behind the constriction and is shed downstream. In Fig. 3, streamline contours generated from these computations are shown at time increment of 0.1 of the period. The results from the two codes show the same phenomena, where INS3D-FS produces larger vortical structure throughout the entire period. The location of B-vortex is then plotted against the experimental data in Fig. 4. Both codes compare well with the experimental results. To compare the time accurate procedure, the iteration process is studied in detail within one time step advancement. In INS3D-UP code, the time accuracy is achieved by subiterating within each step. This subiteration is done with the governing equation modified to hyperbolic-parabolic type. Therefore, the upstream propagating pressure wave from exit boundary has to have enough elapsed time to balance the viscous effect within the channel. The number of iterations required to accomplish this is directly related to the magnitude of the artificial compressibility parameter, defined as f$ earlier. This phenomenon is shown in Fig. 5a. On the other hand, INS3D-FS uses Poisson equation for pressure to map the flow field into a
INCOMPRESSIBLE NAVIER-STOKES
343
divergence-free velocity field at the new time level. The Poisson iteration, which is done here using a point Gauss-Seidel iteration with GMRES convergence acceleration, does not exhibit wave-like phenomena as in the artificial compressibility case. This is shown in Fig. 5b. Subiterating an implicit inversion as in INS3D-UP usually is more expensive compared to a Poisson iteration. For a steady state solution, however, the INS3D-UP can take a much larger time step than INS3D-FS. In our computations, INS3D-FS required 1.5 to 2 times more CPU time for steady state solutions. In time dependent cases where small time step is required to resolve the physics, INS3D-FS runs faster than INS3D-UP by at least a factor of three or greater. The algorithmic characteristics of these two approaches will be discussed further in a future report.
18.3. Computed Results In the previous sections, solution methods and associated codes based on primitive variable approaches have been reviewed briefly. These solvers have been applied to numerous engineering problems in the past. In the present section, computed examples are chosen from external and internal flow problems in aeronautics followed by an application to biofluid problem. 18.3.1
Wingtip Vortex Formation and Propagation
The study of wingtip vortices has been of major importance in many areas of fluid engineering. Its significance can be seen in problems such as aircraft spacing during landing and take off, blade/vortex interaction on rotorcraft performance, and tip vortex cavitation on ship propeller performance. Although there has been a great deal of work done on the tip vortex in the form of theoretical, experimental and computational studies, very few actually address the near-field detail. The present study focused on the near field physics which will provide inflow conditions for simulating wake propagation in the intermediate and far field regions. Formation on the Wing Surface The computational studies on the formation and roll-up process were performed using the ENS3D-UP code by Dacles-Mariani et al. [38,39] in conjunction with an experimental study by Chow, et al. [40]. The initial roll-up process and the near field detail have been studied in great detail both experimentally and computationally. In the computational study, the computational domain consists of the wing-wind tunnel wall geometry, as shown in Fig. 6a, which is a close approximation to the experimental set-up in the 32 in. x 48 in. low speed wind tunnel at NASA Ames. The computational domain includes a rectangular half-wing with a NACA 0012 airfoil section, a rounded wing tip and the surrounding boundaries. The wing has an aspect ratio of 0.75 and was mounted inside a wind tunnel at 10 degrees angle of attack. The
344
KWAK,KIRIS,DACLES-MARIANI,ROGERS & YOON
flow is turbulent with a Reynolds number of 4.6 million based on the chord length. The inflow boundary conditions for the velocity profiles were prescribed using experimental values and the inflow pressure was computed based on the method of characteristics using a one-dimensional Riemann invariant. At the solid surface, (wind tunnel walls and the surface of the model), the velocity was specified to be zero. A single grid approach (see Fig. 6a) was used for this study. Resolving the physics of this problem requires a very dense grid. As a result, the CPU requirement was substantial. A total run time of 28 CPU hours on a Cray C-90 were required for the 2.5 million grid points for a five order of magnitude drop on the residual. This study focused primarily on resolving the fundamental issues of the tip vortex computation without addressing the need to reduce the total number of grid points. However, it is recognized that for more practical implementations, the grid density requirement will have to be addressed. As shown in Fig. 6b, a vortex is formed at the tip of the wing fed by the vorticity from the tip boundary layer. A pressure differential existing between the upper and lower surfaces of a wing drives the fluid in the boundary layer around the tip and towards the suction side of the wing. The discrete vortex formed becomes highly three-dimensional and complex. As the vortex moves downstream, it rolls up more and more of the wing wake until its circulation is nominally equal to that of the wing. This roll-up distance is small compared to the separation of aircraft on the approach path, but may not be necessarily small compared to the distance between interacting lifting surfaces. The flow in the near-field is therefore important in its own right as well as providing possible means of controlling the far-field vortex. As the flow progresses downstream, the maximum crossflow velocity increases in the vortex core. An axial pressure gradient in turn develops that accelerates the fluid in the vortex core. Phillips and Graham [41] have shown that for a turbulent vortex, assuming that v'e2 = v' 2 , we get dr r dr This equation shows the dominant term which contributes to the radial pressure gradient is a term which contains ve, which in turn affects the peak axial velocity. Note, however, that the second term on the right-hand side of the equation may not be neglected because of the high near field core turbulence intensities measured in experiment by Zilliac et al. [42]. This shows that the resolution of the core properties are dependent not only on the gradient of the circumferential velocity, but also on one of the turbulent stress terms. If this term is not properly accounted for a very good comparison between measured and computed values may not be achieved. Propagation in the Near Field Once it is formed, the wake vortex almost preserves its strength for a long time due to its Euler nature in the core region. This requires special turbulence modeling as well
INCOMPRESSIBLE NAVIER-STOKES
345
as high grid resolution in the core region. First the fractional step code, INS3D-FS, is used to investigate the level of error resulting from spatial differencing and turbulence model. Then the wake velocity profiles from the INS3D-UP and the INS3D-FS are compared. In this study Baldwin-Barth [43] one-equation turbulence model was used. For INS3D-FS computations, the computational domain includes the region from the trailing edge of the wing (x/c=1.0) to 0.673 chord lengths downstream of the wing with an H-H grid topology. Extensive experimental data [44] are available at x/c=1.0, 1.12, 1.24, 1.447, and 1.673. The experimental velocity profile at x/c=1.0 station is used as inflow boundary conditions. The pressure distributions at boundaries are calculated from the compatibility condition. The computations are carried out using a relatively fine grid with dimensions of 36x82x82. The solution is converged to machine accuracy in 1000 iterations. The CPU time required for this computation is about 3.5 CRAY-C90 hours. Initially, the computations were carried out using a coarse grid. It was found that the vortex core velocity peak values were underpredicted. The grid resolution was increased by doubling the number of grid points in the k, and / directions, from grid dimension of 36x42x42 to 36x82x82. The prediction of the peak values at the vortex core was improved, but not substantially. The numerical results indicated that there is an excessive amount of numerical dissipation at the vortex core as it progresses downstream. This was consistent with the findings from the tip vortex study using artificial compressibility (see [38,39]). The excessive numerical diffusion at the vortex core was reduced when the production term in the turbulence model was modified using the norm of the strain rate tensor (see, Kiris and Kwak, [35]). In Figs. 7a and 7b, the effects of turbulence model and the third and fifth order convective differencing schemes are compared by showing the axial progression of flow quantities along vortex coreline. The amount of numerical dissipation is large when third-order flux difference splitting is used. In order to reduce this numerical dissipation, one can use a finer grid or can increase the stencil in the upwind-biased differencing. Since the cost of increasing the accuracy of the differencing is much less than that of increasing the grid size, the fifth order upwind differencing is used. It should be pointed out that the overall spatial accuracy of the method is second-order even though fifth order upwind differencing is used for the convective terms. That is because the volume and surface area vector are evaluated to second-order accuracy. A simple averaging is used for the metric terms at the half points location, and a second order central differencing is used for viscous terms. However, increasing the stencil in upwind differencing has a significant effect in reducing the amount of numerical dissipation, compared to lower-order differencing. The results using third and fifth order schemes are compared in Figs. 7a and 7b. As the wake propagates downstream the core region still exhibits excessive dissipation, primarily due to high turbulent viscosity. To study the effects of the production term in the turbulence model, various combinations of the vorticity magnitude and the strain rate have been experimented [35]. The effect of turbulence model in the core region can clearly be seen in Figs. 7a and 7b. By modifying the
346
KWAK,KIRIS,DACLES-MARIANI,ROGERS & YOON
production term in the core to incorporate the strain rate the error in the core region was reduced to less than 2 %. Finally, the velocity magnitude and the crossflow velocity at three different wake locations are plotted in Fig. 8. As shown in the figure, the results from the two codes are reasonably close and both show the capability of capturing the vortex core quite well. 18.3.2
Application to Liquid Rocket Engine
Until recently, the high performance pump design process was not significantly different from that of three decades ago. During this time, a vast amount of experimental and operational experience has revealed that there are many important features of pump flows that are not fully accounted for in the semi-empirical design process. During that same time span, huge strides have been made in computers, numerical algorithms, and physical modeling. After applying the original INS3D to the redesign of the SSME hot gas manifold, enhancing the performance of the turbomachinery components in advanced rocket engines became of major importance. This prompted the development of a CFD procedure for rocket pump flow simulation. The liquid fuel and oxidizer pumps of interest are operating at constant speed. Therefore, rotational steady-state solutions are sought first, leading to the development of a pump simulation procedure using the INS3D-UP code, in a steady rotating reference frame. Rocket pumps involve full and partial blades, tip leakage and exit boundary to diffuser. In addition to the geometric complexities, a variety of flow phenomena are encountered in turbopump flows. These include turbulent boundary layer separation, wakes, transition, tip vortex, three-dimensional effects, and Reynolds number effects. In order to use CFD in the design process, the computational flow analysis tools must be validated so that designers can define the accuracy and variations of these tools. The validation of the CFD procedure for the pump applications has focused on a rocket inducer. Extensive computational validations were performed by the Pump Technology Team organized by the NASA Marshal Space Flight Center (see, Garcia et. al, [44]). The resulting computational procedure was then applied to the flow through the SSME High Pressure Fuel Turbo-Pump impeller and to the development of an advanced pump impeller (Kiris and Kwak, [45]). The results from the advanced impeller flow analysis are presented here. In Fig. 9, a cross sectional view of an advanced impeller is shown schematically. The computational model of an advanced pump includes the impeller and the exit cavity region. Figure 10 shows the computational grid near the hub region of the impeller. The impeller design flow is 1,205 gal/min with a design speed of 6,322 rpm. The Reynolds Number for this calculation was 181,273 per inch. In Fig. 11, the meridional velocity, which is the circumferentially averaged axial velocity, is shown at the impeller discharge. A relative x-distance is measured from the shroud to the hub, where x=1.0 is the hub. The meridional velocities, Cm, were integrated
INCOMPRESSIBLE NAVIER-STOKES
347
along a radial strip for each constant x-position and they were nondimensionalized by the wheel tip speed of 249.5 ft/sec. The meridional velocity distribution for 5% and 10% recirculation from the exit shroud cavity were also plotted. When the exit shroud cavity has leakage to the impeller eye, the velocity peak at the impeller exit moves toward to the center of the b2 width, where b2 is defined as the blade height at the impeller exit (see Fig. 9). However, the shroud leakage has only minor effects on the solution at r/rtip= 1.0275 (Fig. 11). In Fig. 11, the symbols represent experimental data by Brozowski [46], and the lines represent Cm distributions for the flow with vaneless space at the exit of the impeller. The test data shows that the peak is closer to the center of the b2 width. The discrepancy between the computed results and experimental data is partially due to the recirculating flow in the hub cavity. The leakage at the hub cavity leads to stronger recirculation region which shifts the velocity peak to the center of b2 width. Since the CFD analysis did not include the leakage at the hub cavity, the predicted recirculation region in the vaneless space is not as strong as in the experimental study. Figure 12 shows blade-to-blade velocity distributions at the impeller exit. Blade-to-blade velocity distribution illustrates the impeller exit flow distortion. Symbols represent the experimental data and the lines represent computed results. The jet-wake pattern, which produces an unsteady load on the diffuser vanes, was captured at both meridional locations. Overall, the numerical results compare reasonably well with the experimental data. 18.3.3
Design of Ventricular Assist Device
A Ventricular Assist Device (VAD) is a life saving tool when the natural heart is incapable of providing sufficient blood flow. In 1989, NASA and the DeBakey Heart Center of the Baylor College of Medicine (BCM) began developing a new implantable VAD system (Fig. 13). This VAD is based on a magnetically driven axial pump requiring a 5 liters per minute blood flow rate against 100 mm Hg pressure (Fig. 14). To make it implantable, the device has to be made small, which requires a very high rotational speed. Two major problems had to be resolved to make this design operational. First, the red blood cell damage had to be maintained lower than an acceptable level, and second, blood clotting must be prevented from forming in any local region which will stop the pump from rotating. In addition, the blood must be washed out properly since the formation of blood clots may appear within stagnation regions. Since the size of the device is small and the operating condition is severe, instrumentation for flow measurements is extremely difficult. Therefore it became necessary to look at the flow by using computational techniques. The INS3D-UP code has been used to resolve these issues and make the pump meet the requirements for clinical test. The lessons learned from applying the flow solver to rocket engine development were reapplied to the VAD analysis and design. First, the flow through the baseline design of the VAD impeller was simulated in a steady rotating frame of reference. Zonal multiblock grids were used in this analysis with a total of 350,000 grid points. The design flow of this impeller is 5 liters per
348
KWAK,KIRIS,DACLES-MARIANI,ROGERS & YOON
minute and the design speed is 12,600 revolutions per minute (rpm). The problem was nondimensionalized by the tube diameter (0.472 inches) and the impeller tip velocity. The solution was considered converged when the maximum residual had dropped at least five orders of magnitude. Computer time required per grid point per iteration was about 1.5 x 10"* seconds. The total computer time required for these calculations was about 6 to 8 single processor Cray-C90 hours. A parametric study was performed to optimize the impeller blade shape and the tip clearance. Initially, three different impeller blade designs with a tip clearance of 0.009 inches were analyzed. Then, the design shown in Fig. 15a was analyzed with two tip clearances; the tip clearance of 0.0045 inches showed better hydrodynamic performance in terms of efficiency and head coefficient than with a tip clearance of 0.009 inches. Using this design with a tip clearance of 0.0045 inches as the baseline impeller design, ideas from rocket propulsion were introduced to develop a new implantable VAD (Fig. 15b). A new design consisting of the baseline impeller plus an inducer was investigated. The hub and blade surfaces of the baseline impeller and the new impeller, shaded by nondimensionalized pressure, are shown in Fig. 15. The pressure is nondimensionalized by pV2, where p is the density and V is the impeller tip velocity. The pressure gradient across the blades, due to the action of centrifugal force, and the pressure rise from inflow to outflow are shown. Here, the meridional velocity distributions along impeller blade height are shown for various designs (see Fig. 16). The final design essentially removed backflow that existed in the earlier design. One of the critical regions for potential blood clotting is near the bearing area between rotating and non-rotating components. Clotting can be caused in the hub area due to either high shear or stagnation depending on the gap and configuration of the area. Several cavity shapes and gap widths were analyzed. Figure 17 shows velocity vectors for the original baseline design with the cavity width of b and the final design with the gap width of 8b. The original design showed very high shear stresses near the rotating hub face and a very stagnant fluid region in the lower portion of the cavity. Increasing the cavity width to 8 b showed that the recirculation was increased in the cavity. This and other modifications were incorporated into the final configuration, which passed requirements for a two week clinical usage of the VAD. The performance comparison between the original and new design is given in Table 1. Clinical results in the table were obtained by BCM. The hemolysis index reported in this table shows the amount of hemoglobin generated by the pump in grams per 100 liters. Destruction of the red blood cells results in the release of hemoglobin. The new design shows a remarkable improvement in performance over the baseline design. There is a 22 % increase of efficiency between the old and the new design. The inducer provides a sufficient pressure rise to the flow in order to prevent the cavitation on the impeller blades. Besides improving the pumping efficiency, the design of the VAD requires good wall washing near the solid wall by reducing the stagnation regions.
INCOMPRESSIBLE NAVIER-STOKES Original Design Pump Efficiency 0.25 Hemolysis Index 0.02 Power Required 12.6 watts Rotation (rpm) 12,600 Thrombus Formation Yes Test Run Time 2 days Table 1. Performance comparison of the original and DeBakey VAD.
349 New Design 0.33 0.003 9.8 watts 10,800 No 30+ days the new design of NASA
Through the use of computational approach, the performance of the VAD has been improved drastically. Furthermore, detailed flow analysis lead to developing a nonclotting pump. In July 1991, the Institute of Medicine estimated that approximately 25,000 to 60,000 patients per year in North America could benefit from an efficient left ventricular assist device. Thus, improved designs made possible by using a CFD tool will have a far reaching impact on managing the health of patients with critical needs.
18.4
Concluding Remarks
In this paper, incompressible Navier-Stokes solvers primarily designed for threedimensional flow simulations are discussed. Since the primitive variable formulation causes fewer complications in setting the boundary conditions, the discussion has been limited to the primitive variable formulation. The computed results are presented to illustrate the numerical procedures. Even though computer speed and memory have been increased substantially in the recent past, the speed and the memory requirements of a flow solver are still major factors affecting the turnaround time. The INS3D-UP, which is an upwind finite-difference code based on an artificial compressibility approach, is being applied to a wide variety of applications for steady-state, time-accurate and rotational-steady solutions. The INS3D-FS, which is based on a fractional step method using a finite volume discretization on staggered grids, is intended for solving time-dependent problems. For steady-state solutions, the INS3D-FS requires about 1.5 to 2 times more CPU time than the INS3D-UP. For unsteady computations, the INS3D-FS is time-accurate while the INS3D-UP requires subiteration at each time level, thus the INS3D-FS is expected to be economical. As these codes are applied to a variety of problems, we hope to quantify the advantages and shortcomings of these codes in more detail. Despite the limitations in algorithm speed, accuracy, and grid-dependency of the numerical solution procedures, these solvers can be of significant value to developers of modern flow devices when the range of their applicability is properly understood.
350
KWAK,KIRIS,DACLES-MARIANI,ROGERS & YOON
REFERENCES 1. Kwak, D., "Computation of Viscous Incompressible Flows," von Karman Institute for Fluid Dynamics, Lecture Series 1989-04. Also NASA TM 101090, March 1989. 2. Kwak, D., Kiris, C , Dacles-Mariani, J., Rogers, S., and Yoon, S., "Incompressible Navier-Stokes Solvers for Three-Dimensional Steady and Unsteady Flow Simulations," Computational Fluid Dynamics Reviewl997, Hafez, M. and Oshima, K., ed. John Wiley and Sons, 1997. 3. Hirsch, C, Numerical Computation of Internal and External Flows. John Wiley & Sons, 1988. 4. Hafez, M. and Oshima, K., ed. Computational Fluid Dynamics . John Wiley and Sons, 1995 and 1997. 5. Chorin, A. J., "A Numerical Method for Solving Incompressible Viscous Flow Problems," J. Comp. Phys., Vol. 2, pp.12-26, 1967. 6. Chang, J. L. C, and Kwak, D., "On the Method of Pseudo Compressibility for Numerically Solving Incompressible Flows," AIAA Paper 84-0252, AIAA 22nd Aerospace Sciences Meeting, Reno, NV, January 9-12, 1984. 7. Steger, J. L. and Kutler, P., "Implicit Finite-Difference Procedures for the Computation of Vortex Wakes," AIAA J., Vol. 15, No. 4, pp. 581-590, Apr. 1977. 8. Kwak, D., Chang, J. L. C, Shanks, S. P., and Chakravarthy, S., "A ThreeDimensional Incompressible Navier-Stokes Flow Solver Using Primitive Variables," AIAA J, Vol. 24, No. 3, pp 390-396, Mar. 1986. 9. Chang, J.L.C., Kwak, D., Rogers, S. E. and Yang, R-J, "Numerical Simulation Methods of Incompressible Flows and an Application to the Space Shuttle Maine Engine," Int. J. Numerical Method in Fluids, Vol. 8, pp. 1241-1268, 1988. 10. Rogers, S. E., Kwak, D. and Kiris, C, "Steady and Unsteady Solutions of the Incompressible Navier-Stokes Equations," AIAA J. Vol. 29, No. 4. 603-610, April 1991 ll.Merkle, C. L. and Athavale, M., "Time-Accurate Unsteady Incompressible Flow Algorithms Based on Artificial Compressibility," AIAA Paper 87-1137, 1987. 12. Belov, A. Martinelli, L. and Jameson, A., "A New Implicit Algorithm with Multigrid for Unsteady Incompressible Flow Calculations," AIAA Paper 95-0049, 1995 13. Temam, R., Navier Stokes Equations. Revised Edition., North Holland, 1979. 14. Rizzi, A. and Eriksson, L.-E., "Computation of Inviscid Incompressible Flow with Rotation," J. Comp. Phys., Vol. 153, pp 275-312 , 1985. 15. Beam, R. M., and Warming, R. F., "An Implicit Factored Scheme for the Compressible Navier-Stokes Equations," AIAA J., Vol. 16, pp. 393-402, 1978. 16. Briley, W.R. and McDonald, H, "Solution of the Multidimensional Compressible Navier-Stokes Equations by a Generalized Implicit Method," J. Comp Phys Vol 24 v y No. 4, pp.372-397, 1977. '
INCOMPRESSIBLE NAVEER-STOKES
351
17. Pulliam, T. H., and Chaussee, D. S., "A Diagonal Form of an Implicit ApproximateFactorization Algorithm," J. Comp. Phys., Vol. 39, pp. 347-363, 1981. 18. Rogers, S. E., Chang, J. L. C , and Kwak, D., "A Diagonal Algorithm for the Method of Pseudocompressibility," J. Comp. Phys. Vol. 73, No. 2, pp. 364-379, 1987. 19. Kwak, D., Chang, J. L. C, Shanks, S. P., and Chakravarthy, S., "An Incompressible Navier-Stokes Flow Solver in Three-Dimensional Curvilinear Coordinate System Using Primitive Variables," AIAA Paper 84-0253, January 1984. 20. Rogers, S. E., Kwak, D. and Chang, J. L. C, "INS3D-An Incompressible NavierStokes Code in Generalized Three-Dimensional Coordinates," NASA TM 100012, November 1987. 21. Yoon, S. and Kwak, D., "LU-SGS Implicit Algorithm for Three-Dimensional Incompressible Navier-Stokes Equations with Source Term," AIAA Paper 89-1964, 1989. 22. MacCormack, R. W., "Current Status of Numerical Solutions of the Navier-Stokes Equations," AIAA Paper 85-0032, 1985. 23. Roe, P. L., "Approximate Riemann Solvers, Parameter Vectors, and Difference Schemes," J. Comp. Phys., Vol. 43, pp 357, 1981. 24. Harlow, F. H. and Welch, J. E., "Numerical Calculation of Time-Dependent Viscous Incompressible Flow with Free Surface," Phys. Fluids, Vol. 8, No. 12, pp. 21822189,1965. 25. Gresho, M. P. and Sani, R. L., "On Pressure Boundary Conditions for the Incompressible Navier-Stokes Equations," Int. J. Numerical Methods in Fluids, Vol. 7, pp. 1111—1145, 1987. 26. Chorin, A.J., "Numerical solution of Navier-Stokes equations," Mathematics of Computation, Vol. 22, No. 104, 745-762, 1968. 27. Yanenko, N.N., The Method of Fractional Steps, Springer-Verlag, Berlin, 1971. 28. Marchuk, G.M., Methods of Numerical Mathematics. Springer-Verlag, 1975. 29. Patankar, S. V. and Spalding, D. B., "A Calculation Procedure for Heat, Mass and Momentum Transfer in Three—Dimensional Parabolic Flows," Int. J. Heat and Mass Transfer, vol. 15, pp. 1787-1806, 1972. 30. Chen, Y.S., Shang, H.M and Chen, C.P., "Unified CFD Algorithm with a Pressure Based Method," 6th Int'l. Symposium on Comp. Fluid Dyn., Sept 4-8, 1995, Lake Tahoe, NV. 31. Rosenfeld, M., Kwak, D. and Vinokur, M., "A Fractional Step Solution Method for the Unsteady Incompressible Navier-Stokes Equations in Generalized Coordinate Systems," J. Comp. Phys., Vol. 94, No.l, pp 102-137, May, 1991. 32. Rosenfeld, M., and Kwak, D., "Time-Dependent Solutions of Viscous Incompressible Flows in Moving Coordinates," Intl. J. Num. Methods in Fluids, Vol. 13, pp 1311-1328, 1991.
352
KWAK,KIRIS,DACLES-MARIANI,ROGERS & YOON
33. Rosenfeld, M., and Kwak, D., "Multigrid Acceleration of a Fractional-Step Solver in Generalized Curvilinear Coordinate Systems," AIAA J. Vol. 31, No. 10, pp 17921800, October, 1993. 34. Wessling, P., Segal, A., van Kan, J.J.I.M., Oosterlee, C.W. and Kassels, C.G.M., "Finite Volume Discrtization of the Incompressible Navier-Stokes Equations in General Coordinates on Staggered Grids," Computational Fluid Dynamics Journal, vol. 1, pp 27-33, April 1992 35. Kiris, C. and Kwak, D., "Numerical Solution of Incompressible Navier-Stokes Equations Using a Fractional-Step Approach," AIAA Paper 96-2089, June 1996. 36. Park, D.K., "The Biofluid mechanics of Arterial Stenoses," M.Sc. thesis, LeHigh University, Bethlehem, Pennsylvania, 1989. 37. Wiltberger, N.L., Rogers, S. E. and Kwak, D. "A Comparison of Two Incompressible Navier-Stokes Algorithms for Unsteady Internal Flow," NASA TM 108794, November 1993. 38. Dacles-Mariani, J. S., S. Rogers, D. Kwak, G. Zilliac, and J. Chow, "A Computational Study of a Wingtip Vortex Flowfield," AIAA Paper 93-3010, July 1993. 39. Dacles-Mariani, J., Kwak, D., and Zilliac, G, "Accuracy Assessment of a Wingtip Vortex Flowfield in the Near-Field Region," AIAA Paper 96-0208, Jan. 1996. 40. Chow, J.S., Zilliac, G.G., and Bradshaw, P., " Near-Field Formation of a Turbulent Wingtip Vortex," AIAA 93-0551, 31st Aerospace Sciences Meeting, Reno, NV\, Jan. 11-14. 1993. 41. Phillips, W.R.C. and Graham, J.A.H., "Reynolds-Stress Measurement in a Turbulent Trailing Vortex," J. Fluid Mech., Vol. 147, pp 353-371, 1984. 42. Zilliac, G.G, Chow, J.S., Dacles-Mariani, J. and Bradshaw, P., "Turbulent Structure of a Wingtip Vortex in the Near Field," AIAA Paper 93-3011, July 1993. 43. Baldwin, B.S. and Barth, T.J., "A One-Equation Turbulence Transport Model for High Reynolds Number Wall-Bounded Flows," AIAA Paper 91-0610, 1991. 44. Garcia, R., McConnaughey, P., and Eastland, A., "Activities of MSFC Pump Stage Technology Team," AIAA Paper No. 92-3232, 1992. 45. Kiris, C. and Kwak, D., "Progress in Incompressible Navier-Stokes Computations for the Analysis of Propulsion Flows,'" NASA CP 3282, Vol II, Advanced Earth-toOrbit Propulsion Technology, 1994. 46. Brozowski, L. A., Ferguson, T.V., and Rojas, L., "Impeller Flow Field Laser Velocimeter Measurements," Proceedings of the Fifth International Symposium on Transport Phenomena and Dynamics of Rotating Machinery, May 9-11, 1994, Kuanupulu, Maui. 47. Kiris, C, Kwak, D. and Benkowski, R., 'Computational Flow Analysis of a Left Ventricular Assist Device,' Sixth Intl. Symp. on Comp. Fluid Dyn., Lake Tahoe, Nevada, September 4-8, 1995.
INCOMPRESSIBLE NAVIER-STOKES COMPUTATIONS
353
Figure 1 Physucalk dimensions of computational model of a constricted channel. Computational grid size : 241 x 63.
igure 2
Inflow velocity magnitude versus time for one period for constricted channel. Re = 131.9, based on inflow velocity.
354
KWAK, KIRIS, DACLES-MARIANI, ROGERS & YOON
Figure 3 Streamlines at 0.1 T to 1.0 T: (a) INS3D-Up results: (b) INS3D -f3 results.
Figure 4
Location of B-vortex versus period.
355
INCOMPRESSIBLE NAVIER-STOKES COMPUTATIONS
Figure 5
Pressure correction within one time step: (a) INS3D-UP subiteration; (b) INS3D-FS Poisson iteration.
(b) Figure 6 Wing tip vortex study; (a) schematic of the wind tunnel test section and the computational domain; (b) particle traces showing tip vortex rollup and core movement; Re = 4.6 X 106.
356
KWAK, KIRIS, DACLES-MARIANI, ROGERS & YOON
Figure 7 Details of flow near tip vortex; (a) axial progression of velocity magnitude along vortex coreline; (b) axial progression of pressure coefficient magnitude along vortex coreline.
INCOMPRESSIBLE NAVIER-STOKES COMPUTATIONS
357
Figure 8 Comparison of velocity magnitude and crossflow velocity across wake vortex at three different positions.
KWAK, KIRIS, DACLES-MARIANI, ROGERS & YOON
358
igure 9
Figure 10
Schematic view of an advanced pump impeller cross-section.
Advanced pump impeller computational grid on the hub surface.
INCOMPRESSIBLE NAVIER-STOKES COMPUTATIONS
359
Averaged Cm vs. Relative X: R/fitip - 1.0275
F i g u r e 11
F i g u r e 12
Comparison of circumferentially averaged meridional velocity at t h e
Comparison of blade-to-blade meridional velocity at the impeller exit.
KWAK, KIRIS, DACLES-MARIANI, ROGERS & YOON
360
Figure 13
Schematic of an axial flow ventricular assist device (VAD) implantation.
INCOMPRESSIBLE NAVIER-STOKES COMPUTATIONS
361
Fi g u r1e4 Sc he m a t i c o fN A t hSe-AD eB ak e yV A D .
Figure 15 Hub abd blade surfaces of the original and new VAD impeller are shaded by pressure.
362
i g u r e 16
KWAK, KIRIS, DACLES-MARIANI, ROGERS & YOON
Meridional velocity distribution along impeller blade height of various designs.
i g u r e 17
Velocity vectors inside the original and new bearing geometries.
19 Pros & Cons of Airfoil Optimization Mark Drela 1
19.1
Introduction
Optimization has long been considered as a means to solve the aerodynamic design problem in a formal and general manner. Early work by Hicks, Murman, and Vanderplaats [1] investigated this possibility for transonic airfoil flows, with the results being encouraging but showing some rather unexpected results and difficulties. These early efforts were characterized by relatively few design parameters being used, primarily due to the computational costs of black-box gradient calculation via finite-differencing and the limited available computer resources. Recent advances in parameter gradient calculation methods, such as the adjoint method [2] and the Newton-based direct method [3], and the relentless increases in computer speed and memory capacity, have largely removed the limitation on the number of design parameters which can be employed. At first glance, this should allow "truly optimum" designs to be computed and thus lay the issue of the effectiveness of optimization to rest. The author's recent experience indicates that this is not the case, since unforseen difficulties and surprising results arise as the number of parameters increases. The purpose of this paper is to investigate the behavior of constrained optimization solutions with relatively large numbers of free design parameters present. The examples will be restricted to two-dimensional viscous airfoil optimization. The difficulties which appear even in such a seemingly simple 1
Department of Aeronautics and Astronautics, M.I.T., Cambridge, MA 02139 Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez ©1998 World Scientific
DRELA
364
aerodynamic design problem are quite illustrative of the advantages as well as shortcomings of formal optimization as an aerodynamic design method.
19.2
Method Summary
The present paper focuses on the effectiveness of optimization itself, rather than on analysis and optimization algorithms. Hence, this section will be restricted to only a brief summary of the analysis and optimization methods used for the application examples. 19.2.1
Analysis m e t h o d
The present design/optimization method employs the viscous/inviscid MSES code as the underlying analysis solver [4]. The overall equation system R(U; a,M,Gk)
= 0
(19.1)
consisting of the interior steady Euler equations, the boundary layer equations, and the necessary coupling and boundary conditions, are solved for the flow solution U as a fully-coupled system by a direct application of the Newton method. SU = - [dR/dU]'1 R , U «- U + SU (19.2) After the Newton cycle is converged, the factored Jacobian is re-used to compute flowfield sensitivities to the design parameters Gk and flow parameters a, M. dUldGk dU/da dU/dM
-[dR/dU]-1 -[dR/dU]'1 -{dR/dU}-1
= = =
{dR/dGk} {dR/da} {dR/dM}
(19.3)
This re-use involves only back-substitutions, and allows computation of flowfield sensitivities to a very large number of design parameters at negligible cost. Hundreds of design parameters can be handled in interactive calculations on a modest workstation. 19.2.2
G e o m e t r y parameterization
A suitable parameterization for the airfoil shape, defined in terms of the fractional arc length s/s s ;d e on each side of the airfoil, is a summation of sinusoidal basis functions c/jt which perturb the airfoil by a distance A n normal to its current surface. The design parameters Gk are the mode amplitudes. K
An(«) = y^GkgkU)
,
1 gk{>) = r siri(kns/ss\Ae)
(19.4)
AIRFOIL OPTIMIZATION
365
The 1/fc scaling factor makes all the basis functions have the same maximum slope, which theoretically doesn't affect the optimum solution, but it does appear to greatly improve the behavior of an optimization descent sequence. Other geometry bases can of course be defined, and in fact many are better suited for specific problems. The useful features of the sine basis are a guaranteed mutual orthogonality, and a uniform and predictable increase of geometric resolution with added modes, making it quite suitable for the parameter-count investigation in this paper. 19.2.3
Optimization m e t h o d
The voluminous solution sensitivity output generated by the MSES Newton solver is applied to the optimization problem in the interactive LINDOP program [3]. This allows the designer to use the sensitivity information to interactively try out various objective functions, constraints, and design parameter sets, and to generate linearized predictions resulting from explicit parameter changes, imposed pressure distributions, or objective-function descents. The linearized predictions are displayed for all the operating points being considered, thus giving visual warning when the optimization is headed for trouble. One MSES/LINDOP cycle constitutes one descent step in design space. The well-known BFGS method [5] is used to generate the sequence of descent directions. The descent is continued until the objective function refuses to decrease further to within 0.00005 in Co- The number of descents required for this is typically comparable to the number of free design parameters.
19.3
L o w R e y n o l d s N u m b e r Airfoil A p p l i c a t i o n
The DAE-11 airfoil was designed by the author in 1987 specifically for the Daedalus human powered aircraft, using traditional inverse-method and direct geometry manipulation techniques. It is a second-generation airfoil, redesigned from its precursor airfoils whose performance was partially verified in flight tests [6]. Hence, it makes an interesting candidate for possible further improvement via numerical optimization. The key requirement for a human powered aircraft airfoil is to achieve minimum drag at the design flight lift coefficient. It is constrained by a structural thickness requirements at the spar locations, and by a number of other minor geometric requirements which influence the weight of the wing's secondary structure. Here, the focus will be on the primary requirements.
DRELA
366
Figure 1
19.3.1
Baseline DAE-11 airfoil, and partially-optimized unconstrained airfoil.
One-point optimization
The following optimization problem embodies the low drag requirement. minimize
!F(Gki ex) =
Co
(19.5)
Using 40 Gk DOFs with no geometric constraints present (only CL is held fixed) a physically unrealizable airfoil results soon after some number of optimization descent steps are taken, as shown in Figure 1. The airfoil becomes very thin, with the first trouble sign being the appearance of a re entrant trailing edge. This type of result from an unconstrained optimization is of course entirely expected. Suitable thickness constraints must be imposed based on structural considerations, and in this case the trailing edge and leading edge angles must be explicitly constrained as well. The following constraints have been found suitable for this problem after a number of optimization attempts. CL CM
= 1.25 = -0.133
0TE ~ 6.25°
eLE = 180°
(Vc)o.33c = 0.128 (*/ c )o.90c = 0.014
(19.6)
The CM constraint has also been found necessary because the optimizer tends to strongly drive it more negative, which then has a large detrimental impact on the wing structural weight in a human-powered aircraft. The specified angles, thicknesses, and CM are the same as those of the starting DAE11 airfoil. The leading edge angle 6 = 180° constraint simply matches the surface slopes between the top and bottom surface at the airfoil nose, and is necessary to prevent the appearance of a sharp chisel-type leading edge. The Ci constraint effectively eliminates a as a degree of freedom, while the CM constraint and each of the four geometric constraints removes one geometric
AIRFOIL OPTIMIZATION .."...... +,,,
_........-...."...,-.....
ME 11 10 OOFs Opt1 20 o w 5 Opt1 YO OOFs
.-.......Opt1
-
Figure 2 One-point optimized airfoil geometry and objective function versus
number of design DOFs.
degree of freedom. The total effective number of free design parameter DOFs is therefore K - 5. Using 100 geometric DOFs is typically required to generate a practically arbitrary airfoil, although a smaller number may be adequate if the starting airfoil is reasonably close to the final design, as in this case. All constraints are imposed explicitly in LINDOP by augmenting the objective function 3 using Lagrange multipliers. The net effect is to project all changes into the design space onto the admissible constrained subspace. Figure 2 shows the optimized airfoil CD and geometry which results from using 5, 10, 20, and 40 net degrees of freedom. Both the airfoil shape and the CD appear to asymptote, indicating that the 40-DOF result is in fact the true solution to the optimization problem as posed, at least in the local minimum sense. The CD is reduced from 0.0994 in the original D A E l l airfoil to 0.0836 in the optimized airfoil - a rather large 13%reduction. However, calculation of the entire drag polar of each airfoil tells a very different story. As the number of DOFs is increased, the drag reduction is attained over an ever-narrower CL range. Figure 3 shows that the polar curve takes on a cusped form, so that the benefit range shrinks to nearly zero. Hence, the "optimizedn airfoils actually get considerably worse in a practical sense with increasing number of DOFs. In retrospect, this dramatic behavior is not surprising. Airfoil design and in fact most aerodynamic design - is fundamentally driven by tradeoffs, which almost always include off-design performance. If such tradeoffs are not considered, as in this 1-point optimization example, very poor results are almost certain to occur by chance if anything - the number of poor airfoils vastly exceeds the number of good airfoils! In this case, the optimizer raises a bump on.the surface to "fill" the tiansitional separation bubble, effectively reducing the mixing and associated drag penalty which occurs when a bubble undergoes transition and reattachment [6]. However, the bubble location changes with CJ, so this Uoi>timized"airfoil shape is effective for only for the sampled CL point seen by the optimizer. For lower CL values, the shear
DREXA
368 2.0
.. .. .. .. ........................
HSES
V 3.0
Figure 3 klars for origina1 DRIZ-11 airfoil arld 1-point optimized airfoils.
lthyer does not undergo trt~11sitio11at tlle bulnp, but instead separates off the l ~ u m pin a lasl~inarstate, forinning an even bigger and more lossy separation bubble than would occur witl~outthe bump. For higher CI, values, transition runs forwa.rd allead of tlne bump much Faster t;ha~nit would otlnerwise, very quiclcly precipitating s rapid rise in skin friction drag, which then causes a rapid thickening of tile downstrealn boundary layer and quickly precipitates stall. The real deficielncy Inere is not tine optimization technique, which gives the detnonst;ral)ly correct answer, but rather with the formulati011 of the optimization problem itself. The simple 1-point drag minimization, even with a number of real geometric co~nstraintsdetermined llry considerable trial and error, still does not ernbody the real design requirements of' the airfoil. Unfortunately, this shortcoming is not at d l obvious at the outset,
'To remedy the obvious deficiency in the 1-point design, a 2-point optirnirstion is fornlulated by replaci11g the objective fui~ctior~ (19.5) with
so that the two ends of the expected operating Cb range me now sa'lnplecl. Tlne 1:2 weigl~tilngbetween the two operating poiuts has bce~i deterrni~nedto be necwsa.ry so that tlne upper part: of the drag polar is not colnproniised excessively by the less importaslt lower part. The sanns geo~xletric
AIRFOIL OPTIMIZATION 2'0
CL
0.0
. ......*............................... .... ".................. ... ....................... . . . ...... ......i,.::.: ......4......;......:--
RlrFoil ----.
OPU 60 WF MU 20 WF DAE I1
2.0
-----
,.--
CL
.. -...................... .. .. .. .. ......-............... -............................... .. ............... .. ................................... .. ... ... ... ..................... . . . ......(.......,. ......>......I....... ......0. ......,....... >......(....... .. '!......... ,. .. ""..("" . .......................... .. ,. .. '." ......... . . >.......I'...... . .. ... ... ...
.
.
.
.
100
2004
1Oq=Co
-2
0
2
11
6
8
10
a
Figure 4 Polars for original DAEll airfoil and %point optimized airfoils.
constraints (19.6) are used, but of course two separate CL constraints are used for the two operating points. The objective function IIOW decreases less quickly with number of DOFs than the 1-point case, and 60 DOFs are needed to asymptote reasonably well to the true optimum solution. The evolution of the drag polars with increasing DOFs is shown in Figure 4. The striking feature immediately apparent is that the peaky local-optimizationpolar shape persists, but this now occurs at the two sampled CL values. The airfoil geometry now has two bumps, each one at the bubble location at the sampled operating points. The bumps are not as pronounced as in the 1-point example, but they still show a significant drag penalty away from the sampled CLvalues as the scalloped drag polar in Figure 4 shows. Again, the "optimizedn airfoil is inferior in a practical sense to the starting D A E l l airfoil, although its deficiencies are less severe than in the 1-point optimized airfoil. 19.3.3
Six-point optimisation
Carrying the multi-point optimization concept further, a 6-point objective function is defined as
so that the sampled CL points 0.8.. .1.6 now span slightly beyond the expected 1.0.. .1.5 operating range of the airfoil to give some margin at the
Figure 5 Six-point optimized airfoil geometry and objective ft~nctionversus
ntr~nberttf tlesign POPS.
ends. The evolutioa of tlie dcsig~iwith added rnodes is now even Illore graclnal, alcl 90 modes are recluireci to neasly asyii~ptoteto tlie optimum design as sbowli in Figure 5. Tlle polar of this 6-point opti~nizedairfoil is cornpa.rcd with the starting DAB-11 polar in Figure 6. The peaky beliavior around each s;~.nlplcxipoint: is still present, but to a much lesser degree than before. The geol-netry of this airfoil is ratller striking, however, Figure 7 shows the geonletries for the 1point, Zpoint, and 6-point;optimized airfoils, sllowing one, two, and six buxnps a t the separation bubble locations at the sanlpled operating point. Tlie i~lviscid C, distribution of the 6-point airfoil in Figure 8 shows the severity of its six bumps. Note also that the inviscid Cl, is also soinewhat "noisy" over most of tlie airfoil due to slight surface curvature irregularities. This is typical in problems with many geometric DOFs, since such smdl-scde irregularities have virtually no aerodynamic penalty and hence ilre invisible to tlie optimizer.
In the 1-point, Zpoint, a,nd 6-point optimization exalnples J ~ o v e ,the optimiaer manipulated the geometry at the sinallest physicd scde which has a significant iimpact on the objective functio~~ -- in this case the transitiod sepa.ration bubble. If preseiited witli suffit:ie~itgeometric DOPs, the extent to whicli the optimizer perforlns such ma.~iigulationis startling. The 6-point airfoil with the six distinct bumps is surely not what was expectecl, but t11e result ~nalclresperfect sense in ret;rosl>ect.The viscous GI, distribution shown in poiilts at CI,= 1.5 shows that the bu~npsin l;he Figure 9 for one of the sam~>led larninar region are lilostly submerged under the free shear 1:~yer.Ea.cii buml> protrudes &justenough from the surface to ba.rely reach the shea.r layer, and is positioned dong the surface so that it "ca.tches" the s11e.d~layer as it unclergoes transition and dlows it to reattach without significant drag--producingmixing of the stagnant fluid under the shear layer. The computed velocity profiles
AIRFOIL OPTIMIZATION
loq=C,,
a
Figure 6 Polars for original DAEll airfoil and 6-point optimized airfoils.
at the bumps show this clearly in Figure 10. At Cz values midway between the sampled points, there is no bump to catch the shear layer, which must reattach by instead mixing out the laminar bubble in the valley, which carries a drag penalty. This is responsible for the iiscallopingnin the drag polar in Figure 6. Surely, this rather silly multi-point optimum solution could not have been foreseen by even the most experienced aerodynamicist, and illustrates the almost devious cleverness exhibited by an optimizer which is armed with a large number of design DOFs. Given the likelyhood that the localized bumps can be eliminated with denser operating point sampling in the CL range, optimization appears to have produced a slight improvement over the baseline D A E l l airfoil over the lower part of the polar, as Figure 6 indicates. Away from the bumps, the airfoils shown in Figure 7 are clearly converging to a unique shape. However, the geometric details of this shape are not particularly compatible with the structural techniques employed in human-powered aircraft. The highly cambered trailing edge, the distinct concavity on the upper surface near x / c = 0.7, and the bottom-surface inflection points would all be quite difficult to implement. Since there was no way to predict the appearance of these particular features at the outset, a new optimization problem with the appropriate constraints would' likely have to be constructed and solved. The conclusion is that in an engineering setting, using optimization for airfoil design is still an iterative cut-and-try undertaking. But compared to the traditional inverse techniques, the cutting-and-tryingis not on the geometry, but rather on the precise formulation of the optimization problem.
DRELA
372
Figure 7 Geometries of optimized airfoils, showing surface bumps.
Figure 8 Inviscid Cp distribution for 6-point optimized airfoil.
19.4
Transonic Airfoil Application
In this example, the well-known RAE-2822 airfoil is optimized, starting from the baseline Case 13a of reference [7]. This case is partially into the drag rise Mach range, but short of shock-induced separation. Hence it is in an inefficient but not unrealistic flight condition, and significant improvement might be expected upon optimization redesign. The optimization calculations are all performed at the same CL = 0.733 and Re = 2.7 x 10 6 as the starting RAE-2822 case, and the same M = 0.74 is used for the 1-point optimization case.
AIRFOIL OPTIMIZATION
373
Figure 9
Viscous Cp for 6-point optimized airfoil at sampled point CL = 1.5
Figure 10
Velocity profiles for 6-point optimized airfoil at CL = 1.5, showing second bump "catching" reattaching shear layer.
19.4.1
One-point optimization
The objective of reducing drag is embodied by the following plausible optimization problem, again using the sine-mode coefficients Gk and angle of attack a as the design parameters. The leading and trailing edge angles and a local thickness are imposed as reasonable geometric constraints. minimize . . . . subject to
F{Gk,a) CL = 0.733 = 0 1206
{f/c)o^^
= CD 6TE — 8.62° ^ = m °
(19.9) (19.10)
Figure 11 shows the baseline analysis solution compared with experimental data, and the 1-point optimized airfoil. Although a 40% drag reduction is obtained and the local thickness has been maintained, the airfoil is not suitable for a jet transport application since the average thickness over the typically
DRELA
374
-2 0 a r r "3 0
Re -1.5
CP I
RIT1 CL CO
CH 0
----
O P t l t/c HaM
LID
D.7W 2.100.10" 1.810 0.1330 0.01014 -O.liZB
68.23
-0 5 0.0
0.5
Figure 11 Cp distributions for baseline RAE-2822 airfoil, and 1-point optimized airfoil with local thicltness constraint a t .u/c = 0.35 .
wide spar box has decreased considerably. The local thicltness constraint therefore does not represent the real structural requirements, although again tihis was not obvious at the outset. In lieu of the local t l c constraint in (19.10), the r.ni.s. strain E, (per unit bending moment and unit material modulus) is imposed to Inore realistically account for the requirements of a wide spar box.
The integrals are talcen around the airfoil perimeter, and are weighted by the local slcin thiclcness t . The local bending-related unit extensional strain e is defined in terms of the local airfoil surface point y and the local bendilzg inertia and bending centroid location yo.
The skin thiclcness t is chosen to be nonzero only over the chordwise extent of the spar box 0.15 z / c 5 0.65. In effect, the r.m.s. strain constraint enforces roughly an average airfoil thicltness over the spar box, and the remdinder of the airfoil is structurally irrelevant. Figure 12 shows the airfoil which results from the optimization using a net 40 DOFs. In contrast to the local thicltness constraint, the r.rn.s. strain constraint now enforces a reasonable thickness over the extent of the spar box, but a severe bottom surface concavity is produced in front of the spar, and a lesser one in back. Although it adds aerodynamically advantageous bottom loading, the front concavity may not be feasible in a practical sense,
<
AIRFOIL OPTIMIZATION
-2.0 a. .,I
Re R l h
-1.5
c, -1.0
.. --.
OPT I b O.NO 2.7W.ld 2.Y28 CL 0.73% CO 0.01117 C)I -0.152 I(d
RRE 2822 -OPT
Ib
-0.5
0.0
0.5
1.0
Figure 12 C, distributions for 1-point optimized airfoil with r.m.s. strain constraint over 0.15 5 x / e 0.60, and optmized airfoil compared with baseline RAE2822 airfoil.
<
so that the optimization problem as posed still may not embody the actual requirements of the design problem. Again, this shortcoming was not apparent at the outset. In any case, the optimized airfoil shows fully isentropic flow with no shock wave and no wave drag. The net CD has been reduced from 0.0178 to 0.0112 - a 37% reduction. This is partially illusory, however. The entire Mach number sweep in Figure 13 shows that the drag reduction is realized primarily in the vicinity of the sampled operating point at M = 0.74, although here the degree of local optimization is not as extreme as in the low Reynolds number airfoil case. The drag is increased considerably at lower Mach numbers, a behavior which is well known in shockfree transonic airfoils. The geometric feature responsible for this behavior is the slight bump centered roughly at x / c = 0.50 which produces a gradual recompression over the aft part of the supersonic zone and thus eliminates the shock. Associated with the bump is a flattened forward upper surface which produces a strong expansion and a strong shock at lower Mach numbers, thus increasing drag. At Mach numbers above the sampled M = 0.74, the flow expands strongly past the bump and results in a strong shock and rapid drag rise. The behavior of the optimizer in this example is in some ways similar to its behavior in the low Reynolds number optimization case. The optimizer exploits the flow features at the smallest geometric scale which is resolvable by the available geometry perturbation modes. In this case the physical scale manipulated is the shock/boundary-layer interaction region.
DRELA
376
M
Figure 13 Mach sweep for 1-point optimized and baseline RAE-2822 airfoils. 19.4.2
Two-point optimization
To reduce the point-optimized nature of the 1-point design, a 2-point optimization, is defined by replacing the objective function (19.9) with
so that some reasonable range of Mach numbers is now sampled. The same CL and r.m.s. strain, and geometric angle constraints are used. The resulting airfoil is shown in Figure 14, and the entire Mach sweep is shown in Figure 15. Some "localized" optimization is cIearly evident. Two precompression bumps are now present on the airfoil - one at the shoclc foot location at each of the two sampled Mach numbers. No improvement has occurred above the M = 0.74 sampled point. 19.4.3
Four-point optimization
To improve the localized nature of the 2-point design, a 4-point optimization problem is defined as follows. F(Gk, a )
1
1
7 C D I M = ~ .+~ ~5 C D I M = ~ . ~ ~
A denser sampling over the Mac11 number range is now used, with. larger weights applied to the upper part of the range to reflect its greater importance. Note also that a higher Mach of M = 0.76 is now sampled in an attempt to reduce the drag above the highest M = 0.74 point sampled previously.
AIRFOIL OPTIMIZATION
-2.0
m5 Y 3.0
-1.5
c,
--
OPT 2a CL Re
~ 0.6W 0.1UO
0.7330 2.70W.IOo
2.600 2.063
~ 0.01069 0.01118
J
L
L
~
2822 -RRE OPT 2a
-0.0-0.106
-1.0 -0.5
0.0 0.5
Figure 14 C, distributions for 2-point optimized airfoil at the two sampled Mach numbers, and comparison of optimized geometry with baseline RAE2822 airfoil.
Aside from the front underside concavity, the rather thin trailing edge for x / c > 0.9, and the more-negative C M ,the results in Figure 17 show that this 4-point optimized airfoil appears to be an attractive improvement over the baseline RAE2822. The increase in the drag-divergence Mach number is most significant. Although the airfoil contour in Figure 16 appears to be quite smooth on the upper surface, the finite number of sampled points still results in significant bumps at each of the four sampled-point shock locations. This is vividly displayed on the Mach wave plot shown in Figure 18,with an expansion fan emanating from the three forward bumps which are fully contained in the supersonic region at the fourth operating point sampled at M = 0.76. These shock-managing bumps employed by the optimizer are clearly much less severe than the reattachment-managing bumps in the low Reynolds number DAE11 example. Nevertheless, they are clearly artifacts of the finite number of sampled operating points, and would not be present in the ideal case where the Mach number is sampled in a nearly continuous manner.
19.5
Conclusions
The examples presented clearly illustrate the numerous pitfalls which can easily appear in even simple-looking airfoil optimization problems. The appearance of small-scale'geometric irregularities in both the low Reynolds number airfoil and the transonic airfoil examples could be suppressed by increasing the number of sampled operating points, so that additional bumps appear at the intermediate locations, ultimately all blending into a smooth surface. This argument supports the hypothesis that for a
M
Figure 98 Mach sweep for 2-point optil~iizecland baseline RAE-2822 airfoils. smooth geometry, it is necessary to have
# operating points sampled
=
8(#dcsign parameters)
since an increase in the number of design parameters reduces the le~lgthscale at which the optimizer can exploit the flow, which then must be matched by a proportional increase in the number of sampled points to control this exploitation. Of course, geometric constraints such as on surface curvature could instead be imposed to control this exploitation, but this is in effect a reduction in the effective number of free geometric design parameters. Since the cost increases linearly with the number of sampled operating points, the ability to perform optimization with large numbers of design DOFs is an extremely expensive proposition. It stresses the need to reduce the effective number of design DOFs to an absolute minimu~n. These conclusions have direct implications not only for 2-D airfoil optimization, but also for large-scale 3-D optimization problems with numerous design degrees of freedom. There is no reason to hope that 3D problems will be immune to such difficulties. Based on the present optimization examples and the author's prior experience with airfoil design via optimization, the following observations are offered. 19.5.1 r
e
Airfoil O p t k i z a t i o n
-
Cons
The objective function and constraints which effectively embody a practical design problem are not ltnowable at the outset without an extensive experience base for closely related problems. If presented with sufficient design mode resolution, an optimizer will readily (and annoyingly) manipulate and exploit the How at the smallest significant
AIRFOIL OPTIMIZATION
OPT Ua 2822 -RRE OPT l a
Figure 16 Cp distributions for Cpoint optimized airfoil at the four sampled Mach numbers, and comparison of optimized geometry with baseline RAE2822
airfoil.
physical scales present. The examples exhibited such manipulation on the scale of a transitional separation bubble, and on the scale of a shock/boundary-layer interaction zone. Manipulation at the smallest scales tends to produce improved performance only near the sampled operating conditions. The point-optimized airfoil often shows a possibly severe degradation in off-design performance. Manipulation at the smallest scales can be discouraged by multi-point optimization. Small mobile flow features like separation bubbles and shocks thus appear "smeared" to the optimizer and are exploited less severely. Increasing the net number of geometric DOFs appears to require a corresponding increase in the number of operating points sampled by the objective function. Near-continuous sampling of the operating space may be required in the theoretical limit of a general airfoil design problem with a very large number of DOFs - a very expensive proposition. The most suitable operating points to be actually sampled in multi-point optimization are not apparent a priori. From limited experience, sampling somewhat beyond the expected operating range appears to be best. The point weights used in multi-point optimization are arbitrary, and their appropriate values cannot be easily estimated without prior experience. Optimized aerodynamic shapes are usually "noisy" and usually require a posteriori smoothing. 19.5.2
Airfoil Optimization - Pros
Optimization gives a rapid indication of possible directions for improvement when intuition with traditional inverse techniques is exhausted, or when
DRELA
AIRFOIL OPTIMIZATION
Figure 18
381
Mach waves for 4-point optimized airfoil at M = 0.76, with indicated bumps at shock positions at lower Mach number conditions.
6. M. Drela. Low-Reynolds number airfoil design for the MIT Daedalus prototype: A case study. Journal of Aircraft, 25(8):724-732, Aug 1988. 7. P.H. Cook, M.A. McDonald, and M.C.P. Firmin. Aerofoil R A E 2822 pressure distributions and boundary layer and wake measurements. In Experimental Data Base for Computer Program Assessment, AR-138. A G A R D , 1979.
20
Towards Industrial Strength Navier-Stokes Codes - A Revisit Wen-Huei Jou 1
20.1
Introduction
The paper we authored in 1992 [1] assessed Navier-Stokes technology. The focus of that paper was on the accuracy of Navier-Stokes codes. Since then, CFD technology has improved and the range of application has broadened, so it is appropriate to revisit these topics. Based on our experience and the projected applications, we can help identify algorithm issues for further research. We often hear CFD researchers touting their codes as being used by industry. It seems that a clarification of the application of CFD codes, at least in a commercial airplane company, may be in order. There are two ways of using CFD codes in industry. The preferred characteristics of one of the two modes are quite different. The most important use of codes is, of course, during the development of an airplane. Here, the accuracy of the code required by the engineering tasks must be validated extensively to the comfort level of the users before the application. The ability to turn a result around in a very short time is also important. In the heat of the battle, meeting the schedule is a very high priority. Codes which take hundreds of hours to prepare inputs and/or to run in hundreds of CPU hours to obtain one analysis, have no place in the process of developing an airplane. The second use of CFD codes is in research. For
1
Boeing Commercial Airplane Group, P. O. Box 3707, MS 67-LL, Seattle, Washington 98124-2207. Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez ©1998 World Scientific
JOU
384
sorting out cause-and-effect of some complex aerodynamic issues, CFD may be used in research mode during "peace time" to gain knowledge which can be indirectly applied during the development of an airplane. The capability of the code to resolve the aerodynamic phenomenon of interest is probably the over-riding factor for this application. It is less demanding as far as the schedule is concerned. However, in order to sort out technical issues, many cases may need to be investigated, so the cost of computing and turn-around time are still important issues for codes used in this mode. Finally, there are also people in industry who will take a code from researchers to "play with." These efforts should not be even considered as industrial applications. This paper addresses both aspects of industrial use of CFD codes with more attention paid to the first mode. We review the state of applications in several key areas of aerodynamic engineering, and attempt to identify opportunities for code and algorithm improvement. Because the truly effective CFD tools are almost invariably the three dimensional codes, we do not put much effort into reviewing two-dimensional codes, although we recognize that working in two dimensions is often useful in sorting out algorithm issues.
20.2
Cruise Design
For commercial airplane development, careful cruise configuration design is by far the most important aerodynamic task. Its success is the key to satisfying the payload and range requirements, to providing operating economics and to supporting the performance guarantees to the airlines. For a subsonic airplane, the shape of the wing is largely determined by transonic cruise conditions, while the shape of a high-lift wing more or less has to comply with the geometrical constraints given by the cruise shape. The cruise wing configuration is determined in two steps: planform definition and the subsequent detailed aerodynamic shaping under the constraints imposed by planform definition. The first step involves very complex multi-disciplinary trade-offs and system considerations, and the second step is essentially an aerodynamics function. CFD codes are mostly used in the second step at the present time. Because of its importance, transonic cruise wing design and analysis has been the focus of CFD development for the last twenty years. The flow is primarily attached. Therefore, both potential flow approximation coupled with an appropriate boundary layer, and Navier-Stokes methods are equally applicable and being practiced at Boeing and in many other commercial airplane companies. The ability to define the shape of the wing to achieve design objectives is a unique capability of CFD which is not replaceable by any wind tunnel testing method. Therefore, it is the most sought after capability by the aerodynamic
INDUSTRIAL STRENGTH NAVIER-STOKES CODES
385
engineering community of a commercial airplane company. Several generations of design tools have been developed mainly based on the pressure matching method [2]. Recently, industrial CFD has reached a pinnacle with the success of multiple-point, constrained optimization using the TRANAIR code [3]. This new capability allows wing design in the presence of other airplane components such as struts, nacelles, and fairings. Through multiple point optimization, the new method simultaneously considers several design conditions of most importance to the operation of the airplane. It incorporates structural and manufacturing constraints as part of the optimization problem, as opposed to the post processing steps in the pressure matching method. The manual iteration among multiple design points and the post processing of the designed geometry when a pressure matching method is used, often lead to lower performance and long flow time to achieve design closure. This is particularly true when the configuration is complex and the simple sweep approximation of an advanced technology airfoil does not approximate the three-dimensional flow. The method has also been applied to optimize the shape of the cruise wing and body of a supersonic transport airplane, resulting in dramatic performance improvement. It is not an exaggeration to state that the optimization method is one of the key enabling technologies for the development of supersonic transport aircraft. The technology is new and the exploration of how to exploit this method has just begun, but it has already shown its potential for reducing the design cycle time as well as for improving the product performance. Multi-point, constrained optimization is becoming the preferred method for designing the cruise configuration. The practice of wing design using Navier-Stokes methods is confined to the pressure matching approach using the DISC method [4]. A JamesonTechnology (J-T) based TLNS solver [5,6] integrated with an in-house grid generation code [7] is used quite extensively at Boeing in the product development phase because of its quick turn-around using our in-house Triton computer. A multi-block version of the code is also used quite often for pre test analysis [8]. In the last two years, the OVERFLOW code [9] using over-set grid strategy has developed into an effective tool. These two codes are used in the research environment in order to sort out CFD accuracy issues. Whether in the analysis mode or in the design mode, the most desirable characteristic of a CFD code for cruise aerodynamics is its capability to predict drag. For engineering purposes, the prediction of drag increments due to geometry change or due to changes in flow parameters such as Reynolds number, is a very useful capability. The prediction of absolute drag number will be highly valued, but is more difficult to achieve. For initial validation of this capability, it is desirable to use a simple geometry for which dense grids can be generated to understand their effects on such a sensitive quantity as drag. Therefore, a simple transonic wing-body combination was chosen for
386
JOU
Figure 1 Wing -Body in C-grid Topology
this purpose. Figure 1 shows a grid configuration we used to investigate the accuracy of TLNS3D drag prediction. The grid has a C-H topology around the wing-body configuration. Within the constraints of this grid configuration, a dense grid of 297 x 97 x 97 was generated. The computed polar shape agrees well with the test data, if we shift the drag polar by subtracting 25 drag counts. This shift is very large and can not be explained by any possible experimental error. One likely source of error in the computations is inadequate resolution near the front part of the fuselage because of the grid topology. Near the front attachment point, the flow varies rapidly. The C-grid provides good resolution for the wing, but may not provide adequate resolution near the nose of the fuselage. The overset grid technology provides a flexibility in the grid generation
INDUSTRIAL STRENGTH NAVIER-STOKES CODES
387
that allows us to investigate this problem further. The overset grid shown in Fig. 2 was generated, and an OVERFLOW solution was obtained. This grid configuration covers the attachment region with higher grid density than the C-H grid while provides only slightly higher grid density on the wing. The drag polar shift from the test data was reduced to approximately 10 counts, still beyond possible experimental error band. Further refinement of the grid by 50reduces the drag discrepancy to 8.3 counts. The largest source of this change in drag due to grid changes is the pressure drag on the fuselage. Further study will be needed to understand whether this gap can be closed further by increasing numerical accuracy. What this example tells us is that we have no a priori knowledge on where to concentrate our grid for obtaining certain engineering information. It is often too tempting to quickly blame wind tunnel testing, turbulence modeling, and factors other than numerical accuracy for the discrepancy between test data and computed results. An automatic grid adaptation capability with proper adaptive criteria will certainly offer more consistent results. In principle, we need accurate drag prediction to be able to use CFD codes for performance optimization. However, if the error is a systematic one and is a fixed value for a configuration which is undergoing a detailed design of wing shape, only the prediction of increments due to variation in shape is important. Design decisions are usually made by examining the increment from the basehne. Our experience so far indicates that CFD predictions of drag increment due to configuration change seem to be reasonably accurate. This is illustrated in Fig. 3, which shows both the computed and measured increments between two wing-body configurations. The agreement is good enough to provide a basis for engineers to make design decisions. To predict the absolute performance level of an airplane including strut, engine and all the empennages, a large grid size will be needed. It is estimated that 15 to 20 million grid points may be required to analyze a flaps-up configuration. With such a large size grid, a fast converging scheme will be required for building an effective code to be used in a product development environment. The Navier-Stokes code predicts accurately the pressure distributions on the wing surfaces. Figure 4 shows the result of a multi-block Navier-Stokes calculation of a wing-body-strut-nacelle configuration with 5.1 million grid points in 16 blocks. The execution time on the CRAY C-90 is 15 CPU hours to obtain pressure distributions with adequate engineering accuracy. The code is also used to monitor the health of the boundary layer during shape design in order to prevent premature boundary layer separation. Here, the grid spacing distribution within the boundary layer at various locations on the surface of the airplane is estimated, based on the given Reynolds number. The confidence level of the computed results will be greatly enhanced if a reliable solution adaptive method can be developed.
JOU
388
(a)
(b) Figure 2
Overset grid for wing-body
INDUSTRIAL STRENGTH NAVIER-STOKES CODES
389
.65
Figure 3
20.3
Drag Increments Between Two Wing-Body Configurations
High-Lift W i n g Analysis and D e s i g n
The definition of the high-lift wing configuration is probably the second most important aerodynamic task, after the cruise design. Three-dimensional Navier-Stokes codes of today have not yet been applied to this task in the airplane development process. Even their use in the research environment is very scanty. The CFD technology is simply not ready. As in cruise wing design, the high-lift wing design process involves two steps: the definition of planform configuration and the detail shaping of the leading edge and trailing edge devices. The first step in many cases involves discrete configuration changes, such as different leading edge and trailing edge devices. Optimization methods which are based on continuous variation of the configuration do not apply in this case. Therefore, we envision the initial application of CFD for a highlift wing is to evaluate relative merits of different configurations. This will require a code which can turn around very rapidly. For the second step of the process, a constrained design capability is desirable. Because of the importance of the cruise configuration, the high-lift wing geometry is highly constrained. Only a few places on the wing surface and a few geometric parameters are allowed to be changed at this stage. In addition, kinematics of the flap support mechanisms and other factors further constrain the shape of the wing. Design codes must deal with these constraints. For a high-lift wing, it is estimated that between 35 million to 50 million
JOU
390
Figure 4
Pressure Distributions on a Wing-Body-Strut
grid points may be required to obtain a solution of engineering value. Based on this number and on the current performance of JT-TLNS and OVERFLOW codes, it requires approximately 400 CPU hours on a Cray C-90 computer to complete one analysis. As an example, a configuration decision on whether to use a slat or a Kruger leading edge device for a new airplane would require a minimum of 32 runs. That corresponds to about 13,000 hours on a Cray C-90. In addition, the task needs to be done in one month for a timely configuration decision. One also has to recognize that this is but one of the many planform decisions to be made. Therefore, CFD codes of today are quite inadequate to perform the expected functions simply from a throughput point of view. Of course, computing speed and convergence acceleration are not the only issues for CFD applications to high-lift wings. Flow separations are invariably present in a high-lift wing and need to be modeled. Unfortunately, flow separation from a smooth surface under continuous adverse pressure gradient is notoriously sensitive to small variations in flow conditions. This sensitivity causes problem not only for CFD, but also for wind tunnel modeling of the vehicle in flight. The subject of modeling the flow physics in the context of a Reynolds Averaged Navier-Stokes (RANS) formulation probably deserves a full paper in its own right. Our contention is that unless we have a code that resolves the flow well numerically, any conclusions on the deficiencies of any RANS model are tentative at best. For example, whether a one equation turbulence model such as Spalart-Allmaxas model [10], can adequately predict three-dimensional separated flows for high-lift
INDUSTRIAL STRENGTH NAVIER-STOKES CODES
391
configurations cannot be resolved until we are sure that numerical inaccuracy does not contaminate the results. Geometry as complex as a high-lift wing always presents a challenge to the grid generation processes. With the current structured-grid generation system, it takes approximately 200 days to generate an over-set or a multi-block grid. For variations of the similar configurations, the grid generation time can be reduced to a few weeks. A possible solution to this grid generation problem may be the use of an unstructured grid algorithm, which presents less problem for developing an automatic grid generation process.
20.4
Prediction of Handling Quality and Control Effectiveness
Aerodynamic stability discipline is involved in two aspects of engineering during the development of an airplane. One is the prediction of airplane handling characteristics, so that a good configuration can be defined. The other is providing a data base, of the size of approximately 200,000 cases for loads, flight control and flight simulators. If one intends to use CFD codes for generating the required data base, the throughput of the code must be able to handle this large number of cases within a few months. The current codes and CFD technologies do not meet these requirements. There are very few applications of CFD mature enough to be used in this phase of the airplane development process. Only the capability of the codes in predicting some critical characteristics are explored in research environment. Here, the prediction of airplane pitch characteristics is taken as a research example. Figure 5 show the pitch characteristics of a wing-body combination. The Navier-Stokes code with the Spalart-Allmaras turbulence model seems to be able to predict shock induced flow separation reasonably well. The next question is whether the transonic pitch-up of the entire airplane, including the tails, can be accurately predicted. Figure 6 shows the predicted down-wash at the tail location as a function of angle of attack for a wing-body combination as compared to the wind tunnel measurements. For this calculation, there is no solution adaptive capability to capture the wake of the wing, and the wake is very much diffused by the coarse grid cells behind the wing. It seems that the circulation of the longitudinal component of the wake system is conserved by the numerical scheme, and the downwash is predicted with surprising accuracy. Thus, it appears that we have a good chance of using CFD to predict the pitch-up of the entire airplane. The question following that would be whether we can predict the effects of vortex generators on the pitch characteristics. This is ambitious for the near term, but is needed in the airplane development process to reduce the testing of the configurations of vortex generators. The best technical approach to developing
392
JOU
Figure 5 Pitch characteristics for a Wing-Body
such a capability is still unclear. If resolving each vortex generator is to be avoided, a "turbulence model" that includes the spanwise-averaged effects of the vortex generators must be developed. This is not an easy task and may not even be possible. If we decide to resolve each vortex generator, it will greatly increase the number of grid points required and will certainly require grid adaptive capability to capture free vortices. Even if this is possible, we still need a turbulence model that applies to a vortex immersed in a boundary layer. Once one clears the lower hurdles in CFD, the bar is immediately raised!
20.5
Conclusions
CFD for cruise applications is highly developed. In contrast, Navier-Stokes codes for application to separated flows are at their infancy. For engineering applications, two aspects of algorithmic research will be very valuable. These are the convergence acceleration and the grid adaptive capability for complex geometry. An evaluation of current structured-grid technologies indicates that grid adaptation is very difficult to implement in the context of either blockstructured or overset methods. It appears that the unstructured grid method is a better vehicle to implement these new algorithms. Current CFD codes have spectral radii of greater than 0.99. With this
INDUSTRIAL STRENGTH NAVIER-STOKES CODES
393
5c
I
Figure 6
Prediction of Down-Wash of a Wing Body at the Horizontal Tail Location
rate of convergence, approximately 400 iterative cycles are required to reach a four order reduction in residues. Recent advances in multi-grid and other iterative methods indicate that there is a potential for large improvements. It is reasonable to expect that algorithm improvement will contribute a factor of 5 reduction in CPU hours within the next 5 years. Our projection is that advances in computer hardware will give us a 5 fold increase in throughput every 5 years while holding the cost constant. This is a reasonable expectation. With this projection, there will be a 125 fold increase in throughput in the next ten years. A high-lift wing analysis will then take less than 3 hours to compute and can become an effective tool in a product development environment. In order to ensure the engineering accuracy of the Navier-Stokes solution for a separated flow, grid adaptation to pressure gradient, and to both wall bounded and free shear layers is needed. For high Reynolds number, this implies a directional adaptive algorithm. The criteria for adaptation for a complex three-dimensional flow comprise an important subject of algorithm research. When implementing grid adaptation, the surface geometry description, the grid generation and the solver can be quite intimately connected, and require an integrated approach as early in the code development as possible.
394
JOU
Acknowledgements This paper draws from the work by the research team at Boeing Commercial Airplane Group, particularly that of Drs. Steve Allmaras, John Bussoletti, Hoa Cao, Dinesh Naik, Philippe Spalart, Bernard Su, Larry Wigton, Jong Yu, and Mr. T. J. Kao. The author is particularly indebted to Drs. Jong Yu and Venkat Venkatatrishnan, and Dinesh Naik for providing the figures, for reviewing the draft and for making valuable suggestions.
REFERENCES Wen-Huei, Jou, Laurence B. Wigton, Steven R. Allmaras, Philippe Spalart & Jong Yu, Towards Industrial-Strength Navier-Stokes Codes, in N u m e r i c a l and Physical A s p e c t s of A e r o d y n a m i c Flows V, ed. T. Cebeci, Springer-Verlag, January 1992. Goldhammer, M. I. & Steinle, F. W., Design and Validation of Advanced Transonic Wing Using CFD and Very High Reynolds Number Wind Tunnel Testing, I C A S P r o c e e d i n g s , 1990, pp. 1028-1042. Huffman, R.P., Melvin, B.G., Young, D.P., Johnson, F.T., Bussoletti, M.B., Bieterman, M.B. & Hilmes, C.L., Practical Design and Optimization in Computational Fluid Dynamics, A I A A P a p e r 9 3 - 3 1 1 1 , July 1993. Yu, N. J. & Campbell R. L., Transonic Airfoil and Wing Design Using Navier-Stokes Codes, A I A A P a p e r 92-2651, 1992. Vatsa, V. N. & Wedan, B. W., Development of an Efficient Multi-grid Code for 3-D Navier-Stokes Equations, A I A A P a p e r 8 9 - 1 7 9 1 , 1989. Vatsa, V.N., Sanetrik, M. D. & Parlette, E.B., Development of a Flexible and Efficient Multigrid-based Multi-block Flow Solver, A I A A P a p e r 930677, 1993. Kao, T.J., Su, T.Y. & Yu, N.J., Navier-Stokes Calculations for Transport Wing-body Configurations with Nacelles and Struts, A I A A P a p e r 9 3 2945, 1993. Yu, N.J., Su, T.Y. & Wilkinson, W.M., Multiblock Grid Generation Process for Complex Configuration Analysis Using Navier-Stokes Solvers, A I A A P a p e r 96-1995, 1996. Buning, P.G., et al, O V E R F L O W U s e r s ' Manual, Version 1.6ap, NASA Ames Research Center, Moffett Field, CA, 1994 Spalart, P. R. & Allmaras, S. R., A One-equation Turbulence Model for Aerodynamic Flows, La Recherche Aerospatiale, No.l, 1994, pp 5-21.
21 What Have We Learned from Computational Fluid Dynamics Research on Train Aerodynamics? Kozo Fujii1 and Takanobu Ogawa2
21.1 Introduction Applications to aerospace vehicles have accelerated computational fluid dynamics technologies for the last three decades. Now the technologies are being spread to another application areas. One of such applications is train aerodynamics. In recent years, the speed of trains has rapidly increased. Now some of the high speed trains run at 300 km/h, which is roughly Mach number 0.25. The compressibility effect may not be too important at this Mach number range, but it becomes critically important when trains go into a tunnel. Figure 1 shows the Fig 1 Mechanism of the boming noise generation schematic picture of the physical 1
The Institute of Space and Astronautical Science, Sagamihara, Kanagawa, 229, JAPAN
2
Shimizu Corporation, Etchujima 3-4-17, Koto-Ku, 135, JAPAN
Frontiers of Computational Fluid Dynamics — 1998. Editors: David A. Caughey & Mol
396
FUJII & OGAWA
phenomenon that occurs when a train goes into a tunnel. When a train goes into a tunnel, just like a piston, it increases the pressure inside the tunnel. This increased pressure creates a compression wave in front of the train and it propagates down to the exit of the tunnel. Due to the nonlinear effect of the compression wave, the gradient of this wave becomes steeper and steeper as it propagates. Thus, rapid pressure change is created near the exit of the tunnel. If the tunnel is long enough, a shock wave may be established. When this compression wave goes out from the exit, it becomes a pulse-like wave. It is called "Booming Noise" and is considered to be one of the important social problems associated with the development of high speed trains. This "Booming Noise" is a lowfrequencynoise and it strongly shakes window panes of the houses near the tunnel exit. Ozawa[10] clarified that the strength of the booming noise is proportional to the pressure gradient of the compression wave created at the tunnel exit by the train. Therefore, the booming noise would be effectively alleviated by slowing the pressure increase at the tunnel entrance. To see how the pressure increases at the entrance and find out how to reduce the gradient, it is necessary to investigate the transient flow field inside the tunnel created by the train entry. In the past, such flow fields were investigated mainly by the field measurements for real trains. The details of the transient flow field were not discussed because most of the field measurements only gave us the limited amount of information. With regards to the laboratory experiments, axisymmetric bodies were fired into the duct and the pressure increase inside the duct was investigated. These types of experiments implicitly assume that the pressure increase inside the duct only depends on the change of the cross-sectional area and three-dimensional effect is not important. Using the recent CFD technology, the investigation of the flow field can be much easily carried out. Using moving overset grid methods, changing the train geometries is an easy task and detailed data for the three-dimensional transient flow fields can be obtained. We have been investigating the flow fields using this technology for the last several years [5-9]. In the present paper, we would like to summarize what we have done and try to show the basic important features of the aerodynamics of the train entering a tunnel. Based on the flow simulations, a simple theory is developed for the estimation of the compression wave created inside the tunnel. The process of the development may be a good example of how to use the CFD technology for analyses of engineering problems. 21.2 Numerical Method 21.2.1 Basic Equations The basic equations are the three-dimensional compressible Navier-Stokes equations written in the generalized coordinate system.
CFD FOR TRAIN AERODYNAMICS
397
dtQ+d4E+dnF+d(G=Rs-1
d ? Gv
(21.1)
where pU puU+£xp E=pvU+Zyp J pwU+£zp (e + p)U-£tp
F=-
pV puV+r]xp pvV+r]yp pwV + T]zp
J
(e + p)V-r]tp
pW puW+£xP G=pvW+£yp J pwW+Czp (e + p)W-{lP
0 imxul-+{3l p)nv£x (21.2) tmxW£+(3/ti)m2£z IJm:m3+(3/ H)m2^xu+^yv+^zw)
where V^+^U + ^yV + ^W V=t]t+Tixu+nyv+rizw
and
Hy2HZ2 2=tx { + £>£+£>£
m
mi=Cx u
2
m 3 =( M 2 +v 2 +w 2 )c/2+pr 1 (r- 1 )~ 1 (« 2 )f The pressure is expressed by the equations of state for an ideal gas, p = {y-\{e-^(u2+v2
+ w2)
(21.3)
Since the computations require grid movements, it is necessary to include time metrics terms. The motion of the grid is expressed as the terms £„ T]t, and £t. In some of the computations, axisymmetric Euler equations are used instead of three-dimensional Navier-Stokes equations. The basic equations are similar except some terms are simplified and no viscous terms are included. These equations are solved timewisely, but the basic equations are modified to allow the
FUJII & OGAWA
398
information exchange among each grid zone. The interface method used in the present study is called "Fortified Solution Algorithm [2] and it is briefly explained in the next section. 21.2.2 Fortified Solution Algorithm - Zonal Interface Method The flow field is " ^ ^ ^ Entrance decomposed into several zones using an overset grid method. Although the detailed zonal structure is slightly different for each case, a typical grid structure for the three-dimensional computation is schematically shown in Fig 2 The flow field and the zonal grid structure Fig. 2. The train has its own grid that moves at the train speed. It is called "train zone". For the geometrical treatment, another moving grid is prepared for the flow field underneath the train. This is called "bottom zone". The tunnel has its own grid, and the entrance region has its grid. Since there is a topological singularity at the corner of the tunnel entrance, a patched grid is prepared to avoid the singularity. It is called "collar zone". An intermediate grid region covering the train grid is prepared to save the computer time necessary for the interpolation of the data transferred from one zone to the other. This grid region has x=const. grid section and moves with the train grid. The coefficients for the interpolation between the train grid and this intermediate grid do not change in time because both grids move at the same speed. Since the tunnel grid and the intermediate grid both have x=const. grid sections, the interpolation between the intermediate grid and the tunnel grid is essentially twodimensional and the use of this intermediate grid region saves computer time for the search and the interpolation. This intermediate grid region also works as enhancing the local grid resolution. To transfer the information among each zone, Eq. (21.1) is modified to include the source term as, dTQ+d^E+dnF+d!.G=R^dl-Gv+\x\(Qf-Q)
(21.4)
The second term in the right-hand side is the forcing term. When the absolute value X is set to be sufficiently large, the solution is enforced to be Qj. When the absolute value of x is zero> m e ordinary Navier-Stokes equations are solved. The Fortified Solution Algorithm (FSA) method uses x function as a flag to set the
CFD FOR TRAIN AERODYNAMICS
399
region where the solution should be enforced (fortified) to be the values given. In the present zonal computations, the solution that is obtained by the grid zone having a higher priority is given as g/, and the % values are set to be sufficiently large in that region. The details of the FSA zonal method can be found in Ref. 7. The computational grid may go into the body geometry. Since the physical values at the grid points inside the body do not have any meanings, they should not be used for the computation. In the present FSA method, the % is set to be negatively large inside the body, and the program automatically discards the meaningless data that should not be used. There are several advantages to apply the overset zonal moving grid system for the present study. Obviously, the geometrical complexity is alleviated and the solution code for the structured grid can be easily applied to the complex flow configurations. Current type of problems frequently requires parametric studies. In the overset zonal approach, only the train grid has to be changed when a different train configuration is considered. Local grid resolution can be easily enhance by putting a fine-grid zone to an appropriate region with less increase of the total number of grid points. 21.2.3 Discretizations The LU-ADI algorithm [4] is used for the time integration with the convective terms discretized by the Roe's flux difference splitting [11]. Higher-order accuracy is achieved by the MUSCL interpolation [13] using primitive variables. Viscous terms are discretized by the central differencing. The algebraic turbulence model by Baldwin & Lomax [1] is used since the Reynolds number is on the order of 107 and no strong separations are expected. 21.2.4 Boundary and Initial Conditions Since it is difficult to simulate the whole tunnel region, and it has been shown that the strength of the booming noise depends on the slope of the compression waves near the entrance, the tunnel is cut where the compression wave is fully established. At this boundary, one-dimensional non-reflecting boundary condition is imposed to avoid unphysical wave reflections. To avoid the initial disturbance due to the train acceleration, the train should be initially placed sufficiently far from the tunnel entrance. However, it would require enormous computer time for the train moving outside the tunnel, which is out of our interest. The approach we have taken is to compute the steady-state flow field over the train and use the solution as an initial condition for the flow field near the train. The effect of the tunnel entrance is approximately considered by setting the boundary condition at the right end of the computational region for this steady state computation [9]. For the viscous computations, the non-slip boundary condition is imposed on the train body surface because flow separations may alter the effective train blockage and thus may change the strength of the pressure wave. In other words, the boundary layer developing on the train body is resolved. On the other hand, the slip boundary
FUJII & OGAWA
400
Fig. 3 Experimental Apparatus by Maeda et si. [13]
condition is imposed on the tunnel wall and the ground surface even in the viscous computations. For the inviscid computations for the axisymmetric cases, the slip boundary conditions are imposed on all the boundary surface. The pressure on the solid wall is obtained by solving the normal momentum equations.
21.3 Results and Discussion 21.3.1 Axisymmetric Computations - Validation and Parametric Studies There were several analytical and numerical studies on the tunnel entry problem. Yamamoto proved that the gradient of the compression wave is proportional to the U3, where U is the velocity of the train [15]. The laboratory experiments justified this theory up to 200 Km/h. However, the speed of the trains are in the range of 300 Km/h and the speed of the MAGLEV (magnetic levitation) train under development by Japan Railway Central will be 550 Km/h. At this speed range, nonlinearity effect may induce the deviation from the theory. Most of the numerical works so far used one-dimensional equations where the train entry is modeled as the change of the cross-section of the tunnel in time. It has been shown that the level of the pressure increase is well predicted by the one-dimensional simulations. However, the gradients of the compression waves were not necessarily well predicted (as will be shown later). In the present section, axisymmetric computations are conducted to identify the applicability of the existing theories. There is a laboratory experiment conducted by Maeda et al. [13]. Figure 3 shows their experimental apparatus. The axisymmetric bodies were fired into the duct and the pressure change on the duct surface was measured. Corresponding to the experiment, three nose shapes are considered: an elliptic, a parabolic and a conical ones as shown in Fig. 4. The condition used in the experiment is summarized as follows. The blockage ratio of the train versus tunnel, Rt =0.116, the train speed, U = 230 km/h (Mach number 0.188), the aspect ratio, a/b (see Fig. 4), = 5.0. Note that the blockage ratio of the typical bullet trains in Japan is 0.22.
CFD FOR TRAIN AERODYNAMICS
401
The corresponding computations were carried out using the axisymmetric Euler equations. The viscous effect will be discussed in the later section. The preliminary computations showed that the non-dimensional spatial scale of the compression wave is in the order of 5 (note that tunnel height is taken as a unity) and 20 grid points are minimumly required for resolving the gradient of the compression wave, the grid size in the x direction (the direction of the train movement) was taken to be 0.059 which is much less than 5/20. The number of the grid points is summarized in Table I. Since only the Euler computations were done, the two-step Runge Kutta time integration scheme was used instead of the LU-ADI implicit time integration method. In the following, non-dimensional time is counted with the 0.0 to be the time when the train nose enters the tunnel. Time step is decided to be 0.005 after checking the effect of the spatial and temporal grid resolutions. Figure 5 shows the time history of the measured pressure compared with the experiment. The solid lines are the computed results and the marks are the experimental data. The location of the measurement is 6.80 non-
Fig. 5 Time history of the tunnel wall pressure - comparison with the experiment -
CFD FOR TRAIN AERODYNAMICS The results so far seem to indicate that onedimensional analysis can ............... predict the dependency of the pressure gradient on the two ...................... important parameters; the train speed and the blockage ratio. ............. Now, to see the capability of 1 I t h e one-dimensio-nal .................... i...................... l ............... i......... simulations, two problems are .................... ................ ; ...................... .................... (................. ,......... considered. One is the effect 0.1 0.2 0.3 0.4 of the blockage ratio with the R, gradient of the cross section of the train to be kept constant. Fig. 7 Maximum time derivative of the compression The one-dimensional analysis wave .vs. tunnel blockage ratio considers the movement of the train as an area change of the tunnel and therefore the I: Rk0250 pressure increase occurs due to the time-dependent variation of the cross section of the tunnel. Type ': Rk0J16 Suppose the train speed be the same, the gradient of the n Type 3: Rt=0.060 compression wave would only U depend on the time-dependent variation of the cross section of T~~~4: ~ k 0 . 0 3 0 the train. Figure 8 shows four types of trains considered. The Fig. 8 Train shapes blockage ratio different cross sections are plotted in but same gradient Fig. 9. As can be seen, the derivatives of the ;row sectional area of the nose region are the same for all the four trains.
6
-
3 3
3
The computed results are shown in Fig. 10. Both the results for the axisymmetric and one-dimensional simulations are plotted. The pressure is constant until the train enters a tunnel in the one-dimensional analysis, since the pressure starts to increase when the effective cross sectional area starts to change. As has been mentioned, the time history of the pressure increase is the same for all the four cases although the maximum pressure levels are different. In other words, the gradient of the compression wave is the same for all the cases and does not depend on the blockage ratio. Whereas, the pressure increase has already occured before the train enters a tunnel in the axisymmetric analysis, which is physically reasonable. As Figure 10 showed, the time-dependent behavior of the pressure increase (profile of the wave) would never be accurately simulated by the one-
FUJII & OGAWA
404
Fig. 9 Distribution of the cross-sectional area of the train nose
Fig. 10 Time history of the tunnel wall pressure: x = 6.8
dimen-sional analysis. In addition, the blockage ratio does not influence on the gradient of the compression wave. In reality, the blockage ratio is one of the important parameters for the gradient of the compression wave, and the multi-dimensional analysis is necessary for an accurate evaluation of the gradient of the compression wave although one-dimensional analysis is still useful for the rough estimation as Fig. 10 indicated.
Fig. 11 Locations of the measurement of the tunnel wall by Shimbo et al. [12]
CFD FOR TRAIN AERODYNAMICS
405
21.3.2 Three-dimensional Analysis - Viscous Flow Simulations for a Practical Train Configuration In this section, the computed results for the viscous flow simulations for a practical train configuration are discussed based on the comparison with the field measurement for the series 300 called "Nozomi" Shinkansen train. The field measurement was carried out by Shimbo et al [12] in December 1991 at "Katakura" tunnel in Japan. The tunnel is 1266m long and doubletracked.
T h e train called 3 0 0 Fig. 12 One view of the computational grids
series (Nozomi, 16 cars) runs into this tunnel at the speed of 270 km/h (Mach 0.221) on the left track. The unsteady pressure was measured at several locations on the tunnel wall near the entrance. Figure 11 shows the locations. The details of the measurement can be found in Ref. 12. The Reynolds number defined by the train speed and tunnel height (7.6m) as the representative speed and length is 4.0 x 107. The computational grids are shown in Fig. 12. Since we are interested in the formation of the compression wave near the entrance, the tunnel length was taken to be 40.0 (312.0m) and the non-reflecting boundary condition was applied at the right boundary to avoid unphysical wave reflections. The train length was also shortened but kept to be 20.0 (156m) so that the end nose does not influence on the formation of the compression wave. The number of the grid points are summarized in Table II. Initially, the train is located 3.0 in front of the entrance. One simulation roughly required 30 hours on 1 Processing Element of Fujitsu VPP500 vector parallel
(b) rear portion
(a) front nose
Fig. 13 Near-surface streamline
FUJII & OGAWA
406
Fig. 14 Time evolution of the pressure contour plots
supercomputer. The computations were stopped when the train nose comes to 90m (52.0 nondimensional time). Figure 13 shows the near-surface streamlines over the train before the train goes into the tunnel. There is no separated region observed in the front nose. In the rear portion, flow separation is clearly seen. However, the train is long and the effect of the separation to the formation of the Table II Numbers of the grid points for the three-dimensional simulations
Train Zone Bottom Zone IntermediateZone Tunnel Zone Entrance Zone Collar Zone
101x67x25 101x27x15 101x41x41 141x35x21 51x35x21 35x35x9
In total
532,000
compression small.
wave
is
CFD FOR TRAIN AERODYNAMICS
407
Figures 14(a)-(d) show the sequence of the pressure contour plots during the train entry. The pressure in the tunnel slightly increases before the train enters the tunnel as shown in Fig. 14(a). The pressure on the front nose increases, and stronger pressure wave is created in front. Once the wave detaches the train region, it becomes planar wave and propagates toward the exit. Due to the effect of the stagnation region, the pressure on the left wall is higher during this process although it is difficult to recognize in this figure. The pressure increase stops after certain time, and the flow field near the train reaches the steady state inside the tunnel. Figure 15 shows the time sequence of the pressure distributions at the three circumferential points on the tunnel wall. The high and low sharp peaks in the left are the pressure variation around the train, and the formation of the compression wave can be seen as the increase in the right. Initially, the pressure in front of the train increases, then once hits the maximum value, the gradient of the compression wave becomes steeper and propagates away from the train. To alleviate the gradient of the compression wave, the treatment should be placed before the compression wave is fully formed. In this case, such distance is 2.3 in the nondimensional scale. The figure also shows that the pressures are almost the same on all the three points except when the train passes by. It indicates that the compression wave is one dimensional. It should be noted, however, the flow field near the train is fully three-dimensional and its effect is important on the strength and the gradient of the compression wave.
Fig. 15 Time history of the tunnel wall pressures Solid line: left wall, Dash line: right and upper walls
409
CFD FOR TRAIN AERODYNAMICS
Fig. 18 show the difference clearly. The maximum gradient is much higher for the train running in the center. The result indicates that the location of the track changes the flow field inside the tunnel and is also an important factor for the gradient of the compression wave. Therefore the axisymmetric flow simulations are not sufficient for an accurate estimation of the compression wave for the practical train configurations. The reason will be discussed in the next section associated with a new one-dimensional estimation method.
21.4 A New Quasi-One-Dimensional Prediction Method The study so far indicated that the inviscid flow assumption is valid so far as no strong separation occurs over the train surface. Therefore, we assume inviscid flows when developing a new method. The change of the instantaneous streamlines in time are schematically drawn in Fig. 19. We may consider that the tunnel wall approaches the train instead of the train moving closer to the tunnel entrance. When the tunnel comes closer to the train, it shrinks the streamtube near the train. Let's consider the streamtube along the top wall of the tunnel. The streamtube becomes narrower as the Fig, 19 The flow field of the tunnel entry tunnel wall comes closer. The pressure observed in the coordinate moving increase can be understood to occur with the train due to this narrowed streamtube. In the following, we develop a new one-dimensional theory based on this idea. The unsteady one-dimensional Euler equations with the area change considered are written as, d,(QA)+dx(FA)=G p~ pu , Q= e
pu 2 F= p+pu , _(e + p)u_
0 G= -pdxA -pdtA_
(21.5)
(21.6)
These equations are rewritten to be three independent equations for the three characteristic variables to be unknowns. dt w0 + (u+c)dxw0 = 0
FUJII & OGAWA
410 dtw+ + (u+c)d,w,= -. , ... d,w_+(u-c)drw_ V ; x '
(21.7) c DA = A Dt
Here, 5w0=8p+-E, c
Sw+=Su+—, pc
8w_=Su—— pc
(21.8)
We are interested in the compression wave created in front of the train. Therefore, only the equation for the positive eigenvalue, namely the second one of Eq. (21.7) is considered. We consider the change of the flow variables is relatively small compared to the steady-state, and introduce the small disturbance formulation. After some manipulations, this equation becomes, dtw' + +(u+c)dxw'
+
r DA' = - - —
(21.9)
Here, the small disturbance terms are defined with the prime attached. We have assumed that the space derivatives of the cross-sectional area of the train is small and neglected the higher-order terms. From the study so far, the compression wave to be formed is a planar wave. Therefore, the disturbance of the velocity is directly related with the disturbance of the pressure as the equation, Su=^pc
(21.10)
Then the disturbance of the characteristic variable 8w' + can be replaced by the disturbance of pressure as, 5w\=—8p' pc
(21.11)
Putting this into Eq. (21.9), d.p +(u+c)drp V ' x
=
pc 2H A Dt 1 ypDA 2 A Dt
-LIE
M
2 A " dx
(21.12)
CFD FOR TRAIN AERODYNAMICS
411
Equation (21.12) is the equation that computes the pressure change in time and it tells us that the pressure change is x+dx directly related to the area change of the Fig. 20 Control volume for the mass conservation streamtube in time. In other words, we can compute the pressure change using this equation if the area change of the streamtube in time is given. As the speed of the train is constant, and t h e substantial derivative in the right hand side has been replaced by the wan space derivative. As shown in Fig. 20, we consider the control volume for the streamtube and apply the conservation of mass. — (puA) = $pu-ndl
(21.13)
Here, the effect of the multi-dimensionality is included in the integral of the right hand side. Equation (21.13) can be approximately rewritten as, d, , dA A — \ p u )+pu =puL dx dx
(21.14)
Here, L is the length of the tunnel for the integral, un is the velocity component normal to the tunnel wall. With additional assumptions that are reasonable [8], the following equation is obtained. ^dA _T u = un L dx Putting this into Eq. (21.12),
(21.15)
FUJII & OGAWA
412 dj+iu+cpj/J-^A*" [o
^-X'U') {t<-xlUt)
(21.16)
Here, we have defined the integral of the outward velocity component normal to the tunnel wall as vwaU. This equation indicates that the pressure increase, in other words, the gradient of the compression wave is proportional to the vwaU as the schematic picture in Fig. 21 shows. We call this newly developed theory to be the " vwaii theory". The vwaU can be computed based on the steady state flow field over the train outside the tunnel. Using this theory, the gradient of the expected compression wave can be predicted by computing the velocity component normal to the imaginary tunnel-wall location in the steady state flow field over the train without complicated flow simulations including a tunnel effect. Maeda et al. [3] conducted the experiment for the four axisymmetric bodies shown in Fig. 22. The difference for each body is the cut-off region of the conical nose. In the experiment, it was shown that the maximum gradient of the pressure is almost constant until the region of the cut-off is within a certain limit. The gradient suddenly increases if the cut-off region becomes too long, namely Case 7(c). Figure 23 shows the dp/dt computed by the theory. The experimental results are also shown. Remarkable agreement is obtained. The theory clearly captures the sudden increase of the dp/dt for the Case 7(c). Additional computations for different body configurations were also carried out and all the results indicated that the vwaU theory can predict the dp/dt (complete derivative now as the location is fixed) distributions accurately. It should be noted that the vwaU theory only requires the steady-state computation for the body in the freestream. Such computations require small amount of computer time.
Q- case 7 (a/b=7.0)
case 7A "0.809 a " '
'"^0.597 a*'
-9- case 7B -9- case 7C
0.401 a Fig. 22 Four nose configurations
CFD FOR TRAIN AERODYNAMICS
413
Next, we try to find the accuracy of the vwall theory for the practical threedimensional problems. In three-dimensional cases, vwaU is defined as an averaged value of the surface integral of the normal component along the tunnel wall. The velocity distributions at one 0.010 cross section are representatively shown wa| Iheoiy O ° A O : Experimaits (case7,7A,7Band7C) in Fig. 24. The cases considered are the cases Fig. 23 Pressure gradient histories inside the tunnel (x=6.8) for the Series 300 "Nozomi" Both the left track and center track are computed. The dp/dt distributions are shown in Fig. 25. The solid lines are the plots already shown in Fig. 18 and are the results for the costly simulation including the tunnel. The marks are the results by the v waU theory. The Fig. 24 representative Vwan distribution agreement is fairly good. It is, however, noteworthy that the vwaU theory clearly captures the higher dp/dt for the left track case. To see the reason of the difference, the local values of vwall are plotted in Fig. 26. Since the vwaU is larger near the tunnel wall close to the train, the averaged vwau becomes higher for the left track case. This result indicates that the treatment for effectively alleviating the booming noise should be placed in the closer side of the tunnel. The vwaU only requires the steady state computation, which requires small amount of computer time. The two cases above only requires one computation because the vwaU distribution requires the steady state solution without the tunnel effect and are obtained assuming the location of the tunnel. The result above indicates that the train shape optimum for the booming
414
FUJII & OGAWA
noise can be designed by minimizing the maximum value of vwaU. A simple design algorithm based on the panel method was developed for an axisymmetric flows based on this idea. The parabolic shape that is believed to be a good shape for the booming noise was taken to be an initial train configuration and the new train configuration was designed. It was shown that the derivative of the Fig. 25 Time history of the pressure gradients compression wave was -comparison of the V ii and the computations wa reduced 30% by this simple design method. The details of the algorithm and the result are written in Ref. 8.
Fig 26 Vwall distributions ofr left running and center running trains
21.5 Conclusions The recent technology of Computational Fluid Dynamics has been applied to the aerodynamic problem of the train entering a tunnel. Computations are carried out both for axisymmetric and fully three-dimensional flow fields. The comparison with the existing theories, experiments and the field measurement, it showed that the present approach is useful for understanding the mechanism of the formation of the compression wave inside the tunnel. The result showed the application limit of the existing one-dimensional theory and multi-dimensional analysis is necessary for an accurate prediction of the gradient of the compression wave. The computations for a fully three-dimensional configuration indicated the effect of the track location
CFD FOR TRAIN AERODYNAMICS
415
to the gradient of the compression wave. Based on the simulation results, a new theory called theory was developed. This theory can predict the formation of the compression wave accurately only based on the steady-state flow filed of the train without the tunnel effect and therefore is very efficient although it still requires three-dimensional flow simulations. The capability of the theory was justified by the comparison with both the experiment and the simulation. This newly developed theory can be a good engineering tool for alleviating the booming noise of high-speed trains. The CFD technology is believed to be in the first matured stage. With the rapid progress of the computer performance, anyone with some experience can simulate complex flow fields. However, to make the CFD to be in the second matured stage, understanding the flow field and deriving an essential theory are both required. That would make the CFD to be a real tool of engineering. We presented the analyses on the train aerodynamics associated with the tunnels as an example of such efforts.
REFERENCES 1. Baldwin, B. S. and Lomax, H., This Layer Approximation and Algebraic Model for Separated Turbulent Flows, AIAA Paper 78-257, 1978. 2. Fujii, K., Unified Zonal Method Based on the Fortified Solution Algorithm, Journal ofComputational Physics, Vol. 118, pp. 92-108, 1995. 3.
Maeda, T., Matsumura, T., Iida, M., Nakatani, K. and Uchida, K., Effect of Shape of Train Nose on Compression Wave Generated by Train Entering Tunnel, The International Conference on Speedup Technology for Railway and Maglev Vehicles, Yokohama, Japan, Nov., 1993.
4. Obayashi, S., Matsushima, K., Fujii, K. and Kuwahara, K., Improvement in Efficiency and Reliability for Navier-Stokes Computations Using the LU-ADI Factorization Algorithm, AIAA Paper 86-0338, 1986. 5. Ogawa, T. and Fujii, K., Numerical Simulation of Compressible Flow Induced by a Train Moving in a Tunnel, AIAA Paper 93-2951, July , 1993. 6. Ogawa, T. and Fujii, K., Computational Approach to the Aerodynamics of a Train Moving into a Tunnel with Guideways, The International Conference on Speedup Technology for Railway and Maglev Vehicles, Yokohama, Japan, Nov., 1993. 7. Ogawa, T. and Fujii, K., Numerical Simulation of Compressible Flows Induced by a Train Moving into a Tunnel, Computational Fluid Dynamics Journal, Vol. 3, No. 1, pp. 63-82, April, 1994. 8. Ogawa, T. and Fujii, K., Prediction and Alleviation of a Booming Noise Created by a High-speed Train Moving into a Tunnel, The ECCOMAS Computational Fluid Dynamics Conference, Sept., 1996.
416 9.
FUJII & OGAWA Ogawa, T. and Fujii, K., Numerical Investigation of Three Dimensional Compressible Flows Induced by at Train Moving into a Tunnel, Computers & Fluids, to appear in 1997.
10. Ozawa, S., Study on the Booming Noise at the Tunnel Exit, JR Research Report No. 1121, 1979, (in Japanese). 11. Roe, P. L., Characteristic-Based Schemes for the Euler Equations, Annual Review of Fluid Mechanics, pp. 337-365, 1986. 12. Shimbo, Y. and Hosaka, S., Steady and Unsteady Pressure Measurement on High Speed Train, The International Conference on Speedup Technology for Railway andMaglev Vehicles, Yokohama, Japan, Nov., 1993. 13. Thomas, J. L., van Leer, B. and Walters, B. W., Implicit Flux-Split Schemes for the Euler Equations, AIAA Paper 85-1680, 1985. 14. Watanabe, R., Fujii, K. and Higashino, F., One-Dimensional Numerical Simulation of a Compression Wave Induced by a Train Entering a Tunnel, Journal of Japan Society of Mechanical Engineering B, Vol.61, No. 592, 1995, (in Japanese). 15. Yamamoto, A., Aerodynamics of the Train and the Tunnel, JR Research Report, No. 1230, 1983, (in Japanese).
22
On The Pursuit of Value with CFD Paul E. Rubbert1 Summary The key finding of the paper is that the value of CFD is directly related to its contribution to RATE OF LEARNING during the process of designing an airplane. Higher rates of learning lead to better designs. Rate of learning is comprised of the product of two terms, namely (i) learning per design cycle, multiplied by (ii) the number of design cycles that can be executed in a given amount of time. Earlier developments in CFD tended to focus on the former and to ignore or discount the latter. But the teachings of the 1990s created a greater focus on the latter, with the result that the processes in use for designing airplanes today are improving at a rate that is unprecedented. 22.1
Introduction
In addressing the pursuit of value with CFD, one first has to figure out how to define the meaning of value. That is best done by examining the way that airlines judge value in an airplane. One common measure of value is a single number, called TAROC (total airplane related operating cost). A typical TAROC breakdown for a generic wide body airplane is given below: Percent Item 24 Interest 14 Depreciation 1 Insurance 12 Fuel 9 Flight Crew 5 Airframe Maintenance 3 Engine Maintenance ' The Boeing Company. Frontiers of Computational Fluid Dynamics - 1998. Editors: D. A. Caughey & M. M. Hafez.
OWNERSHIP COST 39 %
Cash DOC 29 %
©World Scientific
RUBBERT
418 9 8 4 4 3 3 1
Cabin Crew General and Admin. Ground Handling Landing Fees Control & Communication Gr. Prop Main & Dep APU Fuel
Airplane Related IOC 32 %
It can be observed that there are a lot of things in addition to aerodynamic efficiency that enter into value. Within the past year I was privileged to hear a presentation by a United Airlines executive, Gordon McKinzie, concerning the process they used to select from among wide body airplane offerings by Boeing, Airbus and Douglas. They divided their assessment into eleven areas, which are listed below in ranked order. Rank 1. Mission Capability 2. Interiors 3. Maintenance 4. Facilities 5. Cargo 6. Ground Handling 7. Flight Operations 8. Training 9. Product Support 10. Technical Issues 11. Cockpit Totals
Weightings 20 15 12 10 8 7 6 6 6 5 5 100
The message again seems to be that value is comprised largely of things that have little apparent relationship to aerodynamics. However, the connection between aerodynamic design processes and these other factors is a lot stronger than might be apparent. This comes about from the fact that designing and bringing an airplane to market is, first and foremost, an exercise in large scale system integration. Designing and producing a vehicle as complex as a modern jet transport is a mind-boggling challenge of the first rank. This means that concepts of value associated with individual technical disciplines, such as aerodynamics, must extend far beyond the traditional, discipline-specific measures such as drag, lift, etc. The broader measure of value associated with specific disciplines has grown to also include the ability of those disciplines and discipline processes to contribute effectively to the practice of large scale system integration.
ON THE PURSUIT OF VALUE WITH CFD
419
So it is with aerodynamics. We live in an information age where all serious competitors have access to the same technology. We all use the same kinds of computers and wind tunnels. We use the same turbulence models. And many of the same computational algorithms. And all serious contenders in the business are capable of designing to comparable levels of mission performance. That is looked at as a "given". It is merely the ante that qualifies one to be in the business. The ability to stay in the business, and to prosper, is increasingly dependent on being able to do other things very well, with large scale system integration being near the top of the list. And so a meaningful way of defining the value of CFD is to measure how it contributes to the ability of the company to best carry out the process of large scale system integration. This is the way that I have chosen to define the value of CFD in this paper. I assume that the ability to achieve state-of-the-art aerodynamic performance is a given. Boeing knows how to do that. Airbus know how. Douglas knows how. And others. That is not the key issue. The real measure of value lies in how we do it, measured in terms of how effective we are in integrating the process of aerodynamic design within the broader framework of the larger scale system integration and business practices.
22.2 What Aerodynamics Design Processes Must Strive For I have composed a short list of needs, desires, and goals that define how well or how poorly aerodynamics engineering can support the practice of large scale system integration. That list is the following: 22.2.1
To provide aerodynamic lines that deliver state-of-the-art performance. This is a given.
22.2.2
To provide a multiplicity of aerodynamic design solutions. It is not enough to design one set of lines that accomplish the aerodynamic objective. One needs to be able to provide all possible aerodynamic design solutions in order to arrive at design decisions that best support the overall airplane system integration. For example, in wing/nacelle integration it is desirable that the aerodynamicist be able to eliminate interference by means of wing changes, nacelle changes, strut changes, or any combination thereof. By that means it becomes possible to arrive at a design that strikes the best balance between aerodynamics, structures, fuel volume, manufacturing cost, weight, and systems.
22.2.3
To provide rapid visibility and accommodation of trades between performance, manufacturability, maintainability, nonrecurring cost, time to market,
RUBBERT
420
22.2.4
To provide rapid determination of the cost of constraints.
22.2.5
To be comprised of a number of very rapid design cycles. This is what allows aggressive pursuit of maximum performance and "pushing of the design envelope" It provides a means of rapid recovery from failure. When the number of available design cycles is too small, failure is not an option. Recovery would not be possible. That forces conservative designs and the added cost of things that are done to ensure that failure won't happen.
22.2.6
To provide flawless prediction of full scale flight performance.
22.2.7
To be nimbly adaptable to new and different design constraints and opportunities.
22.2.8
To do all of the above very quickly, such that aerodynamic design is not a pacing item in either the business acquisition process or in the actual process of designing and integrating an airplane.
The above constitutes a tall order, one in which vast opportunities for improvement yet remain. The common theme that is implicit within all of the items on the list is the requirement for rapidly achieving a very high level of aerodynamic knowledge concerning whatever it is that is being done. One would like to know all plausible design solutions, the aerodynamic consequences of a large number of trades, and the aerodynamic cost of various constraints, all accurately expressed in terms of fall scale flight performance. That is a lot of knowledge, and the time available for generating that knowledge is very short. We cannot do all of that today. But it is becoming increasingly clear to me that the quality of a design solution is governed largely by the amount that can be learned during the time available.
22.3 The Overarching Metric Leading To Value If the quality of a design solution is governed largely by the amount that can be learned during the time available, then the overarching metric must be rate of learning. All other things being equal, the best designs will be produced by processes that maximize the rate of learning! And so the value of CFD in today's world of airplane design can be expressed in terms of its contribution to the rate at which the engineering design team learns about whatever it is they are trying to do.
O N THE PURSUIT O F VALUE WITH CFD
421
22.4 The Equation That Governs Design The preceding ideas can be expressed in the form of the following equation, which first appeared in reference 1. fAmount^ ^Learned J
=
TLearning^ [ Cycle )
x
fCycl^ \ Time)
x
^ (22 1)
The business objective requires that we: (i) learn enough to be able to arrive at a very high quality design, and (ii) minimize the time allotted to do it, which reduces cost and time to market. The only way that those objectives can be met is by working to maximize the product of the first two terms appearing on the right hand side of equation (22.1). To make this very clear, the terms can be rearranged in the following manner. f Amount^ [Learning] [ Cycle J
A
(Quality of \
/'Cycled ^ Learned J _ \ Product V Time) ( Time \ = 7 Time toS {Allotted J ^ Market J
Rate of
(22.2)
Learning
The left hand side of this equation expresses rate of learning as the product of two terms, namely (i) the amount that is learned during a design cycle and (ii) the number of cycles that can be executed in a given amount of time. This is what counts. It is not one or the other. It is the product of the two that defines the value that is associated with a given design process. This equation also gives us a means for quantifying the value of incremental changes to a process, such as might arise from using a new CFD code. Let us define: ( . Learning ^ \ Cycle J f . Cycle ^ I Time J
The added information or learning per cycle that is provided by a given change The additional number of cycles that can be executed in a given amount of time, arising from the given change
The value of a given change can then be expressed as:
RUBBERT
422 (Learning ^ Cycle
+
A
Learning^ Cycle J
%
(Learning ^ Cycle
+
A
Cycle^ = ^ ^ Time,)
+
AValue)
(22.3)
Neglecting higher order terms and nondimensonalizing, this becomes Learning ^ Cycle Learning Cycle i
(
Cycled Time Cycle V Time j
AValue Value
(22.4)
A commonly encountered dilemma is that the introduction of a new CFD tool that adds to LEARNING/CYCLE is frequently accompanied by the addition of time required to execute a cycle, thereby reducing CYCLE/TIME. That can change the sign of AVALUE from positive to negative! 22.5 Impact or Value of CFD Prior to 1990 2 When I entered the airplane business, there was no CFD. The process of defining a candidate set of aerodynamic lines didn't take very long, because there wasn't a whole lot of analysis that could be done. Wind tunnel model fabrication didn't take much time either, because there was little need for surface pressure instrumentation (no CFD to compare it with). Model surface definition requirements were more relaxed also, since the lines definition process was incapable of differentiating the aerodynamic consequences of small variations in surface fidelity. And so design cycles were executed swiftly. I remember early airplane programs that had model wing numbers in the 80s, meaning that many cycles of design had been executed. Returning now to equation (22.1), designers in those early years didn't learn as much as we do now in a given cycle, because we now have both CFD and extensive wind tunnel model instrumentation from which to learn. Their LEARNING/CYCLE, the first term on the right hand side of equation (22.1), was low. But their (CYCLE/TIME), the number of cycles they could perform in a given time, was very high, and so those early designers were able to learn a lot in a given amount of time. DC-8s, DC-9s, 707s, first generation 737s, and yes, the early versions of the still magnificent 747, were designed that way, with little or no CFD but with a number of rapidly executed aerodynamic design cycles. And the TIME ALLOTTED, the last term on the right hand side of equation (22.1), was also less in those early years. The 747 was brought to market in about 3 years, rather than the 5 years we took recently to develop the 777. I find this to be rather humbling. What has happened over much of the past history of CFD is that every time we made another significant advance, we increased the amount that could be learned during each cycle. The first term on the right hand side of equation (22.1) got larger. 2
This section is quoted almost verbatim from Reference 1.
ON THE PURSUIT OF VALUE WITH CFD
423
But, each new advance in CFD brought with it a decrease in CYCLE/TIME. The second term on the right hand side became smaller. I well remember the era of the early 1970s when I was involved in a program to design a very fast, area-ruled subsonic transport. This was the first airplane program at Boeing to employ CFD in wing design. The consequence was that we added about six weeks to the time required to define the lines, and we added months to the wind tunnel model fabrication time because we needed extensive surface pressure measurements to calibrate the CFD. At the time I thought we had made great strides because we could produce designs that had good aerodynamics "right out of the box" I ignored in my thinking the fact that it took us 6 months to design and test a configuration. During the same period, Dr. Richard Whitcomb of NASA LaRC, who was our real leader in teaching us what could be done, focused instead on increasing the second term on the right hand side in equation (22.1). He developed a process of personally wielding a file and changing (redesigning the model) the wind tunnel model in a matter of hours, rather than 6 months. It is well documented that the final designs he produced were ahead of ours in time and performed just as well aerodynamically. In retrospect, I now realize that he was learning much faster than we were, as evidenced by the fact that he was the leader and we were the followers. 22.6 ERA of the 1990s In recent years the industry adopted a new paradigm that recognizes cycle time, variation and nimbleness to be of key importance . In responding to that new paradigm, we and others redirected our investments in CFD to focus on that new set of metrics. That caused a lot of consternation and hand-wringing among those who didn't think of the world in the terms expressed by equations (22.1)-(22.4). But enough time has elapsed that we can now begin to observe the consequences of that redirection. The consequences that I am observing are that progress and improvements, by every measure, are occurring at a rate that has never before been seen. The aerodynamic performance of recent designs is outstanding. We predicted the full scale flight performance of both the 777 and the nextgeneration 737 with a degree of accuracy that is almost unbelievable. But more importantly, we carried out the process of aerodynamic designing in ways that contributed markedly to the ability of Boeing to do a superior job of large scale system integration in the creation of our new airplane products. The net result is a Boeing 777 product line that is dominant in its market, and a next-generation 737 product line that has broken all records for number of sales prior to first delivery.
RUBBERT
424
22.7 The Key Enablers The key enablers that in my judgment contributed the most value during the 1990s included the following items and accomplishments. 22.7.1 Contributions to Learning/Cycle • Optimization Via CFD The deployment of optimization methodology within our aerodynamic design processes has provided us with much more information per cycle, especially with respect to sensitivities and cost of constraints. • Integration of Manufacturability Considerations with CFD-Based Design We no longer find out after the fact that a designed shape cannot be manufactured. And we actively examine the trades between aerodynamic performance and manufacturing cost. • Drag Decomposition with CFD Our growing ability to calculate drag increments that can be believed has markedly improved the value and effectiveness of CFD. • Uncertainty Management CFD has played a major role in enabling us to define and to effectively manage the uncertainties associated with predicting full scale flight performance. • Discipline To Only Develop CFD Tools That Could Be Used Within The Time Available 22.7.2 Contributions to Cycle/Time • Early, Strategic Investment Decisions in Fast and General CFD. Key decisions made in the 1970s and 1980s were that: (i) grid generation for "standard" configuration arrangements (such as wing/body) would be fully automated. (ii) complex and arbitrary geometry arrangements in flows that do not involve large areas of separated flows would be dealt with using rectangular, nonsurface-fitted grids employing solution adaptivity.
ON THE PURSUIT OF VALUE WITH CFD
425
Those strategic decisions were of critical importance in giving us CFD tools that can be used within the available time. We do not yet have a comparable, fast-to-set-up capability for complex viscous flows, and our present inability to support the design of high lift flap systems with the degree of effectiveness that CFD enjoys in the cruise regime reflects that fact. • Integration of Manufacturability Considerations with CFD-Based design. By reducing the need to iterate between aerodynamic lines definition and manufacturability we have shortened the design cycle. • Uncertainty Management This has reduced the cycle time for cruise configuration design by minimizing the need to wait for wind tunnel data before committing a design to production. Increasingly we observe designs being committed to production before acquiring wind tunnel data for the final set of lines. Cycling in high speed design is today done almost exclusively with CFD. • Six-Day Wind Tunnel Models One of the most effective improvements to cycle time has been to develop the capability to design and build wind tunnel models in a week or two rather than six months. We exploited that capability last year by cycling an entire wing design within the confines of a single wind tunnel test entry. In summary, the value arising from past developments has been split between increasing the amount that can be learned within a single design cycle, and increasing the number of cycles that can be executed within a given time. In my judgment, CFD has been the principal enabling technology factor in all except the six-day wind tunnel model capability.
22.8 Future Directions Referring again to equation (22.1), I observe that we will eventually reach a point of diminishing return with respect to AMOUNT LEARNED. Once one has learned enough to make a properly informed design decision, there is less payoff in acquiring more knowledge. We are not at that point yet, but we are rapidly approaching it for design in the cruise regime. The real challenge confronting us now is the need to achieve dramatic reductions in the TIME ALLOTTED. If we can reduce the time-to-market for a new airplane from 5 years to 2.5, our products will have significantly greater value in the marketplace. We can only accomplish that through corresponding increases in f Learning^ { Cycle J
( Cycle \ [ Time J
RUBBERT
426
And we must resist the temptation to reduce the number of cycles. Yes, we can probably design a pretty good wing in one cycle. But in so doing we deprive ourselves of the ability to benefit from the learning that takes place, since there is no second opportunity to incorporate that learning into the design of another wing. A more serious consequence of a single-cycle design approach is that failure cannot be allowed. It has to work well the first time, which forces the design approach to be more conservative and to carry with it various costly and time consuming "insurance policies". That makes you vulnerable to a competitor who is a rapid learner, a lesson that Team New Zealand taught the rest of the yacht racing fraternity during the last America's Cup competition when they showed their competitors the stern of their boat Black Magic in 43 of the 44 races they sailed! The following quote sums up how they did it. "The speed of Team New Zealand's rapid design cycle was the critical factor that made its iterative process and exploration of thousands of options possible. Manufacturing companies that accelerate their design cycles likewise can beat their competition to market. But they also can follow the Kiwi's lead and use the additional time to increase the quality, performance and customer fit the their products." When I examine my crystal ball for insight as to how we will accomplish the required increases in rate of learning, I see the following images: • CFD that is fast and effective for the low speed flight regime, for the perimeter of the flight envelope that relates to structural loads and stability and control, and for the unsteady phenomena associated with flutter and noise. • Combined CFD and experimental methodology for giving us greater knowledge and understanding of wake vortices, and a better ability to deal with the issues related to wake vortices. • More contributions from optimization methodology. • A more surgical approach to wind tunnel testing where we measure only what we need at the time and forego our old habits of "measuring everything just in case we might need it". • Systemic changes to infrastructure and process. One of the things we learned from the 6-day wind tunnel model project was that we are our own worst enemy. The solution lies not so much in inventing new technology for further improving cycle time. Rather, the solution lies mostly in changing the bureaucracy that governs our decision processes, our budgeting processes, our reward and recognition systems, and more.
ON THE PURSUIT OF VALUE WITH CFD
427
Those the things that are responsible for much of the cycle time in today's world. • Learning how to exploit a global, 24-hour day. Phil Condit, our CEO, has challenged us to become global. Once element of that global vision pictures the stuff we do being worked on 24 hours a day, 7 days a week, following the sun. Imagine, at the end of our day, handing off our work to associates who are just arriving at work a third of the way around the world. They in turn would hand it off to others who are 2/3 of the way around the world. That way, when we arrive at work the next morning, we would find that 16 hours of progress had been added to whatever we handed off the day before. It is not difficult to imagine what this could do to cycle time. And it is not difficult to imagine information systems that would render this possible. They already exist. The primary difficulty is in learning how to develop the types of human relationships that create the levels of trust and understanding that are needed to make it work.
22.9 Concluding Remarks The pursuit of value with CFD has led to the recognition of RATE OF LEARNING as a key measure of value. RATE OF LEARNING is comprised of the product of two components. One is the LEARNING/CYCLE. The other is CYCLE/TIME. Throughout much of the history of CFD, we focused primarily on what CFD contributed to LEARNING/CYCLE. Each new development was touted in terms of its ability to reveal things that were not revealed before. Researchers tended to ignore or discount the impact on CYCLE/TIME, which was frequently negative. But in the 1990s we were confronted with a new value system, one that recognized the importance of cycle time. That led to investments in CFD (and wind tunnel processes) that were targeted at improving cycle time. The result is that the process of designing airplanes is now improving at a rate that is unprecedented, as measured by the value of the products of that process in the global marketplace.
REFERENCES 1. 2.
Rubbert, P. E, "The Use of CFD in Airplane Design", CFD 97 Fifth Annual Conference of the CFD Society of Canada, Victoria, B.C., May 25-27, 1997. Rubbert, P. E., "AIAA Wright Brothers Lecture: CFD and the Changing World of Airplane Design", ICAS -94-0.2, September 1994.
CFD at a Crossroads: An Industry Perspective Pradeep ~ a j * 23.1 Introduction Computational fluid dynamics (CFD) has made significant progress over the thirtyyear period from the mid-sixties to the mid-nineties. Two factors are largely responsible for this progress: a phenomenal growth of four or more orders of magnitude in speed and memory of digital computers, and impressive advances in numerical algorithms and software. CFD is now an integral part of all science and engineering disciplines where fluid dynamic interactions play an important role. For scientific research, the importance of CFD is obvious from the critical role it plays in developing a better understanding of turbulence [l-31. From an engineering perspective, CFD is now widely recognized as a crucial enabling technology to support the design of not only fixed wing aircraft--the principal focus of this paper--but also rotorcraft, spacecraft, turbomachinery, and automobiles. The breadth and scope of engineering applications of CFD is indeed enormous. However, in the author's opinion, we are at a stage where the fir11 benefits of CFD for aircraft design are easy to demonstrate but much harder to realize. CFD is at a crossroads. Incremental steps along the road we have followed for CFD development over the past thirty years will not be sufficient to meet the demands of tomorrow's design challenges. We need to carefully examine the available avenues and aggressively pursue the most promising ones so that CFD can more effectively -
' Technicat Fellow, Lockheed Martin Aeronautical Systems, Marietta, Georgia 30063-0685,U.S.A. @ 1997 Lockheed Martin. All rights resewed
Frontiers of Computorional Flurd e n o m i c s - 1998. Editors: David A. Caughey & Mohamed M.Hafez
01998 World Scientific
430
RAJ
serve the future aircraft design needs. The remainder of this paper addresses issues related to these observations. The paper is organized along the following lines. Some thoughts on the role of CFD in aircraft design follow in Section 23.2. Issues related to CFD effectiveness for aircraft design are outlined in Section 23.3. Progress made over the past thirty years is summarized in Section 23.4. This is followed in Section 23.5 by a discussion of the transformations taking place in the design processes and the demands they place on CFD. Some concluding remarks are given in Section 23.6. 23.2 Role of CFD in Aircraft Design In this section, the role of CFD in aircraft design is briefly outlined. Readers are strongly encouraged to read other interesting and thought-provoking articles on this subject. They include, but are not limited to, a survey paper on application of CFD to airplane design by Miranda [4], the Lanchester Memorial Lecture by Hancock [5], a paper by Miranda [6] on challenges and opportunities for CFD in fighter design, a discussion of issues in aerospace application of CFD by Cosner [7], and the Wright Brothers Lecture by Rubbert [8] on the role of CFD in the changing world of airplane design. A typical design process is usually divided into three sequential phases: (i) conceptual, (ii) preliminary, and (hi) production. In most cases, there is a preconceptual phase where the customer requirements are mapped into a design space, and targets are established for design variables that characterize the overall size, shape, performance, weight, cost, etc. Several candidate configurations are then generated in the conceptual phase. Following trade-off studies, a single configuration is usually selected for further development and validation in the preliminary design phase. The designs typically undergo numerous modifications during the conceptual and preliminary phases. The goal of the design team is to create an "optimum" design that satisfies all customer requirements. In the production design phase, the final layout and more extensive validation are carried out prior to releasing the design for manufacturing. The myriad of activities that take place in all design phases can be categorized as synthesis or analysis. Synthesis covers defining, refining, and altering concepts and configurations; analysis encompasses methods, tools and expertise that produce useful data to evaluate concepts and configurations. In most aircraft design projects, CFD is treated primarily as a tool—much like the wind tunnel—to produce aerodynamic data. Accurate estimation of aerodynamic data is essential to every aircraft design project. Force and moment data are needed to evaluate performance and flying qualities; surface pressures provide inputs for structural design; and flow-field data facilitate systems integration, such as the integration of engines or weapons with the
CFD AT A CROSSROADS
431
airframe. For generating the required aerodynamic data, a judicious mix of windtunnels and CFD has now evolved as a preferred approach in most design projects; the paper by Bangert et al [9] on F-22 tactical fighter design being a case in point. It cannot be overemphasized that schedule and cost constraints are central to all phases of aircraft design. It is also very important to note that decisions made in the early stages of design have far-reaching consequences for the life-cycle cost of the final design. It has been variously estimated that 70% to 90% of the life-cycle cost of an airplane is locked in during the early stages of design. Of course, aerodynamic data generated using CFD and wind tunnels contributes to the decisions made by the design teams. 23.3 CFD Effectiveness for Aircraft Design CFD can be an effective tool for aircraft design only if it can satisfy the desires and expectations of the design teams. The teams naturally want engineering data of highest fidelity within schedule and budget. In order to be fully effective, CFD must simultaneouly meet three key requirements: (i) rapid turnaround (or short analysis cycle time), (ii) reliable accuracy and (iii) affordable cost. The rationale behind these requirements is presented by the author in Reference 10 and will not be repeated here. However, a brief synopsis is included for the sake of completeness. Rapid Turnaround.—Turnaround is defined as the time span from the initial goahead to the final delivery of data. "Minimizing calendar time of CFD analyses" emerged at the top of the list when CFD applications to F-22 design were recently reviewed by Bangert et al [9]. A typical CFD application process consists of three steps. The first step involves acquisition of geometry and setting up of a computational model (or grid generation). The second step is to run the flow solver. The third and final step is to extract the desired aerodynamic data from the flow solver output. As discussed in the next section, turnaround times vary considerably across the spectrum of CFD codes. Reliable Accuracy.—Although reducing turnaround time is crucial, producing aerodyamic data of reliable accuracy is of equal importance, if not more. Data of reliable accuracy must have a known and acceptable level of error. Most CFD analyses are deficient in providing estimates of error. Numerical schemes and physical models both contribute to errors in CFD solutions. It must be noted that when CFD is exercised in a design environment, the analysts generally do not have the luxury of performing extensive parametric studies by varying numerical schemes or physical models to evaluate the level of accuracy of their predictions. Affordable Cost.—The requirement of affordable CFD analyses is very important to aircraft design teams. Cost includes both labor and computer expenses. Labor expenses are associated with geometry acquisition/modeling, grid generation, monitoring of flow solver runs, and processing of flow solver output. High-speed
432
RAJ
computers with large memory are usually needed to support the design data needs; hundreds of analysis runs are typically made to generate the required amounts of data. Such machines can be quite expensive since computer costs are closely linked to processor speed and memory.
23.4 CFD Progress: The Past Thirty Years
Figure 1. Four levels of CFD methods
A wide variety of CFD methods have been developed over the past thirty years. They vary considerably in their effectiveness to support aircraft design needs. The methods can be broadly categorized into four levels shown in Figure 1. This categorization is principally based on mathematical formulations, capabilities and limitations, and the timeframe of introduction of the methods to aircraft design. It has long been known that Level IV codes, based on the Navier-Stokes (N-S) equations, can simulate nearly all flow phenomena of interest for which the continuum assumption is valid. (The Boltzmann equations based on the kinetic theory of gases are needed to model molecular flows; the related numerical methods will not be covered here.) However, adequate computer power and efficient numerical algorithms to solve the N-S equations were not available in the 1960s. This forced researchers to explore alternatives based on inviscid approximations to the N-S equations; the first three levels correspond to codes based on a hierarchy of inviscid approximations. The lowest level codes, introduced in the late 1960s, are now widely used and accepted; use of the highest-level codes, introduced in the late 1980s, has also been increasing at a rapid pace. Basic features of each level of CFD methods and their effectiveness are highlighted in this section. Level I: Linear Potential.—The linear potential methods are based on the Prandtl-Glauert or Laplace equations which form the lowest level inviscid approximation to the N-S equations. Most of the codes employ the boundary integral approach. The governing partial differential equations (PDEs) along with the boundary conditions are cast in a surface-integral form using Green's theorem. The solution is constructed by discretizing the geometry into small elements and assigning a singularity (sources, doublets, or vortex filaments) to each element. The singularity strengths are determined by satisfying the no-normal-flow condition at a control point on each element. Depending upon the approximations used in surface discretization (mean surface or actual surface) and the type and functional form of singularities (constant source, constant doublet, linear doublet, etc.), codes with different characteristics [4] can be developed. The simplest codes, widely known as
CFD AT A CROSSROADS
433
vortex-lattice methods, employ mean-surface representation of geometry and vortex filaments as singularities. The VORLAX code [11] is a representative example. When the actual surface geometry is discretized, the methods are commonly called panel methods. Low-order singularity distributions, constant on each element, have been employed in QUADPAN [12] and higher-order ones, linear or quadratic, in PANAIR[13]. The simplicity of the mathematical formulation of the linear potential codes inherently restricts their validity to purely subsonic and supersonic attached flows. In spite of these restrictions, the codes are quite extensively used in design efforts due to their ease of use, computational efficiency, and relatively high level of confidence built upon years of use. An experienced user can set up a computational model in a matter of hours even for relatively complex configurations like a complete aircraft. Figure 2 shows a QUADPAN model of the P-3 Orion aircraft with a rotodome for airborne early warning and control [14]. The computational times for Level I codes range from a few seconds on supercomputers to a few minutes on workstations. However, user expertise and experience are crucial to ensuring proper interpretation of the solutions. The vortex-lattice methods and panel methods generally provide good estimates of forces, moments and distributed airloads for steady flight conditions. The data form the basis for performance and weight estimations in the early stages of design. Some of the codes also offer a design option that can be used to determine geometric characteristics (like twist and camber of a wing) for a prescribed set of aerodynamic parameters. To meet the aerodynamic data needs of the aeroelastic and flutter disciplines, versions of the doublet-lattice method [15] are the codes of choice. The linear potential codes were first introduced into the aircraft design environment in the late 1960s and the entire class of codes reached a high level of maturity in the early '80s. With the possible exception of the oscillatory aerodynamic codes [16, 17], very little effort is presently going into research and development of this level of codes. Level II: Nonlinear Potential.—The nonlinear potential methods are based on either transonic small perturbation (TSP) equations or full-potential equations (FPE). Their ability to model transonic flows with shocks is the most significant benefit over the Level I codes. However, this benefit comes at the expense of added complexity stemming from the need to resort to a field approach to solve the nonlinear PDEs. The field approach requires that a region Figure 2. P-3 AEW&C QUADPAN model surrounding a given configuration be
434
RAJ
divided into small elementary volumes; it is no longer sufficient to just divide the surface. Considerable progress was made in the 1970s towards the development of practical transonic-flow analysis methods [18]. The progress had its genesis in the landmark paper of Murman and Cole [19]. The TSP code of Boppe [20] and the FLO-series of FPE codes of Jameson and Caughey [21] were in widespread use by the late 1970s. In practice, TSP codes are easier to use than FPE codes, especially for complex geometries, because of the differences in the boundary condition treatment. The TSP approach permits a simplified treatment based on the application of the no-normal-flow condition at a mean surface. In contrast, the FPE approach requires application at the actual surface. Consequently, Cartesian fieldgrid systems suffice for the TSP codes whereas the FPE codes need boundaryconforming grids. Cartesian grids are considerably easier to set up compared to the boundary-conforming grids. Of course, the TSP codes suffer from limitations on the class of geometries and flow conditions that they can model accurately—a direct result of their simplified boundary-condition treatment. The promise and excitement of the newly-found ability of computing transonic flows were so strong in the '70s that even wing design procedures [22] were developed while the analysis methods were still evolving. Since transonic flows are particularly susceptible to viscous effects associated with shock/boundary-layer interaction, considerable research was done in coupling inviscid TSP and FPE codes with boundary-layer codes. Aeroelastic analysis capabilities based on the TSP formulation [23] were also developed. Although the Level II codes provided the much needed capability of modeling transonic flows, they were not as effective as the Level I codes. A variety of factors contributed to this situation. For example, the level of grid generation technology was not mature enough to support routine application of the FPE methods to anything more complicated than a wing or a wing-body. Applications of the codes confirmed, as one might have suspected, that solution accuracy deteriorated for flows containing strong shock waves or large regions of vorticity (e.g., leading-edge vortices). Usefulness of the codes was therefore severely limited, especially for fighter design since the performance of a typical fighter aircraft is greatly influenced by the leading-edge vortices. In the author's opinion, nonlinear potential codes were basically taken over by the rapid pace of advances in Euler codes in the early eighties. The TRANAIR code [24] was an exception to this trend. TRANAIR adopts an unconventional hybrid approach combining the flexibility of panel methods to handle complex geometries with the ability of FPE formulations implemented on Cartesian grids to handle nonlinearities of transonic flows. Level HI: Euler.—The Euler equations, which form the basis of the Level III codes, represent the highest-level inviscid approximation to the N-S equations.
CFD AT A CROSSROADS
435
They are applicable throughout the subsonic to hypersonic flight regime. This, combined with their demonstrated ability to automatically capture rotational flow regions (such as wakes shed behind wings and vortices emanating from sharp leading edges of fighter-type wings), requiring no explicit a priori definition of such regions, renders them significantly more useful than the Level I or II codes. However, the enhanced capability comes at the expense of additional computational cost of solving at least four and generally five coupled first-order PDEs instead of one second-order PDE. Two factors at the dawn of the eighties convinced most researchers to shift their focus to Euler equations: projected growth in computer power and development of more efficient numerical algorithms to solve the Euler equations [25, 26]. In addition, the accelerated pace of boundary-conforming grid generation technology concomitant with the introduction of the finite-volume concept to decouple flow solvers from grid mappings held considerable promise for realizing a CFDer's dream of routinely analyzing complete aircraft geometries. A synopsis of the impressive progress made is presented here; details can be found in many publications including References 27 and 28. Two distinct development paths can be identified for Euler codes: one based on hexahedral structured grids and the other on tetrahedral unstructured grids. During the early part of the eighties, most researchers focused their energies on structuredgrid methods but the thrust shifted towards unstructured-grid methods from the mid-eighties onwards. The shift was prompted by the realization that unstructured grids afforded greater flexibility in handling complex geometries and promised to "automate" the grid-generation process. Structured-grid advocates pursued multiblock grids to overcome the difficulties encountered in handling complex geometries; codes based on patched or overset multiblock grids evolved to a high degree of sophistication. Both structured- and unstructured-grid methods can now provide steady-state solutions on aircraft configurations in a matter of hours using high-performance computers. Two examples are presented here which illustrate the capabilities of a typical patched multiblock Euler code, TEAM, to support F-22 design needs. The first one involves the complete aircraft airloads prediction [9]. Nearly 370 cases were analyzed covering six Mach numbers, several angles of attack and some yaw angles. Effects of leading- and trailing-edge flap deflections, horizontal tail deflections and rudder deflections were simulated using the surface transpiration Figure 3. F-22 surface pressure concept [29]. A correlation of computed and correlation
436
RAJ
measured surface pressures at 0.9 Mach number and 8 degree angle of attack is shown in Figure 3. The second example involves prediction of the time history of hammershock overpressure in the inlet duct at a supersonic flight condition [30]. In Figure 4, a snapshot of duct pressures and the Mach field around the forebody is presented. In spite of considerable advances in structured grid generation techniques, Figure 4. F-22 inlet hammershock constructing multiblock grids for complex overpressure simulation geometries continues to be a labor-intensive and time-consuming task. Unstructured grid generation is not yet sufficiently automated although it certainly requires less time and effort. The recent resurgence in Cartesian-grid methods [31, 32] offers an attractive alternative because they essentially dispense with the difficulties of grid generation leading to considerable reductions in time and effort. An example of the Cartesian-grid SPLITFLOW code application to F-16 store-carriage/separation modeling [32] is shown in Figure 5. The free-stream Mach number was 0.95 and the angle of attack was 12 degrees. Two other aspects of Euler codes development deserve mention. First, shockcapturing rather than shock-fitting has become the preferred approach; both upwind and central-difference with adaptive-dissipation schemes have enjoyed a great deal of success. Second, most codes solve the time-dependent form of Euler equations even for modeling steady flows. Convergence acceleration techniques, such as local time stepping and multigrid, are employed to obtain time-asymptotic steady-state solutions in a computationally efficient manner. Both explicit and implicit timemarching schemes have been effectively utilized. Due to the use of time-dependent equations, extending the codes to model unsteady flows is relatively straightforward. This aspect has been exploited to develop dynamic aeroelastic analysis methods [33]. Recent attempts at developing inverse design [34] and aerodynamic design optimization [35] methodologies are also noteworthy. Figure S. F-16 SPLITFLOW Whereas Euler codes are superior to nonlinear potential codes in modeling strong shocks, the solutions do not necessarily simulate the transonic flows more accurately because they can not model shock/boundary-layer interaction effects. Some researchers have combined Euler codes with boundary-layer codes to correct this deficiency. The Euler codes do have an edge over potential-flow methods in
CFD AT A CROSSROADS
437
capturing leading-edge vortices. But the location and strength of the primary vortices may not be accurate in cases where the secondary and/or tertiary vortices exert considerable influence. Also, the codes cannot provide an estimate of total drag (including skin-friction) or model flow separation from smooth surfaces. It is, therefore, not surprising that the development of N-S codes has been aggressively pursued in parallel. Level IV: Navier-Stokes.—Navier-Stokes codes have a great deal in common with Euler codes. In practice, a single code usually serves the need of solving both Euler and N-S equations. This follows directly from the similarities between the two sets of equations. Elimination of diffusion terms readily converts the N-S equations to the Euler equations; they both share a common set of convective terms. However, the practical implications of this seemingly minor difference are enormous. In order to accurately resolve the diffusion terms, highly clustered grids are required close to solid surfaces (as well as in other regions where viscous stresses are large). The N-S analyses pose significant challenges for grid generation, numerical accuracy, computational resources, etc. With appropriate grid clustering, one can use the N-S equations to simulate laminar flows in a relatively straightforward fashion. But using these equations to directly model even simple turbulent flows stretches the current supercomputers to their limits [3]. At present, the Reynolds-averaged Navier-Stokes (RANS) equations are used almost exclusively to simulate turbulent flows on aircraft configurations. For a large majority of problems, the thin-layer approximation to the RANS equations is employed to reduce the problem to a manageable size. But these simplifications impose a heavy toll; we now require a turbulence model! A variety of turbulence models has emerged in recent years ranging from relatively simple algebraic models to more sophisticated Reynolds-stress models. The models have been applied to produce impressive results with multiblock structured-grid methods, both patched [36] and overset [37]. Unstructured-grid methods for turbulent-flow analysis are also advancing at an accelerated pace [38]. Approaches range from anisotropic tetrahedral grids to hybrid grids combining prisms close to solid surfaces and tetrahedral or Cartesian meshes elsewhere. In general, experiences in simulating turbulent flows have been rather mixed. There have been many successes in turbulent-flow simulation using simple models and many failures using the more sophisticated ones. Attempts at refining existing models and developing new, improved ones continue unabated [39]. Considerable effort has also been devoted to developing models for laminar to turbulent transition, an area of crucial importance to accurate viscous flow simulation. While progress is being made, the CFD practitioner's dilemma is quite clear. There are many aerodynamic problems in aircraft design where viscous effects dominate and they can be properly simulated only by solving the RANS equations. Internal flow problems (inlets, diffusers, nozzles, etc.) and high-lift systems (multi-element
438
RAJ
and reliability of the solutions continue to be subject to the inadequacies of turbulence models. Limitations of some of the popular models in accurately simulating shock/boundary-layer interaction on a wing are illustrated in Figure 6. As pointed out by Marvin and Huang [39], eddy viscosity models can be used with confidence to predict pressure distributions as long as the pressure gradients are small and shock waves are weak; predicting skin friction and heat transfer to engineering accuracy requires careful attention to grid refinement and free-stream boundary condition influence. In the author's opinion, the prospects of a simple universal turbulence model are rather bleak; capturing the complex nature of turbulence into a model with a few free parameters is a long shot indeed. Probably the best rationale for continuing to use Level IV codes—in spite of their limitations—may be taken from Bradshaw [40]: "...we cannot calculate all flows of engineering interest to engineering accuracy. However, the best modern methods allow almost all flows to be calculated to higher accuracy than the best informed guess, which means that the methods are genuinely useful even if they cannot replace experiments."
This brief overview of Level IV codes would not be complete without mentioning Digital Physics™ technology [41] which simulates viscous flows without resorting to conventional turbulence modeling! The technology is claimed to have a fundamental advantage over the RANS codes because it is free from the artifice of discretization. Depending upon the success in extending the technology to compressible flows and additional validations/demonstrations, Digital Physics™ could provide an attractive alternative to the RANS methods. Summary.—The discussion above clearly points to the tremendous progress that has been made over the past thirty years. Many more examples can be cited to demonstrate how CFD can be applied to simulate flow about complex configurations. The multitude of applications to date has helped us learn to exploit the complementary nature of CFD and wind tunnels. When intelligently used in conjunction with wind tunnels, CFD provides an improved understanding of flow fields and aerodynamic characteristics, and thereby contributes significantly to
CFD AT A CROSSROADS
439
design risk reduction and quality improvement. Wind tunnels will, however, continue to be a necessity for generating aerodynamic data for the full design envelope and for final design validation. Further enhancements in CFD capabilities will expand its role in aircraft design in the coming years. However, meeting the challenges of aircraft design in the future will require more than just incremental progress as discussed in the next section.
23.5 Whither CFD? The Next Twenty Years In this section, we examine some of the challenges that lie ahead for CFD and the avenues that need to be pursued in order to tackle the challenges more effectively. Changing Design Environment.—Since the mid to late 1980s, the U.S. aerospace industry has been seriously concerned about maintaining its leadership position in the increasingly competitive global marketplace. For the aerospace industry in general, and military airframers in particular, the aircraft design challenge has dramatically shifted from higher performance at any cost to high quality at affordable cost. The challenge is not either quality or affordability, but both simultaneously. Many studies conducted jointly by industry and government led to a widely accepted conclusion that industry must transition to an integrated product and process development (IPPD) environment. IPPD is characterized by integration of all aspects of product development including design, marketing, manufacturing, and product support. The IPPD approach relies on considering all requirements and constraints from the start rather than altering a design in its later stages to facilitate manufacturing or accommodate product support needs. Proper trade-offs can therefore be made early and the need for design changes later on is considerably reduced. The result is improved quality and increased productivity. In the IPPD context, design is viewed as an integrated multidisciplinary process. It should not be construed as an "automated process." There is no substitute for human creativity and unique synthesis ability. What is intended is to shorten the design cycle time by expeditiously providing design teams with data needed to make more informed decisions in the early stages and thereby alleviate the serious shortcomings of conventional design processes. Historical databases have traditionally served as the centerpiece of early design studies. However, such databases are of little value when the new configurations and associated processes (manufacturing, support, etc.) depart substantially from the older ones—an increasingly common occurrence in military aircraft designs. New designs are typically driven by more and more stringent mission requirements. Considerable research is being done on cost-effective ways of implementing truly integrated IPPD design environments. For example, the approach proposed by DeLaurentis et al [42] uses statistical techniques for more flexibility in searching a design space by representing large amounts of knowledge from advanced
440
RAJ
computational codes (or physical experiments) via response surface equations. The focus is on providing more knowledge about designs to influence decision making in the early design phases. One of the key requirements for the success of IPPD design processes is fast, accurate and cost-effective means of generating data for each contributing discipline as well as for multidisciplinary interactions. Pivotal Role of CFD.—In the author's opinion, CFD will play a pivotal role in the IPPD design process. A combination of CFD and advanced computational methods from other disciplines (e.g., structures, controls, propulsion, acoustics) offers the only practical means of simulating multidisciplinary interactions which are a cornerstone of the IPPD design process. Wind tunnels are just not suited to producing the desired information in a timely and cost-effective fashion. In addition, CFD affords a powerful means of computationally defining and/or refining geometric shapes to produce specified flow characteristics while satisfying prescribed constraints; this is not practical in a wind tunnel. But CFD effectiveness will have to increase dramatically in order to fulfill the expectations of the IPPD design process. Although considerable attention is being paid to improving CFD effectiveness, many challenges lie ahead for turnaround time, accuracy and cost. Turnaround Time.—The turnaround time is strongly influenced by the level of CFD code, i.e., Level I, II, III or IV. Lower-level codes offer rapid turnaround whereas higher-level codes take longer. Due to the limitations of the lower-level codes in modeling many flows of interest (see Section 23.4), it is absolutely essential that the turnaround time of the higher-level codes be made comparable to that of the lower-level codes. It is expected that, by the turn of the century, a steady aerodynamic analysis of a full aircraft configuration would be performed in less than 24 hours using unstructured-grid RANS codes. However, even this level of speedup is insufficient. When a large number of runs are made on a single model to produce the amount of aerodynamic data required to meet design needs, the total span of time reaches unacceptable proportions. An even more challenging situation arises when the configuration geometry also changes and multiple analyses have to be performed on each variation. This is precisely what the IPPD design environment demands! It must also be noted that the higher-level methods have been rather ineffective to date in supplying unsteady aerodynamics data in a timely and costeffective fashion; most design projects currently rely on linear potential methods. The challenge for the CFD community is clear: develop appropriate technologies and integrate them in a manner that brings the turnaround time for each analysis to a matter of minutes. The list of potential enabling technologies includes streamlined interfaces to computer-aided design (CAD) systems; standard data-exchange protocols; "automated" grid generation; parallel processing of flow solver software; and intelligent systems for data analysis and management, to name a few. Ongoing research and development gives considerable hope and encouragement to CFD practitioners that the target will be achieved over the next ten to fifteen years.
CFD AT A CROSSROADS
441
Accuracy.—Accuracy of computed solutions is one of the biggest concerns. As pointed out by Bangert et al [9], the F-22 design team relied heavily on wind-tunnel data due to the limitations of current CFD codes in modeling viscous effects, especially when applied to full aircraft geometries and the full speed, altitude, and maneuver flight envelope. More than 40,000 hours of wind-tunnel testing has been done to support the F-22 development. Solution accuracy has two components: numerical and physical. A solution may be considered numerically accurate if it shows little or no sensitivity to changes in grids as well as numerical parameters related to the algorithm. (It is assumed that the code in question has been verified as to the adequacy of its numerical formulation in solving the governing equations.) At present, systematic parametric studies are about the only means of estimating the effects of grid resolution and numerical parameters such as dissipation and dispersion. Schedule and cost constraints of a typical design effort do not permit extensive investigations to determine "optimal" grids and numerical parameters. CFD analysts end up relying upon previous experience and expertise. As we enter the era of fewer aircraft programs spaced farther apart, heavy reliance on past experience may become too risky. What we need is built-in means of quantifying the level of accuracy. An assessment of the level of accuracy and associated error bounds of the solutions is even more critical when CFD methods are combined with those from other disciplines to simulate multidisciplinary interactions. Error estimation is admittedly a difficult problem but a solution is urgently needed if CFD is to be utilized effectively in the IPPD environment. Incorporation of solution-adaptive techniques is one approach to minimizing numerical errors. An example of solution sensitivity to grid adaption is shown in Figure 7. Values of lift coefficient for a wing-body-centerline tail model, computed using two unstructured-grid Euler codes, are compared with each other and with experimental data at 0.4 Mach number. The model has a chined forebody and sharpedged wing and vertical fin. The Cartesian-grid SPLITFLOW code which uses solution adaption predicts a much larger value of lift at 30-degrees angle of attack than the tetrahedral-grid USM3D code which at present does not have an adaptivegrid capability. The difference is most likely due to better vortical-flow resolution afforded by the SPLITFLOW code compared to the USM3D code. At higher angles of attack, the leading-edge vortices move away from the configuration to regions where the USM3D mesh is too sparse to adequately resolve them. Additional details for these analyses can be found in References 43 and 44.
442
RAJ
Even if a code produces a numerically accurate solution, it is not trivial to determine how well the solution stacks up against the real flow—a measure of its physical accuracy. To date, the CFD community has advocated and conducted extensive "validation" exercises to generate computed vs. experimental data correlations, and used them to substantiate claimed levels of physical accuracy. However, the exercises have contributed more to the proliferation of databases than to increasing the knowledge base and confidence level needed to produce highly accurate aerodynamic data for aircraft design. Extensive correlations on geometries and flow conditions that differ substantially from those being considered by the design teams are of little value. This situation is particularly relevant to military aircraft projects since new designs are generally quite different from the older ones (for which CFD correlations might exist). The traditional approach to code validation is F i g u r7e. S e sn i t i v i t y o f fraught with difficulties. How many test cases, ynstructured-grid Euler solutions to grid adaption what combination of flow conditions for each test case, and what turbulence models must we consider before a code can be declared as fully validated? A matrix of runs using any reasonable set of test cases and flow conditions quickly grows into a monumental task. Even if we assume that adequate resources as well as high-quality measured data are available for carrying out such a task, we run against the tide of technology dynamics. The rapid pace of advances in hardware, numerical algorithms, and modeling of turbulence and transition fosters an environment where codes are never quite "finished." Sometimes the changes are nominal, many times not. Any rational cost/benefit assessment of a plan that allocates huge resources to validating a code that might be superseded the next day by a "new and improved" method does not support the traditional validation approach. A much more valuable approach might be to select a standard set of test cases, i.e., geometries and aerodynamic databases. The database may include theoretical (or datum numerical) solutions and/or measured data of known accuracy. Code developers could perform comprehensive parametric studies on their codes using the standard test cases to develop guidelines for grid distributions, numerical parameters, etc. Standard boundary-layer and shear-flow test cases could be used to determine the number and distribution of grid points, numerical dissipation parameters, turbulence model, etc., that produce solutions of acceptable accuracy. For example, a recent numerical/experimental study of near-field wing-tip vortex flow by Dacles-Mariani et al [45] found that their structured-grid code required a
CFD AT A CROSSROADS
443
minimum of 15 points across vortex cores with a grid cell aspect ratio of one and a fifth-order scheme to adequately resolve the flow field. Users will then have appropriate guidelines for "correctly" setting up their computational problems. The chances of generating meaningful data in a design environment will thereby improve. The proposed approach will also provide a sound basis for comparing the overall effectiveness of different codes. With more third-party vendors offering CFD software, the need for adopting the proposed approach is even greater. Continuing advances in large eddy simulation (LES) and direct numerical simulation (DNS) offer very attractive options for developing a CFD capability that can be truly predictive unlike the RANS methods which will continue to suffer from turbulence and transition model uncertainties. However, even rough order of magnitude estimates of the required computational resources for a full aircraft analysis at flight Reynolds number are staggering as shown in Figure 8. Due to the Re3 computational work barrier, it appears that DNS will not be practical for several Figur e 8. CFD computing challenge decades. However, LES could become practical in about twenty years if computer memory and speed increase by about four orders of magnitude. Considering the ten-fold increase in speed and memory every 7 years since 1975, it is conceivable that the required computational resources will become available by 2015 to 2020. It is imperative that more effort be directed at several pacing items such as sub-grid scale modeling; numerical algorithms; boundary conditions; grid generation; and tools for analyzing, visualizing and managing extremely large amounts of data. Cost.—Costs associated with CFD application include both labor and computing. At present, labor expenses are mainly connected with pre- and post-processing steps. For higher-level CFD codes, the labor expenses are still beyond the acceptable range. For example, the use of structured-grid methods requires several person-weeks to produce the first solution whereas a desirable value is closer to a few person-hours. Unstructured-grid methods in general, and Cartesian grids in particular, appear to be quite promising in reducing the level of effort. Progress in developing streamlined interfaces between grid-generation methods and CAD systems is crucial to reducing the geometry acquisition/modeling time. These improvements will also help in evaluating design changes in an inexpensive manner. As a matter of fact, technologies needed to reduce labor hours are essentially identical to those for reducing turnaround time.
444
RAJ
Computing costs mainly relate to running the flow solver and may include grid generation in some instances. Computers with high processing speeds and large memory are typically needed to produce the desired amount of data on schedule. The IPPD design process will most likely require that the full range of data be generated over a matter of days and not months. Computing expenses to suit this kind of timeframe must not be so large that the total product development cost increases rather than decreases. Consequently, cost and computational efficiency of the entire hardware and software system are very important considerations for effective CFD use in the future. Strategies for increasing computational efficiency and reducing cost will have to be an integral part of all future plans for CFD development and application. Various competing computer architectures ranging from a single workstation to shared memory processors to distributed memory processors are in the market; new ones will undoubtedly be introduced over the coming years. Matching software and hardware characteristics will be one of the more challenging aspects of effective CFD development and application in the future.
23.6 Concluding Remarks In this paper, the current state of CFD has been examined from an industry perspective which regards CFD as an engineering tool for producing aerodynamic data to support aircraft design needs. CFD capabilities have advanced from the simplified geometry, simplified physics models of the late sixties and early seventies to the complex geometry, realistic physics models used routinely today. These advances were motivated by the need to offer a cost-effective alternative to wind-tunnel testing. The cost of testing had been growing steadily over the same period that the cost of computing (in terms of dollars per floating point operation) was dropping rapidly. CFD has now advanced to a stage that no aircraft project depends solely on wind tunnels for all its aerodynamic data. However, CFD is not an alternative to the wind tunnel, just a perfect complement. Considerable attention is currently being paid to increasing the overall effectiveness of the more advanced Euler/N-S methods. Strategies are being employed to decrease turnaround time, lower cost and increase accuracy. The strategies include streamlining of interfaces with CAD systems, increasing efficiency of numerical algorithms, developing smart tools for data processing, and improving turbulence models to more accurately represent the physics of complex flows. However, incremental changes will not be sufficient to meet the challenges ahead. Recent changes in the aerospace business have triggered dramatic changes in aircraft design process. The evolving process is expected to be much more integrated with focus on providing the requisite data to design teams as early as possible so that more informed decisions can be made in early stages. It will demand
CFD AT A CROSSROADS
445
and expect much more from CFD than what we would be able to deliver if we continue along the path that we have followed for the past thirty years. We need to make truly dramatic reductions in both turnaround time and cost. The issue of producing solutions of reliable and acceptable accuracy needs to be tackled head on. In selecting the "right" technologies for building future CFD capabilities, it is essential to evaluate their impact on cycle time, accuracy and cost. The challenge facing the CFD community today is to channel their efforts and resources in a manner that makes CFD fully responsive to the future aircraft design needs. Numerous benefits will accrue from incorporating advanced CFD methods into the design processes. Using advanced methods that offer rapid turnaround capability will reduce design cycle time. Design teams can then explore a wider spectrum of alternatives within the schedule and cost constraints of a typical product development effort than is currently feasible. Fast, accurate and affordable methods will increase productivity and reduce the number of expensive tests needed to support design data needs. The use of advanced methods may also reduce the number of cycles required for design closure. Design teams will be able to conduct extensive trade-offs needed to guide the evolution of a configuration in a direction that minimizes both acquisition and life-cycle costs. Improved understanding of component interactions will permit design changes to be made early and thereby reduce risk and increase the probability of meeting all customer requirements. REFERENCES 1. Landahl, M, "CFD and Turbulence," ICAS-90-0.1, Stockholm, Sweden, September 1990. 2. Rai, M., Gatski, T., and Erlbacher, G., "Direct Simulation of Spatially Evolving Compressible Turbulent Boundary Layers, " AIAA Paper 95-0583, January 1995. 3. Moin, P. and Mahesh, K., "Direct numerical simulation: a tool in turbulence research," CTR Manuscript 166, Center for Turbulence Research, Stanford University, Stanford, California, April 1997. 4. Miranda, L.R., "Application of Computational Aerodynamics to Airplane Design," Journal ofAircraft 21, June 1984, pp. 355-370. 5. Hancock, G.J., "Aerodynamics: the role of the computer," Aeronautical Journal, August/September 1985, pp. 269-279. 6. Miranda, L.R., "Transonics and Fighter Aircraft: Challenges and Opportunities for CFD," NASA CP 3020, Vol. I, Part 1, April 1988, pp. 153-173. 7. Cosner, R.R., "Issues in Aerospace Applications of CFD Analysis," AIAA Paper 94-0464, January 1994. 8. Rubbert, P.E., "CFD and the Changing World of Airplane Design," ICAS-940.2, September 1994.
446
RAJ
9. Bangert, L.H., Johnston, C.E., and Schoop, M.J., "CFD Applications in F-22 Design," AIAA Paper 93-3055, July 1993. 10. Raj, P., "Requirements for Effective Use of CFD in Aerospace Design," NASA CP 3291, May 1995, pp 15-28. 11. Miranda, L.R., Elliott, R.D., and Baker, W.M., "A Generalized Vortex Lattice Method for Subsonic and Supersonic Flow Applications," NASA-CR 2865, December 1977. 12. Youngren, H.H., Bouchard, E.E., Coopersmith, R.M., and Miranda, L.R., "Comparison of Panel Method Formulations and Its Influence on the Development of QUADPAN, an Advanced Low-Order Panel Method," AIAA Paper 83-1827, July 1983. 13. Magnus, A.E., Ehlers, F.E., and Epton, M.A., "PANAIR A Computer Program for Predicting Subsonic or Supersonic Linear Potential Flow About Arbitrary Configurations Using a Higher-Order Panel Method," NASA CR3251, April 1980. 14. Johnston, C.E., Youngren, H.H., and Sikora, J.S., "Engineering Applications of an Advanced Low-Order Panel Method," SAE Paper 851793, October 1985. 15. Giesing, J.P., Kalman, T.P., and Rodden, W.P., "Subsonic Steady and Oscillatory Aerodynamics for Multiple Interfering Wings and Bodies," Journal ofAircraft 9, Oct. 1972, pp. 693-702. 16. Chen, P.C., Lee, H.W., and Liu, D.D., "Unsteady Subsonic Aerodynamics for Bodies and Wings with External Stores Including Wake Effects," Journal of Aircraft 30, May 1993, pp. 618-628. 17. Chen, P.C. and Liu, D.D., "Unsteady Supersonic Computations of Arbitrary Wing-Body Configurations Including External Stores," Journal of Aircraft 27, Feb. 1990, pp. 108-116. 18. "Transonic Aerodynamics," Progress in Astronautics and Aeronautics, Vol. 81, American Institute of Aeronautics and Astronautics, Washington DC, 1982, Nixon, D. (ed.) 19. Murman, E.M. and Cole, J.D., "Calculation of Plane Steady Transonic Flows," AIAA Journal 9, January 1971, pp. 114-121. 20. Boppe, C, "Transonic Flowfield Analysis for Wing Fuselage Configurations," NASA CR 3243, 1980. 21. Caughey, D.A., and Jameson, A., "Progress in Finite-Volume Calculations for Wing-Fuselage Combinations," AIAA Journal 18, Nov. 1980, pp 1281-1288. 22. Hicks, R.M. and Henne, P.A., "Wing Design by Numerical Optimization," AIAA Paper 77-1247, 1977. 23. Borland, C.J. and Rizetta, D.P., "Transonic Unsteady Aerodynamics for Aeroelastic Applications, Vol. I, Technical Development Summary for XTRANS3S," AFWAL-TR-80-3107, 1982.
CFD AT A CROSSROADS
447
24. Samant, S.S., Bussoletti, J.E., Johnson, F.T., Burkhart, R.H., Everson, B.L., Melvin, R.G., Young, D.P., Erickson, L.L., Madson, M.D., and Woo, A.C., "TRANAIR: A Computer Code for Transonic Analysis of Arbitrary Configuration," AIAA Paper 87-0034, January 1987. 25. Jameson, A., Schmidt, W. and Turkel, E., "Numerical Solution of the Euler Equations by Finite-Volume Methods Using Runge-Kutta Time-Stepping Schemes," AIAA Paper 81-1259, June 1981. 26. Rizzi, A., "Damped Euler Equation Method to Compute Transonic Flow About Wing-Body Combinations," AIAA Journal 20, Oct. 1982, pp. 1321-1328. 27. "Applied Computational Aerodynamics," Progress in Astronautics and Aeronautics, Vol. 125, American Institute of Aeronautics and Astronautics, Washington, D.C., 1990, Henne, P.A. (ed.) 28. "Computational Aerodynamics Based on the Euler Equations," AGARD-AG325, September 1994, Sloof, J.W. and Schmidt, W. (eds.) 29. Raj, P. and Harris, B.W., "Using Surface Transpiration with an Euler Method for Cost-effective Aerodynamic Analysis," AIAA Paper 93-3506, August 1993. 30. Goble, B.D., King, S., Terry, J. and Schoop, M., "Inlet Hammershock Analysis Using a 3-D Unsteady Euler/Navier-Stokes Code," AIAA Paper 96-2547, July 1996. 31. Tidd, D.M., Strash, D.J., Epstein, B., Luntz, A., Nachshon, A., and Rubin, T., "Application of an Efficient 3-D Multigrid Euler Method (MGAERO) to Complete Aircraft Configurations," AIAA Paper 91-3236-CP, 1991. 32. Karman, S., "SPLITFLOW: A 3D Unstructured Cartesian/Prismatic Grid CFD Code for Complex Geometries," AIAA Paper 95-0343, January 1995. 33. Guruswamy, G.P., "Unsteady Aerodynamics and Aeroelastic Calculations for Wings Using Euler Equations," AIAA Journal 28, 1990, pp. 461-469. 34. Campbell, R.L. and Smith, L.A., "A Hybrid Algorithm for Transonic Airfoil and Wing Design," AIAA Paper 87-2552, 1987. 35. Reuther, J.J., Jameson, A., Alonso, J.J., Rimlinger, M.J., and Saunders, D., "Constrained Multipoint Aerodynamic Shape Optimization Using an Adjoint Formulation and Parallel Computers," AIAA Paper 97-0103, January 1997. 36. Ghaffari, F., "Navier-Stokes, Flight, and Wind Tunnel Flow Analysis for the F/A-18 Aircraft," NASA Technical Paper 3478, December 1994. 37. Gomez, R.J. and Ma, E.C., "Validation of a Large Scale Chimera Grid System for the Space Shuttle Launch Vehicle," AIAA Paper 94-1859-CP, 1994. 38. Venkatakrishnan, V., "A Perspective on Unstructured Grid Flow Solvers," AIAA Paper 95-0667, 1995. 39. Marvin, J.G. and Huang, G.P., "Turbulence Modeling—Progress and Future Outlook," Proc. 15th International Conference on Numerical Methods in Fluid Dynamics, Monterey, California, June 1996.
448
RAJ
40. Bradshaw, P., "Turbulent Secondary Flows," Annual Review of Fluid Mechanics, 19, 1987, pp. 53-74. 41. "Digital Physics," Machine Design, Vol 66, Dec. 1994, pp. 96-100. 42. DeLaurentis, D., Mavris, D.N. and Schrage, D.P., "System Synthesis in Preliminary Aircraft Design Using Statistical Methods," ICAS 96-3.4.4, September 8-13, 1996. 43. Kinard, T.A., Finley, D.B. and Karman, S.L., "Prediction of Compressibility Effects Using Unstructured Euler Analysis on Vortex Dominated Flow Fields," AIAA Paper 96-2499, June 1996. 44. Raj, P., Kinard, T.A. and Vermeersch, S.A., "Vortical Flow Simulation Using An Unstructured-Grid Euler Method," ICAS-96-1.4.5, September 1996. 45. Dacles-Mariani, J., Zilliac, G.G., Chow, J.S. and Bradshaw, P., "Numerical/ Experimental Study of a Wingtip Vortex in the Near Field," AIAA Journal 33, Sep. 1995, pp. 1561-1568.
Aerospace Engineering 2000: An Integrated, Hands-On Curriculum A. Richard seebass' & Lee D. peterson' 24.1 "Largest Gift Ever
..."
"Frustrated by the state of engineering education, a New York City foundation is committing $200 million to build a college from scratch" [41]. This would seem to represent a high level of frustration. Lawrence W. Milas, the F. W. Olin Foundation's president, is quoted in this New York Times article: "We are fully committed to getting this institution up and running and with a strong endowment of somewhere between $300 million and $400 million, ... We'll emphasize an interdisciplinary approach, low student-to-faculty ratio, mentoring and a hands-on kind of experience." Bravo! Such an approach is not new, of course, but this gift reflects a concern that engineering education is not measuring up to the nation's needs. This new engineering college will be a notable endeavor, perhaps a Massachusetts analog of Harvey Mudd, perhaps something unique. Is the criticism implied by this article warranted? The aerospace curriculum at MIT, as but one example, has long provided an integrated approach in its sophomore year [34], and while Earll Murman was chair there, its Aeronautics and Astronautics curriculum was further strengthened in many ways, including the adoption of this. systems approach to the fdl curriculum and their emphasis on the important "implicit curriculum" [21]. As but one other example, at the University of Colorado, we planned, raised the funds for, built, and on April 24, 1997, dedicated our Integrated Teaching and Learning Laboratory.
'
Aerospace Engineering Sciences, University of Colorado, Boulder, Colorado, 80309-0429. Frontiers of Computational Fluid Dynamics -- 1998. 01998 World Scientific Editors: David A. Caughey & Mohamed M. Hafez.
450
SEEBASS & PETERSON
Engineering education has been examined for over a century including, among others, the Wickenden Report [52], the Hammond Reports [53-54], the Grinter Report [8], the Haddad and Pister Reports [44-45], and ASEE's Engineering Education for a Changing World [9]. As Monteith reminds us, what Burr said in 1893 remains true today: "[T]he first and fundamental requisite in the ideal education of young engineers, a broad, liberal education in philosophy and arts, [is a] precedent to the purely professional training. ... The main purpose should be such a cultivation of human qualities as will subsequently enable engineers to meet men as well as matter." [16, 38]. On this subject see also the books by Florman and Moulakis [32, 39]. Engineering education is not our most serious educational problem; K-12 education is. On this subject there are divergent views; one view that parallels in several ways what we plan for our curriculum is that of Adler [3]. College education in general has its critics and its assessors. Perhaps the assessment of most repute is due to Boyer [15]. The economic difficulties which this nation faced in the 1970s and 80s (now somewhat in the background because of a robust economy, low unemployment and inflation, and a weaker yen), largely remain. We are now the largest debtor nation, and we continue to import $7 $10 billion more in goods each month than we export (12 month moving average). We in aerospace take pride in the fact that among manufactured goods, aerospace products have provided the largest annual trade surplus, which in 1992 was over $31 billion [5]. Our economic difficulties seem to stem more from our management practices than from failures in engineering education, although the latter must not be excused from contributing [26-28, 33, 35]. The Japanese attribute their success, at least in part, to what they learned from the American engineer and statistical physicist, W. Edwards Deming. After the June 1980 NBC News White Paper, "If Japan Can ... Why Can't We?", corporate leaders in this country started to learn what Deming had to teach them. Many others, the first author among them, also learned much from this man. 24.2 "Best efforts won't do!" During the 1980s, engineering deans and department chairs, and the Engineering Deans Council of the American Society of Engineering Education (ASEE), became concerned that while we may be producing very good engineers, they may not be good enough to maintain our economic well-being. We began to ask, as others asked later [38], "How shall we train our engineers so that they will be the best in the world?" We didn't then, and don't yet, know. We were concerned, however, that the criteria for accredited degrees of the Engineering Accreditation Commission of the Accreditation Board on Engineering and Technology (ABET) did not encourage the needed change. As long as we satisfied ABET's detailed criteria, accreditation was generally assured. Some schools displayed the imagination to depart from the ABET criteria, putting their future accreditation at risk.
AEROSPACE CURRICULUM 2000
451
ABET criteria represented a "best effort" toward assuring this nation a satisfactory engineering education. But satisfactory wasn't likely to suffice. As W. Edwards Deming was wont to say, "Best efforts won't do!" And they won't. Some thought they knew what we needed to do; others simply knew we needed to improve. Under pressure from various engineering deans groups (Big Ten Plus, Engineering Deans Group, ASEE Engineering Deans Council), the Engineering Accreditation Commission rethought its criteria and, much to their credit, developed new criteria. These become effective in 2001, will be implemented on a voluntary basis in 1998, and are being studied and improved through pilot accreditation in 1996-97 [1-2]. The new criteria offer programs the flexibility needed to implement experimentation and reform, but they demand much more of us. Properly applied, they offer the possibility of continuous improvement toward the best in engineering education.
24.3 What do our customers and others say? Industry often thinks of itself as our customer. They hire our students. But try sending them the bill for the education of those they hire. You will soon learn that they are not our customers. Our students, and their parents, are our customers. For public education, society is also our customer. So we should ask our students about their education. Here we have a large selection to consider, including those students who terminate their studies with us—dissatisfied customers, so to speak [51]. But do our customers know what we should do? No, they do not. But they surely expect us to know. To this end perhaps we should listen most to those who lead major industries yet are thinking about the needs of society and of the students, as well as their own needs; and we should listen to those educators who have thought deeply on this subject [7, 10-11, 13, 25, 38, 40, 56], as well as to our graduates. In academe, we have long recognized that corporate recruiters and corporate leaders have different perspectives on what they seek in our graduates. So it is encouraging to note that industry now aspires to formulate a comprehensive view of the desirable attributes of our graduates [14, 36-37]. Beyond the problems that we have long and, to some extent, successfully addressed (such as the greater participation of minorities and women in engineering education, the need to keep our costs under control, and the need for our graduates to understand that they must embark on lifelong learning), there is general agreement on what we need to do. Foremost, we must continue to do what we now do well — emphasize the fundamentals. In addition, our students need to develop better team and communications skills, have the ability to integrate diverse disciplines, and to understand engineering in its humane, global, and political contexts. Most especially, our students need more hands-on experience, a stronger ability to integrate theory,
452
SEEBASS & PETERSON
analysis, and experiment, and participation in the design, fabrication, and testing of real devices. Some also feel that the time has come to move to the master's degree as the first professional degree in engineering [see, e.g., 10]. This is an old, and sound, idea whose time may have come. The ramifications of such a formal change are profound, and it has long been resisted by industry and academe. Industry resists because it makes engineering education less accessible and engineers more expensive; academe resists because it means a longer time to degree and perhaps thereby fewer students. Without formal changes toward this end, this is largely occurring anyway through more emphasis on students continuing their university studies through the MS degree, and through their pursuit of masters studies early in their professional careers.
24.4 Integrated Teaching and Learning Laboratory In the early 1990s our college's faculty developed their second five year strategic plan, and independently planned the college's Centennial Campaign. From our strategic planning it became clear that we needed to do much to improve our undergraduate education, and that in order to finance this improvement, we must make undergraduate curricular reform the cornerstone of our centennial fundraising campaign [22-24]. We were especially concerned with the limited hands-on, laboratory, and design experience among our undergraduates. We were, of course, also concerned about improving the quality of the students we attracted, with more than a third of our first-year students coming from out of state despite the very considerable nonresident tuition we charged. This led to the appointment in 1992 of a team of faculty under the direction of Larry Carlson of mechanical engineering and Jacquelyn Sullivan of civil engineering to: "... pioneer a multidisciplinary learning environment that integrates theory with practice and promotes creative, team-oriented problem-solving skills." To accomplish this change, we planned, built, and on April 24, 1997, dedicated our $11 million Integrated Teaching and Learning Laboratory (ITLL) [55]. This facility is comprised of 34,400 sq. ft. and equipped with 30 LabStations of our own design and some $6 million in computers, instrumentation and networking (largely from Hewlett Packard, to mention the largest of ITLL's many corporate supporters), with the building and its systems serving as a living laboratory.
AEROSPACE CURRICULUM 2000
453
24.4.1 ITLL Curricular Components The curricular components of ITLL are shown in the "ITLL Wheel" above. They are a first-year projects course, lecture demonstrations, hands-on homework, experimental modules, computer simulations, discipline-based laboratory courses, and the interdisciplinary design projects that form the spokes of this wheel. These are all served by an extensive network of computers. The principal components of ITLL are: two large (4,000 sq. ft.) laboratory plazas, each with 15 LabStations that altogether can accommodate 60 to 90 students working in two to three person teams, with adjacent breakout areas; a simulation laboratory with 25 UNIX workstations; four design studios for ITLL senior projects; two first- year design studios; a manufacturing center, an electronics center, an active learning center, two academic "living rooms" (an idea adopted from the new physics building at the University of Washington); ten group study rooms with PCs and Internet connections; a kinetic sculpture gallery illustrating many scientific and engineering principles; and a student lounge. The first-year project
454
SEEBASS & PETERSON
classrooms, the active learning center and the two breakout areas are "smart spaces" equipped with network access and video projectors. The building itself is well instrumented. All components share common data acquisition and analysis software.
LabStations The LabStations, designed by a team led by the second author, accommodate stand alone experimental modules that are sequence-independent experiments requiring minimal supervision, and which are suitable for open-ended exploration. Each LabStation contains two HP Pentium PCs with Windows NT, LabView, MatLab, Microsoft Office, data acquisition and signal conditioning hardware connected to a 100 MB/sec network. They provide an interface for HP bench top instrumentation (comprised of an oscilloscope, function generator, and multimeter), and also various levels of AC/DC power and a compressed air supply. Experimental modules on carts are easily connected to the LabStations, which allows the reprogramming of the LabPlazas several times each day [17]. Modules Modules are mobile experiments that may be used with the LabStations for inclass demonstrations and for hands-on homework assignments. Thirty-five experiments have been developed to date. They span measurements and instrumentation, electronics and microprocessors, dynamic systems and controls, fluid mechanics, heat transfer, and structures and materials. Hands-on homework ITLL provides faculty the opportunity to make "hands-on" homework assignments to their students, who access simple apparatuses and materials to demonstrate to themselves concepts which they have learned in class. Building as Lab The building itself is a laboratory with partially exposed and instrumented systems, including among other features various structural elements, the heating and ventilating system, and glazing instrumented for solar gain and heat loss [17].
First-Year Engineering Projects Course As part of the college-wide reformation of our departmental curricula, we have introduced many new courses. One that fully capitalizes on ITLL is the first-year projects course. Introduced in 1994 on a prototype basis, it has been continuously improved. Early results indicate a 50% improvement in retention of entering
AEROSPACE CURRICULUM 2000
455
students participating in this course in engineering studies. Our aerospace curriculum expects aerospace majors to take this course. Our college-wide, no-fault first year, however, means that we do not require this course. Here we have been guided by Maryland's experience with first-year design [25, 48-49]. The course begins with an ice-breaking "mystery artifact" challenge, continues with a design project and concludes with a reverse engineering project. It introduces the student to working in teams, methods of oral and written communication, project management, and low level computing skills that are part of the implicit engineering curriculum. It includes workshops on team dynamics, learning styles and group communications, AutoCad, basic electronics, and the use of hand tools. We aspire to utilize this projects course as a means of encouraging our students to draw upon their growing knowledge of calculus, chemistry and physics, and to help them understand that this material is integral, not extraneous, to engineering. That this is possible has been demonstrated by a similar course at the University of Maryland. Other notable curricular changes in the freshman year are inclusion of Drexel's E4 curriculum (which integrates fundamentals, design, and laboratory with the mathematics and science needed for the study of engineering [20, 46]), and RoseHulman's integrated first year [50]. On the subject of freshman design see also Ercolano [30]. The trend in first-year engineering courses is reviewed by Fitzhorn [31]. 24.5 Aerospace Engineering Sciences Curriculum 2000 The aerospace industry has experienced challenging times. Defense outlays in constant dollars, and industry employment, are at half the levels they were a decade ago. Yet today job opportunities abound. We are in a transition from military and NASA space programs to their enrichment many times over with commercial space programs, from our repeated presence in space to our (indeed, the world's) permanent presence there, from large, expensive and long delayed space science experiments to frequent space access with more university engineering involvement. Commercial aircraft and gas turbine engine production continues to grow. New, large transports and a supersonic transport are both under study, and there is a renaissance in general aviation. Our industry provides the nation with more than $30 billion each year in an aerospace trade surplus. We must do our part to ensure its continued health. In conjunction with the planning of ITLL, Aerospace Engineering Sciences' Curriculum and Teaching Committee, under the direction of the second author, made a thorough review of our curriculum. In the 1995-1996 academic year, in anticipation of the full occupancy of ITLL in the fall of 1997, and with the intent of fully capitalizing on this facility, the Curriculum and Teaching Committee proposed an entirely new sophomore year and a different upper division curriculum. We will be implementing the new sophomore curriculum this fall for our graduating class of
456
SEEBASS & PETERSON
2000. We are now defining the upper division curriculum for this and successive classes. 24.5.1 Goals The goal of this curriculum reform is to improve the quality of the education we provide our undergraduates in order to ensure that they have the most successful careers possible in industry, government, and academe. To accomplish this goal we concluded that we must: Establish a core curriculum Integrate the material in this core Make the curriculum relevant to applications Make it experiential, i.e., "hands-on" Integrate into all courses the development of communication and teamwork skills Provide more curricular choice at the upper division Implement continuous improvement procedures We also recognized that with the increasing knowledge required to contribute effectively as an engineer, the MS/ME degree is becoming the de facto professional degree. In order to encourage the top quarter of our students to remain with us for a one academic year MS or ME degree, we are implementing a five year BS/MS degree program. A previous reform of our graduate curriculum, coupled with our undergraduate curriculum reform, makes this possible for students who are able enough to take two graduate courses in their senior year. 24.5.2 Actions To accomplish our educational reform we have : Delineated the transition for current and incoming students Developed 1997-98 through 1999-2000 teaching schedules Identified and initiated the faculty development and training required Subjected our curriculum plans to industry design reviews Worked closely with our students to ensure that they value the changes we are making on their behalf We must now complete our definition of the upper division to meet our educational goals and to satisfy the AIAA's program-specific criteria for aerospace programs [6]. We also must develop assessments methods for all our educational endeavors to ensure that we may continuously improve them following Deming's implementation of the Shewhart cycle [26].
AEROSPACE CURRICULUM 2000
457
24.5.3 Environment To be successful in this endeavor we must ensure our faculty an environment that: Provides time for curriculum and course development Provides additional time for student interaction and advising Adds additional resources for the support of instruction Provides time for faculty development (e.g., learning LabView, Mathematica, MatLab, etc.) Ensures equitable teaching responsibilities Ensures time for the further improvement of our research and graduate programs Rewards their participation in educational improvement
24.5.4 The New Sophomore Year The reformation of our sophomore year derives much from MIT's sophomore core curriculum "Unified Engineering" [34]. Our undergraduates, while very able for a state university, are not of MIT's uniform high quality, so our experience may be different than MIT's. Their success with an integrated sophomore year has been long standing. And while it was initiated with small enrollments, it has withstood the test of large enrollments as well. When we committed to our new sophomore year, our enrollments were declining. Now we find that more of the College's incoming freshman have selected aerospace over all other degree programs for their studies. Such is the world of aerospace engineering. Our sophomore core curriculum is comprised of four, five credit hour, courses: Introduction to statics, structures, and materials Introduction to thermodynamics and aerodynamics Introduction to dynamics and systems Introduction to aerospace vehicle design and performance Each of these teaches the fundamentals in the context of aerospace applications; they enable students to do aerospace engineering early in their studies, both to provide the context for the field as a whole and to motivate them for in-depth studies in the upper division. The courses are team taught, with two classes (not lectures) that engage the students scheduled in the ITLL active learning classroom and two labs in the ITLL LabPlazas each week. This is augmented by a limited amount of homework. Homework will combine analysis, design and computation. Lab work will emphasize laboratory and team skills, as well as written and oral communications (see, e.g., Ercolano [29]). Each course is taught by two faculty, comprising eight (or one third) of our faculty. It is our purpose that each course will
458
SEEBASS & PETERSON
draw on the others. To ensure that our upper division courses draw upon the content of these sophomore courses, all of the faculty will participate in team teaching these courses over a six year period. 24.5.5 Upper Division As this paper is being prepared, we are still wrestling with the degree to which we will specify the upper division courses. In part this will be determined by the aerospace-specific criteria developed by the AIAA. This criteria allows for a division along aeronautical and astronautical lines, such as that adopted by Maryland [4], the discipline "pillar" approach developed by MIT [21], or the integrated course approach adopted by Cincinnati [57]. Senior Projects Course Our new curriculum also includes a new comprehensive senior project implemented in a year long, 10 credit sequence. The old curriculum included several senior design courses: a 3 credit paper study that encompassed the preliminary design of a spacecraft or aircraft, and a 6 credit, year long Senior Design Laboratory. The new 10 credit course combines the content of these and goes one step more. In the new senior projects course, students working in integrated product development teams will execute a complete design/build/test of an aerospace related product or device. Students will follow the international standard phased procurement cycle which includes Phase A, B, C, D, and E familiar to aerospace professionals. Project definition and design (Phases A and B) will span the first semester, and Phases C/D/E will span the second semester. One of the most important concepts learned during this course will be "requirements-driven design," which is a disciplined approach to efficient product development widely used in the aerospace industry, but often ignored in undergraduate design projects. 24.5.6 Resources The College's Centennial Campaign raised some $55 million dollars, with nearly a third of these funds in support of ITLL. We share this unique facility with the other undergraduate degree programs in the College. It provides the basis for our curriculum reform, which we could not have contemplated without this facility. Our estimate of the cost of this reformulation of our curriculum, including the development of a senior projects design facility, is about $1 million plus the consequent educational cost of a reduction in the richness of the upper division electives currently offered. Experience elsewhere leads us to conclude that the new curriculum, when fully developed, will increase the faculty's teaching load 25-30%. This can only be accommodated by reducing the number of upper division electives we offer.
AEROSPACE CURRICULUM 2000
459
Fortunately, Colorado's Commission on Higher Education selected Aerospace Engineering Sciences as a state-wide Program of Excellence in 1997 in recognition of our research excellence, and our collaboration with the Laboratory of Atmospheric and Space Physics on one of the three USRA Student Educational Discovery Explorers, the Student Nitric Oxide Explorer. This five year grant, augmented by a similar grant from Lockheed Martin, provides the resources needed for our curriculum revision and a Lockheed Martin / K. D. Wood Senior Projects Laboratory.
24.6 Two Warnings There is much in what we execute in our curriculum reform that will make engineering more fun to study and harder to teach. Computers and the Internet lend themselves to this fun as well [47]. Unless this fun inspires harder and smarter work from our students, and thereby greater accomplishments on their part, we run the risk of displacing fundamentals with fun. And, unless we preserve for the faculty the time they need to be leaders in their field, we run the risk of them becoming poor teachers: "...I have seen a teacher hold a hundred and fifty students spellbound, teaching what is wrong. His students rated him as a great teacher. In contrast, two of my own greatest teachers in universities would be rated poor teachers... Then why did people come from all over the world to study with them, including me? For the simple reason that these men had something to teach. They inspired their students to carry on further research. They were leaders of thought... Their works will remain classic for centuries." W. Edwards Deming [26]
24.7 Conclusion The College's new Integrated Teaching and Learning Laboratory will begin full time operation in the fall. With this college-wide facility to aid us, and with significant support from the Colorado Commission on Higher Education and the Lockheed Martin Corporation, Aerospace Engineering Sciences has instituted a first-year projects course, a new sophomore curriculum comprised of four courses that integrate theory, analysis and experiments (as well as the disciplines) to provide a systems perspective, and capstone, two semester, senior design laboratories supported by a new facility planned for the department. This undergraduate reform was paralleled by adding specificity to our graduate degree requirements and the adoption of a five year BS/MS co-terminal degree. We are now completing our plans for the upper division curriculum. What we have implemented, or will soon implement, is responsive to what we hear from corporate and educational leaders, including two National Academy of
460
SEEBASS & PETERSON
Engineering Chairmen — Norman Augustine and Richard Morrow. They would both ask, however, what are we doing to encourage our graduates to become more involved in the political process. Surely they are right that we need to reform our political leadership, and that engineers are not making enough of a contribution, let alone the contribution of which they are capable, to our governance. Our college has made some strides in this direction with a senior seminar entitled "Entrepreneurship, Leadership and Ethics in the Real World." Citizenship is part of the implicit curriculum adopted at MIT under Earll Murman's leadership. These are steps in the right direction, but we must all do more or our nation will be at risk.
Acknowledgement The authors thank their colleagues, Brian Argrow, Penny Axelrad, Larry Carlson and Jacquelyn Sullivan, for their most valuable comments on this paper; the Colorado Commission in Higher Education, the Lockheed Martin Corporation, and John R. Woodhull for their support of this endeavor.
REFERENCES 1. ABET, An Engineering Look Forward: New Decade, New Century, New Millennium, Proceedings of the ABET National Meeting, 1990. 2. ABET, Engineering Criteria 2000 (draft), Engineering Accreditation Commission of the Accreditation Board for Engineering and Technology, 1997. 3. Adler, M. J., Reforming Education, New York: Macmillan, 1988. 4. Akin, D. L., Barlow, J. B., and Schmidt, D. K., Designing an Aerospace Engineering Curriculum for the Next Century: Experiences at the University of Maryland, ASEE Annual Conference Proceedings, 1994, pp. 27- 35. 5. AIA, Aerospace Facts and Figures, 1993-1994, Aerospace Industries Association of America, 1993. 6. AIAA, Program Criteria for Aerospace and Similarly Named Programs, AIAA Web Page, 1997. 7. Armstrong, J. A., What Should Industry Expect from Academic Engineering Research?, Forces Shaping the U.S. Academic Engineering Research Enterprise, Washington D.C.: National Academy Press, 1995, pp. 59-66.
AEROSPACE CURRICULUM 2000
461
8. ASEE (Grinter report), Report of the Committee on Evaluation of Engineering Education, Journal of Engineering Education, 46, 1995, pp. 26-60. 9. ASEE, Engineering Education for a Changing World, October 1994. 10. Augustine, N. R., Socioengineering (And Augustine's Second Law Thereof), The Bridge, Fall 1994, pp. 3-14. 11. Augustine, N. R., Rebuilding Engineering Education, Chronicle ofHigher Education, May 24, 1996, pp. B1-B2. 12. Bedard, A. J., Jr. and Meyer, D. G., Hands-On Engineering Homework: A New Approach to Out-of-Class Learning, ASEE Annual Conference Proceedings, 1996, Session 1626. 13. Bordogna, J., Fromm, E., Ernst, E., Engineering Education: Innovation Through Integration, Journal of Engineering Education, 82(1), 1993, pp. 3 - 8 . 14. Bowman, D., Lang, J. and McMasters, J., The Roundtable for Enhancing Engineering Education - An Update, AIAA Paper 97-0844, 1997. 15. Boyer, E. L., College: The Undergraduate Experience in America, New York: Harper & Row, 1987. 16. Burr, W. M. H., The Ideal Engineering Education, Engineering Education, Vol. 1, 1893, pp. 17-49. 17. Carlson, L.E., and Brandemuehl, M. J., A Living Laboratory, ASEE Annual Conference Proceedings, 1997, Session 3226. 18. Carlson, L. E., Peterson, L. D., Lund, W. S. and Schwartz, T. L., Facilitating Interdisciplinary Hands-On Learning Using LabStations, ASEE Annual Conference Proceedings, 1997, Session 2659. 19. Carlson, L. E., et al., First Year Engineering Projects: An Interdisciplinary, Hands-On Introduction to Engineering, ASEE Annual Conference Proceedings, 1995, pp. 2039-2043. 20. Carr R., et al., Mathematical and Scientific Foundations for an Integrative Engineering Curriculum, Journal of Engineering Education, 84(2), 1995, pp. 137-150.
462
SEEBASS & PETERSON
21. Crawley, et al., Reform of the Aeronautic and Astronautics Curriculum at MIT, Journal of Engineering Education, 83(1), 1994, pp. 47-56. 22. CUE, Preparing Our Students for the 21st Century, CUEngineering, 8, 1991, pp. 2-5. 23. CUE, Engineering Education, The Undergraduate Experience, Undergraduate Education, CUEngineering, 7, 1990, pp. 6-33. 24. CUE, A Strategic Plan for Engineering Excellence, CUEngineering, 9, 1992/1993. 25. Dally, J. W. and Zhang, G. M., A Freshman Engineering Design Course, Journal of Engineering Education, 82(2) 1993, pp. 83-91. 26. Deming, W. E., Out of the Crisis, Cambridge Massachusetts: MIT Center for Advanced Engineering Study, 1982. 27. Deming, W. E., The New Economics, Cambridge Massachusetts: MIT Center for Advanced Engineering Study, 1993. 28. Dertouzos, M. L., Lester, R. K., Solow, R. M. and the MIT Commission on Industrial Productivity, Made in America: Regaining the Productive Edge, Cambridge, Massachusetts: The MIT Press, 1989. 29. Ercolano, V., Learning Through Cooperation, ASEE PRISM, November 1994, pp. 26-29. 30. Ercolano, V., Designing Freshmen, ASEE PRISM, April 1996, pp. 20-25. 31. Fitzhorn, P. A. and Johnson, G. R., Is There Convergence in First Year Engineering Courses?, ASEE Annual Conference Proceedings, 1994, pp. 813819. 32. Florman, S. C, The Civilized Engineer, New York: St. Martin's Press, 1987. 33. Halberstam, D., The Reckoning, New York: Morrow, 1986. 34. Hollister, W. M., Crawley, E. F. and Amir, A. R„ Unified Engineering: A Twenty Year Experiment in Sophomore Aerospace Education at MIT, Journal of Engineering Education, 85(1), 1995, pp. 13-19.
AEROSPACE CURRICULUM 2000
463
35. Ladesic, J. G. and Hazen, D. C, A Course Correction for Engineering Education, Aerospace America, May 1995, pp. 22-27. 36. McMasters, J. H. and Lang, J. D., Enhancing Engineering and Manufacturing Education: Industry Needs, Industry Roles, ASEE Annual Conference Proceedings, 1995, Session 2502. 37. McMasters, J. and Matsch, L., Desired Attributes of an Engineering Graduate An Industry Perspective, AIAA Paper 96-2241, 1996. 38. Monteith, L. K., Engineering Education — A Century of Opportunity, Journal of Engineering Education, 83(1), 1994, pp. 22-25. 39. Moulakis, A., Beyond Utility: A Liberal Education for a Technological Age, Columbia Missouri: University of Missouri Press, 1994. 40. Morrow, R. M., Issues Facing Engineering Education, Journal of Engineering Education, 83(1), 1994, pp. 15-18. 41. New York Times, $200 million, Largest Gift Ever, Endows New Engineering College, June 6, 1997, p. A18. 42. NRC, The Competitive Status of the U.S. Industry, Washington, D.C.: National Academy Press, 1985. 43. NRC, The Competitive Status of the U.S. Civil Aviation Manufacturing Industry -An Overview, Washington, D.C.: National Academy Press, 1985. 44. NRC (Haddad report), Engineering Education and Practice in the United States: Foundations of our Techno-Economic Future, National Research Council, Washington, D.C.: National Academy Press, 1985. 45. NRC (Pister report), Engineering Education: Designing an Adaptive System, National Research Council, Washington, D.C.: National Academy Press, 1995. 46. Quinn, R. G., Drexel's E4Program: A Different Professional Experience for Engineering Students and Faculty, Journal of Engineering Education, 82(4), October 1993, pp. 196-202. 47. Panitz, B., A Cyberskeptic's View, ASEE PRISM, May-June 1997, pp. 18-20.
464
SEEBASS & PETERSON
48. Regan, T. M. and Minderman, Jr., P. A., Redesigning Freshman Engineering, Chemical Engineering Progress, 90(4) 1994, pp. S1-S4. 49. Regan, T. M. and Minderman, Jr., P. A., Engineering Design for 600 Freshman - A Scale-Up Success, Proceedings, Frontiers in Education, 1993, pp. 56-60. 50. Rogers, G. and Winkel, B. J., Integrated First-Year Curriculum in Science, Engineering and Mathematics, Nature, Evolution, and Evaluation, ASEE Conference Proceedings, 1993, pp. 186-191. 51. Seymour, E., and Hewitt, N. M., Talking about Leaving: Factors Contributing to High Attrition Rates Among Science, Mathematics, and Engineering Undergraduate Majors, Final Report to the Alfred P. Sloan Foundation, Boulder, Colorado: Bureau of Sociological Research, University of Colorado, 1997. 52. SPEE (Wickenden report), Report of the Investigation of Engineering Education, Urbana, Illinois: Society for the Promotion of Engineering Education, 1930. 53. SPEE (Hammond report I), Aims and Scope of Engineering Curricula, Journal of Engineering Education, 30(7), 1940, pp. 555-566. 54. SPEE (Hammond report II), Report of the Committee on Engineering Education After the War, Journal of Engineering Education, 34(9), 1944, p. 589-614. 55. Sullivan, J. and Etter, D. M., "The Integrated Teaching and Learning Program: Meeting the 21st Century Challenge for Engineering Education," The Interface, IEEE, 1,1996, pp. 1-4. 56. Van Valkenberg, M. E., Preparing the Engineers of the 21st Century, Engineering Education, 78(10), 1988, p. 103. 57. Walker, B. K., Wade, J. E., Orkwis, P. D., Jeng, S.-M., Khosla, P. K., and Slater, G. L., Development of a Proposed Aerospace Engineering Curriculum for the Twenty-First Century, AIAA Paper 97-0737, 1997. 58. White, R. M., Science, Engineering and the Sorcerer's Apprentice, The Bridge, 21(1), 1991, pp. 13-20.
25 A Computer-based Textbook for Introductory Fluid Mechanics David A. Caughey* and James A. Liggett* Abstract An interactive textbook that uses the power of the personal computer to teach introductory fluid mechanics has been developed by the authors. This mode of presentation integrates hypertext navigational and search features, the presentation of videos and animations to illustrate phenomena and concepts, and computation to allow the presentation of results for a variety of parameter values and the solution of nonlinear problems without the tedium of table look-up or iteration on the part of the student. The authors' experience using an early version of the book to teach junior-level students in mechanical engineering and in civil and environmental engineering indicates that the students appreciate the increased understanding that comes with dynamic figures, the easy access to data, the ability to locate quickly definitions and specific material, and, most of all, the computational facilities. 25.1 Introduction Fluid mechanics is an engineering science of fundamental importance to most branches of engineering, including aerospace, chemical, civil, environmental, and mechanical engineering, as well as to some aspects of electrical engineering and materials engineering. Fluid mechanics typically is taught to engineering students in curricula in the above fields, starting with one or more courses in the junior year. In spite of the fact that we spend our lives surrounded by, and immersed in, fluids, this course is considered difficult by most students — largely because of the abstract nature of the formulations of many problems in fluid mechanics, for which the typical student has not developed an intuitive feel, and the frequency with which ""Cornell University, Ithaca, New York 14853-7501. Frontiers of Computational Fluid Dynamics — 1998. Editors: David A. Caughey A Mohajned J4,Hjifez ©1998 World Scientific
466
CAUGHEY & LIGGETT
nonlinearity is a factor in the formulation of even the most common engineering problems. The authors have developed a textbook3 that uses the power of the personal computer to try to address issues of visualization of phenomena, the connection between fluids phenomena and their mathematical description, and the inherent difficulty of meaningful computation. The book is designed as a stand-alone text, not as a supplement to an existing text. Although a paper version will be available to accompany the electronic form, the book is designed to be read on a computer, where it integrates hypertext features, animations and video sequences that illustrate kinematic and dynamic phenomena, graphics to present data dynamically, and computational facilities for the solution of complex problems without the tedium associated with classical (graphical or tabular iterative) methods. The navigational features include an extensive glossary of key words for which definitions and related key words can be found in context, a history list of previously visited pages, and several types of tables of contents that allow the student to jump quickly to any particular portion of the text. Animations illustrating key concepts or complex derivations are included as "QuickTime" movies, as are video clips to illustrate various fluid flow phenomena. All numerical computation is done within MATLAB* but data are input and results are presented through Graphical User Interfaces, so no programming is required of the student. This underlying computational power allows the incorporation of dynamic figures in which the student can change the value of one or more parameters and see immediately the effect in plotted results, and an "Active Equations" utility that allows the student to plot any meaningful dependent variable as a function of any meaningful independent variable for many equations. More than a dozen computational utilities are integrated with the text, including a units conversion program, a general utility for solving systems of nonlinear or transcendental equations, a one-dimensional energy equation solver for pipe-flow problems with friction, both superposition and boundary-integral equation methods for two-dimensional potential flow problems, spreadsheet implementations of the compressible flow functions that normally are presented in tabular form, and a number of others. These computational utilities enable the student to do enough exercises that he/she can develop some intuition for the behavior of fluid systems. The present paper provides an overview of the features of this new form of textbook and some of the experiences of the authors using a preliminary version of the text to teach junior-level courses in mechanical and civil engineering. The navigational *MATLAB is a versatile numerical analysis and graphics software package developed and distributed by MathWorks, Inc. of Natick, Massachusetts. An executable version of MATLAB will be distributed with the book, but will be available only to run the utilities that the authors have developed.
CAUGHEY & LIGGETT
467
Fig 1. Typical page (or screen) of the interactive text book.
features are described in the next section, followed by a description of the animation and video features. The computational features are described using several examples. Finally, observations on how the new textbook is likely to change the nature of the teacher/student interaction and experiences using preliminary versions of the text teaching junior-level courses to students in civil and environmental engineering and mechanical engineering are summarized. 25.2 Navigation and Integration The visual presentation of the textbook on the computer screen is designed to mimic the appearance of a conventional (paper) textbook. The screens are pageoriented (rather than scrolled), so that visual memory of where objects appear on the page is preserved, and a number of visual cues are provided to remind the student of the current location in the book. The appearance of a typical page is illustrated in Fig. 1. The basic navigational controls are located in the lower left corner of the page; the "Next" and "Previous" buttons take the student to the next, or previous, linear page in the text at the current level.* The "Back" button steps the student back through *The sections in the text are organized into three levels: Level 1 c o n t a i n s overview material, Level 2 contains all Level 1 material plus that required t o
468
CAUGHEY & LIGGETT
the pages most recently visited — i.e., through a "History" list — regardless of their location relative to the current page. The history list of pages visited most recently is accessible from the "History" menu, and the student can jump to any selected page on the history list. The location of the current page in the chapter is indicated by the location of the red rectangle in the slider bar near the bottom of the page; the page shown in Fig. 1 is approximately 20% of the way through Chapter 11. The numbers to the left and right of the slider bar indicate other chapters in the book (with only odd numbers displayed because of space limitations; evennumbered chapters are denoted by the bullet markers); clicking on a chapter number (or marker) takes the student to the first page of the corresponding chapter. A detailed table of contents for the current chapter can be viewed at any time by clicking on "Sections" in the upper left corner of the page, and the table of contents for any chapter can be viewed from the book table of contents which lists all the chapters. The student also can search the entire book (or any selected subset of chapters) for any figures (including animations, videos, etc.) using the "Figure Search" item under the Table of Contents menu which also has the ability to find only those figures containing a particular text string in their captions if desired. The textbook provides most of the functionality associated with hypertext presentations. These include the ability to call up short definitions of most important technical terms directly from their appearance in the text, to call up definitions of related terms from within the Glossary, and to jump to pages in the text where the term or related terms are first introduced. Glossary terms that are treated as "hot text" in this way are denoted using bold face type: the terms transition, separation point, laminar, and turbulent appearing on the page in Fig. 1 are examples. Equations and figures from earlier pages that are referred to on the current page also can be viewed in a similar way. The student also can write "Notes" on any page, or place a bookmark on any page so that it can be found (and returned to) easily using the "Bookmark" menu. 25.3 Illustrations and Data Three types of illustrations are provided in the text: (1) static figures, (2) dynamic animations/videos, and (3) active graphs. Static figures are similar to figures in a understand how to do engineering applications, and Level 3 contains all Level 2 material plus detailed background on complex derivations or material of a more specialized nature. The current level is indicated by the radio buttons in the upper right corner of the page. Although Levels 1 and 2 "hide" some material from the reader, that material is easily available by changing levels or by using the table of contents.
CAUGHEY & LIGGETT
469
Fig. 2 Single frame of a video animation illustrating the distinction between path lines, streak lines, and streamlines for the ideal, incompressible flow past a circular cylinder whose radius varies with time.
conventional paper text. Simple figures are displayed full size in the left margin of the page. More complex figures, such as the one appearing on the page in Fig. 1, are represented by a thumbnail sketch in the margin carrying a "magnifying glass" icon. Clicking on the thumbnail sketch of such a figure causes it to be displayed full size on the screen for detailed study. Dynamic animations or videos are used to illustrate dynamic features, to provide video displays of actual flow phenomena, and to provide several types of introductory information. This introductory information includes overviews of some of the chapters (appearing on the first page of the chapter) and tutorial sessions illustrating many of the computational utilities included as part of the text. These animations and videos are represented by thumbnail sketches in the left margin of the page and are identified by a "filmstrip" icon. An animation or video is launched as a "QuickTime" movie by clicking on its identifying thumbnail sketch. The "QuickTime" movie can be played forward or backward, can be halted at any frame, or can be stepped one frame at a time, in either the forward or backward directions. An example of an animated sequence illustrating the distinction between path lines, streak lines, and streamlines for the unsteady flow past a circular cylinder of time-
470
CAUGHEY & LIGGETT
Fig.3 Page having an active graph. The "Calculator icon" indicats that the thumbnail sketch can be clikced on to launch the active graph.
varying radius is shown in Fig. 2. Unfortunately, in the format of this paper, only a single frame of the video can be shown. Active graphs are used to display quantitative results that depend on one or more parameters that might be changed by the student. They are identified by the active "calculator" icon, as shown on the page illustrated in Fig. 3. Clicking anywhere within the thumbnail sketch for the graph launches the active version of the graph. In most plots of quantitative data presented in the textbook, the region near a particular point on any curve can be zoomed for more accurate reading of values. A click of the mouse button on any point in the plot doubles the scale in the vicinity of that point, or a "zoom box" can be defined by clicking and dragging between opposite corners to define an area to be enlarged to the full plot area. A double click of the mouse button anywhere in the figure area zooms the plot out to the original scale. The active graph on the page shown in Fig. 3 presents the pressure ratio and shock angle for the turning of supersonic flow through a given angle. The result depends upon both the upstream Mach number and the ratio of specific heats (& = cp I cv) of the gas. When the student changes either the Mach number or the ratio of specific heats, the plot is re-drawn automatically to correspond to the new values.
CAUGHEY & LIGGETT
471
Fig. 4 An active graph presenting the pressure ratio as a function of turning angle for supersonic flow. The figure illustrates the curves for a single upstream Mach number and ratio of specific heats, but the student can enter any values for these parameters to see the corresponding plot.
The active version of this plot is shown (for upstream Mach number M = 2.0 and ratio of specific heats k = 1.667) in Figure 4. When the pressure ratio is selected as the dependent variable (rather than the shock angle) the corresponding curve for isentropic flow also is plotted. The student can choose to have angles displayed either in radians or degrees by clicking on the appropriate radio button. Active graphs also are used to display fluid properties, such as coefficient of viscosity or thermal conductivity, and can be launched by selecting the relevant fluid property from the "Data" menu. Figure 5 shows the dependence of coefficient of thermal conductivity on temperature for several common liquids. As a final example, Fig. 6 shows the lift, drag, and pitching moment characteristics of several airfoils tested by the National Advisory Committee on Aeronautics (NACA), the forerunner of NASA. Once the active graph has been launched, the student can display the lift, drag, and pitching moment characteristics of several different airfoil shapes.
472
CAUGHEY & LIGGETT
Fig. 5 An active graph presenting the dependence of coefficient of thermal conductivity on temperature for several common liquids. The student can read values by clicking on the desired curve or by entering the temperature value in the corresponding data box and reading the resulting thermal conductivity once a particular curve has been selected. The curves are computed using semi-empirical formulas and experimental values from Reid, Prausnitz & Poling5.
25.4 Computation and Utilities Computational utilities to solve a variety of engineering problems that may require the solution of nonlinear (or systems of nonlinear) equations are integrated into the textbook. Historically, these problems were solved by graphical methods, by iteration on tabulated functions requiring repeated interpolation, or by requiring the student to implement numerical methods. The interactive textbook provides Graphical User Interfaces (GUIs) to a number of routines written in MATLAB that allow the student to solve a greater variety and number of such problems in the same amount of time (or with similar effort required) to solve a single problem in the past. Such GUI-based utilities are provided for solving: • Compressible flows with area change for isentropic flows and flows with normal shock waves; • Potential flow problems within arbitrary two-dimensional geometries for any mix of Neumann and Dirichlet boundary conditions; • Dimensional analysis problems to find dimensionless groups of variables for any set of dimensional input variables;
CAUGHEY & LIGGETT
473
Fig. 6 Lift, drag, and moment coefficients for a variety of NACA airfoil shapes, (a) Lift and moment coefficients are plotted as functions of the angle of attack; (b) the drag coefficient is plotted as a function of the lift coefficient. After Abbott & von Doenhoff 1.
• • • • • •
• • • • • •
Solutions of steady-state pipe network flows; Compressible flows in constant area ducts with friction; Numerical integration of functions; Incompressible pipe flow problems to determine pipe sizes, flow rates, head loss, etc., for a variety of formulations of the frictional losses for turbulent flow; One- and two-dimensional plotting of functions and data with integration to determine volumes under surface plots; Planar and axisymmetric incompressible potential flow problems by superposition of elementary solutions (sources, sinks, vortices, doublets, and uniform streams); Solutions of open channel profile problems; Compressible flows in constant area ducts with heat addition; Any system of nonlinear or transcendental equations; Properties in the standard atmosphere; Units conversion problems from any system to virtually any other; Waterhammer problems in elastic pipes.
Here, we will provide brief descriptions of three of the utilities to give an indication of the features.
474
CAUGHEY & LIGGETT
Fig. 7 Moody chart for the friction factor in fully-developed, turbulent pipe flow. The friction factor is a function of the nondimensional pipe roughness and the Reynolds number. 25.4.1 Pipe Flow The head loss hf due to friction for fully-developed turbulent flow in a length L of circular pipe having diameter d is given by the relation LVL hf=t dig' (25-4.1) where V is the mean flow velocity, g is the acceleration of gravity, and f is the (Darcy-Weisbach) friction factor. The friction factor is a function only of the nondimensional roughness eld of the pipe surface and the Reynolds number Red = Vd I v , where V is the kinematic viscosity of the fluid. The formula due to Colebrook2 provides an interpolation between the friction laws for smooth and rough pipes (eld 1 2.51 •2.0 log 1/2 (25.4.2) 3.7 Re, f A plot of the friction factor as a function of Reynolds number, usually called a Moody Chart4, is shown in Fig. 7. Use of the Moody Chart is relatively straightforward for some problems. For example, the head loss for a given flow rate through a pipe of specified length and diameter can be found in closed form. For this case, the Reynolds number can be computed directly from the given quantities, and the friction factor determined either
CAUGHEY & LIGGETT
475
(by iteration) from Eq. (25.4.2) or directly from the Moody Chart. The friction factor can then be used to determine the viscous head loss using Eq. (25.4.1). On the other hand, some problems are much more difficult. For example, the diameter for a specified head loss hf and volumetric flow rate Q through a pipe of given length L requires solution of simultaneous, nonlinear equations. In this case, the Reynolds number must be determined as part of the solution, since neither the average flow velocity V nor the pipe diameter d is known a priori. Equations (25.4.1) and (25.4.2), recast in terms of the volumetric flow rate Q, must be solved simultaneously to determine the velocity V and diameter d corresponding to the given hf. Since no general techniques are available to solve such systems of nonlinear equations, one usually resorts to simple iterative schemes in which one makes an initial estimate for the pipe diameter, then computes the Reynolds number according to R ^ = - ^ .
(25.4.3)
The Moody Chart (or, equivalently, Eq. (25.4.2)) can then be used to compute the corresponding friction factor f, and a new approximation for the pipe diameter can be computed from the head loss by solving the equation %
d
g
for the diameter d=
8f 1 \"K
L_ <$_ h, "f
6
(25.4.4)
If the flow contains entrance or exit losses, minor losses, or a pump or turbine, the equations become even more complex. This process is quite tedious if the Moody Chart or Eq. (25.4.2) must be used to determine the friction factor for the Reynolds number at each iteration, but it can be automated quite easily on the computer. The solution to this problem, and many others, can be found using the GUI-based utility PipeFlow. PipeFlow solves the one-dimensional mechanical energy equation, the Darcy-Weisbach equation (25.4.1), and the Colebrook equation (25.4.2), either singly or in combination, for the friction factor f plus another variable, allowing for various boundary conditions and minor losses. It can choose roughness from a pipe type (e.g., concrete), look up water properties (or allow specification of the properties of other fluids), and find real (instead of nominal) pipe dimensions from a table of commercial pipe sizes. The main screen of PipeFlow used to specify these problems is shown in Fig. 8. The screen shown in Fig. 8 illustrates the solution of one example of the type of problem just described. The solution is displayed for the pipe diameter required for a fully-developed turbulent flow of 2 = 1.0 m3/s in a pipe having length L = 100 m, driven by a pressure difference of Ap = 10,000 Pa.
CAUGHEY & LIGGETT
476 IPi
The Mechanical Energy Equation: Pipe flow solutions I
Fig. 8 Main panel for PipeFlow, a GUI-based utility for solving a variety of fluids problems involving turbulent flow losses in pipes.
25.4.2 Compressible Flow with Area Change Another example of problems for which an inexpensive personal computer can be used to advantage to allow solution of problems that are otherwise rather tedious involves compressible flows in ducts of varying cross-sectional area. The Mach number M as a function of cross-sectional area A for isentropic flow in a nozzle can be expressed as 1 (2 + (fc-l)M 2 2(*-l) (25.4.5) M V fc + 1 where k is the ratio of specific heats and A* is the effective sonic area for the nozzle (which may, or may not, correspond to the minimum area of the nozzle under consideration, depending upon the operating conditions); see, e.g., Shapiro6. Once the Mach number is known, other flow variables can be determined directly from the isentropic relations; e.g., the pressure p is given by A_ A'
P
1= (> + ^
M
')"
C5.4.6,
where P0 is the isentropic stagnation pressure. For completely isentropic flow, the problem of determining the flow properties as a function of nozzle cross-sectional area is simply one of solving Eq. (25.4.5) or, equivalently, interpolating tabulated values of this function, to determine the Mach
477
CAUGHEY & LIGGETT
number M, and then using formulas such as Eq. (25.4.6) to determine the flow properties from the Mach number. If shock waves are present in the nozzle, the problem becomes more complex. A typical problem, usually too tedious even to assign as a homework exercise, is to determine the shock location in a nozzle of given geometry (i.e., for a given A(x)) for a given exit pressure.* This problem, like many of the turbulent pipe-flow problems described in the previous section, requires iteration. For a given shock location, the jump in pressure can be determined from the normal shock relations,
t =
1
+
F >
,
- «
(25-4.7,
where the subscripts ( X and ( )2 refer to the states immediately upstream and downstream of the shock, respectively. The Mach number immediately downstream of the shock is given by 2 - 1) 2 __ k + 1 + (k -1) (M, M 2 fc+1 + 2/t(M, - 1 ) ' (25A8) and this can be used with Eq. (25.4.5) to determine the new effective A', while k
Po2
(fc + l ) M I 2
"\iZ7 f
1
A <:-i ki. +, li IkM* (k-l)
2 (25.4.9) v 2 + Ot-l)M, y Po, can b e used to determine the new stagnation pressure. Equation (25.4.9), along with the area ratio from the shock to the exit, can b e used with the formulas above to compute the pressure at the nozzle exit. Comparing this computed pressure with the desired exit pressure suggests whether the shock needs to b e moved upstream or downstream to more nearly match the required exit pressure.
The GUI-based utility AreaFlow, which provides easy access to the complete set of formulas for isentropic flow in ducts of varying cross-sectional area and the normal shock relations, makes sequences of calculations such as that described above less tedious. The data are entered and presented in the form of a spreadsheet that is updated whenever a value is entered or changed so that it always is internally consistent. T h e AreaFlow display panel is shown in Fig. 9, showing the values computed to determine the shock strength required to produce an exit pressure to upstream stagnation pressure ratio of 0.5 when the exit area is twice the nozzle throat area. This solution w a s computed in about half a dozen iterations and required less than 3 minutes of user interaction time from start to finish.
*An analogous problem exists in open channel flow where the location hydraulic jump is to be determined for specified boundary conditions channel geometry. The channel profile solver in combination with transcendental equation solver provides an easy method of solution for problem.
of a and the that
47 8
CAUGHEY & LIGGETT
Fig. 9 Data panel for AreaFlow, a GUI-based utility providing access to the equations of compressible flow. 25.4.3 Incompressible Potential Flow Our final example of the computational utilities integrated into the textbook is PotFlow, which plots selected streamlines, lines of constant velocity potential, and lines of constant pressure for two-dimensional potential flows (either planar or axisymmetric), generated by the superposition of any number of sources, sinks, vortices, doublets, and uniform streams. The main control panel for PotFlow is shown in Fig. 10. Elements are defined and added to the flow using the controls in the upper right corner of the panel, the properties of existing elements can be altered using the controls in the lower right corner of the panel, and the properties of the current elements are displayed in the upper left portion of the panel. The controls in the lower left portion of the panel are used to clear the data space, to read element data from a file, and to select the form of results to be plotted. Figure 11 shows a plot, generated by PotFlow, of the streamlines and selected contours of constant pressure coefficient for the flow past a discrete vortex representation of a flat plate airfoil at 20 degrees angle of attack. The strengths of 32 vortices, spaced uniformly along the flat plate, were computed using thin-airfoil theory for the case of a flat plate airfoil (a "Level 3" topic in the textbook).
CAUGHEY & LIGGETT
479
Fig. 10 Main control panel for PotFlow, a GUI-based utility for studying two-dimensional potential flows generated by superposition of sources, vortices, doublets, and uniform streams.
Fig. 11 PotFlow graphics showing streamlines and contours of constant pressure for a discrete vortex approximation to the flow past a flat plate airfoil at 20 degrees angle of attack.
480
CAUGHEY & LIGGETT
25.5 Teaching Experiences Both authors have used preliminary versions of the interactive text book to teach parts, or all, of introductory courses in fluid mechanics to junior-level students in civil and environmental engineering and mechanical engineering. The students generally are enthusiastic about the graphics and computational utilities provided by the new medium, but are less enthusiastic about reading large amounts of text and equations from the screen. This is not really surprising, since computation and video are things that the computer is good at. These dynamic interactions are engaging and encourage active learning. We all are reluctant to spend very much time reading text from a computer screen, given an alternative. This experience has led us to provide a paper version of the text to complement the electronic form, but we still feel the integration of the graphical presentations and computation with the textural material is important. We have experimented with several ways to use the textbook in lecture and recitation sections. In lectures, we have found that it is most effective to have computer projection equipment available, and to use the features of the textbook in a limited way to illustrate topics, rather than to base the entire lecture on projected pages, illustrations, and utilities. Increasingly, computer projection equipment is available in lecture halls, and the ease with which it is possible, in most cases, to switch back and forth between the conventional blackboard mode, video and animation sequences, and computation bodes well for the increased incorporation of tools such as this into lecture presentations. An instructor can use bookmarks to locate animations, movies, graphs, and active equations easily. The instructor can use these same features on his/her own machine at home for further study and the solution of exercises. We have found that it is effective to hold at least some recitation sections, especially early in the course, in a facility in which each student has access to a computer. In this way the students can get early experience with the new medium in the presence of a teacher or teaching assistant who is experienced in its use. (Instruction in the use of the text may become less important as the textbook becomes more robust and students become more comfortable with the new medium.) The "studio" environment in which students are encouraged to explore exercises using the various computational utilities under the watchful eye of a roving instructor also has proven effective.
CAUGHEY & LIGGETT
481
Acknowledgements The textbook was written using MacroMedia AuthorWare and will be distributed on a CD-ROM that will run under the Macintosh, Windows 3.1, Windows 95, and Windows NT operating systems. Video segments and animations are run as QuickTime movies, and all numerical computation is done within MATLAB. The authors would like to take this opportunity to thank David Dresia, Managing Director of Publications for the American Society of Civil Engineers for his continued support of this project. We also express our profound gratitude to the personnel of the MultiMedia Courseware Studio at Cornell University for their contributions. In particular, the project could not have been completed without the dedication and expertise of Rob Levine, Kate Mink, Melanie Swain, Mike Tolomeo, Kenny Unice, Dave Wickstrom, John Wolf, and the numerous undergraduate students at Cornell University who have made significant contributions to the success of this project. Particular gratitude is expressed to Kate Mink and John Wolf, who have been associated with the project since it's inception, and whose extraordinary efforts largely are responsible for its timely completion.
REFERENCES 1. I. H. Abbott & A. E. von Doenhoff, Theory of Wing Sections, Dover, New York, 1959. 2. C. F. Colebrook, Turbulent Flow in Pipes, with Particular Reference to the Transition between the Smooth and Rough Pipe Laws, J. Inst. Civ. Eng. London, Vol. 11, pp. 133-156, 1938-39. 3. J. A. Liggett & D. A. Caughey, Fluid Mechanics: An Interactive Text, American Society of Civil Engineers, 1998. 4. L. F. Moody, Friction Factors for Pipe Flow, ASME Trans., Vol. 66, pp. 671-684, 1944. 5. R. C. Reid, J. M. Prausnitz, & B. E. Poling, The Properties of Gases and Liquids, McGraw-Hill, New York, 1987. 6. A. H. Shapiro, Dynamics and Thermodynamics of Compressible Fluid Flow, Vol. 1, Ronald, New York, 1953.
Alphabetical List of Authors
P. J. Brookes Department of Civil Engineering University of Wales Swansea SA2 8PP, UK David A. Caughey Sibley School of Mechanical and Aerospace Engineering Cornell University Ithaca, New York 14853-7501 Jean-Jacques Chattot Department of Mechanical and Aeronautical Engineering University of California at Davis Davis, California 95616
Chapter 15
Chapter 1, 25
Chapter 11
Connie K. Chen Courant Institute of Mathematical Sciences New York University New York, NY 10012
Chapter 6
H. K. Cheng Department of Aerospace Engineering University of Southern California Los Angeles, California 90089-1191
Chapter 5
Julian D. Cole Department of Mathematical Sciences Rensselaer Polytechnic Institute Troy, New York 12180-3590
Chapter 2
L. Pamela Cook Department Mathematical Sciences University of Delaware Newark, DE 19716
Chapter 4
J. Dacles-Mariani University of California at Davis Davis, California
Chapter 18
David L. Darmofal Department of Aerospace Engineering Texas A & M University College Station, TX 77843-3141
Chapter 13
Frontiers of Computational Fluid Dynamics - 1998 Editors: David A. Caughey & Mohamed M. Hafez
©1998 World Scientific
484
INDEX
Mark Drela Department of Aeronautics and Astronautics M.I.T. Cambridge, MA 02139
C h a p t e r 19
Nathalie Duquesne Department of Aeronautics K T H Royal Institute of Technology S-10044 Stockholm Sweden
C h a p t e r 17
D. Scott Eberhardt University of Washington Box 352400 Seattle, Washington 98195-2400
C h a p t e r 12
K o z o Fujii The Institute of Space and Astronautical Science Sagamihara, Kanagawa 229 Japan
C h a p t e r 21
Cyril G a c h e r i e u Centre Europeen de Recherche et de Formation Avancee en Calcul Scientinque F-31057 Toulouse France
C h a p t e r 17
P a u l R. G a r a b e d i a n Courant Institute of Mathematical Sciences New York University New York, NY 10012
Chapter 6
M i c h a e l B . Giles Rolls-Royce Reader in C F D Oxford University Computing Laboratory Oxford, U.K.
Chapter 9
M o h a m e d M . Hafez Department of Mechanical and Aeronautical Engineering University of California at Davis Davis, California 95616
C h a p t e r 1, 5
O. Hassan Department of Civil Engineering University of Wales Swansea SA2 8 P P , UK
Chapter 15
Wen-Huei Jou Boeing Commercial Airplane Group P. O. Box 3707 MS 67-LL Seattle, Washington 98124-2207
Chapter 20
C. Kiris MCAT, Inc. Mountain View, California
C h a p t e r 18
D. Kwak Advanced Computational Methods Branch NASA Ames Research Center Moffett Field, California 94035
Chapter 18
ALPHABETICAL LIST OF AUTHORS
485
James A. Liggett School of Civil and Environmental Engineering Cornell University Ithaca, New York 14853-7501
Chapter 25
Ping Liu Engineering Development Center Chengdu Aircraft Industrial Corp. Chengdu, China
Chapter 7
Shijun Luo Department of Aircraft Engineering Northwestern Polytechnical University Xi'an, China
Chapter 7
Robert W. MacCormack Department of Aeronautics and Astronautics Stanford University Stanford, CA 94305-4035
Chapter 10
M. T. Manzari Department of Civil Engineering University of Wales Swansea SA2 8PP, UK
Chapter 15
K. Morgan Department of Civil Engineering University of Wales Swansea SA2 8PP, UK
Chapter 15
Elsa Newman Department of Mathematics Marymount University Arlington, VA 22207
Chapter 4
Takanobu Ogawa Shimizu Corporation Etchujima 3-4-17 Koto-Ku, Japan
Chapter 21
Lee D. Peterson Aerospace Engineering Sciences University of Colorado Boulder, Colorado 80309-0429
Chapter 24
Pradeep Raj Lockheed Martin Aeronautical Systems Marietta, Georgia 30063-0685
Chapter 23
Arthur Rizzi Department of Aeronautics KTH Royal Institute of Technology S-10044 Stockholm Sweden Thomas W. Roberts Aerodynamic and Acoustic Methods Branch NASA Langley Research Center Hampton, VA 23681-0001
Chapter 17
Chapter 14
486
INDEX
S. Rogers Advanced Computational Methods Branch NASA Ames Research Center Moffett Field, California 94035
Chapter 18
Paul E. Rubbert The Boeing Company P. 0. Box 3707 Seattle, Washington 98124-2207
Chapter 22
Oleg S. Ryzhov Department of Mathematical Sciences Rensselaer Polytechnic Institute Troy, New York 12180-3590 R. Said Department of Civil Engineering University of Wales Swansea SA2 8PP, UK Donald W. Schwendeman Department of Mathematical Sciences Rensselaer Polytechnic Institute Troy, New York 12180-3590 A. Richard Seebass Aerospace Engineering Sciences University of Colorado Boulder, Colorado 80309-0429 Huili Shen Department of Aeroengine Engineering Northwestern Polytechnical University Xi'an, China David Sidilkover ICASE NASA Langley Research Center Hampton, VA 23681-0001 Helmut Sobieczky DLR German Aerospace Research Establishment Bunsenstr. 10 D-37073 Gfittingen R. C. Swanson Aerodynamic and Acoustic Methods Branch NASA Langley Research Center Hampton, VA 23681-0001 E. D. Terent'ev Computing Center Russian Academy of Sciences 40 Vavilov Street 117333 Moscow, Russian Federation Loic Tourrette Aerospatiale Avions F-31060 Toulouse France
Chapter 8
Chapter 15
Chapter 2
Chapter 24
Chapter 7
Chapter 14
Chapter 3
Chapter 14
Chapter 8
Chapter 17
ALPHABETICAL LIST OF AUTHORS
487
Susan A. Triantaflllou Department of Mathematical Sciences Rensselaer Polytechnic Institute Troy, New York 12180-3590
Chapter 2
Bram van Leer Department of Aerospace Engineering University of Michigan Ann Arbor, MI 48109-2140
Chapter 13
Jan Vos Hydraulic Machines & Fluid Mechanics Institute (IMHEF) Swiss Federal Institute of Technology (EPFL) CH-1015 Lausanne Switzerland N. P. Weatherill Department of Civil Engineering University of Wales Swansea SA2 8PP, UK
Chapter 17
Chapter 15
Carlos Weber Centre Europeen de Recherche et de Formation Avancee en Calcul Scientifique F-31057 Toulouse France Pratomo Wibowo University of Washington Box 352400 Seattle, Washington 98195-2400
Chapter 17
Laurence B. Wigton Boeing Commercial Airplane Group P. 0 . Box 3707 Seattle, Washington 98124-2207
Chapter 16
S. Yoon Advanced Computational Methods Branch NASA Ames Research Center Moffett Field, California 94035
Chapter 18
Anders Ytterstrom Department of Aeronautics KTH Royal Institute of Technology S-10044 Stockholm Sweden
Chapter 17
Chapter 12