Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
Minicomputers and Large Scale Computations
In Minicom...
73 downloads
1461 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
Minicomputers and Large Scale Computations
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Minicomputers and Large Scale Computations Peter Lykos, EDITOR
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
Illinois Institute of Technology
A symposium sponsored by the ACS Division of Computers in Chemistry at the Second Joint Conference of the Chemical Institute of Canada and the American Chemical Society, Montreal, Canada, June 1, 1977.
ACS SYMPOSIUM SERIES 57
AMERICAN
CHEMICAL
SOCIETY
WASHINGTON, D.C. 1977
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
Library of Congress
Data
Minicomputers and large scale computations. ( A C S symposium series; 57 I S S N 0 0 9 7 - 6 1 5 6 ) Includes bibliographical references and index. 1. Chemistry—Data processing—Congresses. 2. Minicomputers—Congresses. I. Lykos, Peter George, 1 9 2 7 . II. American C h e m i cal Society. D i v i s i o n of Computers i n Chemistry. III. Joint Conference of the Chemical Institute of Canada and the American Chemical Society, 2 n d , Montreal, Quebec, 1977. I V . Series: American Chemical Society. A C S symposium series; 5 7 . QD39.3.E46M56 I S B N 0-8412-0387-3
542'.8 A C S M C 8 57
77-15932 1-239
Copyright © 1977 American Chemical Society All Rights Reserved. No part of this book may be reproduced or transmitted in any form or by any means—graphic, electronic, including photocopying, recording, taping, or information storage and retrieval systems—without written permission from the American Chemical Society. The citation of trade names and/or names of manufacturers in this publication is not to be construed as an endorsement or as approval by ACS of the commercial products or services referenced herein; nor should the mere reference herein to any drawing, specification, chemical process, or other data be regarded as a license or as a conveyance of any right or permission, to the holder, reader, or any other person or corporation, to manufacture, reproduce, use, or sell any patented invention or copyrighted work that may in any way be related thereto. PRINTED IN THE UNITED
STATES
OF
AMERICA
Society Library
1155 16th St. N. W. Washington, D. C. 20036 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
ACS Symposium Series
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
Robert F. Gould, Editor
Advisory Board Donald G. Crosby Jeremiah P. Freeman E. Desmond Goddard Robert A. Hofstader John L. Margrave Nina I. McClelland John B. Pfeiffer Joseph V. Rodricks Alan C. Sartorelli Raymond B. Seymour Roy L. Whistler Aaron W o l d
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.fw001
FOREWORD The ACS SYMPOSIUM SERIES was founded in 1974 to provide a medium for publishing symposia quickly in book form. The format of the SERIES parallels that of the continuing ADVANCES IN CHEMISTRY SERIES except that in order to save time the papers are not typeset but are reproduced as they are submitted by the authors in camera-ready form. As a further means of saving time, the papers are not edited or reviewed except by the symposium chairman, who becomes editor of the book. Papers published in the ACS SYMPOSIUM SERIES are original contributions not published elsewhere in whole or major part and include reports of research as well as reviews since symposia may embrace both types of presentation.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.pr001
PREFACE '"phis symposium on "Minicomputers and Large Scale Computations" brings together a representative set of reports of concrete experiences, including cost analyses, in which computer users have turned to so-called minicomputers to handle computational problems which just a few years ago could have been handled by only large scale scientific computers. This book should be viewed as a snapshot of a dynamic situation changing fairly rapidly in time. The chapters have been arranged in sequence starting with the smallest instrument (a hand-held programmable calculator) to the largest (a dual large scale, or super, minicomputer). Several superposed trends are operating, and it is important to sort them out so that one can intelligently analyze how to best approach a particular set of computational needs. In its first manifestation with widespread use (the D E C PDP-8) the minicomputer was physically small (made to fit in a standard instrument rack), slow in cycle time, and small in main memory size; it had a short list of machine instructions, a short word length, minimal software support, and virtually no peripherals except a teletypewriter. Its target users were experimenters interested in automated data collection and reduction and those concerned with real-time control applications, somtimes in nonfriendly physical environments. Gradually the minicomputer evolved in several directions including toward the large scale or "super" minicomputer typified by the last four chapters (13-16). The super minicomputer class includes machines with 16-, 24-, and 32-bit word-based architectures, fast floating-point arithmetic (achieved in different ways), virtual memories, a full range of peripheral devices (mass storage, printers, card readers, etc.), and sophisticated multi-user supporting operating systems, compilers, interpreters, and data-base management systems. Indeed the PRIME 400 even has a super-speed small (or cache) component of the main fast memory similar to the IBM 370/195. Thus the full power of the superscientific computer of 10 years ago is now available for an order of magnitude less the costs of purchase, maintenance, and operation. In addition, the space and air conditioning requirements have been reduced to that of an ordinary small research laboratory. Even the modern laboratory minicomputer, similar in many respects to the venerable PDP-8, is being pressed into service as a scientific calculator. Chapters 2, 3, 6, 7, 8, 10, and 11 present examples in which the ix In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.pr001
computer program was reorganized compared with the way in which it would have been done for a large scale scientific computer. The small main memory forced the users to (a) make more, and more clever, use of disk storage where available, (b) search for non-conventional algorithms, in some cases more specifically problem-oriented, and (c) in two cases minimize the need for floating-point operations by scaling, and by table searching and interpolation. Of special concern here, because of the long (wall clock) running times, is the finite probability of machine failure. Chapters 4, 5, 6, and 16 explore the trade-offs involved in using minicomputers for portions of the calculations and conventional large scale computers for the remainder. Indeed Chapter 4 introduces A P L (a mathematically oriented language not so widely used by chemists as the ubiquitous FORTRAN) and also a feature of the IBM 5100 A P L processor which permits the unsophisticated (i.e., higher level language) programmer to build in details of communication protocol easily where optimal distribution of computing tasks among several processors is sought. Another trend is toward the design of special-purpose processors intended to be enslaved to conventional processors. The array processor AP-120B, as an add-on to the Harris 6024/4 at the National Astronomy and Ionosphere Center, has handled highly organized floating-point operations at 12.4 megaflops (millions of floating-point operations per second) which has been compared with the 5-megaflop C D C 7600 and the 15-megaflop ILLIAC IV {see Wolin, L., "Procedure Evaluates Computers for Scientific Applications," Computer Design (1976) 15, 93 for a more detailed comparison of minicomputers and current large scale scientific systems). Chapters 5, 8, 10, and 12 use specialized hardware to hardware-tailor a computer system to the requirements of a specific class of problems. The quantum of computational power is shrinking in physical size and cost to the point where the choice, as well as the computer, is in the hands of the individual user. The microprocessor has burst upon the scene. The mushrooming of over 400 retail computer hobby outlets has been sparked by the large scale integrated circuit ( LSI ) computer-on-achip and the growing personal computing market. The hand-held electronic calculator has decreased in physical size to the limit that conventional computer input-output can tolerate—namely the resolving power of the human eye and the physical size of human fingers. Chapter 1 illustrates attache-case-portable programmable computers with off-line storage and built-in printer capability. However, a parallel limiting process also becomes evident, i.e., the decreasing level of software support and the need for programs in machine language. For the convenx In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.pr001
tional computer (the serial prooessor) the increasing sophistication of LSI chip circuitry and the decreasing cost per bit of corresponding large scale non-electromechanical mass memory makes it more and more likely that the large-scale conventional computer system of today will be replaced by a small inexpensive package that can support today's complex software (see Turn, Rein, "Computers in the 1980's," Columbia University Press, 1974). But by the time that happens, who will want it? Because of the decreasing size and cost of individual processors, computer designers can contemplate highly concurrent multiprocessor devices. However, such devices with so many degrees of freedom available in their design must be problem-oriented. In addition, the algorithms developed to solve problems on conventional serial processors are no longer optimal for more complex computer systems. The recent symposium on High Speed Computer and Algorithm Organization ( proceedings to be published by Academic Press, late 1977) revealed that the surface has hardly been scratched in that regard. Furthermore, the computer designer is severely restricted because historically the user has accepted the designers product passively and adapted his problems and algorithms to the computer rather than vice versa. Perhaps the most important trend of all is that the awesome computer mystique is gradually being supplanted by a more healthy attitude on the part of a computer-acculturated and increasingly demanding community of users who are discovering the Golden Rule, namely, "He who has the gold . . . rules." Chicago, Illinois September 1977
PETER LYKOS
xi In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1 Microcomputer
Plus Saul'yev M e t h o d Solves
Simultaneous Partial Differential Equations of the Diffusion T y p e w i t h H i g h l y N o n l i n e a r B o u n d a r y Conditions
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
R. KENNETH WOLFE, DAVID C. COLONY and RONALD D. EATON University of Toledo, Toledo, OH 43606 Important today is the ability to answer rapidly and inexpen sively the complex questions posed by an increasingly complex society. Mathematics has played an important role in s c i e n t i f i c problem solving. Practical solutions today rely heavily on com puterized numerical approaches. This paper extends for use with the Hewlett-Packard 67/97 a numerical method due to Saul'yev (1,2). His method i s very similar to the popular method of Schmidt (3) used in graphical, numerical and computer computations to study transient heat con duction problems. This paper w i l l i l l u s t r a t e the use of a small minicomputer (microprocessor) to apply the Saul'yev approach to a simple case and also to a more complex case. The complex case is that of a hot s o l i d slab bounded on one side by a cooler semi - i n f i n i t e s o l i d and exposed at the hot surface to solar radiation, cloud cover and forced or free convective heat losses to a i r . A Simple Case Consider a solid cylinder with faces fixed at two different tem peratures. The sides of the cylinder are insulated. Temperature and time are then related through the extension of Fourier's law to the parabolic partial differential equation: 2
1) 01
8T 9X2
_
3T 3t Τ = T(x,t) = temperature at a point χ and a time t. χ = distance, in feet t = time in hours α = k/pc = thermal d i f f u s i v i t y k = thermal conductivity, BTU/hrft °F/ft p = density, l b / f t C = heat capacity, BTU/lb °F 3
m
m
1 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
Equation 1) i s the d i f f u s i v i t y equation which applies to heat transfer as well as to the transport of matter. Assume that the cylinder has an i n i t i a l constant temperature of Τχ and that at time zero one face i s instantaneously brought to a temperature T . The time-temperature relationship can then be determined analytc a l l y by any of several methods to be: s
2)
T(x,t) = ( T x - T s J e r f t x / v C T + T
$
erf(z) = error function 0
_ , exp(-z ) , 1 2
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
r
1_
+
z ' 2z3
νπ
Tj3
1-3-5
22 5 " 23 7
K
Z
Z
, ··'
;
'
Equation 2) i s developed in most heat transfer texts. Two good references which treat this problem are Chapman (4) and Carslaw and Jaeger ( 5 j . But direct application of this analytical treatment i s not often f e a s i b l e . Other methods of solution are required. Approximations of the error function are given in Abramowitz and Stegun (6) which are highly accurate and very fast on computers. Schmidt's Numerical Method Consider, as shown in Figure 1, a cylinder that i s divided into hypothetical elements which are frequently called nodes in heat transfer l i t e r a t u r e . To develop Schmidt's numerical method, an energy balance around an element i i s written: [Heat flow from i - 1 ] + [Heat flow from i + l ] 3)
kA(T _ i
1
- T.)
+
kA(T.
ΔΧ
A Δχ at T^ T.
+1
- T.)
=
P
=
H e a t
accumula
tion i n i
CAAx(T' - T. )
ΔΧ
ΔΪ
= area perpendicular to flow, f t = element length, f t = small increment of time, hours = temperature of element i at time t+ t = temperature of element i at time t . 2
Rearranging equation 3) gives: τ
[(τ^ - V
+
(T
i+1
-
Τ ι
)] =
τ;-τ.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
W O L F E
E T AL.
3
Microcomputer Plus Saul'yev
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
ΔΧ
τ
t
2
τ
Δχ
'i-l
3
1
'i+l
isothermal 1 ines
depth , D-
Figure 1.
Imaginary division of slab with finite depth and large surface dimensions in relation to the depth
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4
4)
MINICOMPUTERS
a N T
i-i
+
(1-2αΝ)Τ. + aNT.
SCALE
COMPUTATIONS
= Τ! ,
+1
a = thermal
A N DL A R G E
diffusivity,
Ν = At/AX
2
For numerical s t a b i l i t y , the coefficients of a l l temperature variables must be non-negative. This means as far as equation 4) is concerned that Ν must be selected such that aN <_ 1/2. The Saul'yev Method The Saul'yev method was o r i g i n a l l y created to conserve computer memory for computerized numerical methods. The method i s based upon an interpretation of the second partial derivative 3 T which results from evaluating the f i n i t e 2
9X "
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
7
partial derivatives f i r s t in one direction and then in the other. This process i s called alternating direction. The alternating direction process i s made clearer, perhaps, by the following: 93F
9
x
3X
(31)
3x
.
(Il) 9X
x,t
Φ
x,t+2At
d A
x-Ax,t+At
AX
Forward Difference
" Φ * x-Ax,t+At
Backward Difference
AX
Returning to equation 4) and using the Saul'yev concept: Forward Difference
6)
„» χ 4
^η.,-τ^
t
t
w
-τΛ
+ 1
- Τ! \
,
τ ;
Backward Difference 7)
αΝΔχ
f/T,_
n
-
+
/Tl
t+At
λ T
Δ
χ
]
k+2àtJ
î , t + 2 A t " l',t+At T
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
τ ,
1.
WOLFE
E T AL.
Microcomputer Plus Saul'yev
5
Algebraically manipulating 6) and 7) to solve for Tl gives: Forward Difference 8)
αΝ(Τ!
β1
+ Τ . ) + (1-αΝ)Τ. = (HaN)T! +Ί
Backward Difference
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
9)
αΝ(Τ.
β1
+ Τ ί ) + (Ι-αΝ)Τ. = (1+αΝ)Τ! + ]
Equations 8) and 9) are unconditionally stable with no restrictions for s t a b i l i t y on the numerical values of aN. As an i l l u s t r a t i o n , consider a semi-infinite s o l i d with a temperature of 100°F. At time zero, the face is brought instanta neously to 300°F. Assume that we desire the temperature of the face at t = one hour. The s o l i d has the following thermophysical properties: k =2
ρ = 100
C = .25
α = .08
Equations 8) and 9) can be used to compute a table of tem peratures at different times. For these computations*use the following parameters: At = 5 min. = 1/12 hr.
ΔΧ = 2 inches = 1/6 f t . α = .08 αΝ = (.08 χ l/12)/(l/6)
2
= 0.240
The total depth is 24 inches and the total number of nodes is made equal to 13. Table I gives the computed r e s u l t s . For this computation, the temperature at 24 inches is considered constant at 100°F. This assumption is confirmed by comparing the computed results with the analytical solution. The analytical results are also shown in Table I. The maximum difference between the analytical and the numerical results i s 2°F at t=1.0 hr. This difference seems in i t s e l f to represent a quite acceptable approximation, but i t should be noted that i t s magnitude is exaggerated by rounding off to the nearest degree. The rows of numerical results are labeled by nAt=t. For odd values of n, equation 8) is used while for even values, equation 9) is required. The numbers (temperatures) in each row correspond to the memory registers in the HP 67/97 computer. By examining equations 8) and 9), i t can be seen that new registers are not required to store new r e s u l t s . This is one major advantage of the
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
203 211
216
232
235
300
300
300
300 300
300
300
300
300
t=5At t=6At
t=7At
t=8At
t=10At
t=llAt
t=12At = 1 hr
t=l hr
235
230
222 225
194
300
181
100
t=4At
t=9At
4 2
6
181
181
177
167 172
162
156
419
141
132
124
100
120
118
115
113
no
108
106
104
102
102
100
4
8
109
107
106
105
103
103
101 102
101
100
100
5
10
103
103
102
101 102
101
100
100
100
100
100
6
12
101
101
101
101
100
100
100
100
100
100
100
7
14
100
100
100
100
100
100
100
100
100
100
100
8
16
142
119
107
102
101
100
Analytical Results (Equation 2 ) , °F
143
139
135
131
127
123
118
114
109
106
100
3
18 20
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
100
10 s 100 100 Numerical Results (Equations 8 and 9 ) , table entries are in °F 300 139 107 101 100 100 100 100 100 100 100 300 165 113 102 100 100 100 100 100 100 100
300
1
2
300
T°F
0 0
100
100
100
100
100
100
100
100
100
100
100
100
100
100
11
21
100
100
100
100
100
100
100
100
100
100
100
100
100
100
12
22
100
100
100
100
100
100
100
100
100
100
100
100
100
100
13 node #
24 inches
Numerical Time-Temperature Results for Saul'yev Method on HP 67/97 Compared with Analytical Results
t=3At
t=2At
t=lAt
t=0
i
X
Table I.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
§ 2
S 8
>
8
>
ο ο
s
05
1.
W O L F E
E T A L .
7
Microcomputer Plus Saul'yev
Saul'yev method. Other methods, such as Schmidt's, require the use of new registers for this purpose. It can be stated that the agreement between the analytical results and the numerical results is excellent. Since the Saul'yev method is an alternating direction method, an even number of rows i s required for accurate r e s u l t s . Each pair of rows i s called a pass; the total number of passes, P, i s equal to t/(2At). Programming the HP 67/97 to obtain the results in Table I , with printout of intermediate r e s u l t s , required 56 program steps out of 224 available. Two memory registers were used to store aN and to maintain a count of the number of rows computed.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
A More Complex Problem In this section we examine a more complex case and develop some extended formulas. Figure 2 shows a recent situation where the authors (7_) needed to provide a method for f i e l d engineers to predict time-temperature cooling curves for hot asphaltic surfaces placed during highway construction, in order to support decisions on whether or not to allow paving work during marginal weather conditions. Figure 2 also shows the various factors which influence the heat transfer and the temperature-time history of a hot pavement layer. The mathematical model which i s applicable to this s i t u a tion i s summarized below: Governing Equation for Hot Layer 10)
aT 9^ 2
α ι
_ =
3T
3t
Governing Equation for Cold Base
8U
11)
2
ax "
012
=
7
9U
3x
Surface Energy Balance 12) ^ a j l ^ t l
= - aMH + . 6 5 V ( T ( 0 , t ) - T 8
+ ε σ ( Τ ( 0 , ΐ ) + 460)
a i >
)
4
Hot Layer - Cold Layer Interface 13)
TU,t) = UU,t)
contact condition
14)
k BTij^tl
k MhH
x
=
2
e condition at medium A to medium Β interface
e n e r g y
b a l a n C
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
8
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
H = solar flux Cloud Transmission Factor j= M = .15 for clouds = 1.0 for no clouds Fraction of cloud cover = W (visually estimated) \adiation=
e a A
( i T
+ 4 6 0
>
4 Tn=T„.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
\onnection-
h A ( T l
-
air
T o )
i n i t i a l uniform
q
T
s o l a r =aMWH
lΊ . T
x=0
2
temperature of media A y at time zero.* " . T
Medium A HOT
k
5
•T 6
Τβ = i n i t i a l tempera ture of base at time zero.* Medium Β
T
7
T
8
T
9
Τ/\.β
Medium to medium change = T in thib case
=
6
Tio Tn
Step size = bAx = 9ΔΧ
COLD BASE Tl2
t Step size = bAx = 9ΔΧ
I 13
* Temperatures T. are specified at a l l node points at time zero. Figure 2.
Hot shb on cold semi-infinite base with surface radiation, convection, and insolation
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
W O L F E
E T
9
Microcomputer Plus Saul'yev
AL.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
I n i t i a l Conditions 15)
T(x,0) = T
16)
U(x,0) = U
0 <_ χ < £
Q
l <_ χ <
Q
-
The above set of equations must be considered simultaneously. Analytical solution of such a set i s impossible. Among other reasons, analytical solution is rendered impossible by the pres ence of the T*+ radiation term. To obtain results useful in practical application, more manageable methods are mandatory. Numerical methods exist for the above set of equations based upon Schmidt's conditionally stable, e x p l i c i t method. In order to u t i l i z e an HP 67/97 computer, the Saul'yev method is the method of choice. To apply the Saul'yev approach, his method must be extended to handle the interface conditions 13) and 14), and the surface energy balance 12). Also, a way to extend the total depth is required. Generalized Change This section develops the equation to describe a change in Δχ, a change in medium, or a change in both conditions.
Y-l Medium A
T
i 1 +
•
Medium Β
•
α , 2
Pi» C]_
P2J
k, 2
C
2
-> <- ->-
aAx
t + l/2aAx l/2bAx The above diagram depicts the generalized change interface. energy balance equation i s : 17)
Μ(Τ ., Ί
Ί
- Τ .) Ί
=
18) F ( ϊ ΐ ( τ a
Μ
- V
(Fo rwa rd)
k A(T.
+
2
^(a C Pl
+
1
+1
Ί
+ bp C )AAx^T'. - T. ^ 2
^2(T. b
- Τ .)
+1
2
- )) T i
+
Tj = η
(Bac kward)
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
The
10
MINICOMPUTERS
19)
F =
2 A I 0 T 2
Ν-
N
aa k +ba k 2
1
1
A
SCALE
COMPUTATIONS
T Z
'
2
A N D LARGE
Δ
χ
Now give 18) a backward and forward interpretation: Forward 20)
F( ^
η
+ ^_
β 1
a
)
T
+ (l - Fk /b)T = (1 + Fkx/a) T! 2
i
b
Backward + ^_ T! ) + (1 - Fk /a)T = (1 + Fk /b) T i . a b When medium A and medium Β are identical and a=b, equations 20) and 21) reduce to equations 6) and 7 ) .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
21
)
F
(
V l
+1
1
i
2
Medium to Medium Change Equations 20) and 21) can be specialized for the case where there i s a change in the media but the step s i z e , At, i s maintained identical on both sides of the interface. For this case a=b=l and the appropriate backwardforward equations are:
22)
F=
2 α
ι 2 α
Ν
αϊk +a kj 2
2
Forward 23)
FikiT;.^ + k T . ) + ( l - F k ) T . = (l+Fki)Tl 2
+1
2
Backward 24)
F(kxT
lel
+ k T ! ) + ( l - F k ^ T . = (l+Fk )T! 2
+1
2
Change in Step Size Ax To improve accuracy of the computa t i o n , the step size and the time increment At can be made smaller. Where there i s a rapid change in the temperature, i t i s wise to use small values of At and Ax. In cases where there i s a r e l a t i v e l y mild change in temperature, the step size AX can be made larger. In cases where the problem includes both rapid changes and mild changes i t i s appropriate to change the step s i z e . For the problem depicted in Figure 2, the change in AX w i l l occur in the colder medium with αχ = α and k = k . For this s i t u a t i o n , equations 20) and 21) become: 2
x
2
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
W O L F E
1.
11
Microcomputer Plus Saul'yev
E T AL.
Forward *>
TO
'V,
+
τ
) Mi
Μ
- s f f o )T, • 0 * ^
) η
b Backward 26
>
TO
* i u ' b T
+
" TTOT > i ' (1 • j f f e j l T j
0
T
b = the step size m u l t i p l i e r , *
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
A
n e w
=
b
A x
old
Interface Temperature When a hot solid and a cold surface are placed together at time zero, an interface temperature must be assigned. The l i t e r a t u r e on this subject generally recommends an arithmetic average. This value is obviously incorrect and a better assignment is required. The l i t e r a t u r e (4,5j contains the mathematical solution for this case, which i s : 28) T(x,t) = U + T i - U i (1+r erf(z)) 1+r x
T(x,t) Ti Ui r
= = = =
temperature at χ and t of solid 1 i n i t i a l temperature of solid 1 i n i t i a l temperature of solid 2 k a 1/2 , subscripts 1 and 2 refer to 2
x
k7 i7 (
ζ
=
}
solids 1 and 2 χ
2T77E11/2
erf(z) = see equation 2 χ = 0 at the s o l i d to s o l i d interface as t -> 0, T(0,t-*0) T(x,t) = T^rUx 1+r When the two media have the same thermophysical properties, r=l and T(0,0) = Ti+Ui = 1/2(T +U ) 1
1
1+1 It i s clear from the foregoing equation that the arithmetic average correctly represents the interface temperature only when the thermophysical properties of the hot and cold media are equal.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
12
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
Surface Equations The surface of a hot l a i d asphaltic con crete pavement layer is acted upon by several environmental influences. These influences include solar radiation (insolation) and cloud cover along with a i r velocity and turbulence. Solar radiation varies in intensity with time of day and the season of the year, while wind conditions are even more variable. Solar radiation can be measured with a pyrheliometer, but in practice, the equations and tables in the ASHRAE Handbook of Fundamentals (8) can be used to accurately determine the solar f l u x . The effect of cloud cover is perhaps more d i f f i c u l t to e v a l uate since height, thickness, water droplet size and percentage of cloud cover a l l influence the transmission of solar energy. For the present purposes, the cloud cover is assumed either to exist or not to e x i s t . If i t e x i s t s , solar radiation is reduced by 85% in a l l computations. The following diagram depicts the surface element and node construction appropriate to the problem under discussion:
(air)
1 2
^One-half element assigned to Τχ
AX-
ΔΧ
The energy balance at the surface element y i e l d s : - radiation loss + solar gain + gain from T
- connective loss
2
= energy gain/loss in Τ χ element 29)
- ε σ Α ί Τ ^ β Ο ) + aMAH + kA(T -Ti) 4
2
Δχ
- ΗΑ(Τχ-Τ . ) a
i
r
= 1/2ΑΔχρΟ(Τ{-Τχ) "Ix
Rearranging terms gives: 30)
2haNAx
(Το-Τχ) + 2 Ν ( Τ - Τ χ ) 2
+ 2aaNMAxH k =T{
2εσαΝΔχ
(Τχ+460) + Τχ 4
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
W O L F E
13
Microcomputer Plus SauVyev
E TA L .
Forward and backward interpretations of the above are: Forward 31)
2αΝΔχ k
(hTft + aMH - εσ(Τ +460) ) Ι+
1
+ 2αΝ(Τ -Τχ) + Τι = (1 + ^ Ν Δ χ 2
}
J
{
Backward 32)
2αΝΔχ k
(hT + aMH - ε σ ί Τ ι ^ β Ο ) ) 4
0
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
+ (i . 2oNA_j! j
Τ
ι
+
2 α Ν Τ
,
=
( 1 + 2 α Ν ) τ
.
In equations 31) and 32) the radiation term, (T +460) , has not been applied in a forward and backward sense, since the solu tion for T'. would otherwise be unduly complicated. Tests show this omission to y i e l d negligible errors. The largest discrepancy occurs in T only during the early minutes after time zero. I+
1
x
Solution of Problem in Figure 2 The above equations provide a means for solving the problem depicted in Figure 2. To further i l l u s t r a t e this problem, the following data are hypothesized: Environmental conditions Solar radiation Η = 200 BTU/hr (obtained from ASHRAE Handbook of Fundamental Tables (8). M = 1 or . 1 5 , assume cloud cover with M=.15 Air velocity = 10 MPH A i r Temp. = 80°F h = convection coefficient α = . 6 5 v = .65(10) · = 4.10 Air temperature = 70°F Surface radiation = εσ(Τ+460) = .95 χ 1.731-10~ (T+460) = 1.644-10" (T+460) 8
8
4
9
Hot Solid Absorptivity a for solar flux = .85 Emissivity for solar radiation = .95 I n i t i a l temperature = 300°F k = 1.5 ρ = 150 C = .25 ι α = 0.04 Δχ= 0 . 5 " = ft.
y
4
4
Cold Solid I n i t i a l temperature = 70°F k = 3 C = .25 ρ = 150 α = 0.08 Δχ from node 5 to node 10 = .5 in Δχ from node 10 to node 13 = 9 χ .4 = 3.6 inches
Elapsed time = 15 minutes At =
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
14
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
Total depth of base = 1.6 + 10.8 = 12.4 inches. Table II gives the results of computations using the HP 67/97 at one minute increments up to 15 minutes. These results have been compared with highly accurate results from an IBM 360 and they agree within 2°F at a l l points.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
Program The program to obtain the results in Table II took 223 steps out of an available 224 steps. The authors have made several hundred computations using the HP67 or HP97. To these authors, the results are extremely satisfactory. The computations are convenient and the method is generally superior to other approaches. The programs are appended to this paper. Current purchase price of an HP67 computer is $400 and that of an HP97 i s $750. No hardware other than one of the foregoing was needed to perform the complex heat transfer calculations which have been described. Monthly maintenance cost of these instruments can be considered negigible. Conclusion Examples have been presented which demonstrate the usefulness of a small, handheld computer for performing numerical solutions of simultaneous partial d i f f e r e n t i a l equations of the diffusion type. Some mathematical development, or extension, of a standard numerical solution method was required to adapt the method to a small computer. But the results obtained compare very closely to those yielded by an IBM 360 computer; and the use of a small computer makes possible rational decisions on the s i t e in real time by a construction project engineer. It has not been possible, hitherto, to support "go - no go" decisions at a paving s i t e with such detailed analysis of environmental data. Acknowledgements The authors wish to thank Mr. Leon Talbert and Mr. W i l l i s Gibboney of the Department of Transportation, State of Ohio, for their assistance and encouragement. The research reported in this paper was supported in part by the Department of Transportation, State of Ohio, and the U.S. Department of Transportation, Federal Highway Administration.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Minutes
in
Time 275
274
263
254
244
235
227
219
256
247
238
230
222
214
207
200
193
188
183
177
172
3
4
5
6
7
8
9
10
11
12
13
14
15
124 131 135 137 139
158 157 159 159
208
155
207 201 195 190 186 182
219 212 205 199 194 189 184
212
205
199
193
187
182
178
212
226
165
168
171
174
177
180
147
138
138
139
150 149
139
140
140
153 152
154
140
157
188
219
234 184
140
157
192
226 140
158
196
235
253 243
201
245
216
111
156
263
256
228
271
286
283
264
70 71
72 76
93
101 103 104 106 107
113 114 115 116 117
126
127
127
127
127
100
112 126
70 70
72 73
96 98
70
72 95
70
70 70
71
70
70
71 72
91
89
98
71
87 111
71
84 125
96
109
70
70
82
124
93
107
122
70
70 80
91
104
120
70
70
70
70 70
70
70
70
14.4
70
70
70
10.8"
70
77
88
101
117
75
73
70
70
80
7.2"
3.6"
84
97
91
108 113
84
99
76
85
250
288
2min
293
296
275
70
70
70
165
300
300
300
1min
300
300
0
3.2"
2.8"
1.2"
0.8"
0.0"
Interface 1.6" 2.0" 2.4"
Inches from Surface
Temperature Profile of Problem in Figure 2 as a Function of Time
t\x
0.4"
Table II.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
^
<5
S
G
r
*α
"-4
Ci
OK
3
? ο* ~* ο ο
>
M
S r S
16
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
APPENDIX A
User Instructions ^
SAUL'YEV INPUT/OUTPUT PROGRAM
1
^•DatalnputSTEP
1. 2.
Compute - T ^ / n n d P ,
Print.
3^ I -
T „ a
INPUT DATA/UNITS
INSTRUCTIONS
-
n
/ OUTPUT DATA/UNITS
KEYS
Input side 1 and side 2 Enter problem data and constants: Time span (seconds) Medium 1 thermal diffusivitv(ft /hr) Medium 1 thermal conductivity (BTU/ft-hr-°F) Medium 2 thermal diffusivity
10 t
IZZHZS
ki a
1 IIR/S 1 • EI] 1 1IR/S I LZIG/iD • 5^1 • S^J
10 11 12 13 14
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
2
2
Medium 2 thermal conductivity Step size (inches) Λχ Node number of medium to medium changp + 10, Air velocity (MPH) V Solar flux times cloud factor (BTU/hr-ft ) ΜΗ Number of back-and-forth passes Ρ Temperature of air(°F) Tn Ta i r Temperature of asphalt (°F) Ta s η ' r Temperature of base l°F) Tpase Set F2 i f h=0.53 ν · , default h=0.65 v0.8 2
=
a
3. 4. 5. 6. 7. 8. 9.
0
8
Calculate parameters required for Saul'yev program Allocate temperatures to RIO to R23 Change to Saul'yev program. Input side 1 and side 2 after using Saul'yev program Determine averaae temperature of Medium A layer Print T , T i . T? . orintinq stops with error Hiçplaypd. n
i
r
15 16 17 18 19 20 21 22 23
1 ÎRTSH
[ZZISZT] 1
1fw7s~1
r
ir^n
rriisTFi Ll_JI 1 1 1 ! Β 1 f
11
1
1 1
II 11
1 1
I 1
ΙΓ5™1 II 1
1 1 1 1 1 1 • 1
II II II II II II
•
CO avq
T
1 1 1 1 1 1 IZZI II 1
tzzuzzi 1 1 1
II II II
1 1 1
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
W O L F E
STEP
KEY ENTRY 881 882 883 884 885
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
006
807 006 889 010 811 812 8l3 814 815 816 817 816 819 828 821 822 823 824
STOI ROL: R-S STO i
16- 41 36 4z Z* 35 45
2A 1£
IS "I
it 26
46
s
05
Chi
tLBLB
22 16- 41 22 45 Zi. 12
RCL5
l£- ZI 36 00 36 c5
XÎ;
6T0Î
825 826 827 826 829 830 031 032 033 034 835 036 83? 03S 839 040 041 042 043 044 045 046 04? 846 049 858 051 852 853 854 855 856
τ
τ
SEQUENTIAL DATA ENTRY-enter R#, press A, R# dis played, enter data into Rx, press R/S, next R# displayed, Icontinue.
COMPUTE CONSTANTS" FOR SAUL'YEV PROGRAM
53
A-
•24
RO^S
τ
STEP
KEY CODE
ii iι 35 •i-
*LBU
36 65
-24" £5
5
Ê
Su
35
STOE
i 0
'4'4
36 C:: 35 i5 36 15 36 01
RCLi ISZI STOi
16
-35 26 4 ο
1SZ1 STOi
le
35 45 36 82 26 4£
RCLE Λ
ISZI STOi RCL4
1£
ISZ1 STOi RCLE RCLl RCLO
it
3D 3D 36
α ι
45 15 ui «33 -35 -35
2
Ν
KEY ENTRY
057 058 055 06'0
RCL2
861 862 963 864 865 866 867 866 869 8?8 871 872 873 874
ISZÎ STOi RCL6 ISZI STOi ISZI 1
875 076 077 876 879 880 081 Θ82 083 084
GSB6 RCL8 ISZI STOi
885 886 88? 888 989 898 891 892 893 894 895 896 897 898 899 180 181 102 183 184 185 186 18? 188 189 118
k
2
-35 j-: 84 -35 36 É3
36 36
111 112
2
asphalt
0
2
\
^
2
base
KEY CODE 36 01
X
+ •=•
,
3 5
16
16 26 4t 35 45 16 26 4b i'I -62 03 85 35 45 83 36 8?
STOi 3 RCL7
x>y?
RCLl RCL5
16-34
23 36 16 26 35 36 36
86 86 4£ 45 81 85 -35 36 15
X.
RCLE
-35
t
T
ISZI STOi RCLl RCL3 T
used
Σαχο^Ν 2^ι (node # + 10) of solid-solid inter face «1^2
+
α
h = 1.35 OR 0.8 h = 0.65 v' NH (as entered) Ν = cloud cover factor = 1.0 or 0.15
36 82 -24
RCL2
τ
-35 -55 -£'4 26 4c 35 45 36 0£
86 -24 1£ 26 46 35 45 36 81 36 03 -24
RCL4 RCLZ
r
λ
S Tût RCLO RCLE
+
EtiTI 1 •
f
2ajAt
54 36 84 36 02 -24 -35 35 15 3£ .3 36 le -55 36 15 -21 il -55 -24
ST0L
35 14
PZS RTH $L6L£
16-51 24 21 06 -62
MH
«NODE. int %)de int
αΝ
Ν I
'air
ft
2
ύ£
2
α ι
2ΡΔΧ
Ρ = # of passes
26 4£
K
RC-L3
'
Hr
35 45 3t *5 3c 53 -35 26 •it 35 45 36" 04
Λ
RCLI RC^4
_t
35 4t
STOI R0L9 ST 01 ROLE
17
Microcomputer Plus Saul'yev
E T AL.
" used
S 8
MH *used
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
R so
f~
A N DLARGE
SCALE
COMPUTATIONS
Program Listing STEP
KEY TNTRY 113 114 115 lit 11? 118 119 126 121 122 123 124 125 126 127 12S 129 138
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
'ύί
MINICOMPUTERS
18
131 132 133 134 135 136 137 138 139 148 141 142 143 144 145 146 147
1
_
A a 0 5
COMMENTS
STEP
êc*
S τ*-
5 3 F27 Ri
x
è£ ëz
STOi RTH *LBLÎ 1
35 45 24
STO: RCL* STCi RCLI x;i RCLB ISZl
35 46 36 il
570 ί
RCLI RCL6
KZI
It 2 163
λ=νν
164 165 166 167 168
£T0? tuBLD 1 8 STOI
1 88 189 198 91 192 193 194 195 S6
6i -45 16-33 22 ëi ëi -34
QTûS 1 4 CHS iTOS *LBLS RCLD ISZl STOi ISZl *L8L9 RCLC STOi ISZl RCLI 2 4
8
36 46 16-4* 36 12 16 26 4c 35 45 36 46 36 86
λ=/?
149 158 151 152 153 154 155 156 15? 158 159 168 161
n
TEMPERATURES DISTRIBUTED FROM RIO to R23
ëi
e
h = 0.65V OR IF F2 ON « h = 0.53v°-
71 172 173 174 175 176 17? 178 1 79 188 181 182 183 184 1 85 186 18?
0,8
-62 es ëZ 16 23 CZ -41 -31
197 198 99 2 2 81 2 82 2 83 2 264 2 85 66 2 67 2 88 2 89 2 18 2 2 12 2 13
ee
16-41 22 45 ZI 3c 36 14 16 26 46 35 16 26 21 36 35 16 Zô 36
KEY ENTRY 169 178
-£Z 6 5 EHT'
148
Date input input" by pa G Q used
KEY CODE
45 46 CS 1Ζ 45 46 46 ëZ v4
14 15 2 16 2 1?
RCLi 1 ii FSE RCLi PRTk $LBL8 ISZl tLBLi 5 CHS. XZI GTOi $LBLE
i6~4l 36 4t ëi ëë ~4Z 16 5. 36 45
DSZI R1 *LBL7 CLh RCLi * DSZI RCLI
£70? R; RCL6 1 1
R7h
Display node number
Zi ëë it 26 46 21 Ci èl ë>5 16-4* 22 45 21 15 ëi
1
STOI
RCL ί RCL6 STOI Ri RCLi
COMMENTS
KEY CODE
35 36 36 Z5
46 45 66 46 -ci 36 45 -55
COMPUTE AVERAGE TEMPERATURE OF LAYER 1 USING TRAPEZOIDAL RULE
-24 It 25 46 16-31 21 ë7 -51 36 45 16 25 46 36 4t 81 ëi -45 16-42 22 C7 -3i 36 06 ëi ëi -45 -24 24
R-ί
ib-
RTri
B
22 U; •a 14 ë* ëë 46
Compute
220
PRINT TERMPERATURES FROM RIO to R23 LAE ELS
c
Allocate D Print/ nut
0
e
1 2
b
c
used h=c v0.8
2
3
4
7
8
9
1
6
d
—uspri
SET STATUS
FLAGS Ε
useo*
3
FLAGS ON OFF O D D 1 • • 2 • • 3 • •
TRIG DEG GRAD RAD
DISP • • •
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FIX SCI ENG
• •
α
WOLFE
E T AL.
Microcomputer Plus Saul'yev
APPENDIX Β
User Instructions SAUL'YEV PROGRAM frSTEP
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
1 2 3
Start
•
•
INSTRUCTIONS
Put in data with Saul'yev input/output program Read Saul'yev program Output from Saul'yev input/output program
INPUT DATA/UNITS
OUTPUT DATA/UNITS
KEYS
, :
11 11
!
1 1
"J [ """""]
L "Ί 1 -
1
LID (Z~1IZZ1 i : ! 1 II 1 1 1 1 C~3 '• 1 • CZI i 1 i 1 1 1 • CZI ι :i ι ι :ι ι ι ir I : π ι
π
α
IZUtZZ]
• ι •
11 11
;
11 11
["
]ι
; ι 1 ι
:
i!
;
:
11
• Γ
: u d 11
cri 1 i 1
ι "J ( II 1 ί
I I
• 1
1 ι 1 1 1
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
20
STEP
KEY ENTRY 891
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
062 005 004 005 006 007 008 005 010 011 012 013 014 015 01 6 917 818 015 020 021 022 023 024 025 026 027 028 025 030 031 032 033 034 035 036 Θ37 03S 035 040 041 042 043 044 045 046 047 048 045 050 051 052 053 054 055 056
*L£LH
ÎSBE 1
STxi DSZJ. RCi. ι ISZI ISZI RCL i * RCL: X
USZ:
ST+i 1 RCL: +
ISZI RCL6 RCLI (iBS
X=V? &70C 2 i x=y:
C7CD 2 3 RCLI
X=r'7 C7Ci 1 1 CHS
Α=ϊ'? CT02 CTOtt *L6L1
SFÊ 2 CHS S7CI CTCa *LfiL2 CSEE CF6 1
ST-e
RCLC X=07
23 ;2 »o ti 36 t'. 35--35 16 25 56 16 26 it 26 56
START Cale. Τ ι MAIN PROGRAM
45 4o 4C4c 4c
45 -55 36 Ci -35 4c it 35--55 45 Ci 36 Ci -55 35-24 45 it 26 46 36 66 36 4 6 16 3i 16--33 22 13 32 Ci 16--33 22 14 '02 ti«i 36 46 i6--33 22 Ci Si 01 -22 16 -33 22 02 22 16 11 21 Ci 16 21 Ct? 02 02 -22 35 46 22 16 i i Zi 62 23 12 16 22 30 Ci 35 -45 CC 36 ce 16 -43
Next Τ T. ? interface* Step size change?
Tl ? 3
Surface T ? x
Prepare for backwarq pass, set flag 0.
Set RI to negative values. END OF BACKWARD PASd Reduce # of passes left by one.
J12
2
Ti»T,
BP* 5
057 058 055 060 061 062 063 064 065 066 067 068 065 070 071 Θ72 073 074 075 076 077 078 075 080 081 082 083 084 085 086 087 088 085 050 051 052 053 054 055 056 057 058 055 100 101 102 103 104 105 106 107 106 105 110 111
RTN 67 OH «LÊL3 CHS STOi
isz;
16 22
CTCa *LEL8
i 0
STCI RCLi RCL?
it
ISZI RCLi
4 6 0 + 4 yx
j 6 4
4 EEA
5 CHS Λ
-
RCLS 8
5
Λ
RC15 A
RCLi •
RCLi 2 Λ
ISZI it RCL i DSZI it F0? 16 CTOfc 22 RCi.;
-
21
λ
Node
24 ii 63 46 0C -<:2 35 46 26 46 16 l i 2i 12 6i 00 25 46 36 45 36 67 -35 26 46 36 45 64 fit 6C -55 04 31 Ci -62 06 64 64 -23 65 -22 -35 -45 36 ô ô -62 0C 65 -35 -55 36 C5 -35 36 45 -55 36 Ci 62 -35 26 46 36 45 25 46 23 00 16 12 36 45 -45 it 12 -35
22 2i 36 it 23
RCL; FC:
tLBLb
SCALE
COMPUTATIONS
KEY CODE
KEY ENTRY J2 2 ]
αΝ |so
A N DLARGE
END!! Start next pass. Routine to keep RI negative if FO on.
SURFACE TEMPERA TURE.
εσ=1.644·10
Solar absorptive = 0.85
MH
intl
T used
Tg
8
used
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1.
W O L F E
21
Microcomputer Plus Saul'yev
E T A L .
Program Listing STEP
KEY ENTRY H3 114 115 116 117 116
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
US 126 121 122 123 124 125 126 127 126 129 138 131 132 133 134 135 136 137 138 139 146 141 142 143 144 145 146 14? 148 149 156 151 152 153 154 155 156 157 158 159 168
•f RCL9 RCLT X
pe?
6SB6 1 iτ
35 45 16 Z6 4t Z4
$LBL6 RCL i X
z* 36 45
-
RTH *LEu STCi RCLI RCL3 STOi ÀZ V ST03 1 RCL4 RCL! Ft" ΧΖϊ Ri RC5 X
DSZl RCL i RCL2 1SZI ISZl RCLi RCL4 X τ RCL5
161 162 163 164 165 166 167
DSZ: ST+ï RCL2 RCL4 F6:
168
Ri
START
x?v
SFO
KEY ENTRY
175 176 177 17S 179 188 181 182 183 184
ce
-35 -45 36 8i 6Z -35 24 Zi 12 35 46 36 61 36 62 35 61 ~4i
MEDIUM TO MEDIUM TEMPERATURE
35 62 61 36 64 36 62 16 23 66 -4i -31 36 65 -3z-45 35-35 45
185 186 187 188 189 196 191 192 193 194 195 196 197 198 199 268 261 282 283 264 265 266 267 288 269 218
it Z5 46 36 45 36 62 -3Z 16 26 46 16 26 46 36 4Z 36 64 -5; 3D 65 -35 16 Z5 46 35-55 45 36 82 26 64 16 23 66 -41 -3i
SURFACE
STEP
169 178 171 172 173 174
-55 36 69 36 67 -35 16 23 86 23 66 61 -55 -Z4
STOi isz:
RCL 1 2 Λ
COMMENTS
KEY CODE
LABELS
End?
RI sign?
KEY CODE 3t" 65 -35
* 6i -55 35-24 45 SUi ZZ S3 QTÛ3 21 14 $LBLD 35 4c STOI 6i 6 61 J F8? i& λύ 66 35-35 61 STxi 9 69 16 23 66 F6 : 81 1 5Z 1 Λ 36 6i RCLi JZ 2 X -35 1 81 66 8 -2-r τ 35 .5 STÛE χ -35 -22 CHS 61 1 -55 + 35-35 45 STxi 16 25 46 DSZ; 36 45 RCLi 16 26 46 isz: it ib" ISZl 36 45 RCLi 5 69 -24 r -55 J +
r
RCLE X DSZl ST+i i F6? 9
211 212 213 214
1-X RCLE χ
£.1*} 216 217
SUi
218 219 226 221 222 223 224
INTERFACE] STEP SIZE
RCL5 A
STEP SIZE CHANGE b=81 2
b=step size multiplier=9
b+l=10
36 15 -35 16 25 46 35- 55 45 81 i6 22 66 62 5Z 36 15 -35
+
i STTI F8 STU CT03 R S FLAGS r
35 -Z4 45 62 61 35-24 61 16 Z3 66 35-35 61 ZZ 63 51 FLAGS ON OFF D D • • • • • •
O 1 2 3
DEG • GRAD • RAD •
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FIX SCI ENG
• • •
22
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
Notation a = solar absorptivity, dimensionless A = area, f t
2
a,b = multiplier used to change
step s i z e ,
ΔΧ
k\
evj
=
k -|(j Ax
0
Cp,C = heat capacity, BTU/lbm - °F D = depth of hot s l a b , inches or feet a s p h a l t base ^ a s p h a l t base base asphalt) h = convective heat transfer coefficient between hot surface and a i r , BTU/ft - hr - °F F
=
2a
a
k
+
a
k
2
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
H = solar heat f l u x , BTU/hr - f t
2
k = thermal conductivity, BTU/ft - hr - (°F/ft) 2
k
3
= thermal conductivity of hot asphalt mat, B T U / f t - h r - ( ° F / f t ) 2
α
k,
= thermal conductivity of base upon which hot layer i s placed, BTU/ft - hr - (°F/ft) L = length of a side of a square, f t
D a s e
2
M = cloud cover factor = 1.0 with no clouds = 0.15 with clouds. Ν =
At/Δχ , 2
sec/in
2
= h r / f t , used in f i n i t e difference 2
equations Ρ = number of backward and forward passes used in Saul'yev method (one forward computation plus one backward computa tion equal one pass) q = rate of heat flow, BTU/hr - f t 2
base asphalt a s p h a l t b a s e t = time, h r s . , or seconds
r
=
k
/k
a
/a
Τ = temperature, °F or °R T- = temperature at node i , °F TÎ = temperature at node i at time t = t + A t , °F T T T_ a
0
x
b
T
s
= temperature of a i r , °F = temperature at hot surface or the i n i t i a l hot layer temperature, °F = temperature at interface of medijm A and medium B, °F = temperature of surface, °F
Ui = i n i t i a l temperature of base, a semi ν = velocity of wind, miles per hour W = percentage of cloud cover
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
1. wolfe et a l .
Microcomputer Plus Saul'yev
23
χ = distance (usually from asphalt surface), inches or feet ζ = x//ât", dimensionless, used in analytical equations 2
α = thermal d i f f u s i v i t y = k/pc, f t / h r α
a
= thermal d i f f u s i v i t y of hot layer,
2
ft /hr 2
= thermal d i f f u s i v i t y of cold base, f t / h r ε = emissivity, dimensionless σ = Stefan-Bol tzman constant = 1.713 χ 10" , BTU/hr - ft - °R (theoretical) = 1.731.10" BTU/hr - f t - °R (experimental) 9
ρ = density, lbm/ft
2
4
3
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch001
Literature Cited 1. 2. 3. 4. 5. 6. 7.
8.
Saul'yev, V. Κ., "Integration of Equations by the Method of Nets," Macmillan Company, 1964. Carnahan, Brice; Luther, Η. Α.; Wilkes, James O., "Applied Numerical Methods, John Wiley & Sons, Inc., 1969. Dusinberre, G. M . , "Numerical Analysis in Heat Flow," McGraw Hill Book Company, Inc., New York, 1961. Chapman, Alan J., "Heat Transfer," Macmillan Company, New York, 1967. Carslaw, H. S.; and Jaeger, J . C . , "Conduction of Heat in Solids," Clarendon Press, Oxford, 1959. Abramowitz, Milton and Stegun, Irene Α., "Handbook of Mathematical Functions," National Bureau of Standards, Dover Publications, Inc., New York, 1972. Wolfe, R. K. and Colony, D. C . , "Final Report, Asphalt Cooling Rates: A Computer Simulation Study, Project 2844," for Ohio Department of Transportation and U.S. Department of Transportation, Federal Highway Administration, 1976. ASHRAE, "ASHRAE Handbook of Fundamentals," American Society of Heating, Refrigeration and Air-Conditioning Engineers, Inc., 1972.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2 Conjugate Gradient M e t h o d s for Solving Algebraic Eigenproblems
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch002
J. C. NASH Research Division, Economics Branch, Agriculture Canada, OttawaK1A0C5,Canada S. G. NASH Mathematics Department, University of Alberta, Edmonton, Alberta, Canada The problem which is addressed in this paper is that of find ing eigensolutions (e,x) which satisfy Ax=e Β
x
(1)
where A and Β are real symmetric matrices and Β is positive defin ite. The particular case where Β is the identity is also of inter est. This study will focus on methods using the conjugate gradient algorithm because this is very frugal of storage when used to solve linear equations [1] or function minimization problems [2]. The algorithm can be used in either fashion to solve the eigen problem (1): 1) By solution of the equations (A - kB) y = Β
x
i
(2)
i
with x = y / || y || (3) i+1
i
i
where the double vertical lines mean "the norm of", it is possible to find the eigenvector x (or y) having eigenvalue e closest to the shift k. This is the process of inverse iteration [3,4]· 2) By minimization of the Rayleigh quotient R
T
=
x
A
x
/
T
x
B
x
(4)
T
with respect to x, where the denotes transposition, the eigen vector x corresponding to the most negative eigenvalue e is found [5]· R takes on the value e at its minimum. Note that if Β is the identity, minimization of 1
R
=
T
x
(A
-
2
k1)
x
/
T
x
x
24 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
(5)
2.
NASH
AND
25
Conjugate Gradient Methods
NASH
g i v e s the e i g e n v e c t o r c o r r e s p o n d i n g to the e i g e n v a l u e w h i c h i s c l o s e s t t o t h e s h i f t k. A l t e r n a t i v e l y , r o o t - s h i f t i n g o r o r t h o g o n a l i z a t i o n a s d i s c u s s e d by S h a v i t t et^ a]_. 16] may be u s e d t o f i n d some o f t h e h i g h e r e i g e n s o l u t i o n s .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch002
The a p p r o a c h e s a b o v e s u g g e s t ways o f f i n d i n g one e i g e n v a l u e and v e c t o r a t a t i m e . No d e f l a t i o n o r o r t h o g o n a l i z a t i o n t e c h n i q u e i s needed f o r p r o b l e m s w i t h d i s t i n c t e i g e n v a l u e s . O n l y a few v e c t o r s o f s t o r a g e a r e r e q u i r e d and t h e p r o g r a m s we have w r i t t e n run i n v e r y s m a l l memories and c o u l d be u s e d on v e r y s m a l l m a c h ines s u c h as the programmable desk top c a l c u l a t o r s . The a i m o f t h e work r e p o r t e d h e r e was t o f i n d a r e l i a b l e y e t s i m p l e method f o r s o l v i n g t h e s y m m e t r i c m a t r i x e i g e n p r o b l e m when t h e m a t r i c e s i n v o l v e d would n o t n e c e s s a r i l y f i t in main memory in t h e c o m p u t i n g d e v i c e . In t h i s r e g a r d , t h e methods o f N e s b e t [7] and S h a v i t t [6J had p r o v e d u n r e l i a b l e , w h i l e t h a t o f D a v i d s o n [8j was t o o compl i c a t e d f o r t h e t a r g e t m a c h i n e s . Conjugate
Gradients
B e a l e [9] has g i v e n a s h o r t but l u c i d d e r i v a t i o n o f t h e c o n j u g a t e g r a d i e n t s (eg) f a m i l y o f a l g o r i t h m s . The f u n d a m e n t a l idea i s to g e n e r a t e a set o f l i n e a r l y independent s e a r c h d i r e c t i o n s by means o f a r e c u r r e n c e r e l a t i o n s h i p . T h a t i s , g i v e n a f u n c t i o n S(x) and i t s g r a d i e n t £ ( x _ ) t o g e t h e r w i t h t h e j ' t h s e a r c h d i r e c t i o n t_., t h e n e x t s e a c h d i r e c t i o n i s d e f i n e d a s
ij+i
=
ij - £
α
<> 6
where α i s some p a r a m e t e r , t h e f o r m u l a f o r w h i c h d e t e r m i n e s t h e m e t h o d . Once s e a c h d i r e c t i o n s e x i s t , p r o v i s i o n o f a mechanism f o r a c t u a l l y performing a l i n e seach completes the s p e c i f i c a t i o n of an a l g o r i t h m e x c e p t f o r s t a r t o r r e s t a r t c o n d i t i o n s . W i t h i n t h i s work we c h o s e a l w a y s t o s e t
ί ο - - a.
(7)
and i f t h e a l g o r i t h m had n o t c o n v e r g e d in η s t e p s , where η i s t h e o r d e r o f t h e p r o b l e m , t o r e s t a r t in t h e same way f r o m t h e c u r r e n t p o i n t o r v e c t o r .x. M i n i m i z a t i o n of the p a r t i c u l a r f u n c t i o n S(x)
= x
T
C x
+ 2 x
T
w +
(any
where C i s a g i v e n p o s i t i v e d e f i n i t e s y m m e t r i c a given v e c t o r s o l v e s the l i n e a r equations C χ = -
scalar) m a t r i x and w
is
(9)
w
M o r e o v e r , in t h i s c a s e i t can be shown t h a t c o n v e r g e in no more t h a n η s t e p s , w h i l e t h e
(8)
the a l g o r i t h m l i n e search
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
should
26
MINICOMPUTERS
simplifies
to
the computation o f h. J
=
g
T
g /
A N DLARGE
the
step
t! "~J
C t. -J
SCALE
COMPUTATIONS
length
(10)
S i m i l a r s i m p l i f i c a t i o n s e x i s t f o r t h e R a y l e i g h q u o t i e n t (k). T h e s e are d e a l t w i t h below. B o t h t h e s p e c i a l i z e d and g e n e r a 1 - p u r p o s e eg m i n i m i z a t i o n a l g o r i t h m s are n o t o r i o u s l y s e n s i t i v e to implementation d e t a i l s w h i c h b e d e v i l any c o m p a r i s o n s . T h e s e d e t a i l s a r e d i s c u s s e d a t g r e a t e r l e n g t h i n [10] where t h e a l g o r i t h m s a r e p r e s e n t e d i n s t e p - d e s c r i p t i o n f o r m . C h a p t e r 19 o f [10] i s p a r t l y f o u n d e d on the p r e s e n t s t u d y .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch002
Direct
Rayleigh
Quotient
Minimization
As a r e s u l t o f o t h e r r e s e a r c h , a c o n j u g a t e g r a d i e n t s program f o r g e n e r a l f u n c t i o n m i n i m i z a t i o n was a v a i l a b l e [11] and o u r i n i t i a l hope was t o a p p l y t h i s d i r e c t l y t o t h e m i n i m i z a t i o n o f (4) o r (5). The m a j o r d i f f i c u l t y t h i s r e v e a l s i s t h a t b o t h t h e s e f u n c t i o n s a r e homogeneous o f d e g r e e z e r o , i m p l y i n g t h a t t h e s c a l e o r n o r m a l i z a t i o n o f x_ i s a r b i t r a r y . T h i s c a u s e s t h e m a t r i x o f s e c o n d p a r t i a l d e r i v a t i v e s , o r H e s s i a n , o f t h e f u n c t i o n R(x) to be o n l y p o s i t i v e s e m i d e f ï n i t e , v i o l a t i n g some t o t h e a s s u m p t i o n s w h i c h u n d e r l y t h e eg a l g o r i t h m . B r a d b u r y and F l e t c h e r [5] a d o p t e d a s t r a t e g y w h i c h r e s t r i c t s x_ t o I i e on a c o n v e x s u r f a c e , t h u s r e m o v i n g t h e t r o u b l e s o m d e g r e e o f f r e e d o m a t t h e e x p e n s e o f some e x t r a work w i t h i n t h e p r o g r a m . One c o u l d a l s o f i x one o f t h e e l e m e n t s o f x_ and a l l o w a l l t h e o t h e r s t o v a r y , e x c e p t t h a t t h e c h o s e n component s h o u l d p e r h a p s be v e r y s m a l l compared t o t h e r e s t , r e s u l t i n g i n v e r y l a r g e a l t e r a t i o n s in t h e s e a t e a c h s t e p o f t h e a l g o r i t h m . R a t h e r t h a n employ s p e c i f i c c o n s t r a i n t s , we t r i e d v a r i o u s p e n a l t y f u n c t i o n s t o impose a n o r m a l i z a t i o n on x. T h e s e a r e summarized in T a b l e I. U n f o r t u n a t e l y we must r e p o r t t h a t none o f t h e s e t e c h n i q u e s c a n be c o n s i d e r e d g e n e r a l l y u s e f u l due t o t h e s l o w c o n v e r g e n c e t o t h e minimum o f t h e f u n c t i o n . Our t e s t s were p r i m a r i l y made u s i n g t h e b i h a r m o n i c m a t r i x d e f i n e d by Ruhe [12] w i t h Β t h e i d e n t i t y . P r i m a r i l y the general m i n i m i z a t i o n is i n e f f i c i e n t because t h e l i n e s e a r c h i s i n e x a c t . F o r t h e R a y l e i g h q u o t i e n t w i t h o u t any p e n a l t y f u n c t i o n t o m a i n t a i n n o r m a l i z a t i o n t h i s can be c o r r e c t e d s i n c e t h e g r a d i e n t o f (k) i s
£ = 2 (A χ -
R Β χ ) / x
T
Β χ
(11)
so t h a t t h e one d i m e n s i o n a l m i n i m i z a t i o n o f R(x_ + h t_) w i t h r e s p e c t t o t h e s t e p l e n g t h h i s a c c o m p l i s h e d by t h e s o l u t i o n o f a q u a d r a t i c e q u a t i o n . T h i s i s s t r a i g h t f o r w a r d , t h o u g h n o t t o be u n d e r e s t i m a t e d ( s e e A c t o n [13]). A p a r t from t h e s e a r c h , the R a y l e i g h q u o t i e n t m i n i m i z a t i o n can be o r g a n i z e d so t h e r e c u r r e n c e (6) g e n e r a t e s s e a r c h d i r e c t i o n s
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
NASH
AND
27
Conjugate Gradient Methods
NASH
which a r e conjugate w i t h r e s p e c t ot the l o c a l Hessian m a t r i x . T h i s a v o i d s s e a r c h d i r e c t i o n s which c a u s e l a r g e growth in the e l e m e n t s o f x. The a p p r o a c h i s due t o G e r a d i n [ 1 h]. H o w e v e r , he d o e s n o t s p e c i f y t h e d e t a i l s o f h i s i m p l e m e n t a t i o n . T h a t w h i c h we have a d o p t e d i s g i v e n in f u l l by Nash [ 1 0 ] .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch002
Inverse
Iteration
by C o n j u g a t e
Gradients
E x c e p t by t h e u s e o f o r t h o g o n a 1 i z a t i o n o r p r o j e c t i o n t e c h niques R a y l e i g h q u o t i e n t m i n i m i z a t i o n cannot f i n d o t h e r than the e x t r e m e e i g e n s o l ut i o n s o f ( 1 ) . H o w e v e r , i n v e r s e i t e r a t i o n can f i n d an e i g e n s o l u t i o n w i t h e i g e n v a l u e c l o s e s t t o any p r e s c r i b e d s h i f t k p r o v i d e d a s t a r t i n g v e c t o r n o t t o t a l l y d e f i c i e n t in t h e r e q u i r e d e i g e n v e c t o r i s g i v e n , ( i n some c a s e s t h i s i s a n o n t r i v i a l p r o v i s i o n . F u r t h e r m o r e , f o r m u l t i p l e e i g e n v a l u e s we c a n o n l y f i n d one e i g e n v e c t o r f r o m t h e r e l e v a n t s u b s p a c e u n l e s s some o r t h o g o n a l i z a t i o n scheme i s e m p l o y e d . ) C o m p a r i n g e q u a t i o n s ( 2 ) and ( 9 ) shows method we s h o u l d make t h e identification C =
(A
-
kB)
;
w = -
Β χ.
;
that
to
u s e t h e eg
χ = y.
(12)
However, C was s u p p o s e d t o be p o s i t i v e d e f i n i t e , w h i c h i s now i m p o s s i b l e i f k i s g r e a t e r t h a n t h e s m a l l e s t (most n e g a t i v e ) e i g e n v a l u e o f o u r p r o b l e m ( 1 ) . Then one may t r y t o s o l v e t h e l e a s t squares problem (A
-
kB) (A T
-
kB)
y.
=
(A
-
kB)
T
Β χ.
(13)
w h i c h d o e s have a p o s i t i v e s e m i d e f i n i t e c o e f f i c i e n t m a t r i x but w h i c h i n c r e a s e s t h e amount o f work and w o r s e n s t h e n u m e r i c a l c o n d i t i o n of the equation system. Following Ruhe and W i b e r g [k] we have o p t e d t o i g n o r e t h e r e q u i r e m e n t t h a t C be p o s i t i v e d e f i n i t e , t h e r e b y d i s c a r d i n g t h e c o n v e r g e n c e r e s u l t s w h i c h have been p r o v e n f o r t h e eg a l g o r i t h m a p p l i e d t o l i n e a r e q u a t i o n s . To c o m p e n s a t e f o r t h i s , we must c h e c k t h a t t h e s t e p l e n g t h computed in ( 1 0 ) i s n o t t o o l a r g e , s i n c e t h i s w o u l d i m p l y t h a t t h e s h i f t k i s t o o c l o s e t o an e i g e n v a l u e . T h i s p o s e s t h e d i l e m m a : 1 ) For r a p i d convergence o f i n v e r s e i t e r a t i o n i t i s d e s i r a b l e have k a s c l o s e t o an e i g e n v a l u e a s p o s s i b l e . 2 ) F o r t h e eg a l g o r i t h m t o work w i t h o u t o v e r f l o w , k s h o u l d n o t be t o o c l o s e t o an e i g e n v a l u e . to
Ruhe and W i b e r g [k] n e v e r t h e l e s s u s e d e g - i n v e r s e iteration to r e f i n e e i g e n v e c t o r s , using s h i f t s v e r y c l o s e to the e i g e n v a l u e , a f t e r w a r d s r e f i n i n g t h e e i g e n v a l u e i t s e l f by means o f t h e R a y l e i g h q u o t i e n t . We have a p p l i e d t h e i r method t o t h e more d i f f i c u l t case where t h e s t a r t i n g v e c t o r may c o n t a i n o n l y a s m a l l component o f t h e d e s i r e d e i g e n v e c t o r and o n l y a c r u d e e s t i m a t e o f t h e e i g e n v a l u e may be a v a i l a b l e . In t h e c a s e where t h e s t e p l e n g t h ( 1 0 ) e x c e e d s some t o l e r a n c e ( t h e r e c i p r o c a l o f t h e s q u a r e r o o t o f t h e
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
28
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
m a c h i n e p r e c i s i o n ) o u r p r o g r a m h a l t s , s i n c e we have y e t t o be c o n v i n c e d t h a t a s u i t a b l e a u t o m a t i c s h i f t i n g p r o c e d u r e c a n be devi sed. As a c o n t r o l on t h e p r o g r e s s o f t h e i n v e r s e i t e r a t i o n we compute t h e r e s i d u a l s r_
=
(A -
sB) χ
(14)
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch002
where s i s t h e c u r r e n t e s t i m a t e o f t h e e i g e n v a l u e . The sum o f s q u a r e s rjr_ p r o v i d e s a c o n v e n i e n t c o n v e r g e n c e t e s t . Unfortunately d i f f e r e n t n o r m a l i z a t i o n s a r i s e in t h e i m p l e m e n t a t i o n s o f t h i s a l g o r i t h m and t h a t o f G e r a d i n , so t h a t t h e c o n v e r g e n c e c r i t e r i a a r e n o t d i r e c t l y c o m p a r a b l e . We have s o f a r c h o s e n n o t t o p e r f o r m t h e e x t r a c o m p u t a t i o n n e e d e d t o make t h e c o n v e r g e n c e c r i t e r i a ident i c a l . Hybrid
Algorithms
H a v i n g now s e v e r a l p r o g r a m s a t o u r d i s p o s a l , i t i s t e m p t i n g to c o n s i d e r , s a y , combining a f u n c t i o n m i n i m i z a t i o n w i t h i n v e r s e i t e r a t i o n to r e f i n e the v e c t o r . E a r l y though q u i t e e x t e n s i v e t r i a l s with the general minimization a l g o r i t h m using f u n c t i o n D o f T a b l e I f o l l o w e d by i n v e r s e i t e r a t i o n a s i n t h e p r e v i o u s s e c t i o n showed t h a t t h i s c o m b i n a t i o n c o u l d be more e f f e c t i v e t h a n f u n c t i o n m i n i m i z a t i o n a l o n e . However, i f o n l y t h e extreme e i g e n s o l u t i o n s a r e d e s i r e d , t h e G e r a d i n a l g o r i t h m can in o u r e x p e r i e n c e s u p p l y an a c c u r a t e s o l u t i o n more e f f i c i e n t l y than any c o m b i n a t i o n m e t h o d , i n c l u d i n g G e r a d i n w i t h i n v e r s e iteration. D i s c u s s i o n and Examples T a b l e l i l i s t s t h e programs e i t h e r d e v e l o p e d o r used f o r c o m p a r i s o n p u r p o s e s w i t h i n t h i s s t u d y . We t e n t a t i v e l y recommend t h e a l g o r i t h m s w h i c h u n d e r l y t h o s e c a l l e d GER and INVIT. T h e r e a d e r i s a d v i s e d t h a t t h i s c a n be no more t h a n a t e n t a t i v e r e c o m m e n d a t i o n s i n c e work i s s t i l l g o i n g on i n t h i s a r e a . F o r i n s t a n c e , the a u t h o r s understand that P r o f . G.W.Stewart o f the U n i v e r s i t y o f M a r y l a n d has been d e v e l o p i n g a l g o r i t h m s f o r t h e sparse eigenproblem. His bibliography [16] p r o v i d e s a w e a l t h o f p o s s i b l e d i r e c t i o n s f o r r e s e a r c h . A t t h e t i m e o f w r i t i n g , we a r e s t i l l i n v e s t i g a t i n g t h e eg methods o f F r i e d [YJ]· T h e r e r e m a i n t o be s e t t l e d t h e q u e s t i o n s o f (1) c o n v e r g e n c e c r i t e r i a and (2) s t a r t i n g v a l u e s f o r t h e s h i f t a n d t h e i n i t i a l v e c t o r . F o r t h e program GER c o n v e r g e n c e i s assumed when t h e g r a d i e n t norm s q u a r e d £
F o r an i n i t i a l
vector
we s u g g e s t
simply
a column o f o n e s .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
2.
NASH
Table
I:
AND
F u n c t i o n s w h i c h may be m i n i m i z e d eigenproblem A χ = e x.
x Ax/x x
Β)
x Ax/x x
T
T
S m a l l e s t e i g e n v a l u e o n l y . Prone to produce l a r g e v e c t o r elements s i n c e no n o r m a l i z a t i o n i m p o s e d .
T
+ ζ(x x
T
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch002
-
T
for
1)
Usually finds smallest eigen v a l u e and n o r m a l i z e d v e c t o r .
2
ζ > 0
C)
x (A-kl) x
D)
x7(A-kl) x/x x +
T
2
2
+
(x x-l) T
T
Least s q u a r e s . T h i s i s not a Rayleigh quotient. Difficulties in f i n d i n g c e r t a i n s o l u t i o n s unless starting vector "good".
2
(x x-l) T
Extremely
2
poorly E)
x x/(x (A-kl) x)
Table
T
T
II:
to s o l v e the
Comments
Funct ion A)
29
Conjugate Gradient Methods
NASH
2
+
(x x-l) T
Programs a p p e a r i n g
2
s l o w when
eigenvalues
separated.
An u n s u c c e s s f u l a t t e m p t t o separate close eigenvalues. Slow c o n v e r g e n c e .
in t h e
study.
GER
The G e r a d i n a l g o r i t h m
INVIT
Inverse i t e r a t i o n using conjugate g r a d i e n t s s o l u t i o n of the l i n e a r equations (2).
TQL
A BASIC v e r s i o n o f TQL1 i n [15] f o r t h e e i g e n v a l u e s o n l y o f a symmetric t r i d i a g o n a l m a t r i x .
SSE
Eigenvalues only using
a s implemented
o f a symmetric
Sturm s e q u e n c e s
tridiagonal
in [ 1 0 ] . for
matrix
[3].
NES
Nesbet's algorithm
MOR
Method o f o p t i m a l
CGT(B)
General conjugate g r a d i e n t s penalized Rayleigh quotient
[J]. relaxation
[6]. function minimizer using (B) o f T a b l e I w i t h ζ = 1.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
30
MINICOMPUTERS
(This
is
lacking
not
normalized.)
a component
in
However,
the
in
A N DL A R G E
SCALE
some c a s e s
direction
of
the
this
desired
e s p e c i a l l y in c a s e s where t h e r e a r e s y m m e t r i e s (other than t h a t about the p r i n c i p a l d i a g o n a l )
in or
COMPUTATIONS
vector
is
eigenvector,
the the
matrices matrix
elements are i n t e g e r s . In s u c h c a s e s we have u s e d a p s e u d o random number g e n e r a t o r t o p r o d u c e e l e m e n t s in t h e interval if
( - 0 . 5 , 0 . 5 )
the
vector
of
ones
seemed
inappropriate.
The a l g o r i t h m s have been t e s t e d i n BASIC i n a 6 0 0 0 b y t e p a r t i t i o n o f a Data G e n e r a l NOVA o p e r a t i n g i n 2 3 b i t binary f l o a t i n g p o i n t a r i t h m e t i c . T h i s c o r r e s p o n d s q u i t e c l o s e l y t o some o f t h e s m a l l d e s k t o p c o m p u t e r s in s i z e o f memory. Some o f t h e
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch002
programs IBM 3 7 0 / 1 6 8 digits). 150
GER, INVIT, and CGT( ) - were a l s o t e s t e d on an a n d / o r Amdahl V6 i n s i n g l e p r e c i s i o n ( s i x hexadecimal
GER and l i n e s of
INVIT a r e q u i t e s h o r t , BASIC d e p e n d i n g on t h e
being a p p r o x i m a t e l y 1 0 0 to code f o r initialization,
n o r m a l i z a t i o n and p r i n t i n g o f r e s u l t s . GER was i m p l e m e n t e d and t e s t e d o n c e in l e s s t h a n two h o u r s f r o m a s t e p - d e s c r i p t i o n r e c i p e . T h a t i s , l e s s t h a n two h o u r s e l a p s e d f r o m t h e moment t h e f i r s t l i n e o f BASIC was w r i t t e n u n t i l t h e c o m p u t e r p r i n t e d c o r r e c t r e s u l t s f o r a t e s t p r o b l e m . INVIT i s s l i g h t l y l o n g e r in l i n e s o f program, for vectors, took 4 9 6 2 programs
2022
but r e q u i r e s o n l y 1 9 1 8 b y t e s o f s t o r a g e compared t o GER. M o r e o v e r , i t n e e d s o n l y 6 i n s t e a d o f 7 w o r k i n g so t h a t the o r d e r 1 0 0 problem r e p o r t e d in T a b l e III b y t e s t o run w i t h GER but o n l y 4 4 2 6 w i t h INVIT. B o t h r e q u i r e s u b - p r o g r a m s t o compute t h e r e s u l t s o f t h e
mult i p i i c a t ions ν By way o f d e f i n e d by
=
A x ;
illustration,
Α..
=
£
=
Β x_
consider
- 2 + l/[(n+l)
2
+
the
i ] 2
(15)
tridiagonal
for
matrix
i=l,2,...,n. (16)
A
i,i
+
1
=
A
i
+
l,i
"
1
f
o
r
i - 1 . 2 , - . n - l .
w h i c h we c a l l t h e F r o b e r g m a t r i x . Β w i l l be s e t t o t h e i d e n t i t y . On t h e NOVA we computed t h e s m a l l e s t e i g e n s o l u t i o n o f t h i s m a t r i x f o r o r d e r s η = 4 , 1 0 , 5 0 , and 1 0 0 . The r e s u l t s , w i t h c o m p a r a b l e values f o r o t h e r programs, r e s u l t s were f o u n d on t h e
a r e g i v e n in T a b l e 1 1 1 . V e r y s i m i l a r IBM 3 7 0 . T h e y i l l u s t r a t e what we have
o b s e r v e d i n a l l c a s e s t o d a t e , t h a t i s , t h a t when o n l y t h e s m a l l e s t (or l a r g e s t ) e i g e n v a l u e i s d e s i r e d t h e p r o g r a m GER i s t h e most e f f i c i e n t y e t p r o d u c e s a c c u r a t e s o l u t i o n s . U s i n g t h i s program w i t h t h e m a t r i x (A to o b t a i n
a solution
-
k
I)
(17)
2
with eigenvalue
closest
to
k causes a
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
very
2.
NASH
31
Conjugate Gradient Methods
A N D NASH
s u b s t a n t i a l slow-down in t h e r a t e o f c o n v e r g e n c e . T h i s i s h a r d l y s u r p r i s i n g s i n c e we have s q u a r e d t h e c o n d i t i o n number o f t h e m a t r i x a s w e l l a s d o u b l i n g t h e work i n e a c h m a t r i x - v e c t o r product s t e p o f t h e p r o g r a m . T h e g e n e r a l f u n c t i o n m i n i m i z e r CGT(B) p e r f o r m s r e a s o n a b l y w e l l on t h e s e m a t r i c e s but u s e s r o u g h l y f i v e t i m e s a s much e f f o r t . Inverse i t e r a t i o n (INVIT) i s a l m o s t a s t e d i o u s . However, t h e r a t e o f c o n v e r g e n c e h e r e i s g o v e r n e d by t h e rat io ρ
=
(e
}
-
k)/(e
2
-
(18)
k)
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch002
where e^ i s t h e e i g e n v a l u e c l o s e s t t o k a n d e « t h e n e x t c l o s e s t . INVIT i n i t s m a j o r i t e r a t i o n c o n v e r g e s a s f a s t a s powers o f ρ tend to z e r o . F o r t u n a t e l y , as t h e e i g e n v e c t o r begins to dominate
Table Program
III :
Minimal
Order
GER INVIT TQL SSE CGT(B) NES MOR
4 4 4 4 4 4 4
GER INVIT TQL SSE CGT(B) NES MOR
10 10 10 10 10 10 10
GER INVIT TQL SSE CGT(B)
50 50 50 50 50
GER INVIT TQL SSE
100 100 100 100
eigensolution
Eigenvalue
0.350142
of
Rayleigh
Froberg's matrix. quotient
0.350144 0.350144
Matrix
produc
5 27
(7)
0.350145 0.350144 0.350144 0.350144 0.350505
7.44379E-2 7.44434E-2 7.44405E-2
3.48618E-3 3.49247E-3 3.48753E-3
8.86582E-4 9.64072E-4 8.89313E-4
36 10 >1000
7.44406E-2 7.44407E-2
1 1 43 (6)
7.44407E-2 f a i 1 ed 7.48813E-2
57 >1000
3.48733E-3 #.4873E-3
26
3.48748E-3
123
8.89216E-4
51 242 (4)
8.89202E-4
147
(5)
The f i g u r e i n p a r e n t h e s e s a f t e r t h e number o f m a t r i x p r o d u c t s i s t h e number o f i n v e r s e i t e r a t i o n s . In a l l c a s e s t h e s h i f t k=0.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
32
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
x^ , the eg linear equations solution is usually accomplished in many fewer than η matrix-vector products. This has also been observed by Ruhe and Wiberg [4]. The number of inverse iterations, starting form a shift k=0, has been included in parentheses behind the number of matrix-vector products in Table III. Acknowledgement s While this study used very minimal computing resources, we wish to acknowledge time provided on the Data General NOVA and IBM 370 (Datacrown Ltcl. ) at Agriculture Canada and the Amdahl V6 at the University of Alberta.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch002
Literature Cited [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
Hestenes M.R. and Stiefel E . , J. Res. Nat. Bur. Standards Section Β (1952) 49, 409-436. Fletcher R. and Reeves C.M., Computer Journal (1964) 7, 149-154. Wilkinson J.H., "The algebraic eigenvalueproblem ,Clarendon Press, Oxford, 1965. Ruhe A. and Wiberg T., BIT (1972) 12, 543-554. Bradbury W.W. and Fletcher R., Numer. Math. (1966) 9.,259-267. Shavitt I., Bender C.F., Pipano Α., and Hosteny R.P., J. Computational Physics (1973) 11, 90-108. Nesbet R.K., J. Chem. Phys. (19657) 43, 311-312. Davidson E.R., J. Computational Physics (1975) 17, 87-94. Beale E.M.L., in Lootsma F.A., "Numerical methods for nonlinear optimization", 39-44, Academic Press, London, 1972. Nash J.C., "Compact numerical methods: linear algebra and function minimization", To be published, probably in 1978. Nash J.C., "Function minimization with small computers", submitted to ACM Trans. Math. Software, 1976. Ruhe Α., in Collatz L., "Eigenwerte Probleme", 97-115, Birkhäuser Verlag, Basel, 1974. Acton F.S., "Numerical methods that work", 58-59, Harper & Row, New York, 1970. Geradin M., J. Sound. Vib. (1971) 19, 319-331. Bowdler H., Martin R.S., Reinsch C., and Wilkinson J.H., Numer. Math. (1968) 11, 293-306. Stewart G.W., in Bunch J.R. and Rose D.J., "Sparse matrix computations", 113-130, Academic Press, New York, 1976. Fried I. J . Sound. Vib. (1972) 20, 333-342. 11
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3 L a r g e Scale S i m u l a t i o n with a
Minicomputer
Β. E. ROSS, PAULA JERKINS, and JAMES KENDALL
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
College of Engineering, University of South Florida, Tampa, FL 33620
This paper describes the alteration and development of an Interdata 7/16 minicomputer to perform large-scale computations. Substantial savings in cost and overall turnaround time resulted from the act of performing the calculations with the minicomputer instead of the previously used IBM 360. Introduction Since 1969, students and faculty of the College of Engin eering at the University of South Florida have been developing a co-ordinated set of digital computer models for environmental simulation. Calculations of the hydrodynamical, chemical and biological aspects of estuarine areas are included. Large-scale physical areas are simulated over long real time periods and the complexity of the interactions of the models result in largescale computations. A natural consequence of performing the simulations with a general purpose IBM 360 are rapid execution times but very delayed turnaround time due to system p r i o r i t i e s and other user demands. The alternative of adapting and upgrading an existing Inter data computer was studied in detail. The developments in mini computer technology in recent years have increased the obtainable operating speed of the central processing units. Large-scale main memories and large-scale auxiliary memories have become available for economical minicomputer development. The computer programs which are implemented with data and become the simulation models are carefully co-ordinated into sub routines which lend themselves to overlay techniques. The p r i n c i pal numerical schemes involved are explicit so that core require ments and run time are flexible and interchangeable. The combin ation of subroutines and explicit solution greatly simplified the transition of the programs from IBM to Interdata.
33 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
34 The
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
Simulation Problem
The tasks to be performed by the computer involve the numerical s o l u t i o n of the v e r t i c a l l y i n t e g r a t e d equations o f motion, c o n t i n u i t y , and mass transport with chemical and b i o l o g i cal i n t e r a c t i o n s i n two dimensions. The b a s i c equations are as follows :
9υ 3t
1 8 υ
+
D
+
3T
3V
U 3V D
3H 3t
3U 3x
3C. ι 3t
+U
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
+
3_
3y
37 +
ν9υ
+
3x
V 3V D
37
+
n l I +
Ω
ϋ
3H 3y
n =
-8°
υ
- / QUD
- I |P ρ 3x D
I
/ QVD"
d
P
| L 3y
3V 0 37 =
ac. JT 1
3C.
3y
v
(1)
(2)
(3)
V +
3C. (DE
ΩΥ = -gD
D 3y
+
1
3
8
C
i
w
4
M. J
1
Ρ
ud
g
Transport i n the χ d i r e c t i o n
vd
g
Transport i n the y d i r e c t i o n
(4)
D = Local water depth Ω = C o r i o l i s parameter Η = Local water surface e l a . / = Local f r i c t i o n f a c t o r Q = L U + V23 h Ρ = Atmospheric pressure C^= Concentrations o f water q u a l i t y parameters or b i o t a NL= I n t e r a c t i o n process and sources or sinks ρ = Mass d e n s i t y Ex,Ey = Dispersion coefficients 2
For the s o l u t i o n scheme the equations are reduced to f i n i t e d i f f e r e n c e form and solved on a square g r i d matrix. An example of the a p p l i c a t i o n o f the model to H i l l s b o r o u g h Bay, F l o r i d a , i s shown i n Figure One. Figure One shows the d i s t r i b u t i o n o f d i s s o l v e d oxygen i n the Bay r e s u l t i n g from the discharge of p o l l u tants from i n d u s t r i a l , m u n i c i p a l and n a t u r a l sources, and the i n t e r a c t i o n of b i o t a .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
E T A L .
Large Scale Simuhtion
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
ROSS
Figure 1.
Hilhborough Bay, Florida
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
36
AND LARGE SCALE
COMPUTATIONS
The number of g r i d elements i n v o l v e d i n the s i m u l a t i o n are 90 X 36 or 3240 elements. There are 10 v a r i a b l e s i n v o l v e d with each element. C a l c u l a t i o n s are performed f o r the h y d r a u l i c program i n i n t e r v a l s of 90 seconds of r e a l time f o r 24 hours. Numerous a u x i l i a r y c a l c u l a t i o n s updating c o e f f i c i e n t s must be performed. The h y d r a u l i c p o r t i o n (the s o l u t i o n of equations 1, 2, and 3) of t h i s s i m u l a t i o n i s the set of c a l c u l a t i o n s that were chosen f o r the computer comparison purposes.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
The
Interdata Computer
B a s i c a l l y the Interdata machine was a p a r t of an experimental data r e d u c t i o n system. The cpu handles numbers at 16 b i t s , two bytes at a time i n t e r n a l l y . The word length i s 32 b i t s so the roundoff e r r o r i s the same as that f o r the IBM. The o r i g i n a l machine had 8K bytes of memory and a magnetic tape u n i t . A f t e r an examination of the problem i n v o l v e d with overlay techniques, the p o s s i b i l i t y was found that adequate storage could be obtained by using 65K bytes of core and a 50 M byte d i s c . Both of these u n i t s were a v a i l a b l e f o r the Interdata. The c o n f i g u r a t i o n of Interdata computer f i n a l l y implemented f o r the s i m u l a t i o n comparisons i s described as f o l l o w s : Supplier Unit Price M71-012 7/16 CPU 3,700 M71-101 Binary D i s p l a y Panel 350 M71-103 Automatic Loader 400 M71-104 Power F a i l / A u t o Restart 400 M71-105 Signed M u l t i p l y / D i v i d e 950 INTERDATA M71-106 High Speed ALU 5,000 M46-004 ASR-33 Teletype 1,950 M48-024 Current Loop I n t e r f a c e 400 M46-500 9 Track 800 BPI Magtape I n t e r f a c e 2,950 M46-501 9 Track 800 BPI Magtape Transport 6,000 M47-102 RS-232 I n t e r f a c e 500 $22,600 BALL COMPUTER BD-50 50M Byte 3330 Type Disc Drive 7,000 MINI-COMPUTER TDC-803 3330 Disc I n t e r f a c e 1,900 PUSHPA PM9800 65K Byte Memory 4,000 $35,500 HAZELTINE 2000 Video Terminal ξ P r i n t e r 4,000 $39,500 The hardware s e l e c t e d i s supported by a Disc Operating System (DOS) s u p p l i e d by Interdata. This i s not the most soph i s t i c a t e d operating system a v a i l a b l e but s u f f i c i e n t to meet the immediate needs. The i n s t a l l e d v e r s i o n has c a p a b i l i t i e s as out l i n e d below: D.O.S. (Disc Operating System) I. System U t i l i t y : A Copy F i l e s Β Compress/Decompress C Disc Backup
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
II.
Ross E T A L .
37
Large Scale Simulation
Disc F i l e Management:
A Allocate Files Β Delete F i l e s C Protect F i l e s D List Files Ε P o s i t i o n to a S u b f i l e III. Software: A FORTRAN V Level 1 Β Extended Basic C OS E d i t o r D OS L i b r a r y Loader Ε OS Assembler F OS A i d s , Debugger The language used i n the s i m u l a t i o n programs i s FORTRAN V , l e v e l 1, which i s a high l e v e l compiler supporting the r e q u i r e ments o f ANSI standard FORTRAN and includes s i g n i f i c a n t extension i n both language and subroutine l i b r a r y to support process control a p p l i c a t i o n s and m u l t i - t a s k i n g programs. Features o f FORTRAN V , which may be o f major importance to the FORTRAN programmer are: (a) Mixed mode a r i t h m e t i c i s allowed (b) Array i n i t i a l i z a t i o n and i m p l i e d - D o s i n Data statements are provided f o r (c) M u l t i p l e e n t r i e s i n t o FORTRAN subroutines are provided (d) H o l l e r i t h constants may be declared using the apostrophe as a delimiter Features which may be o f i n t e r e s t to a FORTRAN programmer u s i n g FORTRAN V as a process c o n t r o l language are: (a) The use of i n - l i n e assembly language i s allowed (b) Encode/decode statements are allowed (c) Hexadecimal and character constants may appear as arguments i n expressions as w e l l as i n Data statements and c a l l p a r a meter l i s t s (d) Analog input i n a s e q u e n t i a l order i s allowed (e) Analog input i n any sequence i s allowed (f) Analog output i n any sequence i s allowed (g) L o g i c a l functions intended to support the Instrument S o c i e t y of America/Purdue Standards are a v a i l a b l e FORTRAN V contains s e v e r a l features to s i m p l i f y and expedite program debugging such as (a) Over 60 compile-time d i a g n o s t i c s are provided (b) 35 run-time e r r o r messages are provided (c) Run-time trace c a p a b i l i t y i s provided (d) Optional c o m p i l a t i o n i s provided which f a c i l i t a t e s i n s e r t i o n of the programmer's d i a g n o s t i c s and allows these to be e a s i l y deleted from a program without p h y s i c a l removal. 1
Standard Operating Procedures with the
Interdata
The main programs e x i s t i n source form on the 50 M Byte d i s c . These can be c a l l e d by an operator. Basic input data are entered by the operator i n an i n t e r a c t i v e mode. The program goes immediately to the compile and run modes. Intermediate
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
38
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
c a l c u l a t e d data such as h y d r a u l i c v e l o c i t i e s and water depths are s t o r e d on the d i s c temporarily and p r i n t e d on magnetic tape at s e l e c t e d r e a l time i n t e r v a l s . The t o t a l q u a n t i t y o f c a l c u l a t e d data i n t h i s step i s too great to be s t o r e d i n t a c t upon the 50 M Byte d i s c . Upon completion of the c a l c u l a t i o n o f the h y d r a u l i c s o f a bay, the water q u a l i t y program i s c a l l e d . The water q u a l i t y program uses the h y d r a u l i c data from the magnetic tape to c a l culate chemical and b i o l o g i c a l r e s u l t s . Longer r e a l time i n t e r v a l s are used i n the water q u a l i t y c a l c u l a t i o n s , thus l e s s data are c a l c u l a t e d . The c a l c u l a t e d r e s u l t s are now s t o r e d upon d i s c f o r p r i n t o u t , or reading onto another magnetic tape.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
Comparisons of Costs and Time The s i m u l a t i o n c a l c u l a t i o n s were a l t e r n a t e l y made by use o f an IBM 360, with 3 Megabyte a c t i v e core and almost u n l i m i t e d d i s c space. However, t h i s system i s i n a U n i v e r s i t y and hosts many languages and supports d i v e r s e u s e r s needs so that much o f the c a p a b i l i t y i s not a v a i l a b l e to a u s e r . The IBM system i s supported by the usual IBM operating system. The usual debugging programs and some p r o f e s s i o n a l a s s i s t a n c e i s a v a i l a b l e by appointment. Comparisons are made f o r comparable computer environmental simulations. Two times are important. These are cpu time and turnaround time. Another parameter o f i n t e r e s t i s c o s t . The r e s u l t s i n d i c a t e that the t e s t program u t i l i z e d 4612.65 cpu seconds i n the IBM machine. The machine elapsed time was 3 hours. The best turnaround f o r the s i m u l a t i o n was 24 hours. Usual turnaround times are on the order o f 72 hours. The cost of the t e s t c a l c u l a t i o n on the IBM system was $303.43. A comparable run was performed on the Interdata 7/16 before hardware high speed f l o a t i n g p o i n t a r i t h m e t i c u n i t was i n s t a l l e d . The cpu time was 74 hours which was a l s o the turnaround time. Costs o f computation were based on the f o l l o w i n g f a c t o r s : $34,300 amortized 4 years $8,575 I n t e r e s t 1st year 3,087 1/3 T e c h n i c i a n time f o r general maintenance 4,000 F i e l d r e p a i r s by I n t e r data per year 500 $16,162 I f computer used 50% time $88.56/day or $ 3.69/hr. Thus, based on 74 hours the cost of t h i s long run was $273.06. A n a l y s i s o f the cpu usage during the long Interdata run i n d i c a t e d that 85% o f the time was spent i n software operations involving floating point arithmetic. A high speed f l o a t i n g p o i n t a r i t h m e t i c u n i t (HSALU) was i n v e s t i g a t e d . A n a l y s i s showed that i f the same s i m u l a t i o n was performed with HSALU the r e s u l t s 1
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
3.
Large Scale Simuhtion
Ross E T A L .
IBM 360
96 -,
INTERDATA WITH HSALU*
INTERDATA NO HSALU*
303
273
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
72 72· j.--jUsaal
48
I 74
I
74
J
!
TIME IN HOURS
24J
y
Moo
!
24
I
Best
f^j| Turnaround Time
6
6
24
*High Speed A r i t h m e t i c Unit
|§§ CPU Time
DIΛD
C o s t
1:10
Figure 2.
Comparisons for runs yielding identical results
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
40
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
s t u d i e d would i n d i c a t e a cpu time o f 5.69 hours which i s also the turnaround time. The costs are adjusted t o r e f l e c t an a d d i t i o n a l investment o f $5,200 and the r e s u l t s i n d i c a t e a cost of $4.09/hour. The cost o f t h i s h y p o t h e t i c a l run i s $23.28. I n s t a l l a t i o n of the HSALU and subsequent s i m u l a t i o n confirmed the expected r e s u l t s . Thus, the new run at highspeed saved $280.15 and 18 hours of turnaround time. The r e s u l t s are summarized i n Figure Two where time and d o l l a r s have been rounded t o the nearest i n t e g e r .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch003
Conclusions The conclusions are that l a r g e - s c a l e computations can be accomplished on modern minicomputers with savings i n time and money. Accuracy i s not s a c r i f i c e d i n 32 b i t word machines and maintenance and r e l i a b i l i t y appear to be r e a l i s t i c i n c o s t . Programs and numerical techniques must be compatible with the s i z e of the machine chosen.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4 APL
L e v e l L a n g u a g e s in A n a l y s i s
A Host-Microcomputer-Instrument Hierarchy in L i g h t Scattering Spectroscopy
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
J. ADIN MANN, ROBERT V. EDWARDS, THOMAS GALL, H. M. CHEUNG, F. COFFIELD, C. HAVENS, and P. WAGNER Department of Chemical Engineering, Case Western Reserve University, Cleveland, OH 44106 So far as we know this is the first report in the open literature of a laboratory experiment instrumented totally within the context of APL as the computer language driving each level of a hierarchy. The significance of our report is the demonstration that such a high level language improves by orders of magnitude the effort required to implement complicated experiments involving elements of control, data acquisition, data processing, and modelling of complex phenomena. It i s especially easy to retain the degree of human interaction required by the experiment. A l l of these desirable results can be accomplished by persons with no formal training in computer science. Certainly other languages and combination of languages have been used to write interactive systems for data collection and analysis. We have had considerable experience with FORTRAN and BASIC as well as assembly languages. Our experience with operating systems for experimental work has been limited to the DEC PDP 11/40 DOS and the DEC PDP11/45 RSX11D operating systems. Unequivocally, APL and our hierarchy has proved to be an order of magnitude more effective in reducing f i r s t concepts to producing results with experimental equipment. The APL language and the concepts of the APLSV or VSAPL implementations have provided an integrity of design and ease of coding that from our experience is far ahead of FORTRAN and BASIC oriented systems. Certainly the human engineering that has gone into APL implementations, e.g. the IBM 5100, is a factor in such a significant improvement, but the major element is that the structure of the language i t s e l f and the notation is much closer to the mathematics involved in experimental work than any computer language known to us. The penalty one pays in using a high level language is that of execution times for certain operations and, perhaps, cost of the hardware. These disadvantages were more than balanced by reduction in the time necessary to integrate the hierarchy into the experiment. Should the measurements become routine, i t may 41 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
42
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
be u s e f u l to have p r o f e s s i o n a l hardware and software t e c h n i c i a n s implement the APL algorithms i n the appropriate assembly language. In that case, the APL functions serve as an unequivocal d e s c r i p t i o n of the "operating system" needed for the experiment. We have chosen to describe our methods i n the context of automating a s p e c i f i c experiment i n v o l v i n g the determination of the d i f f u s i o n c o e f f i c i e n t of p a r t i c l e s i n a f l u i d by the a n a l y s i s of fluctuations i n scattered light. A d e s c r i p t i o n of the p h y s i c s of the experiment i s r e q u i r e d i n order to put the instrumental a n a l y s i s i n context. The d e t a i l s of the i n t e r c o n n e c t i o n of the h i e r a r c h y w i l l be described and excerpts of code w i l l be given to illustrate methods. F i n a l l y , performance w i l l be o u t l i n e d . L i g h t S c a t t e r i n g Spectroscopy:
Particle
Diffusion
Only a b r i e f i n t r o d u c t i o n w i l l be given here to a c l a s s of experiments that i n v o l v e the time s e r i e s a n a l y s i s of scattered light. Consult the books by Chu [1] and by Berne and Pecora [2] for d e t a i l s . The object of the experiment i s the study of the response of a system to small f l u c t u a t i o n s over a l a r g e frequency range. When the frequency region i s below perhaps 1MHz, the "response function" w i l l y i e l d measures of macroscopic c o n s t i t u t i v e c o e f f i c i e n t s such as diffusion coefficients, viscosity coefficients, elastic c o e f f i c i e n t s , r a t e constants and o t h e r s . For frequencies l a r g e r than about 10GHz we observe r e l a x a t i o n effects a s s o c i a t e d with molecular d i s t o r t i o n s . We w i l l s p e c i a l i z e to the low frequency region below 1MHz and further outline only the problem of determining the d i f f u s i o n coefficient of macroscopic p a r t i c l e s i n s o l u t i o n . The d i f f u s i o n c o e f f i c i e n t s of macroscopic p a r t i c l e s (lOé-nm d ^ 2000 nm, d i s the diameter of a p a r t i c l e ) are of interest for a number of p r a c t i c a l and t h e o r e t i c a l reasons. D i s p e r s i o n s of p a r t i c l e s are used by the medical p r o f e s s i o n , paint i n d u s t r y , the p r i n t i n g i n d u s t r y as w e l l as analyzed i n the context of environmental safety and i n human medicine. The theory of c o l l o i d s t a b i l i t y can be s t u d i e d d i r e c t l y as can the i m p l i c a t i o n s of the theory of f l u i d s [3]. Consider a suspension of small p a r t i c l e s , d 200 nm, each p a r t i c l e w i l l execute Brownian random motion that r e s u l t s from the very frequent c o l l i s i o n s of the Brownian p a r t i c l e with the small molecules of the surrounding s o l v e n t . When the number density of the Brownian p a r t i c l e i s s m a l l , the p a r t i c l e s can be treated as i n d i v i d u a l s so that c o l l i s i o n s between these l a r g e p a r t i c l e s can be i g n o r e d . The d e s i r e d information about d i f f u s i o n can be c a l c u l a t e d from the s c a t t e r i n g function
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
M A N N
43
APL Level Languages
E T A L .
where p(D) i s the d i s t r i b u t i o n of d i f f u s i o n coefficients, D and q i s the s c a t t e r i n g v e c t o r . A c o m p l i c a t i o n due to the v a r i a t i o n of s c a t t e r i n g c r o s s - s e c t i o n with s i z e i s included as part of the data analysis. The intermediate s c a t t e r i n g function i s r e l a t a b l e to the results of a l i g h t s c a t t e r i n g measurement i n the f o l l o w i n g way. When the i n c i d e n t beam has an e l e c t r i c f i e l d E the scattered beam, Tf^ w i l l be modulated by the p a r t i c l e motion so that at the detector the i n t e n s i t y w i l l be i = β |E.1 and the current autocorrelation f u n c t i o n produced by the detector w i l l be c
3
Rji-rt -
Ko)> - Afitf
(1A)
A
where A , Β and C are constants for a given experiment. Experimentally, R^(l) computed from a time s e r i e s produced by the detector that i s e s s e n t i a l l y the photocurrent as a function of time. When the i n c i d e n t f l u x of the s c a t t e r e d light is s u f f i c i e n t l y h i g h , the photocurrent i s put through an analog to d i g i t a l conversion before the computation of the correlation f u n c t i o n by the r u l e that i
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
+ Β F (q,y) + C
s
*i< *« = - k \ n
vy+
(
n
I
B
)
When the i n c i d e n t f l u x i s s m a l l , photon counting i s done d i r e c t l y but a s i m i l a r formula holds with the d e f i n i t i o n that 1i s the number of photoelectrons detected during a period T^rAy around j x A T · computer i s used for these c a l c u l a t i o n s . In p r a c t i c e , the r e l a x a t i o n times i n range between a few tenths of microseconds to a few tens of milliseconds'. R e l a x a t i o n times shorter than l^«.sec r e q u i r e measurements of the time s e r i e s i n the 10 to 100/usee range. The accuracy requirement for i ^ ± } modest, eight b i t s i s often s u f f i c i e n t , but averaging must be done with respect to a l a r g e number of time s e r i e s . Even when d i r e c t memory access i s f a s t enough the memory of a conventional computer w i l l be f i l l e d before a large enough time s e r i e s has been c o l l e c t e d . We have been using the SAICOR Mod 42 and 43 machines for preprocessing the time s e r i e s data. While they are not programmable, c o n t r o l of f u n c t i o n can be done by a host computer. The r e s u l t of a determination of R£ i s a set of 16 b i t numbers, one for each correlator channel. As the d e v i c e ' s memory i s read, these b i t s are a v a i l a b l e i n two's complement code on 16 pins mounted on the back panel of the correlator. The a n a l y s i s can take a number of forms and a convenient one involves the computation of cumulants. Essentially, A
i
log
s
F(q,7) = C
+ £
C
where the power s e r i e s i s truncated a f t e r
-^T
(
the Mth term.
(
2
A
)
Then
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
44
MINICOMPUTERS A N D LARGE SCALE
Dp(D)dD
K, = q
COMPUTATIONS
=
(2B)
Ο
q^(D
-
Since f o r Ύ > 0
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
log(R^)
- C) = 2 log F ( q , J ) + constant
(3)
i t i s obvious that a polynomial must be f i t to what amounts to the l o g of the c o r r e l a t i o n f u n c t i o n produced by the SAICOR hardware. The c o e f f i c i e n t s of the polynomial can be i n t e r p r e t e d p h y s i c a l l y as the cumulants of the d i f f u s i o n c o e f f i c i e n t of polydispersed p a r t i c l e s . The determination must be repeated often i n the course of an experiment. The sequence of events must be: 1.
2.
3. 4. 5. 6. 7.
Computation of R^ goes on i n r e a l time for a s e l e c t e d period of time and may i n v o l v e 10M bytes of information on the photocurrent. The 100 to 400, 16 b i t R^ vector must be sent to a computer and transformed to numbers from a two's compliment code. The R{ vector must be subjected to a l e a s t squares a n a l y s i s and the cumulants c a l c u l a t e d . The R^ vector must be archived with ID data and the cumulants made a v a i l a b l e for a n a l y s i s . Repeat t h i s sequence many times for each experiment. The data f i l e s for the experiment must be catalogued. Repeat t h i s e n t i r e sequence for many experiments by d i f f e r e n t u s e r s .
We have found that VSAPL or APLVS l e v e l of APL implementations to be h i g h l y f a c i l e for handling these tasks effectively. A hierarchy with an APL host was devised for performing the data a c q u i s i t i o n , computations and c o n t r o l required. Before the d e t a i l s are d e s c r i b e d , a d e s c r i p t i o n of APL i s necessary. APL
(A Programming Language)
APL i s an array processing language for manipulating sets of numbers or sets of characters of quite general shapes. The formal syntax of APL i s based on the mathematical concepts of f u n c t i o n and functions of functions or o p e r a t o r s . Much of the power of the language derives from the extensive set of
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
M A N N
E T AL.
APL Level Languages
45
p r i m i t i v e functions and operators as w e l l as the notation that represents t h e i r behavior. Defined functions can be constructed simply by w r i t i n g sequences of p r i m i t i v e functions that lead to the d e s i r e d result. The defined f u n c t i o n has the same syntax as the primitive functions. Several examples w i l l be s u f f i c i e n t to i l l u s t r a t e the use of the language. Suppose that the f o l l o w i n g double summation must be evaluated.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
In APL:
y+.xB+.xJ
The l i n e a r l e a s t - s q u a r e s algorithm or one step i n an i t e r a t i v e nonlinear l e a s t - s q u a r e s algorithm would require e v a l u a t i o n of the f o l l o w i n g matrix problem:
the
While FORTRAN requires DO looping and a c a l l to a subroutine for computing the inverse m a t r i x , APL does the e n t i r e operation as follows: K+cm It has been our experience that i n general APL code i s more compact by a f a c t o r of ten to 100 than the equivalent FORTRAN code and takes roughly a tenth of the time to produce and debug on a computer. The large set of p r i m i t i v e s as w e l l as the syntax of the language allows for a s u r p r i s i n g l y large redundancy i n the ways one may code a p a r t i c u l a r c a l c u l a t i o n . This i s an advantage for a number of reasons, not the l e a s t of which i s that the language i s very f o r g i v i n g for the inexperienced programmer. A simple subset of the p r i m i t i v e s i s s u f f i c i e n t for handling most computations that an inexperienced programmer may want to do. As he gains experience, he w i l l n a t u r a l l y take to e x p l o r i n g some of the s o p h i s t i c a t e d p r i m i t i v e s allowed i n the language. In our experience, APL has been far e a s i e r to teach to inexperienced programmers than any of the other languages commonly i n use. The APL language i t s e l f i s i n d i f f e r e n t to i t s implementation. The language has most often been implemented for t i m e - s h a r i n g , but there i s no reason to exclude a r e a l - t i m e implementation. Since about 1972, new APL systems based on the shared v a r i a b l e concept, have been w r i t t e n for the IBM 370 s e r i e s computers. This approach allowed the APL processor to communicate to the e x t e r n a l world e a s i l y . Since shared v a r i a b l e s have e x a c t l y the same s t r u c t u r e as any other v a r i a b l e i n APL, defined functions could be w r i t t e n that use
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
46
MINICOMPUTERS A N D LARGE
SCALE
COMPUTATIONS
data passed back and f o r t h between the APL processor and external processors e a s i l y . It i s possible, therefore, to consider an experimental apparatus as an e x t e r n a l processor that communicates with the APL processor through shared variables. The d i f f e r e n c e between implementations centers on whether or not shared v a r i a b l e s are used and whether or not c e r t a i n systems functions and systems v a r i a b l e s are d e f i n e d . In practice, there i s a degree of p o r t a b i l i t y i n user functions that i s beyond most other languages. The Appendix includes a t a b l e of a number of APL systems that are supported on l a r g e r machines. The l i s t i s probably not complete. We are of the o p i n i o n that a small a d d i t i o n to the set of systems functions and systems v a r i a b l e s would provide a l l of the resources needed for doing r e a l time operation e n t i r e l y w i t h i n the context of an APL machine. The implementation of such a proposal i s beyond the e x p e r t i s e that we have w i t h i n the department. However, we have found an a t t r a c t i v e a l t e r n a t i v e to a f u l l APL r e a l - t i m e machine. A block diagram of the i n t e r f a c i n g schemes that have been used s u c c e s s f u l l y i n our laboratory for the l a s t year i s shown and described i n a l a t e r s e c t i o n as Figures 2 and 3. The Hosts The APL h i e r a r c h y i s s t r u c t u r e d so that any machine running with the equivalent of APLSV can be attached as an e f f i c i e n t host. See t a b l e (1) i n the Appendix. In p a r t i c u l a r , the IBM 5100 and the Xerox Sigma 7 machines have been used as hosts extensively. The IBM 5100 with APLSV features was considerably e a s i e r to use than the Xerox APL. However, the Xerox APL i s s u f f i c i e n t for the purpose even though awkward by modern standards. The various IBM 370 machines running e i t h e r VSAPL or APLSV are e n t i r e l y able to handle the host r e s p o n s i b i l i t i e s . The most e f f e c t i v e i n t e r a c t i o n of the host with the h i e r a r c h y does r e q u i r e communication rates above 300 baud. Since the IBM 5100 can transmit and then r e c e i v e at rates programmable up to 9600 baud, i t was a superior h o s t . Although our experience i s l i m i t e d , our t r i a l s show that the Hewlett Packard 3000 Series I I machines are a l s o s u i t a b l e as h o s t s . In f a c t , the terminal ports for the HP 3000II can work to 2400 baud and the I/O bus to ca 300KBytes/sec or f a s t e r . Use of that I/O bus for the hierarchy r e q u i r e s both hardware and software that does not e x i s t . The IBM 5100 a r c h i t e c t u r e was described by Roberson [ 4 ] and w i l l not be repeated here. This APL system i s small and p o r t a b l e and includes a CRT d i s p l a y as w e l l as a tape c a r t r i d g e d r i v e . The c a r t r i d g e s have a c a p a c i t y of about 220,000 bytes and the system performs tape w r i t e and checking at about 900 bytes/sec and tape read at about 2500 bytes/sec. Our work required a p r i n t e r as w e l l as the a u x i l i a r y tape d r i v e f o r e f f i c i e n c y . It
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
M A N N
E T AL.
47
APL Level Languages
was necessary to use the maximum memory storage of 64K bytes which gives an a c t u a l workspace s i z e of about 57K b y t e s . The serial I/O adapter was r e q u i r e d i n order to a t t a c h the IBM 5100 to the h i e r a r c h y . The cost of t h i s system i s about $23,000. The IBM 5100 was p h y s i c a l l y attached to the h i e r a r c h y through i t s S e r i a l I/O p o r t . The APL processor allows the d e f i n i t i o n of shared v a r i a b l e s that "appear" i n both the APL workspace and the I/O i n t e r f a c e . This i s done by invoking an APL systems f u n c t i o n for generating a shared v a r i a b l e o f f e r , JJSVO, s i n the expression 1 QSVO ' MICRO' where the v a r i a b l e MICRO i s now shared with the s e r i a l I/O processor so that a s s i g n i n g MICRO to another APL v a r i a b l e w i l l cause information to be t r a n s f e r r e d from the I/O processor i n t o the APL p r o c e s s o r . A s s i g n i n g to MICRO w i l l cause information to be t r a n s f e r r e d from the APL processor to the I/O p r o c e s s o r . One or more v a r i a b l e s can be declared as "shared" by £7SV0. The I/O processor must have some information about the data to be t r a n s f e r r e d and that i s given by a s s i g n i n g l i t e r a l s t r i n g s of c o n t r o l information to the shared v a r i a b l e . Three c l a s s e s of s t r i n g s must be assigned to the shared v a r i a b l e before I/O communications o c c u r . F i r s t l y , when the l i t e r a l s t r i n g 'OUT 31001 TYPE=I' i s assigned to the shared v a r i a b l e , the s e r i a l I/O processor i s put i n t o command mode as designated by the ' d e v i c e number' 31001. I f t h i s i s i n f a c t done the vector 0 0 i s assigned to the shared v a r i a b l e by the S e r i a l I/O p r o c e s s o r . A non-zero value i m p l i e s an e r r o r and that c o n d i t i o n can be checked by simple APL code. S i m i l a r l y , ' I N 33001' when assigned to the shared v a r i a b l e informs the I/O processor to prepare f o r input from device address 33 ( i n p u t ) , f i l e 001. L a s t l y , the assignment of 'OUT 32001 TYPE=I' s t a t e s that an output operation w i l l occur for device 32, f i l e 001 and the data type i s s p e c i f i e d . A f t e r the command device i s opened by a s s i g n i n g 'OUT 31001 TYPE=I' to the shared v a r i a b l e , the next assignment to the shared v a r i a b l e i s the s p e c i f i c a t i o n of the device c h a r a c t e r i s t i c s i n the form of a character s t r i n g . The input and output buffer s i z e s may be s p e c i f i e d along with the data r a t e (0.5 baud s t e p s ) . Such aspects as the prompting c h a r a c t e r , new-line c h a r a c t e r , end-of-buffer c h a r a c t e r , p a r i t y , number of stop b i t s and changes i n the I/O t r a n s l a t i o n t a b l e s can be s p e c i f i e d at any time i n c l u d i n g during the execution of defined f u n c t i o n s . The device c h a r a c t e r i s t i c s that can be i n c l u d e d are s u f f i c i e n t to handle any handshake p r o t o c o l of the machines we have used. In f a c t , one may use 5, 6, 7 or 8 b i t I/O code so t h a t , f o r example, the IBM 5100 can be i n t e r f a c e d to EBCDIC or ASCII devices e a s i l y . I t i s convenient to define a small set of f u n c t i o n s that handle the opening and c l o s i n g of the S e r i a l I/O "devices" a u t o m a t i c a l l y . The monadic f u n c t i o n ^COMMAND r e q u i r e s as a r i g h t argument the l i t e r a l s t r i n g of device s p e c i f i c a t i o n s , AOUT outputs a l i t e r a l s t r i n g r i g h t argument, while A l N does not r e q u i r e an argument but can be used to a s s i g n whatever i s i n the input buffer to a
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
a
American Chemical Society Library 1155 16th St. N. W. In Minicomputers and Large Scale Computations; Washington, D. C. 20036 Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
48
MINICOMPUTERS AND LARGE
variable. Each f u n c t i o n checks the r e t u r n c o n d i t i o n s i n d e t a i l . The ease with which the communication p r o t o c o l could be b u i l t was an important f a c t o r i n producing code
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
The
SCALE
COMPUTATIONS
code and reports e r r o r the d e t a i l s of i n t o defined functions quickly.
Device C o n t r o l Processor
The microcomputer chosen for t h i s study was the Motorola 6800 b u i l t up with the components l i s t e d on Figure 1. The microprocessing u n i t (MPU) was b u i l t up on one card with the MC 6800 as the processor. Off of the common address bus and data bus l e a d i n g to the MPU were s e v e r a l types of I/O adapters. These chips provided I/O for two modes of terminal operation as w e l l as i n t e r f a c i n g to the APL h o s t . The t h i r d mode of operation i s that of asynchronous communications through a s p e c i a l chip c a l l e d the MIKBUG ROM. This ROM provides an asynchronous program, a loader program, and a d i a g n o s t i c program for use with the MPU. Two Kbytes of memory were b u i l t up from ICs on a memory card attached to the address bus and the data b u s . Memory could be expanded simply by adding a d d i t i o n a l cards to the b u s . Communications to the instruments required i n t e r f a c i n g , part of which was organized on a channel card as shown i n Figure 1. The p e r i p h e r a l i n t e r f a c e adapter (PIA) was used for t h i s purpose. The PIA allows eight b i t b i d i r e c t i o n a l communication with the MPU and two b i d i r e c t i o n a l eight b i t buses for i n t e r f a c i n g to p e r i p h e r a l s . Handshake c o n t r o l l o g i c for input and output p e r i p h e r a l operation i s also included i n the chip. We used the two e i g h t - b i t buses together for the input of 16-bit p a r a l l e l I/O from the lowest l e v e l of the hierarchy, the SAI 42 or 43 c o r r e l a t o r s (Honeywell - SAICOR). The channel card of the DCP had a simple layout based on the Motorola PIA c h i p . We designed each channel to be of s i m i l a r s t r u c t u r e and only small adaptions, i f any, had to be made i n order to complete the i n t e r f a c e . Our i n t e n t i s to place a l l of the s p e c i a l i n t e r f a c i n g i n the instrument and keep the channel card as clean and ubiquitous as p o s s i b l e . The s p e c i f i c a t i o n of s i x channels does not represent a design r e s t r i c t i o n , but r e f l e c t s our estimate of what i s needed for the l a s e r l i g h t s c a t t e r i n g experiment. The boards and power supply of the DCP were b u i l t up by Hexagram, I n c . , C l e v e l a n d , Ohio f o r a t o t a l cost of about $2,000 i n c l u d i n g l a b o r . Software development for t h i s p a r t i c u l a r v e r s i o n of our system was done by Hexagram, I n c . and brought the e n t i r e cost of the microcomputer to $4,000. Hexagram, I n c . produced a competent design and implementation f o r us and i n the process taught us a f a i r amount of the technology needed for c o n s t r u c t i n g the systems. We are planning to implement a d d i t i o n a l microcomputer systems i n house at a savings. P h y s i c a l l y , a terminal and instruments are plugged i n t o the various channels using conventional telecommunication connectors.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
4.
M A N N
E T AL.
49
APL Level Languages
The host i s connected i n t o e i t h e r the MIKBUG channel or the asynchronous communications i n t e r f a c e adapter (ACIA) depending on whether the purpose i s to load object code i n c l u d i n g the i n i t i a l i z a t i o n of the program for the operating system. Once the operating system has been i n i t i a l i z e d , then the host i s switched i n t o an ACIA channel for the remainder of the s e s s i o n . Of course, the program for the operating system could be entered through other media but we very q u i c k l y learned that i t was easy and f a s t to download a processing module as a hexadecimal character a r r a y . This was e s p e c i a l l y convenient to do with the IBM 5100 as the host s i n c e one could i n i t i a t e a program load through assignment of the character vector to the shared variable. Transfer was accomplished at r a t e s of 1200 baud and could be done considerably f a s t e r than that i f d e s i r e d . The f i r s t operating system w r i t t e n for the device c o n t r o l processor was based on a set of commands for i n t e r a c t i n g with the c o r r e l a t o r . The system was to be compatible with APL r u l e s . This was easy to do once an APL f u n c t i o n was w r i t t e n to emulate the behavior expected of the DCP. The )) i s executed by the DCP while ) i s executed by the APL processor. The f o l l o w i n g f u n c t i o n defines the DCP. DCPLP \STR% CMNAME ; CVAR «EMULATION OF THE DEVICE CONTROL PROCESSOR OF FIG 1. η ΝΑΜΕΔΡ IS A PROCESSOR ft ΝΑΜΕΔΡ IS A FUNCTION fl BLANKSkF STRIPS OFF BLANKS AND : ,6p η NAME IS A VARIABLE OR LABEL fi CM IS SHORT FOR COMMAND η CMARG IS THE ARGUMENT OF A COMMAND fl • REPRESENTS THE TERMINAL I/O TO THE DCP. f
f
η
LI: STR+BLMKSbF CUPr^ : , 6 p +(*/'))'=2+STR)/MPU STR+APLbJ? STR + ( Λ / ' ) ) » = 2t,STR) /MPU a
f
1
f f MPU: CMNAME+TAKECMkF STR+2\STR +(~CHECKbF CMNAME)/ERROR CMNAME DQCMLF CMARG+- ( pCMNAME ) iSTR -+L1 ERROR: ^'IMPROPER COMMAND •+L1 1
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
!
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
50
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
The DCPAP f u n c t i o n , i n the l i n e l a b e l e d L I , simulates input from a terminal connected to the appropriate ACIA port of the DCP. The a n a l y s i s for s p e c i a l c o n t r o l characters i s performed i n the next l i n e . I f the )) p a t t e r n i s found the branch to MPU i s made. The DCPAP parses STR for a command and argument and then executes that command i n pOCMAF. i f p o s s i b l e . E r r o r s are detected and i n case the i n s t r u c t i o n does not match a t a b l e of i n s t r u c t i o n s , that e r r o r i s put out to the t e r m i n a l . C o n t r o l then goes back to LI and the t e r m i n a l . I f the system command c h a r a c t e r s , ) ) , are not detected i n the s t r i n g , then the assumption i s that the s t r i n g contains an executable APL statement. The f u n c t i o n APLAP then attempts to execute whatever may be i n the s t r i n g and transmits back to the terminal the r e s u l t of the execution s t e p . Should the host execute a function that outputs l e a d i n g )) c h a r a c t e r s , the DCPAP recognizes that it must execute a command. Again a branch to MPU o c c u r s . DCPAP emulates exactly the operation of the DCP i n the hierarchy. The f u n c t i o n APLAP which i s meant to emulate the operation of the APL host does so through the function. However, APL systems commands cannot be executed i n t h i s way and an APL e r r o r detected i n execution w i l l suspend DCPAP as w e l l as APLAP. The DCP microcomputer i s d i s t i n c t from the APL host so that a l l functions execute properly i n the r e a l h i e r a r c h y . The f u n c t i o n ££ΕΔΡ assumes that the DCP i s connected i n t o the host through a terminal port on the h o s t . In that case the terminal i s connected i n t o the DCP and the communication pathway i s from the terminal i n t o the DCP and from there to the host and back through the DCP for output from the h o s t , see f i g u r e (2). This mode of operation i n which the terminal and host communicate with each other through the DCP i s s a t i s f a c t o r y for communicating with large APL machines such as the Sigma 7 or the IBM 370/145 that we have used on t h i s p r o j e c t . The large machines, though, have a disadvantage i n that the s i z e of t h e i r input buffer and output buffer and the characters used to s i g n i f y the end of a l i n e and the end of a buffer are not subject to user c o n t r o l simply. We found that the Sigma 7 h o s t , though, could be i n t e r f a c e d through APL r e a d i l y to the DCP with the I/O going back and f o r t h as " b l i n d I/O". In t h i s way the Sigma 7 d i d not i n s e r t c o n t r o l characters i n t o the software character s t r i n g sent across the telephone l i n e s . This d i d mean that a short and simple APL program had to be executing i n the host for the communications to work p r o p e r l y . This was easy to do and not a hindrance i n e x e c u t i o n . There i s no doubt that the problem could be handled i n the software of the DCP, should we choose to do s o . We found that we needed to go more deeply i n t o the telecommunications p r o t o c o l between computer systems than we had r e a l l y expected. And, i n f a c t , i n subsequent e d i t i o n s of our DCP
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
4.
M A N N
E T A L .
APL Level Languages
51
we w i l l include a l l of the telecommunications p r o t o c o l that i s r e q u i r e d for a p a r t i c u l a r h o s t . I t i s most unfortunate that the computer industry does not have a w e l l - d e f i n e d set of standards for the handshake p r o t o c o l s used by computer systems. The RS 232C s p e c i f i c a t i o n r e a l l y only s t i p u l a t e s the hardware aspects of telecommunications and says nothing at a l l about the software p r o t o c o l for e s t a b l i s h i n g communications l i n k s . We have found that the IBM 5100 i s an easy host to use i n our h i e r a r c h y . The communications l i n k i s very much s i m p l i f i e d i n that the IBM 5100 includes i t s own terminal input, d e v i c e . While we used a terminal hooked i n t o the DCP i n order to monitor the data t r a n s f e r between various points on the h i e r a r c h y , i n f a c t a l l of the commands as w e l l as executing APL functions were i n i t i a t e d from the keyboard of the IBM 5100. It was p o s s i b l e to use the IBM 5100 as a t e r m i n a l and operate with i t as the p r i n c i p a l host for the data a c q u i s i t i o n and p r e l i m i n a r y processing steps i n performing the experiment, but then t r a n s f e r the data set over to the l a r g e r host for somewhat longer c a l c u l a t i o n s . The h i e r a r c h y based on the DCP has worked exceedingly w e l l for the l i g h t s c a t t e r i n g experiment as w i l l be documented l a t e r i n this paper. The a r c h i t e c t u r e i n c l u d i n g the software design can be g e n e r a l i z e d to a much more complicated experiment. The key was to recognize that the DCP was s e t t i n g up i m p l i c i t shared variables between the processors on the h i e r a r c h y . The shared v a r i a b l e concept has been e x p l o i t e d i n the APLSV and VSAPL systems as developed and marketed by IBM for t h e i r 370 s e r i e s computers. A paper by Lathwell [5] goes i n t o the system formulation i n some d e t a i l . We have developed the a r c h i t e c t u r e for what we f e e l w i l l be a h i e r a r c h y that e x p l o i t s the shared variable extensively. Four l e v e l s w i l l be used: HOST, SHARED VARIABLE PROCESSOR, DEVICE CONTROL PROCESSOR, and INSTRUMENTS. The SVP and DCP w i l l be plug compatible M6800 microcomputers but with quite d i f f e r e n t operating systems. The DCP w i l l be a command based system with r e a l time c a p a b i l i t i e s . The SVP w i l l have commands, systems functions and systems v a r i a b l e s as w e l l as p r i m i t i v e s necessary for handling the l o g i c and array shaping r e q u i r e d of data a c q u i s i t i o n and c o n t r o l . In the end the SVP might have the APL c a p a b i l i t y of the IBM 5100 with about lOkbytes of memory. However the SVP must be able to l i n k to s e v e r a l devices rather than j u s t one. The shared v a r i a b l e s p r o t o c o l can be used d i r e c t l y with any number of s e r i a l I/O ports that f i t s the hardware l i m i t a t i o n s . We do a n t i c i p a t e the need for an i n t e r r u p t s t r u c t u r e now missing from APL implementations as w e l l as some form of m u l t i - t a s k i n g . These questions are under study.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
52
A N D LARGE SCALE
COMPUTATIONS
Ί MPU MC6800
MPU CARD MIKBUG MCM6830L7
MIKBUG I/O
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
PIA MC6820
ACIA MC6850
Terminal I/O
ACIA MC6850
HOST I/O
L
I
r MEMORY CARD 2 kbytes
IM7552-IC
IIM7552-IC
ι
I
CHANNEL CARD 16 B i t p a r a l l e l I/O & r I/O control lines
PIA MC6820
%
1
Address BUS
PIA
Γ
MC6820
·_
16 B i t p a r a l l e l I/O & 4 I/O control lines
DATA BUS
Figure 1.
The Device Control Processor
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
4.
M A N N
IhSL
53
APL Level Languages
E T AL.
fljLerarçfry
Two h i e r a r c h i e s were used e x t e n s i v e l y Figures 2 and 3.
SIGMA 7 APL
and are blocked out i n
DATALOGICS INC. Cleveland, Ohio
Coupler 300 baud telephone
line
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
A c . Coupler
DCP
Correlator
Terminal
Signal Conditioning
Photomultiplier Tube Scattered
0! Light
)
Figure 2
In g e n e r a l , the DCP accepts s t r i n g s of characters from e i t h e r the Sigma 7 host or terminal and passes them on to the t a r g e t . The DCP i s i n v i s i b l e to the u s e r ' s d i a l o g with the h o s t . Only when the characters ' ) ) ' appear at the s t a r t of a s t r i n g w i l l the DCP look for an executable command to p r o c e s s . This operation i s described i n the previous s e c t i o n . Note that these commands can be organized for s e q u e n t i a l transmission to the DCP as l i t e r a l s t r i n g s by APL functions resident i n the h o s t . The APL functions are i n i t i a t e d from the terminal while the APL system i s i n c a l c u l a t o r mode. Timing can be introduced by an appropriate combination of a function that delays execution and comparison with what amounts to a time stamp v a r i a b l e . However, an i n t e r r u p t s t r u c t u r e was not a v a i l a b l e . The data transmitted from the DCP buffer were taken i n t o the Sigma 7 as quartets of hexadecimal characters followed by a blank a l l i n a binary format. The dyadic \ f u n c t i o n against the atomic vector gave the l o c a t i o n of each character i n the atomic vector. The decode f u n c t i o n produced the d e s i r e d set of numbers to a p r e c i s i o n of 16 b i t s as transmitted by the c o r r e l a t o r . In
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
54
SCALE
COMPUTATIONS
f a c t , two data sets could be transmitted from the correlator—one g i v i n g the most s i g n i f i c a n t 16 b i t s of the 24 b i t c o r r e l a t o r word and the second the l e a s t s i g n i f i c a n t 16 b i t s . A o n e - l i n e APL function compared the common b i t s for r e g i s t e r , masked the common b i t s of the array having the l e a s t s i g n i f i c a n t b i t s of the c o r r e l a t o r words and c a t i n a t e d the two arrays producting a s i n g l e array c o n t a i n i n g the e n t i r e dynamic range of the c o r r e l a t o r . While such p r e c i s i o n i s unnecessary i n conventional l i g h t s c a t t e r i n g work, i t was most i n t e r e s t i n g to discover that b i t l e v e l manipulations could be handled e a s i l y i n APL. The 300 baud transmission rate was i n f a c t uncomfortably slow when more than 100 element data arrays (100 χ 5 = 500 characters) had to be t r a n s f e r r e d to the host and processed. The SAICOR Model 43 produces 400 element data arrays or 2000 characters. The delay was e s p e c i a l l y uncomfortable when the DCP program for i t s operating system was down-loaded. These problems were eliminated by using the IBM 5100 as the host computer. The hierarchy that used the IBM 5100 has the following configuration:
Bidirectional Printer 3000 miles
IBM 370/145 VSAPL
Correlator
Figure 3
The IBM 5100 has a number of advantages i n i n t e r f a c i n g to the DCP. The s i z e of the I/O buffers f o r the IBM 5100 can be changed dynamically and e a s i l y under program c o n t r o l , the end of l i n e and end of buffer characters can be s p e c i f i e d , and an input prompt can be s p e c i f i e d . Incorporating t h i s l e v e l of I/O c o n t r o l i n t o simple APL functions allows a d i r e c t d i a l o g
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
4.
M A N N
E T AL.
APL Level Languages
55
between the c o r r e l a t o r and the IBM 5100. We have found i t easy to w r i t e APL functions that check f o r c e r t a i n prompts that are generated by the DCP ( e . g . , the MIKBUG prompts) so as to detect completion of load c o n d i t i o n s as w e l l as operator e r r o r s . C l e a r l y the necessary programs could have been w r i t t e n i n assembler for the DCP but i t was f a r easier to use the IBM 5100 shared v a r i a b l e processor through APL to accomplish the same b e h a v i o r . Since the IBM 5100 allows the transmission rate to be set up to 9600 baud, t r a n s f e r of data i s no longer l i m i t i n g i n c o l l e c t i n g r e p l i c a t e r u n s . We p r e s e n t l y run at 1200 baud and may go to 3600 baud i n the f u t u r e . F u r t h e r , the down-loading of microcomputer code i s r a p i d and i s t o t a l l y i n t e r a c t i v e . A second host arrangement was e s t a b l i s h e d between the IBM 5100 and an IBM 370/145 i n order to handle l a r g e r l e a s t squares f i t t i n g . A set of APL functions k i n d l y provided by Gussin [6] was extensively r e w r i t t e n to i n c l u d e a command s t r u c t u r e for t r a n s m i t t i n g and f e t c h i n g objects between the IBM 5100 and an IBM 370/145. Data sets were transmitted back and f o r t h so that the data type and name of each array would be the same. Functions were transformed i n t o c a n o n i c a l representations (OCR) transmitted and f i x e d ( Q F X ) i n the new environment a u t o m a t i c a l l y without s p e c i a l conversion functions r e s i d e n t i n the h o s t . Therefore, data sets could be read from magnetic tape a r c h i v e s and transmitted to the l a r g e host f o r data processing using functions sent over to the h o s t . The DCP was programmed to use ASCII code so that i t could not serve as a three way simultaneous i n t e r f a c e between the IBM 370/145 (EBCDIC), the IBM 5100 (ASCII or EBCDIC) and the c o r r e l a t o r . However, that mode of operation i s c e r t a i n l y p o s s i b l e with appropriate programming of the DCP. We found that the IBM 5100 h i e r a r c h y was f l e x i b l e but yet easy to use and e n t i r e l y r e l i a b l e . The down time of the IBM 5100 amounted to a t o t a l of one day f o r s i x months. Adjustments to the tape d r i v e accounted for a l l of the down time. Further the system was a v a i l a b l e at a l l times and d i d not depend on the scheduling of a time sharing vendor. An Operating System The a t t r i b u t e s of the hardware, commands, and the h i g h - l e v e l language are i n t e g r a t e d i n a way that a s s i s t s i n t e r a c t i v e experimentation. This was done by w r i t i n g a set of APL functions that worked together as a processor. Based on previous experience with a PDP 11/45 machine running under RSX 11D, we estimated that such a processor would r e q u i r e at l e a s t eight months to w r i t e and debug. Many of the subroutines would have been w r i t t e n i n assembly language. In c o n t r a s t , one f i r s t year graduate student ( T . G a l l ) wrote and debugged the processor i n three months. This was h i s f i r s t s e r i o u s e f f o r t with APL so
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
56
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
that he a l s o learned the language during that p e r i o d . We next o u t l i n e the operation of the processor and give an example of APL functions that were w r i t t e n . Others w i l l be sent upon request. The IBM 5100 serves four purposes i n our experiments. It gives i n s t r u c t i o n s to the operator. I t c o n t r o l s the instrument. It archives the data and r e s u l t s on tape. It c a l c u l a t e s r e s u l t s from the d a t a . The goal was to write an operating system that could be used by t e c h n i c i a n s who d i d not know the language. It was very easy to w r i t e APL functions that allow i n t e r a c t i o n between the a c t i v i t y of the functions i n c o n t r o l l i n g the instrument and the e x p e r i m e n t a l i s t . A small set of commands, thought of as n i l a d i c APL f u n c t i o n s , was developed for controlling the i n t e r a c t i o n between the host the microcomputer and the c o r r e l a t o r . The l i s t of commands that provided the necessary f u n c t i o n i n the DCP i s : MEMORY, START, FETCH, TRANSMIT, INITIATE, and BEGIN. Each f u n c t i o n was coded i n assembly language as subroutines c a l l e d by a c o n t r o l program. A cross assembler produced the machine code as a s t r i n g of hexadecimal characters. These s t r i n g s could be transmitted from the IBM 5100 through the MIKBUG p o r t . I n i t i a l i z a t i o n data was transmitted as part of the s t r i n g . Once t h i s transmission was completed, the microcomputer would operate as a DCP. When the command ))MEM0R.Y i s sent from e i t h e r the terminal or computer, the microcomputer then prepares for the memory a l l o c a t i o n for each channel. This a l l o c a t i o n i s done by an APL f u n c t i o n from the host or i n t e r a c t i v e l y from the t e r m i n a l . The command s t r i n g ))START invokes a subroutine i n the microcomputer that produces the necessary output s i g n a l l e v e l s to s t a r t the c o r r e l a t o r o p e r a t i o n . When the c o r r e l a t o r f i n i s h e s taking a data s e t , the information that that has happened i s received by the microcomputer as a f l a g . ))FETCH c a l l s a subroutine to read the c o r r e l a t o r memory and s t o r e the hexadecimal r e p r e s e n t a t i o n of the numbers i n the DCP memory. ))TRANSMIT <argument> causes the microcomputer to transmit information to e i t h e r the host or the CRT. The command s t r i n g ))BEGIN i s a microcomputer c a l l to the subroutine START and then TRANSMIT that causes the c o r r e l a t o r to run and when that step i s completed, causes the t r a n s f e r of the data up i n t o the h o s t . The command ))INITIATE invokes a subroutine i n the microcomputer that makes the necessary l i n k s for a host to i n t e r a c t with the h i e r a r c h y . The host must be able to handle block r e c e p t i o n of the data being transmitted from the microcomputer. The time r e q u i r e d for the 5100 to change from output to input mode i s of the order of 100 to 200 m i l l i s e c o n d s and that was slow compared to the microcomputer. However, the 5100 can be i n s t r u c t e d to take the l a s t character of the output s t r i n g as an input prompt. The r e s u l t was that the l a s t character of the s t r i n g was not sent u n t i l the IBM 5100 had switched from output to input. With t h i s technique i t was impossible for the 5100 to
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
4.
M A N N
E T AL.
57
APL Level Languages
miss any of the data s t r i n g s being sent from the microcomputer. Examples of APL/5100 functions to communicate with the M6800 microcomputer are a v a i l a b l e from the authors. Several l e v e l s of p r o t o c o l are i n v o l v e d : b i t l e v e l handshaking, shared v a r i a b l e s c o n d i t i o n i n g of the s e r i a l I/O i n t e r f a c e , and the APL defined f u n c t i o n s . However, i n the end only the APL defined functions need be used by the experimentalist i n w r i t i n g the functions for handling the various aspects of running an experiment. Data, i n s t r u c t i o n s and functions can be stored on magnetic tape. U s u a l l y , data storage on tape was organized with one f i l e for each data s e t . A f i l e name for each data set i s stored in an array c a l l e d LIBRARY s p e c i f i c for each workspace executing the control functions. The number of l i n e s i n LIBRARY t e l l s the system how many f i l e s there are on that tape. The s i z e of the f i l e s should be known i n advance so that the proper s i z e can be marked on the tape. I f the s i z e of the data sets to be stored are unknown, the f i l e s are marked with the s i z e of the l a r g e s t data set expected. A number of functions were coded i n APL to handle the a r c h i v i n g of data a u t o m a t i c a l l y . After each run o r , a l t e r n a t e l y , a f t e r a s e r i e s of runs, APL functions can be invoked to perform data workup c a l c u l a t i o n s , one such c a l c u l a t i o n that must be done i s the conversion of the hexadecimal code transmitted from the microcomputer i n t o the IBM 5100 i n t e r n a l representation for numbers. After t h a t , the l e a s t squares curve f i t of a run or a s e r i e s of runs can be made automatically to the cumulant polynomial, eq. (2). A part of the code i s shown below. K*C CUM Ν B+(NRiSIG)*R-C (N+l) pR)pNRiSIG)x( K+-MA +0 9
( (pi?) N+1)pi , * Î \N)χ(-Γ) 9
° .*0, xN
fl ΡίΤΕΕ SET OF DELAY TIMES, T IS COMPUTED GWEN ΔΤ. PiTHE FIRST ELEMENT OF R WAS DROPPED. «THE WEIGHTED Β VECTOR IS COMPUTED FROM THE LOG OF fl THE CORRELATION FUNCTION SUBTRACTING OUT THE BASE LINE. f\ALL OF THE DERIVAT WES ARE COMPUTED AND ASSIGNED TO A. PiA IS AS LONG AS THE DATA SET AND AS WIDE AS THE NUMBER fl OF CUMULANTS TO BE CALCULATED, N PLUS ONE. flS&4 COMPUTES THE CUMULANTS f\CUM IS PART OF A SHORT FUNCTION THAT CONTROLS THE ITERATION. 9
9
To make memory a v a i l a b l e for l a r g e c a l c u l a t i o n s , a system was designed whereby a l l the data a c q u i s i t i o n programs were themselves stored on tape as c a n o n i c a l representations of the functions. Only the f i r s t few l i n e s of a c a l l i n g function are
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
58
MINICOMPUTERS AND LARGE SCALE
COMPUTATIONS
r e s i d e n t i n the workspace. The r e s t are s t o r e d . The data handling functions are invoked by the c a l l i n g program and so are c l e a r e d from the memory by the 0 Ε Χ f u n c t i o n when i t i s finished. S t o r i n g the functions and data on tape and having two tape drives expands the a v a i l a b l e workspace memory to approximately 400 Κ b y t e s . O b v i o u s l y , t h e execution time i s slowed down when functions have to be read from tape. Even though the IBM 5100 execution time i s slower on i n d i v i d u a l functions than a l a r g e r time sharing system, i t does not have the transmission and sharing delays that a large system requires. Operating at 1200 b i t s per second t r a n s m i s s i o n , the 5100 can take and s t o r e a 400 point data set every 15 s e c . In c o n t r a s t , the Xerox Sigma 7 as a host exchanging data at 300 b i t s per second, r e q u i r e d an average of 5 minutes to take and s t o r e a data s e t . Even though the microcomputer i s slower at executing i n d i v i d u a l f u n c t i o n s , i t s d e d i c a t i o n to one user and one experiment makes i t s o v e r a l l throughput at l e a s t 5 to 10 times that of a conventional time sharing machine. Results and Conclusions The h i e r a r c h i c a l system has been operating for s i x months reliably and has been used by roughly a dozen d i f f e r e n t experimenters. The e f f e c t i v e n e s s of the system i s demonstrated by a simple problem. In working up the a u t o c o r r e l a t i o n f u n c t i o n data to d i f f u s i o n c o e f f i c i e n t s (see the f i r s t s e c t i o n ) , the weight function for the cumulant a n a l y s i s must be estimated. Edwards [7] has constructed a model of the process that p r e d i c t s a c e r t a i n v a r i a t i o n of the standard d e v i a t i o n of the c o r r e l a t i o n function (CJ^ ) with delay time. A simple p r o p a g a t i o n - o f - e r r o r a n a l y s i s w i l l show that the values of the cumulants are s e n s i t i v e to the v a r i a t i o n of the weight f u n c t i o n when very accurate estimates of the d i f f u s i o n c o e f f i c i e n t (<2%) and higher cumulants are r e q u i r e d . The form of the Oj^ f u n c t i o n was to be studied experimentally. R e p l i c a t i o n s of the c o r r e l a t i o n f u n c t i o n had to be c o l l e c t e d under c l o s e l y c o n t r o l l e d c o n d i t i o n s . Histograms were c o l l e c t e d for each of 400 points of the c o r r e l a t i o n f u n c t i o n . F i f t y r e p l i c a t i o n s gave reasonable d i s t r i b u t i o n functions from which could be c a l c u l a t e d for ^ R ^ · The data c o l l e c t i o n proceeded r a p i d l y s i n c e the s i g n a l to noise r a t i o s were large enough that averaging required only ca. 1 min. for each r u n . A few seconds were required for w r i t i n g each data set to a tape f i l e . A l l of these steps were handled by APL defined functions sequenced a p p r o p r i a t e l y by an APL defined f u n c t i o n . The machine executed without operator c o n t r o l during t h i s tedious part of the experiment. The operator then wrote s e v e r a l l i n e s of APL code i n a defined f u n c t i o n i n order to fetch the data sets from f i l e s , b u i l d up
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
4.
M A N N
E T AL.
APL Level Languages
59
the arrays from which the e r r o r data could be c a l c u l a t e d and examined i n p l o t s of various k i n d s . In one day, an experiment could be conceived and completed that with older techniques would r e q u i r e weeks of e f f o r t . Just the coding and debugging of the a n a l y s i s program would have r e q u i r e d about one man day of effort using FORTRAN and a batch system. In APL the coding and debugging was t r i v i a l . We b e l i e v e that the APL hierarchy has enhanced the productivity of the l i g h t s c a t t e r i n g experiment by s e v e r a l orders of magnitude. Just automating the experiment accounts for an order of magnitude i n the p r o d u c t i v i t y of the experiment. C e r t a i n l y , that improvement has been experienced by others with conventional computing systems and i n t e r f a c e s . However, the use of APL has cut the e f f o r t required to produce functions by an order of magnitude as w e l l . This property of APL was e s p e c i a l l y appreciated when d e v i s i n g a n a l y s i s a l g o r i t h m s . The use of a h i e r a r c h y which had an IBM 5100 for most of the work and a l a r g e machine o c c a s i o n a l l y as hosts kept c e r t a i n costs of operating the experiment i n l i n e . A l a r g e host can handle the r o l e of the 5100 so long as telecommunication rates are above 300 baud. However, connect time i s charged on most commercial systems at rates around $15 per hour and h i g h e r . The IBM 5100 requires only c a p i t a l i z a t i o n and maintenance. Further, commercial time sharing rates make the use of the l a r g e host i n place of the IBM 5100 c o s t l y for data a c q u i s i t i o n . However, the IBM 5100 executes large number crunching jobs slowly enough that they can be e f f e c t i v e l y done on a l a r g e h o s t . F o r t u n a t e l y , i t i s easy to use the IBM 5100 for t r a n s f e r r i n g data and functions to the host as w e l l as fetching r e s u l t s from the h o s t . The IBM 5100 can be a very b r i g h t terminal to the large h o s t . It i s able to make e f f i c i e n t use of the number crunching power of the l a r g e host s i n c e function development can be done i n the small machine. We b e l i e v e that a h i e r a r c h i c a l approach based on a APL as the d r i v e r language leads to the most e f f e c t i v e compromise between the cost of hardware and the cost of w r i t i n g software. This has proven to be true for our experimental work and there i s evidence that the hypothesis i s true g e n e r a l l y for laboratory work. The very high cost of producing r e l i a b l e software may make the hypothesis true for a much wider range of applications. Acknowledgment s This work was supported by grants from NIH and NASA. We are happy to acknowledge the support of DATALOGICS, C l e v e l a n d , Ohio for time to develop s e v e r a l models for the operation of the DCP. We g r a t e f u l l y acknowledge the r o l e that D r s . P. F r i e d l and J . Beaumont, IBM Palo A l t o S c i e n t i f i c Center, have had i n developing the IBM 5100 hierarchy as part of a j o i n t study.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5100
IBM 370/145
IBM
Computer
1 sec.
Good
Slow
Good. A f i n e 1 person machine )SAVE i s slow s i n c e mag. tape i s used.
Good
Execution Speed
Response Time to Entry
Table 1
Language Execution
Virtual system default to 330k bytes or larger.
F u l l APL. No known bugs. Very effective.
64kbytes F u l l APL. maximum No known available bugs. This i s minimum s i z e for our a p p l i cations.
Workspace Size
Yes
Yes
Shared Variables
Sign ON Protocol
Simple
Very good Switch First on power, class visual fidelity is used.
Human Engineering of s y s tern
Good. Block Good length l i m i ted to about 2kbytes on input from DCP.
Very good.
Suitability as a host for data acquisition
Summary Information of Some APL Systems
APPENDIX
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
4.
M A N N
APL Level Languages
E T A L .
53 ο ϋ
61
43 cd β
Τ3
ο
ο β Ρ 00 ο
ο
ι
•Η μ
co cd
eu
CO ΡΜ
• ί
•Η μ
W) CU β CU
Ο
ο
Ο
ο
ο
Ο
w β
4* ϋ Ο «44 Η Ο CO PQ ·Η
P
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
• 4J -Ρ χ» ο ο
β α Χ ω β ·Η
Ο Η ·Η
Μ-Ι
Τ3
χ)
ο ο ο
Ο Ο
Ό ιΗ
M cd cd · Η 43 u co cd > β
eu ο 60 Ή Cd Ρ
Si s
§ s eu u
CO
S3
M CU
β ω > Pj S > · Η <ί Ο 4J Η
ιΗ
β
J2
CO* CU
οο m ο β m S rû 0)
•d eu •P β β çu CQ eu cd eu S 43 rO eu rH i H CO •Ρ U H û eu β Ο β s 43 β m «H
CO α) C0 τ—I Τ3 · β rQ Ο) Μ ο cd -Ρ ΤΗ Ί Η β Ό 5! " S β* ο cd a cd β > ω μ
τΗ 3
ΓΗ ΙΗ p4
ι—ΙΙΗ Ό ~ cd «Η , Ρ
44 Μ α) ο
g co
eu
oo co β 00 · Η β eu
5!
>J >00
rÛ 43
β •H •H
0 rH -H CQ β M Pu P4 1 «H
£ eu X *p CO eu μ «H cd > m cd m
β Pu
eu •H eu > m •H
cd 43 eu P •P • 43 Pu U eu 43 ο >> CJ P ο S3 ϋ •H PL. B cd
CQ
C0
cd P* eu
rH cd
44 •Η
4J
Ο
•H
CQ Ν
M CO
ft*
43 Ό CU · 44 eu ο eu οο χ cd Ν
M
>
β Ο •H
4J eu β eu ϋ
ο
eu
•
Ο
CO
eu ω pu
H
β
PQ
CN · Η Η >W
w
I
H α)
CO
β
ο Ρ
CO
eu
psi
>
>
ο eu
PM
CU 4«2
eu
ο
CO cd
eu
C0 cd
CO
ο
ιΗ
cd
• ^ CO χ) ιΗ ω
*β
τ) ο ο ϋ
Ο Ό S Ο cd · Η Ο ,ρ +J
ο ο ο
I
οο eu m ϋ
β
Η β£3 Ο
Μ
PQ Γ ** ΡΗ Η CO ^
β cd <ί cd ·» eu ^ s -u» co > ο ϋ cd ο <ϋ ·Η
Η Ρ Ή Η Λ CO ^ 00 υ Ο
ο ο ο
M H
PL,
eu
m
•
/"Ν
μ
eu
>
42 çd ο çu cd S β & β 43 ο Ο Ο Ό 43 CO β τΗ cd eu β β eu PQ ο β Ρ£ί ^ £ μ
43 cd β
ο ο ο
eu
S
(0
Χ)
00
> •H
Μ
ο eu ο
μ
•H
ιΗ
Ο Ο
P.
CO
ΡΗ · Η
β P μ
u
Pu co
PL, μ ΐ Q ΡΜ ΡΗ
<ί
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
co M-l
β ο eu eu co eu β
P
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Execution Speed
Workspace Size
NO comment.
Must use Subset of QWA and APL. the maximum i s about 256kbytes.
NO
NO
Very poor.
Very poor.
*The Univac APL processors were w r i t t e n at the U n i v e r s i t y of Maryland or by Univac based on the U n i v e r s i t y of Maryland p r o c e s s o r . In our experience the&e processors are not up to modern standards.
OK
reasonable.
time.
Poor during our one session.
OK
NO
Sign ON Protocol
No comment No comment.
Engineering
UNIVAC* 1100/11 UNIVAC APL
F u l l APL.
Suitability
NO. But NO comment, there i s an extensive f i l e system.
Shared Variables
I t e x i s t s but we have not been able to use a system for benchmarking at t h i s
Had to be adjusted through QWA
F u l l APL.
Language Execution
Burroughs APL/700
Good
Has not been benchmarked.
Response Time
CDC Slow. Machine (CYBERNET was loaded. Services) a CYBER 73 machine was on l i n e during benchmark.
DEC 10 APLSF
Computer
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
4.
M A N N
E T A L .
63
APL Level Languages
Table 2
Comparison of Execution Times
The benchmark simulates the l e a s t squares f i t t i n g of 400 data p o i n t s to a f i v e parameter model. The timing was measured using the QTS systems v a r i a b l e or the equivalent function. The f o l l o w i n g f u n c t i o n was executed on the IBM machines, the HP3000 and the CDC machine.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
TIMING T+ n+ncs A+?W0 5 plOO £«-?400pl00 IE 03x-(l 60 60 1000 UT)-1 60 60 1000 ±~4fQrS Z-H3 +1
Machine
Timing
IBM 370/158 (Princeton U n i v e r s i t y )
0.6-2
CDC (Cybernet Services machine. A Cyber 73 machine was probably on l i n e during the benchmark t r i a l . )
1-
17
SIGMA 7 ( D a t a l o g i c s , C l e v e l a n d , Ohio)
2-
5 sec.
HP3000 Ser I I Mod 5
23.5
IBM
90 s e c .
5100
UNIVAC Systems
sec.
sec.
sec.
Would not execute. E i t h e r gave a workspace f u l l or dropped out of APL.
Comments
(heavy loading)
(heavy loading)
Moderate l o a d i n g . One user at the time of the benchmark. System was not fine-tuned for APL o p e r a t i o n .
Benchmark was attempted on s e v e r a l systems.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS Literature Cited 1. 2. 3. 4.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch004
5. 6. 7.
Chu, Β., "Laser Light Scattering," Academic Press, New York, 1975. Berne, B. J . and Pecora, R., "Dynamic Light Scattering," John Wile and Sons, Inc., New York, 1976. Mann, J . A. and McGregor, T. R., Colloid and Interface Science, V. III, 349, Academic Press, Inc., New York, 1976. Roberson, D. Α., Proceedings of the IEEE (1976), 64 (6), 994. Lathwell, R. H., IBM J . Res. Develop. (1973) July, 353. Gussen, Ν . , IBM, Palo Alto Scientific Center, Private Communication. Edwards, R. V., Private Communication.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5 A Distributed Minicomputer System for Process Calculations
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
A.I.JOHNSON, D. N. KIDD, andK.L.ROBERTS University of Western Ontario, London, Canada
During the past decade one of the authors (Johnson) and h i s f a c u l t y colleagues and students have been developing, a p p l y i n g , and evaluating the modular approach to the steady state and dynamic behaviour of process systems. These studies have l e d to two executive systems, GEMCS (1) for steady s t a t e and DYNSYS (2) for dynamic systems studies which have academic and i n d u s t r i a l use. More r e c e n t l y a large timesharing computer, a DECsystem 10, has been used to create an integrated system for process a n a l y s i s and d e s i g n , INSYPS. This system o f f e r s o p t i m i z a t i o n facilities and i n t e r f a c e s to the process designer or analyst through a graphics console (3). With the INSYPS system on the DECsystem 10 the process engineer can call, from one facility, upon the full range of executive programs for handling the various aspects of complex designs and g r e a t l y enhance h i s p r o d u c t i v i t y and creativity. INSYPS on a Large Computer The o r i g i n a l INSYPS deals with two concepts, the use of i n t e r a c t i v e graphics for input and output and automatic linkage between independent packages. The i n t e g r a t i o n of packages i s c a r r i e d out i n a modular fashion to enable one to add new systems or modify e x i s t i n g ones without a f f e c t i n g the e n t i r e s t r u c t u r e . The areas considered for INSYPS are: 1. 2. 3. 4.
Computer Graphics Steady State Simulation Optimization Dynamic Simulation
65 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
66
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
In theory one could make a l l four of these communicate with each other. Figure 1 shows the s y s t e m , i l l u s t r a t i n g b o t h t h e p o s s i b l e l i n k s and those actually implemented. Communication between packages i s e i t h e r a s i n g l e s t e p l i n k a g e o r an iterative link. A single step l i n k a g e t r a n s f e r s i n f o r m a t i o n once per e x e c u t i o n , as i n t h e p l o t t i n g o f the results from a dynamic s i m u l a t o r . I t c a n u s u a l l y be a c c o m p l i s h e d by a data f i l e . T h e i t e r a t i v e l i n k a g e , on t h e other hand, deals with a continuous flow of i n f o r m a t i o n between p a c k a g e s , s u c h a s t h e l i n k b e t w e e n a s i m u l a t o r and the optimization packages. T h i s type of l i n k a g e can o n l y be carried out by interfacing programs specially designed f o r the packages. Some o f t h e c h a r a c t e r i s t i c s of such a program would be: 1. E a s y 'hook-up': t h a t i s , making a subsystem r e a d i l y a v a i l a b l e f o r an a p p l i c a t i o n a r e a w i t h a flexible procedure and associated implementation technique. 2. Efficiency: t h i s i s required i n running time, and is particularly desirable in iterative systems. 3. Automatic operation: once the l i n k has been established the resulting system s h o u l d be automatic in nature. The u s e r s h o u l d not be r e q u i r e d t o know t h e d e t a i l s o f t h e l i n k a g e o f the subsystem.
Figure 1.
Solid line: existing links; dotted line: possible extensions to the system
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
5.
JOHNSON E T A L .
A Distributed Minicomputer System
67
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
INSYPS was b u i l t w i t h a l l o f t h e a b o v e c r i t e r i a i n mind. It consists of four s e l f - S t a n d i n g subsystems, f o u r l i n k i n g p r o g r a m s , and s i x d a t a files. Figure 2 describes their layout. Since the subsystems are independent o f each o t h e r they a r e n o t loaded t o g e t h e r , e x c e p t when t h e f u n c t i o n t o be o p t i m i z e d r e q u i r e s p l a n t simulation. A p p l i c a t i o n o f INSYPS A chemical process i s u s u a l l y represented by a process flow sheet, which i s made up o f p r o c e s s i n g u n i t s j o i n e d by l i n e s . The l i n e s or streams represent flow of m a t e r i a l or i n f o r m a t i o n . In g e n e r a l t h e flow s h e e t and t h e programs w h i c h s i m u l a t e them deal with the equipment (or p r o c e s s i n g units) and t h e s t r e a m connections. One b u i l d s up the process by putting together t h e d e s i r e d u n i t s , t h e shape and b e h a v i o u r o f which are p r e - d e f i n e d . A set of graphical symbols represents t h e u n i t c o m p u t a t i o n s a n d a d i a g r a m c a n be b u i l t to represent the information flow. Usually there is a one-to-one correspondence between t h e g r a p h i c a l symbol and t h e p r o c e s s i n g u n i t . Each g r a p h i c a l symbol corresponds to a unit computation subprogram i n the l i b r a r y of the simulation system. In g e n e r a l the arrangement of t h e m o d u l e s y m b o l s w i l l be s i m i l a r t o t h o s e on t h e p r o c e s s f l o w d i a g r a m . The designer constructs the flow sheet by s e l e c t i n g p r o c e s s i n g u n i t s f r o m t h e menu. E a c h u n i t he selects i s given a number and parameters. Stream c o n n e c t i o n s a r e a l s o d e f i n e d by p i c k i n g t h e a p p r o p r i a t e stream f u n c t i o n s . For every stream the designer gives i t s number, i t s s o u r c e a n d d e s t i n a t i o n u n i t s , i t s t o t a l flow, temperature, p r e s s u r e , vapour f r a c t i o n , and t h e component concentrations. By convention a material f l o w s t r e a m i s shown b y a s o l i d line, an information flow stream by a dotted l i n e . The o t h e r d r a w i n g o r w r i t i n g o p e r a t i o n s p e r f o r m e d a r e mnemonic a i d s o n l y a n d are not passed t o the s i m u l a t o r . On completion of the diagram the program understands t h e p r o c e s s t o p o l o g y - *-^e i n t e r c o n n e c t i o n s of t h e components nd has a record of the i n p u t / o u t p u t stream parameters f o r each p r o c e s s i n g u n i t in the diagram. T h e e n t i r e p r o c e s s i n f o r m a t i o n c a n be s a v e d on t h e d i s k b y e x e c u t i n g t h e SAVE f u n c t i o n on t h e menu. The o t h e r a s p e c t o f t h e g r a p h i c s s u b s y s t e m i s t h e graphical output. Since a person can e v a l u a t e a graph more q u i c k y t h a n he c a n a l o n g l i s t o f numbers, this aspect of INSYPS greatly facilitates the study of t r a n s i e n t svstems. The d e s i q n e r c a n d i s p l a y a single
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
_1
OUTPUT
i
OF
!
-,
)
VARIABLES
(DYNGRP
DESIRED
PLOTTING
I
I NUMERICAL [
(DYNSYS)
SIMULATOR
DYNAMIC
D Y N S Y S DATA F I L E
L
C R E A T I O N OF D A T A F I L E FOR D Y N S Y S (DYNDAT)
DOTTED
DOTTED LINE
80X
i TOPOLOGY
DATA
FILES
|
BOX -
SOLID L I N E -
SOLID
Figure 2
(OPTISEP)
PACKAGE
OPTIMIZATION
AND
PROCESS
(GRSPM )
- FLOW OF DATA
-
DIAGRAM
INPUT
PROCESS
CALLS
SOFTWARE
LNKSO
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
FOR
FILE
OUTPUT
I SIMULATION '
I
J
OUTPUT
I
l
ÎOPTIMIZED J
(GEMCS )
SIMULATOR
STATE
DATA
STEADY
DATA
GEMCS
OF
{GMCDATÎ
J GEMCS
FILE
CREATION
JOHNSON E T A L .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
5.
A Distributed Minicomputer System
69
variable o r a g r o u p o f them t o g e t h e r , a n d c a n p o i n t a t any l o c a t i o n on t h e g r a p h a n d h a v e i t s v a l u e t y p e d o u t . He can also h a v e t h e g r a p h f r o m t h e CRT p l o t t e d o n a conventional plotter for better resolution or larger scale. The d e s i g n e r o f t h i s s y s t e m would start o f f by graphically creating the process flowsheet. ( T h e same s t r a t e g y i s used f o r both steady state and dynamic systems.) He i s e x p e c t e d t o know t h e u n i t computations corresponding to h i s process. Any o r a l l information on the process diagram ( i . e . changes i n equipment or streams or their parameters) can be updated graphically. On completion of the input, the designer is provided with an i n f o r m a t i o n f l o w d i a g r a m and a d a t a f i l e to run the d e s i r e d simulator. The results from the dynamic simulator a r e p u t on a d i s k f i l e , and s e l e c t e d i n f o r m a t i o n f r o m t h i s c a n be g r a p h e d . Figure 3 illustrates the i n t e r a c t i o n of the user with the computer programs o u t l i n e d above. C o n t r o l o f a Simple E v a p o r a t o r System The problem presented was to control a one-component total vaporizer with two mode controllers. The study required determination of optimum s e t t i n g s and s t e a d y s t a t e o p e r a t i o n . A diagram o f the p r o c e s s , presented i n f i g u r e 4, shows the control configuration applied. The
CREATION
f
α
FLOW
OF
USER
DIAGRAM
I I
I I I
CHANGES INPUT
FILE
INPUT
TO
FILE
USER
~ "1 ι
OUTPUT
ι
I
FILE
I
I
f
PLOTTING
GRAPH DISPLAY
SYSTEM
Figure 3
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
70
AND LARGE SCALE
COMPUTATIONS
VAPOUR
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
STEAM
LIQUID
INPUT
Figure 4 evaporator has a capacity for 800 lbs. of liquid oprerating at normal l e v e l c o n d i t i o n s . The e q u i p m e n t i s d e s i g n e d t o h a n d l e 1700 l b s / h r o f any component at steady state operation. T h r e e main v a r i a b l e s have t o be c o n t r o l l e d i n o r d e r t o c o n t r o l t h e entire process: the level of l i q u i d i n the v a p o r i z e r , the output flow o f t h e v a p o u r , and t h e p r e s s u r e i n t h e vessel. These variables are controlled by manipulating correspondingly: the l i q u i d flow i n t o the vaporizer, the steam through the coil, and t h e f l o w o f v a p o u r l e a v i n g the equipment. Once the unit modules were designed and the process configuration defined the information flow d i a g r a m f o r t h e s i m u l a t i o n was c r e a t e d . F i g u r e 5 shows the diagram a s i t a p p e a r e d on t h e s c r e e n a f t e r i t was drawn, u s i n g the programs o f the graphics subsystem. The i n f o r m a t i o n p r o v i d e d i n the streams r e p r e s e n t s the i n i t i a l c o n d i t i o n s at time z e r o . A l l the given values represent steady s t a t e o p e r a t i o n c o n d i t i o n s , except f o r t h e v a l u e s o f t h e p r e s s u r e i n t h e v e s s e l ( s t r e a m 5) and liquid level (stream 2) which will produce a step change i n the s i m u l a t i o n of those v a r i a b l e s . This can be r e c o g n i z e d by c o m p a r i n g t h e m e n t i o n e d v a l u e s a g a i n s t the s e t p o i n t v a l u e a s s i g n e d to the c o n t r o l they feed. The first simulation r u n was s t a r t e d and t h e r e s u l t s d i s p l a y e d on t h e s c r e e n a s graphs. There were five variables under o b s e r v a t i o n : 1) L i q u i d f l o w , 2) S t e a m f l o w , 3) V a p o u r f l o w , 4) Liquid level, and 5) Vessel pressure. The basic criteria for control settings were: recovery from perturbation to normal steady
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
LIQUID
55
^
t
LEUEL
>
CONTROL
F L O U CONTROL
.....
-
-
<
PRESSURE
-
Figure 5
—
>
—
A L U 1\
1
STEAM
-> UAPOUR
CONTROL
^
/(^ALU^\
CONTROL OF AN EUAPORATOR
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
HFST IFST NDST DRST BACK SAUE REDITU DELET PuCT EX^P'N STOP
SPAT1
UALUl SEPPil
REAC1 CONT1 STIR1
CONDI COLfll
JUNC1
CONUi
TEXT
EU*=*P DOT ARC DALNE DLINE ALINE LINE
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
629
703.2 h
777.5 h
851.0 h
925.8
1278.3
2557.6
3836.4
5115.2U
V(I)
Figure 6
LEVEL BEHAVIOR AFTER AN IMPULSE IN L E V E L CONTROL CONSTANT GAIN 5.5 RESET VALUES AS INDICATED
CONSTANT GAIN 5.5 RESET VALUES AS INDICATED
LIQUID FLOW BEHAVIOR AFTER AN IMPULSE IN LEVEL CONTROL
27.84
27.98
582.2
I 170.4
1704.6
2352.5 h
V(I)
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
1.0
Tl ME
PRESSURE BEHAVIOR AFTER AN IMPULSE IN L E V E L CONTROL CONSTANT RESET = 0.02 GAIN VALUES AS INDICATED
TIME
CONSTANT RESET = 0.05 GAIN VALUES AS GIVEN
VAPOR FLOW BEHAVIOR AFTER AN IMPULSE IN L E V E L CONTROL
5.
JOHNSON
E T
AL.
A Distributed Minicomputer System
73
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
state as fast as possible, minimum amplitude of o s c i l l a t i o n during recovery. Sample comparative results for each of the variables m e n t i o n e d a r e shown i n F i g u r e 6. The g r a p h s d e s c r i b e t h e r e s u l t s o f t h i s e x a m p l e and at the same time illustrate the general procedure already mentioned. I t i s evident that i n every set of curves representing specific v a l u e s t h e r e i s one t h a t c a n be c o n s i d e r e d as t h e b e s t i n p e r f o r m a n c e . A designer has to compromise at this point a s t o w h e t h e r he w a n t s f a s t e r response, s a c r i f i c i n g o s c i l l a t i o n amplitudes and h i g h e r number o f o s c i l l a t i o n s o r s l o w e r r e s p o n s e b u t i n a smoother form. An I n t e g r a t e d P r o c e s s A n a l y s i s and D e s i g n S y n t a x on a D i s t r i b u t e d Computer A s s e m b l y Having demonstrated a potential opportunity to enhance t h e c r e a t i v i t y and p r o d u c t i v i t y o f t h e p r o c e s s and d e s i g n e n g i n e e r w i t h I N S Y P S , a d i s t r i b u t e d c o m p u t e r system i s being developed. The system is tentatively named GRAMPS. The analytical capabilities of this new s y s t e m w i l l be similar to those provided by INSYPS, but several performance b e n e f i t s are expected: 1. Graphic o p e r a t i o n s response speedup. The new system incorporates a programmable g r a p h i c s processor capable of performing a l l flow diagram editing o p e r a t i o n s , graph l a b e l l i n g , and t h e l i k e . 2. Calculâtional response speedup. Process calculations w i l l be p e r f o r m e d on a d e d i c a t e d minicomputer, providing better and more predictable response time than a large timeshared system. 3. Portability. The s y s t e m i s compact, and can be moved o n - s i t e f o r a p e r i o d o f i n t e n s i v e u s e by a chemical company's own engineering personnel. 4. Economy o f O p e r a t i o n . The o p e r a t i n g c o s t s a r e relatively f i x e d , and known i n a d v a n c e ; they are i n s e n s i t i v e to the amount of analytical work d o n e .
C o n f i g u r a t i o n and Operation T h e GRAMPS system (figure 7) consists of two minicomputers and various peripherals and communications l i n k s . G r a p h i c s Computer: This is a 32K minicomputer which is programmed in GRAPPLE (4), a graphics
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Π
LOW
T E L E T Y P E FOR HARD COPY (ALPHANUMERIC)
PROCESS CALCULATION COMPUTER
LARGE FAST COMPUTER
COST DJSC
Figure 7
OUTPUT
GRAPHICS COMPUTER
WITH HARD COPY
Π
STORAGE
STORAGE
DISC, PLOTTING
INPUT,OUTPUT TAPE
CARD
PRINTING
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
o
o
TV T A P E R E C O R D E R ( I M A G E AND V O I C E )
MONITORS
TV
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
5.
JOHNSON
E T A L .
A
Distributed Minicomputer System
75
p r o g r a m m i n g l a n g u a g e s i m i l a r i n s t y l e t o ALGOL. This system i s manufactured by Systems A p p r o a c h , L t d . i n O t t a w a on l i c e n s e f r o m B e l l N o r t h e r n R e s e a r c h , L t d . I t i n c l u d e s f o u r f l o p p y d i s k d r i v e s f o r l o c a l s t o r a g e , and a T e k t r o n i x 4015 s t o r a g e t u b e graphics terminal with p l o t t e r f o r hard-copy output or d i g i t i z e d input. Process Calculations Computer: This is a 32K PDP-11 '34 running under RT-11 and programmed i n FORTRAN. I t p r e s e n t l y h a s two f l o p p y d i s k drives for l o c a l s t o r a g e and a t e l e p r i n t e r f o r h a r d - c o p y p r i n t o u t . C u r r e n t p l a n s c a l l f o r t h i s s y s t e m t o be expanded to 64K memory w i t h a d d i t i o n o f a f l o a t i n g p o i n t processor and c a r t r i d g e d i s k s t o r a g e . T h e GRAMPS s y s t e m h a r d w a r e c o s t i s about $100,000. The PDP-11 i s t h e n u c l e u s o f t h e s y s t e m ; i t has communication links with t h e GRAPPLE console, the D E C s y s t e m 1 0 , and a t e l e p r i n t e r , and i s r e s p o n s i b l e f o r o v e r a l l s y s t e m c o n t r o l and f i l e r o u t i n g . When the INSYPS capabilities have been fully implemented i n GRAMPS, t h e p r o c e s s f l o w s h e e t e d i t i n g f u n c t i o n w i l l r e s i d e on t h e GRAPPLE c o n s o l e . When the flow sheet and parameters have been s p e c i f i e d , t h e PDP-11 will be passed a file giving the process topology and the associated values. One or more a n a l y s i s o p e r a t i o n s may be c a r r i e d o u t on the PDP-11, with the r e s u l t s being written to f i l e s . Output f i l e s may t h e n be t r a n s f e r r e d t o t h e t e l e p r i n t e r f o r listing a n d / o r t o t h e GRAPPLE c o n s o l e f o r d i s p l a y a s g r a p h s . At present the various analysis packages f o r steady state simulation, optimization, and dynamic s i m u l a t i o n a r e o p e r a t i o n a l on t h e P D P T I I , and current emphasis i s on the communications l i n k between t h e GRAPPLE c o n s o l e and t h e PDP-11. R e s e a r c h and D e v e l o p m e n t Needs The d e v e l o p m e n t o f a m o d u l a r p r o c e s s a n a l y s i s and d e s i g n system with a range of c a p a b i l i t i e s (e.g. steady state simulation, dynamic simulation, optimization, d e s i g n a n d s y n t h e s i s ) on a d i s t r i b u t e d c o m p u t i n g s y s t e m h a s c r e a t e d some e x c i t i n g o p p o r t u n i t i e s f o r r e d e s i g n o f the system executive programs and f o r r e s e a r c h into e f f i c i e n t d a t a and p r o g r a m storage. Low c o s t g r a p h i c s t e r m i n a l s have been shown to greatly e n h a n c e t h e c r e a t i v i t y and p r o d u c t i v i t y o f t h e design engineer. Y e t much n e e d s be l e a r n e d about the quantity a n d q u a l i t y o f i n f o r m a t i o n t o be p r e s e n t e d on t h e s c r e e n , and t h e d i f f e r i n g needs of a range of users. While a l l are familiar with graphical r e p r e s e n t a t i o n o f d a t a , new n e e d s and o p p o r t u n i t i e s f o r the presentation and manipulation o f complex systems await development.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch005
76
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
In the system under development the GRAPPLE console has a significant analysis capability. The best use of this requires further research. It is apparent that easy communication between and among the computers of a distributed system is a key for their effective use. The units should be essentially parallel processors, taking f u l l advantage of the independence and differing strengths of the computer systems. The communication needs are closely tied to the optimum use of the diskette and cartridge disk storage available. The diskettes provide convenient low cost, personal aids to program development and evaluation. Cartridge disks provide long-term, high-volume storage and faster access. The process calculation computer must serve as a communication link to remote computers when these can solve complex problems outside the capabilities of the local system. Distributed minicomputer-based systems open new dimensions for process analysis and design and encourage reconsideration of the existing design methodology. Abstract This paper describes a distributed minicomputer-based system for simulation and optimization studies of chemical process systems. The system provides an integrated analysis environment for the process engineer, including graphical input of flow sheets and display of performance curves. The system was i n i t i a l l y developed on a large timesharing system and is now being re-developed on a distributed minicomputer system. Literature Cited 1
2
3
4
'GEMCS - General Engineering and Management Computation System', (1971), A. I. Johnson and Associates, The University of Western Ontario, Faculty of Engineering Science, London, Canada. 'DYNSYS User's and Systems Manuals', (1976), A. I. Johnson, J . Barney, R. S. Ahluwalia, available from SACDA, The University of Western Ontario, London, Canada. Ahluwalia, R. S., Lopez, J., Johnson, A. I . , Millares, R., 'Integrated Computer Aided Design System for Process Design', Proceedings of the IFAC Symposium on Large Scale Systems Theory and Applications, Udine, Italy, June 1976. Woolsey, L. G . , 'Design for a High Level Graphics Language Machine', Infor, vol. 13 1975), pp. 248-259.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6 A Computer Data Acquisition and Control System for an Atmospheric Cloud Chamber Facility D. E. HAGEN, K. P. BERKBIGLER,
*
J.L. KASSNER, JR., and D. R. WHITE
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
Graduate Center for Cloud Physics Research, University of Missouri, Rolla, MO 65401
The Graduate C e n t e r f o r Cloud P h y s i c s R e s e a r c h i s a m u l t i - d i s c i p l i n a r y r e s e a r c h center devoted p r i m a r i l y t o the s t u d y o f the m i c r o p h y s i c a l p r o c e s s e s a c t i v e i n c l o u d and f o g . The r e s e a r c h t o o l s from the d i s c i p l i n e s o f p h y s i c s , c h e m i s t r y , m e c h a n i c a l e n g i n e e r i n g , and e l e c t r i c a l e n g i n e e r i n g a r e employed in this effort. S p e c i a l emphasis i s p l a c e d on l a b o r a t o r y e x p e r i m e n t and t h e o r e t i c a l work, complimented by some f i e l d measurement a c t i v i t y . The purpose o f t h i s paper i s t o d e s c r i b e t h e h y b r i d mini/macro computer system t h a t i s used t o s u p p o r t one o f our e x p e r i m e n t a l l a b o r a t o r y f a c i l i t i e s , the c l o u d s i m u l a t i o n chamber. The m i n i c o m p u t e r i s d e d i c a t e d t o s e r v e the d a t a a c q u i s i t i o n and c o n t r o l needs o f the chamber and i s not used f o r g e n e r a l purpose b a t c h j o b p r o c e s s i n g . I t i s j u s t one o f many p e r i p h e r a l subsystems used t o s u p p o r t the c l o u d simu l a t i o n chamber. The chamber and i t s p e r i p h e r a l s are devoted t o " c l a s s i c academic r e s e a r c h and as a r e s u l t are i n a c o n t i n u i n g s t a t e o f e v o l u t i o n . This d i s c u s s i o n w i l l emphasize the p r e s e n t s t a t e o f d e v e l opment o f the computer/chamber s y s t e m , w i t h some d i s c u s s i o n g i v e n t o the near f u t u r e p l a n s f o r t h e system. The c l o u d s i m u l a t i o n f a c i l i t y w h i c h i s s u p p o r t e d by the computer system i s shown i n the b l o c k diagram i n F i g . 1 and i n the p h o t o g r a p h i n F i g . 2. A t t h e h e a r t o f the system i s the c l o u d s i m u l a t i o n chamber. I t i s an e x p a n s i o n c l o u d chamber, one o f the l o n g e s t used and more i m p o r t a n t t o o l s i n the c l o u d p h y s i c s laboratory. I n t h i s d e v i c e a sample o f m o i s t a e r o s o l 1 1
Present address: Ca. 94550.
Sandia
Laboratories,
Livermore,
77 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
78
MINICOMPUTERS AND LARGE SCALE
COMPUTATIONS
AIR EXHAUST OPTICAL OBSERVATION SYSTEMS
Ί
:\ IBM 388/58
CONTROL CHAMBER
SIMULATION CHAMBER v.
Λ.
\
NOVA 848
>
THERMAL REGULATION Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
Λ
/
:/
THERMAL REGULATION
HUMIDIFIER AIR PREP.
SAMPLE HEŒN6
/
/>
Λ
AEROSOL GENERATION
ROOM AIR Figure 1. Cloud simuhtion chamber fa cility block diagram. - - - air flow; data flow and control.
l a d e n a i r i s c o o l e d by e x p a n s i o n . The a i r becomes s u p e r s a t u r a t e d w i t h r e s p e c t t o w a t e r and the water v a p o r condenses o u t on c e r t a i n c a t e g o r i e s o f t h e p a r t i c u l a t e s t o form a c l o u d w h i c h c a n t h e n be s t u d i e d . The e x p a n s i o n chamber i s s u p p o r t e d by a v a r i e t y o f p e r i p h e r a l s y s t e m s . An a i r p r e p a r a t i o n system p r o v i d e s a s u p p l y o f c l e a n d r y a i r . A bank o f h u m i d i f i e r s r a i s e s t h e vapor c o n t e n t o f the a i r to 100% r e l a t i v e h u m i d i t y a t a p r e c i s e l y known tem p e r a t u r e . A e r o s o l g e n e r a t o r s p r o v i d e a s t a b l e and p r e d i c t a b l e a e r o s o l t o s e r v e as c o n d e n s a t i o n n u c l e i . The c o n t r o l chamber a l l o w s f o r e x t e n s i v e a n a l y s i s o f the sample gas and i t s a e r o s o l a t the i n i t i a l simu l a t i o n chamber c o n d i t i o n s d u r i n g t h e time t h e c l o u d i s b e i n g formed i n t h e s i m u l a t i o n chamber. The c l o u d i n the s i m u l a t i o n chamber can be o b s e r v e d by s e v e r a l o p t i c a l systems: l i g h t a t t e n u a t i o n , l a s e r d o p p l e r s h i f t (Γ) , M i e s c a t t e r i n g , p h o t o g r a p h y , and v i s u a l o b s e r v a t i o n by t e l e s c o p e . The a e r o s o l c h a r a c t e r i z a t i o n ( c r i t i c a l a c t i v a t i o n s u p e r s a t u r a t i o n spectrum) i s accomplished with a v a r i e t y o f techniques: elec t r i c a e r o s o l analyzer ( 2 J , continuous flow thermal
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
E T A L .
Atmospheric Cloud Chamber Facility
79
a.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
HAGEN
ε ο ο
•2
Ο
3.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
80
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
d i f f u s i o n chamber ( 3 ) , L a k t i n o v Chamber ( 4 ) , and a Gardner c o u n t e r . A major l i m i t a t i o n o f e x p a n s i o n c l o u d chambers r e s u l t s from w a l l e f f e c t s (5). The o r d i n a r y W i l s o n e x p a n s i o n c l o u d chamber cooTs and s u p e r s a t u r a t e s the gas by means o f a r a p i d e x p a n s i o n which i s approx i m a t e l y a d i a b a t i c , but the chamber w a l l s , whose heat c a p a c i t y i s v e r y h i g h i n comparison to the g a s , r e main at t h e i r i n i t i a l t e m p e r a t u r e . Heat from the w a l l s flows i n t o the gas and reduces the s u p e r s a t u r a t i o n , d e s t r o y s the a d i a b a t i c i t y o f the e x p a n s i o n , and l e a d s to e r r o r s i n o n e s knowledge o f the e v o l v i n g thermodynamics. Furthermore, evaporation occurs a t a l l wet s u r f a c e s . Our s i m u l a t i o n chamber has the unique f e a t u r e o f c o o l i n g the chamber w a l l s i n u n i s o n w i t h the gas to remove t h i s w a l l e f f e c t . The w a l l c o o l i n g i s a c c o m p l i s h e d by t h e r m o e l e c t r i c modules (6) sandwiched between the chamber's i n t e r i o r w a l l and an e x t e r n a l heat s i n k . C o n t r o l o f the gas and w a l l temperatures i s one o f the major r e a l - t i m e t a s k s o f the computer system. The c o n t r o l loop i s c o m p l i c a t e d by the f a c t t h a t the gas temperature cannot be d i r e c t l y measured w i t h s u f f i c i e n t a c c u r a c y (5), (7). I n s t e a d i t i s c a l c u l a t e d from thermodynamics (5j and drop growth theory and measurements o f p r e s s u r e and l i q u i d water c o n t e n t . The s e t o f e q u a t i o n s g o v e r n i n g these p r o c e s s e s are g i v e n below:
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
1
Τ =
(nRT/P)P - (L+3T) f c + s (r - r) p
â,
=
Dp
J
r where
[S - S,* a, + Z J
= r
Q
- ^ π p
,
(1)
Q
L
Ça,)] ,
(8)
(2)
^
Σ Nj (a-
3
(3)
- a .), 3
Τ
denotes
temperature,
η
denotes
the number o f moles o f
R
denotes
the u n i v e r s a l
Ρ
denotes
pressure,
L
denotes the l a t e n t heat o f v a p o r i z a t i o n of w a t e r , i t i s temperature dependent,
3
denotes the c o e f f i c i e n t i n the l i n e a r de pendence of the s p e c i f i c heat o f m o i s t a i r on i t s m i x i n g r a t i o , i e . 3 = dc / d r .
gas
gas,
constant,
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
HAGEN
E T A L .
r
r
81
denotes the m i x i n g r a t i o f o r the m o i s t a i r , i e . t h e number o f grams o f w a t e r v a p o r c o n t a i n e d i n one gram o f d r y a i r , Q
c p
s
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
Atmospheric Cloud Chamber Facility
denotes the i n i t i a l m i x i n g r a t i o , denotes the s p e c i f i c h e a t o f m o i s t a i r , i t i s t e m p e r a t u r e , p r e s s u r e , and m i x i n g r a t i o dependent, denotes the s p e c i f i c h e a t o f l i q u i d w a t e r , i t i s t e m p e r a t u r e dependent,
a.
denotes the r a d i u s o f a c l o u d drop i n family j ,
j
denotes the drop f a m i l y , the c l o u d i s b r o k e n down i n t o a s e t o f f a m i l i e s based on the amount o f c o n d e n s a t i o n n u c l e i m a t e r i a l contained w i t h i n the drops, denotes an e f f e c t i v e d i f f u s i o n c o n s t a n t , i t i s t e m p e r a t u r e and p r e s s u r e dependent,
D ρ
denotes t h e e q u i l i b r i u m v a p o r d e n s i t y o f w a t e r , i t i s t e m p e r a t u r e dependent,
S
denotes the ambient s u p e r s a t u r a t i o n r a t i o , i t depends on t e m p e r a t u r e , p r e s s u r e , and mixing r a t i o , denotes the e q u i l i b r i u m s u p e r s a t u r a t i o n r a t i o f o r a drop i n f a m i l y j w i t h r a d i u s a j , i t i s t e m p e r a t u r e dependent,
S.*
I
p^ NJ
denotes the k i n e t i c c o e f f i c i e n t i n drop l e t growth t h e o r y , i t i s t e m p e r a t u r e and p r e s s u r e dependent, denotes the d e n s i t y o f l i q u i d w a t e r , denotes the number o f d r o p l e t s o f f a m i l y j i n our sample ( t h o s e c o n t a i n e d i n one gram o f d r y a i r ) ,
a . denotes the i n i t i a l r a d i u s o f drops i n family j . The d o t denotes d i f f e r e n t i a t i o n w i t h r e s p e c t t o t i m e . Eq. (1) d e s c r i b e s the thermodynamic e v o l u t i o n o f t h e system, and Eq. (2) d e s c r i b e s the d i f f u s i o n a l growth of the c l o u d d r o p s . Eq. (2) r e p r e s e n t s a s e t o f e q u a t i o n s , one f o r each c l o u d drop f a m i l y . N o r m a l l y a system o f 10 t o 30 f a m i l i e s i s i n c l u d e d i n the p r o b l e m . E q u a t i o n s (1) and (2) a r e c o u p l e d t h r o u g h Eq. ( 3 ) . The gas t e m p e r a t u r e c a l c u l a t i o n i n v o l v e s the n u m e r i c a l s o l u t i o n o f t h i s s e t o f c o u p l e d d i f f e r e n t i a l equations.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
82
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
The s t e p s i n v o l v e d i n a t y p i c a l e x p e r i m e n t a r e as f o l l o w s . F i r s t a l a r g e sample o f m o i s t a e r o s o l l a d e n a i r i s p r e p a r e d and used t o t h o r o u g h l y f l u s h the chamber. The chamber i s s e a l e d and a s t i l l i n g p e r i o d i s a l l o w e d f o r a i r m o t i o n t o d i e down and e q u i l i b r i u m t o be r e a c h e d . D u r i n g t h i s time t h e i n i t i a l t e m p e r a t u r e , p r e s s u r e , r e l a t i v e h u m i d i t y , and a e r o s o l c h a r a c t e r i s t i c s are determined. A f t e r the s t i l l i n g p e r i o d a slow expansion i s performed, which s i m u l a t e s t h e e x p a n s i o n e x p e r i e n c e d by a s c e n d i n g a i r p a r c e l s i n t h e r e a l atmosphere, and t h e e v o l u t i o n o f the r e s u l t i n g c l o u d i s o b s e r v e d . The o b s e r v a t i o n s are t h e n compared w i t h t h e o r y d u r i n g t h e post-mortem analysis. In t h e near f u t u r e our computer system w i l l be used t o s u p p o r t a second c l o u d s i m u l a t i o n chamber f a c i l i t y as w e l l as t h e p r e s e n t one. NASA i s p l a n n i n g t o p u t an A t m o s p h e r i c C l o u d P h y s i c s L a b o r a t o r y (ACPL) on b o a r d S k y l a b i n 1980. The ACPL w i l l be p a t t e r n e d a f t e r our c l o u d s i m u l a t i o n chamber f a c i l i t y . In s u p p o r t o f t h e ACPL NASA has c o n s t r u c t e d a p r o totype f a c i l i t y , c a l l e d the Science S i m u l a t o r , which c o n t a i n s most o f t h e hardware shown on F i g . 1. The NASA Ground Based F u n c t i o n a l S c i e n c e S i m u l a t o r w i l l be l o c a t e d a t our r e s e a r c h c e n t e r and w i l l be supp o r t e d by t h e same computer system t h a t s e r v i c e s our s i m u l a t i o n chamber. The S c i e n c e S i m u l a t o r w i l l be used t o : t r a i n t h e a s t r o n a u t s who w i l l o p e r a t e t h e ACPL, t e s t new p i e c e s o f equipment b e f o r e t h e y a r e i n c o r p o r a t e d i n t o t h e ACPL, and a i d i n t h e p r e p a r a t i o n o f e x p e r i m e n t s f o r t h e ACPL. Data a c q u i s i t i o n i s one o f t h e computer's p r i mary r e a l - t i m e d u t i e s . The s i m u l a t i o n chamber and i t s p e r i p h e r a l subsystems g e n e r a t e a v a r i e t y o f anal o g and d i g i t a l d a t a . A p p r o x i m a t e l y 100 t e m p e r a t u r e p o i n t s a r e measured t h r o u g h o u t t h e system, w i t h a n a l o g t e m p e r a t u r e s i g n a l s d e r i v e d from t r a n s i s t o r thermometers, ( 9 ) , t h e r m o c o u p l e s , and t h e r m i s t o r s . Other a n a l o g s i g n a l s a r e g e n e r a t e d by a i r f l o w m e t e r s , a pressure transducer, a valve p o s i t i o n i n d i c a t o r , a p h o t o m u l t i p l i e r t u b e , a s i l i c o n p h o t o d i o d e , and s e v e r a l v o l t a g e s from t h e w a l l t e m p e r a t u r e c o n t r o l lers. D i g i t a l data sources are a Hewlett Packard q u a r t z c r y s t a l thermometer, a l a s e r s c a t t e r i n g counte r , o p t i c a l p a r t i c l e c o u n t e r s , and an e x t e r n a l d i g i t a l clock. I n t o t a l t h e computer system s e r v i c e s 152 a n a l o g d a t a i n p u t s , 128 from t h e UMR s i m u l a t i o n chamber, and 32 from t h e NASA S c i e n c e S i m u l a t o r ; and 80 b i t s o f d i g i t a l i n p u t d a t a , 48 from t h e UMR s y s tem, and 32 from t h e NASA system.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
HAGEN ET A L .
Atmospheric Cloud Chamber Facility
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
1
83
C o n t r o l i s the computer s other r e a l - t i m e r e sponsibility. Numerous a i r f l o w c o n t r o l v a l v e s a r e i n v o l v e d i n t h e system shown i n F i g . 1, and t h e s e are a l l d i g i t a l l y c o n t r o l l e d . The NASA chamber's a n a l o g d a t a i n p u t hardware has an e x t e r n a l m u l t i p l e x e r b u i l t i n t o i t , and i t i s d i g i t a l l y c o n t r o l l e d t h r o u g h o u r d i g i t a l o u t p u t u n i t . B o t h chambers have c a m e r a / f l a s h o p t i c a l systems under d i g i t a l c o n t r o l . The c o n t i n u o u s f l o w d i f f u s i o n chamber r e q u i r e s a n a l o g t e m p e r a t u r e c o n t r o l and d i g i t a l c o n t r o l f o r i t s opt i c a l p a r t i c l e counters. D u r i n g t h e e x p a n s i o n we want t h e gas and w a l l t e m p e r a t u r e s t o t r a c k each o t h e r a c c u r a t e l y i n o r d e r to m i n i m i z e w a l l e f f e c t s . C a u s i n g t h i s t o happen i s our most d i f f i c u l t c o n t r o l p r o b l e m . Because o f i t s h i g h heat c a p a c i t y and slow r e s p o n s e t i m e t h e w a l l t e m p e r a t u r e i s made t o t r a c k a p r e - d e t e r m i n e d t i m e profile. The d e s i r e d time dependent w a l l t e m p e r a t u r e s i g n a l i s g e n e r a t e d by t h e NOVA and o u t p u t t h r o u g h i t s D/A. The e n t i r e w a l l i s b r o k e n down i n t o 28 i n d i v i d u a l l y c o n t r o l l e d s e c t i o n s . The s e c t i o n a l i z a t i o n o f t h e chamber a i d s i n r e d u c i n g t h e r m a l d i f f e r e n c e s from p l a c e t o p l a c e on t h e s u r f a c e o f t h e chamber. These a r e i s o l a t e d from one a n o t h e r w i t h 28 l i n e a r a n a l o g i s o l a t o r s . D u r i n g t h e experiment we t h e n measure t h e w a l l t e m p e r a t u r e and t h e thermodynamic p a r a m e t e r s t h a t d e t e r m i n e t h e gas t e m p e r a t u r e , c a l c u l a t e t h e gas t e m p e r a t u r e , and t h e n c o n t r o l t h e gas p r e s s u r e so as t o keep t h e gas at the w a l l temperature a t a l l times. We have two d i s t i n c t systems a v a i l a b l e f o r t h e gas t e m p e r a t u r e c o n t r o l d u r i n g t h e e x p a n s i o n . The f i r s t i s a hybrid d i g i t a l / a n a l o g c o n t r o l l e r (10). Here t h e gas t e m p e r a t u r e i s a p p r o x i m a t e d by tEë" sum o f a d r y a d i a b a t i c t e m p e r a t u r e term p l u s a l a t e n t h e a t term due t o condensed c l o u d d r o p l e t s . The d r y a d i a b a t i c t e m p e r a t u r e i s c a l c u l a t e d v i a a s m a l l anal o g computer. The l a t e n t h e a t term i s c a l c u l a t e d d i g i t a l l y and t h e r e s u l t i s o u t p u t t h r o u g h a D/A c h a n n e l . The two t e m p e r a t u r e components a r e t h e n summed w i t h an a n a l o g summation a m p l i f i e r , compens a t e d a n a l o g , and t h e n t h e r e s u l t i s used t o d r i v e a three-way (chamber, h i g h p r e s s u r e r e s e r v o i r , low p r e s s u r e r e s e r v o i r ) r o t a r y v a l v e d r i v e n by a s e r v o motor. T h i s h y b r i d system was t h e e a r l i e s t t o be p u t i n t o o p e r a t i o n ; however, i t s u f f e r s from t h e u s u a l d i f f i c u l t i e s w i t h a n a l o g systems, d r i f t and i n a c c u r a c y . A l s o i t s range o f a c c u r a t e o p e r a t i o n l i m i t e d i t to r e l a t i v e l y small expansions, those w i t h a t e m p e r a t u r e change o f 2°C o r l e s s . Much l a r g e r expansions are d e s i r e d .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
84
MINICOMPUTERS
AND LARGE SCALE
COMPUTATIONS
The second gas t e m p e r a t u r e c o n t r o l l e r , now b e i n g c o m p l e t e d i s a p u r e l y d i g i t a l system, w i t h t h e computer d o i n g a l l o f the p r e s s u r e - t e m p e r a t u r e c a l c u l a t i o n s and c o n t r o l c o m p e n s a t i o n . F i g . 3 contains a b l o c k diagram o f the c o n t r o l scheme. Ρ denotes pressure, T denotes the measured w a l l tempera t u r e , and the X s denote o t h e r measured p a r a m e t e r s w h i c h v a r y from e x p e r i m e n t t o e x p e r i m e n t . A d i s c r e t e time o p t i m a l t r a c k i n g t e c h n i q u e i s used. The c o n t r o l s i g n a l i s s e n t t h r o u g h the d i g i t a l o u t p u t system t o a s t e p p e r motor d r i v i n g a three-way r o t a r y v a l v e . The number from the d i g i t a l o u t p u t i s s e n t t o a s p e c i a l l y designed i n t e r f a c e which then emits that number o f p u l s e s a t 900 Hz t o the s t e p p e r motor. S e v e r a l r e a s o n s were b e h i n d the d e c i s i o n t o use d i g i t a l d a t a a c q u i s i t i o n and c o n t r o l r a t h e r than the c o m b i n a t i o n manual-analog systems used on p r e v i o u s f a s t e x p a n s i o n chambers . F i r s t o f a l l , the a c c u r a c i e s r e q u i r e d i n the c o n t r o l r e l a t e d computa t i o n s a r e beyond t h o s e a v a i l a b l e w i t h a n a l o g t r a n s f e r f u n c t i o n generators. In order to take f u l l w a l l
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
T
OPERATOR
NOVA 56K
MULTIPLEXED
846
A/D
WORDS
STEPMOTOR INTERFACE
STEPMOTOR
8-WAY ROTARY VALVE
VALVE POSITION
CLOUD SIMULATION CHAMBER
Figure S.
P
' HAU.' T
X
1'
X
2'
···
Expansion control loop block diagram
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
6.
HAGEN
E T A L .
Atmospheric Cloud Chamber Facility
85
advantage o f t h e a c c u r a c i e s o f t h e w a l l t e m p e r a t u r e and t h e chamber p r e s s u r e measurements, t h e gas tempe r a t u r e s h o u l d be c a l c u l a t e d t o an a c c u r a c y o f 0.001 °C. A n a l o g d e v i c e s c o u l d n o t meet t h i s r e q u i r e m e n t . Furthermore, the c o s t o f the l e s s accurate analog system i s comparable t o t h a t o f t h e d i g i t a l system i f n o t more e x p e n s i v e . The c o s t o f t h e d i g i t i z i n g equipment must be i n c l u d e d i n e i t h e r approach s i n c e the d a t a has t o be i n d i g i t a l form i n o r d e r t o be s t o r e d and a n a l y z e d . Moreover t h e m u l t i p l e x i n g capa b i l i t y o f t h e d i g i t a l system r e d u c e s t h e o v e r a l l cost considerably. A n o t h e r major advantage o f t h e d i g i t a l system is i t s v e r s a t i l i t y . The s i m u l a t i o n chamber c a n be used f o r a wide v a r i e t y o f e x p e r i m e n t s . W i t h a comp u t e r based c o n t r o l and d a t a a c q u i s i t i o n system t h e n a t u r e o f t h e e x p e r i m e n t c a n be changed s u b s t a n t i a l l y by making o n l y s i m p l e s o f t w a r e changes t o t h e computer program, w i t h l i t t l e o r no hardware changes. Such changes a r e f a s t e r and l e s s e x p e n s i v e than a n a l o g hardware changes and r e a d j u s t m e n t s . T h i s f e a t u r e g r e a t l y i n c r e a s e s the v e r s a t i l i t y o f the chamber and i n c r e a s e s t h e number o f e x p e r i m e n t s w h i c h t h e chamber c a n p e r f o r m p e r u n i t t i m e . I t a l s o makes i t e a s i e r f o r more t h a n one e x p e r i m e n t e r to use t h e chamber. Computer
System
F i g u r e 4 shows a b l o c k diagram o f t h e computer system used f o r c l o u d chamber s u p p o r t . F i g . 5 shows a p h o t o g r a p h o f t h e computer system. The NOVA 840 system i s d e d i c a t e d t o o u r two c l o u d chamber f a c i l i t i e s and i s n o t used f o r t h e b a t c h p r o c e s s i n g o f g e n e r a l j o b s . The c a b l e l i n k i s a h a r d w i r e 19.2 K-baud l i n k t o t h e UMR IBM 360/50 t h r o u g h t h e UMR m i n i - n e t w o r k ( 1 2 ) . The IBM 360/50 f e a t u r e s 524, 288 b y t e s o f c o r e s t o r a g e , 5 IBM 2314 D i s k D r i v e s , 2 IBM 2415 magnetic tape u n i t s , 2 IBM 1403N1 p r i n t e r s , an IBM 2540 c a r d r e a d e r / p u n c h , an IBM 2501 c a r d r e a d e r , and an IBM 2701 t r a n s m i s s i o n c o n t r o l u n i t for data t r a n s m i s s i o n l i n e s to the U n i v e r s i t y o f M i s s o u r i IBM 370/168 and 370/158 computers. The IBM 370 i s n o t used f o r t h e d i r e c t r e a l time s u p p o r t o f our c l o u d s i m u l a t i o n chamber e f f o r t . I t i s used f o r g e n e r a l m o d e l l i n g , s o f t w a r e development, and t h e e x p e r i m e n t ' s post-mortem a n a l y s i s . Our NOVA m i n i c o m p u t e r system was a c q u i r e d over a p e r i o d o f s e v e r a l y e a r s a t a t o t a l c o s t o f $73,500.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
86
MINICOMPUTERS
AND LARGE
SCALE
COMPUTATIONS
NOVA 840 56 Κ MemoryMemory Management 800 nsec c y c l e time R e a l time c l o c k Hardware m u l / d i v Auto program l o a d
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
I/O Expander Computer P r o d . RTP
7410
H i g h L e v e l A n a l o g Input 32 Channel A / D , 1 5 - b i t 20 Κ H z , +10 v o l t Computer P r o d . RTP 7460 Wide Range A n a l o g Input ,| 128 Channel A/D 1 5 - b i t , Programmable G a i n Amp. +2.5 mV to +10 v o l t s Tn 13 r a n g e s , 40 Hz Computer P r o d . RTP 7480
D i s c S t o r a g e 1.2 M Words, D i a b l o 31 J Cable L i n k to UMR Mini-Network 8 bit, serial asynchronous interface Teletype 35 KSR A l p h a Numeric D i s p l a y (CRT)
D i g i t a l I/O 7 1 6 - b i t Inputs 5 1 6 - b i t Output TTL c o m p a t i b l e Computer P r o d . RTP 7430
Tape C a s s e t t e 800 Words/sec, 50K Words
Optical Isolator Manual O v e r r i d e Led S t a t u s L i g h t s Source or S i n k I/O
Analog Output 8 Channel D/A 3 1 4 - b i t , 0-10 volt 2 1 2 - b i t , 0-10 volt 1 1 4 - b i t , +10 v o l t 2 1 2 - b i t , +10 v o l t
Digital Plotter T e k t r o n i x 4662 Line Printer T a l l y T-1120 Figure 4.
Computer system block diagram
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
E T A L .
Atmospheric Cloud Chamber Facility
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
HAGEN
87
a. Ο Ο
•s.
ο Ο 3
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
88
MINICOMPUTERS
AND LARGE SCALE
COMPUTATIONS
M a i n t e n a n c e i s h a n d l e d by our C e n t e r ' s e l e c t r i c a l e n g i n e e r i n g s t a f f a t a c o s t o f a p p r o x i m a t e l y $100/ month. The computer system has p r o v e n q u i t e r e l i a b l e . I n i t s 2h y e a r s o f o p e r a t i o n i t has r a r e l y caused the shutdown o f t h e c l o u d chamber f a c i l i t y . Our campus o p e r a t e s t e n o f t h e NOVA 800 s e r i e s m i n i computers. A s t a f f member i n t h e e l e c t r i c a l e n g i n e e r i n g department h a n d l e s t h e more s o p h i s t i c a t e d r e p a i r and i n t e r f a c i n g problems f o r a l l t h e NOVAs on campus. The o p t i c a l i s o l a t o r s a t t a c h e d t o t h e d i g i t a l I/O are homemade u n i t s t h a t c o s t about $3500. The NOVA's o p e r a t i n g system i s RDOS ( R e a l t i m e d i s c o p e r a t i n g system). The memory management and p r o t e c t i o n o p t i o n d i v i d e s t h e computer i n t o t h r e e p a r t i t i o n s t h a t a r e hardware p r o t e c t e d from each o t h e r . The o p e r a t i n g system r e s i d e s i n one, and two independent programs c a n be r u n i n a time s h a r i n g mode i n t h e o t h e r two p a r t i t i o n s . Hence, we c a n r u n the two c l o u d chambers s i m u l t a n e o u s l y , o r r u n one chamber and e d i t programs f o r t h e o t h e r , e t c . The p a r t i t i o n i n g f e a t u r e has p r o v e n q u i t e u s e f u l . Our c l o u d chamber s u p p o r t programming i s done i n F o r t r a n , B a s i c , and A s s e m b l e r . Most o f t h e chamber o p e r a t i o n programs a r e i n F o r t r a n , a l l o f t h e s u b r o u t i n e s w h i c h h a n d l e o u r p e r i p h e r a l s (A/D, D/A, e t c . ) a r e i n A s s e m b l e r and a r e F o r t r a n c a l l a b l e . The IBM 360/50 s e r v i c e s t h e e n t i r e UMR campus and h a n d l e s b o t h b a t c h j o b s and r e a l time s e r v i c e of t h e m i n i - n e t w o r k . I t i s a $1,000,000 machine w i t h a $2,000/month m a i n t e n a n c e c o s t . I t uses t h e OS 360 MVT R e l . 21 o p e r a t i n g system and t h e m i n i - n e t w o r k i s a c c e s s e d w i t h t h e BTAM method. A l l o f o u r usage o f the IBM 360 i s done under F o r t r a n . Computer System R o l e i n E x p a n s i o n C o n t r o l There i s a d i s t i n c t d i v i s i o n o f l a b o r between the NOVA and t h e IBM 360 d u r i n g an e x p e r i m e n t . The 360 i s i n v o l v e d i n Chamber o p e r a t i o n because we need to c a l c u l a t e t h e gas t e m p e r a t u r e ( t h e gas temperat u r e cannot be d i r e c t l y measured w i t h s u f f i c i e n t a c c u r a c y due t o c o n d e n s a t i o n on t h e s e n s o r s (5^,7)) i n o r d e r t o p r o p e r l y c o n t r o l t h e e x p e r i m e n t (malce t h e gas t e m p e r a t u r e t r a c k t h e w a l l t e m p e r a t u r e ) . The gas temperature i s c a l c u l a t e d v i a a numerical c l o u d model t h a t s o l v e s a s e t o f s i m u l t a n e o u s d i f f e r e n t i a l e q u a t i o n s (Eqs. (1-3) ) , d e s c r i b i n g t h e thermodynam i c s and c l o u d d r o p l e t g r o w t h . T h i s c l o u d model i s too l a r g e and runs t o o s l o w f o r t h e NOVA t o h a n d l e .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
HAGEN ET A L .
89
Atmospheric Cloud Chamber Facility
I n p u t s t o t h e c l o u d model a r e t h e i n i t i a l t e m p e r a t u r e ( T ) 9 p r e s s u r e (P ) , s u p e r s a t u r a t i o n r a t i o ( S ) , t h e
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
Q
Q
a e r o s o l ' s c r i t i c a l s u p e r s a t u r a t i o n s p e c t r u m , and t h e p r e s s u r e v s . time ( P ( t ) ) p r o f i l e d u r i n g t h e e x p e r i ment. The o u t p u t from t h e c l o u d model i s t h e time dependent t e m p e r a t u r e , T ( t ) , p l u s drop s i z e s and o t h e r thermodynamic i n f o r m a t i o n . I d e a l l y the cloud model s h o u l d be r u n i n r e a l time u s i n g t h e i n i t i a l c o n d i t i o n s and a e r o s o l c h a r a c t e r i z a t i o n measured j u s t b e f o r e t h e e x p a n s i o n p l u s t h e measured r e a l t i m e p r e s s u r e p r o f i l e as i n p u t s . However t h i s i s n o t p r a c t i c a l f o r a machine o f t h e NOVA's s i z e and speed. I n s t e a d we u s e a " p e r t u r b a t i o n " t y p e o f a p p r o a c h . The NOVA c o l l e c t s t h e i n i t i a l thermodynamic and a e r o s o l s t a r t i n g c o n d i t i o n s . We know i n advance t h e approximate p r e s s u r e p r o f i l e t h a t w i l l occur d u r i n g the e x p e r i m e n t s i n c e o u r g o a l i s t o make t h e gas t r a c k t h e w a l l t e m p e r a t u r e , and t h i s d e f i n e s a p r e s sure p r o f i l e . The w a l l t e m p e r a t u r e p r o f i l e i s p r e d e t e r m i n e d t o t h e e x t e n t t h a t we c a n make t h e w a l l s t r a c k a d e s i r e d temperature p r o f i l e . The c l o u d model i s (Eqs. (1-3)) i s r u n on t h e 360 f o r t h e meas u r e d i n i t i a l c o n d i t i o n s and t h e e x p e c t e d p r e s s u r e profile. The r e s u l t i n g t e m p e r a t u r e p r o f i l e i s app r o x i m a t e l y what w i l l o c c u r d u r i n g t h e a c t u a l experiment. A l l we need t o c a l c u l a t e i n r e a l time i s t h e d e v i a t i o n from t h i s a n t i c i p a t e d b e h a v i o r due t o dev i a t i o n s i n t h e o t h e r measured p a r a m e t e r s d u r i n g the e x p e r i m e n t . To a c c o m p l i s h t h i s t h e IBM 360 p e r f o r m s a f u n c t i o n a l f i t ; t h e gas t e m p e r a t u r e i s e x p r e s s e d ' a s a s i m p l e f u n c t i o n o f measured parameters. V a r i o u s o p t i o n s are a v a i l a b l e f o r which parameters a r e used ( l i q u i d water c o n t e n t , m i x i n g r a t i o , l i g h t a t t e n u a t i o n , p r e s s u r e , e t c . ) depending on t h e t y p e o f e x p e r i m e n t . These f u n c t i o n a l f i t s a r e found to be q u i t e a c c u r a t e (on t h e o r d e r o f 0.001 °C) b u t are v a l i d o n l y f o r t h e e x a c t e x p e r i m e n t f o r w h i c h they were c a l c u l a t e d . They must be r e c a l c u l a t e d f o r each e x p e r i m e n t . The 360 t r a n s m i t s t h e r e s u l t s o f the f u n c t i o n a l f i t back t o t h e NOVA. T h i s i n f o r m a t i o n i s then used i n t h e NOVA f o r i t s time c o n t r o l calculations. S i n c e s i m p l e f u n c t i o n s were used i n the f u n c t i o n a l f i t t h e y c a n be e v a l u a t e d i n r e a l time on t h e NOVA. The f o l l o w i n g l i s t summarizes t h e s t e p s t a k e n by t h e two l i n k e d computer systems i n o r d e r t o a c c o m p l i s h an e x p e r i m e n t .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
90
MINICOMPUTERS A N D LARGE SCALE
COMPUTATIONS
1)
The NOVA o v e r s e e s t h e p r e p a r a t i o n o f t h e exper iment and t h e measurement o f t h e i n i t i a l c o n d i t i o n s o f t h e system ( T , P , S , and a e r o s o l ) .
2)
The i n f o r m a t i o n t a k e n i n s t e p 1 p l u s t h e expect ed p r e s s u r e p r o f i l e , P ( t ) , i s t r a n s m i t t e d from the NOVA t o t h e IBM 360. The c l o u d model i s t h e n r u n on t h e IBM 360, y i e l d i n g the temperature p r o f i l e T ( t ) . A f u n c t i o n a l f i t i s p e r f o r m e d on t h e IBM 360, y i e l d i n g Τ (Ρ, χ^, x , . . . ) , « t e m p e r a t u r e as a f u n c t i o n o f t h e m e a s u r a b l e p a r a m e t e r s P, X J I , x , . . . . Ρ denotes p r e s s u r e , and x^,
Q
3) 4)
Q
Q
i e
2
2
x , Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
2
... t h e o t h e r m e a s u r a b l e p a r a m e t e r s s e l e c t e d
f o r t h i s experiment. 5) The f u n c t i o n T(P, χ , x , ...) i s t r a n s m i t t e d from t h e IBM 360 t o t h e N O V A . 6) The NOVA uses t h e f u n c t i o n T ( P , x ^ x , ...) f o r i t s r e a l time c o n t r o l c a l c u l a t i o n s d u r i n g t h e e x p e r i m e n t , as shown i n F i g . 3. Steps 3 and 4 a r e r u n on t h e IBM 360 d u r i n g t h e s t i l l i n g p e r i o d f o l l o w i n g t h e f l u s h i n g o f t h e chamber with the aerosol-laden moist a i r . η x
? z
2
Cloud
S i m u l a t i o n Chamber Program
I n t h i s s e c t i o n we d e s c r i b e t h e computer program t h a t r u n s on t h e NOVA and o v e r s e e s t h e c l o u d chamber o p e r a t i o n d u r i n g an e x p e r i m e n t . The program i s w r i t t e n i n F o r t r a n and makes e x t e n s i v e use o f Data G e n e r a l F o r t r a n ' s m u l t i t a s k i n g f e a t u r e . D u r i n g an experiment v a r i o u s a c t i v i t i e s (take temperature data, take l a s e r data, output w a l l temperature c o n t r o l s i g n a l , e t c . ) must be done p e r i o d i c a l l y . The m u l t i t a s k i n g f e a t u r e a l l o w s each a c t i v i t y t o be d e s i g n a t e d as a t a s k , each w i t h i t s own p r i o r i t y and f r e q u e n c y o f e x e c u t i o n . V a r i o u s t a s k s c a n be a c t i v e s i m u l t a n eously. They compete f o r system r e s o u r c e s based on need and p r i o r i t y . The o v e r a l l s i m u l a t i o n chamber s o f t w a r e i s t o o l a r g e t o f i t i n t o t h e a v a i l a b l e c o r e , so i t i s b r o k e n down i n t o f o u r programs c o r r e s p o n d i n g t o f o u r con s e c u t i v e phases o f e x p e r i m e n t a l a c t i v i t y : prepara t i o n o f t h e chamber, c l o s i n g o f t h e chamber, t h e e x p a n s i o n , and t h e p o s t - e x p a n s i o n c l e a n - u p . Chaining i s used t o a u t o m a t i c a l l y t r a n s f e r from one program to t h e n e x t ; upon c o m p l e t i o n o f i t s d u t i e s t h e
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
HAGEN
91
Atmospheric Cloud Chamber Facility
ET AL.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
phase one program d e l e t e s i t s e l f , c a u s e s t h e phase two program t o be l o a d e d i n t o c o r e from d i s c , and b e g i n s i t s e x e c u t i o n , and so on down t h e l i n e . PART 1 - P r e p a r a t i o n . T h i s program c o l l e c t s i n p u t i n f o r m a t i o n from t h e o p e r a t o r v i a t h e CRT k e y b o a r d t h a t d e f i n e s t h e e x p e r i m e n t t o be done. Q u e s t i o n s a r e d i s p l a y e d on t h e CRT and t h e o p e r a t o r r e s p o n d s . Then t h e program r e a d s d i s c f i l e s con t a i n i n g t h e d e s i r e d w a l l t e m p e r a t u r e p r o f i l e and t h e c l o u d model f u n c t i o n a l f i t r e s u l t , Τ (Ρ, , Χ2>..)· Up t o t h i s p o i n t i n time a manual chamber f l u s h w i t h d r y a i r has been i n p r o g r e s s . A l l o f our valves t h a t c o n t r o l a i r f l o w have a manual o v e r r i d e . The computer now s w i t c h e s v a l v e c o n t r o l d i g i t a l o u t p u t b i t s t o t h e d r y f l u s h p o s i t i o n and asks t o have a l l the v a l v e s changed from manual t o computer c o n t r o l . Then a t t h e operator's s i g n a l , t h e computer i n i t i a l i z e s a chamber f l u s h w i t h m o i s t a e r o s o l - l a d e n a i r , and i t t a k e s p e r i o d i c d a t a r e a d i n g s (temperature, l i g h t s c a t t e r i n g , e t c . ) and s t o r e s t h e r e s u l t s on t h e d i s c and d i s p l a y s them on t h e CRT. When t h e o p e r a t o r d e c i d e s t h a t t h e f l u s h i s s u f f i c i e n t , he s i g n a l s t h e computer and i t t h e n c h a i n s t o t h e phase 2 program. PART 2 - C l o s i n g o f t h e chamber. Here t h e computer t a k e s one s e t o f thermometer r e a d i n g s , s t o r e s them on t h e d i s c , and d i s p l a y s them on t h e CRT. Then t h e t e m p o r a l l e n g t h o f t h e wet f l u s h i s r e c o r d e d , t h e chamber v a l v e s a r e c l o s e d , and t h e s t i l l i n g period begins. D u r i n g t h i s time t h e i n i t i a l r e a d i n g s ( T , P , S , a e r o s o l , e t c . ) a r e t a k e n and Q
Q
Q
f
t r a n s m i t t e d t o t h e IBM 360. When t h e 3 6 0 s c l o u d model r e s u l t s a r e r e c e i v e d and t h e o p e r a t o r s i g n a l s t h a t he i s r e a d y f o r t h e e x p a n s i o n t o b e g i n , t h e program c h a i n s t o t h e phase 3 program. PART 3 - E x p a n s i o n . T h i s phase o f t h e e x p e r i ment i s h a n d l e d w i t h m u l t i t a s k i n g . PART 3 a c t i v a t e s f i v e t a s k s and t h e n s i m p l y w a i t s f o r t h e d u r a t i o n o f the e x p a n s i o n t o p a s s . TASK 1 t a k e s p e r i o d i c w a l l t e m p e r a t u r e and gas p r e s s u r e r e a d i n g s , TASK 2 t a k e s p e r i o d i c l a s e r (Mie and D o p p l e r ) l i g h t s c a t t e r i n g d a t a , TASK 3 d i s p l a y s c u r r e n t i n f o r m a t i o n on t h e CRT, TASK 4 o u t p u t s an u p d a t e d w a l l t e m p e r a t u r e con t r o l s i g n a l , and TASK 5 o u t p u t s an u p d a t e d gas temp e r a t u r e c o n t r o l s i g n a l . When t h e time a l l o t t e d f o r the e x p a n s i o n e l a p s e s , t h e computer a b o r t s a l l f i v e t a s k s and c h a i n s t o t h e phase 4 program. PART 4 - P o s t - e x p a n s i o n . Under t h i s phase t h e computer re-opens t h e chamber and b e g i n s a chamber
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
92
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
flush with clean dry air for a given amount of time. Disc data files resulting from this experiment are secured for the post-experiment analysis. Independent programs are then run on the NOVA after the experiment to perform some analysis on the raw data stored on disc f i l e s . They produce written listings and graphical output that records the experimental data in permanent form. Then some of the raw data files are transmitted to the IBM 360 via the mininetwork link and are stored on cards. The subsequent post-mortem analysis using the cloud model and other physics or chemistry models for the process under study are performed on the IBM 360 or the IBM 370 as batch jobs. Acknowledgement This work was supported by the Office of Naval Research, ONR-N00014-75-C-0182, and by the National Aeronautics and Space Administration, NAS8-31849. Abstract The Graduate Center for Cloud Physics Research operates an experimental cloud simulation chamber f a c i l i t y designed for the study of atmospheric microphysics and chemistry. The Marshall Space Flight Center is constructing a miniaturized version of this f a c i l i t y as a ground based science simulator in support of an Atmospheric Cloud Physics Laboratory which is planned for the Space Shuttle. These two f a c i l i t i e s are supported by a hybrid NOVA 840 IBM 360/50 computer system for data acquisition and control purposes. Literature Cited 1. Hagen, D. E., Hale, M. H., and Carter, J., Proc. Electro-Optical Systems Design Conf., Anaheim, CA, 1975, p. 373. 2. Whitby, K. T., L i u , B. Y. H., Husar, R. B . , and Barsic, N. J., J. Colloid Interface Sci., (1972), 39, 136. 3. Sinnarawalla, A. M. and Alofs, D. J., J. Appl. Meteor., (1973), 12, 831. 4. Laktinov, A. G . , English Translation Atmos. and Oceanic Phys., (1972), 8, 382. 5. Kassner, J. L . Jr., Carstens, J. C., and A l l e n , L. B . , J. Atmos. S c i . , (1968), 25, 919. 6. "Thermoelectric Handbook," Cambridge Thermionic Corp., Cambridge, Mass., 1972
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
6.
HAGEN ET AL.
7. 8. 9. 10. 11.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch006
12.
Atmospheric Cloud Chamber Facility
Kassner, J . L. J r . , Carstens, J . C., and Allen, L. B., J . Recherches Atmospheriques, (1968), 3, 25. Carstens, J . C., Podzimek, J., and Saad, Α., J . Atmos. Sci., (1974), 31, 592. Pease, R. Α., Instruments and Control Systems, (1972), 45, (6), 80. Hagen, D. E . , Tebelak, A. C., and Kassner, J . L. J r . , Rev. Sci. Instrum., (1974), 45, 195. Allen, L. B., "An Experimental Determination of the Homogeneous Nucleation Rate of Water Vapor in Argon and Helium," Ph. D. Dissertation, University of Missouri-Rolla, 1968. Beistel, D. W., Mollenkamp, R. Α., Pottinger, H. J., de Good, J . S., and Tracey, J . H., in "Computer Networking and Chemistry," ACS Sympos ium Series 19, (1975), ed. by P. Lykos, p. 118.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
9
7 T h e M i n i c o m p u t e r and X - R a y Crystallography
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
ROBERTA.SPARKS Syntex Analytical Instruments, Inc., Cupertino, CA 95014
One of the first analytical instruments automated was the single crystal x-ray diffractometer. Because of the rather complex angular settings required (for each diffraction measurement 3 or 4 angles must be set with an accuracy of about 0.010.02°) and because of the length of the experiment (typically 24 hours per day for 2-7 days)early diffractometers were punched card-controlled or paper tape-controlled. The controlling equipment consisted primarily of electrical relays. With the advent of the minicomputer the diffractometer became computercontrolled. Early computer-controlled diffractometers were built at IBM Research (1) and at the Oakridge National Laboratory (2). Today a l l commercially available diffractometers are controlled by small minicomputers. With the availabilty of higher-level languages (primarily FORTRAN) and the development of some new algorithms the computer-controlled single crystal x-ray diffractometer has become a very flexible tool. A review of the algorithms for the diffractometer was presented by Sparks (3). The computer requirements for the control of the diffractometer are: 4-8K of 16-bit core memory and an IBM compatible magnetic tape drive for programs which have been written in assembly language. For a FORTRAN version, 24K of 16-bit core memory, a 1.25M word disk, and an IBM compatible magnetic tape drive are required. Because the diffractometer is a slow device the Central Processing Unit does not have to be very fast and floating point hardware is not necessary. Minicomputer for Structure Determination Calculations. Traditionally the crystallographer has taken the data produced by the diffractometer to a large computer and has then used the large computer for a l l the necessary structure determination calculations. A few crystallographers have used the small computer for some crystallographic calculations (4). However, until recently that calculation (least squares refinement of atomic parameters) which requires 80% of a l l computer time 94 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
7.
95
X-Ray Crystallography
SPARKS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
for c r y s t a l s t r u c t u r e determination (_5) could not be done on the small computer. Recent developments i n the minicomputer i n d u s t r y have changed t h i s s i t u a t i o n markedly. Fast f l o a t i n g p o i n t hardware and FORTRAN compilers which generate e f f i c i e n t code are now r e a d i l y a v a i l a b l e f o r minicomputers. In 1972 (6^) Syntex announced that they were developing a S t r u c t u r e Determination System based on the Data General Nova 1200 computer which i s the same computer used to c o n t r o l the Syntex Ρ2χ D i f f r a c t o m e t e r . The f i r s t Syntex XTL S t r u c t u r e Determination System was d e l i v e r e d i n e a r l y 1974. Since then Syntex has developed an Ε-XTL S t r u c t u r e Determination System based on the Data General E c l i p s e computer. Three other commercial manufacturers o f f e r s i m i l a r systems. The four companies and the minimum computer c o n f i g u r a t i o n s are as f o l l o w s . TABLE I Syntex A n a l y t i c a l Instruments
a) 24K Nova 1200 or 24K Nova 800 - 1.25M word d i s k 12.5ips magnetic tape d r i v e b) 32K E c l i p s e 2 - 1.25M word d i s k s -12.5 i p s magnetic tape d r i v e
Enraf-Nonius (Holland)
28K PDP 11/40 or 11/45 or 11/35 1.25M word d i s k 12.5 ips magnetic tape d r i v e
Philips
P h i l i p s ΡΜ 855
(Holland)
Computer-Systemtechnik
(Germany)
32K E c l i p s e
5M work d i s k
A l l of the computers have a 16-bit word length. A very u s e f u l option i s a l i n e p r i n t e r - p l o t t e r . Syntex p r o v i d e s a Versatec 11" p r i n t e r - p l o t t e r . For the Syntex system p r i c e s i n c l u d e a l l of the s t r u c t u r e determination software and range from about $62,000 to $100,000. Syntex has d e l i v e r e d more than t h i r t y systems and I would estimate t h a t i n c l u d i n g a few i n d i v i d u a l l y b u i l t systems there are now over f i f t y c r y s t a l l o g r a p h i c s t r u c t u r e determination systems on mimicomputers throughout the world. To compare the performance o f the minicomputer and the large computers Sparks (1) used the FORTRAN benchmark shown i n Table I I . The code from statement 30 to statement 5001 forms the normal equation matrix, A, from the d e r i v a t i v e v e c t o r s , DV, one r e f l e c t i o n a t a time. T h i s or s i m i l a r codes are used i n a l l the c r y s t a l l o g r a p h i c f u l l matrix l e a s t squares r o u t i n e s on l a r g e computers. For reasons which w i l l be discussed t h i s algorithm i s not used on any of the minicomputers mentioned above. The benchmark i s meant to compare the e f f i c i e n c y o f hardware and
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
96
MINICOMPUTERS
A N D LARGE SCALE COMPUTATIONS
TABLE I I
101
10
20
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
30
5002 5003 5001 6001
6002
COMMON A(2500), DV(IOO) FORMAT(1H ,7E10,4) DO 10 1=1,2500 A(I)=0.0 ACCEPT "N,NREF,IPR=",Ν,NREF,IPR M=N+1 MM=M+1 DO 6001 IP=1,NREF DO 20 1=1,M DV(I)=I*IP*0.9 K=l DO 5001 J=1,N B=DV(J) IF(B.NE.O) GO TO 5002 K=K+MM-J GO TO 5001 DO 5003 L=J,M A(K)=A(K)+DV(L)*B K=K+1 CONTINUE CONTINUE K=K-1 IF(IPR.EQ.O)GO TO 6002 WRITE(10,101)(A(I),I=1,K) STOP WRITE(10,101)A(K) STOP END
FORTRAN t e s t program (N=64, NREF=100) code generated by the various FORTRAN compilers. The execution times f o r t h i s benchmark f o r various computers i s shown i n Table I I I , The computer programs f o r the dedicated minicomputer system are s i m i l a r t o those f o r the large computer. Most o f the major programs f o r the Syntex XTL are m o d i f i c a t i o n s o f the programs w r i t t e n f o r the l a r g e computers. A l l programs are w r i t t e n i n FORTRAN. Because o f the l i m i t e d amount o f core, the programs are more d i s k o r i e n t e d than are those on the large computer. A l l program f i l e s , data f i l e s and s c r a t c h f i l e s r e s i d e on d i s k . Because more than one data s e t can r e s i d e on d i s k a t any one time, the data f i l e s contain as p a r t o f t h e i r names a four l e t t e r code i d e n t i f y i n g the s t r u c t u r e t o which they belong. A l i s t o f the data f i l e s and t h e i r contents are shown i n Table IV. S p e c i a l care was e x e r c i s e d t o make user input as simple and e r r o r - f r e e as p o s s i b l e . XTL and E-XTL programs are c o n v e r s a t i o n a l
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
7.
SPARKS
97
X-Ray Crystallography TABLE I I I
Comparison o f time f o r l e a s t squares inner
Time 0.175 sees 0.402 0.93 1.164 7.5 7.7 FORTRAN V 11.0 16.6 28.4 FORTRAN IV 29.0 FORTRAN IV FORTRAN IV-DOS 9 35.0 41.0 FORTRAN IV 59.8 FORTRAN IV 64.0 FORTRAN IV
Compiler FTN-OP2 FTN-OP1 OP2 0P1
Computer CDC 7600 CDC 6600
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
loop
IBM 370/155 Eclipse PDP 11/50 MOS MEMORY FORTRAN IV PLUS PDP 11/45 CORE MEMORY FORTRAN IV PLUS Syntex XTL (NOVA 800) HP 2100A Enraf Nonius SDP(PDP 11/45) Syntex XTL (NOVA 1200) PDP 11/40 PDP 8 E - f l o a t i n g hardware
and u t i l i z e unformatted (or free-form) input. The user i s prompted on each input by a p r i n t - o u t o f p o s s i b l e responses. An automatic updating o f the primary data f i l e minimizes user i n s t r u c t i o n and e l i m i n a t e s redundant user inputs and c a l c u l a t i o n by i n d i v i d u a l programs. Frequently used input parameters (e.g. c e l l dimensions, space group information, etc.) need only be entered once. Since f i l e s are r e w r i t t e n a u t o m a t i c a l l y as new information i s a v a i l a b l e , c a l c u l a t i o n s made by one program can be u t i l i z e d by the others. The user r a r e l y needs t o intervene i n going from one program t o the next. In most cases, the s p e c i f i c a t i o n o f the program name, the data f i l e i d e n t i f i e r , and a small number o f input parameters are a l l that i s r e q u i r e d t o run any program i n the system. Extensive d i s c u s s i o n s among a l l the programmers (who were a l s o c r y s t a l l o g r a p h e r s ) were h e l d on a l l input and output procedures and formats. An example o f the input and output i s shown i n Table V. The XTL program f i l e i n t e r a c t i o n i s shown i n Table V I . The names o f the major programs are shown i n r e c t a n g l e s , the f i l e s i n ovals. Large programs l i k e MULTAN and the F u l l Matrix Least Squares (FMLS) must be d i v i d e d i n t o many overlays so that each p a r t w i l l f i t i n 24K o f core memory. MULTAN r e q u i r e s 15 segments and FMLS r e q u i r e s 5 on the Nova 1200 and 7 on the E c l i p s e . To minimize the number o f d i s k t r a n s f e r s (which are time consuming) a very thorough understanding o f the program i s necessary i n order t o make an e f f i c i e n t d i v i s i o n i n t o segments. FMLS r e q u i r e s a l a r g e storage area f o r the normal equation matrix. As mentioned above, programs which run on l a r g e computers form the normal equations as shown i n the benchmark. A l l o f
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
98
MINICOMPUTERS
AND LARGE SCALE
COMPUTATIONS
TABLE IV L i s t o f the data f i l e s and t h e i r contents
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
File
Name
F i l e Contents
Primary Data F i l e
D0USER.DA
T i t l e , c e l l dimensions, space group equivalences, s c a t t e r i n g f a c t o r types, o v e r a l l s c a l e and tempera ture f a c t o r s , weighting scheme, atomic coordinates, etc.
R e f l e c t i o n Data F i l e
D1USER.DA
h,k,l,E,F,0,A ,B , s c a t t e r ing f a c t o r s , e t c .
Ε phase Data F i l e
D2USER.DA
Ε phases f o r each MULTAN solution
Peak F i l e
D3USER.DA
Peak information generated by F o u r i e r , Ε-map, or Patterson
H,K,L F i l e
D6USER.DA
F i l e o f h , k , l values f o r Ρ 2 FORTRAN programs
P2
D7USER.DA
O r i e n t a t i o n matrix, wave lengths, c e n t e r i n g , index ing, c o l l e c t i o n , e t c . parameters f o r P 2 FORTRAN programs
1
Parameter F i l e
c
c
χ
1
D8USER.DA
h , k , l , i n t e n s i t y , time f o r check r e f l e c t i o n s
Raw I n t e n s i t y Data F i l e
D9USER.DA
h,k,l, intensity, σ
Scattering
SAIASF.TB
Cromer-Weber s c a t t e r i n g factor tables
NORMAL.TM
Full-matrix
Check R e f l e c t i o n
File
Factor F i l e
Normal Equation F i l e
normal equations
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
7.
99
X-Ray Crystallography
SPARKS
TABLE V PRIME S.A.I. XTL PROGRAM
PRIMARY DATA FILE SETUP (2)
ENTER DATA FILE ID (4 CHAR.): TEST ENTER COMPOUND NAME (30 CHAR. MAX): NEW TEST CRYSTAL WHAT KIND OF X-RADIATION? (CU,MO,AG,FE,CR): MO ENTER CELL CONSTANTS TO AS MUCH PRECISION AS POSSIBLE WITH LENGTHS IN ANGSTROMS AND ANGLES IN DEGREES ORDER - A Β C,ALPHA,BETA,GAMMA - SEPARATE WITH COMMA'S :13.712,15.241,10.334,90,92.17,90 ENTER STD. DEV. FOR CELL DIMENSIONS :.001,.002,.001,0,.02,0 IS THIS SECTION CORRECT? (YES OR NO): YES Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
f
f
ENTER SPACE GROUP SYMBOL: Ρ 21/C CRYSTAL CLASS: MONOCLINIC LAUE GROUP: Ρ 2/M CENTROSYMMETRIC NUMBER OF GENERAL POSITIONS: 4 POLAR AXIS: NONE UNIQUE 2-FOLD AXIS: Y EQUIVALENT POSITIONS: Χ, Υ,
Ζ,
-Χ, 1/2+Y,
1/2-Z,
ENTER ATOMIC NAMES IN STANDARD FORMAT. GIVE THE NUMBER OF EACH IN THE ASYMMETRIC UNIT WHEN REQUESTED. FINISH WITH "END". ATOM TYPE: FE+2 HOW MANY? 2_ ATOM TYPE: C HOW MANY? 18 ATOM TYPE: (3 HOW MANY? _3 ATOM TYPE: H HOW MANY? 8^ ATOM TYPE: END IS THIS SECTION CORRECT? (YES OR NO): YES ATOM LIST COMPOUND: NEW TEST CRYSTAL
IDENT : TEST
A B C ALPHA BETA GAMMA VOLUME CELL DIMENSIONS: 13.712 15.241 10.334 90.00 92.17 90.00 2158.1 ATOM
NUMBER/ CELL
FE+2 C Ο Η
8. 72. 12. 32.
ATOMIC NUMBER 26. 6. 8. 1.
ATOMIC WEIGHT 55.85 12.01 16.00 1.01
WEIGHT (%)
ABSORPTION COEFF.
29.1 56.3 12.5 2.1
13.2 0.4 0.2 0.0
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
100
MINICOMPUTERS
TABLE V
AND
LARGE
SCALE
COMPUTATIONS
(cont)
NUMBER OF ATOMS IN THE UNIT CELL = 124. 768. ELECTRONS UNIT CELL SCATTERING F (000) 2185.1 A3 UNIT CELL VOLUME = 1535.8 AMU UNIT CELL MASS = 1.18 G/CM3 CALCULATED DENSITY = 13.9 CM(-l) ABSORPTION COEFFICIENT = PRIMARY DATA FILE SETUP COMPLETE
FILENAME: D0TEST.DA
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
Underlined items are the user i n p u t .
the normal equation matrix ( r e a l l y only the upper or lower t r i a n g u l a r part) r e s i d e s i n the core memory of such computers. T r a d i t i o n a l l y , c r y s t a l l o g r a p h e r s have measured the s i z e o f t h e i r c e n t r a l computer by the s i z e of the normal equation matrix t h a t can be handled. Thus, Ibers(8_) s t a t e s t h a t no more than 240 v a r i a b l e s can be r e f i n e d on h i s CDC 6400. The maximum s i z e f o r the CDC 7600 a t the Lawrence R a d i a t i o n Laboratory i n Berkeley i s 832 parameters, although a more p r a c t i c a l l i m i t i s 722 parameters (9^) . In the Syntex XTL, the normal equation matrix i s s t o r e d i n a l a r g e contiguous block on d i s k . Since t y p i c a l l y 40-50% of the execution time of a f u l l matrix l e a s t squares program (or 32-40% of a l l c r y s t a l l o g r a p h i c computing time) i s spent i n formation of the normal equation matrix, whatever e f f o r t i s spend on o p t i m i z i n g t h i s p a r t of the code w i l l pay l a r g e d i v i d e n d s . For the Nova 1200 and Nova 800 XTL systems s p e c i a l f l o a t i n g p o i n t hardware was designed to execute Data General's FORTRAN IV e f f i c i e n t l y and to make i t p o s s i b l e to w r i t e a very f a s t machine language subroutine f o r t h i s p a r t of the program. Instead of the a l g o r i t h m i n d i c a t e d i n the benchmark the f o l l o w i n g algorithm was developed. D e r i v a t i v e v e c t o r s f o r three r e f l e c t i o n s are generated and s t o r e d i n core. Then one h a l f - t r a c k (768 elements) of the normal equation matrix are read i n t o core from d i s k . As soon as the f i r s t element a r r i v e s i n core the three corresponding products from the three d e r i v a t i v e v e c t o r s are added t o i t . The next element i s then processed i n the same way. The a r i t h m e t i c operations (three f l o a t i n g p o i n t m u l t i p l i e s and adds) are slower than the d i s k t r a n s f e r r a t e . As soon as the d i s k has t r a n s f e r e d the f i r s t h a l f - t r a c k i t reads i n the second h a l f - t r a c k i n t o a second array i n core. At the end of reading the second h a l f - t r a c k the p r o c e s s i n g f o r the f i r s t h a l f - t r a c k w i l l be f i n i s h e d and can be w r i t t e n back onto the d i s k . F i n a l l y at the end o f p r o c e s s i n g the second h a l f - t r a c k i t a l s o i s w r i t t e n back onto the d i s k . The f l o a t i n g p o i n t processor operates independently of the c e n t r a l p r o c e s s i n g u n i t and the program i s
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
SPARKS
X-Ray Crystallography
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
7.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
101
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
102
MINICOMPUTERS
AND
LARGE SCALE
COMPUTATIONS
w r i t t e n so that time f o r the address a r i t h m e t i c c a l c u l a t i o n s (done by the CPU) are almost completely overlapped by the f l o a t i n g p o i n t processor time. Thus, the f l o a t i n g p o i n t processor i s busy about 80% of the time. The E c l i p s e f l o a t i n g p o i n t processor i s considerably f a s t e r than the s p e c i a l processor designed f o r the Nova 1200 and Nova 800; however, the d i s k t r a n s f e r rate i s the same f o r a l l of these computers. Therefore, f o r the E-XTL i t i s necessary to process more than three d e r i v a t i v e vectors at a time. The algorithm used i n the E-XTL i s the f o l l o w i n g . D e r i v a t i v e vectors f o r many r e f l e c t i o n s (several hundred) are w r i t t e n onto a d i s k f i l e . When t h i s f i l e i s f i l l e d a block of the normal equation matrix (3072 elements) i s t r a n s f e r r e d to core. This block i s then updated with c o n t r i b u t i o n s from a l l of the vectors i n the d e r i v a t i v e f i l e . Then that block of the normal equation matrix i s w r i t t e n back on d i s k . The d e r i v a t i v e v e c t o r must be read f o r each 3072 element block of the normal equation matrix. Depending on the t o t a l number of r e f l e c t i o n s the d e r i v a t i v e vector f i l e may have to be w r i t t e n more than once and each time the normal equation matrix must be updated. To minimize d i s k seek times, i t i s necessary to have the d e r i v a t i v e vector f i l e and the normal equation f i l e on d i f f e r e n t d i s k s . On the Nova 1200 v e r s i o n of the Syntex XTL two c y c l e s of a 217 parameter, a c e n t r i c problem with four equivalent p o s i t i o n s and 1152 r e f l e c t i o n s took 117 minutes. I t took 96 minutes on the Nova 800 XTL and 44 minutes on the E c l i p s e E-XTL. Up to 500 parameters can be handled by the current XTL and E-XTL F u l l Matrix Least Squares programs. Almost a l l of the programs normally run by c r y s t a l l o g r a p h e r s i n connection with s t r u c t u r e determination of molecules (other than p r o t e i n and other macromolecules) are included i n the XTL and E-XTL and i n the systems provided by other manufacturers l i s t e d above. These few programs not now i n c l u d e d can e a s i l y be added to the v a r i o u s packages. P r o t e i n and Other Macromolecular C r y s t a l l o g r a p h i c C a l c u l a t i o n s . The commercially a v a i l a b l e systems described above do not now have very many programs which can be used f o r l a r g e molecules. However, one c y r s t a l l o g r a p h i c group which has a Syntex Nova 800 XTL has w r i t t e n many programs i n c l u d i n g s t r u c t u r e f a c t o r c a l c u l a t i o n s , phase angle refinement, and s t r u c t u r e refinement by d i f f e r e n c e F o u r i e r methods. Except f o r the c a l c u l a t i o n and p l o t t i n g of the F o u r i e r maps they do a l l of t h e i r p r o t e i n c a l c u l a t i o n s on the Nova 800. Much development work (mostly on l a r g e computers) on algorithms f o r p r o t e i n s t r u c t u r e refinement i s now t a k i n g p l a c e . I am confident that a l l or almost a l l p r o t e i n c a l c u l a t i o n s w i l l be performed on minicomputers not very d i f f e r e n t from those described above.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
7.
SPARKS
Time-sharing of the D i f f r a c t o m e t e r with S t r u c t u r e Calculations
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
103
X-Ray Crystallography Determination
Although a number of Syntex customers have systems t h a t w i l l run e i t h e r the d i f f r a c t o m e t e r or the XTL S t r u c t u r e Determinat i o n System, they cannot run both simultaneously. Enraf-Nonius does o f f e r a time-sharing system. This system has a PDP 8 computer operating the CAD-4 d i f f r a c t o m e t e r and a PDP 11 operating the S t r u c t u r e Determination Package. The PDP 8 and PDP 11 are t i e d together and the PDP 11 sends d i f f r a c t o m e t e r commands to the PDP 8 and c o l l e c t s data from the PDP 8. There i s a s l i g h t degradation of the s t r u c t u r e determination c a l c u l a t i o n s when the d i f f r a c t o m e t e r i s operating i n t h i s system. The cost d i f f e r e n t i a l between t h i s system and a stand-alone d i f f r a c t o m e t e r p l u s a stand-alone S t r u c t u r e Determination Package i s small (approximately $16,000 f o r an e x t r a magnetic tape d r i v e and d i s k d r i v e , compared to an approximate c o s t depending on a c c e s s o r i e s of $130,000 f o r the time-shared system). The stand-alone approach o f f e r s the advantage that the d i f f r a c t o m e t e r can s t i l l be o p e r a t i o n a l even i f the PDP 11 i s not. Comparison of the Large Computer with the Minicomputer. Because of d i f f e r e n c e s i n FORTRAN compilers t r a n s p o r t a b i l i t y of l a r g e programs from one l a r g e computer to another has at times been a problem. However, i t i s much more d i f f i c u l t to take a l a r g e program w r i t t e n f o r a l a r g e computer and make i t run e f f i c i e n t l y on a small computer. A very thorough understanding of the algorithm i s necessary i n order to determine how the program must be segmented i n t o overlays and to decide which arrays can be put on d i s k without v a s t l y i n c r e a s i n g execution time. Some minicomputers have a v i r t u a l memory scheme which makes the user think t h a t he has a very l a r g e core memory when a c t u a l l y the computer i s using a paging technique to swap p a r t s of the program between core and d i s k memories. S t i l l the user must be cautious that the way the program accesses elements i n l a r g e arrays i s not causing excessive and very i n e f f i c i e n t paging. C l e a r l y , these problems are much more s e r i o u s f o r the minicomputer programmer than f o r h i s counterpart on the l a r g e computer. Debugging new programs w r i t t e n o r i g i n a l l y f o r a minicomputer i s a l e s s d i f f i c u l t problem. Debugging aids are s t i l l not as good as those found on the best l a r g e computers. From my own experience the most e f f i c i e n t debugging I have ever done has been on a time-sharing system on a l a r g e computer. With a dedicated minicomputer i t i s p o s s i b l e to choose computer hardware and design algorithms so that the computation w i l l be done i n the most c o s t e f f e c t i v e way. It i s also possible to design s p e c i a l hardware to make c e r t a i n c r i t i c a l p a r t s of the programs execute as f a s t as p o s s i b l e . I t i s much more d i f f i c u l t to do these things on a l a r g e general purpose computer.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
104
MINICOMPUTERS
AND
LARGE
SCALE
COMPUTATIONS
One must always be aware of the f a c t that pressures from other users of the l a r g e system can cause the system to change e i t h e r i n hardware, operating system, or charging algorithms i n such a way t h a t h i s programs are no longer cost e f f e c t i v e and i n extreme cases no longer run at a l l . Very l a r g e problems can be run on the minicomputer. As i n d i c a t e d above, the l a r g e s t number of parameters that can be handled i n the F u l l Matrix Least Squares program on the CDC 7600 i s 833. With an a d d i t i o n of 16K of core and a change of some of the dimension statements the E-XTL can handle more than 1000 parameters. One c y c l e of F u l l Matrix Least Squares on such a problem would take about 50 hours. Of course, i t would be p o s s i b l e to program the CDC 7600 to use p e r i p h e r a l memory f o r the normal equations i n the same f a s h i o n as was done f o r the E-XTL. The major c o n s i d e r a t i o n between the choice of a dedicated minicomputer or the use of a l a r g e computer must be c o s t . The c o s t of c r y s t a l l o g r a p h i c computing v a r i e s g r e a t l y throughout the world. A survey (10) of a l l i n s t i t u t i o n s showed that the average computer c o s t was $5,400 per s t r u c t u r e ( r e a l money - not phoney money as used i n some c e n t r a l f a c i l i t i e s ) . The cost of the minimum dedicated minicomputer system i s about equivalent to the c o s t of computer time f o r 12 s t r u c t u r e s . The cost of the maximum system i s equivalent to about 19 s t r u c t u r e s . Two computer c o n t r o l l e d x-ray d i f f r a c t o m e t e r s can c o l l e c t 100 data sets per year (one l a b o r a t o r y i n the USSR i s c u r r e n t l y c o l l e c t i n g t h i s much d a t a ) . The Syntex E-XTL can e a s i l y keep up with t h i s volume. S e r v i c e and operating costs are not very d i f f e r e n t from those of the computer c o n t r o l l e d d i f f r a c t o m e t e r . Most of the dedicated minicomputer systems f o r c r y s t a l l o g r a p h y are l o c a t e d outside the United S t a t e s . The reasons are that i n many p a r t s of the world the minicomputers are as powerful or more powerful than the c e n t r a l l a r g e computer f a c i l i t i e s a v a i l a b l e and secondly, t h a t i n the United States r e s t r i c t i v e p o l i c i e s by the n a t i o n a l funding agencies and c e n t r a l computing f a c i l i t i e s have prevented the wide spread use of dedicated minicomputers. In any case, the dedicated minicomputer promises to p l a y an i n c r e a s i n g l y important r o l e i n a l l c r y s t a l l o g r a p h i c c a l c u l a t i o n s .
Literature Cited 1. 2. 3. 4.
Cole, Η., Okaya, Y. & Chambers, F.W., Rev. S c i . I n s t r . , (1963) 34, 872. Busing, W.R., E l l i s o n , R.D. & Levy, H.A., A b s t r a c t s of the American C r y s t a l l o g r a p h i c A s s o c i a t i o n , (1965) 59. Sparks, R.A., "Trends i n Minicomputer Hardware and Software Part I", pp 452-467, Munksgaard, Copenhagen, 1976. Shino, R . , " C r y s t a l l o g r a p h i c Computing", pp 312-315, ed. F. R. Ahmed, Munksgaard, Copenhagen, 1970.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
7.
SPARKS
X-Ray Crystallography
Hamilton, W., "Computational Needs and Resources i n C r y s t a l l o g r a p h y " , pp 9-17, Washington, D.C.: N a t i o n a l Academy o f Sciences, 1973. 6. Sparks, R.A., "Computational Needs and Resources i n C r y s t a l l o g r a p h y " , pp 66-75, Washington, D.C.: N a t i o n a l Academy o f Sciences, 1973. 7. I b i d . , pp 66-75. 8. Ibers, J.A., "Computational Needs and Resources i n C r y s t a l l o g r a p h y " , pp 18-27, Washington, D.C.: N a t i o n a l Academy o f Sciences, 1973. 9. Z a l k i n , Α., p r i v a t e communication, 1977. 10. Hamilton, W., l o c . cit.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch007
5.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
105
8 Description of a High Speed Vector Processor J. N. BÉRUBÉ and H. L. BUIJS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch008
Bomen, Inc., 2371 Nicolas Pinel, Ste.-Foy, Québec, Canada
A high speed vector processor has been developed which, coupled with a low level minicomputer or microprocessor, provides an efficient data reduction facility for Fourier Transform Spectrometry. The vector processor performs the dot product of two arrays at high speed and very high precision. Since in many data reduction applications the computation of dot products present the greatest load to the data processing system, the vector processor will be found useful for a wide range of tasks. In Fourier Transform Spectroscopy the vector processor is used to perform high speed numerical filtering and fourier transformation. Numerical Filter Modern spectrometric applications sometimes demand very high spectral resolution over a relatively large optical bandwidth. A model DA3.003 Fourier Transform Spectrometer manufactured by Bomem Inc. is capable of providing spectral resolution of 0.003cm over the visible to millimeter wavelength region. A large amount of data (millions of elements) is produced when such resolution is recorded over a large optical bandwidths. Fortunately, it is seldom required that such large bandwidths must be analysed completely at one time; in the vast majority of cases it is preferred to limit the analysis to a succession of rather narrow portions of the spectrum which have been judiciously chosen because of the particular information contents of the spectrum. In such cases the interferogram vector, the raw data from the instrument, may be numerically filtered such that only the information contents of the selected portion of the spectrum are retained. This results in reduced number of data points to be Fourier Transformed in order to produce the spectrum. -1
In a p p l i c a t i o n s where l i m i t e d time i s a v a i l a b l e f o r measurement, such as i n a n a l y s i n g substances i n chemical r e a c t i o n , the r a t e o f information generated may become greater than the storage r a t e normally a v a i l a b l e u s i n g data storage devices such as
106 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
8.
BERUBÉ
AND
107
High Speed Vector Processor
Buijs
magnetic tape o r f l e x i b l e d i s c systems. Real time numerical f i l t e r i n g can be a p p l i e d to reduce the data storage rate to t h a t compatible with the storage system. Numerical f i l t e r i n g a l s o r e duces the volume o f data storage r e q u i r e d f o r a given s e t o f experiments. The high speed v e c t o r processor, developed f o r the above ment i o n e d a p p l i c a t i o n , permits r e a l time numerical f i l t e r i n g t o be performed with i n p u t data r a t e s up t o 200,000 data p o i n t s per second. The numerical f i l t e r i n g process used c o n s i s t s o f numeric a l l y convolving the input data with the impulse response o f a f i l t e r f u n c t i o n having near u n i t y gain over the d e s i r e d s p e c t r a l r e g i o n (σ ± o /2) and h i g h s i g n a l r e j e c t i o n outside t h i s r e g i o n . Such a f i l t e r i s g e n e r a l l y c l a s s i f i e d as "non-recursive" or " f i n i t e impulse response". The general input-output r e l a t i o n s h i p i s o f the form: Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch008
0
r
N-1 Yn = Σ a k * x - k k=0 n
0 )
where {Xn} i s the i n p u t data sequence, {Yn} the output data sequen ce, and {a^} the c o e f f i c i e n t s o f the f i l t e r , ( i . e . t h e impulse response f u n c t i o n ) . I t may be noted here t h a t non-recursive numerical f i l t e r i n g presents s e v e r a l unique c h a r a c t e r i s t i c s which are o f use t o F o u r i e r Transform Spectroscopy and other a p p l i c a t i o n s . P r i m a r i l y , the f i l t e r f u n c t i o n i s a p p l i e d t o data i n the d i g i t a l sampled domain, the t r a n s f e r f u n c t i o n operates t h e r e f o r e on the s i g n a l as d e t e r mined by the sample source. In the modern F o u r i e r Transform Spectrometer, sampling i s c o n t r o l l e d very p r e c i s e l y from a r e f e r ence i n t e r f e r o g r a m generated by a s t a b l e monochromatic l i g h t source. Since the numerical f i l t e r operates on the sampled data, e r r o r s due t o phase s h i f t s and time frequency v a r i a t i o n s are not i n j e c t e d during the f i l t e r process. This a t t r i b u t e would be ap p l i c a b l e t o other a p p l i c a t i o n s where s i g n a l v a r i a t i o n and t h e r e fore sampling i n t e r v a l are f u n c t i o n s o f parameters other than time,. A second c h a r a c t e r i s t i c o f use to F o u r i e r Transform Spectroscopy i s the a b i l i t y t o s h i f t the output sampling f u n c t i o n with r e s p e c t to the s i g n a l . This allows c e n t e r i n g o f the output sample func t i o n with r e s p e c t t o s i n g u l a r s i g n a l s . Fourier
Transform
The raw data from the i n t e r f e r o m e t e r p o r t i o n o f a F o u r i e r Transform Spectrometer i s the a u t o c o r r e l a t i o n f u n c t i o n o f the i n cident r a d i a t i o n . Once the f i l t e r i n g process, b a n d l i m i t i n g and sample f u n c t i o n c e n t e r i n g has been performed, the spectrum may be computed. F a s t F o u r i e r Transform (FFT) algorithms are o f t e n used f o r t h i s since they provide means f o r transforming s p e c t r a with a minimum o f numerical o p e r a t i o n s . General computing f a c i l i t i e s
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
108
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
with f a s t hardware m u l t i p l y / d i v i d e and random access data storage c a p a c i t y comparable to the v e c t o r l e n g t h can perform the FFT q u i t e rapidly. Consider however the general equation f o r the D i s c r e t e F o u r i e r Transform (DFT) :
A
r
N-1 =Σ X ' e x p (-2Trjrk/N) k=0
(2)
k
where A i s the r c o e f f i c i e n t of the transformed vector (spec trum) and Xfc i s the Κ ^ sample of the i n p u t vector ( f i l t e r e d i n t e r ferogram) . One can see that the s t r u c t u r e of the DFT i s i d e n t i c a l to that of the numerical f i l t e r . I f a high speed v e c t o r processor i s needed f o r numerical f i l t e r i n g , the same processor can be used to perform F o u r i e r transformation by u s i n g the DFT algorithm. D i s c r e t e F o u r i e r Transformation has s e v e r a l advantages with respect to F a s t F o u r i e r Transformation. Some examples p a r t i c u l a r l y r e l a t e d to spectroscopy f o l l o w : 1. F o u r i e r Transformation by DFT i n v o l v e s only s e q u e n t i a l access to the data to be transformed whereas the FFT algorithm r e q u i r e s repeated access to d i f f e r e n t p o r t i o n s of the o r i g i n a l vector and to intermediate r e s u l t s . The DFT can therefore be e f f i c i e n t l y performed from s e q u e n t i a l access devices such as d i g i t a l magnetic tape or f l e x i b l e d i s c whereas f o r e f f i c i e n t a p p l i c a t i o n of the FFT, e i t h e r l a r g e random access memory or very high speed d i s c s must be used. 2. Using the DFT any p o r t i o n of the t o t a l s p e c t r a l range may be computed and the s p e c t r a l i n t e r v a l and sample p o s i t i o n s may be chosen at w i l l . I t i s sometimes p o s s i b l e , t h e r e f o r e , to compute a p o r t i o n of the spectrum and begin p l o t t i n g that p o r t i o n while the computation of other p o r t i o n s of the spectrum continues. 3. Computation of one s p e c t r a l p o i n t or s e v e r a l i s o l a t e d p o i n t s i s a l s o p o s s i b l e , t h i s i s sometimes u s e f u l f o r monitoring chemical r e a c t i o n s or mixtures. The p r o c e s s i n g format of the F F T , on the other hand, i s r e l a t i v e l y f i x e d ; i n order to compute one p o i n t i n the spectrum, a l l of the p o i n t s must be computed and the s p e c t r a l i n t e r v a l and sampling p o s i t i o n s are a l s o f i x e d . 4· C a l c u l a t i o n o f the sine or cosine transformation u s i n g the DFT i s performed i n one-fourth the operations r e q u i r e d f o r complex transformation. This i s not true when the FFT i s used. t n
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch008
η
Performance f o r DFT As mentioned p r e v i o u s l y , the Fast F o u r i e r Transform i s often chosen f o r spectrum computation since i t provides a minimal number of r e q u i r e d operations (n log^n complex m u l t i p l i c a t i o n s and a d d i t i o n s i n r a d i x 4- system) i n order to a r r i v e at a complete t r a n s formed spectrum. The DFT r e q u i r e s n operations to f u l l y t r a n s 2
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
8.
BERUBÉ
AND
Buijs
High Speed Vector Processor
109
form a r e a l v e c t o r using the sine or cosine transform and J+a operations f o r transformation of a complex asymétrie v e c t o r of η p o i n t s . C l e a r l y , when l o n g vectors are to be t o t a l l y transformed the number of operations becomes very l a r g e and the time r e q u i r e d i n c r e a s e s . The r e l a t i v e l y high speed of the v e c t o r processor i s used to keep the p r o c e s s i n g time to a reasonnable value f o r vec t o r s of commonly used length ( p l o t t i n g r e s t r i c t i o n s o f t e n l i m i t l e n g t h of output vector) and, where very wide s p e c t r a l i n t e r v a l s are to be transformed, the numerical f i l t e r i s used to f i r s t se parate the i n t e r v a l i n t o s e c t i o n s such t h a t computing time i s h e l d w i t h i n acceptable l i m i t s . Table I i l l u s t r a t e s the computa t i o n times f o r d i r e c t transformation u s i n g the high speed v e c t o r processor. 32 b i t , f l o a t i n g p o i n t , s p e c t r a l data i s assumed and I/O i s assumed to be v i a standard IBM compatible Floppy D i s c . Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch008
1
Hardware The high speed v e c t o r processor i s configured f o r ease o f i n t e r f a c e to most minicomputers or h i g h e r - l e v e l microprocessors. Its f u n c t i o n i s to perform high speed, high p r e c i s i o n , m u l t i p l i c a t i o n and accumulation i n one or more r e g i s t e r s . The c o n t r o l of the system i s performed by the host computer with a minimum number of c o n t r o l t r a n s f e r s . S e r i a l data input i n 16 b i t f l o a t i n g p o i n t format may be accommodated d i r e c t l y from an e x t e r n a l source or may be t r a n s f e r r e d v i a the host computer. Output data i s nor mally i n 32 b i t f l o a t i n g p o i n t format f o r DFT and 16 b i t f l o a t i n g p o i n t format when numerical f i l t e r i n g i s performed. A f u n c t i o n a l block diagram of the system i s shown i n Figure 1. The hardware i s composed of the f o l l o w i n g : - a high speed, high p r e c i s i o n m u l t i p l i e r , 0.5us f o r 4-0 b i t pro ducts, 0.8ys f o r 64 b i t product. - one programmable length accumulator of up to 64. b i t s - f l o a t i n g to f i x e d p o i n t i n p u t converter - f i x e d p o i n t to f l o a t i n g p o i n t output converter - 4£ x 16 b i t memory organized as FIFO ( f i r s t - i n - f i r s t - o u t ) - 4-K x 16 b i t memory f o r storage of f i l t e r c o e f f i c i e n t s - s i n e / c o s i n e generator f o r up to 64.K d i f f e r e n t values - c o n t r o l l o g i c f o r automatic operation as determined by host computer - host computer i n c l u d i n g at l e a s t 8K bytes of read/write memory, DMA c a p a b i l i t y , and, p e r i p h e r a l storage u n i t such as f l o p p y d i s c or d i g i t a l magnetic tape r e c o r d e r . The c o n t r o l i n f o r m a t i o n f o r the v e c t o r processor can be read i l y generated u s i n g any programming language. Computation of the f i l t e r c o e f f i c i e n t s i s a l s o a s t r a i g h t forward task f o r s c i e n t i f i c programmers. For use with F o u r i e r Transform Spectrometers the D i g i t a l Equipment Corporation LSI-11 has been found q u i t e u s e f u l . The LSI-11 has s u f f i c i e n t computing power to be u s e f u l f o r general tasks such as c o n t r o l of other instruments, s c i e n t i f i c computation, process c o n t r o l , e t c . I t supports the time-tested DEC RT-11
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
110
MINICOMPUTERS A N DLARGE
jIncoming Data
FIFO ΛΚ χ 16
SCALE
COMPUTATIONS
Interface
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch008
Processor Initiate Control Parameters
Coefficient RAM Λ Κ χ 16
Data Storage Cosine Generator
F l o a t i n g to F i x e d Converters M u l t i p l i e r s + Shift Registers F i x e d to F l o a t i n g Converter
DACs f o r XY P l o t t e r
User s Terminal 1
DMA Interface
Figure 1.
Vector processor block diagram
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
8.
BÉRUBE
AND
Table I.
Buijs
High Speed Vector Processor
111
Computation times f o r d i r e c t transformation u s i n g the high speed v e c t o r processor.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch008
Points i n l ) t i m e to compute 2)time to compute 3)time to compute v e c t o r to be at l e a s t one and 2048 s p e c t r a l same number of transformed up to 200 specelements. s p e c t r a l elements t r a l elements.* as i n t e r f e r o g r a m points. 4K 8K 16K 32K 6ΛΚ 128K 256K
1 s 2 s 4 s 8 s 16 s 32 s 64 s
4.2 s 8.5 s 17 s 34 s 1 min 10 s 2 min 20 s 4 min 40 s
8.5
s
34 s , 2 min 30 s min 40 s 36 min 30 s 2 h 30 min 10 h 5 min
9
* This computation time i s l i m i t e d by f l o p p y d i s c input/output time The f i r s t group o f computation times i s u s e f u l f o r quick t u r n around monitoring of a very r e s t r i c t e d s p e c t r a l r e g i o n . The se cond group of computation times i n d i c a t e s the rate at which r e a sonable s i z e d p l o t t e r page sets of data may be generated, and the t h i r d group gives times of maximum computing e f f o r t f o r a given s i z e interferogram.
operating system and F o r t r a n compiler. These have been used to supply c o n t r o l to the high speed v e c t o r processor. "While the system was p r i m a r i l y designed to provide s u p e r l a t i v e p r o c e s s i n g accuracy and r e l a t i v e l y high speed i t a l s o provides an economical means f o r accomplishing these t a s k s . Conclusions The high speed v e c t o r processor has been shown to be a high performance device f o r implementation of numerical f i l t e r i n g and f o r performing F o u r i e r Transformation by DFT. Since both of these f u n c t i o n s are r e p e t i t i v e a p p l i c a t i o n s of dot product com p u t a t i o n the system may f i n d a p p l i c a t i o n whenever c a l c u l a t i o n of a dot product i s r e q u i r e d . Examples are c o r r e l a t i o n a n a l y s i s , s i g n a l c o n v o l u t i o n , a u t o c o r r e l a t i o n , s p e c t r a l signature a n a l y s i s , and of course general v e c t o r operations.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
9 The Potential of 16 Bit Minicomputers for Molecular Orbital Calculations MARIE C. FLANIGAN and JAMES W. McIVER, JR.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
Department of Chemistry, State University of New York at Buffalo, Buffalo, NY 14214
The number cruncher i s being squeezed out of University computing centers. The dramatically increasing pressure of small student jobs and the concommitant s h i f t of policy to relieve this pressure have made it so. The number cruncher s i t s endlessly in queues at the bottom of the p r i o r i t y p i l e . He can raise his p r i o r i t y by paying more " r e a l " money but the resources he needs for his mammoth jobs (compared to undergraduate student jobs) are just not available. His turn around time i s appalling. He can run fewer jobs each year, so his research output i s affected. The good old days of number crunching in Universities are gone. The formation of the National Computing Laboratory in Chemistry i s an i m p l i c i t admission of t h i s , as well as a recogn i t i o n of the importance of computing in chemistry. However, one other alternative i s also being explored; namely the use of a minicomputer for number crunching. This symposium i s a testimony to the importance of this p o s s i b i l i t y . By minicomputer we mean a r e l a t i v e l y inexpensive system (less than $100,000) with a 16 b i t or smaller word length. Minicomputers can support a small number of users interactively. More often they are found interfaced to one or more instruments where they are used for data acquisition and control of experiments. They are not designed for the multiple precision, compute-bound jobs that characterize number crunching. Much of the number crunching in Chemistry i s concerned with quantum mechanical calculations of molecular properties. Indeed, one often finds in the expérimental l i t e r a t u r e , reports of molecular orbital calculations together with other chemical and structural data. Although the usefulness of these calculations i s sometimes questioned, there i s no denying their ubiquity. Can minicomputers be used for this type of calculation? If so, w i l l they be cost effective? To address this question, we prepared a benchmark consisting of a semi-empirical SCF molecular orbital program and two supplementary programs designed to separately mimic the numerical and mass storage features of the molecular orbital program. The benchmark was then run for several molecules, for which comput112 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
9.
FLANiGAN
A N D
MCivER
Molecular Orbital Calculations
113
ing times and accuracy were reported. The runs were made on the State University of New York at Buffalo's Control Data Cyber 173 system. The same runs were also carried out on a Data General S/200 ECLIPSE minicomputer kindly made available to us by Dr. Stanley Bruckenstein for this purpose. This paper describes the benchmark, some of our experiences in developing i t , and the rather surprising results of our study.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
Cyber and Eclipse Hardware and Software Differences The CDC CYBER 173 hardware consists of the Central Process ing Unit (CPU), 131,072 decimal words of .4 microsecond Central Memory (CM), fourteen Peripheral Processing Units (PPU), and associated peripheral equipment such as card readers, printers, etc. The CPU has no I/O capability but communicates with the external world through the CM. The peripheral processors read and write to CM and provide communication paths between the CPU and individual peripheral equipment. This is in contrast to the ECLIPSE where the I/O bus with I/O devices attached to i t are d i r e c t l y connected to the CPU. Both machines have disk drives for random access mass storage, with the CYBER's PPU's handling the transfer of data between the CM and the disks. Thus, a significant hardware d i s t i n c t i o n between the two machines is that the ECLIPSE'S single processor must perform the functions of both CPU and PPU of the CYBER. Perhaps the most c r i t i c a l difference between the CYBER and the ECLIPSE i s the difference in word length. The CYBER has a 60 b i t word. Floating point single precision numbers have a sign b i t , an 11 b i t biased exponent f i e l d and a 48 b i j mantissa f i e l d . This gives a range of about 1 0 " to 10 with approx imately 14 significant decimal d i g i t s . A 60 b i t word can hold any positive decimal integer up to 2 = 1 0 . This i s the number of words of memory one can d i r e c t l y address. The ECLIPSE, however, has a 16 b i t word of which only 15 bits can be used for direct addressing. Thus only 32,768 (=32K) decimal words of memory can be d i r e c t l y addressed. In addition to the 32K direct access memory, the particular ECLIPSE we used has 16,384 (=16K) of "extended memory" and a hardware device allowing access to t h i s memory. The total amount of extended memory can reach up to 131,072 (=128K) words. In contrast to the 60 b i t word on the CYBER, the ECLIPSE "single precision" floating point number is made up of two 16 b i t words with a sign b i t , a 7 b i t biased exponent f i e l d (hexadecimal) and a 24 b i t mantissa f i e l d . A "double precision" floating point number on the ECLIPSE i s made up of four 16 b i t words (64 bits total) with a sign b i t , a 7 b i t biased exponent f i e l d and a 56 b i t mantissa. The range of both single and double precision words i s approximately 5 χ 1 0 " to 7 χ 1 0 . There are 6-7 significant decimal d i g i t s in single precision whereas double precision gives 13-15 significant d i g i t s . The maximum number of 2 9 3
6 0
7 9
3 2 2
1 8
7 5
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
114
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
d i r e c t l y addressable single and double precision numbers is 16,384 and 8,192 respectively. Actual space available to the user is even less since normally, the core resident portion of the Real Time Operating System (RDOS) demands on the order of 1115K words out of the 32K available. However, i f the Mapped RDOS system (MRDOS) i s used which requires one 16K unit of extended memory (bringing the total to 48K), then the core resident por tion of the operating system resides in the upper 16K leaving 32K d i r e c t l y addressable words available to the user. Since the total memory requirements of a program very often exceed the amount of user address space, various means are available under RDOS and MRDOS to segment user programs and extend the amount of user address space. Unmapped systems (RDOS), make use of program swaps and chains, and .user overlays. Mapped systems (MRDOS), in addition to these three methods of program segmentation, also employ virtual overlaying and window mapping. These require the use of extended address space. For both methods, i t i s the address at which data is found that i s changed, no true data transfer occurs, resulting in a substantial increase in speed over normal disk I/O while performing e s s e n t i a l ly the same functions. Virtual overlaying d i f f e r s from normal overlaying in that the overlays are loaded into extended memory and there i s no reading and writing of programs to and from disk as in normal overlaying. Window mapping makes use of extended memory in much the same way only for data instead of programs. One or more IK blocks in the lower 32K portion of memory are reserved for the window which is then used to slide up and down the extended memory in such a way that the contents within the window change. This situation is i l l u s t r a t e d in Figure 1, using a 2 block window. Here the window o r i g i n a l l y contains blocks 0 and 1 of "extended" memory. A REMAP operation changes the p o s i tion of this window to have i t contain blocks 3 and 4. Thus, while the window mapping feature does not allow the direct add ressing of more than 32K words, i t does allow the user to select (via the FORTRAN c a l l to the REMAP operation) which 32K of the total memory is to be accessed. In FORTRAN, access to the window i s through a COMMON statement. An alternative means of handling large amounts of data i s , of course, reading and writing to and from disk, although this introduces more overhead. This overhead can be minimized, however, i f contiguous disk f i l e s are used rather than random or sequential to reduce seek time on the disk. The range for this seek time i s 15-135 milliseconds. The user disk to which we are referring i s a Model 33, 2200 ΒΡΙ, 12 sector/ track disk. A complete revolution of the disk cartridge requires 40 m i l l i seconds and a disk block read requires 3.33 milliseconds. The remaining hardware items of the ECLIPSE used in the benchmark are a CRT terminal, hardware multiple precision f l o a t ing point arithmetic units, a TALLY 300 line per minute printer and a Diablo dual disk drive. The compiler used on the ECLIPSE
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FLANiGAN A N D MCivER
Molecular Orbital Calculations
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
IWWWI ///////////////
WINDOW ///////////////
////////////////
2
REMAP
_ >2
WINDOW //////////////// USER'S
USER'S
PROGRAM
PROGRAM
Figure 1.
Window mapping on the ECLIPSE
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
116
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
was the optimizing FORTRAN 5 compiler (Revision 4.01 - Prerelease). The FORTRAN 5 language has quite a few attractive features such as conversational I/O, selective compilation (or exclusion) of certain code, PAUSE statements, etc. However, in the interest of machine t r a n s f e r a b i l i t y and in order to keep both versions of the benchmark as similar as possible, standard FORTRAN IV was used. With regard to the r e l i a b i l i t y of the operating system, we experienced problems with system bugs in e a r l i e r versions of the operating system. However, this situation appears to have r e c t i f i e d i t s e l f with the current version of MRDOS (Rev. 5). The CYBER version of the benchmark was compiled under the FTN (FORTRAN EXTENDED), 0PT=2 mode which optimizes the code. The current operating system on the CYBER 173 is NOS 1.2. Cost Analysis and Depreciation The following i s a cost breakdown of the Data General S/200 ECLIPSE minicomputer used in the benchmark: Basic Configuration Cost CPU, 32K words (16 b i t ) , dual disk hardware floating point (basic math functions),CRT, (Software included) $42,000 Additional Units: Memory Map Unit $1,000 Line Printer $3,500 16K Extended Memory $5,000 Total Purchase Price $51,500 To establish a general cost per hour of this configuration, the machine was depreciated over five years. Since the standard maintenance contract is about 10% of the purchase price per year, this becomes, over five years, approximately 1.5 times the original purchase price or $77,250. To account for system maintenance, preventative maintenance on the hardware and actual down time for repairs, the system was assumed to be available to the user 80% of the time over the five years. Therefore, the f i v e year price divided by 80% of the number of hours in f i v e years gives a cost per hour of $2.20 or $.00061 per second. This we refer to as plan A. Because the system actually used was maintained by inhouse personnel and therefore required no maintenance contract, a somewhat revised cost analysis would be more r e a l i s t i c for our i n s t a l l a t i o n . The actual cost of replacement parts over the two years the machine has been in service was $250 which, when projected over five years becomes $625. This figure plus the purchase price gives $52,125 as the labor-free cost of the machine. In two years the down time for repairs totaled approximately 60 hours. Preventative hardware maintenance (running test programs) averages two hours per month and software maintenance requires about one eight-hour day per month. Therefore, the total time in
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
9.
FLANiGAN A N D MCivER
117
Molecular Orbital Calculations
one year that the machine i s not available to the user i s 150 hours or 750 hours over five years. Dividing $52,125 by 43,050 (total hours in 5 years minus 750) gives a cost per hour of $1.21 ($3.36 χ 10~Vsec). This is cost plan B. Cost plans intermediate between A and Β can easily be con structed by noting that machine repairs were carried out by a trained electronics technician and that software maintenance and preventative hardware maintenance were carried out by graduate students. Also, in both cost plans A and B, i t has been assumed that the machine w i l l not be i d l e , but w i l l always be busy run ning user's jobs. One can account for any deviation from*this assumption by simple dividing the cost by the fraction of a v a i l able time the machine w i l l be used. The cost of running the benchmark on the CYBER was obtained d i r e c t l y from the University Computing Center's charging algor ithm which was designed to " r e f l e c t , as accurately as possible, the actual cost of running the job". Since t h i s cost includes that of personnel s a l a r i e s , and since about one half of the center's budget i s personnel, an alternative price can be obtain ed by dividing by two the calculated cost obtained from the charging algorithm. The CYBER rate schedule divides the computer cost/hour into contributions from various sources. The major contribution of interest with respect to the benchmark is the $200/hour charge for CPU time. Field length usage during the time a job is active in central memory costs $0.75 per central memory kilowordhour, and mass storage charges add an additional $0.015 per kilopru of disk 1/0. There are of course, additional charges for cards, tape mounts, plotter use, e t c . , which are irrelevent to the benchmark in question and so w i l l not be discussed here. An analysis of individual job costs for execution of the benchmark as well as cost ratios for the two machines w i l l be discussed in a l a t e r section. Description of the Benchmark A proper evaluation of the efficiency of a computer should involve running the actual programs used. For this reason, the semi-empirical SCF molecular orbital program MIND0/2(1_) (with options for executing IND0(2J) was programmed on both the ECLIPSE and the CYBER. The starting point for the general SCF molecular orbital problem i s the molecular electronic Schroedinger equation ΗΨ = ΕΨ
(1)
an eigenvalue problem with the set of Ψ s and Ε 's mined. The majority of ways of generating solutions basis set expansions of the Ψ 's with v a r i a t i o n a l l y c o e f f i c i e n t s . The net effect is the transformation 1
to be deter to (1) involve determined of (1) into
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
118
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
a (usually non-linear) matrix pseudoeigenvalue problem. F(C) C = S C Ε
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
C
+
(2)
S C= 1
A typical calculation can generally be analyzed in two steps. The f i r s t involves the i n i t i a l evaluation of integrals used in the construction of F and S and the second involves the solution of (2), usually by i t e r a t i v e methods. In terms of computational e f f o r t , various methods divide themselves into two classes: abi n i t i o methods in which steps (1) and (2) take very roughly equal amounts of computer time, and semi-empirical methods in which step (1) consumes neglibible computer time relative to step (2). The solution of Eq. (2) (step 2) for semi-empirical methods (for which S = 1) involves p r i n c i p a l l y the repeated diagonalization of F(C) until self-consistency i s obtained. This iterative portion of the program requires the most time and memory. As a r e s u l t , this part of the semi-empirical SCF program MINDO was the most c r i t i c a l from the programming standpoint. Because of the limited amount of extended memory (16K) on the minicomputer available, the window mapping and virtual over laying options for data and program storage could not be tested in the actual SCF benchmark. Therefore, normal user overlays and contiguous reads and writes to and from disk were used. However, a short window mapping test program was executed to v e r i f y the performance of this feature. The SCF program consists of a very small main program which contains a few short COMMON blocks and c a l l s to the nine over lays. Only the main program and any one overlay are core r e s i dent at any given time. A brief descripton of each overlay follows: 1. Input Overlay. Introduces cartesian coordinates and atomic numbers for each atom in the molecule, sets print options and specifies type of calculation (MINDO or INDO). 2. I n i t i a l i z a t i o n Overlay. Sets up various tables of constants used in other overlays and stores them on disk. 3. Coulomb Overlay. Calculates two-electron integrals and writes them to disk. 4. Overlap Overlay. Evaluates overlap integrals and stores them on disk. 5. Guess Overlay. Generates an i n i t i a l guess of the density matrix which i s written to disk. 6. Core Hamiltonian Overlay. Generates the one electron part of the F matrix and writes i t to disk. 7. SCF Overlay. This i t e r a t i v e l y solves Eq. 2. 8. Overlap Derivative Overlay. Calculates the derivatives of the overlap integral with respect to the x, y, and ζ coordin ates of each atom and writes them to disk. Although i t i ^ more e f f i c i e n t to evaluate these with the overlap integrals themselves,
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
9.
FLANiGAN A N D MCivER
119
Molecular Orbital Calculations
space limitations precluded t h i s . 9. Gradient Overlay. Calculates the total energy and i t s derivatives with respect to the x, y and ζ coordinates of each atom. The entire program consists of 62 subroutines each of which must be compiled separately on the ECLIPSE. The slow, optimizing compiler of the ECLIPSE required several hours for the task. We found that a hobby such as embroidery i s useful while waiting for the routines to compile. The SCF program was dimensioned to 50 orbital s and run for several molecules of various sizes. Execution times were tab ulated for both the CYBER and the ECLIPSE. At f i r s t sight, the measurement of execution times on either machine would seem to be straightforward. This i s not the case. On the ECLIPSE, the only time available i s that of a real time clock, i . e . , the elapsed time one would ideally obtain with a stop-watch. The elapsed time obtained in this benchmark is the sum of the time the system spends performing two functions: the numerical calculations, and the mass storage transfers of data and programs (overlay over head). No effort was made to overlap these two functions. The time spent in mass storage transfers was measured by running for each molecule, that part of the benchmark that duplicates the mass storage transfers without performing the calculations. This was run only on the ECLIPSE and w i l l be referred to as the mass storage benchmark. On the CYBER, the only time easily available i s the central processor execution time, i.e.>,the time the CPU spends executing a program. The mass storage transfers are handled by peripheral processors during which time the central processor i s performing housekeeping operations or working on another program. The actual "time on the machine" or turn-around time on the CYBER i s d i f f i c u l t to obtain and varies so widely, depending on the load on the machine, as to be worthless as a performance measure. F i n a l l y , in order to obtain a r e l i a b l e estimate of actual execution time with a minimum of overhead from the operating system as well as from sources described above, a simple bench mark designed to mimic the most time consuming portion of the typical SCF calculation was run on the ECLIPSE, the CYBER 173, and the CDC 6400. As mentioned e a r l i e r , for semi-empirical methods the solution of Eq. (2) involves p r i n c i p a l l y the repeated diagonalization of F(C) until self-consistency i s obtained. Thus, a benchmark involving the repeated diagonalization of F plus some matrix multiplications should closely resemble step 2. This execution benchmark, then, consisted of the diagonalization of F which was an array f i l l e d with real numbers ranging from 10 to +10 with magnitudes varying from 1 0 " to 10 , followed by the back-transformation of F to recover the original matrix. We refer to this as the diagonalization benchmark. The numerical precision for various word lengths ( i . e . , single vs. double precision) was determined by subtracting the value of each 5
5
5
5
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
120
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
element in the back-transformed maxtrix F from the original matrix F. The largest value of this difference matrix then, gave an estimate of the machine accuracy. Execution times for the diagonalization benchmark were conducted for varying array sizes. The purpose of varying the dimension of the matrices was to eliminate the effects of operating system overhead expenses and to provide a basis for extrapolating execution times to larger matrices. The entire diagonal ization benchmark was placed in a FORTRAN DO loop and executed 100 times for each array s i z e . The time at the start of execution and the time at the end of execution were printed. Printing of the arrays was suppressed in this series of runs so that I/O time was not a factor.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
Results and Discussion In this section we seek answers to the following: 1. Can a more or less "standard" molecular orbital program with the a b i l i t y to handle 50 basis functions be made to run at a l l on a minicomputer? If so, how much trouble is i t to "bring up" such a program? 2. Will execution times or, more appropriately, turnaround times be reasonable? 3. Will the costs be competitive with alternative computing sources? The f i r s t of these three questions has been answered in the previous section. By using program overlaying and disk mass storage for temporarily holding arrays, a 50 orbital basis semiempirical molecular orbital program can be made to f i t in 32K of 16 b i t words. Of course, one can effectively increase the size of the program by making even more extensive use of the disk. But this would entail a drastic change in the program and, as we shall see shortly, would increase the mass storage overhead to intolerable l i m i t s . As was discussed e a r l i e r , the iterative part of the SCF problem i s the most time consuming and has the largest core requirements for both data and code. The size of the largest overlay i s of particular importance on the ECLIPSE since the area in core reserved as an overlay area is preset at load time to the size needed to contain the largest overlay in the corresponding segment of the user overlay f i l e . This area i s reserved in core throughout execution regardless of whether succeeding overlays are smaller or not, unlike the CYBER. The size of the SCF overl a y , which determined the size of the overlay area, was 26.6K which l e f t a l i t t l e over 5K for the main overlay and run-time stack (expandable area for temporary storage of variables and intermediate r e s u l t s , e . g . , non-common variables). The f i n a l program required 31,778 words out of the 32,768 available under MRDOS to load. A word might be said here regarding the internal clocks
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
9.
FLANiGAN A N D MCivER
Molecular Orbital Calculations
121
provided by the ECLIPSE operating system. The clock interrupt frequency can be selected at 10, 100, or 1000 hertz. Although only a 6% variation of elapsed time was found between the 10 hertz and the 1000 hertz clocks, this perturbation of the system argues in favor of using an external clock for time measurements. This "clock frequency effect" was not observed for the mass storage benchmark since very l i t t l e processor time i s used. In preparing the benchmark, we encountered an interesting and important difference in the way array indexing i s handled by the two computers. The ECLIPSE FORTRAN manual emphatically cautions the user to ensure that the f i r s t index of arrays within a nest of DO loops corresponds to the index of the innermost loop of the nest. Complying with t h i s , however, would have entailed a major revision of the molecular orbital program. We i n v e s t i gated this problem by executing each of the four programs shown in Table I, 10,000 times and recording the execution times on each machine. (The statements in the square brackets were not included in the 10,000 executions of each program). Comparing the ECLIPSE execution times for programs I and III, i t i s seen that the warning in the ECLIPSE manual i s j u s t i f i e d . The ECLIPSE results for programs II and IV show that this problem can be e a s i l y circumvented by referencing the array with a single sub s c r i p t and handling the double subscript indexing in FORTRAN with the aid of the "look up table" INJ. When implemented in the diagonalization benchmark, a 30% reduction of execution time occurred. It was thus included in the SCF overlay of the molec ular orbital benchmark. The ECLIPSE results in Table I can be understood when i t i s recognized that the compiler computes a single index address by the equivalent of the formula IJ = I + 50 * ( J - l ) and that i t "optimizes" the code by removing constant expressions from loops. Thus the optimized revision of program III would be DO 1 I = 1,50 Κ = 50 * (1-1) DO 1 J = 1,50 1 A(K + J) = 1.0 which requires only 50 multiplications (the slowest operation in the program) rather than the 2500 of program I. Programs II and IV require no multiplications, with program IV requiring 2450 fewer table look-ups than II. The CYBER results are puzzling in this context since the CYBER compiler also optimizes the source code and uses the same formula for the single subscript IJ as does the ECLIPSE. The results of II and IV are nearly twice as long as I and III(which are now comparable) on the CYBER as on the ECLIPSE. The explana-
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
122
MINICOMPUTERS AND
LARGE SCALE
COMPUTATIONS
TABLE I COMPARISON OF INDEXING METHODS EXECUTION TIME (SEC)
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
FORTRAN CODE
II.
ECLIPSE
CYBER
[DIMENSION A(50,50)] DO 1 I = 1,50 DO 1 J = 1,50 1 A(I,J) = 1.0
785
66
[DIMENSION A(2500),INJ(50) DO 5 Κ = 1,50 5 INJ(K) = 50*(K-1)]
625
120
DO 1 DO 1 IJ = 1 A(IJ)
I = 1,50 J = 1,50 I + INJ(J) = 1.0
III.
[DIMENSION A(50,50)] DO 1 I = 1,50 DO 1 J = 1,50 1 A(J,I) = 1.0
549
70
IV.
[DIMENSION A(2500),INJ(50) DO 5 Κ = 1,50 5 INJ(K) = 50*(K-1)]
550
103
DO 1 DO 1 JI = 1 A(JI)
I = 1,50 J = 1,50 J + INJ(I) = 1.0
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
9.
123
tion for this l i e s in the fact that the multiplications 50*(I-1) are e f f e c t i v e l y eliminated by the compiler. This is accomplished by using the princple that any positive integer can be expressed as a linear combination of powers of 2 with coefficients of plus or minus one. Thus, in our example, the integer 50 (which is known to the compiler from the DIMENSION statement) can be w r i t ten as 2 - 2 + 2 . Multiplication of any integer ( Ι - Ί ) by 2 can be very rapidly carried out on the CYBER by simply s h i f t i n g the bits of the integer η spaces to the l e f t . Thus the m u l t i p l i c a t i o n by 50 i s replaced by three b i t s h i f t s , an integer additon and an integer subtraction. The impact of using singly dimensioned arrays on the CYBER version of the molecular orbital benchmark can be estimated from the fact that the diagonalization benchmark took 10% more time to execute when the arrays were made 1inear. The double precision 64 b i t word on the ECLIPSE gave a noticeable improvement in precision compared to the 32 b i t single precision floating point word. The largest error in the diagonalization benchmark (with tolerence set to 10" ) was 0.2 χ 10 for the double, precision word and 0.0004 for the single precison version, the CYBER (60 b i t word) gave an error of 6.0 χ 1 0 " . Various execution times for the molecular orbital benchmark are shown in Table II. As discussed e a r l i e r for the ECLIPSE, the total execution time is the "real time", i . e . , the sum of pro cessing time and mass storage transfer time. The contribution of overlay overhead to the l a t t e r averages about 6 seconds and is a constant independent of both the molecule and the number of i t e r a t i o n s . Therefore, disk 1/0 in the form of data transfers accounts for the balance of the mass storage time. The ratio of total ECLIPSE execution time to CYBER CPU time shown in Table II varies from 25 for Z W to 7.3 for C H where i t appears to be leveling off. This is a result of the diminish ing importance of the mass storage transfer time for the larger molecules. The mass storage overhead could easily be eliminated i f s u f f i c i e n t extended memory were available to us. Results of a short test program using window mapping for data transfers i n d i cated that 20,000 REMAP operations using a 2K window can be performed in three seconds. The number of data transfers to disk for the molecules executed in the SCF program varied between 43 and 51 depending on the number of iterations. Therefore, using REMAPS instead of disk reads and writes would entail almost zero overhead. The mass storage overhead then would be due soley to overlaying and this is a known constant (6 seconds). One can easily estimate the SCF ECLIPSE execution time which would be the equivalent of the CYBER CPU time by subtracting the total mass storage times obtained from the mass storage benchmark from the total ECLIPSE execution times. These times are l i s t e d in Table II as Δ. The ratio of Δ to CYBER CPU time varies from 3.8 for C rU to 4.7 for C H . These can be compared 6
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
Molecular Orbital Calculations
FLANiGAN A N D MCivER
h
1
n
8
8
8
2
2
6
h
6
8
8
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
124
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
TABLE II COMPARISON OF EXECUTION TIMES
MOLECULE
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
# ORBITALS
# ITERATIONS IN SCF
CH 2
Cfo
4
C^
Q
CgHg
12
18
22
30
32
9
13
11
10
13
72 61 11
98 68 30
106 65 41
151 62 89
188 68 120
2.88
6.73
9.06
18.81
25.70
14.5 4.5
11.7 4.5
8.0 4.7
7.3 4.7
ECLIPSE TOTAL TIME (SEC) MASS STORAGE TIME Δ (TOTAL - MASS STORAGE)
CYBER CPU TIME (SEC)
ECLIPSE/CYBER RATIOS TOTAL TIME/CPU TIME Δ/CPU TIME
25.0 3.8
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
9.
FLANiGAN
A N D
MCivER
Molecular Orbital Calculations
125
to the ECLIPSE/CYBER execution ratios obtained in the diagonalization benchmark (which has no mass storage c a l l s ) , thus provid ing an independent check of our assumption that Δ corresponds to the CYBER CPU time. The diagonalization benchmark was executed for both singly and doubly subscripted arrays because, although single subscripting was shown to be faster than double, the SCF program used a combination of both. For single subscripting, the ratios obtained in the diagonalization benchmark varied from 3.7 (15 χ 15 matrix) to 3.8 (35 χ 35 matrix) while those for double varied from 4.6 (15 χ 15 matrix) to 5.0 (35 χ 35 matrix). The ratios of Δ to CYBER CPU time given above f a l l s well within the range 3.7 to 5.0. The evaluation of the cost effectiveness^ of anything i s beset with many d i f f i c u l t i e s and the effectiveness of carrying out molecular orbital calculations on a mincomputer i s no exception, Some of these d i f f i c u l t i e s have been discussed in the section on cost analysis. We venture no further discussion here other than to remark that the results presented in this section must be regarded as crude. Table III shows the CYBER cost (as given by the University's charging algorithm) for each molecule and the ratios of the CYBER costs to the ECLIPSE costs (as computed under the cost plans A and Β described e a r l i e r ) . Table III also includes the estimated ratios of CYBER costs to the costs obtained on an ECLIPSE with 65K of extended memory. This is s u f f i c i e n t memory to eliminate the disk I/O from the molecular orbital benchmark. This estimate was obtained by f i r s t modifying cost plans A and Β to include the cost of the additional memory (and i t s maintenance) in the total five year price of the machine. The execution times on this hypothetical ECLIPSE were estimated by adding six seconds (the fixed cost of the disk overlaying overhead) to the A's of Table II. According to the results shown in Table III i t is far cheaper to use an ECLIPSE for these calculations than the CYBER. Even in the worst case shown, the CYBER is nearly ten times more expen sive to use than the ECLIPSE. Moreover, the results also show that the additional extended memory on the ECLIPSE i s well worth the extra investment, although the differences for the two ECLIPSE configurations are not great for the larger molecules. Conclusions The surprising aspect of this work i s not that a molecular orbital program could be run on a 16 b i t minicomputer. Given a suitable length floating point word, such programs can be highly overlayed and at the worst, the array dimensions can be lowered. We believe that even a b - i n i t i o programs can be made to run on the ECLIPSE, although the basis set size might be limited and mass storage overhead somewhat high. What was surprising to us was the sheer power of the ECLIPSE. The results of the diagonaliza-
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
126
TABLE III COMPARISON OF COSTS MOLECULE
C
CYBER CHARGE
H
2 4
C
H
3 6
C
H
4 6
C
H
5 10
C
H
6 8
$0.41
$0.74
$0.91
$1.76
$3.28
9.3 16.9
12.3 22.4
14.0 25.5
19.0 31.6
28.5 51.9
32.4 59.1
26.0 47.5
24. 44.
23.4 42.8
33.0 60.1
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch009
CYBER/ECLIPSE PLAN A PLAN Β
CYBER/EXTENDED ECLIPSE PLAN A PLAN Β
tion benchmark showed that the ECLIPSE i s only 4 to 5 times slow er than the CYBER 173, even though the ECLIPSE uses a 64 b i t floating point word. For a further comparison, we found via the diagonalization benchmark, that the CYBER 173 i s 30 to 50% faster than the CDC6400. Because a minicomputer used in this fashion is a dedicated "hands on" machine, with the only delay due to p r i n t i n g , the turn-around time w i l l often be much better than that of the CYBER. Provided that the usage demand is s u f f i c i e n t l y high, we be lieve that the use of a minicomputer i s both a highly convenient and cost effective alternative to using University computer cen ters for the type of calculation described in this paper. Acknow!edgement We are very grateful to Dr. Stanley Bruckenstein for the use of his ECLIPSE. We also thank Mr. Greg Martinchek as well as other members of Dr. Bruckenstein's group for their valuable assistance in using this machine. Literature Cited (1) Dewar, M. S. and Haselbach, Ε., J. Amer. Chem. Soc.,(1970) 92, 1285. (2) Pople, J. Α., Beveridge, D. L. and Dobosh, P. Α., J. Chem. Phys., (1967) 47, 2026.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
10 A Minicomputer Numbercruncher A. LINDGÅRD, P. GRAAE SORENSEN, and J. OXENBOLL
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
Chemistry Laboratory III, H. C. Orsted Institutet, University of Copenhagen, Universitetsparken 5, DK-2100 København Ο
Due t o t h e l o w p r i c e o f s m a l l m i n i c o m p u t e r confi g u r a t i o n s t h e y h a v e become v e r y p o p u l a r i n c o m p u t e r i z e d instrumentation f o r c h e m i s t r y and p h y s i c s . L a r g e s c a l e s c i e n t i f i c c o m p u t i n g has not been much influenced by this, but h a s m o s t l y b e e n done on l a r g e c o m p u t e r s . An example o f use o f a monoprogrammed minicomputer for quantum chemistry is found at B e r k e l e y ( M i l l e r and S c h a e f e r , 1973), but the c o n f i g u r a t i o n w i t h plenty of main memory, backing store etc. i s not t y p i c a l f o r m i n i c o m p u t e r s y s t e m s . On t h e o t h e r h a n d t h e s m a l l s t a n d alone systems are not s u i t e d for program development due t o t h e l a c k o f p o w e r f u l p e r i p h e r a l s . O n l y interpre ters like BASIC can be used with a r e a s o n a b l e turn around time for program development, and BASIC is certainly not s u i t e d f o r a n y t h i n g but s m a l l programs. A problem with the l a r g e machines i s that they are very expensive for jobs having large cpu-time require ments. Monte-Carlo calculations in statistical m e c h a n i c s c a n o f t e n r e q u i r e weeks o f c p u - t i m e , but do n o t r e q u i r e much b a c k i n g s t o r e o r u s e o f p e r i p h e r a l s . Considering these cpu-bound problems i t became clear that a dedicated minicomputer with a reasonable amount of fast s t o r e w o u l d be s u f f i c i e n t t o do t h e s e type of c a l c u l a t i o n s at a very low cost, the only problem being how t o d e v e l o p p r o g r a m s a n d g e t d a t a i n and o u t o f t h e m e m o r y . At t h e H . C . 0 r s t e d I n s t i t u t e t h e r e was a n e e d f o r h a n d l i n g p r o b l e m s i n s t a t i s t i c a l m e c h a n i c s and c h e m i c a l k i n e t i c s r e q u i r i n g weeks t o m o n t h s of cpu-time. Core requirements for these jobs are low. These jobs c o u l d o f c o u r s e r u n on o u r medium s i z e m u l t i p r o g r a m m e d RC4000 computer ( B r i n c h Hansen, 1967), but not i n a r e a s o n a b l e 127 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE
128
SCALE
COMPUTATIONS
way. E i t h e r o t h e r u s e r s w o u l d h a v e p r o b l e m s getting a decent turnaround time for their computational jobs, and t h e RC+000 w o u l d be l e s s a t t r a c t i v e f o r d o i n g s m a l l j o b s l i k e e d i t i n g , c o m p i l i n g and r u n n i n g s m a l l p r o g r a m s from a t e r m i n a l , or the t u r n a r o u n d time for the time c o m s u m i n g j o b w o u l d h a v e b e e n so l o n g t h a t i t c o u l d n o t have been r e a l i z e d .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
System
design.
The purpose of the s y s t e m i s t o make l o n g c p u bound c o m p u t a t i o n s f e a s i b l e . Typically a program will run f o r a few h o u r s b e f o r e i t n e e d s a t t e n t i o n f r o m t h e RCH000 f o r s t o r i n g away d a t a . The p r o g r a m w i l l t h e n go on making a new c o m p u t a t i o n . T h i s c y c l e may c o n t i n u e for weeks. From t h e p o i n t o f v i e w of the minicomputer the RC*+000 is a backing s t o r e . T h e c o m p u t e d d a t a may be s t o r e d by an RC4000 control program on the backing store ie. t h e d i s c . The f i n a l s e c u r i t y o f t h e d a t a is a s s u r e d by t h e s e c u r i t y dump o f t h e w h o l e b a c k i n g s t o r e on a m a g n e t i c t a p e , w h i c h i s done o n c e e v e r y d a y .
Selecting
the
minicomputer.
The p r i m a r y c r i t e r i a u s e d i n s e l e c t i n g the minicomputer for this project were p r o c e s s i n g r a t e , i n s t r u c t i o n r e p e r t o i r e and c o s t . I t was d e c i d e d t h a t h a r d w a r e multiply/divide was essential for most applications, but that floating p o i n t a r i t h m e t i c w o u l d be u s e d i n a few c a s e s o n l y . It was expected that a l a r g e amount o f p r o c e s s i n g t i m e w o u l d be u s e d f o r b i t m a n i p u l a t i o n and memory addressing, and an a d v a n c e d a d d r e s s i n g scheme w i t h e a s y u s e of i n d e x r e g i s t e r s was important. The Texas Instrument 980A was selected as a r e a s o n a b l e compromise between the abovementioned r e q u i r e m e n t s . F o r i n s t a n c e , the s h i f t i n s t r u c t i o n can h a n d l e a v a r i a b l e number o f p o s i t i o n s a n d t h e h a r d w a r e multiply/divide i s not too d i f f i c u l t t o use f o r multilength i n t e g e r a r i t h m e t i c . F u r t h e r , the p r o t e c t i o n system of the TI980A was considered as an a d v a n t a g e . Software support from the manufacturer was not considered, b e c a u s e we a l r e a d y h a v e a general assembler for any m i n i c o m p u t e r and m i c r o c o m p u t e r , and p r o g r a m d e v e l o p m e n t s h o u l d n o t be done on t h e m i n i c o m p u t e r .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
10.
LiNDGARD E T A L .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
Connecting
the
A Minicomputer dumber cruncher minicomputer
to
the
129
RC4000.
The minicomputer may be c o n n e c t e d e i t h e r a s an independent machine h a v i n g a t e r m i n a l f o r the u s e r and only using the RC4000 as b a c k i n g s t o r e , o r as a s l a v e c o m p u t e r c o m p l e t e l y c o n t r o l l e d by t h e R C 4 0 0 0 , with no other peripherals. We favor t h e l a s t s o l u t i o n as it makes h a r d w a r e and s o f t w a r e s i m p l e r , A slave computer i s l i k e any o t h e r c o m p l e t e l y c o n t r o l l e d p e r i p h e r a l . The difference is t h a t a g e n e r a l purpose minicomputer can do v e r y c o m p l e x d a t a t r a n s f o r m a t i o n s while other perip h e r a l s g e n e r a l l y can n o t . The minicomputer should not h a v e any character oriented peripherals connected. Character input/output requires a lot of software. I f a t e r m i n a l had b e e n c o n n e c t e d , u s e r s would f u r t h e r m o r e have felt inclined to use the minicomputer f o r d e v e l o p i n g , e d i t i n g and a s s e m b l i n g o f p r o g r a m s . T h i s r e q u i r e s a command interp r e t e r and some p r o g r a m t o d e t e r m i n e w h e t h e r t h i s could be done locally or involve the RC4000. We would c e r t a i n l y u s e t h e same command l a n g u a g e on t h e m i n i c o m puter as on t h e RC^OOO, w h i c h i m p l i e s t h a t we h a d t o d e v e l o p a l o t o f s o f t w a r e . I t i s much s i m p l e r t o force the u s e r t o u s e t h e RC4000 f o r e d i t i n g , a s s e m b l i n g and l o a d i n g p r o g r a m s and h a v e no conventional peripherals on t h e m i n i c o m p u t e r , A slave computer is simple to h a n d l e . It can a l w a y s be p u t i n t o a w e l l d e f i n e d s t a t e , i t c a n n o t harm t h e RC4000 as i t c a n n o t do a n y t h i n g on i t s own b u t has t o a s k t h e RC^OOO t o do i t , by s e n d i n g a s i g n a l . The
TI980A
controller.
Communication between the RC4000 a n d t h e T I 9 8 0 A t a k e s p l a c e v i a t h e l o w - s p e e d and t h e h i g h - s p e e d (DMA) data c h a n n e l s o f t h e RC*+000 , b u t o n l y v i a t h e DMA p o r t o f t h e T I 9 8 0 A . B e s i d e s t h e DMA c a p a b i l i t y , this port has an instruction controlled o u t p u t f e a t u r e and an i n t e r r u p t i n p u t . T h e s e f e a t u r e s made i t e a s y to build the TI980A i n t e r f a c e , b e c a u s e i t was o n l y n e c e s s a r y to i m p l e m e n t one p e r i p h e r a l d e v i c e to the minicomputer, n a m e l y RC^OOO t h r o u g h t h e DMA p o r t . The interface can l o g i c a l l y be d i v i d e d i n t o two p a r t s , a c o n t r o l s y s t e m and a DMA d a t a t r a n s f e r system. In t h e c o n t r o l s y s t e m t h e T I 9 8 0 A i s connected to the instruction controlled low-speed data c h a n n e l of the RC4000 and f u n c t i o n s as a s l a v e computer. The RCH000 uses f i v e i n s t r u c t i o n s to c o n t r o l the m i n i c o m p u t e r : 1) reset, 2) stop, 3) s t a r t , 4) s i n g l e instruction e x e c u t i o n a n d 5) i n t e r r u p t . I m p l e m e n t a t i o n of the first
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
130
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
f o u r i n s t r u c t i o n s i s p e r f o r m e d by c o n n e c t i n g t h e o u t p u t o f t h e RC4000 c o n t r o l l e r t o t h e f r o n t panel board of the minicomputer and t h e n s i m p l y s i m u l a t i n g t h e front p a n e l s w i t c h e s . The i n t e r r u p t i n s t r u c t i o n i s connected t o t h e DMA p o r t . The DMA s y s t e m c o n t r o l s t h e d a t a t r a n s f e r s between the two c o m p u t e r s . A word i s l o a d e d f r o m t h e memory o f one c o m p u t e r t h r o u g h i t s DMA p o r t , stored temporarily in a one word b u f f e r , and t h e n t h e s e c o n d c o m p u t e r i s r e q u e s t e d t o s t o r e t h i s word i n i t s memory t h r o u g h its DMA port. In one RC4000 24 b i t word o n l y one 16 b i t T I 9 8 0 A word i s s t o r e d . No a t t e m p t h a s b e e n made t o make a more e f f i c i e n t p a c k i n g , because i t would complicate b o t h s o f t w a r e and h a r d w a r e . The DMA t r a n s f e r c a n o n l y be i n i t i a l i z e d by t h e RC4000, w h i c h has f o u r i n s t r u c t i o n s f o r this purpose: 1) load the RC4000 a d r e s s c o u n t e r f o r i n p u t , 2) l o a d t h e RC4000 a d r e s s counter for output, 3) load the TI980A adress counter and 4) load t h e word number c o u n t e r . E x e c u t i o n of the l a s t i n s t r u c t i o n a l s o starts the data transfer, which is now c o n t r o l l e d by t h e interface. The a u t o m a t i c t r a n s f e r i n s t r u c t i o n (ATI) of the minicomputer is used for a c t i v a t i n g the i n s t r u c t i o n c o n t r o l l e d o u t p u t a t t h e DMA p o r t . T h i s instuction is normally used to initialize a DMA t r a n s f e r to a peripheral device ( e . g . a d i s c ) when t h e m i n i c o m p u t e r i s used i n a stand alone system. Here the output is used for a low s p e e d c o m m u n i c a t i o n f r o m t h e T I 9 8 0 A t o the RC4000. The DMA p o r t does not have an input f e a t u r e , so l o w - s p e e d c o m m u n i c a t i o n t h e o p p o s i t e way i s not implemented. The A T I i n s t r u c t i o n c a n l o a d two 16 b i t TI980A words to p e r i p h e r a l r e g i s t e r s , and sends at t h e same t i m e an i n t e r r u p t t o t h e R C 4 0 0 0 . T h i s c a n r e a d the two r e g i s t e r s by s e n s e i n s t r u c t i o n s . The r e m a i n i n g b i t s w h i c h c a n be r e a d by a s e n s e i n s t r u c t i o n a r e used for status. Software.
for
The communication and c o n t r o l s o f t w a r e d e v e l o p e d t h i s p r o j e c t c o n s i s t s of the f o l l o w i n g p r o g r a m s : 1. A h a n d l e r as a part of the RC4000 monitor (Brinch Hansen, 1973) which together with a process d e s c r i p t i o n i s the peripheral process "ti980a". 2. I n i t i a l i s a t i o n code i n the RC4000. T h i s i s o n l y e x e c u t e d at system r e s t a r t i n the RC4000. 3. A monitor in the TI980A. This includes a h a n d l e r f o r t h e RC4000 known as " r c 4 0 0 0 " .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
10.
LINDGARD
A Minicomputer "Numbercruncher
ET AL.
131
The T I 9 8 0 A m o n i t o r p r o v i d e s a c o n t r o l a n d c o m m u n i c a t i o n s t r u c t u r e s i m i l a r t o t h a t o f t h e RC4000 m o n i t o r . The T I 9 8 0 A u s e r a r e a a n d r e g i s t e r f i l e dump i s c o n c e p t u a l l y a p r o c e s s s i m i l a r to the i n t e r n a l p r o c e s s of the RC4000 m o n i t o r ( B r i n c h H a n s e n , 1 9 7 3 ) . The process it may c o m m u n i c a t e w i t h i s t h e p e r i p h e r a l p r o c e s s " r c H O O O " (see figure 1) a n d i t d o e s so u s i n g a m e s s a g e buffer t e c h n i q u e e q u i v a l e n t t o t h a t i n t h e RC4000 s y s t e m . T h u s m u l t i b u f f e r i n g of i n p u t / o u t p u t i s a built-in feature. The structure a l l o w s us t o i m p l e m e n t m u l t i p r o g r a m m i n g on t h e T I 9 8 0 A w i t h o u t c h a n g i n g e x t e r n a l c o n v e n t i o n s and with a r e l a t i v e l y small e f f o r t i n software development.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
Communication
in
the
TI980A.
When t h e T I 9 8 0 A u s e r p r o g r a m w a n t s the attention of t h e RC4000 u s e r p r o g r a m i t s e n d s a m e s s a g e . T h i s i s d o n e by c a l l i n g a p r o c e d u r e " s e n d message". A buffer within the T I 9 8 0 A m o n i t o r i s s e l e c t e d and t h e m e s s a g e i s c o p i e d from the u s e r program t o the message buffer. The buffer address is returned to the TI980A u s e r p r o g r a m . The l a t t e r may s e n d a new m e s s a g e o r may wait for an answer t o t h e m e s s a g e s e n d ( s e e f i g u r e 2b f o r an e x a m p l e ) . C a l l i n g t h e T I 9 8 0 A m o n i t o r p r o c e d u r e " w a i t a n s w e r " d e l a y s t h e T I 9 8 0 A u s e r p r o g r a m u n t i l t h e RCH000 has s e n t an a n s w e r b a c k t o t h e T I 9 8 0 A . T h e a n s w e r from the RC4000 a r r i v e s i n t h e same m e s s a g e b u f f e r as u s e d by " s e n d m e s s a g e " a n d i s c o p i e d by " w a i t answer" into
TMX
TI980A interface
Figure 1. Structure of a simple job using the TI980A showing the communication and control paths. Rectangular boxes are interface hardware; circles are peripheral processes.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
132
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
an answer a r e a i n t h e T I 9 8 0 A u s e r p r o g r a m . The T I 9 8 0 A user program can c a l l the TI980A monitor to examine w h e t h e r an a n s w e r has a r r i v e d .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
Communication
in
the
RC4000.
When the RC4000 user program has loaded and started t h e T I 9 8 0 A , t h e T I 9 8 0 A u s e r p r o g r a m may s e n d a message t o the h a n d l e r t e l l i n g t h e h a n d l e r t o queue up the message buffer until a message a r r i v e s from the T I 9 8 0 A u s e r p r o g r a m . When i t arrives the message is copied from the TI980A to the s e l e c t e d message buffer i n t h e R C 4 0 0 0 . The RCH000 u s e r program will get the T I 9 8 0 A m e s s a g e c o p i e d i n t o i t s a n s w e r a r e a by e x e c u t i n g wait answer. An a n s w e r t o t h e m e s s a g e f r o m t h e T I 9 8 0 A c a n be s e n d by t h e RC4000 u s e r by e x e c u t i n g a new " s e n d message", "wait answer" s e q u e n c e , (see f i g u r e 2a). Control. The RC4000 u s e r p r o g r a m i s an o p e r a t i n g s y s t e m f o r t h e T I 9 8 0 A u s e r p r o g r a m . I t c a n do block input/output t o t h e u s e r a r e a a t any t i m e . I t c a n s t a r t and s t o p t h e T I 9 8 0 A u s e r p r o g r a m and when f i n i s h e d r e m o v e t h e T I 9 8 0 A user program. T h i s i s done by s e n d i n g m e s s a g e s t o t h e h a n d l e r . The T I 9 8 0 A u s e r p r o g r a m a n d even the TI980A monitor can do nothing to harm t h e RCHOOO a n d t h e a c t i v i t i e s t h e r e i n . The c o n t r o l b o t h i n h a r d w a r e a n d i n s o f t w a r e i s e x c l u s i v e to the RC4000. Survival. F o r l o n g t e r m c o m p u t a t i o n s , i t w o u l d be c o n v e n i e n t if the minicomputer could survive most kinds of troubles in the host system, i r r e s p e c t i v e of whether t h e y a r e c a u s e d by h a r d w a r e m a l f u n c t i o n i n g or by new d e v e l o p m e n t o f hardware and b a s i c s o f t w a r e . In hardware the TI980A i s p r o t e c t e d a g a i n s t the R C 4 0 0 0 . The c o m m u n i c a t i o n channel is separated both from t h e RCH000 d a t a c h a n n e l s a n d f r o m t h e T I 9 8 0 A d a t a c h a n n e l t h r o u g h two c o n t r o l l e r s . The TI980A can run even when there i s no power on t h e c o n t r o l l e r i n t h e RC*+000. I n t h e d e s i g n o f t h e T I 9 8 0 A m o n i t o r and t h e RCH000 h a n d l e r i t was p o s s i b l e t o d e s i g n a safe strategy to keep the TI980A g o i n g i n d é p e n d a n t o f system deadstarts i n t h e R C H 0 0 0 . T h i s i s d o n e by h a v i n g a copy of all state variables in both the TI980A m o n i t o r and t h e RC4000 p e r i p h e r a l p r o c e s s . A t s y s t e m d e a d s t a r t in the
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
LiNDGARD E T A L .
A Minicomputer Ν umber cruncher
133
rcusercomm 1 27 4 76 1 begin 2 comment r c 4 0 0 0 u s e r p r o g r a m f o r c o n t r o l a n d 3 communication with t i 9 8 0 a ; 4 integer i ; 5 i n t e g e r a r r a y M,A(1: 8 ) , i m a g e ( 1 : 256 ) , r e g i s t e r ( 1 : 9 ) ; 6 6 comment f e t c h t r a n s l a t e d t i u s e r p r o g r a m f r o m d i s ^ 7 careaproc(<:tiusercomm:>); 8 M ( l ) : = 3 s h i f t 12; comment i n p u t o p e r a t i o n ; 9 M(2 ) : = f i r s t a d d r ( i m a g e ) ; 10 M(3):=M(2)+2*256-2; 11 M ( 4 ) : = l ; comment r e l a t i v e segment f o r c o d e ; 12 waitanswer(sendmessage(<:tiusercomm:>,M),A); 13 comment t h e t i u s e r p r o g r a m i s now i n image; 14 14 comment r e s e r v e t i 9 8 0 a a n d move code t o t i 9 8 0 a ; 15 reserveproc(<:ti980a:>,0); 16 M ( l ) : = 5 s h i f t 12; comment output; 17 comment M(2) a n d M(3) a r e u n c h a n g e d ; 18 M(4):=0; comment f i r s t a d d r e s s i n t i 9 8 0 ; 19 w a i t a n s w e r ( s e n d m e s s a g e ( < : t i 9 8 0 a : > , M ) , A ) ; 20 20 comment s e t r e g i s t e r s a n d s t a r t t i 9 8 0 a ; 21 r e g i s t e r ( 8 ) : = r e g i s t e r ( 9 ) : = 0 ; 22 comment T I p r o g r a m c o u n t e r : = T I s t a t u s r e g i s t e r : = 0 ; 23 M ( l ) : = 5 s h i f t 12+2; comment s e t r e g i s t e r s a n d s t a r t ; 24 M ( 2 ) : = f i r s t a d d r ( r e g i s t e r ) ; 25 w a i t a n s w e r ( s e n d m e s s a g e ( < : t i 9 8 0 a : > , M ) , A ) ; 26 26 comment w a i t f o r 5 m e s s a g e s a n d g e n e r a t e a n s w e r s ; 27 f o r i : = l s t e p 1 u n t i l 5 do b e g i n 28 M ( l ) : = 1 4 s h i f t 12; comment w a i t m e s s a g e ( < : t i 9 8 0 a :>) ; 29 waitanswer(sendmessage(<:ti980a:>,M), A); 30 comment a message h a s a r r i v e d , g e n e r a t e an a n s w e r ; 31 M ( l ) : = 1 0 s h i f t 12; comment s e n d a n s w e r ( < : t i 9 8 0 a :>); 32 M ( 2 ) : = A ( 2 ) ; comment c o p y T I b u f f e r a d d r e s s ; 33 waitanswer(sendmessage(<;ti980a:>,M), A ); 34 end l o o p ; 35 35 comment r e m o v e p r o g r a m a n d r e l e a s e t i 9 8 0 a ; 36 M ( l ) : = 1 6 s h i f t 12; 37 w a i t a n s w e r ( s e n d m e s s a g e ( < : t i 9 8 0 a : > , M ) , A ) ; 38 end algol
end
15
Figure 2a. Model operating system written in the ALGOL6 dialect (Lauesen, 1969). The program reads the translated TI980A user program from the RC4000 backing store and moves it to the user area of the Τ1980A (lines 6-19). The Τ1980A register file is loaded and the minicomputer started (lines 20-25). A number of messages and answers are exchanged (lines 26-34). The minicomputer is released (lines 35-37).
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
134
AND LARGE
SCALE
COMPUTATIONS
RC4000 the initialization code reads the state v a r i a b l e s from the TI980A monitor into the RC4000 peripheral process. Loading the TI98&A m o n i t o r i s a p r i v i l e g e d o p e r a t i o n which a normal user cannot execute . The major problem that arises when h a n d l i n g a s u r v i v a l problem l i e s in the multiprogrammed RC4000 computer. A f t e r d e a d s t a r t o n l y t h e p e r s o n who h a s l a s t r e s e r v e d t h e T I 9 8 0 A s h o u l d be allowed to control it again, if a TI980A user program is r u n n i n g . The reservation scheme is extended a s f o l l o w s . When t h e TI980A i s f r e e , any RC4000 process may r e s e r v e the TI980A. T h e name o f t h e r e s e r v i n g p r o c e s s i s moved t o the TI980A m o n i t o r . At system d e a d s t a r t it is copied from the TI980A monitor i n t o the p e r i p h e r a l p r o c e s s . O n l y a n RC4000 process with the same name as the original réserver c a n now r e s e r v e the T I 9 8 0 A . The d i s a d v a n t a g e o f t h i s scheme i s that the RCH000 user p r o c e s s e x p l i c i t l y has t o r e l e a s e the t i 9 8 0 a process. A user can take advantage of the automatic p r o c e s s start up facility in one of the o p e r a t i n g systems (Graae S^rensen and L i n d g â r d , 1 9 7 3 ) . T h e RCH000 user program can e a s i l y examine the s t a t e o f the TI980A and t h e p r o g r a m t h e r e i n , m a k i n g t h e c o d i n g o f an start up mechanism r e l a t i v e l y easy. f f
How
to use the
t f
TI980A.
In figure 2a is given a s i m p l e example o f an o p e r a t i n g RC4000 a l g o l program and in figure 2b a TI980A a s s e m b l y l a n g u a g e p r o g r a m . The o p e r a t i n g p r o g r a m fetches t h e a s s e m b l e d c o d e as g e n e r a t e d by t h e g e n e r a l a s s e m b l e r ( B a n g , 1 9 7 4 ) . The TI980A i s l o a d e d with the program after reservation has taken place and t h e TI980A u s e r program i s s t a r t e d . The TI980A u s e r program and t h e RC4000 user program exchange a number of messages. F i n a l l y t h e RC4000 u s e r p r o g r a m r e l e a s e s t h e TI980A. The c o n t r o l m o d e l p r o g r a m i n f i g u r e 2a i s a short program and it is easy to extend i t to a r e a l i s t i c c o n t r o l p r o g r a m by i n c l u d i n g some i n p u t / o u t p u t a n d t e s t of t h e c o m m u n i c a t i o n s . Such a program w i l l only be a few p a g e s l o n g a n d r a t h e r t r i v i a l t o w r i t e . The T I 9 8 0 A m o d e l p r o g r a m i n f i g u r e 2b i s v e r y s h o r t . I t h a s i n d e e d b e e n t h e s c o p e o f t h e d e s i g n t o make l i f e e a s y f o r the programmer when h a n d l i n g c o m m u n i c a t i o n s . A s s e m b l y l a n g u a g e c o d i n g s h o u l d be k e p t a t a m i n i m u m . A l t h o u g h the communication p r i m i t i v e s looks d i f f e r e n t i n the TI980A t h e y work b a s i c l y t h e same way a s i n t h e R C 4 0 0 0 . I n a n
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
LINDGARD E T A L .
10.
A Minicomputer"Νumbercruncher
135
tiuser 0 0 rep : ldx=message ; trap 3 ; send m e s s a g e ( < : r c 4 0 0 0 : > m e s s a g e ) ; 1 ste bufferaddress; bufferaddress:= 2 3 3 comment c o m p u t a t i o n s and/or other communications 3 may t a k e p l a c e h e r e ; 3 3 ldx=answer 3 lde bufferaddress 4 wait answer(bufferaddress.answer); 5 trap 4 6 bru rep goto r e p ;
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
7 7 8 14 20
bufferaddress: message :
Figure 2b.
0 0,r. 0,r.
message a r e a answer a r e a
Model program for the TI980A showing how to communicate with the RC4000
RC4000 assembly language have been t h e same.
program,
the
comments
would
Discussion. Starting out with the model operating system (figure 2a) i t i s a r a t h e r t r i v i a l task to write an o p e r a t i n g system f o r a s p e c i f i c application. The system has a l r e a d y been s u c c e s f u l l y used to solve scientific problems in statistical mechanics ( R o t n e a n d H e i l m a n n , 1976) f o r p o l y m e r s on a g r i d . The c o s t p e r r u n i s v e r y low compared with the cost on a large, fast machine c o n s i d e r i n g the dif ference in speed. The stability of the system is extremely good. The T I 9 8 0 A h a s n o t f a i l e d a t l e a s t i n the p a s t y e a r . The t o t a l h a r d w a r e d e v e l o p m e n t c o s t i s a r o u n d h a l f the p r i c e of the minicomputer. The basic software d e v e l o p m e n t c o s t was a r o u n d two p e r s o n m o n t h s . Acknowledgement.
The g r a n t from S t a t e n s N a t u r v i d e n s k a b e l i g e F o r s k n i n g s r â d to p u r c h a s e the TI980A i s gratefully acknowledged. Jorgen Bang designed and implemented the g e n e r a l a s s e m b l e r . H e i n r i c h B j e r r e g a a r d implemented the basic software.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
136
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
Abstract.
The low price of minicomputers makes them attractive for timeconsuming jobs which are only cpu-bound, like Monte Carlo simulations. At the H. C. Ørsted Institute a minicomputer with 12 k main memory has been connected to the multiprogrammed RC4000 computer. A l l program development is done on the RC4000 and so is the control of the minicomputer.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch010
Literature cited.
Bang,
J.,
(1974), Report 74/15, Datalogisk
Institut,
København. Brinch Hansen, P. (1967), Bit 7 191-199 Brinch Hansen, P. (1973), Operating Systems Principles, Prentice-Hall, Englewood C l i f f s , N. J . Graae Sørensen, P. and Lindgård, A. (1973), Computers in Chemical Research and Ed. Hadzi, Elsevier, Amsterdam. Lauesen, S. (1969), ALGOL5 User's Manual, RCSL 55-D42 Regnecentralen, København. Miller, W.H. and Schaefer, H.F. (1973), Quarterly Reports, Department of Chemistry, University of California, Berkeley, California. Rotne, J . and Heilmann, O.J. (1976), Proc. VIIth International Congress on Rheology, Gothenburg.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
11 Molecular Dynamics Calculations on a Minicomputer PAUL A. FLINN
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
Physics and Metallurgy Departments, Carnegie-Mellon University, Pittsburgh, PA 15213
Since its introduction by Rahman in 1964 (1), the technique of computer simulation of motion in liquids, generally known as "molecular dynamics", has played a vital role in increasing our understanding of the real nature of the liquid state. Applications of the technique to various liquids have been reviewed by McDonald and Singer (2), Rahman (3), and Fisher and Watts (4). The results of the calculations have been in excellent agreement with a variety of experimental measurements of the properties of liquids: the equation of state, the radial distribution function, inelastic scattering of neutrons, and diffusion. The molecular dynamics results also provide valuable tests of the adequacy of various approximate analytic theories of liquids. A major limitation of the technique has been economic: the calculations have required large amounts of time on large, expensive, computers. Fortunately, it is possible to carry out useful molecular dynamics calculations at greatly reduced cost on a minicomputer (or microcomputer); much more widespread use, including instructional use, of the technique, should now be possible. The calculation is, in principle, quite simple; it consists of numerical integration of the simultaneous nonlinear differential equations of motions for a number of particles constituting a small sample of the liquid. In the original work the Newtonian form of the equations of motion was used: d r. 1
m
.5-
f(r..)
dt where m is_^the p a r t i c l e mass, i s the position of the i t h p a r t i c l e , r-y i s t h e d i s t a n c e between t h e centers o f p a r t i c l e s i and j , and f i s t h e f o r c e a c t i n g between p a r t i c l e i and j . F o r the work d e s c r i b e d here, i t was more convenient t o use t h e Hamilton!an equations:
137 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
138
MINICOMPUTERS
-4 dp. dt " . ^ j
U
r
SCALE
COMPUTATIONS
-• dr ρ dt " m
;
i j
A N D LARGE
where p. i s t h e momentum o f t h e i t h p a r t i c l e . The p o t e n t i a l used f o r a given l i q u i d i s g e n e r a l l y o f a form suggested "by t h e o r e t i c a l arguments, hut with parameters obtained from experimental data on the m a t e r i a l . To i l l u s t r a t e t h e method we use t h e case of argon, w i t h a Lennard-Jones p o t e n t i a l : V(r) =
[(σ/r)
1 2
6
- (σ/r) ]
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
—21 e = I.65 X 10"
and parameter values work o f Rahman.
5? J , σ = 3.U À taken from t h e
B a s i c P r i n c i p l e s o f Minicomputer Use. The c h a r a c t e r i s t i c f e a t u r e s o f most minicomputers are small word s i z e (l6 b i t s ) , l i m i t e d memory, and reasonable speed f o r integer arithemetic. F l o a t i n g point operations u s u a l l y r e q u i r e subroutines and are quite slow. The usefulness o f minicomputers f o r molecular dynamics c a l c u l a t i o n s r e s u l t s from the f a c t that t h e range o f values o f t h e v a r i a b l e s needed i s s u f f i c i e n t l y l i m i t e d t h a t i n t e g e r a r i t h m e t i c can be used, and l 6 b i t p r e c i s i o n i s adequate. The only p o t e n t i a l d i f f i c u l t i e s a r i s e i n connection w i t h t h e interatomic f o r c e f u n c t i o n , which may be o f complicated form, and has an unbounded magnitude. These problems can be f a i r l y e a s i l y circumvented: t h e interatomic f o r c e f u n c t i o n i s evaluated and t a b u l a t e d a t the beginning o f the c a l c u l a t i o n ; i n the body o f t h e c a l c u l a t i o n determination of t h e f o r c e i s simply a look-up operation. The wide range o f the magnitude of t h e f o r c e does not represent any r e a l problem, since t h e f o r c e becomes i n c o n v e n i e n t l y l a r g e only at d i s t a n c e s considerably shorter than those which a c t u a l l y e x i s t when the l i q u i d i s a t or near e q u i l i b r i u m . We can, t h e r e f o r e , truncate t h e magnitude of t h e f o r c e t o a constant value f o r d i s t a n c e s l e s s than some r . We a l s o , as i s customary, l i m i t t h e range of i n t e r a c t i o n by s e t t i n g the f o r c e equal t o zero f o r d i s t a n c e s beyond t h e c u t o f f d i s t a n c e . Our f o r c e law then has t h e form: s
r < r r
g
^ r * ^
f(r) f
(
r
f(r ) g
-13
) c r
r > r c
f(r) = ' v
c r 2
0
and t h e lookup t a b l e need cover only t h e range r ^ r ^ r . F o r t h i s c a l c u l a t i o n , r was taken as 2.82 A, and r_ as 5.07 A. r
c
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
11.
FLiNN
Molecular Dynamics Calculations
139
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
System Hardware. The minicomputer used f o r t h i s work was a Texas Instruments 96OA, borrowed from i t s normal use as a Mossbauer spectrometer and data processor (5>). The system i n c l u d e s 8192 words (l6 b i t ) of semiconductor memory, an i n t e r f a c e t o a t e l e t y p e with paper tape punch and reader, and a CRT d i s p l a y . The o p t i o n a l extended i n s t r u c t i o n set of the 96OA includes the f o l l o w i n g hardware operations: m u l t i p l y two l6 b i t words, 32 b i t (double word) product; d i v i d e double word by s i n g l e word, s i n g l e word quotient and s i n g l e word remainder; double word add; double word subtract; double word l e f t and r i g h t s h i f t operations. The d i s p l a y i s provided by a Tektronix 603 storage d i s p l a y u n i t , d r i v e n by two D a t e l DAC k^lOB 10 b i t analog t o d i g i t a l converters, i n t e r f a c e d through the communications r e g i s t e r u n i t (CRU) of the computer. Alphameric d i s p l a y i s provided by software; no character generating hardware i s used. The o r i g i n a l cost of the computer was about $7000 (1972). The 96OA i s no longer made, but equipment w i t h s i m i l a r performance can now be obtained at a much lower cost. For i n t e g e r a r i t h m e t i c , the 96OA i s only moderately slower than t y p i c a l l a r g e computers, such as the Univac 1108. The times i n microseconds f o r some t y p i c a l i n s t r u c t i o n s are:
96 OA
1108
Add
3.583
1.50
Subtract
3.583
1.50
Multiply
8.583
3.125
10Λ17
Load
3.333
3.875 1.50
Store
3.583
1.50
Divide
System Software. The operating system used f o r t h i s work was one o r i g i n a l l y w r i t t e n f o r the Mossbauer spectrometer a p p l i c a t i o n , and described i n more d e t a i l elsewhere (_5). I t c o n s i s t s of a monitor, i/o r o u t i n e s , and a f l o a t i n g p o i n t a r i t h m e t i c package. The monitor provides f o r the l o a d i n g of programs from paper tape, i n i t i a t i o n of execution, recovery from e r r o r t r a p s , dump, patch, and debug facilities. The i/o r o u t i n e s provide f o r t e l e t y p e input and output of decimal, hexadecimal, and alphanumeric data, and CRT d i s p l a y of alphanumeric data by software character generation. The f l o a t i n g point package i n c l u d e s a d d i t i o n , s u b t r a c t i o n , m u l t i p l i c a t i o n , d i v i s i o n , i n t e g e r t o f l o a t i n g p o i n t , and f l o a t i n g p o i n t t o i n t e g e r conversion. A f i x e d point square root r o u t i n e was w r i t t e n and included f o r t h i s c a l c u l a t i o n . The system
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
140
AND
LARGE SCALE
COMPUTATIONS
software occupies 2128 words, with an a d d i t i o n a l 82 words f o r t h e square root r o u t i n e . A l l programming was done i n assembly language, and converted t o object code by a macro i n s t r u c t i o n processor and a cross assembler run on an IBM 360/67. Molecular Dynamics Program. The working program c o n s i s t s o f s e v e r a l p a r t s : initial i z a t i o n , c o o l i n g , i n t e g r a t i o n , c a l c u l a t i o n o f s t a t i s t i c s , CRT d i s p l a y , and t e l e t y p e output. The program storage requirements, i n l 6 b i t words, are: i n i t i a l i z a t i o n , 210; c o o l i n g , 38; i n t e g r a t i o n , 192; s t a t i s t i c s , 100; d i s p l a y and output, kQ; t o t a l , 588. The data storage requirements are: f o r c e t a b l e , 102^; p o s i t i o n , momentum, i n i t i a l p o s i t i o n and i n i t i a l momentum, 38+ each; c o r r e l a t i o n f u n c t i o n s , 102^; t o t a l data 358^. The o v e r a l l space r e q u i r e d i s ^172 words.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
1
Units and S c a l i n g . I t i s customary i n molecular dynamics c a l c u l a t i o n s t o use a system o f u n i t s based on t h e p r o p e r t i e s o f the system under study: f o l l o w i n g the choice o f Tsung and Maclin (6), we take t h e p a r t i c l e mass as t h e u n i t o f mass, t h e parameter σ as t h e u n i t o f length, and a conveniently short time ( l O " ^ sec.) as t h e u n i t o f time. With t h i s convention, t h e momentum i s numerically equal t o t h e velocity. In order t o avoid unnecessary l o s s o f p r e c i s i o n i n t h e i n t e g e r a r i t h m e t i c , i t i s necessary t o scale t h e v a r i a b l e s o f t h e problem i n t o proper i n t e g e r u n i t s . We consider f i r s t t h e length s c a l e : we have a system o f Ν p a r t i c l e s , with a volume V.per p a r t i c l e , f o r a t o t a l volume To use t h e f u l l p r e c i s i o n o f the computer we represent t h i s length by 2 ^ . We choose our u n i t o f time f o r one i n t e g r a t i o n step as 62.5 femtoseconds; t h i s i s scaled as l / l 6 o f an i n t e g e r u n i t , since m u l t i p l i c a t i o n by At i s accomplished by a s h i f t o f k b i n a r y places t o the r i g h t . One i n t e g e r unit o f time i s t h e r e f o r e 0.1 picosecond. T h i s choice o f distance and time s c a l e s f i x e s t h e v e l o c i t y (and momentum) s c a l e s . We d e f i n e the "temperature" o f the system i n terms o f the k i n e t i c energy: 1
Τ - 3k/m. 2
Startup
of Calculation.
The f i r s t step i n the c a l c u l a t i o n i s the generation o f t h e f o r c e lookup t a b l e f o r t h e range o f R needed (1023^ t o 18^26). Since t h e memory space a v a i l a b l e was quite l i m i t e d , steps o f 8 were used, so that only 102^ l o c a t i o n s were r e q u i r e d . I n t e r p o l a t i o n from the t a b l e was planned f o r intermediate values of R, but proved t o be unnecessary.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
11.
FLiNN
Molecular Dynamics Calculations
141
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
Next, t h e i n i t i a l c o n f i g u r a t i o n o f t h e system i s constructed by a s s i g n i n g random values t o t h e p o s i t i o n and momentum coordinates of t h e p a r t i c l e s . Random numbers are generated by the m u l t i p l i c a t i v e congruence method: repeated m u l t i p l i c a t i o n by 3125 and r e t e n t i o n o f t h e l e a s t s i g n i f i c a n t h a l f of t h e double word product. T h i s i n i t i a l c o n f i g u r a t i o n has, o f course, an extremely high energy. I t i s necessary t o " c o o l " t h e system by g r a d u a l l y removing k i n e t i c energy. T h i s i s done by p e r i o d i c a l l y reducing the magnitude o f each component o f momentum o f each p a r t i c l e by some f r a c t i o n o f i t s value. The c o o l i n g must be done g r a d u a l l y t o avoid f r e e z i n g i n a nonequilibrium s t a t e ; t h e c o o l i n g r a t e ' s h o u l d not exceed t h e r a t e a t which t h e i n i t i a l l y extremely high p o t e n t i a l energy o f t h e system can be converted i n t o k i n e t i c energy, so t h a t approximate e q u i p a r t i t i o n i s maintained. Main C a l c u l a t i o n Loop. The c a l c u l a t i o n proper i s c a r r i e d out i n a nest of three loops. The outermost (T loop) i s a time loop; each execution corresponds t o one time i n t e g r a t i o n step. The intermediate loop (J loop) i s over a l l p a r t i c l e s ; one execution corresponds t o a c a l c u l a t i o n o f the net f o r c e on one p a r t i c l e , and an updating o f the p o s i t i o n and momentum o f t h a t p a r t i c l e . The innermost loop (I loop) i s a l s o over a l l p a r t i c l e s ; one execution corresponds t o a c a l c u l a t i o n o f t h e f o r c e on one p a r t i c l e due t o one other particle. We use t h e f o l l o w i n g n o t a t i o n t o d e s c r i b e t h e c a l c u l a t i o n : X l ( l ) , X 2 ( l ) , X 3 ( l ) : p o s i t i o n coordinates of the I t h p a r t i c l e . P l ( l ) , P 2 ( l ) , P 3 ( l ) : momentum coordinates o f t h e I ' t h p a r t i c l e . DX1, DX2, DX3: components o f t h e v e c t o r from p a r t i c l e J t o p a r t i c l e I; e.g., DX1 = X l ( l ) - X l ( j ) . R: t h e d i s t a n c e from p a r t i c l e J t o p a r t i c l e I . F: t h e f o r c e exerted by p a r t i c l e I on p a r t i c l e J . F l , F2, F3: t h e components o f the net f o r c e on p a r t i c l e J ; t h i s i s c a l c u l a t e d as a running sum over t h e I p a r t i c l e s i n t h e inner loop. The c a l c u l a t i o n proceeds as f o l l o w s : Zero t h e time r e g i s t e r and enter t h e Τ loop. I n i t i a l i z e t h e J r e g i s t e r and enter t h e J loop. C l e a r F l , F2, F3 t o zero. I n i t i a l i z e t h e I r e g i s t e r and enter t h e I loop. Test and s k i p i f I = J . C a l c u l a t e RR - DX1**2 + DX2**2 + DX3**2 as a double word sum. Test and s k i p i f RR > RRC. (Separation beyond c u t o f f range). C a l c u l a t e R = SQRT(RR) and s t o r e . Form R-RS and set equal t o zero i f negative. S h i f t r i g h t 3 places ( d i v i d e by 8) and use as index t o look up F. Form the components of F: (F*DXl/R), (F*DX2/R), (F*DX3/R), and add t o F l , F2, F3. T
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
142
AND LARGE SCALE
COMPUTATIONS
Increment I and continue. On e x i t from I loop c a l c u l a t e momentum changes as F l , F2, F3 s h i f t e d r i g h t k places ( d i v i d e d "by 16, corresponding t o At = l / l 6 ) and update P l ( j ) , P 2 ( j ) , and P5(J). C a l c u l a t e p o s i t i o n changes as P l ( j ) , P 2 ( j ) , and P 3 ( j ) , s h i f t e d r i g h t k p l a c e s , and update X l ( j ) , X 2 ( j ) , and X5(J). D i s p l a y new p o s i t i o n . Increment J and continue. On e x i t from J loop, store panel switches and t e s t f o r e x i t t o monitor, c o o l i n g , or temperature c a l c u l a t i o n . Increment Τ r e g i s t e r and continue.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
D i s p l a y and Output o f R e s u l t s . The c a l c u l a t i o n s produce two s o r t s o f r e s u l t s : p a r t i c l e p o s i t i o n s as a f u n c t i o n o f time, and s t a t i s t i c a l f u n c t i o n s o f t h e system. The p a r t i c l e p o s i t i o n s as a f u n c t i o n o f time are d i s p l a y e d on t h e storage CRT. F o r reasons o f c l a r i t y , only those p a r t i c l e s i n t h e f i r s t octant ( a l l components o f p o s i t i o n p o s i t i v e ) are d i s p l a y e d . A f t e r the new p o s i t i o n o f t h e J ' t h p a r t i c l e i s c a l c u l a t e d , t h e t h r e e components o f p o s i t i o n a r e t e s t e d , and, i f a l l are p o s i t i v e , t h e χ and y components o f p o s i t i o n are t r a n s m i t t e d through t h e CRU t o t h e 10 b i t analog t o d i g i t a l converters which d r i v e t h e CRT d i s p l a y u n i t . We thus d i s p l a y t h e p r o j e c t i o n on t h e x-y plane o f t h e content o f t h e f i r s t octant. P l a c i n g t h e d i s p l a y u n i t i n storage mode r e s u l t s i n the development o f t r a c e s o f t h e paths o f the centers o f t h e p a r t i c l e s . Some t y p i c a l t r a c e s a f t e r v a r y i n g lengths o f time (θ.36 ps, 0.9 ps, 1.8 p s ) a r e shown i n Figures 1, 2 and 3. Such d i s p l a y s are quite valuable f o r v i s u a l i s i n g t h e nature o f a l i q u i d ( i t r a p i d l y becomes obvious t h a t a l i q u i d i s n e i t h e r g a s - l i k e nor s o l i d - l i k e ) , but, obviously some q u a n t i t a t i v e c h a r a c t e r i s t i c s a r e needed. Two widely used s t a t i s t i c s o f a l i q u i d a r e t h e mean square displacement f u n c t i o n , and t h e v e l o c i t y a u t o c o r r e l a t i o n f u n c t i o n . We take t h e mean square displacement f u n c t i o n , χ ( t ) , as t h e ensemble average: 2
x ( t ) = <x (o)x (t)> i
i
1
It i s , o f course, equal t o t h e time average f o r any p a r t i c l e : 2
x ( t ) = <x(T) (t + x)> x
T
but t h e f i r s t form i s more convenient here. To evaluate i t , we choose a s t a r t i n g time a f t e r t h e system has reached e q u i l i b r i u m , as determined by t h e constancy o f t h e "temperature". We take t h i s time as t = 0, and store t h e values o f XI, X2, and X3 f o r a l l t h e
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
11. F L i N N
Molecular Dynamics Calculations
143
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
Figure 1. Projection on the x-y plane of the tracer of the motion of the center of simulated argon atoms in thefirstoctant of the system. Temperature is 90 K; elapsed time 0.36 picoseconds; computation time, 2 minutes.
^ •
^
j ^
* \
φ
•η) ν \.
w
Figure 2. Projection of tracer of motion of argon atoms as in Figure 1, but after total elapsed time of 0.9 picoseconds; computation time, 5 minutes
Figure 3. Projection of tracer of motion of argon atoms as in Figures 1 and 2, but after total elapsed time of 1.8 picoseconds; com putation time, 10 minutes
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
144
A N D LARGE SCALE
COMPUTATIONS
p a r t i c l e s as X S 1 , X S 2 , and X S 3 . At each time step, a f t e r completion of the J loop, t h e current value o f x ( t ) i s evaluated by summing ( X l ( l ) - X S l ( l ) ) * * 2 + ( X 2 ( l ) - X S 2 ( l ) ) * * 2 + ( X 3 ( l ) - X S 3 ( l ) ) * * 2 over a l l p a r t i c l e s and d i v i d i n g by N. The r e s u l t i n g f u n c t i o n can be d i s p l a y e d a t any time on the CRT or punched out on paper tape at the conclusion o f a run f o r p l o t t i n g on a pen and ink p l o t t e r . A t y p i c a l p l o t i s shown i n Figure k The normalized v e l o c i t y a u t o c o r r e l a t i o n f u n c t i o n i s c a l c u l a t e d i n a s i m i l a r way. I t i s defined as: 2
m
2
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
0 ( t ) = /. At the t = 0 chosen as d e s c r i b e d above, we store the values o f PI, P 2 , P3 f o r a l l the p a r t i c l e s as PSI, P S 2 , P S 3 . A f t e r each time step we form P1*PS1 + P 2 * P S 2 + P3*PS3 f o r each p a r t i c l e , sum over a l l p a r t i c l e s , and normalize by d i v i s i o n by the sum of P S 1 * * 2 + P S 2 * * 2 + P S 3 * * 2 f o r a l l p a r t i c l e s . This f u n c t i o n a l s o can be d i s p l a y e d on the CRT or punched out f o r e x t e r n a l p l o t t i n g . A t y p i c a l p l o t o f t h i s f u n c t i o n i s shown i n F i g u r e 5 · Comparison with Standard C a l c u l a t i o n s . The r e s u l t s obtained i n t h i s i n v e s t i g a t i o n are c o n s i s t e n t with those obtained i n conventional l a r g e machine c a l c u l a t i o n s , but the cost per computation i s very much lower. The q u a l i t a t i v e f e a t u r e s seen i n Figures 1-5 are the same as those reported f o r the standard c a l c u l a t i o n s ; a q u a n t i t a t i v e t e s t i s provided by the d i f f u s i o n c o e f f i c i e n t , which i s a s e n s i t i v e t e s t o f the technique. Levésque and V e r l e t ( 6 ) have summarized the r e s u l t s o f t h e i r c a l c u l a t i o n s f o r argon with the e m p i r i c a l formula i n reduced units: D = 0.006^23 T/p
2
+ 0 . 0 2 2 2 - 0 . 0 2 8 0 p.
Converted t o SI u n i t s , t h i s becomes: D = 5.639 X 1 0 "
5
T/p
2
+ 8.270 X 1 0 "
9
- 6.2C7 X 1 0 '
1 2
p.
For t h e c o n d i t i o n s corresponding t o the data shown i n Figure k, Τ = 113 Κ, and Ρ = l kh6 X 1 0 " 9 m /s, t h e i r equation p r e d i c t s D = 2 . 3 ^ X 1 0 " n r / s . The data o f F i g u r e 2 correspond t o a D - 2 . 7 7 Χ Ι Ο " m /s. T h i s d i f f e r e n c e i s of the same order as t h e s c a t t e r of standard c a l c u l a t i o n s , and the discrepancy between c a l c u l a t i o n and experiment. The speed o f the c a l c u l a t i o n was quite reasonable: 282 time steps i n 1 0 minutes, or 1692 steps per hour. Each second of machine time corresponds t o 3 X 1 0 " l 5 seconds i n argon. F o r comparison, the reported r a t e achieved on a l a r g e machine, a CDC 6 6 0 0 , was I5OO steps per hour f o r a somewhat l a r g e r system (Q6k p a r t i c l e s ) ( 2 ) . With proper programming, the c a l c u l a t i o n 2
m
9
9
2
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
11. F L i N N
Molecular Dynamics Calculations
Time
145
(Picoseconds)
Figure 4. Mean square displacement of simuhted argon atoms at 113 Κ as a function of time
I 0.0
ι 0.2
ι
ι
ι
ι
0.4 0.5 0.8 1.0 Time (Picoseconds)
ι
ι
1.2
1.4
Figure 5. Normalized velocity autocorrelation function of simulated argon atoms at 113 Κ
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
146
AND LARGE SCALE
COMPUTATIONS
time v a r i e s approximately as N. With t h i s allowance, i t appears t h a t c a l c u l a t i o n on a minicomputer i s slower "by roughly a f a c t o r of 6 than on a l a r g e machine. To estimate t h e r e l a t i v e cost o f computation, we take t h e i n i t i a l cost of t h e system and d i s t r i b u t e i t over t h r e e years (a conservative procedure, s i n c e our machine has been i n continuous use f o r f i v e years with no maintenance contract and n e g l i b l e s e r v i c i n g ) . T h i s corresponds t o a cost of $6Λθ a day or $0.27 per hour. I f we assume a l a r g e machine cost of about $100 per hour, t h e cost of equivalent c a l c u l a t i o n s i s lower on t h e small machine by a f a c t o r o f about 50.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch011
Future Prospects. The use o f c u r r e n t l y a v a i l a b l e hardware, i n s t e a d of t h e obsolete 96ΟΑ, would make p o s s i b l e both g r e a t e r savings and much more ambitious c a l c u l a t i o n s . In p a r t i c u l a r , a t h r e e dimensional a r r a y o f microprocessors (such as t h e T I 9900), each assigned a p o r t i o n o f the volume under study, could be used t o i n c r e a s e t h e speed o f c a l c u l a t i o n by more than an order of magnitude f o r a c o s t i n c r e a s e o f about a f a c t o r o f two.
Literature Cited. (1) Rahman, Α., Phys. Rev. (1964), 136, A405. (2) McDonald, I. R. and Singer, Κ., Quart. Rev. (1970), 24, 238. (3) Rahman, A. in "Interatomic Potentials and Simulation of Lattice Defects", ed. by Gehlen, P. C., Beeler, J . R. J r . , and Jaffee, R. I., p. 233, Plenum, N.Y., 1972. (4) Fisher, R. A. and Watts, R. O., Aust. J . Phys. (1972), 25, 529. (5) Flinn, P. Α., in "Mössbauer Effect Methodology", vol. 9, ed. by Gruverman, I. J., Seidel, C. W., and Dieterly, p. 245, Plenum, N.Y. 1974. (6) Levesque, D. and Verlet, L . , Phys. Rev. (1971), A2, 2514.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12 Many-Atom Molecular Dynamics with an Array Processor KENT R. WILSON
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
Department of Chemistry, University of California—San Diego, La Jolla, CA 92093
"The change of motion is proportional to the motive force impressed; and is made in the direction of the right line in which the force is impressed." Sir Isaac Newton, Philosophiae Naturalis Principia Mathematica, 1687. I.
Introduction and History
A. Theoretical Instruments. We chemists traditionally have built specialized instrumentation for experimental studies. We are now beginning also to build specialized instrumentation for theory (1). While we are accustomed to designing and building, for example, special spectrometers or molecular beam machines to efficiently probe the experimental side of a particular class of chemical questions, it is now becoming clear that with comparable effort we can also design and build specialized computational systems which will efficiently probe particular classes of theoretical problems. The reasons for building specialized instrumentation in either case are similar; that we want to explore chemical questions beyond the range of what we can learn using general purpose commercial instrumentation which must sacrifice specific efficiency to generalized applicability. B. Plastic Hardware. We are accustomed to thinking of computer software as plastic, malleable; employed to adapt a general purpose computer to our specific needs. The advance of computer science and technology has now softened hardware as well, making it also plastic, moldable to effectively fit the task at hand. But while hardware is plastic, it still has restraints. It flows more easily in some directions than in others. Thus, the initial task is to find those chemical problems which are best suited to this natural direction of hardware flow. For example, it is now cheaper to replicate many identical hardware units than to produce even a few different units. Therefore, one direction of hardware flow is toward structures composed of many identical units, working in parallel (2-4). The American Chemical Library In MinicomputersSociety and Large Scale Lykos, P.; 1155 16th st. Computations; N . w. ACS Symposium Series; American Chemical Washington. D. C. Society: 2O036Washington, DC, 1977.
MINICOMPUTERS
148
AND LARGE
SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
congruent chemistry involves those t h e o r e t i c a l problems which can be cast i n t o forms i n v o l v i n g many simultaneous p a r a l l e l streams o f computation. C. Mechanical Molecules. One such chemical area i s the c l a s s i c a l mechanical treatment o f how n u c l e i , or roughly speaking atoms, i n t e r a c t on a Born-Oppenheimer p o t e n t i a l s u r f a c e . The i d e a t h a t the forces among a c o l l e c t i o n o f p a r t i c l e s determine both t h e i r s t a t i c c o n f i g u r a t i o n (molecular s t r u c t u r e ) and t h e i r motions (molecular dynamics) i s an o l d one. Newton, i n the 17th century, already understood the fundamental concepts o f c l a s s i c a l l y i n t e r a c t i n g p a r t i c l e s and considered t h a t macroscopic p r o p e r t i e s might r e s u l t from m i c r o s c o p i c i n t e r a c t i o n s . By the 19th century, with the acceptance o f the atomic theory, the view that chemistry should u l t i m a t e l y be an e x e r c i s e i n mechanics be came a popular one. The nature o f the underlying mechanics became apparent f i f t y years ago with the development o f quantum mechanics; i t i s now c l e a r that what the e l e c t r o n s are doing i s i n h e r e n t l y a quantum problem, but given a p o t e n t i a l surface derived e i t h e r from a t h e o r e t i c a l quantum computation o f e l e c t r o n i c energy or from a f i t to experimental measurements, that what the n u c l e i are doing both i n terms o f molecular s t r u c t u r e and molecular dynamics can be handled i n most cases reasonably w e l l by t h a t approximate form o f quantum mechanics c a l l e d c l a s s i c a l mechanics. (In a sense t h i s i s unfortunate, for chemistry would be an even more subtle and i n t e r e s t i n g p u z z l e i f P l a n c k s constant were l a r g e r . ) We w i l l thus concentrate here on the advantages which com p u t e r hardware p l a s t i c i t y can b r i n g to c l a s s i c a l molecular dy namics. (Molecular s t a t i c s o r molecular s t r u c t u r e w i l l be viewed i n t h i s context as that subset o f molecular dynamics for which the energy has been reduced to a g l o b a l minimum.) The s t r u c t u r e o f the computation i s exceedingly s i m p l e , a d e s i r a b l e s i t u a t i o n f o r a f i r s t essay i n t o a d i f f e r e n t mode o f s o l u t i o n . Given Ν atoms, we have, from Newton's Second Law, 1
F. = m. "
1
m
Q
£j ;
dt
Z± £±(£v
i = 1,
..., Ν
(1)
2
•••>iT )"-V V(r , N
i
...,r )
1
(2)
N
i n which j ; . , the force on the i t h atom, l o c a t e d at _r. , i s a func t i o n o f the p o s i t i o n s , · · · > JTN> °^ °^ ° whose masses are n ^ , n^, and V i s the Bom-Oppenheimer p o t e n t i a l t
n
e
s
e
t
a
t
m
s
surface seen by the n u c l e i .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
Molecular Dynamics
WILSON
149
D. Two Molecular Dynamics. S t r a n g e l y , the a p p l i c a t i o n o f t h i s viewpoint, that chemistry may be understood as the d e t a i l e d mechanics o f atomic motions, has l e d to two q u i t e d i s t i n c t f i e l d s , each c a l l e d by the same name, molecular dynamics, which have r e mained q u i t e separate f o r twenty y e a r s . Both f i e l d s , which are compared i n Table I , grew up i n the l a t e 1950 s , one (5) out o f s t a t i s t i c a l mechanics (SM), l a r g e l y (but not e x c l u s i v e l y ) concerned with e q u i l i b r i u m and steady s t a t e p r o p e r t i e s , u s u a l l y o f f l u i d s composed o f many simple p a r t i c l e s : hard spheres, atoms or s i m p l i f i e d molecules. The breakthrough which t r i g g e r e d the development o f the f i e l d was computational, the a b i l i t y provided by the e l e c t r o n i c computer to a c t u a l l y c a l c u l a t e the t r a j e c t o r i e s of many i n t e r a c t i n g p a r t i c l e s .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
f
TABLE I .
Comparison of two f i e l d s
c a l l e d molecular dynamics
Category
Molecular Dynamics (SM)
Molecular Dynamics (CK)
historical antecedents
statistical
chemical k i n e t i c s
initiating breakthrough
computational
experimental
major application
e q u i l i b r i u m and steady s t a t e
chemical
number o f atoms
many
few
major state
liquid
vacuum ( i s o l a t e d molecules)
mechanics
reactions
The other molecular dynamics (6^, 7) grew out o f chemical k i n e t i c s (CK) and has been concerned with understanding the d e t a i l e d mechanics of the mechanisms o f chemical r e a c t i o n s , usua l l y i n v o l v i n g r e l a t i v e l y few atoms, s m a l l e r molecules c o l l i d i n g and r e a c t i n g i n i s o l a t i o n , the "vacuum" phase. The development of the f i e l d was i n i t i a t e d by experimental advances, the a b i l i t y p r o v i d e d by molecular beam and i n f r a r e d chemiluminescence t e c h niques to measure the r e s u l t s o f i n d i v i d u a l chemical r e a c t i o n events. What we are now attempting i s a synthesis drawing from both f i e l d s o f molecular dynamics, a computational advance which w i l l allow through mechanics the study o f the d e t a i l e d mechanisms o f chemical r e a c t i o n s i n v o l v i n g many atoms, often o c c u r r i n g i n solution.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
150
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
E. D i f f i c u l t i e s and D i r e c t i o n s . Given that the s t r u c t u r e of E q s . (1) and (2) i s so s i m p l e , why i s n ' t the d e t a i l e d mech anism o f many-atom chemical r e a c t i o n s r o u t i n e l y s t u d i e d by com puting the t r a j e c t o r i e s o f the atoms? Three major d i f f i c u l t i e s are as f o l l o w s . 1. Potential surface. In r e a l i t y , we know q u a n t i t a t i v e l y r e l a t i v e l y l i t t l e about the b a s i c determinant o f molecular s t r u c t u r e and dynamics, the forces among atoms. I f we would have to compute from f i r s t p r i n c i p l e s the p o t e n t i a l surface to chemical accuracy s e p a r a t e l y f o r each large molecule o f i n t e r e s t along with a l l the i n t e r a c t i o n s with surrounding solvent mole c u l e s , the problem would seem insurmountable. Our chemical experience, c o n c e p t u a l i z a t i o n , nomenclature and system o f c a t a l o g i n g o f molecules, however, i s based on the f a i t h t h a t mole cules can be analyzed i n t o f u n c t i o n a l groups which r e t a i n t h e i r approximate i d e n t i t y and nature from molecule to molecule. Thus the force f u n c t i o n s , £^(£η> · · · > £ ) > to a f i r s t approximation should be decomposable i n t o i ) l o c a l force functions which describe chemical f u n c t i o n a l groups and which are approximately t r a n s f e r a b l e from molecule to molecule and i i ) terms which des c r i b e the i n t e r a c t i o n among f u n c t i o n a l groups. This t r a n s f e r a b l e force function approach has been e x t e n s i v e l y developed i n v i b r a t i o n a l spectroscopy ( 8 ) , organic chemistry (9-12) and biochemis t r y (13, 14) and the wide extent o f i t s a p p l i c a b i l i t y i s s t r e s s e d i n a recent review by Warshel (15), who describes both the usual type of f u l l y e m p i r i c a l p o t e n t i a l surface treatment and a v e r s i o n i n which π e l e c t r o n s are t r e a t e d i n a formulation de r i v e d from semiempirical quantum mechanics. Thus a reasonable approach to p o t e n t i a l surfaces i s the p a t i e n t c o l l e c t i o n and refinement with respect to t h e o r e t i c a l c a l c u l a t i o n s and comparison o f computed to measured parameters of a l i b r a r y o f force functions which should be at l e a s t approx imately t r a n s f e r a b l e from molecule to molecule. N
2. Computational speed. I f one wishes to study the de t a i l e d molecular dynamics o f r e a c t i o n s o f even simple molecules i n s o l u t i o n , one must consider at l e a s t a s i n g l e s o l v a t i o n s h e l l around each molecule, and thus at l e a s t the order o f 100 atoms. Given x, y and ζ components for E q s . (1) and (2), one must solve the order o f 300 coupled d i f f e r e n t i a l equations, i n t e g r a t i n g forward for thousands or perhaps m i l l i o n s of time steps. The number o f a r i t h m e t i c operations i n v o l v e d i s therefore i n e v i t a b l y large. I f one wishes to i n t e r a c t with the on-going c a l c u l a t i o n s , viewing the t r a j e c t o r i e s o f the atoms and seeing the r e s u l t s o f m o d i f i c a t i o n s o f parameters w i t h i n a reasonable waiting time, the p r o c e s s i n g system must be a r a p i d one even by today's l a r g e computer standards. T h i s d i f f i c u l t y , however, i s overshadowed by an even more demanding and s u b t l e one.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
WILSON
Molecular Dynamics
3. I n i t i a l c o n d i t i o n s . U n f o r t u n a t e l y , we u s u a l l y do not know i n advance where to s t a r t , which set o f i n i t i a l p o s i t i o n s and v e l o c i t i e s for the atoms w i l l l e a d , as time proceeds, to the chemical process o f i n t e r e s t . For most chemical r e a c t i o n s we can't j u s t assemble our molecules and allow them to r a t t l e around toward e q u i l i b r i u m , f o r on the time s c a l e o f i n t e r n a l molecular motion most chemical r e a c t i o n s o f i n t e r e s t w i l l a l most never occur i n an e q u i l i b r i u m system. Thus a random ap proach doesn't solve the problem. A quick c a l c u l a t i o n shows that a brute force systematic approach won't solve i t e i t h e r . Consider a systematic search through j u s t 10 d i f f e r e n t i n i t i a l p o s i t i o n vectors and 10 d i f ferent i n i t i a l v e l o c i t y vectors f o r each o f 100 atoms. This would give ΙΟΟίΟΟ = 1 0 (a number greater than the estimated number o f atoms i n the u n i v e r s e ) , d i f f e r e n t i n i t i a l phase space p o i n t s , each o f which would have to be i n t e g r a t e d forward i n time t o decide i f i t d i d indeed l e a d to the r e a c t i o n o f i n t e r e s t . Such a brute force approach i s now and w i l l always remain i n feasible. I f n e i t h e r random nor brute force systematic approaches are g e n e r a l l y f e a s i b l e , what can be done? One p o s s i b l e ap proach i s the development o f techniques to automatically i d e n t i fy c r i t i c a l configurations or saddle p o i n t s (or more p r e c i s e l y surfaces or regions i n phase space (16) through which r e a c t i o n t r a j e c t o r i e s must p a s s ) . I f one can i d e n t i f y such a phase space r e g i o n , one can then i n t e g r a t e both forward and backward i n time to t r a c e out the e n t i r e t r a j e c t o r y , and one can explore neighboring t r a j e c t o r i e s as w e l l . T h i s approach can be s t r a i g h t forward for systems with s u f f i c i e n t symmetry, such as defect jumps i n c r y s t a l s (17), and i t s extension to more complex mole c u l a r systems can a l s o be expected t o be pursued. Another a l t e r n a t i v e , perhaps complementary to the above, i s t o t r y to use the human chemist's accumulated understanding of the mechanisms o f chemical r e a c t i o n s to guide the machine's calculations. We chemists at l e a s t think we have some know ledge o f the way to r e l a t i v e l y o r i e n t two molecules and how to shove them at one another to get them to r e a c t . We t h i n k we have some f e e l i n g f o r the r e a c t i o n pathway from reactants to p r o d u c t s , for the bonds which must change and f o r the c r i t i c a l c o n f i g u r a t i o n s ( t r a n s i t i o n s t a t e s , a c t i v a t e d complexes) which must be t r a v e r s e d . U n f o r t u n a t e l y , t h i s chemists's understanding i s l a r g e l y p i c t o r i a l and i n t u i t i v e , but our computers need n u m e r i c a l guidance as to p o s i t i o n s and v e l o c i t i e s i n order to p r o ceed. T h i s need to b r i n g together the chemist's non-numerical mechanistic understanding o f the r e a c t i o n pathway with the machine's a b i l i t y t o c a l c u l a t e forward and backward along the r e a c t i o n t r a j e c t o r y once given the p o t e n t i a l surface and the atomic p o s i t i o n s and v e l o c i t i e s at any given p o i n t on the t r a j e c t o r y has l e d us to work on techniques of c l o s e r man-machine interaction. 2 0 0
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
151
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
152
MINICOMPUTERS
AND LARGE
SCALE
COMPUTATIONS
The f i r s t need i s v i s i o n . In order t o comprehend the molec u l a r dynamics o f r e a c t i o n s i n v o l v i n g a hundred or more atoms, i t i s imperative to be able to watch the motions, the t h r e e dimensional (3D) t r a j e c t o r i e s o f the atoms i n v o l v e d . Fortun a t e l y t h i s i s a w e l l - s o l v e d problem, with s e v e r a l s p e c i a l i z e d d i s p l a y systems now being commercially a v a i l a b l e which make f e a s i b l e the v i s u a l i z a t i o n o f the 3D motions o f hundreds o r even thousands o f atoms i n r e a l time (human, not molecular) and even i n c o l o r and/or s t e r e o , i f d e s i r e d . In a d d i t i o n , films can e a s i l y be made u s i n g even r e l a t i v e l y simple d i s p l a y t e r m i n als which can allow the o f f - l i n e v i s u a l i z a t i o n o f molecular dynamics. We q u i c k l y d i s c o v e r e d , however, that v i s i o n alone i s i n s u f ficient. We want to manipulate atoms, fragments w i t h i n molecules or e n t i r e molecules which are c l o s e l y surrounded by other atoms, fragments and molecules, i n order to a r r i v e at some p o i n t on a r e a c t i o n - p a t h phase-space t r a j e c t o r y . To do t h i s we must remain w i t h i n the energy range which i s thermally allowed. However i n a dense system, as i s w e l l known i n Monte Carlo c a l c u l a t i o n s (18), almost a l l randomly chosen new configurations are e n e r g e t i c a l l y i n a c c e s s i b l e , because the atoms are almost a l l already up against hard r e p u l s i v e w a l l s (19) and a random displacement w i l l almost always send the energy too h i g h . Thus, j u s t as p o t e n t i a l surface referenced importance sampling (18) i s used to guide the choice o f new configurations i n Monte C a r l o c a l c u l a t i o n s , some feedback from the p o t e n t i a l energy surface i s needed t o guide the human chemist i n manipulating atoms, fragments and molecules to reach a p o i n t on the r e a c t i o n p a t h . We have found that v i s i o n i s a poor feedback t o o l for maneuvering on a multidimensional p o t e n t i a l surface and we b e l i e v e that t h i s i s at l e a s t i n p a r t because touch r a t h e r than v i s i o n i s the n a t u r a l human sense when forces and torques are to be p e r ceived. This has l e d us to the development of man-machine touch i n t e r f a c e s (1_, 20) more c l o s e l y l i n k man and machine beyond what i s p o s s i b l e with v i s i o n alone. t
0
F. Goal. Our goal thus i s to develop and use an " i n s t r u ment f o r theory" which we c a l l NEWTON, a c l o s e r man-machine symbiosis focused on the understanding of the molecular dynamics o f many-atom chemical r e a c t i o n s , a machine which opens a window to the m i c r o s c o p i c world o f the 3D t r a j e c t o r i e s o f moving atoms, v i s u a l i z e d as we w i s h , elements l a b e l e d , bonds shown. We wish to be able to b u i l d up the system o f i n t e r e s t from atoms, fragments and molecules, adjusting the p o s i t i o n s and v e l o c i t i e s to correspond to our understanding o f mechanism, r e a c t i o n path and c r i t i c a l c o n f i g u r a t i o n i n order to i n i t i a t e the d e s i r e d chemical r e a c t i o n . We want to c o n t r o l energy, temperature and pressure by the turn o f knobs and t o d i s p l a y the c a l c u l a t e d values as the process proceeds. Our viewpoint (angle and zoom) should be v a r i a b l e , as w e l l as which atoms are to be
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
WILSON
Molecular Dynamics
153
displayed. One should be able to c o n t r o l the speed of passage of computed time; i n c r e a s i n g (up to the computational l i m i t ) , decreasing, freeze framing, or backing up and then r e a d j u s t i n g parameters and r e s t u d y i n g . One would l i k e to c a l c u l a t e and d i s p l a y derived parameters such as bond lengths and angles, p r o gress along a defined r e a c t i o n coordinate or computed s p e c t r a to compare with measured s p e c t r a . In a d d i t i o n , a r e c o r d of the run, i n c l u d i n g a l l input parameters and atomic t r a j e c t o r i e s , should be stored for future a d d i t i o n a l a n a l y s i s . As we w i l l see i n the f o l l o w i n g s e c t i o n , most o f these instrumental goals have been achieved, at l e a s t , i n a p r e l i m i n a r y fashion.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
II.
Instrumentation
Two versions o f NEWTON have now been b u i l t and t e s t e d , the e a r l i e r v e r s i o n able to handle a few atoms and the present one a hundred or more atoms. A. I n i t i a l V e r s i o n . The f i r s t implementation of the NEWTON concept i s shown s c h e m a t i c a l l y i n Figure 1. As i t i s described elsewhere (1), i t w i l l only b r i e f l y be mentioned here. The equations o f motion are i n t e g r a t e d i n a minicomputer, the moving atoms are d i s p l a y e d on an Evans and Sutherland (E $ S) P i c t u r e System and the user can c o n t r o l the p o s i t i o n and v e l o c i t y of any s e l e c t e d atom by using the "Touchy-Feely" touch i n t e r f a c e , feeling the forces imparted by neighboring atoms. T h i s system served to show that such an instrument could be b u i l t , but was only adequate to handle a few i n t e r a c t i n g atoms and manipulate them atom by atom. B. Present V e r s i o n . The current system, which can handle a hundred i n t e r a c t i n g atoms f a s t enough for i n t e r a c t i v e use (at approximately 10 i n t e g r a t i o n time steps per second) i s shown as a block diagram i n Figure 2 and as a photograph i n Figure 3. Several hundred atoms can be handled at reduced speed. The equations o f motion are i n t e g r a t e d i n a F l o a t i n g Point Systems (FPS) AP120B Array Processor which runs f o r our a p p l i c a t i o n at a through-put o f s e v e r a l f l o a t i n g p o i n t operations per microsecond and which forms, with the help o f i t s h o s t , essentially a general-purpose processor capable of s e v e r a l simultaneous o p e r a t i o n s , with p a r a l l e l and p i p e l i n e d f l o a t i n g p o i n t adder and multiplier. At p r e s e n t , i t lacks d i r e c t higher l e v e l language capability. I t s approximate r e l a t i v e power may be judged by comparisons i n d i c a t i n g a speed 3 to 4 times slower (21) than a C o n t r o l Data Corporation (CDC) 7600 and 10 to 50 times f a s t e r (22) than a Data General (DG) E c l i p s e under Fortran V . It should be r e a l i z e d that a l l such comparisons are a f u n c t i o n o f program mix and e f f i c i e n c y of coding. V i s u a l i n t e r a c t i o n with the user i s through a dynamic 3D
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
χ
IBM
1800
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
AND LARGE
CAMAC CRATE
CAMAC CRATE
SCALE
COMPUTATIONS
META 4
VISUAL PROCESSOR CAMAC CRATE
TTY
VISUAL INTERFACE
TOUCH INTERFACE
Figure 1. Block diagram of system used to test crudely the con cept of NEWTON. The touchstone of the touch interface drives the central carbon atom of a methane molecule, allowing it to be moved and the forces on it from the other atoms to be felt by the user. The molecule is displayed on the Evans à- Sutherland (Eb-S) Picture System, and the differential equations are integrated in real (human) time by the Digital Scientific Meta-4 computer to give the trajectories displayed on the Picture System. The Meta-4 is linked through three CAMAC crates and an IBM 1800 to the California Data Processors (CDP) 135 emulating a Digital Equipment Corporation (DEC) Ρ DP 11/40 which in turn runs symbiotically with the Picture System processor.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Molecular Dynamics
155
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
WILSON
UNIX COP
Figure 2. Block diagram of present NEWTON instrument designed for interactive study of the molecular dynamics of chemical reactions involving a hundred or more atoms. The user interacts with NEWTON by setting parameters such as temperature, pressure, and time step through knobs and teletype, by watching the motion of the atoms and the values of calculated parameters on the screen of the Eb-S Picture System and by adjusting the positions and velocities of atoms with the touch interface. The coupled differential equations (Newtons Second Law) are integrated in the Floating Point Systems (FPS) Array Processor to calculate the atomic trajectories. Other parts of the Chemistry Department Computer Facility (into which NEWTON is integrated) which are used as part of NEWTON include a CDF 135 emulating a DEC PDP 11/40 which serves as host for the Array Processor and the Picture System and a Varian 72 which handles disk management.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
MINICOMPUTERS
AND LARGE
SCALE
COMPUTATIONS
Figure 3. Photograph of NEWTON showing Eù-S Picture System screen on the left, control knobs and FPS Array Processor in the background X TRANSLATION MOTOR
Figure 4. Schematic of "Touchy-Twisty' designed for force-torque—position—orientation man—machine communication, a touch interface to assemble and manipulate three-dimensional objects. A handball containing force-torque vector sensors is driven to position and orientation by three nester computerdriven rotational stages carried by three nested computer-driven transitional stages.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
12.
WILSON
Molecular Dynamics
157
d i s p l a y u s i n g an Evans and Sutherland P i c t u r e System which allows the motions o f the a p p r o p r i a t e l y l a b e l e d atoms to be seen as they are c a l c u l a t e d . A C a l i f o r n i a Data Processors (CDP) 135 emulating a D i g i t a l Equipment Corporation (DEC) PDP 11/40 serves as host for both the Array Processor and the P i c t u r e System. B i n o c u l a r stereo and c o l o r presentations are a v a i l a b l e by v i s u a l f u s i o n of s p i n n i n g - d i s k c o n t r o l l e d s e q u e n t i a l images, but i n p r a c t i c e are only r a r e l y used. R o t a t i o n o f the system o f molec u l e s i s a b e t t e r depth cue and l a b e l i n g o f atoms i s a s u f f i c i e n t identifier. O r i e n t a t i o n of view, angular v e l o c i t y of r o t a t i o n and zoom are a l l c o n t r o l l a b l e by knobs and buttons. Temperature i s v a r i e d by k n o b - c o n t r o l l e d , mass-weighted v i s c o s i t y which removes energy as v i s c o s i t y i s increased or adds energy i f v i s c o s i t y i s formally made n e g a t i v e . E x t e r n a l pressure i s c o n t r o l l e d by changing the s i z e of an e l a s t i c - w a l l e d boundary cube. Other boundary c o n d i t i o n s , for example, p e r i o d i c r e p e t i t i o n or a f r e e f l o a t i n g drop are a l s o p o s s i b l e . Temperature and pressure are c a l c u l a t e d from atomic v e l o c i t i e s , forces and p o s i t i o n s and are d i s p l a y e d on the P i c t u r e System screen. NEWTON i s i n t e g r a t e d i n t o the Chemistry Department Computer F a c i l i t y , which i n c l u d e s a dozen processors interconnected through a system based on the CAMAC convention. Others of these processors which are used i n conjunction with NEWTON i n c l u d e a V a r i a n 72 which handles d i s k management, an IBM 1800 which cont r o l s p e r i p h e r a l s and a second CDP 135 emulating a DEC PDP 11/40 which runs a UNIX time-shared operating system used for program e d i t i n g and f i l e manipulation. C. Touch I n t e r f a c e . We wish to b u i l d up our chemical systems of i n t e r e s t not j u s t atom by atom, but from fragments and whole molecules and we wish a l s o to be able to reach i n t o the simulated volume and guide fragments and molecules i n t o the d e s i r e d coordinates and v e l o c i t i e s to a r r i v e at a p o i n t along the r e a c t i v e t r a j e c t o r y for the chemical process of i n t e r e s t . The atom by atom touch i n t e r f a c e d e s c r i b e d above, i n v o l v i n g force and p o s i t i o n (1^20), i s no longer s u f f i c i e n t i f we wish to assemble and manipulate three dimensional objects such as f r a g ments and molecules i n v o l v i n g f o r c e , torque, p o s i t i o n and o r i e n t a tion. Therefore we are b u i l d i n g (20) what we c a l l a "TouchyTwisty" which i s shown i n Figures 4-6. A b a l l for the u s e r ' s hand (the handball) i s d r i v e n by three nested computer-controlled t r a n s l a t i o n a l stages c a r r y i n g three nested computer-controlled r o t a t i o n a l stages to follow the x , y , z p o s i t i o n of the center of mass as w e l l as the o r i e n t a t i o n o f three defined axes w i t h i n a designated fragment or molecule. The force and torque v e c t o r s exerted by the user on the handball w i l l be sensed by i n t e r n a l f l e x i n g members with s t r a i n gauge pickups (see Figure 6) and w i l l be added a p p r o p r i a t e l y to the forces already exerted by surrounding atoms on each atom o f the designated molecule, and w i l l t h e r e f o r e a f f e c t the on-going
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
Figure 5. Photograph of "Touchy-Twisty" partially constructed
Figure 6. Photograph of force-torque resolver inside handball, under construction. Strain gauges will be mounted on the flexing members to pick up components of force and torque.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
WILSON
Molecular Dynamics
159
c a l c u l a t i o n of that m o l e c u l e s t r a j e c t o r y . Thus, as the user t r i e s to t r a n s l a t e or r o t a t e the handball-molecule i n a way which matches chemical p o s s i b i l i t y as described by the p o t e n t i a l s u r f a c e , i t w i l l move r e l a t i v e l y f r e e l y , being unhindered by opposing forces from surrounding atoms. Conversely, i f one t r i e s to t r a n s l a t e or r o t a t e the handball-molecule so that r e p u l s i v e walls o f surrounding atoms are impinged upon, i t w i l l move only with d i f f i c u l t y , as these atoms must be shoved out o f the way to proceed. T h i s type o f touch i n t e r f a c e i s designed s p e c i f i c a l l y to i n t e r a c t with a dynamic system, as i t s communication with the user i s i n t i m a t e l y l i n k e d to the computer's a b i l i t y to s i m u l t a neously i n t e g r a t e the equations o f motion of the objects involved i n the dynamic s i m u l a t i o n . f
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
III.
Chemical A p p l i c a t i o n s
While the mechanical molecule approach to the molecular dynamics o f many-atom chemical r e a c t i o n s i s i n p r i n c i p l e a p p l i c a b l e to almost any chemical r e a c t i o n , our lack o f s u f f i c i e n t general q u a n t i t a t i v e knowledge of interatomic forces makes i t wise to concentrate, at l e a s t i n i t i a l l y , on cases i n which the many-atom complexity a r i s e s l a r g e l y from the r e p e t i t i o n of simple u n i t s , for example polymers i n which the monomer i s the repeated u n i t and r e a c t i o n s of smaller molecules i n s o l u t i o n i n which the solvent molecule i s repeated, so that the number o f force parameters to be determined remains manageable. Two o f our current i n t e r e s t s are t h e r e f o r e dynamic approaches to v i b r a t i o n a l spectra i n s o l u t i o n and to the microscopic understanding o f solvation. A. Dynamic Approach to V i b r a t i o n a l S p e c t r a . I f we observe a small molecule, the v i b r a t i o n a l spectrum ( i n f r a r e d or Raman) i s a s e r i e s of w e l l - d e f i n e d l i n e s , and we know how to i n v e r t such s p e c t r a to gain information on the p o t e n t i a l surface near the e q u i l i b r i u m geometry (8, 23). I f we go to many-atom systems, i . e . large molecules or c o l l e c t i o n s o f c l o s e l y i n t e r a c t i n g molecules as i n a l i q u i d , instead o f w e l l - d e f i n e d l i n e s we f i n d broad continuous bands and we can no longer i n v e r t to the p o t e n t i a l surface i n the same d i r e c t way. However, we can s t i l l proceed i n the opposite d i r e c t i o n , c a l c u l a t i n g the v i b r a t i o n a l spectrum from the p o t e n t i a l energy s u r f a c e . (Such an approach was perhaps b e t t e r known before the day o f modern computers when a c t u a l mechanical models o f molecules were constructed from springs and masses and d r i v e n by an e c c e n t r i c disk on a motor whose speed was v a r i e d to f i n d the resonances corresponding to the normal f r e quencies (24, 25).) For example, we can use l i n e a r response theory (26-31) to r e l a t e the spectrum o f the n a t u r a l f l u c t u a t i o n s o f a parameter i n a system at e q u i l i b r i u m to the response spectrum we would f i n d i f we drove that parameter with a weak e x t e r n a l
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
160
MINICOMPUTERS AND LARGE
SCALE
COMPUTATIONS
perturbation. Thus we can s t a r t with a p o t e n t i a l surface V(r . . . , r ) , c a l c u l a t e the t r a j e c t o r i e s r - ( t ) , £ (t) of atoms upon i t at e q u i l i b r i u m at a chosen temperature, c a l culate (for example, i n the f i r s t approximation by a s s i g n i n g p a r t i a l atomic charges) the time v a r y i n g d i p o l e moment y(t) from the t r a j e c t o r i e s , and then c a l c u l a t e the i n f r a r e d spectrum from the power spectrum or from the F o u r i e r transform o f the time c o r r e l a t i o n of the d i p o l e moment (29, 30). 1 5
N
V( ,
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
E l
N
r ) N
X l
(t),
r (t) N
- > }i(t) — • Α ( ω )
(3)
S i m i l a r l y , by assigning an approximate r e l a t i o n s h i p b e tween p o l a r i z a b i l i t y and atomic c o o r d i n a t e s , one should be able to compute Raman s p e c t r a . For example, we have used the L e m b e r g - S t i l l i n g e r p o t e n t i a l (32) f o r water to c a l c u l a t e i n f r a r e d s p e c t r a at approximately room temperature f o r e q u i l i b r a t e d i s o l a t e d water molecules and then for l a r g e r and l a r g e r c l u s t e r s . The spectrum s h i f t s smoothly from the gas-phase l i n e spectrum toward the broad bands c h a r a c t e r i s t i c o f the l i q u i d phase, the bending ( s c i s s o r s ) v i b r a t i o n moving up i n energy and broadening as expected and the asymmetric and symmetric s t r e t c h e s moving down i n energy and melding together to form what i n the l i q u i d i s a s i n g l e broad peak. The L e m b e r g - S t i l l i n g e r p o t e n t i a l was designed f o r somewhat d i f f e r e n t ends, and by i t s nature as a c e n t r a l force approximation, a sum o f two body terms, V ^ , V and V Q H
Q 0
i t cannot accurately reproduce the i s o l a t e d molecule spectrum. Nonetheless, i t i s i n s t r u c t i v e to see that the expected gas to l i q u i d s h i f t s are t a k i n g p l a c e as the c l u s t e r s i z e grows. S i m i l a r c a l c u l a t i o n s with more r e a l i s t i c p o t e n t i a l s are i n preparation for several l i q u i d s . There are two purposes to such c a l c u l a t i o n s . The f i r s t i s to improve our knowledge o f i n t e r a t o m i c f o r c e s , i n p a r t i c u l a r non-bonded and i n t e r m o l e c u l a r f o r c e s , which we need f o r f u r t h e r molecular dynamics s t u d i e s . For example, we can set up a parameterized p o t e n t i a l function which i s constrained i n regards to that which we know such as e q u i l i b r i u m bond lengths and angles and d i s s o c i a t i o n e n e r g i e s , but which contains a d j u s t able parameters such as those d e s c r i b i n g non-bonded i n t e r a c t i o n s . Then we can i t e r a t i v e l y change the adjustable parameters to t r y to gain b e t t e r agreement between c a l c u l a t e d and measured s p e c t r a , h o p e f u l l y converging on an improved p o t e n t i a l s u r f a c e . The second purpose i s to t r y to change our present under standing o f l i q u i d s t a t e v i b r a t i o n a l s p e c t r a , which i s mainly q u a l i t a t i v e , i n t o q u a n t i t a t i v e understanding based on p o t e n t i a l surfaces and molecular dynamics. For example, i f we b e l i e v e we have a reasonable p o t e n t i a l s u r f a c e , we should be able to assign
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
12.
WILSON
Molecular Dynamics
161
s p e c t r a l features by d r i v i n g the simulated system o f molecules with a simulated e l e c t r i c f i e l d o s c i l l a t i n g at the frequency o f the s p e c t r a l feature and then watching and analyzing the a c t u a l computed t r a j e c t o r i e s which the atoms follow i n response to t h i s perturbation. Such an approach i s not r e a l l y a new one, as i t resembles the technique (33) used f o r t y years ago to analyze s t r o b o s c o p i c a l l y the normal motions o f molecules modelled mechanically by masses and springs and d r i v e n by an external mechanical o s c i l l a t o r y p e r t u r b a t i o n . The advent of systematic procedures for the a n a l y s i s o f v i b r a t i o n a l l i n e spectra (8, 23) has made such mechanical molecule approaches unnecessary f o r few atom systems, but has not solved the problem for many-atom systems. With the present a v a i l a b i l i t y o f very f a s t computing systems such as our array processor and our a b i l i t y to v i s u a l l y recognize complex motions with the a i d o f dynamic computer g r a p h i c s , we can now apply t h i s mechanical molecule approach i n a new form to many-atom s p e c t r a , i n p a r t i c u l a r s p e c t r a i n s o l u t i o n . B. Dynamics o f S o l v a t i o n . A second area o f a p p l i c a t i o n i s the understanding o f s o l v a t i o n i n terms o f the t r a j e c t o r i e s o f the atoms. Most r e a c t i o n s o f i n t e r e s t to chemists and most o f the chemistry i n l i v i n g systems occur i n s o l u t i o n , yet we understand very l i t t l e o f s o l v a t i o n , and even l e s s o f chemical r e a c t i o n s i n s o l u t i o n , i n terms o f a q u a n t i t a t i v e microscopic p i c t u r e i n v o l v i n g atomic motions. The modelling of the molecular dynamics of s o l v a t i o n i n i s o l a t e d d r o p l e t s o f up to hundreds o f solvent molecules i s r e l a t i v e l y s t r a i g h t f o r w a r d ; the large d i f f i c u l t y comes i n t r y i n g to match the p r o p e r t i e s of bulk s o l u t i o n s with c a l c u l a t i o n s i n v o l v i n g f i n i t e numbers o f molecules. The key to the l a t t e r appears to be i n the boundary c o n d i t i o n s : whether to choose, f o r example, p e r i o d i c boundary c o n d i t i o n s , a d i e l e c t r i c - s u r r o u n d e d c a v i t y or a surface l a y e r which i s f i x e d i n the c o n f i g u r a t i o n o f bulk solvent (34). In the i l l u s t r a t i o n s shown i n Figures 7-9 we have chosen the easy way out, by modelling i s o l a t e d d r o p l e t s . These stereo p a i r s , which may be seen by most people i n depth by a s l i g h t c r o s s i n g o f the eyes, represent i n d i v i d u a l frames from the c a l c u l a t e d time h i s t o r y o f a water c l u s t e r , the s o l v a t i o n o f a c h l o r i d e ion i n water and the process o f d i s s o l u t i o n and s o l v a t i o n o f an u l t r a c r y s t a l l i t e o f NaCl i n water. The water p o t e n t i a l i s again L e m b e r g - S t i l l i n g e r (32) with e l e c t r o s t a t i c i n t e r a c t i o n s and approximate r e p u l s i v e cores for the i n t e r a c t i o n s with and among the i o n s . IV.
Some Thoughts on the Future
A. Future A p p l i c a t i o n s . The author suspects that i n the long r u n , the most i n t e r e s t i n g many-atom molecular dynamics i s l i k e l y to be found i n biomolecular r e a c t i o n s . While up to the p r e s e n t , biochemistry and molecular b i o l o g y have concentrated on
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
162
MINICOMPUTERS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
Figure 7.
AND
LARGE
SCALE
COMPUTATIONS
A time-step in the evolution of a cluster of 31 water molecules
Figure 8.
A time-step in the history of a chloride ion solvated in an isolated water droplet
Figure 9.
and ion solvation of a crystallite of NaCl A time-step in the dissolution
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
12.
WILSON
Molecular Dynamics
163
s t a t i c s , i . e . the r e l a t i o n s h i p o f s t r u c t u r e and f u n c t i o n , i t seems c l e a r that the f u n c t i o n i n g o f at l e a s t many o f the most i n t e r e s t i n g biomolecules must be understood i n terms o f dynamics, t h e i r time e v o l u t i o n . A very long p e r i o d o f s e l e c t i o n has undoubtedly moulded many biomolecules i n t o very e f f i c i e n t machines whose dynamics as yet i s l a r g e l y s p e c u l a t i v e . Examples of such biomachinery are to be found i n enzymic a c t i o n (35) and a l l o s t e r i c e f f e c t s , muscle c o n t r a c t i o n , membrane transport ( p a r t i c u l a r l y a c t i v e t r a n s p o r t ) , aspects o f drug-receptor i n t e r a c t i o n , and biomolecular self-assembly. Perhaps as the past twenty years have seen such great progress i n the understanding of biomolecular s t r u c t u r e - f u n c t i o n r e l a t i o n s h i p s , the next twenty years may see s i m i l a r progress i n understanding the more complete p i c t u r e of biomolecular s t r u c t u r e - d y n a m i c s - f u n c t i o n . While some molecular dynamic c a l c u l a t i o n s on biomolecules are already i n progress i n batch mode, for example r e t i n a l photoi s o m e r i z a t i o n (36), water around a d i p e p t i d e to study the d i f f e r e n c e i n dynamics near h y d r o p h i l i c and hydrophobic s i t e s (37) and motions o f a s i m p l i f i e d small p r o t e i n , p a n c r e a t i c t r y p s i n i n h i b i t o r (38), such c a l c u l a t i o n s are s e v e r e l y hindered by l i m i t s to a v a i l a b l e computational speed. How can such l i m i t s be transcended? B. F a s t e r Computation. With a few more orders o f magnitude i n computer speed, the mechanism o f most r e a c t i o n s of i n t e r e s t to chemists would be a c c e s s i b l e to study by many-atom molecular dynamics. How can such speed increases be achieved? Two d i r e c t i o n s are apparent: more powerful elements ( i n t e g r a t e d c i r c u i t s ) and the i n t e r c o n n e c t i o n o f these elements i n a r c h i t e c t u r e s which more e f f i c i e n t l y match the problem to be s o l v e d . I t i s thought that there i s another f a c t o r o f 30 s t i l l to be r e a l i z e d i n l i n e a r shrinkage i n metal oxide semiconductor (MOS) technology before fundamental p h y s i c a l l i m i t s are reached (39). T h i s t r a n s l a t e s i n t o a 30 increase i n packing d e n s i t y on a chip and another f a c t o r o f 30 i n speed, f o r a t o t a l gain o f perhaps four orders of magnitude. Thus we can look forward to continuing s u b s t a n t i a l gains i n computational power per element by t h i s and probably by other routes as w e l l . A complementary approach i s the a r c h i t e c t u r e o f interconnecting the elements. The c l a s s i c a l mechanics of a set of i n t e r a c t i n g p a r t i c l e s i s a problem p a r t i c u l a r l y amenable to s p e c i a l i z e d computer a r c h i t e c t u r e because i ) the algorithms are r e l a t i v e l y 2
* Such increases i n s p e c i a l i z e d computer power are i n progress i n other areas as w e l l (1_) . Examples i n c l u d e the P a r a l l e l E l e ment Processing Ensemble (PEPE) f o r m i s s i l e t r a c k i n g b e i n g cons t r u c t e d f o r the Army Advanced B a l l i s t i c M i s s i l e Defense Agency which i s designed (4) to run many times f a s t e r than any e x i s t i n g general purpose processor as w e l l as the s p e c i a l aerodynamic computer (40) being considered by NASA which would be two orders o f magnitude f a s t e r than e x i s t i n g general purpose machines.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
164
AND LARGE
SCALE
COMPUTATIONS
simple and h i g h l y r e p e t i t i v e and i i ) the computation can be s p l i t i n t o p a r a l l e l streams which need communicate only once (or p e r haps a few times with more complex i n t e g r a t i o n schemes) f o r each i n t e g r a t i o n time s t e p . Thus, i n s t e a d o f an array processor we can consider arrays o f processors or even arrays o f array p r o cessors (1). When one considers such computational systems composed o f so many a c t i v e elements, s e v e r a l s i m i l a r i t i e s between computer a r c h i t e c t u r e and molecular a r c h i t e c t u r e become evident (39). The b a s i c determinant o f s t r u c t u r e becomes not the l o g i c a l e l e ments (atoms) themselves, but r a t h e r t h e i r interconnections (bonds) and these now become the focus o f design (39) as shown i n Table I I .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
Table I I .
E v o l u t i o n o f emphasis o f computer a r c h i t e c t u r e l o g i c a l elements to interconnections (39).
from
Characteristics
Past
Future
large,
slow,
expensive
logical elements
interconnections
small,
fast,
cheap
interconnections
logical
elements
Because computer a r c h i t e c t u r e can now be constructed cont a i n i n g so many elements and i n t e r c o n n e c t i o n s , the same problems in human c o n c e p t u a l i z a t i o n a r i s e as i n systems composed o f many atoms and bonds, that no one person can p o s s i b l y understand a l l the r e l a t i o n s h i p s among the i n d i v i d u a l d e t a i l e d p a r t s o f the system. In response, the same approach o f emphasizing the symmetry o f the s i t u a t i o n becomes u s e f u l . For example, one obvious way o f i n t e r c o n n e c t i n g processors i n p a r a l l e l i s a s i n g l e bus, as shown i n Figure 10. To a chemi s t t h i s i s a l i n e a r polymer and shares i t s symmetry. I f one branches the b u s , i t ' s a branched polymer, o r one can make c y c l i c systems, e t c . A very appealing s o l u t i o n f o r a problem such as molecular dynamics which i s to be solved i n terms o f C a r t e s i a n space i s to map the 3D problem space onto a 3D space o f an array o f p r o c e s sors (39) , an example o f which i s shown i n Figure 11. Two ways of c a r r y i n g out such a mapping f o r our case are as f o l l o w s . F i r s t , one could map each atom onto a processor and then "dyn a m i c a l l y r e a l l o c a t e processors" so as to maintain near n e i g h bor r e l a t i o n s h i p s as atoms move about on t h e i r t r a j e c t o r i e s . A key question to i n v e s t i g a t e i s whether there i s a l o c a l r e a l l o c a t i o n algorithm which w i l l e f f i c i e n t l y maintain a s a t i s f a c t o r y mapping by querying only other processors i n the v i c i n i t y , and then exchanging assignments o f processors to atoms. A second
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
WILSON
Molecular Dynamics
165
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
PROCESSORS
oooooooo BUS Figure 10. The symmetry of an array of processors connected by a bus, or equivalently the symmetry of a linear polymer
Figure 11. The symmetry of a simple cubic 3D array of processors, or equivalently of a 3D simple cubic crystal lattice
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
166
MINICOMPUTERS
AND LARGE SCALE
COMPUTATIONS
approach i s t o map regions o f 3D coordinate space onto s p e c i f i c p r o c e s s o r s ; i n other words, to d i v i d e a l l the space i n which the atoms move i n t o volumes such that each processor takes care o f a l l atoms which happen to be i n that volume. When an atom crosses the boundary o f t h a t volume, i t would be reassigned to the processor h a n d l i n g the adjacent volume. In c o n s i d e r i n g such a scheme, i t i s important to note t h a t the force on any given r e a l atom i s only a function o f the p o s i t i o n s o f other atoms w i t h i n some f i n i t e volume about that atom (1) and thus that each processor only need communicate with a l o c a l i z e d set o f other p r o c e s s o r s . Thus i n the l i m i t o f a very large number Ν o f atoms, the number o f a r i t h m e t i c operations r e q u i r e d , i f done p r o p e r l y , to solve the molecular dynamics i n creases only p r o p o r t i o n a l l y to N , i n contrast to widely h e l d opinion (shared u n t i l r e c e n t l y by the author) t h a t i t must r i s e f a s t e r than N . T h i s i s t r u e both i n force c a l c u l a t i o n from a r e a l i s t i c p o t e n t i a l surface i n c l a s s i c a l mechanics, i n that i n r e a l i t y a l l i n t e r a t o m i c forces i n dense systems are damped out at some d i s t a n c e by i n t e r v e n i n g movable and p o l a r i z a b l e atoms as w e l l as i n quantum mechanics i n that i n t e g r a l s among o r b i t a l s s u f f i c i e n t l y separated can be ignored. I f we consider 3D arrays o f p r o c e s s o r s , we chemists already know a l l the p o s s i b l e d i f f e r e n t symmetries o f how to b u i l d the processor array (39), the " c r y s t a l computer" (41) . The p o s s i b l e symmetries w i t h i n each u n i t composing the array are j u s t the symmetries o f c r y s t a l u n i t c e l l s and the symmetries with which the u n i t s can be stacked o r interconnected i n t o 3D arrays are j u s t the l a t t i c e symmetries, the 14 Bravais l a t t i c e s , the grand t o t a l o f a l l combined u n i t c e l l and l a t t i c e symmetry p o s s i b i l i t i e s b e i n g the 230 space groups (42). I f we r e s t r i c t ourselves to b u i l d i n g from symmetric, i d e n t i c a l u n i t s which stack i n t o a s p a c e - f i l l i n g 3D a r r a y , the p o s s i b i l i t i e s are even more l i m i t e d and i n fact we can r e f e r back to the Greeks for the s o l i d t e s s e l lations. Out o f the r e g u l a r and Archimedean polyhedra there are only 5 which are space f i l l i n g : the cube, t r i a n g u l a r p r i s m , hex agonal p r i s m , rhombic dodecahedron and t r u n c a t e d octahedron (43). C. Other Instruments for Theory. One can imagine other instruments f o r other t h e o r i e s . Instead o f a NEWTON f o r c l a s s i cal mechanics, one could consider b u i l d i n g a machine for quantum c a l c u l a t i o n s , a SCHRODINGER o r a HEISENBERG. One can again map 3D c o n f i g u r a t i o n space onto a 3D array o f p r o c e s s o r s , e i t h e r o r b i t a l (s) or atom(s) to p r o c e s s o r or volume o f space to p r o c e s s o r . And a g a i n , as the number Ν o f atoms grows large enough, one r e gion o f space w i l l no longer d i r e c t l y a f f e c t another and the a r i t h m e t i c operations i n v o l v e d i n the c a l c u l a t i o n w i l l s c a l e , i n the l i m i t o f q u i t e large N , p r o p o r t i o n a l l y as N . L a s t l y , one might want to b u i l d a SEMI, a s e m i c l a s s i c a l i n strument f o r s o l v i n g quantum mechanically ( e i t h e r ab i n i t i o or semi empiric a l l y ) for the e l e c t r o n i c wavefunction and using t h i s
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12.
WILSON
Molecular Dynamics
167
wavefunction to a n a l y t i c a l l y derive (44-46, 8_, 15) on the f l y a force function for the n u c l e i whose t r a j e c t o r i e s are being i n t e grated c l a s s i c a l l y . For a system o f very large Ν i t i s no longer f e a s i b l e to c a l c u l a t e and s t o r e a p o t e n t i a l function Cl , £ j j i n advance on a 3N - 6 dimensional mesh. For s e m i c l a s s i c a l dynamics, a l l one needs anyway are the forces at those r e l a t i v e l y few p o i n t s a c t u a l l y sampled by the sequence o f n u c l e a r coordinate sets generated by the c l a s s i c a l numerical i n t e g r a t i o n o f the n u c l e a r t r a j e c t o r i e s . It should be noted that a l l o f the instruments for theory described above could be implemented as the same 3D array o f stored-program p r o c e s s o r s . v
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
V.
1
Summary
While we chemists have long b u i l t s p e c i a l i z e d instruments f o r experimental s t u d i e s , we are now d i s c o v e r i n g that we can also b u i l d s p e c i a l i z e d instruments f o r theory, computational apparatus designed to e f f i c i e n t l y solve p a r t i c u l a r classes o f chemical problems. An example i s NEWTON, an instrument we have constructed to study the d e t a i l e d mechanism, i . e . the molecular dynamics, of many-atom chemical r e a c t i o n s , p a r t i c u l a r l y i n s o l u tion. NEWTON allows the chemist to c o n t r o l the s t a t e o f a simulated system o f i n t e r a c t i n g molecules: s e l e c t i o n o f the p a r t i c u l a r molecules, i n i t i a l conditions o f p o s i t i o n and v e l o c i t y , parameters o f the p o t e n t i a l surface, temperature and p r e s s u r e . In response, atomic t r a j e c t o r i e s are c l a s s i c a l l y i n t e g r a t e d on the i n t e r a t o m i c p o t e n t i a l surface i n a very fast p r o c e s s o r . The chemist can watch the e v o l v i n g molecular dynamics on a 3D d i s p l a y and i n t e r a c t with the molecules through knobs, keyboard and touch i n t e r f a c e . A p p l i c a t i o n s i n progress i n c l u d e dynamic s t u d i e s o f v i b r a t i o n a l s p e c t r a i n s o l u t i o n and the dynamics o f the s o l v a t i o n process. With i n c r e a s e d computer speed, much o f biochemistry might become a c c e s s i b l e ; the r e l a t i o n among s t r u c t u r e , dynamics and function for example i n enzymic a c t i o n , a c t i v e t r a n s p o r t and biomolecular s e l f - a s s e m b l y . Hope f o r such speed increases l i e s i n two d i r e c t i o n s : more power per computational u n i t and the adaptation o f o v e r - a l l computer a r c h i t e c t u r e to match the s t r u c ture o f the problem to be s o l v e d . A p a r t i c u l a r l y appealing route i s the mapping o f c a l c u l a t i o n s i n three dimensional con f i g u r a t i o n space onto a three dimensional array o f p a r a l l e l p r o c e s s o r s , a route which can be a p p l i e d e q u a l l y to c l a s s i c a l , s e m i - c l a s s i c a l and quantum c a l c u l a t i o n s , a l l o f which can be shown to s c a l e only p r o p o r t i o n a l l y to the number Ν o f atoms i n the l i m i t o f very large N . Acknowledgement The v i b r a t i o n a l s p e c t r a and dynamics o f s o l v a t i o n are by
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
168
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
Peter Berens. Thanks to John Cornelius and the staff of the Chemistry Department Computer Facility for their help, to Sylvia Francl for aid on vibrational spectra, and to the Division of Computer Research of the National Science Foundation and to the Division of Research Resources, National Institutes of Health (RR-00757) whose support has made this work possible. Literature Cited 1.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17.
Wilson, K. R. in "Computer Networking and Chemistry," Lykos, P., ed., American Chemical Society, Washington, D. C . , 1975, p. 17. Murtha, J . C . , Adv. Computers (1966) 7, 1. Lorin, H . , "Parallelism in Hardware and Software," PrenticeHall, Inc., Englewood Cliffs, New Jersey, 1972. Comptre Corporation, Enslow, Philip H . , Jr., ed., "Multi processors and Parallel Processing," John Wiley & Sons, New York, 1974. Berne, B. J., ed., "Statistical Mechanics, Part B: TimeDependent Processes," Vol. 6 of "Modern Theoretical Chemistry," Plenum Publishing, New York, 1977. Levine, R. D., and Bernstein, R. B., "Molecular Reaction Dynamics," Oxford University Press, New York, 1974. Miller, W. Η., ed., "Dynamics of Molecular Collisions, Parts A & Β," Vols. 1 & 2 of "Modern Theoretical Chemistry," Plenum Publishing, New York, 1976. Califano, S., "Vibrational States," John Wiley, London, 1976. Williams, J . D., Stand, P. J., and Schleyer, P.v.R., Ann. Rev. Phys. Chem. (1968) 19, 531. Kitaigorodsky, A. I., "Molecular Crystals and Molecules," Academic Press, New York, 1973. Hopfinger, A. J., "Conformational Properties of Macromolecules," Academic Press, New York, 1973. Shipman, L. L., Burgess, W., and Sheraga, Η. Α., Proc. Nat. Acad. Sci. USA (1975) 72, 543. Blout, E. R., Bovey, F. Α., Goodman, Μ., and Lotan, Ν., eds., "Peptides, Polypeptides and Proteins," John Wiley & Sons, New York, 1974. Momany, F. Α., McGuire, R. F . , Burgess, A. W., and Sheraga, Η. Α., J . Phys. Chem. (1975) 79, 2361. Warshel, A. in "Semiempirical Methods of Electronic Struc ture Calculation, Part A: Techniques," Segal, G. A., ed., Vol. 7 of "Modern Theoretical Chemistry," Plenum Pub lishing, New York, 1977. Bunker, D. L., "Theory of Elementary Gas Reaction Rates," Pergamon Press, Oxford, 1966, Sections 2.2 and 3.2. Bennett, C. H . , in "Diffusion in Solids: Recent Develop ments," Burton, J . J., and Nowich, A. S., eds., Academic Press, New York, 1975.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
12. WILSON 18.
19. 20. 21. 22. 23. Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41.
Molecular Dynamics
169
Valleau, J . P., and Whittington, S. G., in "Statistical Mechanics, Part A: Equilibrium Techniques," Berne, B. J., ed., Vol. 5 of "Modern Theoretical Chemistry," Plenum Pub lishing, New York, 1977. Weeks, J. D., Chandler, D., and Andersen, H. C., J . Chem. Phys. (1971) 54, 5237. Atkinson, W. D., Bond, Κ. E . , Tribale, G. L. III, and Wilson, K. R., Comput. $ Graphics (1977) 2, 97. Sutherland, G., Lawrence Livermore Laboratories, Livermore, California, private communication. Park, T. C., Loma Linda University, Loma Linda, California, private communication. Wilson, Ε. B. J r . , Decius, J . C., and Cross, P. C., "Mole cular Vibrations," McGraw-Hill, New York, 1955. Kettering, C. F., Shutts, L. W., and Andrews, D. H., Phys. Rev. (1930) 36, 531. Herzberg, G. Η., "Molecular Spectra and Molecular Structure II. Infrared and Raman Spectra of Polyatomic Molecules," D. Van Nostrand, Princeton, New Jersey, 1945. Kubo, R. in "Lectures in Theoretical Physics Vol. 1," Brittin, W. F., and Dunham, L. G., eds., Interscience Publishers, New York, 1959. Kadanoff, L. P., and Martin, P. C., Ann. Phys. (1963) 24, 419. Felderhof, B. U., and Oppenheim, I., Physica (1965) 31, 1441. Gordon, R. G., Advan. Magn. Resonance (1968) 3, 1. Berne, B. J., in "Physical Chemistry, An Advanced Treatise, Vol. VIIIB, Liquid State," Henderson, D., ed., Academic Press, New York, 1971. Kampen, N. G. van, Physica Norvegica (1971) 5, 10. Lemberg, H. L . , and Stillinger, F. H., J. Chem. Phys. (1975) 62, 1677. Andrews, D. H., and Murray, J. W., J . Chem. Phys. (1934) 2, 634. Warshel, Α., University of Southern California, Los Angeles, California, private communication. Warshel, Α., and Levitt, Μ., J . Mol. Biol. (1976) 103, 227. Warshel, Α., Nature (1976) 260, 679. Karplus, Μ., and Rossky, P. J., "Abstracts of Papers," Chemical Institute of Canada and American Chemical Society, Montreal, 1977, phys. 66. Levitt, Μ., MRC Laboratory of Molecular Biology, Cambridge, U. Κ., private communication. Sutherland, I. E . , California Institute of Technology, Pasadena, California, private communication. Datamation (March, 1977) 23, 150. O'Leary, G., Floating Point Systems, Portland, Oregon, private communication.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
170 42. 43. 44.
Henry, N. F. Μ., and Lonsdale, Κ., eds., "International Tables for X-Ray Crystallography Vol. 1," Kynoch Press, Birmingham, England, 1952. Cundy, Η. Μ., and Rollett, A. P., "Mathematical Models," Oxford University Press, London, 1961. Gerratt, J., and Mills, I. M., J. Chem. Phys. (1968) 49, 1719. Pulay, P., Molec. Phys. (1969) 17, 197. Pulay, P., and Török, F., Molec. Phys. (1973) 25, 1153.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch012
45. 46.
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
13 Theoretical Chemistry via Minicomputer
*
PETER K. PEARSON, ROBERT R. LUCCHESE, WILLIAM H. MILLER, and HENRY F. SCHAEFER III **
***
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
Department of Chemistry, University of California, Berkeley, CA 94720
C e r t a i n l y one of the most important and f a r - r e a c h i n g developments i n chemistry over the past decade has been the emergence of theory as a p r e d i c t i v e t o o l of s e m i - q u a n t i t a t i v e reliability. T h i s statement i s no way meant to detract from the pre-1960 t h e o r e t i c a l chemistry that p r o v i d e d , through the work of men such as Linus P a u l i n g , Robert M u l l i k e n , and Henry E y r i n g , the modern foundations of valence theory and chemical kinetics. Contemporary t h e o r e t i c a l research i s o b v i o u s l y b u i l t upon the achievements of these p i o n e e r s . However the d i s t i n g u i s h ing feature of modern t h e o r e t i c a l chemistry i s the ability not only to c o r r e l a t e e x i s t i n g experimental data (and make rough q u a l i t a t i v e p r e d i c t i o n s ) , but a l s o to provide an a priori d e s c r i p t i o n of chemical phenomena that allows p r e c i s e p r e d i c t i o n s to be tested by experiment. The most s t r i k i n g example of t h i s new age of theory i s the understanding that the s i n g l e - c o n f i g u r a t i o n s e l f - c o n s i s t e n t - f i e l d (SCF) approximation for e l e c t r o n i c wave functions provides e q u i l i b r i u m geometries i n very c l o s e agreement with a v a i l a b l e experimental data ( 1 ) . I f one defines chemistry as the union of s t r u c t u r e , e n e r g e t i c s , and dynamics on the molecular level, then it seems f a i r to say that theory has a f i r m grasp on at l e a s t one t h i r d of t h i s branch of s c i e n c e . Furthermore, s i n c e SCF theory may now be a p p l i e d fairly r o u t i n e l y (2) to systems as l a r g e as TCNQ-TTF (Figure 1) the range of applicability i s c l e a r l y rather broad. A second major i n s i g h t gleaned over the past decade i s the r e a l i z a t i o n that the d e t a i l e d dynamics of chemical r e a c t i o n s are w e l l described by ordinary c l a s s i c a l mechanics, i . e . , by c l a s s i c a l t r a j e c t o r y s t u d i e s (3). Although most t h e o r e t i c a l s t u d i e s to date have d e a l t with the c a n o n i c a l A + BC -> AB + C r e a c t i o n (for which the most d e t a i l e d experimental data i s a v a i l a b l e ) (4), systems as l a r g e as the methyl isocyanide r e a c t i o n CH NC + CH CN 3
3
171 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
172
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
Figure 1
are r e a d i l y a c c e s s i b l e (5). In f a c t i t i s reasonable to assume that much of the future research i n t h i s area w i l l be d i r e c t e d toward a t h e o r e t i c a l understanding of model organic r e a c t i o n s . The l i n k between the above two branches of theory i s c l e a r : e l e c t r o n i c s t r u c t u r e theory has as a p r i n c i p l e aim the e l u c i d a t i o n of the p o t e n t i a l energy s u r f a c e ( s ) ; while the theory of dynamics or c o l l i s i o n processes begins with the same p o t e n t i a l energy s u r f a c e ( s ) . The present research p r o j e c t had i t s genesis i n c o l l a b o r a t i v e s t u d i e s between WHM (dynamics) and HFS ( e l e c t r o n i c structure). Here we have assembled a " f i n a l " (only i n the sense of a r a p i d l y approaching deadline) report on our use of a minicomputer for research i n modern t h e o r e t i c a l chemistry. At the outset we should s t a t e that we have already w r i t t e n many words on t h i s s u b j e c t , and r e p e t i t i o n of these would not appear to serve a purpose. A modified v e r s i o n of the o r i g i n a l proposal has been published i n Computers and Chemistry. That proposal goes i n t o the j u s t i f i c a t i o n and economic m o t i v a t i o n for t h i s p i l o t p r o j e c t . Secondly, Appendix I contains four i n t e r i m reports d e s c r i b i n g i n d e t a i l our experiences with the new machine. We s t r o n g l y encourage the reader to go over these documents c a r e f u l l y . F i n a l l y we note that the proposal for a N a t i o n a l Resource for Computation i n Chemistry (NRCC) has brought squarely to the a t t e n t i o n of the chemical community the need for improved computational f a c i l i t i e s . We therefore a l s o urge the reader to give s e r i o u s c o n s i d e r a t i o n to the r e p o r t s of Wiberg (6) and B i g e l e i s e n (_7) committees. The Economic Argument The minicomputer chosen was the Datacraft 6024/4, which was f u l l y assembled at Berkeley on March 13, 1974. Thus our e x p e r i ence spans a p e r i o d of roughly three y e a r s . Although the same
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON E T A L .
Theoretical Chemistry
173
machine i s s t i l l i n production (ours i s machine #3 of about 200 produced to d a t e ) , s e v e r a l company changes have occurred and our minicomputer i s now c a l l e d the H a r r i s Corporation Slash Four. The cost of the machine was e s s e n t i a l l y $130,000, i n c l u d i n g C a l i f o r n i a s t a t e s a l e s tax. No overhead on the purchase p r i c e was r e q u i r e d . Assuming a m o r t i z a t i o n over a four year p e r i o d , t h i s amounts to $2708 per month. The other l a r g e cost i s that of maintaining the s e r v i c e c o n t r a c t , c u r r e n t l y $1715/month ($1280 to the H a r r i s Corporation and $435 to the UC Berkeley overhead). On t h i s b a s i s the t o t a l cost i s $4423/month or $7.30 per hour i f we assume 20 hours of usage per day, as shown to be r e a l i s t i c i n Appendix I . As noted by one of the reviewers, t h i s cost might be further reduced i n a chemistry department where there i s already a t e c h n i c a l s t a f f member with extensive d i g i t a l hardware e x p e r t i s e . Of course the insurance aspects o f the maintenance contract would be l o s t i n t h i s case. Extensive timing comparisons (Appendix I) have shown the minicomputer to be 25-30 times slower than the C o n t r o l Data Corporation (CDC) 7600. Thus the minicomputer generates the equivalent o f 1 hour o f 7600 c e n t r a l processor (cpu) time per $200. For comparison, we c i t e the charge s t r u c t u r e of the Lawrence Berkeley Laboratory (LBL) CDC 7600. This machine i s g e n e r a l l y a v a i l a b l e to NSF grantees and o f f e r s 7600 machine time at p r i c e s roughly f i v e times l e s s expensive than commercial r a t e s . Nevertheless the LBL r a t e s range from roughly $350 to $900 per hour of cpu time. The former f i g u r e r e f e r s to weekend deferred p r i o r i t y time. On t h i s b a s i s , then, one concludes that the m i n i computer i s *\> 2-4 times more economical than the 7600. However, as we d i s c u s s i n d e t a i l i n the o r i g i n a l proposal and i n Appendix I , the above f i g u r e s i n c l u d e input-output charges ( e s p e c i a l l y d i s k accesses) f o r the H a r r i s machine, but these are a d d i t i o n a l charges (often r a t h e r severe) on the CDC 7600. Thus as i s seen i n Appendix I , the cost e f f e c t i v e n e s s of the m i n i computer sometimes exceeds that o f the 7600 by a f a c t o r of s i x or seven. In a l l f a i r n e s s , the minicomputer does not provide the q u a l i t y of s e r v i c e of the LBL CDC 7600, a smoothly f u n c t i o n i n g p r o f e s s i o n a l l y operated computer c e n t e r . Much of the savings made i s simply a consequence o f the f a c t that our o p e r a t i o n involves no paid employees other than graduate students and postdoctorals. Research Accomplishments The u l t i m a t e t e s t of the present proposal i s undoubtedly whether the chemistry research completed j u s t i f i e s the NSF funds expended. Since t h i s document i s intended f o r p e r u s a l by academic and i n d u s t r i a l research chemists, we leave t h i s judgment to you. A v a i l a b l e upon request i s a l i s t of seventy p u b l i c a t i o n s based on
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
174
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
research c a r r i e d out using the H a r r i s Slash Four minicomputer. In s e v e r a l cases the research was c a r r i e d out i n c o l l a b o r a t i o n with t h e o r i s t s from other i n s t i t u t i o n s . When such studies made use of machines i n a d d i t i o n to the minicomputer, an a s t e r i s k i s indicated. Papers i n the course of p u b l i c a t i o n w i l l be provided on request. Not wishing to be e n t i r e l y i m p a r t i a l , we add the o p i n i o n that the minicomputer has allowed us to make a number of important c o n t r i b u t i o n s both to theory and to chemistry. With t h i s machine, our choice of problems has been p r i m a r i l y based on chemical i n t u i t i o n and s c i e n t i f i c i n c l i n a t i o n , r a t h e r than the p r e s s i n g economic circumstances many t h e o r e t i c a l chemists r e g r e t t a b l y face.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
Developmental Work i n Progress As mentioned i n the i n t r o d u c t i o n , we are j u s t now beginning to take f u l l advantage of the H a r r i s machine. Bruce G a r r e t t , a student of Professor M i l l e r ' s i s continuing work on the development of a quantum mechanical t r a n s i t i o n s t a t e theory. Cliff Dykstra has developed (8) and i s c o n t i n u i n g to work on a Theory of S e l f Consistent E l e c t r o n P a i r s (TSCEP), a fundamentally new approach to the c o r r e l a t i o n problem (9). Also i n Professor Schaefer's group, Robert Lucchese, Jim Meadows, B i l l Swope, and Bernie Brooks are working together to develop a new system of programs for l a r g e s c a l e c o n f i g u r a t i o n i n t e r a c t i o n (CI) s t u d i e s of e l e c t r o n c o r r e l a t i o n i n molecules. The l a t t e r programs are described i n some d e t a i l elsewhere (10). Thus, although t h i s report i s o f f i c i a l l y l a b e l e d " f i n a l " , there i s much work yet to be done i n the development of new t h e o r e t i c a l methods and comput a t i o n a l techniques. It i s i n such cases, where o r i g i n a l programs have been w r i t t e n s p e c i f i c a l l y for the minicomputer, that i t s advantages become most c l e a r l y apparent. In t h i s regard i t i s noteworthy that most students who have taken the time (perhaps one month) to f a m i l i a r i z e themselves with the mini a c t u a l l y prefer i t to the CDC 7600. Qualms A balanced view r e q u i r e s us to admit that a l l i s not sweetness and l i g h t . We have already noted that there i s no convenient computer center s t a f f to operate the machine. When problems occur we not only must c a l l the customer engineer, but a l s o p o i n t him r a t h e r c a r e f u l l y i n the d i r e c t i o n of the problem. As one of the reviewers has pointed out, t h i s i s at l e a s t i n part a r e s u l t of the f a c t that the support s e r v i c e s of the H a r r i s Corporation are s u b s t a n t i a l l y l e s s than those of IBM or CDC. An absolute n e c e s s i t y i s the presence of one very b r i g h t , knowledgeable, and r e s p o n s i b l e computer expert i n the group. The Lord has blessed us with two such i n d i v i d u a l s , Dr. Peter Pearson (who went on to
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON
E T A L .
Theoretical Chemistry
175
greater things i n September of 1974) and more r e c e n t l y Mr. Robert Lucchese. This s o r t of i n d i v i d u a l i s r e q u i r e d to make system changes and updates, determine whether the machine i s r e a l l y s i c k or j u s t out of shape, and show the customer engineer e x a c t l y which machine i n s t r u c t i o n i s f a i l i n g when a d e f i n i t e problem i s located. A l s o , debugging a l a r g e program i s much more d i f f i c u l t than on the CDC 7600. Programmers always blame most of t h e i r mistakes on the computer and t h i s can be e s p e c i a l l y true when a m i n i i s involved. O c c a s i o n a l l y one finds a student who i s simply u n w i l l ing to go through the exhaustive checking that i s necessary to debug a l a r g e s c a l e program on a machine such as the H a r r i s Slash Four. Successful u t i l i z a t i o n of the machine r e q u i r e s the p h y s i c a l presence of one student at any given time. For some i n d i v i d u a l s the idea of spending the n i g h t with a computer i s not a pleasant one. We have found that the only s a t i s f a c t o r y s o l u t i o n to t h i s problem i s to have a s u f f i c i e n t number of students (at l e a s t 10) using the machine that they simply cannot a f f o r d to r i s k the p o s s i b i l i t y of being absent i n the event of a machine h a l t . Two a d d i t i o n a l weaknesses of the m i n i r e l a t i v e to a l a r g e machine such as the 7600 are (a) the smaller memory and (b) the l a r g e amounts of elapsed time required to complete a given j o b . The former l i m i t a t i o n r e s t r i c t s u s , f o r example, to using about 80 contracted gaussian functions i n e l e c t r o n i c s t r u c t u r e c a l c u l a tions. Although C l i f f Dykstra has developed a method of i n c r e a s ing t h i s l i m i t to 120 contracted f u n c t i o n s , such a computation might run i n t o trouble on the second p o i n t . That i s , about 24 hours i s the p r a c t i c a l l i m i t for a s i n g l e j o b . In g e n e r a l , the other users become q u i t e h o s t i l e i f a job r e q u i r e s even t h i s long. In a d d i t i o n , 24 hours i s about the mean time i n t e r v a l between machine f a i l u r e s i f the machine i s running a s i n g l e j o b . I t should be noted that t h i s time r e s t r i c t i o n (to about 1 hour of 7600 time per job) would be a s e r i o u s b a r r i e r i n accomplishing some of the goals set out for the NRCC (6, 7 ) · I n t e r f a c i n g with Experiments A question we are frequently asked i s "Could you handle three or four o n - l i n e experiments at the same time?" The answer to t h i s q u e s t i o n , at l e a s t for the H a r r i s Slash Four, i s an unequivocal no. The cost e f f e c t i v e n e s s of machines such as ours i s i n part a r e s u l t of i t s somewhat r e s t r i c t e d c a p a b i l i t i e s . If one wants the f l e x i b i l i t y of an IBM 370 system, t i e d i n to 43 t e l e t y p e s , one should probably be w i l l i n g to pay ten times more to c a r r y out a p a r t i c u l a r task i n computational chemistry. Our system i s i d e a l l y s u i t e d to batch o p e r a t i o n s , where only one job runs at a time. In fact i f a p a r t i c u l a r job i s long and not r e s t a r t a b l e (many of our programs are now r e s t a r t a b l e ) i t i s b e t t e r not even to read i n another job during execution. Thus the p o s s i b i l i t y of o n - l i n e experiments i s d e f i n i t e l y s l i m .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
176
AND LARGE SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
However, there are a l l kinds of experimental chemists who r e l y on computers f o r number crunching jobs designed to a i d i n the a n a l y s i s of t h e i r data. Such jobs are w e l l s u i t e d to a machine such as the Slash Four and could very w e l l provide a major p a r t of the j u s t i f i c a t i o n for a proposal to the NSF. In l i g h t of s e v e r a l reviewers comments, we f e e l compelled to note that the newer H a r r i s machines ( e s p e c i a l l y the Slash Seven) now have v i r t u a l memory, which allows genuine t i m e - s h a r i n g . Having observed the Slash Seven at the I n t e r n a t i o n a l Engineering Company i n San F r a n c i s c o we must conclude that the simultaneous p r o c e s s i n g of three or four users i s now a r e a l i t y on the Slash Seven. Although v i r t u a l memory i s an a d d i t i o n a l expense (perhaps $20,000) i t would c e r t a i n l y be worthwhile i n s i t u a t i o n s where o n - l i n e data a c q u i s i t i o n i s a primary task. Environmental Impact U n t i l q u i t e r e c e n t l y , the primary medium f o r the disseminat i o n of the r e s u l t s of t h i s minicomputer experiment has been personal c o n t a c t . A f t e r the o r i g i n a l proposal was submitted, copies were mailed to ^ 25 prominent t h e o r e t i c a l chemists. The i n t e r i m reports have been d i s t r i b u t e d on request, of which we have had *\> 50 from research chemists. Another ^ 50 v i s i t o r s , i n c l u d i n g an NSF review team, have toured the Berkeley f a c i l i t y . A s l i g h t l y modified v e r s i o n of the o r i g i n a l proposal was published (Volume 1, pages 85-90) i n the new j o u r n a l Computers and Chemistry. Professor Schaefer presented an i n v i t e d paper "Are Minicomputers S u i t a b l e for Large Scale S c i e n t i f i c Computation" i n September 1975 a t the Eleventh Annual IEEE Computer Society Conference i n Washington, D . C . The same l e c t u r e was given e a r l i e r at the IBM Research Laboratory, San Jose. The trade j o u r n a l Computerworld published a popular d e s c r i p t i o n of the experiment i n i t s March 8, 1976 i s s u e . A number o f recent papers have mentioned the Berkeley m i n i experiments. Most recent and perhaps the most i n t e r e s t i n g i s that of I s a i a h S h a v i t t , (11) e n t i t l e d "Computers and Quantum Chemistry." F i n a l l y , the American Chemical S o c i e t y ' s D i v i s i o n of Computers i n Chemistry, under the d i r e c t i o n of Professor Peter Lykos, has organized the present symposium (June, 1977 i n Montreal) on "Minicomputers and Large Scale Computations". Several research groups (perhaps 20) have expressed serious i n t e r e s t i n a c q u i r i n g t h e i r own minicomputer for purposes comparable to our own. However, to our knowledge the only group to a c t u a l l y do so i s that of the l a t e Professor Don L . Bunker of the U n i v e r s i t y of C a l i f o r n i a at I r v i n e . Although the Hewlett-Packard machine purchased by Professor Bunker with NSF support was much l e s s expensive (and p r o p o r t i o n a l l y slower) than the H a r r i s Slash Four, he found i t to be adequate f o r h i s research i n dynamics and a v a s t improvement over h i s former dependence on an incompetent campus computer c e n t e r .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
13.
PEARSON
E T A L .
Theoretical Chemistry
177
Since the d r a f t v e r s i o n of t h i s f i n a l r e p o r t was prepared, two groups of t h e o r e t i c a l chemists have ordered H a r r i s Slash Sevens. These are the groups headed by Professor P h i l l i p C e r t a i n at the U n i v e r s i t y of Wisconsin and by D r s . John T u l l y and Frank S t i l l i n g e r at B e l l Telephone L a b o r a t o r i e s . These and other i m p l i c a t i o n s of our research have been noted i n recent semi-popular reviews i n Science (12) and Nature (13).
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
The Future The controversy concerning the r e l a t i v e merits of m i n i computers and l a r g e machines i s l i k e l y to continue f o r some time. At present both the Slash Four and CDC 7600 appear to be r e l a t i v e l y economical a l t e r n a t i v e s . The r e a l l o s e r s i n such comparisons are the machines between these two extremes (11). For example, s e v e r a l u n i v e r s i t i e s and research i n s t i t u t e s ( e . g . the U n i v e r s i t y of C a l i f o r n i a , the U n i v e r s i t y of Washington, Colorado State U n i v e r s i t y , and B a t t e l l e , Columbus) are c u r r e n t l y using the CDC 6400. Although the 6400 i s only about 1.5 times f a s t e r than the H a r r i s Slash Four, the cost of using t h i s machine can be as high (at Berkeley) as $420/hour. This i s c l e a r l y an absurd s t a t e of a f f a i r s , and we would encourage the abused supporters of such machines to consider t h e i r a l t e r n a t i v e s . Since our o r i g i n a l p r o p o s a l , s e v e r a l developments have occurred i n the minicomputer a r e a . At that time the H a r r i s Slash Four was by f a r the f a s t e s t machine a v a i l a b l e i n our p r i c e range. Since then at l e a s t four machines of n e a r l y comparable speed have appeared: the Data General E c l i p s e , the V a r i a n V75, the System Engineering L a b o r a t o r i e s (SEL) 32/55, and the Interdata 8/32. We have been e s p e c i a l l y i n t e r e s t e d i n the SEL 32 s i n c e i t i s a true 32 b i t machine and might be s i g n i f i c a n t l y f a s t e r than the H a r r i s Slash Four i f a powerful 64 b i t f l o a t i n g p o i n t processor were available. In f a c t , such a f a s t f l o a t i n g p o i n t processor appears to be a r e a l p o s s i b i l i t y for SEL i n the near f u t u r e . In a d d i t i o n the new H a r r i s Slash Seven i s about 30% f a s t e r than our Slash Four machine. Another encouraging development i s the f a c t that memory p r i c e s have now come down by n e a r l y a f a c t o r of two r e l a t i v e to our purchase p r i c e for the 64K of Datacraft 24 b i t core memory. Thus i t seems q u i t e reasonable that future m i n i purchasers w i l l not be r e q u i r e d to r e s t r i c t themselves to small memory machines. C e r t a i n l y the most s p e c t a c u l a r t e c h n o l o g i c a l achievement of the l a s t three years i s the i n t r o d u c t i o n by F l o a t i n g Point Systems ( P o r t l a n d , Oregon) of t h e i r high speed array processor. At a cost of °o $40,000, t h i s device i s able to c a r r y out 38 b i t f l o a t i n g point operations at e s s e n t i a l l y the speed of the 7600. Professor Kent Wilson of UC San Diego has already purchased the FPS array processor for use i n s i m u l a t i n g the c l a s s i c a l dynamics of b i o l o g i c a l systems (14). We have studied t h i s device c a r e f u l l y and while very e n t h u s i a s t i c about i t , have some r e s e r v a t i o n s . F i r s t the 38 b i t
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
178
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
word, corresponding to 8 plus s i g n i f i c a n t f i g u r e s , i s not q u i t e adequate f o r our type of t h e o r e t i c a l computations. As we have emphasized on many o c c a s i o n s , the 48 b i t word of the H a r r i s machine i s i d e a l for our purposes. Secondly, i n t e r f a c i n g the FPS device to a standard mini i s going to be q u i t e a c h a l l e n g e , and hand coding must be done whenever the array processor i s to be used. Since the a r r a y processor i s so much f a s t e r than the host m i n i , the FPS must be used very j u d i c i o u s l y to avoid i t s degradation. In short we do not f e e l that the FPS array processor i s s u i t able at present for general l a r g e s c a l e computations. The use of such a s p e c i a l i z e d device would tend to "freeze" one i n t o a p a r t i c u l a r t h e o r e t i c a l approach, with future options s e v e r e l y limited. However, the mere f a c t that FPS can manufacture a device of t h i s speed for only $40,000 i s c e r t a i n l y a remarkable a c h i e v e ment. We look forward to the further development of t h i s concept. F i n a l l y i t must be noted that a very important development has a l s o occurred i n the l a r g e s c a l e machine area. This i s the i n t r o d u c t i o n of the CRAY machine, which i s at l e a s t a f a c t o r of f i v e f a s t e r than the 7600 and w i l l be s o l d at e s s e n t i a l l y the same p r i c e (y $10 m i l l i o n ) . At present CDC has l e g a l l y succeeded i n s t a l l i n g the o f f i c i a l d e l i v e r y of the f i r s t CRAY, but t h i s should not be allowed to continue i n d e f i n i t e l y . Our personal o p i n i o n i s that by the time the CRAY machine becomes commercially a v a i l a b l e , both H a r r i s and SEL w i l l have introduced machines about f i v e times the speed of the H a r r i s Slash Four. Thus i t seems l i k e l y that the present r e l a t i v e economic comparisons w i l l be v a l i d for perhaps another f i v e y e a r s . A f t e r completion of our d r a f t r e p o r t , we learned of the i n t r o d u c t i o n of the PDP 11T55 machine by the D i g i t a l Equipment Corporation. Although timing and p r i c i n g information i s s t i l l incomplete, t h i s new DEC m i n i claims to exceed the speed of the H a r r i s Slash Four. We are s k e p t i c a l that a c o r p o r a t i o n as l a r g e and "respectable" as DEC w i l l be competitive with H a r r i s or SEL, but t h i s announcement i s c e r t a i n l y welcome. At the very l e a s t i t w i l l force H a r r i s and SEL to a c c e l e r a t e the development and r e l e a s e o f t h e i r new f a s t e r machines. Recommendat ions The greatest challenge p r e s e n t l y before the NSF (and ERDA) with respect to computation i n chemistry i s the above mentioned NRCC. We s t r o n g l y recommend that these bodies agree as q u i c k l y as p o s s i b l e on a procedure for implementing the NRCC (hopefully for F i s c a l 1978). One c o n c l u s i o n drawn from our i n v e s t i g a t i o n s i s that the u l t i m a t e goal o f the NRCC should not be the a c q u i s i t i o n o f i t s own 7600, but rather of the much more powerful and economical CRAY machine. Although i n i t i a l implementation w i l l probably involve some f r a c t i o n of a 7600, the CRAY a l t e r n a t i v e should be kept i n the forefront of c o n s i d e r a t i o n .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON
E T A L .
Theoretical Chemistry
179
At the same time the NSF should continue to c a r e f u l l y monitor new developments i n the minicomputer a r e a . A reasonable procedure would i n v o l v e the funding of two such minis per year for the next f i v e y e a r s . Our H a r r i s Slash Four has remained f o r t u i t i o u s l y current during the four p l u s years s i n c e the submission of our p r o p o s a l . However, as discussed i n the previous s e c t i o n , the winds of change are now beginning to blow. P e r s o n a l l y we intend to submit a new proposal to NSF as soon as a r e l i a b l e manufacturer meets the f o l l o w i n g s p e c i f i c a t i o n : for l e s s than $200,000 complete ( i n c l u d i n g C a l i f o r n i a s a l e s tax) a machine four times the speed of the Slash Four. Our current o p i n i o n i s that innovations of a l e s s comprehensive nature are not worth the t r i b u l a t i o n s (see Appendix) inherent i n breaking i n a new machine. Since the H a r r i s Slash Four w i l l s t i l l be a very d e s i r a b l e machine ( e s p e c i a l l y with i t s r e s i d e n t programs, i n c l u d ing POLYATOM, GAUSSIAN 70, SCEP, and BERKELEY), we would leave i t to the d i s c r e t i o n of the NSF to f i n d a s u i t a b l e new owner. Appendix I Interim Reports on the Berkeley Minicomputer P r o j e c t . Q u a r t e r l y Report No. 1, December 14,
1973
Notice was r e c e i v e d on June 15, 1973 that the proposal "Large Scale S c i e n t i f i c Computation v i a Minicomputer" had been funded to the extent of $129,600 by the N a t i o n a l Science Foundation. At t h i s p o i n t f i n a l n e g o t i a t i o n s with the Datacraft Corporation was entered i n t o . The U n i v e r s i t y of C a l i f o r n i a was represented by Mr. R. J . B r i l l i a n t of the Purchasing O f f i c e , while Datacraft was represented by Mr. Don F a l t i n g s , of t h e i r Walnut Creek o f f i c e . A f i n a l agreement was reached on October 5, 1973. The primary change r e l a t i v e to the proposed system was the s u b s t i t u t i o n of a 56,000,000 byte d i s k for the o r i g i n a l 28,000,000 byte d i s k . In a p a r a l l e l development, we r e c e i v e d a l e t t e r on June 28, 1973 from Professor D. R. W i l l i s , A s s i s t a n t to the C h a n c e l l o r Computing. On behalf of the Campus Advisory Committee on Computing, Professor W i l l i s requested that we advise him on how progress reports could best be made, on a r e g u l a r and c o n t i n u i n g basis. On August 30, 1973, we agreed to f i l e q u a r t e r l y r e p o r t s , one or two typewritten pages l o n g , to the Berkeley Campus Computing Committee. These q u a r t e r l y reports w i l l a l s o be sent to D r . W. H . Cramer, Program D i r e c t o r f o r Quantum Chemistry, N a t i o n a l Science Foundation. The m a j o r i t y o f the 6024/4 system was d e l i v e r e d at Berkeley on November 14, 1973. As discussed with D a t a c r a f t , the s c i e n t i f i c a r i t h m e t i c u n i t ( f l o a t i n g p o i n t hardware) and 56 megabyte d i s k did not appear. These items are scheduled at be d e l i v e r e d i n e a r l y January, 1974. In the meantine, a temporary 11 megabyte d i s k was s u p p l i e d by D a t a c r a f t .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE
180
SCALE
COMPUTATIONS
The Datacraft engineer, Mr. Mike Crumbliss, a r r i v e d i n Berkeley on November 19 and proceeded to connect the system. Several e a r l y problems were c l e a r e d up during the f i r s t week. For example, an i n a b i l i t y to plug i n the f i n a l 8,000 words of memory was traced to a misadjustment i n the power supply. With i n the f i r s t week the machine was able to d i a g o n a l i z e a 50 χ 50 matrix i n s i n g l e p r e c i s i o n ( s i x s i g n i f i c a n t f i g u r e s ) . This c a l c u l a t i o n was done using the benchmark program HDIAG discussed i n our NSF p r o p o s a l . However, the machine was unable to properly d i a g o n a l i z e the same 50 χ 50 matrix i n double p r e c i s i o n . This e r r o r , which as of today s t i l l o c c u r s , was traced back to trouble i n the square root r o u t i n e , which i n turn f a i l s due to an e r r o r i n the f l o a t i n g p o i n t d i v i d e o p e r a t i o n . The s p e c i f i c problem i s that the quantity (1.0 - 2"^ )/1.0 i s computed to give 1.0 2_3 8 _ 2~ . The Datacraft engineers are working on t h i s problem now and i n d i c a t e that i t should be resolved s h o r t l y . Despite the p e c u l i a r d i v i d e problem o u t l i n e d above, Professor M i l l e r ' s c l a s s i c a l t r a j e c t o r y programs appear to execute properly i n both s i n g l e and double p r e c i s i o n . The complex-valued t r a j e c t o r i e s run only i n s i n g l e p r e c i s i o n , since the f l o a t i n g point hardware i s r e q u i r e d f o r double p r e c i s i o n complex o p e r a t i o n s . The f i r s t e l e c t r o n i c s t r u c t u r e program we are attempting to set up i s HETINT, Professor Schaefer's diatomic molecular i n t e g r a l s program. The program has been rearranged to f i t i n memory w i t h out o v e r l a y i n g , but does not yet execute properly due to the d i v i d e e r r o r discussed above. In general we have found the double p r e c i s i o n software to execute 150-200 times slower than the CDC 7600. This i s about as expected, and a f a c t o r of 3-4 from the f l o a t i n g p o i n t hardware w i l l put us i n the speed range discussed i n the p r o p o s a l . In our research groups, the i n d i v i d u a l most knowledgeable about computers and computing i s Mr. Peter K. Pearson, and he has taken over r e s p o n s i b i l i t y f o r the care o f the system and dissemination o f necessary information to the other research students. At l e a s t four other students have a good grasp o f the system. In that most of us know a great deal more about computers than we d i d one month ago, i t appears that our m i n i computer experiment has had considerable e d u c a t i o n a l value already. On March 29, 1974 our i n s t a l l a t i o n w i l l be v i s i t e d by a s p e c i a l N a t i o n a l Science Foundation committee, t e n t a t i v e l y composed of D r s . W. H . Cramer (NSF), 0. W. Adams (NSF), J . C . Browne ( U n i v e r s i t y o f Texas), and P . G. Lykos ( I l l i n o i s I n s t i t u t e of Technology). 8
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
η
Q u a r t e r l y Report No. 2, March 27, 1974 Our f i r s t q u a r t e r l y r e p o r t documented the a r r i v a l of most of the Datacraft 6024/4 system, described i n our NSF p r o p o s a l . This proposal has now been modified s l i g h l y so as to be s u i t a b l e f o r
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON
E T A L .
Theoretical Chemistry
181
p u b l i c a t i o n , and w i l l appear i n the new j o u r n a l "Computers and Chemistry". At the time of our f i r s t r e p o r t , the 6024/4 had been unable to s u c c e s s f u l l y complete our 50 χ 50 matrix d i a g o n a l i z a t i o n bench mark i n double p r e c i s i o n , due to an e r r o r i n the f l o a t i n g p o i n t d i v i d e subroutine. Shortly t h e r e a f t e r t h i s e r r o r was further traced by Peter Pearson to a machine i n s t r u c t i o n , the AMD i n s t r u c t i o n , Add Memory Double. We should point out here that the Datacraft engineers (or those of any other data p r o c e s s i n g manufacturer) can u s u a l l y solve a problem only a f t e r i t has been traced to a s p e c i f i c machine i n s t r u c t i o n f a i l u r e . In the present case, the AMD i n s t r u c t i o n d i d f u n c t i o n p r o p e r l y when one of the c e n t r a l processor byte s l i c e boards was put on an extender board. This being the case, the e r r o r was e l i m i n a t e d by p o s i t i o n i n g a piece o f copper f o i l between the two offending cpu byte s l i c e boards. The matrix d i a g o n a l i z a t i o n then executed properly at a speed 166 times slower than the CDC 7600. With the f l o a t i n g point hardware, however, we expect (see o r i g i n a l proposal) the benchmark to execute at a speed 49 times slower than the 7600. With the AMD i n s t r u c t i o n c o r r e c t e d , we r e t u r n to the problem of implementing HETINT, Professor Schaefer's diatomic molecular i n t e g r a l s program. Although the program d i d execute, i n c o r r e c t answers were obtained. Peter Pearson e v e n t u a l l y traced t h i s d i f f i c u l t y to improper treatment of exponents by the system's a r i t h m e t i c r o u t i n e s i n underflow cases. In f a c t , he had to modify the f l o a t i n g p o i n t subroutines for double p r e c i s i o n add, subtract, and m u l t i p l y . This was a p a r t i c u l a r l y d i f f i c u l t j o b , s i n c e at that time we d i d not have the source program l i s t i n g s for the software f l o a t i n g point subroutines. With these c o r r e c t i o n s made, HETINT executed p r o p e r l y on December 19, 1973. This program executes at a speed roughly 105 times slower than the CDC 7600. The next major program to be implemented was the Ohio S t a t e Cal Tech-Berkeley v e r s i o n of POLYATOM, a general molecular program for the computation of m u l t i c o n f i g u r a t i o n SCF wave functions (the o r i g i n a l v e r s i o n of POLYATOM was developed by J u l e s Moskowitz and co-workers at NYU). To t h i s end, Dean Liskow began an i n t e n s i v e e f f o r t on the f i r s t of the year. One of the most serious d i f f i c u l t i e s was the setup of the overlay s t r u c t u r e , c o n s i s t i n g of three l e v e l s with seven segments. Success was achieved on January 11 when a proper s e l f - c o n s i s t e n t - f i e l d wave f u n c t i o n f o r the water molecule was obtained. Comparison with the 7600 r e s u l t s showed an accuracy of between 9 and 10 s i g n i f i cant figures f o r the t o t a l energy. Toward the end of January, we began to run POLYATOM on a production b a s i s . One of the f i r s t problems t a c k l e d was the p o s s i b l e existence o f two isomers of the NOâ i o n . A (9s 5p/5s 3$ gaussian b a s i s was centered on each atom, and the three geometric a l parameters optimized for the nonsymmetric form. A complete c a l c u l a t i o n at a s i n g l e geometry r e q u i r e d between four and s i x hours of elapsed time. This i s about a f a c t o r of 85
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
182
AND LARGE
SCALE
COMPUTATIONS
times slower than the CDC 7600. During the same p e r i o d Gretchen Schwenzer used the 6024/4 f o r a thorough p r e l i m i n a r y study of H S and the two h y p o t h e t i c a l hypervalent molecules SHi+ and SH6. S i m i l a r 7600 timing comparison were obtained. Due to the f l o a t i n g p o i n t software's i n a b i l i t y to perform complex operations i n double p r e c i s i o n (11 s i g n i f i c a n t f i g u r e s ) , we have thus f a r been unable to implement Professor M i l l e r ' s s e m i c l a s s i c a l programs i n v o l v i n g complex-valued t r a j e c t o r i e s ( i . e . , generalized tunneling). Several r e a l - v a l u e d t r a j e c t o r y programs ( r o t a t i o n a l e x c i t a t i o n of He + H and t r a j e c t o r y φ "surface-hopping" c a l c u l a t i o n s for 0( D) + N2 -*-0(^P) + N2 ) i n i t i a l l y ran s u c c e s s f u l l y but numerical ^ r e p r o d u c i b i l i t i e s began occuring. This was a source of much f r u s t r a t i o n , and was perhaps due to c r o s s t a l k between s e v e r a l a d d i t i o n a l byte s l i c e boards. To c o r r e c t t h i s problem s e v e r a l a d d i t i o n a l sheets of copper f o i l were p o s i t i o n e d i n the c e n t r a l processor one week ago. The l a r g e d i s k (56 megabyte) and s c i e n t i f i c a r i t h m e t i c u n i t (SAU) a r r i v e d at Berkeley on March 13, 1974. This was two months a f t e r the promised d e l i v e r y date and a source of c o n s i d e r able f r u s t r a t i o n . I t i s important to note here that none of the time comparisons made heretofore u t i l i z e d the SAU ( f l o a t i n g p o i n t hardware). The i n d i v i d u a l hardware f l o a t i n g point add, s u b t r a c t , m u l t i p l y , and d i v i d e i n s t r u c t i o n s execute at speeds 6-14 times f a s t e r than the software subroutines. R e a l i s t i c a l l y , however, we expect the SAU to r e s u l t i n an o v e r a l l i n c r e a s e i n speed of a f a c t o r of 2 to 3. This would put us w i t h i n our o r i g i n a l estimate of being a f a c t o r o f 64 slower than the 7600. As of the time of w r i t i n g of t h i s r e p o r t , n e i t h e r the l a r g e d i s k nor SAU are yet f u l l y o p e r a t i o n a l . The Datacraft engineers are h o p e f u l , however, that the complete system w i l l be f u n c t i o n a l w i t h i n a week. 2
2
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
X
Q u a r t e r l y Report No. 3, June 17,
1974
A w e l l - r e s p e c t e d book d e s c r i b i n g the f i r s t twelve months of infancy makes a statement to the e f f e c t that the t h i r d month of your c h i l d ' s l i f e makes the f i r s t two seem bearable i n retrospect. In a remarkably analogous manner, the f r u s t r a t i o n s of the f i r s t two quarters with our Datacraft 6024/4 minicomputer were more than compensated by the successes of the t h i r d q u a r t e r , j u s t completed. Our second q u a r t e r l y report l e f t o f f with the machine i n o p e r a t i v e due to the recent a r r i v a l of the 56 megabyte d i s k and f l o a t i n g p o i n t processor [referred to by Datacraft as the s c i e n t i f i c a r i t h m e t i c u n i t (SAU)]. Since the NSF v i s i t a t i o n committee (Drs. W. H . Cramer, 0. W. Adams, J . C. Browne and P . G. Lykos) was to a r r i v e on March 29, one might d e s c r i b e the s i t u a t i o n on March 27 as being on the verge of p a n i c . From F o r t Lauderdale Datacraft flew out Mr. R u s s e l l P a t t o n , d i r e c t o r of f i e l d service. Working through the n i g h t , he and Mr. Ron P l a t z
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON
E T A L .
Theoretical Chemistry
183
i n s t a l l e d a separate power supply for the SAU and l o c a t e d and corrected a problem with the d i s k automatic block c o n t r o l l e r . With these m o d i f i c a t i o n s implemented, Peter Pearson was able to execute the 50 χ 50 matrix d i a g o n a l i z a t i o n benchmark program. In the l a s t q u a r t e r l y r e p o r t , we noted t h a t , without the f l o a t i n g p o i n t hardware, (SAU), t h i s program executes at a speed 166 times slower than the CDC 7600. With the SAU and the standard Datacraft 6024 compiler a r a t i o of 59 was found. Using the "optimizing" compiler ( a c t u a l l y s t i l l a r a t h e r crude c o m p i l e r ) , the d i a g o n a l i z a t i o n executed at a speed 43 times slower than the 7600. T h i s r e s u l t i s c o n s i s t e n t with the f a c t o r of 49 p r e d i c t e d i n the o r i g i n a l p r o p o s a l , a modified v e r s i o n of which has now been accepted for p u b l i c a t i o n i n the new j o u r n a l Computers and Chemistry. The NSF v i s i t a t i o n provided the framework for a thorough d i s c u s s i o n of the machine's progress through March 29. Bill Cramer and B i l l Adams s t r e s s e d the importance of keeping an accurate record of machine u t i l i z a t i o n , a key f a c t o r i n the economic a n a l y s i s c e n t r a l to t h i s experiment. Peter Lykos suggest ed we c a l i b r a t e the 6024/4 using the MFLOPS (measure of f l o a t i n g point operations per second) benchmark. A copy of MFLOPS has been obtained and an a n a l y s i s w i l l be presented i n the next quarterly report. Jim Brown gave us many u s e f u l i n s i g h t s from h i s experience at the U n i v e r s i t y of Texas as both chemist and computer s c i e n t i s t . Don F a l t i n g s of Datacraft was on hand to answer a number of questions from the committee and b r i e f l y d i s c u s s some new features ( i n c l u d i n g v i r t u a l memory) of the Datacraft l i n e . F i n a l l y , i t was agreed that a second v i s i t a t i o n would be a d v i s a b l e , a f t e r the machine i s f u l l y o p e r a t i o n a l and i t s c h a r a c t e r i s t i c s thoroughly documented. Steady progress was made during the f i r s t 10 days of A p r i l . That i s , a number of programs were modified to run on the complete system, i n c l u d i n g SAU and l a r g e d i s k . However, s e v e r a l nagging problems p e r s i s t e d , one being that 5.4 v o l t s , 0.4 above the recommended l e v e l , were required to s u s t a i n the c e n t r a l processor. When t h i s minimum f u n c t i o n i n g voltage increased to 5.5 v o l t s , Datacraft advised us to turn the machine o f f . After a week of i n v e s t i g a t i o n , t h i s s u r p r i s i n g l y s u b t l e problem was l o c a t ed and q u i c k l y e l i m i n a t e d on A p r i l 22 by replacement of an i n t e grated c i r c u i t on the memory timing and c o n t r o l board. As i t turned o u t , t h i s small machine defect had been r e s p o n s i b l e f o r many of the problems encountered during the f i r s t f i v e months of operation. Not only d i d the machine run properly at 5.0 v o l t s , but i t was a l s o p o s s i b l e to remove the pieces of copper f o i l p r e v i o u s l y necessary (see Q u a r t e r l y Reports 1 and 2) to s h i e l d the d i f f e r e n t boards from each o t h e r . Since A p r i l 22, the 6024/4 has been running q u i t e smoothly. Some o c c a s i o n a l p a r i t y e r r o r s were put to r e s t by replacement of a memory board chip on May 16. A current problem with the add memory to double (AMD) i n s t r u c t i o n has been temporarily r e l i e v e d
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
184
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
by a sheet of copper f o i l between byte s l i c e boards 2 and 3. However, these are minor problems, and o v e r a l l we have been very pleased with the o p e r a t i o n of the machine during t h i s q u a r t e r . With t e c h n i c a l problems pushed i n t o the background, we were able to turn to the c e n t r a l goal of the experiment, the e v a l u a t i o n of the performance of the 6024/4 r e l a t i v e to the CDC 7600 . For t h i s purpose we report the r e s u l t s of two d i r e c t comparisons, one i n v o l v i n g e l e c t r o n i c s t r u c t u r e theory and the other molecular c o l l i s i o n theory. I t i s to be emphasized that the programs used are by no means o p t i m a l l y e f f i c i e n t . However, of primary i n t e r e s t here are the r e l a t i v e speeds of the two machines, and for t h i s purpose our comparisons should be v a l i d . The f i r s t t e s t case arose i n Dean Liskow's study of the chemisorption of hydrogen by c l u s t e r s of b e r y l l i u m atoms. For the BesH system, a double zeta b a s i s set was adopted: Be(4s 2p), H(2s l p ) . The modified POLYATOM program was used to compute s e l f c o n s i s t e n t - f i e l d wave functions for t h i s open s h e l l doublet. The r e s u l t s are summarized below: Times f 6024/4
(seconds) 7600
^
Ratio
Generate l i s t of unique nonzero i n t e g r a l s Compute unique
3091
174
17.8
2506
119
21.5
integrals
( t o t a l of 476,000)
SCF (time per i t e r a t i o n ) 1548 36 43.5 T h i s comparison i n d i c a t e s that the SCF i t e r a t i o n s show the 6024/4 i n the worst l i g h t . We intend to c o r r e c t t h i s weakness by r e c o d ing t h i s s e c t i o n of POLYATOM i n machine language. However, a l l our d i r e c t comparisons with the 7600 must of n e c e s s i t y employ the same FORTRAM programs. The complete c a l c u l a t i o n , i n c l u d i n g 17 SCF i t e r a t i o n s , r e q u i r e d 0.25 cpu hours on the 7600 and cost $243. The i d e n t i c a l c a l c u l a t i o n r e q u i r e d a t o t a l of 8.86 hours of 6024/4 time, or an o v e r a l l f a c t o r of 35 longer than the 7600. The second t e s t case arose from George Z a h r ' s study of the quenching of 0 ( * ϋ ) by N 2 . Assuming a simple a n a l y t i c a l p o t e n t i a l energy s u r f a c e , c l a s s i c a l t r a j e c t o r i e s were performed w i t h i n the surface-hopping model of Preston and T u l l y . 330 such t r a j e c t o r i e s r e q u i r e d 480 minutes on the 6024/4 and 18.2 minutes on the 7600. The 7600 cost was $193. The Datacraft machine i s seen to be a f a c t o r o f 26 slower. Note that t h i s computation i n v o l v e s v i r t u a l l y no input/output o p e r a t i o n s . Both of the above comparisons show the Datacraft minicomputer to be s i g n i f i c a n t l y f a s t e r than the f a c t o r of 64 p r e d i c t e d i n our o r i g i n a l proposal. There we concluded that the t o t a l monthly cost ( i n c l u d i n g a m o r t i z a t i o n over four years) of the 6024/4 would be $4156. Experience has shown t h i s f i g u r e , which we now round to
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
13.
PEARSON
E T A L .
Theoretical Chemistry
185
$4200, to be r e a l i s t i c . Yet to be f i r m l y e s t a b l i s h e d i s the average number of hours of computing a t t a i n e d per day at our installation. We w i l l d i s c u s s t h i s point i n d e t a i l i n our next quarterly report. However, i f we take the p e s s i m i s t i c view that only 12 hours of computing per day are achieved, 360 hours per month t r a n s l a t e s i n t o a cost of $11.67/hour. Thus the BesH job c i t e d above costs $103, as opposed to $243 for the 7600. The 0( D) + N2 job by the same c r i t e r i o n cost $93, as opposed to $193 for the 7600. Again i t i s only f a i r to remark that the c i t e d 7600 costs at the Lawrence Berkeley Laboratory i n c l u d e only o p e r a t i o n a l expenses and completely n e g l e c t the i n i t i a l purchase p r i c e of the machine. Added i n proof: A s u r p r i s i n g l y simple r e o r d e r i n g of the POLYATOM f i l e s t r u c t u r e (no changes i n the FORTRAN program) by Peter Pearson has r e s u l t e d i n n e a r l y a f a c t o r of two increase i n the SCF speed c i t e d above.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
1
Report No. 4, March 30,
1975
By the time of w r i t i n g of our l a s t r e p o r t , i t had become clear that the Datacraft 6024/4 minicomputer was meeting or exceeding the goals that had been set for i t . The past nine months have served to s u b s t a n t i a l l y strengthen that c o n c l u s i o n . A p a r t i c u l a r l y c r u c i a l t e s t has been passed i n that i t i s now apparent that r e l a t i v e l y l i t t l e maintenance of the machine i s required. T y p i c a l l y , i t i s necessary to c a l l the computer engineer once or twice per month, and r e p a i r "down time" for a t y p i c a l month i s roughly one day. In f a c t , the s e r v i c e c o n t r a c t i s necessary p r i m a r i l y as an insurance p o l i c y , s i n c e we would otherwise be unprotected against d i s a s t e r s , e . g . , i f for some mysterious reason the e n t i r e memory were burned out. In t h i s regard i t should be noted that the Datacraft Corporation was swallowed up by the H a r r i s Corporating during t h i s p e r i o d . Thus our minicomputer i s now marketed as the H a r r i s Slash Four. The only e f f e c t (on us) of t h i s takeover was the increased cost of the s e r v i c e c o n t r a c t , for which H a r r i s proposed a p r i c e of $1500/month T h i s suggestion was p a r t i c u l a r l y d i s t r e s s i n g to us s i n c e (a) i t represented a l a r g e increase over the $1155/month we had budgeted for and (b) the U n i v e r s i t y of C a l i f o r n i a has during the past year changed i t s p o l i c y and we now pay 34% overhead on the s e r v i c e contract. A f t e r some d e l i c a t e n e g o t i a t i o n s H a r r i s lowered the s e r v i c e contract p r i c e to $1280/month and we accepted i t . Before l e a v i n g the subject of maintenance, i t should be mentioned that most of our problems r e q u i r i n g s e r v i c e i n v o l v e the t e l e t y p e and l i n e p r i n t e r . I t turns out that n e i t h e r of these devices was intended for the s o r t of f u l l time usage they are receiving. I n c i d e n t a l l y , the t e l e t y p e i s not covered under the new s e r v i c e c o n t r a c t , but i s i n s t e a d s e r v i c e d by U n i v e r s i t y of C a l i f o r n i a personnel.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
186
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
In one r e s p e c t , the minicomputer has proved l e s s expensive to operate than we had p r e d i c t e d . In the o r i g i n a l p r o p o s a l , $300/ month was a l l o c a t e d for " e l e c t r i c i t y , c a r d s , paper, e t c . " As i t turns out, although we do pay the above-mentioned overhead of $437/month on the s e r v i c e c o n t r a c t , the U n i v e r s i t y pays our e l e c t r i c a l b i l l , and the cost of c a r d s , paper, e t c . , i s l e s s than $50/month. Thus t h i s savings of $250/month p a r t i a l l y cancels the high cost of the s e r v i c e c o n t r a c t . During t h i s p e r i o d we have from time to time run programs to gather s t a t i s t i c s on the u t i l i z a t i o n of the minicomputer. These data suggest that the machine i s busy for about 90% of the time i t i s not being s e r v i c e d . Thus the o v e r a l l ( i n c l u d i n g r e p a i r down time) u t i l i z a t i o n i s i n excess of 85%, a f i g u r e considered q u i t e acceptable for l a r g e s c a l e machines. T h i s u t i l i z a t i o n r a t e i s a l s o remarkably c l o s e to the 20 hours/day estimated i n our o r i g i n a l proposal. I t i s necessary, however, to point out that such a r a t e could not be achieved (without a paid operator) w i t h out an aggressive and hard working group of eleven a c t i v e users (students and p o s t d o c t o r a l s ) . Since each user z e a l o u s l y guards h i s ^ 13 hours/week, he/she i s quite l i k e l y to be on hand should any temporary machine problem i n t e r r u p t his/her j o b . Due to machine demand, i t should be noted that jobs r a r e l y run longer than 13 hours; and the thought ( r a i s e d i n our o r i g i n a l proposal) of jobs running c o n s e c u t i v e l y for one month has been long s i n c e abandoned. A t y p i c a l job now runs for about two hours. Scheduling the computer turned out to be more of a problem than we i n i t i a l l y a n t i c i p a t e d . I t was c l e a r i n J u l y , 1974 that the machine had become s u f f i c i e n t l y popular that "good w i l l " would not be a s u f f i c i e n t deterrent to squabbles. The system that has now been s e t t l e d upon i n v o l v e s g i v i n g each a c t i v e user 13 hours of machine time per week. In a d d i t i o n four hours (10 AM - 2 PM) per weekday are a v a i l a b l e on a f i r s t c o m e - f i r s t serve b a s i s for debug j o b s , with a time l i m i t of ten minutes. The time i s signed up for on Thursday afternoons for the week beginning Saturday. The order of sign-up i s a r e g u l a r one, with the user having f i r s t choice one week being demoted to l a s t choice the f o l l o w i n g week. A final r e s t r i c t i o n i s that the b l o c k to time between 2 AM and 8 AM cannot be s u b d i v i d e d . That i s , a s i n g l e user takes the e n t i r e b l o c k . Although t h i s scheduling system w i l l probably be s l i g h t l y r e v i s e d on o c c a s i o n , i t seems to be working reasonably w e l l at present. Two major program conversion e f f o r t s were undertaken s i n c e the t h i r d r e p o r t . The f i r s t , i n v o l v i n g the Gaussian 70 programs of Hehre, Lathan, D i t c h f i e l d , Newton, and Pople, i s now completed. The second, i n v o l v i n g the polyatomic c o n f i g u r a t i o n i n t e r a c t i o n (CI) program of Charles F . Bender, began very r e c e n t l y and has been implemented thus f a r only i n a r e s t r i c t e d v e r s i o n . The Gaussian 70 conversion was deemed e s p e c i a l l y important s i n c e i t now appears that t h i s program w i l l become s i g n i f i c a n t l y more widely d i s t r i b u t ed than any previous ab i n i t i o progarm for s i n g l e determinant SCF studies. Thus the times we report with t h i s program may serve as
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
13.
PEARSON
Theoretical Chemistry
E T AL.
187
a b a s i s for comparison w i t h many other types of computers. The major d i f f i c u l t y i n the implementation of Gaussian 70 was the r e l a t i v e l y complicated (for the 6024/4) overlay s t r u c t u r e . One o f the e a r l i e s t s t u d i e s undertaken u s i n g Gaussian 70 i n v o l v e d the C H - C £ 2 molecular complex. Using the standard ST0-3G b a s i s set (162 p r i m i t i v e gaussians, 54 contracted f u n c t i o n s ) , a complete c a l c u l a t i o n at one geometry, i n c l u d i n g 8 SCF i t e r a t i o n s , r e q u i r e d 64 minutes of 6024/4 elapsed time. Thus i t i s c l e a r that the study o f reasonably complicated organic systems u s i n g minimum b a s i s sets i s q u i t e f e a s i b l e with the minicomputer. Using analogous minimum b a s i s s e t s , computations have been c a r r i e d out on (CH30)2P0 Ca Cl (67 c o n t r a c t e d f u n c t i o n s ; 59 minutes f o r i n t e g r a l s plus 86 minutes f o r 13 SCF i t e r a t i o n s ) and the Bei3 c l u s t e r (65 contracted f u n c t i o n s ; 80 minutes f o r i n t e g r a l s , 250 minutes f o r 20 SCF i t e r a t i o n s ) . We continue to i n v e s t i g a t e a l a r g e number of systems u s i n g the B e r k e l e y - C a l Tech-Ohio State v e r s i o n of POLYATOM. Advantages o f t h i s program are that i t y i e l d s exact s p i n eigenfunctions f o r opens h e l l systems and can perform l i m i t e d MCSCF computations. One of the l a r g e r systems studied was the NH3-C&F charge t r a n s f e r complex. A b a s i s set of s i z e Cil(12s 9p ld/6s 4p I d ) , N,F(9s 5p ld/4s 2p I d ) , H(4s/2s) was used, t o t a l i n g 62 contracted f u n c t i o n s . A l i s t of non zero-unique i n t e g r a l s i s generated i n 40 minutes ( t h i s process need be done only once f o r the e n t i r e p o t e n t i a l c u r v e ) , i n t e g r a l computation r e q u i r e d 130 minutes, and 11 SCF i t e r a t i o n s consumed 50 minutes. A study of trimethylene methane which we had e a r l i e r found exceedingly d i f f i c u l t to f i n i s h on the 7600 (due to cost c o n s i d e r a t i o n s ) has now been completed on the 6024/4. Using a double zeta b a s i s set (120 p r i m i t i v e functions contracted to 52), 74 minutes were r e q u i r e d f o r i n t e g r a l g e n e r a t i o n . Twenty SCF i t e r a t i o n s on the A 2 ground s t a t e (two SCF hamiltonians) devoured 220 minutes o f elapsed time. During the past nine months, a s e r i e s of production runs was made on the g l y o x a l molecule (HC0)2 u s i n g a standard double zeta basis set. S i n c e , a number of runs were a l s o made on the 7600, a comparison of the costs f o r the e n t i r e p r o j e c t i s p o s s i b l e . The POLYATOM timing comparisons are seen i n Table I . 6
6
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
2
3
The r a t i o of elapsed 6024/4 time to 7600 CPU time i s 25.0, a very encouraging f i g u r e . Since the cost of machine time on the m i n i i s about $8/hour ( i n c l u d i n g a m o r t i z a t i o n of the purchase p r i c e over four y e a r s ) , the t o t a l minicomputer cost of the p r o j e c t was l e s s than $3500. An i n t e r e s t i n g development has been the i n c r e a s i n g use of the m i n i i n an i n t e r a c t i v e mode. This i s e s p e c i a l l y h e l p f u l i n SCF calculations. The t o t a l energy i s p r i n t e d on the t e l e t y p e a f t e r each SCF i n t e r a c t i o n and the user has four o p t i o n s : a) c o n t i n u e ; b) go to a weighted averaging of o r b i t a l s ; c) go to an e x t r a p o l a t i o n scheme; d) stop. Use of t h i s i n t e r a c t i v e feature can remarkably improve the r a t e of convergence f o r c e r t a i n types of molecular systems.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
188
MINICOMPUTERS
Table I .
A N DLARGE
COMPUTATIONS
POLYATOM timing comparisons f o r g l y o x a l . 7600 Job
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
SCALE
Cost
6024/4 Minutes Elapsed Time
35
8
9
215
34
46
240
61
86
CPU-Seconds
Lister Integrals (cis/trans) Integrals (gauche) SCF - ground s t a t e per i t e r a t i o n SCF - e x c i t e d s t a t e s per i t e r a t i o n
6
2.75
2.5
17
7.70
7
Glyoxal P r o j e c t : 3 70 60 130 40
listers 105 cis/trans integrals 8050 gauche i n t e g r a l s 14400 SCF ground s t a t e 7020 SCF - e x c i t e d s t a t e s vertical 13600 60 SCF - e x i c t e d s t a t e s geometry search _ 7140 a
24 2380 3660 3250
27 3220 5160 3510
6160
5800
b
c
50,315 $18,719 14.0 hours a) b) c) d)
3240
3245 c
20,957 349.3 hours
Based on nine SCF i t e r a t i o n s for convergence. Based on twenty SCF i t e r a t i o n s for convergence. Based on seven SCF i t e r a t i o n s for convergence. I f run e x c l u s i v e l y on weekends and h o l i d a y s , cost reduced to $9360. I f i n a d d i t i o n run at deferred p r i o r i t y , cost f a l l s to $7488.
Much of WHM's current research i n v o l v e s n u m e r i c a l l y computed classical trajectories. In " c l a s s i c a l S-matrix" c a l c u l a t i o n s , for example, c l a s s i c a l t r a j e c t o r i e s , and the a c t i o n i n t e g r a l along them, are used to construct quantum mechanical S-matrix elements for s p e c i f i c c o l l i s i o n processes. A l s o , a newly formulated quantum mechanical v e r s i o n of t r a n s i t i o n s t a t e theory, which c o r r e c t l y incorporates n o n - s e p a r a b i l i t y of the t r a n s i t i o n s t a t e , uses t r a j e c t o r i e s — p e r i o d i c t r a j e c t o r i e s i n imaginary time—to determine the net r a t e constant f o r r e a c t i o n . Although the c a l c u l a t i o n of c l a s s i c a l t r a j e c t o r i e s themselves i s f a i r l y standard nowadays, these novel kinds of theory u s u a l l y i n v o l v e search
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
13.
PEARSON ET AL.
Theoretical Chemistry
189
procedures, i . e . , they require particular classical trajectories rather than a Monte Carlo average over them a l l . The ability to operate the minicomputer "hands on" has greatly facilitated the application of these new kinds of theoretical models. Also, this type of work requires a great deal of new program debugging, and the 6024/4 has proved quite adequate in this regard, even though the diagnostics are not as comprehensive as those produced by the LBL 7600. Our final fairly typical timing comparison concerns a threedimensional phase space integral calculation. To obtain the rate constant for D + H2 at 200°K, 237 classical trajectories ,(both real and imaginary) were computed. The minicomputer required 60 minutes for this job, while the 7600 used 2.42 minutes of CPU time. Thus the 7600 was a factor of 25 quicker than the 6024/4. The cost of the 7600 job was $20.57. This comparison puts the large machine in a relatively favorable light since there are essentially no 7600 input/output charges associated with trajectory-oriented jobs of this type. In closing we note that this factor of 25 is characteristic of trajectory studies, which involve the numerical integration of ordinary differential equations. Ac knowledgment s We wish to sincerely thank Drs. W. H. Cramer and 0. W. Adams of NSF for their support of this project, especially during its early and more controversial stages. We also thank Professors Jim Brown, Edward Hayes, Maurice Schwartz, Don Secrest, Stanley Hagstrom, and Peter Lykos for their thoughtful comments on the draft version of this report. * Supported by the National Science Foundation, Grant GP-39317. ** Present address: Lawrence Livermore Laboratory, University of California, Livermore, California 94550. *** Address after September 15, 1977: Arthur Amos Noyes Laboratory of Chemical Physics, California Institute of Technology, Pasadena, California 91125 Literature Cited 1. 2. 3. 4.
Pople, J . Α., "Modern Theoretical Chemistry", Vol. IV, ed., H. F. Schaefer, Plenum, New York, 1977. Cavallone, F . , and Clementi, Ε . , J . Chem. Phys. (1975), 63, 4304. Miller, W. H., Advances in Chemical Physics (1974), 25, 69. Herschbach, D. R., Faraday Discussion Chem. Soc. (1973), 55, 233.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
190
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch013
5. 6.
Bunker, D. L . , Accounts of Chemical Research (1974), 7, 195. Wiberg, Κ. B., "A Study of a National Center for Computation in Chemistry", National Academy of Sciences, Washington, D.C., March, 1974. 7. Bigeleisen, J . , "The Proposed National Resource for Computa tion in Chemistry: A User-Oriented Facility", National Academy of Sciences, Washington, D.C., June, 1975. 8. Dykstra, C. E., Schaefer, H. F., and Meyer, W., J. Chem. Phys. (1976), 65, 2740, 5141. 9. Schaefer, H. F., "The Electronic Structure of Atoms and Molecules: A Survey of Quantum Mechanical Results", AddisonWesley, Reading, Massachusetts, 1972. 10. Lucchese, R. R., Brooks, B. R., Meadows, J. H., Swope, W. C., and Schaefer, H. F., J. Computational Phys., in press. 11. Shavitt, I., paper presented at the Third ICASE Conference on Scientific Computing, Williamsburg, Virginia, April 1-2, 1976. 12. Robinson, A. L . , Science (1976), 193, 470. 13. Richards, G., Nature (1977), 266, 5597, 18. 14. Wilson, K. R., "Multiprocessor Molecular Mechanics", in Computer Networking and Chemistry, Peter Lykos, editor (American Chemical Society, Washington, D.C., August, 1975.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
14 Large Scale Computations on a Virtual Memory Minicomputer
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
JOSEPHM.NORBECK and PHILLIPR.CERTAIN Theoretical Chemistry Institute and Department of Chemistry, University of Wisconsin, Madison,WI53706 In October, 1976, the Chemistry Department at the University of Wisconsin-Madison installed a Harris SLASH 7 computer system. The SLASH 7 is a virtual memory minicomputer and is equipped with 64K of high speed, 24 bit, memory; an 80 Mbyte disc storage module; a 9 track tape drive; and a high speed paper tape punch and reader. Other peripherals include two interactive graphics terminals, a 36" CALCOMP plotter, a 3'x4' data digitizing tablet, remote accessing capability for terminals and other departmental minicomputers and remote job entry (RJE) capability to the campus UNIVAC 1110. The SLASH 7 is a departmental resource for the faculty, staff and graduate students as an aid in their research. The computational and data processing needs of the department can be grouped into four main categories: (1) Real time data acquisition (2) Data reformatting and media conversion (3) Interactive data processing and simulation (4) Batch computing, including large scale number crunching. In this paper we discuss the performance of the Harris computer to date and the role i t plays with respect to the categories mentioned above. The information provided is based on less than six months of operation, with much of this time used for program conversion and user education. Consequently we focus most of our attention on category (4) (batch computing and large scale number crunching), since i t is in this area that cost analyses and performance criteria with respect to larger machines have been concentrated. It is also the easiest area to assess in a short period of time. Although we concentrate on the number crunching capabilities of the computer in this paper, we f i r s t briefly discuss the role of the computer in the other categories. There are presently 16 minicomputers in the department associated with either departmental facilities or instruments dedicated to the research groups of individual staff members. These minis are generally adequate only for control of instrumentation and data acquisition, while the 191 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
192
SCALE
COMPUTATIONS
necessary processing of data in the past has been carried out at the university computing center. Nearly a l l of these minis are equipped with paper tape I/O, with five having magnetic tape units In the future, we expect that much of processing of data w i l l be carried out on the SLASH 7. In addition, the SLASH 7 is equipped with a direct memory access device which i s capable of providing a direct link between the other departmental minis and the SLASH 7. At the present time, three minis are being hardwired to the SLASH 7 throgh RS232C interfaces and w i l l be capable of data transfer of up to 9600 baud. Although direct control of experimental i n s t r u ments by the departmental computer is not contemplated, these direct links to the SLASH 7 w i l l provide fast turn-around for the processing of experimental data. A large number of instruments in the department produce graphic output. This includes a variety of spectrometers, electrochemical instrumentation, chromatographs and custom devices Since most do not provide d i g i t a l output, the Harris computer is equipped with a large data digitizing t a b l e t , which is tied to a high quality plotter through an interactive graphics terminal. These peripherals f a c i l i t a t e the processing of graphic data via curve f i t t i n g , integration, and so on. Turning now to number crunching, after a. brief description of the computer hardware and the virtual memory structure, benchmarks and stand-alone run times are reported for several programs currently in use in the department. One of the most important items of information obtained to date is the extent to which "paging" of the virtual memory degrades job through-put. This has been evaluated by investigating the CPU time to Wall Clock (WC) time ratio under different operating conditions. The CPU/WC time ratios are given for jobs alone in the computer and mixed with others. Stand-alone benchmarks correspond to the optimum conditions for each job and give the most favorable CPU/WC time r a t i o . This provides a bound with respect to CPU time, which i s used in a cost analysis with respect to larger, "hard cash" computer f a c i l i t i e s . We discuss this point in more detail later in the paper. Virtual Memory and Paging The Harris SLASH 7 Virtual Memory System (VMS) involves both hardware and software to control the transfer of user programs and data in IK word (K=l024) segments--called "pages"--between main memory and an external mass storage device, which in our case i s the 80 MB disk. This operation, termed "paging", allows a program's memory area to be noncontiguous and even nonresident, and provides a maximum u t i l i z a t i o n of available memory. This permits the computer (1) to run programs larger than the physical memory (the SLASH 7 has an 18 bit effective memory address that selects one of 256 pages, thus allowing for a maximum individual program size of 262,144 words) and (2) to "page" to disc a low
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
14.
NORBECK
A N D CERTAIN
Memory Minicomputer
193
p r i o r i t y task (e.g. a long running number cruncher) to provide faster turn around for shorter, high p r i o r i t y jobs. The paging feature is obviously a great advantage in a m u l t i user environment. For the individual user, the virtual memory system allows programs to d i r e c t l y address up to 256K words, thus avoiding the necessity of e x p l i c i t overlaying. The disadvantages of the virtual memory system are that (1) the operating system occupies approximately 27K of high speed memory at a l l times, (2) even small programs that do not page incur a paging "overhead", and (3) i t i s possible to create a "thrashing" situation i f the demand for paging becomes greater than a c r i t i c a l value. The mean seek time for a disc read i s 30 milliseconds, so that more than about 30 paging operations per second w i l l s t a l l the system. Our present SLASH 7 has approximately 37 user pages available for programs and data storage. Since many jobs which are executed on our system require s i g n i f i c a n t l y more storage area than t h i s , we have paid particular attention to how paging effects program run-times, and to programming techniques that minimize paging. To give an example of a thrashing s i t u a t i o n , we present in Table I the CPU and WC times required to calculate a l l eigenvalues of various r e a l , symmetric matrices. These programs were executed in double precision on our SLASH 7 with the subroutine GIVENS, distributed by the Quantum Chemistry Program Exchange. Note that as long as the matrix can be stored in 37K (the number of user pages available) the program is CPU bound. For larger matrices (dimension greater than 200x200) the CPU/WC time ratio drops s i g n i f i c a n t l y due to thrashing. The reason thrashing occurs in the present example i s because the matrix i s stored in upper triangular form by columns, while the program code processes the matrix by rows. Consequently, depending on the row being processed, i t is possible to require a new page be brought into memory with each new matrix element which i s referenced. To eliminate thrashing in the present example, i t is necessary to modify the code to process the matrix in the same order in which i t is stored. The run times with the modified code are given in Table I by the entries marked with an asterisk. With the modified program, the CPU/WC time ratio s t i l l decreases as the dimension of the matrix increases, but at a more acceptible rate. In the course of converting programs to execute on the SLASH 7, we have found i t necessary to modify several codes to minimize paging. In a l l cases encountered thus f a r , the changes were straightforward and required l i t t l e program reorganization. Thrashing would be d i f f i c u l t to eliminate in a program that required rapid and random access of a large data s e t , but we have not encountered this problem.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
194
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
TABLE I.
AND LARGE SCALE
COMPUTATIONS
BENCHMARK RESULTS FOR GIVENS.
CPU Time
Wall Clock Time
CPU/WC Time Ratio
2.5K
3 sec.
12 sec.
.25
5.7K
8 sec.
19 sec.
.42
100
10.IK
18 sec.
36 sec.
.50
125
15.8K
35 sec.
51 sec.
.69
150
22.6K
1 min. 18 sec.
.76
175
30.8K
59 sec. 1 min. 32 sec.
1 min. 57 sec.
.79
200
40.2K
1 min. 56 sec.
90 min. 53 sec.
.02
200*
40.2K
2 min. 24 sec.
3 min. 41 sec.
.65
290*
84.4K
7 min. 11 sec.
15 min. 57 sec.
.45
400*
160.4K
18 min. 31 sec.
47 min. 24 sec.
.39
Matrix Dimension
Program Size (Words)
50 75
* Modified GIVENS routine, see text. Benchmarks In this section we present results of programs which were run alone on our SLASH 7. Where available we also present the run times for other machines and, in p a r t i c u l a r , the UNIVAC 1110 which i s the computer at the Madison Academic Computing Center. The following is a short description of each job, the purpose of running the particular task, and the r e s u l t s . (1) CRUNCHER. This program is a small CPU bound job which includes four arithmetic operations plus exponentiation. The main purpose of running this program is to establish the expected accuracy of the Harris 48-bit double precision word (39 b i t mantissa) and to compare machine speeds. In Figure 1 we give the Fortran code for this program and in Table II, the results of CRUNCHER are compared with runs obtained on an IBM 370/195 and the UNIVAC 1110. Each computer has a different word length and the precision of the f i n a l result i s as expected. The "correct" answer is 2.0. For this particular benchmark the SLASH 7 is approximately 15 times slower than the IBM 370/195 and about one-half the speed of the UNIVAC 1110.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
14.
NORBECK
195
Memory Minicomputer
A N D CERTAIN
IMPLICIT DOUBLE PRECISION (A-H,0-Z) R00T=DSQRT(2.0D0) SUM=0.D00 DO 5 1=1,1 000 000 5 SUM=SUM+R00T*R00T/R00T -0.D00 SUM=(SUM/1 000 000.0D0)**2 WRITE(6,9)SUM 9 FORMAT(5H TW0=,E30.20) STOP
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
Figure
TABLE
II.
1.
FORTRAN listing of benchmark* CRUNCHER
BENCHMARK RESULTS FOR CRUNCHER Harris/7
IBM 370/195 UNIVAC 1110
48 Bits
Word Size
64 Bits
Answer 1.999 992 774 8 (Exact=2.0) 1
27.2 sec.
CPU Time
TABLE
III.
36 Bits
72 Bits
1.993 870 154 0
g 9 g
g g g
g g g
8
1.75 sec.
l e 9 g g
g g g
12.8 sec.
g g g
g 8
15.5 sec.
BENCHMARK RESULTS FOR MATMUL
CPU Time
TABLE IV.
UNIVAC 1110
Harris/7
UNIVAC 1110
5 min. 14 sec.
2 min. 18 sec.
BENCHMARK RESULTS FOR SCFPGM Harris/7
UNIVAC 1110
CPU Time
36 min. 3 sec.
26 min. 46 sec.
Wall Clock Time
54 min. 26 sec.
N/A
.66
N/A
CPU/WC Time Ratio
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
196
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
(2) MATMUL. In this program two 60x60 matrices are multiplied together 50 times. In Table III the results are given for the Harris computer and the UNIVAC 1110. Both runs are in double precision. (3) SCFPGM. This program package calculates the one- and two-electron molecular integrals needed for an ab i n i t i o electronic structure calculation and subsequently performs a restricted s e l f - c o n s i s t e n t - f i e l d (SCF) calculation using the integrals. The benchmark calculation involved a gaussian lobe basis set of 39 contracted functions appropriate to the carbon monoxide molecule. In this particular run more than 5χ10 gaussian integrals were calculated and the SCF program ran for 20 iterations. In Table IV we present the results of the bench mark. This program is CPU bound, with a 66% CPU/WC time efficiency.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
6
(4) LEAST SQUARES. This program, which was provided by Dr. J . C. Calabrese of our department, does a least-squares analysis of x-ray crystallographic data and is one part of a large x-ray data analysis package. Such calculations are responsible for a substantial portion of the CPU u t i l i z a t i o n of our computer. In Table V the times for a typical LEAST SQUARES run are given for both the Harris/7 and the UNIVAC 1110. This program is also CPU bound. TABLE V.
BENCHMARK RESULTS FOR LEAST SQUARES. Harris/7
UNIVAC 1110
CPU Time
26 min. 22 sec.
15 min. 9 sec.
Wall Clock Time
36 min. 39 sec.
N/A
CPU/WC Time Ratio
.72
(5) TPROB. This program, which was provided by Professor C. F. Curtiss and Mr. R. R. Woods of our department, calculates atom-diatom rotational excitation cross sections. This program was selected as a benchmark because (1) i t requires 193K words of core which is considerably larger than our 37K physical memory. (The program's paging rate is approximately 10 page requests/sec); (2) the program does a considerable amount of mixed-mode and com plex arithmetic so these functions of the Harris Fortran compiler could be tested; (3) for each set of input parameters the program requires approximately 7 hours of CPU time. In normal operation, this program runs at the lowest p r i o r i t y to soak up unused CPU cycles. The CPU and Wall Clock times for this run on the Harris/7 are given in Table VI. Although this program requires more than 5 times the available core, the program received a 57% CPU/WC time r a t i o .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
14.
NORBECK
TABLE VI.
A N D CERTAIN
Memory Minicomputer
197
BENCHMARK RESULTS FOR TPROB. CPU Time
448 min.
Wall Clock Time
942 min.
CPU/WC Time Ratio
.57
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
Batch-Run Benchmarks In this section we report the results of mixing the programs described in the previous section in order to determine the extent to which the virtual memory system can handle several jobs running simultaneously. For each run in Table VII, the total size for a l l programs greatly exceeds the 36 user pages of physical memory. The results in Table VII are representative of a larger set of s t a t i s t i c s for numerous job mixes run at various p r i o r i t i e s . These jobs are typical for our department. For each run in Table VII, the f i n a l job was aborted when the penultimate job was complete. Based on our experience to date, we make the following observations: (1) For most jobs mixes (e.g. example 1 in Table VII) the total CPU/WC time ratio is close to the combined result obtained when the jobs were run alone. In f a c t , for some mixes (e.g. example 2) there is an overall improvement in through-put. (2) Two or more large jobs running at the same p r i o r i t y results in a decrease in through-put (e.g. example 3 ) . This occurs because in this situation the operating system time-slices the available CPU cycles by alternating between the two programs. The result is more paging and less CPU u t i l i z a t i o n . (3) If two moderately paging jobs are mixed at the same p r i o r i t y , i t i s possible to generate a thrashing s i t u a t i o n . For example, i f two unmodified GIVENS jobs for 175x175 matrices are run alone or at different p r i o r i t i e s , they are CPU bound. If they are run at the same p r i o r i t y they thrash. To correct the problem, i t is necessary to suspend one job until the other is finished. (4) Programs execute faster i f the paging feature of the operating system rather than e x p l i c i t disk I/O commands, is used to reference data sets. This i s not possible i f the total code and data is greater than 256K. (5) For large data sets, where i t i s necessary to use explicit disk I/O commands, i t i s i n e f f i c i e n t to read into memory in a single command more pages of data than there are available user pages in the physical memory. For example, i f there are 4 user pages available and 6 user pages are read from disk, the f i r s t two pages w i l l be read, but then might be paged back to dis:k in order to generate space for the last two pages. A subsequent program reference to the f i r s t two pages results in additional swapping.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
198
TABLE VII.
COMPUTATIONS
BENCHMARK RESULTS FOR BATCH RUNS.
Job 1. SCFPGM
•r—
CPU Time
GIVENS (400)
4
TP ROB
0 11 min. 22 sec.
2. TPROB GIVENS (400)
SCFPGM TOTAL (Size=205K)
18 min. 44 sec. 57 min. 14 sec.
4
36 min. 54 s e c *
0
20 min. 2 sec. 56 min. 56 sec.
TOTAL (Size=359K) 3. GIVENS (400)
Wal 1 Clock Time
CPU/WC Time Ratio
6 27 min. 8 sec.
TOTAL (Size=396K)
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
AND LARGE SCALE
95 min. 18 s e c
.60
100 min. 25 sec.
.57
71 min. 55 sec.
.45
0 18 min. 42 sec. 0
13 min. 37 s e c * 32 min. 19 sec.
* Aborted before completion, see text. Cost-Effecti veness We enter a discussion of this topic with reluctance, since the real cost of operating either our departmental computer or the central university computer i s d i f f i c u l t to establish with precision. For the present, we r e s t r i c t consideration to estimating the dollar cost to the Chemistry Department of computations performed on the SLASH 7, compared to the cost of using the central computing center (MACC). Based on our records of actual charges, the effective cost at MACC is approximately $380 per hour at normal rates. This is a composite charge which includes CPU and memory u t i l i z a t i o n , 1/0 operations, and data and program storage. Most of the numbercrunching calculations are run at a variety of reduced rates (overnight or weekend), so we adopt an average cost of $210/hour. We next estimate the number of UNIVAC 1110 hours that we can generate on the SLASH 7. Based on our experience thus f a r , we expect to be able to achieve a maximum of 12 to 14 hours of SLASH 7 CPU time per day. This includes the estimated paging time and maintenance time. (The paging overhead could be reduced by adding more core.) Thus, at saturation we expect approximately 350 hours per month; this i s a conservative estimate. The equivalent UNIVAC 1110 time i s 175 hours per month, at a total cost of $450,000 per year.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch014
14.
NORBECK
A N D CERTAIN
Memory Minicomputer
199
In the three months since the SLASH 7 has been in f u l l operation, we have obtained an average of approximately 150 CPU hours per month, or approximately 40% of saturation. This corresponds to an annual cost at MACC of $180,000 per year. Interestingly, this is close to the purchase price of our SLASH 7 ($152,000). The direct costs to the Chemistry Department for operating the SLASH 7 are approximately $40,000 per year, which includes the salary of the systems manager, the on call/complete service contract ($1160 per month), and supplies. If the system cost i s amortized to zero value over a five-year period ($30,400 per year), the total cost of the SLASH 7 is approximately $71,000 per year, irrespective of the degree of u t i l i z a t i o n of the computer. Thus, at the present rate of usage, the cost effectiveness of the SLASH 7 is approximately 5:2, while at saturation i t w i l l be approximately 6:1. We emphasize that we consider this to be a conservative estimate of the effectiveness of the SLASH 7. Discussion After less than six months of f u l l operation, we feel that the SLASH 7 has been an effective departmental resource for research-oriented computing. Departmental users have had l i t t l e trouble in converting programs to the new machine. At present, a complete set of ab i n i t i o electronic structure programs (including configuration interaction), a complete x-ray data analysis package, the MINITAB s t a t i s t i c a l package, MIND03, CND0, Χα and other semiempirical programs, an NMR spectra-simulation package, and other chemistry codes are operating on the SLASH 7. For most applications the 48-bit double precision word-length has provided s u f f i c i e n t accuracy. After an i n i t i a l shake-down period of about four months, the hardware has proved r e l i a b l e . A l l standard programming languages are included in the operating system, with FORTRAN and BASIC receiving the most use. A significant by-product of the departmental computer has been greatly increased interaction among experimental and theoretical research groups. With more than ten groups actively using the computer, a stimulating research environment has been created in which expertise and ideas are shared across the boundaries of specialization and f i e l d . This perhaps w i l l be the most significant and long-lasting benefit of our departmental computer. Acknowledgements Total funding for our SLASH 7 was provided by the University of Wisconsin-Madison. Professors Richard F. Fenske and John C. Schrag, together with the Departmental Computer Committee, were instrumental in the acquisition of the computer.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
15 Computation in Quantum Chemistry on a Multi-Experiment Control and Data-Acquisition Sigma 5 Minicomputer A. F. WAGNER, P. DAY, and R. VANBUSKIRK Chemistry Division, Argonne National Laboratory, Argonne, IL 60439 Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
ARNOLD C. W A H L Science Applications, Inc., Rolling Meadows, IL 60008 There has been c o n s i d e r a b l e e f f o r t i n the past few years to lower the cost of performing quantum chemistry computations. An a l t e r n a t i v e that we have examined i s the u t i l i z a t i o n of a computer system whose primary task i s the p r o v i s i o n of r e a l - t i m e support for the e x p e r i m e n t a l i s t i n the l a b o r a t o r y . There are s e v e r a l r e a sons why such a system i s bound to have resources a v a i l a b l e for execution of a program on an ' a s - t i m e - i s - a v a i l a b l e ' b a s i s . The u s age of system resources r e q u i r e d by many o n - l i n e experiments i s u s u a l l y not constant. The system i s u s u a l l y scaled to provide s e r v i c e f o r worst case c o n d i t i o n s . E f f e c t i v e response to r e a l time events r e q u i r e s that the sum of the ' e v e n t - d r i v e n ' tasks should be l e s s than 100 percent of the system's c a p a c i t y . This i n c i d e n t a l ' f r e e time may then be used for doing u s e f u l work, such as quantum chemistry computations. In a way our f a c i l i t y provides a s e r v i c e to the computationally o r i e n t e d user i n the same way that a mini provides the s e r v i c e when connected to a network where some of the m i n i ' s are involved with instrument cont r o l and other m i n i ' s support the computational o p e r a t i o n s . The d i f f e r e n c e being that we perform a l l the tasks on a s i n g l e computer of somewhat l a r g e r c a p a b i l i t y than a mini-computer. For those i n s t a l l a t i o n s i n t e r e s t e d i n both greater experimental automation and quantum chemistry computing at nominal c o s t , our experience suggests that bootlegging batch computations on a computer d e d i cated to experimental c o n t r o l i s an a t t r a c t i v e and f e a s i b l e a l t e r n a t i v e to a c o l l e c t i o n of dedicated mini-computers. 1
System Overview Our chemistry d i v i s i o n of about 120 research s c i e n t i s t s i s involved i n b a s i c r e s e a r c h , r e q u i r i n g h i g h l y f l e x i b l e instrument automation, experiment c o n t r o l and experiment a n a l y s i s . In a d d i t i o n , there i s a strong program of ab i n i t i o c a l c u l a t i o n s , performed mostly on Argonne's c e n t r a l IBM 370/195. Frequent i n s t r u ment replacement and enhancements r e q u i r e r a p i d and e f f i c i e n t m o d i f i c a t i o n s to the a s s o c i a t e d computer programs and s e r v i c e s .
200 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
15.
WAGNER
E T
A L .
Quantum Chemistry
201
In 1967, before the p r o l i f e r a t i o n of low-cost m i n i s , a c a r e f u l study of our d i v e r s e l a b o r a t o r y automation needs l e d us to the c o n c l u s i o n that a c e n t r a l computer could support a l l of the r e a l time needs of the current and p r o j e c t e d instruments and, on the average, have enough l e f t - o v e r resources to support a u s e f u l amount of t h e o r e t i c a l computation [JL]. A s u i t a b l e hardware conf i g u r a t i o n would r e q u i r e an operating system to provide e f f e c t i v e p r o t e c t i o n , f a s t r e a l - t i m e response and e f f i c i e n t data t r a n s f e r . An SDS Sigma 5 computer s a t i s f i e d a l l our hardware c r i t e r i a . However i t was necessary to design and w r i t e our own operating system [2]. Services i n c l u d e program generation, experiment c o n t r o l , r e a l - t i m e a n a l y s i s , i n t e r a c t i v e g r a p h i c s , batch p r o c e s s i n g and long-term computation (hundreds of h o u r s ) . Our system i s c u r r e n t l y p r o v i d i n g r e a l - t i m e support f o r 26 c o n c u r r e n t l y running experiments (see F i g . 1), i n c l u d i n g an automated neutron d i f f r a c t o m e t e r , a pulsed NMR spectrometer, ENDOR and ESR spectrometers, i n f r a r e d spectraphotometers and n u c l e a r m u l t i - p a r t i c l e d e t e c t i o n systems [_3]. I t guarantees the p r o t e c t i o n of each u s e r ' s i n t e r e s t s and dynamically assigns core memory, d i s k space and 9 - t r a c k magnetic tape usage. M u l t i p l e x o r hardware c a p a b i l i t y allows the t r a n s f e r of data between a u s e r ' s device and assigned core area at r a t e s of up to 100,000 bytes/sec. R e a l time histogram generation f o r a user can proceed at r a t e s of 50,000 p o i n t s / s e c . The f a c i l i t y has been s e l f - r u n n i n g (without computer operator) f o r seven years with a mean time between f a i l ure of 11 days and an uptime of 99% of a weekly schedule of 160 hours. Foreground Tasks. Serving the foreground tasks i s the h i g h est p r i o r i t y f u n c t i o n of the system. These tasks c o n s i s t of the execution of programs a s s o c i a t e d w i t h each of the o n - l i n e i n s t r u ments. A software p r i o r i t y i s a s s o c i a t e d with each program cont r o l l i n g an i n t e r f a c e d instrument. Upon r e c e i p t of a request f o r execution ( e . g . , a data buffer i s f u l l ) , the u s e r ' s r e a l - t i m e p r o gram w i l l commence execution w i t h i n about 160 microseconds i f i t i s the highest p r i o r i t y "ready-to-run" j o b ; otherwise i t w i l l commence running when a l l higher p r i o r i t y tasks are completed. Since foreground s e r v i c e c y c l e s t y p i c a l l y complete i n l e s s than 100 m i l l i s e c o n d s (maximum allowed i s one second), the lowest p r i o r i t y foreground task seldom remains i n the "ready-to-run" s t a t e for more than a f r a c t i o n of a second. Non-Resident Program E x e c u t i o n . Real-time computational requirements vary over a wide range. The pulsed NMR spectrometer may r e q u i r e scan averaging a 16K word histogram every 300 m i l l i seconds, t a k i n g about 100 m i l l i s e c o n d s per update. Other e x p e r i ments may r e q u i r e the execution of a 25K word histogram t r a n s f o r mation program ( c o r r e l a t e d nuclear f i s s i o n p a r t i c l e s ) every m i nute, taking about 10 seconds. S t i l l other users r e q u i r e t h i s type of execution every 10 minutes w i t h execution times ranging
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
CENTRALIZED
COMPUTING
XEROX Sigma 5 Computer 230K bytes
REAL-TIME
FACILITY
tttrx
Graphic Display^
Figure 1.
Sigma 5 hyout
Electronic Spectra of Molten Salts
13
Magnetic Tape
Infrared Spectroscopy
Molecular Beam Research
Pulse Radiolysis at Electron Linac
Pulsed Proton and C NMR Spectroscopy ENDOR-Electron Nuclear Double Resonance
Nuclear Particle Counting in Chemistry Building, Tandem Van de Graaff and Cyclotron
Low Temperature Laboratory
Low-level Radioactivity Counting Facility
Neutron Diffraction at CP-5 Research Reactor
2 drives
Data Viewing and Manipulation
3 drives
Data Storage
Experimental Control, Data Acquisition and Data Analysis for 21 Remotely Located Experiments
^ Teletype
High Temperature Laboratory
Mass Spectrometers
^Teletype
Experiment Communication
Card Reader
Batch Processing
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
15.
WAGNER
E T A L .
Quantum Chemistry
203
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
from a few seconds to 30 seconds. To s a t i s f y t h i s v a r i e t y of demand without r e q u i r i n g an i n o r dinate amount of core memory, the o p e r a t i n g system provides for the time-shared execution of n o n - r e s i d e n t programs (not always r e s i d e n t i n core) i n the background core area (where batch and long-term are executed). These programs are d i s k - r e s i d e n t c o r e images of r e l a t i v e l y l a r g e programs r e q u i r e d i n f r e q u e n t l y and without severe time c o n s t r a i n t s . Two queues for t h i s type of s e r v i c e are p r o v i d e d : one with a 1 and the other with a 32 s e c ond time l i m i t . These programs are u s u a l l y w r i t t e n i n FORTRAN by the i n d i v i d u a l u s e r s . Batch P r o c e s s i n g . An open-shop b a t c h - p r o c e s s i n g c a p a b i l i t y i s supported by the system. Queuing jobs through the card reader provides the casual user with immediate feedback for the r a p i d debugging of programs. Although the o n - l i n e user has the o p t i o n of performing extensive a n a l y s i s of an experiment from a remote t e r m i n a l , the batch l e v e l i s often used where l a r g e amounts of output are r e q u i r e d or for the t r a n s f e r r i n g of f i l e data between magnetic tape and d i s k f i l e storage. The batch l e v e l i s a l s o used e x t e n s i v e l y to generate and debug code for the c o n t r o l of o n - l i n e experiments and for performing most of the computations described i n t h i s paper. The batch l e v e l may use a l l CPU c y c l e s not used by higher p r i o r i t y processes: foreground execution, non-resident execution, system l o a d i n g f u n c t i o n s . Under normal daytime l o a d i n g , the f o r e ground usage r e q u i r e s about 10 percent of the CPU c y c l e s and the non-resident execution about another 40 percent. Thus, i t appears to the batch user that h i s program i s executing on a computer with about h a l f the speed of a Sigma 5 computer dedicated to b a t c h - p r o c e s s i n g . Long Term Computation. U t i l i z a t i o n of the CPU seldon exceeds 40 percent i n a 24 hour p e r i o d , even w i t h considerable batch usage. The remaining CPU c y c l e s are made a v a i l a b l e for executing very long (hours to weeks) batch-type computations r u n ning at a p r i o r i t y l e v e l below batch p r o c e s s i n g . These jobs d i f fer from batch jobs i n that they only have access to d i s k f i l e s , not the batch p e r i p h e r a l s . Once i n i t i a t e d (from the card reader), the job i s read i n t o batch core memory from the d i s k anytime there i s s u f f i c i e n t space and higher p r i o r i t y usage p e r m i t s . The d a i l y saving of d i s k f i l e s on magnetic tape a l s o copies the c u r rent core image of the long term job along with i t s f i l e s . Automatic f i l e (and long-term) r e s t o r a t i o n at system b o o t - i n supports execution extending over long p e r i o d s . Table 1 i n d i c a t e s the d i s t r i b u t i o n of long-term jobs that might be performed during a busy week.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
204
MINICOMPUTERS
LENGTH (HOURS) <1 1-2 10-15 >100
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
Table I .
A N DLARGE
SCALE
COMPUTATIONS
JOBS PER WEEK 15 4 2 0.3
Long-Term Job Length D i s t r i b u t i o n
Queuing Low P r i o r i t y Tasks. As the system i s r e q u i r e d to provide r e a l - t i m e support, the batch p r o c e s s i n g s u f f e r s . Since many of the batch jobs are I/O bound, c o n s i d e r a t i o n i s being given to s p o o l i n g a l l batch I/O. This would overlap the I/O with f o r e ground and non-resident executions and thus speed up the apparent execution speed of the batch j o b . As a further enhancement to batch execution, c o n s i d e r a t i o n i s a l s o being given to i n c l u d i n g batch i n the non-resident execution queue. This would further enhance batch processing speed and at the same e l i m i n a t e the p r i o r i t y advantage of the time-share u s e r . As implemented, the long-term queue c o n s i s t s of s t a r t i n g the next job from the card reader a f t e r the previous long-term job i s completed. A queue i s going to be set up to execute jobs i n a c y c l i c manner, with more execution time being given to the shorter jobs. Quantum Chemistry Computations The usefulness of the Sigma 5 system for the quantum chemist depends on the s c a l e of the c a l c u l a t i o n s . Broadly speaking, we may d i s t i n g u i s h l a r g e s c a l e c a l c u l a t i o n s , r e q u i r i n g tens of m i nutes on the equivalent of a fourth generation computer, and small scale c a l c u l a t i o n s requiring less resources. Large s c a l e work g e n e r a l l y i n v o l v e s the a^b i n i t i o c a l c u l a t i o n of wave functions for e i t h e r the bound motion of e l e c t r o n s and n u c l e i i n s t r u c t u r e s t u d i e s or for the unbound motion of p a r t i c l e s on p o t e n t i a l energy surfaces i n dynamic s t u d i e s . Such c a l c u l a t i o n s are most conveni e n t l y performed by e i t h e r a l a r g e computer ( e . g . , fourth generation) or a dedicated minicomputer. The Sigma 5 system i s n e i t h e r s u f f i c i e n t l y powerful or s u f f i c i e n t l y dedicated to be conveniently used f o r l a r g e s c a l e c a l c u l a t i o n s . Small s c a l e c a l c u l a t i o n s are v a r i e d and not r e a d i l y categorized. They i n c l u d e the rigorous c a l c u l a t i o n of r e l a t i v e l y simple wavefunctions ( e . g . , for diatomic n u c l e a r motion or for atom-atom e l a s t i c s c a t t e r i n g ) , the approximate c a l c u l a t i o n of wavefunctions or t h e i r i n f o r m a t i o n a l equivalent ( e . g . , Huckel theory or semic l a s s i c a l t r a j e c t o r y s t u d i e s ) , the r e d u c t i o n of the wavefunction to observable q u a n t i t i e s ( e . g . , e q u i l i b r i u m d i p o l e moments or d i f f e r e n t i a l cross s e c t i o n s ) , the curve or surface f i t t i n g of wavef u n c t i o n information at d i s c r e t e system geometries ( e . g . , the d i p o l e moment curve or the p o t e n t i a l energy s u r f a c e ) , and the
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
15.
WAGNER
E T A L .
Quantum Chemistry
205
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
graphies d i s p l a y of the r e s u l t s of a l l the above c a l c u l a t i o n s . Such c a l c u l a t i o n s r e q u i r e a f l e x i b l e but only moderately powerful computer such as the Sigma 5. In what follows we w i l l d e s c r i b e s e v e r a l general features of FORTRAN programming for the Sigma 5 system i n the batch and long term mode. Then we w i l l review sev e r a l small s c a l e quantum chemistry programs now i n o p e r a t i o n . FORTRAN Programming. A FORTRAN program can be w r i t t e n i n two ways: a deck of cards can be keypunched or card images can be entered on an i n t e r a c t i v e d i s p l a y t e r m i n a l . The l a t t e r a l t e r n a t i v e makes use of a page e d i t i n g system TEXTEDIT which permits the r a p i d t y p i n g , and e d i t i n g of card images followed by t r a n s m i t t a l to a d i s k f i l e . The f i l e can be accessed with a batch job and the card images l i s t e d and punched. Three types of terminals are a v a i l a b l e : L e a r - S e i g l e r 7700, Tektronix 4023, and Tektronix 4010. TEXTEDIT a l s o can be used w i t h a t e l e t y p e . There are s e v e r a l system r o u t i n e s which allow the FORTRAN programmer the use of e x c e p t i o n a l l y u s e f u l I/O i n s t r u c t i o n s . For reading data from cards, the system r o u t i n e READ causes the FOR TRAN statement CALL R E A D ( A , B , C , . . . ) to i n s t r u c t the computer to read i n a format free mode A, B, C, e t c . , on a s i n g l e card provided at l e a s t one blank space separates each member of the argument l i s t . For reading from or w r i t i n g on the d i s k , no JCL i s r e q u i r e d . A program may have up to two f i l e s open at one time ( f i l e p o i n t e r s are core r e s i d e n t ) . A f i l e may be p r i v a t e , i n which case i t i s defined by the statement CALL DEFDSK(NAME,NSEC) where NAME i s the address of a 20 character EBCDIC f i l e name and NSEC i s the number of s e c t o r s (256 words) i n the f i l e . Up to 120 p r i v a t e f i l e s may be defined by each u s e r . A scratch f i l e i s also a v a i l a b l e , and i t can be accessed by the statement CALL OPNSCR. Disk I/O can i n v o l v e r e a d i n g , w r i t i n g , or w r i t e - r e a d i n g f i x e d or variable records. The w r i t e - r e a d o p t i o n allows one l o g i c a l record to be w r i t t e n on the d i s k and the next l o g i c a l record to be read into the same core occupied by the f i r s t r e c o r d , thereby saving one d i s k r e v o l u t i o n p e r i o d (25 m s . ) . As an example of a FORTRAN c a l l for d i s k I/O, a v a r i a b l e record read occurs with the execu t i o n of the statement CALL DISKR(ARRAY,Ν,ISEC,I0VER) where ARRAY i s the name of the f i r s t element i n the r e c o r d , Ν i s the number of words per l o g i c a l r e c o r d , ISEC i s the d i s k sector number, and IOVER i s a f i l e overflow i n d i c a t o r . For magnetic tape I/O, l a b e l e d and unlabeled tape may be d i r e c t l y referenced by the standard FORTRAN I/O statements READ (U,F) and WRITE (U,F) where U i s the u n i t number of one of two tape d r i v e s and F i s the format statement number. P r i o r to execution, the r e l e v a n t magnetic tapes must be reserved and mounted. A tape d r i v e can be reserved by a s i n g l e JCL a s s i g n c a r d , for example,
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
206
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
! ASSIGN 111=LMT TAPELABEL where a l a b e l e d tape (LMT) with the l a b e l TAPELABEL i s reserved for d r i v e 111. The JCL f o r executing a batch or long term job i s p a r t i c u l a r l y simple. This i s i l l u s t r a t e d by the examples given i n F i g . 2. In example A , a subroutine or complete program i s stored as an obj e c t module i n a p r i v a t e l i b r a r y under the name of the program. Card 1 i n the example i s the job 'card which i s the f i r s t card i n every batch or long term submission. I t gives the u s e r ' s ID number (XXX) and name. The l a s t card i n the example i s the end-ofdata card which ends every batch submission. In example B, the main program START i s to be executed. Any unresolved e x t e r n a l references i n START l e a d to a s i n g l e pass search through subsequent e n t r i e s i n the p r i v a t e l i b r a r y . Then the p u b l i c l i b r a r y i s searched f o r the referenced u t i l i t y programs ( e . g . , DSQRT, ABS, etc.). In t h i s way i n d i v i d u a l subroutines stored as members of the p r i v a t e or p u b l i c l i b r a r y are s e l e c t e d and l i n k e d together to form an executable module. In example C, the program stored i n the p r i v a t e l i b r a r y under the entry MIDDLE w i l l be executed i n the long term mode. LT on card 2 i d e n t i f i e s the mode and NNN i s the estimated CPU time r e q u i r e d i n minutes. A l l input and output f o r a long term job must be v i a d i s k I/O, so there can be no data deck. Preceeding and f o l l o w i n g batch jobs read i n any input and p r i n t , punch, tape or p l o t any output. System r o u t i n e s permit i d e n t i f i c a t i o n of any long term job current i n the computer. The examples given i n F i g . 2 a l l d e a l with executing jobs v i a a p r i vate subroutine l i b r a r y . Many other ways of running jobs are poss i b l e and a l l have a JCL as simple as the examples i n F i g . 2. The various FORTRAN programming features we have j u s t d e s c r i b e d have a l l been used to assemble a l i b r a r y of o p e r a t i o n a l small s c a l e quantum chemistry programs. Several members of t h i s l i b r a r y we w i l l now d i s c u s s under the loose c a t a g o r i e s of s t r u c ture s t u d i e s , dynamic s t u d i e s , and g r a p h i c s . Several of these programs have been run on an IBM 360/195. While p r e c i s e comparisons are not a v a i l a b l e , our experience i n d i c a t e s that the Sigma 5 i s roughly 30 times slower than the IBM 360/195 f o r jobs that are not I/O bound. Structure S t u d i e s . The program POTFIT w i l l l e a s t squares f i t Morse and H u l b e r t - H i r s c h f e l d e r p o t e n t i a l functions to a set of diatomic p o t e n t i a l energies as a f u n c t i o n of v i b r a t i o n a l stretch. The n o n l i n e a r l e a s t squares code used i n the f i t i s an adaptation o f STEPIT(QCPE program #66) [ 4 ] . The method involves a p a t t e r n search f o r the nearest minima i n the l e a s t square exp r e s s i o n , s t a r t i n g from an i n i t i a l guess of parameter v a l u e s . The input to POTFIT c o n s i s t s of the o p t i o n f o r a Morse or H u l b e r t H i r s c h f e l d e r f i t , the masses of the atoms, the i n i t i a l guess of the parameter v a l u e s , and the s e t of data to be f i t . A l l input i s format f r e e . The output c o n s i s t s of a l i s t i n g of the i n p u t , the f i n a l parameter v a l u e s , the accuracy of the f i t , the s p e c t r o -
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
WAGNER
E T
AL.
Quantum Chemistry
Example Α . XXX MYNAME PROGRAM ROM ! FORTRAN LS (FORTRAN DECK) !EOD !JOB
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
1ST
Example B. !JOB XXX MYNAME JLOAD START (DATA DECK) !EOD
Example C. ! JOB !LOAD !EOD Figure 2.
XXX MYNAME MIDDLE LT
NNN
JCL examples described in text
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
208
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
scopic constants derived from the parameter v a l u e s , and the r e sulting vibrational levels. F i g u r e 3 reproduces the l a s t two pages of output for a Morse f i t to a set of ab i n i t i o c a l c u l a t e d p o t e n t i a l energies for H 2 . POTFIT runs i n 25K bytes and t y p i c a l execution times are about two minutes. The program CR360 i s another n o n l i n e a r l e a s t squares f i t t i n g routine. CR360 w i l l f i t a s u p p l i e d f u n c t i o n a l form to a set of f u n c t i o n a l values for one, two, or three independent v a r i a b l e , i . e . , CR360 w i l l produce curves, s u r f a c e s , or hypersurfaces. The method used i n f i t t i n g i s to d i s t i n g u i s h l i n e a r from n o n l i n e a r parameters, to solve the l i n e a r l e a s t squares problem for a supp l i e d g r i d of n o n l i n e a r v a l u e s , and to d i s p l a y maps of the sum of the squares of the e r r o r s on the g r i d . There i s no automatic search for the nearest minima to the i n i t i a l guess. The user, through examination of the maps, must s e l e c t the next set of nonl i n e a r parameter values to search through. The program was designed for problems where there i s the p o s s i b i l i t y of many minima and the l o c a t i o n and d i s p l a y of a l l the minima are important. P r i o r to execution of CR360, the user must i n s e r t i n t o the p r i v a t e l i b r a r y a subroutine t h a t , for any given set of f i t t i n g parameters and constants, w i l l c a l c u l a t e the f u n c t i o n a l form for any combinat i o n of independent v a r i a b l e s i n the data s e t . At execution, the input for CR360 c o n s i s t s of the number of independent v a r i a b l e s , any b i a s and s c a l i n g to be a p p l i e d to the data, the data and the weight that i s to be attached to each data p o i n t , the g r i d of n o n l i n e a r parameter v a l u e s , and the map r e s o l u t i o n s f o r the maps of the sum of the square of the e r r o r s over the g r i d . The output c o n s i s t s of the l i s t i n g of the input data and the data biased and s c a l e d , a l i s t i n g of the f i n a l parameter values and the f i t t i n g e r r o r s , and the d i s p l a y of up to ten maps of d i f f e r e n t r e s o l u t i o n s for the r e s i d u a l s over the g r i d . The program runs i n 120K bytes and i t s execution time i s s t r o n g l y dependent on the amount of data and the number of f i t t i n g parameters. For 150 data p o i n t s , 30 l i n e a r f i t t i n g parameters, and 200 nonlinear parameter g r i d p o i n t s , CR360 takes between 5 to 10 minutes. A rather s p e c i a l i z e d program used i n c o n j u n c t i o n with l a r g e s c a l e ab i n i t i o wavefunction c a l c u l a t i o n s i s STVTWC, a program modified from one by Hagstrom (QCPE program #9) [5]. This program c a l c u l a t e s s e l e c t e d diatomic o n e - e l e c t r o n i n t e g r a l s f o r a given b a s i s set of atom-centered S l a t e r type o r b i t a l s (STO). The i n t e g r a l s that can be requested are the o v e r l a p , the k i n e t i c energy, the nuclear a t t r a c t i o n , and the z-moment. For a given b a s i s set of STO's, the s e l e c t e d i n t e g r a l for every p a i r of o r b i t a l s i s computed to give a matrix of r e s u l t s . The i n t e g r a t i o n i s performed by expanding the STO s i n e l l i p t i c a l o r b i t a l s followed by a n a l y t i c integration. The input i s format free and c o n s i s t s of the i n t e r nuclear d i s t a n c e , the charge on the two n u c l e i , the s e l e c t i o n f l a g for the i n t e g r a l d e s i r e d , the number of STO s,and the quantum numbers and zeta value for each STO. The p r i n t e d output c o n s i s t s of a l i s t of the input followed by a l i s t i n g of the c a l c u l a t e d 1
1
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
WAGNER E T A L .
Quantum
209
Chemistry
280VC HZ MORSE CURVE F Î T PKfcDICTS DIAT8MIC CHARACTERIZED BY T H E Feu.9WîN9PARAM^TCR* · · * BINDING ENERGY · • • RE « •
0*17606*7230570*00 HARTREES 0**7910732*3820*01 EV Ot11048*1350130*03 KCAL/MBLE
1·*17639 BtJHRS 0.750183 ANUSTROMS
ASYMPTOTIC ENERGY • • · BETA PARAMETER · t
-0*99260*1 1965*>D*00 HARTREE8 ·0·270107*330*00*02 EV -0*6??878937165D*03 KCAL/MBLE
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
Ε·ΧΕ · . ».
···
-
0.10330*5860*10*01 INVERSE BOHRS 0*195217379058D*01 INVERSE ANGSTRBMS
Ε · 0.2022661127590-01 HARTREES . · . 0***39227*1*1*0*0* WAVENUMBERS ·
Ε · •
.
—
—
—
0*5809167739620-03 HARTREES 0*127*96*765990*03 WAVENUMBERS
0*2/08588952*60-03-HARTREES 0*59**66*76*050*02 WAVENUMBERS 0tlO108bl901880-O* HARTREES . 0.221856J1682*0*0l WAVENUMBERS
LPHAE ·
R 1) 2) 3) *) 5)
1·0000 t**000 1*8000 2·0000 2«b000
IdRATlBNAW
ειπτι •Ot1117*28352*23+01 •0.1l6860930?*63*0t •Ot1109920953*00*01 ·0·11ϋ26372029θ*Α1 ·0·10*88982?6323»01
0·6*?2*23?*7953·03 O«*9657*****97">-0*> ί>· 190605*0***53-03 ·3#179*7706*6213·03 3t3?0063185*66V0*
ANALYSIS
NUEVEW Ο 1 Ζ 3 * 5 6 . 7
-
El INPUT) -0*1117*21830000*01 ·0·11684?8960000*01 -0.11*9730350003*01 -0·1132366680003*01 -0·10*8866*20000*01
.
8 9
to
11 12 13 1* 15 16. 17
...
HARTREES 0*996*376****60-02 0.2903285*172*0-01 0·*693579835250·01 0*6367690898*70-01 0*7925618606890-01 0*9367362960520-01 0*10692923959*0*00 . . 0*11902301603*0*00 •
Ο·12995*9589260*00 0*1397250682710*00 0*1*83333**0680*00 0*1557797863170*00 0*16206*3950170*00 0*1671871701700*00. 0*1711*81117750*00 0*1739*72198320*00 Ο*17558**9*3*10*00 0*1760599353030*00
wAvENUMBERS
..
0*2187739*90**3*0* 0*A37197*036l63*3* Ο* 1030121*56*13*0* 0*13975*6*12**3*0* Ο*1739*71*73*03*05 0*2055898P3S*10*05 0·23*6823ΡθβΜ3*0» 0·2612252*83·>*3*0*
0·2852181?62Α*3*0* 0*30666133*6**3*0* 0*32555*3135**3*0* 0*3*189796289*3*05 0.3556901*27070*0* 0.3669333729ΑΡ0*0* . 0.3756266337ΡΑ3*0* 0.38176996*91*3*0* 0.38536336661*3*05 0.386*068387*63*0*
STBP
Figure 3.
The last two pages of printed output from a typical run of
POTFIT
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
210
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
i n t e g r a l matrix. The program runs i n 25K bytes and takes on the order of 3 minutes to execute a t y p i c a l j o b . A f i n a l program that r e l a t e s to s t r u c t u r e s t u d i e s i s FCF, a r o u t i n e to c a l c u l a t e the Franck-Condon f a c t o r s connecting the v i b r a t i o n a l s t a t e s of two d i f f e r e n t Morse p o t e n t i a l s . The c a l c u l a t i o n c o n s i s t s of the numerical determination of the Morse v i b r a t i o n a l wavefunctions followed by Simpson i n t e g r a t i o n of the product. The input c o n s i s t s of the reduced mass followed by the i d e n t i f i c a t i o n t i t l e , the Morse parameters, and the maximum v i b r a t i o n a l l e v e l of each e l e c t r o n i c s t a t e . The i n t e g r a t i o n range and g r i d s i z e complete the i n p u t . At most 1000 g r i d p o i n t s and 15 v i b r a t i o n a l s t a t e s i n each e l e c t r o n i c s t a t e are allowed. The output c o n s i s t s of a l i s t i n g of the input and the c a l c u l a t e d FranckCondon f a c t o r m a t r i x . There i s an o p t i o n to punch the matrix i f desired. The program runs i n 30K bytes and r e q u i r e s about f i v e minutes for a t y p i c a l case. Dynamic S t u d i e s . The program PHASE w i l l c a l c u l a t e the e l a s t i c cross s e c t i o n and d i f f e r e n t i a l cross s e c t i o n as a f u n c t i o n of c o l l i s i o n energy for an atom-atom c o l l i s i o n system. This i s done by c a l c u l a t i n g the quantum phase s h i f t for an input i n t e r a c t i o n p o t e n t i a l for each angular momentum quantum number of importance at the given c o l l i s i o n energy. The phase s h i f t c a l c u l a t i o n can be done e i t h e r r i g o r o u s l y by f i n i t e d i f f e r e n c e s o l u t i o n of the Schroedinger equation or approximately by a JWKB s o l u t i o n i n v o l v ing s p e c i a l quadrature formulas to handle the c l a s s i c a l t u r n i n g point s i n g u l a r i t y i n the JWKB i n t e g r a n d . Once a l l the phase s h i f t s are obtained at a given energy, the cross s e c t i o n and d i f f e r e n t i a l cross s e c t i o n are obtained by standard formulas. Before the program can be executed, the p r i v a t e subroutine l i b r a r y must c o n t a i n the appropriate r o u t i n e to read and d i s p l a y the parameters of the d e s i r e d i n t e r a c t i o n p o t e n t i a l and to c a l c u l a t e the potent i a l and i t s d e r i v a t i v e at any p o i n t i n space. Routines already a v a i l a b l e i n c l u d e those for Lenard-Jones and EXP-6 p o t e n t i a l s as w e l l as a s p l i n e p o t e n t i a l for a n u m e r i c a l l y c a l c u l a t e d set of potential points. Given the p o t e n t i a l r o u t i n e i n the l i b r a r y , the input to phase c o n s i s t s of the reduced mass, the energy (or v e l o c i t y ) spectrum, the p o t e n t i a l parameters, and parameters governing the f i n i t e d i f f e r e n c e or quadrature s o l u t i o n . The o u t put c o n s i s t s of a l i s t i n g of the input followed by a l i s t , for each energy, of the phase s h i f t as a f u n c t i o n of o r b i t a l angular momentum. Along w i t h each phase s h i f t , the program a l s o l i s t s the p o t e n t i a l at the t u r n i n g p o i n t , the c e n t r i f u g a l p o t e n t i a l at the turning p o i n t , the c o n t r i b u t i o n of the phase s h i f t to the cross s e c t i o n , and the accumulated cross s e c t i o n from a l l the preceeding phase s h i f t s . At the end of the phase s h i f t l i s t , the t o t a l cross s e c t i o n and i t s l o g are p r i n t e d and punched i f d e s i r e d . Opt i o n a l p r i n t o u t c o n s i s t s of a l i s t i n g of the d i f f e r e n t i a l cross s e c t i o n over an input range of s c a t t e r i n g angle, a l i s t i n g of the extrema i n the d i f f e r e n t i a l cross s e c t i o n , and a l i n e p r i n t e r p l o t
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
15.
WAGNER
E T A L .
Quantum Chemistry
211
of the l o g of the d i f f e r e n t i a l cross s e c t i o n versus s c a t t e r i n g angle. F i g u r e 4 reproduces such a p l o t from a study of the A r - H elastic scattering. The program runs i n 100K b y t e s . Execution times per energy vary with the energy, the c o l l i s i o n system, and the s o l u t i o n method (rigorous or JWKB). Rigorous c a l c u l a t i o n s take longer and vary from 20 seconds to ten minutes, w i t h a t y p i c a l time on the order of two minutes. Another o p e r a t i o n a l dynamics program i s TRAJ3D, a three d i mensional c l a s s i c a l t r a j e c t o r y r o u t i n e . The program i s an adapt a t i o n of Muckerman's r o u t i n e i n QCPE (program #229) [ 6 ] . Given the p o t e n t i a l energy surface, any three atom c o l l i s i o n system can be s t u d i e d . For a given energy, the standard s e m i c l a s s i c a l i n i t i a l c o n d i t i o n s are used for each t r a j e c t o r y and the c a l c u l a t e d f i n a l c o n d i t i o n s are analyzed according to the b i n method. After c a l c u l a t i n g the d e s i r e d number of t r a j e c t o r i e s , the program anal y z e s the bins for the nonreactive, r e a c t i v e , d i s s o c i a t i v e cross s e c t i o n s and d i f f e r e n t i a l cross s e c t i o n s . The method of c a l c u l a t i o n i s a combination of a Runga-Kutta and an 11th order p r e d i c t o r - c o r r e c t o r s o l u t i o n to Hamilton's equations. Before the p r o gram can be executed, the p r i v a t e l i b r a r y must c o n t a i n a package of routines to read and d i s p l a y the p o t e n t i a l energy surface parameters and to c a l c u l a t e the p o t e n t i a l energy and i t s d e r i v a t i v e at any point i n space. Given t h i s package, the input c o n s i s t s of the reduced mass, the c o l l i s i o n energy, the i n i t i a l s t a t e of the d i a tomic molecule, the range of impact parameters to be s t u d i e d , the i n i t i a l separation of the r e a c t a n t s , the number of t r a j e c t o r i e s , parameters r e l a t i n g to the method of c a l c u l a t i o n , and parameters r e l a t i n g to the a n a l y s i s and d i s p l a y of the r e s u l t s . The output c o n s i s t s of the above mentioned cross s e c t i o n s and d i f f e r e n t i a l cross s e c t i o n s as w e l l as the t r a n s l a t i o n a l energy l o s s as a funct i o n of s c a t t e r i n g angle and the c o r r e l a t i o n of r o t a t i o n a l and v i b r a t i o n a l energy gain or l o s s . TRAJ3D runs i n 100K b y t e s . For a given energy, the execution time v a r i e s w i t h the energy and the c o l l i s i o n system. A t y p i c a l time per t r a j e c t o r y i s about 1 m i nute. TRAJ3D i s not a good program to run i n the batch mode as the usual t r a j e c t o r y study would i n v o l v e up to a few thousand t r a j e c t o r i e s , i . e . , a number of hours of CPU time. However, the long term mode i s i d e a l for t r a j e c t o r y s t u d i e s . Under o r d i n a r y c i r cumstances the i n i t i a l and f i n a l c o n d i t i o n s of each t r a j e c t o r y would be saved for any a d d i t i o n a l a n a l y s i s d e s i r e d l a t e r . Thus TRAJ3D can be r e a d i l y decomposed i n t o an input program that places a l l input on the d i s k , a c e n t r a l program that reads the i n p u t , c a l c u l a t e s the t r a j e c t o r i e s , and s t o r e s the information for each t r a j e c t o r y on the d i s k , and f i n a l l y an a n a l y s i s program that r e duces the t r a j e c t o r y information to measureable q u a n t i t i e s . The c e n t r a l program i s run on long term and i n t h i s way s e v e r a l thousand t r a j e c t o r i e s can t y p i c a l l y be run overnight between two work days.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
212
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
12.C0C.JJ l6.5Cti-
J».CC:.JJ
io.ecsoc
22.C3-.3J ?*.tOv.X 2».coc:c 3%.CO,.,:.
36.00··.·:; 3a.S0v.-J
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
•••CO-.Oj. S2.0CÎ.0.» Sô.SOùC. «O.CCCvJ ée.io;:73.Mi-72.00001 .
•6.os c; ie.ccc.oo $0.00033 u
M.00--J
M.O'.'.c1C0.3CC.J tc2.cc::o IO*.OOC:O
»C6.CCC-JJCS.0CC33 113.000JC ii*.oc.:o ne.;:.-j 120.03υ>0 1M.00C3C «*.CCUO
«6.00000 12».CCCJ3 130.CCCCJ 132.30». C3 l3«.;c-..o 136.0CCC13*.5CwC3" •· t»o.jc:33 1*2.00^:^ U4.0333J is2.-3:.o 156.0COv3 158.3003 it2.ct;:-3 16*.C03C3 166.CC-..0 173.CC-J 172.«Cl.. 17··;0'.υ9 ,
178.3C3JJ ItO.COCbO
Figure 4.
Line printer display of a differential cross section produced by PHASE
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
15.
WAGNER E TAL.
Quantum Chemistry
213
Graphics. The program WSPLOT f i t s cubic s p l i n e polynomials to sets of data and p l o t s the r e s u l t i n g curves i n page s i x (8 1/2 χ 11) f i g u r e s . There i s an o p t i o n to make the X or Y a x i s 8 inches long with the other a x i s 6 inches l o n g . T i c k marks a u t o m a t i c a l l y occur every inch on both axes. The input c o n s i s t s of the l i m i t s of the X and Y axes, the f i g u r e t i t l e , the axes t i t l e s , and the input for each set of data. This data input c o n s i s t s of f i r s t a s e l e c t i o n of a s o l i d or dashed l i n e with or without symbols mark ing the data p o i n t s or no l i n e at a l l with the data marked by d i a monds. Then the a c t u a l data can be submitted i n two ways: one format free data card for each a b c i s s a - o r d i n a t e p a i r or under the c o n t r o l of a subroutine placed i n the p r i v a t e l i b r a r y p r i o r to ex ecution of WSPLOT. Options allow for the i n t e r n a l b i a s i n g and s c a l i n g of both the ordinate and the a b c i s s a . The data p o i n t s must be arranged i n order of i n c r e a s i n g a b c i s s a s . Up to two hun dred data points can be accomodated i n a s i n g l e s e t . Each set of data produces one curve on the f i g u r e . The p r i n t e d output con s i s t s of a l i s t i n g of a l l the i n p u t , a l i s t i n g of the biased and scaled data, and a l i s t i n g of the s p l i n e f i t to the data p o i n t s to t e s t for any numerical e r r o r s i n the s p l i n e f i t . The p l o t output i s on the p l o t d i s k f i l e . In a separate j o b , a system r o u t i n e w i l l d i r e c t the Calcomp p l o t t e r to p l o t what i s on the f i l e ; t h i s separate job r e q u i r e s only the job card (see F i g . 2) followed by a plot card: !LOAD PLOT As many figures and as many curves on each f i g u r e can be run i n a s i n g l e job as d e s i r e d . The program runs i n 25K bytes and takes about 30 sees to process a t y p i c a l c u r v e . Another graphics program that d i s p l a y s surfaces instead of curves i s KPLOT which makes a contour p l o t of any f u n c t i o n of the p o l a r coordinates (R, t h e t a ) . The t i t l i n g i n KPLOT assumes what i s being p l o t t e d i s the p o t e n t i a l energy surface of an atom ap proaching a diatom frozen at a f i x e d v i b r a t i o n a l s t r e t c h . However the contour p l o t i t s e l f can be for any s u r f a c e . P r i o r to the ex e c u t i o n of KPLOT, a surface subroutine must be stored i n the p r i vate l i b r a r y . T h i s r o u t i n e must handle a l l information regarding the surface to be p l o t t e d , i . e . , i t must read and d i s p l a y a l l s u r face parameters and c a l c u l a t e the surface at any a r b i t r a r y p o i n t . KPLOT then searches for contour values along given r a d i a l v e c t o r s . When a d e s i r e d contour value i s d i s c o v e r e d , i t i s n u m e r i c a l l y traced and the r e s u l t i n g curve i s s t o r e d i n the p l o t f i l e . I f the trace i s l o s t due to kinks i n the surface that are missed, an e r r o r message i s g i v e n . The input to KPLOT c o n s i s t s f i r s t of t i t l e cards for the f i g u r e , for the r a d i a l s c a l e i n s e r t i n the f i g u r e and for the chemical symbols of the AB+C p o t e n t i a l energy surface assumed i n the t i t l i n g . Then comes the maximum and minimum r a d i al values w i t h i n which the contours w i l l be p l o t t e d and the d i mensions of the f i g u r e . Next i s given the angles for which r a d i a l vector searches for contour values are to be performed. The
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
214
MINICOMPUTERS
A N DLARGE
SCALE
COMPUTATIONS
second to l a s t p i e c e of information i s the number of p o s i t i v e cont o u r s , the l a r g e s t contour, the f r a c t i o n r e l a t i n g adjacent c o n t o u r s , and the percent f i t of the computed contour trace to the a c t u a l contour. Both p o s i t i v e and negative contours are searched for and t r a c e d . F i n a l l y any surface parameters are submitted under the c o n t r o l of the subroutine discussed above. The p r i n t e d output c o n s i s t s of a l i s t i n g of the input and a d i g e s t of the t r a c e information for each contour. The p l o t output c o n s i s t of the f i g u r e and, as an o p t i o n , to the r i g h t s i d e of the f i g u r e , a summary of the contours found and where they were found. As always the p l o t information i s placed i n a p l o t d i s k f i l e to be accessed i n a second batch job by the system r o u t i n e PLOT. Figure 5 reproduces a f i g u r e produced by KPLOT; the p l o t t e d surface i s the p o t e n t i a l energy surface for L i + H2 with H2 frozen at 1.4 bohrs. The program runs i n 30K bytes and takes about 3 m i nutes to execute the p l o t i n F i g . 5. Assessment for Quantum Chemists The major advantages of the Sigma 5 system i s i t s power, f l e x i b i l i t y , s i m p l i c i t y of o p e r a t i o n , and nominal c o s t . Most FORTRAN programs for small s c a l e quantum chemistry c a l c u l a t i o n s r e quire l i t t l e reworking to become o p e r a t i o n a l on the system. The JCL, as i l l u s t r a t e d by F i g . 2, i s exceedingly simple and d i r e c t . The system i s open shop and thus each person d i r e c t l y runs h i s own job without the delay of working through an intermediate s t a f f of computer o p e r a t o r s . The nominal cost of the batch and long' term computations i s due to the f a c t that these c a l c u l a t i o n s use e x t r a c a p a b i l i t y unavoidable i n a c h i e v i n g the primary mission of d i r e c t experimental c o n t r o l . The disadvantages of the system for the quantum chemist come i n two forms: foreground i n t e r f e r e n c e and peer p r e s s u r e . Foreground i n t e r f e r e n c e of background batch and long term jobs occurs whenever the foreground tasks and non-resident program executions for experimental c o n t r o l a s s e r t t h e i r p r i o r i t y i n the use of the CPU. On the average, t h i s i n t e r f e r e n c e t i e s up the CPU 50% of the time during r e g u l a r working hours (8 AM to 5 PM) Monday through Friday). I t i s a l s o h i g h l y v a r i a b l e , ranging from no i n t e r f e r e n c e to as much as 55 minutes of i n t e r f e r e n c e per hour during r e g u l a r hours. A f t e r r e g u l a r hours, foreground i n t e r f e r e n c e i s not a subs t a n t i a l problem. As described e a r l i e r , s p o o l i n g , to permit I/O during foreground i n t e r f e r e n c e , and time sharing batch with c e r t a i n foreground jobs w i l l a l l e v i a t e some of the pressure of f o r e ground i n t e r f e r e n c e . However, foreground i n t e r f e r e n c e i s a fundamental feature of the system. Peer pressure c o n s t r a i n s batch or long term usage because, i n the open shop system, the length of time one user can t i e up the batch or long term f a c i l i t i e s i s i n v e r s e l y p r o p o r t i o n a l to the number of people i n l i n e w a i t i n g for the same f a c i l i t i e s . Since there are 120 research s c i e n t i s t s i n the d i v i s i o n , t h i s i s a sub-
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
15.
WAGNER
ET
AL.
215
Quantum Chemistry
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
15HF L I + H2: RHS = 1·H
SCALE
J
I
I
I
2
3
4
5
RHS = 1 ·4>
Figure 5. A plot produced by KPLOT
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
216
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
stantial problem. During regular working hours, the number of batch users per hour ranges from 1 to 20 with an average of 12. In practice, a job requiring more than 5 or 10 minutes generally attracts a crowd of users waiting to run jobs of less duration. After regular working hours, this is much less of a problem. For the long term mode of operation, a week's usage has been given in Table I. In practice, a long term job in the system for longer than 24 hours during the work week would cause others with shorter long term jobs to complain. As described earlier, the establishment of a queue would loosen the constraints of peer pressure by allowing very long long term jobs to run with reduced priority relative to shorter long term jobs. Foreground interference and peer pressure make it inconvenient at best and impossible at worst to run large scale quantum chemistry calculations on the Sigma 5 system. Such large scale computing requires access to either a standard large computer or a dedicated minicomputer. However as our examples indicate, the Sigma 5 system is very well suited for small scale quantum chemistry calculations. It has a power, flexibility, and simplicity of operation, a l l at nominal cost, that would be difficult and expensive to match with dedicated minicomputers. Thus for those laboratories interested in both greater experimental automation and a wide range of small scale quantum chemistry computations, our experience suggests that bootlegging batch and long term computing on a system dedicated to experimental control is a feasible alternative to a collection of mini-computers. Abstract Computation in quantum chemistry and dynamics is being performed in batch and long term mode on a Sigma 5 computer whose primary task is to provide real-time instrument control, data-acquisition and final analysis for 26 on-line experiments. A brief discussion will be given of the multi-programming operating system which provides, in order of priority, real-time interaction with a large number of concurrently running instruments, interactive graphics, time-sharing, batch and long term computation. The efficacy of this facility in three areas of computational chemistry will be reviewed. First, the analysis of wavefunctions and associated energies will be considered with several examples involving property calculations, analysis of potential curves, and least-squares fitting routines for potential energy surfaces. Next, dynamics programs for quantum elastic scattering and three body trajectory studies will be examined. Last, graphics (Calcomp Plots) programs will be discussed in regard to the display of potential energy curves and surfaces. The use of both batch and long term modes will be illustrated and several typical calculations discussed.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
15. WAGNER ET AL.
Quantum Chemistry
217
Acknowledgement s The primary programmer for POTFIT, STVTWC, WSPLOT, and KPOT was Dr. Walter J . Stevens, now of the National Bureau of Standards in Boulder, Colorado. The program FCF was a minor adaptation of a program written by Dr. Patricia Dehmer of the Physics Division at Argonne National Laboratory. Literature Cited [1] [2]
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch015
[3] [4] [5] [6]
Day, P. and Ktejci, H . , Proc. AFIPS FJCC (1968) 33, 1187-1196. Day, P. and Hines, J., Operating Systems Review (1973) 7 (4) 28-37. Day, P., Computer Networking and Chemistry, ACS Symposium Series 19, Peter Lykos, ed., (1975) 85-107. Chandler, J . P., Program #66 in QCPE Catalogue and Procedures (1974), X, 29. Hagstrom, Stanley, Program #9 in QCPE Catalogue and Procedures (1974), X, 19. Muckerman, J . T., Program #229 in QCPE Catalogue and Procedures (1974), X, 85.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
16 An Effective Mix of Minicomputer Power and Large Scale Computers for Complex Fluid Mechanics Calculations R. J. FREULER and S. L. PETRIE *
**
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
The Aeronautical and Astronautical Research Laboratory, Ohio State University, 2300 West Case Road, Columbus, O H 43220
A "hybrid" computer system employing minicomputers linked to various large scale computers has been implemented to perform varied, complex calculations in fluid mechanics. The requirement for and application of such a system is not unique to the area of fluid mechanics. Similar motivation exists in theoretical chemistry and advanced physics. Basically, the increased power of computing equipment, the exponential rise in costs associated with experimental analyses, and the greater versatility of numerical experimentation has led researchers to turn to various computing techniques to examine physical phenomena. Although costs are also rising in many areas of the computer industry, the fact that computational speed has been increasing much faster than computational cost explains the trend of increased use of computers for research based on theoretical computations. The purpose of the present paper is to describe a unique "hybrid" computing system which has been assembled at the Aeronautical and Astronautical Research Laboratory (AARL) of The Ohio State University to perform numerical experimentation with fluid flows. System
Description
and Background
T h e c o m p u t i n g s y s t e m i s c o n f i g u r e d w i t h two m i n i c o m p u t e r m a i n f r a m e s : a H a r r i s C o r p o r a t i o n SLASH 5 a n d a H a r r i s SLASH 4 c o n n e c t e d i n a n o n - r e d u n d a n t d u a l p r o cessor arrangement. Synchronous communication d e v i c e s a r e u t i l i z e d t o l i n k t h e SLASH c o m p u t e r s w i t h t h e d e s i r e d l a r g e s c a l e machine. AARL t y p i c a l l y employs
*Senior Computer Specialist, Member AIAA. **Professor, Associate Director, AARL. 218 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
16.
FREULER
A N D
PETRIE
Fluid Mechanics Calculations
219
d i a l - u p Remote J o b E n t r y ( R J E ) t o a n IBM S y s t e m / 3 7 0 M o d e l 168, a l t h o u g h o t h e r t y p e s o f m a i n f r a m e s s u c h a s those from C o n t r o l Data C o r p o r a t i o n (CDC) o r S p e r r y U n i v a c may be c a l l e d u p o n a s n e e d e d . The c o m m u n i c a t i o n i s accomplished with conventional d i a l - u p modems so t h a t t h e optimum c o n n e c t i o n o f t h e m i n i c o m p u t e r s t o a l a r g e s c a l e m a c h i n e c a n be o b t a i n e d f o r t h e p a r t i c u l a r problem at hand. A d e t a i l e d d e s c r i p t i o n o f t h e two SLASH c o m p u t e r s s y s t e m o f AARL i s i n c l u d e d a s A p p e n d i x A to t h i s paper. I t s h o u l d be n o t e d t h a t t h e d e s c r i p t i o n i n A p p e n d i x A makes r e f e r e n c e t o a H a r r i s S L A S H 6, n o t a S L A S H 4. The SLASH 4 was made a v a i l a b l e t o AARL u n t i l t h e SLASH 6 d e l i v e r y c o u l d be e f f e c t e d . As a r e s u l t , the work r e p o r t e d here d e a l s m a i n l y w i t h observations a b o u t a n d c o m p a r i s o n s b e t w e e n t h e H a r r i s SLASH 4 a n d t h e p r e v i o u s l y m e n t i o n e d IBM S y s t e m / 3 7 0 M o d e l 168. Some d i r e c t c o m p a r i s o n s b e t w e e n t h e SLASH 4 a n d t h e SLASH 6 h a v e b e e n made h o w e v e r a n d w i l l be r e v i e w e d l a t e r . It i s n o t a b l e t h a t t h e H a r r i s SLASH 4 h a s b e e n s e l e c t e d by o t h e r s f o r u s e i n l a r g e s c a l e c o m p u t a t i o n s (1), partic u l a r l y i n t h e o r e t i c a l chemistry (2). I n 1973, AARL e s t a b l i s h e d i t s D i g i t a l C o m p u t e r a n d Data A c q u i s i t i o n System a f t e r an e x t e n s i v e s u r v e y and b e n c h m a r k s by P e t r i e (3.) . The o r i g i n a l i n t e n t o f t h e c o m p u t e r s y s t e m was t o p r o v i d e a n o n - l i n e real-time d a t a a c q u i s i t i o n and r e d u c t i o n f a c i l i t y u t i l i z i n g a d i g i t a l c o m p u t e r b a s e d s y s t e m , r e p l a c i n g a n a n a l o g corno u t e r o f l i m i t e d c a p a b i l i t y and q u e s t i o n a b l e maintainability. T h i s d i g i t a l system, which i s used t o a c q u i r e and r e d u c e e x p e r i m e n t a l d a t a from t h e v a r i e d wind t u n n e l t e s t i n g f a c i l i t i e s a t AARL ( 4_) , i s b a s e d o n t h e H a r r i s SLASH 5 a n d was i n s t a l l e d f o r a p p r o x i m a t e l y $130,000 ( i n 1 9 7 3 ) , i n c l u d i n g t h e a n a l o g s i g n a l c o n d i t i o n i n g equipment and r e a l - t i m e p e r i p h e r a l g e a r . As i s so o f t e n t h e c a s e when t h e f i r s t d i g i t a l c o m p u t e r i s i n s t a l l e d a t a s i t e , t h e r o u t i n e use o f the computer s y s t e m was e x p a n d e d i n t o new a r e a s a t AARL. Soon, t h e r e a l need f o r t h e a v a i l a b i l i t y o f e x t e n s i v e computer power i n s u p p o r t o f t h e o r e t i c a l f l u i d m e c h a n i c s and o t h e r l a r g e s c a l e n u m e r i c a l c a l c u l a t i o n s was r e c o g n i zed. T h i s n e e d i s b e i n g s a t i s f i e d by t h e a d d i t i o n o f a H a r r i s SLASH 6 p r o c e s s o r c o n f i g u r e d i n t h e d u a l p r o c e s s o r a r r a n g e m e n t w i t h t h e SLASH 5 . The a d d i t i o n o f t h e SLASH 6 i n c l u d i n g a n 80 Mbyte d i s c d r i v e , a n o t h e r 9 t r a c k m a g n e t i c t a p e d r i v e , a 36 i n c h d r u m p l o t t e r , a n d s e v e r a l o t h e r p e r i p h e r a l s amounts t o a p p r o x i m a t e l y $180,000. Thus t h e e n t i r e system r e p r e s e n t s investm e n t s t o t a l i n g $310,000 i n two p h a s e s .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS
220
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
SLASH 4
vs.
IBM
System/370 Model
AND LARGE SCALE
COMPUTATIONS
168
As r e l a t e d t o l a r g e s c a l e c o m p u t a t i o n s , t h e AARL SLASH c o m p u t e r s a r e c u r r e n t l y employed t o examine t h e performance of aerodynamic surfaces, u s u a l l y a i r f o i l s e c t i o n s or wings, under v a r y i n g f l u i d mechanical conditions. These analyses are accomplished w i t h a l i b r a r y o f some two d o z e n o r more c o m p u t e r p r o g r a m s , e a c h one o f w h i c h c a n p e r f o r m s p e c i f i c t y p e s o f c a l c u l a t i o n s . A l l o f t h e s e p r o g r a m s a r e c o d e d i n t h e FORTRAN l a n g u a g e . Some a r e d e r i v a t i v e s a n d m o d i f i c a t i o n s o f e a r l i e r v e r s i o n s and u n d o u b t e d l y c o n t a i n p o r t i o n s o f i n e f f i c i e n t and d e a d c o d e . Most, i n c l u d i n g those used f o r comparis o n s t o be made h e r e , a r e more o r l e s s t y p i c a l o f FORTRAN b a s e d l a r g e s c a l e c o m p u t a t i o n a l c o d e s w r i t t e n by r e s e a r c h e r s f i r s t a n d c o m p u t e r p r o g r a m m e r s s e c o n d . The c o d e s a r e t y p i f i e d by m o d e r a t e t o l a r g e memory r e q u i r e m e n t s due t o h e a v y u s a g e o f a r r a y s , and t h e y r e q u i r e s u b s t a n t i a l f l o a t i n g p o i n t p r o c e s s i n g power. The input/output (I/O) r e q u i r e m e n t s a r e m o d e r a t e i n most c a s e s , u s u a l l y c o n s i s t i n g o f a few c a r d i m a g e s i n p u t and a c o u p l e o f t h o u s a n d l i n e s p r i n t e d o u t p u t . S i n c e t h e p r o g r a m s v a r y g r e a t l y i n t h e i r memory r e q u i r e m e n t s , n u m e r i c a l s t a b i l i t y , and r u n t i m e s , no one m a c h i n e c a n be e x p e c t e d t o p e r f o r m i n an o p t i m u m way f o r a l l p r o g r a m s w h i c h m i g h t be u s e d i n t h e analys i s o f an a i r f o i l o r w i n g s e c t i o n . The SLASH 4 h o w e v e r , w i t h i t s 48 b i t f l o a t i n g p o i n t w o r d w i t h a 39 b i t mant i s s a p r o v i d i n g 10+ d i g i t a c c u r a c y *is w e l l s u i t e d t o most f l u i d m e c h a n i c s c a l c u l a t i o n s . On t h e o t h e r h a n d , t h e IBM S y s t e m / 3 7 0 s i n g l e p r e c i s i o n f l o a t i n g p o i n t word l e n g t h o f 32 b i t s w i t h 24 b i t m a n t i s s a i s o f t e n n o t l o n g e n o u g h , r e q u i r i n g u s e o f d o u b l e p r e c i s i o n (64 bit f l o a t i n g p o i n t w o r d l e n g t h w i t h 56 b i t m a n t i s s a ) w i t h an a c c o m p a n y i n g i n c r e a s e i n memory n e e d s and r u n t i m e s . Maximum u s e o f t h e SLASH c o m p u t e r s i s e m p l o y e d s i n c e the cost per run i s g e n e r a l l y l e s s than t h a t f o r the l a r g e r computer, even though t h e run t i m e s are longer w i t h t h e SLASH 4. A r e p r e s e n t a t i v e l i s t of execution time comparis o n s b e t w e e n t h e SLASH 4 a n d t h e IBM 370/168 f o r s e v e r a l cases of f o u r d i f f e r e n t a i r f o i l a n a l y s i s codes is presented i n Table I. E a c h c o d e i s i d e n t i f i e d by a u n i q u e l e t t e r a n d e a c h c a s e e x e c u t e d by t h e c o d e i s summarized. The e x e c u t i o n t i m e s i n d i c a t e d a r e f o r c a s e e x e c u t i o n o n l y ; c o m p i l e and l i n k - e d i t o r c a t a l o g t i m e i s not i n c l u d e d . As m e n t i o n e d e a r l i e r , a l l t h e c o d e s l i s t e d a r e w r i t t e n i n FORTRAN and draw on a m a n u f a c t u r e r s u p p l i e d l i b r a r y o f FORTRAN a r i t h m e t i c s u p p o r t r o u t i n e s ( S I N , COS, ALOG, e t c . ) . A l l programs
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
16.
FREULER
A N D PÉTRIE
221
Fluid Mechanics Calculations TABLE I
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
HARRIS SLASH 4 AND IBM SYSTEM/370 MODEL 168 COMPARISONS Program Code
Case Number
C**
I
I K . 30
728.20
537.1/.
C**
II
82.19
498.26
506.2$
C
III
149.84
1912.55
1176.4$
Ε
I
18.89
157.28
732.6$
Ε
II
18.78
172.86
820.5$
Ε
III
26.49
296.26
1018.4$
Ε
IV
45.47
427.05
839.2$
Κ
I
136.28
1807.26
785.9$
Κ
II
69.12
732.79
960.2$
Ν
I
123.53
1031.71
735.2$
784.86
7164.22
812.8$
Totals/Average
Execution Time i n CPU sec. SLASH 4 IBM 370/168
Percent Slower*
*SLASH 4 i s slower than IBM 370/168 by X%, based on IBM 370/168. **Compiler used on IBM f o r t h i s case was FORTRAN G l . A l l other cases used FORTRAN Η Extended on IBM w i t h the maximum o p t i mization l e v e l .
e x e c u t i n g o n t h e IBM m a c h i n e were r u n i n IBM s i n g l e precision. On t h e SLASH 4, a l t h o u g h s i m p l e a r i t h m e t i c o p e r a t i o n s a r e a l w a y s p e r f o r m e d w i t h a 39 b i t m a n t i s s a , the a r i t h m e t i c support l i b r a r y o f f e r s e i t h e r s i n g l e p r e c i s i o n r o u t i n e s w i t h a 24 b i t m a n t i s s a a c c u r a c y , o r d o u b l e p r e c i s i o n r o u t i n e s w i t h t h e f u l l 39 b i t m a n t i s s a accuracy. I t was d e t e r m i n e d t h a t r e s u l t s p r o d u c e d by P r o g r a m s C a n d Κ w o u l d be i m p r o v e d by u s i n g 39 b i t man t i s s a accuracy f o r a l l c a l c u l a t i o n s . On t h e SLASH 4, t h e change from s i n g l e p r e c i s i o n t o d o u b l e p r e c i s i o n a r i t h m e t i c r o u t i n e s i s a simple matter o f s e l e c t i n g a c o m p i l e o p t i o n , a n d s o t h i s was d o n e . Since the f l o a t i n g p o i n t w o r d on a H a r r i s SLASH c o m p u t e r i s a l w a y s 48 b i t s , t h e r e s u l t a n t i n c r e a s e i n memory r e q u i r e m e n t s a s a r e s u l t o f s e l e c t i n g t h i s double p r e c i s i o n compile-
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
222
MINICOMPUTERS
A N DL A R G E
SCALE
COMPUTATIONS
t i m e o p t i o n i s v e r y s m a l l a n d i s c a u s e d by t h e s l i g h t l y longer double p r e c i s i o n a r i t h m e t i c r o u t i n e s . T h e mecha n i s m h e r e i s t h a t t h e 24 b i t m a n t i s s a a c c u r a c y i n t h e s i n g l e p r e c i s i o n l i b r a r y r o u t i n e s i s extended t o t h e f u l l 39 b i t a c c u r a c y o f t h e n o r m a l H a r r i s SLASH s e r i e s f l o a t i n g p o i n t word. The p r o g r a m s were r u n i n a n o v e r l a y s t r u c t u r e o n t h e SLASH 4 b e c a u s e o f t h e i r memory r e q u i r e m e n t s . No o v e r l a y s t r u c t u r e was u s e d o n t h e IBM m a c h i n e . No a t t e m p t h a s b e e n made t o a d j u s t e x e c u t i o n t i m e s t o a c c o u n t f o r n o n - o v e r l a y i n g o n t h e IBM 370/168 a n d o v e r l a y i n g o n t h e SLASH 4. O v e r l a y i n g o n t h e S L A S H 4 i s t h e d i f f e r e n c e between b e i n g a b l e t o o b t a i n r e s u l t s o r n o t being a b l e t o r u n a t a l l f o r these programs. A fairer c o m p a r i s o n o f what i s r e q u i r e d i n t e r m s o f CPU s e c o n d s f o r each o f t h e computers r e s u l t s from not a d j u s t i n g f o r s u c h f a c t o r s a s n o n - o v e r l a y v s . o v e r l a y . I t w o u l d be exp e c t e d t h a t o v e r l a y s t r u c t u r e s u s u a l l y r e q u i r e more d i s c I/O o p e r a t i o n s a n d l o n g e r w a l l c l o c k e x e c u t i o n t i m e s , b u t h a v e o n l y a s l i g h t a f f e c t o n a c t u a l CPU s e c o n d s . R e f e r r i n g t o T a b l e I , i t c a n be s e e n t h a t t h e SLASH 4 r u n s o n l y a b o u t 8 t i m e s s l o w e r t h a n t h e IBM S y s t e m / 3 7 0 M o d e l 168 o n t h e a v e r a g e f o r t h e c a s e s p r e sented. S i n c e t h e same FORTRAN C o m p i l e r was u s e d f o r a l l c a s e s o n t h e SLASH 4, i t i s n o t a b l e t h a t t h e SLASH 4 r u n s o n l y a b o u t 5 t i m e s s l o w e r t h a n t h e IBM 370/168 when t h e FORTRAN G l C o m p i l e r i s u s e d o n t h e IBM m a c h i n e . A l t e r n a t i v e l y , i t would appear t h a t t h e G l C o m p i l e r g e n e r a t e s much l e s s e f f i c i e n t m a c h i n e c o d e t h a n t h e H Extended Compiler. Worst case comparison p o i n t s o u t t h a t t h e d i f f e r e n c e b e t w e e n t h e SLASH 4 a n d IBM 370/168 i s o n l y a f a c t o r o f 12. The
E f f e c t i v e Mix
B e c a u s e t h e SLASH 4 c o m p a r e s f a v o r a b l y w i t h t h e IBM 370/168, maximum u s e o f t h e SLASH c o m p u t e r i s employed s i n c e t h e c o s t p e r r u n i s g e n e r a l l y l e s s than f o r t h e IBM m a c h i n e , e v e n t h o u g h t h e r u n - t i m e s a r e l o n g e r b y a n a v e r a g e f a c t o r o f 8 w i t h t h e SLASH 4. Program development and i n i t i a l checkout a r e conducted w i t h t h e SLASH c o m p u t e r s w i t h a r e s u l t a n t l o w e r i n g o f program development. The H a r r i s i n t e r a c t i v e a l p h a n u m e r i c e d i t i n g p a c k a g e c o m b i n e d w i t h t h e FORTRAN I V extended c o m p i l e r r u n n i n g under c o n t r o l o f t h e H a r r i s D i s c M o n i t o r S y s t e m (DMS) p r o v i d e an e x c e l l e n t means for program development i n c l u d i n g program source e d i t ing and u p d a t i n g , program c o m p i l i n g f o r s y n t a c t i c a l e r r o r c o r r e c t i n g , and program e x e c u t i o n f o r debugging and c h e c k o u t p u r p o s e s . Because t h e computer
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
16.
FREULER A N D PETRIE
Fluid Mechanics Calculations
223
utilization charging r a t e s a r e c o n s i d e r a b l y lower f o r t h e s m a l l e r AARL c o m p u t i n g s y s t e m t h a n most l a r g e m a i n f r a m e s , p r o g r a m d e v e l o p m e n t c o s t s have been r e d u c e d appreciably. W h i l e much o f t h e c a l c u l a t i o n s c a n be c o n d u c t e d w i t h t h e SLASH c o m p u t e r s s y s t e m , t h e r e a r e s t i l l a f e w of t h e programs w i t h i n t h e f l u i d mechanics l i b r a r y w h i c h h a v e e x c e s s i v e memory r e q u i r e m e n t s a n d / o r v e r y long r u n times. In these cases, each program i s optimized f o r which ever l a r g e mainframes produces t h e best cost-performance. U s u a l l y , t h i s r e q u i r e s minor r e - w r i t i n g o f t h e p r o g r a m code t o p r o v i d e t h e b e s t t r a d e o f f s between e x e c u t i o n speed and s t o r a g e r e q u i r e ments. T h i s i n v o l v e s t a k i n g advantage o f hardware o r s o f t w a r e f e a t u r e s o f t h e p a r t i c u l a r computer s y s t e m on which t h e program i s being r u n . I t has been found that p r o g r a m s w h i c h do n o t s u f f e r f r o m n u m e r i c a l signific a n c e p r o b l e m s w i l l show m a r k e d i m p r o v e m e n t i n p e r f o r m a n c e when f i n e - t u n e d f o r u s e o n a n IBM S y s t e m / 3 7 0 M o d e l 168 a s c o m p a r e d t o a v e r s i o n f o r u s e o n t h e a v a i l a b l e CDC C y b e r 73 m a c h i n e a n d w i l l r e s u l t i n a l o w e r e d c o s t p e r r u n . These r e s u l t s stem d i r e c t l y from t h e d i f f e r e n c e s i n word s i z e employed f o r s i n g l e p r e c i s i o n a r i t h m e t i c b e t w e e n t h e IBM 370/168 a n d t h e CDC Cyber 73, and a l s o from t h e performance d i f f e r e n c e s b e t w e e n t h e two m a i n f r a m e s . The l a r g e m a i n f r a m e f i n e - t u n i n g o r o p t i m i z a t i o n p r o c e s s i s most o f t e n a p p l i e d t o v e r s i o n s o f p r o g r a m s r u n n i n g o n t h e IBM 370/168. The major o p t i m i z a t i o n i s p e r f o r m e d a u t o m a t i c a l l y by u s i n g t h e IBM FORTRAN H Extended Compiler. The c o m p a r i s o n c a s e s r e v i e w e d e a r l i e r demonstrated a s i g n i f i c a n t performance increase o b t a i n e d by u s i n g H E x t e n d e d i n s t e a d o f G l . The e x t r a time and cost r e q u i r e d f o r an H Extended comoile s t e o i s r e p a i d , sometimes s e v e r a l t i m e s o v e r , by t h e s a v i n g s obtained i n the resultant execution. The H E x t e n d e d Compiler i s r o u t i n e l y used f o r "production" v e r s i o n s o f programs and i s even h e l p f u l d u r i n g f i n a l stages o f program c h e c k o u t by o f f e r i n g good d i a g n o s t i c s and a cross reference capability. Minor r e w r i t i n g o f t h e code i s a l s o p e r f o r m e d t o a l l o w t h e c o m p i l e r optimizat i o n p r o c e s s t o be c a r r i e d t o t h e f u l l e s t p o s s i b l e e x tent. I n c r e a s e d e x e c u t i o n e f f i c i e n c y r e s u l t s when input/output o p e r a t i o n s a r e performed on v a r i a b l e s s t o r e d i n contiguous storage l o c a t i o n s , such as a COMMON b l o c k . T h e number o f s e p a r a t e COMMON b l o c k s i s kept as s m a l l as p o s s i b l e and t h e v a r i a b l e s ordered such that t h e l a r g e s t arrays occur l a s t i n t h e block. T h i s a l l o w s fewer base r e g i s t e r s and l e s s f r e q u e n t base r e g i s t e r loads, improving performance. In general,
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
224
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
references to higher-dimensional arrays are slower than references to lower-dimensional arrays. Thus use o f s e v e r a l o n e - d i m e n s i o n a l a r r a y s i s more e f f i c i e n t t h a n a single two-dimensional a r r a y i f the two-dimensional a r r a y c a n l o g i c a l l y be t r e a t e d a s a s e t o f o n e dimensional arrays. The u s e o f E Q U I V A L E N C E statements i s a v o i d e d where p o s s i b l e s i n c e e q u i v a l e n c e d v a r i a b l e s weaken t h e o p t i m i z a t i o n p r o c e s s e s . F i n a l l y , on IBM machines, a l o g i c a l IF statement w i l l g e n e r a t e e q u i v a l e n t or b e t t e r machine code t h a n the c o r r e s p o n d i n g a r i t h m e t i c IF statement f o r simple comparisons. These c o n s i d e r a t i o n s then form the b a s i s f o r f i n e - t u n i n g p r o g r a m s f o r an IBM m a c h i n e . A p h i l o s o p h y has been d e v e l o p e d and i s b e i n g r e f i n e d f o r the r o u t i n e use o f i n t e r a c t i v e g r a p h i c d i s p l a y d e v i c e s f o r the viewing of the numerical r e s u l t s . The r e s u l t s o f , s a y , a p r e s s u r e d i s t r i b u t i o n c a l c u l a t i o n o v e r a c o m p l e x a e r o d y n a m i c s u r f a c e c a n be v i e w e d more e f f e c t i v e l y w i t h g r a p h i c d i s p l a y s r a t h e r than i n conv e n t i o n a l t a b u l a r forms. Such r e v i e w o f the d a t a u s u a l l y m a n d a t e s t h a t a m i n i c o m p u t e r s y s t e m be a v a i l able; the general p a u c i t y of i n t e r a c t i v e graphics capab i l i t y and t h e h i g h c o s t o f s u c h g r a p h i c o p e r a t i o n s i n l a r g e c e n t r a l s y s t e m s i s w e l l known t o c e n t r a l s y s t e m users. The SLASH c o m p u t e r s s y s t e m o u t l i n e d h e r e i n c l u d e s a l a r g e drum p l o t t e r , a h i g h s p e e d s t o r a g e t y p e CRT d i s p l a y w i t h g r a p h i c c a p a b i l i t y , a n d s e v e r a l i n t e r a c t i v e t e r m i n a l s of e i t h e r the t e l e t y p e or alphanumeric CRT v a r i e t y . AARL i s i n t h e p r o c e s s o f f u l l y i m p l e m e n t ing a s o f t w a r e system which can a c c e s s d a t a r e t u r n i n g from e i t h e r a remote host s i t e o r from the in-house SLASH c o m p u t e r s . The m e c h a n i s m h e r e i s t h a t a d y n a m i c a l l y c r e a t e d d i s c f i l e on t h e SLASH c o m p u t e r s s y s t e m i s used t o save t h e d a t a , r e g a r d l e s s o f which machine was u s e d t o g e n e r a t e t h e r e s u l t s . T h i s f i l e can t h e n be a c c e s s e d by a p o s t - p r o c e s s i n g p r o g r a m o p e r a t i n g i n the minicomputer system f o r the purposes of p r e v i e w i n g the r e s u l t s , u s u a l l y d i s p l a y e d i n a p l o t t e d form, p r i o r t o c o m m i t t i n g t h e d a t a t o h a r d c o p y on t h e drum p l o t t e r . T h i s a l l o w s the r e s e a r c h e r t o have a c l o s e r i n t e r a c t i o n w i t h h i s program, which u s u a l l y o p e r a t e s i n a b a t c h e n v i r o n m e n t b e c a u s e o f i t s memory a n d / o r r u n t i m e r e quirements. I f the previewed r e s u l t s i n d i c a t e that perh a p s t h e p r o g r a m i n p u t s p e c i f i c a t i o n s s h o u l d be m o d i f i e d , t h i s c a n be d o n e by e d i t i n g t h e i n p u t p a r a m e t e r s at t h e i n t e r a c t i v e t e r m i n a l and t h e n r e s u b m i t t i n g t h e job t o w h i c h e v e r m a c h i n e ( i . e . , l o c a l l y t o t h e SLASH 4 or v i a RJE t o t h e l a r g e mainframe) i s b e i n g used f o r the c a l c u l a t i o n s . The o p e r a t i n g s y s t e m o f t h e SLASH c o m p u t e r s a l l o w s any i n t e r a c t i v e t e r m i n a l t o s u b m i t j o b s
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FREULER A N D PETRIE
16.
225
Fluid Mechanics Calculations
to t h e l o c a l b a t c h stream o r t o t h e RJE queue. By t h i s " i n t e r a c t i v e b a t c h " t e c h n i q u e , the r e s u l t s from l a r g e scale computations c a n be s t o r e d , s a v e d , p r e v i e w e d , a n d o p t i o n a l l y committed t o h a r d copy a t a s i g n i f i c a n t c o s t s a v i n g s o v e r t h a t i n c u r r e d by u s i n g o n l y t h e c a p a b i l i t i e s o f a l a r g e c e n r a l system mainframe.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
The
SLASH F a m i l y
a n d a SLASH
6 v s . SLASH
4
Comparison
The H a r r i s f a m i l y o f c o m p u t e r s b e g a n w i t h t h e D a t a c r a f t 6024/1 p r o c e s s o r a n n o u n c e d i n 1968. (Datac r a f t C o r p o r a t i o n became a d i v i s i o n o f t h e H a r r i s C o r p o r a t i o n i n 1974). T h e 6024/1 was a 600 n a n o s e c o n d m a c h i n e , a n e x c e l l e n t p r o c e s s o r f o r l a r g e s c a l e computations. The 6024/3 (1 u s e e . ) f o l l o w e d i n 1970, with t h e SLASH 5 (950 n s e c . ) c o m i n g i n 1971, t h e SLASH 4 (750 n s e c . ) w i t h v i r t u a l memory i n 1973, a n d t h e SLASH 7 (400 n s e c . ) i n 1975. E a c h new m a c h i n e b a s i c a l l y o f f e r e d a c e n t r a l processor a r c h i t e c t u r e s i m i l a r toi t s p r e d e c e s s o r w h i l e i n c o r p o r a t i n g some a v a i l a b l e new t e c h n o l o g y a n d a d d i n g new f e a t u r e s . The SLASH 6, a n n o u n c e d i n J u n e 1976, i s b a s e d on a c o m p l e t e l y different processor architecture, u t i l i z i n g a microp r o g r a m m e d a s y n c h r o u n o u s CPU w i t h a 48 b i t c e n t r a l s y s tem b u s s t r u c t u r e . A d d i t i o n SLASH 6 i n f o r m a t i o n a p p e a r s i n t h e A p p e n d i x A. The e a r l i e s t i n f o r m a t i o n a b o u t t h e H a r r i s SLASH 6 p r o c e s s o r i n d i c a t e d t h a t p e r h a p s i t m i g h t be a s much a s 20$ f a s t e r t h a n t h e SLASH 4. As shown by T a b l e I I , TABLE I I Harris
SLASH
Job Stream Identification Job Job Job Job Job Job
1 2 3 4 5 6 Totals
4 a n d H a r r i s SLASH Job Time(sec) SLASH 4
80.591 15.450 19.360 1174.862 4.676 5.337 1300.276
6
Comparisons
Job Time(sec) SLASH 6
85.008 16.157* 20.983 1210.234 4.981 5.714 1343.077
NOTE:
Percent Slower+
5.481$ 4.576$ 8.383$ 3.011$ 6.523$ 7.064$ 3.292**
A l l c o m p a r i s o n s a r e f o r FORTRAN C o m p i l e r V e r s i o n 24 e x c e p t a s i n d i c a t e d b e l o w . * V e r s i o n 26 Compiler. * * R e f l e c t s m o s t l y t i m e o f J o b 4. +SLASH 6 i s s l o w e r t h a n SLASH 4 by X$, b a s e d on SLASH 4
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
226
MINICOMPUTERS
A N D LARGE
SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
t h i s i s c l e a r l y n o t t h e c a s e a s t h e SLASH 6 r u n s s l i g h t l y but c o n s i s t e n t l y slower f o r a l l the jobs l i s t e d i n the table. T h e t i m e s shown a r e f o r t h e t o t a l job t i m e f o r i d e n t i c a l j o b s o n t h e two SLASH c o m p u t e r s . The v e r s i o n s o f t h e FORTRAN C o m p i l e r , A s s e m b l e r , C a t a l o g e r , a n d s u p p o r t l i b r a r y w e r e t h e same o n t h e SLASH 6 a s o n t h e SLASH 4 w i t h one e x c e p t i o n a s n o t e d , a n d c o m p i l e r o p t i o n s were i d e n t i c a l . The j o b s r e f l e c t e d a range from simple compile and c a t a l o g (J0B5, J0B6) t o a c o m p i l e , c a t a l o g a n d e x e c u t i o n o f one o f t h e a i r f o i l a n a l y s i s codes (J0B4) r e q u i r i n g heavy f l o a t i n g p o i n t operations. AARL h a s t e s t e d a s i n g l e p r o g r a m o n s e v e r a l comp u t e r s and t h e r e s u l t s a r e g i v e n i n T a b l e I I I . This TABLE I I I Comparisons of S e v e r a l Computers f o r a S i n g l e Program Computer Tested IBM 370/165 IBM 370/168 CDC 6400 CDC Cyber 73 DEC PDP-15 GA. SPC-16/65 SLASH 1 SLASH 3 SLASH 3 SLASH 5 SLASH 4 SLASH 4 SLASH 6 NOTE:
Hardware F l o a t i n g P t . Available/Used
Word S i z e (Bits)
Yes/Yes
32 32 60 60 18 16 24 24 24 24 24 24 24
YesAes Yes/Yes
YesAes Yes/Yes No/No Yes/No
YesAes Yes/No No/No Yes/Yes Yes/No
YesAes
Execution Time (Seconds)
Time determined by timing subroutine unique for machine, except as i n d i c a t e d below.
13.24 11.39 68.43 52.28 340* 970* 146.65 90** 244** 244.29 60.11 183.22 64.75 each
*Timing mechanism unknown. **Timing determined by stop watch.
" b e n c h m a r k " p r o g r a m was u s e d i n e v a l u a t i n g t h e c a n d i d a t e c o m p u t e r s y s t e m s f o r t h e f i r s t p h a s e o f t h e AARL D i g i t a l C o m p u t e r a n d D a t a A c q u i s i t i o n S y s t e m i n 1973. More r e c e n t l y , i t h a s b e e n r u n o n t h e SLASH 4 , t h e S L A S H 6 , a n IBM 3 7 0 / 1 6 8 , a n d a CDC C y b e r 7 3 . While there i s a l w a y s a q u e s t i o n a s t o what a n y g i v e n s i n g l e b e n c h mark p r o g r a m a c t u a l l y t e s t s , t h e s p e e d o f e x e c u t i o n
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
16.
F~RF.TTT.F~R
A N D
PÉTRIE
Fluid Mechanics Calculations
227
i n d i c a t e s t h e e f f i c i e n c y o f t h e i n s t r u c t i o n s e t a n d how well the compiler optimizes the coding. T h i s FORTRAN p r o g r a m was c o n s t r u c t e d w i t h n o p a r t i c u l a r m a c h i n e i n m i n d , a n d i t s h o u l d n o t be u s e d a s a n o v e r a l l recommend a t i o n n o r condemnation f o r any s p e c i f i c computer. The e x e c u t i o n times l i s t e d a r e f o r t h e case s o l u t i o n time o n l y , no c o m p i l e t i m e o r l i n k - e d i t t i m e i s i n c l u d e d . The p r o g r a m h a s no r e q u i r e d i n p u t s a n d p r o d u c e s l i t t l e p r i n t e d o u t p u t ; i t i s t h e r e f o r e compute b o u n d a n d t h e r e s u l t s r e f l e c t m o s t l y c o m p i l e r g e n e r a t e d code efficiency and c e n t r a l p r o c e s s o r speed d i f f e r e n c e s .
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
Maintenance
and O p e r a t i n g C o s t s
B o t h s o f t w a r e and hardware m a i n t e n a n c e on t h e d u a l SLASH p r o c e s s o r s y s t e m a r e done i n - h o u s e . A computer t e c h n i c i a n d e v o t e s a p p r o x i m a t e l y 80% o f f u l l t i m e t o c o r r e c t i v e and p r e v e n t a t i v e maintenance and t o t h e d e s i g n o f new d e v i c e i n t e r f a c e s . A s i n g l e system a n a l y s t s p e n d s a p p r o x i m a t e l y 50% o f f u l l t i m e o n s o f t w a r e m a i n tenance and development r e l a t e d t o t h e o v e r a l l system ( i . e . , c a n n o t be r e l a t e d t o a s i n g l e r e s e a r c h p r o j e c t ) . These e f f o r t s p l u s nominal c o s t o f expendable s u p p l i e s r e s u l t i n an average monthly maintenance and o p e r a t i n g c o s t o f a p p r o x i m a t e l y $1800. T h i s c o s t a p p e a r s t o be r e l a t i v e l y independent o f the system s i z e . That i s , our o p e r a t i n g c o s t s d i d n o t c h a n g e a p p r e c i a b l y when t h e SLASH 4 p r o c e s s o r s u b s y s t e m was a d d e d . T h e c o m p u t e r f a c i l i t y was f i n a n c e d b y T h e O h i o State University. The c a p i t a l equipment and implement a t i o n costs are being recovered with connect rate charges t o a l l users. S i n c e t h e system i s used f o r d a t a a c q u i s i t i o n as w e l l as s t r a i g h t n u m e r i c a l computat i o n s , w a l l - c l o c k t i m e a c c o u n t i n g c a n n o t be u s e d . Ins t e a d , t h e c o n c e p t o f c o n n e c t t i m e i s employed where u s e r s a r e charged i f they a r e connected t o t h e system. For e x a m p l e , a u s e r who i s c o n n e c t e d t o a n A/D c o n v e r t e r must be c h a r g e d f o r u s e o f t h e s y s t e m , e v e n t h o u g h data a c q u i s i t i o n i s not i n p r o g r e s s s i n c e h i s connect i o n p r e c l u d e s u s e o f t h a t p o r t i o n o f t h e system by others. The c h a r g i n g scheme i s d e s i g n e d t o r e c o v e r t h e i n s t a l l a t i o n costs over a f i v e year period. For the o r i g i n a l SLASH 5 s y s t e m t h i s r e q u i r e d a n a v e r a g e r e c o v e r y o f $2000 p e r m o n t h a t a c o n n e c t i o n r a t e c h a r g e of $45/hour. For the dual processor configuration, a r e c o v e r y o f $5200 p e r m o n t h i s r e q u i r e d w i t h a c o n n e c t i o n r a t e charge o f $76/hour. O v e r t h e l a s t y e a r , we have had l i t t l e d i f f i c u l t y i n m e e t i n g t h e s c h e d u l e d cost recovery.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
228
MINICOMPUTERS
AND
LARGE SCALE
COMPUTATIONS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
Conclusions The AARL a p p r o a c h t o c o m p l e x n u m e r i c a l e x p e r i m e n t a t i o n a l l o w s g r e a t f l e x i b i l i t y i n t h e c h o i c e o f computer to conduct a set of s p e c i f i c c a l c u l a t i o n s . The d i a l - u p c a p a b i l i t y t o o t h e r computing systems, combined w i t h t h e i n - h o u s e c o m p u t i n g power a v a i l a b l e , has prov i d e d s i g n i f i c a n t a d v a n t a g e s o v e r more c o n v e n t i o n a l a r r a n g e m e n t s w h i c h employ e i t h e r a l a r g e , s i n g l e mainframe computer or a d e d i c a t e d minicomputer system. In t h e d i a l - u p mode o f o p e r a t i o n u t i l i z i n g modems o p e r a t i n g o v e r s t a n d a r d t e l e p h o n e l i n e s , t h e AARL c o m p u t i n g s y s tem c a n i n t e r a c t w i t h any h o s t c o m p u t i n g s i t e w h i c h c a n support communications from a remote t e r m i n a l . The SLASH c o m p u t e r s u s e d a t AARL o f f e r a c o s t - e f f e c t i v e a l t e r n a t i v e to l a r g e s c a l e machines f o r a l a r g e majori t y of the f l u i d mechanics c a l c u l a t i o n s performed at AARL. The H a r r i s SLASH 4 o r t h e H a r r i s SLASH 6 a r e w e l l s u i t e d f o r the v a r i e d t a s k s of program d e v e l o p ment, " p r o d u c t i o n " r u n n i n g , and Remote J o b E n t r y comm u n i c a t i o n s w i t h the l a r g e machines. W h i l e the approach described has been u s e d f o r n u m e r i c a l e x p e r i m e n t s i n f l u i d m e c h a n i c s , i t c a n be a p p l i e d t o any disc i p l i n e requiring extensive numerical calculations.
Appendix
A.
The D i g i t a l System.
Computer
and
Data
Acquisition
The AARL D i g i t a l C o m p u t e r and D a t a A c q u i s i t i o n S y s t e m i s an example o f s t a t e - o f - t h e - a r t techniques i n t h e c o m p u t e r and e l e c t r o n i c s f i e l d s a p p l i e d t o experim e n t a l l y and t h e o r e t i c a l l y o r i e n t e d r e s e a r c h . The m a j o r components o f t h e d a t a a c q u i s i t i o n and reduction p o r t i o n o f t h e s y s t e m a n d t h e i n t e r - r e l a t i o n s h i p s among t h e d e v i c e s and t h e c e n t r a l p r o c e s s i n g u n i t s a r e shown schematically i n F i g u r e App-1 below. The system can be b r o k e n i n t o f o u r g r o u p s o f components f o r d e s c r i p t i v e p u r p o s e s : (1) t h e a n a l o g f r o n t end c o n s i s t i n g o f v a r i ous a n a l o g and s i g n a l c o n d i t i o n i n g d e v i c e s ; (2) the c e n t r a l p r o c e s s i n g u n i t s ( C P U ) ; (3) the v a r i o u s input and o u t p u t p e r i p h e r a l d e v i c e s (I/O d e v i c e s ) t o h a n d l e assorted I/O functions associated w i t h more t y p i c a l c o m p u t e r s y s t e m s ; and (4) t h e Remote J o b E n t r y (RJE) s u b s y s t e m w h i c h e n a b l e s c o m m u n i c a t i o n w i t h any remote h o s t c o m p u t e r i n a d i a l - u p mode o f o p e r a t i o n . The a n a l o g f r o n end s e r v e s t o i n t e r f a c t c o n t i n u o u s a n a l o g s i g n a l s to the d a t a a c q u i s i t i o n c e n t r a l p r o c e s s i n g u n i t i n d i g i t a l ( d i s c r e t e ) form. Analog signals e n t e r t h e c e n t r a l p a t c h p a n e l w h e r e t h e y may be r o u t e d
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
FREULER AND
16.
229
Fluid Mechanics Calculations
PETRIE
SLASH 6
CENTRAL
PROCESSING SLASH 5
PROCESSING
I/O
SCIENTIFIC A R I T H M E T I C UNIT
UNIT
CHANNELS
32K
UNIT
CENTRAL
I/O
WORDS
64K
CORE MEMORY
CHANNELS
WORDS
SEMICONDUCTOR
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
MEMORY ANALOG TO DIGITAL
100
KHZ
DISC
DISC STORAGE MB
STORAGE
10.8
70.5
MB
ANALOG TO DIGITAL
400 KHZ
MAG
SIGNAL
TAPE
75
CONDITION
IPS
ANALOG TO DIGITAL
8 KHZ
DIGITAL TO ANALOG
GRAPHIC CRT TERMINAL
MAG
TAPE
45 READER
RJE SYNCHRONOUS CONTROLLER
LINE PRINTER ASR
FLOPPY
33
TERMINAL
DISC
LA
36
TERMINAL
Γ ι I
IPS
CARD
LSI
11
ANALOG T(J DIGITAL
PROCESSOR
DIGITAL PLOTTER
CRT TERMINAL
τ ι 743
CASSETTE TAPE
TERMINAL
REMOTE DATA
Figure 1.
LOGGER
The AARL digital computer and data acquisition system
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
230
MINICOMPUTERS
AND LARGE SCALE COMPUTATIONS
as d e s i r e d t o t h e v a r i o u s s i g n a l c o n d i t i o n i n g d e v i c e s and/or c o n v e r t e r s . Two a n a l o g - t o - d i g i t a l (A/D) converter systems a r e a v a i l a b l e : (1) a h i g h speed m u l t i p l e x e d A/D c o n v e r t e r s y s t e m w h i c h a c c e p t s up t o 128 different i a l h i g h - l e v e l inputs with a f u l l scale voltage of + 10V, p r o v i d e s a r e s o l u t i o n o f 10 b i t s , h a s an i n p u t i m p e d a n c e o f 10 0 megohms, a n d h a s a t h r o u g h p u t r a t e o f 100 kHz; (2) a medium s p e e d m u l t i p l e x e d c o n v e r t e r s y s tem w h i c h a c c e p t s up t o 64 d i f f e r e n t i a l l o w - l e v e l i n p u t s w i t h a f u l l s c a l e v o l t a g e o f + 1000 m i l l i v o l t s , prov i d e s a r e s o l u t i o n o f 12 b i t s , a n d h a s a t h r o u g h p u t r a t e o f 8 kHz. The l a t t e r s y s t e m h a s 8 program-controll a b l e g a i n r a n g e s and i s t r a n s f o r m e r c o u p l e d t o a l l o w h i g h common mode v o l t a g e s . Eight p r e - a m p l i f i e r s are c u r r e n t l y a v a i l a b l e f o r s i g n a l c o n d i t i o n i n g and t e n b r i d g e b a l a n c e and s p a n c o n t r o l u n i t s a r e i n c l u d e d t o accommodate s t r a i n gage bridge type sensors. A 5 channel d i g i t a l - t o - a n a l o g (D/A) s y s t e m i s a l s o a v a i l a b l e w h i c h c a n be u s e d f o r transmission of various c o n t r o l signals. Part of the D/A s y s t e m i s p r e s e n t l y u s e d t o d r i v e an a n a l o g X-Y plotter. A l s o i n c l u d e d a r e 16 d i g i t a l r e l a y o u t p u t s , 8 d i s c r e t e i n p u t s w i t c h e s , a n d a p r o g r a m m a b l e 10 kHz interval timing unit. The AARL D i g i t a l C o m p u t e r a n d D a t a A c q u i s i t i o n S y s t e m u t i l i z e s two c e n t r a l p r o c e s s i n g u n i t s w h i c h a r e operated i n a non-redundant dual p r o c e s s o r c o n f i g u r a tion. One p r o c e s s o r i s a s s i g n e d t h e o n - l i n e d a t a a c q u i s i t i o n and r e d u c t i o n t a s k s w h i l e t h e s e c o n d g e n e r a l l y i s a s s i g n e d most o t h e r t a s k i n c l u d i n g but n o t l i m i t e d to o f f - l i n e data r e d u c t i o n , program development a n d m a i n t e n a n c e , a n d h e a v y f l o a t i n g p o i n t s c i e n t i fic calculations. The p r o c e s s o r s a r e d i r e c t l y c o n n e c ted v i a a CPU-to-CPU l i n k and i n a d d i t i o n , t h e y share a d i s c c a r t r i d g e mass s t o r a g e d e v i c e . The d a t a a c q u i s i t i o n c e n t r a l p r o c e s s i n g u n i t i s a H a r r i s C o r p o r a t i o n SLASH 5 a n d c o n s i s t s o f a n a r i t h metic u n i t , c o n t r o l u n i t , i n t e r f a c e elements f o r the p l a n a r c o r e memory, a n d t h e i n p u t - o u t p u t c h a n n e l interface. The p r o c e s s o r h a s a 950 n a n o s e c o n d s f u l l c y c l e t i m e a n d a f i x e d w o r d l e n g t h o f 24 b i t s p l u s p a r i t y . T h e r e a r e o v e r 120 g e n e r i c i n s t r u c t i o n t y p e s a v a i l a b l e at the assembly language l e v e l . The S L A S H 5 o p e r a t e s on and f r o m 24 b i t d a t a a n d i n s t r u c t i o n w o r d s . The SLASH 5 e m p l o y e s a m u l t i - a c c e s s bus s t r u c t u r e , f u l l y p a r a l l e l b i n a r y a r i t h m e t i c , f u l l y b u f f e r e d 1/0 channels, a n d s i n g l e a d d r e s s c a p a b i l i t y d i r e c t t o 96 K b y t e s a n d i n d i r e c t a n d / o r i n d e x e d t o 192 K b y t e s . There are f i v e g e n e r a l p u r p o s e r e g i s t e r s , t h r e e o f w h i c h may be u s e d for i n d e x i n g which i s performed without a speed pen-
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
16.
FREULER
AND
PETRIE
Fluid Mechanics Calculations
231
alty. Memory may be a c c e s s e d a t t h e w o r d , d o u b l e w o r d , and b y t e l e v e l s . Memory s i z e i n t h i s p r o c e s s o r i s 32,768 w o r d s ( 3 2 K ) a n d i s e x p a n d a b l e t o 64K words. The s e c o n d c e n t r a l p r o c e s s i n g u n i t i s a H a r r i s C o r p o r a t i o n S L A S H 6, t h e n e w e s t member o f t h e H a r r i s SLASH S e r i e s f a m i l y . The SLASH 6 o f f e r s t o t a l software c o m p a t i b i l i t y w i t h t h e SLASH 5 b u t o f f e r s a m i c r o p r o g r a m m e d a r c h i t e c t u r e a s y n c h r o n o u s CPU w i t h a c e n t r a l s y s t e m bus s t r u c t u r e . O t h e r s t a t e - o f - t h e - a r t SLASH 6 f e a t u r e s i n c l u d e MOS memory w i t h e r r o r c o r r e c t i o n , b i p o l a r m i c r o p r o c e s s o r A r i t h m e t i c - L o g i c U n i t ( A L U ) , and m i c r o c o d e e x e c u t i o n PROMS. The ALU i s c o m p r i s e d o f s i x high-speed microprocessor chips - each r e p r e s e n t i n g a 4 bit logic slice. The a u x i l i a r y PROMS a r e u t i l i z e d f o r i n s t r u c t i o n d e c o d i n g and s u b s e q u e n t m i c r o c o d e e x e c u t i o n - r e s u l t i n g i n program and f e a t u r e c o m p a t i b i l i t y w i t h t h e H a r r i s SLASH 5 p r o c e s s o r . Memory s i z e i n t h i s p r o c e s s o r i s 65,536 w o r d s (64K) a n d i s e x p a n d a b l e t o 256K w o r d s v i a a demand p a g i n g v i r t u a l memory o p t i o n . The i n p u t - o u t p u t s y s t e m , e x c l u s i v e o f t h e d e v i c e s w h i c h c o m p r i s e t h e a n a l o g f r o n t end, c o n s i s t s of the f o l l o w i n g p e r i p h e r a l d e v i c e s : (1) a removable pack d i s c s y s t e m w i t h 70.5 m e g a b y t e s f o r m a t t e d c a p a c i t y a n d a 342.7 kHz t r a n s f e r r a t e ; ( 2 ) a c a r t r i d g e d i s c s y s t e m i n c l u d i n g one f i x e d d i s c p l a t t e r a n d one r e m o v a b l e d i s c c a r t r i d g e e a c h w i t h a 5.4 m e g a b y t e s f o r m a t t e d capacity a n d a 89-5 kHz t r a n s f e r r a t e ; (3) a d u a l d e n s i t y 800/ l600 b i t s / i n c h 9 track i n d u s t r y compatible magnetic t a p e d r i v e w i t h a n o m i n a l t a p e s p e e d o f 75 i n c h e s / s e c o n d and v a c u u m c o l u m n t a p e h a n d l i n g ; (4) an 800 b i t s / i n c h 9 t r a c k magnetic tape d r i v e w i t h a nominal t a p e s p e e d o f 45 i n c h e s / s e c o n d ; (5) a 300 c a r d s / m i n u t e c a r d r e a d e r ; ( 6 ) a 135 c h a r a c t e r s / l i n e , 400 lines/ m i n u t e l i n e p r i n t e r ; (7) a n ASR-33 s t a n d a r d t e l e t y p e w i t h p a p e r t a p e f a c i l i t i e s ; (8) a T e k t r o n i x 4010 cath ode r a y t u b e (CRT) o p e r a t i n g o v e r a n a s y n c h r o n o u s i n t e r f a c e a t a 96ΟΟ baud r a t e p r o v i d i n g g r a p h i c as w e l l a s a l p h a n u m e r i c d i s p l a y c a p a b i l i t i e s ; a n d (9) a f o u r p e n 36 i n c h d r u m t y p e p l o t t e r o f f e r i n g 0.0025 i n c h r e s o l u t i o n a n d d r a w i n g s p e e d s o f up t o 8 i n c h e s / s e c o n d on a m a j o r a x i s . The R e m o t e J o b E n t r y ( R J E ) s u b s y s t e m , w h i c h i s shown s c h e m a t i c a l l y i n F i g u r e A p p - 2 , s u p p o r t s c o m m u n i c a t i o n w i t h a remote host computer. Such communica t i o n s are c a r r i e d out c o n c u r r e n t l y w i t h o t h e r computer t a s k s i n c l u d i n g r e a l - t i m e d a t a a c q u i s i t i o n and r e d u c tion. I n c l u d e d i n t h e RJE s u b s y s t e m a r e : (1) a s y n c h r o n o u s c o n t r o l l e r w i t h b a u d r a t e t o 9600 b i t s / s e c o n d ; ( 2 ) a B e l l s y s t e m c o m p a t i b l e modem w i t h d i a l - u p t e l e p h o n e d a t a s e t ; a n d (3) a CRT d i s p l a y d e v i c e p r o v i d i n g 24 l i n e s w i t h 80 c h a r a c t e r s / l i n e o f d i s p l a y .
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
MINICOMPUTERS AND LARGE SCALE
232
COMPUTATIONS
SLASH 5 CENTRAL PROCESSING UNIT i/o
32K
CHANNELS
WORDS
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
CORE MEMORY
TO
SLASH 6 SYSTEM INCLUDING CARD READER AND L I N E PRINTER
DISC STORAGE
10.8
MB BELL MODEM
RJE SYNCHRONOUS CONTROLLER
MODEM RJE COMMAND CRT
TO
HOST COMPUTER SYSTEM
OTHER
SLASH 5
DEVICES
HOST COMPUTER
FACILITY
Figure 2. Remote job entry subsystem of AARL digital computer and data acquisition system
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
16.
FREULER
A N D PETRIE
Fluid Mechanics Calculations
233
The CRT i s u s e d f o r R J E o p e r a t o r c o m m u n i c a t i o n s b u t may be u t i l i z e d i n a n i n t e r a c t i v e f a s h i o n when R J E is not i nprogress. The c a r d r e a d e r a n d l i n e p r i n t e r are used t o support RJE a c t i v i t i e s as needed. The R J E subsystem canoperate under three d i s c i p l i n e s : * CDC UT-200 f o r t h e 6 0 0 0 / 7 0 0 0 s e r i e s * IBM 2780/3780 f o r t h e S y s t e m 360/370 s e r i e s * UNIVAC 1 0 0 4 f o r t h e 1 1 0 0 s e r i e s E a c h p r o c e s s o r i n t h e AARL D i g i t a l C o m p u t e r a n d Data A c q u i s i t i o n System i s under t h e c o n t r o l o f t h e H a r r i s S e r i e s 6000 D i s c M o n i t o r S y s t e m (DMS). DMS i s a r e a l - t i m e o p e r a t i n g system t h a t p r o v i d e s foreground multiprogramming concurrent with sequential batch proc e s s i n g i n t h e background. The f o r e g r o u n d i s d e s i g n e d for a p p l i c a t i o n - r e l a t e d programs which c o u l d c o n t r o l a wind t u n n e l , process r e a l - t i m e data from an a c o u s t i c experiment, o r i n t e r a c t with m u l t i p l e t e r m i n a l users i n e i t h e r a l o c a l o r remote f a s h i o n . These programs r e c e i v e h i g h e s t p r i o r i t y a n d t h e i r r e q u i r e m e n t s a r e met first. Batch p r o c e s s i n g i s conducted i n t h e background and i s n e v e r t i m e - c r i t i c a l so t h a t b a c k g r o u n d i s s e r v i c e d when p r o c e s s o r t i m e a n d memory s p a c e a r e a v a i l able. S a l i e n t f e a t u r e s o f t h e DMS a r e : * Dynamic l o a d i n g o f f o r e g r o u n d programs * D y n a m i c memory a l l o c a t i o n s e r v i c e s * D y n a m i c s p o o l e d 1/0 f o r a n y l i s t o u t p u t d e v i c e * O p t i o n a l s p o o l e d j o b stream i n p u t from any i n p u t device * P u l l f i l e s e c u r i t y f o r every user i n c l u d i n g read, w r i t e a n d d e l e t e p r o t e c t i o n modes w i t h o p t i o n a l password * Re-entrant foreground program c a p a b i l i t i e s * Program p r i o r i t y s t r u c t u r e t h a t governs t h e a l l o c a t i o n o f memory, d i s c f i l e s , a n d p r o c e s s o r t i m e ; 255 p r i o r i t y l e v e l s * Time s l i c i n g among p r o g r a m s e x e c u t i n g a t t h e same priority * P r o g r a m c o m m u n i c a t i o n s v i a a s p e c i a l Common a r e a , i n i t i a t i o n parameter p a s s i n g , o r a program s w i t c h word * Timer s c h e d u l i n g o f p e r i o d i c foreground programs * Automatic checkpointing and r e l o a d i n g o f t h e b a c k g r o u n d memory a r e a a s r e q u i r e d b y a c t i v a t i o n of n o n - r e s i d e n t f o r e g r o u n d programs * Re-entrant e d i t o r package f o r t e r m i n a l users * C o m p l e t e memory p r o t e c t i o n o f a l l i n a c t i v e p r o grams f r o m c u r r e n t l y a c t i v e p r o g r a m s * Concise j o b c o n t r o l language f o r batch processing
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
234
MINICOMPUTERS AND LARGE SCALE COMPUTATIONS
* * * * * *
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ch016
*
Complete operator control over the system environment via the console typewriter or CRT System f i l e manager that maintains program and data files in source and object formats FORTRAN IV interface routines for foreground services Sequential, indexed sequential, and direct random access methods for data files on disc Optional automatic disc f i l e compression and blocking Overlay Link Cataloger that prepares and stores programs on disc in a format designed for rapid loading and relocation RJE subsystem protocol interpreter which performs most of the normal operator functions automatically but does not require a dedicated terminal
The computer system is operated in an "open-shop" mode. A l l users have f u l l access at both the hardware and software levels to the majority of the features of the system. The computer system is used extensively in on-line, real-time, interactive data acquisition and reduction. Literature Cited 1. 2.
3.
4.
Robinson, A. L.: "Computational Chemistry: Getting More from a Minicomputer", Science, (1976), 193, pp. 470-472. Schaeffer, H. F . : "Are Minicomputers Suitable for Large Scale Scientific Computation?", paper presented to 11th Annual IEEE Computer Society Conference, Washington, D. C . , September 1975. Petrie, S. L . : "Design of a Digital Data Acquisition System", paper presented to the 39th SemiAnnual Meeting of the Supersonic Tunnel Association, Bethesda, Maryland, March 1973. (Referenced by author's permission). Freuler, R. J.: "State of the Art Data Acquisition and Reduction Techniques for Transonic A i r f o i l Testing", paper presented to 6th International Congress on Instrumentation in Aerodynamic Simulation F a c i l i t i e s (ICIASF), Ottawa, September 1975.
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
INDEX
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ix001
A ACIA ACPL ALGOL Algorithm ( s ) Geradin hybrid Nesbets APL APLVS Asynchronous communications interface adapter ATI Automatic transfer instruction
49 82 75 29 28 29 44 44 49 130 130
B BASIC Benchmarks BERKELEY Bravais lattices
30 194 179 166 C
CALCOMP Cartesian space CGT CK Cloud simulation chamber program .. Colloid stability Communications interface adapter, asynchronous Computer, slave Conjugate gradients inverse iteration by Cost effectiveness Costs, maintenance and operating CRAY CRUNCHER Crystallographic calculations, protein CYBER and ECLIPSE hardware and software differences
191 164 29 149 90 42 49 129 25 27 198 227 178 194 102
Disc operating system Discreet Fourier transform Distribution function, radial DMA transfer DMS D.O.S Dynamics, molecular DYNSYS Ε ECLIPSE and CYBER hardware and software differences Eigenproblems ENDOR spectrometer Equation of state ESR spectrometer Evaporator system, control of Execution times, comparison of
113 24 201 137 201 69 124
F Fast Fourier transform algorithms .... 107 FFT 107 Floating point systems 177 Fluids, theory of 42 FMLS 97 FORTRAN programming 205 Fourier transform spectroscopy 106 Froberg matrix 30, 31 Full Matrix Least Squares 97 G
113
GAUSSIAN 70 GEMCS GER Geradin algorithm Gradients, conjugate inverse iteration by GRAMPS Graphics GRAPPLE
48 108 108 103 137 123 233
Harris slash four Hierarchy HEISENBERG High speedfloatingpoint arithmetic unit Hollerith constants Hosts
179 65 29,30 29 25 27 73 213 75
H
D Device control processor DFT performance for Diffractometer, time-sharing of Diffusion DIMENSION Disc monitor system
36 108 137 130 233 36 149 65
237 In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
173 62 166 38 37 46
238
MINICOMPUTERS
HSALU Hybrid algorithms
38 28
I I loop Indexing methods, comparison of Inelastic scattering of neutrons INSYPS Interdata computer Interface, touch Inverse iteration by conjugate gradients INVIT Iteration, inverse by conjugate gradients
141 122 137 65 36 157 29 27 29,30 29 27
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ix001
J J loop JWKB
141 210
Κ KPLOT
213
SCALE COMPUTATIONS
Nesbet's algorithm 29 Neutron diffractometer 201 Neutrons, inelastic scattering of 137 NEWTON 152,153,166 NMR spectrometer, pulsed 201 NRCC 172
Ο Optimal relaxation, method of
29
Ρ Particle diffusion 42 Peripheral interface adapter 48 PIA 48 POLYATOM 179 Protein crystallographic calculations .. 102 Pulsed NMR spectrometer 201 Q Quantum chemistry computations Quotient minimization, Rayleigh, direct
204 26
R
L Large computer, comparison with minicomputer Least squares LEAST SQUARES Lemberg-Stillinger potential Lennard-Jones potential Light scattering spectroscopy Liquids, computer simulation of motion in
103 29,97 196 160 138 42 137
Radial distribution of function Rayleigh quotient minimization, direct RDOS Real time operating system Relaxation, method of optimal REMAP Remote job entry RJE
137 26 114 114 29 114 219 219
S
M MACC MATMUL Mechanical molecules Mechanics, statistical Methyl isocyanide MIKBUG Minicomputer, comparison with large computer Molecular dynamics Molecules, mechanical Monte-Carlo calculations MOR Motion in liquids, computer simulation of MRDOS
A N DLARGE
198 196 148 149 171 49 103 137,149 148 127 29 137 114
Ν National Resource for Computation in Chemistry NES
172 29
Saul'yev method 1 Scattering of neutrons, inelastic 137 SCEP 179 SCF 171 SCEPGM 196 Schmidt's numerical method 2 SCHRODINGER 166 SEL 178 Self-consistent electron pairs 174 Self-consistent-field approximation .... 171 Simulation of motion in liquids, computer 137 Slash 4 220 Slash 7 191 Slave computer 129 SM 149 Solvation, dynamics of 161 Spectrometer ENDOR 201 ESR 201 NMR, pulsed 201
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
239
INDEX
SSE Statistical mechanics Structure determination calculations ..
29 149 94
Τ 141 171 131 103 157 153 157 29 211 174
159 192 192 44
W WC WSPLOT
192 213 X
XTL
Publication Date: June 1, 1977 | doi: 10.1021/bk-1977-0057.ix001
Τ loop TCNQ-TTF T1980A Time-sharing of the diffractometer .... Touch interface Touchy-Feely Touchy-Twisty TQL TRAJ3D TSCEP
V Vibrational spectra, dynamic approach to Virtual memory system VMS VSAPL
In Minicomputers and Large Scale Computations; Lykos, P.; ACS Symposium Series; American Chemical Society: Washington, DC, 1977.
97