PROBABILITY (ID
EXPERIMENTAL ERRORS IN
SCIENCE
u«
G
m
\\
Probability and
Experimental Errors in
Science
Scie...
136 downloads
878 Views
20MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
PROBABILITY (ID
EXPERIMENTAL ERRORS IN
SCIENCE
u«
G
m
\\
Probability and
Experimental Errors in
Science
Science Editions®
JOHN WILEY
and SONS, INC.
NEW YORK
PROBABILITY
AND EXPERIMENTAL ERRORS IN
SCIENCE An elementary survey
LYMAN
G.
PARRATT
Professor of Physics
Chairman of the Department of Physics Cornell University Ithaca,
New York
COPYRIGHT
r
1961
BY
JOHN WILEY
All Rights
& SONS, INC.
Reserved
This book or any part thereof must not
be reproduced
in
any form without the
written permission of the publisher.
Library of Congress Catalog Card Number: 61-15406 Printed in the United States of America First Science Editions printing 1966 Science Editions Trademark Reg. U.S. Pat.
Off.
DEDICATION This book
is
dedicated
to those timeless intellectuals
who
have so shaped our cultural pattern
that experimental science can live as a part of a science that seriously
tampers with
the plaguing and hallowed uncertainty in
man's comprehension of and of the universe.
his
gods
it,
"He
that
is
unaware of
his
ignorance will be
misled by his knowledge."
WHATELY
Preface
Although
the concepts of probability
everything in science, the student in
haphazard fashion. His
first
is
and
too often
statistics left to
underlie practically
acquire these concepts
contact with quantitative probability
may
be in extracurricular gambling games, and his second in some requirement of a laboratory instructor
who
insists arbitrarily that
a
±
number should
follow a reported measurement. In his undergraduate training, he
may be
introduced in a social science to sampling procedures for obtaining data, in a biological science to formulas for the transmission of genes in inherit-
ance characteristics, and in a physical science to Heisenberg's uncertainty
which he intuitively concludes is essentially nonsense. Such good as far as they go (except, of course, any excess in gambling and the conclusion as to nonsense), are left woefully disconnected and do not prepare him adequately to understand the intrinsic "open-ended" feature of every measurement and concept in science. Probability is the lens that brings science into philosophic focus. Without a fairly clear comprehension of this fact, the scientist cannot be really "at home" in his own field. And, at a time when as never before the results of science rush on to overwhelm society, the scientist, for lack of focus, is a poor ambassador. Not only is he spiritually uncomfortable in his own field, but science itself, as he portrays it, cannot fit comfortably in the society of other human activities and knowledge. In a very humble way, we are attempting at Cornell University to principle,
experiences,
introduce the undergraduate student to the unifying concepts of probability
Preface
vim
and
statistics as
they apply in science. This
The
a difficult task at this level of
is
broad scope is no doubt the most mature and sophisticated one with which man has ever struggled. But it is believed that the best time to instill a general attitude in a student he will have less trouble later in maturing properly. is when he is young This is admittedly the objective of a teacher rather than of a book. But experience shows the impracticality of trying to teach undergraduate students without a book. The present volume has been assembled to fill the student's development.
subject in
its
—
this
pedagogical need, at least to
fill it
be a base from which the teacher
A
is
patterned to
to discuss further aspects,
deepen and broaden the understanding of
especially those aspects that science.
The book
in part.
may go on
few suggestions are given of such excursions in the under-
standing of science, particularly in the
first
chapter.
comments on the different meanings of probability, then goes into the classical games of chance as examples of the classical or a priori meaning. Although these games have almost nothing to do with science, they provide a convenient framework for the teaching The book begins with
brief
of basic principles, for example, of combinatorial analysis which
is
funda-
probability reasoning, and of the sampling process inherent in
mental in
all
scientific
measurements.
The games
are also remarkably successful in
arousing the undergraduate's interest in the subject, and in providing
numerous problems
to help
quantitative concepts.
we
him develop
Once
his feelings for probability into
the basic principles are well established,
turn to their applications in problems
more
serious than gambling
games. In the bulk of the book, emphasis
is
placed on the experimental definition
of probability rather than on the classical definition. After the ideal games,
and
after
bility,
comments on
the role in science played by both kinds of proba-
namely, classical and experimental, the discussion
ments and to the general
maximum
statistical concepts.
shifts to
measure-
These concepts include
likelihood, rules for the propagation of errors, curve fitting,
several applications of the least-squares method, consistency tests, a
little
on the analysis of variance, a little on correlation, and so on. Then, the normal (Gauss) and the Poisson models of mathematical probability are explored both analytically and with typical problems. The normal and the Poisson models are given about equal weight, this weighting being roughly commensurate with their invocations in modern scientific measureof statistics our discussion is very an undergraduate student in an experimental science. But in both statistics and probability, the point of view taken in this book is somewhat different from that of the professional ments.
Especially in
elementary
the
subject
—
just the essentials for
statistician or
mathematician.
Preface
ix
Numerous problems
are given in each of the five chapters.
Many
of
these problems are intended to provoke discussion,
and the instructor should look them over carefully before he assigns them to the student. The most commonly used equations in statistics and probability are gathered together for convenience and placed at the end of the book, just before the index.
am
I
K.
I.
pleased to express
my
indebtedness and thanks to Professor
Greisen of Cornell University for reading the manuscript, for checking
and for making numerous helpful general suggestions. And, needless to say, practically all that I know about the subject I have learned from others.
the problems,
When 'Omer smote his bloomin' lyre, 'E'd 'eard men sing by land and sea; An' what he thought 'e might require, 'E went an' took the same as me!
—
Kipling In partial acknowledgment, the following books are listed and
them 1.
I
recommend
for collateral reading.
T. C. Fry, Probability
and
Its
Engineering Uses (D.
Van Nostrand
Co.,
New
York,
1928). 2.
3.
4.
G. Hoel, Introduction to Mathematical Statistics (John Wiley & Sons, New York, 1954), 2nd ed. A. G. Worthing and J. Geffner, Treatment of Experimental Data (John Wiley & Sons, New York, 1943). H. Cramer, The Elements of Probability Theory and Some of Its Applications P.
(John Wiley
& Sons, New
5.
A. M. Mood, Introduction York, 1950).
6.
B.
7.
E. B. Wilson,
York, 1955). to Theory of
Statistics
(McGraw-Hill Book Co.,
W. Lindgren and G. W. McElrath, Introduction (Macmillan Co., New York, 1959).
to Probability
and
New
Statistics
Jr., An Introduction to Scientific Research (McGraw-Hill Book Co., York, 1952). R. B. Lindsay and H. Margenau, Foundation of Physics (John Wiley & Sons, New York, 1943). William Feller, An Introduction to Probability Theory and Its Applications (John Wiley & Sons, New York, 1957), 2nd ed. R. D. Evans, The Atomic Nucleus (McGraw-Hill Book Co., New York, 1955), Chapters 26, 27, and 28.
New 8.
9.
10.
11.
Emanuel Parzen, Modern
New Ithaca,
May
Probability
and
Its
Applications (John Wiley
&
Sons,
York, 1960),
New
1961
York
LYMAN G. PARRATT
Contents
Chapter
I
EARLY DEVELOPMENTS: IDEAL GAMES A.
B.
INTRODUCTION,
1
1-1.
"Three" Meanings of Probability, 2
1-2.
Historical Perspective, 6
CLASSICAL (A PRIORI) PROBABILITY, 1-3.
Definition of Classical Probability, 8
1-4.
Probability Combinations, 9
8
Mutually exclusive events, 10 Independent events, 10
Compound
events: general addition theorems, 11
Conditional probability
:
multiplication
theorem, 12 1-5.
Inferred Knowledge, 17
1-6.
Problems, 20
1-7.
Combinatorial Analysis, 23 Permutations, 23
24 Sampling without replacement, 26 Sampling with replacement, 26
Stirling's formula,
Combinations: binomial
coefficients,
Binomial distribution formula, 30
Multinomial
coefficients,
35
Multinomial distribution formula, 37 XI*
27
Contents
xii
Sampling from subdivided populations without lottery problem and bridge replacement: hands, 39 1-8.
Classical Probability
and Progress
in
Experimental
Science, 41
Applications in statistical mechanics, 42 Classical statistics, 43
Quantum 1-9.
C.
bosons and fermions, 43
statistics:
Problems, 45
EXPERIMENTAL 1-10. Definition
(A POSTERIORI) PROBABILITY, 40
of Experimental Probability, 49
Number of "equally probable
outcomes''''
meaningless, 51
Chapter 2
1-11.
Example: Quality Control, 51
1-12.
Example: Direct Measurements
in Science, 52
DIRECT MEASUREMENTS: SIMPLE STATISTICS A.
B.
MEASUREMENTS IN SCIENCE: ORIENTATION, 2-1.
The Nature of
2-2.
Trial
2-3.
Random
a Scientific Fact, 55
Measurements and
Statistics,
2-4.
Probability Theory in Statistics, 61
2-5.
Computed Measurements, 62
2-6.
Conclusions, 63
BASIC DEFINITIONS: FIGURES, ETC., 63 2-7.
56
Variation, 59
ERRORS, SIGNIFICANT
Types of Errors, 64
Random
(or accidental) error, 64
Systematic error, 67 Precision and accuracy, 68
Discrepancy, 69 Blunders, 69 2-8.
C.
Significant Figures
and Rounding of Numbers, 69
FREQUENCY DISTRIBUTIONS AND PRECISION INDICES, 2-9.
71
Typical Distributions, 72
Terminology: types of distributions, 72 2-10. Location Indices,
76
Median, 76
Mode Mean
(most probable value), 76 (arithmetic average)
m
and
/i,
76
55 55
Contents
xiii
2-11. Dispersion Indices, 79
Range, 79 Quantile, 79
Deviation {statistical fluctuation), 79
Mean
{average) deviation, 80
Experimental standard deviation
s,
82
Moments, 84 Variance a2 : "universe" or "parent"" standard deviation a, 86
Degrees offreedom, 89 Variance: binomial model
Standard deviation error) s m
,
distribution, 91
mean {standard
in the
92
Skewness, 94
Other dispersion
indices,
95
Conclusions, 97 2-12. Problems, 98
Chapter
3
OF MEASUREMENTS FUNCTIONAL RELATIONSHIPS
STATISTICS
3-1.
IN 101
Method of Maximum Likelihood,
103
p
in the
binomial distribution, 105
//
and a
in the
/<
in the
Poisson distribution, 107
normal
distribution,
106
Instrumental parameter, 107 Precision in the
maximum
likelihood
estimate, 108
Standard error a m 109 ,
3-2.
Propagation of Errors, 109
Nonindependent errors: systematic
Random errors, 1 Mean {and fractional mean)
errors,
110
1
deviation, 113
Standard {and fractional standard) deviation, 114
Sum
or difference,
1 1
Product or quotient: factors raised powers,
1
to various
16
Other functions, 118 3-3.
Different Means, 118
Weighted mean,
1 1
Weighted dispersion
indices,
120
Consistency of two means: the
Comparison of precisions ments: the
F test,
123
in
t test,
120
two sets of measure-
Contents
xiv 3-4.
Curve Best
Fitting: fit
Least-Squares Method, 126
of a straight
line,
127
Straight line through origin,
1
3
Best fit of parabola, 132 Best fit of a sine curve, 133 Criterion for choice offunctional relation, 133 3-5.
Justification of Least-Squares
Maximum
Method from
Likelihood, 135
3-6.
Data Smoothing, 139
3-7.
Correlation, 140 Correlation coefficient, 140
Covariance, 143 Interpretation, 144 3-8.
Inefficient Statistics, 146
Location index, 147 Dispersion indices, 147
Standard deviation
in the
mean (standard error), 148
Examples, 148 3-9.
3-10.
Conclusions and Design of Experiments, 148 Summary, 150
3-11. Problems, 150
Chapter 4
NORMAL PROBABILITY DISTRIBUTION 4-1.
Derivation
Normal (Gauss)
of the
156
Probability
Density Function, 157
Shape of the normal frequency
curve, 161
Normalization, 161 4-2.
Errors in the Normal Approximation, 163
4-3.
Significance
of the
Bernoulli
Trials
in
Actual
Measurements, 164 Elementary errors, 164
Mechanical analog for Bernoulli-type elementary errors in continuous sample space, 166 Characteristics of elementary errors, 168 4-4.
The Error Function, 169
4-5.
Precision Indices, 170
Standardized variables,
Mean
69
deviation, 170
Standard deviation, Probable error,
1
Confidence limits 4-6.
1
1
72
73 in general,
1
73
Probability for Large Deviations, 175 Rejection of a
"bad" measurement, 175
Chauvenet's criterion for rejection, 176 4-7.
Test of a Statistical Hypothesis: Example, 178
Contents
xv 4-8.
Test of Goodness of Fit of a Mathematical Model, 180 Graphical comparison offrequency curves, 181
Graphical comparison of cumulative distribution functions: probability paper, 181
Skewness and kurtosis, 183 The x2 test, 184 4-9.
Conclusions, 191
4-10. Problems, 192
Chapter 5
POISSON PROBABILITY DISTRIBUTION 5-1.
Introduction, 195
5-2.
Derivation of the Poisson Frequency Distribution Function, 197
Rare
195
events, 196
Shapes of Poisson frequency distributions, 198 Poisson to normal distribution, 199 Normalization, 201 5-3.
5-4.
Errors in the Poisson Approximation, 201 Precision Indices, 202
Standard deviation, 202 Fractional standard deviation, 203
Standard deviation
in
a single measurement, 204
Probable error, 206 Skewness, 207 5-5.
Significance of the Basic Bernoulli Trials, 207
5-6.
Goodness of Fit,
5-7.
Examples of Poisson Problems, 213 Deaths from the kick of a mule, 213
Two mechanical
analogs, 209
21
Spatial distribution, 212
Radioactive decay, 215
Counts per unit time: precision, 218 examples, 221 Composite of Poisson Distributions, 224 Measuring with a background, 224
More
5-8.
Precision, 225 5-9.
Interval Distribution, 227
Dispersion indices, 229
Resolving time: lost counts, 229
Coincidence counting, 230 Conclusions, 231 5-10. Problems, 231
SUMMARY
239
GLOSSARY INDEX
241
249
"For there was never yet philosopher That could endure the toothache patiently,
However they have writ the style of gods And make a pish at chance and sufferance." WILLIAM SHAKESPEARE
"Life
is
a school of probability."
I
Developments:
Early Ideal
A.
WALTER BAGEHOT
Games
of
Chance
INTRODUCTION
Every fact in science, every law of nature as devised from observations, "open-ended," i.e., contains some uncertainty and is subject to future improvement. This may sound harsh but it is simply the way of things, and this book is written to help the student of science to understand it. is
intrinsically
The
subject of probability has
introductions.
many
facets
Sections 1-1 and 1-2 of the present chapter,
The student
is
and needs many
This book has three introductions: advised to read them
progresses into Section
and
all in
(3)
different
(1) the Preface, (2)
Section 2-1 of Chapter
the order
named
2.
before he
1-3.
Although the primary objective of the book with the modern philosophic focus in science, probability, the elementary tools (methods
is
to acquaint the student
viz.,
through the lens of
and formulas) of
statistics
and
of the popular mathematical models of probability are also discussed to
about the extent commonly needed in the practice of an experimental science.
Actually,
most of the pages are devoted
to these
more or
less
conventional practical topics.
The treatment of the
subject presumes that the reader has studied
some
Elementary and intermediate algebra suffice as the mathematics background for the bulk of the book, but full comprehension of a few of the formulas requires just a little knowledge of calculus. experimental science for about one year.
2
Probability
and Experimental Errors
in
Science
"Three" Meanings of Probability
l-l.
The concept of probability is of great antiquity. Its first development was no doubt intuitional. Indeed, one of the "three" general meanings of probability in acceptable use today is a subjective or an intuitional one. This meaning refers to a qualitative state of mind or intensity of conviction, a meaning that is not intended to be, or else cannot be, quantitatively measured. Examples of this meaning are found in the statements "She probably would have won the beauty contest if she had entered," "Shakespeare's plays were probably written by Bacon," and "Life probably exists on Mars." In many examples of intuitional probability, some cogent arguments, even some quantitative arguments, may be marshaled in support of the intensity of the conviction. When an appreciable amount of quantitative support is arrayed, this meaning of probability emerges as one of the two quantitative meanings discussed in this book. Actually, of course, there exists a whole gamut of partially quantitative meanings. The spread is according to the degree of quantitativeness. Let us illustrate one end of this gamut. Suppose the doctor tells his patient that he has a mild attack of what appears to be appendicitis and that the probability is good that he will recover without immediate surgery. The doctor's appraisal of the situation is based on his experience, both direct and indirect, which involves a large number of more or less similar cases. If pressed,
he
may
give a quantitative value of the recovery probability,
in mind an uncertainty of say ±0.3 or so, but he would be very reluctant (and properly so) to state any such quantitative values. He knows that there is still a large amount of nonquantitative "art" in his meaning of probability. At the other end of the gamut, a
such as
0.6,
and have
mathematician, when he speaks of probability, refers (unless he says otherwise) to a quantity having an exact value, an axiomatic or "classical"
but
it is
ledge
may not be known, anyway, obtaining the numerical know-
In real-life situations, the exact numerical value
value.
is
presumed
to exist, and,
On the other
not the mathematician's problem.
works with a
large
amount of
rather precise quantitative value of probability.
good
the data are equally
(or have
known
an "internally consistent" analysis, he or subjective evaluation that the physician.
The
statistician
is
is
hand, a statistician
data and deduces therefrom a
real-life
He presumes
that
injects
none of the nonquantitative
inherent, for example, in the practice of
generally rather closetothemathematician
in the high degree of quantitative intent in his probability, but he
that his value
data,
is
numerous
not
100%
all
weightings) and, concerned with
precise (as
is
knows
the mathematician's) because his
to be sure, are only a limited sample of
all
possible data.
3
Introduction
Let us turn next to the scientist. He is in the business of both making measurements and interpreting them in terms of the laws of nature. In a sense, he is more concerned with a sort of "external consistency" than is the statistician.
This
returns to plague
is
because the
him when more
reporting a measurement, he states
numerical
±
view of the laws of nature
scientist's
measurements are made. In the numerical value and then gives a precise
value to indicate (implicitly)
quantitative reliability or
its
measurement is "correct." The ± value is typically deduced from a combination of (a) a more or less careful statistical analysis of his trials and (b) a guess based on his experience in setting up and performing measurements of this sort, and perhaps from a few unrecorded test measurements made in the process of adjusting the apparatus. Also, he indulges in the guess because he suspects that his measurement contains some undetermined amount of systematic error. His statistical analysis the probability that the
is
often a bit careless for another reason,
in
mind
that
if,
the interpretation he has
viz.,
measurement may not require greater
care. But note well by the design of the experiment, he judges the systematic error
for the
to be relatively small and/or the interpretation to
he must then make a very careful
demand
a high degree of
Hence, depending upon the particular measurement and its interpretation, the experimental scientist works in the range of quantitative probability somewhere between the physician and the statistician. But the scientist reliability,
statistical analysis.
when he is conjecturing about nature in the absence of pertinent measurements, resorts perforce to the classical type of probability of the also,
mathematician.
This he does in desperation, as discussed later in this
chapter.
The
bility is
improved as soon as actual pertinent measurements become
tentative description of nature in terms of classical proba-
available.
To make
clearer the distinction
between the probability of the mathe-
matician (or of the desperate theoretical scientist) and that of the cian (or of the scientist in his eventual description of nature), briefly
one of the points made above.
This
is
let
statisti-
us amplify
beyond doubt the most
subtle point in the concept of quantitative probability.
A
quantitative measure of anything always implies
operational definition of the thing being measured.
(hence meaning) of probability in a contains a certain
amount of
real-life
or
always an arbitrariness that
scientific situation
inherent arbitrariness,
exists in addition to the effect
an unambiguous But the definition
of the subjective guess in evaluating the
systematic error just mentioned. This stems from the fact that, in a given situation,
happen.
bound
we can never imagine or evaluate all As a matter of practical experience, we
the things that might are usually content to
the situation with a closed group of separately
and individually
Probability and Experimental Errors in Science
4 evaluated possibilities.
If the
amount of
is
arbitrariness
often
arbitrariness,
group
small.
It is
is
reasonably comprehensive, the
primarily this feature of inherent
small but always partially unknown,
that
philosophical (and theological) fascination into the subject.
mentary treatment, we
shall for the
infuses
In our ele-
most part side-step the conventional
But we point out now that an attempt to reduce the residual arbitrariness in the definition of probability in real-life situations that we progress from the first to the second of the two so-called quantitative meanings. We shall return to different aspects of this arbitrariness in Sections 1-5 and 1-10, and generally throughout the book. The first of the two quantitative meanings is called the classical or a priori probability. The classical meaning, discussed with examples in Part B of this chapter, is based on the presumption that the probability (for the occurrence of a specified event) can be determined by an a priori analysis in which we can recognize all of the "equally possible" events. The usefulness of this meaning is limited to a rather specialized class of "mathematical" probability situations, such as those encountered in the ideal gambling games (e.g., dice, cards, coin tossing, etc.), and to real-life situations where we are desperate for lack of reliable objective knowledge philosophical (and theological) implications.* it is
really in
(measurements).
The second of
the
two quantitative meanings
or a posteriori probability.
is
called the experimental
This probability concept, having by far the
and understanding of nature, has which the number of "equally possible" events has admittedly not been or cannot be determined. This is the case in all
greater profundity in the description to
do with
real-life It is
our
situations in
probability situations.
of
little
failure to
situations
is
or analytic
consequence in the limited discussion
make
book whether
due to a limitation (temporary or inherent) of our knowledge or is due to an inherent characteristic of the events under
ability,
consideration. f
In either case, the causal factors in the individual events in
such probability situations are not understood us to
in this
the a priori determination of probability in real-life
make
in sufficient detail to
reliable individual-event predictions;
said to be at the
whim
or caprice of nature. But
that in these probability situations
considered by the methods of
each such event
it is
especially
allow
may
be
noteworthy
a large number of events are an amazing degree of regularity
when
statistics
* An introduction to probability from the philosophical point of view is given, e.g., by E. Nagel, Principles of the Theory of Probability (International Encyclopedia of Unified Science, vol. 1, no. 6, Chicago, 1939). t We may, if we like, include as part of the event characteristic the perturbation introduced by the observation itself (Heisenberg's uncertainty principle).
5
Introduction
becomes apparent
—the
capriciousness of nature
is
limited.
With an
experimental or a posteriori definition of probability, predictions of specified future events
can be made, and, more, a significant quantitative
degree of reliability can be assigned to each prediction
The term "random mass-phenomena"
is
often applied to a class of
events that are amenable to analysis by statistics and experimental-probability theory,
a whole.*
events that are unpredictable in detail but are predictable as
Practically every intellectual discipline
with quantitative relationships
war
—economics,
science, biology, chemistry, physics, engineering,
medicine, to mention a few
—
is
replete with
whose content deals
sociology, political science,
commercial business,
phenomena
that are effectively
random. Statistics and probability theory are rapidly being extended to more and more facets of all of these subjects. In particular, the substance of any experimental science is measurement, and practically all measurements are in the class of essentially random mass-phenomena. Measurements are so classed because of one or both of the following reasons: (a) each measurement is merely a sample of a treated as
large
number
(essentially infinite in
some
instances) of possible measure-
ments that differ more or less slightly among themselves, i.e., the experimenter is unable to control completely all the experimental factors involved, and (b) the property being measured may contain a degree of randomness as an inherent characteristic. And, equally important, an increasing fraction of modern scientific theories (i.e., generalizations drawn from measurements) are based on a view of the statistical behavior of nature. Examples of these features of measurements and of theories are discussed later.
For the student of any quantitative science, early recognition and underrandom mass-phenomena are imperative. Obviously, such recognition and understanding require an early grasp of the fundamentals of statistics and probability theory. This book attempts to impart these standing of
fundamentals.
With
this objective, the
book
is
largely concerned with the third of the
above. However, for historical and pedagogical reasons, and because of the use of axiomatic probability by desperate theoretical scientists (desperate in the sense mentioned above),
"three"meanings of probability as
set forth
the next few sections of this chapter are devoted to a review of selected
concepts and arguments that were developed in connection with the classical or a priori probability.
This a priori meaning
one to grasp and to use to the extent that * This
is
our
first
it is
applicable.
much
It
is
given in Section 2-3.
the easier
should be kept
use of the term "random," a term or concept that
present in any discussion of statistics and probability.
concept of random
is
is
inextricably
Further discussion of the
Probability and Experimental Errors in Science
6 in
mind
that all of these concepts
and arguments (excepting, of course,
the definition of probability itself) are also applicable to
phenomena
in
random mass-
which the experimental meaning of probability
is
the
appropriate one.
1-2.
Historical Perspective
and probability as we know it today is dually games of chance and (b) accumulated records such as those of human life and death. Both of these roots, in really recognizable form, date from about the middle of the seventeenth century. The concepts of classical (a priori) probability grew mainly from the first root (ideal games of chance), and the experimental (a posteriori) concepts, based on statistics, grew mainly from the second. In most of this chapter, the development from the first root is reviewed, but, as was just mentioned, most of these
The
subject of statistics
rooted in
(a) ideal
aspects have general applicability.
we launch
into the classical part of our study, a few more comon historical perspective. Around 1650, gambling was very popular in fashionable circles of French society. Games of dice, cards, coin tossing, roulette, etc., were being rather highly developed. As personal honor and increasing amounts of money were involved, the need was felt for some formulas with which gambling chances could be calculated. Some of the influential gamblers, like de Mere, sought the help of the leading mathematicians of the time, such as Pascal, Fermat, and, later, d'Alembert and de Moivre.* Fortunately, the mathematicians accepted the problems as their own, and soon the subject of classical probability took shape. The a priori definition of probability was formulated in correspondence between Pascal and Fermat in 1654. Huygens published the first treatise on the subject in 1657. The famous Bernoulli theorem and the binomial distribution were introduced in 1713. The general theorem known as the probability multiplication rule was proposed by de Moivre in 1718, and de Moivre also published the first indication of the normal probability
Before
ments are
in order
first
special case of the powerful "central limit
in 1733 to 1738.
Further development of the normal distribu-
distribution (and the
theorem")
made by Gauss, whose name
is often attached to it, and it was soon used by Gauss and by Laplace independently in the analysis of errors of measurements in physical and astronomical observations. The important principle of least squares was formulated by Legendre at about this time, and the "theory of errors" was well on its way.
tion
was
later
* See, e.g., Todhunter's History of the Mathematical Theory of Probability from the time of Pascal to that of Laplace (Macmillan, Cambridge and London, 1865).
Introduction
Laplace
7
of 1812,* a treatise in which the a priori
in his classical treatise
type of probability holds supreme sway,| gives a rather complete
summary
of the mathematical theory of games of chance. Soon after 1812, contact with classical mathematicians was almost
lost.
Continued development of the subject was made by statisticians in various fields such as in actuarial work, in certain branches of social, biological, and physical sciences in the treatment of errors in measurements, and in theoretical physics in what is called statistical mechanics. In its
and the root of experimental probaand death records published by Gaunt in England in 16624 Such records and interpretations were significantly extended by Halley a few years later, and Halley is sometimes called the father of statistics. Statistics flourished for some 200 years without much present form, the subject of statistics bility
apparently began in the
life
further progress in probability theory, except for the
new
definition.
Vigorous contact with mathematicians was re-established in the 1920's, and basic development along the lines of mathematics is today continuing
apace with the multitude of new applications. maticians
is
The work of
the mathe-
again closely akin to the classical (a priori) concepts in the
sense that probability
theorems are
valid,
is
taken as axiomatic.
The new mathematical
of course, irrespective of the method whereby the
numerical value of probability
is
obtained.
Philosophical aspects of probability continued to develop as scholars in * Pierre S.
de Laplace, Theorie analytique des probability (1812), and the companion
treatise Essai philosophique sur les probabilites (1814).
For
selections
from
several
contributors to probability theory, including Laplace, see The World of Mathematics (Simon and Schuster, New York, 1956), ed. James R. Newman.
Why
t
—along with most of the leading mathematicians and —so uncritical of the a priori definition of probability Such questions are
was Laplace
of his time
scientists
?
interesting but of course difficult to answer. Perhaps it was because of the intellectual fashion in those days to believe that complete knowledge was available to man, that
of the "equally possible" outcomes in any situation were knowable.
This was and to what later became the principle of determinism. This philosophical fashion was dominant in intellectual circles throughout the nineteenth century, and did not yield until the advent of quantum physics early in the twentieth century. With Heisenberg's uncertainty principle, the whole of science was recognized as ultimately based philosophically on the concepts of experimental all
related to the philosophy of sufficient reason,
probability. X
There
exist very early records of a type
of insurance business in the protection of
Greek merchants from the maritime hazards of ships at sea, but these records do not indicate a very good statistical basis on which the insurance "premiums" were determined. To marine insurance, the Romans added health and burial insurance, again with rather inadequate statistical records. In the year 1609, Count Anton Gunther in
Germany fire;
refused the request of his people for financial protection against the hazard of he refused for fear of "tempting the gods." As a matter of history, it appears that
the English started the
first fire
insurance
company
in 1680.
Probability and Experimental Errors in Science
8
general devoted themselves to the subject.
Activity in
philosophical
probability has been particularly intense since the widespread dissemination of Heisenberg's uncertainty principle of 1927.
based
in
quantum
The general of
(a)
This principle
is
physics.*
subject of probability at the present time
mathematics, (b) measurements or
is
a combination
statistical data, (c)
theory of
and (d) theory of knowledge itself. Typically, the student begins the study from some one rather specialized approach, but he is soon obliged to broaden his view to include all of them. This, perhaps more than any other, is a humbling subject, because it is so all-inclusive. nature,
CLASSICAL (A PRIORI) PROBABILITY
B.
1-3.
Definition of Classical Probability
In a simple ideal game, the chance of winning
heads or
is
if
comes, and
one out of
For
;
The
will settle either
is
honestly cast, there are six possible out-
number (a specified event) is Also, we may readily see that the
the chance for a particular face
six, i.e.,
the probability
is \.
probability of drawing the ace of spades sV
it
tails.
ordinary (six-sided) ideal die
is
easily deduced.
an ideal coin is There are two possible outcomes of the toss and the chance the same for either outcome. The chance for either a head or a tail one out of two, i.e., the probability is \. By similar argument, if an
example,
is
is
honestly tossed into the air
from an ordinary deck of 52 cards
and the probability of drawing a spade
is
||
=
\.
underlying conditions for such simple calculations of probability
are that (1) every single trial must lead to one of a definite known number of outcomes or events, and (2) every possible outcome must have an equal chance.
be defined as the number of events recognized as "win," and let n be the total number of equally possible events. Then, the probability of
Let
w
winning, p,
is
given simply by the ratio
P
^
(i-i)
n
and the probability of
losing, q,
is
f
given by
=
"-^ n
*
W.
Heisenberg, Z. Physik, 43, 172 (1927).
d-2)
9
Classical (a priori) Probability
Equation
of classical or a priori probability,
1-1 constitutes the definition
a quantitative definition that was no doubt generally understood and used
very early in our history. Nowadays, this definition because
was so
it
a priori definition since
it
sometimes called the Laplace
is
by him
well formulated
(in 1812).
allows determination of p before the
It is
an
game has
been played. Application of the classical definition required increasingly critical thought as games of somewhat greater complexity were considered. The definition does not give a criterion for determining the values of w and
of n, especially of n, and the proper values become rather obscure as the complexity of the game increases. As an example, consider a game in which a penny
is
game is won if a head appears T. One argument says that viz., HH, HT, TH, and TT, and
tossed twice in succession and the
Denote heads by
at least once.
H and
tails
there are four equally possible events,
by
won in each of the first three events. Hence, p = f But an argument says that the events and HT are winners on the Acfirst toss and that the second toss in these two cases is not made. cordingly, there are only three equally possible events, viz., H, TH, and TT, and the first two events are winners. The proponents of this argument concluded that p = § instead of |. Which is correct?* The student should recognize that in the second argument the three outcomes are the
game
is
.
HH
alternative
not equally probable. It
was paradoxes such as
this
(and
this is
a very simple example) that
caused the gamblers to go to the famous mathematicians of the time. The
mathematicians were soon able to resolve the ambiguities in the numbers
w and n
shall take
1-4.
games of chance by the methods known as and combinatorial analysis, subjects that we
in Eq. 1-1 for the ideal
probability combinations
up
briefly
now.
Probability Combinations
For convenience
in terminology, let A, B, C,
•
•
•
stand respectively for
the various possible specified outcomes or events in a probability situation.
A may
be heads in a penny toss, or it may be red in the draw of a card from an ordinary deck B may be tails in the penny toss, or it may be a ;
face card in the card draw;
spades
;
etc.
Or one of the
7 in the cast of
two
events are very
dice, or
p
—
|
second argument
is is
it
commonly
* Curiously, the early
leading to
C may not even exist, or it may be the deuce of events
may
may
be more complex,
be either a 7 or an
specified,
11, etc.
and the determination of
mathematicians were divided on
the correct one;
may be a Compound
e.g., it
the paradox
given twice the weight of either
is
this question.
resolved
TH or
TT.
if
their
The argument
the event
H in
the
and Experimental Errors
Probability
10
Science
in
Two
probabilities requires in general very careful analysis indeed.
rather
simple combinations involve either mutually exclusive or independent
component
events.
Mutually exclusive events. Two events A and B are mutually excluone of them can occur in a single trial. For example, a head and a tail are mutually exclusive in a penny toss. The probability of occurrence of an unspecified member of a set of mutually exclusive events is the sum of the component probabilities. This statement is conveniently sive if only
written as
A
p(either
follows that the
It
exclusive events
is
sum of
unity,
=
or B)
p(A)
+
p(B)
of
the probabilities
(1-3)
all possible
mutually
i.e.,
p(A)
+
Independent Events. Events rence or nonoccurrence of
+
p(B)
A
in
p(C)
+
A and B no way
•
•
•
=
(1-4)
1
are independent
if
the occur-
affects the probability
of occur-
rence of B. Examples of independent events are found in successive tosses
of a penny, or in successive samplings of differently colored balls from a jar
if
is
is replaced and the jar shaken before the next sample is The probability of occurrence of two or more independent events product of the component probabilities,
each sample
taken. the
p(A and
The
B and C
•
•
=
•)
p(A)
probability of tossing a penny three heads in a
the three specified events being independent.
two heads in a row and then a tail tossing two heads and a tail in three events
p(B)
is
not specified,
is
3
(f )
+
3
(£)
have a combination of mutually
The
•
p(C)
row is
•
\
•
•
•
(1-5)
\
\
=
3
(|)
,
probability of tossing
But the probability of also (f)3 particular sequence of the tosses, if
is
+
.
3
(I)
=
exclusive
3(£)
3 .
In the latter case,
and independent
we
events.
two pennies are tossed together three times, the probability of seeing two matches and one mismatch in any (unspecified) sequence is 3 3(|) since the outcome of the first penny is never specified but in each of the three tosses the outcome of the second penny is specified (with/? = ^), and there are three independent ways (different possible sequences) in which the winning outcomes may appear. The three independent ways refer, of course, to the fact that the mismatch may follow the two matches, come between them, or precede them.* Another example of the combined probabilities of independent events is found in the following. Suppose an incident is witnessed by one person Similarly, if
,
* is
Another view of this problem is to consider that the outcome of each penny toss Then, the probability of each outcome is (£)\ and there are 24 different
specified.
Classical (a priori) Probability
who
describes
it
II
to another person
who
in turn transmits
people are in the chain before the incident
is
it
related to you,
on.
20
If
and
if
the
component probability for truth is 0.9 per person, what is the probability that you are told the truth? This combined probability is (0.9) 20 an 0.1, and you should not put much credence in the story as you hear it. We might inquire as to the number N of such people in the chain before the combined probability has dropped to 0.5. The answer is given implicitly
=
in the equation (0.9)^
0.5.f
made commonly
In connection with the independence of events, a remark must be in regard to the
quoted
in
popular "law of averages." This so-called law
support of a fallacious belief that a run of bad luck presages a
run of good luck such that the two runs course, merely wishful nonsense
the
is
component
probabilities
if
will
remain unchanged.
or nonfacetious interpretation of this "law of averages"
number of trials
is
if
is
the one implied
By
this
determined only after an extremely large
identical trials (an infinite
either kind of luck
of
is
of
The only acceptable
in the experimental definition of probability as discussed later. definition, the probability
is,
and
average or cancel. This
the events are truly independent
number
in the limit),
and a run of
number
eventually of negligible consequence as the
continues to increase (assuming that the probability so defined
unique, an assumption also discussed
Compound
events:
probability of a general
general addition theorems.
compound
is
later).
event
when
the
Now
consider the
component events are
overlapping or have what are called overlapping probabilities.
Over-
lapping component events are not entirely mutually exclusive. For example, if
one card
is
drawn from each of two ordinary decks of 52
ways of having two matches and one mismatch. In The 24 ways are seen from the table :
HH
this view,
p =
24(£)
6
cards,
=
3(J)
3 .
Probability
12
what is
is
and Experimental Errors
the probability that at least one of
-
viz.,
\+
5
-£$,
probability that both cards are the the
sum of
the probabilities for (a)
on the second and
(b)
(i.e.,
either
Science
one or both)
sum of the component because this sum includes twice the ace of spades. The correct answer is an ace on the first draw and anything
the ace of spades? This probability
event probabilities,
them
in
no ace on the
is
not just the
first
draw and an ace on
the second,
viz.,
-1 5
2~
T
1
I
1
"
51
_1_
52"
1 1 — 52+52
_,„ Or
52
'
t 1
\h2
1 \
'
h2)
where the term in parentheses corrects for the probability of simultaneous appearance (or overlap) of both component events. This and other examples of probabilities of compound events made up of overlapping independent component (nonmutually exclusive) events are illustrated
by the following equations
p(neither
A
nor B)
= [1 - p{A)~\ [1 - />(#)] = 1 - p(A) - p(B) + p(A) p(B) (1-6) = p(A) [1 - p(B)] + p(B) [1 - p(A)] = p(A) + p{B) - 2p(A) p(B) (1-7) = p(A) p(B) + ^(either A or B, not both) = 1 — ^(neither A nor B) = p(A) + p(B) - p(A) p(B) (1-8) •
^(either
A
^(either
or B, not both)
A
or
B
or both)
•
•
•
•
Equations
1-6, 1-7,
theorems.
Equation
and 1-3
1-8 are is
commonly known
as the general addition
the special case for mutually exclusive indepen-
dent events.
The concept of "sample space"
is
a very convenient one.
In a given
probability situation all possible outcomes comprise sample space.
For
example, in the drawing of one card from a deck of 52 cards, there are 52 points in sample space. (Sample space
may be
visualized as points appro-
on a sheet of paper.) The convenience of the concept is readily seen in the idea of overlapping component events. Consider the 52-point space just cited: of the four points representing aces, two also are found among the 26 points representing red cards. Other examples are given in the problems of Sections 1-6 and 1-9. priately arranged
Conditional
multiplication
probability:
multiplication theorem, of
which
Eq. 1-5
is
theorem.
branch of the subject
general
This leads to a
events, involves the ideas of partially dependent events.
known
The
the special case for independent
as conditional probability.
If
event
B cannot
occur unless some condition is imposed, say unless event A has occurred, then the probability for B must include the probability that the condition
Classical (a priori) Probability is
13
In this simple example, the probability for the
satisfied.
event (A and B)
may
compound
be written
=
p(A and B)
p(A)pA (B)
(1-9)
on the assumption that A has p A (B) is written as p(B A). Equation 1-9, in its general form in which B depends upon more than one condition, is known as Bayes' theorem. (Bayes' theorem is usually stated a little differently, viz., as the probability that B was preceded by the where p A (B)
to be read as the probability,
is
B
already occurred, that
specified events
Ax A2 ,
Often,
will occur.
•
•
•
,
;
this is also
|
known
as inverse probability.)
Incidentally, the definition for the independence of events
p(B
|
A)
=
and
p(B)
p(A
=
B)
\
A and B
is
p(A)
Consider the following example. Three white balls and one black ball and one white ball and two blacks are placed in an
are placed in a jar identical jar.
If
withdrawn from
We
one of the two jars it,
what
selected at
is
argue that on the condition that the
f ; and that the white probability is |.
probability
is
on the condition Either jar
Hence, the probability that a white
and from the second white ball
is
the
jar, \
random and one
ball
the probability that this ball will be white?
is
•
The
\.
first
jar
chosen the white
is chosen chosen with a probability of \.
is
drawn from
is
is
that the second jar
the
first
over-all probability for
jar
is
\
•
f
drawing the
sum A)(^l)
= 2*4+2'3 =
24
The
subscript on p indicates that this is the probability based on our knowledge before any ball has been drawn; we make use of the subscript
notation
later.
Another example of conditional probability is the following. Suppose that in a jar are two balls of which either is black (B) or white (W), and suppose that we have no additional a priori information about the particular color complex of the balls in the jar. What is the probability that the first ball drawn will be white? In the absence of any additional a priori information, it is customary to presume that there is an equal
p that each of the possible hypotheses is the correct one. We might further presume that there are three hypotheses, viz., two whites in probability
the jar, a white
we would
and a black, and two blacks. With these two presumptions,
write
p (Hyp
WW) =
p (Hyp
WB)
= j^Hyp BB) =
Accordingly, the over-all probability for a white on the
by the sum Po\ "l)
=
3
'
I
+
3
'
2
+
'
3.
t
=
2
±
first
draw
is
given
Probability and Experimental Errors in Science
14
since the three hypotheses are mutually exclusive.
chosen at random, It is
in the
In this problem, the
argument to the two jars, one of which
three hypotheses are similar in the
is
preceding problem.
to be emphasized that the presumption of equal probabilities for
the three different hypotheses
is
really
made
in
desperation since
information on which to form a better judgment. All we (1) that the probability for
range of
to
hypotheses
is
1
and
each possible hypothesis
sum of
(2) that the
is
in fact are
somewhere
the probabilities for
all
in the
possible
unity.
Depending upon our view
This example has a further complication.
of the conditions under which the balls were placed or the jar,
we have no
know
we might presume
somehow
got into
that there are four, instead of three, equally
probable hypotheses. According to this view we would write
WW) =
p (Hyp
For our purpose now, no
Hyp BW,
made between
=\ Hyp WB and
would be assigned a probability of
\ instead of \
p (Hyp
but as a unit
it
WB)
=
p (Hyp
BW) =
distinction need be
p (Hyp BB)
of being the correct one. Accordingly, the over-all probability for a white
on
the
first
draw
is
a probability that
given by the
is
the
equally likely hypotheses. the ball replaced
sum
same as But
if
that determined
the color of the
on the basis of three ball drawn is noted,
first
and the jar shaken, and then a second
ball
is
to be drawn,
it can be easily shown that the numerical value of the white probability number of equally likely hypotheses assumed 2 ) is dependent upon the Pi( at the time of the first draw. (As an exercise, the student should show this
W
dependence.) as to the "proper" number of a priori equally probable an inherent part of problems of this sort. This is one non-
The ambiguity hypotheses trivial
is
aspect of the arbitrariness inherent in the concept of probability as
mentioned in Section 1-1. For practice, let us rephrase the problem and extend it, knowing however that the numerical values from here on depend upon the initial number of equally probable hypotheses assumed. Consider a thin metal disk which we propose to toss as a true coin. Suppose that there are only three hypotheses: (a) that the disk has a mark on both sides that we may call heads, Hyp HH; (b) that it has a different mark on both sides that we may call tails, Hyp 7T; and (c) that it has heads on one side and tails on the other, Hyp HT.* With no a priori information as to which of these *
Again,
owing likely
if
Hyp
in any way from Hyp HT, e.g., we would start with four equally
77/ were recognized as different
to the process of manufacture of the disk,
hypotheses instead of three.
Classical (a priori) Probability
three hypotheses
is
the correct one, suppose that
an equal probability,
we
write
p H h(H\) as trie Hyp HH is
condition that toss
first
is
correct, then toss
first
is
Po(tfi)
/>
(Hyp HT)
we may
=
p (Hyp TT)
probability that the correct,
Pht(^i)
heads on the condition that
as the probability that the is
we assume
in
desperation
viz.,
HH) -
p (Hyp If
IS
first
toss
is
Hyp
first
=
toss
}
is
heads on the
as tne probability that the
HT
is
correct,
and p TT {H^)
heads on the condition that
Hyp TT
write the expression for the probability that the
heads as
=
Po(Hyp
HH) p HB(HJ + p (Hyp HT)
+
Wi)
p (Hyp TT) pTT (H x )
where the subscript on p refers to the probability before the outcome of any toss is known. Substituting the known quantities as discussed above, /><>(# i)
= i-i +
as expected.
Next,
we
Now we
toss the thin metal disk
W + i-o-i
and observe
that the
outcome
is
heads.
have some information about the relative probabilities of the
three hypotheses; viz.,
we know
that
A(Hyp TT)
=
and, furthermore, that* Pl
(HypHH)>p (UypHT)
The subscript 1 onp refers to the probability after the outcome of one toss is known. Again let us point out that all we really knew before noting the outcome of the first toss was that each hypothesis had a probability somewhere in the range to 1, and it was only in desperation that we guessed the same probability, J, that each hypothesis was correct. The quantitative evaluation of each of the remaining two hypotheses is made by the same *
We
might
arguing that
if
fail
outcome of the were
in
fact
to recognize the second feature of the additional information,
the other side of the disk were in fact tails first toss.
tails,
This argument would be valid
it
if
would not be altered by the we knew that the other side
but this we do not know; we are concerned merely with the
Only the Omniscient knows for sure that it is tails, if it is. two remaining hypotheses before and after the toss, i.e., that the relative probabilities are unaltered by the outcome of the first toss, implies belief in unaltered probabilities after many tosses regardless of their outcomes. That this belief is untenable, unless revealed by the Omniscient, is easily seen. Suppose that a long sequence of heads were to appear; even a novice gambler would suspect that the disk were HH. And an expert gambler or logician would be a little bit suspicious even probability that
it is tails.
Belief in equal probabilities for the
after the first toss.
Probability and Experimental Errors in Science
16
type of reasoning as before, using respective probabilities instead of events in Eq. ft (Hyp
Thus,
1-1.
Po(HypHH)- PHH (lH)
HH) = p (Hyp
HH)
p HH (lH)
+
HT) p HT (W)
p (Hyp
•
and Pl (Hyp
is
the heads observed in the
^^
HH) =
x
3
Now,
•
p (Hyp HH) p HH (\H) + p (Hyp HT)
\H
where the event
p,(Hyp
^PoiHypHT) p HT (lH)
HT) =
^(Hyp HT)
^
= 3
= Pl (Hyp HH) p HH (H + ^(Hyp HT) —-2.1L 3 _ — 56 •
1
I
3
We
and
-'
2
Hence,
toss.*
=
l^32
x
the probability that the second toss will be heads, event
Pl (H 2)
The
2)
•
J
H
2,
is
p HT (H 2 )
1
.
2
i
second time and observe that
toss the disk a
heads.
=3
first
p HT {\H)
•
HH hypothesis
is
again comes out
it
further strengthened at the expense of the
HT hypothesis. We have observed event 2H, viz., two heads in a row, and we
write
p 2 (Hyp
HH) Phh(2H) (Hyp HT) p HT (2H) p (Hyp HH) p HH (2H) + p (Hyp
HH)
•
/>
i-l p 2 (Hyp
+
i-4
5
Po(Hyp
HT) = p (Hyp
l
3
l
HH) 4
3
•
HT)
p HH (2H)
+
p HT (2H)
•
pQ (Hyp
HT)
p HT (2H)
•>
These expressions may be "derived" with the following type of argument. Let us make tosses, using an disk Np (Hyp HH) times and an HT disk Np (Hyp HT) times. This arrangement ensures that the probability is/? (Hyp HH) that a disk chosen at random is an disk. In the ./V tosses, the number of heads that we obtain, on the *
HH
N
HH
average,
is
A^ (Hyp HH)
-p HB {\H)
+ Np
(Hyp HT) p BT (\H)
Np (Hyp HT)-
BT (\H) are with the
HT
probability that any one toss, chosen at
random from among
the
Of
these observed heads,
outcome
is
heads with an
/>
HT disk
(Hyp
the
first
Then the whose
tosses,
is
p (HypHT)p HT (\H) HH) PlIB {\H) + p (Hyp HT) p BT (\H)
This can be stated alternatively as the probability that e.g.,
disk.
N
toss in a series of tosses, the disk will be
in
an
any one toss giving a head, i.e.^^Hyp HT).
HT disk,
17
Classical (a priori) Probability
and the probability that the
third toss will be heads
p 2 (H 3 )
The outcomes of
We may
all
generalize
JH
)
|
+
1
•
i
•
i
=A
tosses are, of course, assumed to be independent. and write the expression for the probability that the
wth toss will be heads, p
=
is
n
if all
—
1
=
tosses were heads, as
1
—^
+
•
-
=
^J-
(1-10)
(remember, for n = 1, we had three hypotheses instead of two).* After observing n heads in a row (n > 0), the probability
for
>
any integer «
for the
1
HH hypothesis, viz., Pn
(HypHH)=^-
(1-11)
i
rapidly approaches unity as n increases, but the hypothesis never
completely certain
—after n
tosses, the (n
+
l)th toss
may be
tails
becomes and the
HH probability would drop abruptly to zero. 1-5.
Knowledge
Inferred
The above examples of conditional feature of inferred knowledge.
This
probability illustrate also the basic
is
the real substance of science.
In the last example, as only //'s appeared in n
trials,
we became
the probability
is
small for
tails
on the
(/?
+
and, in this event, the reliability of the
tails,
abruptly to zero. Such
is
l)th toss,
it
may
HH hypothesis
indeed be
would drop
the possible fate of any inferred knowledge,
Any
of any knowledge based on a limited number of observations.
knowledge
is
actually a hypothesis which, as our confidence in
with experience,
may
rather
But, even though
certain by inference that the disk consisted of two heads.
it
i.e.,
such
increases
be dignified by being called a theory or a generaliza-
tion.!
Any and
all
knowledge
in
an experimental science
is
inferred
from a
problem were initially stated in terms of a disk having unlike sides or like good coin or a bad coin, the expression for/>(Hyp bad) would be the same as here given for/>(Hyp HH) and the restriction n ^ 1 would be removed. This is the case in the "sunrise" example discussed presently. t C. S. Pierce, The Collected Papers of Charles Sanders Pierce (Cambridge, Mass., 1931-35), ed. Hartshorne and Weiss, said in effect, "All beliefs and all conclusions, however arrived at, are subject to error. The methods of science are more useful than old wives' gossip for achieving stable and reliable conclusions, but science offers no * If the
sides,
i.e.,
a
access to perfect certitude or exactitude.
We can
never be absolutely sure of anything."
"There is no absolute certainty" is itself inconsistent, Pierce answered "If I must make any exception, let it be that the assertion 'Every assertion but this is fallible" is the only one that is absolutely infallible."
Then
to the objection that the proposition
Probability and Experimental Errors in Science
18 limited
number of observations.
a scientific generalization rests
than one experiment
is
Usually, however, the evidence on which
complex that the outcome of more
so
is
needed to topple
it
Rather than
completely.
when confronted with an unexpected outfurther developed) to include the new infor-
toppled, a well-based theory,
come, is usually altered (i.e., mation as "expected." Such is the progress of inferred knowledge of any sort; and such is the central feature of experimental probability asdiscussed later.
upon the "proper" number of equally an inherent part of each problem of inferred knowledge. Let us explore this a little further and ask the queswas
It
said earlier that deciding
probable a priori hypotheses
What
is
tomorrow? One two hypotheses to be considered cither assumption is the sun will rise or it will not rise analogous to the two outcomes in the toss of a coin. These two hypotheses are presumed in desperation to be
tion,
the probability that the sun will rise
is
—
that there are only
—
equally probable, each probability being t at the start of things,
before the
sun
first
Some
sunrise.
will rise again, after
having risen n days in a row,
obviously erroneous (notice
an increase
in
i.e.,
people argue that the probability that the
how
small
it
is!)
because
it
is
(|)
n+1 but this ,
the sunrise probability as experience accumulates.
other people argue that, after the
is
does not allow for So,
and more sunrise observations, the
first
the probability decreases that the hypothesis "the sun will not rise"
the
is
correct one. This argument is identical to the one in the thin metal disk problem as discussed above, and the desired probability is (2" + 1)/ n (2 + 2). As a third and last argument, we might consider that at the dawn of history or at whatever time n = 1, all hypotheses in the entire are equally probable. This assumption to range of probabilities from infinite number of hypotheses is again each of an of equal probabilities for 1
a desperation-in-ignorance type of assumption. universe was chosen at
random from
It is
to the effect that
our
a collection of universes in which
all
conceivable universes in regard to the sunrise probability were equally probable.
On
bility is
+
*
(/?
This result
this \)\{n
may
argument we would conclude that the desired proba+ 2).* Laplace advanced this last argument in 1812,
be derived along lines of conditional probability without specifi-
cally evaluating hypotheses as such.
white balls such that the
/th jar
Imagine
contains
/
N+
1
jars,
black and
each containing
N—
i
white
balls,
N
black and
i
taking on
to N. A jar is chosen at random and n balls drawn one by one with integer values from replacement after each draw. Suppose event (nB) has occurred, i.e., that all n balls are black. What is the probability that the next ball drawn from the jar will also be black? If
we choose
the
;'th
jar, the probability for
(nB)
is
p,(nB)
=
(i/N)".
Therefore,
Classical (a priori) Probability
19
and the expression for the probability
(n
+
+
l)/(n
2)
is
called the Laplace
law of succession. Laplace offered publicly to bet anyone 1,826,214 to that the sun would rise tomorrow (he reckoned n as 5000 yearsf).
1
These three arguments conclude with quite different numerical values of the desired probability, aside from the question of the proper value of n,
and
test
serve to illustrate the inherent difficulty in the development
of the
when
reliability
a bit of
of knowledge. The problem
new knowledge
is
is,
and
of course, most acute
just being conceived.
At
this time,
what
are the equally likely hypotheses, or what are the particular hypotheses
Think of the plight of the observer, who was born during the night 5000 years ago and who has never seen or heard of the sun or of a tomorrow, contemplating the prospect that the sun will rise tomorrow, or, if he has seen it just once, contemplating the probability that it has regular habits. Of course, now, with our accumulated experience, confidence in our knowledge that the sun will rise tomorrow is great, and the difficulties in the origin of this knowledge may be amusing. But the alert student will see immediately many modern examples of such even worth considering?
inherent difficulties in new hypotheses or theories and of the inherent arbitrariness in the probability or reliability of a prediction in terms of a
—
new theory or, indeed of any theory, old Further comment is in order in regard This assumption possible tion
is
or new. to the desperation assumption.
with no information whatsoever, each of only two
that,
outcomes should be assigned a probability of \. say that the probability of "life on Mars"
we would
since choices of jars are mutually exclusive events.
=
pUiB) ^
+
!»
+
1
jn+i
The required
+
l)B)
+
N"(N
and, likewise, the probability that n
p((n
2"
=
probability, viz., that («
balls
+
+
assumpwould
We
AT"
in a
row
are
all
black
is
t-N«+i
-j
N n+1 (N + +
this
-|.
1)
drawn
2»+i
is
Then,
•
+
On
1)
1)5 occurs after we
know
that
nB has
occurred,
is
p({n
+
\)B)
lim jy_*oo
p(nB)
=
n n
+ +
1
2
equivalent to evaluating and using the approexample with the thin metal disk, but this may not be obvious.] t The modern scientist would find numerous inferred reasons" for believing that the sun had been rising regularly for many years before 5000 years ago. For example, we may invoke generalizations about planetary motion, interpretations of archeological and geological records of sun-dependent life preceding the earliest records of man, of the
[Dividing by p{nB) in the
last step is
priate hypotheses in the
time involved
in the
evolution of stars,
etc.
Probability and Experimental Errors in Science
20
on Mars
also say that the probability of cats
of every individual form of
the probability of at least one form
is §,
of elephants
is
indeed,
.>,
N
life is J. is
If there are different forms of life, A which, if 1 is large, is very (A)
—
N
What
near certainty and
much
wrong? Nothing
wrong. As soon as we know or profess to know that
there
we
is
a reasonable probability for
is
no longer
are
greater than the
Nor
question.
is
in
answer of
first
more than one form of
\.
life
such complete desperation as to answer
on Mars,
to the
\
is
first
our ignorance quite so profound as for us to answer
any of the additional questions, although we are admittedly rather when confronted with such questions. We must be very careful making the complete desperation assumption. There are numerous in classical "paradoxes" that have been expounded on this point. Additional knowledge always modifies a probability, except for the Omniscient for \ for
disturbed
Whom
the answer to any probability question
is
always either zero or
unity.
1-6.
Problems
Note:
A
numerical answer to a problem
is
not a complete answer;
the
student should justify the application of the equation(s) he uses by giving an
how
analysis of the problem, pointing out
conditions on which each equation
is
the problem meets satisfactorily the
based.
To develop
his "intuition," the
student should contemplate the comparison of the correct answer with his
To
a priori expectation. 1.
What
this end,
answers are given to most problems.
the probability of drawing at
is
jar containing 3 red
and
random each of
the following
from a
5 black balls:
(ans. 2-g)
(a) 2 red balls simultaneously,
(b) 3 red balls in successive
draws with replacement
after each
draw, and (ans. 6-jt-2 )
2 reds and a black in a single draw of 3 balls?
(c)
Ten people are arranged
2.
at
random
the probability that 2 given people will be
by
1
person between them?
3.
Two
(i)
in a ring.
next to each other, and
[ans. (a)
cards are drawn
(ans. 5^)
(a) in a row, and (b)
(i) |, (ii)
(ii)
0.178; (b)
simultaneously from a 52-card deck.
What
is
separated
(i) f, (ii)
What
is
f]
the
probability that
one of them is a spade, and an ace and the other a black ten?
(ans. 34)
(a) at least
(b) they are both red, (c)
one
is
4.
With two dice
(a)
an even number on each
(b) either a 7 or (c)
neither a
1
2,
cast together,
an
II,
nor an
what
is
(ans. x*%\ (ans. 8 f 3 )
the probability for (ans. ])
die,
and 1
1
,
nor a 7
(ans. |) in the first cast,
and a 7
in the
second cast
?
(ans. i)
Classical (a priori) Probability
What
5.
is
the probability that, in tossing a penny,
heads appear
(a) 5
21
(ans. 3A2 )
in the first 5 tosses,
second head appears on the fifth toss, and 5 tosses, a head appears exactly twice?
(ans. |)
(b) the in
(c)
In
6.
how many throws of
die
1
is
there a probability of less than 0.2 of (ans.
(a) seeing a 5,
(b) seeing a 5 for the first time in the last throw,
7.
odd,
<2)
1, i.e.,
and (ans. 9)
not seeing a 5 ?
(c)
^6 )
(ans.
Two dice are cast B the event of at
Let
together. least
1
A be
the event that the
sum of the
faces
is
Are these 2 events independent? Mutually
ace.
What are the probabilities that A and B occur, (b) either A or B or both occur, (c) A and not B occurs, and (d) B and not A occurs? Note that the sum of answers (a), (c), and (d) exclusive?
both
(a)
8.
A
coin
succession.
9.
tossed until for the
What
A
answer to
gives the
(b).
is
toss,
and
(ans. jf)
required?
(ans. §)
Cornell student parked illegally overnight on 100 randomly selected
He
received a total of only 12 "tickets"
or Friday nights.
nothing further
If
nights for checking parked cars, if
he parks
(a) next
Monday
a ticket
-3)
the probability that
is
an even number of tosses
nights.
(ans.
(ans. 3-6 )
time the same result appears twice in
first
experiment ends before the sixth
(a) the
(b)
is
(ans. i) (ans. f f
what
and
all
these
on
either
Monday
is
known about
is
the probability of the student's getting
the police's selection of
night,
(ans. 0.42)
on Monday and Friday nights of next week, and (ans. 0.24) (ans. 0.30) (c) on no more than two unspecified nights of next month If it is known that the police check on random nights, what is the probability that the student, parking on random nights, will get the next 12 tickets all on (d) Mondays and Fridays, and (ans. 2(y) 12 ) (ans. 21(f) 12 ) (e) on no more than two unspecified nights of the week? (b)
10.
On
placed at
an empty chessboard of 64 squares, the white and black queens random except that they cannot occupy the same square. What is
probability that the 2 queens are in striking position, horizontal, vertical, or diagonal 11.
A
marksman
hits a target
the probability of a hit (a) exactly
4 times
is
in
i.e.,
the
first
In what sense 12.
What
(ans. ^f)
on the average 4 times out of
constant.
What
is
Assume
(ans. 0.41) fifth
shot?
(ans. 0.082)
an a priori probability problem? In what sense
is
same calendar month,
that
(ans. 0.41)
the probability that the birthdays of 12 people" assuming equal probabilities for all months, fall (a) in the
5.
the probability that he hits the target
4 shots,
and 4 times but misses on the
is this
on the same
row?
(b) exactly 4 times in 5 shots, (c)
are
are the
is it
not?
randomly
selected,
(ans. 1.34
x 10~ 12 )
Probability and Experimental Errors in Science
22
and
(b) in January, (c)
in 12 different
13.
A man
(ans. 1.12
calendar months?
(ans. 5.37
belongs to a club of 10 members.
making a
to dine with him,
Every day he invites
members
each day.
different party
(a)
For how many days can he do this?
(b)
How many parties will each man attend? A closet contains 9 pairs of shoes with no
14.
5
x 10-13 ) x 10 -5 )
(ans. 126) (ans. 70, 126)
2 pairs alike.
4 shoes are
If
random, what is the probability that there will be no complete pair among them? Generalize the answer for the case of n pairs of shoes with 2r chosen
at
<
shoes chosen at random (2r
(ans.
n).
Suppose that 6 white and 9 black
15.
(a) If the color
of the
not replaced, what
is
first ball
drawn
™)
balls are in a jar.
at
random
is
not
the probability that the next ball
known and
drawn
will
this ball
is
be white? (ans. f)
(b) If the first ball
is
known
probability that the second (c) is
If
the
first ball is
known
not known, and neither
drawn (d)
16.
Why
not replaced, what
be white?
will
is
the
(ans.
/4 )
to be white but the color of the second ball
replaced,
what
drawn
the probability that the third ball
is
is
the answer to part (a) the
same
as the probability for
drawing the
white?
Suppose that the 15 balls of Problem 15 are divided between 2 jars with 4 black in one and with 2 white and 8 black in the other. If a jar is selected at random and a single ball drawn, what is the proba-
white and (a)
is
is
be white?
will
first ball
draw
and
to be white
bility that
1
it
be white?
will
(b) If this ball
is
(ans. 1)
not replaced and
its
color
is
not known, what
is
the proba-
next ball drawn (from a randomly selected jar) will be white? answer be expected to be the same as the answer to part (a)? (c) If the first ball is not replaced but is known to be white, what is the probability that the second ball drawn (from a randomly selected jar) will be white? (ans. 0.47) bility that the
Should
(d)
this
What
is
the probability for the successive drawing of 2 white balls without
replacement from randomly selected jars? 17.
Suppose
that
someone has
million of being bad,
i.e.,
a penny that
(ans. 0.235)
you suspect has one chance in a The penny is then
of having heads on both sides.
it comes up heads. would n have to be before your suspicion that the penny is chance in 2? (ans. n = 20) must n be if you were initially certain that the penny was and
tossed n times and every time (a)
How
large
bad increases to (b)
How
large
remains good?
1
(ans. In this case experimental evidence
is
not relevant)
18. Referring to Problem 9, calculate the ticket probability for next Wednesday night. Assume that only two hypotheses are reasonable, viz., the police check either (A) randomly among the 7 days of each week or (B) randomly on only Mondays and Fridays; and further assume, before the Monday-Friday 12-ticket experience, that the two hypotheses are equally probable (the usual -8 guess in desperation before actual observation). (ans. 3.55 x 10 )
23
Classical (a priori) Probability
Combinatorial Anal/sis
1-7.
As stated in Sections 1-2 and 1-3, methods for determining the total number of equally possible events in somewhat complicated ideal games of chance were soon worked out by the mathematicians. These methods they give formulas for
constitute the subject of combinatorial analysis:
computing the number of permutations and the number of combinations. These formulas are discussed next. Permutations.
Consider a
set
of n objects having different colors or
make each object different from any other. some order, e.g., along a line. This arrangement
some
characteristics that
Arrange the n objects in called a permutation of the n objects. If two objects are interchanged in their positions, a different permutation results. Now select k of the n objects and arrange them in order, e.g., along a line. How many permutais
tions are possible as
k objects are selected from the population of n
objects?
Suppose for simplicity that the k objects are selected one by one. Any one of the // objects may be chosen to fill the first of the k positions; hence there are n possible choices in the selection of the first object. Then, n — objects are left from which one is chosen for the second 1
Since there are n possible choices in selecting the
position.
first
object
— possible choices in selecting the second object, there are a total — 1) possible choices in selecting both objects. Continuing, there of are n — 2 objects left from which the third is selected. In selecting all three objects, there are n(n — 1)(/? — 2) possible choices. In general, the A-th object selected from (n — k + 1) objects, and the total number of possible choices in selecting all k objects is n(n — \)(n — 2) (n — k + 1)and n
1
/?(//
is
•
•
•
Since each possible choice represents one permutation, the total
permutations
is
,A =
n(n
-
lXn
-
2)
•
-
(n
•
k
+
1)
=
7-^77. (/?
The symbol
n
Pk
is
commonly read
things taken k at a time.*
In case
n *
The phrase "«
"taken" rather than the lation
11.
Pn
=
=
n,
//
-
number of permutations of n number of permutations is
the
(1-13)
n\
commonly used
In a given sample,
things, but the total
(1-12)
fe)!
it
is
in the
discussion or
the k things that are
number of permutations
is
equal to the
size
k that can be taken from the popu-
In this sense, the n things are taken as
one "boundary condition" and the in the sampling process.
number of
sample
as the
k
things taken k at a time,"
permutations, requires amplification. total
number of
given by
size k
is
different ordered
samples of
taken as a second "boundary condition"
Probability and Experimental Errors in Science
24 This
consistent with the convention
is
=
0!
As implied above, a permutation sample of a
to or less than the total population,
to
//.
defined as one possible ordered
is
The sample size may be equal number of objects in the
of nonidentical objects.
set
algebra of taking
factorial
in
Of course, k can be any number from
1.*
i.e.,
the total
set.
not necessary that the k objects be selected one by one; they
It is
be withdrawn simultaneously and subsequently placed
all
in order.
may It
is
Suppose that we have n bins, and that we wish to distribute k objects among these bins, no more than one object to a bin. In placing the first of the k objects, we have n possible possible choices; in placing the second object, we have // problem from
instructive to consider the
this view.
—
The total number of the same as given by Eq.
1
choices;
etc.
possible choices in distributing
k objects
is
1-12.
As
a simple example, consider the
when four
arise />
4
2
=
and
—
4!/(4
are a, b,
two
=
2)!
4
3
•
=
12,
that
may
In this case,
at a time.
and these permutations,
can be written as ab, ba, ac,
c, d,
if
the
letters
ca, ad, da, be, cb, bd, db, cd,
dc.
formula.
Stirling's
often inconvenient.
Numerical evaluation of a factorial number is Help is frequently found in the use of Stirling's
formula, z!
=
where z
any
is
12z
^
9,
The
integer.
2%
mation, good to
for z
— + -J- - -^- + )
+
x/2^ (z/eyll \
z
number of permutations
different letters are taken
all
first
=
5,
288z 2
term of
and
518402
this
is
1
(1-14)
/
expression
to better than
the straight factorial evaluation
3
is,
%
usually the
as an approxi-
for z
>
9.
For
more convenient,
and the practical use of Eq. 1-14 is usually limited to the first term. The essential part of Stirling's formula can be derived as follows. The natural logarithm of z!
=
log (2!)
Consider.a graph of y log
2,
log
3,
log 4,
is
=
etc.,
log 2
log
+
2,
log 3
+
log 4
as in Fig. 1-1,
and abscissa values
+
•
•
•
+
log z
and mark ofTordinate values 1, 2, 3,
etc.
Then, log
(2!) is
clearly equal to the sum of the areas of the rectangles indicated in the figure, each rectangle having unit width and having a height equal to
log *
2,
or log
This
n\jn, and,
0!
=
1.
is
3,
•
•
•
,
or log
2.
often stated as follows:
if
n
is
This area By
is
approximately equal to the area
the definition of a factorial
taken as unity in this expression, n\ n
=
1.
It
number,
follows that
(/;
(1
—
1)! 1)!
= =
Classical (a priori) Probability
2S
log 7 log
log
r^T
6
log 5
—
.y
=
logz
4
l/|
^~A
log 3
\/ log 2
tT
i
l|
-3ZJZ^\
I
I
I
I
I
I
I I
I
I
I
i
i
i
I
i
i
i
i
I
I
I
I
I
~
i
i
i
i
i
~ —
J I
i
i
I
I
0123456789
-
10
z
Fig. 1-1.
Graphical interpretation of
under the smooth curve y = log z out to as z increases. Hence, for large z, log
z,
log
(2!) ph
Stirling's formula.
the approximation improving
2
dz
Jo
and, integrating by parts,
—z+
log (2!) £a z log 2
Putting this in exponential form,
^
1
2
log 2
—
the
first
2
we have
&
2!
(z/eY
which, with the insertion of the factor V2ttz,
is
term of
Stirling's
formula, Eq. 1-14. The additional terms enter in the complete derivation to take account of the excess area between the step curve
and the smooth
curve in Fig. 1-1.* *
Mathematicians commonly define
z\
as XI
x ze
dx
(1-15)
J* z is integral or not. Later, we shall use the factorial of a negative terms of Eq. 1-15 that such factorials can be shown to be infinite. Interpreted graphically, Eq. 1-15 refers to the area under the curve y = z ze~*. The
which applies whether integer.
It is in
integration indicated in Eq. 1-15 can be carried out by parts,
+
i
e -*dx
=
z{z
-
(1-16)
1)!
Jo Further discussion of
this integration, as well as
formula, can be found is
called the
gamma
cussed under this
in
function of
title.]
of the complete derivation of Stirling's
standard textbooks of mathematics. z,
written T(z),
and
Stirling's
[Incidentally, (z
formula
is
—
1)!
often dis-
Probability and Experimental Errors in Science
26
Sampling without replacement.
Either procedure just described in
number of permutations of /? things taken k at a time, or, other words, the number of different ways k objects can be placed in
arriving at the in
is also called sampling without replacement. With this terminology, „Pk is read as the number of different samples of size k that can be taken from a population of n elements. In this case, the term "sample" refers
n bins,
to a particular permutation of the k objects.*
The
a priori probability of drawing, or of otherwise realizing, a partic-
from a population
ular permutation or properly specified sample of size k of/; elements
lj n
is
Pk assuming each ,
possible permutation or sample to
have an equal probability.
Sampling with replacement. ]f the first object selected from the n is noted and then replaced before the next selection is made, we have sampling with replacement. A given object may be selected more than once; repetitions are permitted. It follows that in selecting two objects
objects, the total
three objects, the
the total
number of possible total number is n n
number of possible choices
ordered arrangements of
is
choices
=
n
n
//
We
fc .
is
3 ;
n
n
and
=
/r;
in selecting
in selecting
say that n k
is
the
size k, repetitions permitted, that
k objects,
number of
can be taken
from a population of/? different objects. Note that in this type of sampling no restriction is placed on the magnitude of A:; the sample size k may be smaller or larger than the population
Tossing coins, casting dice, ment.
/?.
are examples of sampling with replace-
In tossing a coin, the population n
the sample size
is
When random made
etc.,
the
number of
sampling
is
order of heads and bility for this
is
\jn
k .
sample
(five
all
heads
in a
independent events as discussed
problems involve,
For example, what
selected digits are
tails,
and
different?
is
the
in
is
is
equally
size k, repetitions
In the tossing of a coin,
tails is specified as, say, five
interesting probability
sampling.
each of the possible samples
Hence, for each specified ordered sample of
permitted, the probability
Some
heads and
done with replacement, the assumption
in a priori probability that
probable.
is 2, viz.,
tosses.
if
the sequential
row, the proba-
earlier)
is
1/2
5 .
a sense, both kinds of
probability that five
randomly
In this problem, although a selected digit
* The term "sample" has various meanings: as a verb it refers to a process, as a noun it refers to something selected; here, it means a permutation; in the next paragraph, it means an ordered arrangement of k objects that are not necessarily all different and in "combination" problems discussed presently, sample refers to different objects but without regard to their particular sequential order. The term "arrangement" is also a general one, like "sample," and is also often used in special senses without the proper qualifying phrases. Some authors use the term "combination" in a similar general way, but usually it has the restricted meaning assigned presently.
27
Classical (a priori) Probability
not "replaced," there are assumed to be an
is
of
digit in the
infinite
imaginary jar from which selection
number of each type is made; hence the
problem is really one of sampling with replacement. It is immaterial whether sampling is one by one or is a single group of five together. We note that, in this example, n = 10 and k = 5, and that there are therefore 10 5 possible ordered arrangements, repetitions permitted, each to be equally probable. There are
ments without
repetitions).
10
Pb
(i.e.,
=
5
An
Consider another problem.
and
elevator has seven passengers
we
empties with stops at ten floors, and
same
at the
is
^ 0.3024
™*J 10
assumed
ordered arrange-
Hence, the answer to the question
p
no two passengers leave
permutations
ask,
floor?
What
is
the probability that
Assuming equal probability
for each of the 10 7 ordered arrangements with repetitions permitted, find the
we
answer to be
p
=
io^z
10
7
w 0.06048
Suppose now that the k by sequential drawings from n different objects or by individual positions in an /7-place bin, are considered to be a single unordered group. Such an unordered group is a type of sample called a combination. Thus, in the case of the four letters a, b, c, and d, the twoletter sample ab is the same combination as ba, but is different from ac, etc. The total number of combinations possible in selecting k objects from n different objects may be denoted by n Ck or, in more modern notation, Combinations:
binomial coefficients.
objects, instead of being ordered
by 1,1.
Either of these two symbols
read as the total number of
is
combinations of n things taken k at a time, or as the total number of combinations of size/: from a population of n elements. Again, < k < n.
The expression
for the total
The k
as follows.
number of combinations, 1,1,
is
obtained
group make one combination,
objects as an unordered
whereas one particular order of the k objects makes one permutation.
The k\,
total
number of permutations
and the
total
in
one combination of k objects
number of permutations
in
Thus,
I
^
,
)
'
combinations
is
(
^
is
A:!.
,
J
'
••=(;) k\ from which, with Eq.
1-12,
(1-17)
kf
k\
k\(n-k)\
28
and Experimental Errors
Probability
It is
apparent from the denominator of the
—
k\ and (n
k)\
can be interchanged.
:)
The symbol
I
:
(.
of
last part
this
Science
expression that
follows that
J
called the binomial coefficient because
is
I
-
It
in
it
appears in
Newton's binomial expansion
+ by =
(a
n \^
ja"« »
n „n-2L2 + ^^""^
n-lu + [{ja^b + '
rt
»
(
,
\
.
,
coefficient
I
•••
n
l.
+
(") b"
arranged in an interesting way in
is
,
i
•
+ The
+
.
•
0-19) Pascal's
J
'
^
triangle:
=0
n
1
11 12 13 3
= =2 =3 1
14
=4 = = =
5
6
6
1
7
10
20
15
35
21
7
1
1
4
6
10
5
1
1
1
5
1
6
15
35
1
21
7
1
etc.
Entries in the
/7th
row are the
respective values of
I
given by successive
,
J
Note
values of k.
that the basic idea of the triangle
(except for the unused apex) ally
above
it;
i.e.,
is
the
is
that each entry
sum of the two immediately and diagon-
.
,
1
("t
=(:)+(,->
)
(As an exercise, the student should prove Familiarity with combinations let
is
this equality.)
best achieved
from
specific
examples;
us consider a few examples.
The number of selected
two-letter combinations (sample size
from the four
letters a, b, c,
and d (population 4!
,k!
and these are
Ml =
ab, ac, ad, be, bd,
and
=
k /;
= 2) that can be = 4) given by is
6
2!(4-2)! cd.
Consider further the example about a random sample of discussed
in
different, in
five digits
the previous section, where the problem specified five digits,
an ordered group or permutation.
Now
let
us ask, In
all
how
29
Classical (a priori) Probability
many ways can
five different digits
be selected
the sequential order
if
is
of
I, i.e., reduced from no consequence? The answer to this question is ^ ^ factor 5 10 P5 by the Another example is found in the problem, How many different single bridge hands can be dealt from a 52-card deck? A hand contains 13 cards and the particular order of the cards in a hand is of no consequence. So I
the question can be rephrased,
How many
combinations of
taken from a population of 52? The answer
And
number of permutations,
the
specified,
is
Among
i.e.,
is
I
^
I
can be
size 13
635,013,559,600.*
with the sequential order in the deal
even greater by the factor (13)!
all
'"
these
cards? Each hand
1
is
I
how many have
possible bridge hands,
a sample of size 13 cards from the population of 26 .
hand contains only red cards
probability that a
only red
is
_
I
.
The
then
26 13
52 13 It is
perhaps instructive to solve this
problem by the long-hand method of
writing the product of the probabilities of successive independent single-
card deals. Thus, the probability that the
second
is
are red
is
that the 13 cards
25
24
14
52
51
50
40
In general, coefficients,
it
i.e.,
trial
acquitted
first
is
etc.,
26!
26!
13!
(26-13)!
52!
52!
39!
(52-13)!
26! 13! (26
/26I
-13)!
!
-
13)!
method of
to think in terms of a sample size
\13
/52
52! 13 (52
simpler to use the shorthand
is
A bridge player
At the
red
is -§£;
;
26
*
card
and the probability of the event that all is ff product the of the component events, viz., given by
red
U3 the binomial
k from a population of
immediately after looking at his hand rose up and shot the dealer.
he pleaded that the dealer had obviously stacked the deck, and he was
when he proved that if the deck had not been stacked the probability of his hand of cards would have been only 1 in 635,013,559,600. What
getting that particular
should have been the counterargument of the prosecuting attorney? t It
is
not valid to argue that the sample in this problem
is 1
3
cards from a population
of two colors, giving 2 13 red hands, because this improperly presumes sampling with replacement.
Probability and Experimental Errors in Science
30 size n, is
although
example the most apparent simplification
in this particular
notation.
in the
Another example: What
the probability that a poker
is
number of
cards contains five different face values? The
(combinations)
is
(combinations) the
from a
=
I
I
five
would be
1
different
ways
cards having different face values can be selected
single suit of 13cardsis
the answer
The number of
2,598,960.
hand of five hands
different
I
/
If a suit
I.
I
I,
I
but no
were specified in the problem,
suit is specified.
The number of
ways (arrangements with repetitions permitted) in which the can be arranged among the four suits is 4 5 Hence, the answer to the problem, assuming all possible hands to be equally different
specified five cards
probable,
.
is
,
45
p
=
5 ,
«s 0.5071
'
(?) Note that in each of the last two examples, component parts of each problem are analyzed in terms of different population numbers, and, in the last example, one component sampling is done with replacement. Different types of populations in a given problem are common. Binomial distribution formula. there are only
In probability situations in which
two possible outcomes (outcome population
observation or sampling with replacement
Obvious examples of Bernoulli
trials
is
called a
=
2),
Bernoulli
each trial.
are the successive tossings of a coin,
the casting of a die for a six vs. a nonsix, etc.
The
essential features of
Bernoulli trials are:
must be independent, each trial must be determined entirely by chance, and outcome of the the probability for any specified outcome (called "success") must be
(1) successive trials (2) (3)
constant for Let
p be
"failure."
all trials.
the probability for "success,"
What
is
The problem posed by If
and
let
q be the probability for
the probability for exactly k successes in n trials? this
question
the sequential order of successes
is
known
and
as the Bernoulli problem.
failures
were important, the
—
k failures would be probability of exactly k successes and of exactly n pkqn-h s nce successes and failures, being mutually exclusive, are necesj
But since the sequential order is of no conse~k k by the number of combinations of the quence, we must multiply p q n sarily
independent events.
Classical (a priori) Probability trial
31
—
population n taken k (or n
equivalent to adding up
This multiplication
k) at a time.
the mutually exclusive combinations.
all
the answer to Bernoulli's problem
given by
is
n
k„ n-k n,p)=\ JpY-' k
B(k;
is
Hence,
>
c
(1-20)
and may be read as the Bernoulli (or the binomial, see below) probability k successes out of n trials with a success probability p. Since p + q = 1, it follows by the use of Eq. 1-19 that for exactly
=
1 .
(P
+ *)" =
re
1
4
P
I
Ip"' *
j
+
+
n I
2
---
-2
)p
+
q k n
\
k )p
_ *= I 5(fc; 1= I (£W K =Q k2ft k=0 \
The
term
in the
n successes in n
trials;
n
—
etc.
up
1
first
-k
q
+ '-'+q n
«,p)
(1-21)
(1-22)
/
sum of Eq.
1-21 gives the probability for exactly
the second term gives the probability for exactly
successes; the third term the probability for exactly n
—
2 successes;
the last term, q n gives the probability for zero successes. The series k to and including the p term represents the probability that the event ;
,
happens at least k times in n trials. The probability for at one success is 1 — q n Because of its general form, Eq. 1-21 or 1-22 is called the binomial formula, and a graph of B(k; n,p) vs. k is called the binomial distribution of probabilities. Such distribution graphs are shown in Fig. 1-2 for two
called success least
.
particular pairs of values of n and/?.
for the cases in which p
=
The
\ (symmetry
distribution
is
asymmetric except be proved
in these special cases will
in the next chapter).
Examples of binomial probability are found in the tossing of a coin n k heads (and n — k tails), for which the head probability per trial is known, e.g., is | the casting of a die for which p for a particp ular face value is \, or for an even number is i; the casting of two dice for which p for a double ace is gV; the hitting of a target k out of n times times, seeking
;
if
the hit probability It
is
known;
etc.
follows from the definition of the success probability/?, Eq. 1-1, that,
in n Bernoulli trials, the a priori expected
the product np.
expectation value possible values of
by Eq. 1-20 or
This product np is
number of successes k
is
given by
The when all
called the expectation value.
equal to the arithmetic
k are weighted by
1-21.
is
mean
value of k
their respective probabilities as given
Probability and Experimental Errors
32
in
Science
Expectation value
C^
p = np = 50*1-
0.08
16.66'
o fl 0.06 -a;
03
0.04 0.02 !
o.oo
•
2
4
*
»
8
6
10
12
16
14
20
18
22
24
26
28
k Fig. 1-2.
Two
The asymmetry decreases
binomial probability distributions.
as the
expectation value, np, increases.
The most probable value of
k, viz.,
k
,
that value for
is
nomial probability is a maximum. In some cases k two values differing by unity [see Problem 14(o) of Section differs from np by more than unity.* is
*
This can be shown as follows. By the definition of k B((k
+
1);
n,p)< B(k
B((k
-
1);
n,p)
;
,
in
which the
bi-
double-valued, the 1-9].
k never
general
n,p)
and also
By expressing each B
B(k
\
n,p).
as the appropriate term in Eq. 1-22, these inequalities can also be
written as
n
1
—
p
1
—
p
p
33
Classical (a priori) Probability
Before proceeding,
us recapitulate a
let
for the binomial distribution formula
Bernoulli
trials,
we
little.
In writing the expression
use n to represent the
number of
whereas, in the discussion that led to the binomial co-
number of unlike objects from which random k are made. It is important to realize that, although the terminology is different and the "unlikeness" of the "objects" is rather specialized, the two /7's refer to the same thing. Each binomial coefficient is the number of combinations or ways in which an unordered sample of size k can be taken from the n unlike objects. The binomial probability for k successes is proportional to this coefficient, i.e, to the number of ways efficients,
n refers to the
selections of size
trials can be taken k successes at a time. The k successes are of course in an unordered sequence with n — k failures mixed in. The specialized
n
unlikeness of the objects in this instance
is
that the population of n trials
made up of only two different kinds of things, viz., successes and failures, and we do not know which is success and which is failure until after the is
But
trial.
does not impair the argument.
this
The binomial can be analyzed
is
a very important distribution since so
in
terms of basic Bernoulli
many problems
Indeed, the normal
trials.
(Gauss) and the Poisson mathematical models of probability, the two most
important models science,
may
of measurement in any experimental
in treating errors
(but not necessarily) be considered to be special cases of the
binomial model. The normal case
is that for which n becomes very large and p is sufficiently large that np > 1 (infinite in the In practice the normal approximation is fairly good as long as when p < \, and nq > 5 when p > \. The formula for what is
(infinite in the limit) limit).
np
>
5
called the probability density in this distribution
G(z;
fc)
=
A
e
is
-»V
(1-25)
V
ITT 7
where h
=
1/V 2npq and '
z
=
np
—
Equation 1-25
k.
is
derived in Chapter
The Poisson special case obtains when n becomes very large (infinite in the limit) and p becomes very small (zero in the limit) but in such fashion
4.
that the product np remains
moderate
The Poisson formula (derived
in
P(fc ;A
where//
The
=
viz., that
np <^
\ n.
5) is
= £iL-
d-26)
np.
special conveniences of Eqs. 1-25
measurements are discussed ately apparent convenience, is
magnitude,
in
Chapter
in detail in
and 1-26 and
Chapters 4 and
when compared with
that evaluation of large factorials
is
avoided.
their application to 5,
but one immedi-
the binomial formula,
34
Probability
When
n
excessive.
Eq. 1-25
mation
is
is
is
small, Eq. 1-20
When
n
used.
And when
used.
is
large
Thus,
is
is
is
large
and p
pennies.
not
is
small, the Poisson approxi-
experiment to check
it.
it
is
instructive
Toss five "honest"
Pick up those that are not heads and toss them a second time.
Finally, toss
any that are not heads Table
For Pl
=
,
1
(2
« r = h;
l-l.
i
Pz
this
time a third time.
Values of 8(k; 32,
lt = d -\r 16807
Now,
p.)
243 0.237;
1024
* -0-W- 32678 ~°- 313 k
is
have been considered.
Because of the importance of the binomial distribution little
Science
not small, the normal approximation
all possibilities
for the student to perform a
in
used directly and the computation
and p n
and Experimental Errors
and
after the
35
Classical (a priori) Probability
understanding of the values of/? listed at the head of the table,* and then check a few of the binomial probabilities as listed in the table. Use Stirling's formula where it is accurate enough and convenient. The numt
and
bers n
np, at least in the third-toss case, are sufficiently large that fair
accuracy can be achieved with the simpler normal approximation, Eq. 1-25 also, B(k; 32, p^) may possibly be satisfactorily represented by the Poisson approximation, Eq. 1-26. Try these approximations in a few instances in test their accuracy and their
each of the one-, two-, and three-toss cases to relative convenience.
Then compare
the observed
numbers of successes
with the calculated frequencies and discuss the discrepancies.
Later, in
Chapter 4, we discuss the so-called % 2 test for determining whether or not the observed discrepancies are reasonably compatible with the assumption that the five pennies are "honest" (i.e., that each has a head probability of
|).
We
have discussed permutations and combinations of n different objects taken k at a time. In many problems the n objects are not all different from one another. Thus, in an ordinary
Multinomial coefficients.
Suppose that k x + k z + k3 + there are r different kinds of objects; then n + kr r < n. The expression for the number of permutations n Pk k k can be 52-card deck, 4 cards are aces, 13 are spades, 26 are red.
=
•
•
•
,
...
same type of reasoning as was used
arrived at by the
in obtaining
I
J
from
n
Pk The k x
similar objects in a single combination represent
.
kx
permutations of the n\ total number of permutations that would exist
Likewise, the k 2 similar objects in a single
n objects were different.
all
combination represent k 2
permutations
\
\
if
if all
n objects were different.
By continuation of this argument, the number of permutations n Pk k k when multiplied by k 1 \k 2 k r gives the total number of permutations ...
l
*
Calculation of p x
show heads
to
first
=
p2
is/» 3
is
—
5
J)
By
.
= d -i)
p 2 may on the
-
To
easy.
twice in succession
or the second try (1
-
the
-
\
calculate is
J;
p2
— J). The probability that all five pennies will do this is same type of argument, the probability of success in three tries
is (1
5 -
also be calculated as follows.
The
—
n tails
5!
—
2" "h! (5 Thereafter, 5
we
probability for n heads and 5
first try is
1 5 1/2 "".
note that the chance of a penny failing
,
hence the probability for heads on either the
—
If these
obtain
p2
.
ii)'!
The chance that they are all heads and summed over n from n = to n =
n pennies are tossed again.
is
two
5,
factors are multiplied
This sort of argument also gives ,5,
A
5!
1
P*= „f 2,= 1o™t2 s
«! (5
-
(5-h)!
1
»)!
2 5 ""
(5
-
n
-
m)\
1
2s
Probability and Experimental Errors in Science
36 that
would be present
if all
1
may
Hence, we
n objects were different.
write
=1
where the symbol n means the product. This is to be read as the total number of ways in which n objects can be divided into r groups of which the first contains k x objects, the second contains k 2 objects, etc., when the order of the r groups is preserved. Only the order within each k group is
making
relaxed, It is
it
a combination.
instructive to derive Eq. 1-27
by a different route,
this
time following
the line of reasoning that led to the expression for „Pk directly rather than
II
to the expression for
As
as above.
a
step, consider n to
first
divided into just two groups of like objects in each group.
kx
+
k2
.
Then, n
be
=
In this case, after the k x like objects have been selected from the
— k x objects left. We can write nPk # as the product of the number of ways k x objects can be selected from n objects and the number of ways k 2 objects can be selected from n — k x objects. Thus,
n objects, we have n
^=(;J("; 4=t)=^=^7
o*>
2
k2
since
=
—
n
k x and
M =
I
which there are three groups of
same argument
as in the
Now
like objects,
P
1
k2
(n
kx
—
kx
—
-
.k 2 .k 3
k2
—
«
=
Ar
x
+
Ar
2
+
kz
-
k 2 )\
By
.
the
2
k3
)\
n\
k x \{n
consider the next step in
first step,
_lA(n-k\ln-k -k
^*"Wl
since (n
1.
kx )\ k 2 \(n
-
k.V.
-k x
(/i
-
fci
-k -k -
k 2 )\ k 3 \(n
x
2
k3 )\
3 .
k 3 )\
Pkk
n^ = k
0!
=
=
1.
By 5!
generalizing,
we
= -5L»
=
see that
(1-29)
l
It is perhaps clearer in this derivation than in the former that the order is preserved among all the r groups, each group being a single combination.
37
Classical (a priori) Probability
The symbol
Pk
n
k
...
appears as the coefficient of each term in the
k
+
algebraic expansion of (a x '
+
a2
•
•
+
•
a r) n
and
,
for this reason
is
it
that it is called a multinomial (or polynomial) coefficient. Consider a few illustrative examples. First, how many different ways can five letters be arranged if three of the letters are x and two are p.
The answer
is
5^3,2
=
=
5!/3!2!
and these ten arrangements are and jjxxx. Note
10,
xxxjj, xxjxj, x/'xxj, jxxxj, xx/jx, xjxjx, jxxjx, xj'jxx, jxjxx,
again that the group sequential order
from
jjxxx, although the order
is
important;
In an earlier example, we inquired as to how bridge hands could be dealt from a 52-card deck.
many
permutations are possible
normal way.
We
when
xxxjj
e.g.,
not preserved in the
is
z's
many
is
different
or in the
y's.
different single
Now
let
us ask
how
the four hands are dealt in the
are concerned with permutations because, in the
game
of bridge, the sequential order of North, East, South, and West is important. There are four combinations, each of size 13 cards, to be selected
from a population of 52 cards. Hence, n
52!
P*!,*8 ,*3,*4 =
13! 13 13! 13 1
which
is
a very large number,
viz.,
(5.3645
outcomes,
to the case in
i.e.,
is
.
distribution for-
which the object population or outcome
subdivided into more than two groups of like elements.
For example, a
may
die has six different sides, a jar
different colors than two, a
in a gas
28
)10
•
be generalized to the case of more than two possible
easily
population
•
The binomial
Multinomial distribution formula.
mula can
•
contain balls of
deck of cards has four different
have many different values of velocity,
Ax A2 A3 ,
,
•
•
Ar
•
,
,
Let
.
more
molecules
etc.
Consider a probability situation in which there are possible outcomes, viz.,
suits,
r
mutually exclusive
p be
the probability
t
outcome A occurs at a trial and let n independent trials be made. The probability that outcome A x occurs exactly k 1 times, that outcome A 2 occurs exactly k 2 times, etc., is calculated in a manner identical to that used in deducing the binomial formula. The probability/? of obtaining a partickr ular sequence of outcomes is pfrp^p** p r and if we are not interested in the sequential order in which the A outcome occurs in the k times it is observed, and if we do wish to preserve the sequential order in which the various groups of like outcomes occur, we must multiply by the multinomial coefficient n Pk k 1-29. Thus, k from Eq. that
t
'
'
'
,
i
x
^(particular sequence)
=
:
«!
=
M[(/c x
;
n, p^){k 2
;
./c
n,
2
kr
.
p2 )
•
•
Pi^pf*
'
' :
Pr
lr
.
(k r
;
n,
p r)~\
(1-30)
Probability and Experimental Errors in Science
38
which may be read as the probability that in n independent trials A 1 A 2 occurs exactly k 2 times, etc., when the respective outcome probabilities are p v p 2 etc. Here, k is any integer from to n
occurs exactly k x times,
,
t
=
=
1. with the condition, of course, that ^; = i^, «• Also, of course, ]£• x=1 p in Eq. 1-30 stands for multinomial. It can be shown easily The symbol i
M
that the
sum over
all
values of
A:
Eq. 1-30 gives the expression for the
in
multinomial (ft
and Eq. 1-30
may
is
known
+ ft +
all
*
*
+
PrY
=
(1-31)
1
as the multinomial formula.
be put in graphical form
a graph for
'
if
the graph has r
+
1
Equation 1-30
dimensions; such
values of k represents the multinomial distribution of
probabilities.
An
understanding of the multinomial coefficients and distributions
imperative
if
is
the student seeks an understanding of the kinetic theory of
gases or, indeed, of any physical theory involving statistical mechanics.
Note
well that such theories are of increasing importance
in
the
all
physical sciences.
We may
point out in the interests of general perspective that
the analysis of errors of experimental measurements, we
later, in
shall conceive
of some probability distribution as being the subdivided population of "objects" from which a sample, i.e., a single measurement or a limited
number of trial measurements,
is
sample, with replacement, from the
same
taken.
Each measurement or
trial set is
a
a rather specially subdivided population of
our considerations of the multinomial and the probability per outcome, e.g., the probability of a
sort as that described in
coefficients,
particular measurement,
is
given by the population distribution probability.
This distribution, for which n
remain unknown or
it
is
very large, infinite in
may be assumed to be known. Commonly assumed parent
some It is
instances,
may
also called the
an Poisson and the experimental science are the normal (Gauss) distribution distribution, both of which may be considered as special cases of the binomial distribution as stated earlier in this section. The statistical problem in experimental measurements is generally to infer from a limited number of trial measurements (a) what is the most appropriate parent probability distribution, e.g., normal or Poisson, and (b) what are the quantitative values of its descriptive parameters. Help in the answer to (a) is usually afforded from a priori experience and the particular type of measurement or from statistical analysis from a rather large number of trials; obtaining the answer to (b) is often solely an a posteriori problem. The features of the measurement problem should become clear as the "parent" distribution.
reader progresses in this book.
distributions in
Classical (a priori) Probability
39
Sampling from subdivided populations without replacement: problem and bridge hands. A basic condition in the binomial distribution and in the multinomial distribution is that the component lottery
probability p be constant for
This condition restricts applications
all trials.
to sampling with replacement. But, the use of the binomial coefficient as
giving the
extended
number of combinations can be
to, a
common
replacement from a subdivided population. in
further illustrated by, or
type of problem in which sampling
is
done without is one
This type of problem
which we ask for the probability that a random sample of size/ contains i elements of a specified type k from a population of n elements
exactly
subdivided into n
— k + k2 + x
'
+k
* '
r,
with
r
<
n,
when
the sampling
done without regard to sequential order among the i elements. Suppose, first, that there are only two different subdivisions in n, viz., n = Ar x + k 2 To make the problem concrete, let k x be the number of winning tickets in a lottery in which n tickets have been sold,/ be the number of tickets we have bought, and i be the number of winning tickets that we hold. Then, the desired probability is given by* is
.
(M(.M p(exactly
i)
=
;
v
)'
"
(1-32)
(;) Here,
|
}
I
distributed
is
the
among
number of ways all
which our winning
in
the outstanding winning tickets,
and
tickets
k2
/ I
.
can be \ is
.
the
J
number of ways our
losing tickets can be distributed
standing losing tickets.
among
all
the out-
Since any single combination of winning tickets
can occur with any single combination of losing
tickets, the
product of
the two
numbers gives the total number of ways that our tickets, both winning and losing, can be arrived at from the total number of outstanding tickets. The denominator of Eq. 1-32 is simply the number of combinations of n tickets taken j at a time.
To make this example numerical, suppose that there are 400 tickets sold, we bought ten of them, and that there are four prizes. Then, the probability that we will win exactly one prize is that
396 ^(exactly 1)
=
.
.
9 ,
-
^
0.0934
/400\ I *
Equation 1-32
of p(i)
vs.
i
is
is
also
known
10/
as the hypergeometric probability formula;
called the hypergeometric distribution of probabilities.
a graph
Probability and Experimental Errors in Science
40 Incidentally,
should be obvious that the probability that we
it
or more prizes
is
win one
will
given by
p(\
k( or more) = '"
3 •
_ ) - % 0.0967
) (
(D since the events of winning exactly
mutually exclusive. ability
of winning
Of
all
one
prize, exactly
we bought
course, had
two
400
all
prizes, etc., are
prob-
tickets, the
four prizes would be given by
Kall4)
=
W
400 - 4; \ /400\
=
l
',400/
As another easy example with n = k x + k 2 consider the combination problem given earlier about the number of all-red hands possible in a single hand of bridge. This problem can be discussed in terms of n = 52 cards divided into k 1 = 26 red and k 2 = 26 black cards. An all-red hand corresponds to i = 13 and j = 13. The answer to the problem can be ,
written as
'
v
,
/>(13 red,
black)
=
,
.
26 \ 26W26\ 13/ \ 0//
/26
=
\13
"(3 Or, the probability that a single bridge hand will contain one ace to be
p(\ ace, 12 nonace)
Next, consider an example in which n like elements.
Let us ask,
What
is
similarly for the other suits. in the
hand
is
I
_ I.
in
The
is
subdivided into four groups of
hand of
the probability that a bridge
13 heart cards, h total
<
13, is
I
,
number of combinations
So the desired probability
K t '*A number h specified -fi~4 suit) -rt p(specined in each •
seen
4W48 = 1/U2
cards consists of s spades, h hearts, d diamonds, and c clubs? The
of ways h hearts can be found
is
I
;
= \s)\h)\d)\c)
3
and
possible
is
-
13
number
Classical (a priori) Probability
41
if s = 5, h = 4, d = 3, and c — 1, the answer is 0.00538 problem did not specify the particular suit distribution, the probability that a hand consists of, say, five of one suit, four of another, three of a third, and one of the fourth suit is 4! (= 24) times greater, viz.,
Numerically, If the
0.1293
•••-.
As a
final
groups and
example, consider a case in which n
is
subdivided into four
which the particular sequential order of the four groups is important. We may ask, What is the probability that each of the four bridge hands contains one ace? First, how many ways can four aces be arranged into four ordered groups each of size 1 ? The answer to this ace question
in
is
Second,
how many ways can
the four ordered hands?
4'
_ |- 441
—
p
48*1,1,1,1-
-
m!1!1
the remaining 48 cards be distributed
The answer
to this question
among
is
48! 48*12,12,12,12
12!12!12!12!
Then, since each permutation of the aces and each permutation of the
nonaces are mutually exclusive, we must add up all the separate ways in which they can occur together, i.e., we must multiply the two respective
numbers of permutations
Hence,
to obtain the required probability.
this probability is
=
4!48!/(12!) 52!/(13!)
4
=
24-48!-(13) 4
4
^
Q
m
52!
This, except for problems in Section 1-9,
is
as far in this
book
as
we
shall
pursue the arguments in ideal games of chance. These games have served well in helping us to get acquainted not only with classical or a priori
probability but with the basic principles of probability combinations
and
of combinatorial analysis. The principles are, of course, just as applicable
when 1-8.
the experimental or a posteriori definition of probability
Classical Probability
and Progress
in
is
used.
Experimental
Science
As implied above, upon
the progress of any experimental science
the constant repetition of three steps:
or a conception of the behavior of nature as best calculation of the a priori probabilities
is
based
(1) the invention of a model
we understand
it, (2) a such a of conception, on the basis
and, then, (3) a comparison of the a priori probabilities with actual
measurements,
i.e,
with the experimental or a posteriori probabilities.
42
and Experimental Errors
Probability
in
Science
Confidence in the a priori conception increases to the degree that the comparison is favorable; and the conception is modified to the degree that the comparison is unfavorable. Then, more a priori calculations are made with the new model, more measurements taken, etc. Measurements are always the final arbiters, and in this sense the experimental or a posteriori meaning of probability is the more significant one. But note well that both types of probability are essential in scientific progress. In fact, we have already seen in our discussions of conditional probability and of inferred knowledge some easily recognized examples of the modification of conceptions of nature (hypotheses or theories),
modification of our degree of confidence in them as actual observations
becomes
These are good,
available.
examples of the elements of progress
arguments
science,
in
us
let
reference to statistical mechanics.
albeit simple,
in science.
Applications in statistical mechanics. ability
and of the
new experience of
To
illustrate further the
prob-
amplify very briefly the earlier
The following
illustrations
of binomial
and multinomial probability are extremely cryptic; if the reader has had no previous introduction to the subject of statistical mechanics he may be well advised to skip directly from here to Section 1-9. In the subject of statistical mechanics we are generally interested in specifying the simultaneous values of six properties of each of a large
number of
interacting particles, viz., the three mutually perpendicular
components of position and of momentum. These properties are commonly expressed in a six-dimensional "phase space."
To
express the frequency
any given instant of time for N particles, we need a seven-dimensional graph, the seventh dimension being the number of particles (and if time is also a variable we need eight dimensions). (As a special case, in considering only the velocity distribu-
distribution of these six properties at
tion of molecules of a
components of
monatomic gas we are interested in onlv the three and then we deal with the conventional three-
velocity,
We imagine that phase
dimensional space and a four-dimensional graph.) space the
is
subdivided into numerous regions called
same shape and small numerical
frequency distribution of the being assigned to the distribution
may
be
rth cell.
made
is
W= where some
known
particles
cells,
The problem
among
all
If there are r cells, the
each is
cell
having
to specify the
the cells, k
t
particles
number of ways
this
given by
s Pkl *
2
,..* r
=-^—
(1-33)
W
is may have no particles and some more than one. "thermodynamic probability" of the system if the system
cells
as the
N
size.
Classical (a priori) Probability
can be defined particles
because
it is
manner, and the most probable distribution of which W^is a maximum. (W is not a true probability
in this
that for
is
greater than unity, but
incidentally, the logarithm of
Classical
43
The
statistics.
W
is
it is
proportional to probability; and,
proportional to the entropy.)
so-called
Maxwell-Boltzmann
based on the hypothesis that both the position and particle
can be simultaneously exactly specified.
This
is
is
of a
a reasonable
hypothesis in view of our experience in macroscopic mechanics.
initial
Each point
a
in
of position and is
statistics
momentum
cell in
phase space corresponds to a possible specification
momentum of any one particle. A second initial hypothesis
that every cell
is
equally probable for every particle, subject to the
boundary conditions imposed by the conservation of energy and of the total number of particles. In this case, placing dW\{dt) = 0, where t refers to time, and imposing the boundary conditions, we find (although we shall not prove it now) the Maxwell-Boltzmann distribution k = (NlZ)e~^Wi where Z and /? are constants of the system and w is the energy of a particle ,
t
{
in the /th cell.
Quantum
bosons and fermions.
statistics:
accumulates, the Heisenberg uncertainty principle
and momentum of a given
Then, tells
as
experience
us that the position
cannot be exactly specified at any given volume of a cell in phase space cannot be arbitrarily small. The smallest volume must now be taken as h 3 where h is Planck's constant. The cell of interest is much larger than h3 so we particle
time, and, as a consequence, the
,
,
make
are obliged to
a finer subdivision of each
cell;
now we
reckon n
compartments, each of volume h 3 in each cell. This greatly increases the magnitude of but does not alter the relative probabilities. ,
W
Another new feature of quantum
statistics is that
of the indistinguish-
of identical particles. In classical statistics, if two identical particles exchange cells, we count the new arrangement as a different permutation ability
but in quantum or
cells,
statistics, if
we count
the
reduces the magnitude of
A
two
exchange compartments same permutation. This both
identical particles
new arrangement
as the
W and alters the relative probabilities.
new development
in quantum statistics, a result of the further accumulation of experience and the modification of former hypotheses, is the new property of each type of particle called its spin. Spins are quantized angular momenta and the quantum numbers are of only two
third
kinds
:
integer values (0,
1
,
2,
•
•
•)
and
half-integer values (|,
f
,
•
•
•)•
Particles having integer spin are called bosons (photons, neutral atoms, a-particles, 7r-mesons, etc.);
particles
having half-integer spin are called
fermions (electrons, protons, neutrons, neutrinos, //-mesons, is
no
limit to the
number of bosons
that can
etc.) There occupy a given compartment
Probability and Experimental Errors in Science
44 in
phase space; but the number of fermions per compartment
limited by the Pauli exclusion principle, this limit being is
\,
the magnitude of the spin
quantum number. The
sq the occupancy by electrons
is
limited to
U+
is
severely
1,
where J
spin of an electron
is
(Incidentally, the Pauli
2.
exclusion principle also governs the arrangements of electrons in atoms
where the other quantum numbers enter into the picture; the coordinates of the h 3 volume in phase space correspond to the other quantum numbers.) In the case of bosons, since there are n compartments in the z'th cell, there — 1)! ways of distributing the A^ particles. But of these are n{n + { ways, many represent indistinguishable ways in which particles are merely interchanged between different compartments and between different cells.
N
The
number of
net
distinguishable distributions
w=
n(n
+
-
N,
1)!
In
=
nlNtl
and of the system as a whole,
\
all cells
l\
t
I
t
+ N;-
n
i
is
+N N
considered,
w=Uw = u( This
is
1
(1-35) )
the basic formula for the Bose-Einstein statistics.
In the case of fermions of spin
|,
=
the
maximum number of available
sites
3
where v is the volume of the cell. In the z'th cell, which contains In sites, k of the sites are occupied and 2/z — k are empty. The thermodynamic probability for the z'th cell is given by the number of distinguishable ways In sites can be divided into in
each
cell in
phase space
is 2/<
2r///
,
t
t
two groups,
viz.,
occupied and empty, and
this is
W,
:
=
I
,
,
much
simpler
J
than
in the
boson
case.
The general expression
for the system
is
2
W=f[( f] This
is
(1-36)
the basic formula for the Fermi-Dirac statistics.
In each of these kinds of statistics, the frequency or probability distribution
is
obtained by maximizing
conditions as stated above.
W
consistent with the
boundary
The respective distributions, although not
derived here, are k,
:
=
—— Be pu
>
and k
i
=
—— Be
where B and
ft
'
+ —
for fermions
(1-37)
for bosons
(1-38)
1
1
are constants of the system.
These expressions are to be
45
Classical (a priori) Probability
compared with that given earlier for the Maxwell-Boltzmann statistics; the classical Maxwell-Boltzmann expression is the special case of the boson
<
expression for which (kjn)
1.
mechanics provides an excellent example of the progression of hypotheses or theories from the desperation-in-ignorance type of guess to rather complex conceptions that are more consistent with nature's behavior Statistical
as our experimental knowledge of that behavior accumulates.
claim with impunity that our knowledge
is
No
at all yet complete.
one can Inferred
knowledge of phase space and of the occupancy probabilities is still way to new information and to a better theory
growing, each theory giving as the science develops.
Further pursuit of this subject, and indeed a proper understanding of the features lightly touched
upon
Problems
1-9.
Note the "instructions" preceding 1.
11(b)
beyond the scope of this book.*
here, are
Solve Problems 1(a) and
and
(c), 12(a), (b),
and
3(a)
(c),
(c),
the problems in Section 1-6.
13(a)
and and
and (c), 6(a), and 14 of Section
(b), 5(b)
(b),
(b),
1-6
and
(c),
by using
equations of combinatorial analysis. 2.
What
is
the probability that
among
9
random
digits the digit 7
appears
(a) exactly 3 times,
(b) 3 times or less, (c)
A
3.
and
more than once? horseshoe contains 8
(a) In
(b) If
how many
nails.
different orders
may
they be driven?
shoe were to be attached in a different way to each horse, and
1
horse were shoed in
own
its
6-ft stall,
how many
miles of
stalls
if
each
would be
required ?
How
4.
long must a series of random digits be so that there
is
a probability of
0.9 that the digit 7 appears (a) at least once,
and
(b) at least twice?
How many
5.
(a)
<5
(b)
>0.5 for
for
Make
6.
no
dice ace,
must be and
at least
1
cast together in order for the probability to be (ans. 7)
pair of aces?
the histogram,
(ans. 9)
and indicate the position of the expectation
value, for
the binomial distributions
B(k;
B(k;l,h)-
*
F.
and
(a)
(b)
6, £),
For an elementary treatment of these features of
W.
(Addison-Wesley Publishing Co.,
New
statistical
of Gases, and York, 1955), 2nd ed.
Sears, Thermodynamics, The Kinetic Theory
mechanics, see, Statistical
e.g.,
Mechanics
Probability and Experimental Errors in Science
46 7.
What
the probability that in 7 tosses of a coin the
is
odd
tosses will
show
heads and the even tosses tails? 8.
in
If
birthdays are
random among n people
same birthday? An answer 9.
The
in
10.
11.
The
of the
letters
is
"peep"?
A
set
What
in order.
word "pepper"
What
The cards
is
the probability that at least
now
What
10 years old.
of them will
(a) exactly 3
them none of them will
live to
(b) at least 3 of
will live to
(c)
live to
13.
is
this
living at the
age of
Each of
5
the probability that
be 21, be 21, and
be 21
?
How would you plot the frequency distribution of a trinomial distribution ?
14. (a)
Show
random
digits (this
if
(n
that
if
+
\)p
is
a decanomial distribution).
the binomial probability /(£)
/12
-k\
=
B{k;
12, \),
then
1
(XT!
) 2
/<*>•
the general expression for the binomial probability/^
is
is
equal to an integer.
f (k + » = What
is
that, in a binomial distribution, the most probable value k
Show
double-valued
(c)
of the 2
an a posteriori rather than an a priori probability problem?
Discuss the distribution of 800
(b)
is
1
the probability that neither of the
is
According to a table of mortality, of 100,000 persons is
are then
the probability that the
is
10 years, 91,900 on the average are living at the age of 21 years.
Why
are then
of dominoes runs from double blank to double N.
1
(b) If 2 dominoes are drawn, what numbers on either of them is /V?
children
n
the probability that the
is
are written on cards.
in order.
domino is drawn, what numbers on it is N? (a) If
12.
is
"oral"?
is
thoroughly shuffled, and 4 drawn result
large
The cards
of the word "tailor" are written on cards.
letters
how
0.5 that 2
;
thoroughly shuffled, and 4 drawn result
at a certain party,
and only 2 of the people have the the form of an equation suffices.
order that the probability be
+
1)
for
any n and for any p ? 15.
What
are the probabilities
(a) that (i)
(b) that
(i)
1
1
ace and
(ii)
at least
double ace and
(ii)
1
ace will appear in a cast of 4 dice, and
at least
1
double ace
will
appear
in
24 casts of
2 dice? [That the (a) probability in each case
Mere's paradox.
De Mere
argued
is
known as de number of number of possible
greater than the (b)
that, since
4
is
is
to 6 (the total
possible events in a cast of 1 die) as 24 is to 36 (the total events in a cast with 2 dice), the respective probabilities should be equal.] 16.
What
is
the probability that in a sample of 10
random
equal? Compare the answer with that obtained by using
digits
no 2 are
Stirling's formula.
47
Classical (a priori) Probability that the ratio of male to female children
Assuming
17.
find the probability
is 1,
that in a family of 6 children
be of the same sex,
(a) all children will
4 oldest children
(b) the
will
(ans.
be boys and the 2 youngest
will
be
3i>)
and
girls,
(ans. 6\)
the children will be boys.
(c) exactly half
(ans. re)
A
box contains 90 good and 10 defective screws. If 10 screws are selected as a sample from the box, what is the probability that none in the sample is 18.
defective
if
(ans. 0.330 ••
sampling is done without replacement, and (b) with replacement? (a)
How many
19. (a)
different
(ans. 0.348
outcomes (permutations) are possible
k
=
12,
appearing twice
)
(ans. 6*0
what
(i.e.,
the probability for the event of every face
is
number
(ans. 0.003438
2 aces, 2 deuces, 2 treys, etc.)?
How many
20. In a lottery of 10,000 tickets there are 100 prizes.
must a person buy so that the probability of his winning 50%? An approximate answer suffices.
at least
1
•
tickets
prize will
(ans: 69)
exceed 21.
•
••
in the cast
of k dice together ? (b) If
•
A certain professor always carries 2 match boxes, each initially containing
25 matches. Every time he wants a match he selects a box at random. Inevitably a moment occurs when, for the first time, he finds a box empty. Then, what is the probability that the other
box contains
r
=
0, 1, 2,
•
•
•
matches?
22. In a laboratory experiment, projectiles (small steel balls) are shot (at
random
The screen
times) through a screen (the spokes of a rotating wheel).
(spokes) deflects or scatters
some and allows
others to pass through undeflected.
Suppose 8 projectiles are shot. Suppose that the probability of each passing through undeflected is 0.8. Compute and plot the probability distribution for traversals without deflection. If the experiment were to be repeated many times, what proportion of the trials would yield results within ±2 of the mean value? This
is
a typical "scattering cross-section" experiment in which, usually, the is determined from the observed numbers of undeflected
basic event probability/?
When
projectiles. it is
the experiment
is
performed for the purpose of determining/?,
a typical a posteriori probability experiment.
23.
Among
(a) If
N
two
TV different keys,
probability that the lock will be (b)
What
will
open a certain
100 and half of the keys are selected at
is
is
lock.
random
to try,
opened?
what
is
the
(ans. 0.752)
the limiting value of this probability as
N increases
indefinitely? (ans. |)
(c) If TV is 100,
how many
keys should be selected to try in order that there
should be just more than an even chance of opening the lock? 24. (a) If the
on any
odds are k to
particular day,
show
1
(ans. 35)
against a machinist's meeting with an accident
that the
odds are
(1
+
\jk) n
—
1
to
1
against
escaping injury for n days. (b) If
k
=
1000,
escaping injury?
what
is
the greatest value of n giving favorable odds for (ans. 693)
48
Probability
The A gene
25.
Aa have
A
the
is
and Experimental Errors
dominant, the a recessive;
characteristic,
i.e.,
and of type aa the a
Science
in
A A and Assume {, i,
organisms of types
characteristic.
and \ to be the a priori probabilities for the gene types AA, Aa, and aa (Assume Aa = aA). (a) If both parents are A, what is the probability that an offspring will be a?
respectively.
(ans. £)
(b) If all
4 grandparents and both parents are A, what
the second generation will be 26. In testing
ESP
is
the probability that
A?
(ans. if)
(extrasensory perception), an experiment
is
conducted
with 4 red and 4 black cards. The cards are thoroughly shuffled and placed face
down on
the table.
black cards, but he
The person A to be tested is told that there are 4 red and 4 knows nothing as to their arrangement. Person B draws a
card and, without either looking at If
A
answers "red,"
B places
it
it
on one
himself or showing
it
side of the table
A
on the other side. This process drawn. Let us assume A has no ESP. places
(a)
it
What
is
is
;
if
repeated until
the probability that there will be just
1
to A, asks
its
color.
all
cards have been
black card in the "red"
pile?
A)
(ans.
(b) If the first card to
appear
is
black but
is
called red,
what
is
the probability
that there will be exactly 3 red cards in the "red" pile at the
experiment? the
(c) If
end of the 4
(ans. first
card
is
called correctly,
what
is
B
answers "black,"
3 5)
the probability of having
exactly 3 correct cards in each pile?
(ans. §§)
27. In the game of craps, the person casting 2 dice wins if he gets a 7 or an II on the first cast or, alternatively, if the first sum is a 4, 5, 6, 8, 9, or 10 and the same sum reappears before a 7 appears in any cast after the first. (a) What is the win probability when the game is defined in this way? (ans. 0.49293
•
•
)
Sometimes the game is defined so that the player does not automatically lose if he casts a 3 on the first throw, and 3 is then added to the winning sums for succesive throws. What is the win probability in this case? (ans. 0.50682 (b)
•
28.
A
poker hand of
5 cards
is
dealt
•
from an ordinary 52-card deck. What
•
is
the probability for each of the following: (ans. 0.422)
(a) a single pair,
(ans. 0.0476)
(b) 2 pairs, (c) 3
of a kind,
(ans. 0.0211)
(d) straight (5-card sequence, ace permitted at either end, including a flush), (ans. 0.00394) (e) flush (5 (f) full
(g)
cards in a single
suit,
including a straight),
4 of a kind,
(ans. 0.00024)
(h) straight flush (including a royal flush), (i)
royal flush, and
(j)
"opener"
(a pair
(ans. 0.00198) (ans. 0.00144)
house,
(ans. 0.0000155) (ans. 0.0000015)
of jacks or better)?
(ans. 0.206)
49
Experimental (a posteriori) Probability
29. Consider the "pros" and "cons" of the following system of betting: Suppose in successive games, in each of which the odds are 50-50, you bet SI. At any time that you win, you pocket the winnings and start betting again at SI At any time that you lose, you bet double the amount on the next game. No matter how long the series of consecutive losses, when you win you are $1 ahead as though the losses had not occurred. (a) If you were the owner of a gambling house, under what conditions would you allow a client to use this system? (b) How would you alter the system if the odds were known to be 75-25?
(In considering both parts of this problem, ignore the usual bias
the house in
C.
its
own
EXPERIMENTAL 1-1 0.
imposed by
interest.)
(A POSTERIORI) PROBABILITY
Definition of Experimental Probability
Suppose that for some reason we wish to check the
classical (a priori)
idea that the probability for observing a head with a tossed coin
The obvious
thing to do
keep a record of the
results.
We
moderately large number n obs of independent ratio
u' ot)S /tf obs is,
|.
We
trials.
say that the
for this value of « obs the best experimental value of the ,
probability for heads in any single toss,
e.g.,
in this value increases as n oX)S is increased.
experimental probability
is
fluctuate rather erratically
probability steadies
down
the next toss.
Indeed,
if
Our confidence
the value of this
plotted as a function of « obs
when n ohs
is
small, but, as
/?
,
it
is
Fig. 1-3.
By
the
definition, the experi-
mental probability (sometimes called the frequency probability)
becomes
seen to
obs increases,
to an apparently constant equilibrium value.
A typical graph of this sort is shown in this ratio as « obs
is
number of times and to observe heads u obs times after some
to toss the coin a large
is
is
simply
indefinitely large, viz.,
pobs
=
limit
^
(1-39)
nobs— 00 Hobs the outcome of each trial (toss) is (a) independent of all preceding trials, and (b) determined entirely by chance. There are four difficulties with this definition. First, how can we be sure that all the trials are independent? The practical problem here is that the coin may wear out asymmetrically or that the person (or device) tossing the coin gradually but inadvertently acquires a "system" which favors a particular outcome. It should be noted here that we do not require the absence of a "system," but merely that if it is present it must
if
Probability and Experimental Errors in Science
50 Heads
Tails
400
Fig. 1-3. Experimental probability (frequency ratio for "heads") steadies
apparently equilibrium constant value as n
bs increases.
down
to
an
(Note the logarithmic abscissa
scale.)
Second,
remain constant. trial
is
how can we
be sure that the outcome of each
determined entirely by chance?
related to the
The
practical
one for the independence of successive
problem here
trials.
is
Third, the
limit n obs -*> oo is obviously impractical. In this regard, we substitute a conceptual extension of the experiment after « obs has become "satis-
However, the value of p obs for any large but finite n obs it as a consequence of the fact that n obs not strictly converge mathematically no ratio does finite. Fourth, the is matter how large « obs becomes. This is because, after any specified n obs
factorily" large.
contains some small uncertainty in
,
there
The
is
a
finite
chance that a long run of heads (or of
tails) will
occur.
experimentalist points out that as n obs increases, such a run must be of
increasing length to have a given effect in the value of p ob9
,
and
that after
verv ' ar g e trie probability for having a significant effect of this sort >*obs This has been proved mathematically in is so small as to be negligible. terms of the so-called strong law of large numbers. It is important, i
s
nevertheless, that n obs be very large indeed if p obs is to be expressed with very high precision. Later we shall show that the standard deviation,
a measure of the
statistical uncertainty, in the
proportional to the square root of n oba
.
measure of p obs
is
inversely
Experimental (a posteriori) Probability
SI
Even with these difficulties, the experimental definition is the one that must be invoked to "prove" that the coin is "satisfactorily honest," i.e., that the a priori probability is reasonably valid, or sometimes even to prove that a very complex combinatorial analysis is indeed correct.
Number of "equally probable outcomes" meaningless. Outside the realm of ideal games numerous probability situations exist in which the number of equally probable outcomes is entirely meaningFor these situations the classical probability, Eq. 1-1, cannot be Examples are legion: A marksman shoots at a target; evaluated. what is the probability of a hit? What is the probability that a particular person of given age will die within one year? What is the probability that a given house will be ravaged by fire within a specified time? If a baby is to be born, what is the probability that it will be a boy? What is the probability that John Doe, a candidate for public office, will be elected? What is the probability that the next measurement of cosmic-ray intensity will differ by a given per cent from the immediately preceding measurement? What is the probability that two different measurements of the velocity question of the
less.
of light agree within the experimental errors? In such probability situations
we
are at a complete loss in trying to apply the classical definition for
the probability.
Rather than rely on "armchair" reasoning, or make a
basic desperation-in-ignorance guess,
we may experiment, make
actual
measurements, and use the experimental definition of probability. I-II.
Example: Quality Control
Determining the experimental probability of a specified outcome generally involves rather intricate statistical reasoning in order to achieve
satisfactory numerical value with a
minimum of
the heads probability in tossing a coin
is
effort.
very simple.
a
The example of
To
illustrate the
problem discussed problem, a random sample of limited
typical complexity, let us consider the lottery type of in the last part
sizey
was
of Section
selected
from a
1-7.
In this
large population n subdivided into n
We
=
kx
+
k2
how many elements of the kind k x may we expect to have in the sample j. Suppose now that we alter this problem as follows. The numerical value of n is known but the division of n between k 1 and k 2 is not known, and we wish, from
with
all
numerical values known.
inquired then as to
an observation of the number of k x elements in j, to determine the ratio kjn. This is a typical problem in what is called "quality control." It is instructive to consider this type of problem a little further because it illustrates one essential feature of the measurement problem in an experi/'
mental science.
A factory turns out a very large number n of supposedly identical items,
Probability and Experimental Errors in Science
52 but some
unknown
fraction are defective,
whether or not
infer
can be discussed
in
this fraction
and we wish, by sampling,
to
exceeds a specified value. The problem
terms of the equation for the probability for having
i
defectives in sample j, viz.,
" (M ( '~ M //l; /; n
KexactlyO='
(;)
As
a
first
approximation,
this expression for /^(exactly
"equal" to the observed ratio
i/j,
i)
may
be placed
and a value of the defective fraction kjn
deduced therefrom. The difficulty is that a different value of kjn is obtained from each different sample ratio i/j. Of course, the reliability of the deduced value of kjn increases as the sample size increases or as the
number of independent samplings increases to provide a more reliable mean value of i/j. The problem really is to determine, for preassigned reliability in k x jn, the optimum sample size and number of samples commensurate with a minimum of effort in examining the samples. There are various statistical arguments in treating quality control problems of this sort, and discussion of them is beyond the scope of this
But one approach to this problem, in case n is very much larger mentioned now because it ties together some of the concepts discussed earlier in this chapter. In this case, the problem can be approxibook.
than
y, is
mated by one of sampling with replacement. Then, the binomial equation can be used, Eq.
1-20, viz., ]-i
\\)l
and the problem becomes one of determining the parameter p (= kjri) from the observed ratio i/j. Suppose that a guess as to the true value of p puts it in the range 2 to 4%, and that it suffices to know/? to one significant figure.
hypotheses,
One procedure then is to make five mutually exclusive 2, 3, 4, or 5 % and to guess initially (in desperation) p = all equally likely, i.e., probability \. The binomial probability
viz.,
that they are
1
,
j, p) may be calculated for each value of p, and comparison with the outcomes of successive independent samples serves to
distributions B(i;
increase the probability that one of the hypotheses
is
to be favored over
the others.
1-12.
Example: Direct Measurements
Now
let
in
Science
us extend the quality control problem so as to
to a typical
measurement problem.
make
it
similar
This illustrates a most significant
Experimental (a posteriori) Probability
S3
application of the multinomial probability distribution in the science of
measurements. As a
first
step in this extension, suppose that the definition
of "defective" involves an upper and a lower limit of tolerance in the pertinent aspect of quality, e.g., in a linear dimension such as the diameter of ball bearings. With n
=
+
kx
k2
+
The problem,
k3
if
n
this extension, n r
i.e.,
,
=
3,
subdivided into three categories,
is
with category k 2 being "nondefective."
very large compared toy, becomes one of multinomial
is
p x p 2 pz unknown. In this optimum sample size and number of the compromise between reliability and
probabilities with the respective probabilities
determination of the
the
case,
samples, with consideration for
,
,
even more complicated than in the case in which n was divided
effort, is
two
into only
categories,
and we
shall
not attempt
it
further here.
Next, suppose that the n elements are subdivided into a
much
larger
number of different categories. Suppose that these categories are ordered in terms of some numerical characteristic of the elements, perhaps the diameter of the ball bearings. Our objective is to infer the average or arithmetic mean value of the entire population of n possible values from a sample of
In the determination of a length
size j.
(e.g.,
the balls as measured with a pair of calipers or with instrument),
we
take a
number j of independent
a sample of size/, from an essentially
From i
z
+
•
'
,
h
-,
i
r
'
K
=j
and
the arithmetic mean.
+
^1
+h+
^2
of k x
,
i
2
of k2 ,
large that adjacent numerical
limit
is
'
'
kr
<
n
mean
value, a valid inference r in
n
is
in this case so
we can read some other instrument indeed, if
k values are as
the vernier caliper scale or the scale of
no upper or lower
'
We infer that the length being measured
reasonably large. The number of subdivisions
infinite
ij
trials.
,
has a value reasonably close to the measured
be
i.e.,
population of possible
of k r where
+h+ "+
we calculate if / is
fine
measurements,
the variety of values observed in this sample, viz.,
of k3
h
infinite
trial
the diameter of
some other
closely spaced as
;
imposed on the
size
of the ball bearings,
r
may
even though the effective least count in the measurement scale
is
finite.
The
quality control problem in simple
form
is
also seen to be identical
to the problem of determining the experimental probability of heads in a
coin toss from a sample size « obs (=/) taken from an infinite population n. In this case, the number of subdivisions r of n is only two, viz., heads and
The problem in slightly extended form is also similar to the one of determining the experimental probability for a deuce in the cast of a sixtails.
= co and r = 6. However, as implied above, in a typical measurement problem, n is infinite but the number of subdivisions,
sided die, n scientific
although
infinite in principle,
may
be limited in practice by the effective
Probability and Experimental Errors in Science
54
count of the measurement scale and by the largest and smallest possible measurements.
least
As mentioned it
turns out
direct
in
earlier in this chapter
an experimental science
and again that, in
measurements, the very large population n
real-life
in
many
aspect of the
Chapters 4 and
5,
of the problems of
(infinite)
can be taken as
subdivided according to either the normal (Gauss) distribution or the
Poisson distribution.
This knowledge makes for great simplification
in
determining the experimental probability with reasonable precision by
invoking one of the mathematical models that
is
based on axiomatic or
classical probability concepts.
But before we discuss these features let us introduce in the next chapter some basic notions about measurements in general and about elementary statistics.
"We of
are
in
having
the ordinary position of scientists
we
with
piecemeal
make several but we cannot make anything
improvements: clearer,
content
be
to
can
things clear."
FRANK PLUMPTON RAMSEY
2
"Probability
is
a
measure of the importance
of our ignorance."
THORTON
C.
FRY
Direct Measurements:
Simple
Statistics
MEASUREMENTS
A.
2-1.
The Nature
Most people
IN SCIENCE:
ORIENTATION
of a Scientific Fact
are strongly inclined to the idea that a so-called "fact"
immutable, an absolute truth, and that science especially yields such truths. But as we study science and its philosophical implications, we is
is entirely foreign to science. It becomes necessary two kinds of facts: (1) those known by the Omniscient, and (2) those devised by man. Only the former are absolute.* Complete certainty is never the mark of a scientific fact, although it is the business of scientific endeavor to reduce the uncertainty as much as
find that absolute truth
to distinguish
In
possible.
many
instances, the residual uncertainty
and some people may be is
inclined to say that
not valid in principle. Scientific knowledge
it is
is
is
very small indeed,
negligible.
Such neglect
always inferred knowledge,
knowledge based on a limited number of observations. But it is to be emphasized that, as our experience accumulates and our interpretations
i.e.,
*
The
mean
choice.
Wendell Holmes wrote: "When I say a thing is true, I cannot help believing it. I am stating an experience as to which there is no
late Justice Oliver
that
I
But ...
inabilities
I
do not venture
of the universe.
and leave absolute truth Doubts,"
Illinois
Law
I
to
assume that
my
inabilities in the
way of thought are
therefore define the truth as the system of
for those
who
are better equipped."
Rev., 10 (1915).]
55
my
limitations,
[From "Ideals and
Probability and Experimental Errors in Science
56
become more critical and more and more reliable.*
Some
objective, scientific
knowledge becomes steadily
people have argued that science, more than any other subject,
responsible for the progressive emancipation of men's minds the philosophy of absolute truths.
Whether science
absolutes, toward absolutes, or neither,
or ethics, for
meaning.
it is
only
in
Science, per
concepts of science,
is
directs
is
away from away from
a matter of one's religious faith
such faith or cultural patterns that absolutes have se,
like the
is
necessarily silent
on
this question. f
The
concept of probability and for essentially the
same reason, are "open-ended" concepts. Whether or not these concepts may be the basis of a new type of religion or ethics is also a matter of opinion which we shall not discuss in this book. The feature that distinguishes scientific knowledge is not only that there is
a clear recognition of uncertainty but that the degree of uncertainty can
usually be rather well determined.
This determination
is
carried out by
and probability theory. As we accumulate scientific facts, including knowledge of their intrinsic uncertainty, our philosophical intuition grows, and we are no longer dismayed that scientific the
methods of
facts are if
statistics
"merely" probable; indeed, we
the probable error
by man) because
it
is
known,
is
most
realize that reliable
probable knowledge,
of all knowledge (devised
includes a realistic self-appraisal.
The fundamental
truth or fact in an experimental science, e.g., in physics,
chemistry, biology, engineering,
ment.
the
etc., is
always an observation, a measure-
Prediction, or.a generalized description of the behavior of nature,
an important goal of the science, but the degree of reliability of the is no better than the measurements upon which it is based. Careful analysis of the reliability of measurements therefore is necessarily an early step in achieving scientific maturity. is
prediction or of the distilled principle
Measurements and
2-2. Trial
As an example of
a measurement, consider the direct determination of
the length of a piece of wire.
measurements with a *
During the
last half
Statistics
Suppose that we have made 12 independent count (smallest scale division)
ruler having a least
of the nineteenth century, Newtonian mechanics and classical all the observations that
electromagnetic theory were able to "explain" just about
mechanics and electricity. Then, some measurements were made with some experiments were carried out in a new domain, viz., atomic physics. It was immediately necessary to modify the then-current theories to encompass the new observations. Examples of the need for, and the process of, increasing scientific reliability still continue in every active facet of science, and this
had been made
in
greater precision, and
situation will persist indefinitely so long as that facet of science remains active. t
This fact
is
in
no sense a belittlement of the
social
and
attending absolute truths; see the "Dedication" of this book.
intellectual
problems
Measurements
in Science:
Orientation
Table 2-1. Typical Set of Measurements
Measured Value Trial
1
(mm
units)
57
Probability
58
t
and Experimental Errors
in
Science
Measurements
in Science:
Orientation
The branch of applied mathematics or as nearly ideal
A
that treats
data as possible,
trial
fundamental postulate
is
and
interprets trial data,
called statistics.
in statistics is that the variations in a set
ideal (or near-ideal) trial data are strictly
chance.
59
(The concept of random
is
random,
i.e.,
of
are due entirely to
discussed in the next section.)
It is
assumed, of course, that the property being measured does not change during the measurements. The bulk of the problems in statistics deal with data for which the actual degree of approximation to random ideal trial data re-
and
quires careful thought
test (e.g.,
ages of people at the time of death, sex
distribution in a population, accidents
ravaged by
on the highway, number of houses
votes in an election, nutritive value of milk from cows on
fire,
etc.). But in a set of painstaking measurements in an experimental science, especially in laboratory physical science in which the subjects are inanimate, the random ideal trial nature of the measurements can often be safely assumed. Often, however, it is desirable
unintentionally different diets,
to carry out a test to check specifically for the presence or constancy of
systematic (nonrandom) errors in the measurements. It is well known that, in comparison with the physical scientist who works under the "controlled" conditions of a laboratory, either the biologist or the nonlaboratory physical scientist* must typically put up
with certain extraneous factors that controlled.
With
minimize the in
relative
approximating
ments.
The
diligent design
make
his
experiments
and performance of
less
well
his experiments to
importance of these factors, he also often succeeds
satisfactorily the conditions of
social scientist,
on the other hand,
is
random
trial
measure-
frequently confronted
with such extraneous factors that the majority of his measurements are
perhaps better described as investigations than as
scientific
experiments.
Certain methods in statistics have been developed to give special attention
and with these complications the subject of In this elementary book for the student of experimental science, we shall for the most part pass over the statistical treatments of extraneous factors; when the student needs them, he can to the extraneous factors,
statistics is necessarily intricate.
find these treatments in the literature.
2-3.
Random
Variation
The mathematician
defines
random
(or stochastic) as the adjective
modifying a variable whose value depends on the outcome of a random experiment.
A
random experiment
is
one whose possible outcomes are
all
equally probable, or, better, for which each possible outcome (each point *
Examples of nonlaboratory physical sciences are astronomy, meteorology, cosmic geomagnetism, cosmology, etc.
rays,
60
Probability and Experimental Errors in Science
in
sample space) has a fixed probability. Idealized penny tossing, drawing
a number from a
hat, etc., are often cited as such "experiments."
Also,
random numbers are digits arranged in a random manner. The phrase "random numbers" is short for "randomly generated numbers."* The experiment or the process of generation of random numbers in real life is left to the scientist to devise; and, confronted with
to the mathematician,
this nontrivial task,
even the scientist would prefer to be a mathematician
(or an armchair philosopher).f It is
The
impossible to give a rigorous operational definition of random.
subjective meaning, however,
may
statements
A
set
is
not
The following
difficult to grasp.
be helpful.
of generally nonidentical numbers has one essential feature of
randomness
if,
as the
numbers are
successively revealed, the next
number
in the series has an a priori equal chance of being larger or smaller than the median valued of the already revealed numbers. Another necessary
condition of randomness errors, the first
moment
absence of inconstant systematic
that, in the
is
of the
set is
zero
when taken about
the arithmetic
mean value, i.e., moments is discussed presently.) Identification of randomness in terms of the mean value is not really practical because inconstant systematic errors the random deviations must add up
are never completely absent;
because
it
the
A
(The concept of
attempt has significance, however,
emphasizes consideration of the
sented by the individual numbers
to zero.
sizes
of the deviations repre-
as well as the respective algebraic signs.
single event in a set of generally different events,
whether each event
numerical measurement or some other type of observation,
is
random
is
a
if it
has an a priori constant chance of occurring regardless of the position of this event in the
ordered
depend on the position
set
(although the magnitude of the chance
in the set).
The
adjective "a priori"
is
used
in
may
two of
these statements, and, strictly speaking, the a priori chance cannot be
determined
—
it
can merely be inferred.
The concept of random
an especially interesting one. In science, it It is properly defined in terms is intrinsically an "open-ended" concept: of the chance happening of a future event, always about the unknown. Hence, it does not properly apply in the description of a number or of a set of numbers (or of an event or of a set of events) that has already been is
* A book of "random" numbers has been published, A Million Random Digits, by The Free Press, Glencoe, 111. t And the mathematician said "Let there be random numbers," and lo and behold it came to pass: there were random numbers. Only a mathematician can get away
with X
this.
The median, discussed
later in this chapter,
is
defined as the middle value, or as the
interpolated middle value, of a set of ordered numbers. If the histogram of the is
symmetrical, the median and the arithmetic
mean have
the
same
value.
numbers
Measurements
Nor can any
revealed.
of a
Orientation
61
a posteriori test of the
randomness of a number or
of numbers (or events) be completely satisfactory. The best we can
set
do is
in Science:
from the already revealed numbers whether or not the next-tonumber may be expected to be a random member of the set.* This past-vs.-future aspect of random has philosophical fascination, and some people say (erroneously) that the inherent arbitrariness of any operational definition of random prevents the subject of probability from to infer
be-revealed
being properly a part of a science.
Actually, every operational definition
most cases) arbitrariness or unknowledge is complete, no measure-
in science has a residual (albeit small in
certainty in
ment It
inasmuch as no
it
scientific
exact.
should be mentioned that, as regards the experimental concept of
random, the terms "equal chance" and "constant chance" in the respective statements above have significant meaning only in terms of a very large set of observations. The set must be sufficiently large that the statistical pattern of the variation, including the median or the arithmetic mean value of the already revealed numbers, has taken on an essentially equilibrium value (see the definition of experimental probability, Eq. 1-39). It
apparent that the terms "trial" and "random" are somewhat
is
An
related. refers to
interesting distinction between them is the following: trial an experimental process (although an impractical one since an
actual process never has quite the perfection required in the strict definition of trial
trial),
whereas random
is
a mathematical condition.
implies a striving for a real-life perfection, whereas
a kind of perfection by acclamation;
this is
In a sense,
random
refers to
a characteristic difference
between a science and mathematics.
As was
stated before, simple statistical methods treat sets of trial data which the variations are assumed to be satisfactorily random. And, fortunately, in an experimental science, the assumption of random variations in successive trial measurements is often satisfactory per se. in
2-4.
Probability Theory in Statistics
In treating
random trial data, it is often possible to invoke a mathematical
model of the variations among the it
trials.
model
If this
enables the statistician or the experimenter to
limited
number of
trials,
is
not too complex,
quickly,
i.e.,
with a
very significant computations about such pro-
perties of the set as (a) the best value
and
its reliability, (b)
with which a particular result or measurement *
make
may
the frequency
be expected to occur
Comment on the following story The doctor shook his head as he finished examin"You have a very serious disease," he said. "Nine out of ten people :
ing the patient.
having this disease die of it. But you are lucky because you came to me. had nine patients all of whom died of it."
I
have already
Probability
62
and Experimental Errors
in
Science
when a certain number of trials are made, (c) the number of trials that need be made for a specified precision in the best value, etc. The branch of statistics that applies and/or develops mathematical models for random trial data is called probability theory. The simplest mathematical model is the binomial distribution. This was initially model, whose formula was derived and discussed in Chapter 1
devised for the simple ideal games of chance dice, dealing cards, etc.).
(e.g.,
,
tossing coins, casting
The two mathematical models of outstanding
an experimental science are the normal (or Gauss) distribution and the Poisson distribution, both of which, mentioned briefly in Chapter 1, may be considered as limiting cases of the binomial distribution. importance
It
in
turns out that one or the other of these two models very often satis-
measurements,* and only a rudimentary knowledge of the subject is necessary to enable the experimenter to decide which of these two models is the one of interest. Procedures for testing the degree of fit are discussed later. These models do not involve
factorily "fits" a set of actual direct
advanced mathematics (beyond elementary calculus), and tions are of
and then 2-5.
in
immense help in designing the experiment analyzing and interpreting the results.
their applica-
in the first place
Computed Measurements
The type of measurements discussed measurements.
in
the last section are direct
computed or
Usually, at least in the physical sciences,
derived quantities, also called "measurements," are
more
frequently the
An example of a computed measurement is that of which is obtained from the directly measured quantities of distance and time. Other examples are legion. After the direct measurements have been recorded and the best value and its reliability determined, we apply the appropriate statistical formula for the propagation of errors and determine the reliability of the computed result. The probability model, if it exists, for computed results is generally different from that applicable to the direct measurements, and usually no simple model applies.! Hence, with little or no prospect of finding a satisfactory model for computed measurements, we must be content with the more limited
focus of attention. velocity
No mathematical model distribution conforms any set of experimental measurements. Whether or not the actual degree of "misfit" and the consequences thereof are serious depends upon the care and pains that the measurements justify or that the experimenter is willing to take. *
"Fits" needs further comment.
strictly to
t
Note
that
if
the direct measurements are
made on
a scale that
respect to the errors themselves, then generally no simple in
such a case that a better
fitting
than for the direct measurements.
model may be found
for
is
nonlinear with
model applies; it is possible the computed measurements
63
Basic Definitions: Errors, Significant Figures, Etc. potentialities
of the
statistical precision indices as
discussed in Part
C
of
measurements
in
an
this chapter.
Conclusions
2-6.
With the
special characteristics of
experimental science,
we
most of the
direct
usually assume satisfactory compliance with the
two assumptions:
random
independent measurements carried and for which there is a constant characteristic chance that any particular possible measurement will occur as the next measurement, and (2) a simple mathematical model of the variations. (1)
trial
measurements,
i.e.,
out under identical conditions,
Then, the pertinent principles and
details of the generally
complicated
and probability theory are not very formidable even to the beginner. The reliability of computed measurements, and of any direct measurements that are not satisfactorily fitted by a simple mathematical model, may be obtained by the statistical precision indices without resort to any model. The general objectives in the statistical treatment of measurements are (1) the determination of the best (or sometimes the median or the most probable) value from a limited number of trials, (2) the specification of and
specialized subject of statistics
the reliability of the best value, (3) a statement of the probability that the
measurement would have a particular value, and (4) assistance and performance of the experiment so as to obtain a desired degree of reliability (expressed as an error) with a minimum of effort. next
trial
in the design
B.
BASIC DEFINITIONS: ERRORS, SIGNIFICANT FIGURES, ETC. There are
many
different specific aspects of the error concept.
discuss these aspects
second, in Part
ments.
C
first
of this chapter, as they apply to
sets
of
Errors in the latter case are especially interesting
tion observed
among
We
as they apply to individual measurements
the trials
is fitted
trial
when
shall
and
measurethe varia-
by a simple mathematical model,
but in the remaining two parts of the present chapter individual measure-
ments and sets of trial measurements are discussed without regard to any mathematical model. Discussion of the models and of the probability predictions will be delayed until later chapters.
Some
basic concepts
and
definitions have already
been given or implied
Probability and Experimental Errors
64 in Part
A
ments,
random
of
this chapter, viz., the
nature of a scientific fact,
in
trial
Science
measure-
variations in measurements, histogram, frequency distri-
bution, frequency curve, statistics as a general subject, probability theory, direct
and computed measurements,
etc.
We now explore some additional
terms, principally those dealing with errors, with elementary statistics,
and with precision of
measurements.
Types of Errors
2-7.
It
direct
is
convenient to subdivide the general concept of error into three
broad types, viz., random errors, systematic errors, and blunders. In our present discussion, blunders should be immediately dismissed with the appropriate embarrassment, but a few examples are mentioned briefly below. In general, the term experimental error is some additive function of all three.
Random
(or accidental) error.
concern in the
statistical analysis
of random error are in
(1)
A
Random
errors are of the greatest
of measurements. Four separate meanings
common
use as follows:
deviation or statistical fluctuation
(Eq.
2-5) is the
difference
between a single measured value and the "best" value of a set of measurements whose variation is apparently random. The "best" value is defined
mean of all the actual trial measurements. random it is necessary that the systematic errors
for this purpose as the arithmetic
[For the deviations to be
(mentioned presently) be either absent or not change as the trial set is obtained.] For a symmetrical frequency distribution that is unimodal (has only one
maximum
in
it),
the arithmetic
mean
is
obviously the "best"
value and also the most probable value; for an asymmetrical distribution, the
mean
somewhat arbitrary but is supported by shown later. For an asymmetrical distridepends upon the intended use, and sometimes the
as the "best" value
is
the principle of least squares as
bution, "best" really
median or the most probable value of a deviation the mean (2)
Random
is
is
preferable; but for the determination
conventionally used.
error sometimes refers to the difference between the arith-
mean as determined from a certain number of random trials and the mean that is determined from a larger number of trials. Often the latter mean is the hypothetical or theoretical "true" value that we believe would be obtained from an infinite number of trials it is often called the "parent" metic
;
or "universe" mean.
This error with respect to the hypothetical "true"
value does not have experimental significance but
more advanced
statistics
and
in
is
of great interest
in
any discussion with a mathematical model.
Basic Definitions: Errors, Significant Figures, Etc.
65
20
15
Time (sec) Fig. 2-3.
Record of "zero-reading"
deflections of a very sensitive torsion balance.
These irregular fluctuations are due to elementary errors, perhaps dominated by Brownian motion in this case.
(3) A more difficult concept of random error has it as some one of numerous so-called elementary errors that are merely imagined to exist.
According to the theory, these elementary errors conspire to be observed as a deviation or statistical fluctuation in certain types of measurements.
Examples of the imagined random elementary errors, and measurement process, are discussed in Chapter 4 in connection with the normal (Gauss) mathematical model of probability. In this theoretical interpretation of deviations in real-life measurements, an elementary error may indeed be either a random error or an inconstant systematic error. The term "inconstant" refers either to the magnitude of the elementary error or to its algebraic sign as it adds into the sum to give the deviation or it may refer to a time dependence in those measurements that are made in a time sequence, as most measurements are made. See Fig. 2-3.
their role in the
;
(4) Finally, reliability
mean This
random
error
value, determined
is
indices.
may
refer to a quantitative statement of the
of a single measurement or of a parameter, such as the arithmetic
from a number of random trial measurements. is one of the so-called precision
often called the statistical error and
The most commonly used
reliability
indices, usually in reference to the
of the mean, are the standard deviation, the standard error (also
called the standard deviation in the mean),
and the probable
error.
Precision indices are defined and discussed later.
The student may
well have difficulty in immediately perceiving
some of the
above distinctions, but they should become clearer later. It is important to note that the algebraic sign of a random error is either positive or negative with an equal probability when the error is measured with respect to the median value. In the fourth meaning listed, fine points in the
Probability and Experimental Errors in Science
66 the precision index
always measured with respect to the "best" value
is
of the parameter under consideration, and the best value, as stated above, is
the arithmetic
mean
rather than the
value (the mode). Also note that
asymmetrical
(as, e.g., in
distributions), the
when
median or the most probable
the distribution of measurements
the cases of the binomial
median, the mean, and the most probable values are
In this case, the precision index of reliability
all different.
is
and Poisson model is
properly
(although in practice only occasionally) expressed, not as plus or minus
some symmetrical error magnitude, but rather as plus one error magnitude and minus another error magnitude. Sometimes we distinguish between (a) the random errors that are introduced specifically as a part of the measurement process and (b) the
random phenomena
that are inherent in the statistical nature of the
property being measured.
one but
is
us elaborate
on
is perhaps not a fundamental measurements themselves. Let
This distinction
significant in certain types of this a little.
The former case
typically refers to the role
of the elementary errors as they conspire to produce an observed deviation in a
measurement
in a very fine-grained (perhaps continuous)
sample
space, whereas the latter very often (but not necessarily) refers to a
which the sample space is obviously discrete. Examples in measurements of a length or of a time; examples of the latter are found in so-called counting measurements (e.g., in counting nuclear disintegrations, in measuring with a counter the intensity of cosmic rays, or in the quality control problem of the last chapter in which "defec-
measurement
in
of the former are found
is unambiguous). A count is an integer with no uncertainty in it, but whether or not a count occurs in a specified sample (in a selected batch or in a selected time interval, etc.) is due to chance which is itself due to an unfathomed array of elementary errors. In a counting experiment, we measure the probability that a count will occur in the selected sample
tive"
with
all
the elementary errors
lumped
of answer, and the probability sought as
is
into a discrete "yes" or is
some
"no" type
characteristic of nature just
the length of a piece of wire.
The central feature of all statistical measurements, of whatever type, is that we seek to determine a measure, from a limited sample (number of trials),
of a property of a large (often
mean of
infinite)
population. In whatever type,
measurements increases in statistical fashion as the size of the sample increases. Only in the event that the "sample" is the entire population (often an infinite number of trials) are the random sampling errors reduced to zero. Distinction between the properties of a sample and of the entire ("parent" or "universe") population is basic in the second of the four meanings of random error as set forth the reliability of the
above, and
is
the
the subject of several later discussions.
67
Basic Definitions: Errors, Significant Figures, Etc.
Systematic error.
An
error that always has or tends to have the
same algebraic sign, either an additive or subtractive quantity introduced in the measurement process, is called a systematic error. If the magnitude of this error does not change with time,
appears as a constant error in
it
median and arithmetic mean values. If it changes in magnitude, it introduces some skew (asymmetry) in the observed histogram. If the changes in magnitude occur in some irregular fashion, it is especially a most unpleasant and insidious error contribution. In any case, since systematic errors are not generally amenable to statistical treatment, they impair the reliability of the mean to a degree which can only be estimated and often not very well.* The observed errors in every instance probably include both random and of the measurements and also
all
in the
systematic errors.f
Examples of systematic errors are those caused by: (1) Incorrect (or
friction or
electrostatic
an unjustifiably assumed) calibration of an instrument,
moving parts of an instrument (as in a "sticky" meter), charge on the glass front of a meter, failure to correct for
wear
in
the "zero" reading, etc. (2)
Constructional
in
faults
apparatus,
the
e.g.,
misaligned parts,
thermal electromotive forces from poorly chosen materials, screw errors, etc.
(3)
Inadequate regard to constancy of experimental conditions and
imperfect measurement techniques,
e.g.,
changes
in
dimensions owing to
thermal expansion, one-sided illumination of a scale, nonvertical position of a liquid manometer, alteration of the property being measured as in
chemical contamination or spilling part of a liquid sample, (4)
Failure to
make
necessary corrections,
e.g.,
etc.
for the effect of atmos-
pheric pressure or of the variation of gravity with elevation or latitude in
determinations of mass by weighing, meniscus corrections in a liquid *
Examples of
classical situations involving large systematic errors are:
to 1920, atomic weight determinations were, as
was
later
shown,
afflicted
(1) Prior
with unsus-
pected systematic errors that averaged fully ten times the stated experimental errors, and 10 esu (note the (2) in 1929 the accepted value of the electronic charge was 4.7700 x 10"
was changed to 4.80294 x 10 -10 esu. Examples of large unsuspected systematic errors are numerous in the scientific literature, far more numerous than is generally suspected. This is the "problem" in the apparent inconsistencies in the fundamental constants (e.g., see Cohen, DuMond, Layton, and Rollett, Rev. Mod. Phys., 27, 363 (1955), and Bearden and Thomsen, Nuovo Cimento, Suppl. (Ser. 10), 5, no. 2,
number of
significant
figures),
and
(Significant figures are discussed
in
later
it
Section 2-8.)
267 (1957). t
And
behold
it
the pseudo scientist said, "Let there be
came
be so sure.
to pass: there were
no systematic
no systematic errors," and lo and Only the pseudo scientist can
errors.
Probability and Experimental Errors in Science
68
barometer, "stem" correction
in
a
common
mercury-glass thermometer,
etc.
(5) Bias by the observer, e.g., more or less constant parallax, supposed improvement in technique in the midst of a set of measurements, desire
for the "right" result, etc.
This
not intended to be exhaustive, merely illustrative.
list is
To justify the we often attempt
application of a mathematical model of variability, as to do,
we assume randomness
may
observed variability,
in the
an assumption whose
validity
systematic errors.
generally imperative that every effort be
and
detect
It is
be jeopardized by the presence of
to eliminate systematic errors.
dependent, recognition of it
If a
may come from
systematic error
is
made
to
not time-
greater care or by comparison
with the results from other apparatus or from some more ingenious method
of measurement. If the error changes with time, a study of the time dependence of the mean value may reveal the error's presence. Tests of correlation
and of consistency of means, discussed
Sections 3-3 and 3-7,
in
are helpful.
Elimination of systematic errors often strains the ingenuity, judgment,
and patience of
After exhausting
the best of experimenters.
all
metiiods,
even though he does not believe them to be entirely absent, he resigns himself to their presence but assumes the variability to be
purpose of
random
for the
treatment of the deviations.
statistical
However, note in and accuracy.
this
regard that a distinction
is
made between
pre-
cision
Precision
and accuracy.
Precision in a
to the reciprocal of the statistical error is
small; accuracy
"high"
is
if
and
is
mean
value
"high"
is
proportional
if the statistical error
the net systematic error
is
small.*
Usually,
but not necessarily, high accuracy implies a small statistical error as well. *
As
errors
one
is
stated above, objective numerical determination of the residual systematic
not practical.
for precision, but
An arbitrary
its
The
error for accuracy
numerical value
procedure, often used,
is
is
is
usually appreciably greater than the
perforce
left
to the observer's best judgment.
to estimate the equivalent statistical error caused
by estimated component systematic errors. This estimate is usually deliberately very chance that the "true" value lies outside of the say about a 1 or a 5 limits given by the equivalent error. Some experimenters go further: With an assumed
conservative
—
%
%
mathematical model of the "histogram" of the component systematic errors, e.g., the normal (or Gauss) distribution, such an estimated error is converted by the formulas of the model to the same "confidence limit" used
in the statistical error;
often this
is
the
probable error. The assumption of a mathematical model for these component errors is admittedly highly ad hoc and is not to be much trusted; but there is apparently no better general procedure.
(See next chapter.)
Basic Definitions: Precision
69
Errors, Significant Figures, Etc.
and accuracy are not interchangeable terms. * Statistical methods measure of precision, not of accuracy
give specifically a quantitative
(however, see discussion of the consistency of means, next chapter).
The difference between two measured values, e.g., values reported by two different observers, or the difference between a value by an observer and an "accepted" value as listed in a handbook, is Discrepancy.
This difference
called a discrepancy.
is
the need of a statement of error, both
not an error, although statistical
it
and systematic,
implies in
each
value to provide a basis for interpreting the discrepancy.
Blunders.
These are outright mistakes.
A
measurement known to
contain one or more blunders should be corrected or discarded.
Blunder
errors include the following effects: (1)
Misunderstanding what one
is
doing, incorrect logic.
(2) Misreading of an instrument. (3) Errors in transcription of data. (4)
Confusion of
(5)
Arithmetical mistake, "slide-rule error."
(6)
Misplaced decimal point. Listing of an improper number of significant
(7)
2-8. Significant
units.
Figures and Rounding of
figures, etc.
Numbers
Significant figures are the digit figures necessary to express a measure-
ment so
some
as to give immediately
ment. There
is
idea of the accuracy of the measure-
no uniformly accepted
rule for deciding the exact
number
by measurement of, say, 63.7cm indicates a "high" probability for a value between 63.55 and 63.85 cm but a "possible" range 62.2 to 65.2. Another popular practice is to retain the last digit that is uncertain by 10 units or less. In case either procedure is followed, it is recommended to include one more figure but to set it down slightly below the line of the significant figures. If this additional subfigure, with of digits to use.
more than
1
One popular
practice!
*s
t0
drop a ^
digits uncertain
5 units. Accordingly, a
was desired to determine as precisely as possible the height was unthinkable to bother theemperorwith a direct measurement. So, the investigator, knowing about statistics and being of diligent mind, conducted an extensive poll. He selected at random a thousand, nay, a million Chinamen from all parts of the nation. Each was asked to give his opinion as to the height of the emperor, and was then sworn to secrecy. The average of all the numbers provided a very precise determination. But none of the Chinamen concerned had ever even seen *
According to legend,
of the emperor of China.
it
It
the emperor! t
Recommended by The American of Data (ASTM, Philadelphia,
tation
Society for Testing Materials, 1937),
2nd printing,
p. 44.
Manual on Presen-
Probability and Experimental Errors in Science
70
more than 10 (or 15) units, were to be included as a it would erroneously imply in some cases that the preceding figure was uncertain by less than 10 (or 15) units. "Uncertain" in a single measurement refers to the observer's best guess as to the "sum" of the random and systematic errors. Sometimes this guess is taken its
uncertainty of
significant figure,
simply as the effective measuremental least count,
i.e.,
either as the smallest
measurement or as the observer's estimate of a meaningful interpolation. In the mean measurement, the precision part of the uncertainty is set by, say, the standard deviation or by the standard
division
error
on the
scale of
the standard deviation in the mean).
(i.e.,
mean and
Significant figures in the
some precision indices are illustrated in Table 2-2 which is introduced and discussed in Section 2-11, and also in some of the problems in
and answers
in Section 2-12.*
in determining the proper number of significant figures with As which to express the precision of a mean determined from seven or more equally weighted measurements, the mean should have one more significant figure than has each measurement. In general, justification of this rule, and indeed the proper number of figures for the mean in any case, is indicated by the magnitude of, say, the standard deviation or, better, of the standard deviation in the mean. As stated, the proper use of significant figures provides a rough method of expressing accuracy. f However, because it is only rough and involves an arbitrary criterion of uncertainty, it is by no means a good substitute
a guide
for the assignment of the appropriate statistical error (e.g., standard
and also of a separate estimate of the net systematic error. Rounding of numbers is the process of dropping one or more significant
deviation, standard error, or probable error),
*
A man
letter to
named Babbage read Tennyson's The
the poet:
"In your otherwise beautiful
moment
'Every
dies a
Every moment one It
must be manifest
standstill.
that,
were
is
Every speaking
this
in the line,
but
Strictly
get
it
Comment on
is I
not correct.
believe
1
,'„
man, born.'
slightly in excess of that of death.
poem you have
'Every
is
of Sin, and wrote the following is a verse which reads:
there
population of the world would be at a
this true, the
In truth, the rate of birth
that in the next edition of your
Vision
poem
moment moment The
will
it
dies a 1
x
\
I
would suggest
read:
is
man, born.'
actual figure
is
a decimal so long that
be sufficiently accurate for poetry.
A physics student
I
I
am,
cannot etc."
in a mathematics class received which ten equally weighted questions had been asked. He inquired of the instructor whether the examination was graded on a basis of 10 as perfect or on a basis of 100. The instructor insisted that the question was f
a grade of zero
the following story
on an examination
pointless, saying that zero
:
in
was zero regardless of the
basis of scoring.
Frequency Distributions and Precision Indices
7/
when the measurement is used in computations for a computed The rules for the rounding of numbers are rather well developed
figures result.
and are
When
stated below.
these rules are followed consistently, the
errors due to rounding largely cancel one another.
To round of ten) less
off to n figures, discard
(i.e.,
replace with zeros or with powers
If the discarded number than half a unit in the nth place, leave the nth digit unchanged; if it all digits
to the right of the /7th place.
greater than half a unit in the nth place,
discarded
number
nth digit unaltered
is
add
to the nth digit.
1
is is
If the
exactly half a unit in the nth place, then leave the
if it is
an even number but increase
it
by
1
if it is
an odd
number. In multiplication and division (indeed, in
all
computations except
addition and subtraction) with numbers of unequal accuracy and of equal
weights in the final result, a generally safe rule
is
:
Retain from the begin-
ning one more significant figure in the more accurate numbers than contained in the least accurate number, then round off the
is
final result to
same number of significant figures as are in the least accurate number. unequal weights are involved, adjust the respective number of significant
the If
figures accordingly (weights are discussed later).
In the case of addition or subtraction, retain in the
more accurate
numbers one more decimal digit than is contained in the least accurate number. A decimal digit is a figure on the right of the decimal point regardless of the number of figures on the left. The above rules presume that all the measurements are independent. If the measurements are at all correlated (see Section 3-7), the rules are not applicable and we must proceed with caution.
C.
FREQUENCY DISTRIBUTIONS AND PRECISION INDICES The
variations in successive trial measurements are completely repre-
sented by a detailed graph of the frequency distribution of the measure-
ments. The idea of a frequency distribution was introduced and discussed very briefly in Chapter 1 and also in connection with Figs. 2-1 and 2-2. It
was mentioned
that
some
distributions are symmetrical in shape
and
that others are asymmetrical.
we shall comment on a few of the some typical empirical frequency distributions, discuss some easily obtained numerical measures of
In the remainder of this chapter, qualitative features of
and then we
shall
the shapes of or the types of variations in distributions.
We
shall treat
general distributions, symmetrical or otherwise, and regardless of any possible
fit
of a mathematical model.
These numerical measures, except
Probability and Experimental Errors
72
in
Science
for the location values (e.g., the arithmetic mean), are called precision indices of dispersion. 2-9.
Typical Distributions
Most
actual sets of observations or measurements have frequency
distributions
skewed
somewhat
like the bell shape,
and
bell shape (Figs. 2-1
may have
almost any conceivable shape.
shown shown
Fig. 2-4(a)
in
in this figure
through
or like a more or
But
2-2).
possible that a distribution
A
variety of distributions are
Each of the
(g).
first
three distributions
has a rather large classification interval.
cation interval in (c) were
much
less drastically
it is
smaller than
it is,
If the classifi-
the distribution
would
undoubtedly drop to a low frequency value at very small ages and contain a narrow or strongly peaked maximum. As another example (not shown) a frequency distribution of the wealth of the average person in New York State according to age
would probably be
similar to (b) or (c) but with
reversed skewness.
Comment zero
is
is
in
order on a few additional features.
displaced to the
left,
off the page.
In (a), the abscissa
In the (a), (b), (c),
and
(g)
examples, measurements are presumed to be possible that would have a nonuniform distribution within each indicated classification interval.
The indicated
interval size
is
by perhaps the resolving power and (c) by perhaps some decision of hand. The problem in (c) evidently did not
imposed
of the particular apparatus used,
convenience for the problem
at
in (a)
in (b)
include finding the most probable age or the shape of the distribution for
very small ages, or perhaps the investigator judged that a smaller
Note
cation interval would not be meaningful. cation intervals;
the interval size
is
in (g) the
unequal
classificlassifi-
varied so as to avoid the vanishingly
small frequencies per interval that would obtain in the higher intervals
if
a
constant class size and a linear abscissa scale were used.
types of distributions. Referring to Fig. 2-4, we an example of a discrete distribution, "discrete" because only in (d) integer numbers of colonies of bacteria are possible by the nature of the
Terminology:
have
observations.
In (e), (/"),
and the
fitted
curve
in (g), the distributions are
continuous since they are calculated from the respective functional theoretical relations
which are continuous.
The binomial mathematical models of curves,
Fig. 2-2,
may
well be included
Fig.
among
binomial and Poisson distributions are discrete; is
1-2,
and the
distribution
those of Fig. 2-4.
The
the normal distribution
The shape and "smoothness" of a model distribution are though the number of observations was infinite.
continuous.
always as
Another general feature of interest is the range or bounds. For example, the normal distribution curve extends from — oo to +oo, the distribution
Frequency Distributions and Precision Indices of
(e) is
bounded by
±A,
etc.
at
(c)
type; frequency diagram for the (d)
and frequency curve for the
(e)
or (/) type. In this terminology, the fitted curve is a frequency curve.
basic graph in (g)
A
often reserved for the block-area type of distri-
is
or
bution such as in the type;
+ oo, and the distribution of (/) is bounded
and by
The term histogram
73
is
(a), (b),
a histogram; the
histogram approaches a frequency curve as the classification interval
goes to zero and the number of
goes to
trials
In this
infinity.
use the term frequency distribution to refer to
all
book we
of these types, either
for actual measurements or for the mathematical models of probability distribution.
Another distinction tribution has a unit it
is
in terminology
sum
if it is
a continuous distribution,
ordinate value
is
is
A
often made.
by
probability dis-
and a unit area
a discrete distribution
An
greater than unity.*
if
hence no
definition of probability;
observed frequency distribution
has ordinate values greater than unity unless the observed values are
"normalized,"
i.e.,
divided by the total
normalized distribution it
number of
sometimes called the
is
If
is a mathematical model it is called the frequency function. Finally, we often have occasion to refer to the sum or to the area under
the distribution between specified abscissa limits. quencies, either observed or to
The
observations.
relative distribution.
some
specified value
is
relative,
The sum of
from an abscissa value of
the fre-
(or of
called the cumulative distribution, or the
bution function, or the error function
if
the distribution
is
— oo)
distri-
continuous and
expressed analytically. After a set of measurements has been classified,
appropriate class intervals, and next task in statistics
is
to devise
the particular distribution.
amount of
The
grouped into the
some simple numerical
The features of primary
location index of the "center" of the distribution,
the spread or dispersion.
i.e.,
plotted as a frequency distribution, the
and
descriptions of
interest are (1) a (2)
a measure of
simplest descriptions (simple in terms of the
Of
arithmetic involved) give the least information.
we had
would be completely described if mathematical formulation and the necessary parameters. But
the distribution
experimental distribution, such a formulation
is
its
course,
analytical
in a typical
impractical.
We
proceed to discuss some descriptions of the distributions, location indices as well as measures of dispersion, that are applicable to any set of experimental data (or to any mathematical model).
The measures of
dispersion are called the precision indices. *
The ordinate
probability in a continuous distribution
interval (e.g., see Eq. 4-10
and discussion);
if
altered extended interval, the numerical value
is
proportional to the abscissa
the probability
may
is
expressed in terms of an
exceed unity.
74
Probability and Experimental Errors in Science
a O .
u
« ~*
c
<-
*-
.
-
o >
O u J?
1
o i-
o
r~
is |c 3
«->
<«
S
>
D
Frequency Distributions and Precision Indices
75
Probability and Experimental Errors in Science
76 2-10.
The
Location Indices
commonly used
three
When
the mean.
mode, and and is symvalue. Almost all
location indices are the median, the
the distribution has a single
metrical, these three location indices are
all
maximum
identical in
in
it
unimodal (have a single maximum) if the number of trial measurements is large enough for the meaningful application of statistical methods. But the condition of symmetry is less often realized in practice. practical cases are
Median. The is
compute is the median. This odd number of measurements
easiest location index to
defined as the middle measurement of an
(all
ordered as to magnitude), otherwise as the interpolated middle value.
For some
statistical
problems the median This
three location indices.
is
the most significant of the
when
apt to be the case
is
the distribution
is
For example, we may be more interested in the median wage than in the mean wage of workers in a certain large industry for which the top executives' wages may give an undesirable distortion. However, even in a strongly skewed distribution of measurements in an experimental strongly skewed.
science, the
mean
is
almost universely accepted as
statistically the best
location index.*
Mode
Of
(most probable value).
metrical distributions
is
especial interest in
some asym-
the abscissa value at the peak position.
the measurement, or the interpolated value, having the
This
maximum
is
fre-
quency of occurrence. It is of much less interest in actual measurements in mathematical model distributions because of the great difficulty
than
of its accurate determination with great
when
Mean
(arithmetic average)
data; this difficulty
real-life
the classification interval
m
is
is
especially
very small.
and
By
most important location index is the mean. The experimental mean is denoted by w; the hypothetical ("parent" or "universe") mean for an infinite number of measurements, or the mathematical model mean, is denoted by /a. m is -, x •, x defined for /; trial measurements, x x x 2 xz n by the relationf ,
m =
Xl "*
^
"I"
X2 "
^
"1"
X3 "
fx.
,
,
"f" T
'
'
•
*
By is
•
•
t
•
,
,
2
Xi
+ Xn _I_r2?=i^_' '
(2-1)
n
n index
far the
the conventional criterion for statistical efficiency, the median as a location
considerably less "efficient" than
is
the mean.
In the normal distribution, for
example, the median is only 64% as efficient as the mean (although both indices in this case have the same numerical "parent" values). The criterion for statistical efficiency is
described in the discussion of the
mentioned
in
Section 3-1.
A
mean
deviation in the next section and
is
also
few additional "inefficient statistics" are mentioned
in
Section 3-8. t In this chapter,all
The mean
is
measurements
x,
are assumed to be individually equally weighted.
defined in Section 3-3 in terms of unequally weighted measurements.
Frequency Distributions and Precision Indices
77
5000 Fig. 2-5. Typical variation of the experimental mean as the number of (Note the logarithmic abscissa scale.)
trials increases.
one or more values of a; are observed more than once, say xt is observed < n) different x values in the n trials. In this
If
fi times, there will be ri (ri
case,
we may weight each x by i
_ J\ X
\
~\~
J2 X2
"t"
'
'
'
frequency of occurrence j], and write
its
+ Ji x +
'
'
i
+ Jn' X
'
Hfi x
i
n'
__ i=\
n
m = xfi + xfi + n n
=
where, of course, n for observing to infinity
.
+
xfi
+
•
•
•
+
Xn
U= |
n
2fli/f
x^* The sum
if it is
.
.
,
(2-21
n
and/?j
n
xiPi
(2-3)
t=i
=fjn is the experimental probability
in Eqs. 2-2
and
understood that for some
2-3 could just as well be taken
Vsf may be t
zero.
Often we are interested in a comparison of two means determined on different days, or with different observers.
measurements taken with different apparatus, or by statistics of such a comparison is discussed in
The
Section 3-3.
The hypothetical or mathematical mean /u is defined by a relation similar would obtain if n = oo. Of course, if x is a point in continuous sample space, the summation in Eq. 2-3 must be replaced by an integral, /u is also called the universe or parent mean, and m the sample mean. (In slightly more advanced statistics, m is called an "estimator" to Eq. 2-3 that
i
of the parent parameter
mined, but
As
its
illustrated in Fig. 2-5,
but
it
gradually settles
* Just for practice,
the
/u.)
mode
is 9,
the
Of course,
ju
cannot be experimentally deter-
value can be approached as n
m may
down is 8,
made
to a steady value as n
check that for the numbers
median
is
larger
and
larger.
fluctuate rather violently for n small,
and the mean
is
becomes
large.
The
2, 3, 5, 6, 7, 8, 8, 9, 9, 9, 11, 11,
7.69.
12
and Experimental Errors
Probability
78
in
Science
usual objective in the statistical interpretation of measurements with a
number of trials is the best determination of /n, not of m, and this some very interesting statistical arguments. Of course, in many
limited
calls for
sets
of measurements the difference between m and /j, may be believed to be It turns out that for any given number of trials n the corre-
very small.
sponding
m from Eq.
2-1
is
generally the best available value of [x.
presently define a precision index that gives the reliability of
given
We shall
m
for
any
n.
mean of any mathematical model distribution, we necessarily refer to /u. If we are convinced that a model fits the experimental measurements, we approximate li by m and proceed to take In speaking of the
advantage of calculations and predictions by the model. By the term "mean" without further qualification we always refer to the arithmetic mean. (Other possible means are the geometric, root-meansquare, etc.)
The mean
is
the abscissa value of the center of area of the
frequency distribution.
The phrase expectation value is often used [x in a model distribution.
to refer to the
mean
value m,
or to
mean
Calculation of the
working mean.
m
usually rather laborious
is
where w, the working mean, chosen to be somewhere near invariant significant figures in venience, chosen such that x/
ments
at
w
is
carried
t
=w+
is
m all
=
x/
an arbitrary constant value of x. w is magnitude, chosen to include all the the actual measurements, and, for con-
in
for
some one value of
/'.
A
convenient
readily selected after a visual inspection of the measure-
hand. Then,
J
t=i
and
it is
it
x
value for
when
can generally be simplified by the device known as the To effect this simplification, write for each measurement
out directly, but
xf
= 2 (w + i=i
*{)
=
nw
+ ^
x
t
i=i
let
from which
m = w+A The convenience of
make
the working
calculation of A, which
than calculation of m.
mean
is
A
is
mean
is
(2-4)
realized
if vv is
so chosen as to
small in magnitude, appreciably easier
simple example of a calculation with the working
given in Table 2-2, which
is
introduced later in Section 2-11.
Frequency Distributions and Precision Indices
79
Dispersion Indices
2-11.
There are several indices of the spread (dispersion) of the measurements about the central value. The dispersion, as stated above, is a measure of precision.
As with
the location indices, the
amount of information con-
tained in each dispersion index increases roughly with the arithmetical
labor involved.
Range. The simplest index of dispersion
the range.
is
It is
equal to
the difference between the largest and the smallest measurements.
For
obvious reasons, the magnitude of the range generally increases with the
number of trial measurements. Hence, whenever
the range
should be accompanied by a statement of the number of
is
specified,
it
trials.
Quantile. Suppose that all the n measurements are ordered as to magnitude and then divided into equal intervals, each interval having an intervals, each interval is equal number of measurements. If there are called a quantile or an Af-tile, or sometimes a fractile. The quartiles refer to the quarter divisions of the total number of measurements, there being two quartiles on each side of the median. Deciles refer to the division by tenths, etc. The dispersion information given by the quantiles increases as increases, but it disregards the distribution of values within each
M
M
quantile.
For the
Deviation (statistical fluctuation). set
of n
trials,
the deviation
zi
From
the definition of z t
it
denoted as
is
=
zt or
-m=
x
i
rth
measurement is denned as
in a
dx { and
6xt
(2-5)
readily follows that
= JU =
i
(2-6)
l
The mean m is used as the reference value in this definition in order that the sum of the squares of all deviations shall be a minimum, as shown in the next paragraph. All the precision indices that have to do with the shape or dispersion of the frequency distribution are defined in terms of deviations.
Let z/
is
S be
the
sum of the squares of the deviations when each deviation some unspecified reference value m. Then,
defined in terms of
S=Iz/ = i
To we
2
l
=I i
=l
fo,
-
m')
2
= I x? =l
find the particular value of m', call differentiate
S
2m'
i
it
mm
',
that
|=
1
xt
+
nm' 2
1
makes S a minimum,
with respect to m', place the derivative equal to zero,
and Experimental Errors
Probability
80
and solve for
mm
'.
Since
all
t
n
X*
n
—dS- — — 2 J dm
which
t
+
2nm m
=
'
mental deviations
'
n
=i
point which
and as used
is a minimum.* we encounter over and over again
dealing with experimental deviations
such as
does not
fx,
deviations,
value
li
ments
i.e.,
in
Eq. 2-5. There-
in dispersion indices
any other reference value,
that
make S a minimum. Only when we
refer to all possible
minimum. But note that the objective of the measurewe really wish the precision dispersion index to m. The best we can do with a limited number of observa-
a
not m, and
not to
/u,
is
to the "universe" (infinity) of deviations, does the reference
make S
is /a,
refer to tions,
and
= m m = ^~
or
m is the reference value for which the sum of the squares of the experi-
fore,
A
xf
the mean, m, as defined in Eq. 2-1
is
Science
the x values are constants their derivatives
Then,
are zero.
in
however,
is
practical in
m and a dispersion index with respect to m. many
It is
possible
situations to introduce additional information
from more than one sample of n measurements each, or from samples of different sizes, or from a reasonably fitting mathematical model. Use of such additional information,
into the problem, as, for example,
introduces factors such as Vnj{n
—
1),
Vn(n
—
1), etc.,
in the quantitative
dispersion indices; these factors are discussed later.
When we deal S to be
value for
Mean
with a mathematical model distribution, the reference a
minimum
is /u.
(average) deviation.
This index
is
defined without regard to
the algebraic signs of the individual deviations.
defined as
The mean
deviation z
is
n
x
=
I
—kl
t -^ 1
(2-7)
n
A
small numerical value of
closely grouped,
value of
z
and
z
means
that the individual
that the distribution
is
measurements are
The amount by mean value m.
rather sharply peaked.
also provides a sort of numerical guess as to the
which the next measurement is likely to differ from the The use of this measure of dispersion, z, is rather widespread in scientific work, but it is not much used by statisticians. It is what the statisticians call an "inefficient" measure of dispersion, and thisfor the following reason. Suppose that a large number of trial measurements are made from which one application of the famous principle of least squares; it is really by this mean of any distribution is said to be the "best" location value. The basis for this principle is discussed in Chapter 3. *
This
is
application that the
Frequency Distributions and Precision Indices the
mean
deviation
81
computed. Then suppose these measurements to be many small subsets and the mean deviation zf of
is
arbitrarily divided into
computed with respect to w the The subset values z j will show a rather large
each subset computed. (Each
mean of scatter
the
y'th subset.)
about the value
measure of dispersion
z
z
i
in zi is
3
,
forthegrandtotalofthemeasurements. Anefficient
is
one that shows small
scatter,
i.e., is
one that allows
a statistically reliable estimate of the precision based on just one subset. Efficiency refers specifically to the reciprocal of the square of the standard deviation of the zi distribution about the central value z. (Standard deviation
is
defined in the next section.)
In a practical sense, efficiency
to the inverse of the number of measurements required for a given
refers
statistical precision:
number
the smaller the
the greater the efficiency.
that to have a given degree of precision in a set of measure-
Gauss showed ments from a parent population having a normal distribution 14% more measurements are required if z, rather than the standard deviation, is used as the precision index. Any set of measurements in real life is just one subset of a much larger number of possible measurements. Judged by this statistical efficiency criterion, the mean deviation does not justify the widespread use to which
However,
it is
it
has been put by
scientists.
nonetheless a very useful index of dispersion
if
the larger
deviations in the measurements are believed not to justify a higher weighting than the first power. (The standard deviation weights according to the
work. Suppose Another characteristic of the mean that a large set of measurements is divided into subsets of two measurements in each subset, and suppose that the value zi is computed for each subset. The average of these two values of z i is generally less than, statis-
Such measurement problems do
second power.)
deviation
tically
only 0.707
of,
the
mean
is
arise in scientific
the following.
deviation of the parent
set.
It is
true that
the larger the number of measurements in each subset the closer the average of the subset values becomes to the value of the parent set. For subsets of three measurements each, the statistical average of the mean deviations z i
is
0.816 of the parent
for subsets of five measurements
z;
0.943. Most actual sets of and the mean deviation gives measurements in real life are not precision. This bias is also found an unduly optimistic measure of the each case can be corrected and in in the experimental standard deviation,
each,
it is
0.894; for subsets of ten each,
it is
very large,
—
by multiplying by the factor Vn/(n
The
fractional
mean
deviation
is
1)
fractional z
and
is
as discussed later.
defined as
usually expressed in per cent.
=— m
(2-8)
Probability and Experimental Errors in Science
82
Although the value of the fractional value depends
upon
measurements of temperature
fractional : in a set of
numerical
z is dimensionless, its
the choice of zero of the x scale. is
For example, different if
x
is
expressed in degrees centigrade or in degrees Kelvin. Hence, the fractional value
x
is
scale
usually used only in those measurements for which the zero of the
This restriction also applies to
the physically significant zero.
is
any fractional dispersion index, e.g., the fractional standard deviation as mentioned presently. In Eqs. 2-7 and 2-8 the value of z is that obtained with the mean m, not ju, as the reference value.* Hence, z and fractional z are experimental quantities. The corresponding indices for a model distribution are not used because of their inefficiency and bias as stated above. Experimental standard deviation persion of the frequency distribution limited
number
Another measure of the disWith a
s. is
the standard deviation.
n of trial measurements, the experimental or sample
standard deviation
s is defined asf
=
s
\
—
=\ ^L—J
This expression can also be written, in case x t
2
where s
is
ri
<
n and
p
{
=1
is
(2- 9 )
l
/
is
observed
sum makes
t
-(5(*-«Fa)
the dimensions
and
units of s
(2-10)
xt The quantity Taking the square the same as of x.
the probability of occurrence of
also called the root-mean-square (or rms) deviation.
root of the
times, as
f
.
(Again, the sums in Eq. 2-10 could just as well be taken to
Note that the deviations summation is made; this
in
infinity.)
Eq. 2-9 are individually squared before the
assigns
more weight to the large deviations. is more sensitive to large deviations
Hence, as a measure of the dispersion, s than
is
the
mean
deviation
z.
It
follows that, of two frequency distributions
having the same mean deviation, the distribution with the relatively higher tails has the greater standard deviation. In a series of measurements, a large deviation *
Sometimes
is
always cause for concern;
in statistics
a
"mean
reference value. This, however, f
This definition presumes
all
is
deviation"
very
is
uncommon
its
appearance increases our
reckoned with the median as the work.
in scientific
individual measurements x, to be equally weighted, a
typical presumption for a set of
random measurements. The standard
defined in Section 3-3 in terms of unequally weighted measurements.
deviation
is
Frequency Distributions and Precision Indices
83
a priori belief that another large deviation will soon appear. deviation
more
a
is
efficient
reckons efficiency, than
Of all
the
is
The standard
measure of the precision, as the
mean
statistician
deviation.
the precision dispersion indices, the standard deviation
And
in widest use in statistics. science, but
nowadays
it
is
the probable error, which
is
deviation, seems to be about equally popular.
discussed in Chapters
3, 4,
and
is
the one
also widely used in experimental
based on the standard The probable error is
5.
Because of the squaring operations
in
Eq. 2-9, s does not allow distinc-
and gives no asymmetry of the distribution. For a symmetrical distribution, s can be indicated with meaning as ±5 values on the graph of the distribution; for an asymmetrical distribution, it cannot be so indicated and it remains of mathematical significance only. The standard deviation provides, as does any dispersion index, a numerical guess as to the likely range of values into which the next measurement may fall. With this interpretation, s is sometimes called the standard deviation of a single measurement rather than of the distribution itself. As we shall see in Chapter 4, if the normal model fits the experimental parent distribution of mean fx and standard deviation a, the probability is 0.675 that the next measurement will fall within ju ± <** which is very nearly the same as m ± s. Often we are interested in a statistical comparison of the precision of the means determined on different days, or with different apparatus, etc. This comparison is discussed in Chapter 3. The fractional standard deviation is defined as tion between the algebraic signs of the individual deviations
indication of the degree of
fractional s
This index
is
5 =— m
(2-11)
extensively used in problems dealing with the propagation
of error, discussed in Chapter
3.
However, as with the fractional mean
deviation, the fractional standard deviation
cases
when
the zero of the x scale
Computation of carefully done,
inaccuracy
is
is
s
is
is
not especially useful in
not physically significant.
by Eq. 2-9 or 2-10
is
often tedious and, unless very
apt to be somewhat inaccurate.
a consequence of the fact that
be used in the value of
m
more
than in each value x
t
.
One reason
for the
significant figures
To
must
reduce this difficulty
can be shown that, in any distribution, the probability/* of observing a deviation 2 is p < l/k where k is any positive number. (This is known as Tchebycheff's theorem or inequality.) * It
2
>
ks
,
Probability and Experimental Errors in Science
84
convenient expression for
in the calculations, a
(x (
— mf in
Eq. 2-9 and by using Eq.
=
5
I
(*<
;
\\4/n
In /
2-1
-
/
obtained by expanding
\Vi
n
I ^ 2
w) 2 \
s is
thus,
2m
^
*i
+ nm2 \
—
X? — nm (2 •^-
A
further simplification
in Eq.
2-4, viz.,
vr
is
achieved by using the working
=m—
From
A.
2
\
(2-i2,
;
mean introduced
a visual inspection of the set of
measurements at hand, or better, of the frequency distribution graph, a working mean w is chosen which may be a convenient round number, if the classification interval is bounded by round numbers, and which contains For convenience, w is all the invariant significant figures in all the x/s. chosen to be close in value to m. Then, in terms of (x — w), which may be i
called an adjusted deviation,
I=
-
fo,
w
-
A) 2
2
- "? - »^2
to
1
2
(2-13)
where, as for Eq. 2-4,
2
(
x
i
-
w)
A = -^ !
A
(2-14)
can be made relatively quickly by choosing a different value of the working mean and by performing an check of the calculation for
s
2
independent calculation.
An example ally, for the is
of the shortened computational method for 5 [and, incident-
mean
and for the
given in Table 2-2.
observed value xit the is
m is
In this
large
.
number of
skewness (discussed
expressed with one more
is
This
is
.r/s,
and
later)],
table,/, the frequency of occurrence of the
used to shorten the tabulation. Note that
computed value of aw
each value x t
coefficient of
justified
in
the table
significant figure than
because the mean
this justification
is
value of the standard deviation in the mean,
is obtained from such a confirmed by the numerical
5,„,
a precision index intro-
duced presently.
Moments. The square of the standard deviation is also known as the moment about the mean. Since moments higher than the second are also mentioned presently, it is worthwhile now to become acquainted with moments in general. second
Frequency Distributions and Precision Indices
85
Table 2-2. Sample Calculations for Mean and Standard Deviation (Using the
Measured
Frequency
Value xt
Observation
(cm)
fi
125
"Working Mean") Adjusted Deviations and Moments with Working Mean w = 128 cm
of (x (
-
w)
(cm)
ffa (cm)
w)
ffa - wf (cm2 )
fi(xt
-
w) 3
(cm 3 )
Probability and Experimental Errors in Science
86
The kth moment about
dk
2
-
(*,
= «^—
m
6km ,
mean m,
the
mffi
- I
n
from which, with Eq.
is
i*i
-
mfPi
(2-17)
2-10,
=
s
2
(2-18)
as stated above. If the
mean
deviation were defined differently than
on
specifically, if the absolute brackets
it is
in
Eq. 2-7,
i.e.,
were removed and regard were
zi
maintained for the algebraic signs of the individual deviations, then this 2 would be the first moment about the mean. But the first moment about the mean, is
A
equal to zero by the definition of the mean, and this
viz., Oj™, is
the reason that
z is
defined without regard to algebraic signs.
moments about the origin shown by essentially the same
useful relation exists between the second
and about the mean expansion made
in
K=I
Eq. 2-12,
(**
-
and by
= I *V, - 2m 2 xiPi + m 2 £ i=i i=i i=i
*nfPi
—
u2
This expression
is
we have
m
w2
the desired relation
m2
(2-19)
'"
the well-known formula
identical to
inertia of a
axes a distance
Pi
- 2mB» + w 2
° 2
substituting Eq. 2-16 for 6^,
moments of
is
viz.,
i=i
=
This
respectively.
body of
unit
mass about two
relating
the
different parallel
apart, with one axis through the center of mass.
"universe" or "parent" standard deviation a. If all the infinite population of possible measurements in the "universe" were known (a purely hypothetical situation), would be known and we could use /n instead of m as the reference value in computing each deviation. The standard deviation of a set of n deviations, as n —> co, with ju as the Variance a 2 :
/li
reference value
is
t
is
known
=
denoted by
—
l
as the variance,
a,
= a
and
matical model.
it
is
square
= 2
*=*
is
standard deviation. The variance distribution whether
its
fo.
- n?Pi
(2
-
20 )
also called the "universe" or "parent" is
a parameter of the universe or parent
of the "imagined
Incasethe parent distribution
real-life" is
type or
is
a mathe-
continuous, thesummation
Frequency Distributions and Precision Indices
may
Eq. 2-20
in
87
be replaced by an integration.* Thus, using/as the con-
tinuous frequency function from
to oo,
fxffdx Jo
(2-21)
f
fdx
Jo
The
integral in the
denominator
that the frequency function
function
included in this expression to be sure
is
normalized;
is
when a model
used, the integral in the denominator
is
The variance a 2
,
i.e.,
the second
moment about //,
is
probability
unity by definition.
is statistically
the
most
important parameter in describing the dispersion of any universe or parent distribution, including any mathematical model distribution. With either
the
"real-life"
distribution,
imagined universe
we must assume
that
//,
is
The value of [x, hence of a 2 can never be ,
set
We
of measurements.
* In
distribution
or
the
model a2
known
in order to calculate
exactly
known from any
.
real-life
use the "best" estimate that can be obtained.
going from a discrete to a continuous frequency distribution,
we
use the basic
For some students, it may be helpful to review this argument. Consider the 12 measurements listed in Table 2-1 and graphed in the discrete distribution of Fig. 2-1. Or consider any list of a large number n of measurements. The range of the abscissa or x axis of interest can be arbitrarily divided into a large number N of equal increments Ax,. The number of measurements that fall into the ith interval is «,, and the normalized frequency (or the experimental probability) with which a measurement is observed within this interval is argument of
calculus.
N i
The normalized frequency distribution
the graph of/>, vs. x,, where x, is the coordinate taken as the average of the measurements within this interval. of course, discrete so long as Ax, is finite in size.
of the interval Ax, and This distribution
is,
n
I-, =i
is
is
We wish now to approximate the discrete frequency distribution/?,(x,) by a continuous function /(x), defined by
i.e.,
one for which Ax,
means of the
->-
in the limit as n
-*
oo.
This function can be
relation
which says that the value of/(x) at x = x, is to be made such that the product of this value and the width of the interval Ax, is equal to the normalized frequency of the observed measurements within this interval. Actual real-life measurements are far too few in number in any given situation to determine /"(x) in fine detail (i.e., Ax, small, zero in the limit) by direct use of this definition. We are usually content to guess the "way/"(x) should go" reasonably consistent with the actual measurements at hand. This gives a continuous function that approximates not only the actual discrete frequency distribution but also the presumed-to-becontinuous parent distribution.
Probability and Experimental Errors in Science
88
For a
set
of n measurements, as stated above, the best value of
generally taken as the experimental value
m; and
to a 2 that can be
of n measurements
deduced from a
set
/x is
the best approximation is
generally
taken asf n
9
o
(2-22) 1
one estimator of a, but Vn/(n — \)s is generally considered to be a better estimator of a because the radical factor corrects for a bias inherently present in s. This bias was mentioned [The sample standard deviation s
earlier, in
connection with the
is
mean
deviation,
and
is
discussed again
below.]
Combining Eqs.
2-9
and
2-22,
we note
that
(n
\
'-^1
n
M (2-23)
/
—
/
1
In practice, this is commonly put in the form of a guess that a particular known continuous function satisfactorily "fits" the finite set of actual measurements; in other words, the guess is made that more measurements, were they to be taken, would merely increase our satisfaction that the analytic function fits and describes the parent distribution.
Then, the
common problem becomes one
of determining the best guesses
as to the important parameters of the continuous function.
of the sample and of the parent distribution respectively,
For example, for the means
we
write
N
m = i—L_ = n
^ />.(*)>
tl
wm
>
ancl
f*
=
f(x) d" Jo
i=i
and for the sample and the continuous-parent k moments about the mean
(see Eq. 2-17),
/*00
«*/(*) dx
N ek m
=£(*<-/*)*/>,(*<)
(experimental)
»'=l
and
Bk "
=^
(parent)
f(x)dx Jo
| That these are the "best" values is discussed with more or less general statistical arguments in the appendix of the paper by R. H. Bacon, Am. J. Phys., 21, 428 (1953). 2 Later, in Chapter 3, we show that taking these as the best estimators of /< and a in the normal distribution is consistent with the method of maximum likelihood; see Eqs. 3-12, 3-14, and 3-97 and the discussions attending these equations. In general, of course, other estimators exist for the parent parameters // and a but they are of no great interest to us at our present level of discussion. But it is worth
mentioning that there is no rigorous proof that m and Vn/(n — l)s are in fact the best Such a proof requires the introduction of some condition or criterion in addition to the conventional theory of probability. Maximum likelihood is such a
estimators.
condition, and
it
leads to useful estimators that are generally taken as best for
practical purposes in experimental science.
all
Frequency Distributions and Precision Indices
and
this is the best practical
89
formula (or estimator) for the universe or
Note
that in this expression for a in terms of m, the denominator is n — 1 instead of n. Often, no distinction need be made between the numerical values of s and a since neither is numerically very significant unless n is reasonably large, and then the differences between m and ju, and between n and n — 1, are relatively small. For
parent standard deviation.
— 1) which corrects for the bias whenever the standard deviation of the parent distribution is desired. And, regardless of the size of n, the difference in principle between s and a (and between m and p) is fundamental in statissmall n, say
<10
or 15, the factor Vn/(n
must be applied to
tical
s
theory and, indeed, in the philosophy of science.
It is
of importance that Eq. 2-22 or 2-23 not be interpreted in any sense
refers to the parent population and the equations means of estimating the value of a. The "approximately equals" sign in Eqs. 2-22 and 2-23 approaches
a
as a definition of a;
here merely provide a
"equals" as n
—>
oo.
On
the average, even for n rather small (but of course
greater than unity), the expression
note that, for n in this case;
proper in
=
1,
is
very nearly exact.
Eq. 2-23 gives 0/0 and a
and, for n
=
1,
is
Eq. 2-9 gives s
this case although, here also,
It is
interesting to
indeterminate, as
=
0, as is
is
proper
mathematically
"indeterminate" expresses a more
appropriate meaning.
Degrees of freedom. The factor n/(n —1) enters in Eq. 2-22 by an argument in statistical theory, and we shall encounter it again in Chapter 3 in connection with various tests and in Chapter 4 in connection with the 2 X test of the fit of a model distribution to an experimental distribution. So let us take a moment to see at least its plausibility. This has to do with the concept of the "degrees of freedom," which has the same meaning here that it has in geometry and in mechanics. The argument can be stated in terms of the number of "restraints" imposed on the universe or parent distribution. We seek to know the parameter a of the parent distribution of infinite population of which the n measurements are a sample. The only thing we know about a, or indeed of the parent distribution at all, is that the experimental measurements are a sample. When we insist that the parent distribution have a characteristic, any characteristic, which is determined from the sample, we impose a restraint on the parent distribution. For example, insisting that the parent distribution have a mean value equal to the sample value m imposes one restraint, and the price we pay for this restraint is a sacrifice in s as being the best estimate of
a.
In other words, the effective
number of
measurements useful in the determination of the best estimate of a is reduced from n to (about) n — 1, as in Eq. 2-22; one view is that (about)
90
and Experimental Errors
Probability
in
Science
one of the n measurements is used to determine the mean and the remaining — 1 measurements are left to determine the errors or the dispersion. The best estimate of a is a little greater than the value of s. n
In another view, the reason that the best estimate of a is
population
is
a
minimum
only
if //,
not m,
greater than s
for the universe
used as the reference value in
is
sum
calculating each deviation, whereas the
minimum
is
The sum of the squares of the deviations
the following.
for the n sample deviations
is
m, not /u, is the reference value. This argument is equivalent to the one expressed in the previous paragraph. The restraint mentioned above, viz., that m = ju, is always unavoidably a
only
if
imposed whenever we try to match a parent distribution, either a model distribution or the one believed to exist in real life (even though it may not be fitted by any model). And it is fairly obvious that a second restraint is imposed if and when we insist that the parent parameter a be given in terms of
—
as n
s.
1.
—
Then, the factor n
We
2 enters the statistical formulas as well
shall discuss situations
shall see that the
x
2
test
of this sort in Chapter
3,
and we
of model match in Chapter 4 involves just this
type of argument.
An
way of looking at the restraint is in terms of the ideas of The solution of a single equation is said to be unrestrained, but when we require the simultaneous solution of an additional equation we impose a restraint; we impose two restraints with three simultaneous independent equations; etc. The attempt to determine a equivalent
simultaneous equations.
from the n experimental measurements, as
in Eq. 2-23, involves three
simultaneous equations n
a2
=
I
Urn n-*co
^
n
n
te
- /")
2
ii
,
=
fl
2
lim
x '-^—
n-*oo
n
i ,
m=
2
xi
^—
(2-24)
n
two of which are independent and from which we eliminate /z and solve known quantities m, n, and the x/s. This solution can be carried out statistically but it is not an easy one. The feature of the argument is readily seen when it is pointed out that for n = 1 we have no deviation; for n = 2 we have two deviations but they are identical (except for sign) and therefore are not independent, i.e., we have only 2 — independent deviations. We can generalize the argument to say that for n measurements we have n — independent deviations; and the proof of this conclusion lies in the fact that any one of the x values is given by Eq. 2-1 in terms of the other z values and the value of the mean m. The argument for the n — factor in the denominator of Eq. 2-23 is a for a in terms of the
1
1
t
t
1
little
more involved than
just
the
number of independent
deviations
Frequency Distributions and Precision Indices
9/
because, with the aj/s as only a sample of the parent distribution,
not possible from Eq. 2-24 to determine either
it
is
or a exactly.
/u,
Another example of the concept of degrees of freedom is found in curve A curve may be made to go through all the points if the equation of the curve has as many constants as there are points. With the curve so fitted, no point deviates at all from the curve, and the sum of the squares of the deviations (i.e., the numerator of the expression for a) is zero. This is also the same as saying that one degree of freedom is lost for each point fitted, and therefore the number of degrees of freedom left for the determination of error (i.e., the divisor) is zero. The standard deviation a there is no information concerning is thus 0/0 and is indeterminate: dispersion left in the set of points; there is no freedom left. Finally, the remark may be made that division in Eq. 2-23 by the number of degrees of freedom, instead of by n, insures that we have the property mentioned earlier, viz., that the standard deviations based on small subsets have statistically the same average value as that computed for the fitting.
parent
set.
binomial model distribution.
Variance:
The mean and
the variance
of the binomial model distribution are readily written as the respective
moments. Using Eq.
1-20 for
=
a
^ The k =
factor is
k
p
= V
0« l
from k nominator by
k,
to k = with k >
ii
Next, n and
p
—
k\(n-k)\
sum
1.
is
(2-25)
us that the term for
by changing the lower
0,
pY~ k
are factored out, giving
I,
and
7]
=
/"
n
=
(n
= i (k
—
1
nP
— ;
-
see that the
summation
fc
—
_! n _ fc
k)\
then K<
2
is
1)!
1)!(h
k=o k\{t\
and we
tells
not altered
= I
jt
k
k n ~k
pq
Then, by dividing both numerator and de-
*
=
find
numerator of the summation
in the
=
we
k
t=o
zero, so the value of the
limit
Let k
in Eq. 2-3,
t
—
^B(k;
P
f
K
k)\
r\,
p) which, by Eq. 1-22,
is
unity.
Hence, ju
=
np
(2-26)
Probability and Experimental Errors in Science
92
The variance of
the binomial distribution can be written, with the help
of Eq. 2-19, as
V = ^ -K=I
=
a"
w! k*
Now we
—
substitute [&(£
write
+
1)
k] for & 2
„
,ft-»
—
Ac!(n
+ /«-V
/c)!
—
1)
factor in the summation, the
as well begin with k
=
2.
factor out n(n
o2
=
&(/;
—
\)p
2 ,
Then,
if
we
=
k
we have
2)! - dp 2 i = 2(/c — ;; 2)!(/t — and v = n — 2; then (/
2
a2
=
»(n
k)\
-^ A d!(v — i
-
l)p
2
2=
a
1
a Substituting np for
ju
=
2
n{n
v
"5
TTT^
o
+
" y"2 :
A*
d)!
and, again, the summation £5(<3;
r, /?)
—
l)p
2
= +
so
1,
—
fx
/j?
from Eq. 2-26, we have a2
=
In the binomial distribution,
The
^-y-» + /• -V
n(n
—
summation may just — 1), and
cancel the factor k(k
a
Let 6
(2-27)
and, making use of Eq. 2-25
,
"•
s=o
Because of the
2
/,
A:)!
,
= iKk-i)
o2
pV"* -
—
k!(h
a=o
np(l 2
<j
is
—
p)
always
=
npq
less
(2-28)
than np
(=
//).
fractional standard deviation, also called the coefficient of variation,
can be written for the binomial model distribution as
=
Inpjl-p)} np
ju
which
is
)
'
1
a
*
2
=
np
J
(i \/u
especially convenient in those cases for
_
S
(2 . 29)
2
w
which n
>
/a,
as in the
Poisson approximation. Since the normal
and the Poisson model
distributions are special cases
of the binomial, Eqs. 2-26, 2-28, and 2-29 also apply to them.
Equations
2-26 and 2-28 are derived again in Chapters 4 and 5 specifically for these distributions.
Standard deviation
mean (standard error) s m If we were measurements, the second value of the mean
the
in
to record a second set of
//
.
would in general differ from the first value. But the difference between the two means would be expected to be less than the standard deviation in either set. This expectation could be checked, of course, by a large number TV of repetitions of the set of measurements and the frequency distribution
93
Frequency Distributions and Precision Indices of the
N means analyzed. We
deviation in the
would write
/ S
where
m
is
for the experimental standard
mean
the grand
willing to record the
»
"
|
- mf\
(m,
V^H
<2 " 30>
/
mean of the TV values. Very few experimenters Nn measurements required in order to obtain
are the
value of s m from Eq. 2-30. Fortunately, statistical theory provides a satisfactory formula for sm from the n measurements of a single set. In Chapter 3 we show in simple
fashion that sm
=
(2-31) -J=-
In reference to the parent distribution, the formula rather than for s m
,
is
«
=
.
\
-=
(2-32)
yjn
I
Combining Eq. 2-32 with Eqs. 2-22 and
°m
=
am
and
N
Either sm or a m
of course for a m
is
2-9,
we have
^
(2-33)
/
often called the standard error in (or of) the mean, or,
in experimental sciences, simply the standard error *
the reliability of the mean,
it
standard deviation because
it
number of measurements
As
a measure of
has more direct significance than has the includes
more vigorously
the effect of the
n.
In one theoretical derivation of the expression for o m (see Eq. 3-26) the
approximation
made
is
with vVvery large,
is
that the hypothetical distribution of the TV means,
almost a normal distribution irrespective of the shape
of the particular parent distribution of which the samples. tion
is
distribution *
comes
Unfortunately,
closer to being normal.
many
are
is very good even when from normal, and it improves as the parent
This approximation
significantly different
Nn measurements
the parent distribu-
One consequence of
investigators use the term "standard error" as
this
synonomous
with "standard deviation" without specifying the particular random variable involved. The phrase "standard deviation in the mean" is awkward; and if the ambiguity in
"standard error" persists, a new term should be agreed upon or else the qualifying phrases "in the measurements" and/or "in the mean" respectively must be added.
Probability and Experimental Errors in Science
94
approximation /u
by
less
teristic
The error,
than
is
any one sample mean
that the chance that
±o m
is
about 0.683 since
this
of the normal distribution (see Table
fractional s m
55
=— =
—
is
from
a charac-
4-5).
=— =
fractional a m
,
differs
mean, or the fractional standard
fractional standard deviation in the is
numerical value
—o
-=
(2-34)
Equations 2-31, 2-33, and 2-34 are formulas that should be
working knowledge of
scientist's
in
every
statistics.
Skewness. The first and second moments about the origin and about mean have been discussed above. Moments about the mean are
the
And
especially useful as dispersion indices.
it
was pointed out
that the
moment about the mean the greater is the relative weighting of the large deviations. None of the dispersion indices discussed so far gives a measure of the asymmetry of the distribution. Now we introduce higher the
the dispersion index called the coefficient of skewness, which in
terms of the third
moment about
the mean.
This coefficient
is
defined
is
n
skewness
=
—
(2-35)
3 /is
(experimental)
and
skewness
=
—
(2-36)
no
(universe)
(For a continuous universe distribution, the summation should be replaced by an integral, or a3
etc.,
as in Eq. 2-21.)
in
Eq. 2-36
The
factor 5 3
makes skewness a dimensionless quantity used. The coefficient of skewness of the measure-
the denominator
in
independent of the scale
ments in Table 2-2
is
tion in Fig. 2-A(e)
is
+0.05], but the coefficient of skewness of the distribuabout +21. Positive skewness means that more than half of the deviations are on the left (negative) side of the mean but that the majority of the large deviations are on the right (positive) side. Because the skewness is so sensitive to large deviations, its numerical
value varies value
is
initially rather
restricts its practical use,
upon
widely as n increases, and a reasonably stable
not generally realized until n
it is
but when
its
is
rather large.
This sensitivity
experimental value can be relied
a very powerful aid in the fitting of a
model
distribution.
This
is
Frequency Distributions and Precision Indices particularly true since
generally
most
so sensitive in the
it is
difficult to
check, and since
it
95
tail
regions where the
fit is
does allow a comparison of
asymmetry.
The expression
for experimental skewness in terms of the working
=
skewness
2 —
(*i
-
w) 3
-
nA 3
3A (2-37)
ns
(experimental)
s
where w and A are defined in Eqs. 2-4 and 2-14. The binomial model distribution has a third central moment, the mean, given by 8
"
=
np{\
-
-
p){\
=
6>
*
3 -%
=
=
p)(l
2p)
2p)
(1
-.,
(2-39)
1
«"
(binomial)
in deriving
- — — —"^ =— ^ [*» - V)Y W np(l-
about
(2-38)
easily
skewness
i.e.,
2p)
proved by an extension of the argument used Eq. 2-28. The binomial skewness is
which can be
mean
n -^t
(2 .40)
no Equation 2-38 or 2-39 shows that the binomial distribution only in case/? = \, as was mentioned in Chapter 1.
Other dispersion is
indices.
The fourth
is
symmetrical
moment, divided by and is written
central
called the coefficient of peakedness or of kurtosis,
s4 ,
as
n
2
peakedness
=—
(*<
-m
4 )
(2-41)
"s
(experimental)
and
peakedness (universe)
(Again,
if
Eq. 2-42
is
=—
(2-42)
na
to apply to a continuous universe distribution the
summation should be replaced by an integral.) The peakedness, like skewness, is dimensionless and is independent of the scale. The fourth moment, even more so than the third, is restricted in its usefulness with actual measurements unless a very large number of trials have been made. If n
is
not large, the value of the peakedness
because of
its
distribution.
is
numerically unreliable
high sensitivity to fluctuations in the
tail
regions of the
Probability and Experimental Errors in Science
96
Combinations of precision indices are sometimes combination
useful.
One such
is
3(skewness)
2
—
2(peakedness)
+
6
is zero for a normal distribution, positive for a binomial or a Poisson distribution, and negative for distributions that are extremely
which
peaked. Additional indices of dispersion can be defined and evaluated but their An exception may be is generally not very great.
practical significance
the universe standard deviation in the sample standard deviation,
a s which,
if
,
the
a can be approximated by
number of
deviation
s.
where 0/
is
s, is
which to express the standard
significant figures with
may be
This index
written in general form as
moment about
the fourth
viz.,
useful as a guide in determining
the universe mean.
In special cases,
Eq. 2-43 simplifies as
a for a
normal distribution (2-44)
and
=
as
/2a
2
1
I
An
\
where the It is
effect
evident that
distribution
6/
is
,
a2
+
3a4 expressed
in
a
units.
the use of Eq. 2-44 for a
case, suppose that n = 8;
more than two
for a Poisson distribution
1
of the particular shape of the distribution can be seen. " normal distribution is 3a4 and for the Poisson 4 for the
As an example of normal
+
H l\
then \/V2n
normal or approximately
=
0.25, and, therefore, not
significant figures should be used in expressing the standard
and most likely only one. It has been mentioned that the probable error index in scientific work, and it may be worthwhile
deviation
s,
is
a popular precision
to write the expression
for the probable error in the probable error, viz.,
pe pe
where the numerical generally a
little
=
0.675
-^ = 0.48-^
(2-45)
normal distribution and is any other distribution (e.g., for a Poisson
coefficient applies for a
different for
distribution see Table 5-2).
As
a final
comment,
all
the central
moments
are useful
when we
are
Frequency Distributions and Precision Indices
97
dealing with mathematical or theoretical distributions,
i.e.,
with distribu-
whose shapes can be expressed exactly. For example, this is the case with the distributions (e) and (/) in Fig. 2-4. However, for some very interesting distributions the central moments do not exist unless the tails tions
of the distributions are arbitrarily cut
Cauchy
this is the case for the so-called
off;
distribution, viz.,
/(*)
= 7r[l -f-~7 + (* ~ yU)vn]
(2 " 46)
2
This expression appears in theoretical physics rather frequently, classical dispersion in physical optics, in the shapes of lines, etc.
In actual experimental work, distributions whose
as x2 or less rapidly than x2 (the square ,
moments
diverge), are not
As
Conclusions.
is
e.g., in
atomic spectral tails
drop off
the limiting rate for which the
uncommon.
we would
stated earlier, the information
to have in the description of a set of measurements
is
really like
the complete mathe-
matical equation for the distribution histogram or frequency curve. But, in real
life,
distribution
is
with a limited number n of actual measurements, the
at best defined with
some
finite
obtaining the exact mathematical formulation
vagueness. is
Then, since
we must Each index
impractical,
content ourselves with one or more of the precision indices.
has
its
own advantages and
disadvantages in providing the information
we
want.
The standard deviation, i.e., the square root of the second central moment, is the index in widest use and the one on which most statistical theory is based. The probable error, the evaluation of which first requires is rather widely used among experiThere are three excellent reasons for the popularity of
evaluation of the standard deviation,
mental
scientists.
the standard deviation: reliable; (2) the rules
(1) it is statistically efficient
and experimentally
of operation in the propagation of errors as based on
the standard deviation (rules discussed in the next chapter) are conveniently
simple; and (3) the tively unreliable.
moments higher than
the second are usually quantita-
Occasionally, a measurement situation
is
encountered
which the larger deviations are suspect for nonstatistical reasons, and then even the second moment overweights the larger deviations and the mean deviation is used as the dispersion index. The mean deviation, howin
ever,
is statistically less efficient
The general
than
is
the standard deviation.
and precision indices discussed in of measurements having any type of
definitions, concepts,
the present chapter apply to sets
frequency distribution.
We
shall continue with the treatment of empirical
data for another chapter, introducing a theory, before taking
little more advanced up the mathematical models.
statistical
Probability
98 2-12.
and Experimental Errors
in
Science
Problems
Note:
A
numerical answer to a problem
is
not a complete answer;
the
student must justify the application of the equation(s) he uses by giving an analysis of the problem, pointing out how the problem meets satisfactorily the
on which the equation
conditions 1.
From
is
based.
the measurements of Table 2-1, determine (ans. 31.7 to 32.3
(a) the range,
(b) the (c)
(ans. 31.9
median, (ans. 31.8 to 31.9
the middle 2 quartiles,
(d) the
mm;
31.9 to 32.0 (ans. 31.9
mode,
(e)
the arithmetic
(f)
the
mean with and without
the device of the working mean, (ans. 31.92
mean
(ans. 0.11
deviation,
(g) the fractional
mean
mm) mm) mm) mm)
mm) mm)
(ans. 0.0036 or 0.36%) and without the device of the working mean,
deviation,
(h) the standard deviation with
(ans. 0.15 (i)
the fractional standard deviation,
(j)
the standard error, and
mm)
(ans. 0.004 7 or 0.4 7 (ans. 0.045
%)
mm)
(ans. 1.5) (k) the skewness with the working mean. Express each answer with the proper number of significant figures and indicate
the proper dimensions. 2.
Make
the histogram for the following frequency distribution
x
=
5
83.304
Probability and Experimental Errors in Science
100
Why (i)
do we
prefer
moment
the second central
rather than the fourth central
moment
in
describing experimental dispersion, (ii)
the arithmetic
mean
rather than the
rms mean
as the best location value,
and (iii)
in
some
cases the
median rather than mean as a location value?
13. Discuss the relative
advantages and disadvantages of the
mean
deviation
the standard deviation as a dispersion index in 8 measurements that approxi-
vs.
mate a normal
distribution except that they are
(a) consistently
somewhat higher
in the tail regions,
(b) inconsistently higher in the tail regions
owing to inconstant systematic
errors, or (c)
made with
a rather high "background" (as in the intensity measurements
of a component line in emission spectra or in a nuclear cross-section measurement).
"The true
world
logic of this
is
the calculus
of probabilities."
JAMES CLERK There
numerus:
"Defendit
is
MAXWELL safety
in
numbers."
AUTHOR UNKNOWN
3
"Maturity
capacity
to
endure un-
JOHN
Statistics of in
the
is
certainty."
FINLEY
Measurements
Functional Relationships:
Maximum
Likelihood,
Propagation of Errors,
Consistency Tests,
Curve
The
sole
Fitting, Etc.
purpose of a measurement in an experimental science
is
to
influence a hypothesis.
Science progresses, as stated earlier, by the constant repetition of three steps: (1) conception of
some aspect of nature,
i.e.,
a hypothesis, based
on
all experience available, (2) calculation of the a priori probabilities for
certain events based
on
this hypothesis,
priori probabilities with actual probabilities.
The comparison
thesis or a rational basis for
The
and
measurements,
modifying
with experimental
is
that
it is
we would we would measure, are
define,
101
hypo-
perforce formulated
the intrinsic difficulty with
that they are unavoidably veiled in errors.
particular property of nature that
quantity of nature that
i.e.,
it.
with only a limited amount of experience, and is
comparison of the a
yields either a confirmation of the
intrinsic difficulty with a hypothesis
actual measurements
(3)
and each
Each
particular
at best only probabilities.
Our
task as scientists
hypotheses are correct,
in
and Experimental Errors
Probability
102
to increase the
is
i.e.,
in
Science
probability that our current
are confirmed by objective experience.
Elementary but fundamental examples of the function of measurements improving scientific hypotheses were pointed out in Sections 1-4 and 1-5.
These examples arose in the discussion of conditional probability and of the reliability of inferred knowledge. In Chapter 1, however, there was no ambiguity as to the outcome of each experiment; the veil of errors enshrouding each measurement was not involved. line
of argument begun in Chapter
with emphasis
1
Let us continue the
now on
the measure-
mental uncertainties. First,
we continue
this
argument with very general
specific line of
we introduce
equations for a page or two, and then in Section 3-1
maximum
powerful and useful method of the veil of errors.
some
the
likelihood for dealing with
These discussions are intended to give the reader though upon first read-
insight into or feeling for the subject even
ing he
may
in these
not understand every step in detail.
If
he has
much
trouble
pages he should not be discouraged at this time but should pro-
ceed in cursory fashion on to Section
Consider
first
3-2.
a discrete set of possible hypotheses; in particular, con-
two hypotheses A and B.
Suppose that pin(A) and pm(B) are the A and B, respectively, are correct. pm(A) + pin(B) = 1. Suppose that an experiment is performed •, x •, x Suppose also and a set of measurements xx x 2 it n is obtained. sider
initial
or a priori probabilities that hypotheses
•
,
that
pA (x,) is
•
•
,
•
the probability, on the assumption that
A
is
correct, that the
would be observed, and that /^(x ) is the probaThen, as the bility, if B is correct, that the set x would be observed. consequence of the experiment, the a priori probabilities p-m{A) and p m {B) that applied before the experiment are now modified; they become particular set of values x
i
t
Pmod(A)
=
(J-1)
Pin(A)p A (x n
l
These expressions were written "/?
heads
in
)
+
p in (B)p Ii (xi )
Pin(B)p l{ (x,)
P\n(A)p (x
"head" or
t
in
l
)
+
Chapter
a row" appeared
p in (B)p Ji (x I
i
where the
in the
)
single observation
case of the penny-toss
in place of x if and where "the sun rose" or "the sun rose n row" appeared in place of xt Insofar as the experiment yielding x is well chosen and well performed, our confidence in A (or B) is increased at the expense of our confidence in B{or A), the increase in confidence being
experiment times
in
a
.
i
proportional to, say, the difference /j„io.iM) — p\n(A). Of course, this value /?mo.iM) becomes p\ n {A) for the next experiment to be performed, etc.
of Measurements
Statistics It
in
Functional Relationships
often happens that the possible hypotheses
than a discrete
set rather
As mentioned
set.
in
make up
Chapter
1,
the problem of the sun's rising as involving a continuous
a continuous
Laplace viewed set,
whereas the
analogy to the coin-tossing experiment presumed a discrete In the continuous case, the experiments serve to sort the numerical
argument set.
103
in
values of each of the/Jmod's into a continuous distribution, and presumably this distribution
has in
a single
it
maximum which
article
of faith in experimental science.) The
corresponds to the
(This presumption
hypothesis deserving our greatest confidence.
maximum becomes
is
an
increas-
good additional experiments are performed and interpreted. Whether the set of possible hypotheses is discrete or continuous, we are confronted in the evaluation of each/? m0 d with the problem of what to do with the variety of x values yielded by the experiment. The calculation of pA (z ) requires that the hypothesis A be expressible in some analytic functional form. Sometimes the hypothesis refers to the functional form itself, and sometimes it involves the property being measured as either a ingly sharp as
i
t
In any case, suppose that the
variable or a parameter in the function.
functional form
observed
is
is
cj>(A,
Then
xt).
the probability
equal to the product of the n factors
a single experimental observation.
pjm) =
trials
are
all
II (A,
x)
is
t
independent (see Eq.
and
A and
B,
is
x { ), each factor being
n #a *d
(3-3)
=1
written with the assumption that the n 1-5).
A similar expression may be written To compare the
for each of the other possible hypotheses. different hypotheses
that set x {
pA (x^)
Thus,
1
where the product
cf>(A,
we may
write the ratio
reliability
from Eqs.
of two
3-1, 3-2,
3-3,
Equation 3-4
is
likelihood ratio,
pA (x t)
is
PmodM)
p in (A)
Pmoa{B)
pm(B)
U(f>(A,
x^
U<j>{B,
x)
(3-4) •
t
recognized as the "betting odds" and
and
Ucf>(A,
ar ) 2
is
a normalized probability as
known all
often called the
is
as the likelihood function
proper probabilities
cal evaluation of the likelihood function
is
straightforward
are].
if
[if
Numeri-
the functional
form of each hypothesis is known. And, as mentioned in Chapter 1, if we have no a priori knowledge favoring A or B, we usually resort to the desperation-in-ignorance guess that pm(A) = pin(B). 3-1.
Method
of
Maximum
Likelihood
As an example of the use of the likelihood ratio, Eq. we wish, from n independent trial measurements of x,
3-4,
suppose that
to find the
most
Probability and Experimental Errors
104
in
Science
g of a true parameter y in a known matheAssume that there is only one parameter x n) °f tne up some function g = g{%x, #2
likely estimate (or estimator)
matical functional form
We
to be determined.
y).
(f>(x;
set
>
'
'
'>
values of x from which the estimate
g is to be deduced. There are several methods for setting up such g functions, and each method gives a different degree of goodness of estimate ofg. The statisticians rate these methods in terms of their relative efficiencies. As stated in the discussion of the mean deviation in Section 2-11, the relative effitrial
ciency
is
defined as follows.
If
N sets of samples each
of size n are taken
from the parent population, N different values of g are obtained. These N values of g themselves form a distribution, and let us say that the standard deviation of this g distribution is noted. This process is repeated methods for estimating g, and the standard deviation ob-
for each of the
tained with the respective different
methods
deviations.
is
Of many
g
most
The
noted.
is
relative efficiency
method having
possible methods, that
standard deviation has said to be the
method
its
the smallest
values clustered most closely together and
g
Also, with any method,
efficient.
if
the
mean of
distribution for a sample size TV tends to a value different
estimate
is
of two
taken as the inverse ratio of the squares of the standard
said to be biased.
If the estimate
g converges
from
is
the
y, the
to the true value
y as N —* co, the estimate is said to be consistent, i.e., free of bias as the sample size increases without limit. (An example of bias is mentioned presently.) For scientific work, it is generally agreed that a good estimate must have zero (or at most small) bias as well as reasonably high efficiency.
For most parametric estimation problems, the method of estimation as the method of maximum likelihood is the most efficient, and, if n is large, the estimate is usually satisfactorily consistent. The likelihood
known
function, the product of
L(x x x2 ,
,
•,
all
n values of
x n y) ;
=
{x x
;
>(:r
y), is written
t ;
y)(x 2
;
y)
•
•
•
0(x„; y)
(3-5)
Especially in the case of a discrete population, certain values of x i are
observed with a frequency/, which
is
greater than unity.
actual frequency/, appears as an exponent total
number of
factors
is
r
with
r
<
on the factor
In this event, the <£(#,; y),
and the
n.
which there is a continuum of possible The relative is a continuous variable. probability of any two different values of g is given by the likelihood ratio, Eq. 3-4, in which the likelihood functions are of the form given in Eq. 3-5 with one value ofg in place of y for the numerator of the likelihood ratio and with the other value of g in place of y for the denominator. The ratio Consider the general case
values for g,
i.e.,
p-m{A)lp m {B),
if
in
a parameter that
nothing
is
otherwise
known about
to unity as the desperation-in-ignorance guess.
it,
may
We
be taken as equal imagine each of
N
Statistics
of Measurements
possible values of g, viz.,
and each of the
Functional Relationships
in
g lt g2
•
,
L
•
•,
gp
•
•
gN
•,
,
105
inserted in the
L
function,
computed. These TV values of Lj form a distribution which, as TV— oo, can be shownf to approach a normal distribution and whose mean value at the maximum of the distribution of
TV values
;
corresponds to the desired estimate g. To find the value of L j that makes
L a maximum, we differentiate L with respect to y and set the derivative equal to zero. Since Lisa maximum when log L is a maximum, we may use the logarithmic form when it is more sum than with
convenient to deal with a / aiogiA
=<>=£/< |- log #*<;$)
L
a Oy dv
\
,
ly=g
and we seek a solution of
Then,
a product.
(3-6)
og
=l
i
this expression for g.
This value for
g
is
the
of y (but as we shall see it is not always an unbiased estimate). Solution of Eq. 3-6 is often explicit and easy without multiple
most
likely estimate
roots;
in case of multiple roots the
The procedure can be there
is
most
significant root
generalized to treat
is
chosen.
more parameters than one;
one likelihood equation for each parameter.
The maximum
method^ is generally considered to be about approach to the majority of measuremental problems
likelihood
the best statistical
encountered in experimental science. This method uses all of the experimental information in the most direct and efficient fashion possible to
unambiguous
give an
estimate.
functional relationship must be
p
in
by a binomial
distribution,
and
let
that "success"
is
observed
likelihood function
is,
w
us find the
'-=
more
known
to be fitted
maximum
likelihood
Call/?* this estimate of/7.
times and "failure" n
from Eq.
that the
the above example
of n measurements that are
estimate of the success probability/?.
is
or assumed.
To make
the binomial distribution.
specific let us consider a set
principal disadvantage
Its
known
—
w
Suppose
times, so that the
1-20,
Ijpo
-
h
p)
Then,
— logL op t See, e.g.,
/p=p*
=0 = --p*
H. Cramer, Mathematical Methods of
1
—
(3-8)
p*
Statistics (Princeton University
Press, Princeton, 1946). % This method was used in special applications as early as 1880 by Gauss, but was developed for general applications by Fisher in 1912 to 1935. See, e.g., R. A. Fisher,
Statistical
Methods for Research Workers (Oliver and Boyd, Edinburgh,
1950), 11th ed.
Probability and Experimental Errors in Science
106
and the most
likely estimate
of the true value of p
is
=~
P*
(3-9)
n
which
is
the experimental probability as defined in Eq. 1-39.
This method also leads to a
and a
fi
Vnpq;
Problem 27
see
To
the normal distribution.
in
maximum
of
=
in Section 3-11.
method and the standard
illustrate further the
mean
likelihood, let us estimate the
fi
deviation a of a normal distribution from a sample set of size the estimates
m
and
n.
place of \/hV2, Eq. 4-8, the likelihood function
is
L=n^=exp(-^) 2a i
= i a^J27T
Following the procedure of the
Call
In this case, using Eq. 1-25 with a in
s respectively.
maximum
likelihood
method
first
0=|^i
=
(AlogL)
(3-.0)
1
\
for
ju,
(3 .U)
from which
m=
-fx n =
which agrees with Eq.
To
(3-12)
i
i
i
2-1.
estimate a,
(-U^)
=0 = 2
f.ogL)
(3-13)
from which s
=
2
-
I
- f*f
(*,
0-14)
n t=i which,
if
t In the if
we
is
replace
/u
with our best estimate,
method of maximum
retain
estimate
As
we
u
it
likelihood,
v
s'
a
is
s'
2
=
——
"
1
2, (x =
-
not necessary to "replace" s'
=
2
"
1
n
—
; 1
V
(x,
—
fi
2-9.
with m;
mY, and
this
>-,
s' is
not an unbiased estimate of
t
~m
2 )
'
s
an unbiased estimate of a 2
Vo = 2
- HM/i -
the parent distribution
is
normal.
—
We
,
1
a.
Rather, the unbiased estimate of
given by a/ v i"
if
m, agrees with Eq.
unbiased. See Eq. 3-98 and the attending discussion.
a fine point, although
=
is
can be shown that the estimate
i
2
it
viz.,
777-; r(j»)
l)]
s
shall generally ignore this fine point.
Statistics
of Measurements
Functional Relationships
in
In the case of the mean, the estimate
estimate of /.i; this
But
in the case
is
of the standard deviation, s
the fraction it is
V(n
—
it is
always
in this case, correction for
as n
—*
size
even though
fi,
in
an unbiased and consistent mean of any distribution.
a biased estimate of
is
than a;
less
it is
l)/n as stated in Eq. 2-22.
a consistent estimate since
5 is
is
true for the estimate of the
a negative bias because
as
m
107
it
When
the bias
is
s
has
less
by
known,
can be made. However, the estimate
it
converges on the true value asymptotically
oo (but s does not converge
N ->
a.
on the average
on a
if
we take
N samples each of small
oo).
Suppose that the variable k is known mean /u unknown. The m, can be obtained from the likelihood function
the Poisson distribution.
to have a Poisson distribution, Eq. 1-26, with the
estimate of
by
//,
viz.,
differentiating with respect to
[x
and
setting the derivative equal to zero,
=0=i4^-l)
flogl)
(3-16)
and solving for m, 1
-Ik f = -Xx n i=i n
™=
i
i
i
(3-17)
i=i
in
agreement with Eq.
Can mean is
2-1.
the statement that the
this
be interpreted in any sense to confirm
statistically the best location
value for an
asymmetric distribution ? Instrumental parameter.
method of maximum
As
a final example in this section of the
likelihood, suppose that
/ is
the time interval between
counts in a Geiger counter measurement of the intensity of cosmic rays,
and suppose that the frequency function for
where 6
is
=
6e-
t is
et
of the form (3-18)
some unknown instrumental parameter whose value we wish
Suppose that a sample set of n measurements of / has been made. The maximum likelihood estimate O e is given by the procedure of setting up the likelihood function
to determine.
L=0"exp -02'* i
=l
(3-19)
Probability and Experimental Errors in Science
108
and of finding the value of
=
(!£)
e
(Incidentally, in this
0, viz.,
°
6 e , that
makes L a maximum,
= K^-^H-^X, +
n)
(3-20)
=^-
(3-21)
example the estimate 6 e
is
mean
the reciprocal of the
of the recorded times.) Precision in the
maximum
lihood function L(g) in the set
As
vs.
g
of measurements xi that
graph of the
like-
pertinent to the possible values ofg.
is
discussed above, the estimate g* of the true value y corresponds in
the graph to the
maximum
depends upon the
details
may be As
A
likelihood estimate.
gives all the experimental information contained
value of L;
the precision in this estimate
of the spread of L(g) about g*.
This precision
stated as, say, the standard deviation of the L(g) distribution.
N—* oo, the L(g) distribution becomes normal in shape, and the standard with N limited and rather easily deduced but in real
deviation
is
life,
;
may be
usually rather small, L(g)
problem of finding the precision approximation, however,
it
may
For an assumed normal L(g)
greatly different
reliably
is
from normal, and the
more complicated. As a
be treated as though distribution,
we may
it
first
were normal.
write the spread of
L(g) about the best estimate g* as
L(g) oc e
where the standard deviation
4-8.
=
^
L= — h\g —
g*f
Then, log
(3-22)
is
"•
from Eq.
- h2(9 - 9 * )2
(3 " 23)
+
constant
2 ^logL=-2fc 2
(3-24)
dg
Combining Eqs. 3-23 and
3-24,
we
a °=
/
find the following very convenient
expression,
Standard error a m
.
~T2
1
As an example of
\* (3-25)
the use of Eq. 3-25,
we may
point out that differentiating the logarithmic form of Eq. 3-10 twice with respect to
\x
allows us to write immediately the expression for the standard
Statistics
of Measurements
Functional Relationships
in
error (standard deviation in the
mean) for the normal
g
distribution, viz.,
=^
*m-l-rr\ In this example,
of the
refers to a possible value
(3-26)
mean
is
an exact expression even
if
sample set of can be shown
in a
n measurements from a normal parent population, but that Eq. 3-26
109
it
the parent distribution
is
not
normal. In
many
L cannot
by trying several
successive value of
g
of g and by using Eq. 3-5 with each
different values
These several values of L(g) If it is normal,
written in place of y.
and the general shape of L(g) sketched in. the same everywhere if it is not normal but
are then plotted 2
(d /dg
ent
be determined analy-
In such cases the distribution L(g) can be approximated numeri-
tically.
cally
actual problems, (d 2 /dg 2 ) log
2 )
log
L is
;
from normal, the average value
The method of maximum
2
(d /dg
2
likelihood,
precision of the estimate so obtained,
is
)
log
and
L may be its
is
not far differ-
extension to give the
applicable in
problems in which the functional relationship
is
used in Eq. 3-25.
known.
many
We
statistical
shall
have
occasion to use this method in later discussions.
We
have in the above discussion referred several times to the normal
distribution
and
to
details; we shall continue to do so For a proper background, the reader is the first part of Chapter 4.
some of
its
throughout the present chapter. well advised to read at least 3-2.
Propagation of Errors
In Chapter
maximum
2,
and
in
most of the
specific
examples of the method of
likelihood of Section 3-1, the statistics of direct measurements
was considered. But interest in precision in experimental science extends, of course, beyond direct measurements. The best value (the mean) of each of several direct measurements is very often used in the computation of the value of another property. For example, velocity is derived from the direct measurements of a distance and of a time, a computation involving simple division. (A critical examination of the measurement of a distance or of a time shows it to be the really a difference between two location points in space or time observed "direct" fluctuations include the component fluctuations in each of the two location points.) Examples of derived properties from more
—
—products, quadratics, —are very easy to
complicated relationships metric functions,
etc.
exponentials, trigono-
find.
In this section, the rules are discussed for the determination of the precision or reliability of the
computed "measurement"
in
terms of the
Probability
110
measured property. This
precision of each directly
as
tfie
and Experimental Errors
Science
in
the subject
is
known
propagation, or compounding, of errors.
Suppose that the derived property u is related to the and y by the functional relation
directly
measured
properties x
= /(*,
"
where the bar
= f(x
ut
y),
it
mean
u
yt),
= fix,
(3-27)
y)
and where the function may be combination of assumed to be regular as regards continuity and
signifies the
value,
additive, multiplicative, exponential or otherwise, or a
The function
these.
is
derivability. First,
we must
decide whether the measurements of x are entirely
independent of those of
many
In
y.
answer
cases, the
is
For
obvious.
example, the distance and time measurements for velocity are independent.
The measurements of the two sides of a rectangular table for its area, if in each measurement the observer uses a ruler having a wrong calibration and/or makes the same systematic parallax error, are dependent to the extent that both contain the same systematic error. Such measurements and their errors are also said to be partially correlated (i.e., as discussed later, they have a correlation coefficient between and ±1). In this case, and in most or all actual cases, the independence is only partial the errors are of both the random and the systematic types. Both types are obviously present, for example, in the measurement of the area of the table when the parallax errors are random but the calibration error is
—
systematic.
In
many
actual problems, the functional relation involves parameters
that are not independent of each other.
involve two or
more of
For example, the
the "fundamental constants"
relation
(e.g.,
may
electronic
charge, electronic mass, velocity of light, Avogadro's number, Planck
Boltzmann constant, Faraday constant, Rydberg constant, Bohr magneton, fine structure constant, etc.), and the errors in these parameters from previous experiments propagate along with those in x
constant,
in the present experiment.
Nonindependent errors: dependent error
is
that
its
systematic errors.
systematic errors cause
i.e.,
otherwise
partially correlated as stated above.
propagate to yield the error
=
Au
in u
all
in u.
the component The symbol A
direct in
characteristic of a
independent deviations to be
(See Section 2-7.)
Dependent errors
according to the relation
— Ax + — A*/+ ••• dx
for
A
algebraic sign generally tends to be the same;
(3-28)
By
measurements
x, y,
-
•
that
may
Eq. 3-28, in contrast with d which
be involved is
used for
of Measurements
Statistics
in
Functional Relationships
supposedly independent or random errors,
we
are
now
III
intended to indicate that
is
dealing with a clearly recognized systematic type of error.
In practice the dependent errors are not clearly recognized;
usually inextricably mixed in with the
random
they are
and both types of
errors,
make up the observed frequency distribution of x or y Sometimes one can make a test for the presence of dependent
errors together
or
•
\
The
errors.
test uses the basic expression for correlation (discussed later).
This expression, for n measurements of each variable, with the variables
taken a pair at a time, say x and
y, is
i(te ay,)«0
(3-29)
(
4
=
where dx €
—
xt
dy {
x,
=
=1
yi
independent, each term in the and, as «
—»- oo,
sum
the
is
—
sum
is
not very large, the
may
nearly zero, and the deviations dx it dy t interpretation
Of course,
becomes a
if
will
it
not be detected by
sum may be
be independent;
still
the
statistical one.
a systematic error
a dependent error,
are
as likely to be positive as negative,
If n is
zero.
sum
deviations in the
If the
y.
cause a
is
present in x and/or
shift in the
this correlation test.
the inconsistency of different
mean we
Later,
means by which,
y, etc.,
x or y or discuss
in
some
•
•
is
not
and
will
but •,
some
tests for
instances, the
presence of those systematic errors that depend on the variable or on time
may
Usually, however, small systematic errors can only be
be detected.
suspected,
When
and the magnitude of each component only estimated.
dealing with systematic errors in general, the assumption that the
algebraic signs of
This
justified.
ponents
is
is
the
all
component
errors are the
especially so if the total
greater than, say, three.
If the
same
is
not
statistically
number of independent comnumber of components is suffi-
ciently large to justify treating the individual systematic errors as having a
own, with a meaningful mean and standard deviation, then these errors may be treated as though they were random. But if this is not the case, we are generally left with no satissignificant frequency distribution of their
factory procedure for reckoning the propagation of systematic errors.
Strenuous efforts should be large systematic error.
tinguishable from the in the
mean
made
to recognize
and to eliminate any
Residual systematic errors are perforce indis-
random
errors, but, if their net effect
can be estimated
value, this estimate should be stated as well as, say, the ob-
served standard deviation.
Random
errors.
treated as random.
values of x and n" W; is
Suppose that
all
errors are independent
In the general case, in using Eq. 3-27, trial
values of y.
It is
most
and may be
we have
likely that ri 7^ ri
not computed from an associated pair of xi and y t
.
ri trial
and
that
So we imagine an
Probability and Experimental Errors in Science
112
and n values of y such that each of the n computed from a corresponding imagined pair x{ yt The pairing may be done at random. The imagined xi or y is equivalent to the actual measured values of x or y if the frequency distribution of the imagined set is the same as of the actual set. With this equivalence, we choose n, the number of "trial" values of u, as large as we please. The argument is easily extended to include more than two independent equivalent set of n values of x
values of u
is
.
,
t
{
t
variables.
Next we assume that all the deviations 6x = xi — x and by Then, we define the deviation Su as t
are relatively small.
=
{
y
t
—
y
t
S Ui
=
Ui
- u w — dx + — dy t
ox
(3-30)
f
oy
when
which follows from the definition of the
partial differential
for small increments. Or, better, Eq. 3-30
may be obtained from the Taylor
written
expansion from elementary calculus; thus ui
=f([x
+
= /(*,
V)
+
dx l [y i
+
-^ & x ox
i
dyj)
+
-^ oy
fyi
and du t which,
if
— dy
= u -u = — 6x + f
t
ox
(3-31)
t
oy
the higher order terms are neglected,
is
the
same
as Eq. 3-30.
=
we have
here taken u a.sf(x, y) rather than as u (2?=i M i)/ w but these two definitions of u are essentially the same //all deviations are
Note
that
small.*
Note
that the partial derivatives are taken at x
=
x,
»
y
=
y,
hence
are constants. *
For example, suppose
we can show
=
u(x)
that this relation x, 2
=
is
x2
not true in general that x-
It is
.
true in the limit as dx t -> 0.
+ dx y =
(x
+
x2
t
2x dx
t
+ 6xi
We
=
z2
.
write
2
Then, by definition of the mean square, n
1
x2
i
<5x,
2
term.
But
it
i
=l
follows from the definition of the 1
"
y _ X2
as 6x t -* 0.
(5x,>
.*".
=\
-
Hence
2
*
t
n neglecting the
n
1
y x = -n y (x + 2x ~.
=-
<5x,
=
=
X2
o
mean
that
However,
of Measurements
Statistics
in
Functional Relationships
113
Equation 3-30 or 3-31 indicates the procedure by which individual deviations propagate to give an individual deviation in
We
u.
learn
from these equations that the effect in u of a deviation in x (or in y, etc.) is multiplied by dujdx (or by du\dy, etc.), hence that, if x (or y, etc.) appears in u with an exponent much greater than unity, the effect may be rather large. In planning an experiment and apportioning relative effort in the measurement of each of the various component quantities, it behooves us to note the exponents on the respective variables. As an example, consider the relation for the acceleration due to gravity in terms of the length / and the period P of a simple pendulum, viz., 2
A
/
for which
dg ^
=
d6 gi
<5/.
+
fractional effect
P2
dP
dl
The
^ dPi = — dg
dl {
-
~P
6P
t
3
due to the component fractional deviations dgi
= dk_
g
I
2
SP1
P
where we write / for / and P for P since the differentiation these values. Note the factor —2 in the last term.
We
are generally
when each
error
is
more
these errors
Mean (and
M
.
—
u
fractional
J independent
deviation, as a standard deviation, is
not useful in
\u,
ta =
+ dy<
—
this case
because
signs.
mean) deviation. The equations that govern and fractional mean deviations are
variables,
fractional
carried out at
deviations
du
=
=z„
mean
-\dx for
mean
Equation 3-31
do not have meaningful algebraic
the propagation of
is
interested in the rules of propagation of errors
expressed as a
or as a probable error.
is
u\
r J
idu
2 If
(3-32)
and
=
fractional z„
wa M V" M \
= \-j
= x\OX i]/ )
2
xK
(3-33)
Equation 3-33 usually simplifies to a very convenient expression because
The basic
of cancellation. squares,
is
deviations,
and
that the x, y,
same
type.
relation, the square root
of the
sum of
the
derived presently for the case of propagation of standard
•
this relation applies for •
•,
and u frequency
mean
deviations to the extent
distributions are all of essentially the
In this event, the simple numerical constant relating the
mean
deviation and the standard deviation cancels out. This numerical constant
Probability and Experimental Errors in Science
114
depends, of course, on the particular shape of the distribution; 0.80 for a normal distribution,
As was pointed out
Chapter
in
z
i.e.,
however, as mentioned in Chapter
some measurement
On
computational labor. standard deviation
the
2,
2,
it is
about
not as
statis-
0.80s.
mean
deviation
may
give too
situations,
is
The standard
standard deviation.
tically efficient as is the
deviations in
($t
and
much
deviation,
weight to large
also entails a
little
more
the basis of statistical arguments alone, the
preferable.
is
Standard (and fractional standard) deviation. square of the standard deviation in
u,
written s u
,
By
the
definition,
is
IW
2
_ =
2
s,r
t
=l
(3-34)
Squaring the binomial'in Eq. 3-30 gives
Then, placing
o
As
this expression in
Eq. 3-34, we have
+ 2 £ ^ + (f*)W \ax/ ox oy sum
(W
+
ox oy
\oxi
(fs) \oy/
as likely
. 2
it
(3-35)
a? and y are comany particular product dx dy { to be positive as negative. Then, since
n increases, the
2(<5z t 6y { ) goes to zero if
t
pletely independent (uncorrected) because is
W
2(<^.)
t
2
and
t
s
2
=
%r
follows that
[(duV
2
. (duY
J (3-36)
Note that n does not appear in Eq. 3-36 and, as may be expected, s u does not depend upon either the actual or the imaginary number of x's or the actual or the imaginary number of t/'s. Generalization of the derivation of Eq. 3-36 to cases where more than two variables are involved is obvious. The general expression for J independent variables
may s„
be written
=
(3-37) i
=
l\OX j /
Statistics It
of Measurements
Functional Relationships
in
IIS
from Eq. 3-37 and from arguments leading to Eq. 3-45 that
also follows
the standard deviation in the
mean
w.
or the standard error,
is
(3-38)
where
ss is the
standard deviation
in the
mean
value of they'th component
property.
The
fractional standard deviations in u,
j
=— =
fractional s u
I L
and then
in
it,
are written as
du
dx
(3-39)
ii
c-
=— =
fractional s«
(3-40)
ii
and are usually expressed tion,
in per cent.
As with
the fractional
mean
devia-
Eq. 3-39 or 3-40 should not be used unless the zero of each of the
x j and the u scales
is
Equations 3-39 and 3-40
physically significant.
generally simplify to very convenient expressions.
Equations 3-36 through 3-40 are the really fundamental equations in and are well worth memorizing.
the subject of propagation of errors It
was mentioned in Chapter 2
that scientists frequently use the "probable
For any given
error," pe, as a precision index. error, discussed in detail in
to the standard deviation.
pe
«a 0.6755;
different.
Chapters 4 and
distribution, the probable 5, is
linearly proportional
For a normal or near-normal
distribution,
for any other distribution the constant of proportionality
But irrespective of the numerical value of the constant,
it
is
follows
that -
Pe
j
/
5u
-VA
\2
u= I\T-)pe
2 ;
Xi
(3-41)
and
//all the
fractional pe u
=
fractional
=
pe,-,
(3-42)
u and the xi distributions are of the same type so that the constant
of proportionality cancels out.
Probability and Experimental Errors in Science
116
Sum
or difference.
Referring to Eq. 3-27, u
then
Bu and, by Eq. 3-37,
=
x
±
y
±
•
let
of Measurements
Statistics
in Functional Relationships
and, by Eq. 3-37, 5U
= VflV (a_
W
+ bvy<
b
-i
117
v
(3-48)
Because the partial differentiation in the Taylor expansion is carried out x and y, the values of x and y in this expression are taken as
at the values
the
mean
pe u for
values x and y. Likewise, the probable error
pex for
su ,
sx ,
pe y for
sy
,
and
may be written with
also the standard error s a
and the
meanpe a may be written with the appropriate changes assuming, of course, that whence is used the proportionality
probable error in the in notation,
constant cancels out.
The expression probable error)
is
for the fractional standard deviation (or fractional especially convenient,
fractional s u
=
fractional s a
=
—
b*s
+
f-
2
f
^+ ^x
\
2
y
(3-49)
2
and fractional
(^L + ^^,V!
^ = (I
fractional pe a
y
= [tj^L + x2
\
if,
&WY y
2
2
(3 . 50 )
I
in the formulas involving pe, all the frequency distributions are of the
same
type.
It is
obvious, since a and b are squared in Eqs. 3-49 and 3-50,
that the sign of a or b in Eq. 3-47
may
be either positive or negative
and Eqs. 3-49 and 3-50 are unaltered. Thus we have treated the case of either a product or a quotient.
As a
special case of Eq. 3-47, consider
u
where
A
is
su
Note
a constant. With Eq. 3-48,
=
xA4
that
aV (a_1 V = Aax
We
case in which
with
2
we have assumed
independent.
it
is
2
a
(3-51)
we
not
write
-\;
fractional s u
1
,
this
assumption
satisfied is the following one.
we
find su
=
^
(3-52)
components are
that the errors in all the
Eq. 3-43 we obtain su
.
=
x
must be careful that
= x From A = 2 and a =
where x x
= Axa
= V2sx
2s x which ,
is
.
is
satisfied.
Let u
=x + l
A x2
,
But from Eq. 3-52,
different
from the
vious value by the factor
pre-
V2. The latter calculation is the correct one; the former improperly presumes 6xx to be independent of 6x whereas in 2 ,
Probability and Experimental Errors in Science
118
same quantity. Inclusion of the correlation term
fact they are the
in
Eq.
3-35 corrects for the dependence and brings the results into agreement.
Other functions.
If the functional relation
u
we
=
write du/dx
x
(3-53)
B\x\ then, with Eq. 3-37, the standard deviation in u
= (~ s x
su
= B In
that of a logarithm,
is
2
\x 2
Y= ^ X
=A
u is
hot near nj2.
whose
first
terms
in the
It is
sin
(3-54)
xu
/
As another example, consider a trigonometric
where x
^
=
fractional s u
;
is
relation,
x
(3-55)
important to realize that for any function
derivative in the region of interest
very small, the higher
is
Taylor expansion cannot be neglected.
In the present trigo-
nometric example, su
=
As x cos
x;
fractional s u
=
sx
cot x
(3-56)
In expressing s x note that the units should be radians, not degrees, since ,
the trigonometric functions
3-3. It
and the derivatives are
in radians.
Means
Different
often happens that in a series of direct measurements the individual
values are not
than others
values should be given
determination of the mean; otherwise the
And
the "best" value.
means
Some
equally good.
all
in the
often
we
more weight mean is not
are interested in comparing two different
for consistency.
Weighted mean. weighted mean m"'
If
measurement x
w
assigned a weight
is
t
the
written as
is
n
mw
__
H' 3' 1
1
+
WX
W 2 X2
+
W2
+
+ +
W
;
X,.
+
5* •
•
+
•
W„-r„
N»
hwH
K'j is
s^
35?
,
=
l
i
w
t
is
replaced by
1/s,
2 ,
a relation
from the principle of least squares (if all the the grand weighted mean of N different means is
presently to follow
errors are small). desired,
'
of each measurement x known, the weight
properly expressed in terms of st ;
shown
'
yw i
With the standard deviation
X
\V
,T 1
=
we may
Or,
if
write the weight of the /th
weighted grand mean
=
xw
mean
=
as
^ 1
=
1
l/.s>
2 .
Then,
(3-58)
Statistics
of Measurements
To show
that
of least squares values of xt
h>,
in Functional Relationships
should be replaced by 1/s? according to the principle
(if all
the errors are small), consider for simplicity only
_ W& + w Wj + Write u = m w and, a;
+ wx 1 + w
xx
__
2
vv 2
=
w
w 2 jw 1
.
\
We
=
0;
This argument is
we
is
z
with Eq. 3-37,
a+ (1
w)2 J
proceed to solve for the particular value of w,
dsmwfdw
two
Then,
.
2
where
119
call it
wmln
for which
,
find
easily generalized to
inversely proportional to s?,
wlSl 2
=
w2 s22
show
that each of
many
weights
w
t
i.e.,
=
w3s32
=
•
•
w
•,
t
oc
—
(3-59)
2
s,-
The value of (/?e,-) 2 can
alternatively be used in place of sf.
One immediate consequence of Eq. 3-57 or Eq. 3-58, whether for the weighted mean of a simple set of n measurements or for the weighted grand mean of different component means, is that, if one or more of the standard deviations is much smaller than the others, the less precise values of xi or of x i can be neglected. This
is
a feature of great interest in
designing experiments in which several different direct measurements are
involved to yield a computed result. large deviations, x t
Note
—
x,
(This does not say, however, that
can be neglected;
see Section 4-6.)
from placing dsm w/dw equal to zero and this is equivalent to finding the value of w, for which the sum
that Eq. 3-59 followed
solving for
w
;
-iw^-m") n =
2
i
is
a minimum.
Thus, the weighting 1/5/
weighted
least squares.
principle
is
or
if all
i
Strictly speaking,
is
based on the principle of
we should point out
valid only if the frequency distribution in question
that this
is
normal
the deviations are sufficiently small that the higher powers in the
Taylor expansion are negligible. It is instructive to apply the method of maximum likelihood to the case of the normal distribution. The likelihood function is written as in Eq.
and Experimental Errors
Probability
120
now
3-10 except that different for
the standard deviation
each measurement xt
Science
considered as being possibly
is
Thus,
.
(-^-
2
L= l
1
ex H P
= a j2TT &aJ2n i
i
in
\
iy
2cr 2a.
)
2
/
and
=0=|^-— m
logL)
2
from which
m« =
2*W i±l
t
=
(3-60)
l
Comparison of Eq. 3-60 with Eq. 3-57 shows that for the normal case the weight
oc
h'j
I/a,-
The
.
may now be
Equation 3-45 ing.
2
reliability (in
ments, each of unit weight, (the
mean) having weight s
2
derived directly from an argument of weight-
terms of the standard deviation) with n measureis
n.
the
same as the
reliability
of one measurement
Thus,
= nsj
or
s
m
= -= v»
This derivation the condition
Weighted
wJCLw^
is
really equivalent to the
was made
one leading to Eq. 3-45
that all deviations
dispersion
indices.
It
is
must be
in
which
small.
more or
less
obvious
that
equivalent to the relative frequency, fjn, in the expression for the mean, Eq. 2-2. Also, with this change in notation, the weighted is
standard deviation
is
written
= / Swfo - m«ry = if
/ sw,fo
- mn
2
y
(3 _ 61)
Sw^ = n. pe w is given by the product of the appro-
the weights are so chosen that
The weighted probable
error
w the appropriate constant being 0.675 in the case
priate constant times sx
,
of a normal distribution, as stated
earlier.
Note that by allowing different weights to be assigned to individual measurements x we distort the condition of random trials, and if a t
,
weighted location index or precision index such.
However,
different
is
used
it
should be stated as
means may be weighted with impunity.
Consistency of two means:
the
existence of a systematic error in a
t
test.
mean
One important
value
is
test for the
afforded by comparison
Statistics
of Measurements
with another
mean
Functional Relationships
in
111
value determined on a different day, or by a different
This comparison
observer, or with modified or different apparatus.
is
an
involved statistical problem because each mean, even in the absence of systematic errors, possible means.
is
only a sample of the infinite parent population of
The two samples are expected
to disagree to a certain
extent simply because they are merely samples;
decide whether or not the observed disagreement basis of
and the problem is to "reasonable" on the
is
random sampling.
The simplest is that two means may be considered consistent (no "evidence" for a significant systematic error in either mean) if they agree within about the sum of their standard deviations. The criterion for "agreement" is entirely arbitrary; some conservative investigators take twice the sum of the standard Several criteria for consistency have been proposed.
deviations.
A if
better criterion for consistency
—
the difference between them, x x
in the difference.
two means are consistent
that the
is
x2
,
is less
This standard deviation
is
than the standard deviation
given with the aid of Eq. 3-43
as s
= ^5 ^ 2 +
(*i-* 2 )
2
s*
(3-62)
2
In case the two means have equal precision, s {f _ £ is
>
= V2s£
.
(This test
further developed in Section 5-8 as a quantitative test to determine
whether or not a measured signal is "really" above the background noise.) An even better criterion for consistency is the following one. Assume that the two sets of measurements, n x of x 1 and « 2 of x 2 are consistent and ,
are pooled. Then, the best estimate of the standard deviation of the parent
population, of which the n x
sample,
+
n % measurements are a
compound random
is
where n x + n z — 2 is the number of degrees of freedom. parameter in this test, which is called the t test, we write t
= h^I?
In the special case for which n x t
On
n 1 =n z
our assumption that the
=
=
sets
l^h!h^\
As a working
A (3 .64)
n z Eq. 3-64 becomes ,
1
^""
2
l~\
V2 n
x x and x% are consistent, the value of
as different pairs of sample sets are considered,
is
t,
expected to fluctuate
for the reason that they are sample sets each of finite size.
If
an
infinite
Probability and Experimental Errors in Science
122 0.4
_4
_5 Fig. 3- 1
.
/
-2
distribution curves for different numbers of degrees of
the distribution
number of /
-3
is
pairs of
constitute a
t
sample
sets are
imagined, the corresponding values of
x4 and y are
the
if
t
parent distribution, viz., as
f{t)
where
oo
This distribution can be expressed in rather
distribution.
simple analytic form (not easily derived, however)
from a normal
=
freedom v. For v
normal.
c is a constant
=
/
c[l
,2\-[^(v + l)]
+
(3-65)
'-)
chosen to make the integral of/(/) equal unity, and
number of degrees of freedom. This t distribution, illustrated in and, for v < oo, Fig. 3-1, is symmetrical in shape about the mean t = Knowing distribution. normal the is relatively higher in the tails than is probability the can calculate we the analytic form of the t distribution, v
is
the
that the value of
range,
e.g.,
is
of the next sample pair of sets
outside a range set by the values
made of ±t c to bound
tion
/
±t c
will fall outside a specified
this
range
the calculated probability
is
is
arbitrary.
0.05,
i.e.,
This calcula-
in Fig. 3-1.
The
by integrating Eq. 3-65 over the set range.
Commonly,
t
c
specification
chosen so that
is
that out of 100 sample values of
/
only 5 on the average will fall outside the bounds of ±t c Note that the calculated probability is based on the assumption that x\ and x2 are consistent. If it turns out that the magnitude of the experimental value of t as deduced from Eq. 3-64 is larger than the magnitude of t c this fact does not prove that the two means are inconsistent but it argues rather strongly in favor of a suspicion that they are inconsistent. The argument is even .
,
Statistics of
stronger
Measurements
if t c is set
Inconsistency
at
any
in
Functional Relationships
limit less
than the
5%
123
limit, e.g., the
would be caused by the presence of any
1
%
limit.
significant
systematic error affecting the observed deviations differently in one set
of measurements than in the other Values of if
no
/
set.
that are exceeded only
significant
1
(or 5) in 100 times
on the average,
nonconstant systematic errors are present, are
Table 3-1 for several different numbers of degrees of freedom n x Table n1
+
«2
—
2
3-1.
Values of
t
c
in
the
t
Test,
I
%
and
5%
Limits
+
listed in
n2
—
2.
Probability and Experimental Errors in Science
124
like to pool them so as to increase the total number of measurements and thereby increase the reliability of the mean according to Eq. 3-45. The pooling of such sets of measurements requires that the sets be
we would
consistent not only in regard to their
Or,
standard deviations.
we may wish
means but
also in regard to their
to test the internal consistency of
the precision of two sets of measurements (rather than of merely the means)
recorded on different days, with different apparatus, or by different observers. Again, the standard deviations of the various sets or samples are expected to
differ
somewhat among themselves, even
because of the fact that they are merely samples.
they are consistent,
if
We
seek a
test,
proba-
of course, for consistency of standard deviations.
bilistic
In the
test for
/
consistency of two different means, as just described,
that both sample sets of measurements are from (from a normal population if Eq. 3-65 is used). the same population of the validity of this assumption is tested. We shall Then the probability
the assumption
made
is
assumption again, but this time use the so-called F ratio as the working parameter which is defined in terms of the standard deviations of
make the
this
two
sets.
means.)
(The
t
parameter was defined in terms of the difference
Incidentally,
strictly
F
speaking,
(or
/)
in the
a "statistic" not a
is
"parameter."
Suppose that s x and sx are the respective sample standard deviations Then, in the nx measurements of x 1 and in the n 2 measurements of x 2 ox and ax are the best estimates of the standard deviations of the parent .
populations.
The
F ratio
is
defined as "i
_
a
F = -=;2 -
"2
n2
method of the
in the
sets
This
is
c
=
in the
e.g.,
infinite
number of
^~
2)
pairs of
sample
constitute an
V2)]
vx
/?
x
1
F
/
test,
F of the
we
Fig. 3-2.
in Fig. 3-2 for a typical pair
x
.
n2
—
1
are the
of values of
\\
and
v2
.
can calculate with Eq. 3-68 the probability that the
next sample pair of sets will
This calculation
F and F2
(3-68)
shape of the /"distribution
fall
outside a specified range,
outside the range set by the arbitrary values
limits
F
+ ri F)- [!i(Vl + = — and v2 =
(v 2
numbers of degrees of freedom. The
asymmetric, as shown
value of
cF H
a constant and where
is
respective
As
*X 2 1
a continuous distribution
f(F)
is
an
—
2
whose analytic form, if the of measurements are from the same normal parent distribution, is
distribution.
where
test,
2
imagined, and the corresponding values of
sets are
two
/
5 *i
(3-67)
a *2
As
j
is
F and F2 x
indicated in
an integration of Eq. 3-68 with the particular
of Measurements
Statistics
Fig. 3-2.
F distribution
Now, however, define
F
in
Functional Relationships
curves for different pairs of numbers of degrees of freedom
since f{F)
is
not symmetric,
as being greater than unity,
i.e.,
values of
F
is
the arbitrary value 0.05,
If
it
i.e.,
is
is
chosen so
that out of 100 sample
F2
.
Note
that the
based on the assumption that ax and ax are
comon parent population. Fas deduced from Eq. 3-67 is
estimates of the true value a x of a
turns out that the experimental value of
larger than
F2
Suppose
i.e.,
only 5 on the average will be larger than
calculated probability consistent,
convenient further to
it is
to have the larger standard
deviation always in the numerator of Eq. 3-67. that the probability
I2S
F2 we may ,
say that, statistically, the standard deviations are
not consistent.
Table 3-2 limit are 5
lists
%, for
the limits for F, different
if
the chances that
it
will
numbers of degrees of freedom
not exceed the in
determining
Probability and Experimental Errors in Science
126 Table
"l(=«l -1) Denominator
(for
in
Eq. 3-67)
3-2.
Limits for F
in
the F Test,
5%
Level
of Measurements
Statistics
Functional Relationships
in
127
between the dependent and the the problem often justifies considerable effort in the analysis of the measurements. If there are K constants to be determined and if there are K pairs of measurements, then there are K simultaneous equations which may be solved for each constant. This is a so-called "exact" determination with no degrees of freedom for the evaluation of the precision. But if more than K pairs of measurements are available, the constants are said to be "overdetermined"; the errors in the measured values of the variables prevent an "exact" determination but do provide a basis for evaluating the in a specified functional relationship
The importance of
independent variables.
precision.
The usual procedure
is
to
make a graph of the measured quantities and we can. If we rely chiefly upon the eye
to "fit" a curve to the data as best
making this fit, there is a strong tendency to give undue weights to the end points. As a matter of fact, the end points are often the least reliable
in
because of experimental factors in the extremes of the range. By the method of least squares, however, we can give either equal or unequal weights, as desired, to the various points of the graph.
The method of least squares does not functional relationship;
it
does
constants appearing in the equation.
between two Best
Also,
different functional relations, as
of a straight
fit
us in a practical
tell
line.
Many
way
the best
us precisely the best values of the
tell
it
does allow us to choose
is
seen presently.
functional forms can be expressed
as a linear relation,
y
The
=
a
+
photoelectric equation cited above
bx is
(3-69)
an example of
this
The
form.
constant that relates the variations in electrical resistivity p with the temperature Tis given by the expression a (T/p) (dp/dT), and this expres-
=
= A a log T. The Cauchy equation for the refractive index of a substance is n = a + 6/A 2 which is seen to be linear when x is written for 1/A2 The exponential decay law, / e -/iX can be rephrased to be log / / jux. log I sion can be put in the linear
+
form log p
,
.
=
It
=
,
usually turns out that, of the
3-69,
we
are
more
—
two general constants a and b
interested in b than in a, but this
graph and a the intercept. Consider the graph of measured values of x and
is
in Eq.
not always
so.
b
gives the slope of the
and a
y,
such as in Fig.
3-3,
straight line
yQ
=
a
+
bx
such that the sum of the squares of the deviations from
minimum.
(3-70) it
shall
be a
In what direction should the deviations be reckoned ? Ideally,
Probability and Experimental Errors in Science
128
(a)
Fig. 3-3.
fitted
by a curve:
(a)
by a
straight line,
by a parabola.
(b)
only
if
(b)
Graphs of experimental points to be
random
and
errors are present in both x
the deviations should be
y,
reckoned perpendicular to the straight line. But the arithmetic involved in the determination of the constants a and b is rather formidable in this case and, in general, the result depends upon the choice of the scale of each correction for even this effect
coordinate axis;
The usual procedure
laborious.
is
is
possible but
very
is
to choose either the x or the y direction
for the deviations, recognizing that the price paid for the simpler arith-
metic
is
a sacrifice, usually negligibly small, in the accuracy of the best
of the
fit
The choice between
line.
the x
and the y direction
is
favor of that direction in which the larger standard deviation
made
is
in
found;
comparison in the same dimensions and units, s y In almost all cases in experimental science, x is taken as the independent variable whose values are selected with practically negligible error, and in these cases the deviations are reckoned along the
make
this
compared with
bs x
in order to is
y
.
axis.
We
shall
along the y
assume
in the following text that all the deviations are taken 2
axis, i.e., that b sx
the exact value of y,
viz.,
<5&
Graphically, dy t
+
is
y
=
.
Vi
2
<s y 2
Then, a
.
+
bx
i
Accordingly, a deviation
-
=
y
Vt
-
(a
+
the length of a vertical line
is
always taken as
is
written as
bx ) t
(3-71)
drawn between y and t
^
bx ) at the x abscissa position. (Remember that if 6x 0, by is not an observed deviation in y, i.e., yt — y; rather, 6y t may be greater or less than the observed deviation because 6y also includes the observed
(=
y
a
t
t
t
t
{
deviation in x t .)
Let us assume
initially that all
dy/s are equally weighted.
with the principle of least squares,
we
In accord
seek those values of a and b that
Statistics
of Measurements
make
sum of
the
in
Functional Relationships
the squares of the n deviations dy t a
|(W=i(%-a-K) 2
129
minimum. Thus,
2
=1
-
- 22 y, +
:
IbUxi
=
(3-72)
da
2E(*t y,)
=
(3-73)
from which
a
=
2x,-y,
«
fe
=
2av
2x 2 2y ?
7:
2a:i2(a;t yi)
y2x 2 i
-
zSfoy,.)
Probability and Experimental Errors in Science
130
must be noted that when a nonlinear relation is put in linear form by in one or both variables, e.g., an exponential equation to logarithmic form, etc., such a change always involves an alteration of scale which is equivalent to a gradual change of weights along the scale. Such a change It
a change
of weights, unless properly corrected
for,
always results in a different value
(An example is given as Problem 12 in Section 3-11.) The "best" values of a and b depend on the "best" form of the relationship between x and y. It is important to note that, in case the y values for each of a and
b.
x are normally distributed, the proper correction is made automatically, irrespective of the functional form, //weight I Is? is consistently assigned to the ith point; this is shown by the method of maximum likelihood (and is also part of Problem 12, Section 3-11). The standard deviations in the values for a and b may be deduced as follows. Draw the n straight lines through the n points x y and the common point x, y. This gives n values of a and n values of b These n values, however, are not equally weighted; the standard deviation in a and in 6 is proportional to l/(x — x), and thus the weight is proportional to (x t — x) 2 The standard deviation in a and b is determined with Eq. 3-37, for which J = 1, viz., t ,
t
t
t
.
t
t
t
.
t
sh
,
da
{
dy
(
s„,
=
s„
=
dbi
=
x,-
t
—
x
Then, with the weighting factors
— 1
oc
we can
and
-
write, using Eq. 3-61, if all the sv
wh 's
oc
.
are the
same
(all
y/s equally
weighted),
Xwai ( ai - a?Y*
(I Y,(xt
-
xf(a t
Zte
-
-*« - afj (3-78)
xf
and
=
ti
a,
(b i
<
t
(
t
or explicitly by a.
(3-79)
%»»
and b are t
bt
2
\n
where
-bn
= n xixt-xftbi-by y l.(x, — x) Swv \n 2j(#j given implicitly by y = a x + b and y =
zw
= X;
—
and X
and a and b by Eqs. 3-74 and
3-75.
b
t
=
Vi
-
y
ax t
t
+
b
t,
Statistics
of Measurements
Functional Relationships
in
Or, each standard deviation
may be
131
expressed in terms of the standard
deviation s u from equally weighted y values, i
= ( *» -
s,
-
'
"*->
T
(3-80)
sy
(3-81)
*
<3 - 82)
viz.,
s-
Sx 2
=
\^-V^rnXx - (2^.)
2
2
and 5
These expressions
may
U;-(W
-
'
^
be derived by using Eq. 3-37,
viz.,
and
Substitute a
and £ from Eqs. 3-74 and 3-75 and take
all
j
's
to be the
same
For example, in the b case, db
nx
_
dy
riLzf
t
— Ear, - (Lx^2 •
{
and »
y (?£f=
riLxf-XLxtf
i=i\dyj
from which Eq. 3-82 readily
follows.
The argument
give a y
n were replaced by n
is
essentially the
same
for Eq. 3-81.
Note
that Eq. 3-80
would
ber of degrees of freedom
if
—
2; the
num-
here two less than n because of the two arbit-
is
rary parameters, a and b, in the definition of each deviation.
The probable
error in a or b
by the constant 0.675
if
is
given by multiplying Eq. 3-81 or 3-82
the distribution of the weighted values of a or b
is
normal. Straight line through origin. the straight
line,
Eq. 3-69,
is
If the intercept a in the equation for
known
to be zero,
differentiate with respect to a, as in Eq. 3-72.
for b
is
it is
not meaningful to
In this case, the expression
obtained directly from Eq. 3-73 with a put equal to zero.
b
J***h Ea-,
2
or
bW= ?WWi ? Sw,x, ii
Then,
( 3.83)
The expression on
and Experimental Errors
Probability
132
Science
in
for the correlation coefficient, defined presently,
is
based
Eq. 3-83.
Best
of a parabola. The equation of a parabola
fit
y Again,
+
bx
ex
(3-84)
the number n of pairs of measurements xt y is greater than three, number of constants now to be determined, the method of least ,
squares gives the best
The argument that
+
a
if
the
i.e.,
=
all
is
2
the error
is
is
fit
the in
t
of the parabola.
same as for the case of the straight line. Assume yit and write the deviation of each point from the
best parabolic curve as fyi
The
=
y,
—
a
-
best values of the constants a, b,
vp*wn =
apw
bx
and 2
=
]
t
-
(3-85)
c are those for
which
3PW] =
db
da
ex?
dc
The algebra for the simultaneous solution of a, b, and c is straightforward.
these three equations for
Likewise, if the x ( y measurements are not equally weighted, and if the weights are known, the weighted values of a"\ b w and c w can be calculated. ,
t
,
The expressions for the weighted precision indices also follow. Perhaps the most important aspect of the method of least squares curve stants,
fitting is that
and
this part
it
in
allows determination of the precision of the con-
of the problem must not be neglected; the procedures
are similar to those detailed above in the discussion of the best
fit
of a
straight line. It
should be pointed out that calculations of
simplified
if
a, b,
and
c are greatly
the values of x are chosen to have equal interval spacings
Ax and
to be symmetrical with respect to the median, and if the x variable changed so that the zero of the scale is at the median value and is measured in Ax units. In this case, all summations of odd powers of the new variable x are equal to zero. Denoting this case with primes also on a, b, and c, we write* is
y r '2
y-r' 4
- (Z*/ 2 2 - (Zx/ 2 f nS*,' 4 -y^ n^ nLx^xf* - (Lx/ 2
hS*/ 4
See, e.g.,
>
riLzt*
)
t
c
l
2
nZx* -
(£z/ 2 ) 2
- (Z*/2)
,
g6) )'
v Y*x' 2 Vi '
G. C. Cox and M. Matuschak,
/.
^ Xi nZ*/ 4
'
- (Lx/ 2
2
Zis Vi
)
Phys. Chem., 45, 362 (1941)
of Measurements
Statistics
In these equations, x'
ifV
integers
=
(x t
-
is
in
Functional Relationships
133
measured from the median value and x values are if a' = 2(x - xmedian)/Aa;
* median)/A:r for n even and
t
for n odd.
The expression for a forms the basis of the least-squares parabolic method for the smoothing of data, as discussed later. Best
of a sine curve. The function
fit
=
y
a sin
(
-
b)
(3-87)
cannot be converted to the form of a straight line or to a power series in which a and b appear as coefficients of the power terms. The fitting of this type of function illustrates additional interesting features.
From let 6
=
Then,
a graph of the measurements y t fa, estimate a value e for b and e, which must be rather small. e, and A Let 6 \\a. (f> ,
—
b
sin
(cf>
—
=
—
b)
=
sin (6
—
d),
=
and, after expanding sin (0
—
d),
Eq.
3-87 becomes
Ay — Assume
all
the error in the
sin 6
z'th
+
=
point to be S(Ay t) and write the deviation as
KAyd = Ay — t
sin Bt
+
6 cos
0,
approximation to the best values of A and d, hence for a and b, that for which '^{Ay i — sin d + d cos 6 ) 2 is a minimum, i.e., for which
The
is
d cos 6
first
t
AlLyf v4E?/t cos
=
f
t
- E& sin
— 2 cos
+ <5E& cos 6 = = sin fy + (3D cos = 0. If the approximation Bt
t
2
i
2
and then d/dd given by the solution of these two simultaneous equations is not satisfactory, a better estimate of e is then found by making d smaller and repeating the
by writing d/dA
process.
Likewise, the weighted values a w and b w and the precision indices sa , sb sa w and sb w may be computed with the same type of argument that ,
,
was given for the straight-line case. The cases of the straight line, the parabola, and the illustrative
of the curve-fitting aspect of
sine curve are
statistical precision;
the reader
is
referred to the literature for the fitting of functional relationships that
cannot be expressed
in
one of these three forms.*
Criterion for choice of functional relation. least squares in curve fitting does not
method of
As tell
stated earlier, the
us directly the best
functional relation y =f(x), only the best values of the constants. it
can be used to *
For example,
New
York, 1959).
tell
which of two relations
is
But
the better. Thus, for example,
see F. S. Acton, Analysis of Straight-Line
Data (John Wiley
& Sons'
Probability
134
we may choose between a
and Experimental Errors
straight-line relation
in
Science
and a parabolic
relation,
or between a parabolic relation and a power series which has terms higher
than the second power. Let the two relations to be compared be a and
t
jS
be exact,
let
Then, calculate
Q.
=
^ -" n
The
having a number of
ft,
For each value of x which is assumed to value of be the or y computed from the respective relation. y, y^
constants c\ and c respectively.
relation
y* f
and
,
2 q,, **-*?
n-
ca
which has the smaller value of
Q. is the
c
(3-89)
p
one that better
the measurements.
Table
x
xl
y
3-3.
A
Differences Table A/y
A 2y
A 3#
A4
*/
fits
of Measurements
Statistics
in Functional Relationships
their differences), a function consisting
mation
is
135
of an «th order power series approxi-
required.
The scheme works for all functional forms that can be converted to a power series by a change of variable, * even though the highest power is unity as in the case of the straight line.f 3-5. Justification
Method from Maximum
of Least-Squares
Likelihood
We
have encountered so far several applications of the principle of least The first was met in the definition of the mean with the deviations
squares.
equally or unequally weighted (whichever applies) as the location index for which the
sum of
the principle
was implied (but not
statistical efficiency
the squares of the deviations
on the sum of the squares of the deviations t,
of the least-squares values of constants in the principle
was used
it
;
was
minimum
;
then,
high
also implied (but not
F, and, later, the
method was used
tency; the least squares
finally,
a
of the standard deviation as a dispersion index based
discussed) in the high efficiency of
and,
is
specifically discussed) in the
x
2
tests for consis-
explicitly in the
known
determination
functional relationships
in the criterion for
choosing the better
of two different functional forms for the representation of a ments. *
A
set of measurepopular method of data smoothing (mentioned presently) is
Note that a change of variable implies a change of relative weights, as was discussed fit of a straight line; see Problem 12, Section 3-11. Incidentally, if we ignore errors and precision altogether, the average of the
in connection with the best
t
"constant" values of the wth-order difference allows a determination of the value of the intercept in the
power
with a
a second-order power
series,
trivial bit
y
we may
Furthermore,
series relation.
may be approixmated
=
if
Ax
is
small, the other constants
of computation. For example,
a
+
bx
+
ex 2
Ax)
+
c(x
if
the function
is
write
y
where, since
Ax
is
+
Ay
=
Ay
= b Ax + c Ax +
a
+ b(x +
2
constant, a'
and
Ay
+
Ax
2cx
Ax) 2
=
a'
+ b'x
Repetition of this procedure gives
b' are constants.
+A
2
y
=
a'
+
b'(x
A
2
y
=
b'
Ax
=
+
Ax)
b"
Hence, knowing Ax, we can determine b' from the second-order difference "constant"; then, knowing b', the average \(x + x i+1 ), and Ay we can compute a'; then, from a' and b', we can compute b and c; then, knowing y and xu we can compute a. The difficulty with these approximations is that we must assume that a A.y, and a pair x it y { are exactly known. Of course, what appears to be the best values are chosen. As stated in the first sentence of this footnote, this procedure ignores errors and all questions of precision. t
t ,
{
Probability
136 also based
on
Science
in
And, errors propagate according to convenviz., based on the simple
this principle.
based on the sum of the squares,
ient rules
addition rule for the variances, in the
and Experimental Errors
if all
the errors are sufficiently small that,
Taylor expansion, the terms of powers higher than the
may
first
be
neglected.
method
Strictly, the least-squares
valid only in case the errors are
is
normally distributed, but the method
is frequently invoked for nearnormal distributions and indeed for distributions in general in which the errors are all small. The validity in the case of normal errors is easily
shown by
method of maximum
the
likelihood.
Suppose that we have a single independent variable x and that we are concerned with values of y given by
= f(z, a,
y
where
a,
Suppose
•
ft,
•
•
•
•)
•
ft,
whose values we wish to determine.
are fixed parameters
that the deviations are essentially all in y, viz., that fyi
=
- f( x )
Vi
was discussed in connection with Eq. 3-71, and are normally distributed. and of x the probability of a measurement For given values of a, ft, yielding a result between y and y + dy is given by as
•
•
•
f,
t
t
t
^e p(-<M
2
Pto,
The y n in
yi
+ ^) =
)
probability of observing y x in the interval dy±, y 2 in dy 2 dy n is the likelihood function
L-ft if
(3-9.)
X
• ,
<«.«,(- 2«2)
'
•
-,
and
(3-92)
the deviations are independent. It is
convenient to think of "error space" which
space in which the
z'th
coordinate
is
zi
=
is
an ^-dimensional
yja^ Then, Eq. 3-92 can be
written as
L
-
-^
exp
[- |S(^) 2 ]
dv
(3-93)
(2tt)
where dv is the elementary volume in error space. The relative likelihood of any set of errors or deviations in dv depends only on 2(&z,.) 2 i.e., on the square of the radius vector R in error space. It is immediately apparent •) that corresponds to the maximum value of that the value of a (or of ft, ,
•
L
is
This
is
•
which the sum of the squares of the deviations is a minimum. a proof of the validity of the principle of least squares for normally
that for
distributed errors.
Statistics
Note
of Measurements
that
it
^[(dy,-)
is
in
2
2 /^,-
Functional Relationships that
]
is
137
minimized where a?
to be
is
the
weighting factor to be associated with each y (Such weighting was discussed earlier for the special case of the weighted mean in connection t
.
with Eq. 3-60.) The derivation of the principle of least squares, therefore, requires that, in application of the principle, each value 6y t be properly
weighted.
In
many measurement
problems,
that
all
known about a
is
t
same for all i; in such cases all deviations are properly equally weighted and we write simply a for a If many different samples each of size n are taken from the infinite is
that
its
expected value
is
the
t
.
parent y population, many different values of £(<3z,) 2 (= R 2 ) are obtained. Analysis of the R 2 distribution gives further interesting information
regarding the principle of least squares.
We
can rewrite Eq. 3-93 in terms of R, with
L(R) dR
C
where the constant
= RdR;
and
sum of all
dR
=
dv, as
= Ce^^R"- dR
(3-94)
can be evaluated as follows. Let
let
R
L(u) du Since the
_1
1
=
u
then du
i?"
n~2
=
=
Ce-
u
IR
(2w)^
=1=
(3-95)
- 2)
(n
;
then,
2^ n ~ 2)u 1A{n - 2)
probabilities L(u)
L(«) du
2
du
must equal
C[|(n
-
2)]! 2
unity,
H(n " 2)
J
and therefore
C=
1
2^»-2)[i(„_2)]!
Then,
=
L(u) du
e
-u
v*(n-2)
du
(3 _ 96)
[Kn-2]! Equation 3-96 forms the basis of the % 2 distribution and of the x 2 test for the goodness of fit of a mathematical model frequency distribution to a set of actual measurements.
of the probability that the fall in
an
limits
of integration. This
Specifically,
sum of the
Eq. 3-96 allows calculation
squares of n observed deviations will
arbitrarily selected range, a range that is
is
by the choice of the
set
discussed further in Chapter
The derivation of Eq. 3-96 has not
4.
specifically included the effect of the
•)• If there are q a, /S, in the function y = f(x, a, /?, parameters whose values are to be estimated from the n measurements at hand, then proper account of the effect of these parameters in the mini-
parameters
mization of/? 2 n
—
•
[i.e.,
•
•
of
•
(<5z ) 2
q degrees of freedom
2
in
left
Eq. 3-93] reduces n to n
•
—
q;
i.e.,
there are
for the estimation of the characteristics of
138
and Experimental Errors
Probability
R2
the
may
(a, /?,•••
distribution,
Science
in
refer to functional constants, instru-
mental parameters, distribution characteristics such as y and a v The pertinent error space then has n — q dimensions.
R2
In regard to the likely value
of R
maximum
of
2 ,
viz.,
distribution,
R* 2
Bm
R*
likely value is
=
R* 2 was
#
'/
\dR
It
most from Eq. 3-94 by the method
interesting to note that the
Thus,
likelihood.f
and the m©st
it is
readily obtained
is
,
n
-
=
R* 2
or, generally,
1,
n
-q -
(3-97)
1
normal distribution of errors, that the best estimate 2 is [«/(« — \)]s Equation 3-97 for the case in which no additional parameters a, /?, are
asserted, for a
of the variance of the parent distribution reiterates this
.
•
n-l
R 2 = ±-l (dy f = -s 2 = t
Also, the
mean
=
By
R2 is
R2
=
n,
viz.,
,
R 2 may mean
is
=
\U
=
u
e
\n.
known specifically about
= n
The most
likely value
=
u,
R2 =
n
-q
(3-99)
the proper weight to be assigned
R 2 [=
,
2(<5y 1 /o I )
2 ]
tell
us
-q
R can also be obtained from L(u). To do this we note that V2u dR, and we must use the function V2uL(u), not L(u)
of
L(u)
d
log
value of
Hence,with Eq. 3-95,
or, generally,*
o?'av
R 2 = R**
-u H(n-2) du M
to each y t value, Eq. 3-99 and the definition of that the proper average weight is
L(R) dR = L(u) du alone. Then,
•
be obtained from either
,
definition of the
\UL(U) du
which, upon integration,
In case nothing
i
value of
Eq. 3-94 or Eq. 3-96.
U
•
Write
to be estimated.
t
etc.)
,
V2uL(u)\
= 0=J--\+l(n-2)\
from which
R" = %
The function
involved.
2«*
=
n
-
1
L(u), rather than 2uL(u), suffices here because
no
differentiation
is
Statistics
3-6.
of Measurements
in
Functional Relationships
139
Data Smoothing
Let us discuss briefly another problem in the treatment of experimental data, a problem closely related to curve fitting.
This
is
known
as data
Data smoothing always presumes some knowledge of the analytic form of the best-fitted curve y = f(x), although this presumption is not always recognized by investigators who try to smooth their data. It must be emphasized at the beginning of this discussion that, in general, we should not indulge in data smoothing unless we have a clear a priori assurance of the order of magnitude of at least the first and second derivsmoothing.
= f(x) throughout the region of Furthermore, because of the inherent arbitrariness in any datasmoothing operation, it is practically impossible to treat the measurements atives of the appropriate function y
interest.
statistically after
they have been smoothed, and any quantitative inter-
is likewise open to some question. mention two popular methods of data smoothing. In the first method, smoothing is accomplished graphically, a portion of the data at a time, by arbitrarily drawing a smooth curve "through" the experimental points. This method is improved if the first-order differences between adjacent x values are smoothed, and is further improved if second-order differences are smoothed. (First- and second-order differences have been
pretation of the results
We
shall
described earlier in connection with Table 3-3.) this
method
is
By
successive applications,
capable of very satisfactory results (except for the end points)
but the procedure repetition that this
is usually slow and inefficient. The fact deserves and any method of data smoothing require that the
=
unknown
relation y f(x) be a slowly varying function over each small portion of the curve being treated otherwise, any short-range real "struc;
would be undesirably smoothed out. second smoothing method is based on the principle of
ture" in the curve
A
least squares.
This method, for convenience, presumes that the values of x are chosen with equal interval spacings Ax, that the error
is
entirely in the y values,
and that each small portion, e.g., four adjacent intervals each of size Ax, of the unknown/(x) curve agrees reasonably well with a small portion of a parabola. Consider the five measured values y_ 2 */_ x y y+i, y+i cor,
responding to the values x
We
2Ax, x
—
Ax, x
,
,
smoothed value, to replace y x be replaced by x', defined as
wish to find
First, let
—
y,
the
x'
x
,
+
Ax, x Q
+
2Ax.
.
= ^-—^°
(3-100)
Ax as used also in Eq. 3-86, so that x
is
the central value
and the
unit of x'
Probability and Experimental Errors in Science
140
Ax.
is
If the
parabolic relation
fits
y
=
a
'
+
unknown we seek is
the
well over the range 4Ax, the value of y b'x'
+
c'x'
2
(3-101)
and, because of the shifted zero of the x scale to x is
just
The value of
a'.
relation satisfactorily
,
this value
of y we seek
given by Eq. 3-86, which for five points
a' is
becomes
=
y*
a
'
=
+
™\- ll yo
This method of smoothing
is
12 (2/+i
+
y-i)
-
3 (y +2
+
y- 2 )]
used by starting at one end of the range of
%iS and working systematically and consecutively through the values of x to the other end. The method does not do well at the ends, and the end t
regions should be repeated, working toward the respective end.
important that the
unknown
It is
also
function have no real "structure," of a
comparable to the range 4Ax, which should not be smoothed
size
out.*
Correlation
3-7.
So
we have considered relations y = f(x) in which x Additional variables are often involved, and if so
far in this chapter
the only variable.
is
we
say that y
Or,
we
The
to x.
is
correlated to x
say that y
is
and correlated
to each of the other variables.
functionally related stochastically rather than exactly
cases of correlation
most frequently encountered in practice known; in fact,
are those for which the other variables are not specifically their existence
is
suspected only because the observed fluctuations in y
for a given value of x are too great to be attributed to experimental error alone.
Examples of correlated properties are
(1) the heights
of mothers and of
adult daughters in the United States, (2) the grades of college students in
mathematics courses and
in physics courses, (3) the longevity
age 60 years and their weight
age 50 years, and
of
men
past
mass of the atomic nucleus and the number of nuclear fragments into which the nucleus breaks in fission by slow-neutron bombardment. Correlation coefficient.
at
(4) the
Let the graph of Fig. 3-4 of a set of student
grades in mathematics and physics represent two correlated properties.
We
shall define
This coefficient * It
is
an index or coefficient to express the degree of correlation. is
such that
possible, in this
component
in
operation
is
like a filter
others.)
would be zero
if
the points were distributed
method, to accentuate rather than to smooth a periodic
the experimental "jitter" of the measurements
or less than about 4A.r.
some
it
if
this
period
suspected, try a different x interval.
is
equal to
(The smoothing that attenuates waves of some frequencies and resonates with If this is
Statistics
of Measurements 1UU
in
Functional Relationships
141
Probability and Experimental Errors in Science
142
Assume first that all the points xu y are equally weighted. assumption the equation of the straight line is t
With
this
n
V'si
=
=
bx>
i^—
(3-102)
x\
Introduce the quantity
us
S,
which
is
a
(m^lf
(3 . 10 3)
measure of the dispersion of the measured values
to the least-squares fitted straight line.
of estimate.) The standard deviation
(S^-
is
y\
with respect
called the standard error
in the y\ values is
V = (^) The
correlation coefficient
(3-104)
defined in terms of S,/
r is
rs
and
s„-
as follows:
U--HJ
(3-105)
s~
\
y
From Eq. 3-105, the value of r may be anywhere in the range 1 to — 1 by convention, the negative values correspond to an inverse correlation, i.e., to a situation in which the general trend of the points is such that the slope of the straight line
For
=
y' gl
and
=
r
0,
Sy
=
-
0, that (1)
(2) the
s
y
-,
is
and
negative (see also Eq. 3-107 and Fig. 3-5).
this
is
possible only
if y'sl
=
0.
It
follows, if
the least-squares straight line, Eq. 3-102, has zero slope;
£a^
sum
is
zero, which, as discussed in Section 3-1,
the
is
condition encountered in the propagation of errors that the deviations
bx
t
(= x
The
i
—
x
=
Sy
x'{)
and by (= y t
i
—
y
=
y\)
be entirely independent.
increases, and this ratio is y sometimes called the alienation coefficient. Computation of the correlation coefficient r is generally easier with a modified form of Eq. 3-105 obtained as follows. From Eqs. 3-102 and ratio
js
.
decreases in magnitude as
r
3-103,
n 2 x'}
hence, noting also that sy
2
-( using Eq. 3-83.
=
(Zx'. 2
)/n,
¥f- ^J-=b
5 -£
(3-107)
Or, better, in terms of the weighted means xw and r
=
*"<* ~ *"»* ~
//
.
>•> (3-108)
Statistics
of Measurements
in Functional Relationships
143
Fig. 3-5. Scatter diagrams illustrating different correlation coefficients
and regression
lines.
Covariance.
The term covariance
is
commonly
samples.
It
may
observations
be written oxy
This
used.
is
a
which x and y are individual and the best evaluation of it from n actual
characteristic of the parent population of
t
i
is
n
The covariance may be divided by
-
1
n
-
(3-109) 1
the product of ax
.
and ay to give the -
best determination of the parent or universe correlation coefficient.
Probability and Experimental Errors in Science
144 It is
often convenient in statistics to speak of the experimental covariance
which
s
from Eq.
H=
n
= "A'
(3-HO)
3-107.
The
Interpretation.
help
—
given by
is
usefulness of the correlation coefficient
provides in answering such questions as:
it
mathematics
what grade may he expect
75,
is
such a question can be worked out simply
The answer to and b are known.
in physics?"
if r, y, x,
The equation of
the least-squares fitted straight line
This
M indicated
sy , is
(3-111)
V'si=bx' is
the line
given value of
Sy
The expected value of y
in Fig. 3-4.
readily obtained
for a
from Eq. 3-111. Then, the value of
computed from Eq. 3-105 as
is
'
is
a;
in the
is
"If a student's grade in
SvU and the answer
to the question
y
= s Jl y
is
r
2
(3-112)
simply
+ M«- ±
syU
(3-H3)
where the plus or minus value is the standard deviation in the expected physics grade. If the x and y' frequency distributions are normal, the i.e., the chances are is 0.6755^ 50-50 that the student's physics grade would be y + [y's i\ x ± 0.6755^. and mark the 50-50 Lines A and B in Fig. 3-4 are parallel to the line
probable error in the expected value
,
-
M
limits;
calculation of lines
distribution
is
A and B
independent of
also
presumes that the parent y
x.
In the question as posed, the student's grade in mathematics
To make x
=
70, sv
9 if
the calculated answer numerical, suppose that r >
=
and b
10,
+ M*'= = 5
the reliability
V
is
80
=
y/x.
Then,
+
f(75
-
70)
±
10Vl
-
(0.60)
2
=
0.60,
=
85.7
y
±
is
75.
=
80,
8
expressed as the standard deviation, and as
+ M*<= 5 =
85.7
±
0.675
x
8
=
85.7
±
5.4
and y' distributions are normal. (If they are not normal, the factor 0.675 must be changed appropriately; and if the distributions are asymmetrical, lines such as A and B in Fig. 3-4 do not have the same graphical significance as
if
the reliability
is
expressed as the probable error and
if
the x
for symmetrical distributions.)
Another useful feature of the correlation If r
=
60%
0.60,
we may
of the net
effect
coefficient
infer that the effect of the
of
all
is
x variable
the following.
may be about
the correlated variables including x.
However,
of Measurements
Statistics
Fig. 3-6.
distributions for
r
different values of
hood
an
maximum
145
two
likeli-
r*.
interpretation of this sort
the
Functional Relationships
in
common
effect in
x and y
is
very risky because of the fact that
may
of
all
be due to other variables.
Each of these physical interpretations of the correlation coefficient presumes that the unspecified as well as specified correlated variable have distributions that are
somewhere near normal.
Calculation of the standard deviation in r can be impractical procedure of recording
many sample
ments, computing r t for each sample
and then
set,
made by
the rather
of x t y measureand the mean value of r, viz., sets
,
t
manner with an equaform of Eq. 2-9. The distribution of all the sample values of r is essentially normal about the mean r if r is small, but the distribution is increasingly skewed as r approaches ±1, i.e., as x and y become strongly r,
calculating sr in the usual experimental
tion of the
t
correlated either positively or negatively.
No
This
is
illustrated in Fig. 3-6.
simple formula, therefore, can be given for the standard deviation in a
measurement of
single
r for
any value of
r.
In the above discussion of correlation, a straight line least-squares fitted curve y
which t
all
the points
For example,
in the state of
would
is
taken as the
=f(z), the so-called regression^ curve, lie if
on
the correlation coefficient were unity.
and liquor consumption have quite a positive correlation as studied over the past 90
tuition fees for students at Cornell University
New York
years. t The line M, or in general the curve y =/(*), is often called the regression line. This term comes originally from studies of the correlation of heights of sons and heights of fathers: It was found that sons' heights, compared with fathers' heights, tended to regress toward the symmetrical or mean line M. Regression is another word to express
the fact that additional uncontrolled variables are involved.
The viz., y,
The
slope of the regression line
regression curve discussed here
the error in x.
M,
viz.,
the constant b,
and the mean value of
y,
are sometimes called the regression coefficients.
is
in y
;
is
based on the assumption that x is exact and all is obtained if error is allowed
a slightly different regression line or curve
Probability and Experimental Errors in Science
146
often happens that the straight line
It
but that a curve of a higher power
not the most reasonable curve,
is
or whatnot,
series,
is
example, the points on a graph of "distance in feet to stop"
random
miles per hour" for a
better. vs.
For
"speed in
selection of 100 automobiles are better
by a parabola by the argument attending Eq. 3-89, and it is preferable to use the squared relation rather than to change the variable to make the relation linear.* (This point is treated in Problem 29 of Section 3-11.) In this nonlinear event, the quantity Sy is, as before, the measure of the y spread about the curve in general, and a correlation coefficient (often called the correlation index p to distinguish it from r for a straight line), But the magnitude of p is different from r, and is defined by Eq. 3-105. fitted
-
the physical interpretation of the numerical value of correlation in the
nonlinear case
is
more complicated than
in the linear case.
It is,
of course,
important in speaking of correlation to specify the particular least-squares regression line involved.
Inefficient Statistics
3-8.
Throughout
this
book, emphasis
placed on the procedures that use
is
of the experimental information;
all
referred to as efficient statistics.
of
it is
often convenient to use
The quick-and-easy formulas them almost no attempt to use
desired feature of the measurements.
inefficient statistics frankly
much
However,
obtain quickly at least an order of magnitude of
inefficient statistics to
some
such procedures are often loosely
have
in
of the imformation actually available in the measurements.
In this
quick-and-easy interest, their discussion may logically have appeared earlier in this book, somewhere in the latter part of Chapter 2. But the
understanding of the concepts of efficiency considerable degree of
vs.
statistical sophistication.
inefficiency involves
We
a
have in the present
chapter at least touched upon the types of arguments basic to this under-
However,
standing.
rehearsed: (1) the
Suffice
in this brief section, these
to say that in each case
it
numerical efficiency of a location value
arguments
mentioned is
from a parent normal *
Each case of
this sort
parabolic relation e.g.,
is
and
(2)
now be belowf
relative to the efficiency
of the mean, and the efficiency of a dispersion index the standard deviation,
not
tersely
will
is
relative to that of
the measurements are presumed to be
distribution.
must be judged on
its
own
merits.
In the example cited, the
preferable because of the role of the other correlated variables,
variables such as the weight of the car, the size
time of different drivers,
and type of brakes and
tires,
reaction
etc.
W. Dixon and
F. Massey, Jr., Introduction to Statistical Analysis(McGrawYork, 1951), for more complete discussion of inefficient statistics. + For an asymmetrical parent distribution, the quantitative use of these particular inefficient statistics depends of course upon the degree of asymmetry. t See, e.g.,
Hill
Book
Co.,
New
of Measurements
Statistics
Functional Relationships
in
147
The median has been described as being 64% effimeasurements whose parent distribution is normal.
Location index.
cient in a large set of
36% of the information contained in the data is ignored if the median taken as an estimate of the value of the mean. But, when the number n
Thus, is
measurements is very large, this inefficiency may not be serious. median increases as n becomes smaller; for example, the efficiency is 69% for /? = 5, and 74% for n = 3. In general, however, the median is more efficient as a location value than is the mid-range which approaches 0% for very large n, and is 54% of
trial
The
efficiency of the
for n
~
10,
77%
for n
=
5,
92
%
=
for n
3,
two of the large number n of measurements are to be averaged to give an estimate of the mean, again if the parent distribution is approximately normal, the best two measurements for this purpose are those at the points 29% and 71 % along the range. This estimate of the mean is 81 % efficient. The 25% and 75% points are practically as good and are easier to remember. If three measurements are used, the average of those at the 20%, 50%, and 80% points in the range give an 88 % efficiency in estimating the mean value, and these three are also about the best three. When n is only about 7 to 10, the average of the third measurement from each end of the range gives an estimate of the mean with an efficiency of about 84%. If only
Dispersion indices.
normal
that, in a is
88% efficient. When the number
mean
deviation, as a dispersion index,
only
population is
Attention has already been called to the fact
distribution, the
n of measurements
is
very large and the parent
approximately normally distributed, the standard deviation
is
estimated with about 65 s
_
%
from
efficiency
(93%) point
- (7%
point) (3 114)
3
or with about
80%
(97%
_
efficiency
point
from
+ 85% point - 15% point - 3% point
These respective points are the best
if
(3-115)
only two or four points are to be
used.
When
n
is
small, 2 to 20 or so
the standard deviation
is
from a near-normal parent
distribution,
estimated simply from the range as s
^
—
range =-
V"
,
.
(3-116)
Probability and Experimental Errors in Science
148
with an efficiency that
As n
increases,
3.5 for n //
=
100.
=
falls
we should
The
efficiency,
Standard deviation
in
the simplest estimation
use, instead
=
n
15, 3.7 for
99%
from 20, 4.
1
for n
=
3 to about
85%
at n
=
10.
Vn in Eq. 3-116, the following: = 30, 4.5 for n = 50, and 5.0 for
of
for n
however, progressively decreases as n increases. the
mean (standard
error).
When
n
is
small,
is
s„~^
(3-117,
n
where, again, for n
y
numerical values given
Examples.
10,
n should be replaced with the respective
in the previous
As examples with
paragraph.
the measurements listed in Table 2-2,
the inefficient estimates of the mean, of the standard deviation,
and of the
standard error are compared with the
3-4.
efficient values in
Table
Table 3-4. Comparison of Inefficient Estimates with the Efficient Values from Measurements of Table 2-2
Statistics
of Measurements
in
Functional Relationships
149
the need of a quantitative analysis of precision as free as possible
ambiguity. This need
may come
of the precision of his
in the stating
from
own
measurements or in the interpretation of another's. This need is more frequently and more acutely felt as the pertinent facet of science progresses and the residual uncertainties in the pertinent scientific facts and concepts
become smaller. The continuing tests of a generalization or of a theory depend more and more upon small differences. After systematic errors have been eliminated as far as possible, the remaining inherent ambiguity in precision and in its interpretation is reduced to a
minimum
only by a statistical argument.
The design of an experiment, statistics, refers
as the phrase
is
used in the subject of
generally to the process of determining a priori the
most
economical sample size or sizes to give a specified precision, where economical refers to both time and financial cost. It usually does not involve any changes in the apparatus or in the general measurement operations; but it does specify the order of taking measurements and the groupings of measurements. One purpose of the order or grouping has to
do with
the possible detection of sources of constant or systematic error
measurements.
Design features that are obviously pertinent are broad general terms:* (1) The number of measurements must be sufficiently large to give the necessary number of degrees of freedom for the determination of the in the
easily stated in
desired precision. (2) If subsets
of measurements (including controls or background) are
involved, they should be of approximately equal size
a
way
and grouped
in
such
as to reveal inconstant effects (such as a zero drift in the apparatus).
(3) If
measurements or subsets are of unequal precisions, each should
be weighted inversely as the square of the standard deviation (or, alternatively, if the distributions allow, as the square of the probable error). (4)
Before subsets recorded at different times, with different apparatus,
or by different observers, are pooled to form a grand
set,
the subsets
should be tested for consistency. (5) If
one or more measurements or subsets has a high precision,
the measurements or subsets having low precision s as a consequence of weighting
if,
say, s
t
5*
4s h
t
may
sh ,
be neglected
.
measurements to be used in computing the value of a derived property should have precision roughly in accord with the relative magnitude of the final propagated error in the computed value it serves little purpose to spend time and effort improving the (6) All
component
direct
—
W. G. Cochran and G. M. Cox, Experimental York, 1950), and R. A. Fisher, The Design of Experiments (Oliver and Boyd, Edinburgh, 1949), 5th ed. *
For
fuller discussions, see, e.g.,
Designs (John Wiley
&
Sons,
New
Probability and Experimental Errors in Science
ISO precision of one
component
if
the precision of the derived
measurement
is
heavily dominated by another component.
And,
(7)
the fluctuations in the direct measurements, or in the
finally,
must be as nearly random as possible; this often on the part of the experimenter in order to avoid systematic errors and bias. subsets of measurements,
involves a rather rigid discipline
Summary
3-10.
The
topics discussed in
many
parts of this chapter
and
in
Chapter 2
serve the pedagogical purpose of emphasizing the difference between
the properties of a sample set of measurements and those of the parent
For example, the mean and
or universe frequency distribution.
cision are at best related stochastically to the corresponding
precision of the parent distribution. reliability
This
is
essentially the
its
pre-
mean and
problem
in the
of inferred knowledge of any sort, including measurements and
hypotheses, the basic
phenomena of any experimental
science.
This chapter has been concerned with direct or derived measurements that have parent distributions of
commonly found
unknown
shape.
This
is
the situation
for derived measurements, but, fortunately,
many
of
measurements in experimental science have parent distributions of one of two general types, and each of these types is fitted reasonably well by a simple mathematical model. With the model, once its paramthe direct
eters are satisfactorily
determined
(albeit
only stochastically since the
by means of the sample), predictions of future measurements can be made and the reliability of any characteristic of the parent distribution can be established with many fewer measurements than is Some possible when we must rely on the sample knowledge alone. determination
examples of
is
this
convenience, in the case of specialized types of derived
quantities (e.g., the fitted
means and
precisions of successive subsets) that are
reasonably well by the normal distribution, have been shown in the
many of the topics of this chapter. In Chapter 4, the normal model is explored in some detail, and then in the final chapter the Poisson model and typical Poisson measurements are discussed.
discussions of
3-11.
Problems
The student should
test his intuitional feeling
with the correct answer to each
problem. 1.
Given the measurements:
gm
10.01
±
10.00
±0.12
9.97
± ±
±
9.96
±0.15
9.99
0.25
0.05
9.98
0.04 0.06
gm
of Measurements
Statistics
Find the weighted mean
(a) (i)
values,
151
the weights are taken as inversely proportional
if
to the squares of the
(ii)
±
Find the standard error (standard deviation
(b) is
±
to the
Functional Relationships
in
values as they should be. in the
mean)
if
each
±
value
interpreted as the standard deviation.
The standard deviation
2.
in the reading of a given voltmeter
corresponding quantity for a given ammeter
is
is
What
0.015 amp.
0.20 v; the are the per-
centage standard deviations of single determinations of wattages of lamps
operated at approximately their rated wattages, obtained from readings on these instruments for the case of (a)
a 500-w, 115-v lamp,
(b)
a 60-w, 115-v lamp, a 60-w, 32-v lamp, and
(c)
(d) a 60-w, 8-v 3.
With what
0.39%) 2.9%) (ans. 1.0%) (ans. 2.5%)
(ans.
(ans.
lamp?
precision
may
the density of a 10-g steel ball bearing of approxi-
mate density 7.85 g/cm3 be obtained tion of its average radius 4.
What is
is
the standard deviation of the determina-
if
mm, and of its mass,
0.015
the standard deviation in
where u
u,
=
0.05
mg ?
(ans. 0.67
%)
3x, in terms of the standard
deviation in x ? 5.
One of the
radiation constants
is
given by the formula
2n*k*
"15^
"
where k h c
= = =
where the
1.38049 x 10~16 (1 ± 0.000,05) erg/(molecule °K), 6.6254 x 10-27 (1 ± 0.000,2) erg-sec,
2.997928 x
±
lO^l ±
0.000,004) cm/sec,
values are probable errors.
(a)
Solve for a and express
(b)
What is
its
it
with the proper number of significant figures.
probable error expressed with the proper number of significant
figures?
What is the standard deviation in u, where u = 3x + 5y2 from the measure-
6.
,
ments
x y (a)
(b)
= =
12
13
11
12
10
14
13
12
14
13
12
35
37
34
37
34
37
36
35
38
34
35
when x and y are assumed to be completely independent, and when they are recognized as being partially dependent ?
What What is
the correlation coefficient in Problem 6 ?
7. (a)
is
(b)
the equation, with constants evaluated, of the linear regression
line? 8.
p
is
the pull required to
lift
/>(lb)
w
a weight
w by means
made
following measurements are
(lb)
= =
12
15
21
25
50
70
100
120
(a)
Find a linear law of the form p
(b)
Compute/? when w
=
150
lb.
=
a
+
bw.
of a pulley block, and the
Probability and Experimental Errors in Science
152
Find the sum of the deviations. Find the sum of the squares of the deviations of the given values of from the corresponding computed values. Note significant figures in all parts of this problem. (c)
(d)
p
In a determination of h/e by the photoelectric method, the following stop-
9.
ping potentials were found, after correction for the contact potential difference,
corresponding to the various wavelengths of incident
A(A)= V(y) =
3126 -0.385
2535
+0.520
light:
4047 -1.295
3650 -0.915
Using the least-squares method, determine h/e and
Assume
V only,
errors in
a
R is the resistance to motion of a + b V 2 from the following data
weighted equally, and
(b)
weighted in proportion to the speed V:
K(mi/hr) = R (lb/ton) =
The
standard deviation.
car at speed V, find a law of the form
(a)
11.
its
5461
-2.045
a fractional standard deviation of 0.5 %.
10. If
R =
4339 -1.485
10
20
30
40
50
8
10
15
21
30
a-ray activity of a sample of radon, expressed in terms of
measured
its initial
each succeeding 24-hr interval to be: 0.835, 0.695, 0.580, 0.485, 0.405, 0.335, 0.280, and 0.235. On the assumption that the activity obeys an exponential decay law, find the equation that best represents activity as unity,
the activity,
is
after
and determine the decay constant and the (ans.
12.
What
is
y
=
Solve this problem
(ii)
(iii)
and
E = olT4 In E = In
a
=
a
£/(r 4 )
+
4
In
E=
oiT 4 ,
from n
pairs of
measurements?
in
each of the following forms
T
give a qualitative reason for the differences in the answers.
ans. (b) Solve the
for
half-life.
/day, 0.1815/day, 3.82 days)
without knowledge of the precision of the measure-
first
ments by writing the relation (i)
1815<
the expression for the best value of a in the blackbody law relating
radiant energy and temperature, (a)
1.000,36-°
all
XEiT+KXTf),
(i)
problem
in
(ii)
ln"1
^ In E
t
- 42 In T )jn, t
(iii)
-
2 -i
n
l
i
_
terms of the standard deviations sE and s T constant
pairs of measurements.
13. Calculate the value of the correlation coefficient r for the following data
on the heights x x
y
= =
(in inches)
and weights y
63
72
70
68
124
184
161
164
66 140
(in
pounds) of 12 college students:
69
74
70
63
72
65
71
154
210
164
126
172
133
150
Are the stars that are easily visible with the naked eye randomly distributed sky? Divide the entire sky into many equal small solid angles, and discuss a method for finding the answer in terms of sample means and standard devia14.
in the
tions.
of Measurements
Statistics
=
—
4x
error of
The
16.
What
2/x.
viscosity
mean
deviation in y corresponding to an
is
% when x
1
=
when x
large, oo
is
1/V2)
calculated using Poiseuille's formula for the
through a cylindrical tube of length
/
and of radius
deviation,
(c) the fractional
in terms
r\
mean
under a pressure difference p. Write the expression for
t
(a) the
of a liquid
?]
Q flowing mean
(b) the fractional
in
the percentage
depends upon x; about
quantity of liquid in time
is
% in x ?
1
(ans.
a
IS3
A quantity y is expressed in terms of a measured quantity x by the relation
15.
y
Functional Relationships
in
deviation,
standard deviation
of the errors in the measured quantities Q,
a,
I,
and p, where
npafit
17.
The
viscosity
so that
G —
=
ri
-
(\ I
is measured by a rotation viscometer. The and a torque G is applied to the rotating cylinder
of a liquid
->]
cylinders are of radii a
and
— — —\
b,
l
I
(a) the fractional
mean
,
where
to is
the angular velocity of rotation. Calculate
and
deviation,
(b) the fractional standard deviation r\ when a = 4 cm and b = 5 cm, and when the mean deviation in both a and b is 0.01 cm and the standard deviation in both a and b is 1.25 x 0.01 cm, assuming that the error in G/oj may be neglected.
in
18.
A
coil
point on
its
of n turns of radius r carries a current
axis at a distance
error in measuring x (a)
(b)
when when
x from
its
center
is
find the value of x for
is e,
e is the
standard deviation, and
e is the
mean
2
in
H
is
a
If the
greatest
(ans. r/2)
deviation.
The mean of 100 observations is 2.96 cm and 0.12 cm. The mean of a further 50 observations is
the standard deviation 2.93
cm
is
with a standard
Find
deviation of 0.16 cm.
mean,
(b) the standard deviation, (c)
field at
+ x 2 )~%.
2
which the error
19.
(a) the
The magnetic
/.
H — 2nr I(r
and
the standard error
for the
two
sets
of observations taken together as a single set of 1 50 observations.
20. Derive Eq. 3-36. 21.
Prove that the frequency function of the variable
the frequency function of the normal variable
degrees of freedom
->•
v
oo.
Assume
z,
/,
Eq. 3-65, approaches
Eq. 1-25, as the number of
that the constant approaches 1/
^2n.
accompanying data on the yield of corn in bushels per plot on 22 experimental plots of ground, half of which were treated with a new type 22. Consider the
of
fertilizer.
Does
the fertilizer increase the yield ?
Treated
6.2
5.7
6.5
6.0
6.3
5.8
5.7
6.0
6.0
5.8
Untreated
5.6
5.9
5.6
5.7
5.8
5.7
6.0
5.5
5.7
5.5
IS4
Statistics
Samples of and s 2 =
25. sx
=
of Measurements 10
sizes
12
18.
in
Functional Relationships
1
55
and 20 taken from two normal populations give
Test the hypothesis that the standard deviations are
internally consistent. 26. The curve to be fitted is known to be a parabofa. There are 4 experimental points at x = -0.6, -0.2, 0.2, and 0.6. The experimental y values are 5 ± 2, 3 ± 1, 5 ± 1, and 8 ± 2. Find the equation of the best fitted curve. [ans. y(x)
=
(3.685
±
0.815)
+
(3.27
±
1.96).r
27. Differentiate (d/dp) logZ., Eq. 3-8, with respect to
solve for (p
— p*) 2
bution
is
by the method of
equivalent to writing a w
is
ap
=
maximum
= ^ npq,
\ p*{\
±
(7.808
4.94).r 2 ]
and, using Eq. 3-9,
and then show that the standard deviation
reference value as the estimate p*, result obtained
p
+
in p,
— p*)jn. Show
with the that this
likelihood for the binomial distri-
Eq. 2-28, where
w
is
the
number of
"wins." 28. in air 29.
Smooth
the measurements given in Table 3-5 of the sparking potentials between spheres of 75-cm diameter as a function of sphere separation.
Measurements were made on the distance-to-stop as a function of speed
with a group of 50 different automobiles of various manufacture and with different drivers. The measurements are given in Table 3-6. The speed is presumed to have been very accurately known in each case, the parent y distribution is presumed to be independent of x, and all the distance measurements to
have equal weighting. Which of the following relations best represents the measurements and what are values of the constants: 2 (a) y = ax, (b) y = b + ex, (c) y = dx or (d) y = ex + fx2 l ,
(ans. e
=
1.24,/
=
0.082).
"Everybody believes of errors;
think
it
in
the exponential law
the experimenters because they
can be proved by mathematics; and
the mathematicians because they believe
4 Normal
it
has been established by observations." E.
T.
WITTAKER
Probability Errors
The normal (Gauss) probability distribution is the mathematical model most commonly invoked in statistics and in the analysis of errors. For example, as was pointed out in the last chapter, it can be demonstrated analytically that this model fits very well the distribution of each of certain special parameters of empirical distributions:
(a) the likelihood functions
Lj of Section 3-1 approach a normal distribution;
consistency of Section 3-3, the degrees of freedom
is
t
distribution
very large; and
(c)
is
of
(b) in the tests
normal
if
the
number of
the ratios s^/s^, can be
assumed
to be normally distributed.
And
it
was pointed out
that the analysis of the errors in direct measure-
if it can be assumed that the parent distribution Examples of such simplification are (a) the weight to be assigned to each measurement a;, is llsx 2 if ther-'sare normally distributed; (b) the method of least squares is strictly valid if the normal distribution applies; and (c) errors propagate according to convenient rules if the
ments is
is
greatly simplified
normal.
parent distributions are normal. Indeed, the "theory of errors" as developed during the early years of the subject,
i.e.,
the theory of probability as applied to direct measure-
ments, was based almost exclusively on the assumption that the normal distribution
fits
Nowadays, however,
the measurements.
of the popularity of measurements
Chapter
5),
made
the Poisson distribution takes
especially in view
with Geiger counters, its
distribution in the analysis of errors in direct measurements.
when
the expectation value
//
in the
etc. (see
place alongside the normal
But even
Poisson distribution (Eq. 1-26)
large, the simpler algebraic expression that describes the
is
normal case
convenient and satisfactory approximation to the Poisson expression. 156
so,
rather is
a
Normal 4-1.
It is
Probability Errors
Derivation of the Function
Normal (Gauss)
Probability Density
apparent in the binomial distribution, Eq. 1-21, that,
probability
p
is
constant, the expectation or
boring region of interest
of
157
trials
n increases.
shift to larger
This
is
shown
and
mean
(1)
the
value,
When
k
=
0),
and
0.4
0.3
0.2
0.1
0.0 0.3
i.e.,
number
n becomes very
mean value ju becomes of we become much more
from the origin between adjacent k values becomes the distribution loses much of its practical
(2) the unit interval
relatively very small,
the "success"
and the neigh-
i.e.,
interested in the deviations than in the values of A: reckoned (at
if
fx
larger values of k as the
in Fig. 4-1.
large, two significant features appear: dominant importance as the reference
value
Probability and Experimental Errors in Science
IS8
Fig. 4-2.
Binomial probability B(k;
discrete character.
k\n\p constant, n varied.
n, %) vs.
In regard to the second feature,
the unit k interval to the standard deviation a
and the
becomes continuous
distribution
adjacent values of the deviation variable z
can be written, Eq. 4-1, k
f)
is
in
=
k
—
[X
=
k
—
(= Vnpq) approaches
zero,
In other words,
in the limit.
z,
rip
n -> cc, the ratio of
if
where
&
k
—
k
(4-1)
by the differential dz. In k as is discussed in connection
the limit, as separated
the most probable value of
— up = 0. normal distribution is the special case of the binomial distribution when n becomes infinite and p remains of moderate value.
with Eq. 1-24; and, in the limit, k
As we It is
shall see, the
instructive to plot binomial distributions, as
(instead of k) as the abscissa values.
curves of Fig. 4-2, the
mean
/;
In such plots, as
value of k\n
is
constant
increases, with kjn
shown
if/? is
the width of the curve decreases and the individual steps
To
in the step
constant, but
become
derive the formula for the normal probability distribution,
smaller.
we
use
Normal
Probability Errors
the
first
term of
factorial
number
IS9
approximation, Eq. 1-14, to represent each
Stirling's
in the binomial expression.
=
B(k; n,p)
k
k\(n-k)\
pq
Thus,
n-k
n n J27rnn e~ fc
V27rfcA;
n
/
\27rk(n <2irk(
and, writing n n
=
n kn n
n
n - kVY kHn -k)
n
—
k
now change
=
nq
—
(4-2)
k n .k
(4-3)
n - kPq
~k
\27Tk(nLet us
k n~k
- k)(n - k) n ~ k e - (n_fc) p q
e"V277(n
the variable
k to
z (the latter follows
k)I
z
\kJ \n-kJ
according to Eq. 4-1 and also write
from Eq.
+q=
and from p
4-1
1).
Then, Eq. 4-4 becomes /
i
\np + z/
1
\nq — z
i
B(z; n,p)
rnpq(l+-)ll--)
n large
27 L
npi
\
Since, for n large, the quantities z\{np) unity,
-
1
nq/
\
np>
and
nq'
zj{nq) are small
compared
to
we may neglect them in the first factor; but they cannot be neglected
in the parenthesized factors that are raised to high powers.
the parenthesized factors,
it
is
In treating
convenient to rewrite the expression in
logarithmic form so as to take advantage of the power series expansion
>x> - If -
of a logarithm. This expansion, for any x such that 2 log e x
= {x-\)-\(x-
If
+
\{x
0, is
(4-5)
Hence, log, B(z
;
n,
p) at
-
-
log, (27rnpq)
- (np + z)
n large
+ {nq - z) (i) log
M
,
2nV
Lnp
^+ Inq '
2
2«V
+
3«V Z 3
3n g 3
_£(l + l) + 5(i-i) 12n i \p#i
+ a*) q
(4-6)
Probability and Experimental Errors in Science
160
For n small, several approximations have been made up to this point in most serious approximation is the one to be made now, viz, that the derivation; but for n large, the
z
z (q
-
2
p
2 Z _ Ap + q
2 )
6nW or, since
a
= Vnpq from
)
12nW
Eq. 2-28, that
*y-P )_*y+^ 2
6a4
(4 . 7)
12a 6
Which of these two terms is the more important depends on the values of p and q. With this approximation, it is seen that, with p of moderate value, we neglect the net effect of all z terms of powers higher than the second in Eq. 4-6. Then, as an approximation, we change B(z; n,p) to G(z; n, p), with
G
symbolizing "Gauss", note that/?
log e G(z;
n,p)= -(-)
log, (2-nnpq)
+q=
1,
and write
- -^— 2npq
\2/
and
=
G(z; n, p)
*
e
.
-l
*
l2npQ)1
yjl-nnpq
An
important feature of
this expression
is
some
simplification
is
and q Hence,
that the parameters n, p,
of the binomial distribution appear always as a
triple
product.
afforded by writing
^jlnpq
Oyjl
and then G(z;
k)-A re -*W
(4-9)
or G(z;
h)bz =-^= e _7lV ^z
(4-10)
Equation 4-9 is the normal (Gauss) probability density function. It is also normal differential probability distribution, or the law of the
called the
normal frequency
distribution.
is only one value in a continuum of values, and G(z; h) does not have the significance of probability until it is multiplied by Az, as in Eq. 4-10. G{z; h) Az is the probability of observing a
As
seen in Eq. 4-9, G(z; h)
deviation z within the small interval Az.
The function
»
-^ yJTT
«T* J -oo
V dz = (D(z)
(4-11)
Normal
Probability Errors
161
h G(z\h)
=
1.41,
a=
0.40
=
Fig. 4-3. Normal (Gauss) density function (normalized frequency distribution) for each of three values of the parameter h.
normal (Gauss) probability distribution function or the (Note the difference between the terms "probability density function" and "distribution function.") Equation 4-11 gives the probability of observing a deviation z in the range called the
is
cumulative probability distribution.
— oo to z. Shape of the normal frequency curve. Three graphs of the normal density function, Eq. 4-9, are shown in Fig. 4-3. The normal (Gauss) curve is symmetrical, since z appears to the second power only, and the curve approaohes the z axis asymptotically at both extremes. The curve is shown as continuous as is appropriate for the normal distribution which is have been based on the assumption that the limits n —> oo and Az -> reached. The maximum ordinate and the shape of the curve are determined by the
single
parameter
h.
is
A/vrr, and
is
a precision
The peak ordinate value
the relative width of the curve increases as h decreases,
h
index that varies inversely with the standard deviation a (Eq.
Normalization. is
In order that Eq. 4-10 or 4-11 give a probability,
necessary that the probability function be normalized,
of
all possible
4-8).
i.e.,
that the
it
sum
outcomes equal unity h_ f«
V *
IT
J-
dz=
1
(4-12)
(
This normalization has been assured in the derivation inasmuch as
Probability and Experimental Errors in Science
162
=
n >P)
2*=o^(^
1»
Eq. 1-22. Since the area under the curve
ized, the inverse relation
the curve
between the
maximum
is
normal-
ordinate and the width of
obvious.
is
We shall have need later for the type of integration indicated in Eq. 4-12, so
us carry
let
=
same time check the normalization. is based on geometrical considerathe z, y plane, and a similar function
at the
for this integration
Consider y
tions.
y
now and
out
it
The usual method
=
G(z;
h) in
G(x; h) in the perpendicular
y plane from z y plane from x =
the curve in the
curve in the
x,
z,
A=
2
4=
f
x,
y plane.
=
to oo,
to oo.
also the area under the
Thus,
V*v dz = 2 ~= f Y» v dx
/ show that the coefficient 2(/7/v -n-) is such x and z variables are independent,
and we wish Since the
to
H = A-
f
7T
Jo
(4-13)
yJTT Jo
yj7T Jo
A*
Let A/2 be the area under
and
that the area
V»V * fV»V dz = 4 h- f" fVV** Jo Jo Jo
A = 1.
<**
IT
Evaluating the double integral corresponds to determining the volume of h) curve about the y axis. To convenient to change to polar coordinates
the solid obtained by rotating the G(z;
perform
this integration,
in the x, z plane.
becomes
r
dd
dr,
it is
So, place r 2
and
=
z2
+
this is (7r/2)r dr in
x 2 The element of area dz dx one quadrant. Hence, .
i
hr rdr
I JO
TT
and the integration is in an easy form. Integration gives A 2 = 1, which proves that the normal probability density function is normalized as it is written in Eqs. 4-9 and 4-12. In a measurement problem, we are concerned with an experimental frequency
distribution
distribution scale
by
n.
tribution values,
is
is
of n
may An alternative
trial
measurements.
This
experimental
be normalized by dividing the ordinate (frequency) procedure,
if
use of the normal frequency dis-
involved in the analysis or predictions of the experimental
to multiply G(z;
h) Az,
G(z;h)dz, by
or
n.
The normal
Jz x
expression, multiplied by distribution; but
normalized.
this
With the
n,
product
is is
properly called the normal frequency it is no longer and procedure of normalization well
not a probability because
significance
usually not explicitly
understood, careful distinction
is
the terms frequency distribution
and probability
made between
distribution.
Normal
Probability Errors
4-2.
Errors in the
The
essential
163
Normal Approximation
approximation made
Eq. 4-9 was the
in the derivation of
neglect of the terms of powers higher than the second in the logarithmic
expansion.
n
is
This approximation, represented by Eq. 4-7,
very large or
if all
the deviations are very small.
of Eq. 4-7 shows that the deviations should be at
±3c
if
n
Bernoulli
~
10,
or about ±4
if
n
~ 100.
A
k
Errors
in
B(k; 10,0.2)
G(z; h)
valid only if
least smaller
(Note that n
trials.)
Table 4-1.
is
quick inspection
Az
G(z; h) Az
is
the
than about
number of
Probability and Experimental Errors
164
Table 4-2.
k
Errors
in
G(z; h)
Az
in
Science
Normal
Probability Errors Table 4-3.
Errors
in
165 G(z; h)
Az Independent
of Skewness
All odd-order terms in Eq. 4-6 are zero
k
Ak
=3
Probability and Experimental Errors in Science
166 errors, all
presumed now
to be
random and independent,
is
very large
indeed.
With measurements is
continuous sample space, each elementary error
in
to be identified with a basic Bernoulli trial.
of each elementary error negative, in the actual
is
measurement;
in discrete
assume that the
we assume
errors so conspire that the observed deviation
measurements
We
effect
a very small increment, either positive or
sample space, as
is
in
that the elementary
their algebraic
sum. With
counting experiments, the
elementary errors are grouped into bundles that correspond to "yes, a
observed" and "no, a count
is
not observed" in a specified sample.
In this case, as discussed in Chapter
5,
the basic Bernoulli trial
count
is
with "a count"
vs.
is
Mechanical analog for Bernoulli-type elementary errors tinuous sample space.
Referring
to
identical small spherical steel balls are
from the nozzle
identified
"no count."
at the top.
Each
Fig.
4-4,
dropped one
ball filters
in
suppose that
con-
many
at a time vertically
down through
the symmetrical
array of steel pins (represented by solid circles in the figure) and comes to
Normal rest in it
Probability Errors
167
one of the many identical bins
at the
bottom. Whether the
determined entirely
ball, as
presumed to be by chance. As drawn, the pins are arranged in an
encounters a pin, goes to the right or to the
left is
array of quincunxs (one side of a face-centered cubic structure). ball
is
If
each
only slightly smaller than the clearance distance between pins,
falls practically
it
head on to the next pin and has a constant chance, perhaps
a 50-50 chance, of being deflected to the right or to the
left.
It is
not
must be constant, the same for each pin encountered a noneven chance might be realized in this model if pins of an asymmetrical shape were used. There are numerous possible paths through the array and some balls will find their way to each of many necessary that this chance be 50-50, but
it
;
different bins.
In analogy, each operation of a ball filtering through to a bin
is
a
measurement. The horizontal position of the nozzle represents the supposedly "true" value for the case of the 50-50 pin chance; each deflection caused by a pin encountered in the
filtering
process corresponds to a
and the position of the bin into which the ball finally comes to rest represents the measured value. * If the chance is 50-50, then the central bins, directly under the nozzle, have the best chance of receiving the largest number of balls, and the frequency with which a ball enters a particular bin decreases with the distance of the bin from the central position. If the right-left deflection small elementary error;
chance per pin to one side;
constant but
is
is
not 50-50, the shape of the histogram
same and symmetrical but
essentially the
this
its
maximum
ordinate
is
is
shifted
corresponds to the effect of a systematic error.
Chapter 3, there is no practical way of distinguishing between an undetected systematic error and one or more random elemen-
As mentioned
in
tary errors since, in real life,
To be
we never truly know the "position of the nozzle."
a good analog for the normal distribution, the number of hori-
and the number of balls dropped must be increased and the geometrical size of the balls, the pin spacing, and the bin size must be reduced indefinitely (conditions that, in combination, correspond to the infinite number of Bernoulli trials and to continuous
zontal rows of pins indefinitely,
sample space).
These extensions can be easily imagined.!
* The "true" value is merely supposed because, unfortunately, our best view of it is through the thick veil of elementary errors. Also, in some measurements, the property itself is altered by the very process of measurement, a complication enshrined in Heisenberg's uncertainty principle. In any case, the "true" experimental value is usually taken as the mean value resulting from the measurements. t The ball filtering down through the array of pins is solving the popular randomwalk problem in one dimension. If the deflection chance is 50-50, the ball performs a symmetric random walk. The physicist takes this as the simplest model for onedimensional diffusion.
Probability and Experimental Errors in Science
168
Characteristics of elementary errors. In the derivation of the normal distribution we assumed that the magnitude of the increment per elementary error is constant (only two possible outcomes of a Bernoulli trial, viz., positive and negative), and that the probability p that the magnitude is positive is constant for all elementary errors. In an actual measurement, it is most unlikely that all the elementary errors contribute in accord with these two assumptions. However, those elementary errors making extremely small incremental contributions are presumed to be less important than those making larger contributions. In essence, then, we assume the existence of a very large number n of important elementary errors all of about the same incremental size, and all of about the same positive sign probability p. p may be reasonably presumed to be \ but this value
not necessary.
is
In support of the just-mentioned relaxation of the rigid Bernoulli
we may point out that the normal on the basis of elementary errors characteristics from those of the Bernoulli
requirements of the elementary errors, distribution function can be derived
having somewhat different
our derivation.*
trials in
Two
other sets of characteristics are as follows:
(1)
If
p
=
\,
incremental contributions need not be rigidly constant in magnitude for
elementary errors;
if
order of magnitude.
they are very small, they (2) If the sizes
of
all
may be
more, the standard deviation error may be large or small. In conclusion,
random
it
is
may
the possible increments of a
number of
be either large or small, and, further-
in the distribution
due to any one elementary
reasonable to suppose that numerous elementary exist and are measurements reasonable to suppose that these
errors of the various imagined causes actually
indeed responsible for the observed variations in the
continuous sample space.
in
all
merely of the same
given elementary error are themselves normally distributed, the errors n need not be specified, n
the
And
it
is
do
trial
trial measurenormal (Gauss)
elementary errors conspire in such fashion as to cause the
ments to
fit
in
more or
distribution, even characteristics. set
less
good approximation
though we are not able to
Also,
it is
of trial measurements
fix in detail
It
is
their special
reasonable that the region of greatest misfit of a
is
in the tails,
say
\z\
y
which a few elemennonnormal shape would
2a, for
tary errors of relatively large contributions but of
have the greatest
to the
effect.
significant that, at least to the author's
knowledge, no experi-
mental situation leads to a truly normal distribution, and that the theory of the proof of the soSee H. Cramer, Mathematical Methods of Statistics (Princeton University Press, Princeton, 1946), pp. 213-232. * All these derivations are special cases in probability
called central limit theorem.
Normal
Probability Errors
169
deviations from normal are greatest in the
tail
regions beyond about
±2(7.
In the application of the normal distribution function to actual measuren, p, and q have no individual beyond the concepts of the elementary errors. These paramalways appear as the product npq and this product, which does
ments, the Bernoulli-trial parameters significance eters
have practical significance, we refer to
=
or a [a
terms of the precision index h
in
=
Vnpq]. In the application of the normal distribution, we shall generally determine (best estimate ) a and h from the actual measurements themselves rather than from the Bernoulli parameters. Having no further need of n as a symbol of the number of Bernoulli trials, we use it as the symbol for the number of actual trial measurements. It is hoped that this double use of the symbol n will not be confusing. 4-4.
l/(/z/V2)
The Error Function
probability of observing a deviation z in the range from —z x to where z x and z 2 are arbitrarily chosen, is found by integrating the normal density function, Eq. 4-9, between these particular limits. This integration is carried out by expanding the exponential term in a power series and by integrating term by term, but it is a tedious process. Fortunately, integral values for most problems can be found in reference
The
+z 2
,
tables.
from
In the tables,
either
to 2 or
we
generally find the integral value as integrated
from
—z
+z
to
the parameter of the table), and
it is
(where the numerical value of
necessary to
make
z is
simple additions or
subtractions to deduce the integral value between two arbitrary limits of
and
done with comprehension if any two limits is the area under that part of the normal density curve bounded by the two limits. integration such as z x
we remember
z2
.
This
is
easily
that the integral between
Standardized variables. in a satisfactory
The function 0(z) given
form for general tabular
listing
numerical value for each different specific convenient to standardize the variable,
i.e.,
set
in Eq. 4-11
is
not
because h has a different
of measurements.
to use either hz or zja
(=
It
is
V2 hz)
instead of just z; then, in terms of either of these forms of the variable, the
error function
is
invariant to different values of h (or of a).
The two most popular forms of the
invariant function for computational
purposes are
»
= 4= \e~ x%dx
(4-14)
y/TT JO
where x
=
hz,
and erf (0
=
erf (-j
=
-L
+ I
V
<2/2
dt
(4-15)
no
Probability and Experimental Errors in Science
where
/
=
= \2 hz.
z\a
in reference to Eq. 4-15.
The term "error function"
To
out that, in Eq. 4-14 where x
=
4-15* where x
Eq. 4-15,
if
=
=
z\a
is
used specifically
we
aid in the ready use of the tables,*
Vlhz,
=
O(x)
hz,
erf(f)
=
the integration limits are
=
0.8427 for x
0.6827 for to
=
=
t
1;
=
1
and also
—
instead of from
/
point
in Eq.
;
t
to
in
+
f,
(Note that x here is not the (/) value of a single measurement, as in Chapters 1, 2, and 3, but is a standardized deviation.) Table 4-4 lists some values of \ erf (/) from to /. \ erf
4-5.
0.3413,
viz.,
0.6827/2, for
t
1.
Precision Indices
To use Eq. 4-14 or 4-15 in a typical measurement problem, we must know two parameters. First, we must know the central location value, the value at which z = 0. This is usually taken as at the arithmetic mean of the set of n observed trial measurements. Then, we must know one or more of
For example,
the dispersion precision indices.
deviation s
is
known from
if
the standard
the n observed measurements, a satisfactory
estimate of the universe standard deviation a
obtained from the
is
relation
"fcF as discussed in connection with Eqs. 2-22 ical value
of a we
standard variable
make
may proceed zjo,
s
and
3-98.
Knowing
the numer-
with the change of the variable
or to zh since
we know
that
a =
z to the
l/(/rV2),
and
use of Eq. 4-15 or 4-14 respectively.
Dispersion indices other than a and h are
common,
e.g.,
mean
the
and the probable error (or some other confidence limit). For a mathematical model of the frequency distribution, such as the normal distribution, a simple numerical relation exists between each pair of
deviation
z
the various dispersion indices.
Mean
deviation.
The mean deviation
z is
taken without regard to
the algebraic sign of the individual deviations, as discussed in Chapter 2 * B.
O. Peirce,
128, uses the
A
Short Table of Integrals (Ginn
form of Eq.
&
Co., Boston, 1956), 4th ed., p.
4-14.
H. B. Dwight, Tables of Integrals (Macmillan Co.,
New
York, 1957), 3rd
ed., p.
275, uses the form of Eq. 4-15.
The Handbook of Chemistry and Physics (Chemical Rubber Publishing Co., 1956), 38th ed., uses the form of Eq. 4-15 with the integration from
—
/
to
/
instead of
from
to +t.
Tables of Probability Functions, Vol.
1
(Federal
Works Agency, Work
Projects
Administration, 1941, sponsored by the National Bureau of Standards)," uses Eq. 4-14.
Normal
Probability Errors
Table 4-4. Error Function
171
},
erf
(t)
from
to
G(t)"=(l/\ 2^) e -<-/•;
t
and Ordinate Values
Probability and Experimental Errors in Science
172 this case
because we already
know
easy one to evaluate, and
we
that the probability distribution
The remaining
properly normalized, Eq. 4-12.
integral in Eq. 4-16
is
an
is
find that
2 tn
— =— h^n
0.564
1
=
=
(4-17)
h
Our
best estimate as to the value of z lh
usually taken to be
from
in
Chapter
2,
mean
the
used by experimenters, not only because calculate.
It
was
inefficient index.
the
mean
the experimental value,
is
2
2th*H—^— j As mentioned
z,
-
,,
,
5
(4-18)
deviation z it
is
rather
commonly
an easy dispersion index to
is
also stated in Chapter 2 that the mean deviation is an And, as mentioned before, in addition to its inefficiency,
deviation does not lead to a useful general rule for the propa-
The standard deviation
gation of errors in a derived measurement.
is
generally a preferable index.
Standard deviation. The square root of the mean squared deviation normal distribution is written as
for the
(4-19)
where, again, the indicated normalization This expression
case.
may
be altered
and the integration performed by Then,
parts.
first
2
J
integration,
°°
2
= \-Tr
term on the right vanishes at both
definite integral
=
we encountered
we have a
=
z
~ 2
The
(Write u
,-
,
«
not actually necessary in this
is
slightly,
in
— /
=
N /2
limits;
Eq. 4-13.
=
»
and dv
=
ze~ h
*
z2
dz.)
-,,i-2
r
the second term
is
the
After carrying out the
(4-20)
h
This expression we already knew, Eq. 4-8, from the derivation of the normal
frequency distribution;
its
derivation here merely checks the various
arguments and gives us practice
in their use.
Normal
Fig.
tion
Probability Errors
The
4-5.
known
173
particular devia-
as the probable error,
pe, divides the area £:£:}.
— pe Probable error.
The probable
deviation that divides the
±pe
ing a deviation within
=
This
is
indicated in Fig. 4-5.
In other
the particular value of z for which
is
\, viz.,
erf(z)
=
A1
=
2
This integral
and we
Thus, the probability of observ-
parts.
is \.
words, for a normal distribution,/^ erf(z)
defined as the particular
is
(or right) half of the area under a frequency
left
two equal
distribution curve into
error, pe,
+pe
easily evaluated
is
*x a = li(ve)
2
f
-v/7T
JO
4
V'
from a
x e~ "dx
table of values of error functions,
find
pe
The probable
error,
=
0.4769
=
0.6745(7
(4-21)
having equal positive and negative magnitudes,
is
a
dispersion index that can be indicated on the graph of symmetrical distributions only.
It is
hardly the
less useful,
although not pictorial, as a
dispersion index for asymmetrical distributions.
Indeed, the probable an index rather commonly used by experimental scientists, although statisticans always prefer the standard deviation. It is important to note
error
is
that the numerical relation between the probable error
and any other
index depends specifically upon the shape of the distribution; the numerical relation in
Eq. 4-21 between pe and a holds specifically for the normal
distribution.
Confidence limits the
90%
confidence limit,
the probable error.
0.90
=
The probable
in general.
fidence limit by definition;
may
Jtt Jo
error
is
the
50%
other confidence limit,
cone.g.,
be deduced in the same manner as that for
Thus, for the
—
Any
see Fig. 4-5.
90%
e~
x
limit,
dx,
90%c.l.
=
-^
(4-22)
h
In terms of confidence limits in the normal distribution, the precision indices correspond to the per cent limits as listed in Table 4-5.
174
Probability and Experimental Errors in Science
Table 4-5. Numerical Relationships between Various Dispersion Indices
and the Confidence Limits for the Normal Distribution Dispersion Index
Normal 4-6.
Probability Errors
ITS
Probability for Large Deviations
In a normal distribution the probability that a deviation
served equal to or greater than
>
G(\z\
This probability
\h\)
some
=
particular deviation
-f=\
e~
.2 2 hz
will
\z\
\z
x
\
is
be ob-
given by
dz
may be more easily evaluated, depending on the particular
tables available, if the limits of integration are
written as G{\z\
>
\z \) x
=1-
~ ?h
changed and the expression
<.
fZ*
e~
h
2
2
dz
>
(4-23)
Jtt Jo
A few calculations with Eq.
4-23 are listed in Table 4-6.
convenience, the independent variable
is listed
as \zja\-
In this table, for
Note from the
Table 4-6. Probability for Large Deviations
Odds
Odds G(\ z \zJo\
0.6745
\
> N)
(%)
against,
tol
G(\z\
\zJo\
>
(%)
\z
x \)
-
against,
tol
Probability and Experimental Errors in Science
176
the standard deviation).
do "justice" to measurement be rejected? In order to
The experimenter
faced with the question:
is
measurements as a whole, should the "bad'
his
Before seriously considering rejection, the experimenter should do the following.
make
he should
First,
additional measurements
if
at
all
possible so as to lessen the relative influence of the divergent value or else
to reveal
more convincingly
it
as being "bad."
Second, he should
make
every effort to find a blunder or a transient systematic error that might be responsible for the discordant value.
Many
been made
reasons for the divergence beyond
that
in searches for possible valid
owing to randomness. There
is,
important discoveries have
for example, the
discovery of argon by Lord Rayleigh.
He noted
famous case of the
a discrepancy between
from air and that of a sample produced chemically. It would have been easy for him to reject immediately one of his results as having been caused by some unidentified mistake. Sometimes, confronted with the question of what to do with a divergent value, the experimenter uses the median instead of the mean as the better location value and also as the reference value in computing the dispersion index, e.g., the "standard deviation from the median." However, a price the median is less efficient is paid for the safety afforded by this scheme the density of a sample of nitrogen prepared
—
than the mean, precision.
i.e.,
more measurements
Also, this procedure
same
are needed to obtain the
very unconventional in experimental
is
may be misunderstood. is so common that, as a general policy, some investigators take the mean of all but the highest and lowest values in each set of trial measurements. To resort to this science,
and
if it is
used the reported measurements
This problem of what to do with a large deviation
device
is
obviously
less
than honest, and, in
fact,
it
denies the fundamental
basis of statistical interpretations of precision.
Chauvenet's criterion for hunch or of general fear is not criterion
tive
proposed,
may set
all
is
better than
at all satisfactory,
trials shall
Many
none.
of them arbitrary.
The one due
be rejected
if its
to
Chauvenet
is
old but in
a is
This
larger does not exceed l/(2«).
significance level in rejection
script
sort of objec-
deviation (reckoned from the mean)
such that the probability of occurrence of
If the
and some
objective criteria have been
This criterion states that a measurement
serve as an example.
of n
Rejection on the basis of a
rejection.
parent distribution
is
is
is
all
deviations equally large or
not a good criterion because the
too sensitive to the sample
normal, the
size n.
critical rejection size z ch (sub-
"ch" for Chauvenet) can be computed for any value of n from the G(\z\
>
|zch|)
= -^ y/7r J*ch
e
'
dz
=
T~ 2n
(
4 " 24 >
Normal For
Probability Errors
is computed (from s) before the measurement in The need for the factor 2 in the coefficient of the
h
this calculation,
question
rejected.
is
integral in Eq. 4-24
as follows:
The
177
is
readily recognized
if
the rejection criterion
deviation, to be acceptable,
must
fall
is
restated
with the range
bounded by ±z ch if it falls ouside of this range, on either side, it is rejected. Note that as n increases the critical rejection size z ch also increases, and, for very large n, rejection of any measurement becomes very improbable, as it should. The dependence of z ch on n up to 500 is shown in Table 4-7. ;
Table 4-7. n
5
Dependence on //Zch
Zchjo
n of Chauvenet's Limiting Values hz ch z ch /a, z ch /pe ,
Zchlpe
n
hz^
z^\a
z cti lpe
Probability and Experimental Errors in Science
178
for different types of measurements. arbitrary at best,
is
of measurements. |2.5cr| is
As
a consequence, the criterion,
generally arbitrary in a different
way
We should especially note that the
for different types
region beyond about
any a priori case, we lose confidence that an adequate description of the parent population. important that the experimenter who rejects one or more
just the region where, in
the normal distribution Finally,
it is
is
measurements, and intends
results
his
to
be
significant,
statistically
should report very carefully the detailed conditions of the measurements, the total number of trials, the particular measurement(s) rejected, and the criterion of rejection,
4-7.
all this
as part of the reported final results.
Test of a Statistical Hypothesis: Example
The type of arguments made in the objective test for rejection of a "bad" measurement is also involved in the test for rejection of a statistical hypothesis. An understanding of this type of argument is essential in the statistical interpretation of the significance of almost any type of observation or theory. For practice, let us discuss now a simple example of a test of a statistical hypothesis. Also, this example will better prepare us to understand the y 2 test of the next section. Consider that a die has been cast 3 5,672 times and that either a 5 or a 6 1
appeared 106,602 times.
The hypothesis
to be tested
is
that the die
is
"true." In this example,
experiment
fit
we wish
to find out
satisfactorily a
whether or not the outcomes of this
binomial distribution where
//
=
315,672
and, according to the hypothesis, p = \. The binomial expectation value for success, i.e., for either a 5 or a 6, is np = 315,672 \ = 105,224. This •
is different from the one actually observed by a relatively small amount, viz., 1378 about \\ %. This difference does not seem to be very much, but the question is, Is it more than we should expect on the basis of purely random outcomes of each cast of a perfectly true die? We can answer this question with rather satisfactory reliability by the following
value
—
argument. If many experiments, an infinite number in the limit, were to be performed with a perfectly true die, each experiment consisting of 315,672 casts of the die, there would be many different numbers of successes; in fact, with an infinite number of such experiments, the frequency
distribution
Eq. 1-20.
a
is
just the binomial distribution B(k\
The standard deviation
= Jnpq =
Jnp(l
Now, we may compare
-
p)
//,/?)
in this distribution
=
x/3 15,672
X
i
x
=
B(k; 315,672,
\),
is
5
=
264.9
the deviation of the result obtained with the actual
Normal
Probability Errors
with the standard deviation with a perfectly true die,
die, viz., 1378,
264.9.
179
We may
=
experimental standardized deviation (1378/264.9)
Our next 5.20(7, is
task
is
a reasonable one owing to statistical fluctuations alone on the
=
£,
is
true.
We
of having a deviation
315,672 casts?
is
k
true.
number of
=
What
is
the binomial probability,
is
or larger in a single
very small,
we
shall
"unreasonably" large and that the die
not true; but, on the other hand,
"probably"
ask,
this large
If this probability
the deviation 5.20a
the
5.20c
to determine whether or not this observed deviation,
assumption that the die with/?
viz.,
express this comparison conveniently by writing the
To determine
if this
probability
this probability,
successes outside the limits
is
"probably"
not small, the die
is
we must sum over
±5.20(7,
i.e.,
distribution
is
write, with n
B(np-
a very
=
1378
sum
is
simplified
good approximation
315,672 and/?
< k<
np
+
=
is
all
greater than
The
105,224 and less than 103,846, in the binomial distribution.
arithmetic in performing this
of
trial set
conclude that
by noting that the normal
to the binomial.
Then we may
\,
1378;
n, p)
= -==
e~
t/2
dt
tJItt Jo.20
= 0.000,000,2 using Eq. 4- 1 5 in which the standardized variable
is zja.
Hence, the chance
that a true die will give a result of 106,602 (or more) successes 1
in 10,000,000,
and we conclude that
either the die
is
is
about
not true or
else a
most unexpected event has occurred.* Since, as we have just shown, it is not reasonable for p to be -], it is instructive to extend this example to include the question, What is the reasonable value and range of values for
p
as judged
from the 106,602
The most reasonable value for/? is simply the experimental value 106,602/315,672 = 0.3377. The proper number of significant figures successes?
We must decide is our next concern. what numerical deviations are reasonable. One commonly employed criterion, an easy one to use, is that reasonable deviations must not exceed ±o\ The value of a is not sensitive to the actual value of/? in this example, and we may take it as 265. Hence, the limiting expectation values are 106,602 ± 265 and the "reasonable" limiting range of values of/? is ±0.000,84 as deduced from the calculation (106,602 ± 265)/3 15,672 with which to write this value of/?
=
0.3377
±
0.000,84.
Another commonly used criterion is that the limiting deviations are those for which the probability exceeds 0.01 (or sometimes 0.05). With *
This example could be rephrased to demonstrate the law of large numbers
effective
in
convergence of the expression for the experimental probability, Eq. 1-39.
the
Probability and Experimental Errors
180
this criterion, the limiting deviation
-%= V 277
dt
>
0.01
value 0.01,
we
find
'
<2/2
is
x
\
critical
of the error function.
(or
It
This
0.0022.
sometimes 0.05)
(4-25)
\z
x
=
\
2.516a from a table of values
follows that the limiting range of values of p
±0.0022 from the calculation (106,602
±
given implicitly by the expression
J *io
"
Using the
\z
Science
in
±
265 x 2.576)/3 15,672
the "reasonable" limiting range
is
if
we
=
is
0.3377
say that on the
average 99 out of 100 random observations (each observation consisting
of 315,672
trials)
are reasonable,
picions to the extent that
The
we
and
declare
that
it
criterion mentioned, viz.,
first
\z
that about 68 out of 100 are reasonable
This
a
is
much more
1
out of 100 arouses our sus-
to be unreasonable. x
\
=
±o,
is
equivalent to saying
and 32 out of 100 are unreasonable.
stringent requirement.
Let us quickly review what we have just done. In this binomial example, the position of the actual deviation, viz., 1378,
was found
to
lie
in the
of the binomial distribution for p = \. For this reason, it is unreasonable to say that the deviation is due to random fluctuations alone,
remote
tails
and so the
statistical
hypothesis that p
=
\
was
rejected.
Then, assuming
that the binomial model, with the experimental value of p, does
observations, In this
we found
example as
it
fit
the actual
the "reasonable" range of values of p. is
given, the statistical hypothesis as to the value
test. Suppose now that, instead of having only two possible outcomes of each cast, viz., a 5 or a 6 on the one hand and a 1, 2, 3, or 4 on the other hand, there has been recorded the number of
of/?
is
the only one
we can
times each of the six different sides of the die appeared.
Now we
can
test
whether or not the multinomial distribution agrees reasonably well with the observations as well as test whether or not each of the six values of/?
The test for each value of/? would proceed in the same manner as above described, the problem being treated as a binomial one. But the test of the hypothesis as to the model parent distribution is more involved in just the same way that the multinomial distribution is more complex is J.
than
is
the binomial distribution.
The
test
of a hypothesis that a particular
model distribution fits a given experimental distribution test for "goodness of fit" and is discussed next. 4-8.
Test of Goodness of
The frequency
distribution
Fit of
is
known
as the
a Mathematical Model
of a small number n of measurements
generally provides us with only sketchy information as to the parent distribution
may
suggest
whose characteristics we seek. The experimental distribution more than one maximum, will generally show asymmetry,
Normal and
Probability Errors
may
it
If n
is
may
not.
181
suggest either higher or lower
increased, these
nonnormal
The problem now
is
tails
than the normal distribution.
characteristics
may
disappear or they
to decide, having only a small n, whether or
not the experimental distribution can be satisfactorily assumed to be a
sample from a normal parent distribution. We shall mention two qualitative graphical of the goodness of
tive tests
sion here
is
fit
tests
and then two quantita-
of the normal curve. Although the discus-
normal model, the general methods of the any model.
specific for the
tests are applicable to
Graphical comparison of frequency curves. The observed frequency measurements is plotted with the normal-
distribution curve of the n trial
i.e., with each observed frequency divided by n. Then and the standard deviation 5 are computed. The model value of// is taken equal to m. The value of the model index h is obtained from a which, in turn, is taken as given by Eq. 2-22; then,
ized ordinate scale,
the
mean
m
sVl With plotted
\
n
I
/a and h known, the normal frequency curve is calculated and on the same graph paper as the experimental curve. The normal
A visual is of course centered about the experimental mean m. comparison of the experimental points relative to the normal curve affords the first test of goodness of fit. Figure 4-7 shows a typical example of the graphical comparison of an experimental histogram and the fitted normal curve. This comparison is sometimes extended so as to express the discrepancy as the percentage of "excess" or "deficiency" of the normal curve at each curve
experimental value of x or at the center of each classification interval.
By
this
extension the test becomes, in a sense, quantitative.
age discrepancies are large, the but
if
fit
the discrepancies are small,
of the model curve
is
If the
percent-
obviously poor;
we need some further arguments to help may be merely the fluctuations
us decide whether or not these discrepancies to be expected in a sample size n
normal model the x 2
distribution.
from a parent population of the assumed
The additional arguments
are
made
later in
test.
Graphical comparison of cumulative distribution functions: probapaper. The second qualitative test compares summations of the observed values with corresponding integrals of the normal curve. bility
The observed deviations order of
size,
z
with respect to the
mean
m
the largest negative value at the top of the
positive value at the bottom.
The
are listed in the
list
and the
largest
entire range of observed deviations
is
Probability
182
Fig. 4-7.
Normal curve
and Experimental Errors
fitted to
in
Science
an experimental histogram.
Fig. 4-8. fit.
Ogive curve for goodness of
Normal
183
Probability Errors
divided into
M intervals, where M
sary that
intervals be of the
all
about 10 or 20 or
is
same
size;
in fact,
so.
It is
not neces-
make
usually best to
it is
extreme ends of the range of deviations.
relatively large the intervals at the
No interval should be so small as to contain only one or two observations. •••,/,•• •, M. •, j, These intervals are numbered consecutively 1, 2, They'th interval has (/obs); observed values in it, and, of course, 2j^i(/obs)i •
=
the
n,
plotted,
number of measurements
in the set.
where
= 2 = ;
z l is
the deviation reckoned
the /th interval, large
This
/.)
is
The
Now the
points y ohs vs. z l are
i
2/obs
and
•
(z, is
C/obs),1
from the mean
large negative for small
(at z /
and
= is
0) to the center
of
large positive for
plot consists of points such as are illustrated in Fig. 4-8.
called the ogive curve.
frequencies normalized,
It
convenient to have the observed
is
divided by n;
i.e.,
in this case the
normalized
to 1 ordinate scale (fohs)jln goes from Then, on the same graph paper, the corresponding quantities from the fitted
normal distribution curve are plotted. These quantities are Z
yth
= JL\ e~^
dz
yJTT J -oo
where z is a continuous variable reckoned from the experimental mean m, and the parameter /; is also determined from the experimental measurements as stated above. Comparison of the experimental points with the theoretical curve allows a qualitative test of the goodness of fit. This comparison attempts to minimize the rapid fluctuations and to
show trends of agreement or disagreement over extended regions of the distributions. But in the tail regions, where our concern is often most acute, the ordinate scale
is
too crowded, and,
not satisfactorily sensitive in these regions.
way
scale is stretched in such a nonlinear
a straight line, then the test stretched ordinate scale
about the y
=
=
0.5 line
much
consequence, this
However,
better.
is
test is
the ordinate
if
that the y xh vs. z curve
becomes
Graph paper having such a
called probability paper;
is
and
is
in
it
is
symmetrical
from The comparison
linear in units of z\a in both directions
Probability paper
Fig. 4-9.
is illustrated in y between the observed points and the normal curve on probability paper
0.5.
can be made readily
in the tails
of the curve.
Probability paper can also be used conveniently to determine the
experimental values of the
from the center and the
fit
Skewness and of the normal
mean
m
and the standard deviation
s directly
slope, respectively, of the best-fitted straight line.
kurtosis.
Useful quantitative tests of the goodness of
distribution can be
made by comparing
the numerical
Probability
184 99.99
and Experimental Errors 50
P 6d
o
in
Science 0.01
Normal
Probability Errors
185
of about 30 measuremeuts. The x 2 test gives a single numerical measure of the over-all goodness of fit for the entire range of deviations.
minimum
The observed deviations intervals all be of the
that
it
least
same
but
size,
classified as in the
Again,
now no
it
is
procedure
not required that the
interval should be so small
contains less than about five measurements, and there should be at
about
(fobs)j
and
are ordered
for the ogive curve described above.
m
six
In this test, the observed frequency
or eight intervals.
the interval (Az) 3
compared with
is
the theoretical
model value
(/th ) ; (Az) corresponding to the center of the interval. (/th is given by the product of the model probability and the number of trials n.) If the interval 3
is
that/th cannot be assumed
so large
to be constant throughout (Az) 3
,
the frequency function should be integrated over the range of the interval,
but this
usually not necessary except in the
is
first
and
last intervals.
To maximize the statistical efficiency of the test, the square of the difference,
-
i-e-, [(/th),
quantity x
2
is
2
(/obs),]
>
is
and
taken,
2
x
this quantity is divided
by (/th),. The
sum
defined as the
= y 3
[(/°bs)j
=1
~
(f^)i]
(A-26)
(/th),-
Exact fit of the model or theoretical frequency curve to the experimental measurements would correspond to x 2 = 0, an extremely fortuitous event in any real-life situation because of statistical fluctuations even if the model fits perfectly the parent distribution of which the actual set of measurements Increasingly large values of x 2 involve the probability arguments concerning the question of what statistical fluctuations are reason-
is
a sample.
able even If the
may be
if
the
model
model does
distribution
fit
is
perfectly.
simply a uniform
2
X
= 2 = 3
in case
(flat)
distribution, Eq. 4-26
written as
no attempt
is
made
to
(Xj 1
~
m)2
m
group the n measurements into
(4-27)
intervals.
In a general view, the x 2 test determines the probability that a purely random sample set of measurements taken from the assumed model
show better agreement with the model than is shown by the actual set.* The type of arguments involved is essentially the same as was encountered in Section 3-3, in the / test and in the F test for consistency. Also, the parent distribution would
The x 2
test was introduced by Karl Pearson in 1900, and it is often called Pearson's However, in the form of Eq. 4-27 it was first used by Lexis in 1877, and is sometimes called the Lexis divergent coefficient when written in this form. *
X
2
test.
Probability
186
arguments are those involved of maximum likelihood, and
and Experimental Errors
in the derivation
in
Science
of Eq. 3-96 by the method
are very similar to those
made
in the rejection
of a "bad" measurement and in the test of a statistical hypothesis. As stated before, these arguments are fundamental in almost every statistical interpretation for significance of actual data.
But, although essentially
arguments as they are stated now the same difficult to follow through the first and more are a little more complex and review if necessary, the general time. The reader should keep in mind, examples. "philosophy" of the previous In order to obtain the theoretical frequency values/,,, devoid of random intervals, we imagine an infinite number of fluctuations, for each of the trials or sets of measurements, all known to be sample sets from a multias in the previous instances, the
M
In each trial the
nomial model parent distribution.
outcomes
very large; this
is
large but finite
if
distribution, or
normal
/-
number
is
r in Eq. 1-30.
number of
possible
taken as generally
r is
the y 2 test concerns a model having a discrete frequency may be infinite if the model is continuous such as the In
distribution.
any
case,
/•
is
subdivided into the same
M intervals
grouping of the experimentally observed frequencies. Then, from Eq. 1-30 with n very large, the multinomial frequency for the center of they'th interval, (/t)l ) ; is computed. This computation requires knowledge used
in the
,
of the
M different values of the probabilities p, in the multinomial distribu-
knowledge is obtained from the experimental measurements, from the mean m and from the standard deviation s. By coupling the respective observed and theoretical intervals, we determine the frequency difference (/obs — /„,), for each interval, and then the value tion,
and
this
specifically
The subscript "obs" is attached to this value of from the theoretical value discussed next, which is based
of £* hs from Eq. 4-26. 2 y to distinguish
exclusively
it
upon the model.
2 the exclusively theoretical value of y i.e., the effect of purely fluctuations alone, we look mere closely at the parent multi-
To deduce random
,
nomial frequency distribution during its analytical formation, e.g., as number of trial sets of hypothetical measurements builds up and
the
becomes infinite. The multinomial probability p i for the same value as was used above in determining hs
^
may be determined
value of this probability, as
of
trial sets
of hypothetical measurements.
values about this mean.
We
shall not
prove
they'th interval ;
this
is
is
mean number
the
with a very large
But there exists a spread of it
here, but this spread itself
has essentially a normal frequency distribution. We make a single random theoretical trial set of measurements, then determine the difference
between
this
and the mean and then finally determine random theoretical trial set of measurements. The
random frequency value
in they'th interval
theoretical frequency value in this interval,
the value of y l for this
Normal
187
Probability Errors
Fig. 4-10.
2 x distribution for various degrees of freedom
v.
value of x 2 is, of course, also one member of a frequency distribution that is spelled out and takes on its equilibrium shape as the number of random theoretical trial sets of
measurements becomes
The derivation of the % 2 frequency
variations in each of the possible outcomes,
multinomial distribution.
infinite.
distribution i.e.,
is
in
made from each
M
the
normal
interval, in the
The arithmetic involved becomes
and
tedious,
approximations are made similar to those in our derivation of the normal distribution. These approximations are reasonably good if a parameter v (defined
below)
is
greater than about
5,
and
if
the
number of measure-
5. These conimposed for similar reasons to those placed on the use of the normal approximation to the binomial distribution, viz., np and nq each greater than about 5. v is related to M, as is mentioned presently.
ments
in
each classification interval
is
greater than about
ditions are
The expression
for the x 2 frequency distribution is*
/(rw), =
(#^ 2
2
\\v
-
d(x
2
(4-28)
)
1)!
The form of this expression was derived in Section 3-5 where R 2 (= 2w) is written in place of x 2 and n in place of v; see Eqs. 3-95 and 3-96. The shapes of x 2 distributions for a few values of v are illustrated in Fig. 4-10. As stated above, the significance of the actual value of ;q, s is found in a
comparison with the theoretical value of x 2 f° r the appropriate * if
Note the close similarity of this expression to that for the Poisson distribution 1 and £t> ^> 1, the approximation to a Gaussian is also very good.
2 x ;>
This
v.
;
and
188
Probability and Experimental Errors
comparison is made with the particular value x under the x 2 frequency curve in a specified way, (or 95 to
5).
We
say that
^
bs is 2
for which the probability P(x
2
>
Science
which divides the area
e.g., in
"unreasonable"
in
the ratio 99 to
1
greater than x 2 '% (or 5%). By
if it is
2
) is less than "unreasonable" we mean of course that the mathematical model used in computing xlus probably does not "reasonably" fit the actual measure-
ments.
And
x
note that, for a given value of
a large probability
P
v,
a small value of
as thus defined.
Table 4-8. Values of Xc 2 where P
=
2
f( x
)
d{y})
^
bs
means
Normal Probability Errors
189
number of theoretical frequencies considered number of experimental values. This constraint, viz.,
the fact that the n,
the
i
fobs l
expressed as
= 2 = j
/th
=
n
1
A second constraint is introduced when model frequency curve. This constraint is in the model distribution be equal to the the condition that
inherent of course in the % 2 we locate the center of the is
limited to
M
M
2=
is
test.
/j,
mean
experimental
A
value m.
third constraint
is
introduced
when we
deduce and use the universe value of the standard deviation from the
These three constraints are usually all goodness of fit of a model distribution, and, if
experimental standard deviation. that are
made
in testing the
=
M—
3.
so, v
However, if the total number of measurements is not sometimes worthwhile to impose a fourth restraint, one in and the interval size are so chosen that the number of measureeach interval is constant and nearly as large as M. This condition
very large,
which ments
M in
it is
allows about the greatest validity of the % 2 approximations, viz., that v be greater than about 5 and that the number of measurements in each interval be greater than about
7.
The
interval sizes
however, without introducing a constraint
no
if
can be adjusted,
size is influenced
by any
equation or approximate relation involving the experimental values.
As an example of the % 2 of
Table 4-9
light.
test,
consider 233 determinations of the velocity
the frequency
lists
obs
the respective classified deviation interval.
measurement occurs
in
In this case the deviation
is
that a
reckoned for arithmetic convenience from an arbitrary reference value of 299,000 km/sec, although the origin of the normal distribution curve placed
deviation in
the
at
mean, 299,773.85 km/sec.
14.7 km/sec.
is
Table 4-8 to a
P
The .^ bs value of 29.10
for v
=
13 corresponds
probability of only about, 0.005, and, since this
we
is
The experimental standard is
less
normal distribution is not a good However, if those intervals containing a fit to the actual measurements. small number of measurements are grouped together (as indicated by the braces in Table 4-9), reducing v from 13 to 8, the #obs is 18.52 and the P than, say, 0.01 (or 0.05
probability
is
larger,
does not
fit.
prefer), the
about 0.018. This
of the approximations the normal curve
if
may
made
latter value
in the derivation
be said to
The formal fit may
fit
—
is
more
reliable in
view
of the/(;r) distribution, and
at least
we cannot be
very sure
possibly be even further improved by
it
more
This example emphasizes the arbitrariness of the an unambiguous "yes" or "no" answer is not possible. Finally, it must be pointed out that the x 2 test can be applied in the test of the goodness of fit of any type of mathematical model of probability. The/th of Eq. 4-26 must be calculated, of course, on the basis of the model
appropriate grouping. criterion of
fit;
Probability and Experimental Errors in Science
190
Application of the %z Test of Goodness of Fit of the Normal Distribution to Measurements of the Velocity of Light
Table 4-9.
Deviation
Normal
Probability Errors
191
arranged into different groups of intervals either by their relative size or by their total number. Indeed, illustration of this fact was just given in the velocity of light example.
This
the measurements themselves
1
partly because of the
random nature of
partly a consequence of the approxi-
the derivation of the x 2 frequency distribution. largely for such reasons that we set the "rejection ratio" so low, say
mations that are made It is
is
and
in
or 5 out of 100, instead of near the ratio corresponding to the standard
deviation, about 32 out of 100. Statisticians
of
fit,
4-9.
have developed additional quantitative
tests
of goodness
but they are seldom used by investigators in experimental science.
Conclusions
There are
many
direct trial
measurements
in
experimental science that
are fitted "rather well" by the normal (Gauss) distribution. These include
the host of measurements that differ one from another in an essentially
continuous manner as a consequence, apparently, of a large number of small elementary errors.
The
fit
is
typically less
good
in the tails of the
presumably due to the fact that the idealized set of conditions on which the normal distribution (central limit theorem) is based is not quite realized in real life. However, to the extent that the fit This
distribution.
is
satisfactory,
is
i.e,
that the parent distribution
form of the distribution function allows the probability that any measurement,
(a)
is
normal, the analytic
very convenient predictions of
either past or future, will have a
value within a specified range, (b) simple and convenient rules for the
propagation of errors
in a derived
or computed measurement,
(c) rules for
assigning weights to measurements, (d) convenient equations for curve fitting, etc.
Such predictions, calculations,
etc.,
are of such great conven-
ience in the statistical interpretation of measurements that there
is
a rather
strong tendency for scientists to accept uncritically the normal distribution as the ansver to their prayers.
A
loud note of caution must be sounded
typically not very
good
But
is
in
pointing out that the
fit
is
Almost any set of trial measurements is generally bell shaped in the central region, and if interest in the statistics of the set is not very quantitative the normal approximation suffices. if
the interest
in the tails.
precise or if the
tail
regions are of special concern (as
"bad" measurement"), a specific test of goodness of fit must be made and the reliability of the normal approximation judged accordin rejecting a
ingly.
In addition to its degree of quantitative its
use therein, the normal distribution
is
fit
to direct
measurements and and valid
the one of most general
use in statistical theory in dealing with certain parameters of empirical distributions.
This application of the normal distribution has been noted
Probability
192 in the first
now add 4-10. 1.
2.
this
in
chapter; to this earlier paragraph,
the x 2 test for the goodness of
Science
we must
fit.
Problems
Show
points at
paragraph of
and Experimental Errors
z
that the curve of the
—
normal distribution formula has
inflection
±a.
The times recorded by 37 observers of
to the nearest 0.1 sec as follows:
Observers
a certain
phenomenon
are classified
Normal Probability Errors
193
As an example in the normal curve approximation, suppose that the marksman will hit a target is ^ and that he takes 12 shots. Compare the binomial probability with the normal probability that he will score (ans. about J % discrepancy) (a) at best 6 hits, and 6.
probability that a
(b) exactly 6 hits. 7.
with h
—
0.447 reciprocal seconds, assume a normal distribution and find
(a) the probability
1.0
about 5 % discrepancy)
(ans.
In a series of observations of an angle taken to tenths of a second of arc
and
1.1 sec,
that the next observation will have an error between
and (ans.
depends on interpretation of "between", e.g., 0.0204) than ±3 sec.
(b) the probability that the error will not be greater
(ans. 0.9421) 8.
If \\h is
2ft and the least count
is
in.,
1
what
is
the probability that 3
randomly chosen measurements, regardless of the order of their taking, will have deviations of 8 in., 16 in., and —4 in.? What is the probability if the order is specified? Assume a normal distribution and assume that the mean is at the center of the least count interval.
[ans. P(8)
=
i>(_4)
Show
=
0.023,
0.022, i»(16)
P^Pa =
8.1
=
0.016,
x 10~ 6 ]
normal distribution is equal to 3. quoted for the rest mass of the electron m =9.1154 x 10" 28 (1 ± 0.00018) g, of which ±0.00018 has the significance of a fractional probable error. Determine the probability that the value quoted (a) is correct to within 0.0005 x 10~ 28 g, (ans. 0.162) 9.
10.
A
(b)
is
(c) is
that the kurtosis of the
value
is
correct to within 0.00010 x 10~ 28 g,
and
(ans. 0.0325)
not correct to within 0.001 x 10~ 28 g.
11. In a breeding experiment,
it
(ans. 0.682)
was expected that ducks would be hatched
in
duck with a white bib to each 3 ducks without bibs. Of 86 ducks hatched, 17 had white bibs. Are these data compatible with expectation? Do the observations prove that the expectation was correct? 12. Should the last recorded observation in the data listed in Problem 2 the ratio of
1
be rejected according to Chauvenet's criterion ? 13. (a)
Compare on
distributions for B(k; (b)
Why
14.
From
probability paper the binomial and normal probability 100, 0.3)
and G(z;
hi)
as listed in Table 4-2.
not practical for a bookstore to stock probability paper for other model distributions than the normal ? is it
past experience, a certain machine properly operating turns out
items of which 5
%
are defective.
On
a certain day, 400 items were turned out,
30 of which were defective. (a) If a
normal distribution
is
assumed
ordinates of the plot of the distribution (b)
What
is
in this
problem, what are the co-
?
the probability that the machine
was operating properly on
this
day? According to Mendelian inheritance theory, certain crosses of peas should and green peas in the ratio 3:1. In an experiment 176 yellow and 48 green peas were obtained. 15.
give yellow
Probability and Experimental Errors in Science
194 (a)
Do
these
conform
to theory?
peas conforms to the theory
if it is
Assume
that the observation of 176 yellow
within 2a of the expected value. (ans.
(b)
Show
that about 95
%
of the normal area
bounded by
is
conforms)
2a.
2 16. (a) Apply the x test to the fit of the normal curve to the following 500 observations of the width of a spectral band of light:
/obs =5 /th =5
12
43
61
105
103
89
54
19
7
2
14
36
71
102
109
85
50
21
7
2
2/obs 2/th
Here/th denotes the fitted normal curve frequencies obtained by mean and the standard deviation from the actual measurements, (b) What is the significance of the difference (2/ h — 2/obs )?
= =
500 502
estimating the
t
How
would you determine whether or not 100 given measurements (each measurement expressed, e.g., with 5 significant figures) (a) are approximately random, and (b) fit as members of a normal (Gauss) distribution ? 17.
"Lest
5
men
suspect your tale untrue,
Keep probability
in
view."
JOHN GAY
Poisson Probability Distribution
5-1.
Introduction
In the preceding chapter the
normal (Gauss)
distribution, Eq. 4-9,
was
discussed as an approximation to the exact binomial distribution, Eq. 1-20.
A more or less paralleling discussion is to be made for the Poisson distribution, so First,
we shall quickly review the line of argument involved. remember that the normal distribution plays a role of
fundamental importance application to
many
sets
in
statistical
theory in addition to
of direct measurements.
This
is
its
great direct
not so for the
Poisson distribution, and our discussion in this chapter exclusively with applications to those sets of direct satisfy
The
reasonably well the Poisson conditions. algebraic expression for the
the particular conditions that the
and
is concerned measurements that
normal distribution was derived under
number n of Bernoulli
trials is
very large
p in each trial remains constant. The first practical advantage of the normal approximation in dealing with direct measurements is in the greatly simplified arithmetic when n is large the factorial numbers, the fractions raised to high powers, and the tremendous algebraic summations are avoided. But the most significant advantage in that the success probability
—
dealing with direct measurements
and q appear
first
that the Bernoulli parameters n, p, in the product np, the location value, and then in
the triple product npq.
is
This triple product, generally considered to be the
only parameter in the normal expression, distribution,
i.e.,
npq
=
a
2 .
An
is
equal to the variance of the
estimate of the parameter a
from the standard deviation of the
set
significance of the individual Bernoulli parameters
unidentified
and
is
obtained
of actual measurements; and the is
then relegated to the
little-understood elementary errors that are believed to be 1
95
Probability and Experimental Errors in Science
196
unavoidably present with different net ments.
effects in successive trial
With a evaluated, the simple normal formula
measure-
of inestimable
is
aid to the investigator in designing his experiment, in allowing predictions
of the probability of future measurements, and in judging the "reasonableness" of past measurements.
The Poisson
distribution
may
an approximation to
also be derived as
Again, a single parameter
the binomial distribution.
is
involved whose
direct experimental evaluation, without regard to the values of the separate n, p, and q, allows very useful application in the measurements and in the design of experiments. In this case, however, we can often recognize the basic Bernoulli trials and evaluate n, p, and q from them; but often these basic trials remain hypothetical, as they are in the normal case. When the basic Bernoulli trials are recog-
binomial parameters
analysis of
nized, their characteristics
may justify
Poisson formulas; otherwise a
test
immediately the application of the
of goodness of
fit,
such as the x 2
test,
must be made.
Rare events.
The Poisson approximation holds when
the following
three conditions in the binomial distribution are satisfied: (1)
//
very large, infinite in the limit,
(2)
p
very small, zero in the limit, and
(3) the
product np moderate in magnitude,
Thus, on the average,
many
called success appears.
known
i.e.,
that np
< Vn.*
Bernoulli trials are required before the event
For
this
reason the Poisson distribution
is
often
as the formula for the probability of rare events.!
There are statistical
many examples
analysis,
of rare events for which we wish to
make
and arguments
as to
predictions of future events,
As illustrations, we may mention such classical problems as the fractional number of soldiers who die each year from the kick of a mule, the number of atoms that spontaneously decay per unit time in a reasonableness.
man of age 25 will die at uncommon noncommunicable disease
radioactive sample, the chance that an average
a specified age, the incidence of an (such as polio) and
its
response to large-scale vaccination treatment,
and the number of houses per thousand burned by fire per year. Typical rare-event problems are discussed in detail after we understand, first, the essential Poisson equations and, second, the order of magnitude of the errors involved in the approximation. *
k
A
better statement of the third condition
~ np, the
left
side need not be
much
less
is
that k 2
+
(np)-
<
n,
and then,
if
than the right side.
t The Poisson distribution is often improperly number of successes need not be small when n is
called the law of small numbers.
very large.
case of a "spatial distribution" of events, as pointed out
This
later.
is
The
generally so in the
197
Poisson Probability Distribution
5-2.
Derivation of the Poisson Frequency Distribution Function
The derivation of the Poisson function may probability equation, Eq. 1-20.
(Eq. 2-26),
we
B(k;
write Eq. 1-20 in
n,
P)
=
Noting that/? the form
start
+q
(jW- = "-f^M" -
2)
=
with the binomial 1
=
and that np
•••(«-
fc
+
[j,
1)
k>.
/ff(,
'H)H)-(<-^) k\
1
--
v
i
(-3(«-i)
11
-
-
(5 -°
-J Under
the Poisson conditions, viz., n very large,
product np of moderate magnitude, the unity,
and the
last
this exponential
power
first
p
very small, and the
fraction in Eq. 5-1
is
essentially
factor can be approximated as an exponential.
To show
approximation, write an exponential in the form of a
series,
^ = l+A + l + - + - + 2
0!
1!
2!
and write out the binomial expansion,
e
-- =1 nj
—J
+
3! e.g.,
4-i)er 2!
+
•••
: --
(5-2)
4!
from Eq.
4
1-21,
-;)(> 3!
± n'
-m
Probability and Experimental Errors in Science
198
where the sign of the last term depends on whether n By comparison, it is apparent that B
lim
(
1
«/f) nl
n-oo \
With
the
first
=1
2
3
2!
3!
even or odd.
is
_£ + AL _AL + ^---=e-" 1!
(5-3)
4!
and with the last factor becomes the Poisson prob-
fraction of Eq. 5-1 written as unity
as an exponential, the binomial probability ability, viz.,
P{k;
^—
t
x)=
f
(5-4)
k!
Equation 5-4
is
called the Poisson frequency distribution or "density"
Note that
function.
it
contains only one parameter,
The Poisson cumulative distribution function following sum at the desired value of k:
a-
=o
fc=o
shown
It is
!i
1
is
—
;
e~
for ll \
n\
presently that this sum, as written,
probability for observing fxe~
2!
1!
fc!
two
successes,
no "success" (/u
2
e~'')l2\;
is
is
simply e for
etc.;
forexactly k successes, Eq. 5-4;
/x.
given by stopping the
is
etc.
equal to unity. - '';
The
for one success,
more than one
Stirling's
success,
formula, Eq. 1-14,
when k is greater than about 9. number N of trials in an experiment may be
often a help in evaluating k\
In actual measurements, a
made, each trial involving n basic Bernoulli trials. Equations 5-4 and 5-5 give the normalized probability or probabilities, as is expected of a model, and comparison of the calculations with the observed frequency or frequencies requires that either the observed values be normalized by dividing by TV or that the calculated values be multiplied by N. But note the difference between the definitions of
N and
of
n.
Shapes of Poisson frequency distributions. The shapes of the frequency distributions (or histograms) represented by Eq. 5-4 are essentially the same as the shapes of the binomial distributions for /; large and p small. (Binomial distributions are illustrated in Figs. 1-2, 4-1, and 4-2, and
also in
Problem 6 of Section
but approaches symmetry as
ju
1-9.)
The shape has
increases.
When
fx
a positive skewness, is
rather large, the
between the Poisson and the normal distributions; in this region we may generally approximate the binomial by using either Eq. 5-4 or Eq. 4-10, whichever offers the simpler arithmetic. The most probable value of A:, viz., k is generally less than the expectation value /x but never differs from it by more than unity. This was proved
distribution
is
in the transition region
,
in
a footnote in Chapter
double-valued,
i.e.,
1,
p. 32.
If (n
+
\)p
adjacent k values at
/x
is
equal to an integer, k
and
at
/x
—
1
is
have equal
Poisson Probability Distribution
Problem 14 of Section from Eq. 5-4 that
probabilities [see It
199 1-9].
readily follows
p< + '*
* = -jl.
fc
+
k
P{k\fi)
(5-6)
l
which indicates conveniently the rate with which P(k; everywhere in the distribution. Poisson
to
normal
distribution.
It
fi)
instructive
is
varies with
derive
to
k
the
normal density distribution from the Poisson equation, Eq. 5-4. To do this we may define the normal variable z in terms of the Poisson variable as k,ifk 1 and ft > 1
>
,
-k-
z
k ** k
-
-
(fi
(5-7)
\)
and by Eq. 5-7 Note we match in location the most probable value of/: with the normal maximum. The means of the two distributions thus differ on the average by \. If k > 1 and fi 1, the term \ is of no practical consequence (except for k very close to ft). With Eq. 5-7 we may write that the Poisson distribution
is
intrinsically discrete,
>
Wo +
+ Zp-"
//"O
= ?,
*); /*)
log, P((k
(kQ
+
,,
=
.*
+ z)\
=
z); fi)
Pfro;
a«)
(*b
P(k
log,
+
fi)
;
+
+ 2)"-(*b +
lX*b
log,
z
z
-
) (f \kJ
log, (l \
»)
+ f) k n
-
z
1+-
-•••-log, \
A:
and fi are very large compared with unity; by expanding the logarithms in a power series, Eq. 4-5, neglecting terms 2 and higher powers, we find [(ft — k )lk ]
In the normal case, k, k
+ 2);
log, P((k
/.)
,
«
log,
P(k
;
fi)
+
Z{fi
~
K)
- ^-^-
}
2k
k
Then, with the definition of
+
P((/c
By
z); //)
z in
w
terms of
P(fe
the property of normalization,
C
2=
e"
?2/2A'°
;
as given by Eq. 5-7,
ft
u>- j2/2i
»
/
C may
=C
e~
= Ce^2*
be determined as follows, z2/2k0
dz
=
1
fc
and, since q ph
1,
and by Eqs. 1-24 and
4-8,
£
1
/i
yj2TTk Q
yJlT
C = -=L= = -4
«s* 77/N7
=
1/2//
2 ,
(5-8)
Probability
200
and Experimental Errors
in
0123456789 Fig. 5-1. Binomial
and Poisson distributions for n
=
12,
p
=
10 \.
123456789 Fig. 5-2. Binomial
and Poisson distributions
for n
=
96,
p
Science
=
10 1/24.
Poisson Probability Distribution
We
201
conclude that
+ z);
P((/c
p) t*
P((fx
+ 2);
//)
^
G(z; h)
=
-£= e-*
v which
is
when Az
the
=
same
v
(5-9)
77
normal density function, or as Eq. 4-10
as Eq. 4-9 for the
1.
Normalization.
That the Poisson distribution is normalized, i.e., outcomes in the Poisson distribution is unity as n —> co, is assured by the fact that it is derived from the normalized binomial distribution, but this assurance depends upon the validity of the approximations made in the derivation. However, it is easy to show directly that the
sum of all
possible
that this distribution
T
n
P(k; u)
=
rt%
As
indeed exactly normalized.
is
n i
e~>
J fc
k
=
£-
L0!
n -> oo, the factor in brackets
lim n-»oo
5-3.
Errors
When
n and
£ + £ + £+£ + ...+-£"
£?""
=ofc!
Write Eq. 5-5 as
is
'
1!
'
2!
'
3!
' '
of the same form as Eq.
]T P(k; n) =
= e'^ =
n!
5-2.
Hence, (5-10)
1
fc
in
the Poisson Approximation
p
are both
finite,
made in using the relatively cumbersome but accurate of course, when the Poisson conditions
the errors
simple Poisson expression, instead of the
binomial expression, are significant, are poorly approximated. Table 5-1.
k
B{k; 5,|)
For example,
Errors
in
B{k;
if
n
=
12 and/?
=
\,
the errors
the Poisson Approximation 10, to)
B(k; 100,
j^)
P(k;
1)
Probability and Experimental Errors in Science
202
=
with np
11
=
in all three cases.
1
the Poisson approximation the order of 100 or rule of
thumb
stated as k 2
—
as (k
To
evident from these examples that
more and
if/? is less
most applications
than about 0.05.
This
but, as noted earlier, the condition for small errors
+
(up) 2
n.
n
if
is
is
of
good
a
better
is
can be seen that the relative error increases
It
fx)fn increases.
Precision Indices
5-4.
must
It is
satisfactory for
is
predict probability first
know
from the Poisson
relations, Eqs. 5-4
We
the magnitude of the parameter p.
method of maximum
5-5,
we
likelihood that in the binomial distribution this
parameter, the arithmetic mean,
mean of
that in the Poisson case
show
for practice let us
k values
all
We
equal to the product np.
is
assumed above, with perfect validity, and is also the arithmetic mean. Just from Eq. 5-5 that p is the mean. Let the arithmetic
and
have shown by the
have np
=
p
directly
m th
Eq. 5-5 be denoted by
in
.
Then, by definition,
lk = ^" I— = *~> 2 — 2— — n
n
n
m th = 2
=
kP(k; p)
e -»
i=o
=o
a-
k=i
fc!
H*
n
k
i
k\
k
_l
= i(k
1)!
(5-11)
where the lower
limit
sum
of the
changed from
is
presence of the factor k in the numerator (the
first
to
1
because of the
term, for k
=
equal
0, is
and where, in the last step, we divided through by k (k > 0) and substituted p p'~ l for p k * Then, as n -> oo, the sum is equal to the exponential e as may be seen when e is written as a power series, Eq. 5-2. to zero),
1
1'
',
Hence, Wth
=
lim n -»
'
2=
/,-
kp ( k
'>
/")
= e'^p =
(5-12)
p
o
This conclusion was also reached by the method of
maximum
likelihood
Section 3-1, Eq. 3-17.
in
Standard deviation. distribution, a
since q «*
1,
=X
npq.
We know Hence,
*
Eq.
The argument
gamma 5-1
1
is
in
is
sometimes made
binomial
that, in the
the Poisson distribution, a t&
and, by Eq. 5-12 and the relation
o
the
from Eq. 2-28
p
=
= ^p
that, since
X np
np,
(5-13) (
— 1)! = x
function, see discussion attending Eq. 1-16), the
equal to zero and that, hence, the lower limit of the
changed from to 1. But this argument would have through by 0, which is not allowed.
(which can be shown by
first
term of the
last
us, for the
last
sum may k
=
sum
in
as well be
term, divide
Poisson Probability Distribution
where the
show By
it
=
sign
2
= I (* - vfP{k; *=o A-
—
>-
strict
Poisson conditions. But we shall
definition,
=| since
used for the
is
directly.
o
«
203
(A:
2
^
2
2
P(Ar;
)
p)
=
= f W(fc; =
Zkp
+
-
pt)
/r)P(/<;
A«)
2
(5-14)
/<
fc
^£% =Q kP(k; p) oo
-
= 2 (k t=o
ft)
=
by Eq. 5-10.
by Eq.
/u,
and
5-12,
since
2fc=o^°(^
(Incidentally, this expression
A*)
=
f° r
1
one form of the
is
general relation, irrespective of the type of distribution,
a2 where k
2
mean of
the arithmetic
is
= ie-^
arithmetic mean.
This expression
and the next time
as Eq. 2-19.)
(5-15)
the squares,
first
appeared
Equation 5-14 can also be written with (k(k
k
=o
k\
fc=2
—
Because of the factor k(k
1) in
and
ju
in this
2
is
—
1)
+
(k
—
2)!
the numerator of the
the square of
book
as Eq. 2-12,
k) for k 2
first
;
then
sum, the lower
sum may just as well be written as k = 2, as is done in the second sum. Then the sum is equal to unity (for n —> oo) by the property
limit of this
of normalization, Eq. 5-10.
a
=
2
Note that the
2
ft
+
In conclusion,
ju
—
2 /li
=
and
ju
a
=
yjfi
(5-16)
and negative values of a have no simple interasymmetry of the Poisson distribution, a is an rms deviation having an asymmetrical graphical interpretation. The rms of all positive deviations is greater than the rms positive
pretation as actual deviations because of the
of
all
negative deviations.
To show by
the
method of maximum
likelihood that a
posed as a problem for the reader.
(It is
= V npq
in the
27 in Section 3-11, that a
Fractional standard deviation. the most
commonly used
The
shown by
this
= A up = \ /u is method, Problem
binomial distribution.) fractional standard deviation
is
precision index in those "counting" measure-
ments for which the Poisson model is a satisfactory fit and for which is rather large by the design or analysis of the experiment. The fractional jli
standard deviation in per cent fractional a in
is
defined as
%=
- x 100
= -^
x 100
(5-17)
204 It is
Probability and Experimental Errors in Science the simple inverse relation with
V/u that makes the fractional a so
popular.
When single
/u is
moderately large, not
measurement k s
(fractional a) of, say,
much
1
error
is
introduced by writing a
In this case, in order to have precision
in place of//.
%, we must observe k s
=
10,000 "successes."!
Standard deviation in a single measurement. Even when the single measurement is not large the question often arises. What is the precision in this single measurement? This question may be stated another way: If a highly precise value of m (sa //) is known, as from many trial measurements, what is the probability that the single measurement k s and [x will differ by a given amount? To answer this question, we may consider m, instead of the single measurement k s to be the variable in P(k m). Note that m is a continuous variable although k is discrete. The likelihood function L(m) for a single measurement k s is ,
;
Urn)
=
^—
(5-18)
kg.
Following the procedures of the method of lined in
Chapter
we
3,
log
/dm
maximum
likelihood as out-
write
L(m)
[log L(m)-]
=
m — m —
k s log
=-m
log k s
(5-19)
\
s
(5-20)
1
and
^ [logL(m)]=--^m dm 21
To
(5-21)
2 l
m* from Eq. 5-20, measurement k s we write
find the best estimate
from a
single
i.e.,
the
most probable value
,
^-[logL(m)] m=m .
=
m*
=
(5-22)
dm This result is
is
the one
we would
ks
(5-23)
expect for a symmetrical distribution, but
not necessarily the expectation for an asymmetrical distribution;
remember, our only knowledge
To |
is
from a
find the standard deviation in a single
Such a large k or
/<
the Poisson distribution
but
measurement, A ,. measurement, a,. we combine
single
,
does not invalidate the Poisson condition that up be moderate; is
not one of small numbers, only one of rare events.
205
Poisson Probability Distribution
Eqs. 5-21 and 3-25 with the reasonable assumption that the values of L(m) are normally distributed about
a*
=
1
m* (= k from s
Eq. 5-23):
Probability and Experimental Errors in Science
206
Probable error.
By
definition, the
probable error, either positive or
marks the + and — limits in any distribution such that the number of measurements between the respective limit and the mean is equal to the number beyond the limit, i.e., the 50% confidence limit on either side of the mean. The median value is always enclosed within these ±pe limits, and, in most distributions of interest, the mean value m or fi also lies somewhere between +pe and —pe. In a symmetrical distribution, the positive and negative values ofpe are equal in negative, viz., ±/?e,
is
Table 5-2.
that error that
Numerical Coefficients pe/o for Various
Values of
/<
the Poisson Distribution
in ft
20
peja
207
Poisson Probability Distribution
We
can also write
-L
(5-30)
the appropriate numerical coefficient from Table 5-2. the fractional probable error fractional pe k tv
^^
«
ak
±0.6745
P
x
K
with a similar qualification as to the magnitude of the numerical coefficient.
Skewness. The Poisson distribution is intrinsically skewed. The asymmetry or skewness decreases as the expectation value p increases. It is interesting to see the relationship between the skewness and p. By definition, the third moment about the mean is
f
(k
- pfP{k;
= SCfc3 -
p)
3k 2p
+
3/c/r
- pz)P{k\
p)
k=
= £/c 3P(/c; By
-
3pLk P{k\ p)
we
—
separate the quantity k{k
1)
sum from the k property of normalization. Then the
change the
first
term
in the
p*
+
z 3p ZkP(k; p)
+
- p3
in deriving the expression for a,
same arguments as were used
the
Eq. 5-16, the
p)
2
3SA: 2 P{k- p)
-
—
(k
=
2) out
of the
first
term,
k = 3 term, and use term becomes
to the
first
2LkP(k;
p).
We make use of the relations that Hk P(k; p) = a2 + p2 from Eq. 5-15, that a 2 = p from Eq. 5-16, and that T,kP(k; p) = p from Eq. 5-12. Substi2
tuting these quantities in the expression for the third
moment about
the
mean, we get
|=
(k
- pfP(k;
p)
=p
(5-31)
A
Finally, the skewness,
divided by a 3
by
definition,
is
the third
moment about
the
mean
,
skewness
= l (K-M^k:,) = JL = ±_ = a k=o p
N
p
i a
(5 .32,
a very simple relation that holds for every Poisson distribution.
The derivation of
the expression for kurtosis
is
assigned to Problem
11, Section 5-10.
5-5.
Significance of the Basic Bernoulli Trials
Since P(k; p), Eq. 5-4, contains a single parameter, p, each of the p is known. In the derivation
Poisson probabilities can be calculated once
of P(k;
ju), p is taken as equal to the product np, where n and/? are the parameters in the binomial distribution and therefore are characteristics
of basic Bernoulli
trials.
However,
tional purposes whether or not
//
it is
of no consequence for computa-
and/? are separately known. (However,
208
Probability
note that knowledge of n and
and Experimental Errors
p may
in
Science
establish immediately the appli-
\i by m from a and then to use Eq. 5-4 or 5-5 in predictions and in calculations of precision and of reasonableness.* This, however, presumes that the Poisson model satisfactorily fits the actual measurements. The question of goodness of fit is discussed
cability of the Poisson model.)
rather large
number of
It suffices
to approximate
actual measurements by Eq. 2-2,
later.
In practice, Poisson problems can be divided into two general classes: (a)
sampling with replacement and
In the are
first
class there
presumed
limited.
to
is
come;
no known in the
(b)
sampling without replacement.
limit to the supply
second
class, the
from which events
supply
is
known
to be
In sampling with or without replacement, the experiment or
measurements must be such that the basic success probability, or the Poisson probability for a given value of k, must remain constant for
all
samplings.
An example of the first class is in the number k of defective
screws that a
machine turns out in a given batch of known size n if the average performance of the machine is known to be ju (^ m) defectives per batch of this size; p (= /////) is the probability, assumed to be constant, that any one screw will be defective. We do not inquire into the machine factors that make a screw defective it is presumed that the machine never changes in its performance in this regard. It is also presumed that our knowledge of n and p is such as to justify the assumption that the problem is indeed Poissonian. Another example of this class is in the number k of cosmic rays from outer space that appears in a specified solid angle in a specified time interval at noon of each day, or in the number k of X rays emitted per second from an X-ray tube under steady operation. In this example, the basic Bernoulli parameters n and p are not a priori known, but a special argument can be invoked to show that the parent distribution is satisfactorily Poissonian. This argument is the one of "spatial distribution" to be elaborated presently. A Poisson example of sampling without replacement is the classical one of radioactive decay the number k of atoms in a given radioactive specimen that decays in a specified time. To be Poissonian, this example must include the specification that the lifetime for decay must be long compared with the time interval of observation, and the number of atoms in the specimen must be rather large. In this example, the basic Bernoulli trial is whether or not a given atom decays during the interval of observation. certain
;
—
made in Chapter 4 regarding the normal Gauss distribution approximate the normal mean // (for the position at which 2 = 0) and the parameter h from the experimental measurements. The parameters /;, p, and q are not *
it
This argument was also
suffices to
separately evaluated.
209
Poisson Probability Distribution Clearly, the supply of possible events
of atoms in the specimen.
is
limited, viz.,
is
the total
number
These and other examples are discussed
in
detail later.
Two mechanical analogs. It was stated or implied in Sections 2-7 and 4-3 that the distinction between "success" and "failure" in a Poisson problem may be thought of as due to an unfathomable array of elementary errors. According to this view, a mechanical analog of "success" and "failure" is illustrated in Fig. 5-3. This mechanical model is a simplification of the one used for the normal distribution, Fig. 4-4, and the interpretation ball as
it
is
In both cases, however, the deflection of the
quite different.
encounters a pin corresponds to an elementary error.
By the relative size and position of the "success" bin, only a small number k of the total number n of balls dropped manages to reach this bin. The Poisson probability p refers to the chance that a given ball will do so. This probability is constant if the (hypothetical) pins and their array are appropriately designed. If the n balls are gathered up and again
•
• •
•
•
Fig. 5-3.
A
•
•
•
•
possible mechanical analog of "successes"
probability problem. after n balls
•
••••• ••••••
• •
• •
A
and "failures"
single trial consists in observing
have been dropped from the nozzle at the top.
k balls
in the
in
a Poisson
"success" bin
Probability
210
and Experimental Errors
in
Science
Lasassa 'Successes"
k
k<£n Another possible mechanical analog of "successes" and "failures" in a The angles of bounce of the dropped balls are random. A large number of ball drops gives only one Poisson measurement, viz., the number A' of balls Fig.
5-4.
Poisson problem.
in
the "success" bin.
Poisson Probability Distribution
211
dropped through the nozzle, as another Poisson trial, a generally different number k of successes is observed. A graph of the number of times each to n, as the number of Poisson possible value of k is observed, from
N
trials
increases indefinitely, gives the Poisson distribution.
geometry, k 2
+
[By the
(np) 2 << ».]
In this analog, each dropped ball, not each pin, corresponds to the basic
Bernoulli
trial.
And
a single
elementary error Bernoulli It is
trial,
(i.e.,
measurement
the
is
In the analog for the
position of a single ball.
number
normal
k,
not the final
distribution, each
the deflection at each pin) corresponds to a basic
but not so in this Poisson analog.
instructive to consider another possible mechanical analog for the
one
Poisson case.
This
into a
deflection
single
is
in
which
all
uncertainty.
the elementary errors are
lumped
Consider the model shown
in
Fig. 5-4.
dropped irregularly one by one from and bounce off the rotating rough central pin. The pin is rotating and rough so as to insure that the angles of bounce are essentially random. On one side of the cylindrical box is a hole, somewhat In this model, n identical balls are
the nozzle so as to strike
larger than a ball, leading to the "success" bin.
and the radius of the
cylindrical
box
By having
this hole small,
large, the "success" probability
is
small.
This model, with a large
ment
is
number
n of balls dropped, illustrates only one
relatively small number k number n of basic Bernoulli trials. If the experirepeated with the same number n of basic trials, the value of k
measurement, of "successes"
will in general
viz.,
the
measurement of the
in a large
be different;
if
the experiment
is
repeated
N times,
many
The Poisson frequency distribution is the graph of the number of times each k value is observed vs. the k values themselves as p -> and TV— oo. [Again by the geometry, k 2 + (npf <«.] different
5-6.
k values are observed.
Goodness of
Fit
Before the Poisson equations
may
be properly applied to calculate the
probability of the value of the next measurement, to assess the precision,
or to judge the reasonableness of past measurements, the investigator must
be assured that the measurements are members of a parent distribution that
is
satisfactorily Poissonian.
There are two general ways of determining the goodness of fit of the Poisson distribution, viz., by a priori analysis and by test. Analysis of the experiment usually allows the investigator to see whether or not the conditions of the basic Bernoulli
trials,
hence, of the Poisson
Probability and Experimental Errors in Science
212
approximations, are satisfactorily met. These conditions, to recapitulate, are: (1)
only two possible outcomes,
(2)
each outcome determined entirely by chance,
number of
(3) the
product
each basic
k, in
trial,
basic trials, n, very large,
(4) the success probability, p,
(5) the
k and not
viz.,
constant and very small, and
+
such that k 2
tip
(tip)
2
n.
In experiments in which the Bernoulli parameters are directly recognizable,
or are seen to be adjustable as
a spatial distribution (examples are
in
discussed presently), the analysis for goodness of priori that the parent distribution
infrequently indicates
it
is
case in which for any reason there
necessary to carry out a
The
qualitative
is
and
In the transition case, or in a
a serious question of goodness of
fit,
test.
and quantitative
tests
described in detail in Section 4-8
method
for the normal distribution are also applicable in
case or for any model distribution.
for the Poisson
Discussion of these tests will not be
Of them, the quantitative y 2 test is number of trial measurements.
repeated here.
often indicates a
to be in a transition region between binomial
Poisson or between normal and Poisson.
it is
fit
satisfactorily Poisson, but also not
generally the best but
requires the largest
Spatial
distribution.
problems,
let
argument, (infinite
Before
proceeding
examples
to
of Poisson
us elaborate the argument of spatial distribution.
many
By
this
experimental problems of sampling with replacement
supply of events) are recognized a priori as being satisfactorily
Poissonian.
Consider a problem in which events occur randomly along a time axis, and consider a unit time interval /. (In this case, "spatial" refers to time; in
other cases
it
may
terms of Bernoulli
time interval.
refer to space.)
trials,
each
trial
In each observation,
and "one or more events"
is
We may
first
analyze the problem
in
being an observation during the unit
"no event"
is
declared to be failure
declared to be success.
We may
repeat the
observation for TV successive unit intervals of time. The observed distribu-
numbers of successes is represented by the binomial distribution NB(k' ri p'), where p is the success probability and k' is the number of "one or more events" observed out of ri Bernoulli trials. In most problems, we are not satisfied with success defined as "one or more events"; we wish it to be only "one event." We may then imagine
tion of ;
',
the size of the time interval reduced until the probability for observing
more than one event Suppose
that, in
doing
the time interval in the
in the this,
we
reduced time interval
is
negligibly small.
divide the unit time interval by an integer n;
new Bernoulli
trial is
now At
=
l/n.
Since division
Poisson Probability Distribution
213
by n is purely imaginary, we may make n as large as we please. As n becomes very large (infinite in the limit) the probability p n that At contains even one event
is
very small (zero in the
limit).
We now have
n imaginary
Bernoulli trials that by definition in the limit satisfy exactly the Poisson
Then, for each of the
conditions.
TV actual observations, the probability
/ is B(k; n,p n ) = and the predicted frequency of observing exactly k events in N trials is NP(k; ju ). The point of special significance in the spatial distribution is that the basic Bernoulli trials on which the justification for the Poisson approximation is based are hypothetical trials having no actual
for observing exactly k events in the unit time interval
P(k;
fj, t ),
t
measuremental
significance.
t, we may start with any arbitrary time interand then reduce the Bernoulli time interval by dividing by nT instead of by n. The argument to be followed is the same as before except that now, to have the same success probability p n we have Tn basic Bernoulli trials. The Poisson predicted frequency distribution is written with T/u instead of ju viz., NP(k; T/u ). Then we may write /u for Tfi
Instead of the unit interval
val T,
,
t
t
.
,
t
important to realize that
It is
(x
t
is
t
a physical constant determining
oin or of measurement during the unit
the density of events along the time axis independent of the value
nT.
(x
is
t
the expectation value in an actual
time interval,
/u,
the expectation value during the time interval T,
determined from the viz.,
ph
fx
m=
N
Yufk k\N.
is
experimental measurements, each over time T,
For convenience
in
choosing an appropriate T. This adjustment
practice,
is
we
adjust
fj.
by
merely a k scale factor of
convenience. 5-7.
Examples of Poisson Problems
Perhaps actual examples provide the best way
(a) to
become acquainted
with the a priori analysis in determining whether or not a given problem Poissonian, and (b) to
become
is
better acquainted with the Poisson expres-
sions in general.
Deaths from the kick of a mule. tioned in
almost
all
This
is
a classical example men-
textbook discussions of
Poisson
probabilities.
Records of the number of soldiers dying per year from the kick of a mule were kept for a Prussian army unit of 200 men for a time of 20 years. It is assumed that this army unit was kept at full strength and at essentially the same routine activities throughout all of the 20 years. The records for one year are shown in Table 5-3. The mean or expectation value m (en fi) 200) observations is given by the ratio (Z/*A:)/(E/ ) = 122/200 and the Poisson probability can be readily computed. This probability, multiplied by N, is also listed in the table. Our first conclusion is that by direct comparison of the experimental and the model frequency
from
=
N(=
0.61,
fr
214
Probability Table
k
5-3.
and Experimental Errors
Deaths from the Kick of
fk
a
Mule
NP(k; 0.61)
in
Science
Poisson Probability Distribution
or a mule dies he or
it is
21
replaced immediately by another
insofar as soldier-mule interaction
is
who
is
equivalent
concerned, or else the army unit
is
large that one (or a few) deaths has a negligible effect in changing
so
pn
;
argument of spatial distribution we must have sampling with replacement (or a very good approximation thereto). Consider next as a basic Bernoulli trial the observation of one "statistically average" soldier, instead of the army as a unit, for a period of one year. In this case, the number of possible outcomes per trial is decreased effectively to only two, viz., one death or no death; the number of trials n is increased from one by a factor about equal to the number of average soldiers in the army; and p is reduced to a satisfactorily small value. p n is kept constant in this case by the mere assumption that all soldiers, in regard to mule contacts, are equivalent. Replacement of casualties need not be immediate but if not, to approximate the Poisson conditions, in order to use the
it is
required that the lifetime of a soldier be long,
army
unit be large.
This analysis
In discussing this problem,
is
we
i.e.,
that the size of the
the best of the three tried.
first
considered the simplest possible
Then we explored the and found that, along the time axis, the Bernoulli probability p n could not be assumed to be constant. Finally, our best analysis was made in terms of the basic Bernoulli trial as the
basic Bernoulli trial
argument of
and found
unsatisfactory.
it
spatial distribution
observation of a single soldier, with the assumption that identical.
In spite of the excellent
fit
all soldiers
are
of the Poisson calculations to the
reported measurements, Table 5-3, this problem barely qualifies as a
Poisson example.
Radioactive
Perhaps the best known
decay.
Poisson distribution in physics
is
a-particles in radioactive decay, counts of
counts of visible light photons,
application
in "counting" experiments:
etc.
And
cosmic rays, counts of X rays, commonest of these experi-
the
ments are those for which the source intensity constant over the time of measurement.
is
safely
Consider a radioactive substance emitting a-particles. is
placed behind a
set
of the
counts of
assumed to be This substance
of diaphragms which screens off all but a small solid
The unscreened rays fall upon a sensitive counting device, such as a Geiger counter tube combined with an amplifiersealer arrangement and a count register. The number of counts is recorded angle of the emitted rays.
for each of
N(=
2608) time intervals of
T (=
7.5) sec each.
Table 5-4
k counts were recorded in an actual fk experiment. The total number of counts is 2.fk k = 10,094, and the average counting rate is 10,094/2608 = 3.870 per 7.5 sec. The calculated Poisson
lists
the frequency
that exactly
distribution, multiplied
by
N to make
it
frequency instead of probability,
216 is
Probability
also listed in the table.
frequencies shows rather
The
fit
and Experimental Errors
in
Science
Direct comparison of the observed and calculated
good agreement.
of the Poisson model to the observed data
may
be judged
m
quantitatively by application of the x 2 test tms example we have enough measurements to use this test to advantage. This test tells us that 5
comparable cases would show worse agreement than appears owing to chance fluctuations alone. We conclude that the good Poisson fit establishes that the a-emission from this long-lived substance is not only governed by chance but is well represented by the Poisson model. 17 out of 100
in
Table
5-4,
Table 5-4.
k
Radioactive Disintegrations
fk
NP(k; 3.870)
Poisson Probability Distribution
217
example has many things in common with the one dealing with the kick of a mule, and it has two essential differences. The first difference is that, if the source intensity
is
satisfactorily constant during all of the
measurements
long atomic lifetime and large number of atoms in the specimen), may safely invoke the argument of spatial distribution. In this argu-
(i.e.,
we
ment, the time interval T (= 7.5 sec for convenience in the experiment) is imagined as subdivided into n equal subintervals of time, and n may be very large indeed. The second difference is that all atoms of a given species are strictly identical, as we know from the success of the principle of indistinguishability in
we may
quantum
statistical
alternatively analyze the
mechanics. With this knowledge
problem
in terms of the observation of each atom for 7.5 sec as the basic Bernoulli trial, the number n of such trials being equal to the number of atoms in the specimen. In terms
of either
The
set
of basic Bernoulli
trials,
the problem
is
clearly Poissonian.
analysis in terms of individual atoms, instead of the subintervals
of time,
is
in general the
more
because, often, the lifetime
logical
is
one
in
problems of radioactive decay
too short, or the number of atoms
is
too
small, to justify the assumption of constant source intensity
and thus of constant Bernoulli probability. This type of problem was mentioned earlier as an example of sampling without replacement, and only if the source intensity does remain constant can the spatial distribution argument be invoked, i.e., can the problem be considered as one of sampling with replacement. Note also that if the lifetime is short and the number of atoms is small, the problem is a binomial one in which the basic Bernoulli trial is an observation of each atom. A practical limit on the intensity of the source, or, rather on the maximum counting rate, is imposed by the measuring equipment. All detector devices have an inherent resolving time, i.e., a "dead time" immediately following a count. If one or more a-particles arrive at the detector during this dead time, it or they will not be counted. The Poisson distribution cannot "exactly" counts is
is
fit
the measurements, of course, unless the
negligible.
about 2-
sec,
The dead time of the eye
of a typical Geiger counter about 120
proportional counter about
1
/usee,
about 10 -7 sec as determined
number of "lost"
(as in counting scintillations) //sec,
of a typical
of a typical "fast" scintillation counter
now
(in the
year 1960) by the associated
electronic circuits.
Another
practical matter in regard to the source
"conditioned"
—in
the early runs
is
that
it
must have been
of data, loose molecular aggregates
become detached from the emitting substance atomic recoils from disintegrations within the
as a consequence of the
aggregates.
During the
conditioning process, the value of % 2 decreases successively as the source intensity becomes more nearly constant.
Probability and Experimental Errors
218
Counts per unit time: radioactive decay, typifies
many
precision.
The measurement of
in
Science
the rate of
the source intensity remains satisfactorily constant,
if
Poisson measurement problems
—most of the
problems
nuclear physics, cosmic rays, radioactive tracer work, carbon dating fact,
in
— in
any Poisson measurement amenable to analysis as a spatial distribuA feature of especial interest in such measurement problems
tion of events. is
precision, not yet discussed in these examples.
Consider the relative precision
in
counts per minute
among the
following
three measurements (a)
(b) (c)
one measurement 100 counts in lOmin one measurement: 1000 counts in 100 min ten successive measurements (N = 10) each for 10 min, the :
number of counts being The counting
total
1000.
assumed to vary randomly on a short-time scale on the average, to be constant, i.e., constant as measured on a long-time scale (e.g., days). The average counting rate is given experimentally by the mean m which is our best estimate of the desired parent mean /lc; m is the same in all of the three cases. Consider each of two dispersion indices of the parent (Poisson) distribution, viz., the standard deviation a and the fractional standard deviation a/ju in per cent; and consider also the standard deviation in the mean (standard error) a m These indices, for the three measurement situations,
(e.g.,
rate
is
minutes), but,
.
are:
(«)
Eq.
5-24
Eq.
5-28
a
=
-
= 4=
yjks
Jk
fi
=
^/lOO c/10 min
X 100
=
s
=*
=
-^2L V100
cpm
= cpm 1
10^
2-32 Eqs. 3-26
am
= —a=l =
Vl00 v
yjN
Vl
3-45
Eq.
5-24
Eq.
5-28
(b)
a
=
-
= JL
(i
V/c s
2-32 Eqs. 3-26
3-45
a
yJk s
=
,_
.
min
=
Vl00 cpm
=
.
1
cpm
10
x/l000c/100min
X 100
= —a— = JN
/1A c/10
=
-122= Viooo
w
..__ ^1000 ^ c/100min
^/i
'
= ^2°. C pm m 0.32 cpm 3.2% _ ._ ^ = ^1000 cpm ^0.32 cpm
100
Poisson Probability Distribution , u c Eq. 5-16
g
5-17
Eq.
/= Ju v
(7
(c)
->
y/ft
=
"1=
JN
~~F=
J 10
^
/1000
=
100
=
°"m m
-.^
/
V N
VlOOO/N
]
Eqs. 3-26
/1000
an
= lpO^
[x
t-
219
=
/
V
=
.
10 c/10
min
1
cpm
10
.
.
100
10
.
/b
71000/10
3.2 c/10
mm =
0.32
cpm
3-45
Or, better, the standard deviation a in case
Eq. 5-24 with
all
100 min. Then,
(O
may
(c)
be computed from
1000 counts considered as a single measurement over
we have
Eq.
5-24
a
= Jk =
Eq.
5-28
-
=
am
Eqs. 3-26
1
=
-J= x 100
—a
= —= JN
V1000
y.
10
=0.32 cpm
ion ~4= ^ 3.2%
71000
V/c s
ix
2-32
= ^ 100° c/10min
,/lOOb c/100 min
s
_ .
&
_
.
;1 3.2 c/10
min
=
. ,_ 0.32
cpm
Jl
3-45 First,
compare the precision of (a) with
that of (b).
It is
a characteristic
of Poisson problems that the precision in counts per unit time becomes greater as the mean m increases, but only {/"the effective number n of basic Bernoulli trials in each measurement also increases.
example, which involves hypothetical Bernoulli
T
of the observational interval
trials
must be stated
distribution argument, this characteristic
In the present
because of the spatial in
terms of the size
along the time axis instead of in terms of
number of hypothetical Bernoulli trials; the two statements are seen be equivalent when it is realized that the gain is in the product np, not
the to
merely in n alone.*
This
made
is
clear in a
penny-tossing example
presently.
Second, compare
remains the same, *
Note
that
if
(a)
with
observation time per measurement
If the
(c).
T=
viz.,
10 min, the values of a
the counts for each of the 10 trials (each for 10 min) are
use Eq. 2-23 for
o, viz.,
a
=
^(k —
I
t
/
in
Eq. 2-32 for a m
,
viz.,
am
=
I
of any mathematical model. But
m) 2 /9
|
10
^ (&i — w) /90
,
the Poisson
this
value of
\H .
These expressions apply regardless
J
model
is
known
to
fit
the parent distri-
bution, the Poisson equations lead to a greater precision with a given
measurements.
are the
known, we may
and then we may use
2
if
affx
VA
/ 10
a
and of
number of
220
Probability
same, respectively,
by
and Experimental Errors
Science
That these precision indices are unaltered measurements is expected from the statement in the
in (c) as in (a).
repetition of the
previous paragraph relating precision to the
having a given success probability.
trials
in
number of
basic Bernoulli
Note, however, that the addi-
tional nine trial measurements in (c) have improved the precision in the mean, the standard error a m On the other hand, if (c) is considered as a single measurement of 1000 counts with the observational time interval T = 100 min, the precision as measured by all three indices is improved; in this view, (b) and (c') are equivalent. .
Third, consider further (b) tially
case
N,
made above, but one
(c'), it is
i.e.,
This comparison has already been par-
= a/V N for
imperative that the value of a be consistent with the value of
the value of
We
measurements. the
vs. (c').
point remains. In the expression a m
same time take
or
must correspond
cannot take a
= Vk =
N=
would
this
\0;
deduced from the
to the value s
VlOOO
=
yield a m
N
cpm and at cpm but is not
en 0.32 0.1
valid.
Finally,
it
should be emphasized that, although we
may
at first
be
inclined to regard the subdivided data in (c) as thereby containing addi-
somehow yield an improvement in precision, no additional information (except in checking systematic errors as mentioned below, and in checking the applicability of the mathematical model). We have assumed that all the fluctuations in successive 10-min trials are random; and all of the information about the dispersion of the parent distribution is contained in the product np of the basic Bernoulli trials it does not matter how this information is obtained; it may be obtained in subdivided measurements or all at once in a single measurement. tional information that should
there
is
in fact
—
The
situation
is
somewhat
similar to that encountered in tossing pennies.
If five pennies are tossed together,
tions of heads
and
tails is
binomial probability
a
= Vnpq =
distribution
vf, and
some one of the possible five combinaThe five combinations form a parent
observed.
a/ju
=
B(k;
which
for
5, |)
(1/V5) x 100%.
If
/.i
=
np
=
experiment
the
f, is
repeated ten times, the experimental frequency distribution begins to take
shape and we have additional information for the estimate m of the mean i.e., a m is improved, but the parent distribution \0B(k; 5, £) is un-
/u;
altered (except for the normalization factor of 10);
and a Ip, are
just the same.
the values of
/*,
a,
But, instead of repeating the experiment ten
times, suppose that 10 x 5 = 50 pennies are tossed together; in this new experiment a single observation is made which is a member of the parent
distribution B(k;
pennies,
50, |), a different distribution
and one whose parameters are
/x
=
from that with
np
=
5 °, 2
a
=
\
just five \°,
and
Poisson Probability Distribution
221
a I[A = (1/v 50) x 100%. Of course, the new experiment could be performed by tossing the 50 pennies in 10 lots of 5 each, calling all 50 pennies r.
a single observation; this
is equivalent to subdividing the total number of The dispersion of B(k; 50, ^) is unaffected by any mere treatment of the data. The crucial point is in recognizing the parent distribution whose dispersion we would specify, i.e., the number of basic Bernoulli trials. It is apparent that a in the new experiment with 50
basic Bernoulli
trials.
is only VlO, instead of 10, times larger than a in the old experiment with 5 pennies, and, on a "per penny" basis, the new experiment is the more precise. (The "per penny" basis is somewhat artificial here but illustrates the point. Had we started with a very large number of pennies, say 10,000, instead of 5, the "per certain large number of pennies" would make more sense.*) A real advantage of subdivision of the data is that any significant
pennies
misbehavior (change
in
systematic errors) of the experiment
may
be
recognized, such as bursts of spurious background counts or a slow drift
magnitude of the background or perhaps of the average value of If enough measurements and subdivisions are made, the % % test can be used to check whether or not the systematic errors are constant, i.e., whether or not the observed fluctuations are purely random. Such a test in the radioactive decay example revealed that a changing systematic error was present until the radioactive specimen had in the
the quantity being measured.
become "conditioned." The above conclusions as to precision, equally valid in many Poisson measurement situations, illustrate the importance of applications of probability theory and statistics in the design and performance of experiments.
More examples. The
soldier-mule example and, strictly speaking,
the radioactivity example discussed above illustrate Poisson sampling
But, the army was presumed to be kept up to And, with the condition that the source intensity remain constant, the radioactivity example was also discussed as though the supply of events were unlimited. Let us discuss some more Poisson
without replacement. strength at
all
times.
examples.
Suppose that the volume of blood in an adult human is 6 liters, and that 3 volume C bacteria. A sample of size 0.6
there are distributed in this
mm
* The situation with the pennies does not allow a convenient analog to the argument of spatial distribution, but this argument is not really involved in the present discussion.
However, such an analog would require an adjustable reciprocal relation between the head probability/?' and the number /;' of pennies tossed for each observation with the product n'p' remaining constant.
222 of blood
What
taken for examination.
is
k bacteria
in the
sample?
This
imagine, as a basic Bernoulli
bacterium.
number of
is
in
Science
the probability that there are
is
a Poisson problem because
we may
the observation of one particular
trial,
bacterium in the sample? The success probability
Is this
the ratio of volumes, 0.6/(6
is
and Experimental Errors
Probability
C
10 6)
x
=
10~ 7 and the
number of trials n
,
p is
volume of blood. If the criterion answer to the problem is P(k; 10" 7 C). If many samples are taken from the same person we have another example of sampling without replacement; it is essentially the same as the radioactivity case in which the observation is whether or not a given atom the
+
k2
<
(np) 2
n
bacteria is
in the total
satisfied, the
many identical persons is we have sampling with replacement; but in this case identity of different persons may be suspect. It should be noted that if value of k we select in this problem is close to the expectation value
decays in a specified time interval. If each of used for each sample, the the
=
np
10
_7
C, the standard deviation in the value k
10~ 7/2 C-.
is
Also sampling with replacement: How many radio tubes in a new sample batch of 800 are reasonably expected to be defective if the manufacturing process
trial is
p
=
is
either
The success 0.03.
good or
may
i.e.,
the average,
3%
defectives?
not in bunches, the basic Bernoulli
The number of
defective.
probability
These Bernoulli
distribution
on
to put out,
— the inspection of each of the 800 radio tubes where
obvious here
each tube 800.
known
is
appear randomly,
If the defectives
p
is
Bernoulli trials n
is
also given as part of the problem, viz.,
trial characteristics
are such that the parent
be assumed to be Poisson. The
by the expectation value
np.
But
this
first question is answered answer should be accompanied by a
statement of precision, say, by the standard deviation
±Vnp.
Thus, a
reasonably complete answer is 0.03 x 800 ± Vo.03 x 800 = 24 ± 4.9. Another question may be asked, What is the probability that, in the batch, less than 30 tubes are defective? The answer is ^fLoP(k', 24). Or the question may be, What is the probability that at least one tube will be defective? To this question the answer is
P(>1;
24)
=
1
-
P(0; 24)
=
}
K
=
1
-
e~
2i
«
1
Consider another simple problem. Suppose that at a certain telephone switchboard 50,000 connections are made and that 267 of them are wrong. From a random sample of 25 connections, what is the probability that one be wrong? Again, in this problem if the wrong connections occur randomly, the basic Bernoulli trial is the observation of each connection.
will
The 0.
1
best determination of the expectation value
335.
The answer to the question is P(
1
;
0.
1
335)
=
25 x 267/50,000 en
is
0.
1
335e>-°
1335
^0.117.
Poisson Probability Distribution
As
223
a variation of this problem, suppose that at another switchboard
two wrong connections are made in 500 trials. What is the probability that one or more wrong connections will occur in the next 35 trials? This is left as an exercise for the reader. As a final example, suppose that one night the "enemy" dropped 537 bombs in an area of 576 square blocks in the central region of Washington, D.C. We study the array of bomb hits and ask the question, Were the Table 5-5.
k
Bombs "Dropped" on Washington, D.C.
fk
NP(k; 0.9323)
Probability
224
and Experimental Errors
in
Science
that your house, occupying 1/100 of a block, will be directly hit?
The
subdivision of the pertinent area into blocks was for convenience in the
problem dealing with the size of the White House or the Capitol. Since your house is now the interesting unit of area, we subdivide into 57,600 units, each an equally likely target. The answer to the question posed is approximately 537/57,600. Why "approximately"? 5-8.
Composite of Poisson Distributions
The observed fluctuations in any set of trial measurements are generally due to more than one source. For example, as an extreme case, the normal density function, Eq. 4-9, was derived as the composite of numerous elementary distributions. Now we wish to examine briefly the effect of a small
number of sources of
fluctuations in a typical Poisson problem.
Measuring with a background.
measurement involvthe most common second source of fluctuations is the ubiquitous background "noise." Whether or not the primary source under study is introduced, this background is present and is producing a measurement effect. A background is generally defined as those residual observations made when all deliberate sources are removed. Thus, a background is separately In almost any
ing the counting of individual photons, a-particles,
etc.,
measurable.*
A typical background in a measurement with a Geiger counter or with an ionization chamber is caused by radioactive contaminations in the air or materials of the laboratory, by cosmic rays, by "spurious" counts inherent in the counter tube or ion chamber, or by the "noise" in the associated electronic circuits. The type of background we shall discuss is one that can be assumed to be constant on the average and random in detail. Since the background can be separately measured, its constancy on the average can be experimentally checked with the x 2 test by assuming measured means and by using Eq. 4-27. Generally, found to be not only constant on the average but Poissonian in distribution. The randomness and distribution can be checked experimentally with the x 2 test by assuming a Poisson distribution of individual measurements of the background. It can be shown (and this is put as Problem 14, Section 5-10) that if both the background and the distribution of events we would measure are Poissonian, then the observed composite distribution is also Poissonian. a
flat
the
distribution of
background
* In
is
measuring the background
—
in practice,
the deliberate sources are preferably not
removed this would also remove any secondary effects (such as spurious scattering) in the background itself. It is often better to stop, by an appropriate shield, the direct source radiation from reaching the detector. physically
225
Poisson Probability Distribution
Thus,
I where k b
is
the
P(k b
;
background count,
the observed composite count,
ju
the
kb
ju b is
the
x is
we are usually mean composite count.
activity (the quantity is
-
P(k
fx b )
p x)
;
=
k
P(k;
(5-33)
/x)
the mean background count, k is mean value of the primary source
trying to measure),
The observed composite count
Precision.
=
K+
kb
and
/u
=
/u b
+
fi x
is
'
(5-34)
and the primary source count can be written kx
By Eq.
==
k
kb
kx
3-43, the standard deviation in sx
2
=
s
2
+
is 2
(5-35)
sb
(Note the sum, not a difference, in Eq. 5-35.) By Eq. 5-16, or Eq. 5-24 if we are dealing with a single measurement in each case, Eq. 5-35 becomes sx
2
w
(k x
+
+
k b ')
kb
(5-36)
where k b is a specific measurement of the background in the absence of k x k b and k b are generally different because of statistical fluctuations. .
The
conclusion
first
is
obvious,
viz., if
the signal
is
relatively strong,
background is not very important. But if the signal is relatively weak, the k b and k b terms in Eq. 5-36 are important, and considerable time must be spent in measuring the background. Suppose that a time t b is spent in measuring the background rate precision in determining the
bit
=
B.
Then, Bt b counts are recorded. Our best estimate of the mean rate B and its standard deviation is
background
B= The
precision in
B is
»>'»
In this case
mean
rates
(X
+
t
(*'»>*J
= B ± (»V
(5-37)
inversely proportional to the square root of the time
spent in determining B.
suppose that a time
*
x is
Then, with the source under study introduced, + k b ')jt = + B'
X
spent in measuring the rate (k x
B')t x counts are recorded,
and our
best estimate of the
is
X+ B = (X + B')t ± x
[(X
+
B')*J*
Probability and Experimental Errors
226
By
and using Eq.
similar arguments,
The
5-35,
£a'
X
w
{
_
X is approximately
w !s_ x 100
jl
X+
B'
Science
we obtain
fractional standard deviation in per cent in
lx±w +
in
(5-40)
_B
h
tx
Equations 5-39 and 5-40 could have been written directly from Eqs.
and k b had been divided by the times involved But writing X and B as rates, independent of the particular choice of observational times, is general and shows clearly, by Eqs. 5-37 and 5-38, that the precision in each case is inversely proportional to the square root of the time of measurement. A common practical problem in precision is the following one. If the time t x + t b is to be constant, what is the most efficient division of time? The most efficient division refers to that value of the ratio tjt b such that the standard deviation s x is a minimum. It can be shown (Problem 15, 5-36
and 5-34
if
the counts k
in their respective
measurements.
Section 5-10) that this ratio
is
-
t
-t b Js x = min
(5 - 4i >
(H*f B
Another practical problem in precision occurs when the mean background counting rate B is comparable in magnitude with, or greater than, the
mean primary source X.
In such a case, great care
is
required in
measurement X + B'. The desired signal X, obtained by subtracting B from X + B', may be confused with a fluctuation peak in the background. Help in this interpretation is afforded by the tests for significance of the difference between two means, in this case between the interpreting the
means
X+
B' and B. Such
tests,
including the useful
t
test,
were discussed
in Section 3-3.
The type of argument involved in the / test may be applied a little more in two means as follows. Let m and m h be the means in question. Then our best estimate of the difference in the parent means is written, with the standard deviations, as
simply to the difference
P If
we imagine
—f*i>
('"
-
rn b )
±
s lm -
mb )
(5-42)
number of independent values of m — m b to be can be shown (by the central limit theorem) to be
a very large
obtained, these values
-
Poisson Probability Distribution essentially
227
normally distributed about the parent mean value
a standard deviation s {m_ m
By Eq.
>.
=
<:
<\
—
[x
/u b
with
3-43, 2
4-
'V
«>
2
and, for the Poisson case, Jc (m-
m b)
= -+
—
(5
"
43 )
by Eqs. 3-45 and 5-16, where n and n b are the numbers of trial measurements used in determining m and m b respectively. Again, if only a single measurement is used in either case, n or n h = 1, Eq. 5-24 should be used instead of Eq. 5-16. Hence, s {m _ m can be evaluated, and with the single parameter h of the normal distribution thus known by Eq. 4-8, calculation can be made by Eq. 4-10 of the probability that /;; — m b will differ from — ju b by any specified value. Now, if the two parent means /u and [x b are assumed for the moment to be the same, i.e., if there is no difference >
fji
X+
between
B' and B, there
cally greater than s (m _ m >. normal distribution curve,
A 32%
chance
is
a 32
B.
68%
Fig. 4-6, lies within
h
will
be numeri-
of the area under the
^(standard deviation).
usually considered to be too large to be a satisfactory
is
X
criterion that the desired signal
background
% chance that m — m
In other words,
It
is
be declared to be clearly above the
customary to take
5%
as the "significance level."
measurement m — m b is so far out on the tails that it is beyond the 5% limits, then the assumption that /n = fi h is thereby declared to be unreasonable, and consequently the desired signal is clearly above the background. One important general conclusion of the above discussion of the highbackground problem is that the useful sensitivity of a measuring instrument depends greatly upon the background. This dependence is not specified by the response of the instrument to a source of standard strength divided by the background, but the standard response must be divided by the magnitude of the fluctuations in the background. A very large background is perfectly acceptable if it is perfectly constant in magnitude. If the actual
5-9.
An
Interval Distribution interesting extension can be
that involve a spatial distribution. intervals
made
easily in those Poisson
problems
This refers to the sizes of individual
between adjacent success events.
The discussion
is
usually of
events that occur along a time axis, and the intervals are time intervals.
As
discussed so
as independent.
questions
—
it
far,
the Poisson distribution treats
all
these intervals
But the so-called interval distribution answers further
gives the probability for the occurrence of an interval of a
228
and Experimental Errors
Probability
specified
size.
This distribution
is
in
Science
of course, whenever the
realized,
Poisson conditions are satisfied for events that occur randomly along some spatial axis, such as time.
A
spatial-distribution Poisson
problem can be
considered as one of measuring intervals instead of one of measuring
We should be acquainted
events; the two views are obviously related.
both views and choose the more convenient one
From
with
problems.
in individual
Eq. 5-4, the probability that there will be no event per unit time
u)=£J— =
P(0;
e
-"
is
(5-44)
0!
where
fx is
the average
number of events per
time interval of interest be this interval
is /at,
unit time.
Let the size of the
then the average number of events during
/;
and P(0;
fxt)
=
£>-"'
(5-45)
The
probability that an event will occur in the interval dt
This
is
the probability for a single event
probability for no event during is
the product fie~'
lt
t
and
Equation 5-46
is
<
dt
1.
is
simply
/u
dt.
Then, the combined
for one event between
Designate this probability as
dt.
/(/;
/
and / + dt and write
ju) dt,
= fie'^dt
l{t; fi)dt
sizes
if [x
(5-46)
the probability density function for the distribution of
of intervals occurring between random rare events.
It is
immediately
evident that small intervals have a higher probability than large intervals;
hence, the interval distribution
is
asymmetrical.
measurement of k events in time T is accompanied by K intervals, where K = k ± depending upon whether we start and/or end the time T with a count or with an interval. These K intervals are of many different sizes. The number n, of intervals having sizes greater than t Y but smaller than t 2 is given by
A
1
t
n tuU
=
KI(t;
ju)
n hh In particular, with
t
x
=
0,
=K =
/ue'
Kie-"*
Eq. 5-47 or 5-48
-
1
is
11
dt
(5-47)
e~" h )
(5-48)
'
the cumulative probability
distribution function. If
t
2
>
taken as
T, the time of the actual observation for the infinite,
n t>h
The average
k events,
t2 ,
may
be
and then
interval
is
larger than the average
simply
= Ke~^
1/ju,
(5-49)
so the fraction of intervals that are
is
0.37
(5-50)
229
Poisson Probability Distribution
As a second interesting limiting case, suppose that t x = 0. Then, by Eq. 5-48, the fraction of intervals that are smaller than any specified interval
t
t is
=
- e~^
1
K Dispersion indices. is
(5-51)
In the interval distribution, the
mean
deviation
defined as
Jo =^ t-r _ /
dt
(5-52) 00
jue'^dt Jo
where t
is
mean we have
written for the
integration of Eq. 5-52,
t
t
interval,
=
\ffx.
After performing the
- t = — fv 0.7358t
(5-53)
e
Equation 2-21 defines the variance for any continuous distribution, such as the interval distribution, and, accordingly,
a2
P° (t = J-±-^
write
rffie-^ dt (5-54)
°°
11
dt
lie-'
|
we
Jo
The standard
deviation in the size of the intervals between randomly
distributed events
is,
from Eq.
5-54,
=
a
r
=
-
(5-55)
just equal to the average interval.
Resolving time:
lost counts.
It is
when
physically impossible to measure
the interval between
them becomes
two
spatially separated events
less
than the resolving ability of the measuring instrument.
axis, the limiting resolving ability
time, or the
of the instrument
"dead time." Because short
long intervals, Eq. 5-46, the
finite
is
intervals are
On
a time
called the resolving
more probable than
resolving time reduces artifically the
to become artifically smaller than unity, where n is the number of Poisson become less % measurements; see Problem 30(b), Section 5-10).
dispersion of Poisson events; (i.e.,
2
/n to
it
causes x
2,
230
and Experimental Errors
Probability
Science
in
which events are "received" by the instrument,
If R, the rate at
and a small fraction of the counts observed rate R c is given by small,
c
w
R(l
R
as
R
R
-R
c
rr)
(or intervals)
for
R
rr
<
1
for
R c tt
<
1
c
is
rather
then the
is lost,
(5-56)
from which where r r
is
c
+R
(l
c
tt)
In this case, r r
the resolving time.
R- R
=
RR and,
if
If
R
R
is
usually written as
c
C
known, r r can be measured. becomes rather large, we must distinguish between two is
types of counters.
Equation 5-56
is
satisfactory
counts do not themselves extend the dead time; then value \JT r
.
But
if
we must
=
when Rrr
1,
R
= R e - RT
c
*
—
c
e
declines.
(5-59) v '
er r
At the maximum counting
rate,
only \\e
= 37%
of
R
increases
instead of the events being
randomly
This type of counter,
the received events are counted.
It is
(5-58)
maximum
reaches a
c
R 'max = - =
indefinitely,
c
(lost)
approaches the
write
R
and then
R
each unrecorded count does extend the dead time, then,
instead of Eq. 5-56,
In this case,
different
unrecorded
if
if
becomes completely paralyzed.
interesting to note that
if,
spaced, they are uniformly spaced, as are "pips" from an oscillator, and
if
recovery were abruptly complete after the dead time interval, the second type of counter described in the preceding paragraph would follow faithfully until the rate equals l/r r
events.
An
oscillator
electronic circuit, e.g.,
is
and
,
this
is
just e times
/?
Cmax
for
random
often used to determine the resolving time of an
one which
some allowance must be made Coincidence counting.
is
part of a counter system, but usually
for incomplete recovery.
There are a great many interesting problems
that arise in practical applications of the Poisson statistics.
We
shall
mention just one more. Suppose that, in the radioactive decay of ^Al, we wish to show that the emission of the /?- and y-rays is very nearly simultaneous in the decay process,
Two in
i.e.,
simultaneous within the resolving time of the equipment.
counters are used and arranged so that both must respond together
order for the event to count. Since each counter has a
that
random coincidences
finite
will
This
is
called coincidence counting.
background, there
is
a certain probability
be counted that are unrelated to the 13AI
Poisson Probability Distribution
Furthermore, in
decay.
231
experiment, since each counter subtends only
this
a small fraction of the A-n solid angle of the
and y-emission, a
ft-
ft
(or a y)
may
be received by one counter and the related y (or ft) go in such a direction as to miss the other counter; perhaps it also enters the first
counter.
This situation greatly complicates the background for each
counter;
the
total
background includes the unrelated rays from the
We shall not work out We have discussed in
decaying 13AI. Conclusions.
examples of Poisson type problems
the details here. this
chapter only a few selected
— starting
with the. simple and pro-
ceeding to the more complex
—problems that are typical
problems are legion, but
us say that they lead us beyond the scope of
let
in science.
Other
book.
this
Problems
5-10.
A deck of 52 cards
1.
cards are turned up
The
1
shuffled
is
and placed face down on a
player calls each card without looking at
calls.
Show by
it
it is
Then
the
examined.
and promptly forgets what he and the Poisson
the parameters of the basic Bernoulli trials
conditions that the probability distribution for the
expect to call correctly
is
number of cards he may
essentially Poissonian.
Suppose that the weather records show
2.
table.
each card being discarded after
at a time,
that,
on the average,
5
out of the
30 days in November are snowy days. (a)
What
is
the binomial probability that next
November
will
have at most
4 snowy days ?
What is the Poisson probability for the same event? Make the histogram of the following numbers of seeds area on damp filter paper:
(b) 3.
unit
k = fk =
2
1
6
20
28
3456789
12
germinating per
10
6
8
and (b) a Poisson distribution to these measurements and plot two frequency distributions on the same graph as the experimental histo-
Fit (a) a binomial
these
gram. (Hint: In the Poisson calculation, use Eq.
5-6.)
Suppose the number of telephone calls an operator receives on Tuesday mornings from 9:00 to 9:10 is fitted by a Poisson distribution with p =3. (a) Find the probability that the operator will receive no calls in that time 4.
interval next Tuesday. (b)
Find the probability that
a total of (c)
1
call in that
in the
next 3 Tuesdays the operator will receive
time interval.
Find the probability that the
call
1
of part (b) will be in the
first
Tuesday. 5.
A
book of 600 pages
contains,
on
the average 200 misprints.
the chance that a page contains at least 3 misprints. estimate.
[ans.
p
(3
Estimate
Discuss the reliability of this or more) rs 0.29; a ^ 2.36 x 10~ 2 ] fl
Probability and Experimental Errors in Science
232
A
6.
company has From a mortality
life-insurance
people at age 25.
25, 88,314 are alive at age 26.
1000 policies, averaging S2000, on
lives of found that, of 89,032 alive at age Find upper and lower values for the amount
table
it is
which the company would reasonably be expected to pay out during the year on these policies. 7. (a)
What are the binomial and the Poisson probabilities that exactly 3 random sample of 500, have birthdays on Christmas? (Assume all
people, in a
days of each year to be equally probable as birthdays.)
What is What is
(b) (c)
the expectation value of the
number of birthdays on February 29?
the precision of the answer to part (b)?
number of random samples of 500 people per sample were
(d) If a large
in-
vestigated an experimental probability could be obtained that "exactly" 3 people
out of 500 have birthdays on Christmas. Mention a few factors that would it
unlikely that this experimental probability
binomial probability for
How
8. (a) girl
make
would agree with the calculated
this event.
would you determine the probability that an unspecified college
has red hair, assuming no ambiguity in color?
your determination.
Assume
in the
remainder of
Discuss the reliability of
this
problem that
this
proba-
bility is 0.05.
What
(b)
the probability that in a
is
random sample of 20
college girls 4 will
have red hair?
What
(c)
ment of
the probability that, of 4 girls in a physics class having an enroll-
is
30, only
How
1
has red hair?
must a random sample be it tne probability of its containing at least 1 red head is to be 0.95 or more? (e) List the Bernoulli and the Poisson conditions separately, and write "good," Si fair," or "poor" by each condition to indicate the degree to which it is "satisfied" in part (b), in part (c), and in part (d). (d)
9.
A
large
long-lived radioactive source emits particles at
(a)
What
is
(b)
What
is
the expectation
number of
an average
particles observed in 10
rate of 10/hr.
min? (ans. 1.67)
the probability that in a 10-min run
no
particles are
observed?
(ans. 0.188) (c) If 20 measurements are made, each for 10 min, what can you say about a measure of fluctuations relative to the mean value?
(d)
What
(e)
If
is
the precision of a single 10-min observation?
the average counting rate were 300/hr instead of 10/hr,
answers be for parts 10.
If
is
the difference great
11.
enough
in
one hour, then 265 counts
finally
i.e.,
concluding as to significance.
Derive the expression for kurtosis
in the
to indicate a significant time variation
cosmic-ray intensity? Treat the data three different ways,
arguments, before
what would the
and (d)?
246 cosmic-ray counts are observed
next hour, in the
(a), (b), (c),
in the
Poisson distribution.
by different
Poisson Probability Distribution
233
»
Consider the distribution B(k\ 100,0.05)
12.
P(k;
What
5).
the value of
is
mean, the most probable value,
(a) the
(b)
standard deviation,
(c) the
mean
(d) the standard deviation in the (e)
the skewness,
(f)
the kurtosis
The
13.
(standard error),
and
?
mean value in almost any probability Show that P{fi /u) = l/(27r,«)^, using Stirling's
probability of observing the
distribution
surprisingly small.
is
;
formula.
counting with a constant background, when the source
14. (a) Verify, for
rate
=
kjt
is
k\t
-
k b /t b
that
,
k
2
P(k b
fi b
;
)P{k -
kb
/i ) x
;
=
P(k;
/i
+
6
n x)
=
P{k;
/<)
(b) Reconcile this identity with the statement (implied in Section 4-3) that a
large
number of component Poisson
distributions gives a composite normal
distribution. 15. If
+
/
tb
is
constant, where
cluding the background) and
show and
16.
mean
most
that the
kit
such that
is
is
tb
/
is
X
the time spent in counting
rays (in-
the time spent in determining the background,
efficient division
of time between measurements of k b /t b
(klk b fK
t/t b *t
Show, using Eq. 5-4, that the probability of observing one less than the value is the same as the probability of observing the mean value in a
Poisson distribution. 17.
As an
inspector of an
enormous quantity of some manufactured gadget,
determine the sample size and the acceptance number so that
should be
(a) there
than
less
1
chance
in 10 that a lot with
5°
defectives
is
accepted,
is
(b) there
should be
rejected,
and
(c) that the
18.
less
than 5 chances in 100 that a
combination
(a)
and
lot
with only 2
% defectives
(b) obtains.
Consider the chances of a bomber pilot surviving a series of statistically which the chance of being shot down is always 5 %.
identical raids in (a)
From an
survive
1, 5,
original
group of 1000 such pilots, how many are expected to and 100 raids? Plot the survival curve.
10, 15, 20, 40, 80,
(b)
What
(c)
In a single raid of 100 planes, what are the chances that 0,
will
is
the
mean
life
of a pilot in
number of
(ans. 20 raids)
raids?
1, 5,
or 10 planes
be lost ?
19.
Ten cm3 of a
material
is
(a) that
Each of
liquid contain 30 bacteria.
inoculated with
only 1 of the
test
1
cm 3
of
this solution.
tubes shows growth,
(b) that the first test tube to be inoculated
i.e.,
10 test tubes of a nutrient
What
is
the probability
contains at least
shows growth,
1
bacterium,
and Experimental Errors
Probability
234 of the
(c) that all 10
test
is
show growth, and show growth ?
a multinomial problem, but can any part or parts of
How many
be conveniently
it
problem?
treated as a Poisson 20.
Science
tubes
(d) that exactly 7 test tubes
This
in
stars
must there be randomly distributed
in the sky all
around
the earth in order that there be a 50-50 chance of having (a) a "north" polar star, i.e., within, say, 2° of the axis,
both a "north" and a "south" star, one or both? (d) What would the answer be to part buted in the sky? (b)
(c) either
What
21.
additional information,
if
(a) if the stars
any,
were uniformly
do you need
distri-
in order to determine
each of the following parts whether or not it is a probability distribution, and, if it is, which of the 3 distributions emphasized in this book does it most
in
closely
approximate? Give as many a
each answer as you
priori reasons for
can.
New York
in
State according to financial
year.
Number
(b)
men
of adult
(a) Classification
income per
of defective lamp bulbs in each of
from a factory. (c) Repeated
measurements of
trial
(i)
many
large sample batches
very feeble and
(ii)
very intense light
intensity.
(d) One hundred measurements of the winning time at a horse race, each measurement by a different observer using his own stop watch. (e) Values of the height of the emperor of Japan from a poll of every tenth adult resident of Japan, the residents being randomly selected.
Number
(f)
vs. deflection
through a thin metallic
foil,
beam of protons
angle of a there being
process per scattered proton processes per scattered proton
on the average
single scattering),
(i.e.,
and
than
(ii)
1
scattering
100 scattering
(Scattering
multiple scattering).
(i.e.,
scattered in passing
(i) less
is
due to
the proton-nucleus electrical repulsion).
Suppose that the average number of
22.
of Ithaca
1.5/yr.
is
occurring in the city
within the range of numbers
(a) Is this
number
may be (b) What
considered reasonable on the basis of chance alone?
that
is
(4) reasonable,
the critical
prompt, say, an increase
owing
it
fall
fatal accidents in
of the police force
5%
any one year that should if
the criterion
that this or a greater
number
is
set that
will
occur
is
entirely possible in a multinomial probability
problem
to
have n
and np moderate, where p, is the basic Bernoulli probability p k successes and i is any one of the possible outcomes in Eq. 1-20. Show
large,
for
does
chance alone?
to
23. It
i.e.,
number of in the size
the probability shall be less than
t
small,
t
/-
(
= 3 the How would
that for r
multiple Poisson distribution
is
normalized.
you design the experiment to distinguish between randomness direction and randomness in time in the emission of a-particles from polonium?
24. in
fatal accidents
In a particular year 4 fatal accidents occur.
Poisson Probability Distribution
235
Discuss your choice of classification intervals and of the total number of measurements of counting rate. 25.
A
proportional counter
is
used in the measurement of
X rays
of constant
average intensity. (a)
A
count (source plus background) of 8000 is observed in 10 min. X rays removed, 10 min gives a total of 2000 background counts. the average X-ray intensity in counts per minute, and what is the standard total
Then, with the
What
is
deviation in this value? (b) if
What
is
the
optimum fraction of time to spend measuring make measurements is fixed?
the background
the total time to
26. In the measurement of y-rays, a counter is used with a measured average background of 120 cpm. If the y-rays enter the counter at an average rate of 240/min,what must be the duration of an observation of the y-rays if the measurement of the number of y's per minute is to have a probable error of 2 % ? 27.
Show by
the
method of maximum
likelihood that a
= Vnp =
/n
in the
Poisson distribution. (See Problem 27, Section 3-11.) 28. In expressing the dispersion of a set of
members of a Poisson distribution, monly than the mean deviation. (a) State as
many
(b) Outline in
deviation 29.
A
is
measurements believed to be
the standard deviation
is
used more com-
reasons for this as you can.
some
measurement situation in which the mean (Assume that no common probability model "fits").
detail a real-life
preferable.
college class of 80 students meets 267 times during the year.
number of class meetings fk with k absences Table 5-6. Absenteeism
k 0-2
fk
in
is
listed in
Table
5-6.
College Class Meetings
267P(k; 8.74)
The
Probability and Experimental Errors in Science
236
you arrange the data and proceed to determine whether or not the absenteeism on days of out-of-town football games was due to random fluctuations alone?
Make
30. (a)
the y} test of the goodness of
of the Poisson model to the
fit
observations in Table 5-4. (b)
Show
that, for a perfect
fit,
x
2
=
n,
where n
is
the
number of trial measure-
ments. 31. In successive
5-min intervals the background with a certain counter is 5. A radioactive source of long half-life is brought
310, 290, 280, 315,315, 275, 3
up
to the counter.
1
The increased counting
rate for successive
5-min intervals
is
720, 760, 770, 780, 710, 780, 740, 740. (a)
for
(i)
(b)
Calculate in counts per minute the average value and the probable error the background,
Show
ground can
(ii)
the background plus source,
and
(iii)
the source alone.
quantitatively whether or not the data with the source plus backsafely be considered to be
randomly
32. In counting a-particles, the average rate
distributed.
is
30
a's per hour.
What
is
the
fraction of the intervals between successive counts such that they are (a) longer
than 5 min,
(b) longer than 10 min, (c)
shorter than 30 sec.
a- and /J-rays are emitted from a certain radioactive sample. Assume and ^-emissions are independent, i.e., from different noninteracting atoms. The observed counts are A a's per minute and B /3's per minute. What is the combined probability that a particular interval between two successive a's will have a duration between t and t + dt and will also contain exactly x /Ts?
33.
Both
that the a-
34. (a)
and
Perform the integrations of Eqs. 5-52 and 5-54 for the mean deviation
for the standard deviation in the interval distribution.
(b)
Derive the expression for the skewness of the interval distribution.
35. In a certain experiment, a counter system gives counts per in
Table
5-7.
The parent Table
5-7.
distribution
is
Observed Counts Trial
1
minute as
listed
expected to be Poissonian. The internal in a
k
Certain Experiment
237
Poisson Probability Distribution
37. There are more positively charged cosmic rays at sea level than negatively charged ones. In a given experiment, 2740 positive ones and 2175 negative ones were detected during the same time interval. How should the ratio of positive to negative particles be reported ? 38.
One g
of radioactive material of atomic weight 200
which has a 5.00% registered in 30 min.
is
exposed to a counter 787 counts are
efficiency for detecting disintegrations;
What
is
the
mean
lifetime of the radioactivity?
What
is
the most probable lifetime of single radioactive nuclei? 39.
A
piece of metal
is
exposed to neutron irradiation and therafter
is
placed
near a counter than can detect the induced radioactivity. During the first minute after irradiation, 256 counts are recorded; during the second minute there are
49 counts. Ignore background. Assuming that only one kind of radioactivity was produced, determine the decay constant and the standard deviation in its determination. (Assume, of course, exponential decay). 40.
A cosmic-ray
of 0.02 steradian
is
"telescope" with a sensitive area of 100
pointed in a fixed direction.
different times the following
In
1
cm 2 and
numbers of counts are recorded
276
an aperture
hr intervals at various
Summary
Throughout
this
book we have devoted
the majority of the pages to the
practical mechanics of analyzing observations
and measurements with an
eye to the proper formulas for use in determining experimental errors and probability precision. But the discussion has been deliberately cast in the
framework of the
scientist rather
than of the mathematician or of the
statistician. It is
hoped that this book leaves the reader with a better understanding is a complex of observations, measurements, theoretical conand predictions that are all essentially probabilistic, that all "facts"
that science cepts,
of science are probabilistic;
that the "exact" or deterministic views of
and indeed held by many pseudo scientists open-ended views. And thus can science continue
science of the 19th century,
today, have given
way to
and grow and be philosophically endeavors of man.
to live tual
239
at
home among
the other intellec-
Glossary
The equations
numbered
as in the
of equally probable
outcomes:
listed in the five parts of this glossary are
text; the pages in the text are also indicated.
I.
CLASSICAL PROBABILITY
Definition of p;
w = number
Independent events
A and
p(A and B) = p(A)-p(B)
=
p(B\A)
Compound
A
^(either
^
^(either
A
or B)
=
nor B)
1
-
or
B
or both)
=
p. 13
+ p{B)
p{A)
-
p(A)
p{B)
+ p(A)-p(B)
=
p(A)
+ p{B) -
=
£(4)
+ £(B) -
or B, not both)
A
10
component events; additive theorems:
^(either
^(neither
p.
p(A\B) = p{A)
and
p(B)
events; independent
(1-3)
(1-8)
p. 8
n
B:
(1-5)
(1-7)
= number
w =-
p
(1-1)
(1-6)
n
of "wins,"
2p(A)-p(B) />(;4)
-p{B)
.
p.
10
p.
12
p.
12
p.
12
Partially dependent events; conditional probability: (1-9)
/>04
Permutations, total number
of,
and 5) = £(4) £(B|;4)
n
n!
Pk = (n
Stirling's
formula
(for
any
of,
—
—
p.
23
«)!
n objects taken k at a time (the k objects unordered)
(n\
n Pfc
factorial
number ,
„-,«)
13
n objects taken k at a time (the & objects ordered)
(1-12)
Combinations, total number
p.
n\
z!):
>"V=(|) (i+s*s ? --) 247
242 II.
Probability
and Experimental
Errors in Science
MEASUREMENTS IN SCIENCE: SIMPLE STATISTICS (REGARDLESS OF 'TIT" OF ANY MATHEMATICAL MODEL)
Definition (experimental) of p;
w ba = number
of
'win" observations,
rc
bS
= number
of identical trials:
(1-39)
pobs
=
, .
Mean m
(sample,
real-life
data)
;
^obs
.
limit ra
p.
49
obs~*°° ^obs
r different
values of
x, xi
observed /, times,
in
n
trials:
n
,
.
(2-1)
(2-2)
(2-3)
m =
Xl
flXl m =
+x
Xn
2 -\
=
i= l
n
+/a*2
p.
76
p.
77
p.
77
n
-\
frXr
=
i=l
Glossary
243
Second moment; variance
s
2 :
(2-19)
s
Universe variance a
2 ,
E =
for
m =
-
(*<
2
°
- m2
p.
86
p.
86
p.
87
p.
88
p.
93
p.
94
discrete universe distribution:
ju,
2
m)
E
=
-
lim
n
n-»o
Same
62
with universe mean
a2
(2-20)
=
2
~
lim ( xi n—>0 i=l
tfpi
continuous universe distribution
-
(x
I
2
n)
Jo
px dx
(2-21) I
Jo
p x dx
Practical ("best") estimate of a:
E
°<^)
(2-22, 2-23)
Standard deviation
in the
mean sm
n
—
m) 2
1
data and universe:
real-life
,
'A
-
(xi
n
(2-31
2-32 2-33)
(2-34)
— V«
-
,,<
fractional sm
Standard deviation
in
=
;
m
E
-—
m =
=
Vn
\
-
(xi
-
1=L
\
w (w
fractional
mVn
m) 2
-
!)
m =
—= M
the standard deviation
for
/xV«
approximately normal
(bell-
shaped) distribution: (2-44)
Probable error
/>e,
50%
«
a-/\/2n
p.
96
confidence limits; for approximately normal (bell-shaped)
distribution
pe
(4-21)
Probable error
in
pep.
Skewncss, coefficient
~
0.65^/ a/2w
= 0.46£e/Vw
p.
of:
E
skewness
(*i
-
E
m) 3
i=l
=
Peakedness, coefficient
of:
n
peakedness
(
xi
lim sample
(2-41, 2-42)
pp. 173, 206
0.655
the probable error pe pc for approximately normal distribution:
(2-45)
(2-35, 2-36)
»
=
E(*»i=i
n—>»
7Z<x
~
M)
96
244
and Experimental
Probability
PROPAGATION OF ERRORS
III.
Propagation of random errors,
in function
(3-31)
dUi
= f(x,
u
du
du — dy
« —Sxi + dx
Mean
=
deviation zu in u ,
(3-33)
,
in
with
f(x, y)
fraCtl0naI
Standard deviation su
u
and
zx
%
(3-36, 3-39)
Same
112
2 -i
^
^=LU)^ U)7]
= /(x,
with S* and
y)
= [(£)
in
p.
'du\ 2
Sz
>dy/
sy
/3m\2
+ (y)
2
= /(x,
u
known:
^ pp. 114, 115
V*]
and
with £e z and
y)
y
—
=
su
for s„ with s^ written for s x
Probable error pe u (or pen),
same as
Syi
+
fractional
(3-38, 3-40)
y)
known:
zy
2
IV.
Errors in Science
Sy for s y
/>e v
(or £&g
p.
and
pe$)
115
known:
for s u (or for Sg) with £e replacing 5 throughout.
MORE
STATISTICS
Weighted mean
mw
,
each
x,-
with weight W{, n
trials:
n
ww =
(3-57)
-
x^
X —
wx i
i
p.
n
118
i=i
Weighted grand mean J" (i.e.,
wxi
oc
of iV
component means each weighted by inverse variance
\/sxi "):
J2
xw
(3-58)
Weighted standard deviation
sx
2
Xi/s±
=— t Us 1=1
p.
118
p.
120
'
Xl
w :
n
£
-
Wi{xi
x w) 2
i=i (3-61)
t
/
in
the
/
test for consistency of
means i\
(3-64)
/=ff
—
of
=l
two
ii /
-(
sets of
«i«2
Ul+l2) \«i
\
V-^-J n-i) +
measurements: P 121
245
Glossary
F
the
in
F
standard deviations of two sets of measurements:
test for consistency of
«i
2
2
F =
(3-67)
-\=
p.
n2
2
M X
2
in
the chi-square test for goodness of
and model subdivided into
zl
<
—
1
mathematical model; actual data
of a
fit
124
2
M intervals, / = frequency of measurements (actual,
obs, or universe, th) in thej'th interval:
..__,
(*-26)
If
x
the model distribution
2
£
=
(4-27)
X
„ 2
~
2
(/th),]
TT-\
?=1
(Jth)j
uniform
is
[(/obs)/
2_,
=
p.
n total measurements:
(flat),
E~ -m m)~ »
185
2
(Xi
P 185
j=l
Curve
fitting,
y
=
a
+ bx,
values of a and b:
a
(3-74)
2x<2 2yi
—
=
-= 2
nZxi (3-75)
b
=
- Sxi2(xiyi) -^ - (ZXf) 2
—
g
For weighted values, see Eqs. 3-76 and
3-77.
For standard deviations, see Eqs. 3-78 and In case a
Curve
=
fitting,
0,
y
\*
.
3-79.
see^Eq. 3-83.
=
a
+ bx + ex
2 ,
values of
see Eq. 3-86.
a, b, c:
p.
129
p.
129
p.
1
p.
130
p.
131
p.
132
29
Correlation coefficient r for straight-line regression curve through the origin at x> y: 2jXiy{ Sx
Sx = =bSy _
(3-107)
Covariance
r
s xy for straight-line
y—
iAj
p.
142
Sy
regression curve through the origin at x, y
and with
correlation coefficient r: Sxy
_ —
VXjyj
_ —
TSxSy
n
V.
MATHEMATICAL MODELS OF PROBABILITY
All the equations listed in Parts II, III
and ful
and IV above, except Eqs.
2-44, 4-21,
apply also to all the mathematical models. In some instances more powerexpressions apply specifically to the models; only the specifically more powerful 2-45,
expressions are listed below.
Binomial
Model
Probability (distribution, discrete) for k successes, n (1-20)
B(k; n, p)
trials,
= (") pk q n ~i
p
-f q
=
1
p. 31
246
and Experimental
Probability
Cumulative probability (from k =
£ B(k;
(1-22)
Expectation value
is
q
=
\
P- 31
1
n'=n
np.
p. 31
ko in
is>
(1-24)
-
np\
m
= np
|£
Mean
k n ~k
k=0
Jt=0
Most probable value
k'):
P £ (?) \«/
=
n, p)
=
to k
Errors in Science
^
1.
32
p.
n:
(2-26)
Standard deviation
variance a 1
a;
p. 91
:
= vnpq;
(2-28)
a
(2-29)
fractional a
=
2
=
=
(
Multinomial probability (distribution) for each , k r observations, n trials, pi k\, kn, p2
M[(h;
of
n,
;
(k r
p2)
;
more than two
—
+
n, pi)(k 2
-\
•
+ pr n
p r)\ =
n,
p.
92
p.
92
Model
Multinomial
(1-30)
)
«/
\m
M
•
rc£g
=
possible outcomes;
1:
\
,
,fti*W 2
£ r*'
P-
37
Normal (Gauss) Model
=
Probability (density or distribution, continuous), z
G(z;h)
(4-9)
=
deviation, h
—-r " ft
=
l/
160
p.
160
y/ir
Cumulative probability (from z (4-11)
= —»
With standardized
r
h
=
*(*)
=
to z
2
z')\
'
e~" 2z2 dz
y/H J —x
variable: /
x
(4-14)
=
hz\
=
(*)
\
e~
x2
dx
p.
169
p.
169
y/^ Jo 2
/*'
1
/=-;
(4-15)
erf(0=— 7=1
_/2/2 e
dt
V2)r J -f
Mean
deviation
5 (universe)
(4-16)
Standard deviation
1
=
Z
=
0.564 — —
p. 171
a:
(4-20)
1
=
<j
=
0.707 p.
172
p.
173
P-
173
h
hy/2 Probable error pc (50% confidence limits): (4-21)
pe
=
0.4769
=
0.6/4.->cr
h
90%
confidence limits:
(4-22)
90%
c.l.
=
—
247
Glossary
Model
Poisson
—
Probability (distribution, discrete) for k successes, n
value np
=
fx
moderate
[k
+
2
2
(np)
=
Cumulative probability (from k
=
to k
fc=0
variance
0,
expectation
p.
198
P-
198
P-
202
p.
203
p.
205
p.
206
p.
207
2 o-
:
a
(5-17)
fractional
in a single
>
«!
/c=0
(5-13)
Standard deviation
—
= Z^rr-
)
a;
p
Jcg—lt
k'
£ P(k; M
Standard deviation
trials,
k')
V
(5-5)
<x>
—
=
P(k;n)
(5-4)
»
=
Vm =
=
measurement,
:
s
= y/T,
aks
(5-24)
a/,
—
Probable error pe
«
pe
Skewness, coefficient
0.65
of:
=
skewness
(5-32)
<x
Optimum time moved)
B
for
superposed on a background, obobserved background rate (signal re-
ratio for counting a signal rate
served combined rate
time
X+B
time
for
tx ,
fa:
+B
JX
t
-x =
(5-41)
a
p.
B„
M
tb
226
Model
Interval
Probability (density or distribution, continuous) for size or duration of intervals
randomly distributed between rare events, mean number size
/ is fit,
mean
interval
is
r
(
=
(5-46) t
=
to
—— = fit
(5-51)
Mean
l//x)
/(/; n)
Cumulative probability (from
deviation
t
—
t
= =
M e-"
ne-*
I
1
e
t')
with
dt
=
ci
st'
K
/
of events in interval of
1
p.
K -
events observed
«-"*'
in
228
time T:
p.
229
p.
229
p.
229
Jq
t:
(5-53)
t
-
t
=
—2
~
0.7358
n
fie
Standard deviaton a: (5-55)
a
=
M
Ind ex
statistics, 44 Bridge hand probabilities, 29, 37, 40
Bose-Einstein
Accuracy, 68 see also Precision vs.
Acton,
accuracy
F., 133
Addition theorems, 11
Cauchy
Alienation coefficient, 142
Central limit theorem,
Analysis of variance,
F
test, 123,
e.g.,
t
test,
134
see also
see
Chance, see Probability Chavenet's criterion for rejection of a "bad" measurement, 176
Mean
Bacon, R., 88 Bayes' theorem, conditional probability,
Chi-square (x 2 ), 185 frequency distribution, 187
goodness of 184-191
13
test for
Bernoulli theorem, 6
fit
of a
model,
dependence on the chosen group
30
conditions
for,
binomial probability, 32
in
normal probability, 164
in Poisson probability,
in-
tervals, 190
30
in
probabilities
P(x 2
>
xc
2 )-
table of,
188
example of, 190 Choice of functional relation for least squares fit, 133 F test for, 134
207
Bernoulli probability, 31 see also Binomial probability "Best" (location) value, 64, 88 see also
"Best" value; Mean; Median;
Mode
2
Average (arithmetic),
168
6,
Central (location) value, 73, 79
120
X test, 184 measuring with a background, 227
trial,
distribution, 97
Mean
Class (classification) interval, 72, 183, 186 Classical probability, 2, 8
Betting odds, 103 Bias in estimate, 81, 88, 104, 107
conditions
for,
8
Binomial coefficients, 27 Binomial probability, 31 distribution, 30 mean, and expectation value in, 3 1 91 most probable value in, 32 skewness in, 95 standard deviation and variance in, 92 graphs of, 32, 157, 158 in normal approximation, 33 in Poisson approximation, 33
Cochran and Massey, 146
Birge, R., 123
Composite Poisson distributions, 224 see also Measuring with a background
Coefficient of variation, 92 see also Standard deviation Coincidence counting, 230
,
Combinations, 23, 27 Combinatorial analysis, 9, 23 combination (unordered group) in, 27 permutation (ordered group) in, 23 Comparison of precisions in two sample sets,
F
Blunders, 69
249
123-126
test for, 123
250
Index
Compounding
of errors, 109
Dispersion indices, 73, 79
Propagation of errors Conditional probability, 12 see also
knowledge
see also Inferred
Confidence
limits, 173
deviation; Peakedness;
Range; Skewness; Standard deviation; and Variance Distribution function, see Frequency disQuantile;
see also Levels of significance
tribution
Consistency of two means, 120-123 /
Mean
see also
test for, 120
Dixon and Massey, 146 Dwight, H., 170
Correlation, coefficient of, 142 inverse, 142
Elementary
of errors, 111
Equally probable outcomes (events), 50 Errors, 64 blunders, 64, 69
two or more variables stochastically related, 140
Covariance, 143
Cox and Matuschak, 132
errors, see Errors
examples
of,
Cramer, H., 105, 168 Craps, game of, 48
by
Curve
elementary, 65
fitting,
examples of, 165 in normal probability theory, 164 mechanical analog of, 166
choice of functional relation, 133 differences, table for, 134
parabola, 132
experimental, 64
sine curve, 133
nonindependent, 110 propagation of, 62, 109-118
straight line, 127 for,
129
standard deviation for, 130
in
random
parameters
(accidental, independent), 64,
111
straight line through origin, 131
systematic, 64, 67
examples d'Alembert,
J.,
69
significant figures, 69
dispersion, see Dispersion
126
parameters
theory
6
of,
of, 6,
67
156
Data smoothing, 139 by first-order differences, 139 to a parabola by least squares, 139
Error function, 169
Degrees of freedom, 89
Estimator, 77, 88, 89, 104
in
curve
in
dimensions of error space, 138 standard deviation, 89
in
in the chi-square (x in the
F test, t
test,
table of values of, 171
Error space, 138
127
fitting,
in the
bias in, 81, 88, 104, 107
2 )
test,
consistent, 104, 107
Events, compound,
189
independent,
124
Expectation value, 31, 78 in binomial distribution, 31, 91 in Poisson distribution, 198
error propagation, 113
149
see also
Mean
Experimental probability,
Deviation, 64, 79
Mean
11
mutually exclusive, 10 overlapping (nonindependent), 11 rare, 196
de Moivre, A., 6 Design of an experiment, 148 statistical features in,
9,
10, 13
individual, 4
121
de Mere, Chevalier, 6 de Mere's paradox, 46
see also
8,
Skewness; Standard and Variance Discrepancy, 69
4,
49
conditions for and difficulties
deviation; Peakedness;
in,
deviation;
F
test for consistency, in
standard deviations, 123-126
49
251
Index
F test for
consistency,
curves
of,
F distribution,
124
125
for choice of functional relation, 134
Factorials, see Stirling's formula
146-148
for dispersion indices, 147 for location index, 147
beginning progress
of,
in,
19
with experience,
41
18,
see also Science
Interval probability, 227
181
of,
10, 13
Inefficient statistics,
Inferred knowledge, 17, 55
Fermat, P., 6 Fermi-Dirac statistics, 44 Fit of a mathematical model, 62, 180 chi-square (x 2 ) test of, 184-191 graphical visual test
Independent events,
cumulative distribution function, 228
ogive curve, 183 probability paper, use
of,
density function, 228
183
skewness and kurtosis, in test
of,
183
distribution, 227
mean
Fisher, R., 105, 123, 126
Frequency distribution,
in,
229
standard deviation
58, 71
in,
229
resolving time, lost counts, 229
cumulative, 73
continuous type, 72
curve
of, 58,
diagram
of,
73
Kurtosis, see Peakedness
73
discrete type, 72 relative (normalized), 73
Mathematical model Frequency function, 73 see also
Laplace,
Galton's quincunxes, 166
Games
Gamma Gauss,
function, 25
C,
parent, 175
Law
of chance, 6, 8
P., 6, 7, 9, 18
law of succession of, 19 Large deviations, probability for, 175 rejection of a "bad" measurement, 175 table of probabilities for, with normal of averages, 11
Least squares, see Principle of least
6
probability density formula, 33, 160
squares Legendre, A., 6
probability model, 33, 161
Levels of significance, in
see also
Gossett,
W.
Normal probability
in
(Student), 123
in chi-square (x
Goodness of fit, test of, see Fit of a mathematical model Grand mean, 118, 123 weighted, 118
Gaunt, J. A., 7 Gunther, Count
F
t
test,
120
123
test,
2 )
184
test,
measuring with a background, 224 see also Confidence limits
in
Lexis divergent coefficient, 185 see also Chi-square (x
2 )
Likelihood function, 104 A., 7
Maximum
see also
likelihood
Likelihood ratio, 103 Halley, E., 7
Heisenberg, W., see Uncertainty principle
Histogram, 57, 73 see also
Frequency distribution
History, highlights
of,
of probability,
6
1,
P.,
Lottery problem, 39
6
of statistics, 6, 7
Hoel,
Location indices, 76 see also "Best" value; Mean; Median; Mode; and Most probable value
188
Holmes, Justice O., 55 Huygens, C, 6 Hypergeometric probability, 39
Mathematical model
of probability, 61,
63, 72 see also
Binomial; Multinomial; Inter-
val;
Normal; Poisson;
F,
t,
and
chi-square (x 2 ) probability distributions
252
Index
Maximum
of,
105-108
precision
in,
108
Maxwell-Boltzmann
Mean
Normal
likelihood, 103
examples
statistics,
Normal frequency
43
deviation, 80
see also Normal probability Normal cumulative probability
normal distribution, 170
distribu-
tion, 161
use in science, 81, 97
see also
Normal likelihood, 106, 107
Normal probability
probability, 156
density function, 157, 199
sample (experimental), 76 universe (parent, theoretical), 77, 78
weighted, 118
approximations 160, 163-165
binomial)
(to
of,
distribution, 6, 160
working, 78
dispersion indices
Measurements, 5, 56 as a multinomial probability problem,
in a graph,
mean
38
in,
in,
160,
170-174
174
by maximum
likelihood,
106
as a quality control problem, 52
computed
standard deviation likelihood, 106
(derived), 62
direct, 52, 62
elementary errors
in science, 3, 41, 52, 56, 101
signal,
Bernoulli
trials,
168 fit of,
to actual data, 180-191 of,
graphs, 161, 174
large deviations in,
"bad" measure-
frequency curves ments, 175
errors,
166, 209,
efficiency of, 76 of least squares, see Principle of
least squares
Mode, 76 Model of probability, see Mathematical model Moments, 84 about the mean (central moments), 86 about the
by maximum
different allowed characteristics of,
225
Mechanical analogs of 210 Median, 76
in,
in,
164
random, trial, 56, 58 Measuring with a background, 224 chi-square (x 2 ) test of flatness and randomness of background, 224 precision in measuring a superposed
Method
(or density function)
distribution, 160
fractional, 81
Mean, 64, 76 by maximum
160
Normal probability
see also
efficiency of, 81
in
differential probability distribu-
tion,
origin, 85
Most probable value (mode),
32,
76,
mechanical analog of, 166 wide use in statistical theory, 156
Observation, see Measurement
Ogive curve, 183 Ordered group, see Permutation Over-determined (vs. exact) constant, 127
Parent distribution, see Universe bution
distri-
Pascal, B., 6
Pascal's triangle, 28
198 Multiplication
theorem,
probabilities,
12
Multinomial
coefficients, 35
probability distribution, 37
Peakedness (experimental and universe), 95 Pearson, K., 185
Permutations, 23
Mutually exclusive events, 10
Phase space, 42 Philosophy of sufficient reason,
Nagel, E., 4
Pierce, B., 170
Newman,
Pierce,
J., 7
C,
17
7
253
Index Poisson frequency distribution (density
Principle of least squares, justification of,
Poisson cumulative distribution func-
quantitative, 2
198
a priori, axiomatic, exact, mathematical, 2, 4, 8
classical,
Poisson probability, 33, 156, 195 Bernoulli trials
in,
207
conditions
density function, 198
approximation
3, 4,
(to binomial) of,
201
Probability combinations, 9
by maximum
likelihood, 107
standard deviation
in,
distributions
202
to actual data, 211
models,
mathematical, 61,
63, 72 see also
measurement, 204
illustrating,
Probability distributions, see Frequency Probability
fractional, 203
problems
49
theological, 4, 56
202
probable error in, 206 skewness in, 207
fit of,
8
philosophical, 4
distribution, 198
in a single
for,
experimental, a posteriori, scientific,
conditions for, 196, 212
in,
Mathematical model
Probability paper, 183
Probability theory, 62
213
Probable error,
normal distribution, 173
bacteria in blood samples, 221
in
bombs "dropped" on Washington,
in Poisson distribution, 206
223
popularity in science, reasons
deaths from the kick of a mule, 213 defective units in manufacturing process, 222
radioactive decay, 215 precision,
2
of,
intuitional, non-quantitative, 2
see also Poisson probability
mean
likelihood, 135
Probability, several meanings
see also Poisson probability
tion,
maximum
from
function), 198
subgroups
in the
148
Propagation of errors in computed measurements, 109M18 in a sum or difference, a product or quotient, a logarithm, a sine func-
in total
counts, 218
shapes of frequency distribution,
for,
probable error, 96
of
198, 200
tion, 116-118 nonindependent (systematic) errors,
110
spatial distribution, 212
of
telephone switchboard, 222
two mechanical analogs of, 209, 210 Poker hand probabilities, 30, 48
random errors, mean deviation
111 in,
113
probable error in, 115 standard deviation in, 114
Polynomial, see Multinomial Precision indices, 65, 71, 170 see also
Location indices and Disper
Quality control, 51 Quantile, 79
sion .indices
Quantum
Precision vs. accuracy, 68 Prediction,
5,
in
curve
in
data smoothing, 139
fitting,
Random mass-phenomena, Random numbers, 60 Random variation, 59
126
in definition of deviation,
80
in regression curve, 145 in various applications, 135
weighting by standard deviation, 119
43
56
Principle of least squares
in
statistics,
5
mathematical meaning
of,
scientific (operational)
meaning
Range, 79 Rare events,
60 of,
see Poisson probability
Rejection, see Large deviations
60
254
Index Standard deviation, reasons for popularity
Regression curve, 145 Restraints, see Degrees of freedom
Root-mean-square deviation,
see
ard deviation
Rounding
of
97
of,
sample (experimental), 82
Stand-
fractional, 83
numbers, 69
universe (parent), 86
Standardized variable in error function, 169
Sample space, 12
Normal probability
see also
Sampling, 26
Statistical bias, see Bias
random, 26
Statistical efficiency, 76, 81, 104
with replacement, 26
Statistical fluctuation, 64
measurements, central feature
without replacement, 26 Scatter diagram, 143
Statistical
Science, progress in, 41, 55, 101
Statistical mechanics,
of,
purpose of a measurement in, 101 roles of classical and experimental probability in, 42 see also Inferred
knowledge and Meas-
urements Scientific fact,
history of, 6 definition of, 59 Stirling's formula, (t)
24
distribution, see
t
test
"Sun rise" problem, 18 System of betting, 49
55
see also Inferred
42-45
Statistics, 7, 56, 101, 146
Students
1,
66
knowledge
Systematic error, 64
Sears, F., 45
examples
Significance, see Levels of significance
of,
67
Significant figures, 69
Skewness, 72, 94 coefficient of, 94
t
experimental, 94
t
universe, 94 in
curves
Spread, see Dispersion indices
Standard error of estimate, 142 Standard error (standard deviation mean), 92, 93
in the
likelihood, 109
fractional,
94
inefficient,
148
Standard deviation,
82, 86
"best" estimate of (variance), 88, 89,
of,
122
ity, 83 Test of a hypothesis, e.g., fit of binomial model, 178 "true coin" example, 14
extended to include veil of errors in measurements, 102 "true" die, 178 see also Fit of a mathematical model
Theory
of errors, 6, 156
Thermodynamic
138
Todhunter,
calculation of, 85
measurement, 83, 204 mathematical models of probability,
in a single in
consistency of two means, 120
distribution, 122
Tchebycheff's (or Chebyshev's) inequal-
Poisson distribution, 207
by maximum
test,
I.,
probability, 42
6
"True" value, 64, 167 "True coin" hypothesis,
test of, 14
e.g.,
binomial, 92
normal, 172 Poisson, 203 in the
mean (standard
in
error), 92, 93
94 the standard deviation, 96, 120
fractional,
inefficient, 147
Uncertainty principle (Heisenberg), 8, 43 Universe distribution, 58
mean
in, 76,
78
in, 86-89 Mathematical models
standard deviation see also
4, 7,
255
Index
Weighted mean, weighted grand mean,
Variance, 86 best estimate in
of,
88
118
binomial model distribution, 91
see also
Standard deviation
Weighted standard deviation, 120 Working mean, 78 for skewness, 95
Weighted mean, 118 by standard deviation, 119 by maximum likelihood,
for standard deviation, 84
120, 138
X
2 ,
see
Chi-square
SCIENCE
Probability
and
Experimental Errors in Science Lyman G.
What is tion
is
Parratt
The answer to this quesmodern science and its relaThis book supplies the answer.
the nature of scientific "meaning"?
essential to
an understanding
of
tionship to other intellectual activities. It brings
home
the base of all
the significance of a conceptual revolution that lies at science — the replacement of the "exact" or
modern
"absolute" scientific meanings of the 19th century by the "probabilistic" meanings of the 20th. Written by a man who is an experienced teacher and a distinguished physical scientist, the book fully clarifies the relationship of statistics and probability theory to probin, as well as the philosophy of, modern physical science. At the same time, it teaches the mechanics involved in applying proper formulas to scientific situations.
lems
Both the experimental and the
classical definitions of probability are
covered, but primary emphasis
is
given to the experimental. Early in
the book, classical games of chance are introduced in order to arouse
him develop his feeling for probability The discussion then shifts to measurements in science and to general statistical concepts (maximum likelihood, curve fitting, consistency tests, etc.). The normal (Gauss) and the reader's interest and help into quantitative concepts.
the Poisson models of mathematical probability are explored both analytically and through typical problems; both types of models are given about equal weight. In line with the purpose of the book, the discussion is elementary and concentrates on essentials. Numerous prob-
lems are included, some with answers.
cover design:
mike mciver