INTRODUCTION TO STATISTICAL INFERENCE
by
Jerome
C. R. Li
Chairman, Department of Statistics Oregon State College
Dis tri buted by Edwards Brothers, Inc. Ann Arbor, Michigan
1957
COpyright 1957 by JEROME C. R. LI All rig"" re."rued
o Internatioul Copyright 1957 by
JEROME C. R. LI All foreign rig"'_ rnerued
Composed by TM Science Pre .., Inc., Lancaster, PeDlUJylvania, U.S.A. Printed by Edworda Brolla.r., Inc., Ann Arbor, Michigan, U.S.A. Published by JEROME C. R. LI Distributed by EdwtITd. Brollaer., Inc., Ann Arbor, Michigan, U.S.A.
PREFACE This book fs~sitentially a non-mathematical exposition of the theory of statistics written fM experimental scienti.ts. It is an expanded version of lecture notes used fM a one-year cour.e in statistics taught at Oregon State College since 1949. Students in this course
Unfortunately, however, aa
students they are Wlually deficient in mathematical training, understandably so, since it i. unrealistic at present to require a student to study four or five years of college mathematics for the sole purpo.e of learning statistics. Calculus, which is u8ually taught in the sophomore year in college, does not by itself p'ovide sufficient background' for comprehending the mathematical proof of the theorems in statistics'. ' However, though hard to prove mathematically, these theorems can be verified empirically, and it is this empirical verification, not mathematical prool, which is used 88 the key to statistics in this book. Students in the experimental sciences are accustomed to acquiring knowledge through
iii
iv
PREFACE
experiments, that is, through induction. It is fitting, therefore, that these students should acquire an understanding of statistics by experimental, or inductive, means rather than by mathematical, or deductive, means. It is my hope that instructors who adopt this book as a text will require their students to conduct at least some of these experiments. The time required in performing such experiments is negligible if desk calculators are available. It is not so formidable a job as it seems. A table of 1000 random samples, each consisting of 5 observations, drawn from a normal population with mean equal to 50 and variance equal to 100, is given in Table 1 of the Appendix. If samplfng equipment is not available, the samples in this table may be used for performing the experiments. Of course, the number of the samples does not have to be 1000 and the sample size need not be 5.
For example, 500 samples, each consisting of 10
observations, may be made from the 5000 observations. The 5000 tabulated observations are already randomized and may be systematically parceled out to the students for computation. The experiments are valuable not only as a substitute for mathematics, but also as an aid in grasping the physical meaning of the theorems. They are helpful to beginning students in statistics regardless of their mathematical background. Examples are used to illustrate the possible applications of the principles presented. They are selected mainly for their simplicity and can be understood without specialized knowledge in a subject matter field. They might be cRlled common-sense examples. Though the most frequently used examples refer to fertilizers, insecticides. teaching methods, and children's heights, one does not need to be an agronomist, an entomologist, an educator, or a parent to comprehend them. The more technical examples are given as exercises at the end of each chapter. The book is roughly divided into three parts by the two review chapters, 13 and 20. The first part is devoted to the basic concept of statistical inrerence and to the introduction or the rour distrihutions, namely-normal,
t, )( and F. The second part is devoted essentially to the analysis of variance, and the third part to the non-parametric methods. These topics are joined, however, by repeated references to the analysis of variance. Even the non-parametric methods are explained in terms of this topic. The relations among various methods are stressed, and numerous crossreferences are used. While the material of the early chapters is essential to the understanding of the later chapters, yet, the use of algebra is confined mainly to special sections developing computing methods suitable for use with the desk calculator. The student may omit these sections without interrupting the continuity of the book. If a short course is desired, I recommend that Chapters 16, 17, and 19 be omitted as a unit or that Chapters 21, 22, 2..':\, and 24 be omitted as a unit. Of course, in an even shorter course, one may omit both units.
PREFACE
v
The importance of the individual degree of freedom is stressed. Rightly or wrongly, I feel that this topic has not received the attention it deserves.
Its use as a means to increase the power of a test has not been suf. ficiently emphasized in statistics books. The notations used are quite consistent throughout the entire book. Even though some notations are traditional, such as p. for mean and q 2 for variance, I did not hesitate to deviate from traditions in order to achieve unifOl'Dlity of notations. For example, an observation is denoted by y rather than the traditional %. Since the % commonly used in the analysis of variance is the y in the regression, a unified notation y can dispel a great deal of confusion in the mind of a beginning student when he encounters the two topics simultaneously in the test of linearity of regression or in the analysis of covariance. The methods of presentation are not exactly traditional, nor are they entirely my own invention. The distinguishing features of this book are the result of a complex of influences, many of them not now traceable. However, a word about those which are traceable. The method of illustrating the distributions of the sample mean and the difference between sample means, through the use of small populations (Sections 5.3 and 10.1), I learned from Dean Walter Bartky at the University of Chicago. From Professor George W. Snedecor of Iowa State College, I first ootained the idea of using sampling experiments as an aid in teaching statistics and Professor W. J. Dixon, of the University of California at Los Angeles, convinced me of the value of the sampling experiment as a teaching aid. To all these men I am deeply grateful • .The subject matter itself is of course not original, with the possible exception of the correction tenn given in Equation 2, Section 24.4. I suspect that my interpretation of the non-parametric methods in terms of the analysis of variance given in the last four chapters of the book has long been known to many statisticians, but I have not been able to find it in textbooks. I may have learned this interpretation from Professor W. G. Cochran, of the Johns Hopkins University, under whom I had the
privilege of studying statistics at [owa State College fifteen years ago. The exercises at the end of each chapter are supplied by many people. The problems on agricult\D'e came from the staff of the Oregon Agricultural Experiment Station. Those on biology came from Dr. Roebert L. Stearman, National Institute of Arthritis and Metabolic Diseues. Those on psychology came from Dr. Horace M. Manning, Psychology Department, Oregon State College. Those on the physical sciences came from Dr. Richard F. Link, Statistics Department, Oregon State College. I am much indebted to all these able men. I also wish to thank Professor E. S. Pearson for his kind permission to reproduce Tables 1, 2, 4, 5, 6, and 7 of the Appendix from Biometrika; Professors David Duncan and Gertrude M. Cox for their permission
n.
vi
PREFACE
to reproduce Table 8 from Biometrics; and Professor G. W. Snedecor and Iowa State College Press for their permission to reproduce Table 9 hom SttJlistical Method&. I &Ill also indebted to Sir Ronald A. Fisher, Cambridge, to Dr. Frank Yates, Rothamsted, and to Messrs. Oliver and Boyd Ltd., EdinburRh, for permission to reprint Tables 10 and 11 from their book Statiatical Tables for Biological, Agricultutal and Medical Research. I am further indebted to Profe.sor Richard L. Anderson, North Carolina State College and Dr. Lyle D. Calvin. Oregon State College, for their criticism of the manuscript. Indeed, by relieving me of many of my consulting duties. my colleague Dr. Calvin enabled me to find the time needed to write this book. My thanks also to my clerical staff. Miss Cathy Olsen, Mrs. Sally Strause, Mis. janet M. Grexton and Miss Sherry Lee Holbrook did the typing, computing and proof-reading. Mr. 'Ernest Defenbach drew all the graphs. Last but not least, I am grateful to my editor, Profe.sor james W. GroshoD8, English Department, Oregon State College, who spent almost a. much time on the book as I did. Sometime. I wondered which of us was writin8 this book. jerome C. R. Li August, 1956
CONTENTS PREFACE
........................................................
iii
CHAPTER 1. INTRODUCTION 1.1 Statistical Inference ...•........•..•........................... 1.2 Reference System •••••••••••••••••••••••••••••••••••••••••••••• References •••••••••••••••••••••••••••••••••••••••••••••••••••
1 2 2
CHAPTER 2. QESCRIPTIVE STATISTICS 2.1 Mathematical Notations .......•....•...•.......•...•............
2.2
Mean •••••••••••••••••••••••••••••••••••••••••••••••••••••••••
2.3 Variance and Standard Deviation ...•.••....••.•....••.•..••....•. 2.4 Effect of Change in Observationa on Mean and Variance ••••••••••••• 2.5 Frequency Table .....••.•.•...•..•.••.•..•.•.•..••••...••....• 2.6 Histogram and Frequency Curve ..•••••..••.•...•.•...•..•.•••..•
2.7 Rem.arks .••...•....••••.....•.•..•..•.•....•.....•.•.•.••••... Exercises Questions
.................................................... ....................................................
3 3 4 5 6 8 10
11 13
CHAPTER 3. THE NORMAL DISTRIBUTION 3.1 Some Properties of the Normal Curve •.•••.••••.•••...••.....•.... 3.2 Table of the Normal Curve .........•............................ 3.3 Normal Probability Craph Paper ••••••••••••••••••••••••••••••••• 3.4 Probability ...•...•.•.•..•...•..........••..•.•........•.••... Exercises •..........•........•.......•.•........•...........• Questions ......•......••..................................... References
14 15 19 21 21
22 22
CHAPTER 4. SAMPLINC EXPERIMENTS 4.1 Description of Population ..•....•..•.•.......•••••.•••..•..•.... 4.2 DrawiDS of Samples ••.......•..•.•...••.........•.............. 4.3 Computation .••.....••.••....•..••.•..••.•....................
4.4 Parameter and Statistic ••••••••••••••••••••••••••••••••••••••••• 4.5 Purpose of Sam.pling Experiments ••••••••••••••••••••••.••••••••• References .....•.•••.•..•...•.••.••..••...•....•.............
23 26
27 28 28 28
CHAPTER 50 SAMPLE MEAN
..............................................
5.1 Sam.plin8 Scheme 5.2 Distribution of Sample Means ...•••...•.........•................ 5.3 Mean and Variance of Sample Means ....•...........•............. 5.4 NotatioDs .•....••.•..•.......•....•........••................. 5.5 Reliability of Sample Means ...•..•.••......•.•...•.•.•.......... 5.6 Experimental Verification of Theorems •••••••••••••••••••••••••••• 5.7 Remarks •.•••••••.•••••••••••••..•..••••••••••••••••.•••••••.• Exercises .•...•...•........•..................•.............. Questions ...•.•....................•.........................
29 31 35 36 36 38 39 41
References ••••••.••••••••••••••••••••••••••••••••••••••••••••
42
vii
41
viii
CONTENTS CHAPTER 6. TEST OF HYPOTHESIS
6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10
Hypothesis •••••••••••••••••••••••••••••••••••••••••••••••••••
44
Two Kind. of Errors ••••••••••••••••••••••••••••••••••••••••••• Level of Sipificance •••••••••••••••••••••••••••••••••••••••••• Type II Error •••••••••••••••••••••••••••••••••••••••••••••••••• Sample Size •••••••••••••••••••••••••••••••••••••••••••••••••••
45 46
49 50
Summary ••••••••••••••••••••••••••••••••••••••••••••••••••••••
52
The a-Tea' •••••• Assumptions ••••••••••••••••••••••••••••••••••••••••••••••••••
53 54 55 55 56
to • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
Proced1lrea •••••••••••••••••••••••••••••••••••••••••••••••••••
Remarks •••••••••••••••••••••••••••••••••••••••••••••••••••••• Exercises •••••••••••••••••••••••••••••••••••••••••••••••••••• Questiona •••••••••••••••••••••••••••••••••••••••••••••••••••• References ••••••••••••••••••••••••••••••••••••••••••••••••••• CHAPTER 7. SAMPLE
57 58
V~RIANCE-)(2.DlSTRIBUTI01L
7.1 Purposes of Studying Sample Variance ••••••••••••••••••••.••••••• 7.2 Sample Variance ••••••••••••••••••••••••••••••••••••••••••••••• 7.3 Unbia.ed Estimate ••••••••••••••••••••••••••••••••••••••••••••• 7.4 Computing Method for Sample Variance •••••••••••••••••••••••••••• 7.5 )r( 21-Distribution •••••••••••••••••••••••••••••••••••••••••••••••• 7.6 Distribution of u21 •••••••••••••••••••••••••••••••••••••••••••••• 7.7 Distribution of SS/u 2 7.8 Algebraic Identities •••••••••••••••••••••••••••••••••••••••••••• 7.9 .Au.alyais of Variance ••••••••••••••••••••••••••••••••.•••••••••. 7.10 Test of Hypothesis ••••••••••••••••••••••••••••••••••••••••••••• 7.11 Procedures of Test of Hypothesis •••••••••••••••••••••••••••••••• 7.12 Applications ••••••••••••••••••••••••••••••••••••••••••••••••••
..........................................
7.13 Remarks ••••••••••••••••••••••••••••••••••••••••••••••••••••••
Exercises "•••••••••••••••••••••••••••••••••••••••••••••••••••• QuestioDs •••••••••••••••••••••••••••••••••••••••••••••••••••• References
...................................................
59 59
63 64 65
70 71
76 77 78
82 83 84 84
85 86
CHAPTER 8. STUDENT'S t-DlSTRlBUTION
8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8
Description of t-Distribution ••••••••••••••••••••••••••••••••••••• Experimental Verification of t-Distribution •••••••••••••••••••••••• t-Table ••••••••••••••••••••••••••••••••••••••••••••••••••••••• Test of Hypothesis ••••••••••••••••••••••••••••••••••••••••••••. Procedures •.••••••••••••••••••••••••••••••••••••••••••••••••• Applications •••••••••••••••••••••••••••••••••••••••••••••••••• Paired Observations ••••••••••••••••••••••••••••••••••••••••••• Remarks ••••••••••••••••••••••••••••••••••••••••••••••••••••••
Exercises •••••••••••••••••••••••••••••••••••••••••••••••••••• Questions •••••••••••••••••••••••••••••••••••••••••••••••••••• References •••••••••••••••••••••••••••••••••••••••••••••••••••
87 89 92 92
94 95
96 99 99 104 104
CHAPTER 9. VARIANCE-RATIO-F..DISTRIBUTION
9.1
Description of F-DistributioD ••••••••••••••••••••••••••••••••••••
9.2 Experimental Verification of F -Distribution ••••••••••••••••.••••••• 9.3 F-Table •••••••••••••••••••••••••••••••••••••••••••••••••••••• 9.4 Test of Hypothesis •••••••••••••••••••••••••••••••••••••••••••••
105 107
108 109
CONTENTS CHAPTER 9. VARIANCE-RATlO-F-DlSTRIBUTlON (Con,inued)
9.5 9.6 9.7 9.8
Procedure. • •••••••••••••••••••••••••••••••••••••••••••••••• Weighted Mea of Sample V.riacea •••••••••••••••••••••••••••• Relation Between F.Diatribution and xa.Diatribution ••••••••••••• Remarks ••••••••••••••••••••••••••••••••••••••••••••••••••••
Exerci.es •••••••••••••••••••••••••••••••••••••••••••••••••• Quea'ioD8 •••••••••••••••••••••••••••••••••••••••••••••••••• Referencee •••••••••••••••••••••••••••••••••••••••••••••••••
110 111 114 115 115 118 118
CHAPTER 10. DIFFERENCE BETWEEN SAMPLE MEANS
10.1 Diatribution of Difference Between Sample Meaa •••••••••••••••• 10.2 Experimental Verification of Distribution of Difference Between
119
5.mple Meane •••••••••••••••••••••••••••••••••••••••••••
125 126 127
10.3 10.4 10.5 10.6
u-Di atribution ••••••••••••••••••••••••••••••••••••••••••••••• Student's t-DiatributioD •••••••••••••••••••••••••••••.••••••••••
Experimental Verification of t-Distribution •••••••••••••••••••••• Teat of Hypotheais-Procedure ••••••••••••••••••••••••••••••• 10.7 Adv_sapa of Equal Sample Size •••••••••••••••••••••••••••••• 10.8 Application ••••••••••••••••••••••••••••••••••••••••••••••••• 10.9 Randomi zatioD •••••••••••••••••••••••••••••••••••••••••••••• Exercises •••••••••••••,••••••••••••••••••••••••••••••••••••• Qu,estlon8 •••••••••••••••••••••••••••••••••••••••••••••••••• RefereD cas •••••••••••••••••••••••••••••••••••••••••••••••••
129 131 133 134 135
136 140 140
CHAPTER 11. CONFIDENCE INTERVAL
11.1 11.2 11.3 11.4 11.5
IDequality •••••••••••••••••••••••••••••••••••••••••••••••••• Estimation by Interval •••••••••••••••••••••••••••••••••••••••• Confidence Interval and Confidence Coefficient •••••••••••••••••• Confidence Interval of Mean ••••••••••••••••••••••••••••••••••• Confidence Interval of Difference Between Means •••••••••••••••• Exercise. • ••••••••••••••••••••••••••••••••••••••••••••••••• QueatioDS ••••• : •••••••••••••••••••••••••••••••••••••••••••• Reference••••••••••••••••••••••••••••••••••••••••••••••••••
141 141 142 145 147
148 149 150
CHAPTER 12. ANALYSIS OF VARIANCE-ONE-WAY CLASSIFlCA1l0N 12.1 Mechaics of Partition of Sum of Squarea ••••••••••••••••••••••••
151
12.2 Statistical Interpretation of Partition of Sum of Squares •••••••••••
155 159 163 167
12.3 ComputlDI Method ••••••••••••••••••••••••••••••••••••••••••• 12.4 Varlance Componenta and Models ••••••••••••••••••.••••••••••• 12.5 Teat of Hypothesia-Procedure •••••••••••••••••••••••••••••••• 12.6 Relation Between t-Distribution and F-Dlstribution ••••••••••••••• 12.7
171
172
12.8
173
12.9 Ss»ecific Test •••••••••••••••••••••••••••••••••••••••.••••••• 12.10 Unequal Sample Size••••••••••••••••••••••••••••••••••••••••• 12.11 Advataps of Equal Sample Size ••••••••••••••••••••••••••••••
175 175 179
Exerci.ee •••••••••••••••••••.•••••••...•..•••..••••.•.•.•••
181
Questions ••••••••••••••••••••••••••••••••••••••••••••••••••
186 187
Refer.cas ••••••.••••••••••••••••••••..••.•••.••••..•.•••••
x
CONTENTS CHAPTER 13. REVIEW
13.1 13.2 13.3 13.4 13.5 13.6 13.7
All Possible Samples Relation Among Various Distributions ••••••••••••••••••••••••••
188 189
Tests of Hypotheses .••.••.•.•.••••••.•••••.•••••..•••••••.••
190
Sigllificance ........•............................•.......... Sample Size ..•........................•............••....... Simplified Statistical Methods ........•....................•... Error •.••••••••••••••••••••••••••••••••••••••••••••••.•••.••
192 192 193 194 195 195
QuestioDs •••••••••••••••••••••••••••••••••.•••.•••.•••••.•• Re fereDces •••••••••••••••••••••••••••••••••••••••••••••••••
CHAPTER 14. RANDOMIZED BLOCKS 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9
Randomized Block Versus Completely Randomized Experiment ••••• Mechanics of Partition of Sum of Squares ••••••••••••••••••••••• Statistical Interpretation of Partition of Sum of Squares ••••••••••• ComputiDB Method •.••••••••..••••••••••••••••••••••••••••••• Test of Hypothesis-Procedure and Example •••••••••••••••••••• Paired Observations and Randomized Blocks •••••••••••••••••••• Missing Observation •............•............•..•..•.•.......
196 198 200
Experimental Error ..••.•.••••.•...•.•.•..••....•.••...•.....• Models .................•..•.......•.......................• Exercises .......................•......•...•.......•.•..... QtJ,estions .•.••..........••....•....•........••.....•••••••. References .............••.••.•....•.....•.••••••.•••.•....•
212
205 207 208 209
214 215 219 220
CHAPTER 15. TESTS OF SPEClnC HYPOTHESES IN THE ANALYSIS OF VARIANCE 15.1 15.2 15.3 15.4 15.5
Linear Combination Distribution of Linear Combinations •••••••••••••••••••••••••••• Individual Degree of Freedom .•..............•..•............. Least Significant Difference (LSD) ••••••••••••••••••••••••••••• New Multiple Range Test .......•............................. Exercises .............................•...•...............• Questions .................................................. . References ................................................ .
221 222 226
233 238
241 243 243
CHAPTER 16. LINEAR REGRESSION-I 16.1 16.2 16.3 16.4 16.5 16.6 16.7 16.8 16.9
Fundamental Notions ......................................... . Description of a Population ..........•............ Estimation of Parameters ..................................... . Parti tion 0 f Sum 0 f Squares .................................. . Distribution of Sums of Squares ••••••••••••••••••••.••••••••••• I
Estimate of Variance of Array ••••.••••••••••••••••
I
•••••••••••
•••••••••••
Test of Hypothesis ..................................•....... Correlation Coefficient ....••••.•.•.••...............••.•••...
Algebraic Identities and Computing Methods .•••••••••••••••••••• Exercises .................•.........................•...... Questions ..•.........•....•...•....••.•....•....•.••••....• References ....•................................ I
•••••••••••
244
248 250
255 259
263 263 265 266
268 273 273
CONTENTS
xi
CHAPTER 17. LINEAR REGRESSION-II 17.1 17.2 17.3 17.4
Sampling Experiment ........................•.....•..........
Distribution 01 Sample Pt'ean .........•.......•................. Distribution of Sample Regression Coefficient •••••••••••••••••••
274 276 278
17.5
Distribution of Adjusted Mean •••.•••••••••••••.•••••••••••••••
283
Variance Components ••••••••••••••••••••••••••••••••••••••••
17.6 17.7 17.8 17.9
Contrast Between Linear Regression and Analysis of Variance •••• Test of Linearity of Regression .............................. . Indi vidual Ilegree of Freedom ................................ .
288 290 29S
Rertlartcs •••••••••••••••••••••••••••••••••••••••••••••••••••• Exercises ••••••••••.••••••••••••••••••••••••••••••••••••••• Questions ••••••••••••..••••••••••••••••••••••••••••••.••••• References •••••••••••••••••••••••••••••••••••••••...•••.•••
298
302 303
007 308
CHAPTER 18. FACTORIAL EXPERIMENT 18.1 Description of Factorial Experiment ••••••••••.••••••••••••••••• 18.2 Mechanics of Panition of Sum 01 Squares •••••••••••••••••••••••
309
18.3 Interaction ••••••••••••••••••••••••••••••••••••••••••••••••••
315 316
18.4 Computin8 Method ••••••••••••••••••••••••••••••.•••••••••••• 18.5 Statistical Interpretation-Fixed Model •••••••••••••••••••••••••
18.6 ModeJs--Tests of Hypothesis ••••••••••••••••••••••••••••••••• 18.7 Testa of Specific Hypotheses ................................ . 18.8 Hierarchical Cla••ification ..........................•........ 18.9 Sarnp1in8 Error ..••.•••.•.•.•..•.•..•..•.....•.•••..••.•.....
311
319 324 325 326
331
Exercises ••..•••••••.••.••••••••••••.•••••••••••••••••••••• Questions .••.•••••••••••••••••••
333
References ••••••••••••••••••••••
343
342
CHAPTER 19. ANALYSIS OF COVARIANCE 19.1 Test of Homoseneity of RelVession Coefficients •••••••••••••••••
19.2 Analogy Between Mean and Regression Coefficient ••••••••••••••• 19.3 Sampling Experiment on Regression Coefficients •••••••••••.••••• 19.4 Test of Homogeneity of Adjusted Means •••••••••••••••••••••••• 19.5 Sampling Experiment on Adjusted Means •••••••••••••••••••••••• 19.6 Individual Degree of Freedom ................................ . 19.7 Test of Adjusted Me8llS with E4fual Regression Coefficient •••••••• 19.8 Tesl of Adjusted Means for Randomized Block Experiment •••••••• 19.9 Itelalion Between Analysis of Covariance and Factorial Experiment Exercises ••••.•.•••••••••••••••••••••••••••••••••••••••••••
344 U9 351 353 356
359 363 366
370 378
Q.,aestioRS ••••••••••••••••••••••••••••••••••••••••••••••••••
381
Referen ces .•••••.••••••••••••••••••••••••••.•••••••••••••••
S83
CHAPTER 20. REVIEW II
20.1 .Analysis of Varillllce ••••••••••••••••••••••.••.•.•••••.•.•••••
384
20.2 Individual Degree of Freedom ................................ .
386
20.3 Randomized Block Experiment ................................ . 20.4 Units of Measurement .••••••••.••••••••••••.••.•.••••••••••.• 20.5 Applications of Statistical Methods ••••••••••••••••••••••••••••
387 388 388
20.6 Power of a Test •••••••••••••.••.•••.•••••..•••••••••••••••••
389
xii
CONTENTS CHAPTER 21. SAMPLING FROM BINOMIAL POPULATION
21.1 21.2 21.3 21.4 21.5 21.6 21.7 21.8 21.9 21.10
Binomial Population ...............•......................... Sa.mple Mean and Sample Sum ...•....••..••......•............. Minimum Sample Size ......••••..••....................••...••
Test of Hypothesis Concerning Mean ••••••••••••••••••••••••••• Confidence Interval of Mean .........•......................... Difference Between Two Means •••••••••••••••••.••••••••••••••
Test of Homogeneity of Means ................................ . Analysis of Variance Versus )(2..Test •••••.••••..•••....••.••.• Individual Degree of Freedom ••••••••••••••••••••••••••••••••• Summary and Remarks •.•.•.•••••••..•••..•••••••••..•••••.•.• Exercises ................................................. . Questions •••••••••••••••••••••••••••••••••••••••••••••••••• References •••••••••••••••••••••••••••••••••••••••••••••••••
390 393 397 402 405 407 413
416 420 423 427 430 431
CHAPTER 22. SAMPLING FROM MULTINOMIAL POPULATION 22.1 22.2 22.3 22.4 22.5 22.6 22.7 22.8
Mul tinomial Popul ation .............•.......••••.••......•.... Test of Goodness of Fit ••••••••••••••••••••••••••••••••••••••
Individual Degree of Freedom for Goodness of Fit •••••••••••••••• Fitting Frequency Curve ••••••••••••••••••••••••••••••••••••••
Test of Independence ••••••••••••••••••••••••••••••••••••••••. An Example of Test of Independence ••••••••••••••••••••••••••• Individual Degree of Freedom for Test of Independence ••••••••••• Compu tin g Shortcu t for )( 2 •••••••••••••••••••••••••••••••••••• Exercises ................................................. . Questions •••••••••••••••••••••••••••••••••••••••••••••••••• References ••••••••••••••••••••••••.••••••••••••••••••••••••
432 432 435 436 438 439 442 442 444
446 446
CHAPTER 23. SOME COMMONLY USED TRANSFORMATIONS 23.1 23.2 23.3 23.4 23.5 23.6
Arlgular Transfonnation ••••••••••••••••••••••••••••••••••••••• .An Example of Transformation ••••••••••••••••••••••••••••••••• Square Root Transfonnation ....•...........................•.•
447 450
Logarithmic Transfonnation •••••••••••••••••••••••••••••••••••
458 459 462 464 468 468
Nonnal Score Transformation ................................. . Summary and Remarks ••••••.•••••••••••••••••••••••••••••••••
Exercises ................................................. . Questions ................................................. . References •.••••.•••••.••••••.•••••••••••••••••••••••••••••
454
CHAPTER 24. DISTRIBUTION-FREE METHODS 24.1 24.2 24.3 24.4 24.5 24.6
Median Hypothesis Concemins Median ••.••••••••.•...•...•.....••...•
Completely Randomized Experiment •••••••••••••••••••••••••••• Randomized Block Experiment .........•...••..•..•............ Sigrt Test .................................................. . Remarks •••.•••••••••••••••••••••••••••••••••••••.•••••••.••
Exercises ................................................. . Questions ..••..••••••..••••••••••••.••••• .- ••••••••••••••••. References ••••.••••••••••••••••••.•••••••••••••••••••••••••
469 471 472 474 478 482 483 484 484
CONTENTS
xiii
APPENDIX Table 1 Table of Random Nonnal Numbers with Mean Equal to 50 and Table Table Table Table Table Table Table Table Table Table
2 3 4 5 6 7a 7b 7c 7d 8a
Table 8b Table Table Table Table
9a 9b 10 11
Variance Equal to 100.................................
487
Table of Random SamplinR Numbers. • • • • • • • • • • • • • • • • • • • • • • • • Area Under the Nonnal Curve............................... Percentage Points of the X2.Distribution.......... •• ••••••••• Percentage Points of X 2/11 • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Percentage Points of the t-Distribution • • • • • • • • • • • • • • • • • • • • • • 5% Points of the F-Distribution. • •• • • • • • • • • • • • • • • • • • • •• • • • • • 2.5" Points of the F-Distribution ••••••••• ••••••••• ••• •.•••• • 1% Points of the F-Distribution.............. ...... ......... 0.5% Points of the F-Distribution • • • • • • •• •• • • • • • • • • • • • • • • • • • Significant Studentized Ranges for a 5% Level New Multiple Range Test. . • . . . . . . . • . . . . . • . . . . . . . . . . . . . . . . . . . . . . . . . Significant Studentized Ranges for a 1% Level New Multiple Range Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95% Confidence Intervals for Mean of Binomial Population 99% Confidence Intervals for Mean of Binomial Population ••••• Transfonnation of Percentages to Degrees ••••••••••••••••••• Normal Scores for Ranks. • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
507 517 518 519 520 521 523 525 527
529 530 531 532 533 533
INDEXES Index to Theorems Index to Tables
.................................................
537
Index to Figures .•.•••••••.•...••••.•...•..••...••••.•.•••...••.•.
538 540
Index to Subject Matter ..........................•.................
541
SYMBOLS AND ABBREVIATIONS Lower Case Letters. • . • • • • • • • • • • • • . • . . . • • . . • • • . • . . . . . • • • . . . • • • . • • • • Capital Letters•••••••.•••..•.•........ - . • . • . .• • . . . •. • . . •. . •• . . . •• • Greek ~tter8..... • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •
549 55() 551
CHAPTER I
INTRODUCTION Broadly speaking, the term statistics means the collection and tabulation of data and the drawing of conc lusions from those data. More s pecificaliy, the collection and tabulation of data and the calculation of various indices such 88 averages and percentages is called descriptive statistics, while the systematic drawing of a conclusioD or conclusions from those data is called statistical inference. In statistical inference, the computation of averages and percentages, etc., is considered incidental. It is a means to an end rather than the end itself. 1.1 Statistical Inference Much of the statisticiaD's technical vucabulary is made up of common
words given special meanings. For example, an observation in statistics is a recording of information, such as a person's height. A population is a set of observations, for example the height of every student in a college. (Note that the technical application of the term population refers to the collection of heights rather than to the collection of people). A sample is a collection of observations drawn from the population. If the heights of all the students of a college constitute a population, then the heights of some of the students of that college constitute a sample. A random sample is a sample drawn in such a way that all the potential observations of the population have an equal chance of being selected. The process of statistical inference takes place when a conclusion about a population is drawn from a given sample of that population. If the population is already available, there may not be any need for a sample. If the population is not available, what is said to be true of the population is usually derived from a known portion of it. For example, if a manufacturer of light bulbs wishes to know the average life span of his product, be cannot burn out his entire production for the sake of an answer. He must be contented with testing a sample. Therefore. if that manufacturer claims for his product an average life span of 2,000 hours, the buyer should realize (a) that the test sample never reaches the market and (b) that the marketed product has never been tested for life span. The inference about his entire production of light bulbs (population) has been drawn from a small, but carefully tested, part of that production (sample). Statistical inference has very broad applications in science and technology. Scientists who work with animals and plants are primarily interested not in the particular animal or plant which they examine (sample), but in reaching a conclusion which may be applied to a large group (popu-
1
2
Ch. 1
INTRODUCTION
lation) which the particular plant or animal represents. Scientist. themselves call tbis process induction. Statisticians call it 5t1Jtisticol inference. Thus modern statistics may be regarded as the technology of scientific method. It is an indispensible tool for the worker in experimental sciences.
1.2 Referellce Systelll
A great many cross-references are used in this book.
The reader's attention is frequently called to a certain section, theorem, tsble, or figure. For easy reference, indexes of theorems, tables, and figures, accompanied by their page numbers, are given at the end of the book. The number given a theorem, a table, or a figure, is the same as that given the section in which it appears. The firat, aecond, or third theorem
in a single section is designated by a, b, c, etc. Both Theorems 2.4« and 2.4b, for example, appear in Section 2.4, or Section 4 of Chapter 2•
.
REFERENCES BroslI, Irwin D. J.: Design for Decision, The Macmillan Company, New York, 1953. Wilson, E. Bright, Jr.: An Introduction to Scientific Reuorcla, McGraw-Hill Book Company, Inc., New York, 1952.
CHAPTER 2
DESCRIPTIVE STATISTICS This chapter is devoted to some descriptive measures of a population. These meaures are used throughout the book.
2.1 Mathematical Notations One of the moet frequently used notations in statistics ia the capital Greek letter aigma ~ which meaa "the sum of." The observations are denoted by y'a, y. being the first observation, Y2 being the second observation and ao fortb. If the 5 obaervations, 3, 2, 1, 3, 1 are designated by y" Y2' Y.. Y., y, respectively, the sum of the 5 obaervations, which is 10, is designated by Iy, that is, Iy - y, + Y2 + y. + y. + y, - 3 + 2 + 1 + 3 + 1 - 10. Besides Iy, two other expressions Iy and (Iy)' are commonly used. The expressioDs Iy means the sum of the squares of the obaervations, that is,
Iy2 ... y! + y! + y: + y! + y: _ 9 + 4 + 1 + 9 + 1 - 24. The square of the sum of tbe observations is written as (Iy)2, that is, (Iy)2 _
Cr,
+ Y2 + Y. + Y. + y,)2 = (3 + 2 + 1 + 3 + 1)2,.. 102
...
100.
There is no profound reasoning behind the notations. They are merely shorthand symbols. The important thing is to know what each symbol stands for. The quantities I t and (Iy)2 look very much alike, but they are entirely different. U each number is squared individually and then the squares added together, the resulting quantity is Iy. If the numbers are added together, then the total squared, the resulting quantity is (Iy)·.
2.2 Mean The mean of a population is merely the average of its observations. Suppose the lengths of three sticks are 3, 1, and 8 inches respectively. The mean length is (3' + 1 + 8)/3 or 12/3 or 4 inches. In general, the mean is tbe aum of tbe observations divided by the number of observations. Symbolically, the mean of a population may be expressed 88
where the Greek letter mu, p., is the mean, tbe y's are the observations. 3
4
Ch.2
DESCRIPTIVE STATISTICS
the letter N is the number of observations, and the three dots stand for all the tenns in between. 2.3 Variance and Standard Deviation The variance of a population is a measure of the variation of the observations within that population. If one has 4 populations, each containing 3 observations: (a) 4,4,4; (b) 2,3,7; (c) 1,2,9; and (d) 0,1,11, the mean of each of the four populations is equal to 4, but the amount of variation of the observations within each population is not the same. There is no variation in set (a). The variation is larger in (b), still larger in (c), and largest in (d). The most readily understood measure of variation is the range which is the difference between the largest and the smallest observations. The ranges of the populations (a), (b), (c), and (d) are 0,5,8, and 11 respectively. There are many measures of variation besides the range. One of the moet important measures is called variance which is defined as
(12 = (y, - ".)2 + (Y2 - ".)2 + .. • + (y N N
-
".)2
=I(y -
".)2
(1)
N
where (12 is the variance, ". the mean, y's the observations, and N the number of observations. The variances for the 4 populations are as follows: (a) (12
(b) (12 (c) (12
(d) (12
= = = =
(4- 4)2 + (4- 4)2 + (4- 4)2 3 (2-4)2 + (3-4)2 + (7-4)2 3 (1-4)2 + (2-4)2 + (9-4)2
=0
= =
3 (0-4)2 + (1-4)2 + (11- 4)2
3
4+1+9
14 =3
3 9+4+25
38 z-
3 3 16 + 9 + 49 74
= - -::\- - -- 3'
The square root of the variance is called the standard deviation, which is also used as a measure of variation. The concept of variance is familiar to most people even though the method of computation may be new. The climates of d.ifferent regions are often described as extreme or temperate. In an extreme climate, the temperature varies a great deal from summer to winter. The variance of the 365 daily temperature readings is large. If the temperatures do not vary much from season to season, then technically speaking, the variance is small. F or another illustration of the principle of variance, one may consider the grading of eggs. In a dozen eggs chosen at randoll before they are
2.4
OBSERVATIONS ON MEAN AND VARIA.... CE
5
graded few size, some will be large and some will be small. The variance of the weights of the 12 eggs will be large. After being graded for sile and boxed for sale, however, the eggs within each box will be of more or less the same size. The variance of the weights of the eggs taken from any particular box will then be small; or if all the eggs are of the same weight, the variance equals zero. All of these discussions of variance apply also to the standard deviation. The standard deviation, bein~ the square root of the variance, increases and decreases with the variance. If the variance is equal to zero, the standard deviation is also equal to zero. 2.4 Effect of Change in Observations on Mean and Veriaace The mean and variance are calculated from the observations. If changes are made in the observations, the mean and variance, of courae, will be changed. For example, in the three observations 11, 16, 15, the mean is equal to 14. If 10 is added to each of the observations, the new observations will be 21, 26, 8Ild 25 and the new mean will be 24, which is 10 more than the old mean. If 5 is subtracted from each of the observations, the mean will be reduced by 5. This kind of change in observations does not affect the variance and consequently not the standard deviation. The variance of 11, 16, 15 is
(11-14)2 + (16-14)2 + (15-14)2
9+4+1
3
3
If 10 is added to each of the observations, the mean of the new ob.ervations 21, 26, 25 will be 24 and the variance will be (21-24)2 + (26-24)3 + (25-24)3 0 2 -
3
9 + 4+ 1 .. - - - -
3
which remains unchanged. The variance is calculated frOD: the deviations of the observations from their mean. Both observation and mean are changed by the same amount; therefore the deviation i. not changed. It can be seen from the above example that 11-14 =21- 24, 16 -14 =26 - 24, and 15-14 co: 25- 24. Therefore, the variance is not affected by such change in observations. 1 his result may be summarized in the followin'g theorem: Theorem 2.4a If a fixed amount is added to or subtracted from Ifach of the observations, the mean will be increased or decreased by that amount, but the variance and standard deviation are not affected. If each of the three observations 11, 16, 15 is multiplied by a fixed quantity, say 10, the new observations 110, 160, 150 will be 10 times
6
Ch.2
DESCRIPTIVE STATISTICS
their corresponding old observations, and the new mean, 140, is also 10 times the old mean. The variance of the old observations is (J 1 _
(11-14)2 + (16-14)2 + (15-14)1 "" _ 9 +_ 4_ + 1_
3
3
and that of the new observations is (110-140)2 + (160-140)2 + (150-140)2
2 (J
..
3
-
900 + 400 + 100 --_0---
3
This shows that the new variance is 100 times the old one, and the new standard deviation, being the square root of the variance, is 10 times the old one. The deviations 11 - 14, 16 - 14, and 15 - 14 are each multiplied by 10. 1 herefore, each of the squares of the deviations is multiplied by 100. This discussion concerning a constant multiplier can be extended to a constant divisor, because the division by 10 is simply the multiplication by 1/10. This result can be summarized in the following theorem: Theorem 2.4b If each of the observations is multiplied by a fixed quantit.y m, the new mean is m times the old mean, the new variance is rnl times the old variance, and the new standard deviation is m times the old standard deviation. If every observation is multiplied by 2, the new mean is twice as large as the old one, the new variance is four times as large as the old one, and the new standard deviation is twice as large 88 the old one. If every observation is divided by 2, which is the same as multiplied by 1/2, the new mean is 1/2 as large as the old one, the new variance is 1/4 as large as the old one and the new standard deviation is 1/2 as large as the old one. 2.5 Frequency Table This section describes a method of presenting the observations in a tabular form. Suppose the ages of ten persons are 21, 20, 21, 22, 20, 22, 21, 20, 23, and 21. There are 3 persons 20 years old, 4 persons 21 years old, 2 persons 22 years old, and 1 person 23 y~ars old. Now the ten persons are divided into four age groups. Each group is called a class. The nun.ber of persons in a class is called the freqruncy of that class. In general, the frequency is the number of times an observation occurs. The table which shows the class frequencies is called a frequency table. The total number of observations, which is the sum of all class frequencies, is called the total frequency. The frequency table of this example is shown in Table 2.5a. The stated age refers to the nearest
2.5
7
FREQUENCY TABLE TABLE 2.5a
y
- --
--
20 21 22 23
f -
-- -
3
Relative Cumulative Frequency r. e.l.
Relative Frequency
Frequency
Age
-- -
r.f. 30% 40% 20% 10%
4
2 1 10
30% 70% 90% 100%
lOO~
birthday, the age 21 D'eans that the true age is somewhere between 20.5 and 21.5, 21 being located at the middle of the class. The relative frequency of a class is the frequency of that class expressed as a percentage of the total frequency. For instance, in Table 2.5a, the frequency of age 21 is 4 and the corresponding relative frequency is 40%, because 4 out of 10 is 40%. The rel4tive cumulative frequency (r.c.f.) of a certain point is the sum of the relative frequencies up to that point. The r.c.f.'s 30%, 70~, 90%, and 100% in Table 2.5a indicate those percentages of observations that fall below 20.5, 21.5, 22.5, and 23.5 respectively. The mean can he found from a frequency table. The mean is the sum of the observations divided by the total number of observations. The observation 20 occurred three times and has to count three times; 21 occurred four times and has to count four times, and so forth. Therefore the sum of the observations is not 20 + 21 + 22 + 23, but (20 x 3) + (21 x 4) + (22 x 2) + (23 x 1) which is equal to 211. The total number of obserTABLE 2.Sb y 20 21 22 23 Sum
f
yf
(y -.E)
3
60 84 44 23
-1.1 -0.1 0.9 1.9
4
2 1 10 N
(y _
1.21 .01 .81 3.61
Iyf 211
p.=-=-= 21.1 N 10 I(y - p.)2f 8.90 = - =.890 02= 10 N cr= "';.890
(y - p.)2f 3.63 .04 1.62 3.61 8.90 I(y - p.)2f
211 ~yf
g>2
= .943
8
DESCRIPTIVE STATISTICS
Ch.2
vations, which iii called the total frequency, is 10. The Ir.ean is 211/10 21.1. Symbolically, the mean is
01'
Iyf
fl-
N'
(1)
The variance can also be found from a frequency table. The deviation squared (20- #1)2 has to be multiplied by 3 because the observation 20 occurred 3 time.. In general, each (Y-#1P has to be tnultiplied by f. Symbolically, the variance is (2)
The detaila of the calculation of the mean and the variance of the 10 observations are given in Table 2.Sb. 2.6 Histogram and Frequency Curve A graph of a frequency table is called a histogram. The histogram of the frequency table of the last section (Table 2.5&) is shown in Fig. 2.6&. It should he noted from the figure that each rectangle is constructed on the base which represents the interval an observation stands for. The vertical scale which represents the frequency always start. from zero.
f
5
~
4
~
3
~
2
1
20
• 21
I
l
22
23
Fig. 2.6 a
2.6
9
HISTOGRAM AND FREQUENCY CURVE
The height of a rectangle is equal to the frequency of that class. Since the bases of the rectangles are of the same length, the areas of the rectangles are proportional to the frequencies. For instance, the frequency of the 21-year-old class is twice as large as that of the 22-year-old class. The areas of the two corresponding rectangles also maintain such a relation. The ratio of the area of a rectangle to the total area of all rectangles is the relative frequency of that class. F or example, the area of the rectangle representing the 23-year old class is 109E of the total area.
f
20
,
,
21
22
23
r
Fig. 2.6 b
A problem involving one million pel'8ons with age ranges from 20 to 23 would necessitate 4 rectangles, if each interval is chosen as one year (Fig. 2.6b). If one's age is measured by the month, each year-class becomes 12 month-classes. Consequently, the histogram will contain a total of 48 rectangles (Fig. 2.6c). If the process is continued and one's age is measured by the day, there will ordinarily be (4 x 365) + 1 or 1461 days in the four ye81'8 and the same number of rectangles in the histogram. As the unit of measurement becomes smaller, the width of the rectangles becomes smaller, while the number of rectangles becomes larger, but the total area of the rectangles remains the same (Fig. 2.6b and Fig. 2.6c). As this process goes on, the tops of the rectangles become more and more like a curve. This resulting curve is called a frequency curve. No part of a frequency curve can get below the horizontal axis because the frequency, being the number of observations, cannot be negative. As in
10
DESCRIPTIVE STATISTICS
Ch.2
I
20
21
22
23
'1
Fig. 2.6 c
the histogram, the relative frequency is represented by the area between the curve and the horizontal axis. But unlike the histogram, the frequency curve is used only to represent the relative frequency. Therefore the total area under the curve, being the total of the relative frequencies, is equal to 100%. 2.7 Remarks The disconnected topics presented in this chapter hate little direct application to experimental data. Scienti.ts are no longer content to use only descriptive measures such 88 mean and 'variance on their data. But one must know something about descriptive measures before he undertakes the study of statistical inference, just as one must know the alpha.. bet of a foreisn language before he can learn to read, write, and speak the language itself. In a sense, the material of this chapter is the statistician's alphabet. The symbols must be recognized on sight, and the theorems, which will be repeatedly quoted in later chapters, must be known thoroughly. The userulness of the Greek symbols #L and q2 will become increasingly clear 88 the subject is developed. Although in this chapter a set of observations is not always labeled "sample" or "population," from Chapter 4 on, this distinction is extremely important and is made constantly. The {;reek letters Il and qZ are used to denote the mean and variance of a population. The corresponding quantities of a sample are denoted by the Latin letters f and S2 respectively. Confusion between a sample and a population can be avoided in this way.
11
EXERCISES
EXERCISES (1) A population consists of the following observations:
1.8, 2.0, 1.8, 1.9, and 2.0. (a) Find the mean. (Il = 1.90) (h) Find the variance. «(72 .. 0.008) (c) Suhtract 1.0 from each' of the given ohservations and find the mean and variance of the new observations and thus verify Theorem 2.4&. (d) Multiply each of the original observations hy 5 and find the mean and variance of the new observations and thus verify Theorem 2.4h. (2) A population consists of the following observations: 17.2, 17.1, 17.0, 17.1, 16.9, 17.0, 17.1, 17.0, 17.3, 17.2, 17.1, 17.0, 17.1,16.9,17.0,17.1,17.3,17.2,17.4,17.1. (a) Find the mean. (Il- 17.105) (h) Find the variance. (a2 - 0.016475) (c) Find the standard deviation. (d) Make a frequency table showing frequency, relative frequency, and relative cumulative frequency (r.c.f.). (e) Make a histogram. (3) Find the mean and variance from the frequency tahle of Exercise 2(d). Note that the values obtained are the same as those obtained in Exercise 2, Parts (a) and (b). (4) Consider the following set of observations a population: 24 25
(5)
(6) (7) (8)
23 25
25 26 27 25
27 25
25 24
25 26 26 25
24 25
25 26
Find: (a) the mean, (b) the variance. Make: (c) a frequency table, (d) a histogram. F iild the mean and variance of the 500 observations given in Table 4.1a. Cv. - 50, (72 .. 100) Subtract 50 from each of the 500 observations given in Table 4.1a. Find the new mean and the new variance. Divide each of the 500 observations given in Table 4.1a by 10. Find the new meau and the new variance. . For the 500 ohservations given in Table 4.1a, subtract 10 from each of the observations and divide each difference by 10. Then the relation between the new observation u and the old observation r is
y-50 u .. - - · 10 Find the mean and variance of the new observations u.
12
Ch.2
DESCRIPTIVE STATISTICS
(9) In a study of the number of bacteria in rat feces, a direct microscopic count was attempted. Five pellets were suspended in 100 mi. of water and were finally diluted to approximately 1/500 with a sterile 0.01 per cent aqueous solution of agar. A film was made by spreading 0.01 mi. of the final dilution over an area of 1 square centimeter on a thoroughly cleaned glass slide. The film was dried at room temperature, fixed, and stained with crystal violet. Twenty-five random fields were examined with a microscope and the number of bacteria in each field was counted. The following are the results obtained. 19 21 13 25 33
35 33
9 0 8 37 17 23 15 1
25 32 35 10 39 9 24 7 1 7
Compute the mean, variance, and standard deviation. (Wallace, R. H.: "A Direct Method for Counting Bacteria in Feces," Joumal of Bacteriology, Vol. 64, 1952, pp. 593-594.) (10) The following data were obtained in a study of the precision of radioactivity measurements. A constant source (essentially) of radiation was counted repeatedly using the Tracerlab Autoscaler (SC-l B) employing the automatic sample changer (SC-6A) and a printing interval timer (SC-5A). The scale setting used was 4,096 counts. The following table shows the counts per minute (CPM) obtained along with their frequency. Find the mean and variance. CPM
f
CPM
9,035 9,069 9,102 9,136 9,170 9,204 9,239 9,274 9,309 9,344 9,380 9,416 9,452 9,489 9,526
1 0 1 1 2 3 8 20 22 31 4S
9,563 9,600 9,638 9,676 9,714 9,752 9,791 9,830 9,870 9,910 9,950 9,990 10,031 10,072 10,114
54
79 81 85
Total
(Courtesy of Dr. Robert L. Stearman, Johns Hopkins University)
f
112 91 88 90 64
65 37 31 14 10
6 1 3 4 1
1,050
QUESTIONS
13
QUESTIONS (1) If the observations are numbers of inches, what is the unit of measure-
ment for (a) the mean, (b) the variance, (c) the staodard deviation? (2) The mean and standard deviation of a set of 500 observations are 50 and 10 respective ly. What is the new mean and new standard deviation if (a) 8 is added to each of the observations, (b) 9 is subtractecl from each of the observations, (c) each aervation is multiplied by 3, (d) each observation is divided by 5, (e) 50 is subtracted from each of the observations and then each difference is divided by 10?
CHAPTER 8
THE NORMAL DISTRIBUTION Of infinitely many frequency cnrves in statistics, one of the most important is the normal cnrve. The word "normal" is used to identify rather than to describe the curve, and the term "normal cnrve" does not mean that other frequency curves are abnormal. H one describes a particular kind of educational institution as "nonnal" school, one does not intend to imply tbat other kinds of educational institutions are "abnorma 1." A nonnaI curve is thus a particular frequency curve 88 distinguished from many other frequency curves.
8.1 Some Properties of the Normal Curve Tbe graph of the normal curve is shown in Fig. 3.1a. It can be seen from the graph that it is a bell-shaped curve. (Some other curves are also bell-shaped). It is symmetric with respect to the line drawn perpendicular to the horizontal axis at the mean p.. If the graph paper is folded along the line C in Fig. 3.1a, the left and the right branches of the curve will coincide. Actually, the normal curve is not one curve, but a family of cnrves. Fig. 3.1a shows a normal cnrve with mean equal to 50 and standard deviation equal to 10.
,,.. c .4
/A. 50 II.
10
.1
.2
.1
,.
50
Fig. 3.1 a
14
60
70
/A.f1
I'.:Ior
3.2
TABLE OF THE NORMAL CURVE
15
The mean iL and the standard deviation a have to be specified before a particular normal carve is identified. For example, Fig. 3.1b show. three Damal carves with their means equal to 40, 50, and 60 respectively. The shapes of the three curves are identical. The difference is in their location with respect to the horizontal axis. The one with mean equal to 60 is at the right of the one with mean equal to 50 which in turn is at the right of the one with mean equal to 40. Therefore, if the mean is given, the location of the curve is fixed. The standard deviation a determines the spread of the curve. Fig. 3.1c shows three normal carves with equal means but unequal standard deviations. The fact that all three curves center around SS shows that the mean of each of the curves is equal to SS. The Curve A, which has the largest spread, has the largest standard deviation and C the smallest. The three carves in Fig. 3.1b have the same standard deviation as shown by the fact that they all spread out to the same extent. As in all frequency curves, the total area under the normal carve is equal to l00~ (Section 2.6). The symmetric property of the nonnal curve (Fig. 3.1a) indicates that 5~ of the observations faU below the mean and 50~ faU above it. In the interval " to iL + a lie approximately 34~ of the observations. For example, the normal curve ShOWD in Fig. 3.1a has the mean iL equal to 50 and standard deviation a equal to 10. Thirty-four percent of the observations are greater than 50 and less than 60. Then in the interval iL - a to iL + a or from 40 to 60 lie 34~ + M~ or 68~ of the observations. Since 5~ of the observations lie below the mean iL and M~ lie between iL and II- + a, 5~ + 34~ or 84~ of the observations lie below iL + a. In terms of the above example, 84% of the observations have values less than 60 and consequently 16% of the observations have values greater than 60. 3.2 Table of the Normal Curve An example of a normal carve with mean equal to 50 and standard deviation equal to 10 is given in the last section. The relative frequency of the observations between 50 and 60 is given as M%. This percentage is not obtained by observing the graph, but by complicated computation. IT one wishes to know the relative frequency of the observations between any two points such as 51.9 and 62.7, he can, if he wishes, compute the percentage by the process called numerical integration. But it is tedious work even if he possesses the knowledge and tools to do so. As a result, tables are made for the convenience of the public. Then a needed value can be readily obtained from a table. The ordinates (heights of the curve) at various points along the horizontal axis, as well as the area under the curve between various points, are extensively tabulated. There are many such tables available. An abbreviated table is given in Table 3, Appendix. The tables are made for
16
Cb.3
THE NORMAL DISTRIBUTION
.Q
....
~
3.2
17
TABLE OF THE NORMAL CURVE
-... u
...;
110
r..
18
THE NORMAL DISTRIBUTION
Ch.3
the normal curve with mean equal to 0 and the standard deviation equal to 1. However, they can be used for a normal distribution with any meaon and any standard deviation if changes are made in the observations (Theorems 2.48 and 2.41». F or example, the distribution of the observations of a population follows the normal curve with mean equal to 50 and standard deviation equal to 10. If 50 is subtracted from each observation, then the mean of (y-50) is equal to zero and the standard deviation of (y-50) is still equal to 10. H each (r-50) is then divided by 10, the new mean is equal to 0/10, which is 0, and the new standard deviation is equal to 10/10 or 1. Then the transformed observation u = (y - 50)/10, or in general, y-p.
u=--
(1)
o
will have the mean equal to 0 and the standard deviation equal to 1. For example, the u-value of the observation 60 is (60-50)/10 or 1, and that of 45 is (45-50)/10 or -0.5. This is just another way of saying that 60 is one standard deviation above the mean 50, and that 45 is one-half a standard deviation below the mean 50. Table 3 of the Appendix gives the area under the curve from -00 (minus infinity) to various values of u. This area represents the relative cumulative frequency from -00 to the given point. The relative cumulative frequencies for u = 1 and u = -0.5 are given as 84.13% and 30.85% respectively. This means that the relative frequency between - 0 0 and u = 1, or one standard deviation above the mean, is 84.13%; and that between -00 and u = -0.5, or one-half standard deviation below the mean, is 30.85%. Consequently, the relative frequency of the observations falling between u = 1 and u = -0.5 is 84.13 - 30.85 or 53.28%. These relative frequencies, and not the general shape of the curve, identify the normal distribution. Because many frequency curves are bell-shaped, casual observation of the graph of a frequency curve will not enable one to tell whether a frequency curve is normal or not. Certain values shown in Table 3 of the Appendix are frequently used in statistical writing without explanation. Cne of these values is 1.960 and the other is 2.576 (or 2.5758). Within the interval p. - 1.960 CI to p. + 1.960 CI are 959t of the observations of a normal distribution. Within the interval p. - 2.576C1 to p. + 2.5760 are 99% of the observations. For example, the mean and standard deviation of a normal population are 50 and 10 respectively. Within the interval 50 - 1.960(10) to 50 + 1.960(10) or 30.40 to 69.60 are 95% of the observations of that population. Within the interval 50 - 2.576(10) to 50 + 2.576(10) or 24.24 to 75.76 are m of the observations of that population. It may also be said that 95% of u's fall between -1.960 and 1.960 and 99% of u's fall between - 2.576 and 2.576.
3.3
19
NORMAL PROBABILITY GRAPH PAPER
3.3 Normal Probability Graph Paper A special kind of graph paper can be used for determining whether a set of observations is diatributed normally. If the values of Table 3 of the Appendix are plotted on ordinary graph paper (Fig. 3.3a), with the horizontal scale showing either y or u, and the vertical scale showing the relative cumulative frequencies corresponding to y or u, the curve is S-shaped. If the curve were drawn on a sheet of rubber, it is conceivable that it could be stretched into a straight line, as in Fig. 3.3b. When the curve is stretched into a straight line, the vertical scale becomes badly distorted. A special graph paper in effect stretches the curve into a straight line by means of a distorted vertical scale. Tbis kind of graph r.c·f· 1.00
.90
.80
.70
.60
.50
.40
.30
.10
1l-30 - 3
1l-20 - 2
1l-t1 -1
Il 0
Fig. 3.3 a
1l+t1 1
1l+20 2
1l+30 3
'Y u
20
Ch.3
THE NORMAL DISTRIBUTION
"c.,. 99.9
99
90
I I I
- - - - - - - - - - - - - 84"
I I I I I I
-- - --- - ----- --- - - - - -
so
10
0.1
Ii- 3a
Ii + 3a y
Fig. 3.3 b
3.4
PROBAmLITY
21
paper is called no111UJl probability graph paper. After the relative cumulative frequencies are plotted on it, a ruler can be used to determine whether the points are on a straight line. If the points are on a straight line, the set of observations follows the normal distribution, and the mean and standard deviation -of this normal distribution can be read off the graph (Fig. 3.3b). The 5~ point on the vertical scale corresponds to the mean p. on the horizontal scale, and the 84.13~ point corresponds to p. + (I. After p. and p. + (I are determined, (I can be obtained by subtraction. The use of nonnal probability graph paper is illustrated in the next chapter. The importance of normal distribution in present-day statistics and the value of knowing whether a set of observations follows the normal distribution will be shown as the subject develops. 3.4 Probability The term probability has many definitions, but in statistics it is another expression for tbe term relative frequency. For example, in terms of relative frequency one might say that 34% of the observations of a normal distribution fall inside the interval p. to p. + (I. In terms of probability, one would say that the probability for an observation of a normal distribution falling inside the interval " to p. + (I, is .34. The relative frequency, which may be expressed either as a percentage or a decimal, such as 34% or .34, is used in connection with all the observations collectively; while the probability, which is usually expressed as a decimal, is used in connection with a single observation. It must be realized from the above example that the probability of an occurrence is based on the relative frequency. If the relative frequency is 50%, the probability is .50. In other words, any statement of probability originates from the relative frequency.
EXERCISES (1) If a normal population has a mean equal to 10 and a variance (not
standard deviation) equal to 4. (a) what percentage of the observations fall between 9 and 14? (66.88%) (b) what percentage of the observations fall between 13 and IS? (6.06%) {d within what range will the middle 95% of the observations fall? (6.08 to 13.92) (d) within what range will the middle 99% of the observations fall? (4.85 to 15.15) (2) For the same population given in Exercise (1), find the relative cumulative frequencies less tban 6, 7, 8, 9, 10, 11, 12, 13, and 14. Plot the nine points on the normal probability graph paper. Read the mean and the standard deviation off the graph. Check the answers in Exercise (1) graphically.
22
THE NORMAL DISTRIBUTION
Ch. 3
(3) If a normal population has a mean equal to 400 and variance equal to 10,000, (a) what percentage of observations fall between 200 and 300? (b) what percentage of observations fall below 250? (c) within what range will the middle 50% of the observations fall? (d) within what range will the middle 95% of the observations fall? (4) The test scores of a large number of students follow the normal distribution with mean equal to 500 and standard deviation equal to 100. Suppose 5% of the students receive the grade A, 20% receive B, 50% receive C, 20% receive 0 and 5% receive F. What are the dividing scores of the 5 grades? (5) The test scores of a large n1lmber of students follow the normal distribution with mean equal to 500 and standard deviation equal to 100. (a) The top 25% of the students have scores above what value? (b) The bottom 25% of the students have scores below what value?
QUESOONS (1) The normal curve is really a family of curves. What uniquely identifies a particular curve? (2) Is a bell-shaped curve always a normal curve? (3) What is the method discussed in this chapter for determining whether . a distribution is normal or not? (4) What are the mean and standard deviation of u? (5) If 10 is added to each of the observations of a normal distribution, what is the effect on the frequency curve? (6) If every observation of a normal distribution is multiplied by 3, what is the effect on the frequency curve? (7) What is the vahle 1.960?
REFERENCES Pearson,
Karl (Editor): Tobles
lor
Suuiseicions OM Biome'ricions, Part I,
Tables I-III, Biometric Laboratory, University College, London, 1930. U.S. National Bureau of Standards: Tables of Probability Function, Vol. n, 1942.
CHAPTER 4
SAMPLING EXPERJMENTS The theorems given in the following chapters are verified by sampling experiments. To avoid the repetition of the detailed description of the experiments in each chapter, the equipment and procedure used in the entire series of sampling experiments are described here. The resuits needed to verify the theorems are presented in later chapters.
4.1 DescriptioD of Populatiou A basketful of 500 round, metal-rimmed, cardboard tags is the only equipment needed foc the sampling experiments. A two-digit number, which is the observation, is written on each of the tags, making a population of 500 observations. The frequency table of the population is shown in Table 4.1a, and the histogram is shown in Fig. 4.1a. The observation is the two-digit number written on the tag and the frequency f is the nomber of tags bearing that number. This population of 500 observations is constructed in such a way that it follows approximately tlae normal distribution with mean equal to 50 and standard deviation equal to 10 (Fig. 4.1a). The more condensed frequency table showing the relative cumulative frequencies is given in Table 4.1h. The relative cumulative frequencies are plotted ,against the
r
Table 4.1a
y
f
20
1
21
0
I)" -~
0
23
1
24
1 1
25 26 27 2R 29 30 31 32 33 34
35
1
1 2 2 3 3
4 5 6 6
y
J
-
7 9 10 11 . 12 13
36 37 38 39 40 41 42 43 44 45
y
51 52 53 54 55 56 57
14
16 17
46
IR IR
47 48 49 50
19 19 20 20
y
j
-
-
20
61)
19 19 18
67
6 5
18
17 16 14 13 12
on
4
69 70
3
71
2
2 1 1 1 1
3
59 60 61
11
72 73 74 75 76
62
10
77
1
63
9 7 6
7R 79 80
0 0 1
58
64
65
23
f -
24
Ch.4
SAMPLING EXPERIMENTS
~
Q
CO
/-=
II
A [7
./
............ l/
y .Y ~
--
J;<'
.-.r
.i'
J
/:
II'
• II.
\.. 'I'ii.:
f'oo...
•
~
--.t ~ ~
I"'-.. I~
-.....;
'"""'-J
"'" "-
~
"'-.......
.., Q
~
'-I
\=-
~
-Q
l_
I
on
Q N
N
II)
I
4.1
25
DESCRIPTION OF POPULATION Table 4.Th '1
Below 30.5 30.5 to 35.5 35.5 to 40.5 40.5 to 45.5 45.5 to 50.5 50.5 to 55.5 55.5 to 60.5 60.5 to 65.5 65.5 to 70.5 Above 70.5
/
c./.
r.c.,.
13 24 49 78
13 37
2.6 7.4 17.2 32.8 52.0 70.8 85.2 93.8 98.0 100.0
96
94 72 43 21 10
86
164 260 354 426 469 490 500
500
,.r.'. 99
90
I I I
I
t I
t t I
-----------~--
- ---------
M
In
30
70
,
26
SAMPLING EXPERIMENTS
Ch.4
values 30.5, 35.5, ••• , 70.5 on the probability graph paper (Fig. 4.1b). The fact that the nine points are on a straight line indicates that the population is normal. The 50% point on the vertical scale corresponds to y = 50 on the horizontal scale. This shows that the mean Il is equal to 50. The 84.13% point on the vertical scale corresponds to y = 60 on the horizontal scale. This shows that Il + a = 60, or a = 10 because Il = 50. Because a substantial part of present-day statistical methods is based on random sampling from the normal population, the tag population is deliberately made normal and the samples are drawn at random. This tag population is referred to quite often both in the text and in the exercises in later chapters. 4.2 Drawing of Samples After the 500 tags in the basket are thoroughly shuffled, one tag is drawn an~ tlte observation is recorded. The tag is then replaced in the basket and the whole basketful of tags is thoroughly shuffled before another tag is drawn. This process is repeated until 5000 observations are drawn. As the tags are being drawn, the observations are tabulated in groups of five (Table 1, Appendix). Then each group of 5 observations is considered a sample. Thus 1000 samples, each consisting of 5 observations, are obtained. The purpose of drawing one tag at a time and shuffling the tags between drawings is to insure random sampling. Because of the shuffling between drawings of individual tags, no observation is in any way influenced by any other observation. H a tag were not replaced before the next one is drawn, this tag could not be obtained in the next drawing, and thus the value of the next observation would be limited. H the tag were replaced but the tags were not shuffled before the next drawing, the first tag, being on top of the pile of tags, would very likely be drawn again and thus cause adjacent observations to be similar. Therefore, the replacement of a tag and the shuffling of the tags between drawings are to insure that, for each drawing, the 500 tags of the popul!ltion have equal chance of being drawn. The observations obtained in the manner just described are said to be independeRt. A sample consisting of independent observations is called a random sample. By this method of drawing tags, the samples of 5 tags are also made independent of each other. The replacement and shuffling of tags prevent the observations of one sample from influencing those of another. All the theorems which are derived from the sampling experiment and given in later chapters require that the samples be random and independent. Therefore, it is important to know this sampling scheme at this stage. An example of 4 random samples, each consisting of 5 observations, is shown in Table 4.2.
4.2
27
DRAWING OF SAMPLES Table 4.2
Explanations Appear in:
Sample No.
1
2
3
~
SO 57 42 63 32
55 44
67 57 71 55
244 48.8
228 45.6
tiona
'1
Calc~
lationa
Chapter 5
Iy
Y(mea) (I y)1 (I y)l/ lI IyI SS I ul
Chapter 7
.
.1 S2/ 1I
149.7 29.94 5.472 -1.2 -0.219
{SI/lI
Chapter 8
,
y-Il
Chapter 9
59,536 11,907.2 12,506 598.8 6.06
F (SS, + SSI)/q2
Chapter 11
..j2s,,J!1I
Chapter 12
F
Chapter 15
.,
Chapter 16
I
Chapter 17
r.
296 59.2
51,984 87,616 10,396.8 17,523.2 10,634 17,920 237.2 396.8 3.34 8.20 59.3 11.86 3.444 -4.4 -1.278
99.2 19.84 4.454 9.2 2.066
40.0 57.6 15.2 33.6 64.0
36.8 54.4 9.6 36.0 55.2
• 61 52 68 50 46 277 55.4 76,729 15,345.8 15,665 319.2 4.65 79.8 15.96 3.995 5.4 1.352 1.24 7.160 3.8 89.5
6.465 .495
,
Pm
46
3.2 104.5
• pI
1-8.8 1+8.8 2.7764 ";.1/ 11 Y- 2.7764 ";SI/lI 1 + 2.7764
40 52
2.52 8.360
Y'-Y2
Chapter 10
37
4
5.983 .635 SO.4 68.0 12.4 46.8 71.6
46.6 64.2 11.1 44.3 66.~
.40
.24
-30
-10
-44
-32
S08.8
227.2
203.2
216.8
88.8 17.0 122.8
85.6 19.0 123.6
99.2 15.6 130.4
95.4 16.8 129.0
4.3 Co_pat.UOB Various quantities, such 88 the sum, :Iy, and the mean, y, (Table 4.2) were calcnlated for each of the 1000 samples. The other calculated quantities shown in Table 4.2 are not self-explanatory, but are explained in later chapters 88 they are needed to verify certain theorems. The
28
SAMPLING EXPERIMENTS
Ch.4
number of the chapter in which the quantities are explained appears in the first column of Table 4.2. 4.4 Parameter and Statistic A para~ter is a quantity, such as the mean, calculated from a population. For example, the mean 50 of the tag population (Section 4.1) is a parameter. It is one of the characteristics of the population. A statistic is a quantity, such as the mean, calculated from a sample. For example, the means of the four samples given in Table 4.2 are 48.8, 45.6, 59.2, and 55.4 respectively. These are the values of a statistic. The population mean is designated by p. and the sample mean by y. A parame ter such as the population mean p. is a fixed quantity, but the sample mean y is not. From a particular population, various samples can be drawn, each sample baving its own mean. Therefore, the sample mean y ordinarily changes from sample to sample. The fluctuation of a statistic (such as the sample mean) from sample to sample is an important concept. An example of this kind of fluctuation can be seen in Table 4.2. The four sample means, y, listed there are all different. 4.5 P.pose of Sampllug Experiments The most important objective of statistics-to draw conclusions about the population from the iuformation obtained from a sample-is accomplished not by guesswork, but by attaining a sure knowledge of the relation between a population and its samples. The first step in developing a statistical method is to determine what kind of samples can be obtained from a given population. Once this is known, one hopes that a given sample will enable him to reach a conclusion about the population. The purpose of the sampling experiment, then, is to show the relation between tbe population and its samples. Table 4.2 shows that none of the four sample means is equal to 50 even though the population mean is equal to 50. Yet, as shown in Chapter 5, the fluctuation of the sample means follows a consistent pattern. REFERENCES Dixon. Wilfrid J. and Massey. Frank J.o Jr.: Introduction to SIa'is tical Analys is. McGraw-Hill Book Company, New York, 1951. Knowles, Elsie A. G.: "Experiments with Random Selector as an Aid to the Teaching of Statistics," Applied Slatis tics, Vol. S, 1954, pp. 90-103.
CHAPTERS
SAMPLE MEAN One of the most important topics with which statistics deals is the relation between a population and its samples. The relation can be from a population to its samples or, inversely, from a sample to its parent population. This chapter deals exclusively with the relation from a population to its samples. The characteristics of the means of the samples drawn from a given population are discussed. An important factor in the discussion is the number of observations in a sample, which for convenience is called the size of the sample and denoted by the letter n. If a sample consists of 5 observations, the size of the sample is said to be 5 or n - 5. 5.1 Sampling Scheme From a given population, many samples can be drawn. The relation between a population and its samples refers to all the samples and not merely to some of the samples drawn from the population, no matter how large or how small the total number of samples. How all the samples can be drawn from a population is illustrated by the following example: The three observations 2, 4, and 6 are considered a population whose bistogram is as follows:
2
4
6
1£ a sample consists of only one observation, only three samples can be drawn from this population. The samples are 2, 4, and 6, and the means of these samples are also 2, 4, and 6. If a sample consists of a single observation, the !Man of that sample is that observation itself. IT each sample consists of two observations, there are 9 possible samples. The
first observation of a sample can be either 2, 4, or 6. After the first observation is made, the second observation can be either 2, 4, or 6. The 9 samples are 2, 2; 2, 4; 2, 6; 4, 2; 4, 4; 4, 6; 6, 2; 6, 4; 6, 6. These samples with their sums Iy and means are given in Table 5.1a. From this table, it can be seen that each of the three first observations has three branches. The total number of branches is 3 x 3 or 32 or 9. This sampling scheme is more like rolling dice than drawing cards. IT the observations 2, 4, 6 are written on three separate cards and samples of two cards are drawn, there are only three possible samples instead of
r
29
30
Ch.5
SAMPLE MEAN
nIDe. The samples are 2, 4; 4, 6; 2, 6. The fact that there exists only three possible samples can also be seen in a different way. If two out of three cards are drawn, there is only one left behind. Since there are only three possible ways of leaving one out of three cards behind, there must be only three possible samples of two cards. H 2 and 4 are drawn, 6 is left behind. If 4 and 6 are drawn, 2 is left behind. If 2 and 6 are drawn, 4 is left behind. Therefore, there eould be only three possible samples. Table 5.1a
----
r-
lst abs.
=-=-= =----==
2nd Dba.
-
2<:::: 2 4
Sample
- --
• 2,2 • 2,4 • 2,6
-- - -
Ir 4 6 8
--
-
Y 2 3 4
6
3
4,4
8
4
• 4,6
10
5
• 6,2 ' 6,4 • 6,6
8 10 12
4 5 6
• 4,2 - --+
-
In rolling dice, the situation is different. In the first roll, the observation could be either 1, 2, 3, 4, 5, or 6. After the first observation is recorded, the second roll still could produce either 1, 2, 3, 4, 5, or 6. Then the combinations 1, 1; 2, 2; etc. become possible. The result is 6 x 6 or 62 or 36 possible samples. It is this dice-rolling sampling scheme, which is technically called sampling with replacement, that is used here. It should be realized that this sampling scheme insures independence among the observations (Section 4.2). Regardless of what the first observation may be, every observation of the population still has an equal chance of Leing drawn in the second drawing. When this dice-rolling sampling scheme is used, the population of the three observations 2, 4, 6 yields 3' or 27 possible samples, if n = 3. The 27 samples are tahulated in Table 5.1b. F.ach of the observations 2, 4, 6 has three branches, each of which in turn has three sub-branches. This • makes 3 x 3 x 3 or 3' or 27 samples. It can be seen that there are 3' or 81 samples if n = 4. It should be recalled that 3 is the number of observations in the population and that the exponent 4 is the number of observations in a sample. In general, the total number of samples is Nn, wuere N is the number of observations in the population and n is the size of the sample or the number of observations in a sample. The number ."In increases very rapidly with N or n. In the sampling experiments described in Chapter 4, N = 500 and n = 5. These are not large
5.2
31
DISTRIBUTION OF SAMPLE MEANS
Table 5.lb
Sample ~2~2'2,2
4~2.2.4 6 ~ 2,2,6 2~ 2,4,2 2-4~ ---. 2,4,4 /
~ 2,4,6
"" 6
~2~2,6,2
4 ~ 2,6,4 6 ---- 2,6,6
~~4,2,2
/2
~
/
~4.2.4
6 ---. 4,2,6 ~2 ---- 4,4,2
4~4.4.4
6~ ~ 2 ---. 4 ---. 6 ---.
4,4,6 4,6,2 4,6,4 4,6,6
~~6,2,2 2 4 ---- 6,2,4 6 ~ 6.2.6 2~6,4,2
6--4~4~6,4,4 ~ 6~6,4,6 ~2~6,6,2
4 6
~
~
6,6,4 6,6,6
I.y
y
6 8 10 8 10 12 10 12 14
2.00 2.67 3.33 2.67 3.33 4.00 3.33 4.00 4.67
8 10 12 10 12 14 12 14 16
2.67 3.33 4.00 3.33 4.00 4.67 4.00 4.67 5.33
10 12 14 12 14 16 14 16 18
3.33 4.00 4.67 4.00 4.67 5.33 4.67 5.33 6.00
numbers; yet the total number of samples is (500)' or 31,250,000,000,000. The direct enumeration of so many samples, without proper tools, is almost impossible. To avoid such a tremendous undertaking, a special branch of mathematics called probability has been used to cope with such problems. The purpose of using probability on these types of problems is to replace tedious labor by mathematical skill. However, the giant high-speed computing machines built during the last decade make the direct enumeration a feasible approach. 5.2 Dislrib.tloD of Sample
MeGS
The preceding section shows that 9 possible samples can be drawn from the population 2, 4, 6, if the sample size is 2, and 27 samples if the sample size is 3. The mean y can be calculated for each of these samples. These two seta of sample means are given in Tables 5.1a and 5.1b respectively. For the case" = 4, there are 34 or 81 sample means and for case" = 8, there are 31 or 6561 sample means. (For the cases" = 4 and " = 8, the individual samples and their means are not shown in the book.
32
Ch.5
SAMPLE MEAN
Table 5.2 Sample Size n-8
y
f
"f
f
y
f
2
1
2
1
2.0
1
2.5
4
3.0
10
3.5
16
4.0
19
4.5
16
5.0
10
5.5
4
6.0
1
3
4
4
1
5
6
1
No. of Samples
3
10----.
n .. 4
n-2
n-l
6
2
3
2
1
r
f
2.00 2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25 4.50 4.75 5.00 5.25 5.50 5.75 6.00
1 8 36 112 266 504 784 1016 1107 1016 784 504 266 112 36 8 1
81
9
6561
but they may be determined by the reader 88 exercises.) The &equency tables of the sample means for" = 1, 2, 4, and 8 are shown in Table 5.2 and the histograms are shown in Fig. 5.2a, b, c, d. It can be observed that the histogram becomes more and more like the normal curve as n increases. This phenomenon is described in the following theorem:
f
1
n ... l
o
I
,
.
2
4
6
Fig. 5.2 a
5.2
33
DISTRIBUTION OF SAMPLE MEANS
f 3 I-
2
~
n=2
1 .
.
o
6
5
4
3
2
Fig. 5.2 b
f 20
15
10 n=4 5
2
3
4
5
6
r
Fig. 5.2 c
Theorem 5.2a As the size of the sample increases, the distribution of the means of all possible samples of the same size drawn from the same popultJtion becomes more and more like a normal distribution provided t"at the population has a finite variance. The provision of the finite variance need not cause much concern to readers who are not interested in the mathematical aspect of statistics. Neither the meaning nor the reason for this provision is explained here. It is stated here to make the theorem complete. For almost all practical problems, this condition of finite variance will be fulfilled. As long as the observations of a population have a finite range, the population will always have a finite variance.
34
Ch.S
SAMPLE MEAN
f r-
-
1000
800 I-
l-
,...
'-
.
600
·400
I-
~
!
I-
~
200
~
I-
r-
o
I
2
rL
J 3
4
5
6
r
Fig. 5.2 d
This theorem is called the CeJJtTal Limit Theorem. It is actually the most useful special case of a more generalized theorem which will not be given here. The Central Limit Theorem applies to any population, but if
5.3
35
MEAN AND VARIANCE OF SAMPLE MEANS
the population is normal to begin with, the situation is even simpler aid the result is stated in the following theorem: Theorem S.2b If the population is normal, the distribution of sample means follows the normal distribution exactly, regardless of the size of ' the sample. This theorem is verified by a sampling experiment in Section 5.6. 5.3 Me. aad Vari_ce of Sa.ple Me_ In the Jreceding section, the shape of the distribution curve of the sample means is discussed. This section considers the mean and variance of these sample means. Again the population of the three observations 2, 4, and 6 ma., be used as an example. The mean of the population i. equal to 4 and the variance (Equation 1. Section 2.3) is
0"
(2-4)2 + (4-4)2 + (6- 4)2 =0
=
4+ 0+ 4
8
= _.
3 3 3
If the sample size is 2, there are 9 possible samples as shown in Table 5.1a. The frequency table of the 9 salllple meaDS is given in Table 5.2. The details of the calculation of the mean and variance of these sample means is shown in Table 5.3. Table 5.3
-
1
1
yl
fr-,.d
<1-ll)2
1
2
-2
4
3 4
2
6
-1
3 2 1
12 10
0
5 6
1 0 1
Sum
9
2
1 2
6
<1-lt)21
..2 0
2
4
4
12
36 ~
J£y=9=4
12
=#,;
4
sn
~
0;=9=3=2=-;
The mean of all the sample means is equal to 36/9 or 4, which is the population mean. The variance of the sample means is equal to 12/9 or 4/3 which is the population variance 8/3 divided by the sample size 2. The mean and variance of the sample means are denoted by Il- and u!... respectively. The example given here is to illustrate the toUowirls theorem: TheCJt'em 5.3 The mean of the means of all possible samples of the same size drawn from the same population is equal to the mean of thai population; tilJJt is, Il- = Il Y
(1)
36
Ch.5
SAMPLE MEAN
The variance of these sample means is equal to population variance divided by the size of the sample; that is,
=_. 0'2
0'2
r
(2)
n
Equation (1) gives the relation between the mean of all sample means and that of the population. Equation (2) gives the relation between the variance of all sample means and that of the population. The relation between the standard deviation of all sample means and that of the population can be obtained by extracting the square roots of both sides of Equation (2). 'The resulting equation is 0'
0'
r
(3)
=-
V-n
which says that the standard deviation of the means of all possible samples of the same size drawn from the same population is equal to the population standard deviation divided by the square root of the size of the sample. This standard deviation of the means of all possible samples of the same size drawn from the same population is called the a~.
"""yf dis mea.n. 5.4 Notations So far three kinds of means have been considered: the population mean, the sample mean, and the mean of sample means. There are two kinds of variances, namely, population variance and the variance of sample means. Later the sample variance S2 will be added. To avoid possible confusion in the future, the notations are tabulated below:
Mean !population ~ample
Pistribution of sample means
p.
r
IJ.y
Variance
Standard Deviation
0'2
0'
S2
S
O'.!
0'-
r
r
Number of Items N(Size of Population) n(Size of sample) Nn(No. of all possible samples)
5.5 Reliability of Sample Means Either the standard error of the mean or the variance of sample means may be used as a measure of reliability of the sample means. The word "reliability" has many different meanings. A punctual person may be said to be reliable. One who always meets his commitments is also said to be reliable. When used in connection with the sample means, however,
5.5
RELIABILITY OF SAMPLE MEANS
37
the work "reliability" refers to the closeness of the sample means to the population mean. Since the mean of all sample means is equal to the population mean, the sample means cluster around the population mean. Since the variance of the sample means measures the variation among the sample means, a reduction of its magnitude indicates that the sample means hug the population mean more closely. The purpose of having a sample mean is to estimate the population mean. Therefore, it is desirable to have it close to, if not equal to, the population mean. Consequently, the variance of the sample means should be reduced whenever possible. This reduction can be accomplished in only two ways. Since (1)
the variabce of sample means can be reduced either by increasing the sample size n or by reducing the population variance 0 2 , or by both. When a scientist conduct. his experiment in a more or les8 homogeneous environment with homogeneous material, he is reducing the population variance and consequently reducing the variance of sample means. H the population variance is equal to zero, that is, if all the observations are the same, the sample mean is equal to the population mean regardless of the sample size. For example, in a population consisting of nothing but 9's as its observations, the mean is equal to 9 and the variance is equal to zero. No matter how many observations are taken, every sample mean will be 9. However, if the population variance cannot be further reduced by various devices such 88 air-conditioning of a laboratory, purification of chemicals, selection of a field of uniform fertility, or se lection of people of the same I.Q., the size of the sample has to be increased to make the sample meaDS more reliable. It should be noted that the term "reliability of sample means" refers to all the sample means collectively rather than to a particular one individually. Table 5.2 shows the distributions of the sample means from the population consisting of the observations 2, 4, 6 for various sample sizes. In every case, whether n is equal to 1, 2, 4, or 8, one of the sample means is equal to 2 (Table 5.2). This sample mean underestimates the population mean 4 to the extreme extent. But as the size of the sample increases, the relative frequency of the sample mean 2 decreases from 1/3 for n .. 1 to 1/6561 for n = 8. In other words, such an undesirable sample mean is less likely to be drawn when the sample size is increased. However, it is possible that a particular sample mean based on 1 observation may be closer to the population mean than a particular sample mean based on 8 observations. It can be observed from Table 5.2 that the sample mean 4 in the case n = 1 gives a perfect estimate of the population mean, while the sample mean 2 in the case n = 8 is the worst obtainable underestimate.
38
Ch.5
SAMPLE MEAN
5.6 Experimental VeriCicalioll of Theorems Th" experimental verification of previously slated theorems is given in this section. Theorem 5.3 states that the mean of the means of all possible samples of the same size drawn from any population is equal to the mean of that population, and that the variance among the sample means is equal to the population variance divided by the size of the sample. Thecrems 5.2a and 5.2b further state that the distribution of the sample means from a non-normal population approaches the normal distribution as the sample size increases, and that the distribution of the sample means from a normal population is always normal regardless of
Table 5.6 Sample Means
Theoretical r.f.(")
Below 39.5 39.5-42.5 42.5-45.5 45.5-48.5 48.5-51.5 51.5-54.5 54.5-57.5 57.5-60.5 Above 60.5
1.0 3.7 11.0 21.2 26.2 21.2 11.0 3.7 1.0
Observed r.f.(%) .7 4.0 12.6 19.5 24.2 22.2 12.3 3.7 .8
100.0
100.0
Observed r.c.f.(~)
.7 4.7 17.3 36.8 61.0 83.2 95.5 99.2 100.0
(This sampling experiment was done cooperatively by about 90 students at Oragon Stale College in the Fall of 1949).
the sample size. The implication of these theorems is that the relative frequency of the sample means within any particnlar interval is already known before a single sample from a given population is drawn, provided that the mean and variance of tbat population are known. For example, if all possible samples of size 5 are drawn from the normal population with p. = 50 and u = 10, the distribution of sample means will follow tbe normal distribution exactly with mean eql1al to 50 and standard deviation equal to ul,/iJ or 10/y5 or 4.47; and 95% of the sample means will fall within the interval 50 - 1.96(4.47) to 50 + 1.96(4.47), or within the interval 41.2 to 58.8. All the above predictions are deducible as natural consequences of tbe theorems, before a single sample is drawn. These cons~quences are verified here by the sampling experiment described in detail in Chapter 4. Briefly, the experiment consists of 1000 random samples, each containing 5 observations, drawn from a normal population with mean equal to 50 and standard deviation equal to 10. The mean was calculated for eacb sample. The frequency distribution of the
5.7
REMARKS
39
1000 sample means is shown in Table 5.6. The theoretical relative frequencies are the would-be relative frequencies if all possible samples of size 5 were drawn. They are obtained from a table of normal distribution such 88 Table 3 in the Appendix. The observed relative frequencies are the relative frequencies based on the 1000 samples drawn. For example, the observed relative frequency for the class 39.5 to 42.5 is 4.0% which indicates that 40 out of l~ sample means fall within the interval 39.5 to 42.5. The corresponding theoretical relative &equency is 3.7%. Table 5.6 shows that the theoretical and observed relative frequencies fit very closely, but that they are not identical. This discrepancy illustrates the basic principle of statistics. The theoretical relative frequency is based on all the samples, while the observed relative frequency is based on some of the samples (1000 to be exact). In the terminology of statistics, one is based on the whole population of sample means, while the other is based on only a sample of sample means. It is not expected that a sample will agre e with the population perfectly. The observed relative cumulative frequency is plotted on the probability graph paper as shown in Fig. 5.6. The fact that the points are almost 011 a straight line indicates that the distribution of the sample means is approximately normal. The mean of the sample means as read from the graph is 50.2 in contrast to the theoretical value of 50. The value corresponding to the 84% point is 54.7. Therefore the standard deviation of the sample means is equal to 54.7 - 50.2 or 4.5 as against the theoretical value of ulvn or 10lv5 or 4.47. This completes the experimental verifi cation of the theorems.
5.7 Remarks The application of these theorems has been deliberately postponed. Thirty years ago, scientists had to be content with the computation of the mean and the standard error of the mean of a series of observations. This computation provided some measure of reliability of the mean, but the statistical analysis had to end there. Nowadays, however, more recent developments in statistical methods enable one to do more than this. Of course, the sample mean and its standard error are still 88 useful as ever, but in a different capacity. Now they are simply elements or intermediate steps in the process of drawing inferences about th, population. In the test of a hypothesis and in the determination of a confidence interval as p-esented in later chapters, all the results developed here are used. In other words, the material in this chapter is still preliminary information. Therefore, the exercises in this chapter are designed to acquaint the reader with the meaning of the theorems rather than with their direc, application to practical problems.
40
Ch.S
SAMPLE MEAN
r.e.!. 99.51
90 _. - -- - - ---- 84"
-------------------- 50
10
0.1 35
40
45
50
Fig. 5.6
55
60
EXERCISES
41
EXERCISES (1) Draw all possible samples of size 2, with replacement, from the
population which consists of the observations 1, 2, 3, 4, 5. Find the mean for each sample. Make a frequency table for the sample means. (a) Draw the histogram for the population and also that for the distribution of the sample means and compare the shapes of the histograms. Which theorem does this comparison illustrate? (b) Find the mean of the population and that of the sample means. What is the relation between them? (c) Find the variance of the population and that of the sample means. What is the relation between them? (2) The frequency distributions of sample means drawn from the population 2, 4, 6 are given in Table 5.2. (a) For the case 4, what should the mean and variance of the sample means be? Calculate the mean and variauce of the sample means directly from table 5.2 and see if your answers are correct. (b) For the case" = 8, do the same as in (a). (3) H all possible samples of size 25 are drawn from a normal population with mean equal to 20 and standard deviation equal to 4, within what interval will the middle 95% of the sample means fall? (4) The size of a sample is 25 and the standard error of the mean is 2.4. What must the size of the sample be if the standard error is to be reduced to 1.5? (5) Repeat Exercise (1) with the sample size changed from 2 to 4. (6) Find the relative cumulative frequencies of the distribution of sample means, with 11 = 4, shown in Table 5.2, and plot these frequencies against 2.25, 2.75, ... , etc. on the normal probability graph paper. Are the points on a straight line? Which theorem does this graph verify? Read the mean and standard error of the sample means from the graph, and check these values with those obtained in Exercise 2a. (7) Find the re lative cumulative frequenc ies of the distribution of sample means, with 11 = 8, shown in Table 5.2, and plot these frequencies against 2.125, 2.375, ••• , etc. on the normal probability graph paper. Are the points on a straight line? Which theorem does this graph verify? Read the mean and standard error of the sample means from the graph and check these values with those obtained in Exercise 2b. (8) Plot the distribution of the 625 sample means obtained in Exercise 5 on the norllBl probability graph paper, and thus verify Theorems 5.2a and 5.3.
"K
QUES'OONS (1) What is the distribution of sample means?
(2) What is the "standard error of the mean"?
42
SAMPLE MEAN
Ch. 5
(3) The standard error of the mean is supposed to measure the reliability of the sample mello. (a) If the standard error is reduced, will the sample mean be m'ore reliable or less reliable? (b) What is the meaning of reliability when referred to the sample mean? (c) Does "reliability" refer to a particular sample mean or to all the sample means of the same size? (4) Under what conditions are the sample means normally distributed?
REFERENCES Keodall, Maurice G.: The Adl/anced Theory of Sta'islicB, Vola. I & II, Charles Griffin & Company, London, 1943 & 1946. Mood, Alexander M.: /,ul'Oduclion 10 Uae Theory of StadslicB, McGraw-Hili Book Company, New York, 1950.
CHAPTER 6
TEST OF HYPOTHESIS It has been shown that statistics deals with the relation between a population and its samples. That relation may be one of two kinds, according to the direction of the relation. If the direction is from the population to the sample:
Samp-Ie one can enumerate all possible samples which can be obtained from a given population. This process is called deduction, a process of reasoning from the general to the particular. If, on the other hand, the direction is from a sample to the population: Population
Sample
Sample
~
Sample
Sample
Sample
one judges the population on the basis of what is already known ahout the sample. This is called induction, a process of reasoning from the particular to the general. In other words, if one knows a sample and attempts to characterize the population on the basis of his knowledge of the sample, he is reasoning inductively. If one knows the population and attempts to characterize all possible samples on the basis of his knowledge of the population, he is reasoning deductively. The distinction between the two processes may be remembered easily by recalling that the prefix de- means "from," or "away from" (the population), while the prefix in- means "in," or "into" (the population). Whereas the material of the preceding chapter emphasizes deduction, that is, reasoning from the population to all possible samples, this chapter emphasizes induction, reasoning from a sample to the population. More specifically, the preceding chapter deals with the characteristics of the means of all possible samples of the same size drawn from a given population, while this chapter deals with the drawing of a conclusion 43
44
TEST OF HYPOTHESIS
Ch.6
about the population with the aid of knowledge derived &om a single sample which consists of some but not all of the observations of that population. 6.1 Hypothesis A hypothesis is a contention based on preliminary observation of what appear to be facts, which mayor may not be true. The test of hypothesis is in the comparison of the contention thus formulated with the newly and objectively collected facts. If these newly collected facts can be shown to agree with the contention, the contention is retained, that is, the hypothesis is accepted. If the contention and facts do not agree, the contention is discarded, that is, the hypothesis is rejected. The formulating and testing of the hypothesis as described in this way appear to be too simple and too easy. However, when one's self-interest or pride is involved, one finds it hard to discard his contention and accept the facts. For example, a man may have a fixed opinion that women are poor drivers. No amount of evidence can change his belief. If he is shown good women drivers, he says they are simply exceptions. If he is shown poor men drivers he says they drive like women. He wants all good women drivers and poor men drivers to be excluded as evidence. Indeed, if all contradictory evidence is ignored, all men are good drivers and all women are poor drivers. But if one wishes to be objective about this question, he should establish the hypothesis that on the average men and women are equally good drivers. This hypothesis concerns all the drivers-a population. Then he can check the hypothesis with the facts. No matter bow and where the facts are obtained, they are- obtained from a sample, because not all the drivers are observed all the time. Associated with the hypothesis about this population there may be one or two alternative hypotheses. In the case of two alternative hypotheses, one is that men on the average are better drivers, and the other is that women are better drivers. After the hypothesis is checked with the facts, and if the evidence is warranted, one may accept the hypothesis and reach the conclusion that, on the average, men and women are equally good drivers. If one rejects tbe hypothesis, he may adopt one of the two alternative hypotheses and reach the conclusion that men are better drivers, or that women are better drivers, depending on the evidence observed. However, if it were somehow known beforehand that women on the average cannot possibly be better drivers than men, the hypothesis still states that men and women are equally good drivers. Thus the only alternative is that women are worse drivers than men. The establishing of one or two alternative hypotheses must be based on known facts and not on prejudice.
6.2
45
TWO KINDS OF ERRORS
6.2 Two Kinds of Enol'S Checking a contention against the facts is a simple matter if all the facts (a population) are known, but it is a substantial problem if only some of the facts (a sample) are known. The purpose of a test of hypothesis is to check the con~ntion against some of the facts; therefore, the conclusion thus reached is not always correct. One of two kinds of errors, called Type I error and Type 1/ error, may be committed. For example: Smith is drinking coffee with his friend and the two agree to toss a penny to decide who is going to pay for the two cups of coffee. The very fact that Smith gambles with his friend indicates that he trusts him. other words, Smith acts on the hypothesis that hie friend ie honest and will not cheat him. If this tossing is repeated the next day and Smith loses both times, he will probably not doubt his friend's honesty. But if he loses 1000 times in a row, he may suspect that his friend is cheating him and decide to stop using this method of determining who is to pay for the coffee. Smith may now reject the hypothesis that his friend is honest. If the original hypothesis were true, it would be extremely unlikely that Smith would lose 1000 times in a row. But because he did loee 1000 times consecutively, Smith rejects the hypothesis. Smith's rejection of the hypothesis may be justifiable: his friend may really be dishonest. On the other hand, the original hypothesis may be correct: it is theoretically possible for the friend to win 1000 times consecutively without cheating. Therefore, if the friend is actually honest, but Smith decides from the evidence that he is not honest, Smith is committing a Type I error: rejection of a hypothesis that is actually true. A different situation may also arise. If, in tossing for coffee, Smith's friend wins six out of ten times, of course Smith will not suspect his friend's honesty. In other words, he accepts the hypothesis that his friend is honest. This conclusion may be correct. But it is also possible that bis friend is such a dishonest and scheming fellow that he deliberately loses occasionally but continues to win dishonestly most of the time. If the friend is actually dishonest but Smith continues to believe he is bonest, Smith is committing a Type n error: accepting as true a hypothesis that is actually false. The situation may be summarized in the following table:
m
True Hypothesis False Hypothesis
Acceptance
Rejection
Correct conclusion Type II error
Type I error Correct concluion
It should be noted from the above table that the two kinds of errors cannot be committed simultaneously. If a hypothesis is accepted, only
46
TEST OF HYPOTHESIS
Ch.6
a Type II error can be committed. If the hypothesis is rejected, only a Type I error can be committed. From the preceding example, it appears that Smith is liable to error either in thinking his friend honest or in thinking him dishonest. The fact that he must decide whether or not to continue their friendship puts him in an unpleasant position. Any decision he might make may be wrong; yet he must make it. Now to the bystander this may be no problem at all. He can say: "With the evidence against him, Smith's friend appears to be dishonest, but on the other hand, he might be an honest man. You never can tell." Or, "That is not enough evidence on which to accuse a man of being dishonest, but, on the other hand, he certainly can be. It is hard to tell." As long as one is not forced to make a decision, he can always avoid errors by indecision, inaction, and the use of evasive words such as "maybe," "perhaps," and "probably." This is a privilege, however, which can be enjoyed only by the person who is not involved in the problem. Smith must make a decision and careful examination of the evidence, though unpleasant, is a necessary preliminary to making a decision. Only decisions and the actions which follow the decisions can result in error. 6.3 Level of Significance The probability of committing the Type I error is called the level of significance, which can be illustrated by the test of the hypothesis that the mean of a population is equal to a given value. The hypothesis, which is merely a contention which mayor may not be correct, must be checked with the facts. The facts are the n observations drawn from the population. For example, the hypothesis is that the population mean is equal to 50, that is, #lG - 50. The notation #lG is used to denote a hypothetical population mean as distinguished from the true population 11. One alternative hypothesis is that the population mean is less than 50, and the other is that the population mean is greater than 50. To simplify the problem, it is assumed that the standatd deviation of the population is known to be 10. A random sample of, say, 16 observations is drawn from the population and the sample mean is computed. Now the problem is to decide whether the population mean is equal to 50, less than 50, or grea ter than 50. The decision is to be based on the knowledge that (J - 10, n - 16, and the sample mean y. However, an inductive inference about the population mean cannot be drawn from one sample without the knowledge of the distribution of all sample means. But the knowledge of the distribution of sample means can be deduced from the theorems developed in Chapter 5 without actually drawing a single sample. It is known that (a) the mean of the meaDS of all possible samples of the same size is equal to the population mean (Theorem 5.3);
6.3
47
LEVEL OF SIGNIFICANCE
(b) the standard deviation of all sample means is equal to the population standard deviation divided by the square root of the size of the sample (Theorem 5.3); and (c) the sample means follow the normal distribution (Theorems 5.2a and 5.2h). Therefore, if the hypothesis is true and all possible samples of size 16 were drawn from this population, the sample means would follow the normal distribution with mean equal to 50 and the standard deviation equal to ul Vn or 101 y'I6 or 2.5. This distribution of sample means is shown in Fig. 6.3a. If the sample mean y turns out to
r.
/.
45
50
55
'1
FiR. 6.3 a
be 60, rejection of the hypothesis is justified, because if tbe population mean is 50, it is possible but extremely unlikely for the mean of a random sample to be 60 (Fig. 6.3a). The disagreement between the facts and the hypothesis leads to the rejection of the hypothesis. Then the conclusion is that the true population mean is greater than 50. (How much it is greater tban 50 is another problem wbich is discussed in Cbapter 11.) Similarly, if tbe sample mean turns out to be 40, the conclusion is tbat the population mean is less than 50. However, if tbe sample mean turns out to be 50.5, it is judged to be so close to 50 (Fig. 6.3a) tbat the hypothesis is accepted, and the conclusion is that the population mean is equal to 50. In this case, the hypothesis is accepted on the ground that the evidence obtained from the sample does not refute the hypothesis. Now, if the sample mean is 50.5, the hypothesis is accepted, and if tbe sample mean is 40 or 60, the hypothesis is rejected. These decisions intuitively seem easy to make, because they are more or less clear-cut cases. But wbat decision should be made if the sample mean is 51, 52,53, etc.? A line must be drawn somewhere, beyond which the bypothesis is rejected. The tail-ends of tbe distribution of sample means (Fig. 6.3b) are marked off for this purpose. These two marked-off portions are called the critical regions. When a sample mean falls inside eitber critical region, the hypothesis is rejected.
48
Ch.6
TEST OF HYPOTHESIS
r.
f·
2.5% ~_--., _ _~J
-
Critical region
45.1
t
-
y-
Y
Critical region
50
54.9
t
t
Jlo
Jlo + 1.96 alyn
Fig. 6.3 b
The words "inside" and "outside" when used in reference to the critical regions often cause confusion when the topic of testing a hypothesis is first introduced. A sample mean is said to be inside a critical region if it falls into either one of the two tail-ends of the distribution curve. If it falls into the middle portion of the distribution curve, it is outside the critical regions. When one is inside a classroom, he is outside the hallway. When he is inside the hallway, be is outside the classroom. The words "inside" and "outside" refer to the critical regions and not to the middle portion of the distribution curve. The size of the critical regions is arbitrarily chosen. The regions may be marked off in such a way tbat 2.5% of the sample means fall into the left critical region and 2.5% of tbe sample means fall into the right critical region (Fig. 6.3b). A sample mean which is less than Ilo -1.96 (llyn or 50 - 1.96(10/y'16) or 45.1, is inside the left critical region, and a sample mean wbich is greater than Ilo + 1.9601 y'n or 50 + 1.96(101 y'16 ) or 54.9, is inside the right critical region. When the sample mean falls inside the left critical region, the conclusion is that the population mean is less than 50. When it falls inside the rigbt critical region, the conclusion is that the population mean is greater than 50. When the sample mean falls outside the critical region, the conclusion is that the population mean is equal to 50. It must be realized that the distribution of the sample means shown in Fig. 6.3a is made on the condition that the hypothesis is correct, that is, that the population mean is really equal to 50. Then, if the critical regions are marked off as described above, 5% of all possible samples will fall inside the critical regions al)d thus lead to the erroneous rejection of a true hypothesis (Type I error). This percentage (relative frequency) of all possible samples leading to the com-
6.4
TYPE II ERROR
49
mitting of a Type I error, is called the significance level. Consequently the significance level is the probability (Section 3.4) that a Type I error may be made on the basis of ft single sample. The significance level of 5~ does not imply that 5~ of all conclusions based on tests of hypotheses are incorrect. The discussion in this section refers only to the case in which the hypothesis is correct. The case in which the hypothesis is false is discussed in the following section. Theoretically the significance level is arbitrarily chosen, but in practice 5% and 1~ significance levels are usually used. In the example cited above, many choices of significance level are possible. If the value of 1.960 is replaced by 2.576 in determining the critical regions, the significance level is changed from 5~ to 1~. The values given in Table 3 of the Appendix offer many choices of significance level. But for other problems presented in later chapters, the tables offer only a limited choice. It is impossible to use 4.9~ as significance level for many tests of hypothesis unless one makes such a table for himself. Henry Ford once said that you can buy a Ford in any color as long as it is black. The situation involved in choosing a significance level is not that bad, but actually it is not much better. The theory of statistics offers complete freedom of choice of a significance level, but the existing tables place a limitation on that choice. As a result, only 5% and 1% are usually used, and levels such as 3.8, 4.1, etc., are seldom, if ever, used. The consequence of choosing a high or low significance level is discussed in the following section.
6.4 Type
n Error
A Type II error, as stated before, is the error committed by accepting a false hypothesis. The existence of this kind of an error and its relation to the level of significance are to be discussed by using the same example given in the preceding section. The hypothesis is that the population mean is equal to 50 and the population standard deviation is known to be 10. If the mean of the 16 observations falls outside the critical regions, that is, if it is greater than 45.1 and less than 54.9 (Fig. 6.4), the hypothesis is to be accepted. The Type II error can be committed only when the population mean is not equal to 50 but is equal to some other value. For the sake of discussion, let the true population mean be 58 while the hypothetical mean is still 50. Now there are two distribution curves of sample means. One is hypothetical and false, and the other is true. The true distribution of sample means which centers around 58 is labeled B in Fig. 6.4. The hypothetical but false distribution curve is labeled A. It can be observed in Fig. 6.4 that a substantial percentage of the sample means of Curve B is less than 54.9. If one of these sample means is drawn, the hypothesis that the population mean is
50
Ch.6
TEST OF HYPOTHESIS
r.
I·
c,
Typ II
45.1
50
1
po
error
c,
54.9
f
l
sa
t
r
/'
po + 1.960 u/ ..{n
po - 1.960 u/ n n
= 16; 5% Significance level
Fig. 6.4
equal to 50 will be erroneously accepted. Therefore, this percentage (relative frequency) of all possible samples is the probability (Section 3.4) for a single sample to commit the Type II error. It is represented by the portion of Curve B to the left of line C2 and shaded by horizontal lines in Fig. 6.4. It can be seen from Fig. 6.4 that the farther apart the true mean IL and the hypothetical mean lAo, the smaller is the probability of committing the Type n error. In other words, if the true population mean and the hypothetical population mean are close together, one is likely to accept the false hypothesis and thus commit a Type II error. If they are far apart, one is unlikely to accept the false hypothesis. It can also be seen that if the significance level (probability of committing Type I error) is reduced from 5% to 1%, line C1 moves towards the left and line C2 moves towards the right. As this is being done, the probability of committing a Type II error will increase. Any attempt to reduce the significance level will increase the probability of committing a Type error. It is a matter of robbing Peter to pay Paul. If the probability of committing one kind of error is increased, the probability of the other is reduced and vice versa, provided the sample size is unchanged.
n
6.5 Sample Size The sample size plays an important role in the test of a hypothesis because both Type I and Type II errors can be reduced only by increasing the sample size. It can be observed from Fig. 6.4 that after the significance level is chosen and the critical regions are fixed, the probability of committing the Type II error depends on the extent to which the
6.S
51
SAMPLE SIZE
two curves A and B overlap. If the size of the sample is increased from 16 to 100, the standard deviation of sample means is reduced from 10/ y16 or 2.5 to 10/\1'100 or 1. Then the dispersion of each of the two curves is reduced (Fig. 6.5) and consequently the extent of overlapping is reduced. If 5% significance level is used, the line C; is located at 50 + 1.960 or 51.960. If 1% significance level is used, the line C2 is located at 50 + 2.576 or 52.576 and even the latter quantity is more than r.
/.
u
47.4
I
1'0 - 2.576 (llv n
50
52.6
t 1
1'0
58
t
r
I'
1'0 + 2.576 (llv n
n .. 100; I ~ Signilicanc(' levr.l
Fig. 6.5
5 standard deviations «7}" - 1 for n co 100) below the true mean 58. Then the probability of commItting a Type II error is almost zero even when the 1% significance level is used (Fig. 6.5). In other words, a large sample permits the reduction of both kinds of errors. If the significance level (probability of committing Type I error) is unchanged, a larger sample will result in the reduction of the probability of committing a Type II error. The above discussion of course presupposes that the hypothesis is false because a Type U error can be committed only when the hypothesis is false. If the hypothesis is correct, the arbitrarily chosen significance level determines what percentage of the sample means leads to the re-
52
TEST OF HYPOTHESIS
Ch.6
jection of the correct hypothesis or the probability of committing a Type I error. If the significance level is unchanged and the hypothesis being tested is true, a larger sample has no advantage over a smaller sample, simply because a Type II error cannot be committed. Therefore, the pu~ pose of using a large sample is to reduce the probability of committing a Type II error, if the significance level is kept constant. The above discussion is based on the fact that the standard deviation of the distribution of sample means will be reduced if the sample size is increased and the overlapping parts of the Curves A and B in Fig. 6.5 are thereby reduced. Since (I (I, -
..;n
-,
(1)
it is obvious that (11 can be reduced by increasing n. It is equally obvious that (1- can be reduced by decreasing the population standard deviation (I. J!~rher incr~8ing n or decreasing (I has the same effect on the test of hypothesis. The decreasing of (I can be accomplished by the experimental scientists themselves. The refinement of technique cannot be overemphasized in conducting experiments. However, statistics also offers various devices of reducing (I. They will be discussed in later chapters. After the significance level is fixed, the reduction of the probability of committing a Type II error can be accomplished either by increasing sample size n or by reducing the population standard deviation (I. The importance of reducing this prohability cannot be overemphasized because of the nature of the test of hypothesis. The whole idea of the test of hypothesis is to try to produce evidence to refute the hypothesis. If the hypothesis is not rejected, it may be because the evidence that could refute the hypothesis is not produced. This lack of evidence could result either from a sample of insufficient size or from an experiment of excessive error. An ill-designed experiment based on only two or three observations can end with the acceptance of almost any hypothesis. Only after one has examined a great deal of evidence which does not refute the hypothesis can he have faith in the correctness of the hypothesis.
6.6 Summary The interlocking relations among the significance level (Type I error), Type II error, and the sample size can be summarized as follows: (1) A greater difference between the true mean p. and the hypothetical mean 110 will result in a lower probability of committing a Type II error, if the same sample size and the same significance level are used. (2) For the same sample size, the reduction of the significance level (Type I error) such as from 5% to 1%, will result in the increase of the probability of committing a Type II error.
6.7
53
THE V-TEST
(3) The way of reducing the probabilities of committing both kinds of errors is to increase the sample size, or to reduce the population variance, or both. (4) If the significance level is fixed, both the improvement of experimental technique and the increase of sample size can reduce the probability of committing a Type II error. Poor experimental technique combined with an insufficient Dumber of observations is very likely to lead ODe to accept a hypothesis whether it is true or not.
6. 7 The u-tesl The test of the hypothesis that the populatioD meaD is equal to a giveD value, while the populatioD staDdard deviation is kDown, is called the u-test in this book. There is DO uDiversal name for this test, and it is called u-test here simply for the cODvenience of reference. In previous sections the 10catioD of a sample meaD iD the distributioD curve has to be expressed iD tetms of the number of standard deviations from the populatioD mean. This Dumber of staDdard deviationl!J il!J
y-p. u-
y-p.
(1)
'---. 0'1
0
Vi For example, consider a sample of size n - 25 with mean y :: 53 takeD hom a population with mean p. - 50 and standard deviatioD 0 - 10. The standard deviatioD of all sample means is equal to o/..;n or 10/ v'25 or 2. The distance hom to p. is 53 - 50 or 3 which is equal to 1.5 standard deviation of sample meanl!J. The value 1.5 is exactly
0,
r
Y-11 o
53 - 50 10
Vi
V25
=
1£-------
3 2
-
1.5.
ADother versioD of 1£ is giveD iD Equation 0) of Section 3.2 as
y-p. 1£- - - .
o
In EquatioD (2), the observatioDs y follow the Dormal distributioD with mean equal to p. aDd staDdard deviatioD equal to o. After each y is traDsformed to 1£, the transformed observatioDs u follow the normal distributioD with meaD equal to zero and staDdard deviation equal to one (SectioD 3.2). ID Equation 0) the sample meaDS follow the Dormal distribution with meaD equal to p.- (or p. by Theorem 5.3) and staDdard deviation equal to 0 - (or o/vn by l'heorem 5.3). After a similar transformation, the traoslormed sample means 1£ also follow the normal distribution with
r
54
TEST OF HYPOTHESIS
Ch.6
mean equal to zero and standard deviation equal to one. These two u's in Equations 0) and (2) are not the same quantities, but are desisnated by the same notation because they have the same distribution curve. However, F:quation (2) may be considered a special case of Equation 0). If n ~ 1, the sample mean y is the single observation y itself. Then Equation (1) becomes Equation (2).
It is much simpler to deal with u than with y. If the 5% significance level is used, the critical regions are y < Po - 1.9601 v'n and y> Po + 1.9601 Vii. (The notations < and > stand for "less than" and "greater than" respectively). It may be simply stated as u < -1.96 and u > 1.96. The inequality u < -1.96 implies tbat
y-Pt
- - < -1.96. o
'l/i
As both sides of the above inequality are multiplied by 01 Vn the resulting inequality is
y-
Po
< - 1.9601v'n.
As 1'0 is added to both sides of the inequality, the result is
y < I'D -
1.9601'1/".
Therefore, u < - 1.96 is the same critical region as y < P-o - l.96olvn. Similarly, u> 1.96 is the same critical region as y> P-o + l.96o/yn. In similar situations hereafter, the quantity u is calculated for the sample, and the critical regions will be stated as u < - 1.96 and u > 1.96 if the 5% significance level and two alternative hypotheses are used. Each a lternative hypothesis corresponds to a critical region. A test of hypothesis with two alternative hypotheses and consequently two critical regions is called a two-tailed test. If there is only one alternative hypothesis (Section 6.1) and consequently only one critical region, the test of hypothesis is called a one-tailed test.
6.8 Assumptions The assumptions are the conditions under which a test of hypothesis is valid. In tbe example used in Sections 6.3 and 6.4, the most important assumption is that the sample is random. If the sample is deliberately selected so that the sample mean is close to or far from the hypothetical population mean, in order to accept or reject the hypothesis, tbe objectivity and therefore the validity of the test are completely destroyed. A random sample is a sample drawn from the population so that every observation in the population has an equal chance of being drawn. Another assumption is that the population is normal. Only a normal population produces normally distributed sample means (Theorem 5.2b).
REMARKS
6.10
55
But the size of sample 16 is large enough to insure the approximate normal distribution of the sample means (Theorem 5.2a) even if the population is not normal. Therefore this assumption is minor as compared to the assumption that the sample is random. 6.9 Procedures The procedures of a test of hypothesis may be illustrated by an example. A random sample of 25 observations is used to test the hypothesis that the population mean is equal to 145 at the 1% level. The population standard deviation is known to be 20. The procedures are as follows: U) Hypothesis: The hypothesis is that the population mean is equal to 145, that is, 1'0 = 145. (2) Alternative hypotheses: The alternative hypotheses are that (a) the population mean is less than 145, and (b) the population mean is greater than 145. (3) Assumptions: The assumptions are that (a) the sample is random (important), (b) the population is normal (minor), and (c) the population standard deviation is known. (4) Level of significance: The chosen significance level is 1%. (5) Critical regions: The critical regions are where (a) u < - 2.576 and (b) u > 2.576 (Table 3, Appendix) (6) Computation of statistic 1'0 == 145 n - 25 (given)
I y ... 3,471 (The 25 observations are given but are not listed here)
r - 138.84
r - 1'0 -
yn -
-6.16
5
u -= 20 (given)
u/yn=4
r-I'o
u -= - -
u
-6.16
= - - = -1.54 4
yn which is outside the critical regions. (7) Conclusion: The population mean is equal to 145.
6.10 Remarks The statistical test of hypothesis can be used only on numerical data representing either measurements, such as the temperature of a room, height of a person, etc., or counts, such as the number of insects on a
56
Ch.6
TEST OF HYPOTHESIS
leaf or the number of books on a shelf. Before any statistical methods can be used, the information obtained must be expressed in numbers. The scientist's first problem is to devise ways of measurement. Before the thermometer was invented, the temperature was described by words such as hot, warm, cold, and cool. The intangible is made tangible by the thermometer. The I.Q. is the psychologist's attempt to make intelligence tangible. As human knowledge advances, more and more intangible qualities are being made tangible quantities. The test of hypothesis is widely used even without expressing the information in numbers. Legal procedure in the United States and in many other countries is an example of a test of hypothesis. During a trial, the defendant is considered innocent, that is, his presumed innocence is merely a hypothesis which is suhject to reje-ction.
As a matter
of fact, perhaps, the police, the district attorney, and the grand jury may already consider him guilty. An alternative hypothesis, therefore, is that he is guilty. To extend the analogy, the witnesses and exhibits are the observations of a sample. How much evidence is sufficient to convict a person depends on the jury's judgment. In other words, the jury determines the critical region. When the trial is completed, the jury has to decide, after deliberation, whether the defendant is innocent or guilty, that is, whether to accept or reject the hypothesis. If an innocent man is found guilty, the Type I error is committed. If the jury wants a great deal of evidence to convict a defendant, the probability of committing a Type I error is reduced, but because of this, a guilty person may escape punishment and thus a Type II error is committed. If the jury convicts the defendant on flimsy evidence to prevent a possibly guilty person from escaping punishment, an innocent person may be convicted and thus the Type I error is committed. The probability of committing both kinds of errors can be reduced only by increasing the sample size, which means the presentation of more evidence in court. With this analogy in mind, the reader may achieve better understanding of the two kinds of errors if he will read this chapter again.
EXERCISES (1) A random sample of 16 observations was drawn from the basketful of tags which is a normal population with mean equal to 50 and standard deviation equal to 10. The observations of the sample are as follows:
62 37
43 56
60 41
49 43
72 56 36 45
45 56
46 49
Let us pretend that the population mean is unknown. Using the 5% significance level, test the hypothesis that the population mean is equal to (a) 40 (b) 49 (c) 50 (d) 51 and (e) 60. This is a two-tailed test. Following the procedures given in Section 6.9, write a complete
QUESTIONS
57
report for (a) only. For (b), (c), (d), (e), just compute u and state the conclusions. Since the population mean is actually known to be 50, it can be determined whether a conclusion is right or wrong. For each of the five cases state whether the conclusion is correct, or a Type I error is made, or a Type II error is made. This exercise is intended to show that a Type II error is likely to be committed if the hypothetical population mean is close to the true population mean. [(a) u .. 3.90; no error. (b) u .. 0.30; Type II. (c) u- .10; no error. (d) u .. - .50; Type II. (e) u .. -4.10; no error.] (2) A random sample of 2500 observations were drawn, with replacement, from the basketful of tags. The sample mean is 49.9. Using the 1% significance level, test the same five hypotheses in Exercise (1). Are the conclusions different from those obtained in Exercise (1)? This exercise is intended to show that the probability of committing both Type I and Type II errors can be reduced at the same time by increasing the sample size. (3) A random sample of 25 observations was drawn from the basketful of tags (p .. 50, C1 .. 10). The sample mean was found to be 54. Using both the 1% and 5% significance levels, test the hypothesis that the population mean is equal to (a) 50 and (b) 49. There are four tests
altogether. For each test, compute u and state the conclusion. Since the population mean is actually known to be 50, it can be determined whether the conclusion is right or wrong. For each of the four cases, state whether the conclusion is correct, or whether a Type I error or a Type II error is made. This exercise is intended to show the fact that a change in the significance level without a change in the sample size gains something and also loses something. The use of the 5% significance level is more likely to lead to a Type I error and less likely to a Type II error than the use of the 1% significance level.
QUESTIONS (1) Define the following terms:
(a) Hypothesis (b) Assumption (c) Type I error (d) Type II error (e) Significance level (£) Critical region. (2) The quantities 5% and 1% are used repeatedly in this chapter. They refer to the percentages of what? (3) What are the mean and variance of the u's?
58
TEST OF HYPOTHESIS
Ch.6
(4) What is the consequence of reducing the significance level from 5% to 1% without changing the sample size? (5) What is the consequence of increasing the sample size without changing the sign ificance level? (6) What is the consequence of using zero percent significance level? What does this mean in terms of the analogy given in Section 6.10? (7) Does one need a large or a small sample to reject a false hypothesis which is very close to the true one? Why? (8) If a hypothesis is already rejected by a sample of 10 observations, is it likely to be rejected or accepted by a sample of 100 observations? Why? (9) When one tests a hypothesis with the same significance level, is a large sample or a small sample more likely to cause the rejection of the hypothesis? Why? (10) Regardless of the sample size, one can test a hypothesis. Why does one prefer a large sample?
REFERENCES Kendall, Maurice G.: Advanced Theory of Statistics, Vol. II, Charles Griffin and Company, London, 1946 (Extensive bibliography). Wald, Abraham: "On the Principles of Statistical Inference, I f Notre Dame Mathematical Lectures, No. I, University of Notre Dame, 1942.
CHAPTER 7
SAMPLE VARIANCE-x'-DISTRIDUTION Chapters 5 and 6 collectively deal with the deductive and inductive relations between a population and its samples. The deductive relation is shown in Chapter 5, which describes the characteristics of the sample means drawn from a population. The direction is from the population to the samples. Chapter 6, on the other hand, in showing how a single sample can be used to test a hypothesis about the population illustrates the inductive relation between a population and its samples: the direction is from a sample to the population. Furth ennore , in Chapters 5 and 6 the center of discussion is the mean. Now, in Chapter 7, all of what is described above is repeated, but the center of discussion is shifted to the variance.
7.1 Purposes of Stadying Sample Variance The most obvious reason for the study of the sample variance is to acquire knowledge about the population variance. A second reason, however, is at least as important as the first. It is that the knowledge of the variance is indispensable even if one's interest is in the mean. The ~test is introduced in Section 6.7 to test the hypothesis that the population mean is equal to a given value, where y-p.
u =---. a
0)
It can be seen from the above equation that the population standard deviation a must be known before this test can be used. But a usually is unknown, and this test therefore has a very limited use. To remove this 1imitation, it is essential to find a way to estimate (J or a Z from a sample. In other words, whether one's interest is in the population mean or in the population variance, the knowledge of the sample variance is indispensable.
7.2 Sample Vmace The first problem in studying sample variance is to detennine what the sample variance should be to provide a good estimate of the population variance. A hint can be obtained from the sample mean. The population mean is I.y Y1 + Y2 + . .• + YN (1) p. =IV= N ---
59
60
Ch. 7
SAMPLE V ARIANCE-)(-DISTRIBUTION
and tne population variance is ~
= I(y -1L}2 = <1, N
-1L}2 + (Y2 -1L)2 + •.• + {YN + 1L}2
N·
(2)
To estimate "" a sample of n out of N observations is drawn from the population, tile n observations are added together, and the sum is divided by n; thus the sample mean is (3)
For example, the tag population (Section 4.1) consists of 500 (or N) observations. The mean IL is the sum of the 500 observations divided by 500. The mean of a sample consisting of 10 (or n) observations is the mean of only 10 {or n} out of 500 (or N) observations. In order to estimate ~, one would be tempted to do the same thing by using n out of N (Y - 1L}2 terms and dividing the sum of these n terms by n. In other words, the sample variance is
V,
=
(y, - 1L}2 + (Y2 -1L)2 + ••• + (Y .. -1L)2 n
,
where V, is the sample variance and the ,,'s are tbe n observations drawn
from the population. This intuitive approach turns out to be a correct method of estimating 0'2, and its validity can be verified by the example of the population which consists of three observations 2, 4, and 6 (Section 5.l). The mean and the variance of this population are equal to 4 and 8/3 respectively. If the size of the sample is equal to 2, or n = 2, there are nine possible samples that can be drawn from this population. The samples are 2,2; 2,4; etc. The nine samples are listed in Column 1 of Table 7.2. The variance of the first sample is (2 - 4)2 + {2 - 4)2
2 and that of the second sample is V, =
(2 - 4)2 + (4 - 4}2
2
= 2.
The nine sample variances are listed in Column 3 of Table 7.2. The sum of these nine V,-values is equal to 24. The average of these nine values is equal to 24/9 or 8/:J which is the population variance cI. It should be noted from Column :J, Table 7.2, that none of the nine V,values is a correct estimate of cI. This is similar to using to estimate IL. The individual sample means are not necessarily equal to the popula-
r
7.2
61
SAMPLE VARIANCE
tion Il, but the mean of all of the sample means is equal to p. (Column 2, Table 7.2). Here the individual V. is not necessarily equal to dA, but the average of all V.-values is equal to cI'. Therefore this is a correct method of estimating the population variance. Yet it is as useless as it is correct. Equation (4) shows that in order to calculate any V., tile population mean p. must be known. Further, to test the hypothesis that the population mean is equal to a given value, the population standard deviation mu-st be known. One obvious way to break this vicious circle is to replace p. in Equation (4) by y. Then the modified version of the sample variance is
Va
=
I(y - 1)2 n
=
(y. - y)a + (y 2 - y)a + • • • + (y n - 1>2
(5)
------------=.:.-----. n
The modified variance of the first sample is V2 = (Y. - 1>2 + (y 2
-
y)2 = (2 - 2)2 + (2 - 2)2
2
= 0,
2 TABLE 7.2
Sample ~o.
----
-
1 2 3 4 5
6 7 8 9
I
(1)
(2)
Samples
-
2,2 2,4 2,6 4,2 4,4 4,6 6,2 6,4 6,6
-
r
-
2
(3)
(4)
V. -
Va
-
0
4 2 4 2 0 2 4 2 4
3
4 3 4 5 4 5 6
(6)
(5) sa
-
1
4
1 0 1 4 1 0
Total 36 12 24 12/9 -= 4/3 Average 36/9 = 4 r4/9 =8/3 Parameter 0' = 8/3 ~= 4 0' = 8/3
0 2 8 2 0 2 8 2 0
----
24 24/9 = 8/3 0' = 8/3
s -
-
~-
0.000 1.414 2.828 1.414 0.000 1.414 20828 1.414 0.000
1(7
11.312 1.257 == 1.633
and that of the second sample is
V2
(2 - 3)2 + (4 - 3)2
=
2
= 1.
The Va-values of the nine samples are listed in Column 4, Table 7.2. ft can be readily seen that Va is not a good estimate of (72, None of the nine individual Va is equal to (72; their average is not even equal to 0'1. The average of the nine VI-values is equal to 12/9 or 4/3 while 0'2 = 8/3. In other words, VI underestimates (72. Therefore this method also seems
SAMPLE V ARIANCE-~-DlSTRIBUTION
62
Ch. 7
to he defective. However, the value V2 is easily obtained because the knowledge of the population mean is not required. The fact that it underestimates (T 2 does not make it useless. The defect of underestimation can be remedied. The most commonly used version of sample variance is the corrected version of V2 • The correction is achieved by using n - 1 instead of n as the denominator of Equation (5). The reason for using a smaller denominator is to boost V2 so that it will not underestimate (T2. The modified version of V2 is S
2 I{y - 1)2 {Ya - 1)2 + (y2 - y)2 + ••• + (y /I - y)2 == ---------- ----"---n-l n-l
where Ya' Y2' ••• , Y/I are the n observations of a sample and sample mean. The S2 of the first sample of Table 7.2 is (2 - 2)2 + (2 - 2)2
S2
= --
2-.:1
(6)
y is
the
0
= - = 0, 1
and that of the second sample is S2
=
(2 - 3)2 + (4 - 3)2
2 -1
2
= -- = 21
The 9 values of S2 are listed in Column 5 of Table 7.2. The average of the 9 values is equal to 24/9 or 8/3, which is the population variance. The replacement of n by n - 1 in the denominator (Equations 5 and 6) corrects the underestimation. Hereafter, in this text only one version, s2 of Equation (6), is used as the sample variance. The advantages of this S2 are that the average of all sample variances is equal to a2, and that the population mean need not be known. The following theorem will be helpful for future reference. Theorem 7.2 If all possible samples of size n are drawn from a population, the mean of all sample variances S2, where S
2 I
(7)
is equal to the population variance c 2• Another point of view shows that it is reasonable to use n - 1 instead of n as the divisor in finding S2. If a sample consists of only one observation, the observation Y is the sample mean This sample mean is not a reliable estimate of the population mean, but it expresses some idea of the population mean. If the weight of an animal is two tons, that animal is likely to be an elephant rather than an insect. So one observation does show the approximate magnitude of the population mean. But a sample of one observation contains no information concerning the popula-
y.
7.3
63
UNnIASF.D ESTIMATE
tion variance, because variance measures tite varlauon among the observations. One observation cannot show to what extent it differs from the other observations. When an attempt is made to estimate the population variance by a sample of one observation, it is interesting to see what kind of estimate can be obtained. The sample variance of one observation y is
I.(y - y)3 S2
= _
n-I
=
(y - y)3
1-1
o
=~ 0
The quantity % is indeterminate. It can be equal to I, because 0 x I. But it also can be 12, because 0 = 0 x 12. In fact, it can be any value. In other words, the formula gives a correct answer, whi ch is "I don't know," simply because a single observation does not provide any information concerning the population variance.
o=
7.3 Unbiased Estimate
1£ the mean of all possible values of a statistic is equal to a parameter (Section 4.4), the statistic is ~alled the unbiased estimate of that parameter. For example, the sample mean is an ullbiased estimate of the population mean p., because the mean of the means of all possible samples of a given size is equal to the population mean (Theorem 5.::\).
r
Another unbiased estimate is
s2,
because the mean of the variances
S2
of
all possible samples of a given size is equal to the population variance The assertion that an estimate is unbiased, refers to the method of estimation and not to the end-product computed from a pa~ ticular sample. The method of computing y is ~ (Theorem 7.2).
y, + Y2 + ••• + Yn I.y y=-=n n
-
(1)
whi cit enables each sample to produce an estimate of the population mean. The mean of the means of all possible samples of the same size is equal to the population mean, and therefore, the sample means, on the average, are unbiased. But an indi vidual sampl e mean may or may not be equal to the population mean. A similar situation is found in sample variance S2. The method of computing S2 is (2)
whiCh enables each sample to produce an estimate of the population variance. The mean of all these sample variances (Column 5, Table 7.2) is equal to the population variance, and therefore the sample variances, on the average, are unbiased. An individual sample variance mayor may not he equal to the population variance. None of the sample variances
SAMPLE VARIANCE-~·DlSTRJBUTION
64
Ch. 7
shown in Column 5, Table 7.2, is equal to the population variance. Whenever the word "bias" is used in statistics, it is used in this sense. When an estimate is called unbiased, it is for the reason that the average of the estimates produced by all possible samples of a given size is equal to the parameter, rather than that the estimate produced by a particular sample is equal to the parameter. 7.4 Computing Method for Sample Vsri anee The sample variance is defined in Equation (6), Section 7.2, S2=
(Yl-Y)!I+(Y2-Y):I+· .. +(y n -1):1
=:t
n-l
n-l
88
(1)
This formula is given 88 the definition of the sample variance but is not intended 88 a method of computing S2. This section introduces a shortcut computing method which is especially adaptable to desk calculators. The tedious work involved in computing s:l is in obtaining the numerator of Equation (1). Therefore, the short-cut method concems only the numerator, which is the sum of the squares of the deviations of the observations from their mean. In this book this lengthy description of the numerator is replaced by the abbreviation SS, that is, SS
= l:(y - 1):1.
(2)
The short-cut method of computing SS is
SS=l:y-~ n
(3)
The fact that both versions of SS give the identical result is illustrated by the five observations 3, 2, 1, 3, 1, with the details of the computation shown in Table 7.4. The value of S5 computed by Equation (2) is equal to 4, and that computed by Equation (3) is also equal to 4, that is (l:y)2
SS = l:y2 - -
n
(10)2
=
24 - -
5
=
24 - 20 = 4.
The equation
(4) is an algebraic identity which holds true for any set of numbers. Therefore, the two expressions of SS shown in Equations (2) and (3) are used interchangeably in later chapters. Despite the two different methods of computation, the meaning of SS remains the same. The value SS does not acquire a new meaning by acquiring a new algebraic expression. The algebraic proof of this identity is Riven in Sectiop. 7.8.
~-DISTRI8UTJON
7.5
65
Alter the SS-value is computed, the sample variance S2 can be obtained by dividing" - 1 into SSe For this example, sa = 4/4 = 1. It is interesting to observe in Column 2, Table 7.4, that the sum of the deviations of the observations from their mean is equal to zero, that is,
I. =
(5)
This is also an algebraic identity which holds true for any set of numbers. This identity is used quite often in later chapters to develop short-cut computing methods. Its algebraic proof is given in Section 7.8. TABLE 7.4
2
1
-
1-1
1
==--==:\
0=-
4
3 (y -
1)a
F'
yI 9
1
-1 1 -1
1 0 1 1 1
10
0
4
24
I.(y - 1)2
~2
2 1 3
l:y
1 0
-
I.(y-Y>
4 1 9
J
n : 5; y = Iy/ n = 2. I.(y - 1)2 .. 4; S2 = I.(y - 1)2/(n -1) = 1 I.y - (I,y)2/n = 24 - (10)2/5 = 24 - 20 = 4
7.5 ,r-Dislribution The normal distribution is described in Chapter 3 as one of the most important distributions in statistics. In this section another important frequency distribution called the )(-distributio" is described. This distribution is closely related to the normal distribution and will be introduced through reference to the normal distribution. If all possible samples of si ze " are drawn from a normal population with mean equal to ,.,. and variance equal to c1, a sample mean y can be computed from each sample. Theorem 5.2b states that the distribution of these sample means follows the normal distribution. However, from each sample one can compute not only the mean, but also other statistics, such as tbe sum Iy and variance S2. 1£, for each sample, the statistic
(Equation 1, Section 3.2) is computed, the value of lu2 , like any other statistic, will change from sample to sample. The fluctuation of the values of Iou 2 can be illustrated by the four random samples gi ven in
SAMPLE VARIANCE-~-DlSTRIBUTION
66
Ch. 7
Table 4.2. These samples are drawn from the tag population which is a nonnal population with mean equal to SO and variance equal to 100 (Section 4.1). The first sample consists of the observations 50, 57, 42, 63, and 32, and the value of I,u2 of this sample is Iu J =
= =
eO~50r + e7~50)2 + (42~~Or + (63~50r + (32~50r (50-50)2 + (57-50)2 + (42-50)2 + (63-50)2 + (32-50)2 100
o- +-49 + 64 + 169 + 324 606 ------=-=6.06. 100
100
The second sample consists of the observations 55, 44, 37, 40, 52 and the value of Iu2 is Iu2 = (55 -50)2 + (44 - 50)2+ (37 -50)2 + (40 _SO)2 + (52 -50)2 100
."
25 + 36 + 169 + 100 + 4
100
= 3.34.
If all possible samples are drawn from a normal population, each sample will have its own value of Iu2• The distribution of these values of !tul is called the ~-distribution. The value of Iu2 is influenced not only by the change of observations from sample to sample, but also by n, the sample size. For example, if the first sample consists only of its first two of the five observations, the value of Iu2 is Iu 2
=
(50 -50)2 + (57 _SO)2 100
= .49,
instead of 6.06 for the five observations. If the second sample consists only of its first two observations instead of five, the value of ~U2 is Iu 2 = ~55 -50)2 + (44 -50)2 100
= .61
instead of 3.34. On the average, Iu2 is larger if the sample size is larger. Therefore, for each sample size, there will be a different ~~istribution, the mean of which increases with the sample si ze. Consequently, the )(2-distribution is not represented by a single frequency curve but by a family of curves. What uniquely identifies a particular curve is the mean of the distribution. This mean which is denoted by II is called the number of degrees of freedom (for which the abbreviation is
7.5
~-DISTRIBUTION
67
d.f. or OF) of the ~-distribution. The reason for adopting this name for the mean is not explained here, but its meaning will be revealed as the subject develops. The curves for the ~-distributions with 1, 4, and 5, degrees of freedom are shown in Figure 7.5a • •20
,.
.
{
• 1$
Fig. 7.5 a
The discussion in this section may be summarized in the following theorem: Theorem 7.5 If all possible samples of size n are drawn from a normal population with mean equal to IL and variance equal to (12, and for each sample Iu 2 is computed, where (2)
the frequency distribution of Iu 2 follows the )(2-distribution with n degrees of freedom (that is, raj. Elaborate mathe:natics must be used to derive the ~-distribution from the normal population. The theorem, however, can be verifi eel experimentally by reference to the sampling experiment described in Chapter 4. Briefly, 1000 random samples, each consisting of five observations, are drawn from the tag population which is a normal population with mean
,,=
SAMPLE VAHIANCE-~-DlSTRIBUTI0N
68
Ch.7
eTlal to 50 and variance equal to 100. For eacb sample, the value of is computed. The values for four such samples are shown in Table 4.2. The frequency table of the 1000 values of I.uJ is given in Table 7.5, where both the theoretical and the observed relative frequencies are shown. The theoretical relative frequency is the would-be frequency if all possible samples of size 5 were drawn, and the observed relative frequency is that obtained from the 1000 samples. It can be seen from ~2
TABLE 7.5
~~r 0-1 1-2 2-3 3-4 4-5 5-6
6-7 7-8 8-9
9-10 10-11 11-12 12-13 13-14 14-15 Over 15 Total
Observed Frequency
/
r./.(%)
Theoretical
Mid-pt.
r./.(%)
m
35 109 148 171 139 106 77 64 53 28 18 20 11 6 8 7
3.5 10.9 14.8 17.1 13.9 10.6 7.7 6.4 5.3 2.8 1.8 2.0 1.1 .6 .8 .7
3.7 11.3 14.9 15.1 13.4 11.1 8.6 6.4 4.7 3.4 2.4 1.7 1.1 .8 .5 1.0
1000
100.0
100.1
Mean of
.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5 11.5 13.5 14.5 18.1
m/ 17.5 163.5 370.0 598.5 625.5 583.0 500.5 480.0 450.5 266.0 189.0 230.0 126.5 81.0 116.0 126.7 4924.2
'It -- Ina/ = 9 "if "" 4924.2 1000 4.
(This sampling experiment was done cooperatively by about 75 students Oregon State College in the Fall of 1949.)
at
Table 7.5 that the theoretical and observed frequencies fit closely but not exactly. The theoretical frequency is based on all possible samples of size 5, while the observed frequency is based on only 1000 samples. Therefore the theoretical and observed frequencies are not expected to agree perfectly. The close agreement between the theoretical and observed frequencies can also be seen in Fig. 7.5b which shows the histogram of the distribution of 1000 Iu2 -values and the theoretical frequency curve of 'It-distribution with 5 degrees of freedom. The fact that the mean of I.u2 is equal to the sample size n can also be shown by Table 7.5. The mean of the 1000 values of ~2 could be easily found. Unfortunately, however, the individual value loses its identity after the frequency table is made. Even so, the approximate mean can be found (rom
69
7.5
'.
I·
2
"
6
Fig. 7.5 b
R
12
10
x'
the frequency table by considering any value of Iu2 in the class 0-1 as .5, any value in the class 1-2 as 1.5, etc. These mid.points of various classes are designated by m in Table 7.5. For the class "over 15", the mean 18.1 of the 7 values of Iu2 in that class is used for m. Then the approximate mean of Iu2 is
Imf
4924.2
If
1000
--=---=4.9
which is approximately equal to the sample size 5. If all possible samples were drawn from the normal population, it can be proved mathematically that the mean of the distribution is exactly 5. This completes the experimental verification of Theorem 7.5. Like the tables showing the relative cumulative frequency for tbe normal distribution (Section 3.2), tables for the ,c-distribution are also available. An abbreviated table is shown in Table 4, Appendix. Each line of the table represents a different number of degrees of freedom (d.f.) such as 1, 2, ••. , 100 shown in the extreme left column. Each column shows a different percentage. For example, the tabulated value cOlTesponding to 10 d.f., and 5% is 18.3070. This means that 5% of the ~-values with 10 d.f. are greater than 18.3070. The ,c-value corresponding to 5 d.f., and 5% is 11.0705. This means that 5% of the ~-values
SAMPLE V ARIANCE-~-DISTRIBUTION
70
Ch.7
with 5 d.f. are greater than 11.0705. Th is value may be compared with that obtained by the sampling experiment. From Table 7.5, it can be seen that 5.2% (i.e., 2.0 + 1.1 + .6 + .8 + .7) of the 1000 Iu2 are greater than n. This percentage is approximr.tely equal to 5% as expected. 7.6 Distribution of u2
= 1, follows the ~-distribution with 1 degree of freedom. The distribution curves of u and u 2 are shown in Fig. 7.6a and 7.tih. The distribution of It can be deduced from Theorem 7.5 that u2 , being Iu2 when n
.6 r.
.
{
•5
.4
.3
.2
2.5%
.1
-3
-2
-I
0
f
-1.96
2
t
3
,.
1.96
Fig. 7.6 a
u is the normal distribution with mean equal to 0 and variance equal to l. The distribution of ~ with 1 degree of freedom, being u2 , is the doubledup version of u (Fig. 7.6a and 7.6b) because (_U)2 = ,;. The square of any value between 0 and 1 is a value between 0 and 1. Likewise the square of any value between 0 and -1 is a value between 0 and 1. For example, (.5)2 = .25, (_.5)2 = .25. The 68% of u-values lying between -1 and 1 yield the 68% of u 2 between 0 and 1. The square of any value greater than 1.96 is greater than (1.%)2 or 3.84, and the square of any value less than -1.96 is also greater than 3.84. For example. either 22 or (_2)2 is 4 which is greater than 3.84. Since there are 2.5% of u-values greater than 1.96 and 2.5% of u-values less than -1.96, a total of 5% of u2 -values are greater than 3.84. Therefore, there arc 68% of ~-values
7.7
71
DISTRIBUTION OF SS/a2
.8 r.
.
/
.7
.6
.5
.4
.3
.2
.1
o
2
3
r
5
3.84 = (1.96)1
Fig. 7.6 b
between 0 and 1, and 5% of ~-values greater than 3.84. (d. Table 4, Appendix). It should be noted from Fig. 7.6a and 7.6b that the middle portion of the CIJrve of u becomes the left tail of ';(' with 1 d.f. The fact that uJ follows the ,;(,-distribution with 1 d.f. is frequently recalled in later chapters. The above discussion about u and u J may be summarized in the following theorem: Theorem 7.6 If a statistic u {ollows the normal distribution with mean equal to zero and variance equal to 1, u 2 {ollows the X2-distribution with 1 degree o{ freedom (that is v = 1). 7.7 Distribution of SS/a2 In the two preceding sections it is shown that (1)
follows the ,;(,-distribution with n degrees of freedom and u2 follows the
72
Ch.7
SAMPLE VARIANCE-Xl·DlSTRIBUTION
){-distribution with 1 degree of freedom. This section shows that follows the ~-distribution with n - 1 degrees of freedom, where 5S I(y - Yl2 -=
~
=(y I -
Yl 2 + (y 2
-
y>2
+ ••• + (y II
-
SS/(i
y)2
(2)
~
a2
The two quantities Iu2 and SS/cI are not the same. It can be observed from Equations (1) and (2), that the former quantity deals with the deviations of the observations of a sample from the population mean, while the latter quantity deals with the deviations of the observations from the sample mean. Thus they are not the same, yet they are related. The relation is as follows: (3)
The algebraic proof of the above identity is given in Section 7.8. In this section, the relation is verified numerically. For example, a sample of five observations 50, 57, 42, 63, 32 is drawn from a normal population with mean Il equal to 50 and variance q2 equal to 100. Each of the three quantities in Equation (3) can be calculated for this sample, with the details of the computation shown in Table 7.7a. The result is that 606 598.80 5(1.44) -= +----,
cI
q2
cI
Since 606 = 598.80 + 7.20, this example verifies the identity shown in Equation (3).
-
Y
y-p.
50 57 42 63 32
0 7 -8 13 -18
TABLE 7.78 (y _ p.>2
--
y-y
(y -
y)2
49
1.2 8.2
64 169 324
-6.8 14.2 -16.8
1.44 67.24 46.24 201.64 282.24
244
606
0
598.80
Ir
I(y _p)2 n
0
I(y -
y)
SS
= 5; Y=48.8, P. = 50; (y -1l)2 = 1.44
The left side of Equation (3) is shown to follow the ){-distribution with n degrees of freedom in Section 7.5. The term on the extreme right
7.7
73
DISTRIBUTION OF SS/u 2
of Equation (3) is
orr ~~)' = rr
!' .~;;)'.~
(Equation 1, Section 6.7) which is shown to follow the ~~istribution with 1 degree of freedom in Section 7.6. Then it is reasonable to expect that the middle term
1(1 - y)2
tI
SS
(5)
=-
tI
of Equation (3) follows the )(~istribution with n - 1 degrees of freedom. This expectation can be verified by the sampling experiment described in Chapter 4. Briefly, 1000 random samples, each consisting of five observations, are drawn from a normal population with mean equal to 50 and variance equal to 100. For each sample, the value SS is calculated by the method given in Section 7.4. The S5-values of four such samples are given in Table 4.2. Since tI is equal to 100, it is an easy matter to obtain SS/cI, once SS is calculated. The frequency table of 1000 values of SS/tl is given in Table 7. 7b. The theoretical relative frequency given in the same table is that of the ~-distribution with 4 degrees of freedom. TABLE 7.7b
ss/ci
Observed Frequency
f
r·f·(%)
93
9.3
10-11 11-12 12-13 Over 13
181 189 152 116 86 64 35 38 21 8 7 3 7
18.1 18.9 15.2 11.6 8.6 6.4 3.5 3.8 2.1 .8 .7 .3 .7
Total
1000
100.0
0-1 1-2 2-3 3-4 4-5 ~
6-7 7-8 8-9 9-10
neoretical r·f·(%)
I
9.0 17.4 17.8 15.2 11.9 8.8 6.3 4.4 3.1 2.1 1.4
.9 .6 1.1 100.0
Mid-pt. m
.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5
10.5 11.5 12.5 16.4
mf
46.5 271.5 472.5 532.0 522.0 473.0 416.0 262.5 323.0 199.5 84.0 80.5 37.5 114.8 3835.3
SS Inaf 3835.3 Mean 0 ( - = - = - - = 3 . 8 tI 'I.f 1000
(This sampling experiment was conducted cooperatively by about 80 students at Oregon State College in the Fall of 1950).
74
SAMPLE VARIANCE-Xl-DISTRIBUTION
Ch.7
.20 f.
.
/
•IS
.10
.OS
Fig. 7.7
The histogram of the 1000 values of SS/t/, and the ,r-curve with 4 degrees of freedom are given in Fig. 7.7. It can be observed from Table 7.7b or Fig. 7.7 that the agreement between the observed and the theoretical frequencies is close. This shows that the values of SS/(l2 which were calculated from 1000 samples, each consisting of 5 observations, follow the ,r-distribution with 4 degrees of freedom, and thus verifies the contention that SS/ll follows the ,r-distribution with n - 1 degrees of freedom. The approximate mean of the 1000 values of SS/ll is found to be 3.8 (Table 7.7b), which is close to the number of degrees of freedom 4. This further verifies the contention that SS/ cI follows the ,redistribution with n - 1 degrees of freedom. The following theorem summarizes the discussion in this section: Theorem 7.7a If all possible samples of size n are drawn from a normal population with variance equal to (12, and for each sample the value I(y - r)I/(l2 is computed, the values of I(y - '1)2/(12 follow the ,r-distribUlion with n - 1 degrees of freedom (that is, v .. n - 1). To avoid the lengthy statement given in the above theorem, in practice it is usually said that I(y - 1)2 or SS has n - 1 degrees of freedom. Since
7.7
DISTRIBUTION OF SS/q2
75
this number of degrees of &eedom is also the divisor used in obtaining S2, that is,
sZ
SS
SS
=--=-, n -1 v
(6)
(Equation 6, Section 7.2), the sample variance S2 is the SS divided by its number of degrees of &eedom. The number of degrees of &eedom of a sum of squares can also be interpreted as the least number of deviations which have to be known in order that the remaining.ones can be calculated. The quantity SS is the sum of the squares of the n deviations (y - 1), that is, SS = 1:(y - 1)2
= (y. - 1)2 + (y2 - 1)2 + ••• + (y n - Y)2.
(7)
But it is known that the sum of these deviations is equal to zero, that is, I(y - 1) = 0 (Equation 5, Section 7.4). Therefore, when n - 1 of these deviations are known, the remaining one becomes automatically known. So the SS has n - 1 degrees of &eedom. For example, the mean of the five observations 3, 2, 1, 3, 1 is 2 and the five deviations &om the mean are 1, 0, -1, 1, -1. If any fonr of the five deviations are known, the remaining one will be known, because the sum of these five deviations is equal to zero. The nnmber of degrees of &eedom is used in connectioo with every method presented in later chapters. Whenever the number of degrees of &eedom is used, it is used directly or indirectly in connection with the ~-distributiOD. If one states that an S2 has v degrees of &eedom, he means that the statistic vs a/q1 = SS/qa follows the x'-distribution with v degrees of freedom. For example, if one says sa has a degrees of freedom, he means that asZ / q1 follows the x'-distribution with a degrees of &eedom. In later chapters, the number of degrees of freedom is often nsed without reference to the x'-distribution, although it is taken for granted that the x'-distribution 'is the point of origin. From the distribution of SS/q1, the distribution of sZ /qa can be deduced, because sZ = SS/v. In the case of n = 5, or v = 4, SS/4 = S2. In other words, SS is 4 times as large as sJ-. When a value of SS falls between 0 and 4, the corresponding value of sa will fall between 0 and 1. From Table 7.7b it can be seen that 93 out of 1000 samples have values of SS/q1 falling between 0 and 1. Without a further sampling experiment, it can be deduced that the same 93 samples will have values of S2/qa falling between 0 and .25. In other words, if the classes of Table 7.7b are changed to 0-.25, .25-.50, etc., the new frequency table will be that of S2/q1. Consequently, the statistic sZ/qa follows the distribution of ~/v.
Table 4, Appendix, shows the relative cumulative &equencies of If each line of the
~-distribution which is the distribution of SS/lI'.
76
SAMPLE VARIANCE-X 2-DISTRIBUTION
Ch.7
values of the same table is divided by v the number of degrees of freedom, the resulting table, which is Table 5, Appendix, gives the relative frequency of X-/v which is the distribution of sI/tI. The mean of the distribution of s" /tl can be deduced from that of SS/u". Since SS/tl is v times as large as s" /tI, the mean of SS/u" is also v times as large as that of s" / ~. But the mean of SS/ ~ is equal to v, and therefore, the mean of s"/~ is vlv or 1. This same result can be obtained by a different method. Theorem 7.2 states that the mean of the variances s" of all possible samples is equal to tI. If each of the sI-values is divided by ~, then the new mean is equal to the old mean divided by u", that is, the mean of s" / u" is equal to tI /u" or 1 (Theorem 2.4b). For convenience of future reference, the result is stated in the following theorem: Theorem 7.7b If all possible samples of the same size are drawn from a given population with variance equal to a 2• ana the variance S2 is computed for each sample, the mean of all the ratios s2/a" is equal to 1. After the mean of s" / tI is considered, it is also of interest to know how the variation of s., / u" is affected by the sample size. If the size of the sample is increased, the sample variance s., becomes a more reliable estimate of the population variance tI, and the values of s" of all possible samples of a given size cluster more closaly around the population variance u", or the values of s.,/tl of all possible samples hug the value 1 more closely. As the sample size n approaches infinity, every s., becomes tI. Then the value of S2 / ~ becomes 1 for every sample. This phenomenon can be observed in Table 5, Appendix. For example, both the CJ7 .5% point and the 2.5% point converge toward 1 as the number of degrees of freedom increases. 7.8 Algebraic Identities The algebraic identity given in Equation 4, Section 7.4, can be proved as follows:
(1)
7.9
ANALYSIS OF VARIANCE
77
The algebraic identity gi veo in Equation 5, Section 7.4, can be proved as follows
I(y - y) = (Yl - 1) + (r2 - 1) + ••• + (yII - y) =~-ny
Iy
= 1:y - n n
(2)
The algebraic identity given in Equation 3, Section 7.7, can be proved as follows: 1:(y - ".)2
= 1:[(y - y) + (y _ ".)]2 -- 1:[(y -
y)2 + 2(y -
1> (y -
".) + (y - ".)2]
= [(Yl - 1>3 + 2(Yl - 1) (y [(y 2 - y)' + 2(y2 - 1> (y
-".) + (y - ".)3] + - 1') + (y - IL)2] + • • • + 1> (y - ".) + (y - IL)2]
[(y II - 1)2 + 2(y II = ICy - 1)2 + n(y - ".P.
Since I(y - 1)
= 0, 2<1 - ".)I(y -
(3)
1> = o.
7.9 Analysis of V.idee
It has been verified numerically (Section 7.7) and proved algebraically (Section 7.8) that (1)
It has been shown that ICy - p.)2/0' follows the )(-distribution with n degrees of freedom (Theorem 7.5), that I(y - 1>2/a2 or SS/u2 follows the ~-distribotiOD
with n - 1 degrees of freedom (Theorem 7.7a) and that ~-distribution with 1 degree of freedom (Theorem 7. $. In other words, the Sl1m of squares (of the deviations of the observations from the population mean) on the left side of Equation (1) is partitioned into two components on the right side of the equation. This method of partitioning the sam of squares into components is called the analysis of variance. The partitioning is purely an algebraic process, whatever its physical meaning. But the fact that each of the two components and also the sum of the two components follow the ~-distribution is of great importance in the methods given in later chapters. Not only the aWD of squares is partitioned into components, but also the number of degrees of freedom. The sum of squares I(y - IL)2 has n degrees of freedom; its components 1:(y - y)2 and nCr - ".)2 have n - 1 and 1 degree of freedom respectively, and n = (n - 1) + 1. Each of the three terms of Equation (1) divided by its respective nam-
nCy - ".)2 /u2 or u2 follows the
78
SAMPLE VARIANCE-Xl-DISTRIBUTION
Ch.7
her of degrees of freedom is an unbiased estimate (Sectioo 7.3) of fl. The sum of the squares on the left side of Equatioo (1) divided by its degree of freedom n is (Equation 4, Section 7.2).
~=~y-~
(~
n whi ch was shown to be an unbiased estimate of cJ (Section 7.2). The component I(y - y)l or SS divided by its degree of freedom n - 1 is S2
=
I(y - 1)2
(3)
n-l
which is an unbiased estimate of fI (Section 7.3). The quantity nG _,,)2 seems to be in a different category, because it involves means rather than observations. But all possible sample means may be considered a population with mean equal to " and variance equal to ~ In. If some of the sample means are drawn from this entire population of sample means, the variance V, of these sample means is an unbiased estimate of fI In. The quantity 2 has n - 1 degrees of freedom is often states as: "the sum of squares has lost one degree of freedom." This loss is due to the replacement of the population mean "by the sample mean in the sum of squares. The analysis of variance is a method used very frequently in later chapters, so it is introduced as early as possible. What is given in this section is the simplest case of the analysis of variance. Its practical use is shown later.
r
7.10 Test of Hypothesis The distribution of SSI q'A (Section 7.7) can be used in testi ns the hypothesis that the population variance is equal to a given value. For example, the value of SS of a random sample of 5 observations is 480. The problem is to test the hypothesis that the population variance is equal to 100, that is, = 100, where is the hypothetical variance. By Theorem 7.7a, it is known that SSlfI follows the ~-distribution with n - 1 = 2 ... 100, from degrees of freedom. If the hypothesis is true, that is, Table 4, Appendix, it can be found that 2.5% of the ~-values with 4 d.f. are less than .484419 and 2.5% of the ~-values are greater than 11.1433.
u:
u:
u: u
7.10
79
TEST OF HYPOTHESIS
Therefore, the critical regions are where x' <.484419 and x' > 11.1433 (Fig. 7.10a), if the 5% significance level is chosen. The statistic 55/0: is equal to 480/100 or 4.80 with 4 degrees of freedom. The value 4.80 is outside the critical regions; therefore, the hypothesis is accepted and the conclusion is that ci' 100. If it were in the left critical region, the conclusion would be that the true population :0::
,. f·
.4
.J
.2
_1.....
2
4
6
19
Critical
"Ii-
•
10
r11.143312
14
X'
•
Critical ...1011
Fig. 7.10 a
variance is less than 100. If it were in the right critical region, the conclusion would be that the true population variance is greater than 100. If the hypothesis is true and this procedure of testing the hypothesis is followed, 5% of all possible samples will lead to the erroneous conclusion that the population variance is not equal to 100. Therefore, the Type I error (shaded by vertical liDes in Fig. 7.l0a) is 5%. The reason for risking the Type I error is the ever-present possibility that the hypothesis might be false. If the hypothetical variance is smaller than the true variance 02, e.g., 0: e ('1Jo 1, the statistic 55/u: is 2(55/0 2) and consequently 55/0: does not follow the true )(2-distribution, but a distorted one such as the curve on the right shown in Fig. 7.l0b. If the hypothetical variance is larger than the true variance 0 2, e.g.,
0:
u:
80
Ch. 7
SAMPLE VARIANCE-X-tDISTRIBUTION
r. {
. •4
.3
.2
Dlalarte4 )C. - c _
.I WIll1.E/~"::~: ~· === 2 _
6
10
.484419
f
12 \ 11.1433
t.r 1
Crillcal "'1100
14
)C.
•rC FoG
Fig. 7.10 b
u! = 2u
the statistic SS/u: is (~)(SS/u2) and consequently SS/u: does not follow the true X2-distribution, but a distorted one such as the curve on the left shown in Fig. 7.10c. In the case where = (~)u2, more than 2.5% of SS/u: is greater than 11.1433 (Fig. 7.10b); and in the case where u: co 2u 2, more than 2.5% of SS/u: is less than .484419 (Fig. 7.10c). In other words, when the statistic SS/u: is large enough or small enough to fall inside one of the critical regions, the event is due to one of two reasons. One is that the hypothesis is true, or = 2, and the sample is one of the 5% of all possible samples which have the values of SS/u 2 in the critical regions (Fig. 7.10a). The other reason is that the hypothesis is false, e.g. u: = '1p2 or u: = 2u 2, and the x2-value is made too large or too small by using u: as the divisor in the statistic SS/u! as in the portions of the distorted curves not shaded by horizontal lines in Fig. 7.10b and 7.10c. Whenever the conclusion that u2 > u! or u 2 < u: is reached because SS/u: is inside one of the critical regions, the second reason overrides the first. In tenns of the three curves gi ven in Fig. 7.1 Oa, 7.10b, and 7.10c, whenever SS/u; is inside either one of the critical regions, the sample is considered drawn from a distorted ~-distribution rather than from the true ~-distribution. But because the sample could come from the true 2,
u:
u: u
7.10
81
TEST OF HYPOTHF:SIS
~-d.istribution (shaded by vertical lines in Fig. 7.10a), a Type I error may be committed. When the statistic SS/~ falls outside the critical regions and the conclusion that 0'2 = O'~ is reached, it is still possible that the sample may have come from a distorted ,(-distribution (the portion of the distorted curve shaded by horizontal lines in Fig. 7.10b and 7.1Oc). In other words, when a hypothesis is accepted, a Type II error may be committed. The probability of committing a Type II error is
,. f.
•4
/
.1
.2
.1
14 .484-419
C.
).'
1 .4
1 ,..,~._
Fig. 7.10 c
represented by the portion of a distorted curve shaded by horizontal lines. All the principles concerning the test of a hypothesis given in Section 6.6 apply here. As the sample size n increases and consequently the number of degrees of freedom, n - 1, increases, the overlapping area of the true X--curve and a distorted >t-curve will decrease and thus the probability of committing a Type II error will decrease. As the ratio 0'2/u! drifts away from 1, e.g. = 10 0'2 instead of = 2 0'2, or = 0'2/10 instead of = 0'2/2, the overlapping area of the true ~-curve and a distorted x'-curve will also decrease, and thus the probability of committing a Type II error will decrease. In other words, if the significance level remains unchanged, either the increase of sample size or the drifting away of ~ from 0'2 will reduce the probability of committing a Type II
o!
o!
o!
u:
82
SAMPLE V ARIANCE-X2-DlSTRIBUTION
Ch. 7
error. The Type II error shown in Fig. 7.10b or 7.10c is so large because (a) the sample size is small and (b) the ratio 0'2/o! is not greatly different from 1 (0'2/0'~ = 2 in one case, and 0'2/0': = X in the other). The distribution of S2/0'2 can also be used in testing the hypothesis that the population variance is equal to a given value. For illustration, the same example of SS = 480 and n = 5 is used. Since S2 = 480/4 = 120. the statistic S2/0-: = 1201100 or 1.20. The critical regions can be determined with the aid of Table 5, Appendix. They are where ~/v <.121105 and )(/v >2.7858, where v is the number of degrees of freedom. These two values which determine the critical regions are onefourth as large as th eir corresponding values obtained from the ~-table, and the statistic s21 a! is also one-fourth as large as SS/O': Therefore the conclusions reached by the two methods should always be the same. There is really no need for both methods. The reason for introducing the statistic S2/0': is because it provides the common sense way of looking at the test of the hypothesis that the population variance is equal to 0'2. o The statistic S2 is an estimate of the true variance 0'2. If s2/a! is too much greater than 1, the indication is that the true variance is greater than the hypothetical variance Leing tested. If S2/0'2 is too much less than 1, the indication is that the true variance is less than the hypothetical variance being tested. 7.11 Procedures of Test of Hypothesis The procedures in the test of the hypothesis that the population variance is equal to a given value can be illustrated by an example. A sample of 10 observations is drawn from a population. The observations are tabulated as follows: 4.8 5.6
:1.2 3.6 4.7 5.3
4.8 5.1
6.1 7.6.
The problem is to determine whether the variance of the population is equal to 4. The test procedure is as follows: 1. Hypothesis: The hypothesis is that the population variance is equal to 4, that is, 0'2 = a! = 4. 2. Alternative hypotheses: The alternative hypotheses are (a) that the population variance is less than 4, that is, 0'2 < 4, and (b) that the population variance is greater than 4, that is, 0'2> 4. 3. Assumptions (conditions under which the test is valid): The sample is a random sample drawn from a normal population. 4. Level of significance: The chosen significance level is 5%. 5. The critical regions are where (a) ~ < 2.70039 and (b) ~ > 19.0228. (Values obtainable from Table 4, Appendix; )( with 9 d.f. An example of a one-tailed test is given in Section 7.12.)
7.12
83
APPLICATIONS
6. Computation of statistic:
o! (l:y)2
= 4 (given)
= 2580.64
n
(l:y}2 -n
= 10 -258.~4
l:y l:y2
= 50.8
= 271.80
SS = 13.736 [SS =l:y - (l:y)2/n (Section 7.4)] ,r = 3.4340 with 9 d.f. (,r =SS/u!) 7. Conclusion: Since the computed ,r-value, 3.4.140, is outside the critical region, the hypothesis is accepted. The conclusion is that the population variance is equal to 4. (If the computed ,r-value were less than 2.70039, the conclusion would be that the population variance is less than 4. If the computed ,r-value were greater than 19.0228, the conclusion would be that the population variance is greater than 4.) 7.12 ApplicatiODs The test of the hypothesis that the population vanance is equal to a given value, is extensively used in industry. Manufacturers ordinarily require that their products have a certain degree of uniformity in length, weight, etc. Consequently, the permissible variance (or standard deviation) of a certain measurement of a product is usually specified as a production standard. Samples are taken periodically and the hypothesis that the production standard is being maintained is tested. For instance, the average drained weight of a can of cherries may be specified as 12 ounces, and the standard deviation may be specified as 'l. ounce. As long as wbole cherries are canned, it is almost impossible to make the drained weight of every can exactly 12 ounces. One more cherry might cause overweight and one less might cause underweight; therefore, it is expected that the drained weights of canned cherries will vary from can to can. The problem, however, is to prevent the variation from becoming too great. A random sample of n, say 10, cans is inspected periodically. The cherries are drained and weighed. The weights are the observations, y. The hypothetical variance is (%)2, or u! = 1~6' The procedures given in Section 7.11 may be followed in testing this hypothesis. If the objective is to prevent the population variance from becoming too large, the one-tailed test should be used. In other words, there is only one critical region on the right tail of the ,r-distribution. The two critical regions where ,r < 2.70039 and ,r > 19.0228, as given in the preceding section, should be replaced by one critical region where ,r > 16.9190. If the hypothesis is accepted after the statistical test, the production standard is maintained and no action is needed. If the hypothesis is rejected, th~ conclusion is that the drained weight varies more from can to can than the specified standard allows. Then some action is needed to correct the situation. The advantage of testing the hypothesis periodically is
84
Ch. 7
SAMPLE VARIANCE-X2·DISTRIBUTION
that any defects in the manufacturing process are in this way revealed before a great deal of damage is done. This example of what is called quality control is an illustration of the general principles of the application of statistics in industry. The details may be found in the many books written on the subject. It should be realized that the term "quality control" used here is really quantity control. It has nothing to do with the quality of the cherries. 7.13 Remarks The purpose of this chapter, like that of the preceding chapters, is to introduce some of the basic principles of statistics. The application is incidental. Even though the test of the hypothesis that the population variance is equal to a gi ven value is extensi ve ly used in industrial quality control, its use among research workers is quite limited. Examples of its usefulness to research workers can be found; by and large, however, its importance as compared to the methods presented in later chapters is not very great. Nevertheless, princi pIes presen ted in th is chapter are repeatedly used in developing more useful methods. The ~.test is used in Chapters 21, 22, and 24 to test a wide variety of hypotheses other than the one mentioned in this chapter.
EXERCISES 0) Draw all possible samples of size 2, with replacement, from the popu-
lation which consists of the observations 2, 3, 4, 5. Find the mean y and variance S2 for each sample. Show that (a) is an unbiased estimate of p. and (b) S2 is an unbiased estimate of fiA. (2) A random sample of 8 observations is drawn from the tag population whi ch is a normal population wi th mean equal to 50 and standard deviation equal to 10. The observations of the sample are as follows:
r
152 43 60 37
56
41
49 43.
Pretend that the population mean and variance are unknown. Using the 5% significance level, test the hypothesis that the population variance is equal to (a) 10 (b) 99 (c) 100 (d) 101 and (e) 1000. This is a two-tailed test. Following the procedure given in Section 7.11 write a complete report for (a) only. For (b), (c), (d), (e), simply compute the ~.values and state the conclusions. For each of the fi ve cases state whether the conclusion is correct, or whether a Type I error or a Type II error is made. The purpose of this exercise is to acquaint the student with the basic principles of the test of a hypothesis. It is not an application to a practical problem: in practice one does not test various hypoth-
85
QUESTIONS
>t
>t ""
eses with the same sample. [(a) = 61.8875; no error. (b) 6.2513; Type II. (c) = 6.1888; no error. (d) = 6.1275; Type II. (e) >t = 0.6189; no error.] (3) Using the 8 observations given in Exercise 2, verify the three identities proved in Section 7.8 (p. = 50, (J = 10). Do not drop the decimals in your computations. (4) The drained weights, in ounces, of 12 cans of cherries are:
>t
11.9 12.7
12.6 11.9
12.3 11.3
11.8 12.0
>t
12.1 11.8
11.5 12.1.
The specified standard deviation is ~ ounce. Is this specification being met? The purpose of this test of hypothesis is to detect the possibility that the standard deviation may become too large; therefore, the one-tailed test should be used. Use the 1% significance level. (5) Find the variance sa for each of the 25 samples of Exercise 1, Chapter 5. Then show that the mean of S2 is equal to u'. (6) Repeat Exercise 1 with the sample size changed from 2 to 4. (7) Draw all possible samples of size 3, with replacement, from the population which consists of the observations 3 and 6. Find the mean y and variance S2 for each sample. Show that (a) the mean of
:y is
equal to 11 and (b) the mean of .~2 is equal to
(12.
(8) The specified standard deviation of a certain machine part is allowed to be 0.010 inches. A sample of 10 parts is measured and are recorded as follows:
1.011 0.975
0.998 0.995
0.980 0.970
1.021 1.000
1.025 1.031.
Test the hypothesis that the population standard deviation is equal to 0.010, at the 5% level, with the intention to detect the possibility that the standard deviation may become too large. QUESTIONS (1) One thousand random samples, each consisting of 5 observations,
are drawn from the tag population which is a normal population with mean equal to 50 and variance equal to 100. If the statistic SS/(J2 were calculated for each sample, (a) what distribution does this statistic follow? (b) what is the mean of this distribution? (2) If the sampling experiment were done as described in Question 1 except that each sample consists of 10 instead of 5 observations, (a) what distribution would the statistic SS/(J2 follow? (h) what would be the mean of this distribution?
SAMPLE VARIAN CE-)(~ D1STRI BUTION
86
Ch. 7
(3) If the sampling experiment were done as described in Question (1) except that 2000 instead of 1000 samples were drawn, (a) what distribution would the statistic SSjq2 follow? (b) what would be the mean of this distribution? (4) If 10 were added to each of the observations in the population before the sampling experiment of Question (1) were carried out, (a) what distribution would the statistic SSjq2 follow? (b) what would be the mean of this distribution? (5) If the statistic concerned is I(y - p.)2jq2 instead of I(y - y)2jq2, what are the answers to Questions (1) - (4)? (6) What is the relation between the distribution of SSjq 2 and that of S2 jq2?
(7) (a) What is u? (b) What distribution does u2 follow? (8) What is the analysis of variance? (9) What is an unbiased estimate? (10) What can one do to reduce the probability of committing a Type II error, if the significance level is fixed at 5%? (ll) What are the assumptions underlying the ~-test? (12) The ~-test may be used in testing the hypothesis that the population variance is equal to a given value. If the hypothesis is accepted, the ~-value must be in the neighborhood of what value?
REFERENCES Kendall, :llaurice G.: Advanced Theory of Statistics, Vol. II, Charles Griffin & Company, London, 1946. \Iood, Alexander M.: Introduction to the Theory of Statistics, McGraw-Hill Book Company, New York, 1950. Peach, Paul: An Introduction to Industrial Statistics and Quality Control, Edwards & nroughton Co., Raleigh, N.C., 1947. Pearson, Karl (Editor): Tables for Statisticians and Biometricians, Part I, Table XII, Biometric Laboratory, University College, London, 1930.
CHAPTER 8
STUDENT'S ,.DISTRIBUTION This cbapter introduces another important frequency distribution called Student's t-distribulion, named after W. S. Gosset, who used the pseudonym "Student" in bis statistical writings. First developed by Gosset early in this century, this distribution was subsequently modified by R. A. Fisber. It is tbis modified version which is presented in this chapter.
8.1 Description of t-Distribution The "-test is introduced in Section 6.7 to test the hypothesis that the population mean is equal to the given value 110 where (1)
However, the u-test has limited practical use, because the population variance qa is usually unknown. The t-distribution is developed to overcome this difficulty. If the population variance qa in Equation (1) is replaced by the sample vtuiance s2, the resulting statistic is
-r - 110
1=
~.
(2)
The purpose of introducing t, therefore, is to remove the restrictive condition that the population variance must be known. The variance S2 in Equation (2) can be computed from a sample. Thus even though u has limited practical use, it is instrumental in introducing t. Since the mean of the means of all possible samples of the same size is equal to the population mean p" it is conceivable that the mean of t is equal to 0 or is the same as that of u. It is also conceivable that thp. variance of t is greater than that of " (variance of u = 1, Section 6.7). The statistic u is made of four elements, namely, y, p" qa, and n. Of these four elements only y cbanges from sample to sample. But while the statistic t is also made of four elements, namely, y, p" 52, and n, two of these elements, y and 52, change from sample to sample. As a result, it is expected that t will fluctuate more from sample to sample than u will. Therefore, it is expected that the variance of t is greater than 1, which is the variance of u. The t-distribution is not a single frequency curve, but a family of curves. The number of degrees of freedom of S2 uniquely identifies a
87
88
STUDENT'S t-D1STRIBUTION
Ch. 8
.s
,'..
•1
Fig. 8.1
particular t-curve. The variation, from sample to sample, of sI diminishes as the number of degrees of freedom increases. As a result, the variation of t also diminishes as the number of degrees of freedom of sa increases. As the number of degrees of freedom approaches infinity, S" approaches u"; consequently t approaches u. Thus u becomes a special case of t. The number of degrees of freedom of s 1 is also called the number of degrees of freedom of t. In other words, a particular t-distribution is identified by the number of degrees of freedom of sI in Equation (2). From the graphs of the t-distributions, with 1, 4, and 00 degrees of freedom, given in Fig. 8.1, it can be seen that a t-curve is bell-shaped and looks very much like the normal curve. Therefore, the casual observation of the graphs will not enable one to distinguish them. It is the relati ve frequency, not the general appearance, th at distinguishes one frequency curve from another. 'The t-distribution with 00 degrees of freedom shown in Fig. 8.1 is the u-distribution which is the normal distribution with mean equal to 0 and variance equal to 1. The above discussion can be summarized in the following theorems: Theorem 8.1a If all possible samples of size n are drawn from a normal population with mean equal to p., and for each sample the statistic t, where (3)
8.2
EXPERIMENTAL VERIFICATION OF I-DISTRIBUTION
89
is calcrJ.ated, the frequency distribution of the t-values follows the Stu,dent' s ~distribution with 11 degrees of freeaom, where 11 is the number of degrees of freedom of s2 (11 = n - 1 in this case). Theorem B.lb As the number of degrees of freedom of S2 approaches infinity, the Student's t~istribution approaches the normal distribution with mean equal to zero and variance equal to 1, that is, t approJJches U, as 11 approaches infinity. The experimental verification of Theorem 8.1a is given in the following section.
8.2 Experimental Verification of t-Distribution Theorem 8.1a can be verified experimentally. The details of the sample experiment are described in Chapter 4. Briefly, 1000 random samples, each consisting of 5 observations, are drawn from the tag popnlation, which is a normal population with mean equal to 50 and variance equal to 100. For each sample, the statistic t is calculated. As an ell'ample, the computing procedure of t for the sample consisting of the observations 50, 57, 42, 63, 32 is shown as follows:
n=5
= 244 r = 244/5 = 48.8 (Iy)' = (244)' = 59,536 (Iy)' /n = 59,536/5 = 11,907.2 Iy
Iy = 12,506 SS = 12,506 - 11,907.2 = 598.8 (Section 7.4) s' = 598.8/4 = 149.7 (Section 7.4) s'ln = 149.7/5 = 29.94 ys'/n = Y29.94 = 5.472 = 48.8 - 50 = -1.2 t = -1.2/5.472 = -0.219.
r - ,.,.
For each of the 1000 samples, the t-value is calculated as shown above. An example of the t-values of four samples is given in Table 4.2. This is not so formidable a computing project as it seems. The values of and SS are already computed for previous sampling experiments, and very little additional computation is needed to obtain a t-value for each sample. Since s' has n - 1 or 4 degrees of freedom, t also has 4 degrees of freedom. The frequency table of these 1000 t-values is given in Table 8.2. The theoretical frequency given in that table is that of the t-distrihution with 4 degrees of freedom. The histogram of the 1000 t-values with the superimposed t-curve with 4 degrees of freedom is shown in Fig. 8.2. It can be seen either from Table 8.2 or Fig. 8.2 that
r
90
Ch. 8
STUDENT'S t-DISTRIBUTION TABLE 8.2 t
Observed Frequency (
r.(.(%)
Theoretical
-
m
m/
-5 -4 -3 -2 -1 0 1 2 3 4 5
-40 -24 -69 -170 -21S 0 219 160 75 16 35
Below -4.5 to -3.5 to - 2.5 to -1.5 to -0.5 to 0.5 to 1.5 to 2.5 to 3.5 to 4.5 Above 4.!i
S 6 23 85 218 325 219 80 25 4 7
.0 .6 2.3 8.5 21.S 32.5 21.9 8.0 2.5
.7
.5 .7 2.1 7.1 21.S 35.6 21.S 7.1 2.1 .7 .5
Total
1000
100.0
100.0
-4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5
.4
Ina/ 'i../
~.lid-pt.
r./.(%)
-16
-16 1000
\fean of t = -,- = - - = -.016 (This sampling experiment was conducted cooperatively by about 75 students at Oregon State College in the Fall of 1952).
the observed frequency and the theoretical frequency fit very closely. 'The observed frequency of t is based on the t-values of 1000 samples, while the theoretical frequency is based on all possible samples of size 5. They do not exactly agree. The mean of the 1000 t-values could be found easily if they were available. Unfortunately, however, the identity of each individual t-value is lost after the frequency table is made. Yet the approximate value of the mean can be found by using the mid-point m of a class to represent Illl the t-values in that class. For example, the class -.5 to .5 is represented by 0 and the class .5 to 1.5 is represented by 1. The two extreme classes do not have definite class limits. They are arbitrarily assigned the values -5 and 5 respectively. Then the approximate mean of the 1000 t-values is
Unf
-16 -=-=-.016 If 1000
which is very close to 0 as expected. It can also be observed from Table 8.2 that the variance of t is larger than that of u. For example, the relative frequency of u Leyond -3.5 and +3.5 is almost equal to 0 (Table 3, Appendix). But in Table 8.2, it car. be seen that 1.4% of the 1000 t-values are less than -3.5 and that 1.1% of the t-values are larger than 3.5. This shows that the variance of t must be larger than that of u. This experiment verifies the t-distribution and confirms the specula-
8.2
EXPERI~IENTAL
VERIFICATION OF l-D1STRInU1l0N
91
tion (Section 8.U that the mean of t is equal to zero and that the variance of t is larger than 1. After the verification of Theorem 8.1a, the theorem itself may appear to have been awkwardly stated. If every sample has n observations, S2 must have n - 1 degrees of freedom, and consequently t has n - 1 degrees of freedom. Certainly there is no need to say that t has II degrees of freedom, where II is the number of degrees of freedom of S2. However, the reason for stating the theorem in such an apparently awkward manner is that and S2 in
r
(1)
need not come from the same sample and not even from samples of the same size. For example, if 2000 random samples were drawn from the tag population. with the si ze of the first. third •••• , 1999tb samples being 4 and with the size of the second. fourth •.••• 2000th samples being 9. one t-value could be calculated from a pair of samples of sizes 4 and 9 respecti vely. The sample mean could be calculated from the first sample with 4 observations, and the sample variance S2 could be calculated from the second sample with 9 observations. If the sampling experiment were conducted this way, at-value
r
y-50
.= If .. .1
.1
.1
- I
-J
-I
o
Fig. 8.2
92
STUDENT'S t-DISTRIBUTION
Ch.8
could be calculated for each pair of samples. The quantity II in Equation (1) in this case would be 4, which is the number of observations from which is calculated, and the number of degrees of freedom of sI would be 8. Then the frequency distribution of these 1000 ,-values would follow the ,-distribution with 8 degrees of freedom. Although it may be difficult at this stage to see why rand S2 should be calculated from different samples, as the subject develops the desirability of doing so becomes obvious.
r
8.3 t-Table The relative cumulative frequency of the ,-distribution for various numbers of degrees of freedom is giveu in Table 6, Appendix. Each liue of the t-table represents a particular number of degrees of freedom. For example, for 4 degrees of freedom, 2.5% of the t-values are greater than 2.776. Since the t-curve is symmetrical, tbis tabulated value also indicates that 2.5% of the t-values are leu than -2.176. As the number of degrees of freedom increases, the tabulated values in the column labeled 2.5% in the t-table become smaller and reach 1.960 as the limit. This shows that t approaches u as the number of degrees of freedom approaches infinity, and also shows that the variance of t decreases 88 the number of degrees of freedom lDcreases. 8.4 Test of Hypothesis The preceding three sections deal with the deductive relation between a population and its samples; or more specifically, the distribution of the t-values of all possible samples of the same size drawn from a given normal population. This section deals with the drawing of inductive inferences about the population from a given sample, or more specifically, the test of the hypothesis that the population mean is equal to a given value. It is necessary to establish the t-distribution before the test of hypothesis is discussed, because the t-values which are needed to establish the critical regions come from the t-table, which is made according to the distribution of the t-values of all possible samples of the same size drawn from the same normal population. The use of t is similar to that of u in testing the hypothesis that the population mean is equal to a given value. The only difference is abat, in the t-test, the sample variance S2 is used, while in the u-test, the population variance 0 2 is used. Since the two distributions are not the same, the critical regions are also different. If the 5% significance level is used, the boundary values of the critical regions for a two-tailed u-test are -1.96 and 1.96, while in the t-test these values are replaced by the corresponding values in the t-table with the appropriate number of degrees of freedom. For 4 degrees of freedom, these values are -2.776 and 2.776. (Table 6, Appendix).
8.4
TEST OF HYPOTHESIS
93
All the discussious of significance level, Type II error, and sample size coucerniug the u-test as given in Sections 6.4, 6.5, 6.6, and 6.7, apply to the t-test. To avoid duplication, only the rudiments of these discussions are repeated here. The test of hypothesis can be explained in terms of the sampling experiment of Section ~.2. It is important to realize that, in verifying the t-distribution with 4 degrees of freedom, the true population mean 50 is used in computing the 1000 t-values, that is,
However, in testing the hypothesis that the population mean is equal to SO, the critical regions are where t < -2.776 and where t > 2.776, if the 5% significance level is used. Since 5% (relati ve frequency) of all possible samples of size 5 yield t-values falling inside these regions, and thus lead to the erroneous conclusion that the population mean is not equal to 50, the probability of one sample, drawn at random, committing the Type I error is .05. The reason for risking the Type I error is the ever-present possibility that the hypothetical mean being tested may be false. For the sake of discussion, consider the true population mean I'to be 50 aod the hypothetical mean fLo to be 60. 10 testiog a hypothesis, the hypothesis is considered true until proved false. Therefore, the computed tis
, y -1'-0 Y- 60
.=
~-~
(1)
Since the statistic (2)
follows the t-distribution, t'in Equation (1) cannot follow the t-distribution because the wrong value of the population mean is used. The effect of using 60 instead of 50 is to make t'less than t. When a computed t-value is small (large negative number) enough to fall inside the left critical region, it could be because the hypothetical mean 1'-0 is larger than the true mean p.; or it could be because the sample is unusual while 1'-0 and I'- are really equal. But the decision is to reject the hypothesis; that is, the former reason overrides the latter one. The conclusion is that the true mean p. is less than the hypothetical mean 1'-0' If, for fear
94
STUDENT'S t-D1STRIBUTION
Ch.B
of committing a Type I error, the bypothesis is accepted no matter bow small or bow large a t-value is, the large t-value, derived from the fact that ""0 < Il, or the small t-value, derived from the fact that ""0 > Il, will escape detection. Consequently, any hypothesis, correct or false, will be accepted. In other words, one risks a Type I error to make possible the rejection of tbe hypothesis if it seems false. 1£ the significance level is made low, tbe probability of committing a Type I error is reduced but that of committing a Type II error is increased if the sample size remains the same. 8.5 Procedures The procedures of the Hest may be illustrated by an example, in which one-digit observations are used to make the computing procedures easy to follow. The observations of a gi ven sample are 5, 3, 1, 4, 2. A two-tailed test, with 5% significance level, is used to test the hypothesis tbat the population mean is equal to 5. 1. Hypothesis: The hypothesis is that tbe population mean is equal to 5, that is, Jlo = 5. 2. Alternative hypotbeses: The altemative hypotheses are that (a) the population mean is less than 5 or (b) the population mean is greater than 5. 3. Assumptions: The given sample is a random sample drawn from a normal population. 4. Level of significance: The 5% significance level is chosen. 5. Critical regions: The critical regions are where t < -2.776 and where t > 2.776. 6. Computation of t: 1'-0 n
l:y
Iyl = 55
=5 =5
SS
15 Y=3 (Iy)2 = 225 (Iy)2/n = 45 =
y-f4,
-2
S2
s2/n ys2/n
Y-
= 10 =
2.5
= .5 = .7071
1'-0 =-2
t = ~=--=-2.83 s .7071
. WIth
4 d.f.
n 7. Conclusion: Since t is inside the left critical region, the conclusion is that the population mean is less than 5. (If the t-value were Letween -2.776 and 2.776, the conclusion would be that the population mean is equal to 5. 1£ the t-value were greater than 2.776, the conclusion would be that the population mean is greater than 5.)
8.5
95
PROCEDURES
It should be noted that t, like u, has no unit of measurement. If the observations 'Y are measured in inches, y is a number of incbes; ,.,.., is a number of inches; S2 is a number of square inches; n has no unit of measurement. Therefore, the unit of t is
y inches -,.,.., inches
/?
s~ in.
=
number of inches number of inches
-
a pure nnmber.
Consequently, the unit used in the observations has no bearing on the t-value. Similarly, if the same quantity is subtracted from or added to all the observations, the t-value again is unaffected. If 32 is added to each of the observations, y is increased by 32; ,.,.., is also increased by 32; consequently
8.6 ApplicatioDs The t-test may be used in industrial quality control. For example, the average drained weight of canned cherries is fixed at 12 ounces as a production standard (Section 7.12). A sample of n, say 10, cans is inspected periodically. The chenies of each can are drained and weighed and the weight is recorded. With this sample of 10 observations, the hypothesis that the population mean is equal to 12 ounces can be tested. If the hypothesis is accepted, the conclusion ia that the production standard is being met. If the t-value is inside the left critical region, the indication is that the average weight of the cherries is below the standard and corrective action must be taken. If the t-value is inside the right critical region, the indication is that the average weight of the cherries is above the standard. This example illustrates the basic principle of controlling the mean of a certain product in industry. Yet in practice the t-test is seldom used for this purpose, largely because computation of t is time consuming. Therefore a mucb simpler method is used, although the principles remain tbe same. Another illustration of the application of the t-test may be seen in the comparison of the scores of a standard achievement test of the graduates of a certain high school with the national average. The scores of the graduates of this particular high school are the observations of a sample. The y and S2 can be calculated from the scores of the n students. Even when the examination is given to all graduates of that school, the set of
96
Ch.8
STUDENT'S t-DISTRIDUTION
scores is still considered a sample, for the population theoretically consists of all the scores that could be produced by the potential students of this school. If the national average score is 65, the hypothetical mean of this potential population is 65, that is, ~ = 65. Since~, y, sZ and n are available, t can be calculated. If t falls outside the critical region, the indication is that this school is up to the national standard. If t falls inside the left critical region, the indication is that this school is not up to the national standard. It should be noted that the scores of all potential graduates of that school are considered here as a population, and that the scores of this particular group of students constitute only a sample from that population. 8.7 Paired ObservatioDS The possible applications mentioned in the preceding section do not greatly concern experimental scientists. Of more direct interest to them is the application of the t-test on paired observations, the meaning of which is clarified by the following example. An experiment was conducted in Eastern Oregon in 1950 to determine the effect of ni trogen fertili zer on the yie Id of sugar beets. A field was divided into 10 blocks of equal size. Each block was divided into two equal plots, making 10 pairs of plots (Fig. 8.7). One plot of each pair (block) was sclcctcd at random, as by tossing a coin, and thc £crtili zcr was applied to that plot at the rate of 50 pounds of available nitrogen per acre. No fertilizer was applied on the other plot of the pair (block). The field map of the 20 plots is shown in Fig. 8.7. The yield of sugar beets, in pounds, from th e 20 plots is gi ven in Table 8.7. "LOCK NO.
2
0
50
0
3
50
"--v--' Block or
50
~
Pial
0
50
6
5
4
0
50
0
0
50
o - - - - no
0
50
50
ferlilizer
50 - - - - 50 Ibs. per acre
Pair or plots
Fig. 8.7
10
9
8
7
0
50
0
0
50
8.7
97
PAIRED OBSERVATIONS
The two plots of a block were placed s ide by side so that the variation in Datural soil fertility would have the least possible effect on yields from the treated and untreated plots. Tbe plot to be treated with fertilizer was selected at random to simulate the assumption of random sampling (Section 8.5, item 3). Because each block of two plots was an experiment by itself, the experiment was actually done 10 times. The TABLE 8.7 Fertilizer Block No.
110 II
Iy
Y
y
Difference (b) - (a)
Ca)
(b)
o lha.
501bs.
1 2 3 4 5 6 7 8 9 10
140.4 174.7 170.2 174.6 154.5 185.0 118.9 169.8 174.7 176.7
170.5 207.4 215.9 209.0 171.6 201.2 209.9 213.3 184.1 220.4
30.1 32.7 45.7 34.4 17.1 16.2 91.0 43.5 9.4 43.7
Total
1639.5
2003.3
363.8
0 10 363.8 36.38
--
(Iy)2 (I.y~ II
~ SS
~
132350.44
.2
13235.04 17973.30 4738.26
- 0 36.38 =5.014 WI0th ,= r~.. ~ 7.2558
9
526.473
-
52.6473
14
7.2558
d.,.
II
(This ia a small part of aD extensive experiment conducted by Dr. Albert S. HUDler, Oregon Agricultural Experiment Staticm).
difference r between the yield of the fertilized plot and that of the unfertilized plot of each of the 10 blocks is given in Table 8.7. It should be noted that the difference changes from block to block, making a set of 10 differences. These 10 differences constilDte a sample of 10 observations from the population consisting of infinitely many potential observations. That is, future experiments of the same kind are in effect further samples drawn from this infinite population. The problem is, therefore, to draw an inference about this population from this sample of 10 observations. The hypothesis being tested is that the population mean is equal to 0, or that, on the average, the fertilizer does not increase the
98
STUDENT'S l-DISTRIBUTION
Ch.8
yield of sugar beets. The rest of the procedure in the test of hypothesis is the same as that outlined in the preceding section. The two-tailed test is used because it is possible that the fertilizer may increase or decrease the yield. The critical regions are where t < -2.262 and where t > 2.262 (9 d.f.), if the 5% significance level is used. The computation of t is shown in the lower balf of Table 8.7. The t-value is 5.014 with 9 degrees of freedom. Since the "'value is inside the right critical region, the conclusion is that the application of the fertilizer, at the rate of 50 pounds of available nitrogen per acre, increases the yield of sugar beets. Of course, this is a foregone conclusion after an examination of the data even without any statistical test. However, it is not wise for an inexperienced person to draw a conclusion solely from an examination of the data, because whereas the conclusion drawn from the t-test is based on four elements, namely y, #lot S2 and n, an inexperienced person very frequently ignores S2 and n, and draws his conclusion only from the difference Y -Po. The use of the paired observations is not limited to field experiments. It can be used in any experiment cODsisting of only two treatments. Two kinds of feed may be given to two groups of animals to determine the relative merits of the feeds as shown by the weights gained by the animals after a feeding period. The animals are matched before the experiment is started. The animals may be two pigs from the same litter or two steers of the same initial weight or age. One of the matched pair is assigned to one treatment at random, while the other one is given the other treatment. As another example, the method of paired observations can also be used in the comparison of two teaching methods. A group of 2n, say 40, children may be divided into two classes, each of 20 children. The children are matched beforehand in pairs by some criterion such as I.Q. Each child of the pair with similar I.Q.'s is assigned to one of the two classes at random. Both classes are taught the same subject by the same teacher with two different methods-for instance, one with visual aids and the other without visual aids. At the end of the term, the same examination is given to both classes. From the 40 scores, 20 differences are obtained. Therefore, the sample size is 20. A difference between the scores of a pair of children may be either positive or negative. The + or - sign cannot be ignored. An experiment of this kind determines which one of the two methods is better suited to a particular teacher, rather than the relative merits of the two teaching methods. The method of paired observatioDs can be used only if there are but two treatments. If there are more than two treatments, such as four kinds of feeds or three teaching methods involved in an experiment, a different experimental design called randomized blocks (Chapter 14) may be used. This design can accommodate any number of treatments. The
8.8
99
REMARKS
method of paired observations is, therefore, a special case of the randomized block design (Section 14.6). When the number of treatments is two, either method may be used. The conclusions reached by the two methods are always the same. Any discrepancy is due to mistakes in computation. The advantage of pairing the children, plots, or animals is to minimize the experimental error. The variation from one pair to another is eliminated by the pairing procedure. For example, if 10 is added to the two observations in Block 1 of Table 8.7 and 20 is added to the two observations in Block 2, 30 to Block 3, and so forth, the value of t is not affected, because the 10 differences are stiil the same. If the variation in soil fertility causes this to hawen, the accuracy of the experiment is not affected. Similarly, the I.Q.'s of the children may vary a great deal from one pair to another, yet the accuracy of the experiment is not affected. Moreover, if the teaching methods are tried on children of different I.Q.' s, the experiment has the advantage of having a wide inductive basis. A conclusion based on children of various I.Q.'s is more apt to be generally valid than the one based on children with a particular I.Q.
8.8 Remarks
r
One of the important developments of this chapter is the use of and as elements in the process of drawing an inference about the population mean. The t-test involves four elements, y, /lo, S2, and n. The combined contribution of these elements enables one to reach a conclusion. The elements y and S2 are not computed for estimating Il and 0'2 as the final goal, but they are intennediate steps in computing the t-value which enables one to reach a conclusion about the population mean. S2
EXERCISES (l) A random sample of 10 observations was drawn from the tag population which is a normal population with mean equal to 50 and variance equal to 100. The observations of the sample are as follows: 45, 55,
68, 55, 51, 44, 42, 45, 53, 37. Assuming that the population mean is unknown and using the 5% significance level, test the hypothesis that the population mean is equal to (a) 40, (b) 49, (c) 50, (d) 51 and (e) 60 by the two-tailed '-test. Following the procedures given in Section 8.5, write a complete report for (a) only. For (b), (c), (d), (e), compute the values of t only and state the conclusions. Since the population mean is actually known to be 50, it can be determined whether a conclusion is right or wrong. For each of the five cases, state whether the conclusion is correct, a Type I error is made, or a Type II error is made. This exercise is intended to show that a Type n error is likely to be committed if the hypothetical population mean is close to the true
100
Ch.8
STUDENT'S t-DISTRIBUTION
population mean. [(a) t = 3.4051; no error. (b) t = 0.1792; Type II. (c) t = -0.1792; no error. (d) t = -0.5377; Type II. (e) t = -3.7636; no error.] (2) The drained weights in ounces of a random sample of 12 cans of cherries are listed as follows:
12.1 11.9 12.4 12.1
12.4 12.3 11.9 12.4
11.9 12.3
12.1 12.0.
Test the hypothesis at the 1% level that the production standard of 12-ounce average drained weight per can of cherries is being maintained. (t = 2.5729 with 11 d.!.) (3) Forty-six high school freshman girls in a home-making class were divided into 23 pairs of almost equal I.Q. One of each pair was selected at random and assigned to a section in which the lecture method of teaching is used, while the other was assigned to a section in which the discussion method is used. Thus each section had 2.1 girls. Roth sections were taught by the same instructor. The same examination was given to all girls at the end of the tenn. The scores on the examination are listed in the following table. Test the hypothesis, at the 5% significance level, that both methods are equally suited to the instructor. (t ... 0.9908 with 22 d.!.) Pair No.
1 2 3 4 5 6
7 8 9 10 11
12 13 14 15 16 17 18 19
20 21 22 23
Lecture
Discussion
84 88 63 88 78 75 59 75 56 72 94 81 84 84 81 78 66 59 78 59 59 72
72 78 81 84 75 63 72 75 53 75 69 84 69 72 84 69 78 69 69 84 63 53
56
38
(Courtesty of Miss Emilia Tschanz, Oregon State College)
101
EXERCISES
(4) Consider the 10 observations given in Exercise (1) temperature readings in centigrade of various parts of a kiln. The hypothesis to be tested is that the mean temperature of the kiln is 50°C. The t-value is already computed in Exercise (1). Now change each reading into Fahrenheit (F = l.8C + 32), and recompute the t-value; note that the two t-values are identical and that, therefore, the conclusions are identical. This exercise is intended to show that the I--test is independent of the unit of measurement. (5) The theoretical relative frequency of the t-distribution with 4 degrees of freedom is given in Table 8.2. Find the relative cumulative frequencies less than -3.5, -2.5, ••• , 3.5. Plot the 8 points on the normal probability graph paper and observe that the points are not on a straight line. What does this show? (6) The following results were obtained for the control animals in a study of the changes in creatine phosphate during chromatolysis of Nissl bodies in the anterior horns of the spinal cord following section of the sciatic nerve. The results are given in terms of milligrams of phosphorus per 100 grams of tissue, for the cervical cord. Animal namber
Normal left
Normal right
A7l? A7l9 A754 A730 A66? A77? AS06 A774 A773
IS.2 9.7 5.S 16.6 12.8 13.7 14.3 22.9 7.9
16.S 10.2 6.2 15.2 10.6 14.4 14.0 14.2 10.1
Test the hypothesis that the average difference in phosphocreatine content between the left and right sides in normal animals is equal to zero, at the 5% level. (Bodian, David: "Nucleic Acid in Nerve-cell Regeneration," Symposia of the Society for Experimental Biology, No. 1 Nucleic Acid, pp. 163-178, Cambridge University Press, London, 1947.) (7) The following results were obtained from a study of the changes in creatine phosphate during the chromatolysis of Nissl bodies in the anterior horns of the spinal cord following section of the sciatic nerve. The left sciatic nerve of 10 monkeys was sectioned. The phosphocreatine content (mg. P /100 g. tissue) of the anterior horns of the cervical cord was determined for both the regenerating left side and the nonnal right side for each animal.
102
Ch. 8
STUDENT'S t-DISTRIBUTION
Animal nwnber
Regenerating left
A738 A739 A797 A778 A779 A780 A677 A684 A559
5.6 4.3 12.5 8.9 4.6 6.1 5.0 18.0 9.8 10.6
A688
Nannal right 7.4
R.O 10.9 20.6 16.8 31.8 15.9 22.6 17.6 15.2
Test the hypothesis that the average difference in phosphocreatine content between regenerating and nonnal cells is equal to zero, at the 5% level. (Rodian, David: "Nucleic Acid in Nerve-cell Regeneration," Symposia of the Society for Experimental Biology, No. 1 Nucleic Acid, pp. 163-178, Cambridge University Press, London, 1947.) (8) Two methods of evaluating the octane rating of gasoline are available. One method of evaluation is by laboratory analysis, the other by direct test on a standard motor. The following are some data gathered by these two methods. Sample No.
Laboratory
Motor
1
104 99 91
106 97
96
89 84 68 79 63 66
2 3 4 5 6 7 8 9
85 74 81 66 72
88
Do these two methods give the same results on the average? If not, does this mean that one of the methods of testing is faulty? Use 5% significance level. (9) Since certain current educational toys are similar to some of the tasks invol ved in intelligence tests, a study was conducted to investigate the effects of familiarity with some of these toys on children's test perfonnance. A number of nursery school children were divided into 2 groups of 40 each, matched by pairs for sex and I.Q. score on an intelligence test recently given. Treatments differed in that one group was then given daily use of an assortment of these toys during the
103
EXERCISES
regular morning sessions. At the end of 3 weeks, the children were all retested on an alternate form of the same test. The I.Q. scores of each pair are listed in the accompanying table. Pair
~ o.
1 2 3 4 5 6 7 8 9 10
Without Toys
With Toys
117 102 81 108 112 114 94 117 110 128
130 103 87 103 120 115 95 107 116 136
11
99
89
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
104
101 102 141 95 91 97 116 127 120 86 139 121 107 139 106 107
27
28 29 30 31 32 33 34 35 36 37 38
39 40
97
142 99
87 92 112 125 120 82 140 122 95 134 106 115 90
94
103 117 90 110 130 108
106 134 91 121 134 118 115 106 107
III 99
104 107 124 122
113
126 129
Test the hypothesis that the average scores of the two groups of children are the same, at the 5% level.
104
STUDENT'S t-DISTRIBUTION
Ch.8
QUESTIONS (1) What is the mean of the t-dietribation? (2) Both the '-curve and the normal curve have about the same general appearance. How are they distinguished from one another? (3) The '-test is introduced to take the place of the "-test. Why does the u-test need to be replaced? (4) If the population variance is known, you may use either the '-test or the "-test. Which one is the better method? In what sense is it better? Why? (5) The &-test can be used to test the hypothesis that the population mean is equal to a given value as long as the sample size is equal to or greater than 2. What, then, is the advantage of having a large sample? (Merely to say u a large sample is more accurate" is not an adequate answer). (6) Under what condition does the &-distribution become the u-distribution? (7) How is the number of degrees of freedom of , determined?
REFERENCES Cocbrao. William G. aod Cox. Gertrude M.: E""erimental Duigna, 101m Wiley & SODS, New York, 1950. Fisher. R. A.: SttICiaeicol Methoda lor Reuarch Workera, 11th editioD, Hamer PublishiDg Company. New York, 1950. Fisher, R. A.: The Deai,n 0/ E""erimenta, 6th editioD, HafDer Publishing Compaoy. New York, 1951. Peach, Paul: An Introduction to Induatriol Scatiatica and Quality Concrol, Edwards & BroughtoD Co., Raleigh, N. C., 1947.
CHAPTER 9
VARIANCE-RATIO-F-DISTRIBUTION In the discussion of the characteristics of the sample variance Sl in Chapter 7, both the deductive and inductive relations between the population variance ql and the sample variances, S2, are explored. In this chapter the subject of the sample var·iance is expanded to include not merely one population and its samples. as in Chapter 7, but a pair of populations and their respective samples. More specifically, in this chapter the emphasis is not on the variance S2 of a sample but OIl the ratio s!ls: of a pair of sample variances s! and s!. Furthermore, to cope with the variance-ratio s!ls!, this chapter introduces a new frequency distribution, called F-distribution, originally developed by R. A. Fisher and subsequently modified by Snedecor, who named the modified version F in honor of Fisher. The original version. called z-distribution, is not presented in this book. Q.l Description of F-Dtstribution The discussion in this section invol ves two populations. To avoid possihle confusion of the two populations and their respective samples, the subscript 1 or 2 is attached to all notations such as q2,s2,nand II. For example, u!, is the variance of the first population and q! the variance of the second population. The F-distribution can be described through a sampling experimenL From a population with variance equal to all possible samples of the size can be drawn and, for each sample, the variance can be comall possible puted. From another population with variance equal to samples of the size ~ can also be drawn and, for each sample, the varican be calculated. Then there are two sets of s2-values; one from ance each of the two populations. Every of the first set is divided by every of the second set to form the variance-ratio If there are 9 possible samples (Section 5.1) drawn from the first population and 25 possible samples drawn from the second population, there are 9 x 25 or 225 possible pairs of samples, and, consequently, the same number of variance ratios. Since the values of both and change from sample to sample, changes from one pair of samples to another pair. The frethe ratio quency distribution of these variance-ratios is called the F-distribution, if the two populations are normal and their variances are equal. The Fdistribution has two numbers of degrees of freedom, III and 112, which are
u:,
n.
s:
s:
s:
s:1s:
s:
s:1 s:.
s:
105
s: q:,
106
VARIANCE- RA TIO-F-DISTRIBUTION
Ch. 9
.8 r.
/.
.6
.4
.2
F
Fig. 9.1
the numbers of degrees of freedom of s~ and s! respectively. The first one of a pair of numbers of degrees of freedom of F always refers to that of the numerator and the second one always refers to that of the denominator of the ratio. The F-distribution is a family of frequency curves. A particular F-distribution is uniquely identified by a pair of numbers of degrees of freedom. The graph of three F-curves with 1 and 4, 4 and 4, cUld 4 and 25 degrees of freedom is shown in Fig. 9.1. The discussion of this section can be summarized in the following theorem: all Theorem 9.1 From a normal population with variance equal to possible samples of size n l are drawn and, for each sample, the variance s: with III = n l - I degrees of freedom is computed. From another normal population with variance equal to all possible samples of size 11, are drawn and, for each sample, the variance ~ with Va = n 2 - 1 degrees of freedom is computed. The frequency distribution of all possible ratios
0':,
0':,
S2
0)
F= ...2 S2 2
0': 0':.
follows the F-distribution with VI and 112 degrees of freedom, if = The expression "all possible ratios" as stated in the above theorem implies that each of all possi ble samples drawn from the first popuof all possible samlation has an equal chance to be divided by every
s:
s:
9.2
107
EXPERIMENTAL VERIFICATION OF F-DISTRIBUTION
pies drawn from the second population. This is an important condition under which the theorem is valid. When this condition is satisfied, the sample variances and s~ are said to be independently distributed. Theorem 9.1 is verified by a sampling experiment in the following section.
s:
9.2 Experimental Verification of F-Distribution The details of the sampling experiment are given in Chapter 4. Briefly, 1000 samples are drawn from the tag population, which is the normal population with mean equal to 50 and variance equal to 100. For each of these 1000 samples the variance s" is computed. Now consider that 500 of the 1000 samples are drawn from one normal population, while the other 500 samples are drawn from another normal population. Since the two imagined populations are really the same tag population, they must have the same variance, and thus the condition of Theorem 9.1 is satisfied. The 500 variance ratios are obtained by dividing the variance of the first sample by that of the second, the third by the fourth, and so on. An example of two such F-values is gi ven in Table 4.2. The frequency distribution of these 500 variance-ratios is given in Table 9.2. The theoretical frequency shown in that table is that of the F-distribution with 4 and 4 degrees of freedom. The histogram of the 500 F-values TARLE 9.2
,
Observed Frequency F
0-1 1- 2 2-3 3-4 4-5 5-6 6-7 7-8 8-9 9 - 10 10 - 11 11 - 12 12 - 13 13 - 14 14 - 15 15 - 16 Over 16
241 125 50 27 16 10 8 5 4 2 2
Total
r.,.( fife)
48.2 25.0 10.0 5.4
3.2 2.0 1.6 1.0 .8
-
Theoretical r.[.(%)
50.0 24.1 10.3 5.2 3.0 1.9 1.2 .9
.6
.4
.5
.4
.3
2
.4
.3
1 0 0 1 6
.2 .0 .0 .2 1.2
.1 1.0
500
100.0
100.0
.2 .2 .2
(This sampling experiment was conducted cooperatively by about 80 students at Oregon State College in the Fall of 1950).
108
Ch.9
VARlANCE-RATlO-F-DISTRIBUTlON
.R
,. r.
.6
.4
.2
2
6
4
R
10 F
Fig. 9.2
superimposed on the F-distribution with 4 and 4 degrees of freedom is shown in Figure 9.2. In both the histogram and the table, the theoretical frequency and the observed frequency fit closely. This verifies Theorem
9.1. In the tag population of 500 observations (Section 4.1), there are (500)' possible samples of size 5 (Section 5.1). This one population generates that many sample variances A similar population will generate the same nomber of sample variances, If every is divided by every (500)'0 or 9,765,626 x 102c variance-ratios will result. The 500 variance-ratios involved in this sampling experiment are only a tiny fraction of all possi ble variance-ratios. It should be noted that in the sampling experiment the samples are independent (Section 4.2). This is an important condition of Theorem
s:.
s:,
s:.
s:
9.1. 9.3 F-Table The percentage points of the F-distribution with various combinations of numbers of degrees of freedom are given in Table 7 of the Appendix. The nombers of degrees of freedom listed on the top of the table are those of the numerator of the variance-ratio F. The numbers of degrees of freedom listed in the first column of the table are those of the denominator. Table 7 consists of four separate tables, one for each of the four percentage points 5%, 2.5%, 1%, and .5%. In the 5% F-table for 4
9.4
109
TEST OF HYPOTHESIS
and 4 degrees of freedom. the tabulated value is 6.3883. This says that 5% of the F-values with 4 and 4 degrees of freedom are greater than 6.3883. Reference to the 1% table shows that, for 5 and 10 degrees of freedom, the tabulated value is 5.6363. This says that 1% of the Fvalues with 5 and 10 degrees of freedom are greater than 5.6363. The 1 % F-table shows that 1% of the F-values wi th 4 and 4 of de grees freedom exceed 15.977. The sampling experiment (Table 9.2) shows that 1.2% of the 500 F-values exceed 16. These two percentages, 1 and 1.2, check very closely. Each of the four percentage points of the F-table is on the right tail of the distribution. Though not tabulated, the percentage-points on the left tail of the F-distribution, such as the 99.5% and 97.5% points, can be obtained from the existing table by a simple computation. The 97.5% point of F with III and 112 degrees of freedom is equal to the reciprocal of 2.5% point of F with 112 and III degrees of freedom. For example, the 97.5% point of F with 6 and 4 degrees of freedom is 1/6.2272 or .16059, where 6.2272 is the 2.5% point of F with 4 and 6 degrees of freedom. A similar relation exists between 99.5% and .5% points. The 99.5% point of F with 5 and 10 degrees of freedom is 1/13.618 or .073432, where 13.618 is the .5% point of F with 10 and 5 degrees of freedom. 9.4 Test of Hypothesis In the three preceding sections, the deductive relation from two populations to their samples is considered. In this section the inductive relation-from two samples to their respective populations-is discussed. Theorem 9.1 may be used in testing the hypothesis that two popuand are equal. From each of two normal population variances lations, a random sample is drawn. The variances ~ and are computed, and the numbers of degrees of freedom are determined. The varianceratio F is with III and 112 degrees of freedom. If the F-value is close to I, the indication is that = If the F-value is too much smaller than 1, the indication is that is less than 1£ the F-value is too much larger than I, the indication is that is greater than Of course, how small is too small, or how large is too large, depends on the significance level. For example, a sample variance is equal to 700 with 4 degrees of freedom and another sample variance is 50 with 5 degrees of freedom. In testing the hypothesis that = the value of F is = 700/50 = 14 with 4 and 5degrees of freedom. If the 5% si~ nificance level is used, the right critical region is where F> 7.3879 (2.5% point of the F-distrihution with 4 and 5 degrees of freedom); and the left critical region is where F < 1/9.3645 = 0.10679 (9.3645 is the 2.5% point of the F-distribution with 5 and 4 degrees of freedom). Since the F-value 14 is in the right critical region, tbe conclusion is that
u:
s:/ s:
u:
s:
u: u:.
0:
s:/s:
u:
u:.
s: s: u: u:,
u:.
110
Ch. 9
VARIANCE-RATIO-F-DISTRIBUTION
o!
is greater than o~. But theoretically 2.5% of the F-values are greater than 7.3879 if the hypothesis is true, that is, o~ = Yet when F falls inside the critical region, the hypothesis is rejected. This decision is made because of the possibility that the hypothesis is false and is greater than All the discussions on the interlocking relation between Type I error, Type II error, and sample si ze given in Sections 6.3, 6.4, ti.5, and 6.6, apply to the test of the hypothesis that o! =
0:.
0:
0;.
0:.
Q.5 Procedures
0: 0:
The procedures in the test of the hypothesis that = can be illustrated by the following example. The observations of the first sample are 2, 3, 7 and those of the second sample are 8, 6, 5, 1. The procedures are as follows: 1. Hypothesis: The hypothesis is that the two population variances are equal, that is, o! = 2. Alternative hypotheses: The alternative hypotheses are that (a) < and (b) > 3. Assumptions: The given samples are random samples drawn from normal populations. 4. Level of Significance: The 5% significance level is chosen. 5. Critical regions: The critical regions for F with 2 and 3 degrees of freedom are where F < .025533 (.025533= 1/39.165 and 39.165 is the 2.5% point of the F-distribution with 3 and 2 d.f.) and where F> 16.044. 6. Computation of F: The details of the computation of F are given in Table 9.5. The F-value is 7/8.667 = .8077 with 2 and 3 degrees of freedom.
0:.
0: 0:
0: 0:.
TABLE 9.5 -
1
Sample No. --
-
,---"-"-=
2
-
-----
2
8 6
3 7
Observations
~
(Ir~ (~f' n
~ 55 ~
55
5 1
12 144
400
48
100
62 14 7
126
= II - (Iy)2 In
i = 55/(n-1)
20
~
8.667
-
9.6
WEIGHTED MEAN OF SAMPLE VARIANCES
111
7. Conclusion: Since F is outside the critical regions, the conclusion is that the population variances are equal, that is, = (If F were less than .025533, the conclusion would be that O'! < If F were greater than 16.044, the conclusion would be that > It should be noted that F is a pure number. If the observations are measured in inches, S2 is a number of square inches. Then F = which is a number of square inches divided by another number of square inches, is a pure number.
0': 0':. 0':. 0': 0':).
s:/ s:,
9.6 Weighted Mean of Sample Variances
In the example of Section 9.5, s: = 7 with 2 degrees of freedom, and s: = 8.667 with ::\ degrees of freedom. The conclusion drawn from the test of the hypothesis is that the two population variances are equal. Then both s: and s: are the estimates of the variance which is common to both populations. The problem is to combine s: and s: to form a single 0'2
0'2.
estimate of The average of s! and s: could be used for this purpose, but the average (~) (s! + s:) i goores the fact that s! wi th 3 degrees of freedom is a more accurate estimate of 0'2 than with 2 degrees of freedom. Therefore, in order to give more weight to the more accurate estimate, the weighted average s~ of s! and s: is used. Using the numbers of degrees of freedom as weights,
s:
2
S2
2
s. + "2 S2_ = ". ____
(1)
P
s:
where ", is the number of degrees of freedom of and lIz is the number of degrees of freedom of s:. The weighted average, s~, of s: and s: is also called the pooled estimate of aZ. For example, where s: = 7 with 2 d.f. and = 8.667 with 3 d.f. the pooled estimate of 0'2 is
s:
SZ
p
=
(2 x 7) + (3 x 8.667) 2+3
=
14 + 26 5
= 8.
It should be noted that "S2 = 55, because 55/" = sz. Therefore the pooled estiptate of O'z is actually (2) with ('" + lIz) d.f., where 55, is the 55 of the first sample with "1 degrees of freedom, and 55z is the 55 of the second sample with lIz degrees of freedom. From Table 9.5, it can be seen that 55, = 14 and S52 = 26. Th e nnmerator of S2 is 04 + 26) or 40 and the denominator is (2 + 3) or 5. The pooled varfance s~ is equal to 40/5 or 8 with (2 + 3) or 5 degrees of freedom. This method of obtaining pooled estimate of O'z stems from the following theorem which is verified experimentally.
112
Ch.9
VARIANCE-RATIO-F-DISTRIBUTION
Theorem 9.6 If the statistic SS/u l foUows the >t-diSlribuuon with II, degrees of freedom and the statistic SS,/U' follows the >t-dislribueion with lIa degrees of freedom and SS/u l and SS,/U' are obtained from independent samples, the statistic
follows the x'-distribution with (II, + II,) degrees of freedom. This theorem can be explained and also verified by the sampling experimenL In Section 7.7 it is shown that the value of SS/~ of the 1000 samples follow the X'-distribution with 4 degrees of freedom. Five hundred pairs of samples can be made from the 1000 samples. The first and second samples form a pair; the third and fourth form a pair, and so on. An example of the S5-values of 4 random samples, each consisting Qf 5 observations, is given in Table 4.2. The S5-values are 598.8, 237.2, TABLE 9.6 55, + 55, 100
Observed Frequency
r./.(,,)
/
0-1
1
1-2
6
2-3 3-4 4-5 5-6 6-7 7-8 8-9 9 - 10 10 - 11 11- 12 12 - 13 13 - 14 14 - 15 15 - 16 16 - 17 17 - 18 18 - 19 19 - 20 Over 20 Total
Theoretical
29 32 54 54 56 58 52 30 29 31 19 12 13 5 7 3 2 3 4
0.2 1.2 5.8 6.4 10.8 10.8 11.2 11.6 10.4 6.0 5.8 6.2 3.8 2.4 2.6 1.0 1.4 .6 .4 .6 .8
0.2 1.7 4.7 7.7 10.0 11.0 11.1 10.3 9.1 7.7 6.3 5.1 3.9 3.0 2.3 1.7 1.2 .9
500
100.0
100.0
Mean
+ 55, 100
f 55, 0
Mid-pt. m
r.f.(~)
.5 1.S 2.5 3.5 4.5 5.5 6.5 7.S 8.5 9.5 10.5 11.5 12.5 13.5 14.5 15.5 16.5 17.5 18.5 19.5 23.1
.6 .5 1.0
Ina/ 3942.4
-!:[=
500
m/ .5 9.0 72.5 112.0 243.0 297.0 364.0 435.0 442.0 285.0 304.5 356.5 237.5 162.0 188.5 77.5 115.5 52.5 37.0 58.5 92.4 3942.4
=7.9
(This sampling experiment was done cooperatively by about 75 students at Oregon State College in the Fall of 1952).
9.6
113
WEIGHTED MEAN OF SAMPLE VARIANCES
396.8, and 319.2 respectively. The value of (SSI + SSa)/ul for the first pair is (589.8 + 237.2)/100 or 8.360; and that for the second pair is (396.8 + 319.2)/100 or 7.160. Such a value is obtained from each of the 500 pairs of samples. It should be noted that the 1000 samples are independent samples (Section 4.2). This is an important condition for the validity of Theorem 9.6. The frequency distribution of the 500 valnes of (SSI +SS2)/UI is given in Table 9.6. The theoretical frequency given in the table is that of the Xl-distribution with 8 degrees of freedom. The histogram of the 500 values of (SSI + SS2)/U I with the snperimposed ~-curve with 8 degrees of freedom is shown in Fig. 9.6. Both the table and graph show that the observed and the theoretical frequencies fit closely • . IS r.
I.
8
4
10
12
x'
14
Fig. 9.6
Since the mean of the Xl-distribution is its number of degrees of freedom, the mean of the 500 values of (SSI + SS2)/U I is expected to be close to 8. The approximate mean of these values can be found from Table 9.6 using the mid-point m of a class to represent all the values in that class. The class 0-1 is represented by .5, the class 1-2 by 1.5, and so forth. The class "over 20" has no upper limit. The value 23.1, which is the mean of the 4 values falling inside that class, is assigned to that class. The approximate mean of the 500 values of (SSI + SS2)/U 2 is
I.m{ 3942.4 -=
.l'..{
500
= 7.9
114
VARIANCE- RA TIO-F-DISTRIBUTION
Ch.9
which is close to 8 as expected. This completes the verification of Theorem 9.6 and thus justifies the assignment of (VI + v2 ) degrees of freedom to the pooled variance, s~, where VI and V 2 are the numbers of degrees of freedom of s~ and respectively. The pooled estimate of 0 2 is obtained by Equation (2) only if the two populations have the same variance 0 2 but different means. If the two populations have the same mean and the same variance, the observations of the two samples should be lumped together to form a single sample of n l + ~ observations; and the S2 of this single sample with (ns + ~ - 1) degrees of freedom, is the pooled estimate of 02. The pooled estimate, s~, of the population variance 0 2 can be obtained from any number of sample variances. The use of this method is not limited to two samples. In general, if a random sample is drawn from each of k populations with the same variance 0 2, the pooled estimate of 0 2 , based on the k samples, is
s:
k
sum the_ _ SS-values S2 - _ _ of __ _ _ __ P -
sum of k numbers of J. f.
.
(3)
9.7 Relatioa Between F-DtstrlbuUon aDd ~-DtstribUtiOD
It is shown in Section 7.7 that the Statistic SS/02 follows the ~-distri bution with V (v = n - 1) degrees of freedom; and in Section 9.2 that the statistic sV s: follows the F-distribution with VI and V 2degrees of freedom. Since s2and SS are related, that is, S2 = SS/v, the two distributions F and )(2 must be related. In fact, they are related in more ways than one. The relation demonstrated in this section has the most practical importance. Since S2 is an estimate of 0 2, S2 approaches 0'2, as the number of degrees of freedom approaches infinity. In other words, if infinitely many observations are included in a sample, the sample becomes the population itself, and S2 becomes 0 2• Therefore, the statistic F eo sV s: becomes SV'02 if V 2 approaches coo(jnfinity). But this statistic S2/02 also follows the distribution of )(2/V (Section 7.7). This relation between )(2 and F is stated in the following theorem. Theorem 9.7 If a statistic x 2 follows the X2-distribution with V degrees of freedom, the statistic )(2/V follows the F-distribution with v and coo degrees of freedom. The relation stated in Theorem 9.7 can be observed from Tables 5 and 7 of the Appendix. For example, the 5% point of ~ /v with 10 degrees of freedom is 1.8307 which is also the 5% point of F with 10 and coo degrees of freedom. In fact, the whole column of 5% points of X2/V with various numbers of degrees of freedom is the bottom line of the 5% F-table. The column of 1% points of X2/V is the bottom line of the 1% F-table. This relation holds for all the percentage points. Because of this relation, the test of the hypothesis that the population variance is equal to a given value (Section 7.10) can be performed by three different methods. Either
EXERCISES
115
the statistic X' = SS/(1' with n - 1 degrees of freedom, or the statistic x'/(n -1) - S2/(1' with n - 1 degrees of freedom, or the statistic F _ S2/(11 with n - 1 and 00 degrees of freedom, may be used. Of course, the three seemingly different methods are really the same method, and the conclusions reached through these methods should always be the same.
9.8 Remarks The test of the hypothesis that two population variances are equal can be applied in comparing the uniformity 0 f a product made by two different processes. For example. two manufacturing processes are used to make a machine part with a specified length of 1.5 inches. With the average length of the machine parts being equal to 1.5 inches. the process which can produce parts with more uniform length is the more desirable process. A part will not fit into a machine if it is too long or too short. In other words, small variance is a desirable feature of the machine parts. To compare the variances of the length of the machine parts made by the two processes, one would obtain a random sample of the product made by each of the two processes, and measure the length of each machine part. Then, following the procedures given in Section 9.5, he can test the hypothesis that (1: - (1~ If one accepts the hypothesis, the conclusion is that the two processes can produce the machine parts with equal unifomity. If one rejects the hypothesis, the conclusion is that the process wi th the smaller variance is the more desirable one. Scientific research workers may use a similar test to compare the refinement of two experimental techniques. The technique which produces observations with the smaller variance is the more refined one. Despite the examples cited above, the direct applications of this test to experimental data by research workers are limited. The main purpose of introducing the F -distribution at this stage is to prepare for a future topic called the analysis of variance (Chapter 12), and not to test the hypothesis that two population variances are equal. The F -table is awkward to use for a two-tailed test, because the percentage points of the left tail of the F-distribution are not tabulated. However, the F -table is not made for this purpose, but is largely for use in connection with the analysis of variance where the F -test is a onetailed test. The two-tailed F-test is Dot used in the remainder of this text. EXERCISES (1) Two random samples are drawn from the tag population which is a normal population with mean equal to 50 and variance equal to 100. After the samples are drawn, 10 is subtracted from each observation of the first sample and 10 is added to each observation of the second sample. Then the first sample is actually a random sample drawn
116
VARIANCE-RATIO-F-D1STRIBUTION
Ch.9
from a nonnal population with mean equal to 40 and variance still equal to 100. The second sample is a random sample drawn hom a nonnal population with mean equal to 60 and variance equal to 100. The observations of the two samples are tabulated as follows: Sample 1 41 46 T1
53 45 37 57
46
Sample 2 58 72 81 67 67 72 59 63 59
Pretending that the sources of the two samples are unknown, test the hypotheses that the two population variances are equal, at the 5% level. Following the procedure given in Section 9.5, write a complete report. Since the population variances are actually known to be equal, state whether your conclusion is correct or whether you make a Type I error. Note that a Type n error cannot be made, because the hypothesis being tested is true. The purpose of changing the observations of the samples is to illustrate that the means of the populations have no bearing on this test. (F - 1.48 with 7 and 8 d.f.) (2) If the hypothesis is accepted in Exercise (1), find the pooled estimate of the variance (I' which is common to both populations. Find the number o~ degrees of &eedom of this pooled variance s~. (s~71.6147 WIth 15 d.f.) (3) Lump the observations of the two samples of Exercise 0) together to fonn a single sample of 17 observations. Find the sample variance Sl of this sample. Compare this 8 1 with 8~ obtained in Exercise (2). Which quantity is greater? Why? Under what condition is s I the appropriate estimate of (II? Why? Under what condition is the more desirable estimate of (lJ? Why? (4) Multiply each observation of the second sample of Exercise (1) by 1.1, and recalculate Test the hypothesis that (I: - (I: at the 5% level. When each observation of the second sample is multiplied by 1.1, this sample becomes a sample drawn from the population with variance equal to 0.1)1 X 100 or 121 (Theorem 2.4b). Thus (I: - 100 and (If = 121. Since the population variances are actually known, state whether your conclusion is correct or whether you have made a Type n error. A Type I error cannot be made when the hypothesis being tested is false. The purpose of this exercise is to illustrate that a false hypothesis is likely to be accepted with small samples if the hypothesis is not too far from the truth.
:1
8:.
117
EXERCISES
(5) Multiply each observation of the second sample of Exercise (1) by 5, and recalculate Test the hypothesis that at the 5% level. When each observation of the second sample is multiplied by 5, this sample becomes a sample drawn from the population with variance equal to 2500 (Theorem 2.4b). Thus 100 and 2500. Since the population variances are actually known, state whether your conclusion is correct or whether you have made a Type D error. A Type I error cannot be made when the hypothesis being tested is false. The purpose of this exercise is to illustrate that the false hypothesis can be rejected with relatively small samples if the hypothesis is far enough from the truth. (6) For the data of Exercise 4, Chapter 7, test the same hypothesis by the F-test. (7) For the data of Exercise 8, Chapter 7, test the same hypothesis by the F-test. (8) Two methods are available for packing frozen strawberries. The packer wishes to know which one gives him packages of more uniform weights. The weights of the packages produced by the two methods are as follows:
s:.
(7f - (7: (7: -
(7: -
Method A
Method B
16.1 16.2 15.9 16.1 16.1 15.9 16.0 16.2 16.0 16.2 16.1 16.0
15.8 16.4 16.3 16.2 16.4 15.7 16.0 16.1 16.5 16.0 16.1 16.3
Test the hypothesis that the two population variances are equal, at the 5% level. (9) In a study of the method for counting bacteria in rat feces, a new method and the old method were compared. Films were made by both methods and the resulting slides were fixed and stained with crystal violeL Twenty-five random fields were examined with a microscope and the number
12 11 4 7 0 5 6 0 15 7 13 9 15 0 7 11 0 13 15 49 0 4 44 32 10
New
16 28 20 23 21 21 26 27 26 29
31 21 24 41 23
20 28 27
28
17 42 27
29
28
21
Test the hypothesis that the variances are equal, at the 5% level. (Wallace, R. H.: uA Direct Method for Counting Bacteria in Feces," Journal of Bacteriology, Vol. 64, 1952, pp. 593-594.)
118
VARIANCE-RATIQ-F-DISTRIBUTION
Ch.9
QUESTIONS (l) Either the x2-test or the F-test can be used in testing the hypothesis that the population variance is equal to a given value. Are the conclusions reached by these two tests always the same? (2) How are the x2-distribution and F-distribution related? (3) Four random samples, each consisting of 7 observations, are drawn from populations with equal variance. How many degrees of freedom does the pooled variance. have? (4) Three random samples, consisting of 4, 5, and 6 observations respectively, are drawn from populations with equal variance. How many degrees of freedom does the pooled variance, s~ have? (5) What is the smallest value F can have? (6) If the observations are measured in ounces, what is the unit of F? (7) What is indicated by the statement that S2 is equal to zero? (8) What are the assumptions of the F-test? (9) The notations v and n are used in this chapter. What does each represent? nO) How does one compute the quantity vsl'? Must one find S2 and v first and then multiply them together?
s!.
REFERENCES Hoel, Paul G.:· Inlroduction to MalhemtJtical StotisUcs, Second Edition, John Wiley & Sons, New York, 1954. Mood, Alexander M.: Introduction to the Theory of Statis,ics, McCraw-Hill Book Company, New York, 1950.
CHAPTER 10
DIFFERENCE BETWEEN SAMPLE MEANS Chapters 5, 6, and 8 collectively deal with the deductive and inductive relations between a population and the means of its samples. The deductive relation is shown in Chapter 5, which describes the characteristics of the sample means drawn from a population. The inductive re1ation is shown in Chapter 6, which describes the test of the hypothesis about the population mean through the use of a single sample and introduces the u-test. Chapter 8 is essentially a repetition of Chapters 5 and 6, except that it replaces the population variance in the statistic u with
the sample variance and introduces the t-test: Now, in Chapter 10, all of what is described in Chapters 5, 6, and 8 is repeated, with the discu8sion centering not on the sample mean but on the difference between two sample means.
10.1 Dfstrlbatloa of Difference Betweea Sample Meus The problem to be considered involves two populations. To avoid possible confusion between the two populations as well as between their respective samples, the subscript 1 or 2 is attached to all notations such as ~ O'a, y, S2, N, and n. For example, the mean of the first population is designated by IL., and that of the second population by ILa. The problem is illustrated by the following example. The first population consists of three (N I = 3) observations 2, 4, and 6. The mean ILl is 4 and the variance (Section 2.3) is O'~
(2-4)a + (4-4)' + (6-4)2
8
3
3
=- - - - - - - - - -
(1)
The second population consists of two (N 2 = 2) observations, 3 and 6. The mean IL2 is 4.5 and the variance is 2.25. The information concerning the two populations may be summarized as follows:
0':
Population 2 3,6 N2 = 2 IL2 = 4.5 O'~ = 2.25
Population 1 2, 4,6 N, = 3 ILl = 4 ~~ = 8/3
From the first population, all possible samples of size 2 (n l = 2) are drawn (Section 5.l) and, for each sample, the mean f, is computed. From the second population, all possible samples of size 3 (n a - 3) arc drown.
119
120
Ch. 10
DIFFERENCE BETWEEN SAMPLE MEANS
and, for each sample, the mean fa is computed. The two sets of samples and their means are given in Table 10.la. The frequency tables of the two sets of sample means are shown in Table 10.lb. The mean of the 9 sample means drawn from the first population is 4, which is Ill; and that of the 8 sample means drawn from the second population is 4.5, which is 1'2' The variance of each set of sample means can be obtained by Theorem 5.3, that is, (2)
The variance of the first set of sample means is 2 (7
Yl
(7~ 8/3 -1 =-=-=nl 2 3'
(3)
TABLE 10.1a From PopulatioD 1
From PopulatioD 2
Samples
Samples
11
2,2
2
2,4
3
2,6
4 3
4,2 4,4 4,6 6,2 6.4 6,6
5 4 5 6
Sum
36
MeaD
12
3,3,3 3,3,6
3 4 4 5 4
3,6,3 3,6,6 6,3.3 6,3,6 6,6,3 6,6,6
4
4
5
5 6
Sum
36
Mean
4.5
TABLE 10.lb
11
f
f
12
2
1
3
1
3 4 5 6
2
4
~
3
5 6
3
Total
9
nl = 2 Ily; = III = 4
4 (7_;Yl =(7:n,-= 8/3 -=2 3
2 1
1
Total n2
Ily (71
2
Y2
8
=3 Ie IAt
=4.5
n2
3
=ul=2.25=~ 4
.
10.1
DISTRIBUTION OF DIFFERENCE BETWEEN SAMPLE MEANS
121
and that of the second set of sample means is 0 2
Y2
o~
2.25
3
R2
3
4
= - = - - = .75 = -.
None of this is new. Both the sampling scheme and the characteristics of sample means are given in Sections 5.1 and 5.2. But now a new problem arises; how does one determine the characteristics of the differences, (YI-yJ? There are 9 sample means from the first population and 8 sample means from the second population. If each sample mean in the first set is compared with every sample mean in the second set, there TABLE IO.Ie
rl, ra
r,- ra
2,3 2,4 2,5 2,6
-1
1
-2 -3 -4
3 3
0
2 6 6 2
3,3 3,4 3,5 3,6
-1
-2 -3
4,i'
4,4 4,5 4,6
f
1
1 0 -1
3
-2
3
2
9 9
5,3 5,4 5,5
0
2 6 6
5,6
-1
2
6,3 6,4 6,5 6,6
3 2
1
1 0
1
3 3 1
will be 9 x 8 or 72 differences. The 72 differences, (y,-fJ, are given in Table 10.lc. The frequency table of the differences is given in Table 10.ld. In reading these two tables, one should consider the difference, (y, - yJ as one quantity. Then the mean and variance of the 72 diffe ... ences can be calculated. The details of the computation are given in Table lO.Id. The mean of the 72 differences W, - Y2)' is I'-fi- Y2 =
ICy, - Y2){ -36 72 = 72 =-.5
(5)
122
Ch. 10
DIFFERENCE BETWEEN SAMPLE MEANS TABLE 10.ld
(1, - 12)
[
(1, - 12)[
(y, - h + .5)
(1' - Y2 + .5)1
(y, - 12 + .5)2[
-4 -3 -2 -1 0 1 2 3
1 5 12 18 18 12 5 1
- 4 -15 -24 -18 0 12 10 3
-3.5 -2.5 -1.5 - .5 .5 1.5 2.5 3.5
12.25 6.25 2.25 .25 .25 2.25 6.25 12.25
12.25 31.25 27.00 4.50 4.50 27.00 31.25 12.25
Total
72
-36
ILy1
-
150.00 Ya
~
-3" 72 = -.5
= -
= 150
fa - fa
72
=4 -
4.5
= Il' -1l2
= 25 = !+ ~= ul + ui 12
3
4
n,
n2
which is the difference between the population means, Il, - Ila or 4 - 4.5 or -·.5. This relation between the mean of the differences (11 - 12) and the two population means is stated in the following theorem. Theorem 1O.la The mean of the differences between two sample means is equal to the difference between the means of the two populations from which the two sets of samples are drawn, that is, (6)
Ily, - fa = Il, - Ila'
The variance of the 72 differences, (y,
I[{fl -
yJ - (IL, -Ila)]af 72
- Y2)
is
I(YI - Ya + .S)af 150 =
72
=
72
25 =
12'
but the sum of the variances of the two sets of sample means is also equal to 25/12, that is, u~
-+
n1
u:
-=
n1
8/3 2.25 4 3 25 - + - - = -+ -=-. 2 3 3 4 12
It should be emphasized here that the above relation between the variances of the differences (y, - y~ and the respective variances of y, and Ya is the direct consequence of the sampling scheme, which specifies that all possible samples are drawn from each of the two populations and that the set of differences (y, - Ya) consists of all possible differences. When this sampling scheme is used, the sample means y, and Y2 are said to be independently distributed. The independence refers to the fact that every sample mean of the first set has an equal chance of being compared with every sample mean of the second set.
10.1
DISTRIBUTION OF blFFERENCE BETWEEN SAMPLE MEANS
123
f 1
~
.•
l
2
6
Population 1
f
I
6
3 Population 2
20
..
IS
.
10
.
5
~
I
o
I
l
-4
-3
-2
-I
I
l
o
2
3 y, - Ya
Fig. 10.1
124
Ch. 10
DIFFERENCE BETWEEN SAMPLE MEANS
The discussion of the variance of the differences between sample means can be summarized in the following theorem.
Theorem lO.lb The variance of the differences between two sets of independent sample means is equal to the sum of the variances of the t,,:o respective sets of sample means, that is, (12
Yl - Ya
(12
(12
n1
n' 2
= (12 + (12 = _1 + _2 Yl
Ya
(7)
and consequently, the standara error of the difference between two independent sample means is (8)
The histograms of the two populations and that of the distribution of the differences between sample means are given in Fig. 10.1. It can be observed that the differences 1 - f J show a tendency to follow the normal curve despite the fact that neither population is normal and neither sample size is large. This fact is stated in the following theorem:
Theorem lO.Ic The distribution of the differences (1. - raJ between sample means approaches the normal distribution as the sample sizes n1 and n2 increase, if the two populations from which the samples are drawn have finite variances. If the two populations are normol at the outset, the differences (rl - r2) foUow the normal distribution e%actly regardless of the sample sizes. Theorems 10.1 a, b, c are already verified for the two populations which consist of the observations 2, 4, 6 and 3, 6 respectively. They are verified again in the next section by a sampling experiment. TABLE 10.2
11 -Ya Below -10.5 - 7.5 - 4.5 1.5 1.5 4.5 7.S Above
-
Total
-10.5 to - 7.5 to - 4.5 to - 1:.5 to 1.5 to 4.5 to 7.5 to 10.5 10.5
I
c./.
24 33 58 85 83 95 63 33 26
24 57 115 200 283 378 441 474 500
r.c./'
(%)
4.8 11.4 23.0 40.0 56.6 75.6 88.2 94.8 100.0
500
(This samplinl experiment was done cooperatively by about 75 students at Oregon State College in the Fan of 1952).
10.2
DISTRIBUTION OF DIFFERENCE BETWEEN SAMPLE MEANS
125
10.2 ExperimeDlal VeriftealioD of Dtstrlba,loD of DtffereDee BelweeD Sample Me..s Theorems 10.1 a, b, and c CaD be verified by a sampling experiment. The details of the experiment are described in Chapter 4. Briefly, 1000 random samples, each consisting of 5 observations, are drawn from the tag population which is a normal population with meaD equal to 50 and variance equal to 100. For each sample, the meaD is calculated. An example of the means of four such samples is given in Table 4.2. Then the sample means are paired to obtain the differences. The first and second sample means form a pair, the third and fourth form a pair, aDd so
r
98 r. c.
f.
90
----------
84"
70
--~ ---~---------
50
30
10
~
-IS
______________ -10
~
________
-5
~
______
o
~
5
______________. . 2 15
10 7,-71
Fig. 10.2
126
DIFFERENCE BETWEEN SAMPLE MEANS
Ch. 10
forth. Then 500 pairs are made out of 1000 sample means, and the difference between the sample means, fl - fit of each pair is calculated. Now there are 500 differences between sample means. For example, the means of the four samples given in Table 4.2 are 48.8, 45.6, 59.2, and 55.4. The two differences of the two pairs of sample means are 48.8 45.6 or 3.2 and 59.2 - 55.4 or 3.8 respectively. Both of these differences, 3.2 and 3.8, are positive, but if the first mean of a pair of sample means is less than the second one, the difference is negative. The 1000 samples are independent samples (Section 4.2). The first sample of each pair is to be visualized as a sample fron: the first population and the second sample from the second population. Since the two populations of this sampling experiment are really the same population, the mean of the differences, Yl - Y2t should be equal to PI - Pa = 50 50 = 0 (Theorem 10.la) and the variance of the differences, fl - fit should be equal to (Theorem 10.lb) u~
u:
100 100 -+ - = - + - = 40. n l na 5 5
Since the populations are nomlal, the distribution of the differences, (fl - fJ, will follow the normal distribution (Theorem 10.1d. The frequency table of the 500 differences, (11 - 'YJ, is given in Table 10.2. The relative cumulative frequencies (r.c.f.) are plotted on the normal probability graph paper (Fig. 10.2). The fact that the pointa are almost on a straight line indicates that the differences, (Yl - YI)' follow the normal distribution (Section 3.3). The mean, which is the 50% point of this distribution, as read from the graph, is .25, which is as near zero as expected. The mean plus one standard error which is the 84% point, as read from the graph, is 6.7. Then the standard error is 6.7 - .25 or 6.45 which is close to V40 or 6.3245. This completes the verification of the theorems given in Section 10.1. 10.3 a-Dletd),.tloD
Theorems 10.1 a, b, c, show that the differences, (Yl - Ya), follow the normal distribution with mean equal to {PI - pJ and variance equal to (ut'n l + u:!n a). Then the statistic
(fl - fJ - (PI - PI) u=-------
(1)
follows the normal distribution with mean equal to 0 and variance equal
10.4
STUDENT'S t-DISTRIBUTION
127
to 1. In Section 6.7 the statistic u is given as
Y-Il
u=--
(2)
u
VB while in Section 3.2 it is given as
Y-Il
u=--.
u
(3)
The three versions of u given above are not the same quantities, but they all follow the same distribution-the normal distribution with mean equal to 0 and variance equal to 1. For this reason they are all designated by the same letter u. In Equations (1), (2), and (3), each of the three statistics (f, - f J, Y, and Y follows the normal distribution. The quantity u, in each case, is obtained by subtracting the mean of a statistic from that statistic and dividing the resulting difference by the standard deviation of that statistic. With so many similarities among the three versions of u, it is conceivable that they are related to each other. Equation (3) may be regarded as a special case of Equation (2), which in tum may be regarded as a special case of Equation (1). When n z approaches infinity, fz approaches Ilz and u:/n z approaches zero. Then F.quation (1) becomes Equation (2). When n c 1, f becomes y itself, then Equation (2) becomes Equation (3). The statistic u may be used in testing the hypothesis that the diffe~ ence between the two population means (Il, - IlJ, is equal to a given value, if the two population variances and O'~ are known. Since the variances are seldom, if ever, known, the population variances have to be replaced by sample variances. This replacement leads to the ~ distribution (Section 8.1).
u:
10.4 St.dent's t-Distribntion The statistic u given in Equation (1), Section 10.3, has very limited use, because the two population variances are usually unknown. When the population variances are replaced by the sample variances, u becomes t (Section 8.1). There are two versions of t, depending on whether the population variances are equal or unequal. These two cases are considered separately. Case I: = 0':' If the two population variances are equal, the variance common to both populations may be denoted by O'z. The subscript 1 or 2 is no longer needed. Then the statistic u in Equation (1), Section
u:
128
DIFFERENCE BETWEEN SAMPLE MEANS
Ch. 10
10.3, becomes (1)
The variance u' can be estimated by the pooled variance s~ (Section 9.6). If the two samples are of n, and n. observations respectively, s: has
n, - 1
degrees of freedom and
s: has n. - 1 degrees of freedom.
Then When s~ is used to replace u ' in Equation (1), the resulting statistic follows Student's I-distribution with n, + n. - 2 degrees of freedom. Note that the number of degrees of freedom of t is always that of the Sl involved in the t (Section 8.1). The above discussion may be summarized in the following theorem: Theorem lO.4a If two given populations are norrntJl and have the same variance, the statistic s~ has (n, -1) + (n. - 1) or (n. + n. - 2) degrees of freedom.
(2)
follows the Student's t·distribution wid n. + n. - 2 tlegrefls of freedom, where s; is the pooled estimate of the common veriance of the two populations "and " and ,. ere the means of two independent samples. (This theorem is verified by a sampling experiment in Section 10.5.) Case II: u: ~ u:' If the two population variances are not equal, no method available at the present time (1956) can be used to test the hypothesis that the difference between two population means is equal to a given value. Theorem 10.4a is usually used without knowing whether the two population variances are equal. However, this seemingly indiscriminate use of Case I is not to be condemned. Box (1954) found that no serious consequences will result from this practice, if the two population variances are only moderately different and the two sample sizes are equal. If the two population variances are very different as indicated by the fact that F - s:ls: exceeds the 1% point of F, and, at the same time, the two sample sizes are also different, one may use an approximate method which is stated in the following theorem:
Theorem IO.4b If the populations are normal and have different variances, the statistic
(f. - fJ -
(IL, - ILl)
t=-------
(3)
10.5
EXPERIMENTAL VERInCATION OF t-DISTRIBUTION
foUows approximately the Stude"" st.-distribution with dom, where
v=
II
129
degrees of free-
(::J (::)'. --+-n. - 1
n2
-
1
The statistic t in Theorem 10.4b does not actually fit Student's ,. distribution with any degree of freedom. The reason for computing II, the number of degrees of freedom, in Theorem 10.4b by a complicated process is to select the true I-distribution, with a particular uumber of degrees of freedom which will best fit the distribution of this t. The computed number of degrees of freedom may not even be a whole number. It must be rounded off to the nearest integer when used in detennining the critical regions. This approximate method should be used only as a last resort. In conducting an experiment, efforts should be made to equalize the sample sizes, so that Theorem 10.4& may be used whether the two population variances are equal or not. Besides the simplification of the computation of t and its number of degrees of freedom, an additional advantage of equalizing the sample size is shown in Section 10.7.
10.5 Expert.uta! VeriftcaUoD of ..DtetrlbllUoll Theorem 10.4& can be verified by a .sampling experiment. The details of the experiment are given in Chapter 4. Briefly, 1000 random samples, each consisting of 5 observations, are drawn from the tag population which is a normal population with mean equal to 50 and variance equal to 100. An example of four such samples is given in Table 4.2. For each sample, y and SS are computed. Then 500 pairs of samples are made from the 1000 samples. The first and second samples constitute a pair, the third and fourth constitute a pair, and so forth. The pooled variance, s~, of two sam~les is the sum of the two S5-values divided by [(n. - 1) + (n a - 1)], that IS,
SSt S2= _ _ _+ _SS2 _ __ P
(n. - 1) + (n a - 1)
(Section 9.6). For the first pair of samples of Table 4.2, S2
P
=
598.8 + 237.2 836.0 = - - = 104.5. R 8
130 The
Ch. 10
DIFFERENCE BETWEEN SAMPLE MEANS ~value
of the same pair of samples is
48.8 - 45.6
t = --;:==r.==:=r = -r===r.====:;r= = 0 ..495
with 8 degrees of freedom. The reason that (Ill - p.,) disappeared from the above equation is that III - p., = 50 - 50 = O. The t-value was calculated for each of the 500 pairs of samples. The frequency table of these 500 t-values is given in Table 10.5. The theoretical frequency of Student's t-distribution with 8 degrees of freedom is also given in the table. The histogram of the 500 t-values with superimposed t-curve is shown in Fig. 10.5. Both the frequency table and the histogram show that the observed and theoretical frequencies fit closely. TABLE 10.5 Observed Frequency t
Below
-4.5 -3.5 -2.5 -1.5 - .5 .5 1.5 2.5 3.5 Over
to to to to to to to to to
-4.5 -3.5 -2.5 -1.5 - .5 .5 1.5 2.5 3.5 4.5 4.5
Total
Theoretical
Mid-pt.
m
/
T./.(,,)
T./.(,,)
0 1 3 37 112 178 124 37 5 2 1
0.0 0.2 0.6 7.4 22.4 35.6 24.8 7.4 1.0 .4
.2
.1 .3 1.4 6.8 23.0 36.8 23.0 6.8 1.4 .3 .1
500
100.0
100.0
Mean of t
-5 -4 -3 -2 -1 0 1 2 3 4 5
m/
-- 40 -- 74 -112 9
0 124 74 15 8 5 27
27 =Im/ 21= soo = .054
(This sampling experiment was done cooperatively by about 75 students at Oregon State College in the Fall of 1952).
The approximate value of the mean of the 500 ~values can be obtained by considering that all the ~values in a class are equal to the mid.point m of that class. Then the mean of t is
Imf <J:l -=-=.054 If 500
wh ich is as near 0 as expected. Theorem 10.4a.
This completes the verification of
10.6
TEST OF HYPOTHESIS-PROCEDURE
131
.. .. ~
.~
.1
.J
o
-I
Fig. 10.5
10.6 Teet of IIypotbesie-PlOcedure The first five sections of this chapter deal with the deductive relation between the two populations and their respective samples. The t-distribution is obtained from all possible samples of giveD sizes drawn from the two populations. Now the drawing of inductive inferences about the two population means from two random samples is considered. Theorems 10.4a and 10.4b may be used in testing tbe hypothesis that the difference between two population means is equal to a given value. The most commonly tested hypothesis is that the difference between two population means is equal to zero, that is, Il. - III = 0 or Il. = Ill" The use of Student's t-distribution in testing a hypothesis is discussed in detail in Chapter 8. In testing the hypothesis that two population means are equal, the statistic t (Theorem 10.4a) becomes t
~-fa
= -r=:;;==~
0~(~. + :J
(1)
with (11. + lIa - 2) d.f. If t is approximately equal to zero, the indication is that Il. = 1l2" If t is too much larger than zero, the indication is that 1'. is greater than 1l2" If t is too much smaller than zero, the indication is that Il. is less than 1l2" How large is too large or how small is too small is determined by the significance level. The procedure of testing the hypothesis that two population means are equal, is illustrated by an example: A sample consists of three observations 2, 7, 3 and another sample consists of two observations 9,.7. The
132
Ch. 10
DIFFERENCE BETWEEN SAMPLE MEANS
problem is to determine whether the two population means are equal. The procedure is as follows: 1. Hypothesis: The two population means are equal, that is, /l, = /lr 2. Alternative hypotheses: The alternative hypotheses are that (a) /l, < /l2 and (b) /l, > /lr 3. Assumptions: The two samples are random samples drawn from their respective normal populations and the two populations have the same variances. 4. Level of significance: The 5% significance level is chosen. 5. Critical regions: The critical regions for t with 3 (that is R, + R2 - 2) degrees of freedom are where t < -3.1825 and t> 3.1825. TABLE 10.6 Sample No. 1 2 Observations
I, n -
,
(I,)I (I,)l/n
Ir SS tl.f.
Combination
2 7 3
9 7
12 3 4 144
16 2
8
62 14 2
256 128 130 2 1
.3333
.5000
I
48
sl
lIn
- 4
Explanation
"- -'I-
16 3 5.333 .8333
pooled SS pooled d./. .; =16/3 l/n. l/na
4.444
r P ~~+~) ria nl
2.108
~~(~ +~)
- 1.90
+
t
The example given above refers to the case in which ": - ":. For the case 0': p 0':' only the procedure iD the computation of the ~value and the number of degrees of freedom needs modification; the rest of the procedure of the test of hypothesis is the same. 6. Computation of t: The details of the computation of t are given in Table 10.6. The ~value is -1.90 with 3 degrees of freedom. 7. Conclusion: Since t is outside the critical regions, the conclusion is that the two population means are equal. (If t were inside the
10.7
ADVANTAGES OF EQUAL SAMPLE SIZE
133
left critical region, the conclusion would be that 1'1 < 1'1'" If t were inside the right critical region, the conclusion would be that 1'1 > 1'•• Note that t is a pure number and not influenced by a change of the unit of measurement.)
10.7 AdvaDtape of Equal Sample Size In conducting an experiment, efforts should be made to equalize the sample sizes. One of the advantages, as stated in Section 10.4, is in the minimization of the effect of the inequality of the two population variances on the t-test. If the sample sizes are equal, Theorem 10.4a may be used in testing the hypothesis that two population means are equal, even though the two population variances may not be equal. Another reason for equalizing the sample size is of even greater importance than the one mentioned above. The variance of the differences between two independent sets of sample means is (Theorem 10.1b)
• • -+-.
If 0': = 0': -
0'1,
0"
0"
n.
na
the variance is
For a given total number of observations for the two samples, the quantity
reaches a minimum when n. = n.. For example, the total number of observations for the two samples is 100, that is n. + na - 100. The two sample sizes n, and na may be 1 and 99, 2 and 98, ••• , SO and SO. For the case n, - 1, n. - 99, the variance is
1
",(~+ ~
= 1.0101,,'.
For the case n, .. na .. SO, the variance is
,,'
(~ + ~) - O.040~'.
In the case n. - 1 and na - 99, the variance is more than 25 times as large as in the case n • .., n 2 - SO, even though the total number of observations in both cases is 100. A larger variance indicates that the differences, (1, - fJ, fluctuate more from one pair of samples to another. The consequence is that the probability of committing a Type II error is
134
DIFFERENCE BETWEEN SAMPLE MEANS
Ch. 10
greater in the case of unequal sample sizes than in the case of equal sample sizes, even though the total number of observations remains the same and the significance level also remains the same. Therefore, efforts should be made to equalize the sample sizes. If the sample sizes cannot be made exactly equal, they should be made as close to each other as possible. If R, is fixed, the mere increase in R2 does not substantially decrease the probability of committing a Type II error. For example, if R, .. 25, the variance of the differences, (f, - fJ, is
ua(~+ ~)
\25
na
which is greater than .()4u2, regardless of the size of nz • In other words, if n, is fixed at 25, the variance of the differences, f J, is always greater than in the case n, - n J .. SO, regardless of the size of the second sample. The variance of the differences (1, - 1z) cannot be further decreased unless the size of the first sample is increased.
10.8 ApplicaUoD8 The test of the hypothesis that two population means are equal is commonly used in various fields of science. In comparing the relative merits of two kinds of feed for steers, a number of steers, say SO, may be divided, at random, into two groups of 25 animals each. Each group is fed a different ration. Each animal is weighed at the beginning and also at the end of the feeding period. The weight gained by an animal during the feeding period is the observation. Then each of the two groups has 25 observations (i.e., n," R z .. 25). The two sets of observations must be regarded as samples, because the observations would be different if the feeding experiment were repeated with different animals. The problem is to determine whether there is a true difference between the fattening ability of the two rations. In other words, the problem is to determine whether the two population means are equal with only two samples given. The hypothesis to be tested is that two population means are equal. The procedure is given in Section 10.6. If the hypothesis is accepted, the conclusion is that both rations are equally good. If the hypothesis is rejected, the conclusion is that one ration is better than the other. The t.-test (Theorem 10.4a) described in this chapter and the t.-test of paired observations (Section 8.7) are applied to the same type of problem. The difference between the two tests is in the method of randomizing the experimental material such as the 50 animals mentioned above. In the method of paired observations, the 50 animals are deliberately matched and thus result in 25 pairs of similar animals. The two animals
10.9
RANDOMIZATION
135
of a pair are assigned to two different rations at random. In other words, two animals are randomized at a time. On the other hand, the t-test described in this chapter is used in the case in which the animals are completely randomized without matching. The method of randomization determines which t-test is to be used. The choice is not at all arbitrary. The relative merits of the two methods of randomization are discussed in Chapter 14.
10.9 Raado.izatloa Section 10.8 emphasizes the fact that the method of randomizing the experimental material determines which t-test is the appropriate one to apply. This section explains how the randomization can be carried out. The same example of 50 animals is used here as an illustration. To randomize the 50 animals completely, the animals are first arbitrarily numbered 1, 2, 3, ••• , 50. These numbers serve as identifications of the animals. Then a random number table (Table 2, Appendix) is read in any direction, vertically, horizontally, or diagonally and any two-digit numbers between 1 and 50 are selected. The numbers selected may be 03, 16, 12, 33, 18, etc. Any number already picked or exceeding 50 is discarded. This selection of numbers continues, until 25 distinct numbers are obtained. The animals which bear these numbers are fed one ration and the remaining 25 animals are fed another ration. The process of selecting the random numbers described above needs modification at times. From 00 to 99, there are 100 two-digit numbers. But in the described method any number exceeding 50 is discarded. Therefore, only 50 of the 100 numbers are used, the other 50 numbers are wasted. However, the wasted numbers can be utilized by subtracting 50 from any number over 50. For example, 51 is considered 1, 96 is considered 46, and 00 is considered 50. In this way each of the random numbers will be used. If the total number of animals is 38, the numbers from 1 to 38 can be used and the numbers 39 to 76 can also be used by subtrscting 38 from each number. The remaining 24 numbers must be discarded. However, to simplify the subtraction, the numbers 51 to 88 may be used instead of 39 to 76. Since any number which has already occurred must be rejected, it is important to keep an account of the numbers that have already occurred. This can be accomplished by writing down all the numbers, say from 1 to 50, in a consecutive order, and striking out each one as it is selected. Then at a glance one can determine whether a newly selected number should be retained or rejected. The numbers in a random number table appear to be two-digit numbers. They are printed this way for easy reading. Yet they may be regarded as one-digit numbers or three-digit numbers, or numbers of any number of digits.
136
Ch. 10
DIFFERENCE BETWEEN SAMPLE MEANS
The random number table given in Table 2, Appendix, i. an abbreviated table. The sources of some of the more extensive random number tables
are given in the references at the end of this chapter. The explanation of the use of the random number table should serve to emphasize the fact that a selection at random has a special meaning in statistics. Positive action is needed to select, at random, 25 out of 50 animals. Arbitrary or haphazard grouping of 50 animals does not constitute a selection at random.
EXERCISES (1) Two random samples were drawn from the tag population which is a nonnal population with mean equal to SO and variance equal to 100. The observations of the two samples are tabulated 88 follows: Sample 2
Sample 1
69 65
38 35
49 37
48 47
39
59
60
37
50
Pretend that the source of the samples is unknown and test the hypothesis that the two population means are equal, at the 5% level. Following the procedure given in Section 10.6, write a complete report. Since it is actually known that the two samples were drawn from the same population, state whether your conclusion is correct or a Type I error is committed. A Type II error cannot be committed in this case, because the hypothesis being tested is true. (t -= .0392 with 11 d. f.) (2) Subtract 1 from each of the six observations of Sample 1 and add 1 to each of the seven observations of Sample 2 of Exercise (1). Then Sample 1 becomes a sample drawn from a population with mean equal to 49, and Sample 2 becomes a sample drawn from a population with mean equal to 51. Pretend that the source of the samples is unknown and test the hypothesis that the two population means are equal, at the 5% level. Since it is actually known that III = 49 and 112 == 51, state whether your conclusion is correct or a Type U error is made. Note that a Type I error cannot be made, because the hypothesis that III = 112 is false. The purpose of this exercise is to demonstrate that a Type II error is likely to be made if the sample sizes are small and the hypothesis is not too far wrong. (t - -0.2601 with 11 d.f.) (3) Subtract 20 from each of the six observations of Sample 1 and add 20 to each of the seven observations of Sample 2 of Exercise (1).
137
EXERCISES
The Sample 1 becomes a sample drawn from a population with mean equal to 30, and Sample 2 becomes a sample drawn from a population with mean equal to 70. Pretend that the source of die un:f'les is unknown and test the hypothesis that III - Ill' at the 5% level. Since it is actually known that III - 30 and Il2 - 70 and, therefore, III < Ill' state whether your conclusion is correct or the Type II error is made. Note that the Type I error cannot be made, because the hypothesis that III .. Il2 is false. The purpose of this exercise is to demonstrate that a false hypothesis can be rejected with relatively small samples if the hypothesis is far enough from the truth. (t - -5.9474 with 11 t1.f.) (4) Hops are perennial plants which are usually asexually propagated. The problem is to determine whether the cuttings from high yielding plants will produce high yield when they reach maturity. Cuttings from each of 10 high- and 10 low-yielding plants (based on their 1948 yields), were planted in 1949. The locations of the 20 plots were completely randomized in a field. The cuttings were given folD' years to mature. The 1953 yields, in pounds, of the 20 plots are given below: Low
37.55 49.65 52.00 35.50 35.85
High
44.25 46.95 40.50 31.50 43.00
49.50 46.55 49.85 41.90 45.65
33.05 43.30 42.25 39.90 49.60
(This is a small part of an extensive experiment conducted by Dr. Kenneth R. Keller with whose permission the above data are published).
With the given experimental evidence and the aid of the ~test, state whether the cuttings from the high-yielding plants produce higher yield than the cuttings from the low-yielding plants. (t - -0.9229 with 18 J.{.) (5) The following data were obtained in a study of the effect of induced maltase formation on the free glutamic acid content of yeast cells. IDductor AbseDt
10.8
10.5
10.6
10.1
10.8
10.7
IDductor Present
11.7
11.6
11.4
11.6
12.0
11.8
(X-methyl-glucoside was used to induce maltase formation. Six separate flasks were used for each induction condition. Each flask contained equal weights of yeast cells suspended in phosphatesuccinate buffer at pH 4.5. Both the control (noninduced) cells and the experimental (induced) cells received equal amounts of glucose.
138
Ch. 10
DIFFERENCE BETWEEN SAMPLE MEANS
The experimental cells received in addition equal quantities of a-methyl-glucoside. Following incubation, the free glutamic acid
421
462
0.063 cc
207
17
400 . 412
378
413
74
116
Test the hypothesis that the difference in effectiveness between the two doses is equal to zero at the 5% level. Because of the great difference in the two variances, Theorem 10.4b should be used. (Whitlock, J. H. and Bliss, C. I.: "A Bioassay Technique for Antihelminthics," The Journal of Parasitology, Vol. 29, 1943, pp. 48-58.) (7) The following data are measurements of the breaking strength of two types of coated cloth. Do these two types of cloths differ in the average breaking strength? Use the 5% significance level.
I
~I
58
49
42
43
50
73
71
70
60
72
66
(8) The following measurements (S02 ppm) of atmospheric pollution were found in two locations. Country (Coast) City (Heavy Industry)
.8,
2.1, 1.2,
.5,
.1
25.0, 15.0, 5.0, 14.0, 5.0, 22.0, 2.0,
17.0
What can you conclude about the relative amount of pollution in the two areas? Use Theorem 10.4b, because of the unequal variances. Use the 5% significance level.
139
EXERCISES
(9) Two groups of white rats, each containing 12 animals, were tested daily in a T-maze. All subjects were 23 hours hungry at the time of each session, and only one of the 2 goal boxes of the T-maze contained food. All subjects were trained and tested until each had chosen the correct tum in the maze for 3 days in succession. Treatments of the 2 groups were the same except that the correct goal box for Group B had walls of a different color from the rest of the maze and a wire mesh over its floor. The number of days required by each subject to reach the criterion of learning is given in th~ following table. A
8
8
9
9
10
10
10
10
10
B
5
5
7
7
7
7
7
7
7
10
11
11
788
Did the additional cues of the goal box for Group B make a difference in the average speed of learning? Use the 5% significance level. (10) The speed of cars entering a curve is studied with each of two different warning signs. The license numbers of the cars are recorded, and the data on those belonging to drivers living in the vicinity are rejected, in an effort to limit the study to those less familiar with the road who would be dependent upon the warning signs. The observations (miles per hour) of the two groups of unfamiliar drivers are 88 follows:
I
Croup B
Croup A
22.2 26.3 24.6 22.2 23.8 30.8 21.7 27.9 21.2 35.6
38.5 25.2 22.1 22.1 IB.l 22.0 23.5 44.5 19.4 36.0
31.3 18.9 19.2 31.7 21.0 62.1 IB.4 33.3 17.4 21.2
20.7 27.5 23.0 38.8 22.9 IB.7 22.2 2Q.6 19.8 33.2
26.3 22.2 22.0 30.4 22.1 37.1 43.9 18.9 19.6 20.3
55.9 42.8 26.0 21.8 48.0 18.3 21.6 22.1 24.6 35.7
34.4 22.0 30.6 18.8 23.3 20.1 31.0 32.7 54.5 34.2
22.4 25.5 27.4 20.2 25.2 25.6 48.0 30.4 23.7 38.3
23.5 44.8 33.1 46.9 44.7 38.9 29.9 3B.0 21.4 23.7
20.5 26.0 29.1 23.5 36.8 24.2 22.6 21.3 22.7 24.6
Does the difference in warning signs affect the average driving speed? Use the 5% significance level.
(11) A relatively uDselected group of young men and women were tested on their speed of assembly of a small standardized task of the erector-set variety. The numbers of seconds required are given as follows:
140
Women
36 29 26
I
I
Ch. 10
DIFFERENCE BETWEEN SAMPLE MEANS
27 27 28 29 35 27 25
25 31 29 27 30 28 23 27 26 23
23 30 31 30 21 28 24 36 31 31
Men
41 24 36
27 32 28 24 24 21 16
23 28 30 31 30 23 29 23 22 22
32 30 30 23 29 28 30 27 25 30
33 38 36 35 28 31 32 37 26 24 -----~--
29 30 14 38
36 41 24 35 34 31
27 35 29 39 38 35 24 32 27 30
36 21 35 3'1 31 26 31 31 36 32
!
j
-----
Is there a difference between men and women on this task? Use the 5% significance level.
QUESTIONS (1) What are the assumptions underlying the t-test? (2) Whether the two sample sizes are equal or not, the t-teat can be used to teat the hypothesis that two population means are equal. Why are the two sample sizes usually made equal? (3) What is the difference between the t-test described in this chapter and the one described in Section 8.7? (4) How do you divide 24 children into two groups at random? (5) In Exercise (4), it is stated that the locations of 20 plots were completely randomized in a field. How do you do it? (6) If the hypothesis is true and the 5% significance level is used in the t-test, 5% of all possible pairs of samples lead to an erroneous conclusion. Then why is the 1% or even 0% significance level not always used in testing a hypothesis? (7) In Exercise (4), the observations are numbers of pounds of hops. What is the unit of t?
REFERENCES Box, G. E. P.: "Some Theorems on Quadratic Forms Applied in the Study of Analysis of Variance Problems, I. Er£ect of Inequality of Variance in the One-Way Classification," Annal. of Mathematical Statistics, Vol. 25 (1954), pp. 290-302. Fisher, R. A. aDd Yates, F.: Statistical Table. for Biological, Agricultural, and Medical Research, Table XXXIII, Oliver & Boyd, LODdoD, 1938. KeDdall, M. G. aDd Smith, B. BabiDgtoD: Table. of Random Sampling Numbera, Tracts for Computers, No. XXIV, Cambridge UDiversity Press, Cambridge,
1946. PearsoD, Karl (Editor): Tables of the Incomplete Beto-Function, Biometrika Office, UDiversity College, LODdoD, 1934. Welch, B. L.: "The GeDeratioD of StudeDt Problem WheD Several DiffereDt PopulatioD VariaDces are IDvolved," Biometrika, Vol. 34 (1947), pp. 28-35.
CHAPTER 11
CONFIDENCE INTERVAL In the preceding chapters, only one kind of statistical problem is considered-the test of hypothesis. This chapter deals with a different kind of statistical problem-estimation of a parameter by an interval. 11.1 Inequality
Inequality, a topic treated in almost all elementary algebra textbooks, is used quite frequently in this chapter. Some of the principles concerning inequalities are summarized here for convenience of reference. (1) The direction of the sign of an inequality remains the same, if a quantity is added to or subtracted &om both sides of the inequality. For example, 2 is less than 3, that is, 2 < 3. If 10 is added to both sides of the inequality, the resulting inequality is 12 < 13, which is still we. If 10 is subtracted from both sides of the inequality 2 < 3, the new inequality is -8 < -7 which is still true. (2) The direction of the sign of an inequality remains the same, if both sides of the iDequality are multiplied or divided by the same posilive number. For example, when both sides of the inequality 2 < 4 are multiplied by 5, the resulting inequality is 10 < 20 which is still true. When both sides of the inequality 2 < 4 are divided by 2, the resulting inequality is 1 < 2 which is still true. (3) The direction of the sign of an inequality is reversed if both sides of the inequality are multiplied or divided by a negative number. For example, when both sides of the inequality 2 < 4 are multiplied by -1, the resulting inequality is -2> -4 and not -2 < -4. When both sides of an inequality 2 < 4 are divided by -2, the resulting inequality is -1 >-2 and not -1 < -2. 11.2 EstimalioD by IDterval The problem of estimation of a parameter is already considered in the preceding chapters. For example, the sample mean is used to estimate the population mean".. But a sample mean is very seldom equal to the population mean. The four samples, each consisting of 5 observations, shown in Table 4.2, are random samples drawn from the tag population whose me_ is equal to 50, but none of the four sample means is equal to 50. It is obvious that a sample mean is not in general an adequate estimate of the population mean. Therefore, this chapter introduces a new approach to the problem of estimation, the use of an interval to estimate the population mean.
r
141
142
CONFIDENCE INTERVAL
Ch.11
It is a common practice in everyday life to use an interval for the purpose of estimation. For example, in estimating a person's age, one might say that Jones is in his fifties, that is, Jones' age is somewhere between 50 and 60. One would never estimate Jones' age as 52 years 5 months and 14 days. The fonner estimate by the interval 50 to 60 appears to be rough, but it is likely to be correct, while the latter estimate is precise, but it is likely to be wrong. The longer the interval, the greater the chance that the estimate will be correct. For example, one can estimate Jones' age as somewhere between 0 and 200. This interval is bound to include Jones' correct age. One has 100% confidence in the correctness of this estimate. But the interval is so long that it becomes useless, because anybody's age is between 0 and 200. But if the interval is kept short, for example in the estimate that Jones' age is between 52 and 53, one's confidence in the correctness of this estimate is not nearly so great as is his confidence in the correctness of the estimate that Jones is from 0 to 200 years old. Therefore, the length of an interval and the degree of confidence in the correctness of an estimate are of primary interest in the estimation of a population mean by an interval. 11.3 Coal'ideDce IDle"al and Coal'ldeDce Coefficient In this section, the estimation of the population mean is used as an illustration of the terms confidence interval and confidence coefficient. It is known that the statistic
y-p.
.-~ follows the normal distribution with mean equal to zero and standard deviation equal to 1 (Section 6.7). If all possible samples of size n are drawn from a population, and the u-value is calculated for each sample, 95% of these u-values fall between -1.96 and 1.96, that is, the inequality (1)
holds for 95% of all possible samples of si ze n. Two intervals can be derived from Inequality (1). When each element of Inequality (1) is multiplied by yu t In and then II- is added to each element, the interval, expressed by the resulting inequality, is II- -1.96
.9 V-; < Y< II- + 1.96 .~ V-;·
(2)
11.3
CONFIDENCE INTERVAL AND CONFIDENCE COEFFICIENT
143
When each element of the original Inequality 0) is multiplied by - ..jut/II, the resulting inequality is
1.96
~> -y +". > -1.96 ~.
r
Then is added to each of the elements of the above inequality. The interval, expressed by the resulting inequality, is
r + 1.96 .9" V-;;- > ". > Y- 1.% .~ V-; or
-y -1.96 .R JUT V-;;- <". <1- + 1.96 .V-;-
(3)
Both intervals (2) and (3) are derived from the same Inequality 0) and therefore hold for only 95" of all possible samples of size II. Since the two intervals come from the same source and also have the same general appearance, they are often confused by beginners in statistics. Nevertheless, it is of utmost importance that they be differentiated at this stage. Among the quantities involved in Inequalities (2) and (3), only y changes from sample to sample. In Inequality (2), the limits are
~
". -1.96
and
". + 1.96
~
and these limits do not change from sample to sample. For the sampling experiment described in Chapter 4, where p. = SO, u l = 100, and II = 5, the limits are
..VJiOO 5
and
50 + 1.96
50 - 8.8
and
SO + 8.8
50 - 1.96
..VJiOO 5
or or 41.2 and 58.8. The 4 sample means given in Table 4.2 are 48.8, 45.6, 59.2, and 55.4. Only the third sample mean, 59.2, falls outside this interval. If the means of all possible samples of size 5 are calculated, 95% of these means fall inside this interval. Interval (3) is not a single interval, but a collection of intervals. The limits
-
1 - 1.96
or
..VF-;
r - 8.8
-
and 1 + 1.96
and
r + 8.8
.j;T
V-;;-
144
CONFIDENCE INTERVAL
Ch.11
r
change from sample to sample, because changes from sample to sample. Consequently, there is an interval for each sample. For the first sample in Table 4.2, the interval is 48.8 - 8.8 to 48.8 + 8.8, or 40.0 to 57.6. The intervals for the other three samples in Table 4.2 are 36.8 to 54.4, 50.4 to 68.0, and 46.6 to 64.2. These intervals, which are calculated from samples and used to estimate a parameter, are called confidence interoals. The interval 50.4 to 68.0 mined the population mean 50. The other three intervals caught the population mean 50. If all possible samples of size 5 are drawn from the population and a confidence interval is calculated from each sample, 95% of these intervals will in-· elude the population mean 50. The percentage of all possible samples (of a given size) yielding the confidence intervals which cateb the parameter is called the confidence coefficient. In this example, the confidence coefficient is 95%. The end-points, such as
-r - 1.96 .P V -;; and r- + 1.96 .P V-; are called confidence limits. The length of a confidence interval is the difference between its confidence limits. In this example, the length of the interval is 2(1.96)
.P V-;;=2(1.96) .;roo V5= 17.5.
Now the contrast between the two intervals (2) and (3) should be clarified. In interval (2), the limits are fixed, but the center element changes from sample to sample, and 95% of the sample means fall inside this interval. In this respect the sample mean resembles a basketball; it may fall inside or outside the fixed basket which resembles the fixed interval (2). The interval (3), on the other hand, is a collection of confidence intervals, each of wbich is calculated from a sample. The confidence limi ts change from sample to sample, but the center element p. is a fixed quantity. Some confidence intervals catch the population mean p., and some of them miss it. The confidence intervals in a sense resemble horse-shoes which catch or miss the fixed peg which resembles the population mean. Those confidence intervals calculated from the first, second, and fourth samples of Table 4.2 caught the population mean 50, while that calculated from the third sample missed the population mean 50. An individual confidence interval thus may either include or miss the population mean. The percentage of all possible samples, of the same size, having confidence intervals that catch the population mean, is the confidence coefficient. One may use another analogy in sports and say that the confidence coefficient is like the batting average of a baseball player. It is a performance rating based on many battings.
r
11.4
CONFIDENCE INTERVAL OF MEAN
145
For each individual batting, the player may either hit or miss the ball. The batting average i8 the percentage of time8 that he made hite. The confidence coefficient, like the batting average, i8 a performance rating based on the confidence intervals computed from all possible samples of the same size. The confidence interval computed from a particular 8ample may either include or miss the population mean. If a confidence coefficient of .95 i8 attached to a particular interval, this indicates that the interval is selected at random from a group of intervals with a performance rating of .95. The confidence coefficient is arbitrarily chosen. It Ileed not be 95%. When the value 1.960 is replaced by 2.576 in calculating the confidence limite, the confidence coefficient would be 99% and the length of the interval would be 2(2.576)
.? V-;;- - 2(2.576} .JiOO V 5 - 23.0.
Now it CaD be seen that the increase in the confidence coefficient automatically lengthen8 the confidence interval. When the confidence coefficient is increased from 95% to 99%, the confidence interval is lengthened &om 17.5 to 23.0. The only way of keeping the confidence coefficient high and the confidence interval short i8 to reduce the standard error of the meao, o/,/ii. This can be accomplished either by reducing the population standard deviation q (Section 5.5) or by increasing the sample 8ize 11, or by both. It should be noted that a confidence interval may either catch the population mean or misa it. It is a simple matter of right or wrong. There is no 8uch thing as a Type I error or a Type n error associated with a confidence interval. The Type I and Type U errors are the possible errors that may be committed in testing a hypothesis. Since a confidence interval does not involve any hypothesis, these ert'ors caonot be committed. 11.4 Confidence Interval of MeaD A method of finding the confidence interval of a population mean is
given in the preceding section as an illustration of the confidence interval and the confidence coefficient. Yet that method, which i8 based on the u-distribution, is almost useless, because the population variance qt is usually unknown. A more practical method of finding the confidence interval of a population mean can be derived &om the t-distribution. It is known that if all possible samples of size 11 are drawn from a normal population with mean equal to p. aDd, for each sample, the statistic t is calculated, 95% of the t-values fall between -t. OI ' aDd '.ou, where t.au is the 2.5% point of the Student's t-distribution with 11 - 1 degrees of freedom (Theorem 8.1a).
146
CONFIDENCE INTERVAL
Ch. 11
In other words, the inequality (1)
is true for 95% of all possible samples of size n. When each of the three tenos of the above Inequality (l) is multiplied by - ys2 In, the resultiug inequality is t .025
When
.? V~ > - 'Y- + I' > - t
.021
y is added to each of the three tenos, the resulting inequality is y - t.
02•
~ < I' < Y+ t. o• ~
(2)
The above inequality specifies the confidence interval of I' with a confidence coefficient of 95%. It should be realized that the confidence interval specified by Inequality (2) is not one interval, but a collection of intervals. For each ssnple of " observations, y and S2 can be calculated and t.021 can be obtained from the t-table and therefore a confidence interval of I' can be calculated. If all possible samples of size " are drawn from a nonoal population, 95% of the samples yield confidence intervals which will include the population mean. The sampling experiment described in Chapter 4 may be used to demonstrate the meaning of this confidence interval and its confidence coefficient. The four samples, each consisting of 5 observations, given in Table 4.2 are random samples drawn from a nonnal population with mean equal to 50. The values of and vsi/" are already computed for each of the samples. The value of t. 021 is 2.7764. For the first sample, the confidence in· terval of the popUlation mean, with a confidence coefficient of 95%, is
r
48.8 -(2.7764 x 5.472) < I' < 48.8 + (2.7764 x 5.472) or
33.6 < I' < 64.0. Since the population mean is known to be 50, this sample yields a confidence interval which includes the population mean. The confidence intervals calculated from the other three samples are 36.0 to 55.2, 46.8 to 71.6, and 44.3 to 66.5 respectively. Each of these three confidence intervals also includes the population mean SO. If all possible samples of size 5 are drawn and, for each sample, a confidence interval is calculated, 95% of all the intervals will in elude the population mean 50.
11.5
CONFIDENCE INTERVAL OF DIFFERENCE BETWEEN MEANS
147
The confidence interval of the population mean can be applied to the same kind of problems described in Sections 8.6 and 8.7. In those sections, the problems are tests of hypotheses. In this section, there is no hypothesis and the problem is to estimate tbe population mean. Tbe sugar beet experiment described in Section 8.7 may be used to differentiate the two kinds of problems. In testing the hypothesis that the population mean is equal to zero, the objective is to determine wbether the use of fertilizer increases the yield of sugar beets. In finding the confidence interval of the population mean, the objective is to detennine how much the yield of sugar beets is increased by the use of fertilizer. In an application, only one sample, which may consist of many observations, is available and consequently only one interval is calculated. Before the confidence interval can be calculated, the confidence coefficient is arbitrarily chosen. If 95% confidence coefficient is chosen, t. 021 is used in computing the interval. If 99% coefficient is cbosen, t .001 is used in computing the interval. The confidence interval computed from a given sample mayor may not include the population mean. Since the population mean is unknown, there is no way to determine wbether the interval actually includes the population mean or not. The confidence coefficient is attached to the confidence interval as a performance rating based on the confidence intervals compnted from all possible samples of the same size. The method of computing the confidence limits of a population mean is quite simple. The method of computing and saln is given in Section 8.5. When these two quantities are computed, tbe intervals are obtained by Inequality (2). The confidence intervals of the population mean are already computed from the four samples given in Table 4.2. The reader may familiarize bimself with the computing procedure by recomputing these intervals. The length of the confidence interval given in Inequality (2) is
r
2 t. oa•
-l"
which changes from sample to sample, because s' changes from sample to sample. But the average length of the confidence intervals computed from all possible samples of size n will be decreased by increasing the sample size. 11.5 Coafidence Interval of Difference Between Means The confidence interval of the difference between two population means can be obtained from Theorem 10.4a. The algebraic manipulation involved is the same 88 that given in the preceding section and therefore
148
CONFIDENCE INTERVAL
Ch.11
is omitted. The limits of the 95% confidence interval of Il. -Ilt are
(r. - y,):t: t.OII
(1)
If the confidence coefficient of 99% is desired, the 2.5% point is replsced by .5% point of the t-distribution with ". + ". - 2 degrees of &eedom. The method of computing
<1. -1.) and
~; (~ + ~) V" n. ".
•
is given in Table 10.6. After these two quantities are cslculated, the confidence limits can be obtained very easily. The 95% confidence interval determined by the pair of samples given in Table 10.6 is
- 4 -(3.1825 x 2.108) < Il. -Ilt < - 4 + (3.1825 x 2.108) -4 -6.7
-10.7 < Il. -Ilt < 2.7.
m other words,
the difference between the two population means I'a -I'a is somewhere between -10.7 and 2.7. Both the confidence interval and the test of hypothesis are applicatioos of the same theorems (10.4& or 10.41», but the objectives of these applications are different. The objective of the test of hypothesis as described in Sections 10.6 and 10.8 is to determine whether two population means are the same. The objective of fiuding a confidence interval of I'a -I'a is to estimate the magnitude of the difference between two population means. It is shown in Section 10.1 that, for a givea total number of observations, the variance of the difference between two sample means
reaches a minimum when the two sample sizes are equal. It can be seen from Inequality (1) that the advantage of equalizing the sample sizes is to shorten the average length of the confidence intervals of the difference between two population means.
EXERCISES (1) In Section 8.7, a sogar beet experiment is described. Find the 95% confidence interval of the average increase in yield. (19.97 to 52.79) (2) For the data given in Exercise (2), Chapter 8, find the 95% confidence interval of the average drained weight of the canned cherries. (12.02 to 12.28)
QUESTIONS
149
(3) For the data given in Exercise (1), Chapter 10, find the 99% confidence interval of the difference between two population means. Since the source of the two samples is known, state whether the estimate is correct. (-20.5 to 21.0) (4) For the data given in Exercise (3), Chapter 10, find the 95% confidence interval of the difference between the two population means. Since '" -1'2 is known to be -40, state whether the estimate is correct. (5) For the data given in Exercise (4), Chapter 10, find the 95% confidence interval of the difference between the average yields of the cuttings from the high- and low-yielding hops. (6) For the data of Exercise 6, Chapter 8, find the 95% confidence interval of the average difference in phosphocreatine content. (7) For the data of Exercise 7, Chapter 8, find the 95% confidence interval of the average difference in phosphocreatine content. (8) For the data of Exercise 8, Chapter 8, find the 95% confidence interval of the average difference in the octane ratings determined by the two methods. (9) For the data of Exercise 9, Chapter 8, find the 95% confidence interval of the average difference in scores between the two groups of children. (10) For the data of Exercise 5, Chapter 10, find the 95% confidence interval of the difference between two population means. (11) For the data of Exercise 6, Chapter 10, find the 95% confidence interval of the difference between two population means. (12) For the data of Exercise 7, Chapter 10, find the 95% confidence interval of the difference between two population means. (13) For the data of Exercise 8, Chapter 10, find the 95% confidence interval of the difference between two population means. (14) For the date of Exercise 9, Chapter 10, find the 95% confidence interval of the difference between two population means. (15) For the data of Exercise 10, Chapter 10, find the 95% confidence interval of the difference between two population means.
QUESTIONS (1) In estimating a parameter by a confidence interval, how many kinds of errors can be committed? What are they? (2) Define the confidence interval and the confidence coefficient. (3) What is the advantage of having a large sample in determining the confidence interval of a population mean? (4) Wby do the lengths of the confidence intervals of a population mean change from sample to sample even if the sample size remains the same?
150
CONFIDENCE INTERVAL
Ch.11
(5) The 95% confidence interval of the population mean as obtained from a given sample is 49 to 51. Is it correct to say that, 95 out of 100
times, the population mean falls inside the interval 49 to 51? Whether your answer is yes or no, state the reason.
REFERENCES Kelldall, Maurice G.: The Advance Theory of Stati&,ic5, Vol. II, Charles Griffill Company, LOlldoll, 1946 (Extellsive Bibliography).
OIAPTm 12
ANALYSIS OF VARIANCE-ONE-WAY CLASSIFICATION The analysis of variance, as introduced in Section 7.9, is the process of partitioning the sum of squares into components. One of the objectives of the process is to test the hypothesis that a number of population means are equal. The method of testing the hypothesis that two population means are equal is given in Chapter 10. The analysis of variance, therefore, may be regarded as an extension of the '-test. 12.1 Mechanics of PartilioD of Sam of Squares In this section only the mechanics of the partition of the sum of squares are described. The interpretation is given in the following section. The problem involves a number of samples, say k, each consisting of n observations. The samples are referred to as the first, second, ••• , and Irth sample. Their respective means are denoted by and The mechanics are described through the example shown in Table 12.1a. This example involves 3 samples
rl' r2' ... ,
rl:'
TABLE 12.1a
Sample No.
ObservatiOlls
(1)
(2)
(3)
3
9 12 11 8
1 2 6
5
7
45
20
9
4
7 7
6 2 Total T Mea
'I
25 5
4
90 =G
6-
S'
Now the summation sign I becomes inadequate. If, as just stated T = Iy and G = IT, the two summation signs have different meanings. In the case T = Iy = YI + Ya + ••. + Yn' the sign 1: indicates the sum of n 151
152
Ch. 12
ANALYSIS OF VARIANCE-ONE-WAY CLASSIFICATION
tenos. In the case of G = IT = Tl + T, + ••• + Tk' the sign I indicates the sum of "tenos. In order to make this distinction, a quaDtity is attached to the sign I to indicate the number of terms to be added together. For example, n
"n
T e: Iy and G = IT = Iy. The meaD of tbe 1m observations is called the general mean whicb is denoted by Then
r.
_ G 90 Y =-=-=6.
"n
15
It sbould be noted tbat the general mean ple means, tbat is,
1 is also the mean of the "
_ Yl+Y'+···+Yk 5+9+4 Y -= =
sam-
= 6.
" 3 Each of tbe 15 ("n = 15) observations can be partitioned into three components, tbat is, y
=
r+ (Y - 1> + (y - 1>.
(1)
The above equation is an algebriac identity. After simplification, it can be seen that the equation is y = y. For example, the first observation of Table 12.1a is 3; the first sample mean is 5 aDd the general mean is 6. Then by Equation (1),
3
=6 + (5 -
6) + (3 - 5) = 6 - 1 - 2.
The quantity (-1) is tbe deviation of the first sample mean from the general meaD and the quantity (-2) is tbe deviation of the observation from the first sample meaD. The components of eacb of tbe 15 observations are sbown in Table 12.1b. The purpose of breaking down eacb observaTABLE 12.1b
I
Sample No.
(1)
(2)
(3)
CompolleDts of observations
6-1-2 6-1+2 6-1+2 6-1+1 6-1-3
6+3+0 6+3+3 6 +3 +2 6 + 3-1 6 +3-4
6-2-3 6 -2-2 6-2+2 6 -2 +0 6 -2 +3
6-1+0
6+3+0
6 -2 +0
1+(1-1> +(1-1> S. .ple me_
r
12.1
MECHANICS OF PARTITION OF SUM OF SQUARES
153
tion into components is to explain the algebraic identity that
L
(y -
1)2 =
L: (y -;)2 + L:
(y -
1)2.
(2)
The SUID of squares I(y - ;)2 in Equation (2) is called the total SS, which is the SS of the composite sample of the len or 15 observations. For the given example (Table 12.1a), AI"
L(y - 9>2
= (3 - 6)2 + (7 - 6)2 + ••• + (7 - 6)1
a
148.
The middle term of Equation (2) is the SUlD of squares of the middle components of the observations (Equation 1 and Table 12.1b), that is, ira
L
k
L(y - ;)1 = n
L
(3)
The last tenn of Equation (2) is the sum of the squares of the last components of the observations, that is, k"
L(y -1)1 It should be noted that 148
= (_2)2 + 21 + ••• + 32 = 78.
= 70 + 78.
This numerical example explains
and also verifies Equation (2). The algebraic proof of Equation (2) is quite simple. However, students with a limited mathematical background may find that the use of complicated notations in proving the identity, obscures rather than reveals the meaning of the proof. For them, the following explanati on of the proof may be more useful than the demonstration of the proof. The values of (1 - 1) and (y - 1) of each of the 15 (len = 15) observations are given in Table 12.1b. The quantity (y -;) is the middle component and (y - 1> is the last component. Therefore Equation (2) says that the total SS is equal to the sum of the squar~s of the middle components plus the sum of the squares of the last components of the 15 observations. When ; is subtracted from both sides of Equation (1), the resulting equation is (y -;) = (y + (y - 1).
1>
After both sides of the above equation are squared, the result is (y -
1)2 = (y - 1)2 + (y
-
1>2 + 2(1 -
;> (y - 1>.
154
Ch. 12
ANALYSIS OF VARIANCE-oNE-WAY CLASSIFICATION
For the first observation of the first sample, the equation is (3 - 6)2
= (_1)2
+ (_2)2 + 2(-1)(-2)
or 9
= 1 + 4 + 4.
For the second observation of the first sample, the equation is
= (_1)2
(7 - 6)2
+ 22 + 2(-1)(2)
or
1 = 1 + 4 - 4. For each of the 15 (kn = 15) observations, there is such an equation. When all 15 equations of this kind are added together, the result is
I(y - 9)2 = I(y - 9)2 + I(y - y)2 + 2I(y -
1> (y - y).
The difference between the above equation and Equation (2) is the term 2I(y - y> (y Now the problem is to show that I(y - 1> (y - y) = O. Since (y - y> and (y - Y> are the second and third components of an observation, the sum of the products, I(y - y) (y -y) can be obtained from Table 12.1b. This quantity is the sum of the products of the second and third components of the 15 (kn = 15) observations, that is,
n.
I(y-y>(y-y> = (-1)(-2) wb icb
IS
+(-1)(+2) + ... +(-2)(3),
also equal to
(-1)(-2+2+2+1-3) + 3(0+3+2-1-4) - 2(-3-2+2+0+3). But the sum of the deviations of the observations from their mean equal to zero (Sections 7.4, 7.8). Therefore,
I(y -
IS
1> (y - Y> = (Y1 - y> (0) + (Y. - 1> (0) + ••• + (Y" - y) (0) = O.
This completes the proof of the identity ",.
1: (y -
",.
y)2
= 1: (y -
If,.
y)2 +
1: (y - 1)2 ,
which can also be written as (Equation 3), kra
1: (y -
k,.
"
9) 2 = n 1: (1 - y)2 + 1: (y - Y)2.
The above three sums of squares measure the variation of the k,.
vat ions in different ways.
The total 55,
1: (y - y)2, which
"n
(5) obser-
is the 55 of
the composite sample of the kn observatioDs, measures the overall variation of the kn observations. The total 55 is equal to zero only if all the observations are the same (Table 12.1c). The sum of squares,
"n
12.2
INTERPRETATION OF PARTITION OF SUM OF SQUARES
155
II
n L(Y - i)l, which is called the among-sample SS, measures the variation among the k sample means. The amoug-sample SS is equal to zero, if all the k sample means are equal, but the observations within a sample do not have to be the same (Table 12.1d). ira
The sum of squares,
L (Y - 1)1, which is called the within-sample SS,
measures the variation of the observations within the samples.
The
TABLE 12.1c (1)
(2)
(3)
6 6 6 6 6
6 6 6 6 6
6 6 6 6 6
within-sample SS is equal to zero, if all the observations within each of the k samples are the same, but the k sample means themselves do not have to be the same (Table 12.1e). The within-sample SS is actually the sum of the S5-values of the k samples. It is also called the pooled SS (Section 9.6), tbat is, lira
ra
n
ra
L (y - 1)1 = L(Y - r,)1 + L(Y - YI)I + ••• + L(y - r,)1
(6)
or, within-sample SS - SSt + SSI + ••• + SS,
T
Y
(I)
(2)
(3)
(1)
(2)
(3)
4 8 8 7 3
6 9 8 5
3 4 8 6 9
5 5 5 5 5
9
9 9 9 9
4 4 4 4 4
30 6
30
25 5
45 9
20 4
2
-
6
(7)
TABLE 12.1e
TABLE 12.1d
-
= pooled SSe
30 6
-rT
-
12. 2 StaUs&lcal ID&erpre&a&loa of Pull&loa of Sa. of Sq.ares
The mechanics of the partition of the sum squares are described in the previous section. The total SS, which is the SS of the composite of k samples, each consisting of n observations, can be partitioned into two components, namely, the among-sample SS and the within-sample SSe This section gives the statistical interpretation of this partitioning.
156
ANALYSIS OF VARIANCE-oNE-WAY CLASSIFICATION
Ch. 12
Now the k samples are considered random samples drawo from the Dormal populatioD. Therefore, it is justified to lump all tbe lera observations of the k samples together to form a composite sample. If all possible samples of kn observations are drawo from a normal population, and for each sample, the SS.value is calculated, the distribution of SS/u' follows the ,c-distribution with 1m - 1 degrees of freedom (Theorem 7.7a). In other words, (total SS)/u' follows the x'-distribution with 1m - 1 degrees of freedom. The distribution of the among-sample SS can also be deduced from Theorem 7.7a. The distribution of the means of all possible samples of size n may be regarded as a population. This population has y, instead of y, as its observatioos. Since the parent population is normal, the sample means follow the normal distribution (Theorem 5.2b). The mean of this population of sample means is equal to Il and the variance is equal to u'/n (Theorem 5.3). Then the k sample means, 1" y" ... , Yk' become a sample of k observations from this population. If all possible samples of k sample means are drawn from this population of y, the SS can be calculated for each sample of k sample means. The distribution of 8,.,.11
Among-sample SS
----= -----= ------n must follow the x'-distribution with k - 1 degrees of freedom (Theorem 7.7a). The distribution of the within-sample SS can also be deduced from theorems developed in the preceding chapters. The within-sample SS is the pooled SS of the k samples. It is known that the statistic SS/u' follows the x'-distributions with n - 1 degrees of freedom (Theorem 7.7a). If the " samples are independent samples, the statistic pooled SS
"----- =
SS, + SS, + ••• + SSk
within-sample SS
- -----=---
follows the ,c-distrihution with (n, - 1) + (n, - 1) + ••• + (n k - 1) degrees of freedom (Section 9.6). Since all the sample sizes are equal in this case, the number of degrees of freedom of the pooled SS or the within-sample SS, is ken - 1). Now it can be seen that not only the total SS is partitioned into components, but also the number of degrees of freedom of the total SSe The relation among the three SS-values and among their respective numbers of degrees of freedom is summarized in the following equations:
SS: Total = Among-sample + within-sample d.{.: kn - 1 = k -1 + ken -1)
12.2
INTERPRETATION OF PARTITION OF SUM OF SQUARES
157
Since the sample variance Sl, which is the SS divided by its number of degrees of freedom, is an estimate of the variance of the population from whicb the sample is drawn, the variance of a sample of k sample means. whicb is
0) is an estimate of the variance of tbe population of sample means. Therefore, s.! is an estimate of ul/n. Then n times s~ is an estimate of u l , t hat . IS,"
,
n
ns.!
,.
1:
co
k-l
among-sample SS ... - - - - - - -
k-l
(2)
is an estimate of u l • From Section 9.6, it is already known that the pooled SS divided by the pooled numbers of degrees of freedom, tbat is,
L:'ra (y - 1)1
pooled SS Sl -
,
k(n - 1)
-
within-sample SS = -----k(n - 1) k(n - 1)
is also an estimate of u l • Then one would be tempted to say that tbe ratio nsi/s I follows the F-distribution with k - 1 and k(n -1) degrees of freedom.' This intuition happens to be correct. The fact that this ratio follows the F-distribution can be verified by a sampling experiment, the details of wbicb are described in Chapter 4. The 1000 random samples, eacb consisting of 5 observations, are grouped into 500 pairs. Tbe first and the second sample form a pair, the third and the fourth form a pair, and so forth. For each pair of samples, the among-sample SS and tbe within-sample SS are calculated. The amon~ample SS bas k -lor 1 degree of freedom and the within-sample SS bas k(n - 1) or 2(5 - 1), or 8 degrees of freedom. An example of two pairs of samples is given in Table 4.2. For tbe first pair of samples, the means are 48.8 and 45.6 respectively and the S5-values are 598.8 and 237.2 respective ly. The general mean is ~(48.8 + 45.6) - 47.2. The among-sample SS, which bas 1 degree of freedom, is
r
,
n L<1-y)1
= 5[(48.8-47.2)1
+ (45.6-47.2)1]
= 5[0.6)1+(-1.6)1] = 25.6.
The witbin-sample SS, wbicb bas 8 degrees of freedom, is the sum of the two S5-values, that is, 598.8 + 237.2 .. 836.0. Then ns 25.6/1 .,. 25.6, and Sl = 836.0/8 = 104.5; the variance-ratio nsy/s; = 25.6/104.5 = 0.245 with and 8 degrees of freedom. Sucb a ratio is computed for each of
y-
r
158
Ch. 12
ANAL YSIS OF VARIANCE-ONE-WAY CLASSIFICATION
the 500 pairs of samples. The frequency table of the 500 ratios is shown in Table 12.2. The theoretical frequency given in the table is that of the F-dislributioD with 1 and 8 degrees of freedom. The histogram of the 500 ratios and the superimposed F-curve with 1 and 8 degrees of freedom is TABLE 12.2 Observed Frequency
F
/
r./.('Ie)
0-1 1-2 2-3 3 -4 4 -5 5 -6 6 -7 7 -8 8 -9 9 -10 10 -11 11 - 12 Over 12
332 72 37 16 11 15 5 2 4 1 0 1 4
.0 .2 .8
Total
500
100.0
Theoretical r. f.( 'Ie)
66.4 14.4 7.4 3.2 2.2 3.0 1.0 .4 .8
65.1 15.4 7.4 4.1 2.5 1.6 1.1 .7 .5 .4 .3 .1
.2
.8 100.0
(This sampling experiment was done cooperatively by about 75 students at Oregon State College ill the Fall of 1952).
shown in Fig. 12.2. It can be observed in either the frequency table or the histogram that the observed frequency and the theoretical frequency fit closely. This experiment verifies the contention that the statistic among-sample
k-l
ss
k
n
2:
F=-------- = within-sample SS kn
k (n - 1)
k-l
2:(y - y)2
ns..!y
=-
(4)
S2
P
k(n - 1)
follows the F-dislribution with k - 1 and k(n - 1) degrees of freedom. The applications of this result are given in later sections. The variance ns~ is called the among-sample mean square and the pooled variance s; of k samples is called the within-sample mean square. Neither of these variances acquires any new meaning because of its new name. Each is given its new name for convenience of reference. The abbreviation \1S is usually used to represent the mean square. The among-sample SS is also called the treatment SS and the withinsample SS is also called the error SSe These names are acquired because of the applications of the analysis of variance which are discussed in Section 12.8.
12.3
159
COMPUTING METHOD
1.00
r.
I· .75
.$0
.25
•
6
7
R
9
10
12
\I
Fig. 12.2
Il F
The physical meaning of the number of the degrees of freedom can be observed in Table 12.1b. The among-sample SS is " times the sum of squares of the quantities (; - 1> (Equation 3, Section 12.1). These quantities are -1, 3, and -2, the sum of which is equal to zero. Therefore, if two of these quantities are known, the third one becomes automatically known, and consequently the number of degrees of freedom of the amongsample SS is 2. In general, if " - 1 of the k quantities are known, the remaining one becomes automatically known. Therefore, the number of degrees of freedom is k - 1. The within-sample SS is the sum of squares of (y - 1> for all the k samples. But for each sample, ~y - 1> - 0 (Table 121b). Therefore, if " - 1 of the deviations (y - y) are known, the ren
maining one becomes automatically known; and conse'JUently has ,,- 1 degrees of freedom for each sample.
L (Y - y)2
Then the pooled value
Iu,
L (Y - 1)2 for the k samples has k(n -
1) degrees of freedom.
12.3 Compaliag Method The mechanics of the partition of the sam of squares are shown in detail in Section 12.1. In that section the example used, involving mostly one-digit numbers, was deliberately chosen to avoid complicated com-
160
ANAL YSIS OF VARIANCE-oNE-WAY CLASSInCA1l0N
Ch.12
putation. The purpose of that section is to show the meaning of partitioning the total SS into the among-sample SS and within-sample SS, but the method of doing so is too tedious to be of practical value. In this section, a short-cut method is presented for computing the total SS, among-sample SS, within-sample SS, and also the F-value. The short-cut method is developed entirely &om the identity (Equations 3 and 4, Section 7.4)
0) and the procedure is ~evised mainly to suit a desk calculator. The notations used in this section are the same as those used in Section 12.1, that is
k --number of samples; n --number of observations of each sample; '1--an observation; T--sample total, e.g. T" TI , •• ' , Tk i r--sample mean, e.g. rl' •• ', G grand total; y---general mean.
r".
rk;
Since the general mean y is the mean of the kn observations, the total SS is the SS of the composite sample of the kn observations. From Equation (I), it can be seen that kit
total SS =
ca
kit
L ('1 - 1)2 = LY --. kn
r
(2)
r,
Since is also the mean of the k sample means the application of Equation (1) to the sample means leads to the result that among-sample SS But
k
=n L (r - 1)2
ris the sample total T divided by n.
I:
n
r,k Cty)ll LLY - -k-J
Therefore,
1:----n
kn
(3)
12.3
COMPUTING METHOD
161
The SS for each sample is II
(~)J
1'1
II
Lr--= Lr--. n n The pooled SS for the k samples is the sum of the k S5-values, that is,
within-sample SS
=L r '"
~ - .J..J!.... n 'P.I
(4)
Now the computation of the total SS, among-sample SS, and withinsample SS narrows down to the computation of the following three quantities:
,
(I) : ;
(ll)
1:."'P;
(III)
i: r.
Then, from Equations (2), (3), and (4), it can be seen that total SS among-aample SS
within-aample SS
= (Ill) - (I) = (D) - (I) == (Ill) - 01).
(5) (6)
(7)
Now it is apparent, from the above equations, that the total SS is equal to the among-sample SS plus the within-sample SS. The basic principle of the short-cut method is to replace the means by the totals in all steps of computation. The general mean y is replaced by the grand total G and the sample mean:; is replaced by the sample total T. The observation r remains intact. With this principle in mind, it is easy to remember of what particular combination of the three quantities (I), (II), and (III) a certain 55 consists. The total S5, being the sum of the squares of the deviations (r - 9>, is equal to (III) - (I). The quantity (III) involves r and the quantity (I) involves G which is associated with y. The amonJl:-sample (treatment) S5, being the sum of squares of the deviations (y - n, is equal to (II) - (I). The quantity (ll) involves T which is 8880ciated with and the quantity (I) involves G which is 88sociated with The error SS, being the sum of the squares of the deviations (r - 1>, is equal to (III) - (II). The quantity (III) involves r and the quantity (II) involves T which is 88sociated with y. With these 88sociations in mind, the short-cut method becomes meaningful and easy to remember. In the analysis of variance calculations, the sample totals T and the grand total G are obtained first. The rest of the computation can be arranged in a tabular form shown in Table 12.3a. Note that the sample
y.
r,
162
Ch. 12
ANALYSIS OF VARlANCE-oNE-WAY CLASSIFICATION
r
mean and the general mean and its components. The quantity (Table 12.3a)
y are
If2 = T:+
not needed in computing the total SS
11 + ••• + r:
can be obtained in one continuou8 operation on a desk calculator. Of course, Ira can also be obtained in the same way. TABLE 12.3a Preliminary Calculations (1)
(2)
(3)
(4)
Type of Total
Total of Squares
No. of Items Squared
No. of Observations per Squared Item
Crand Sample Observation
IT'
C'
1
kn n
I.y2
k kn
5) Total of Squares per Observation (2)
+ (4)
(I) (II) (In)
1
Analysis of Variance Source of Vsriation Among-sample Wi thin-sam pie Total
Sum of Squares
SS (II) - (I)
(III) - (II) (III) -(I)
Degrees of Freedom
k -1 kn - k kn -1
Mean Square MS
ns!y
F
ns~.s:
sa p
The three S5-values in the lower half of Table 12.3a are obtained by subtracting one item from another item of column (5) of the upper half of the table. The number of degrees of freedom can be obtained by performing the corresponding subtraction among tlie items of column (3) in the upper half of the table. For example, the total SS is obtained by subtracting (I) from (III) and its number of degrees of freedom, len - 1, can be obtained by subtracting the first item, 1, from the third item, len, of column (3). The total SS and its components of the example given in Table 12.1a we already found in Section 12.1. Now the short-cut method is used on the same 15 observations. The details of the computation are shown in Table 12.3b. It should be noted that the three S5-values obtained by the short-cut method given in Table 12.::\b are the same as those obtained in Section 12.1 by the long method. For this particular example, the ad-
12.4
163
VARIANCE COMPONENTS AND MODELS TABLE 12.3b Preliminary Calculations (1)
(2)
(3)
(4)
(5)
Type of Total
Total of Squares
No. of hems Squared
No. of Observations per Squared
Total of Squares per Observation (2) .;. (4)
Item Grand Sample Observation
1 3 15
8.100 3,050 688
15 5 1
540 610 688
Analysis of Variance Source of Variation
Sum of Squares
Among-sample Wi thin-samp Ie Total
70 78 148
SS
Degrees of Freedom 2
12 14
Mean Square
35.0 6.5
F
5.38
vantage of using the short-cut method is not obvious. because the total number of observations is small and the observations are mostly onedigit numbers. Wheu one compares the two methods on a practical problem, however, one soon realizes that the short-cut method indeed deserves its name. 12.4 Variance CompoaeDls and Models In Section 12.2, it is shown that, il the " samples of size n are drawn from the same population, the variance among the k sample means
(1)
is an estimate of (II In. The average of s; of all possible sets of k samples of size n drawn from the same population is equal to (I I In (Theorem 7.2). It is of interest to know the average of s~ if the k samples are draWD from populations with different means but with the same variance. 'The sampling experiment of Section 12.2 may be used to determine this average. In this sampling experiment, where k .. 2, n = 5, all 1000 samples are
164
ANAL YSIS OF VARIANCE-oNE-WAY CLASSlnCATION
Ch. 12
drawn &om the same nonnal population with mean equal to 50 and variace equal to 100. However, the samples could be drawn from different populations. For example, the first samples of each of the 500 pair of samples could be drawn from a population with mean equal to 60 and the second sample drawn from a population with mean equal to 40. If 10 is added to each of the 5 observations of the first sample of each pair and 10 is subtracted from each of the 5 observations of the second sample, then these samples, with changed observations, become samples drawn from normal populations with means equal to 60 and 40 respectively. But both variances are still equal to 100 (Theorem 2.4a). The effect of these changes in the observations on the variance among sample means can be seen from the example of four samples given in Table 4.2. The means of the firat pair of samples are 48.8 and 45.6 respectively. These changes in observations, and the resulting changes in population means, will change the two sample means into 58.8 and 35.6 respectively, but with the general mean, :; = 47.2, unchanged. When the two samples are drawn from the same population with mean equal to 50, the variance of the sample means is 1
(48.8 - 47.2)1 + (45.6 - 47.2)1
'1
2-1
s_ =
= 5.12.
Now when the first sample is drawn from the population with mean equal to 60 and the second sample drawn from the population with mean equal to 40, the variance of the sam pie means is ....t oS;.
'1
=
(58.8 - 47.2)1 + (35.6 - 47.2)1 2-1
=
2 69.12.
Because of the variation among population means, the variance among sample means is increased from 5.12 to 269.12, with a net gain of 264. On the average, the gain is equal to the variance of the two population means. In general, the average of for all possible sets of Ie samples drawn from Ie populations is equal to CT 1In + CT;, where CT I, the variance of the Ie population means, is defined as IJ.
Sv
(2) The notation ~ in the above equation is the mean of the Ie population means, Ill, #la, ••• , "'Ie· The sampling experiment may be used to verify this point. The 1000 samples of the experiment are all drawn from the
12.4
165
VARIANCE COMPONENTS AND MODELS
same population with mean equal to 50. The variance among the sample means is
(3)
If the two samples of each of the 500 pairs of samples are drawn from populations with means equal to 60 and 40, the effect is that is increased by 10 and 11 is decreased by 10, and consequently the difference 1. - 11 is increased by 20. The variance of the new sample means
r.
is then
The effect of the population means being 60 and 40, instead of being equal, is that the variance of a pair of sample means is increased by 20<1. - rl) + 200. This increase changes from one pair of samples to another, because 1. and 11 changes from one pair of samples to another. As in Theorem 10.1a, the mean of all possible differences <1. - 11) is 50 -SO = o. Therefore, the average increase in is 200, and the variance of the population means is also
Sv
I
a
= JJ.
(60 - SO)I + (40 - 50)1 2-1
-
200.
This illustrates the fact that the average of s~ of all possible sets of " samples, each consisting of n observations, drawn from " populations with different means is equal to a l In + a~. Since the among-sample means square is lIS!. (Equation 2, Section 12.2), the average amongsample MS is n(al/n a:J or,,· + na;. The addition of 10 to each of the observations or the subtraction of 10 from each of the observations does not affect the sample variance (Theo-
!
166
ANALYSIS OF VARIANCE-ONFrWAY CLASSIFICATION
Ch. 12
rem 2.4&) and consequently does not affect the pooled variance or withinsample mean square. The above discussion may be summarized in the following equations: (a) Average of among-sample MS
= 0" + nO'~
(b) Average of within-sample MS =
0".
(4) (5)
There are two possible interpretations about the set of " population means: lis, Ita, ••• , #lie. One is that they are fixed quantities. The other is that they change from one set of samples to another. For example, there are 4 (" = 4) herds of cattle. From each herd, a sample of 20 animals (n = 20) is drawn. The weight of each animal is determined. If one's interest is in these 4 herds only, the four herds are the four populations and their average weights are Ill' Ita, #Is, and p.... If repeated samples were drawn, they would be drawn from the same four populations. On the other hand, the four herds may be considered only a sample of a much larger number of herds. If so, the set of four population means /ls, Ita, Il" and ",. is only a sample of a much larger number of population means. If repeated samples were drawn, they might be drawn from four different populations. The former interpretation of the " population means is called the lineaT hypothesis model or the fiud model of the analysis of variance. The latter interpretation of the" population means is called the component of variance model or the random vaTiable model of the analysis of variance. In this chapter only the linear hypothesis model is considered, that is, the different sets of " samples are drawn from the same "populations. This model can be illustrated by the sampling experiment, in which there are only two populations whose means are 60 and 40. The first sample of every pair is drawn from the population with mean equal to 60 and the second sample of every pair is drawn from the population with mean equal to 40. In other words, different pairs of samples are drawn from the same pair of populations. Symbolically, each observation may be expressed in the following equation
r z:: ji + (Il - ji) + (r -
Il),
(6)
where r is an observation, ji the mean of the " population means, Il the mean of the population from which r was taken. For the sampling experiment, III = 60, IA2 = 40, and ji = ~60 + 40) = 50. The observation 36 from the second population is 36
= 50 + (40 -
50) + (36 - 40)
= 50 + (-10) + (-4). Note that (Il - jl) is equal to 10 (or the first population and -10 for the second population and these quantities do not change with repeated samplings from the same populations.
12.5
167
TEST OF HYPOTHESIS-PROCEDURE
y.
The quantity ~ is estimated by the general mean The quantity (Il - jiJ, which is called the treatment effect, is estimated by (y - y> and the quantity (y - Il) which is ·called the error, is estimated by (y - 1). As a result, the among-sample SS, which is the sum of squares of the estimated treatment effects, is also called the treatment SS. The withinsample SS, which is the sum of squares of the estimated errors, is also called the error SS.
12.5 Test
or Hypothesis-Procedure
The preceding sections deal with the deductive relation between k populations and their respective samples. Now, in this section, the topic to be considered is the drawing of inductive inferences about the k population means from k samples. It is shown in Section 12.2 that, if all possible sets of k samples, each consisting of n observations, are drawn from the same normal population, the statistic
F- among-sample mean square within-sample mean square
a
treatment MS _ _ _ __ error MS
follows the F-distribution with k - 1 and k(n - 1) degrees of freedom. It is further shown that the among-sample mean square is an estimate of 0 2 + no~ (Equation 4, Section 12.4) and the within-sample mean square is an estimate of 0 2 , where 0 2 is the variance common to the k populations and 0 2 is the variance of the k population means (Equation 5, Section 12.4). ~These results may be used in testing the hypothesis that k population means are equal, if the k populations are normal and have the same variance 0 2 • When k random samples, each consisting of n observations, are drawn from k populations, the F-value can be calculated (Section 12.3) to test the hypothesis that the k population means are equal. The magnitude of the F-value enables one to decide whether the hypothesis is to be accepted or rejected. If F is in the neighborhood of 1, the indication is that 0 2 + no~ = 0 2, or o~ - 0, or more explicitly, the k population means are equal or all the treatment effects are equal to zero. If F is too large, the indication is that 0 2 + no~ > 0 2, or that o~ > 0, or more explicitly, that the k population means are not equal, or not all the treatment effects are equal to zero. Since the variance o~ of the k population means cannot be negative, 0 2 + no~ cannot be less than 0 2• Therefore, the hypothesis is rejected only because F is too large, and never because F is too small. In other words, this F-test is a one-tailed test, and the critical region is at the right tail of the F-di9tr~bution. There is also a commonsense way of seeing why this F-test is a one-tailed test. The smallest value that F can assume is zero. But if F is equal to zero, the among-
168
Ch.12
ANALYSIS OF VARIANCE-oNE-WAY CLASSIFICATION
sample SS is equal to zero, or the k sample means are equal. The fact that the k sample means are e1'1al is certainly no basis for rejecting the hypothesis that the k population means are equal. The F-test is used in Section 9.4 to test the hypothesis that 0': - O'~ The quantities (0'1 + 110';,) and 0'1 in this F-test are equivalent respectively to O'~ and 0': of the previous F-test. Then in this F-test, the hypothesis is that 0'2 + 110'~ - 0'1, that is, 0'1 - O. It should be noted that, if the hypothesis is true, that is, O'~ - 0, then 0'1 + 110'~ - 0'2, no matter how large the sample size n is. If the 5% significance level is used, 5% of all possible sets of k samples will lead to the erroneous conclusion that the k population means are not equal, regardless of the sample size. But if the hypothesis is false, that is, O'~ is greater than zero, 710~ can be magnified into any size, if n is sufficiently large. Thus 0'2 + 710~ can be made as large as one pleases by increasing the sample size n. If 0'2 + 110'~ is made much larger than 0'2, the average F-value will be much larger than 1, and the hypothesis is likely to be rejected. Therefore, the advantage of large samples is to make the rejection of the false hypothesis more certain or to reduce the probability of committing a Type II error. For example, the population means are 30, 40, SO, 60, and 70 respectively. The variance of each of the 5 populations is 0'2 - 100, and the variance of the 5 population means is 0'2 _ ~
Then and
(30 - 50)2 + (40 - 50)2 + (50 - 50)1 + (60 - 50)1 + (70 - SO)I 5-1
- 250.
0'2 + 110'~ - 100 + n(250) 0'1
= 100.
If n - 3, 0'1 + I'IO'~ is 8.5 times as large as 0'1 and if n :: 6, 0'2 + 110'~ is 16 times as large as 0'1. The larger n becomes, the larger is the average value of F. Thus the false hypothesis is more likely to be rejected. In other words, if the significance level remains the same, the increase in sample size reduces the probability of committing a Type II error. However, if O'~ is smaller, it will take larger sample sizes to reduce the probability of committinR a Type II error to the same extent. For example, if the population means are 40, 45, 50, 55, and 60 respectively and the variance of the population means is 0'2 _ ~
(40 - 50)1 + (45 - 50}2 + (50 - 50}I + (55 - 50)2 + (60 - SO}I
5-1
then it requires n to be 24 to make 0'1 + nO'~ - 100 + 24 (62.5) - 1600.
-
62 5 .,
12.5
TEST OF HYPOTHESIS-PROCEDURE
169
not
The quantity I remains the same in either case, whether the condition is ql - 250 an n - 6 or the condition is q~ - 62.5 and n - 24. That which influences the probability of committing a Type II error is the product of n and Hence, if the " population means are widely apart, relatively small samples will enable one to reject the hypothesis that" population means are equal. However, if the " population means are almost equal, the same hypothesis can be rejected only with the aid of large samples. It should be remembered that the statistic
q!-
F
-
among-sample mean square
treatment MS
- -error - -MS- within-sample mean square
follows the F-distribution, only if the hypothesis is true, that is, ~ = 0, or ql + nol = ql. Now if ql + no~ is 16 times as large as ql, the F-values are, on the average, 16 times as large as the true F-values and consequently do not follow the true F-distribution. This distorted distribution and the true F-distribution with 4 (i.e. " - 1 - 4) and 25 (i.e. len " - 30 - 5) degrees of freedom are shown in Fig. 12.5. The critical region for 5% significance level is where F > 2.7587. When an F-value is less than 2.7587, the hypothesis that the " population means are equal is accepted and thus a Type II error is committed. It can be seen from Fig. 12.5 that the probability of committing a Type II error, as shown by the shaded area of the distorted F-distribution, is very small. The procedure of the test of hypothesis, illustrated by the example given in Table 12.1a, is summarized as follows: 1. Hypothesis: The hypothesis is that the three population means are equal, that is, 1'-, - 1'- ... 1'-. The hypothesis can also be stated as q~ - o. 2. Alternative hypothesis: The alternative hypothesis is that the three population means are not all the same, or that q~ > o. 3. Assumptions: The assumption is that the three samples are random samples drawn from normal populations with the same variance. 4. Level of significance: The 5% significance level is used. 5. Critical region: The critical region is where F> 3.8853.(Since " - 3 and n - 5, the numbers of degrees of freedom of F are 2 and 12. This F-test is a one-tailed test; therefore, the 5% table is used for the 5% significance level). 6. Computation of F: The details of the computation of F are shown in Table 12.3b. The F-value is equal to 5.38, with 2 and 12 degrees of freedom. 7. Conclusion: Since the F-value is inside the critical region, the hypothesis is rejected, and the conclusion is that the three populations do not have the same means. (If the F-value were less than 3.8853, the conclusion would be that the three populations do have the same means.)
170
ANAL YSIS OF VARIANCE-oNFrWAY CLASSIFICATION
Ch. 12
..,o
0
N
at)
N .... eO
.~
c.
"~ "uI
~
r.... ."
!
~
.i !::
\
0
12.6
RELA110N BETWEEN t-DISTRIBU110N AND F-DISTRIBU110N
171
12.6 Relation Between t-Dtstrlbatlon aDd F-Dtstrllnatloa
If two samples are available, the t-test may be used in testing the hypothesis that the two population means are equal (Section 10.6), but the F-test (analysis of variance with k - 2) can also be used for the same purpose. It is interesting to see whether the two tests will always lead to the same conclusion. In the t-test, if n, D n l - n, y,- YI
t
D
,
~(~+
V si
n,
y,- fl
.:.)_, ~ n V-;;-
(1)
l
with (n, + n l - 2) or 2n - 2 degrees of freedom. In the analysis of variance,
(2) (Equation 4, Section 12.2) with 1 (k - 2, k - 1 - 1) and 2(n - 1) degrees of freedom. After some algebraic manipulation, it can be shown that t l - F. The quantity is equal to (1/2)(f, - fl)1 (Equation 3, Section 12.4). Hence
s;
n
n
n
and t has 2(n - 1) degrees of freedom, while F has 1 and 2(n - 1) degrees of freedom. The t-distribution is symmetrical, with zero in the center of the distribution (Fig. 8.1). The F-distribution is asymmetrical with zero as the lower limit (Fig. 9.1). A t-value may be either negative, zero, or positive, but t 2 can be only zero or positive, because the square of a negative number is positive. For t with 10 degrees of freedom, 2.5% of all tvalues are less than -2.2281 and 2.5% of all t-values are greater than 2.2281, but (-3)1 and 31 are both equal to 9 and greater than (2.2281)1. Therefore, a total of 5% of all tl-values are greater than (2.2281)1 or 4.96. This value 4.96 is the 5% point of F with 1 and 10 degrees of freedom. The F-distribution with 1 and &I degrees of freedom is then a doubled-up version of the t-distribution with &I degrees of freedom. The two tails of t are folded into the right tail of F. The center portion of the t-distribution becomes the left tail of F-distribution. The square of the 2.5% point of t with &I degrees of freedom is the 5% point of F with 1 and &I degrees of freedom. This relation can be observed from the t-table and F-table. The 2.5% points of t for various numbers of degrees of freedom are 12.706, 4.3027, 3.1825, etc. The squares of these values
172
ANAL YSIS OF VARIANCE-oNE-WAY CLASSIFICATION
Ch.12
are 161.44, 18.513, 10.128, etc., which are the values given in tile first column of the 5% F-table. Therefore, in testing a hypothesis that two population means are equal, either the two-tailed t-test or the one-tailed F-test (analysis of variance) may be used, and the two tests always yield the same conclusion. The fact that t 2 - F can be further substantiated. In Section 7.9, it is shown that
ss:
I(y - 1l)1
d.,.:
II
Then
n(r - 1l)1
-
1l)1 + 1:(,. - 1)1
(4)
+n-l
(5)
1
-
n(y - 1l)1
1
F- I.(,. -
= nCr -
y)1
Sl
-
(f -
".)1
(6)
Sl
n
n-l
follows the F-distribution with 1 and n - 1 degrees of freedom. In Theorem 8.1a, it is shown that
(7)
follows the t-distribution with n - 1 degrees of freedom. It can be seen from Equations (6) and (7) that t l - F. The relation between the t-distribution and the F-distribution can be summarized in the following theorem:
Theorem 12.6 If a statistic t follows the Student's t-distribution with degrees of freedo~ t l follows the F-distribution with 1 and II degrees of freedom. II
12.7 Asaamptloaa The assumptions are the conditions under which a test of hypothesis is valid. In the analysis of variance, the assumptions are the conditions under which the statistic
F
= among-sample
mean square _ treatment MS
within-sample mean square
(1)
error MS
follows the F-distribution. These conditions, which are already demonstrated in the sampling experiment of Section 12.2, are listed below: (a) The Ie samples are random samples drawn from Ie populations. (b) The Ie populations are normal. (c) The variances of the Ie populations are equal.
12.8
APPLICATIONS
173
Of course, equality of the Ie population means is another necessary condition under which the statistic F follows the F-distribution. But this conditioD is the hypothesis being tested. The hypothesis is considered true before the test, but it is subject to rejection after the test. The other three conditions listed above must be true, before and after the test. If the assumptions are not satisfied, the stati.tic F given in Equation (1) will not follow the F-distribution and consequently the percentage points given in the F-table will not be correct. Then the 5% point is not really the 5% point but a different percentage-point. Therefore, when the 5% point in the F-table is used in determining the critical region, the actual significance level is not exactly 5%, but more or less than 5%. Hence, the consequence of using the existing tables without satisfying the assumptions underlying a test of hypothesis is that the significance level is disturbed. Another conceivable consequence is that the probability of committing the Type II error may also be affected. The consequences of not satisfying the assumptions have been investigated by many workers. The results of these investigations are briefly summarized as follows: (a) Randomness of samples. The randomness of the samples can be achieved by randomizing the experimental material (Section 10.9). The random number table may be used for this purpose. In other words, one avoids the consequences of non~andomness by making the samples random, instead of creating non-randomness and then worrying about the consequences. (b) Normality of populations. Non-normality of the populations does not introduce serious error in the F-test or in the two-tailed "'test. 1£ the F-table and ,..table are used in detennining the critical regions, the true significance level is actually larger than the one being specified. For example, if the 5% point of the F-table is used to determine the critical region, the true significance level is actually larger than 5%. Therefore, if the hypothesis is true, the rejection of it is more likely than the significance level indicates. (c) Homogeneity of variances. If the variances of the Ie populations are not too much different, the F -test and the two-tailed ,..test are not seriously affected. The effect of heterogeneity of variances can be reduced by using samples of the same size (Section 10.7). In conclusion, a slight departure nom the assumptions will not cause serious error in the F-test and the two-tailed test. In Chapter 23, methode are presented for correcting the departure nom the assumptions.
12.8 Applications The analysis of variance is a very comprehensive topic in statistics. What is presented in this chapter is only the simplest case. Originally,
174
ANALYSIS OF VARIANCE-QNFrWAY CLASSIFICATION
Ch.12
the analysis of variance was developed by R. A. Fisher for Dee by agricultorisls in analyzing data from their field experiments. At the present time, ils applications are extended to various experimental sciences, some examples of which are given in this section. A manufactorer has three (Ie - 3) di((erent processes of making fiber boards and wishes to determiue whether these processes produce equally strong boards. A random sample of, say, 20 (n = 20) boards is to be obtained from each of the products manufactured by the three processes. The strength (observation) of each of the 60 boards is determined. Then the analysis of variance may be used to test the hypothesis that the average strengths of the boards produced by the three different processes are the same. The analysis is as follows (the case of multiple observations for each board is discussed in Section 18.9):
Sources of VCJTiation
Degrees of Freedom
Amon g processes Within processes Total
57 (len - Ie) 59 (len -1).
2 (Ie -1)
Five different kinds of feed may be compared for their fattening ability. A total of 50 animals may be divided into·5 groups, at random, with the aid of a random number table (Section 10.9). The animals are individually fed, but each group of 10 animals is fed a different ration. The animals are weighed, both before and after a feeding period. The gain in weight of each animal is an observation. Then the analysis of variance may be used in testing the hypothesis that the 5 groups of animals, on the average, gained the same amount of weight. The analysis is as follows:
Sources of Variation
Degrees of Freedom
Among rations Within rations Total
45 (len - Ie) 49 (len -1).
4 (Ie -
n
Two examples of possible applications of the analysis of variance are given in this section. The applications seem to be quite straightforward once the basic principles of the analysis of variance are mastered. However, it must be realized that these examples are deliberately chosen for their simplicity and, therefore, are deceiving. In general, the application of statistics involves the translation of the abstract idea of sample and population into a practical problem. This translation has never been proved easy. It is baffling even to the experts at times. In order to make an abstract idea (mathematical model) agree as closely as possible with a practical problem, the method of collecting data becomes all important. The design of an experiment, which is a process of collecting data, is no longer an amateur's job. At the present time, research or-
12.10
UNEQUAL SAMPLE SIZES
175
ganizations usually have consulting statisticians on their staff. Their job is to recommend methods of collecting data to suit existing statistical methods, to modify the old statistical methods to fit practical problems, or to develop new statistical methods. An experimental scientist consults the statistician for this purpose before he starts his experiment and not after the experiment is completed, much as one consults an architect when he wishes to build a house. The purpose of consulting an architect is to obtain his advice before a house is built and not his blessing after the house is finished. An experimental scientist consults a statistician at the planning stage of an experiment mainly to take advantage of the statistician's knowledge. However, the limitation of the statistician's knowledge is also a reason for early consultation. Modern statistics is a very new field, as witnessed by the fact that most of the outstanding contributors to its development are still living. Therefore, even though a vaet amount of statistical knowledge has been accumulate,! during the last 50 years, the number of statistical methods which can be directly applied to practical problems is still quite limited. In general, experimeuts are designed within the limit of existing methods. But if there is no existing method for dealing with a particular set of experimental data at hand, a statistician may be unable, or unwilling, to help the experimental scientist. The production of a tailor-made method usually requires both time and talent, and even the best brains do not produce drastically new methods in rapid succession. Moreover, e theorret, in'statistics as in any field of science, usually prefers to select problems to suit his own talents and may not wish to spend his time on an assigned problem which he mayor may not be able to solve. Therefore, the experimental scientist should consult a statistician before an experiment is started to make sure that a statistical method is available or a method can be developed, to handle the experimental data.
12.9 SpectRe Tests There are many tests associated with the analysis of variance. The F-test described in Section 12.5 is a general one. It only tests the hypothesis that the k population means are equal. The other tests are designed to test more specific hypotheses concerning the k population means. Some of these tests are presented in Sections 15.3, 15.4, 15.5, and 17.8. The advantages of these tests over the general F-test are discussed in those sections.
12.10 Uaeqaal Sample Sizes So far, the analysis of variance being considered is the case of k samples, each consisting of n observations. In this section, the case of
176
Ch. 12
ANAL YSIS OF VARIANCE-oNErWAY CLASSInCATION
unequal sample sizes is considered, that is, the " samples consist of "i observations respectively. The hypothesis and assumptions are the same for both cases. The difference is mainly in the computing method which is to be illustrated by the example given in Table 12.10a. The 3 (" = 3) samples consist of 2, 2, and 4 observations respectively, that is "a = 2, ". -= 2, ". = 4, and L& = 8. It should he noted here that the total number of observations is 1:" instead of "". Each of the 8 observations can be partitioned into three components, that is,
"If "If ... ,
Y-
r+ (f - y) + (y - y).
(1)
The components of the observations are .hown in Table 12.10b. The among-sample SS is 1:,,(y - y)1 -
",(f, -
= 2(3)1
1)1 + "I(fl - 1)1 + n.
(2)
+ 2(1)1 + 4(- 2)1 - 36
with " - 1 or 3 - 1 or 2 degrees of freedom. The within-sample SS, which is the pooled SS of the " samples, is [(_1)1 + (I)'] + [(1)1 + (-1)'] + [(0)1 + (1)1 + (_1)1 + (0)'] - 6. TABLE 12.10a Sample No. ObservatioDa
(1)
(2)
(3)
7 9
7 5
3
" 2
3
Sample Total, T Sample Size, ,. Sample Mean, y
ra n
16 2 8
12 2 6
12 4
128
72
36
40 grand total, G 81:,. 5 general mean,
3
236
~(~)
-
r
TABLE 12.10b (3)
Sample No.
(1)
Components of observatiolls
5+3-1 5+3+1
5+1+1 5+1-1
5-2+0 5-2+1 5-2-1 5-2+0
5+3+0
5+1+0
5-2+0
(2)
~
1+ \y -1> + (y - y) Sample me81l
y
12.10
177
UNEQUAL SAMPLE SIZES
The number of de8l'ees of freedom of the within-sample 55 is (II, - 1) + (II. - 1) + ••• + (II Ii: - 1) - ~ II - k
(3)
instead of kll - k or k(II - 1). In this example, the number of degrees of freedom is 1 + 1 + 3 - 5. The general mean y is no longer the mean of the k sample meaDS, but the weighted mean of the k sample means, with the sample sizes being the weights, that is, '-' II'y,+II5'+"'+II~k _ 1,+1,+ ••• +1 a=_. G y_
~II
11,+11.+"'+111:
(4)
~"
It can be seen from the above equation that the general mean 1, being the grand total divided by the total number of observations, is still the mean of the composite of the k samples. In the short-cut method of computing F, the sample totals a nd the grand total are computed as in the case of equal sample size. But, in addition to these totals, the quantity 11/11 must be computed for each sample (Table 12.10a). The details of the short-cut computing method of F of the given example are shown in Tables 12.10c and 12.tod. TABLE 12.10c Preliminary Calculations (I)
Type of Total GraDd Sample ObservstioD
(2)
(3)
Total of Squares
\4)
No. of Items No. of ObservatioDs Squared per Squared Item
G2
1
-
-
1:"
-
-
-
-
(5) Total of Squares per Observation (2)
+ (4)
--
calI"
(I) ~(ra/II) (II) (III) 1:,.
---
ADalysis of Variance MeaD Square
Sum of Squares Source of Variation
55
Degrees of Freedom
Amoog-eample WithiD-sample Total
(II) - (I) (III) - (10
Is-I
(III) - (I)
M5
F
~n- "
~n-l
The average of the witbin-aample mean square is still (I. which is the variance common to the k population (Equation 5, Section 12.4), but the average of the among-sample mean square is no longer (71 + 1IO~ (Equation 4, Section 12.4), simply because II no longer exists. The sample sizes
178
Ch. 12
ANALYSIS OF VARIANCE-oNFrWAY CLASSIFICATION TABLE 12.10d Preliminary Calculations (l)
(2)
Type of Total
Total of Squares
Grand Sample Observation
1,600
(3)
(5)
l4)
Total of Squares No. of Items No. of Observations per Observation (2) + (4) per Squared Item Squared
1
200 236 242
8
Analysis of Variance Sum of Squares
Mean Square Degrees of Freedom
M5
F
36
2
5 7
18.0 1.2
15.0
6
Source of Variation
55
Among-sample Within-sample Total
42
are DOW "u " •••• , "lc respectively. This Dumber" is replaced by a Dumber which is designated by "Ot where
1_ fI" _I"a).
"0 __
k -1
~
(5)
I"
"1 =2, "I = 2, ". =4,andk=3. (21 + 21 + 4 Z)] ... -1 ~8 - -24) = 2.5.
ID the example given in Table 12.10a, 1 ~(2 + 2 + 4) "0 ... -3- 1
(2 + 2 + 4)
2
8
Then
(6)
The value "0 is approximately equal to, but less than, the average sample size. For this example, the average sample size is 8/3 or 2.67. Since EquatioD (5) is given without proof, at least it should be verified for the case of equal sample size. If the sizes of the k samples are equal, it is known that "0 should be equal to ". Now, if the sample sizes are equal, that is, "t" "a - ••• III:: "lc - ", aDd " is substituted in Equation (5), the result is
"0 = k--1-1 ~k" -
r
t
k2na - k"a] k"a] 1 k,,2(k - -1= -k" k- 1 k" k- 1 k"
1)J
= " (7)
which is the number expected. This section shows that there is DO difference in basic principles between the case of equal and that of unequal sample sizes. But equalizing of the sample sizes bas definite advantages which are discussed in the next section.
12.11
ADVANTAGES OF EQUAL SAMPLE SIZL
12.11 Aciv.'ases
179
or &pal s..ple SA ze
In the analysis of variance, the use of equal sample sizes has several advantages over the use of unequal sample sizes. These advantages are listed below: 1. The most obvious advantage of having equal sample size is the ease of computation. Calculation of the F-value requires a little more time if the sample sizes are unequal, but not much. The computation of the quantity I(TI/ n) seems to require some time, but it can be obtained in one continuous operation on a desk calculator, after the values of T and n are obtained. However, the ease of computation is a definite advantage, not in the computation of the F-value, but in the specific tests mentioned in Section 12.9. This advantage will be seen iD later chapters where these tests are discussed. 2. The second advantage is that the equality of sample size minimizes the effect of the heterogeneity of population variances. One of the assumptions of the analysis of variance is that the Ie population variances are equal (Section 12.7). If sample sizes are equal, the consequence of not satisfying this assumption is not so serious as if the sample sizes are unequal (Section 10.7). 3. The third advantage of equal sample size is that the probability of committing a Type error is minimized. The probability of committing a Type II error in the analysis of variance is influenced by the ratio
n
(1)
The larger this ratio, the smaller is the probability of committing a Type II error. The effect of n and O'~ on a Type II error is discussed in Section 12.5. Now the effect of replacing n by no (Section 12.10) is considered. In the analysis of variance, the number of samples is Ie and the total number of observations is In. The distribution of the In observations among the Ie samples determines the value nO. which reaches a maximum when the sample sizes are equal, that is n l .. n l - ••• - n k = In/Ie. For example, Ie - 4, In .. 40. When n, - na - n. - n4 - 10, no - 10. Wben n l .. 2, n, - 5, n. = 15, n4 = 18, and In = 40,
1 [I n Ina] no--Ie - 1 In 1 4-1
~
= - - 40 -
J
2' + 52 + 15 2 + 18 2 - 8.52 40
which is less than 10. In both cases, the total number of observations is 40. But the value of no is greater in the case of equal sample sizes than in the case of
180
ANALYSIS OF VARIANCE-oNE-WAY CLASSIFICATION
Ch. 12
unequal sample sizes. If no is greater, the ratio u l + n,plp.
is greater and the F-value, on the average, is greater, if u~ > 0. Thus the hypothesis that u~ - 0, is more likely to be rejected if the hypothesis is false. Therefore, the probability of committing the Type n error is reduced by equalizing the sample sizes. However, if the hypothesis is true, that is, u~ - 0, the value of no has no effect on the ratio which is equal to 1. Consequently, the probability of committing a Type I error is not affected by the value of noo Therefore, equalizing the sample sizes has every advantage. It can be proved algebraically that l:n Sl Sl no - - - -.!!.- ii - -.!!. Ie l:n l:n
(2)
wb ich indicates that no is always less than or equal to the average sample size, ii. The magnitude of the difference between no and ii depends on the variance, s!, of the sample sizes, where l:n l - (l:n)I/1e
l:(n - ii)1 Sl -
Ie-I
n
-
Ie-I·
(3)
If the total number of observations is the same, the larger the variation amoDg the sample sizes, the larger is the probability of committing a Type II error. The algebraic proof of Equation (2) is as follows: 110 _ _ 1 rl: n _ l:nl] Ie -1 In
L
_ l:n _ l:n + _1_ ~l:n)1 -l:nll
J
k- 1 l:n (k - 1) (l:n)1 - 1e(l:n)1 + kl:n l
k l:n
k
---- k
k
t
k l:n
---------:-----
--:-:--:-----:--:::--
Ie(k - 1)l:n
Sl n
-n-I,.· -
Since equality in the sample sizes has so many advantages over inequality in the sample sizes, a scientist should make an effort to equalize the sample sizes in conducting his experiments.
181
EXERCISES
EXERCISES (1) Four samples, each consisting of two observations, are given in the
following table: (1)
(2)
(3)
(4)
2 4
9 5
3
9
5 3
(a) Express ponents, (b) Compute directly equal to
each of the 8 observations as the sum of three comthat is, r - + (y - y) + (y - y). the total SS, among-sample SS, and within-sample SS from these components, and show that the total SS is the sum of the among-sample SS and the within-sample
r
SSe (c) Compute the three SS.values by the short-cut method and show that the values obtained are the same as those obtained in (b). (2) Four random samples are drawu from the tag population which is a nonnal population with mean equal to 50 and variance equal to 100. The observations of the four samples are tabulated as follows: (1)
(2)
(3)
(4)
47 55 38 52
42 65 46
67 33 57
45 50 60 74
45 42 64
67 42 4) 49 48 56 47
56
49 67
54
Assuming that the source of the samples is unknowu, test the hypothesis that the four population means are equal, at the 5% level. Following the procedure given in Section 12.5, write a complete report. Since it is actually knowu that the four samples are drawu from the same population, state whether your conclusion is correct or a Type I error is committed. A Type II error cannot be committed in this case, because the hypothesis being tested is true. (F - 0.22 with 3 and 24 tLf.) (3) Add 2 to each of the observations of Sample 1; add 3 to each of the observations of Sample 2; add 4 to each of the observations of Sample 3; and add 5 to each of the observatious of Sample 4 of Exercise (2). Then the resulting samples become samples drawu from four populations with means equal to 52. 53, 54. and 55 re-
182
ANAL YSIS OF VARIANCE-ONE-WAY CLASSIFICATION
Ch. 12
spectively. Assuming that the source of the samples is unknown. test the hypothesis that the four population means are equal. at the 5% level. Since the four population means are actually known, state whether your conclusion is correct or a Type II error is committed. Note that a Type I error cannot be committed because the hypothesis is false. The purpose of this exercise is to demonstrate that a Type II error is likely to be made, if the sample sizes are small and the hypothesis is not too far wrong. (F .. 0.14 with 3 and 24 d.f.) (4) Add 10 to each of the observations of Sample 1; add 30 to each of the observations of Sample 2; add 50 to each of the observations of Sample 3; and add 70 to each of the observations of Sample 4 of Exercise (2). Then the resulting samples become samples draWD from four populations with means equal to 60, 80, 100, and 120 respectively. Assuming that the source of the samples is unknown, test the hypothesis that the four population means are equal, at the 5% level. Since the population means are actually known, state whether your conclusion is correct or a Type II error is made. Note that a Type I error cannot be made, because the hypothesis being tested is false. The purpose of this exercise is to demonstrate that a false hypothesis is likely to be rejected with relatively small samples, if the hypothesis is far enough from the truth. (F .... 37.60 with 3 and 24 d.f.) (5) Since the sources of the samples of Exercises (2). (3), and (4) are known, it is possible to find the average of among-sample mean square and that of within-sample mean square (Equations 4 and 5, Section 12.4). What are these averages of each of the three exercises? (6) For the two given samples, test the hypothesis that the two population means are equal. Use both the t-test (Section 10.6) and the F-test (analysis of variance) and show that t 2 = F.
1
2
4 6 5 8 4 9 7 7
5 5
(7) Thirty-foUl' steers were divided into three groups at random. Each animal was individually fed and each group was given a different feed. All the animals had exactly the same feeding period. The initial and final weights in pounds for each steer are tabulated a8 below.
183
EXERCISES Control Init. wt. 632 732
1/5 Molaloses
1/3 Molasses
FiDal wt. 940 1045
Inil. wt. 688 697
Final wt.
Init. wt.
Final wt.
890 903
683 708
866 1007
729
1060
757
961
761
1002
766 752 790 783 846 844 855 859 944
1017 998 1090 1024 1110 1095 1024 1177 1234
760 766 796 792 801 850 868 897 911
1040 1021 1052 995 985 1068 1019 1154 1150
757 778 787 795 806 852 858
924 957 1025 1035 1007 1060 1080
(Courtesy of Dr. W. W. Heinemann, Washington State College). Use the gain in weight (linal minus initial weight) a8 the observa-
tion. Test the hypothesis that the three rations are equally good as cattle feeds. Use the 5% significance level. (A more elaborate method may be used on this set of data; see Exercise 9, Chapter 19). (F .. 6.78 with 2 and 31 d.f.) (8) Twenty-five animals were divided into five groupe at random. The effectiveness of carbon tetrachloride as an antihelminthic was tested on the animals. Each group of five animals was tested at a different time. Each animal was given an injection containing 500 Nippostrongylus ",uris larvae. On the tenth day after injection, the rats were killed and the adult worms recovered and counted. The numbers are given below:
Test Number 1
2
3
4
5
279 338 334 198 303
378 275 412 265 286
172 335 335 282 250
381 346 340 471 318
297 274 300 213 199
Is there a difference in the average numbers of adult worms recovered among the different tests? (Whitlock, J. H. and Bliss, C. I.: "1\ Rioassay Technique for Antihelminthics," The Journal of Parasitology, Vol. 29, 1943, pp. 48-58.) (9) In a study of the effectiveness of carbon tetrachloride as an antihelminthic, each of seventeen albino rats received an injection of 500 Nippostrongylus muris larvae. On the eighth day after the
184
Ch.12
ANALYSIS OF VARIANCE-ONE-WAY CLASSIFICATION
injection, the rats were treated with varying amounts of carbon tetrachloride dissolved in mineral oil via a stomach tube. On the tenth day after the injection the rats were killed and the worms recovered and counted. The numbers are given in the following table:
008e per rat of carboD tetrachloride CODtrol
0.016 cc
0.032 cc
0.063 cc
0.126 cc
279 338 334 198 303
328 311 369
229 274 310
210 285 117
63 126 70
Test the hypothesis that the dosage of carbon tetrachloride does not affect the average number of adult worms recovered. (Whitlock, J. H. aDd Bliss, C. I.: "A Bioassay Technique for Antihelminthics," The Journal of PQTGSitology, Vol. 29, 1943, pp. 48-58.) (10) The following results were obtained in a study on the effect of oxidative rancidity in unsaturated fatty acids on the germination of bacterial spores.
The three treatments used were cODtrol medium
(Yesair's pork infusion agar), control medium plus rancid oleic acid (Kreis +), and control medium plus oleic acid which was not rancid (Kreis -). Ten plates were made for each treatment using equal volumes of a spore suspension. After incubation, the number of bacterial colonies were counted for each plate. The organism used was a putrefactive anaerobe (strain P. A. 9679). I
Oleic, RaDcid
Oleic, Not RaDcid
13 36 12 41 22 6 20
180 173 160 149
173 212 205 180
158
192
193 155 160 175 188
151 167 201 177
11
30 28
CODtrol
179
Test the hypothesis that the average number of colonies is not affected by the treatments. (Roth, Norman G., and Halvorson, H.
185
EXERCISES
0.: "The F.ffect of Oxidative Rancidity in Unsaturated Fatty Acids on the Gennination of Bacterial Spores," Joumal of Bacteriology, Vol. 63, 1952, pp. 429-435.) (11) The following data are the per cent porosity of bricks fired at four temperatures. Firing Temperature 2600
F
0
2675 0 F
16.7 14.4 12.3 12.7 13.7
19.4 14.8 14.7 12.7 15.7
2800° F
2150 0 F
15.7 13.2 11.7 11.8 11.3
14.4 11.8 9.7 11.4 11.3
Is the average por08ity of bricks affected by the firing temperature? (12) The following data are measures of the work necessary to compre8S each of four .amples of three materials.
Wool
CoUon
Spun Nylon
14
5
13
16 11 13
9 8 6
10 10 12
Teet the hypothesis, at the 5% level, that the three materials require the .ame amount of work to be compressed. (13) The following data are measurements of the sulfate resistance of five types of cement.
Type of Cement
I
II
III
IV
V
.028 .039 .048
.036 .048 .058
.025 .033 .038
.008 .010 .013
.014 .019 .023
Do the five types of cement differ in their sulfate resistance? (14) The following data are measurement8 of the tendency of oil to separate from grease mea8ured for six types of grease. What can you conclude from these data?
r
186
-
-- -
-- - - - - _ -
ANAL YSIS OF VARIANCE-oNE-WAY CLASSIFICATION
Ch. 12
Grease Type A
B
C
D
E
F
1.98 2.15 2.28
2.16 1.98 2.02
1.65 1.63 1.67
1.32 1.24 1.27
1.45 1.43 1.41
1.49 1.65 1.50
(15) The following data are the densities of bricks fired at four temperatures.
Firing Temperature 2600 0 F
2675 0 F
2750 0 F
2800 0 F
2.14 2.20 2.15 2.34 2.26
2.19 2.20 2.16 2.33 2.30
2.22 2.23 2.15 2.34 2.34
2.26 2.25 2.14 2.34 2.33
Is the average density of bricks affected by the firing temperature? (16) The strengths of five types, A, B, C, D, and E, of fiber-board were being compared. Four boards of each type were used. The modulus of rupture of eacb of the 20 boards is given in the following table:
A
B
C
D
E
519 574 633 622
424 525 549 601
392 494 541 557
462 559 565
465 568 613
611
611
(Courtesy of Mr. A. D. Hofstrsnd, Oregon Forest Product Laboratory)
Test the hypothesis that the five types of fiber-board are equally strong.
QUESTIONS (1) What is the hypothesis of the F-test of the analysis of variance?
(2) (a) Under what condition is F equal to zero? . (b) If F = 0, should the hypothesis be rejected or accepted? Why? (3) If the k sample means are equal, what is the F-value? (4) If the k population means are equal, what is the F-value? (5) Under what condition is the within-sample SS equal to zero? (6) Write down the algebraic expressions of the among-sample SS, within-sample SS and total SSe Define the notations used.
REFERENCES
187
(7) The among-sample mean square is an estimate of what parameter or parameters? (8) The within-sample mean square is an estimate of what parameter or parameters? (9) Why is the F-test in the analysis of variance always a one-tailed test? (10) If the observations are measurements in inches, what is the unit of F? (11) What is the relation between the t-distribution and the F-distribution? (12) Which quantity in the analysis of variance table is called pooled variance 8 2 in the t.-test? (13) In testing Pthe hypothesis that two population means are equal we may use either t.-test or F-test (analysis of variance). If contradictory conclusions are reached by these tests, which one should be trusted? Why? (14) What are the advantages of having samples of the same size in the analysis of variance? (15) What are the assumptions underlying the analysis of variance?
REFERENCES Bartlett, M. S.: ··The Use of TraDsformstioDs" Biometric., Vol. 3 (1947), pp.
39-52. Box, G. E. P.: ··Some Theorems OD Quadratic Forms Applied ia the Study of ADalysis of VarisDce Problem, I. Effect of IDequality of VariaDce iD ODe-Way ClassificatioD,'· Annab of Maehemolical Seaei.eic., Vol. 25 (1954), pp. 290-302. CochraD, W. G.: "Some CODsequeDces WheD the Assumptions for the ADalysis of VariaDce Are Not Satisfied," Biometric., Vol. 3 (1947), pp. 22-38. Crump, S. Lee: "The EstimatioD of Variance CompoDeDts in the ADalysis of VariaDce" Biomemc., Vol. 2 (1946), pp. 7-11. Eisenhart, Churchill: "The AssumptioDs UDderlyiDg the ADalysis of VariaDce," Biometric., Vol. 3 (1947), pp. 1-21. IrwiD, J. 0.: "Mathematical Theorems IDvolved in the ADalysis of VariaDce," Joumal of Royal Seoll.tical Socidy, Vol. 94 (I 93 I), pp. 285-300.
CHAPTER IS
REVIEW This chapter presents a condensed, comprehensive review of the basic principles and methods of statistics discussed in the preceding chapters. These preceding chapters show that statistics deals with the relation between populations and their samples and that the ultimate objective of statistics is to draw inductive inferences about the populations from given samples. Therefore statistical methods may be regarded as tools for making decisions based on incomplete information (sample). In terms of scientific experiments, these methods enable scientists to reach a general conclusion from the experimental data (sample). For generations, scientists in many fields have used mathematics as an aid in deduction. Now statistics may be used as an aid in induction. 13.1 All Possible Samples One of the basic principles of statistics IS that a statistic, such as the sample mean, changes from sample to sample, while a parameter, such as the population mean, is a fixed quantity. When all possible samples are drawn from a given population, and for each sample the value of the statistic is calculated, the frequency distribution of the statistic can be obtained. The purpose of drawing all possible samples is to establish the frequency distributions of various statistics, such as p, ')(, t, and F. From these frequency distributions the percentage points can be tabulated, such as those given in the Appendix, and these tabulated values are used in computing the confidence interval and in determining the critical region in a test of hypothesis. All possible samples and the distributions of statistics are means to an end. In applied problems, only these tabulated percentage points are used. In estimating a parameter, only one confidence interval is calculated. In testing a hypothesis only one value of a statistic is calculated. But in both cases the tabulated percentage points are needed. The purpose of performing the sampling experiments is not to show how statistics is applied, but to verify the distributions of various statistics such as 11-, )(2, t, and F, and, consequently, to explain the meaning of the tables given in the Appendix. In the sampling experiments, only 1000 random samples are drawn instead of all possible samples. As a result, only 500 or 1000 values of a statistic are calculated. The 1000 values of t and 500 valnes of F are sufficient to show the general shapes of the distribution curves, but these numbers of values are not sufficiently large to determine accurately the 5% or 1% points of the distri butions. It 188
13.2
RELATION AMONC VARIOUS DISTRIBUTIONS
189
sbould be noted that the percentage points obtained from tbe sampling experiments are not exactly the same as those given in the tables. Therefore, the sampling experiments serve to demonstrate the meaning of the theorems rather than to prove them. 13.2 Relation Among Various DIslrlbutioas Various frequency distributions and various statistical methods are presented in the preceding cbapters. Frequently more than one method is available for testing the same bypothesis and, at the same time, one test can be used in testing different bypotheses. For example, in testing the bypothesis that two population means are equal, either the Studenl·S t-tesl (Theorem 10.4&) or the F-test (analysis of variance with ,,- 2) may be used. On the other band, the bypotheses that two population variances are equal and that Ie population means are equal, can be tested by the F-test. All the various distributions, being generated through sampling from normal populations, are interrelated. The relations are 88 follows: (1) The relation between" and is given in Tbeorem 7.6. If a statistic " follows the normal distribution with mean equal to zero and variance equal to 1, "I follows the >t-distribution with 1 degree of freedom. (2) The relation between " and t is given in Theorem 8.lb. The tdistribution approaches the u-distribution as the number of degrees of freedom approacbes infini ty. (3) The relation between Xl and F is given in Theorem 9.7. If a statistic ~ follows the Xl-distribution with v degrees of freedom, Xl/V follows the F-distribution with v and 00 degrees of freedom. (4) The relation between t and F is given in Theorem 12.6. If a statistic t follows the Student's t-distribution with v degrees of freedom, f follows the F-distribution wi th 1 and v degrees of freedom. (5) The relation between " and F can be deduced from the relation between "and t and that between t and F. If" follows the t-distribution with 00 degrees of freedom, ,,2 must follow tbe F-distribution with 1 and DO degrees of freedom. It should be noted that each of tbe three distributions, II, t, and >t, is
>t
a special case of the F-distribution. The various relatioDs are suuuoar-
ized in Table 13.2, which is made in the form of tbe F-table. 1£ Table 13.2 is the 5% F-table, the figures in tbe first column of the table are the squares of the 2.5% points of the t-distribution (Section 12.6). Tbe figures in the last line of the table are the 5% points of /v (Table 5, Appendix). Tbe lower left comer of the table is The 5% point of F with 1 and 00 degrees of freedom is (1.960)1 or 3.842t, and F. In Tbe four distributions are introduced in tbe order of II, terms of Table 13.2, the introduction is started from tbe lower left comer (,,) aDd tben extended to the whole bottom line. Then" is extended to the wbole first column, and finally the whole table is covered. As
,,2.
>t
>t,
190
Ch. 13
REVIEW TABLE 13.2 F-lable
~ "2 1 2 3
· ·
· 00
1
2
3
...
00
l-
II'
)(/lla
1
compared with the F-distribution, the other three distributions are relatively unimportant; but they are instrumental in introducing the statistical concept. To simplify the subject of statistical methods, it seems to be more practical to merely list a collection of methods, each for a particular purpose, aud say nothing of the relations among the various methods. Yet anyone who reads research papers in the fields where statistics is used, is seriously handicapped if he does not have some knowledge of the relations among the various distributions. For example, if one knows that the t-test can be used to test the hypothesis that two population means are equal, he may become bewildered when he encounters the analysis of variance being used for the same purpose. [n fact, it is quite common among beginners to use the analysis of variance to test the hypothesis that two population means are equal after the t-test is already perfonned, hoping that a least one test will produce a favorable conclusion (Section 12.6).
13.3 TealS or Hypodaesea The preceding section shows that the various distributions are introduced systematically. This section shows that the methods of testing various hypotheses are also introduced systematically. The methods are listed in Table 13.3. [t should be noted that the F-test can be used in testing every hypothesis listed in the table. All of the four statistics, u, )(2, t, and F, listed in Table 13.3 are pure numbers. Regardless of the unit of measurement used in the observations, the values of these four statistics remain unaffected. The length may be measured in inches or in centimeters; the yield of crops may be measured in pounds per plot or bushe Is per acre; the temperature may be measured in Fahrenheit or centigrade; whatever the unit of measurement, the values of u, )(2, t, and F remain unchanged. As a result, the conclusions reached by the tests of hypothesis remain the same, even though the unit of meSAurement is changed.
13.3
191
TESTS OF HYPOTHESIS
The hypotheses listed in Table 13.:J are very simple. If two populations are available, it is a simple matter to detennine which population has a greater mean. If III is known to be 49.2 and 112 is known to be 49.1; anyone can see that III is greater than 112 without committing any type of error and, furthermore, anyone can see that p.. - Il2 = .1 without the aid of a confidence interval. Such simple arithmetic as this becomes a complicated statistical method when the populations are not available and samples must be relied upon. If 11 = 49.2 and 12 = 49.1, it cannot be concluded that III - P-2 = .1, because the sample mean changes from sample to sample. A conclusion concerning the population means must be reached through the use of statistics. There are only two conditions under which statistics is not needed. Either the sample sizes are 80 large TABLE 13.3 No. of Parameters
Hypothesis
I
Statistic
Section No.
1
17 2 =17:
1
Il =1'0
~/V F , F
8.4 & 8.5 7.9 & 12.6
2
cI.=o:
F
9.4 & 9.5
2
p..=112
,
F
10.6 12.5
p..=P2=···-llk
F
12.5
Ie
7.10 & 7.11 7.10 9.7
that the samples become the populations themselves, or the population variances are equal to zero. When all the observations of one population are the same, and, at the same time, all the observations of another population are the same (that is, both population variances are equal to zero), the difference between III and 1'2 can be determined with certainty by drawing one observation from each population. For example, all the observations in one population are equal to 49.2, and all the observations in another population are equal to 49.1.
Then the two population
means are 49.2 and 49.1 respectively. As a result, only simple arithmetic is needed to determine the difference between the two population means, and a statistical method is not needed for this purpose. Whenever conclusions about the populations are reached without using statistics, one or both conditions mentioned above must be fulfilled. However, a problem in which the variance is equal to zero frequently ceases to be of interest to scientists. For example, sociologists may be interested to know the income distribution of a country. But if the in-
192
REVIEW
Ch. 13
come of every person is the same, immediately the problem itself disappears. 13.4 Significance The word significance has a technical meaning in statistics. In general, it is used in connection with the rejection of a hypothesis. The meaning of the word significance depends on the hypothesis tested. This is the reason why various tests of hypotheses are presented before the word significance is introduced. In testing the hypothesis that the population mean is equal to a given value, say 60, the sample mean is said to be &ignifico.nely different from 60, if the hypothesis is rejected. If the hypothesis is accepted, the sample mean is said to be not significantly different from 60. In testing the hypothesis that two population means are equal, the two sample means are said to be significantly different if the hypothesis is rejected, that is, if the conclusion is reached that the two population means are different. The mere conclusion that two population means are different does not imply that there is a substantial difference between them. The magnitude of the difference must be estimated by a confidence interval. The result of the analysis of variance is said to be significant if the conclusion is reached that the k population means are not equal. The word significance is used only in connection with statistics and never with parameters. Two sample means may be significantly different or Dot significantly different, depending on the rejection or acceptance of the hypothesis that the two population means are equal. But the word sigllificantiy is not used to modify the difference between two population means. 13.5 Sample Size The sample size plays an important role in statistical methods. It seems to be intuitively obvious that the larger the sample size, the more accarate the result. However, the specific benefits to be derived from a large sample are not so obvious. Two types of errors are involved in testing a hypothesis. The increase in accuracy of a test always meaDS the reduction of the probability of conmitting one or both types of error. The probability of committing 8 Type I error is called the significance level, which may be fixed as large or as small as one wishes, regardless of the sample size. The advantage of having a large sample is to reduce the probability of committing a Type II error, after the sisnificance level is fixed. If the hypothesis being tested is true, only a Type I error, -if any, can be made. As long as the significance level remains the same and the
13.6
SIMPLIFIED STATISTICAL METHODS
193
hypothesis being tested is true, a large sample has no advantage over a small one. Il the hypothesis being tested is false, only a Type II error, if any, can be made. If the significance level remains the same, the probability of committing a Type II error will be decreased by an increase in sample size. In othcr words, a large sample is more likely to callsc the rejection of a false hypothesis than is a small sample. This is the advantage of having a large sample in a test of hypothesis. In estimating a parameter by a confidence interval, the confidence coefficient and the length of the interval are of primary importance. The higher the coefficient, the more likely it is that an interval will catch the parameter. But the confidence coefficient is arbitrarily chosen and has nothing to do with the sample size. The advantage of having a large sample is to rednce the length of the interval, after the confidence coefficient is chosen. For example, the 95% confidence interval of II- is
y- t. 02l
~
to
f+
t.'2I
~.
The length of this interval is 2t. 01l ';s 2 /n. As y changes from sample to sample, the center of the interval will change. As S2 changes from sample to sample, the length of the interval will change, even if the sample size remains the same. The increase in sample size will reduce the average length of such intervals. 13.6 Simplified Statistical Methods The various hypotheses and methods of testing them 8S listed in Table 13.3, are presented in the preceding chapters. Yet they are not the only methods which can be used in testing these hypotheses. During the last decade many simplified methods have been developed for testing the same hypotheses. Some of these methods, which are unfortunately named inefficient statistics, are usually presented in books on industrial quality control, but none of them is given in this book. The advantage of these simplified methods lies in the simplicity of computation. For example, the quantity SS, which is needed in every statistic listed in Table 13.3, is never needed for the simplified methods. However, for the same given observations drawn from a nonnal population, the probability of committing a Type II error is greater, if the simplified methods are used instead of those listed in Table 13.3. This is the reason why the simplified methods are called inefficient statistics. In contrast, the methods listed in Table 13.3 may be called efficient statistics. But it should be realized that the increase in sample size will decrease the probability of committing a Type II error. Therefore, this
194
REVIEW
Ch. 13
probability may be equalized by using larger sample sizes for the inefficient statistics than for the efficient statistics. The choice of a method of testing a hypothesis depends on the cost of computation and that of collecting observations. If the cost of collecting observations is low, larger samples may be obtained and the inefficient statistics may be used to avoid the complicated computation. If the observations are costly to collect, they should be treated carefully with efficient statistics. The analogy is like an orange juice squeezer. If oranges are sold at a penny a dozen, there is no need to buy an expensive and efficient squeezer to squeeze an orange dry. If oranges are sold at one dollar apiece, a good squeezer is needed to squeeze every drop of juice out of an orange. In addition to cost, human inertia also plays an important part in the choice of a method. One naturally likes to use a familiar method rather than to struggle with a strange one. At the present time, inefficient statistics are usually used in industrial plants where observations are collected easily and cheaply. EFficient statistics are usually used by scientists in analyzing experimental data, which are usually collected at great effort and expense. 13.7 ":rror The word error is used in several distinct senses in statistics. The computing error, the standard error, the Type I error, and the Type II error are really different kinds of errors. These different kinds of errors are listed below: (1) A mistake made in computation is usually called an e"or. It may be due to the failure either of the calculator or of the operator. The seriousness of this kind of mistake usually is not realized by beginners in statistics. A mistake in computation, if undetected, may lead to a wrong conclusion, no matter how elaborate the statistical method applied to the experminetal data. (2) The term sUJndam error does not imply that any mistakes are made. The sample mean changes from sample to sample. The standard error of the mean (~ection 5.3) measures the deviations of the sample means from the population mean. The fact that sample means deviate from the population mean is a natural phenomenon. (3) The word e"or in the error SS of the analysis of variance also does not imply that any mistakes are made. An error here is the deviation of an observation from the sample mean. (4) The Type I and the Type II errors are actual mistakes; but they are not mistakes due to human or mechanical failure. They are natural consequences of drawing conclusions about the populations while only samples are available.
REFERENCES
195
QUESTIONS
(1) If the significance level remains the same, what is the advantage of
increasing the sample size in testing a hypothesis? (2) In testing a hypothesis with the same significance level, is a large sample or a small sample more likely to cause a rejection of the hypothesis? Why? (3) In testing the hypothesis that the population mean is equal to a given value, only one sample is needed. Why then are all possible samples dragged into the discussion? (4) What is 'the 1% point of F with 1 and 00 degrees of freedom? Why? (5) What are the 1% and 5% points of F with 00 and 00 degrees of freedom? Why?
REFERENCES Dixon, W. J. and Massey, F. J.: Introduction to Statistical Analysis, McCr.... Hill Book Company, New York, 1951. Mosteller, Frederick: "00 Some Useful 'Inefficient' Statistics," A1IIIals of Mathematical Statistics, Vol. 17 (1946), pp. 377-408.
CHAPTER 14
RANDOMIZED BLOCKS The term randomized blocks designates a particular design for an experiment. The design itself and possible applications of it are the same as those used in the method of paired observations (Section 8.7). However, in the latter method, only two treatments are involved, while in the randomized block design, any number of treatments may be involved in the experiment. As a result, the analysis of variance must be used, instead of the t-test, in testing the hypothesis that the treatment effects are equal. 14.1 Randomized Block Versus Completely Randomized Experiment The randomized block design may be illustrated by the feeding experiment (Section 12.8) which involves 5 rations and 50 animals. The 50 animals are divided at random, with the aid of a random number table, into 5 groups, each consisting of 10 animals. Then each group is fed a different ration. Or this method may be modified by dividing the animals, before they are randomized, into 10 groups, each consisting of 5 animals. The grouping is done in snch a way that the animals within a gronp are as nearly alike as possible in initial weight, age, etc. If the animals are piSS or mice rather than steers, each group may very well be animals of the same litter. Each gronp or litter of animals is called a block. Then the 5 animals in a block are assigned at random to the 5 different rations. An experiment of this kind, in which the randomization is preceded by deliberate selection, is called a randomized block experiment. If the 50 animals are assigned at random to the 5 different rations, withont previous grouping into blocks, the experiment is said to be completely randomized. The applications of the analysis of variance, with one-way classification, as given in Chapter 12, are all completely randomized experiments. In the completely randomized experiment, the 50 animals are assigned at random to the 5 different rations, without any restriction. In the ramdomized block experiment, the animals are deliberately divided into 10 blocks and then the randomization is restricted within the blocks. This restriction of randomization differentiates these two designs of experimen ts. Many books have been written about the design of experiments, which is a specialized field of statistics. Long lists of designs are available for use by experimental scientists. One design differs from another according to the restrictions imposed on the randomization. 196
14.1
RANDOMIZED BLOCK. VS. RANDOMIZED EXPERIMENT
197
The sugar beet experiment described in Section 8.7 is an example of a randomized block experiment. The field map showing the relative positions of the plots is given in Fig. 8.7. This example of two treatments, with and without fertilizer, is to show the possible application of the method of paired observations. However, if the number of treatments i8 iocreased from two to four, (for example, if 0, SO, 100 and ISO pounds of available nitrogen per acre are used in the experiment,) each block will ioclude 4 plots instead of 2, and the 4 treatments are then 88signed at random to the 4 plots within a block. When the random number table is used, the treatments are given code numbers: I, 2, 3, and 4. The plots within a block are numbered consecutively according to their geographic Block No.
4
PI .. No.
4
t
4
2
Fig. 14.1
order (Fig. 14.1). The treatment whose code number occurs firat in the random number table is 88signed to the first plot; the treatment whose code number occurs second in the table is assigned to the second plot, and so forth. The randomization of the treatments is repeated for each block. A field map of 4 treatmeuts with 3 blocks is given in Fig. 14.1 88 ao illustration. Although the block and plot numbers are given separately io Fig. 14.1, in actual practice the two numbers are usually combined ioto one. For example, the third plot of the second block is called Plot No. 23, 2 for the block and 3 for the plot. The term block was first used in field experiments where the work "block" meant a block of land. It was appropriate and descriptive. But it subsequently became a technical term used in experimental designs, aod now even a Ii tter of animals may be called a block. In the randomized block experiment, every treatment occurs once in each block. Then each block is really a complete experiment by itself. Therefore, the randomized block experiment is also called the randomized complete block experiment as distinguished from the more complicated designs which are called incomplete block experiments. The number of complete blocks is also called the number of replications of an experiment. The experiment shown in Fig. 14.1 is called a randomi zed block
Ch. 14
RANDOMIZED BLOCKS
198
experiment with 4 treatments (4 different quantities of fertilizers) and 3 replications (blocks). The advantage of the randomized block experiment over the completely randomized experiment is discussed in Section 14.8.
14.2 Mechuics of Partitioa of
s.. of Sqaares
For the randomized block experiment, the analysis of variance is used in testing the hypothesis that the treatment effects are equal. The mechanics of the partition of sum of squares are illustrated by an example of 3 treatments and 4 repJications given in Table 14.2&. The data may be supposed to have come from a feeding experiment consisting of 3 feeds and 4 litters of animals or from a variety trial consisting of 3 varieties of wheat planted in 4 blocks in the .field. TABLE 14.2a
Rep. totals
Treatment
1 2 3
Replication
4
Rep. means Yr
Rep. effects Y,-'
1
2
3
T,
6
7 7 2
21
4
8 9 3 4
24 12 15
7 8 4 5
1 2 -2 -1 0
8 7 7
Te
28
20
24
72
24
Treatment means Ye
7
5
6
18
, =6
Treatment effects Ye - ,
1
-1
0
0
Treatment totals
The notations used here are the same as those used in Chapter 12. The letter k is the number of treabnents and n the number of replications. In the given example, k = 3 and n = 4. The general mean and the grand total are still designated by y and G respectively. However, the notations rand T need modification. In Chapter 12, the analysis of variance is the ~ase of one-way classification. The sample mean is the treatment mean. The sample total T is the treatment total. However, in the randomized block experiment, the analysis of variance is the case of twoway classification: one for replication and the other for treatment. To distinguish these two classifications, the subscripts r or t are attached to rand T. The rand T r are the mean and total of a replication, and re and Teare the mean and total of a treatment. The replication effect is (rr - y) and the treabnent effect is (Y e It can be observed from Table 14.2a that the sum of the replication effects is equal to zero and that the sum of the treatment effects is also equal to zero. Each of the kn or 12 observations can be expressed as the sum of the (1) general mean, (2) replication effect, (3) treatment effect, and (4) error,
r
r
r>.
14.2
199
MECHANICS OF PARTITION OF SUM OF SQUARES
that is,
y
= y + (Y, - Y> + (Y, - 1) + (y - r, -
y, + 1).
(1)
The above equation is an algebraic identity. After simplification, it can be reduced to = y. The error term (y - y, + 1) appears to be complicated, but it is simply the left-over portion of an observation, after the general mean, replication effect, and treatment effect are accounted for. For example, the observation in the third replication and the second treatment (Table 14.2&) can be expressed as
r
r,
2
= 6 + (-2) + (-0 + (-1),
where 6 is the general mean, (-2) is the effect of the third and (-0 is the effect of the second treatment. The error (-0 over portion of the observation after the general mean, the effect, and the treatment effect are accounted for. In other error (-1) is the quantity needed to balance the equation.
replication is the leftreplication words, the
TA8LE 14.2b
general mean + replication effect + treatment effect + error Treatment 1 2 Rep. 3 4
1
2
3
6+1+1-2 6+2+1-1 6-2+1+2 6-1+1+1
6+1-1+1 6+2-1+0 6-2-1-1 6-1-1+0
6+1+0+1 6+2+0+1 6-2+0-1 6-1+0-1
The 12 observations, each expressed as the sum of the four components, are shown in Table 14.2b. It can be observed in the table that the replication effects are the same for each of the treatments and that the treatment effects are the same for each of the replications. The sum of the replication effects and that of the treatment effects are both equal to zero. The sum of the errors, for each replication and for each treatment, is equal to zero. It can be shown algebraically that these characteristics of the components shown in Table 14.2b are present for any set of numbers, no matter what physical interpretation, such as treatment and replication, is attached to the kn observations. These characteristics reveal the physical meaning of the numbers of degrees of freedom. Since the sum of the k trea tment effec ts is equal to zero, if (k - 1) treatmen t effects are known, the remaining one is automatically determined. Therefore, the treatments have k - 1 degrees of freedom. For example, the 3 treatment effects are 1, -1 and o. If any two of them are known, tHe remaining one can be determined, because the sum of the three treatment effects is equal to zero. Therefore the treatments have 2 degrees
200
Ch. 14
RANDOMIZED BLOCKS
of freedom. Similarly, the n replications have n - 1 degrees of freedom. Since the sum of the errors, for each replication and for each treatment, is equal to zero, if the errors for (n - 1) replications and (Ie - 1) treatments are known, the remaining terms can be aulomati cally determined. Therefore, the errors have (Ie - 1) (n - 1) degrees of freedom. For exam-
ple, the errors of the first 2 treatments and the first 3 replications of Table 14.2h are these:
o
-2 I? -1 O? 2 -1 ? ? ??
o
000
o
o o
The errors of the last treatment and the last replication are not given but are designated by question marks. Since the som of the errors is equal to zero for every replication and for every treatment, these missing errors can be determined. Therefore, the errors have (Ie -1){n - 1) or (3 - 1){4 - 1) or 6 degrees of &eedom. The sum of squares of the len or 12 treatment effects, Ie
nL: {Y, - 1>2 = 4[1 2 + {_1)2 + 02] = 8, IS the treatment SS with (Ie - 1) or 2 degrees of freedom (Equation 3, Section 12.1). The sum of squares of the len or 12 replication effects, /I
leL:(Yr - 1>2 = 3[1 2 + 22 + (_2)2 + (_1)2]
= 30,
is the replication SS with (n - 1) or 3 degrees of freedom. squares of the len or 12 error terms, (_2)2 + (_1)2 + ••• + (_1)2
The sum of
= 16,
is the error SS with (Ie - l){n - 1) or 6 degrees of freedom. The total SS, which is the SS of the len observations, is len
L: (y -
1>2 = (6 - 6)2 + (8 - 6)2 + ••• + (4 - 6)2
= 54,
which is the sum of the replication SS, treatment SS, and error SS, that is, 54
= 30 + 8 + 16.
The algebraic proof of this identity is similar to that given in Section 12.1 and, therefore, is omitted here.
14.3 Statistical IolerprelatioD of Partitioa of s.. of Squares The meaning of the partition of the sum of squares with two-way classification is to be illustrbted by the tag population which is a normal popu-
14.3
INTERPRETATION OF PARTITION OF SUM OF SQUARES
201
lation with mean equal to 50 and variance equal to 100. Six samples, each consisting of a single observation, are drawn from the tag population. The observations or samples are arranged in the form of 3 treatments and 2 replications as shown in Table 14.3a. Since all of the 6 observations TAOLE 14.3a Rep.
Treatment 1 2 3
mean
41
57
37
:
45
I m~~ ~ =-44
55
57
I
53
56
47
1 2
I 47
,=
49
are drawn from the same population, the treatment effects and the replication effects do not really exist. For illustration, these 6 observations may be considered the yields of 6 plots of wheat. The treatments and replications have no effect on the yields. The variation in yields among the 6 plots may be considered the natural characteristics of the plants, just as different people may have different heights. The 6 observations are drawn from the same population with mean equal to SO, even though the 6 observations themselves are not the same. Now suppose the land of the first replication is more fertile than that of the second replication. To simulate this difference in fertility of the two replications, one may add 10 to each of the 3 observations (yields) in the first replication and add nothing to the second replication. Then the population means of the 6 observations are no longer the same. ~ow further suppose that the 3 treatments, such as different kinds of fertilizers, also influence the yields. To simulate the treatment effects, one may add 10 to each of the 2 observations of the first treatment; 20 to each of the 2 observations of the second treatment; and 30 to each of the 2 observations of the third treatment. Then the population means of the 6 observations are further disturbed. The resultant population means, given in Table 14.3b, are no longer tle same. But each population mean can be expressed in the fonn (1)
where Il is the mean of any of the 6 populations; Ii is the mean of the 6 population means; IL, is the mean of the 3 population means in the same replication; and IL, is the mean of the 2 population means in th e same treatment. The 6 population means, expressed in the form of Equation 0), are given in Table 14.3c. Note that the 6 population means given in Tables 14.3b and 14.:k are the same even though they are expressed in different forms. The quantity IL, - Ii is called the replication effect. It can be observed from Table 14.3c that the two replication effects are 5 and -5 which are originally introduced by adding 10 to each of the observations of the first replication and nothing to the second replication.
~
S
TABLE 14.3b Treatment
-
-- . I
Rep. 2 Treatment mean /-If
1 --
I
50 + 10 + 10 = 10 50 + 0 + 10 =60
-
Replication Mean
Replication Effect
2
3
p,.
JIr - ji.
50 + 10 + 20 = 80 50 + 0 + 20 = 10
50 + 10 + 30 = 90 50+ 0+30=80
80 10
5 -5
75
85
p=15
:
65
- - -
~ o
~
N ~ o ~
o n
1'0::
Treatment
effect /-If - ji
0
::lei
en
-10
0
10
0
-----
.cr ("")
'"""
~
14.3
INTERPRETATION OF PARTITION OF SUM OF SQUARES
203
TABLE 14.3c Treatmeat Rep.
--
1 2
1
2
-
--
75 + 5 -10 = 70 75 - 5 -10 =60
3
== =---=====-- =
75 + 5 + 0 =80 75 -5 +0 =70
75 + 5 + 10 = 90 75 - 5 + 10 =80
This modification of observations causes the average yield of the two replications to increase by 10 + 0) or 5. The effect 5 of the first repliclition is obtained by 00 - 5) and the effect -5 of the second replication is obtained by (0 - 5). In other words, an individual replication effect is defined in terms of the deviation from the average of all replication effects. The same is true for the treatment effects (Il t - ~). The treatment effects are originally introduced by adding 10, 20, and 30 to the observations of the first, second, and third treatments respectively. The average increase is 20 and the three treatment effects are -10, 0 and 10 respectively. After the population means are cbanged by introducing the replication effects, the observations will be changed correspondingly. >\n observation 4.3, from a population whose mean is 50, becomes 63, if the population mean is increased to 70. The error is defined as (y - 11) which is the deviation of an observation y from its population meaD /1. The error of the observation 43 is (43 - 50) or -7. If one adds (y - 11) to both sides of ~quation 0), the resuiting equation is
t(
Y = ji +
(2)
The above equation is the assumption regarding the kn population means of the randomized blocks. When Equation (2) is satisfied, it is said that the replication effect and treatment effect are additive. It can be seen in Table 14.3c that the effect of the first replication is 5 and that the effect of the first treatment is -10. The combined effect of the first replication and the first treatment is the sum of 5 and -10. Then the population mean of the first replication and the first treatment is (75 -5) or 70, where 75 is the mean of the 6 population means. This is what is meant by the statement that replication and treatment effects are additive. The word "additivity" refers to the relation among the population means. The replication effects and treatment effects discussed in this section
are parameters; while the same terms are used in the preceding section to describe the statistics whicb are the estimates of these parameters. To avoid possible confusion in later sections, the parameters and the corresponding sta tistics are listed in Ta ble 14.3d. It is shown in the preceding section that the replication SS, treatment SS and error SS can be obtained from the kn observations. If all possible samples of kn observations are drawn, all of the 1. 55-values will change
204
Ch. 14
RANDOMIZED BLOCKS TABLE 14.3d
Parameter -
-IJ.
General mean Replication effect Treatment effect Error
IJ., - -IJ. IJ., - IJ. Y-IJ.
Statistic
1 1,-' 1,-, Y-1,-1,+'
from sample to sample. The averages of each of the mean squares (Equations 4 and 5, Section 12.4) are given below: (a) Average of replication MS = (13 + kq~ (b) Average of treatment "S = (13 + nu~ (c) Average of error MS =~ The variance is the variance of the replication effects, that is,
u!
U
The variance
2 k(lt,. - iD2 = n-l ,
(3) (4) (5)
(6)
q! is the variance of the treatment effects, that is, 2
q,
k(It, - iDa =.
k-l
(7)
The variance q2 is the variance common to all kn populations. In testing the hypothesis that the treatment effects are the same, that is, q~ = 0, the statistic (Section 12.5)
F = treatment mean square error mean square with (k -1) and (k - 1)(n - 1) degrees of freedom, may be used. The assumptions involved in this test are already demonstrated in the example shown in Table 14.30. They coo be summarized as follows: 0) The replication and treatment effects are additive, that is, the len population means maintain the relation given in Equation (1). (2) The kn populations are normal. (The nonnality is demonstrated by the use of the tag population.) (3) The len populations have the same variance. (The equality of variances is demonstrated by the use of the tag population with only the mean changed.) (4) The kn samples, each consisting of a single observation, are drawn at random. The first assumption is new. The rest of the assumptions are the same as those in the analysis of variance with one-way classification. The consequences of not satisfying the assumptions (2), (3), and (4) are discussed in Section 12.7, but the consequence of not satisfying the first assumption is not well known at the present time (} 951).
14.4
205
COMPUTING METHOD
14.4 Compadag Method The mechanics of partitioning the total SS into replication SS, treatment SS, and error SS are shown in detail in Section 14.2. In this section, a short-cut method is presented for computing the various S.5-values and also the F-value. The basic principle of the short-cot method is to replace the means by the totals in all steps of computation. The details of the development of the short-cut method are already given in Section 12.3 and therefore, are not repeated here. TABLE 14.4a Preliminary CalcwatiOJls
....-(1)
(2)
(3)
(4)
(5)
Type of Total
Total of Squares
No. of Items Squared
No. of ObservaliollS per Squared Item
Total of Squares per Observalion (2) + (4)
Grand Replication Treatment Observation
I~ I~
ca
~
kll Ie
I n Ie
II
len
I
calkn I~rlle
I71/n
II
(I) (II) (Ill)
(IV)
Analysis of Variance Source of VariatiOJl
Sam of Squares
Replication Treatment Error Total
(II) - (I) (Ill) - (I)
SS
(I V) - (III) - (II)
(IV) -(I)
Degrees of Freedom DF
Mean Square MS
F
n-I Ie-I
+ (I)
(Ie - 1)(,.-1)
len -I
The notations used in this section are the same as those used in Section 14.2, that is, "---nomber of treatments n---number of replications 'Y---an observation ~, treatment total 'Y, treatment mean T replication total -' 'Y..c---replication mean (I ---grand total ' Y---general mean
-
206
Ch.14
RANDOMIZED BLOCKS
In the analysis of variance calcnlations, the len observations are first tabulated by replication and treatment aa shown in Table 14.3a. Then the replication totals, treatment totals, and the grand total can be obtained by addition. The means are not needed for computing the SS. values. The details of further computation are shown in Table 14.4a. The four S~values in the lower half of Table 14.48 are obtained by combining the items of column (5) of the upper half of the table. The numbers of degrees of freedom can be obtained by combining the corresponding items of column (3) of the upper half of the table. For ex· ample, the error SS is the combination (IV) - (lIn - (II) + (I) (column 5, Table 14.4a) and its number of degrees of freedom is len - Ie - n + 1 or (Ie - 1)(n - 1). How to combine the four quantities (I), (II), (III), and (IV) to make a certain SS can be made easy to remember by associating the means with their corresponding totals (Section 12.3). For example, the error is (y - Yt - Yr + y) (Equation 1, Section 14.2). The observation y is associated with the quantity (IV), which involves only individual observations. The treatment mean Y, is associated with the quantity (lin which involves treabDent totals. The replication mean r is associated with the quantity (II) which involves replication totals. The general mean is associated with the quantity (I) which involves the grand total. So the error SS is (IV) - (III) - (II) + <0.
r
r
TABLE 14.4b Pre liminary Calcu lations (3) (4)
(1)
(2)
Type of Total
Total of Squares
No. of Items Squared
Grand Replication Treatment Observation
5,184 1,386 1,760 486
1 4 3 12
I
(5)
No. of Observations per Squared hem
Total of Squares per Observation (2) + (4)
12 3 4 1
432 462 440 486
Analysis of Variance Source of Variation
Sum of Squares
55
Degrees of Freedom
Replication Treatment Error Total
30 8 16 54
3 2 6 11
DF
Mean Square MS
F
10.00 4.00 2.67
1.50
14.5
TEST OF HYPOTHESIS-PROCEDURE AND EXAMPLE
207
The total SS and its components of the example given in Table 14.2a are already found in Section 14.2. Now the short-cut method is used on the same 12 observations. The details of the computation are shown in Table 14.4h. It should be noted that the four Ss..values obtained by the short-cut method given in Table 14.4h are the same as those obtained previously in Section 14.2. The short-cut method does not introduce any new meaning to the analysis of variance. It is simply an easy way of obtaining the various Ss..values on a desk calculator. 14.5 Tesl of Hypo&besis--Proeeclure ad Example To test the hypothesis that the treatment effects are equal, that is, = 0, one may use the statistic
o!
F = treatmeut mean square error mean square with (Ie - 1) and (Ie - 1)(n - 1) degrees of freedom (Section 14.3). The procedure of the test, illustrated by the example given in Table 14.2a, is summarized as follows: 1. Hypothesis: The hypothesis can be stated in 3 different ways: (a) the population means of the treatments are the same, that is, Il, is the same for all treatments; (b) the treatment effects are equal to zero, that is, III - ii = 0, for every treatment; = 0 (Equation 7, Section 14.3). (c) 2. Alternative hypothesis: The alternative hypothesis is that the population means of the treatments are not the same, or that the treatment effects are not all equal to zero, or simply that O. 3. Assumptions: The assumptions are listed at the end of Section 14.3. 4. Level of Significance: The 5% significance level is used. 5. Critical region: The critical region is that F > 5.1433. (The numbers of degrees of ueedom of F are 2 and 6, as shown in Table 14.4h. The F-test of the analysis of variance is always a onetailed test.) 6. Computation of F: The details of the computation of F are shown in Table 14.4h. The F-value is equal to 1.50 with 2 and 6 degrees of ueedom. 7. Conclusion: Since the F-value is outside the critical region, the hypothesis is accepted. The conclusion is that the population means of the treatments are the same. It may also be stated that there is no significant difference among the treatment means (y). Examples of the applications of the randomized blocks are given in section 14.1. An actual field experiment is to be described here as another illustration.
o!
o!>
208
Ch. 14
RANDOMIZED BLOCKS TABLE 14.5a Replication Variety
1
2
A. BODDY Best B. John Baer c. SioDX D. Stokesdale E. T-5 F. T-l7
32.6 38.4 65.1 54.2 83.7 77.1
41.0 39.4 59.9 46.4 37.9 70.8
-
3
4
5
17.9 37.1 41.9 43.6 69.3 57.7
23.8 42.8 36.1 35.1 63.8 51.1
19.6 21.8 21.1 43.4 SO.7 46.5
(CoDrtesy of Mr. Thomas P. Davidson. Oregon State College) TABLE 14.5b Sonrce of Variation
Snm of Squares SS
DF
Mean Square MS
F
Replication Variety Error Total
1,987.5187 4,541.6840 2,033.4093
4 5 20 29
496.8797 908.3368 101.6705
8.93
8,56~.6120
An experiment was carried out to compare the yields of 6 varieties of tomatoes. The randomized block design with 5 replications was used. The yields (numbers of pounds of marketable fruit) of the 30 plots are given in Table 14.5a. The analysis of variance is given in Table 14.5b. Since the computed F-value of 8.93 wi th 5 and 20 degrees of freedom is greater than 2.7109 (5% point of F with 5 and 20 d.f.), the conclusion is that the 6 varieties of tomatoes do not have the same yield. To rank the varieties by the yields of tomatoes, further tests must be used (Sections 12.9 and 15.5). 14.6 Paired ObservatiOils aDd Randomized Blocks The method of the paired observations (Section 8.7) is a special case of the randomized blocks. In an experiment with 2 treatments and II replications, either method may be used, and the conclusions reached by these methods are always the same. Despite the apparent difference in tbe computing methods, the F-value of the randomized blocks is the square of the t-value of the paired observations. The numbers of degrees of freedom of F are k - 1 and (k - 1)(11 - 1) or 1 and II - 1, because k = 2. The number of degrees of freedom of t is II - 1. Therefore, the two-tailed t-test is exactly the same as the ODe-tailed F-test (Theorem 12.6). The t-value of the .sugar beet experiment of Section 8.7 is 5.014 with 9 degrees of freedom (Table 8.7). As an evidence of t 2 -= F, the F-value of the same set of data is computed. The details of the computation are given in Table 14.6. The F-value, with 1 and 9 degrees of freedom, is
14.7
209
MISSING OBSERVATION TABLE 14.6 Preliminary Calculatloa.s (1)
(2)
(3)
(4)
(5)
Type of Total
Total of Squares
No. of Items S._ecl
No. of Observetloa.s per Squared Item
Total of Squares per Observation (2) + (4)
Grud Block Fertilizer Observatioa.
13,269,991.94 1,335,673. ~ 6,701,171.14 1576,823.62
1 10 2 20
20 2 10 1
663,499.592 667,836.970 670,117.114 676,823.620
AIlalysis of Variaa.ce Source of Variatioa.
Sam of Squares
Block Fertilizer Error Total
4,337.378 6,617.522 2,369.128 13,324.028
SS
Degrees of Freedom
Meaa. Square MS
F
9
1 9
6,617.5220 263.2364
25.14
19
equal to 25.14 which is the square of 5.014. The 2.5% point of t with 9 degrees of freedom is 2.~22. The square of this number is 5.1174, which is the 5% point of F with 1 and 9 degrees of freedom. Therefore, the conclusions reached by the two methods must always be the same. 14.7 Missiag Observatloa While an experiment is being conducted, accidents may occur, especially when the experiment extends over months or years. Animals or plants might be killed, not by the treatments but by accidents. As a result, the original design used in the experiment would be destroyed and consequently a special computing method would be needed to cope with the missing observations. This section describes a method for the analysis of variance calculations, when one observation is missing from a randomized block experiment. The analysis of variance with unequal numbers of observations usually requires complicated computation except for the case of one-way classification (Section 12.10). If one observation is lost from a randomized block experiment, the usual method of computation (Section 14.4) is not applicable, because the numbers of observations for the " treatments are not equal. The exact computing method for the randomized block experiment with missing observations is quite complicated and time consuming. As a result, an approximate method is more commonly used. The method
210
Ch. 14
RANDOMIZED BLOCKS
given in this section is the approximate method. The basic principle of this method is to substitute a dummy value for the missing observation and still use the same compnting method as if no observation were missing. Consequently, the extra computing work involved is in determining the dummy value. Before the method of detennining the dummy value is given, the prin. ciple involved is to be illustrated by an example: A sample is originally planned to have 5 observations, but one is missing. The observations are 8, 4, 5, ~, and 3, where ~ is a missing observation. If the missing observation is i goored, the sample mean y = 20/4 = 5, and the SS is (8 - 5)2 + (4 - 5)2 + (5 - 5)2 + (3 - 5)1
= 14,
with 3 degrees of freedom. Now if y is used as a dummy to repl ace the missing observation, the new mean is 25/5 = 5 and the new SS is
(8 - 5)2 + (4 - 5)1 + (5 - 5)2 + (5 - 5)1 + (3 - 5)1 = 14. In other words, the use of the dummy value changes neither the mean nor the SSe The value is indeed a dummy. The number of degrees of freedom of the SS is still 3, because the two SS.values are really the same one. A dummy is not an observation. Four observations plus a dummy are still 4 observations and not 5, just as 4 persons plus 8 dummy are still 4 persons and not 5. The determination of the dummy value of the random· ized blocks is based on the same principle. The dummy value is a value which, when substituted for the missing observation, enables one to obtain by the usual computing method the same value for the error SS that could be obtained by a more complicated method. Even though the error SS is well cared for by the dummy value, the treatment SS is not and is slightly greater than that obtained by the complicated method. Therefore, the approximate method leads to an F·value slightly greater than it should be. This is the price one pays for using the approximate method. The dummy value, d, of a randomized block experiment, with one missing observation, is
d=
leT + "R - S (Ie - 1)(" - 1)
,
(1)
where T = sum of ,,- 1 observations of the treatment with the missing observation; R = sum of Ie - 1 observations of the replication with the missing observation; S = sum of Ie" - 1 observations.
An example of a randomized block experiment, with 3 treatments and 4 replications, is given in Table 14.7. The observation in the second treatment and the third replication is missing. First, the missing obser-
14.7
MISSING OBSERVATION
211
TABLE 14.7 Treatment Rep.
1
2
3
Total
13
1
5
4
2
4
5
3 4
6
-
2 4
8
14 =R
3
fj
6
15
Total
18
15 T
20
11
53 =5
vation is considered zero, and the treatment totals, replication totals grand total are obtained as usual. The total of the treatment with missing observation is T = 15. The total of the replication with missing observation is R 14. The grand total is S = 53. Then dummy value is :0:
d=
3(15) + 4(14) - 53 (3 - 1)(4 - 1)
and the the the
48
= - = 8. 6
Then this dummy value 8 is inserted in the data and treated as if it were an observation in further computation. The numbers of degrees of freedom of the replication SS and treatment SS are still (II -1) and (k - 1) respectively, because the numbers of replications and treatments are not changed by a missing observation. The number of degrees of freedom for the total SS is one less than the usual number of kll - 1. Since one observation is missing, the total number of observations is kll - 1 to start with; the number of degrees of freedom for the total SS is expected to be kll - 2. Then the number of degrees of freedom for the error SS has to be one less than the usual number (k - 1)(11 - 1). The formula for computing the dummy value for a missing observation is given without any justification. Now an example is used to show that such a formul a is not unreasonable to expect. Suppose every observation in Table 14.7 were equal to 3. Then it is conceivable that the computed dummy value should also be 3. If every observation is equal to 3, then T = 9, R = 6 and S = 33. The dummy value is
d
= 3(9) + 4(6) -
33
(3 -1)(4 - 1)
18 -=3 6
which is the value expected. If more than one observation is missing, the method still can be used in determining the dummy values. For example, suppose two observations are missing. For the first missing observation, an arbitrary value approximately equal to the other observations is inserted in the position
212
RANDOMIZED BLOCKS
Ch. 14
where the first missing observation occurs. Then this arbitrary value is cODsidered an observation, and the dummy value for the second missing observation is computed by Equation (1). Then this computed dummy value is inserted in the position where the second missing observation occurs, and a dummy value for the first missing observation is computed. This process of leaving one dwnmy value in the data and computing the other is repeated until neither dummy value is changed by further computation. Then these last two dummy values are inserted in the data and the analysis of variance is carried out as if the two dummy values were observations. The nombers of degrees of freedom for the replication SS and trcabncnt SS arc still (n - 1) and (k - 1) respectively, but the Dumbers of degrees of freedom for the error SS and total SS, are each two less than the usual numbers of degrees of freedom. If too many observatious of one replication or treatment are lost, that replication or treatment should be omitted from the experimental data. No dummy values should be computed. It should be borne in mind that when an observation is lost, it is lost forever. The computed dummy value does not in any way estimate the lost observation. The method of determining the dummy value is nothing more than a device to replace the more complicated computation. For further tests beyond the F-test (Section 12.9), the inequality of the uumbers of observations of the" treatments cannot be ignored.
14.8 Experl.ealal Error For the completely randomized design, the experimental emJr is the variation among the observations receiving the same treatment. In statistical terms, it is the deviation of an observation from its population mean (Section 14.3). It is measured by the population variance (71. However, the estimates of (71 are also referred to as experimental error. This is the reason why the within-sample mean square is also called the elTor mean square. In this section, the advantage of having a smaller experimental error is discussed. This discussion is centered around the two experimental designs-completely randomized and randomized bloclt experiments-but
the implication of the discussion on the advantage of having a smaller experimental error is quite general. The completely randomized experiment and the randomized block experiment are both valid methods of experimentation in the sense that the ratio of the treatment mean square to the elTor mean square, in each case, follows the F-distribution. But the randomized block experiment has a distinct advantage over the completely randomized one, if the experimental material or environment is heterogeneous. For example, 10 litters, each consisting of 5 animals, are used in a feeding experiment with 5 rations (Section 14.1). The litter mates are usually more alike
14.8
213
EXPERIMENTAL ERROR
than the non-litter mates. If the randomized block design is used, the 5 animals in a litter are given 5 different rations. In other words, the different rations are tested on very similar animals and therefore the difference in fattening ability among the rations will be more easily revealed, while in the completely randomized experiment, the 5 different rations may or may not be fed to animals of the same litter, and consequently the difference in rations will be partially concealed by the difference in the animals themselves. So the advantage of using the randomized blocks in this case can be seen intuitively. However, the advantage can be even more readily seen from Table 14.8 which shows the analyses of variance of the two designs. The basic difference in the two analyses is that the within-ration SS, with 45 degrees of freedom, of the completely randomized experiment, is partitioned into two components, namely, the litter (replication) SS, with 9 dewees of freedom, and the error SS, with 36 degrees of freedom, of the randomized blocks. If there is no difference among the litters (replicatioos), that is O'~ = 0, all the three mean squares are the estimate of the parameter O'J. However, if there is more > 0, the difference among the litters than within the litters, that is, variance O'J in the completely randomized design is the variance among all the animals of the experiment; while the variance O'J is the variance among the animals within litters. Therefore, the variance in the randomized block design is smaller than that in the completely randomized design. In other words, if there is more difference among the litters than within litters, the advantage of the randomized block design over the completely randomized design lies in the reduction of experimental error. And the reduction of experimental error leads to the reduction of the probability of committing a Type II error. The reason for this conse-
O'!
TABLE 14.8 Completely Randomized Source of VariatioD
d./.
AmoDg-ratioD WithiD-ratioD Total
4 45 49
Average Mean Square (Eq. 4 & 5. SectioD 12.4) O'J
+ l00'~(O'~::: O'~
lT J
Randomized Block Source of VariatioD Litter (rep.) RatioD Error Total
d./.
Avera~e Mean Square (Eq. 3. 4. & 5. Sectioa 14.3)
9
~
"
O'J O'J
36 49
+ 50'~ + 10~
214
RANDOMIZED BLOCKS
Ch. 14
quence can be seen from the fact that the average F-value is influenced by the ratio (Section 12.5)
Since n. is the number of animals used for each feed and u~ the variance among the fattening ability of the feeds, these two quantities are the same if the same feeds and the same animals are used in both designs. Both a smaller u 2 in the randomized blocks will result in a larger ratio. For example, if nu! == 200, and u2 == 100, the ratio == (100 + 200)/100 or 3. But if u 2 is reduced to 10, the ratio becomes (10 + 200)/10 or 21. The increase in the ratio will result in, on the average, a larger F-value; thus the hypothesis that u! == 0 is more likely to be rejected. In other words, when the experimental error is reduced, the probability of committing a Type II error is also reduced. This is an example of reducing the experimental error by using the proper experimental desifSD. However, the experimental error can also be reduced by physical means such as using an air conditioned laboratory to reduce the variation in temperature. Scientists in each field usually have their own techniques of reducing experimental error. Whatever method is used in reducing the experimental error, the consequence is the same. 14.9 Models
The two models of the analysis of variance-variance component and linear hypothesis-are described in Section 12.4. The randomized block design described in this chapter is the linear hypothesis model. In other words, the replication effects 11, - ii and the treatment effects 11, - ji (Equation 1, Section 14.3) are considered fixed parameters which do not change from sample to sample. In terms of the feeding experiment in the preceding section, the implication is that the conclusion reached concerns only the particular 10 litters of animals and the particular 5 rations. The liuear hypothesis model is quite realistic so far as the rations are concerned, because an experimenter is usually interested in certain particular rations and not in a random sample of many rations. But the linear hypothesis model is extremely unrealistic as far as the replications are concerned. Very seldom, if ever, is an experimenter interested only in particular litters of animals. These animals will die in time and the information obtained on them will die with them. So the 10 litters of animals involved in the experiment may be considered a random sample of animals from a breed. Then the conclusion reached through the experiment will apply to all animals of the same breed. In the field experiment, an experimenter is usually interested in certain varieties of a crop rather than in a random sample of all varieties of the seme crop. Thus, the linear hypothesis model is quite realistic. Yet
MODELS
14.9
215
the model is unrealistic so far as the replications are concerned. Few, if any, experimenters are interested only in reaching a conclusion solely concerned with a particular field. The experimental field is usually considered a random sample of many fields in the same region, and therefore the conclusion will bave an application mucb wider tban tbe particular experimenter's own field. For that reason it is more realistic to use the variance component model (Section 12.4) for the replication effects and the linear hypotbesis model for the treatment effects. In other words, a mixed model is more desirable. However, when the linear hypothesis model of the randomized blocks is cbanged to the mixed model, the F-test, where
= treatment mean
F
square
error mean square is still valid. Thus an experimenter may cboose eitber model to suit his experiments.
EXERCISES (1) (a) Express eacb of the 6 following observations as the sum of the
general mean, replication effect, treabnent effect, and error (Table 14.2b). Treatment Rep. 1 2
-
1
2
3
7 7
5 9
0
-
2
---'--.
(b) Find the replication SS, treatment SS, and error SS directly (rom the various effects. (c) Show that the total SS = replication SS + treatment SS + error SS. (d) Calculate the same 4 SS-values by the short-cut method (Section 14.4) and note that the values thus obtained are the same as those obtained iD (b). (2) The following 20 observations are drawn at random from the tag population and are tabulated in tbe form of 4 replications and 5 treatments: Treatment 1
2
3
4
40 70 46 : 53
54 38 52 45
54 46 54 62
44 53 29 28
Rep.
5 -
1
2 3 4
I
U
53 47 53
216
(3)
(4)
(5)
(6)
RANDOMIZED BLOCKS
Ch. 14
Pretending that the source of the observations is unknown, test the hypothesis that the population means of the treatments are equal, at the 5% level. Since it is actually known that the observations are drawn &om the same population, state whether your conclusion is correct or a Type [ error is committed. A Type II error CaDDot be committed iu this case, because the hypothesis being tested is true. (F = 1.64 with 4 and 12 d.f.) Add 10 to each of the 5 observations of the second replication and 20 to each of the 5 observations of the third replication of Exercise (2) and leave the first and fourth replications as they are. Then the revised observations are no longer samples drawn from the same population. Now test the same hypothesis as in Exercise (2). Note that the F-value in this exercise is the same as that of Exercise (2). Why? The purpose of this exercise is to illustrate that the variation among the replications, such 88 different blocks in the field or different litters of animals, can be eliminated in a randomized block experiment. (F = 1.64) Subtract 40 from each of the 4 observations of the lirst treatment and add 40 to each of the 4 observations of the fifth treatment of the data of Exerci se (3), and leave the second, third, and fourth treatments alone. Now test the hypothesis that the population means of the treatments are equal, at the 5% level. Since the population means of the treatments are not equal, state whether your conclusion is correct or a Type II error is committed. The Type I error cannot be committed, because the hypothesis being tested is false. (F = 34.52) Since originally (Exercise 2) the 20 observations are drawn from the tag population whose mean is equal to 50 and whose variance is equal to 100, the means of the 20 populations of Exercise (4) are actually known. (a) Find the mean of the 20 populations (Table 14.3b). (b) Express each population mean in the form of Equation (1), Section 14.3 (Table 14.3c). (c) Find the average replication mean square if all possible samples of the same size are drawn (Equation 3, Section 14.3). (558) (d) Find the average treatment mean square if all possible samples of the same size are drawn (Equation 4, Section 14.3). (3300) (e) Find the average error mean square if all possible samples of the same size are drawn (Equation 5, Section 14.3). (100) Suppose the data given in Exercise (4) are mistaken for those of a completely randomized experiment. Carry out the analysis of variance calculation by the method given in Section 12.3. Compare the F-value thus obtained with that in Exercise (4). What is the effect
217
EXERCISES
of the mistaken identity on the F-vslue? The purpose of this exercise is to show that the probability of committing a Type II error is 8l'eatly increased if the one-way analysis of variance is used by mistake on a randomized block experiment. (7) The method of "paired observations" is identical to the analysis of variance of 2 treatments with n replications (Sections 8.7 and 14.6). Use both paired t and the analysis cl variance on the following data to test the hypothesis that the population means of the two treatments are equal. Show that t' = F. Give the numbers of degrees of freedom for both t and F. Treatment
Rep.
1 2 3 4 5
1
2
50 57 42 63 32
55 44 37 40 52
(8) Thirteen strains of hops were planted in a yield trial in a randomized block experiment with 5 replications. The yields (dry weight in pounds) of the 65 plots are given in the following table: Strain of Hops Fuggles Walker 0-104 0-105 0-107 0-110 0-201 0-203 0-207 0-211 0-214 0-307 0-407
Replication 1
2
3
4
5
11.50
11.25 21.02 24.76 21.31 19.64 13.46 23.22 17.45 15.60 18.45 18.00 16.85 19.31
13.36 23.80 23.81 21.20 21.21 10.49 19.54 22.39 16.90 19.01 22.52 26.15 18.54
8.76 13.29 17.30 16.41 13.• 21 9.38 15.66 14.88 9.51 16.62 16.82 14.58 14.25
10.28 11.11 18.62 16.36 17.30 9.00 15.50 10.10 10.06 12.86 17.35 14.58 12.38
15.78 24.29 24.65 25.35 15.12 16.65 16.91 17..96 27.15 19.37 16.00 21.95
(This experimcilt was conducted by Dr. K. R. Keller with whose permission these data are published.)
Test the hypothesis that the yields of the 13 strains of hops are the same, at the 5% level. (F = 7.94 with 12 and 48 d.f.)
218
Ch. 14
RANDOMIZED BLOCKS
(9) An industrial concern has been plagued with a high accident rate among its drill press operators. Among other factors, the method of lighting and painting the machines was suspected as being a cause. Two firms of color consultants snggested somewhat different schemes, so both were tried. Of three similar sections of the plant, one is left the same, one is modified according to Plan A, and the third according to Plan B. From each section, 46 operators were chosen on the basis of matching for accident rate and production rate, i.e., we have 46 matched trios of subjects. Results are expressed in terms of the difference in half days lost due to accidents between the six~onth period under observation and the previous six months for the same worker. Negative values mean fewer accidents, etc. What do· you conclude about the effectiveness of the changes? Plan A
No Change
Plan B
2 0 5
-6 -4 -2
7 -1
3 0
4
8
-12 3
-2
0
-6
-4
-5 8
-2 -9
10 -10
-7
-13 -1 -5 -5 -2 -10
-5
-4 2
6
0 -5
-2 0
-8 4
2
-6 -4 -8
-6
-3
1
-8 -11 4
-4 -5 2 13
-4 2
-s
4
-6
0 -2 10 9
-3 3
0
-6
8 1
-2
-5
-2
0 14
-1 -6
-6
-2
-3 1 -1 1 -2 -5 5
0
4 -4 4 -6 -6
-3 -1
5
4 4
-10
5
-5
-6
-I -9
-1
1
-4
10
-8
4 4 5 -10 2 0
-4
-8 -3
-7
0 -8
-4 -4 6 -5 5 2
-2
-6 0 5 -9
0 -8 4 -4 -2 -5
1
(10) In a study of bacterial counls on milk by the plate method, twelve plates were counted by each of ten observers. The twelve plates were arranged along a bench illuminated by daylight. The position of each plate was kept fixed in order so that each plate should be counted under approximately the same lighting conditions by each observer. The observers moved from plate to plate in rotation. The following results were obtained.
219
QUESTIONS
.-
-Observer Treatment
A B
C D E
F G H I
J
Plate (Replication)
1
2
3
4
5
6
7
8
298 222 244 323 268 274 334 282 288 212
209 141 175 202 183 198 214 178 190 232
95 63 65 80 65 75 68 69 66 65
250 249 131 198 194 150 198 144 204 149
401 292 301 323 408 464 504 399 330 352
91 79 79
145 161 122 181 152 171 161 133 159 129
251 397 342 416 452 424 436 318 385 226
97
70 77
78 67 78 73
9
10
109 112 118 93 87 78 139 112 110 73 98 78 121 74 117 77 108 74 97 90
11
12
101 94 85 98 76 87 92 86 70 79
146 163 136 161 154 191 192 154 180
ISO
Test for a difference among the observers. (Wi lson, G. S.: "The Bacteriological Grading of Milk," His \fajesty's Stationery Office, London, 1935.) (11) The following are observed pH readings from the top, middle and bOllom of six core samples of soil. Sample No.
Top
Middle
Bottom
7.5 7.2 7.3 7.5 7.7 7.6
7.6 7.1 7.2 7.4 7.7 7.7
7.2 6.7 7.0 7.0 7.0 6.9
f---
1 2 3 4
5 6
Does the average pH reading vary with the depth of the soil?
QUESTIONS (l) A randomized block experiment consists of 8 treatments and 5 repli-
cations. The replication mean square, treatment mean square, and error mean square are the estimates of what parameters? Use numbers rather than k and n in the answers. Define all the mathematical notations used. (2) What is the meaning of the statement that the replication effects and treatment effects are additive? (3) What are the two models of the analysis of variance? Why is the mixed model more realistic for the randomized block experiment? (4) A randomized block experiment consists of 6 replications and 10 treatments. What are the numbers of degrees of freedom for the replication, treatment, and error? (5) In testing the hypothesis that the population means of two treatments are equal, either the t-test or the analysis of variance may be used.
220
RANDOMIZED BLOCKS
Ch. 14
The relation between these two methods is that t;I = F. There are, however, two different versions of t-paired and unpaired observations. If each of the two treatments consists of 20 observations, what are the numbers of degrees of freedom of t for the two versions and what are the nombers of degrees of freedom for the corresponding F -values? (6) What is the consequence of misusing the one-way analysis of variance on the data of a randomized block experiment? Is the conseryuence serious? (7) What is the consequence of misusing the two-way analysis of variance on a completely randomized experiment? Is the consequence serious? (8) What is the advantage of the randomized block experiment over the completely randomized experiment? (9) A computed dummy value is usually inserted in the experimental data to take the place of a missing observation. Is the information lost by the missing observation fully or partially recovered by the dummy value? (10) What is the advantage of reducing the experimental error in a test of hypothesis?
REFERENCES Anderson, R. L. and Bancroft, T. A.: Statistical Theory in Research, McGrawHill Book Company, New York, 1952. Cochran, W. G. and Cox, G. M.: E"perimental Designs, John Wiley & Sons, New York, 1950. Cochran, W. G.: "Some Consequences When the Assumptions for the Analysis of Variance Are Not Satisfied," Biometrics, Vol. 3 (1947), pp. 22-38. Crump, S. Lee: "The Estimation of Variance Components in the Analysis of Variance," Biometrics, Vol. 2 (1946), pp. 7-11. Crump, S. Lee: "The Present Status of Variance Components," Biometrics, Vol. 7 (1951), pp. 1-16. Eisenhart, Churchill: liThe Assumptions Underlying the Analysis of Variance," Biometrics, Vol. 3 (1947), pp. 1-21. Fisher, R. A.: The Design of E"periments, 6th Edition, Hafner Publishing Company, New York, 1951. Irwin, J. 0.: "Mathematical Theorems Involved in the Analysis of Variance," Journal of Royal Statistical Society, VuJ. 94 (J93l), PI" 285-300.
CHAPTER IS
TESTS OF SPECIFIC HYPOTHESIS IN THE ANALYSIS OF VARIANCE For an experiment with Ie treatments, the statistic treatment mean square
F ---------error mean square
is used in testing the hypothesis that all treatment means Il, are equal. This chapter describes some methods of testing various specific hypotheses concerning the treatment means, in addition to the general one that all treatment means are equal. 15.1 Li.ear ColDbl.aUo. A lilleaT combi1l4lioll of II
11
observations, y" y» ••• Y", is defined as
= I.My ., M11, + MaYa + ••. + M"y"
(1)
where the M's are constant multipliers. For example, the observations are 20, 50, 40 and 30 and the corresponding multipliers are 2, 3, 4 and 5 respectively. Then the linear combination of the 4 observations is II -
(2)20 + (3)50 + (4)40 + (5)30 .. 500.
One of the most familiar linear combinations of the 11 observations is the sample total I.y. When all the multipliers are equal to 1,the linear combination of the n observations is (2)
When the same 4 observations 20, 50, 40 and 30 are used this linear combination is II -
8S
an example,
(1)20 + (1)50 + (1)40 + (1)30 - 20 + 50 + 40 + 30 - 140
which i. the total of the 4 observations. Another familiar linear combination of the 11 observations is the sample mean f. When all the multipliers are equal to 1/n, this linear combination of the n observations is (3)
221
222
HYPOTHESIS IN THE ANALYSIS OF VARIANCE
Ch. 15
In terms of the example of the 4 ohservations 20, 50, 40, and 30, this linear combination is
v .. ~(20) + 1-<50) + ~(40) + ~(30) .. ~(20 + 50 + 40 + 30)
= 35.
which is the mean of the 4 observations. A linear combination is not limited to combining observations. It can also be applied to other items such as sample means. For example, the difference between two samples means fa and fa is B linear combination of the two sample means, with the two multipliers being 1 and -1 respectively, that is, (4)
r
The general mean (Equation 4, Section 12.10) is a linear combination of the k sample means, with the multipliers being n~n, n/"i.n, ••• , n Ic~n, that is,
"1) -
("a)
v= (!on Yl+ l':n ra+"'+
("Ic' InJYIc=
"Jl+"aYa+" . +"lcflc =y. = In
(5)
The above examples of linear combinations are cited for four purposes: (a) To illustrate the meaning of linear combinations. (b) To show that the idea of linear combinations is not new. (c) To generalize many statistics previously treated, such as Iy, f, etc. and also many statistics treated in later chapters such as the regression coefficient (Chapter 16). (d) To show that the multipliers are not arbitrarily chosen, but are purposefully selected.
r,
15.2 Distribution of Linear CombinatioDs
A linear combination v of n observations (Equation 1, Section 15.1) changes from sample to sample, because the observations themselves change from sample to sample. This section deals with the properties of the distribution of the linear combination v. These properties are summarized in the following theorem:
Theorem 15.2 If all fDssible samples of n independent observations Yl' Ya' ••• Yn are drawn from n normal populations with means equal to Ilu Ila ••• , Il n and variances equal to uL u~ ... , u! respectively, and for each sample, the value of the linear combination v
= IMy = MtYl + Mara + ••• + MnYn
(1)
is computed, the distribution of the v-values follows the normal distribution with the mean equal to IlI1
=:
MiIl' + Malla + ... + Mnlln
(2)
15.2
223
DISTRIBUTION OF LINEAR COMBINATIONS
and the variance eqU4/. to (11.II
= M2..1. f' , + !tI~2 r 2 + ••• + M2(11 ra rae
(3)
This theorem is to be verified by a sampling experiment, the details of which are given in Chapter 4. Briefly, 1000 random samples (that is, observations are independent; Section 4.2), each consisting of 5 observations, are drawn from the tag population which is a normal population with mean equal to 50 and variance equal to 100. For each sample, the linear combination
v .. (-2)y, + (-1)Y2 + (O)y. + (1)y. + (2)y.
(4)
.. -2y, -Y2 + Y. + 2y.
is calculated. The first sample shown in Table 4.2 consists of the 5 observations 50, 57, 42, 63, and 32 and the value of v is
v
= -2(50) - 57 + 63 + 2(32) = -30.
The second sample consists of the 5 observations 55, 44, 37, 40, and 52, and the value of v is
v - -2(55) - 44 + 40 + 2(52) .. -10. A value of v is calculated for each of the 1000 samples. The frequency distribution of these 1000 v-values is shown in Table 15.2. TABLE 15.2 II
Theoretical r.f. (%)
Observed r.f. (%)
Observed r.c.f. (%)
.6 2.4 7.6 16.2 24.0 23.1 16.1 7.2 2.3 .5
.5 2.9 7.5 17.1 23.7 22.B 14.5 7.6 2.B .6
.5 3.4 10.9 28.0 51.7 74.5 89.0 96.6 99.4 100.0
100.0
100.0
-
Below -79.5 -59.5 -39.5 -19.5 0.5 20.5 40.5 60.5 Above
-79.5 to -59.5 to -39.5 to -19.5 to 0.5 to 20.5 to 40.5 to 60.5 to BO.S 80.5
Total
(This sampling experiment was conducted by about 75 students at Oregon State College in the Spring of 1953).
Since the 5 observations of each sample are drawn from the same population, the 5 population means must be equal, that is, Il, ., Il2
= Il3 .. Il. = Il.
R::
50,
224
HYPOTHESIS IN THE ANALYSIS OF VAKIANCE
Ch. 15
and the 5 population variances must be equal, that is,
ut- u~ - u~ -u: -u: = 100. The 5 multipliers are -2, -1, 0, 1, and 2 respectively. Therefore, the mean of v (Equation 2) is 1'-11 - (-2)50 + (-1)50 + (0)50 + (1)50 + (2)50 - 0,
and the variance of v (Equation 3) is
u! - (-2)2].00 + (-1)1100 + (O)'}OO + (1)1100 + (2)2100 _ [(_2)' + (_1)' + (0)' + (1)' + (2)'] 100 - 1000,
and the standard deviation of v is ,,'t000 or 31.6. The distribution of the 1000 u-values is plotted on the normal probability graph paper (Fig. 15.2). The fact that the points are almost on a straight line indicates that v 99.S 99
r.<./.
90
I
, , ,
----------------
~,
I
I I I
--------,-----------___
so
10
. . . .--.. . . .
_1IO---_60---_~40-----
-...:...----'----80"""-~O.S
Fig. 15.2
IS.2
22S
DISTRIBUTION OF LINEAR COMBINATIONS
II
which follows the nonnal distribution (Section 3.3). The value of corresponds to the SO% point is the mean of It can be observed from the graph (Fig. IS.2) that the mean of the 1000 u-values is almost exactly equal to 0 a. expected. The value of f} which correspond. to the 84% point is 1'-" + (I,,' It can be observed &om the graph (Fig. 15.2) that this value i. almost exactly equal to 31.6 as expected. The sraph of Fig. IS.2 thus verifie. Theorem IS.2. The properties of the distribution of sample means (Theorems S.2b and S.3) can be deduced &om Theorem IS.2. When all the multipliers (M) are equal to l/n, the linear combination f} of the n observations of a sample is f(Equation 3, Section IS.1), that is,
II.
I: -
II = (~)rl + (~)rl + ••• + (~)rn _
f.
Therefore, by Theorem 15.2, the sample mean f follows the normal distribution. Since all the observations are drawn &om the population with mean" and variance (I', the mean of f) is
~- (~)" + (~)I'- + •.• + (~)I'- _n(~)p. and the variance of II is "" _
(I! _(I; _( ;;1)1
(12
a
1'-,
+ (1)2 ;;- (13 + .•• + (1)2 ;;- (12 - n (1)2 ;; (13 _ (I' ;-.
These results, which are derived from Theorem lS.2, are the same as those given in Theorems S.2b and S.3. The properties of the distribution of the difference between two independent sample means, f. - fl' (Theorems 10.la, 10.lb, IO.Ie) can also be deduced from Theorem IS.2. When the multipliers are 1 and -1, the linear combination of the two independent sample means is
II = OW. + (-1)r2 = f. - rl' Since it is known that (Theorem 5.3)
then, by Theorem IS.2, and
226
HYPOTHESIS IN THE ANALYSIS OF VARIANCE
Ch.ls
Furthermore, by Theorem 15.2, v = fl - f2 follows the normal distribution. These results, which are derived from Theorem 15.2, are the same as those given in Theorems 10.1a, 10.1b and 10.lc. The examples cited above show that Theorem 1s~2 is a very useful theorem which generalizes many theorems in statistics. IS-SI.dlvid... Degree of Freedom In the analysis of variance described in Chapters 12 and 14, each of the Ie sample (treatment) meaDS rl' Ie is based OD II observatioDs which are drawn at random from k normal populations with equal variance ql (Section 12.7). An individual degree of freedom is a linear combination of these treatment means. It is used to test various hypotheses concerning the population means of the k treatments. In Chapters 12 and 14, the only hypothesis tested is that the k population means are equal. It is known (Theorem 5.3) that the mean of all possible sample means is equal to the population mean, that is,
ra, ... , r
ILyI .. Ill; IlYa = Il~ ••• ; #Lyle - Ille
(1)
and that the variance of all possible sample means is equal to the population variance divided by the sample size, that is, q2 _q2 _
Y
n'
If the linear combination of the k sample (treatment) means is v = IMy a Msfl + M52 + ... + M,;yIe'
(2)
then v follows the normal distribution (Theorem 15.2) with mean equal to Ilv - MsIll + M~2 + ... + M"""
(3)
and variance equal to q2 q2 .. M~-+ v n
q'
q2
q2
M!-+ ... + Mi-- n n n
(~.Jla).
Consequently, the statistic (5)
follows the normal distribution with mean equal to zero and variance equal to 1 (Section 6.7). However, the statistic u is almost useless in testing a hypothesis concerning a linear combination of the k population means (Equation 3), because the population variance ql is usually unknown. If q2 of Equation (5) is replaced by s~, the error mean square of
15.3
227
INDIVIDUAL DEGREE OF FREEDOM
the analysis of variance, the resulting statistic is
IMy- IMp.
'-~ P
n
which follows Student's t-distribution with .., degrees of freedom, where .., is the number of degrees of freedom of Sl (Theorem 8.1a). The number of degrees of freedom for the error mea: square Sl is k(n -1) for the completely randomized experiment and (k-1)(n-l) Pfor the randomized block experiment. Since F (Theorem 12.6), the statistic
,I ..
(IMf - IMp.)1 ,I=F _____ _
(7)
slIMI P
n
follows the F-distribution with 1 and.., degrees of freedom. the hypothesi. that P.v - IMp. - M,p.1 + MfIlI + ... + M,ple - 0,
In testiDg (8)
the statistic shown in Equation (7) becomes
F-
(IMy)1 (M tYl + M51 + .•. + MtYle)1 slIMI p
----------=--~ Sl
(M: + M: + •.. + MIle) .
(9)
~p~-------
n
n
When both the numerator and denominator of the above equation are multiplied by nl, the resulting equation ill F -
(M IT I + MIT. + .•. + .vie TIe)a
n s~(M: + M: + ... + AI~)
,
(10)
where T - ny ill a treatment total. The quantity
(.lI,T, + MaTI + ..• MIeT Ie)· QI __ ----------~~n(M~ + M~ + ••• + M~)
(11)
ill called an individual degree of freedom of the treatment SSe Then the statistic F in Equation (10) is
QI F---
(12)
Sl P
which follows the F-distribution with 1 and.., degrees of freedom, where
228
HYPOTHESIS IN THE ANALYSIS
or VARIANCE
Ch. 15
is the error mean square, ·with 11 degrees of freedom, of the analysis of variance. To illustrate the application of the F-test of Equation (12), the example of a completely randomized feeding experiment consisting of 3 treatments, each with 5 animals, is aaed. The treatments are as follows: Treatment 1: 100% regular feed. Treatment 2: 85% regular feed + 15% wood molasses. Treatment 3: 80% regular feed + 15% molasses + 5% soybean meal. The fictitious data given in Table 12.1a are considered the numbers of pounda gained by the animals during the feeding period. The treatment totals are 25, 45, and 20 respectively (Table 12.1a) and the analysis of variance is shoWD in Table 12.3b. The statistic F with 2 and 12 degrees of freedom is used in testing the hypothesis that the population means of the three treatments are equal, that is, IL, = ILl = IL.. But the individual degree of freedom may be used to test the hypothesis (Equation 8) that 51
1MIL - M,p., + M,p,1 + M.IL. - O.
If M, - 1, MI
- -
~ M. - -~ the hypothesis becomes 1
",. - 21f. -
or
1 r"" -0
or
2IL, -
1L2 - Il, -
o.
In tenns of the feeding experiment, this hypothesis is that the regular feed (treatment 1) is as good as the average of the two substitute feeds (treatment 2 and 3). In testing this hypothesis, the individual degree of freedom (Equation 11) is
(21, - 1 1- 1 JI [2(25) - 45 Q: - 5[(2)' + (_1)1 + (-1)2) 30
20]1
-7.5.
(13)
The statistic (Equation 12)
Q:
7.5 F-- .. --l.15 S'p 65 •
(14)
is less than 4.7472 which is the 5% point of F with 1 and 12 degrees of freedom. Therefore the conclusion is that the fattening ability of the regular feed is the same as the average fattening ability of the two substitute feeds. Then the two substitute feeds can also be compared by the method of individual degree of freedom. The hypothesis in this case
15.3
229
INDIVIDUAL DECREE OF FREEDOM
is III - III and the individual degree of freedom is
Q: -
(T a - T JI (45-20)1 5[(1)1 + (-1)S] 10 = 62.5.
The multipliers used in
Q: are Ma -
0, MI
-
1 and MI
-
(is)
-1. The statistic
Q:
62.5 F- - - 9.62 Sl 65 p
(16)
•
is greater than 4.7472. Therefore, the contlusion is that the second feed is more fattening than the third feed. It muet be realized that the data are fictitious and that therefore the conclusions are also fictitious. The analy.is of variance is ShOWD in Table 15.3a. It can be seen from the above example that very little additional work is involved in computios an individual degree of freedom, after the aoalysis of variance calculations are completed. The treatment totals and the error mean square are already computed during the course of the analysis of variance calculations. However, if the Ie treatments do not have the same number of observations, the computing method of an individual degrees of freedom is much more difficult and is given in Section 19.6. TABLE 15.3a Source of variation
55
DF
M5
F
Feed Regular vs. substitutes Between substitutes Error Total
70.0 7.5 62.5 78.0 148.0
2
35.0 7.5 62.5 6.5
5.38 1.15 9.62
!
1 12 14
When the sum of (Ie -1) individual degrees of freedom is equal to the treatment SS, that is, treatment SS - Q~ + Q: + .•. + Q~ _ l '
(17)
the set of Q2 is called an orthogonal set of individual degrees of freedom. The example of the feeding experiment shows an orthogonal set of individual degrees of freedom. The treatment SS (Table 15.3a) is 70 with 2 degrees of freedom. The value of Q~ is 7.5 with 1 degee of freedom and that of Q: is 62.5 with 1 degree of freedom. But 70 - 7.5 + 62.5. Therefore and constitute an orthogonal set of individual degrees of freedom. The F-value of 1.1:5 (Table 15.3a) is used to test the hypothesis that
Q:
Q:
(a) 2p., - Il2 - IL1 - 0
(18)
230
Ch. 15
HYPOTHESIS IN THE ANALYSIS OF VARIANCE
and the F-value of 9.62 is used to test the hypothesis that (19)
(b) p., - III == O.
These two hypotheses collectively imply the hypothesis that (20)
III == Ila - III
which ill tested by the F-value of 5.38. If the hypothesis (b) is correct, that is, Ila" Ill" 50, say. then III has to be equal to 50, because by hypothesis (a), 2111 - 50 - 50 - O. Therefore, all three population means are equal. Conversely, if the general hypothesis that III - /L, - III is correct, the two component hypotheses (a) and (b) must be correct. Therefore, the purpose of using an individual degrees of freedom is not to change the hypothesis but to test a more specific hypothesis concerning the population means rather than the general one that all the population means are equal. When different values are assigned to the multipliers (M) in Equation (8), a wide variety of hypotheses concerning the population means can be tested. A set of individual degrees of freedom need not be orthogonal. However, if an orthogonal set is desired, it can be obtained by selecting appropriate multipliers. A set of individual degrees of freedom will be orthogonal if the multipliers (M) of Equation (8) are selected to satisfy the following two conditions: (a) The sum of the multipliers for each individual degree of freedom is equal to zero. (b) The sum of the products of the corresponding multipliers of any two individual degrees of freedom is equal to zero. These two conditions may be illustrated by the example of the feeding experiment. The multipliers for Q~ and are listed below:
Q:
Ma 2
o
-1 1
-1 -1
6 2
The sum of the multipliers for Q~ is 2 - 1 - 1 - 0 and that for Q= is 1 1:1 O. Therefore, the condition (a) is satisfied. The sum of the products of corresponding multipliers of the two individual degrees of freedom is
o+ 1 -
(2) (0)
+ (-1) (1) + (-1)
(-1)
=0 -
1 + 1 - O.
Therefore, the condition (b) is also satisfied. As a consequence, this set of individual degrees of freedom is orthogonal.
.
15.3
231
INDIVIDUAL DEGREE OF FREEDOM
Incidentally, the individual degree of freedom (Equation 18) which tests the hypothesis that
2111 - 113 - Il,
=0
also tests the hypothesis that the first treatment effect III zero. The hypothesis that
Ii is
equal to
is the same hypothesis that
because
Therefore, the test of a hypothesis concerning a treatment effect can be done through the use of an individual degree of freedom. TABLE 15.3b Set 1
Individual
Set 2
d·l·-
M,
Ma
M.
M.
Qf QI QJ
-I -I 1 -I
1 0
1 0 1
0
0
-I
I.M 3
M. 1 -I -I 1 -I -I 1 -I -I !rI,
4
2 2
Ma
M. 1 1 1
~M2
4 4 4
The treatment SS with k -1 degrees of freedom may be partitioned into many different sets of individual degrees of freedom. The set chosen must be meaningful and purposeful. As further illustrations, the multipliers of two orthogonal sets of individual degrees of freedom are given in Table IS.3b. The multipliers in both sets satisfy the conditions of orthogonality. For the first set, it can be observed from Table lS.3b that the sum of the multipliers for each individual degree of freedom is equal to zero. The sum of the products of the corresponding multipliers of and Q= is
Q:
(-1)
(-0 + (-0 (1) + 0) (0) + (1) (0)
and that for Q~ and
1-1 + 0 + 0 - 0,
Q! is
(-0 (0) + (-1) (0) + 0) (-1) + 0) and that for Q~ and
0::
(1) - 0 + 0 - 1 + 1 - 0,
Q: is
(- 1) (0) + 0) (0) + (0) (-1)
+ (0) (1) - 0 + 0 + 0 + 0 - O.
Therefore the conditions of orthogonality are satisfied.
232
HYPOTHESIS IN THE ANALYSIS OF VARIANCE
Ch. 15
The fact that the treatment SS is equal to the sum of an orthogonal set of individual degrees of freedom can be proved algebraically. To avoid the algebraic manipnlations, the numerical verification is used here. Suppose the 4 treatment totals Tl , Ta, T" and T. of a completely randomized experiment are 20, 30, 40 and 50 respectively. F.ach treatment has 5 observations. The treatment 55 with 3 degrees of &eedom i8
l:Ta
ca
5400
19600
n
len
5
20
- - - = - - ---1080 - 980 -100.
When the first set of multipliers is used, the components of the treatment SS are as follows:
(40)1 Q:- (-T,- Ta4n+ T, + TJa ----80; 20 Q~z
Q:-
(- Tl + Ta)a 2n
10
(-T. + TJ2 2n
(10)2 .. ---10; (10)2
- ---10. 1'0
It can be observed that the sum of these components is equal to the treatment SSe It should be noted that the value of Q2 is unchanged if its multipliers are multiplied or divided by a positive or negative constant. For example, the multipliers of Q~ are -1, -1, 1, and 1. When these multipliers are multiplied by -2, the new multipliers are 2, 2, -2, and -2. Now if these new multipliers are used.
Q: -
(2T, + 2T2 - 2T.- 2T J2 (-80)2 5[(2)2 + (2)2 + (_2)2 + (-2)2] - 80 .. 80
which is the same quantity obtained by the old multipliers. Therefore, fractional multipliers can be avoided by multiplying each M of a particular Qa by the common denominator of the M's. For example, each of the 4 multipliers
311
1
4"'-3"'-6"'-4" may be multiplied by 12 and become 9, -4, -2, -3. Both sets of multipliers produce the same value of Q2. The latter set is usually used to simplify the computation. It is easy to select the multipliers to satisfy the condition that the sum of the multipliers must be equal to zero for each individual degree of freedom. But it is not so easy to satisfy the condition that the sum of
15.4
LEAST SIGNIFICANT DIFFERENCE (LSD)
233
products of corresponding multipliers must be equal to zero for any two individual degrees of freedom. In the case of 4 treatments, there are 3 such sums of products to be made equal to zero. In the general case of k treatments, the number of sums of products to be made equal to zero is t(k - l)(k - 2). If k = 4, the Dumber i. !(3)(2) = 3. 1£ k = 5, the Dumber is t(4)(3) =6. If k = 10, the number is t(9)(S) = 36. It seems extremely difficult to select the multipliers 80 that all these restrictive conditions are satisfied. Indeed, it is very difficult to do so, if one relies on the method of trial and error. However, if the selection of these multipliers is guided by the physical meaning of the experiment, the difficulty usually dissolves. As an illustration, the physical meaning may be attached to the 4 treatments of Table 15.3b, making the treatments the following teaching methods: Treatment 1: Lecture Treatment 2: Lecture + demonstration Treatment 3: Lecture + visual aid Treatment 4: Lecture + demonstration + visual aid If the first set of the multipliers in Table 15.3b is adopted, the quantity Q: is used to test the hypothesis that III + 112 - III - 114 = o. In terms of the teaching method, this individual degree of freedom is used to determine whether the visual aid is of any value in teaching, regardless of whether the demonstration is used or not. Note that the minus signs are attached to the teaching methods with visual aid, while the positive signs are attached to the teaching methods without visual aid. Furthermore, the lecture and demonstration are used on both the minus and plus sides. Therefore, the compairson is with vs. without visual aid. The quantity Q~ is used to test the hypothesis that III -112 = o. In terms of the teaching methods, this test is to determine whether the demonstration is of any value in teaching. The quantity Q: is used to test the hypothesis that III - 114. This test is also to determine whether the demonstration is of any value. But the difference between Q~ and Q: is that, in the former case, no visual aid is involved and, in the latter case, the visual aid is used whether the demonstration is used or not. The examples given in this section are those of completely randomized experiments. But the method of individual degree of freedom can be applied similarly to the randomized block experiment. 15.4 Least Significant Difference (LSD) The difference between any two treatment means must exceed a certain quantity before it is considered significant. This quantity is called the least significant difference (LSD) between two treatment means. This least significant difference can be determined (Section 10.4) from the fact
234
HYPOTHESIS IN THE ANALYSIS OF VARIANCE
Ch. 15
that the statistic
follows Student's t-distribution, if 1'& = 1'2' with II degrees of freedom, where II is the number of degrees of freedom of s~. When used in connection with the analysis of variance, the statistic t which is given in Theorem 10.4& needs some modifications. First, the denominator IS simplified. When each of the k treatments has n o~ servations,
1
-+
nl
1 -p nl
1
1
2
n
n
n
-+ -- -.
Second, a more accurate estimate of the population variance is available. In Theorem 10.4a, the pooled variance is based on the observations oE two samJfles (treatments), while in the analysis of variance, a more accurate pooled variance s~ is available in the form of the error mean square which has k(n-1) degrees of freedom for the completely randomized experiment, and (k - 1)(n - 1) degrees of freedom for the randomized block experiment. In testing the hypothesis that two population means are equal at the 5% level, the two sample (treatment) means are considered significantly different, that is, 1'1 rF Ill' if t is greater than t. 025 or t is less than -'.025' where '.025 is the 2.5% point of the t-distribution. In other words, the difference between two treatment (sample) means is significant at the 5% level, if the absolute value of t (t-value with + or - sign ignored; denoted by t is greater than '.025' that is
I I)
Iy,- fal > '.025~.
(2)
The quantity on the right-hand side of the above inequality is called the lea.st significant difference (LSD .os) between two treatment means. When the difference between two treatment means exceeds this quantity, it is significant, that is, III ~ Ilao When both sides of the inequality (2) are multiplied by n, the resulting inequality is (3)
15.4
LEAST SIGNIFICANT DIFFERENCE (LSD)
235
where T, and T 2 are the treatment totals (T - ny). The quantity on the right-hand side of the inequality (3) is called the least significant difference between two treatment totals. When the difference between two treatment totals exceeds this quantity, the difference between two sample (treatment) means is significant, that is, Il, ,. IIp These two versions of the least significant difference are really the same test. The computing work involved is about the same for both versions, but the latter version does not require the computation of treatment meaDS from the totals. Whichever version is used, the computation of the least significant difference requires very little additional work, after the analysis of variance calculations are completed. The quantity t. 025 is obtained from the t-table without computation. The quantity n is the number of observations in each treatment and also needs no computation. The quantity s~ is the error mean square which is already computed in the analysis of variance calculations. None of the three quantities t. 025 ' n and s~ is attached to any particular treatments. Therefore, the least significant difference may be used in testing the equality of any two population means. The subscripts 1 or 2 attached to y, T and Il are used to differentiate one treatment from another and do not necessarily denote the first and second treatments. The method of the least significant difference is really the same test as that of the individual degree of freedom. When the individual degree of freedom is used in testing the hypothesis that Il, =1l2' the statistic used (Equations 10, 16, and 17, Section 15.3) is (T,- T2)2
F- ns 2[(l)2 + (-1)2)= p
The hypothesis Il, co 112 is rejected when the F-value in Equation (4) exceeds the 5% point of F, that is,
F=
(T, - T2)2
2ns2
p
>F .05
or (5)
It should be noted that the above inequality is the inequality (3) with both sides squared, because t 2 • 025 0:::: F.05 (Theorem 12.6). Therefore, the method of the least significant difference and that of the individual degree of freedom always lead to the same conclusion in a test of hypothesis. However, the least significant difference can be used only in testing the hypothesis that two population means are equal; while the individual degree of freedom can be used in testing various hypotheses
236
HYPOTHESIS IN THE ANALYSIS OF VARIANCE
Ch. 15
concerning the k population means (Section 15.3). Hence the former method may be regarded as a special case of the latter method. The use of the least significant difference is illustrated by the example given in Table 12.1a. The three sample (treatment) means, each based on 5 observations, are 5, 9, and 4 respectively. The complete analysis of variance is given in Table 12.3b, which shows that the error mean square is equal to 6.5 with 12 degrees of freedom. The 2.5% point of the t.odistribution with 12 degrees of freedom is 2.1788. Then the least significant difference between two means at the 5% significance level (Inequality2) is LSD.os .. t. 02S
,f2s!"
V ~= 2.1788,j2{6.5) V S--5-'" 3.5.
(6)
91
The difference between the first and the second sample means is 15 or 4 which is greater than the LSD 3.5. Therefore, the conclusion is that III < 1l2" The. difference between the first and the third sample means is (5 - 4) or 1 which is less than the LSD 3.5. Therefore the conclusion is that III = Ill. When the LSD is used, any significance level may be chosen. If the 1% level is chosen, the .5% point of t is used in calculating the LSD instead of the 2.5% point used in this example. The least significant difference is very commonly used and also frequently misused. The use and misuse of the LSD are illustrated by a sampling experiment (Chapter 4). One thousand random samples, each consisting of 5 observations, are drawn from the tag population, which is a normal population with mean equal to 50 and variance equal to 100. The random samples are separated into groups of 10 samples each, according to the order in which the samples are drawn. For example, the first 10 samples constitute a group, the second 10 samples constitute another group, and so forth, to make 100 groups, each consisting of 10 samples. Then each group may be regarded as the data of a completely randomized experiment which consists of 10 treatments and 5 replications. The means are computed for each of the 1000 samples. For each group of 10 samples, the least significant difference (Inequality 2) for two treatment means can be computed, but the value of the least significant difference will change from group to group, because the pooled variance S2 changes from group to group. To simplify the computation, the popup . lation variance is used in computing the LSD, and consequently the number of degrees of freedom for t is 00 (Section 8.3). Then the least significant difference between two treatment means (Inequality 2) is 1
LSD.os "" 1.960 The hypothesis that III
~
,
j2('IOOl
V -;;-"" 1.960 V S---5 - - 12.4.
= III is
(7)
tested by comparing the difference between
15.4
LEAST SIGNIFICANT DIFFERENCE (LSD)
f,
237
and f. for each group with the LSD 12.4. When the difference between the first and the fifth sample means exceeds 12.4 in absolute value, the difference is declued significant. Since all the samples are drawn from the same population, the hypothesis being tested is true, and therefore it is expected that, in 5% of all possible groups, the hypothesis is rejected erroneously. In this sampling experiment of 100 groups, approximately 5 greater than 12.4. (This sampling groups produce differences experiment has been performed cooperatively by the students at Oregon State College yearly since 1950. The percentages obtained ranged from 3 to 8.) Then the difference between the largest and the smallest sample means within each group is determined. This difference is again compared with the LSD 12.4. This time approximately 60 of the 100 groups produce differences exceeding 12.4. (This sampling experiment has been performed cooperatively by the students at Oregon State College yearly since 1950. The percentages obtained ranged from 58 to 67.) Therefore the actual significance level is about 60% rather than the intended 5%. This result is not really surprising. The LSD is designed to test the significance of the difference between two particular treatment means such as the first and the fifth. As an illustration, for a particular group the two means may be 48.4 and 51.2, and the difference between them 2.8, which is less than 12.4. If the other 8 sample means are brought into consideration, the difference between the least and the greatest among the 10 sample means is still 2.8, only if all the 8 other sample means lie between 48.4 and 51.2. If one or more of the 8 sample means fall outside this range, the difference between the extreme sample means will be greater than 2.8. In other words, the difference between two sample means can be increased and never reduced by bringing the 8 other sample means into consideration. As a result, when the difference between two particular treatment means is distorted into the difference between two extreme sample means, the least significant difference is very misleading and becomes more so, as the number of treatments increases. From the results of the sampling experiment, it can be concluded that the least significant difference is used only when a comparison is inspired by the treatments such as comparing the yield of a local variety with each of the introduced varieties of a certain crop; and that it should never be used when the comparison is inspired by the relative magnitude of the sample means. In a yield trial involving a large number of varieties, the difference between the highest yield and the lowest yield is almost always significant if the LSD is used as a yardstick. All these discussions about the use and misuse of the least significant difference also apply to the individual degree of freedom. The hypothesis of the individual degree of freedom should be formulated without the benefit of the knowledge of the treatment (sample) means, for otherwise, the significance level is disturbed.
1f, - f.1
238
HYPOTHESIS IN THE ANALYSIS OF VARIANCE
Ch. 15
The purpose of the sampling experiment is to demonstrate that the least significant difference should not be used in testing the hypotheses inspired by the magnitudes of the treatment (sample) means. It is not the intention of the sampling experiment to show that SDch hypotheses themselves are undesirable. It is perfectly natural for an experimenter to select the high yielding ones among many different varieties of a certain crop with the aid of the experimental data. Rut such problems need special methods, one of which is given in the following section.
15.5 New Multiple Range Test. In the analysis of variance, if the hypothesis that the k population means are equal (Section 12.5) is accepted, no further test is necessary. However, if the hypothesis is rejected, the conclusion is that the k population means are not all equal. This, of course, does not imply that all the population means are different from one another. In order to determine which of the k population means are equal and which are different, further tests are necessary. Although the methods of individual degree of freedom and least significant difference, described in the preceding sections, seem to be the tests for this purpose, they are not. The sampling experiment in Section 15.4 shows that the hypotheses being tested by these methods should be inspired by the treatments and not by the relative magnitudes of the sample means; for otherwise the conclusion reached would be extremely misleading. To test the hypotheses inspired by the sample means, several methods are available, of which only one, the new multiple range test, is presented in this section. The new multiple range test was developed by Duncan in 1955. The principle is quite similar to that of the least significant difference (Section 15.4); and, therefore, this test may be explained in terms of the LSD. The least significant difference between two treatment means at the 5% level is
(Inequality 2, Section 15.4), where t. 025 is the tabulated t-value, s~ is the error mean square of the analysis of variance, and n is the number of observations each treatment mean is based on. In the multiple range test, where more than two treatment means are compared, the quantity corresponding to the LSIl is called the shortest significant range (SSR); and the multiplier v'2t. 025 of the estimated standard error yspn is called the significant studentized range. The word "studentized" has not yet appeared in dictionaries. It is derived from the Student's t-
15.5
239
NEW MULTIPLE RANGE TEST
distribution, where the estimated variance sa is used instead of the population variance (7a. The significant studentized ranges for 5% and 1% significance levels are given in Tables Sa and 8b, respectively, of the Appendix. The letter g of the tables designates the number of means in a group to be compared, and Va is the number of degrees of freedom of s~, the error mean square of the analysis of variance. It should be noted that, for g .. 2, the significant studentized ranges for the 5% level are equal toy'2 t .025 and those for 1% level are equal to ..j2 t.oos' For example, t. 025 with 10 degrees of freedom is 2.2281 and y2 is equal to 1.4142. The product of these quantities is equal to 3.15 which is the significant studentized range for g - 2 and Va - 10 (Table 8a, Appendix). The workings of the new multiple range test may be illustrated by the tomato experiment given in Table 14.5a. To perform this test the means of the six varieties are necessary. They are computed from Table 14.5a and are given below:
A
B
D
c
27.00
35.90
44.54
44.82
F
E
60.64 61.08
Note that the six average yields are arranged according to their magnitudes. The standard error of the mean is needed in computing the shortest significant range. The standard error is
Sy-.;sam - Yl01.67/5 - 4.51 where n is the number of replications, and sa is the error mean square, with 20 degrees of freedom, of the analysis of variance given in Table 14.5b. If the 5% significance level is chosen, the significant studentized ranges are obtained from Table 8a, .~ppendix, with "a - 20 and g eo 2, 3, 4, 5, 6. The last uumber 6 is k, which is the number of varieties or treatments in the experiment. Then each of the tabulated values 2.95, 3.10, 3.18, 3.25, and 3.30 is multiplied by the standard error 4.51 to form the SSR which are given below:
g: SSR:
2
3
4
5
6
13.30
13.9R
14.34
14.66
14.88
Now the differences between means may be tested in the following order: the largest minus the smallest, the largest minus the second smallest, and so on, ending with the second smallest minus the smallesL For the tomato experiment, there are 15 such comparisons as shown in Table 15.5a. In general there are ik(k -1) comparisons for the k treatments. Then the difference between two means is compared with the corresponding SSR. The difference is significant if it exceeds the
240
Ch. 15
HYPOTHESIS IN THE ANALYSIS OF VARJANCE
corresponding SSR; otherwise it is not significant. For example, the difference between the means of the varieties E and A is M.08 (Table 15.5a) which is greater than 14.88, the SSR for g - 6. Therefore, the difference between the two means is significant. The reason wby this difference of 34.08 is compared with the SSR with g = 6 is that E - A is the range of 6 means. TABLE 15.5a g
Varieties
DiffereDce
SSR
CouclusioD
6 5
34.08 25.18 16.54 16.26
14.88 1••66 14.3. 13.98 13.30
Significant
3 2
E-A E-B E-D E-C E-F
5 4 3 2
F-A F-B F-D F-C
33.64 2•• 74 16.10 15.82
4 3
17.82 8.92
2
C-A C-B C-D
3 2
D-A D-B
17.54
2
B-A
8.90
•
.«
.
.. .... "
Not Significant Significant
14.66 14.34
13.98 13.30
"
14.34
Significant Not SignificaDt
13.98
"
do Dot test
.
I
13.98
Significant Not Sipificant
I
13.30
Not SignificaDt
do Dot test
There is a limit to the rule given above. The limit is that no difference between two means can be declared significant if the means concerned are botb cODtaiDed in a sub-group of means wbicb baa a Don-
significant range. Because of this limit, as soon as a non-significant difference is found, it is convenient to group the two means and all of the intervening means together by underscoring them with a line as shown in Table IS.Sb. The remaining differences between all members of a sub-group underscored in this way are not significant and they need not, and should not, be tested against the SSR. For example, C - B is not significant (Tables IS.Sa and IS.Sb), and this sub-group includes C, 0, and B. Therefore, C - 0 and 0 - B should not be tested. TABLE 15.5b Varieties,
MeaDS:
A
B
D
C
F
27.00
35.90
44.54
44.82
60.64
E 61.08
The result of the new multiple range test is shown in Table IS.Sb. The varieties E and F have significantly higher yields than C and D.
241
EXERCISES
The variety A has a significantly lower yield than E, F, C, and D. The position of 8 is not clearly determined. It may belong to the same group as A or the same group as D and C. Indeed it may be a group by itself. Only further experimental evidence, that is, an increased PI, will clarify the situation. The new multiple range test looks very tedious, especially when the number of treatments is large. However, after becoming familiar with the procedure, one can devise many short-cuts to speed up the procedure.
EXERCISES (1) The sampling experiment of Section 15.2 verifies the fact that the
linear combination
" - 2r + r r 2r. I
(2)
(3)
(4) (5)
I -
4 -
follows the normal distribution wi th mean equal to 0 and variance equal to 1000. (a) Find the mean and variance of the distribution of (" + 200). (b) Find the mean and variance of the distribution of (" + 200)/10. (Hint: see Section 2.4). The totals of 4 treatments are 145, 131, 169, 172 respectively and each treatment has 8 observations. Find the treatment 55 and its components by each of the two sets of multipliers given in Table IS.3b. Note that, for each set, the treatment 55 is equal to the sum of its components. For the same treatment totals given in Exercise 2, find the value of QI, if the multipliel'8 are -1, 3, -1, -1 respectively. Now divide each multiplier by -3 and find the value of QI. Note that the value of QI is unchanged. Make another orthogonal set of multipliers for 4 treatments besides the two sets given in Table IS.3b. A randomized block experiment consisting of 5 treatments and 8 replications i. conducted to study the effect of the rate aod the time of application of the fertilizer on the seed yield of tum ips. The rate of application of fertilizer is expressed in terms of number of pounds per acre and the seed yields is in terms of pounds per plot. The individual observations are not given here, but the treatment totals are given as follows: No. 1 2 3 4 5
Treatment Check (no fertilizer) 100 lbe. in fall 50 lbe. in fall; 50 lbe. in epring 200 lbe. in fall 100 Ibe. in fall; 100 Ibe. in epring
Total 60.2 73.5 82.8 93.3 92.7
242
(6)
(7)
(8)
(9)
(10) (11) (12) (13)
HYPOTHF.SIS IN THE AN AL YSIS OF VARIANCE
Ch.15
The treatment SS with 4 degrees of freedom is to be partitioned into 4 components for the following comparisons: (a) With vs. without fertilizer. (F - 14.13) (b) 100 Ibs. vs. 200 Ibs. of fertilizer, regardless of time of application. (F .. 6.05) (c) For 100 Ibs. of fertilizer, once vs. twice in application. (F o::r 1.19) (d) For 200 lbs. of fertilizer, once vs. twice in application. (F - 0.00) For this exercise, the following procedures may be followed: (A) Make a table of multipliers as shown in Table IS.3b. (B) Make an analysis of variance table (Table 15.3a) showing the treatment SS and its 4 components. The replication SS is 89.6541 and the error SS is 127.6235. State what hypothesis each of the 5 F-values is testing. Write a short report on the conclusions in terms of the fertilizer experiment. Avoid statistical terms in the report. Find the least significant difference between two treatment totals for the data given in Exercise 5. Use the 5% significance level. (LSD = 17.5) For the data given in Exercise 7, Chapter 12, test the hypotheses (a) that the control feed is as good as the two experimental feeds, and (b) that the two experimental feeds are equally good, by the individual degrees of freedom, at the 5% level. For the data given in Exercise 10, Chapter 12, break down the treatment SS into two individual degrees of freedom. Previous experiments with lard show that rancid lard inhibited the germination of the spores while the control and non-rancid lard gave plate counts which were not significantly different. Test the above data at the 5% level to see whether the same results hold for oleic acid. For the data of Exercise 9, Chapter 14, compare (a) the merits of the old color scheme and the new color schemes, and (b) the merits of the two new schemes, at the 5% level, by the individual degrees of freedom. For the data given in Exercise 8, Chapter 12, rank the means by the new multiple range test, at the 5% level. For the data given in Exercise 12, Chapter 12, rank the means by the new multiple range test, at the 5% level. For the data given in Exercise 13, Chapter 12, rank the means by the new multiple range test, at the 5% level. For the data given in Exercise 14, Chapter 12, rank the means by the new multiple range test, at the 5% level.
REFERENCES
243
(14) For the data given in Exercise 16, Chapter 12, rank the means by the new multiple range test, at the 5% level. (15) For the data given in Exercise 10, Chapter 14, rank the means of the ten observers by the new multiple range test, at the 5% level. (16) For the data given in Exercise 11, Chapter 14, rank the average pH-values of the three depths of Boil, at the 5% level.
QUES110NS
(1) The pooled variance
and
s:' that is
s: <Section 9.6) is a linear combination of s:
What are the M's? (2) The sample mean is a linear combination of the n observations. What are the M's? (3) The general mean is a linear combination of the " treatment means, that is,
What are the M's? (4) What is the purpose of partitioning the treatment SS into individual degrees of freedom? (5) In the sampling experiment of Section 15.4, there are 100 groups, each consisting of 10 samples. The difference between the greatest and the second greatest sample means within each group may be compared with the LSD 12.4. Do you expect less than 5%, exactly 5%, or more than 5% of the groups to produce differences greater than 12.4? Why? (6) In comparing two particular treatment means, either the individual degree of freedom or the least significant difference may be used. Will the two methods always lead to the same conclusion? Why? (7) Under what condition is the least significant difference appropriate to use? (8) What is the consequence of misusing the LSD?
REFERENCES Cochran, W. G. and COlt, G. M.: uE""erimental Design," JohD Wiley & SoDB, New York, 1950. Danca a, David B.: "Multiple Range and Multiple F tests." Biomdrica. Vol. 11 (1955), pp. 1-42. SDedecor, G. W.: Sc~iadcal Mdhods, 4th EditioD, Iowa State College Press, Ames, 1946.
CHAPTER 16
LINEAR REGRESSION-I Regression is an extensive topic in statistics. Of the many sub-topics which may be considered under regression, this chapter treats only onelinear regression. Regression, like the analysis of variance, has to do with a family of populations, but with an additional restrictive condition imposed upon the population means. Therefore regression is not treated in this chapter as a new topic, but as an extension of the analysis of variance with one-way classification (Chapter 12). The word "regression" should be regarded as a technical term. It is used to identify but not to describe this topic of statistics. The word "linear," however, is descriptive and indicates that all the population means being considered are on a straight line.
16.1 F ..d. .ental Notloas Regression has to do with the relation between the treatments and the population means of those treatments. For example, the treatments may be different rates of application of a fertilizer, such as 0, SO, 100 and 1'2, 150 pounds of fertilizer per acre and the treatment means are the average yields of a crop from the plots recei ving these treatments. The quantity of fertilizer, such 88 0, 50, 100, 150 is denoted by oX, while the yield of a plot is denoted by y 88 usual. In other words, regression in statistics is concerned with problems involving the relation between x, the quantity of fertilizer, and the mean of y, the average yield of a crop from the plots receiving that quantity (x) of fertilizer. A common problem is to determine the amount of the increase in the average yield when the quantity of fertilizer applied is increased by one pound. In regression, the treabnents must be quantitative so that a treatment can be represented by a value of x. If the treatments are qualitative, that is, if they involve different kinds of fertilizers rather than different quantities o£ the same fertilizer, the treatments cannot be represented by different values of x. As a result, the method of regression cannot be used. When the treatments are qualitative, one must rely on the analysis of variance, where the treatments may be either qualitative or quan ti tati ve. As a further illustration of regression one may consider the relation between age (x) of children and their heights (y). For two-year-old children, the height varies from one child to another, and the heights constitute a population. For three-year-old children, the height also varies from one child to another, and the heights constitute another population.
"'U
244
"'I' ,.,.,.,
16.1
245
FUNDAMENTAL NOTIONS
If five age groups, ranging from 2 to 6 are considered, there will be five populations of heights. Thus the variation in the age s generates a family of populations of heights y. An additional concept pertains to the different age groups. In this concept, the heights of all children between the ages of 2 ad 6 constitute a single population and the heights of a particular age group constitute a sub-population which is called an array. In the concept of a family of populations, a sub-ciivision of the family is a population itself. In the concept of a single population, a sub-ciivision of the population is an array. These concepts suggest by analogy the political organization of the United States. The states may be regarded 88 sovereign nations and the United States as a collection of sovereign nations, or the states may be regarded as sub-divisions of a single nation. In this cbapter, both concepts are used, but the fundamental notations of regression are explained in terms of the latter concept. In this concept, the heights of children of all age groups constitute a single population and the heights of a parti cular age group cons ti tute an array. Each array of y has its own mean; that is, for the children of a particular age s, there is an average height. The mean of an array, such as the average height of a particular age group, is denoted by The average height in inches (data fictitious) of the children of various ages is given in Table 16.1.
"'y.".
TABLE 16.1 Measurements Age" Av. ht. Py."
Popuiatioo
Total
Mean
2
3
4
5
6
20
4
%
34
37
40
43
46
200
40
~
It can be observed from Table 16.1 that different x-arrays (age groups) have different means of y (heigbt); tbat is, the average beigbt varies witb the age s. The relation between the age s and the average height "'y." can be expressed by the following equation:
"'y."
"'YoS
= 40
+ 3(s - 4).
(1)
For different values of s, that is, for different age groups, the average height, can be calculated from the above equation. For example,
"'y."
At At At At At
s = 2, s = 3, s = 4, s - 5, s = 6,
= 40 '" y. 3 = 40 p. y ••~ - 40 '" y. 5 - 40 "'y.6 ... 40 "'y.2
+ 3(2 - 4) - 34; + 3(3 + 3(4 + 3(5 + 3(6 -
4) - 37; 4) = 40; 4) - 43; 4) - 46.
246
Ch. 16
LINEAR REGRESSION-I
The above values of average height obtained from Equation (1) are the same as those given in Table 16.1. In other words, the table can be adequately described by the equation. Yet there remains a question about bow the equation is obtained. The quaotity 4 in Equation (1) is the average i of all the children. The quaotity 40 is the average height of the children of this average age 4 and also the average height of all children regardless of age. The quaotity 3 is the rate at which the average height changes. From the table, it cao be observed that the average height increases at the rate of 3 inches per year.
50
p. 45
,..s = 40 + 3(% -
4)
lit. (in.)
40
35
2
4
3
5
6
Age (year)
Fig. 16.1
The relation between the age and the average sented graphically. The graph of Equation (1), is shown in Fig. 16.1. It can be observed from chaop;e of the average height with respect to age
height can also be reprewhich is a straight line, the graph that the rate of is the slope of the line.
16.1
FUNDAMENTAL NOTIONS
247
In general, an equation which describes the relation between x and /ly.,,' the mean of y of an x-array, is called the rpgression equation of y on x, and the graph of this equation is called the curve of regression. In geometric terms, the curve of regression is the locus of the meanA of the arrays of y. The curve of regression can be of any shape, but if it happens to be a straight line, it is called the line of regression. If this is the case, the regression of y on x is said to be linear. In this chapter, only linear regression is considered. The line of regression of y on x can be expressed by the following equation:
p.
'1." = ex + fJ(x - i}.
It can be observed that Equation (l) is a linear regression equation with ex = 40, f3 = 3 and i = 4. The meaning of these quantities is already explained in terms of age and heights of children. In general, i is the mean of the %-Values; ex is the mean of the array of r at x = i, that is, ex=p. -
'1."
(3)
or the mean of the means of all arrays, that is,
P. +/l'1." +"'+p. y'''n ex= '1." a • 1
n The quantity fJ is the rate of change of /ly." with respect to x and is called the regression coefficient of y on x. In geometric terms, it is the slope of the line of regression. It should be realized that the line of regression is not really a straight line in the geometric sense, but rather a segment of a line. In geometry, a straight line has an infinite length, but the line of regression usually has a finite length for a practical problem. It starts at the smallest value of x and ends at the largest value of x. For example, the line of regression of height on age shown in Fig. 16.1 starts at x = 2 and ends at x = 6. The advantage of having a regression equation is that the means of all arrays become known, if the quantities ex and f3 are known. In the example of the regression of height on age, there are 5 means (average heights) for the 5 different arrays (age groups). In statistical terms, the means ry." II of the 5 arrays are parameters and usually unknown. But if the two parameters ex and fJ of the regression equation are known, the values of all 5 parameters p.y." can be obtained from the regression equation. Therefore the regression equation reduces the number of parameters from 5 to 2. In general, the number of parameters p.y.", no matter how large, is always reduced to 2, ex and fJ, by the regression equation, if the regression is linear. In Section 16.3, when the estimation of the parameters is considered, the effort is devoted to the estimation of ex and fJ.
248
Ch. 16
LINEAR REGRESSION-I
Once the estimates of IX and f3 are obtained, the estimates of the means p.y.s of the arrays of r can be obtained indirectly through the estimated regression equation.
16.2 DescriptioD of a PopalalioD All the theorems given in this chapter are concerned with sampling from a specific population which has the following characteristics: (a) Each x-array of r follows the normal distribution. (b) The regression of on % is linear. (c) All the %-arrays of y have the same variance (II. 10 terms of the example of the regression of height (y) on age (%) of the preceding section, the specifications or assumptions are that (a) the heights (y) of the chi ldren of the same age groups (x-arrays) follow a normal distribution, (b) the relation between age and the average height of different age groups can be represented by a straight line (Fig. 16.1), and (c) the variances of the heights of all age groups are the same. lt should be realized that these assumptions about the population are similar to those of the analysis of variance (Section 12.7) with the exception that the regression is linear. In the regression, the arrays are assumed to be normal and to have equal variances; while in the analysis of variance, the populations are assumed to be normal and have equal variances. Arrays and populations are really the same things from different points of view (Section 16.1). Therefore, these two assumptions are the same. However, linear regression specifies that the means of the arrays are on a straight line; while in the analysis of variance, no specification of any kind is made about the relations among the population means. This specific population of linear regression can be represented by a three-dimensional model which is shown in Fig. 16.2. The whole model represents the population. Each section of the model is an array. It can be observed that each section is shaped like a normal curve. The fact that all sections are of the same shape indicates that the variances of all arrays are equal. The bar which crosses the y-axis is the line of regression. This model is made flexible, that is, the angle between the bar and the sections of the model can be adjusted at will. Figure 16.2 show the same model adjusted to three different angles. The upper one shows that the slope of the line of regression is positive, that is, the regression coefficient f3 is positive, or the mean of an array of increases with %. The middle figure shows that the line of regression is horizontal, that is, f3 ." 0, or the mean of an array of remains the same, regardless of the value of %. The lower figure shows that the slope of the line of regression is negative, that is, f3 is negative, or the mean of an array of decreases with %.
r
r
r
r
16.2
DESCRIPTION OF A POPULATION
249
{3>O
Fig. 16.2
fJ '"
0
fJ <
0
250
Ch. 16
LINEAR REGRESSION-I
16.3 Es&lmalioD of Parameters
In the preceding sections the discussion is confined to a population whose regression equation of y on x (Equation 2, Section 16.1) is I'-
..
" = cx+ f:J{x - i).
(1)
In this section, the method of estimating the parameters ex and fj is considered. The linear regression equation has two parameters CX and f:J. If these two parameters are known, the means I'of all arrays of y can be detennined from the regression equation. ii~wever, when the population is not available, the parameters must be estimated from a sample. The sample estimates of ex, f:J, and 1'-" •• are denoted by a, b, and y., respectively. Then the estimated regression equation is
r. - a +
b(x -
(2)
i).
It can be seen from the above equation that when the estimates II and b are obtained from the sample, y. can be obtained for any value of x through the sample regression equation. Therefore, the estimation of p.. y •• is accomplished through the estimation of CX and f:J. The example of the regression of height (y) on age (x) of the preceding sections is again used as an illustration of the methods of estimating ex. and p. The age lind height of each child of a random sample of 5 children are shown in Table 16.3a. It can be observed from the table tbat, for each child, there is a pair of numbers, age and height (x, y), instead of a single observation (y) as in the preceding chapters. The sample size n, being the number of children or the number of observations y, is 5 and not 10. The pairs of numbers are denoted by (xu Y.), (xa, Y2)' .•• , (x n' y n)' For this example, (x., Y.) is (2, '35) and (XI' Y2) is (3, 36) and so forth. The first number in the parenthesis is always the value of x and the second one is the value of y. TABLE 16.3a Measurements Age:
%
Height: y
Sample
Total
MeaD
2
3
4
5
6
20
I.x
4
35
36
41
45
44
201
ljy
40.2
:i
.,
Despite the fact that two numbers are associated with each child, the height y is still considered the observation and the age x is used to indicate from which age group (array) the observation y is obtained. The heights of 5 children, regardless of age, constitute a sample and the children of the same age constitute a sub-sample. The sample of Table 16.3a consists of 5 sub-samples, because the 5 children are all of differ-
16.3
251
ESTIMATION OF PARAMETERS
ent ages. In general, the II observations collectively constitute a sample, and the observations drawn from the same array (with the same x-value) constitute a sub-sample. The mean of the sample is still designated by y and the mean of a suh1ample is designated by ys. The reason for the subscript % is to indicate that the sub-sample mean ys varies with %, that is, the ys changes from one sub-sample to another. The estimate of ex. is
- l::r
a-y--=
1,+11+"'+1,. .
II
(3)
II
For the sample given in Table 16.3a, the estimate of ex. is a - 1
=
35 + 36 + 41 + 45 + 44
5
201
=-
5
- 40.2,
while ex. itself is equal to 40 (Equations 1 and 2, Section 16.1). The reason for using y as the estimate of ex. can be observed from Equation (1). When the values of % are substituted into this equation. the values of p.y.s are as follows: For For
The sum of the above
II
% -
%"
% -
%2'
p..,"Sa - ex. + (J(%, - i). P. y.St - ex.+ ,., R(%2 - i).
(5)
equations is
II. y.s, + P. y.s. + • • • + P. y.s,. -
II
ex. + (JI(% - i).
Since the sum of the deviations of the observations from their mean is equal to zero (Equation 5, Section 7.4), that is, I(% - i) - 0, therefore, ex. is equal to the mean of the means of all arrays, that is,
+p. +"'+p. ex. _ P.-,---y._S..:,.I_ _y_.---'sl=---_ _~y_._s~,.
(6)
II
(d. Equation 4, Section 16.1). For the population given in Table 16.1, ex. =
;l4 + 37 + 40 + 43 + 46
5
200
=-- 40. 5
Since ex. is the mean of the means of all arrays, then it is reasonable to estimate ex. by the mean of the observations from these arrays (Equation 3).
252
LINEAR REGRESSION-I
The estimate of
b.. ..
f3
Ch. 16
is
I(x - x)(y - y) SP I(x _i)2 -SS (XI - i)(y I -
1> + (x 2"-
(XI - i)2
i)(Y2 -
1> + •• • + (xra -
x)(y ra
+ (xa - i)2 + ••• + (x ra - X)2
y)
(7)
•
In the above algebraic expression of b, the numerator, being the sum of the products of the deviations of the x.-values and the y-values from their respective means, is designated by SP; and the denominator, being the SUID of the squares of the deviations of x.-values from their means, is designated by SS". For the sample given in Table 16.3a, the estimate of f3 is
27.0
b = - - = 2.7, 10
(8)
TABLE 16.3b
"
2 3 4 5 6
35 36 41 45 44
20
201
I"
C" -
y
C" _%)2
%)
-2 -I 0 I 2
4 I
0 I
4
0
10 55"
Ie" -x)
Ir
% = I,,/n
= 20/5 = 4;
-1>
(y
C" -
%)(y -
-5.2 -4.2 0.8 4.8 3.8
10.4 4.2 0.0
0.0
27.0 5P
y)
4.8
7.6
Icy-y)
r =Ir/n =----------201/5 = 40.2
with the details of the calculations shown in Table 16.3b. The reason for using b as the estimate of {3 can be justified from Equation (5), which may be expressed as follows:
{3 = p. '1·-"1 - _ ex. = p. Y·-"a - _ex._ X 1 -%
••• _
p.
%a - X
y.
x,. _- ex. .
(9)
X ,. -%
For the population given in Table 16.1, the regression coefficient is
{3 =
34 - 40 2-4
=
37 - 40 3-4
=
40 - 40 4-4
=
43 - 40 5-4
=
46 - 40 6-4
-
3.
(10)
The middle one of the above 5 ratios is % which, being indeterminate, supplies no information regarding the value of {3. The other 4 ratios are all equal to 3 which is the correct regression coefficient of the population. From Equation (9), it can be seen that if p. '1." and ex. are replaced by their estimates, the estimate of {3 can be obtained. The only information
16.3
253
ESTIMATION OF PARAMETERS
available regarding the meao of an array is the ohservation y drawn from that array. Therefore, y may be used to estimate p. y •• • The estimate of ex. is y which is given in Equation (3). Then from each of the II pairs of observations, ao estimate of f3 cao be obtained. The II estimates which are denoted by C are as follows:
-
-
-
t.
_-
CI .y, -Yo C2 .YI-Y, ... ' C .Y. -Y _, _, x , -x x2 -x x n -x
(11)
These estimates are reasonable to expect. In terms of the example of the regressioo of height on age of children, y, - is the deviation of the first child's height from the average height of the children in the sample, and x, - i is the deviation of the same child's age from the average age of the children in the sample. If his height is 4 inches above average aod his age is 2 years above average, then the rate of increase in height is 4/2 or 2 inches per year. By this method of estimating {3, each pair of observations (x, y) provides ao estimate C. Then a sample of II pairs of observations would provide II estimates, which must be combined in some way into one comprehensive estimate of {3. The method of combining these n estimates is the weighted average of them, with the weight being (x - i)1. If a child's age is 2 years above average, that is, (x - i) - 2, his estimate of f3 carries the weight of 22 or 4. If a child's age is 1 year above the average, that is, (x - i) • 1, his estimate of {3 carries the weight of 12 or 1. If a child's age is equal to the average, that is, (x - i) = 0, his estimate of f3 carries no weight. In other words, the greater a child's age deviates from the average age, the more weight his estimate of {3 carries. This method of weighting seems to be reasonable. After all, if every child is of the average age, certainly the regression coefficient {3, which is the rate of chaoge of the average height with respect to age, cannot be estimated. The weighted average of the C-values, with (x - i)2 being the weight, is the sample regression coefficient b in Equation (7), that is,
r
b=
(x, - i)2C, + (x, - i)2C Z + ... + (x
- i)2C II
(x, -
i)2 + (x 2
-
i)2 + ••• + (x n - i)2
n.
(12)
When C is replaced by (y - y)/(x - i) (Equation 11), the above equation becomes Equation (7). The method of estimating ex. and {3 is so far justified only by intuition, but the method has its theoretical basis. It is known as the method of least squares. The name of least squares is derived from the fact that if this method is used in determining a and b, the sum of the squares of the deviations of the y-values from the estimated meaa of the arrays is a minimum, that is,
254
Ch.16
LINEAR REGRESSION-l
is a minimum. The fact that the above sum of squares, which is called the residual SS, is a minimum can be illustrated by the sample given in Table 16.3a. To obtain the residual SS, the values of yshave to be obtained from the estimated regression equation (ElJUation 2) which contains a, b, and i. The values of a and b for this sample are given in Equations (4) and (8) respectively, and the value of i is 4. Therefore the estimated regression equation is
Ys = 40.2 + 2.7(% -
(14)
4).
From the above equation, ys can he obtained for each value of %. For example, when %= 2, Ys = 40.2 + 2.7(2 - 4) - 34.8. From these values of ys and the values of y given in the sample, the residual SS can be calculated and is equal to 9.90. The details of the calculations are given in Table 16.3c. The fact that the residual SS is a minimum can be illustrated graphically. The 5 pairs of observations of Table 16.3c are plotted as the 5 points and the estimated regression equation (Equation 14) is plotted as the straight line (estimated line of regression) shown in Fig. 16.3. Graphically the deviation y - ys is the vertical distance between a point and the line, with the distance being positive if the point is above the line, the distance being zero if the point is exactly on the line, and the distance being negative if the point is below the line. The SUID of the squares of all the n distances is the residual SSe Then it can be seen from Fig. 16.3 that the value of the residual SS is affected by the position of the line relative to the points, which, representing the n pairs of observations, are fixed for a given sample. The position of the line is determined by the values of a and b of the regression equation. The least squares method produces such values of
G
and b that the residual SS thus produced reaches a minimum. The discussion about the minimum residual SS is to justify the name of the least squares method. The advantages of this method are discussed in Sections 17.2 and 17.3. TABLE 16.3c Estimated Mean of array
Sample 1-:,
"
Y
2
35
3
36
4 5 6
41 45 44
34.8 37.5 40.2 42.9 45.6
20
201
201.0
Ix
Ir
~s
__== ~'._
0:
--
Ys
-
(y
y -Ys
.-
_yJI
.
.2 -1.5 .8 2.1 -1.6 0.0 I
.04 2.25 .64 4.41 2.56
9.90 I
!
16.4
255
PARTITION OF SUM OF SQUARES
50 7
1
• ~ 40.2 + 2.7(. -
41
~~-------L2----------~3----------~4----------~--------~6 Fig. 16.3
The line of regression estimated by the least squares method always passes through the point (i, (Fig. 16.3). The reason for this is that when % .. i, s = a = y (Equations 2 and 4), and consequently the estimated line of regression passes through the point (i, The method of the least squares has other interesting characteristics. It can be seen from Table 16.3c that
r
r>
y>.
05) and (6)
The above two equations are algebraic indentities which hold for any sample. Equation (16) indicates geometrically that the sum of the distances of the n points from the estimated line of regression (Fig. 16.3) is equal to zero, that is, that the sum of the positive distances is equal to the sum of the negative distances. (A distance is considered positive or negative depending on whether the point is above or below the line.)
16.4 Partitioa of Sam of Squares The sum of squares ~y - y)2 can be partitioned into two components, and the residual SS, ~y - Yx)2, namely, the regression SS, I(y s -
r>2,
256
Ch. 16
LINEAR REGRESSION-I
that is, (1)
or total SS
I:
regression SS + residual SS,
where y is an observation, y the sample mean, and ys the soh-sample mean. This partitioning of the sum of squares can be explained by the example given in Table 16.3a. The 5 pairs of observations of the example are plotted in Fig. 16.4 as points. The estimated line of regression (Equation 14, Section 16.3), (2)
30~----~----~2------~3----~4------~S--~--6L-~~7
.. Fig. 16.4
16.4
PARTITION OF SUM OF SQUARES
257
is represented by the slanted line; and the sample mean
y - 40.2
(3)
(Equation 4, Section 16.3) is represented by the horizontal line. The three deviations involved in Equation (1) can be explained in terms of these points and lines (Fig. 16.4). The deviation (y - 1> is the vertical distance between a point and the horizontal line. Therefore the total SS, ~ - y)l, measures the overall dispersion of the points about the horizontal line. The deviation (y", - y) is the vertical distance between the line of regression and the horizontal line. This distance is influenced by the regression coefficient b, which is the slope of the line. It can be seen from Equations (2) and (3) that the two lines would coincide if the regression coefficient were 0 instead of 2.7. Therefore the regression SS, l:(y", - na, is influenced by the regression coeHicient b and measures the discrepancy between the two lines. The regression SS is equal to zero when the two lines coincide. The deviation (y - y",) is the vertical distance between a point and the line of regression. Therefore the residual SS measures the dispersion of the points about the estimated line of regression. When all the points are exactly on the line, the residual SS is equal to O. Equation (1) can be verified by the example given in Table 16.3a. The sample mean y is equal to 40.2 and the sub-sample ~ean y! can be o~ tained from Equation (2) for any value of x. From y, y, and y"" the three SS-values can be calculated. The details of the calcnlation are given in Table 16.4 which shows that the total SS 82.80 is equal to the sum of the regression SS 72.90 and the residual SS 9.90. Table 16.4 is used to demonstrate the meaning of the partition of the sum of squares, but not to demonstrate the method of computing the various SS-values. The short-cut method of computing the total SS is the familiar method of computing SS, that is, Total SS ... l:yl - (l:y)l/n (Equation 4, Section 7.4). The regression SS can be obtained very easily by the algebraic identity
l:(y", - y)2 .. b2 l:(x - i)2 _ (SP)I ISS ",'
(5)
This identity can be verified by the example given in Table 16.4. The value of b of this example is 2.7 (Equation 2) and the value of SS", is 10. Therefore. by Equation (5), the regression SS is (2.7)2(10) = 72.90 which is the same value obtained by the tedious method shown in Table 16.4. After the total SS and regression SS are obtained, the residual SS can
~
CI1
CO
TABLE 16.4
"
y
Y.
(y-1)
(Y. -1)
(y -y.>
(y -f)J
(Y. - y)J
(y _y.>J
2 3 4 5 6
35 36 41 45 44
34.8 37.5 40.2 42.9 45.6
-5.2 -4.2 .8 4.8 3.8
-5.4 -2.7 .0 2.7 5.4
.2 -1.5 .8 2.1 -1.6
27.04 17.64 .64 23.04 14.44
29.16 7.29 .00 7.29 29.16
.04 2.25 .64 4.41 2.56
20
201
201.0
0.0
0.0
0.0
I.y
I.r.
:L:
I.(y -y)
I.(y. -Y)
I.(y -f.>
82.80
72.90
9.90
I.(y _y)J
I.(f. _y)J
I.(y _y.)J
Total
Regression
Residual
r
ipj >
=
= ~o pj
Cl
:z
L
.ncr 0\ "'"
16.S
2S9
DISTRIBUTION OF SUMS OF SQUARES
be obtained by subtraction; that is, the residual SS is equal to the total SS minus the regression SS.
16.5 Dilllribulioa
or Sa.s or Squares
The distributions of the three sums of squares can be verified by a sampling experiment, the details of which are given in Chapter 4. Briefly, 1000 random samples, each consisting of S observations, are drawn from the tag population which is a normal population with mean equal to SO and variance equal to 100. Now the tag population is CODsidered to consist of S identical arrays, from each of which a single observation is drawn. Then the arbitrary x-values of 1, 2, 3, 4, and S are attached to the S observations (Y) of each sample. Consequently, the S observations YI' Ya' y" Y., and Y. become (1, YI)' (2, Ya), (3, y,), (4, Y.), and (S, Y.) respectively. As an illustration, the observations of the first two samples of Table 4.2 are listed in Table 16.Sa with their corresponding %-values. Since all the S arrays have the same mean, SO, the mean I'y.. of an array does not vary with x. Therefore, the regression coefficient fl, being the rate of change of I'y •• with respect ot x, is equal to zero and the regression equation is
I'y ••
(1)
= SO.
For each of the 1000 samples, the total SS, regression SS and residual SS can be calculated. Then the distribution of each of these three sums of squares can be obtained. The computing work involved seems treTABLE l6.5a
Any Sample
First Sample
SecoDd Sample
%
1
%
1
%
1
1
11 1a 1, 1. 1,
1
50 57 42 63 32
1
55
2 3
37
4
40
5
52
-
2 3 4
5
2 3
4 5
44
mendous, but fortunately, most of the computation is already done in connection with previous sampling experiments. For instance, the total SS, I.(y - y)2, is already calculated (or the sampling experiment of section 7.7 and the fact that !.(y - y)~ /0'2 follows the ~-distribution with n - 1 degrees of freedom is also demonstrated there. The computation o( the regression SS can be done by the short-cut method (Equation S, Section 16.4), that is, regression SS = (SP)2/SS".
260
Ch. 16
LINEAR REGRESSION-I
Since the x-values are 1, 2, 3, 4, and 5 for all samples, the values of (x - i) are -2, -1, 0, 1, and 2, and consequently, the value of SS" is 10 for each of the 1000 samples. It can be observed nom Table 16.5b that
SP - I(x - i)(y =:
1> .. 1:(x -
i)y
(2)
-21, - 1a + 1. + 21. TABLE 16.5b
-
'1
(" -~
5
'1, '12 '11 '1. '1.
-2 -1 0 1 2
15
Iy
0
"
1 2 3
"
..
('1
-1>
(" - %)('1 - y) -2'1, + 2Y -'12 + '1 0 '1. -1 2'1. - 2Y
'1,-y '12-Y '11 -y '1.-'1 '1.-'1
-2y, - '11 + '1. 5P
0
-
+ 2y.
(" - '1)2
"1 0 1
"
10
55"
which is the v of Equation (4) of Section 15.2. The values of (x - i) of the SP are the values of M of the linear combination v. Then the regression SS becomes (SP)I vi (3) --=SS 10
"
which can be readily computed, because the values of v are already computed in connection with the sampling experiment of Section 15.2. The distribution of the regression SS can be obtained from the sampling experiment of Section 15.2. Since v follows the normal distribution with mean equal to zero and variance equal to 1000 or 10 ql, then the statistic (Section 6.7)
v-O u-~==
y'IOc?"
follows the normal distribution with mean equal to 0 and variance equal to 1; and consequently
U2=~
(5)
10ql
follows the X2-distribution with 1 degree of freedom (Theorem 7.6). Since the regression SS is equal to vi /10 (Equation 3),
If
= regression
SS
ql
follows the ~-distribution with 1 degree of freedom.
(6)
16.5
261
DISTRIBUTION OF SUMS OF SQUARES
After the total SS and the regression SS are computed for each sample, the residual SS can be obtained by subtraction; that is, the residual SS
is equal to the total SS minus the regression SSe F or example, the total SS for the first sample of Table 4.2 is 59S.S (labeled SS), which is used in Section 7.7, and the regression S5 is v l /lO - (-30)1/10 ... 90.0 and therefore the residual SS is 598.S - 90.0 ... 50S.S. The values of the residual SS for four samples are given iD Table 4.2. Since the total SS has n - 1 degrees of freedom and the regre ssion SS has 1 degree of freedom, intuitively ODe would expect that the residual SS has n - 1 - 1 or n - 2 degrees of freedom. If this intuition is correct, the values of (residual SS)/cfA would follow the xl-distribution with 5 - 2 or 3 degrees of freedom. The frequency table of the 1000 values of (residual SS)/q2 is shown in Table 16.5c with the theoretical frequencies of the x'-distribution with 3 degrees of freedom. The corresponding histogram, superimposed on x'-curve with 3 degrees of freedom, is shown TABLE 16.5c
I(, _1,,)2
ql 0-1 1-2 2-3 3-4 4-5 ~
6-7 7-8 8-9 9-10 10-11 Above 11 Total
Observed Frequency r.I.('fo) I
Theoretical r·I·(%)
Mid-pt.
39 21 17 7 8 13
21.3 21.8 19.7 12.4 8.5 5.8 3.9 2.1 1.7 .7 .8 1.3
19.9 22.9 18.1 13.0 8.9 6.0 4.0 2.6 1.7 1.0 .7 1.2
.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5 14.1
1000
100.0
100.0
213 218 197 124 85 58
MeaD. of
m
ml 106.5 327.0 492.5 434.0 382.5 319.0 253.5 157.5 144.5 66.5 84.0 183.3 2950.8
Inal 2950.8 '>l = - =1000 - - c 295 1000 •
(This sampling experiment was conducted by Cathy OIS8O at Oregon State College in the summer of 1954.)
in Fig. 16.5.
From either the table or the figure, one can see that the frequencies of the 1000 values of (residual SS)/~ fit the theoretical frequencies very well. Therefore this sampling experiment verifies the contention that (residual SS)/~ follows the y-distribution with n - 2 degrees of freedom. The mean of the 1000 values of (residual SS)/ql can be obtained from Table 16.5c by considering every value in a class to be equal to the
262
Ch. 16
LINEAR RECRESSION-I
.30 T.
f.
•25
.20
.15
.10
.OS
o
2
4
6
8
Fig. 16.5
midoopoint m of that class. The values in the class 0-1 are represented by .5; those in the class 1-2 are represented by 1.5 and so forth. The class "above lr', which hRR no finite mid.point, is represented by the
16.6
ESTIMATE OF VARIANCE OF ARRAY
263
average 14.1 af the 13 values in that class. The mean of the 1000 values thus obtained is equal to 2.95 which is close to 3, the number of degrees of freedom. Therefore, this sampling experiment verifies the contention that {residual SS)/q2 follows the )(-distribution with n - 2 degrees of freedom.
16.6 Estimale of V.-iaace of Array The sample estimate of the variance
s2 -
residual SS
n-2
q2
of an array is
I{y -
r
x)2
(1)
== -----.:=--
n-2
This method of estimating ~ follows the familiar pattern of dividing SS by its number of degrees of freedom. It can be shown that the S2 is ~n unbiased estimate of q2 from the fact that the mean of the values of {residual SS)/q2 for all possible samples is equal to n - 2 (Section 16.5). If the value of (residual SS)/q2 for each sample is multiplied by q2/(n - 2), the mean of 2
s==
residual SS q2
is of
q2/(n -
2) times
(n -
2) or
n q2.
residual SS
q2
.--z::-_ __ 2
Therefore,
n S2
2
is an unbiased estimate
q2.
The reason for using the residual SS as the numerator of s" can be seen f~OID Fig. 16.4 which shows that the residual SS measures the variation of the observations (points) about the estimated line of regression. Since the variance q2 measures the varIation of the observations about the means of the arrays which are on the line of regression of the population, the residual SS is the appropriate SS to use as the numerator of S2. The nwnber of degrees of freedom, n - 2, for the residual SS can also be seen from Fig. 16.4. It takes two points to determine a straight line. If n .. 2, the estimated line of regression must pass through both points and therefore the residual SS must be equal to zero regardless of the value of q2. Only when n is greater than 2 will the magnitude of the residual SS be affected by q2. Therefore, it is reasonable to expect the divisor of S2 to be n - 2 (Section 7.2).
16. 7 Tesl of Hypotlaests The statistic regression SS
1 I(y x - y)2 F - - - - - - - -......;;;;...-residual SS S2 n -2
(l)
264
LINEAR REGRESSION-I
Ch. 16
follows the F-distribution with 1 and n - 2 degrees of freedom and is used in testing the hypothesis that the population regression coeEficient fJ is equal to zero. This F-distribution may be justified from the fact that (regression S5)/(12 follows the )(-distribution with 1 degree of freedom (Equation 6, Section 16.5), or the F-distribution with 1 and 00 degrees of freedom (Theorem 9.7), where 00 is the number of degrees of freedom of (12. Now if (12 is replaced by the S2 with n - 2 degrees of freedom, the resulting F, which is the F shown in Equation (1), will follow the F-distribution with 1 and n - 2 degrees of freedom. The reason for using this statistic F in testing the hypothesis that fJ == 0, can be seen intuitively from the fact that the regression SS is equal to b2 SS. (Equation 5, Section 16.4). When fJ is large, b, being an estimate of {3, will be large. When b is large, the regression SS will be large and consequently the F-value will be large. Therefore the indication of a large F-value is that the population regression coefficient fJ is either greater than or less than zero; and on the other hand, the indication of a small F-value is that f1 = O. This F-test then is a one-tailed test. The hypothesis that (3 = 0 is rejected, because the F-value is too large, never because it is too small. It should be realized that when fJ = 0 the means of all arrays are equal and the regression equation Ily •• = ex. + f3(x - x)
becomes Il y •• = ex.
which says that the mean of every array is equal to ex. regardless of the value of x. The sample given in Table 16.3a is used as an illustration of the test of hypothesis. A random sample of 5 children is obtained and the problem is to detennine whether the average height of the children varies
with age. The value of SS" of this sample is 10, that of SP is 27.0 (Table 16.3b) and that of residual SS is 9.90 (Table 16.3c). Then the regression SS, being (SP)2/SS" (Equation 5, Section 16.4), is (27)2/10 72.9, and S2, being (residual SS)/(n - 2) (Equation 1, Section 16.6), is 9.90/3 = 3.30. Then F = 72.9/3.30 = 22.1, with 1 and 3 degrees of freedom. Since 22.1 is greater than 10.128, the 5% point of F with 1 and 3 degrees of freedom, the hypothesis that f3 = 0 is rejected and the conclusion is that the average height of children does vary with age. As another example, the x may be the amount of fertilizer applied in a plot and r the yield of a crop of that plot. The problem is to determine whether the average yield is a££ected by the amount of fertilizer applied. The hypothesis being tested here is that the average yield remains the same regardless of the amount of fertilizer applied.
16.8
CORRELATION COEFFICIENT
265
16.8 COII'elaUoa Coemcleal The correlation coefficient between l:(% - i)(y r..
%
and y is defined as
y)
SP
VI(% - i)2 ~y - 1)2
- .
ySS"SSy
(1)
The meaning of the correlation coefficient , can be seen through the square of " that is,
(SP)I ,I _-SS-S-S-,· "
(2)
y
Since the regression SS is equal to (SP)I ISS" (Equation 5, Section 16.4) and SSy' l:(y - y)l, is the total SS, then the square of , is 1
regression SS
, =-----total SS
(3)
whicb must be a number between 0 and 1, because the total SS is equal to tbe regression SS plus the residual SS. Thus ,I ranges from 0 to 1 and consequently , itself ranges from -1 to 1. The + or - sign of , is determined by that of tbe sample regression coefficient b, because both b and , have the same numerator SP. Therefore, the sign of r indicates whether the estimated line of regression has a positive or negative slope. The correlation coefficient' may be used as an index measuring the closeness of fit of the n observed points to the estimated line of regression. It can be seen from Equation (3) that regression SS = ,I(total SS) and residual SS - (1 - ,a) (total SS).
(5)
Tbe larger the absolute value of " tbe closer the points will fit the line. 1£ ,- tl, the residual SS will be equal to zero and every point will be exactly on the line. 1£ r = 0, then b = 0, the line will be horizontal and
y" - y.
The statistic F of the preceding section, where
F-
regression SS residual SS
,
(6)
n-2 can be expressed in terms of , and n. When Equations (4), (5), and (6)
266
Ch. 16
LINEAR REGRESSION-I
are combined, the statistic F can be expressed as
F
r2(total SS) r= - - - - - - - -
(1 - r2)(total SS)
=
r2(" - 2) 1 - r2
.
(7)
,,-2 Since F has 1 and" - 2 degrees of freedom, the square root of F, (Theorem 12.6)
"F
=t
"In -2 r=
"r::rx
(8)
,
follows Student's t-distribution with n - 2 degrees of freedom. Therefore, the hypothesis that the means of all arrays are equal can be tested by either t or F, when the values of rand" are known. However, the percentage points of r like those of t and F, are tabulated (R. A. Fisher and F. Yates: Statistical Tables for Biological Agricultural and Medical Research. Table VD. and consequently the hypothesis can be tested through r directly, without calculating t or F. But the conclusions reached through the use of either t, F, or r: are always the same. For example, the 5% point of r with 10 degrees of freedom is 0.5760, and that of F with 1 and 10 degrees of freedom is 4.965. But r2(n - 2)
--- = 1 - r2
(0.5760)210 1 - (0.5760)2
=
3.317760 0.668224
= 4.965 r= F.
Therefore if r is greater than 0.5760, F must be greater than 4.965; and consequently the conclusions reached through the use of r or F must always be the same. The letter" given in Fisher and Yates' table is the Dumber of degrees of freedom of tbe residual SS and is, therefore, equal to n - 2, where n is the sample size.
16.9 Algebraic Ideatities aDd Co.patiag Methods \lany algebraic identities are involved in regression. Even though they do not contribute much to one's understanding of statistics, they are nevertheless indispensable in developing short-cut methods of computation. One of the most useful identities is _
SP - 1(x - i)(y - y) - 1xy -
(1x)(Iy) •
(1)
n
This identity is explained and verified by the numerical example given in Table 16.9a which shows that the value of SP obtained by either algebraic expression is equal to -3. One of the reasons for using this identity for computational purposes is to reduce the number of subtractions. To obtain the deviations (x - i) requires" subtractions and to obtain the deviations (y - y) requires another n subtractions; but these 2n subtractions are reduced to one subtraction by Equation (1). Another reason for
16.9
267
ALGEBRAIC IDENTITIES AND COMPUTING METHODS TABLE 16.9.
x
Y
"Y
3 2 3 4
5 4 6 1
15 8 18 4
12
16
45
L:
Iy
L:r
(y -;:)
0 -1 0 1
1 0 2 -3
0 0 0 -3
0 I(y -;:)
-3
0 I(x-~
11-4; SP -
(x - i)
--
~
%-12/4=3.
L:r - (Ix)(Iy) = 45 -
(x - %)(y -
Y>
SP
;: -16/4 ... 4
(12)(16) - 45 - 48 =-3 4
11
using the algebraic identity is to postpone the divisions which are required in finding i and y and thus avoid the accumulation of the rounding0(( errors in the deviatioDs (x - i) and (y - y). For convenience of reference, the algebraic identities which are explained either in this section or in the preceding sections are listed below:
SP ... I.(x - i)(y SS = I.(x - i)2 • SS
= total Y
SS
y)
= I.x2 -
= I.(y -
n
(I.X)2 -n
(2)
(3)
_
(I.y)2
y)2 :: I.y2 - - -
(4)
n
b = I.(x - i)(y -
y) = SP
I.(x - i)2
SS
I.(x - i)(y -
(I.x)(I.y)
I.ex - i)y = I.xy -
=
(5)
•
y)
SP
r."
~~=
I::
(6)
"SS• SS Y
..;I.(x - i)2 I.(y - 1)2
(~)2
Regression SS - I.(y - y-)2
= b2SS ... - -
•
total SS
I.(y - r)2 s
n-2
(7)
•
-SS - -SSJI:
Residual SS - I.(y S2 ... _
SS
(SP)2
regression SS
,a..
JI:
(8)
Y
r.)2 - total SS - regression SS
= residual ____SS n-2
(9) (10)
268
Ch. 16
LINEAR RECRESSION-I
The last algebraic expression of each of the above equations is usually the most adaptable one for computation on desk calculators. A quantity such as I.xy, U, or 1:y 3 can be obtained in one continuous operation on desk calculators. The bulk of the computing work involved in regression lies in the obtaining of SP, SS,,' and SSy' After these three quan ti tics arc obtaincd, the rest of the quantities can be found readily through the identities listed above. The procedures in computing the various quantities may be illustrated by the following example:
The details of the computation are shown in Table 16.9b. TABLE 16.9b n
6 18 3 324
Ix ~
(!.%)3 (!.%)3
--
54
SS"
58 4
n ~
24 4 576
l:y (!.%)(l:y) (!.%)(~)
~ SP
b
(~)3
72
n
Regression coefficient: Regression equation: Correlation coefficient: Regression SS: Residual SS: Estimated vari8llce: Test of hypothesis {3- 0:
(l)3
432
96
n
64
1:1
-8
SSy
118 22
= SP/SS" =-8/4 -=-2
),,, =)' + b(% -x) ... 4 - 2(% -3) r - SP /ySS ;cSSr .. -8/y(4)(22) c: -.85 I(y" - y)3 (SP)3/ SS" _ {-8)3 / 4 -= 16
=
1:(r -y,J'" = SSy - Reg. S5 .. 22 -16 -6 S3 =(rcsidaal SS)/{n -2) -6/4 = 1.5 F - Reg. SS/ sa a 16/1.5 = 10.67 with 1 & 4 tl.f.
EXERCISES (1) Consider the following ten pairs of observations a population:
"
r
4
4
5
5
6
6
7
7
8
8
8
10
10
12
12
14
14
16
16
18
(a) Find the mean of y for each of the five x-arrays. (b) Find the regression equation of y on x. (p. y.s ." 2x + 1) (c) Find the variance of y for each of five x-arrays. (0'3y.s - 1) (d) What is value of the regression coefficient of y on x? (f3 - 2) (2) Verify the algebraic identity that the total SS is equal to the re-
EXERCISES
269
gression SS plus the residual S5 (Table 16.4) by the following sample of six pairs of observations:
(3) For the data given in Exercise (2), compute the total SS, regression 5S and residual SS by the short-cut method (Table 16.9b) and see that the values obtained are the same 88 those of Exercise (2). (4) A random sample of 20 observations (y) is drawn from the tag population, which is a nonnal population with mean equal to 50 and standard deviation equal to 10. To each of the 20 observations an arbitrary value of % is attached. The 20 pairs of values are listed below:
"
2 4 6 8 2
y
S9 41 35 43 48
"
4 6
8 2 4
y
47 47 53 52 48
"
6 8 2 4 6
y
"
y
51 65 52
8 2 4
60 39 64
56
6
50
39
8
36
Since for each x-array of y, the mean is 50, the regression equation of y on " is 1'-YoS
= 50
and therefore the regression coefficient f3 is zero. Pretend that the source of sample is unknown and test the hypothesis that f3 - 0 at the 5% level. (F - 0.0208 with 1 and 18 d.!.) Since it is actually known that fJ - 0, state whether your conclusion is correct or a Type I error has been committed. A Type II error cannot be committed in this case, because the hypothesis being tested is true. (5) To the data of Exercise (4), add 1 to each of the observations (y) with x = 2; add 2 to each of the observations with x = 4; add 3 to each of the observations with x = 6; and add 4 to each of the observations with x .. 8. The resulting sample is as follows:
"
y
2 4 6
60 43
8
47 49
2
38
"
4 6 8 2 4
y
49 50 57 53 50
"
y
6
54
8 2 4 6
69 53 58
42
"
8 2 4 6 8
y
64 40
66 53 40
270
Ch. 16
LINEAR REGRESSION-I
The means of the four x-arrays of yare as follows:
As x increases by 2 units, the mean of y increases by 1 unit. Therefore, the regression coefficient (3 is equal to O.!l. Pretend that the source of the sample is unknown and test the hypothesis that f3 ::: 0 at the.5% level. (F .. 0.1685 with 1 and 18 d.,.) Since it is actually known that f3 = .5, state whether your conclusion is correct or a Type n error has been committed. Note that a Type I error cannot be committed because the hypothesis is false. The purpose of this exercise is to demonstrate that a Type II error is easily made, if the sample size is small and the hypothesis is not too far wrong. (6) To the data of Exercise (4), add 100 to each of the observations (y) with x = 2; add 200 to each of the observations with x = 4; add 300 to each of the observations with x::: 6; and add 400 to each of the observations with x::: 8. The resulting sample is as follows:
" 2 4
6 8
2
r 159 241 335 443 148
r
"
"
247 347 453 152 248
4 6 8 2
4
6
8
2 4 6
r
"
r
351 465 152 256 339
8
460 139 264 350 436
2 4 6 8
The means of the four arrays of y are as follows:
[:.1
2
4
ISO
250
As x increases by 2 units, the mean of y increases by 100 units. Therefore, the regression coefficient f3 is equal to SO. Pretend that the source of the sample is unknown and test the hypothesis that f3 = 0 at the 5% level. (F = 3,062 with 1 and 18 d.,.) Since it is actually known that f3 ::: 50, state whether your conclusion is correct or a Type II error has been committed. Note that a Type I error cannot be committed because the hypothesis is false. The purpose of this exercise is to demonstrate that a false hypothesis is likely to be rejected with a relatively small sample, if the hypothesis is far enough from the truth. (7) The following data were obtained from the determination of a standard curve for the colorimetric measurement for phosphorus, using an Eve lyn Colorimeter. Varying known amounts of a standard phos-
271
EXERCISES
phate solution were placed in colorimeter tubes, the colorimetric reaction set up, and the optical density of the resulting color was recorded. Micrograms Phosphorus
" 0.00 2.28 4.56 6.84 9.12 11.40 13.68 15.96 18.24 22.80 27.36
Optical Density y 0.000 0.056 0.102 0.174 0.201 0.268 0.328 0.387 0.432 0.523 0.638
(Courtesy of Dr. Roebert L. Stearman)
Find (a) the estimated regression equation of optical density on micrograms of phosphorus, (b) the estimated variance of array, and (c) the correlation coefficient. (8)
-
Liver weight "grams
Testes fat weight y milligrams
1.384 1.342 1.233 1.331 1.231 1.482 1.133 1.483 1.399 1.834 1.544 1.491 1.301 1.150 1.498 1.425 1.319 1.404 1.531 1.438 1.438
312 275 428 318 415 208 351 196 223 145 285 181 219 326 373 187 176 299 193 183 245
272
Ch. 16
LINEAR REGRESSION-I
The preceding data on liver weight and testes fat weight were gathered incidental to a study of the effect of social rank on adrenal size in mice. It was thought that the weight of testes fat should be inversely related to liver weight. Find the correlation coefficient and test for its significance (Equation 7 or 8, Section 16.8). (9) The college entrance examination score (x), and the final examination score (y) in a freshman course for a group of students are listed in the accompanying table. (a) Find (i) the estimated line of regression, (ii} the estimated variance of array, and (iii) the correlation coefficient. (b) Test the hypothesis that the population regression coefficient is equal to zero at the 5% level.
r
"
4 2 9 8 5 5 9 2 6 2 10 10 10 8 7 9 5 7 8 2 9 6 7 3
9 4 3 5 10
65 72 95 84 65 66
96 70 80 68 83 100 90 84 86 80 I
66 69 91 84 67 72 80 66 90 74 73 63 91
"
7 9 5 8 8 3 1 9 10 5 3 7 7 6 6 3
10 6 5 8 7 9 5 8 4 I
10 7 2
r 80 72
72 94 91 66
68 83 103 64
80 66
87 81 86 76 85 78 74 100 73 85 82 73 100 57 83 61 75
"
4 9 4 10 9 8 6 9 7 6 2 3 4 4
r 80 74 66 93 86 95 60 87
Rn
3
10 2 10 4 9 6 6 1 9 10 9 9 5 9 3 6 3 6 5 2
3
7 6
83
6
80
I
77
9
83 77 97
5
8 3 9 9
2 5 9 4 6 8 4 3
88
65 74 93 80 76 70 51 64 72 76
9 1 6 4
"
60
77 90
r 75 68 75 58 90 87 61 82 72 92 54 94 70 100 76 95
80 64 92 86 79 50 80 60 71 82 76 86 75
273
REFERENCES
(10) The modulus of rupture (y) and the specific gravity (x) of 20 fiherboards are listed below:
r
%
0.930 0.902 0.870 0.914 0.935
!l19
424 392 462 465
%
r
%
r
%
r
1.021 0.958 0.963 0.968 0.991
574 525 494 559 568
1.026 0.989 0.981 0.977 1.026
633 549 541 565 613
1.016 1.026 0.987 1.040 1.020
622 601 557 611 611
(a) Test the hypothesis that the average modulus of rupture is not affected by the specific gravity of fiber-board, at the 5% level. (b) Find the estimated line of regression of the modulus of rupture on the specific gravity of fiber-boards. (c) Find the correlation coefficient.
QUESTIONS (1) The regression equation of y on x gives the relation between what and what? (2) What is the regression coefficient f3? (3) If x is the number of inches and y the number of pounds, what is the unit of the regression coefficient? (4) What method is used in estimating the regression coefficient f3? (5) What is the meaning of the statement that regression of y on x is linear? (6) Indicate graphically the following deviations: (8) y (b) x (c) y - x (7) What hypothesis does the statistic
r r r r
F = regression SS residual SS
n-2 test? Is this test a one-tailed or two-tailed test? (8) Under what condi tions is the correl ation coeffi cient equal to 1, -1, and O? (9) What does the sign (+ or -) of the correlation coefficient indicate? (10) If the regression SS is equal to 25 and the residual SS is equal to 75, what is the value of the correlation coefficient?
REFERENCES Anderson, R. L. and Bancroft, T. A.: Statistical Theory in Research, McGrawHill Book Company, New York, 1952. Mood, Alexander \t.: Incroduccion Co the Theory of Statistics, McGraw-Hill Book Company, New York, 1950.
CHAPTER 17
LINEAR REGRESSION-II This chapter is a continuation of the preceding chapter. All notations and terminology are exactly as used before.
17.1 Sampling Experiment In Section 16.5 a sampling experiment with the regreseion coefficient f3 equal to zero is used to verify the distributions of the sums of squares. In this section, another sampling experiment, with f3 different from zero, is described. This experiment is used in the following sections to verify the distribution oC Y band y- and also to determine the variance com" % ponents (Section 12.4) of the regression SSe The sampling experiment described in Chapter 4 involves the drawing of 1000 random samples, each consisting of 5 observations, from the tag population. But in regression, a population must contain at least two arrays; and therefore, the tag population is a single array which follows the nonnal distribution with mean equal to 50 and variance equal to 100. However, 5 arrays with different means can be created by adding 5 different numbers, 0, 20, 40, 60, and 80 to the 5 respective observations of each of the 1000 random samples. Then the 5 modified observations of each sample become a sample drawn from a population with the means of the arrays equal to 50, 70, 90, 110, and 130 respectively. The 5 arrays are identified by the x-values 1, 2, 3, 4, and 5 respectively. For x = 1, the mean of y is 50; for x = 2, the mean of y is 70, and so forth. The means of y of the 5 x-arrays are listed below:
50
2
3
4
5
70
90
no
130
From the above table, it can be seen that i" = 3 and p. y.%- .. 90 (Equation 3, Section 16.1) and that the mean of the array increases by 20 units, while x increases by 1 unit, that is, f3 .. 20. Therefore, the regression equation of this population is
p. y'"
= 90 + 20(x-3).
0)
It should be realized that the addition of the same number to all observations of a nonnal population does not alter the population in any way except to change its mean (Theorem 2.4a and Section 3.1). Therefore the population, consisting of 5 arrays which are created by adding 0, 20, 40, 274
17.1
275
SAMPLING EXPERIMENT
60, and 80 to the 5 respective observations of each of the 1000 samples,
is a population as described in Section 16.2; that is, in the population thus created, (a) each x.-array of r follows the normal distribution {the tag populatior. is norman; (b) the regression of r on % is linear (Equation 1); (c) all the x.-arrays of r have the same variance «(72 .. 100). The conversion of the observations can be illustrated by an example. Four samples, drawn from the tag population, are shown in Table 4.2. The observations of the first samples are 50, 57, 42, 63, and 32. When the numbers 0, 20, 40, 60, and 80 are added to the 5 original observations, the modified observations are 50, 77, 82, 123, and 112. To avoid possible confusion between the original and the corresponding modified observations, r' is used to denote an original observation and r is used to denote the corresponding converted observations. Then the relation between r' end r is
r = Ya + 0; Y 1
2
= ~
+ 20;
r• ., r. + 40; r
4 -
r. + 60; r. = r: + 80.
(2)
By the above equations, the 5 observations r' are converted into 5 pairs of observations (x, r), with % being 1, 2, 3, 4, and 5. The original observations and the converted pairs of observations for the first two samples given in Table 4.2 are listed in Table 17.1. It should be noted fr.om Table 17.1 that the x.-values remain the same for all samples. This is an important condition in developing the theorems in the following sections. There are reasons for converting the observations of a previous sampling experiment instead of performing a new one. The most obvious reason is to save the effort of drawing new samples, but the most important reason is to link linear regression with the topics discussed in preceding chapters such as the t-test, analysis of variance, and indi vidual degree of freedom. TABLE 17.1 Any Sample
" I
2 3 4
5
.,, r
First Sample
,
11
YI, y., Y4, y,
1
,
y~
-
+ 0
r. + 20
Y: + 40 y: + 60
,,: + 80
Second Sample
Y
r
5
50 57 42 63 32
50 77 82 123 112
" 1
2 3 4
,
y
y
55 44 37
55 64 77
5
40 52
132
"-
1 2 3 4
100
Sum
Iy' Iy' + 200
Sum
244
444
Sum
228
428
Mean
y'
Mean 48.8
88.R
Mean
45.6
85.6
y' + 40
276
Ch. 17
LINEAR REGRESSION-II
The conditions under which the sampling experiment is conducted can be summarized as follows: 1. The samples are drawn at random. 2. Each array of y of the population follows the normal distribution. 3. The variances of all the arrays of the population are equal. 4. The regression of y on x is linear. 5. The x-values remain the same for all samples and do not change from sample to sample. These conditions are assumed when linear regression is applied to practical problems.
17.2 DistributioD of Sample MeaD The sample mean f, which is the least squares estimate of
C( (Equation 3, Section 16.3), will change from sample to sample. The distribution of the sample mean drawn from a single array (population) is discussed in Chapter 5. Now the distribution of y of samples drawn from a populatioc with many arrays is considered in this section. The distribution of the values of the sample mean is to be obtained from the sampling experiment described in the preceding section. It can be seen from Table 17.1 that the mean f of the 5 converted observatious is equal to the mean y' of the 5 original observations plus 40, that is,
y ... f' + 40.
(1)
From the above equation, the distribution of f can be derived from that of f':· The distribution of y# which is verified experimentally in Section 5.6 follows the normal distribution (Theorem 5.2b) with mean equal to I-' or 50 and variance equal to (li/ n or 100/5 (Theorem 5.3). Because of the relation expressed in Equation (1), it can be concluded (Theorem 2.48 and Section 3.1), without further experimentation, that y follows the normal distribution with mean equal to 50 + 40 or 90 which is C( (F.AJuation 1, Section 17.1) and variance also equal to ql/n or 100/5. The result of this sampling experiment verifies the following theorem:
Theorem 17.2 If all possible samples of size n are drawn from the population with the regression equation Il y. " -=
C(
+ {3(x- i')
ant! the specifications given in Section 16.2, the statistic f or a, which is the least squares estimate of tx, follows the normal distribution with mean equal to C( and variance equal to (l1/n, where (II is the variance common to all arrays. The above theorem can also be derived from Theorem 15.2. estimate of C(.
a
I I I
=y - -
n
(y,) + - (y I) + ... + - (y n)'
n
n
The
(2)
17.2
277
DISTRIBUTION OF SAMPLE MEAN
is a linear combination (Section 15.1) of the observations y. If all possible samples are drawn from the population, the mean of the values of which are the observations drawn from the first array, is equal to the mean of the first array, and the mean of the values of Y2 is equal to the mean of the second array and so forth. The variance of r is equal to q2 for all arrays. Therefore, by Theorem 15.2, the statistic a follows the normal distribution with mean equal to
r"
1 II
ro
-
-II
n ry.
1 1 + -II + . .. + -II x, n rye x. n ry. x n
-
<X
(3)
(Equation 6, Section 16.3) and variance equal to
Note that the result derived from Theorem 15.2 is the same as that given in Theorem 17.2. There are reasons for using the least squares method of estimating (),. First, the estimate a or f is an unbiased estimate (Section 7.3) of <X (Equation 3). Second, the least squares estimate has the smallest variance as compared to all other kinds of estimates which are also based on linear combinations of the observations y. The physical meaning of the fact that the variance of a or f decreases as the sample size increases (Equation 4) can be seen from Fig. 16.3. The variation of f from sample to sample moves the estimated line of regression up and down without changing its slope. When the number of points, which is the sample size n, is increased, the line is held to a more Rtabilized position, that is, the variance of f is decreaseil. From Theorem 17.2, it follows that the statistic
(5)
follows the normal distribution with mean equal to 0 and variance equal to 1. If the 0'2 of Equation (5) is replaced by S2, where S2 is the residual SS divided by (n- 2) (Equation 1, Section 16.6), the resulting statistic
f-
<X
.= t;
(6)
follows Student's t-distribution with n-2 degrees of freedom. This statistic t may be used in testing the hypothesis that <X is equal to a given value <Xo and also in finding the confidence interval (Section 11.4)
278
LINEAR REGRESSION-II
Ch. 17
of ex., that is,
f- t. 025 ~< cx.< y + t. 025 ~
(7)
with the confidence coefficient being .95. The sample of ten pairs of observations given in Table 17.6b may be used as an illustration of the computing procedure. The details of the computation are given in Table 17.6c which shows that n"'" 10, Y-6.2, and S2 = 5.2. In testing the hypothesis that the parameter ex. is equal to 5.0, the statistic is t
6.2- 5.0
1.2
, f5.2
.721
=---- =-
Vlo
= 1.66
with 8 degrees of freedom. If the 5% significance level is used, the critical regions are t < -2.306 and t> 2.306 (Table 6, Appendix). In estimating the parameter cx., the 95% confidence limits are
6.2 ±2.306
V¥o
.2
10
or
6.2 ± 2.306(.721)
or
6.2 ± 1.7
or
4.5 and 7.9.
17.3 Distribution of S8IIIpie Regression CoeUicient The distribution of the sample regression coefficients b can be obtained from the sampling experiment described in Section 17.1. For each of the 1000 samples, the regression coefficient (Equation 7, Section 16.3) can be calculated and thus the frequency distribution of the 1000 values of b can be obtained. But judging from the algebraic expression of b, this operation would appear to be a tremendous undertaking. The complicated computation must be preceded by converting the 5000 original observaliun8 y' into y (Equation 2, Section 17.1). However, a short-cut method can be found to make the computation of the 1000 values of b almost effortless. The short-cut method, wh ich is shown in Table 17.3, is not only a device to save time and effort in conducting the sampling experiment but also a device to show the theory of the distribution of the sample regression coefficient. Since the x-values are 1, 2, 3, 4, and 5 for all samples, the values of (x - i) are -2, -1, 0, I, and 2, and consequently the value of 55" is 10, for each of the 1000 samples (Table 17.3). Therefore. the denominators of the 1000 values of b need no com-
17.3
DISTRIBUTION OF SAMPLE REGRESSION COEFFICIENT
279
putation. The effort of computing the numerators of the values of b is also drastically reduced 88 shown in Table 17.3. The short-cut method
reduces the computation of b to
r.-+
SP -2Yt -~+ 2r. + 200 b - 5S --• 10 s
%
1 2 3 4
5 15
I.x
TABLE 17.3 (% - i)2 Y-Y
,.
%-x-
Y~ + 0 Y; + 20 Y; + 40 ,.: + 60 Y; + 80
-2
4
-1 0 1 2
1 0 1 4
~y' + 200 I.,.
I.(%-i)
0
(l)
(x-X) (,.-y) -2)~-
(Y~ + O)-y (,.; + 20) - Y (Y; + 40) (y: + 60) - Y (y: + 80) -
-
r
:r
0+ 2y , Ya- 20+ Y 0 Y:+ 60- y 2y~ + 160 - 2y
, 2' - 2' y,- 12, + 1.+ 1.+ 200 SP or I.(x - x)(y - y)
10
55 s
It can be seen from the above equation that the original observations y' need not be converted into y for the computation of the value of b. For each sample, one simply uses the original observations and carries out the following simple operations: (1) Subtract the first observation twice. (2) Subtract the second observation once. (3) Ignore the third observation. (4) Add the fourth observation once. (5) Add the fifth observation twice. (6) Add 200. (7) Place a decimal point (divide by 10) in the sum thus obtained. It should be noted from Equation (1) that
v
b = - + 20, 10
or
b - 20
v
=-
10
(2)
where v is the same v given in Equation 4, Section 15.2. (The y in Section 15.2 is an observation drawn from the tag population without conversion and is therefore the y' in this sampling experiment.) The sampling experiment in that section shows that v follows the normal distribution with mean equal to 0 and variance equal to 1000. When each
of the 1000 values of v is divided by 10, (Theorem 2.4b), the meaD of (b-20) or v/l0 is 0/10, or still 0, and the variance of (b-20) is 1000/10 2 or 10. Therefore, the statistic b follows the normal di stribution with mean equal to 20 or ~ and variance equal to 10 (Theorem 2.4a) or 100/10
280
Ch. 17
LINEAR REGRESSION-II
or u 2/SS ,,' The result of this sampling experiment verifies the following theorem: Theorem 17.3 If all possible samples of size n are drawn from the population with the regression equation
I'-y. "-
(3(x-%)
CX,+
and the specifications given in Section 16.3, the statistic b, which is the least squares estimate of {3, follows the normal distribution with mean equal to {3 and variance equal to u 2/SS sf that is, u~
u2
"
----
~(X_i)2'
where u 2 is the variance common to all an-ays. The above theorem can also be derived from Theorem 15.2, because the sample regression coefficient b can be expressed as a linear combination of the observations y. It can be observed from Table 17.3 that
SP
= ~(x-i)
(y-y)
II:
~(x-i)y
(3)
because ~(x-i) is equal to zero, and consequently r~(x-i)
= O.
Equa-
tion (3) is an algebraic identity which holds for any n pairs of observa-
tions and not only for the observations of this sampling experiment. The purpose of pointing out this identity is to show that
b= ::. =
r;;.)y. t;;), +
(X~;.X)y•
+ ••• +
(4)
is a linear combination (Section 15.1) of the observations y. If all possible samples are drawn from the population, the mean of the values of Yl' which are the observations drawn from the first array, is equal to the mean of the first array, the mean of the values of Y2 is equal to the mean of the second array, and so forth. The variance of y is equal to u 2 for all arrays. Therefore, in accordance with Theorem 15.2, the statistic b follows the normal distribution with mean equal to
ILb
= (X~;:rY' "I +
(X;;~rY'
"2
+ ..• +
(X;;,,%)I'-y. "n
(5)
and variance equal to u2 b
=
(x 1 -i)2
(SS)2
"
U2
+
(x
(X 2 -%)2
(SS
U2
)2
"
+ ••• +
_%)2
n
(SS )2
"
U2.
(6)
17.3
DISTRIBUTION OF SAMPLE REGRESSION COEFFICIENT
The algebraic expression for the mean of b can be simplified. the equation II '-y. -,
281 When
= (X, + f3(x,-i)
is multiplied through by (x,-i)/SS s' the resulting equation is
t-;-;-i")#l y. _, _ S
(X_;_;_i") (X, + f3_(x_~~_X_X)_2• S
S
Ry changing the subscript 1 of x into 2, 3, ••• n, a set of n similar equations can be obtained. The mean of b is the sum of n such equations (Equation 5). When the n equations are added together, the resulting equation is I(X-i)2 #lb- I(x-X) CI.+f3 =f3 SS SS s
(7)
s
because ~(x-i) = 0 and I(x-iP is SS s • The algebraic expression of the variance of b can also be simplified. The common factor of equation (6) is u 2leSS s)2. When the n quantities (X-i)2 are aJJed together, the resulting sum is SS s. Therefore,
u2 b
uJ
=
Ufo
--(SS ) _ (SS)2 S SS
Ufo
-I:: - - -
s
s
I(X-i)2'
(8)
Note that the above result derived from Theorem 15.2 is the same as that given in Theorem 17.3. The reasons for using the least squares method of estimating f3 are the same as those given for estimating CI.. First, the estimate b is an unbiased estimate (Section 7.3) of f3 and second, the least squares estimate has the smallest variance as compared to other kinds of estimates which are also based on linear combinations of the observations y. The physical meaning of the fact that the variance of the values of b decreases as the value of SS JII: increases (Equation 8) can be seen from Fig. 16.3. The value b is the slope of the estimated line of regression. The variation of b results in a rotation of the line with the point (x, 1) as the center. The increase in the number of points, which is the sample size n, does not necessarily prevent the line from rotating. The positions of the points are also important. The additional points placed at the center of rotation, that is, x ... X, or (x-X) .. 0, do not stabilize the line. However, if the additional points are placed at the ends of the line, that is, when (x-%) assumes the greatest values, they are much more effective in preventing the line from rotating. Therefore, the accuracy of a sample regression coefficient does not entirely depend on the sample size. The
282
LINEAR REGRESSION-II
Ch.17
combination of the sample size and the spread of the x.-values, as indicated by the magnitude of SS ,,' determines the accuracy of b. This point which has an important bearing on application is stressed again in Section 17.5. It follows from Theorem 17.3 that the statistic
b-(3
(9)
follows the nonnal distribution with mean equal to zero and variance equal to one. If the variance (71 of Equation (9) is replaced by S2, where Sl is the residual SS divided by (n-2) (Equation 1, Section 16.6), the resulting statistic
b-(3
t ... --;====i==
VI("S~i)2
(10)
follows Student's t-distribution with n-2 degrees of freedom. This statistic t may be used in testing the hypothesis that (3 is equal to a given value (3Ot and also in finding the confidence interval (Section 11.4) of (3, that is, (11) with the confidence coefficient Leins .95.
In testing the hypothesis that (3 is equal to zero, the statistic t in Equation (0) becomes
b
(12)
t---==
VS~i
" with n-? degrees of freedom. Then the statistic tl ..
F-
bl
blSS
- a -_"
,sl
SS
"
(13)
,sl
follows the F-distribution with 1 and n-2 degrees of freedom (Theorem 12.6). It should be noted that the F in Equation (13) is exactly the same F of Equation 0), Section 16.7, because b2SS" is the regression SS (Equation 7, Section 16.9) and S2 is the residual SS divided by n-2
17.4
DISTRIBUTION OF ADJUSTED MEAN
283
(Equation 1, Section 16.6). Then either the t-test or the F.test may be used in testing the hypothesis that fJ is equal to zero and the conclusions reached by these two tests are always the same. However, the statistic t in Equation (10) has the advantage of testing the hypothesis that fJ is equal to a value other than zero. In addition, the confidence interval of fJ can be obtained from the same t. The sample of ten pairs of observations given in Table 17.6b may be used as an illustration of the computing procedure. The details of the computation are given in Table 17.6c which shows that n = 10, b a:-2, SS" = 14.5, and sa - 5.2. In testing the hypothesis that the regression coefficient fJ is equal to zero, the statistic is
t.,.
-2-0
~5.2
-2
--=-3.34 .599
14.5 with 8 \legrees 0; freedom. If the 5% significance level is used, the critical regions are t < -2.306 and t> 2.306 (Table 6, Appendix). Note that the square of the t-value, (-3.34)a, is equal to 11.15, which is the F-value shown in Table 17.6c. In estimating the parameter fJ, the 95% confidence limits are
-2 ±2.306
.2 ~ 14.5
or
-2 ± 2.306(.599)
or
-2
or
-3.4 and -0.6.
±1.4
17.4 Distribution or Adjusted Mea The mean /l". " of r for any array can be determined from the regression equation II
ry'"
-
(X
+ fJ(%-i)
(1)
of the population. However, if the population is not available, the mean /ly." of an array must be estimated through the line of regression of a sample. For a particular array where % - x', the estimated mean of r is (Equation 2, Section 16.3)
f,,'''' f + b(%' -
i) -
'1 -
b(x - x').
(2)
Since '1,,' is obtained by subtracting b(i"-%') from f, the former is called the adjusted mean while the latter, by contrast, is called the unadjusted mean.
284
Ch. 17
LINEAR REGRESSION-II
Both Ys and fare estimated means; Ys' being the estimated mean of an array, is the sub-sample mean and y, being the estimated mean of the entire population, is the sample mean. Since both rand 6 vary from sample to sample, fs will vary from sample to sample. The distribution of the sub-sample mean or the adjusted mean y" is considered in this section. It can be seen from Equation (2) that fs is a linear combination (Section 15.I) of y and 6 with the multipliers (M) being 1 and (x' -i) respectively. Therefore, the distribution of the sub-sample mean Ys can be derived from Theorem 15.2. Since the sample mean f follows a normal distribution with mean equal to ex. and variance equal to u'/n (Theorem 17.2), and the sample regression coefficient 6 follows a normal distribution with mean equal to f3 and variance equal to u'/SS" (Theorem 17.3), the linear combination fs of f and 6 also follows a normal distribution with mean equal to (Equation 2, Section 15.2)
,.,. y, =,.,.y +(:¥'-i')""L-cx+(x'-il{3=,.,., " y.s
(3)
JI:
(Equation I) and variance equal to (Equation 3, Section 15.2 and Theorem 17.3)
l
r
- u t.;+
(4) (x'-i) 2
1
l:(x-i)~
•
The ahove result about the distribution of the sub-sample mean fs is derived from Theorem 15.2 which requires that rand b be independent. So far the independence has not been established. Therefore, the above result must be considered tentative and requires experimental verification. The sampling experiment is descrihed in detail in Section 17.1. Briefiy, 1000 random samples, each consisting of 5 observations, are drawn from the population with the regression equation being
" ry·s
(5)
- 90 + 20(x-3),
%oovalues being 1, 2, 3, 4, 5, and the variance of every array heing 100. The values of f and 6 are already calculated for each sample (Section 17.2 and 17.3). Therefore, the sub-sample means
y" for
each sample can
be calculated from Equation (2). The values of y" of the 1000 samples are used to verify the tentative result about the distribution of f x • In this sampling experiment only one sub-sample mean is calculated for each sample. At x - 5, the sub-sample mean f" is
YI - r + 6(5-3) ... f
+ 26.
(6)
17.4
285
DISTRIBUTION OF ADJUSTED MEAN
For the four samples given in Table 4.2, the values of f (Equation 1, Section 17.2) are 88.8, 85.6, 99.2, and 95.4, and the values of bare 17.0, 19.0, 15.6, and 16.8 (Equation 2, Section 17.3). Therefore, the values of f. for the four samples are 88.8 + 2(17.0) - 122.8, for the first sample; 85.6 + 2(19.0) - 123.6, for the second sample; 99.2 + 2(15.6) - 130.4, for the third sample; and 95.4 + 2(16.8) - 129.0, for the fourth sample. For each sample, the value of f. can be similarly computed. Then the distribution of f. can be obtained. If the result derived from Theorem 15.2 is correct, f. will follow the normal distribution with mean equal to (Equations 3 and 5)
(7)
p.,.. 5 - 90 + ~(5 -3) -130 and variance equal to (Equation 4)
11 (5-3)~ u;.; - 100 li+ 10 -J - 60.
(8)
The frequency distribution of the 1000 values of f. is shown in Table 17.4 and the relative cumulative frequency (r.c.f.) plotted on normal probability graph paper is shown in Fig. 17.4. The fact that the points follows the normal distriare almost on a straight line indicates that bution. The mean or the 50% point of the values of f. ia about 130.5 which is close to 130 (Equation 7). The 84% point which corresponds to the mean plus one standard deviation of f. is about 138.2, and conseis 138.2 -130.5 or 7.7. But from quently the standard deviation of
r.
r.
TABLE 17.4 Estimated Me8D of an Array Ys
Below 116.5 116.5 - 119.5 119.5 - 122.5 122.5 - 125.5 125.5 - 128.5 128.5 - 131.5 131.5 - 134.5 134.5 - 137.5 137.5 -140.5 140.5 - 143.5 Above 143.5
Observed Frequency f 32 47 73 106 142 153 156 104 85 53 49
r.c. f. 3.2 7.9 15.2 25.8 40.0 55.3 70.9 81.3 89.e
95.1 100.0
(This sampling experiment was conducted by Cathy Olsen at Oregon State College in the summer of 1954.)
286
Ch. 17
LINEAR REGRESSION-II
99 r. c.f.
90 - - - - - - - - - - - 84"
---------t------------
50
10
liS
120
130
125
135
140
145
Fig. 17.4
Equation (8) it can be calculated that the standard deviation of 1. is {60 or 7.746. The values 7.7 and 7.746 are very close. Now the sampling experiment has verified the tentative result obtained from Theorem 15.2 and the characteristics of the distribution of r. can be summarized in the following theorem. Theorem 17.4 If all possible samples of size n are drawn from the population with the regression equation II ry.a:
-=
(X
+ (3(x-i)
and the specifications as given in Section 16.2, the sub-sample mean
17.4
DISTRIBUTION OF ADJUSTED MEAN
287
Y,/
(which is the least squares estimate of I'y. ,/) foUows the normal distribution with mean equal to I' y' ,/ and the variance equal to
(9) where q' is the variance common to all arrays and x' is any particular value of x. It can be seen from the above theorem that the sub-sample mean Ylr is an unbiased estimate (Section 7.3) of the mean /L y • lr of an array, and the variance of lr is influenced by three factors, namely n, (x~ - %):1, and SSlr. The presence of the factors n and SS lr is expected. These are the very factors which influence the variances of Y and b (Theorems 17.2 and 17.3) and, therefore, they also influence Ylr which is a linear combination of yand b. However, the factor (x'_i)2 is new. Equation (9) shows that the magnitude of the variance of Ylr increases with the difference between x' and x. In terms of the example of the regression of height on age of children (Section 16.1), the implication of Equation (9) is that the estimated average heights for various age groups are not of equal accuracy. The variance of the estimated average height is the smallest for the 4-year-old group. As the ages of the children deviate farther from this average age of 4, the variance of the estimated average height increases. It follows from Theorem 17.4 that the statistic
r
r! (x'-%')21 V q3 L;+ I(x-i)~
(0)
U eo 1 /
follows the normal distribution with mean equal to zero and variance equal to one. If the variance q2 of Equation (10) is replaced by S2, where S2 is the residual S5 divided by (n-2) (Equation 1, Section 16.6), the resulting statistic (11)
288
1 = 6.2,
u!\EAR REGRESSlOS-D
b = -2, 55. - 14.5. s'
= 5.2.
1. = 6.2 -
Ch. 17
and cbe regressioo equation is
2(% - 2.5).
(12)
Suppose one wishes to test the hypothesis that the population mean of y is equal to 4.0 at %= 2. for this value of %, Y. ==7.2 (Equation 12). The statistic is t
~
7.2 -4 0
]I
-+ 5.2 [1
10
(2-2.~~
3.2
= - = 4.10
.781
14.5
with 8 degrees of ueedom. If the 5% significance level is nsed, the critical regions are t < -2.306 and t> 2.306 (Table 6, Appendix). In estimating the parsneter p. y. 2' the 95% confidence limits are
7.2 ±2.006
OG o
l (2-2.5>'] 5.2 -+ - - 14.5
or
7.2±2.306(.781)
or
7.2 ± 1.8
or
5.4 and 9.0.
17.5 VarlaDce Co.poaeate In Section 16.7, the statistic regression SS
F - -----::--S2
(1)
is suggested for testing the hypothesis that the means of all arrays are equal or that the population regression coefficient {3 is equal to zero. In this section the average value of the regression SS of all possible samples is determined and thus further justifies that the statistic F is the appropriate one to use in testing this hypothesis. The value of the regression SS, of course, changes from sample to sample. The average value of this SS can be determined by a sampling experiment. However, it can also be obtained by deduction from the equation regression SS .. b2SS.
(2)
(Equation 5, Section 16.4), which can be expressed as
b2SS. - [(b-{3) + f3]2SS.
= [(b-f3)2 + 2f3(b-f3) + fJf)SS..
(3)
In the above expression of the regression SS, it can be observed that the values of (b-fJ) and (b-f3)2 vary from sample to sample because the
17.5
VARIANCE COMPONENTS
289
value of b varies from sample to sample. But the population regression coefficient fJ and the value of SS ~ remain unchanged for repeated samples. For instance, in the sampling experiment of Section 17.1, fJ is equal to 20 and SS ~ is equal to 10. Neither of these quantities varies with the sample. Therefore, the determination of the average value of the regression SS boils down to the determination of the average of (b-f3) and that of (b-fJ)2. Since the mean of the values of b is equal to fJ (Theorem 17.3), the mean of the values of (b-f3) is equal to zero. This can be explained in terms of the sampling experiment of Section 17.1. The mean of the 1000 values of b is equal to 20 which is fJ. If 20 is subtracted from each value of b, then the new mean is equal to zero (Theorem 2.4a). The mean of (b-f3)1 is the variance of b. This can be explained in terms of the sampling experiment of Section 17.1. Since the mean of b is 20, the variance of the 1000 values of b is 1(b-20)1 1000 which can be regarded as the average of the 1000 values of (b-20)1 or (b-fJ)2. Therefore, the average value of (b-f3)2 is equal to (l2/S8 ~ which is the variance of b (Theorem 17.3). When the average value of (b-f3) and that of (b-f3)2 are substituted in Equation (3), that is, when (b-f3) is replaced by 0 and (b-f3)2 is replaced by (l1/SS~, the resulting equation is average regression MS
=0
[~2
+ f32] SS ~ _ u 2 + f32SS~.
(4)
~
The regression SS is equal to the regression mean square, because this SS has 1 degree of freedom. The average value of the numerator of F (Equation 1) is u 2 + f32SS ~ and that of the denominator Sl is u l (Section 16.6). These two average values are the respective equivalents of u~ and of Theorem 9.1. Then the hypothesis of the F-test shown in Equation (1) is
u:
(5)
that is, f3ISS ~ is equal to zero or f3 is equal to zero, because SS ~ is always positive. If SS ~ were equal to zero, all the values of x would be the same and consequently there would be no problem of regression at the outset. If the hypothesis is true, that is, f3 is equal to zero, the magnitude of SS will not affect the F-value. However, if f3 is different from zero, the quantity f32SS ~ can be magnified to any desired size by increasing the value of SS~. If the quantity fJ2SS ~ is large, the F -value, on the ~
290
LINEAR REGRESSION-II
Ch.17
average, will be large and the hypothesis that f3 -= 0 is likely to be rejected. In other words, the increase in the value of SS" decreases the probability of committing the Type II error. The magnitude of SS" is influenced by both the sample size and the spread of the values of x. For example, the values of x of a sample of five observations are 1, 2, 3, 4, 5 and SS" is equal to 10. If the sample size were doubled without changing the values of x, the ten values of x would be 1, 1, 2, 2, 3, 3, 4, 4, 5, 5 and the SS" would be 20, which douhles the original value of SS ". However, if the ten values of x were five l's and five 5's, the value of SS" would be
5[0-3)2 + (5 -3}2) -= 40 have the same variance; while in the analysis of variance, one perceives a collection of k populations, each of which follows the normal distribution, and all of which have the same variance (Section 16.1). So far the difference is superficial and only a matter of terminology, but the real difference between the two topics is in the presence or absence of the regression equation. In the linear regression, the means of all the arrays are bound together by the regression equation
p.,.. "
II
.,.. "
IZ
ex + fJ Q(x-i}
and consequently only ex and f3 are the independent parameters, regardwhi ch is twice as large as 20, even though in both cases the number of observations is ten. This shows that the increase in sample size and the increase in the spread of the value of x reduces the probability of committing a Type II error in testing the hypothesis that the means of all the arrays are equal or that the population regression coefficient is equal to zero. This point should be kept in mind in planning an experiment and will be discussed further in the following sections. 17.6 Contrast Detween Linear Regression 8IId Analysis of Variance Regression and the analysis of variance have been developed by different workers at different times. As a result, different notations and
terminology are used in connection with these two topics.
However,
these topics are really different versions of the same general topic, and the difference between them is more apparent than real. The purpose of this section is to point out, in preparation for the study of the two following sections, the similarities and differences between these methods. The discussion here is limited to linear regression and the analysis of variance with one-way classification (Chapter 12J. Much of the difference between regression and the analysis of variance stems from different points of view about the population. In regression, one percei ves a population consisting of a number of sub-populations or arrays, each of which follows the normal distribution and all of which
17.6
LINEAR REGRESSION AND ANALYSIS OF VARIANCE
291
less of the nu~ber of arrays in the population. On the other hand, the means Ill' #l2' ••• , #lie of the k populations of the analysis of variance are k independent parameters, which are not bound together by a regression equation. The hypothesis of the F-test of the linear regression (Section 16.7) is the same as that of the analysis of variance (Section 12.5). In the linear regression, the hypothesis is that the regression coe££icient f3 is equal to zero or that the means of all arrays are equal; while in the analysis of variance, the hypothesis is that the means of all the populations are equal. The two hypotheses are really the same. But it should be realized that, despite the similarity in the hypotheses, the alternative hypotheses for the two cases are entirely different. In the case of linear regression, the rejection of the hypothesis leads to the conclusion that f3 is greater than or less than zero with the means of the arrays still lying on the line of regression. In the analysis of variance, on the other hand, the rejection of the hypothesis leads to the conclusion that the population means are not all the same, with the means of the populations baving the freedom of being different in every possible way. The e££ect of this difference in the alternative hypotheses on the probability of committing a Type n error is discussed in Section 17.8. The difference in perceiving the populations involved in linear regression and in the analysis of variance is also reflected in the samples of the two cases. In regression, the n observations are considered a single sample consisting of a number of sub-samples each of which consists of the observations with the same value of x. The notation is used to denote the mean of all observations and Ys the mean of a subgroup or a sub-sample of observations. But in the analysis of variance, the In or kn observations are considered k samples, each of which consists of a number of observations. The notation y is used to denote the mean of all observations and y the mean of a sub-group or a sample of observations. Therefore, y, Ys and 'Y are the equivalents of y, yand respectively. The difference in the notations stems from the way in which the observations are conceived as a single sample or a ~ollection of k samples. It is obvious that y (or a) in regression and y in the analysis of variance are equivalents, because they are both means of all the observations whether they are considered one sample or k samples; but it is not so obvious that Ys and yare equivalents, because they are computed in different ways. The difference in the methods of obtaining Ys and y stems from the presence or absence of the regression equation. In regression, where the regression equation is present, the mean of an array is not estimated by the observations drawn from that array but through the regression equation
r
r
ys = a + b(x-i)
292
Ch. 17
LINEAR REGRESSION-II
which in turn is estimated by th'e observations drawn from all arrays (Section 16.3). In the analysis of variance, the mean of a population is estimated only by the observations drawn from that population, because there is not the advantage of having a regression equation. The equivalence in means leads to the equivalence in the sums of squares. The total SS, regression SS, and residual SS of the linear regression are the respective equivalents of the total SS, treatment SS, and error SS of the analysis of variance. The partition of the sum of squares of the linear regression as shown in Equation (1), Section 16.4, is
l:(r-yr' -l:<1" _,.)1 + l:(r-r.)1
(1)
and that of the analysis of variance as shown in Equation (2), Section 12.1, is (2) These partitions of the sums of squares are comparable if one recognizes the equivalence of y, 1", r of the linear regression to Y, r of the analysis of variance. ' The numbers of degrees of freedom for the total SS are the same for both cases, despite the difference in notations-n -1 in regression and l:n -1 (or kn -1, if all the sample sizes are equal) in the analysis of variance. The difference in notations stems from the difference in conceiving the observations as one sample or as a collection of k samples. In both cases, the number of degrees of freedom is the total number of observations minus one. The difference in the numbers of degrees of freedom for the regression SS and the treatment SS is real and it stems from the fact that in linear regression there are two parameters ex. and {3 and that in the analysis of variance there are k parameters, 1'1' 1' •••• , p./c' But in both cases, the number of degrees of freedom is the number of parameters minus one; that is, the regression SS has 2 -1 or 1 degree of freedom, while the treatment SS has k -1 degrees of freedom. Therefore, when k - 2, linear regression and the analysis of variance merge into one topic (Exercise 3 at the end of this chapter). The numbers of degrees of freedom for the residual SS and the error SS are n-2 and l:n-k (or kn-k if all the sample sizes are equal) respectively. Despite the difference in notations, they both represent the total number of observations, n or l:n, minus the number of parameters, 2 or k. From the above discussions it can be concluded that the only difference between the linear regression and the analysis of variance is in the presence or absence of the regression equation. The other differences are a matter of points of view. It seems to be much simpler to ignore
y,
17.6
LINEAR REGRESSION AND ANALYSIS OF VARIANCE
293
ODe point of view aDd adhere to the other consistently. Yet, unfortunately, both points of view are needed, but at different places. ID the two following sectioDs, the arrays of a population of the regression are cODsidered the equivaleDts of the k populations of the analysis of variance. But in the analysis of co-variance (Chapter 19), each of the k populations consists of a number of arrays, and consequently a different point of view is Deeded. The cODtrast between linear regression and the analysis of variance with one-way classification is listed in Table 17.6a. In applicatioD, both linear regression and the analysis of variance have their advantages. In an experiment which determines the effect of various rates of the same fertilizer on the yield of a certain crop, the rates of application of fertilizer are the values of % in regression and are the treatments in the analysis of variance. The yield is designated by r in both cases. However, while the % in the regression must be a quantitative measurement, the treatments iD the analysis of variance can be either quantitative, such as the various rates of application of the TABLE 17.6a
Analysis of Variance. one-way classification
Linear Regression
,
1
s, quantitative
Treatment, quantitative or qualitative
2
r
r
3 4
Population Array or sub-population Array follows normal distribution
Collection of k populations Population Population follows normal distribution /J.a, I'-a. • •• , 1'-" No specified relation among lis k parameters lis. I-'a••••• 1-'" 1-', - I-'J = ... = 1-'" 0" of all populations are equal
5 6
7
I'-y." I'- cx+ fJ(s - i) y. "
fJ
8 9 10
2 parameters cxand
11
One sample A sub-sample may have any number of observations
12
fJ-O 0"
of all arrays are equal
13 14
R
15 16 17 18 19 20 21
~east squares estimates of cx and
No notation used
r r
~tal 55 Regression 55 Residual 55 ,,2
k samples (treatments) At least I sample has more than I observation 4a .. R. + RJ + ••• + R" R
fJ
!-east squares estimates of lis
r r
Total 55 Trea tment 55 Error 55 .~ (error mean square)
I
294
Ch.17
LINEAR REGRESSION-II
same fertilizer, or qualitative, such as the different kinds of fertilizers or different varieties of crops. Therefore, the analysis of variance has a wider application th~n regression. Yet the advantage is not entirely one-sided. If the treatments are qualitative, one has no choice but to use the analysis of variance. But if the treatments are quantitative, one has a choice of method. The criterion by which one chooses the more efficient method is discussed in the following two sections. The 10 pairs of observations given in Table 17.6b may be used to illustrate the contrast between linear regression and the analysis of variance. First, the observations are considered a sample and the linear regression is used to test the hypothesis that the means of remain the same resardless of the value of %. The details of the computation are shown in Table 17.6c and the F-value is equal to 11.15 with 1 and 8 degrees of freedom. Since 11.15 is greater than 5.32, which is the 5% point of F with 1 and 8 degrees of freedom, the hypothesis is rejected or the result is significant. Now the same set of data is rearranged as shown in Table 17.6d and the analysis of variance (Section 12.10) is used to test the same hypothesis, with the details of computation shown in Table 17.6e. This time the F-value is equal to 4.23 with 3 and 6 degrees of fre'edom. Since 4.23 is less than 4.76, the 5% point of F with 3 and 6 degrees of freedom, the hypothesis is accepted or the result is not significant. Now it can be seen' from this example that the two methods can actually lead to contradictory conclusions. Therefore, the choice of method is of great importance. However, the criterion of
r
TABLE 17.6b
n
10 25 2.5
4: %
(l:x)2
1)25
(Ix)l/ n
IxI 55 s
(Ix) (l:y)
62.5 77 14.5
l
(Iy)/n Ixy 5P
1.550
155 126 -29
(Iy)1
b = SP /S5" = -29/14.5 =-2 = y + b(x - x) = 6.2 - 2(% - 2.5) RegressioD 55 = (5P)I/55 s - 58.0 Residual 55 = 99.6 - 58.0 = 41.6 S2 = (residual 55)/(n-2) = 41.6/8 = 5.2 F = 58.0/5.2 = 11.15 with 1 aDd 8 d. {.
f"
.
(~x)
Iy Y
5S y
62 6.2 3,844
384.4 484 99.6
17.7
295
TEST OF LINEARITY OF REGRESSION TABLE 17.6d 2
1
r T II
-r
1'2
-II
8 12 10
8 4
30 3 10 300
12 2 6 72
"
3
-
4 .
2 6
2 6 4
8 2 4 32
•
12 3 4 48
C
,
62 10 6.2 452
In
L(~2)
TABLE 17.6e Preliminary Calculations (1)
Type of Total Grand Sample Observation
(2)
(5)
(4)
(3)
Total of No. of Items No. of Observations per Squared Item Squares Squared 3,844
Total of Squares per Observation (2) + (4) 384.4 452.0 484.0
10
1
I
Analysis of Variaace Source of Variation
Sum of Squares SS
Degrees of Freedom
Mean Squares MS
Treatments Error Total
67.6 32.0 99.6
3 6 9
22.53 5.33
F 4.23
.
choosing the method is discussed in the two following sections. At this stage, this numerical example serves only to illustrate the contrasts between the two methods as shown in Table 17.6a. For example, nand f in linear regression (Table 17.6c) are exactly Ion and yof the analysis of variance (Table 17.6d).
17.7 Tesl or LiDeartly or RegressiOD This section describes a method by which one may discover whether the regression in the pop~lation is linear. If the population is available, it is an easy matter to determine whether the regression of on " is linear. All one must do is to see if the means of all the arrays are on a straight line. However, if only a sample is available, a special method is needed for this purpose. The method involves using both linear
r
296
Ch.17
LINEAR REGRESSION-II
regression and the analysis of variance on the same set of data. This is one of the reasons why these two topics are deliberately linked together in the preceding section. The method of testing linearity can be illustrated by the example given in Table 17.6b. The steps of computation are as follows: {l} The observations r are arranged in columns according to the xvalues, as shown in Table 17.6d. (2) The analysis of variance calculations are carried out on the data as shown in Table 17.6e. The method given in Section 12.3 or that in 12.10 may he used, depending on whether the sample sizes are equal or unequal. (3) The regression SS is obtained by the method shown in Table 17.6c. (Short-cut methods for this step are given in the following section.) (4) The treatment SS with k -1 degrees of freedom is partitioned into the regression SS with 1 degree of freedom and the remaining SS with k-2 degrees of freedom, as shown in Table 17.7a. The remaining SS is obtained by suhtracting the regression SS from the treatment SS. TABLE 17.7a Source of Variation Treatment Linear Regression Deviation from Linearity Error Total
55
DF 3
67.6 58.0 9.6 32.0 99.6
1
2 6 9
M5
F
22.53 58.00 4.80 5.33
4.23 10.88 .90
The reason why the remaining SS is labeled "deviation from linearity" in Table 17. 7a can be seen from Fig. 17.7, where the 10 pairs of observatioDS of Table 17.6b (or Table 17.6d) are represented by the 10 dots; the four means (y') of Table 17.6d ate represented by four circles; and the regression equation
r" .. 6.2 -
(1)
2(x-2.5}
(Table 17.6c) is represented by the slanted line. The vertical distance between a circle and the line is labeled (m -r). This m is the mean of r with the same value of x. It is used in place of the of Table 17.6d, because the notation has different meanings depending on whether it is used in connection with the linear regression or with the analysis of variance. The sum of squares 9.6 due to the deviation from linearity (Table 17.7a) is the weighted sum of squares of the deviations of these circles from the line, with the numbers of observations in each treatment being the weights. The direct computation of this sum of squares is
r
r
17.7
297
TEST OF LINEARITY OF REGRESSION
14
r
12
•
10
o
8
•
•
6
o
•
4
•
o
2
2
Fig. 17.7
•
0---r
(m - r,,) ___ l
•
•
3
4
5
"
shown in Table 17.7b. The values of n, x, and m in this table are obtained from Table 17.6d and the values of Y" are obtained from the estimated regression equation (F:quation 1). If the means m of the subsamples are exactly on a straight line, that is, m = Y", this sum of squares
298
Ch. 17
LINEAR REGRESSION-II
is equal to zero. It must be realized that even if the regression is linear in the population, the means m of the sub-samples do not necessarily lie on a straight line. Therefore, a test of hypothesis is needed to determine whether the regression in the population is linear, when only a sample is available. TABLF~
n
3 2 2 3
" 1 2 3 4
m
1"
10 6 4 4
9.2 7.2 5.2 3.2
SS due to deviation from linearity
17.7b
(m-y)
(m-y,l
n(m-y)2
-1.2 -1.2 .8
.64 1.44 1.44 .64
1.92 2.88 2.88 1.92
" .8
"
9.60
The F-test shown in Table 17.7a is used to test the hypothesis that the regression in the population is linear. This is a one-tailed test; that is, the hypothesis is rejected only because F is too large and never because it is too small. If F is large enough to fall inside the critical region, the conclusion is that the regression in the population is not linear. If it is small enough to fall outside the critical region, the conclusion is that the regression is linear. The F-value of .90 with 2 and 6 degrees of freedom, shown in Table 17. 7a, warrants the conclusion that the regression is linear. The test of linearity may be regarded as an aid in deciding whether the linear regression or the analysis of variance should be used on a given set of experimental data. If the regression is linear, the method of linear regression should be used. Otherwise, the analysis of variance should be used. The reason for making such a choice is discussed in the followi~g section. It must be realized, however, that this discussion is limited to these two methods only. The other possible choice of curvilinear regression is not discussed in this book. After the test of linearity is made, little more computation is involved in testing the hypothesis that the means of all the arrays or populations are equal. 1£ the linear regression is to be used, one simply finds the F-value of 10.88 with 1 and 6 degrees of freedom (Table 17.7a). On the other hand, if the analysis of variance is to be used, one finds the Fvalue of 4.23 with 3 and 6 degrees of freedom (Table 17.7a). Note that this F -value is the same as that shown in Table 17.6e. 17.8 Individual Degree of Freedom Linear regression and the analysis of variance are often used simultaneously under the disguise of an individual degree of freedom (Section 15.3). Table 17. 7a shows that the treatment S5 with 3 degrees of freedom
I7.B
INDIVIDUAL DEGREE OF FREEDOM
299
is partitioned into (a) the SS due to the linear regression with 1 dewee of freedom and (b) the SS due to the deviation from linearity with 2 degrees of freedom; that is, the regression SS is expressed as an individual degree of freedom of the treatment SSe However, it is not obvious that the regression SS and an individual degree of freedom are really the same, because of the apparent difference in the computing methods used in obtaining these quantities. The method of computing the regression SS is
l:(y" _y}a
= (SP)2/SS ,,'
(I)
while that of an individual degree of freedom is
(M aT a + MaT a + ••• + M,l k)2 Qa _________________ ~~
n(.~~ + M~
+ ••• + Mt)
(2)
(Equation 11, Section IS.3). There seems to be no resemblance between the two algebraic expressions. But here again the difference is more apparent than real, and the regression SS can be expressed in the form of Qa, that is, a
(SP)a [(%a-i)T a + (%a-i}T a + ••• + (%k-i)T k)2
Q .. SS" -
na(%a-iP+n2(%2-iP+ .... +nk(xk-i)a·
(3)
The above algebraic identity can be derived from the identity
SP
= l:(% -
:%)(y -
y)
==
!(% - i)y
(F:quation 2, Section 16.9). However, instead of an algebraic proof, the identity shown in F:quation (3) is verified by the example given in Table 17.6d. The values of % of this example are 1, 2, 3, and 4 and the value of x is 2.5, (x is not obtained by averaging I, 2, 3, and 4, but from Table I7.6c), and consequently, the deviations (% - x) are -1.5, - .5, .5, and 1.5. The treabnent totals are 30, 12, 8, 12 and the numbers (n) of observations of these treatments are 3, 2, 2, and 3 (TaLle I7.6d). When these values are substituted into Equation (3), the value of the individual degree of freedom is
(n
[(-1.5) (30) + (- .5) (12) + (.5) (B) + (1.5) (12»)2 0 2 =__________________________________ __ 3(-1.5)2 + 2(- .5)2 + 2(.5)2 + 3(1.5)2
\:
(-29.0)2
841
- - - - == -
14.5
14.5
(5)
= 5B.0
which is the regression SS given in Table I7.6c. The values -29.0 and 14.5 in the above equation are the values of SP and SS" respectively (Table 17.6c). Therefore, the regression SS is an individual degree of freedom of the treatment SSe When the sample sizes n are equal, that is
300
Ch. 17
LINEAR REGRESSION-II
[(x,-i)T, + (x a-i)T 2 + ••• + (x/c-i)T/cP
Q2_------------------------~~--~ n[(x l -i)2 + (X a -i)2 + ••• + (x/c-i)2) ,
(6)
it becomes more obvious that the regression SS is an individual degree of freedom of the treatment SS. The multipliers M of Equation (2) are simply the values of (x-x) of Equation (6). If the x-values are equally spaced, for example 0, 1, 2, 3, etc. or 4, 8, 12, 16, etc., the computation of the regression SS can be further simplified. The deviations (x-i) can be replaced by positive and negative integers (Section 15.3). For example, the deviations -1.5, -.5, .5, and 1.5 of Equation (5), when multiplied by 2, become -3, -1, 1, and 3, and the value of the regression SS is unchanged. Therefore, if the x-values are equally spaced and the sample sizes n are equal, the regression SS can be computed by the method given in Equation (2) with the multipliers M being integers. The values of M for various numbers (k) of treatments up to 7 are listed in Table 17.8, which shows the pattern of obtaining the values of M for any number of treatments. TABLE 17.8 No. of treatments
Multipliers (M) -1
3 -3
4
-2
5 6 7
-5 -3
-1
-1
1
3
1
3
1
0
2
1
0 -1
-3 -2
0 -1
1
5
2
3
In terms of a planned experiment, the equally spaced x-values imply that the quantitative treatments are equally spaced, such as in the application of fertilizer at the rate of 0, 50, 100, and 150 pounds per acre. For this example, the values of M for computing the individual degree of freedom due to linear regression are -3, -1, 1, and 3 (Table 17.8). These methods may be used in connection with either the completely randomized (Chapter 12) or the randomized block (Chapter 14) experiments. After the regression SS is obtained, the SS due to the deviation from linearity can be obtained by subtracting the regression SS from the treatment SS. Then the test of linearity can be carried out (Section 17.7). So far this section has added nothing new to the preceding section except the simplified method of computing the regression SS for a planned experiment, where the numbers of observations of the treatments are the same and the quantitative treatments are equally spaced. However, this simplified method enables one to test the linearity of regression with very little additional computation after the analysis of variance calculations are completed, and thus enables one to decide whether the linear
17.8
INDIVIDUAL DEGREE OF FREEDOM
301
regression or the analysis of variance is the appropriate method for testing the hypothesis that all the treatment effects are equal. If the regression is linear, the method of linear regression is superior to the analysis of variance in testing the hypothesis that the treatment effects are equal. This can be seen hom the comparison of the average mean squares.
For the analysis of variance, the average mean squares
(Equations 4 and 5, Section 12.4) are average treatment MS - u a + TI(J J.L2 and
average error MS -= u 2,
(7) (8)
and those of the linear regression (Section 16.6 and Equation 4, Section 17.5) are (9) average regression MS ... u 2 + f32SS" average residual MS
a:
U
I.
(l0)
The sampling experiment of Section 17.1 may be used to show that for the case of more than two treatments the average treatment MS is les8 than the average regression MS or that TI(J J.L2 is less than f32SS ,,' The five population (or array) means of the sampling experiment are as follows:
In terms of the analysis of variance, u l is equal to 100 and the variance of the 5 population means (Equation 2, Section 12.4) is u
2
+ {70-90)1 + {90-90)2 + {1l0-90)2 + (l30-90~ = {50-90)1 _____________________ _ 4
J.L
=
4000 --1000. 4
If all the sample sizes are equal to n, the total number of observations is 5n and the average treatment mean square is 100 + n{lOOO). On the other haud, in terms of linear regression, f3 is equal to 20 and the values of " are 1, 2, 3, 4, and 5, each repeated n times, because n obser-
vations are drawn from each x-array. Then
SS
"
= n[{l-3)2 + {2-3)2 + {3-3)2 + {4-3)2 + (5-3)2] -IOn.
Therefore, the average regression mean square is
100 + (20)Zn(l0) - 100 + n(4000).
302
LIN EAR REGRF.SSION-Il
Ch.17
no:
The quantity fJ2SS % is 4 times as large as for the population of the particular sampling experiment. It can be proved algebraically that the number 4 is actually k-l, where k is the number of treatments. Therefore, as long as there are more than two treatments, the average regression mean square is always greater than the average treatment mean square. Consequently, the F-value in the linear regression is, on the average, greater than that of the analysis of variance if fJ is not equal to zero or the hypothesis is false. A larger F-value will be more likely to lead to the rejection of the false hypothesis. Hence, if the regression is linear, use of the method of linear regression will result in a smaller probability of committing a Type II error than use of the method of the analysis of variance. The discussions of this section can be summarized as follows: (1) In a planned experiment, such as the completely randomized experiment (Chapter 12) and the randomized block experiment (Chapter 14), the analysis of variance is used to test the hypothesis that all treatment means are equal. But if the treatments are quantitative, the regression, linear or curvilinear, is also used in the form of individual degrees of freedom of the treatment SS. (2) If the quantities of the k treatments, such as the rates of application of fertilizer, are equally spaced and the numbers of observations for the treatments are equal, the multiplier given in Table 17.8 may be used to find the regression SS with 1 degree of freedom. The SS with k-2 degrees of freedom due to the deviation from linearity, can be obtained by subtracting the regression SS with 1 degree of freedom from the treatment SS with k -1 degrees of freedom. Then the test of linearity can be carried out. (3) If the regression is not linear, the analysis of variance is used; that is, one tests the treatment mean square against the error mean square. (The choice of curvilinear regression is not considered in this book.) (4) If the regression is linear, the SS due to linear regression with 1 degree of freedom is tested against the error mean square. This test has a smaller Type II error than the test of the treatment mean square with k -1 degrees of freedom against the same error mean square, even though both tests are valid.
17.9 Remarks (1) Linear regression is much more frequently used than curvilinear regression, not because linear regression exists in abundance in nature, but because a short segment of a curve can be approximated by a straight line. As a result, it is always risky to extrapolate an estimated line of regression.
EXERCISES
303
(2) The method of testing linearity is given in Section 17.7 as an aid in deciding whether the method of linear regression should be used. The method of linear regression can lead to misleading conclusions if the regression is actually non-linear. If the test of linearity is not performed, at least the " pairs of observations should be plotted on graph paper for assurance that the regression is not obviously curvilinear. (3) The fact that fJ is different from zero does not imply that % is the cause of y. For example, if % is the length of one's left arm and y the length of his right arm, since one who has a long left arm usually has a long right arm, the regression coefficient {3 is greater than zero. But one can hardly say that Smith's right arm is long because his left arm is long. However, cause or no cause, for the purpose of prediction, the regression equation is useful. If one has a long left arm, the chances are that his right arm is also long. But it should be realized that for a particular value of %, the value of y calculated uom the estimated regression equation is fft' the estimated mean of an array, and not an observation y (Section 17.4). In terms of the example of the regression of height (y) on age (%), the height calculated uom the regression equation i. the predicted average height of a particular age group and Dot the predicted height of a particular child.
EXERCISES (1) (a) For the data given in Exercise (6), Chapter 16, test the hypothesis that {3 =0 by the t.-test (Section 17.3) at the 5% level. Note that the value of F of Exercise (6), Chapter 16, is the square of the value of t of this exercise. (t - 55.33 with 18 d.f.) (b) For the same data find the confidence interval of {3 with a confidence coefficient of .95. Since {3 is knowD to be 50, state whether your estimate is correct. (47.98 to 51.76) (c) For the same data find the confidence interval of Ill" 6' the mean of the array of y at % - 6, with a confidence coefficient of .95. Since the mean of this array is known to be 350, state whether your estimate is correct. (344.5 to 353.8) (d) The population uom which the sample of Exercise (6), Chapter 16, is drawn is described in that exercise. Suppose all possible samples of the same size and of the same set of x-values are drawn uom this population. What are the average values of the regression MS and residual MS? (250,000 and 1(0) (e) For the same data, test the hypothesis that the regression of y on % is linear at the 5% level. Since the regression is known to be linear, state whether your conclusion is correct or a Type I error has been committed. A Type II error cannot be committed in this case, because the hypothesis is true. (F = 0.99 with 2 and 16 d.f.)
304
Ch. 17
LINEAR REGRESSION-II
(f) For the same data, calculate the regression SS by the method of
individual degree of freedom. Note that the value thus obtained is the same as that obtained previously. (2) Repeat the above exercise for the data given in Exercise (5), Chapter 16. (3) (a) For the following two samples, each consisting of five observations, test the hypothesis that the two population means are equal by the analysis of variance. Sample 1
42,
5-4,
61,
51.
55.
Sample 2
53,
43.
62.
43,
56
(b) To the observations of sample 1, attach an arbitrary %-value of 10; and to the observations of sample 2, attach an %-value of 20. The same data may be tabulated as follows: y
"
10 10 10 10 10
y
"
42 54 61 51 55
20 20 20 20 20
53 43 62 43 56
Test the hypothesis that f3 = 0 by the F-test. Note that the value of F is the same as that of (a). Why? The error SS of (a) is the same as the residual SS of (b). Why? (4) The following data are obtained from those of Exercise (4), Chapter 16, by adding 100 to all observations (y) with % = 2 or 8.
" 2 4
G
8
2
y
159 41 35 143 148
"
4 6 8 2 4
y
"
47 47 153 152
6 8 2 4
48
6
y
51 165 152 56 39
"
8 2 4 6 8
y
160 139 64 50 136
The means of the four arrays of y are as follows:
" JJ.y
."
2
4
6
8
150
50
SO
150
The above table shows that the means of the four arrays are not equal and the regression of y on % is not linear.
305
EXERCISES
Test the hypothesis that fJ - 0 at the 5% level. (F - 0.00056 with 1 and 18 d.{.) What is your conclusion concerning the means of the four arrays? Then rearrange the data as follows:
" Y
(5)
(6)
(7)
(8)
(9)
(10)
2
4
6
8
159 148 152 152 139
41 47 48
35 47 51 39 50
143 153 165 160 136
56 64
Use the analysis of variance to test the hypothesis that the four population means are equal at the 5% level. (F = 217.49 with 3 and 16 d.f.) What is your conclusion concerning the means of the four populations (arrays)? The purpose of this exercise is to show that the use of linear regression on non-linear data may result in a misleading conclusion. For the data given in Exercise (4), partition the among-sample 55 with 3 degrees of freedom into the 55 due to linear regression with 1 degree of freedom and that due to deviation from linearity with 2 degrees of freedom. Then test the hypothesis that the regression of y on x is linear at the 5% level. (F = 326.23 with 2 and 16 d.f.) Since the regression is known to be non-linear, state whether your conclusion is correct or a Type II error has been committed. A Type I error cannot be committed in this case, because the hypothesis is false. For the data of Exercise 7, Chapter 16, find (a) the 95% confidence interval of the regression coefficient of y on x, and (b) the 95% confidence interval of the adjusted mean of y at x-i". For the data of Exercise 8, Chapter 16, find (a) the 95% confidence interval of the regression coefficient of y on x, and (b) the 95% confidence interval of the adjusted mean of r at " - i". For the data of Exercise 9, Chapter 16, find (a) the 95% confidence interval of the regression coefficient of r on '" and (b) the 95% confidence interval of the adjusted mean of 'Y at " - %. For the data of F.xercise 10, Chapter 16, find (a) the 95% confidence interval of the regression coefficient of r on '" and (b) the 95% confidence interval of the adjusted mean of r at " - i". The following data were obtained from the determination of a standard curve for the colorimetric measurement for phosphorus, using a
306
Ch.17
LINEAR REGRESSION-II
Coleman Junior Spectrophotometer. Varying known amounts of a standard phosphate solution were placed in colorimeter tubes with two tubes per level of phosphorus, the colorimetric reaction was set up, and the optical density of the resulting color was recorded for each tube. x
Micrograms Phosphorus
0 2.24 4.48 6.72 8.96 11.20 13.44 17.92 22.40 33.60 44.80
y Optical DeDsity
1
2
0 0.036 0.066 0.086 0.125 0.149 0.187 0.252 0.328 0.456 0.638
0 0.027 0.061 0.097 0.125 0.155 0.194 0.244 0.310 0.481 0.638
(Courtesy of Dr. Roebert L. StearmaD)
(a) Considering the x-values as the treatments, test the hypothesis that the means of '1 are not affected by x, by the analysis of variance (Section 12.3), at the 5% level. (b) Test the hypothesis that the regression of '1 on x is linear, at the 5% level. (c) Test the hypothesis that the regression coefficient of '1 on x is equal to zero, at the 5% level. Tabulate the results of your calculations in the form of an analysis of variance table (Table 17. 7a) and summarize your conclusions. (I1) It is generally known that houseflies emerge faster at a higher temperature. The accompanying table shows the numbers ('1) of days required for houseflies (M. domestica, Corvallis strain) to emerge at various temperatures (x). BOOF
84 OF
80°F
75°F
70°F
65°F
8 9 9 10 8 10
9 10 10
12 12 10 10
15 15 13 13 13 14
20 18 18 19 21 19
28 27 26 24 28 25
11 11
10
11
10
(Courtesy of Dr. Russell E. Siverly, OregoD State College)
QUESTIONS
307
(a) Test the hypothesis that the regression of the number of days (y) on the temperature (z) is linear, at the 5% level. (b) Use the reciprocal of a number of days as an observation, that is, transform 8 into 1/8 or 0.125, 9 into 1/9 or 0.111, etc., and test the hypothesis that the regression of the new observations on the temperatures is linear, at the 5% level. (Transformation is discussed in Chapter 23.) (c) Find the 95% confidence interval of the regression coefficient of the new observations on the temperatures. For an increase of one degree in temperature, on the average, how much faster do the flies emerge? (12) The following data are similar to those of Exercise 11, except that the flies are of a different strain. Repeat Exercise 11 with this set of data. 88°F
84°F
80°F
75°F
70°F
65°F
8
9 10 10 10
10
15 14 13 14
18 20 21 21 19
28 30 29 27 28
20
27
9 8 9 9 10
11 11
11 11
10 12
H 15
11
(Courtesy of Dr. Russell E. Siverly, Oregon State College)
(13) For the data of Exercise 9, Chapter 12, test the hypotheses that (a) the regression of the number of adult worms on the amount of carbon tetrachloride is linear and (b) the regression coefficient is equal to zero, at the 5% level. (14) For the data of Exercise 11, Chapter 12, test the hypotheses that (a) the regression of porosity of bricks on the firing temperature is linear and (b) the regression coefficient is equal to zero, at the 5% level. (15) For the data of Exercise 15, Chapter 12, test the hypotheses that (a) the regression of density of bricks on the firing temperature is linear and (b) the regression coefficient is equal to zero, at the 5% level.
QUESTIONS (1) In testing the hypothesis that f3 - 0, either the t-test or F-test may be used. If contradictory conclusions are reached by these tests, which test is to be trusted? Why? (2) Suppose that the regression or y on
%
is known to be linear. You are
allowed to make 120 pairs of observations withifl the limit x-I to
308
(3) (4) (5) (6)
LINEAR REGRESSION-II
Ch.17
x - 120 to estimate the population regression coefficient by a 95% confidence interval. You may let x-I, 2, ••• , 120 and observe their corresponding y-values or you may let x - 10, 20, .•• , 120 and observe 10 y-values for each x-value. There are many alternative methods of doing the experiment. (a) What is the best way of doing it? (b) Why i8 it the best way? What do you think is the reason that linear regression is more commonly used than curvilinear regres.ion? Why is it desirable to test the linearity of regression before linear regression is used? If the regression coefficient b is found to be significantly different from zero, is x the cause of y? What are the advantages of using the method of least squares in estimating ex and {3?
REFERENCES Anderson, R. L. and Bancroft, T. A.: StatisCical Theory ill ReuCITcIa, McGrawHill Book Company, New York, 1952. Mood, AleJ:8nder M.: ilitroducCioli 10 'he Theory of Statis'ics, McGraw-Hill Book Company, New York, 1950.
CHAPTER 18
FACTORIAL EXPERIMENT In the experiments described in Chapters 12 and 14 the effect of only one factor is studied. By contrast, a factorial e"Perimen' is one in which two or more factors are studied simultaneously. It is a number of onefactor experiments superimposed on one another. Therefore, the material presented in this chapter is an extension of that given in Chapters 12 and 14. 18.1 Description of Factorial Experiment The factorial experiment may be described through an illustration. For example, an experiment may involve two kinds of fertilizers A and 8, each of which is applied at different rates. Fertilizer A may be applied at the rates of 0, 50, 100, and ISO pounds per acre, and fertilizer 8 may be applied at the rates of 0, 100, and 200 pounds per acre. When one experiment is superimposed on the other, there are 4 x 3 or 12 treatments as shown in Table 18.1. TABLE 18.1
liZ 0 100
:mo
ISO
0
SO
100
1 2
4 5 6
7
10
8 9
11
3
12
The numbers 1 to 12 shown in Table 18.1 are the code numbers of the 12 treatments. For example, treatment No.1 is the check; treatment No. 2 contains no fertilizer A, but 100 pounds of fertilizer 8; treatment No. 12 contains 150 pounds of A and 200 pounds of 8; and so forth. For each of the 12 treabnents, there are n ohservations, the yields of a crop. Such
an experiment is called a 4 x 3 factorial experiment. The factor A is said to have 4 levels and 8 to have 3 levels. The factorial experiment was originally developed for testing the response to different fertilizers in crop yield. But now it is used in many fields of study. For example, a food technologist may wish to study the effects of three different kinds of sugar and four concentrations of syrup on the drained weight (y) of canned peaches. The three levels of factor 8 may be three kinds of sugars, such as sucrose, ~luco8e, and corn
309
310
FACTORIAL EXPERIMENT
Ch. 18
syrup, and the four levels of factor A may be the four concentrations of syrup such as 25%, 30%, 35%, and 40%. Then the treatment No. 1 is 25% sucrose and the treatment No. 12 is 40% corn syrup. The factorial experiment has distinct advantages over a series of onefactor or simple experiments. Suppose each of the 12 treatments in Table 18.1 has 10 observations. For the comparisons among the four levels of A, each level has 30 observations. For the comparison among the three levels of B, each level has 40 observations. Then each of 120 observations of the experiment is used for the comparison among the levels of A as well as for the comparison among those of B. Thus, every observation supplies information on both factors studied and one factorial experiment serves the purpose of two simple experiments of the same size. The result is economy in effort and experimental material. The second advantage of the factorial experiment is that a series of simple experiments throws no light on the interaction of the two factors. If, for example, the yield (y) of a crop increases with the rate of application of fertilizer A in the absence of B, but the yield does not respond to the different rates of A while B is applied at the rate of 200 pounds per acre, one could hope to learn this fact only by carrying out a factorial experiment where factors A and B are studied simultaneously. The third advantage of the factorial experiment over a series of simple experiments is that tIle conclusion reached through a factorial experiment is more general. For example, the conclusion that the yield increases with the rate of application of fertilizer A, at various levels of B, has a wider application than the same conclusion reached at a particular level of B. Thus a factorial experiment may provide a wider inductive basis for a conclusion. (A more comprehensive discussion of factorial experiments is given in Chapter VI, "The Factorial Design in Experimentation," of R. A. Fisher's The Design of Experiments.) Although the factorial experiment is not limited to two factors, in this chapter only the two-factor case is presented. To make the discussion more general, the two factors are identified by A and B. The numbers of levels of these factors are designated by a and b respectively. Therefore, the experiment under disc ussion is an a x b factorial experiment with n replications. The total number of treatments is k = a x b and the total number of observations is kn or abn. For the example given in Table 18.1, a = 4 and b = 3. A factorial experiment may be either completely randomized or in raDdomized blocks. The methods of analysis are identical to those given in Chapters 12 and 14. But what is new in the factorial experiment is the hreakdown of the treatment SS, with k - 1 or ab - 1 degrees of freedom, into three components, namely main effects A, B, and interaction AB. The idea is similar to that of individual degrees of freedom and the detailed process is given in the following section.
18.2
311
MECHANICS OF PARTITION OF SUM OF SQUARES
18.2 Mechanics of Partition of Sum of Squares The mechanics of the partition of the sum of squares are to be illustrated by the completely randomized 4 x 3 factorial experiment with two replications shown in Table 18.2a, that is, a = 4, b = 3, and n = 2. The factors A and B may be regarded as two kinds of fertilizers; and the levels 1, 2, etc., are different quantities of the fertilizers applied. Then the 24 observations such as 5, 9, etc., are the yields of a crop. TABLE 18.2a
~
Total T8
Mean Y8
B effect Y8-Y
2 4
32
4
-2
2 2
2 4
40
5
-1
11
3 7
10
72
9
3
60
36
18
30
Mean 1A
10
6
3
5
A Eff~ct YA -Y
4
0
-3
-1
1
2
1
5 9
4 4
1 3
2
10 12
2 6
3
12 12
9
Total T A
3
4
R
0
G= 144
1=6 0
The notations used here are the same as those used in Chapter 12. The letter k, the number of treatments, is replaced by abo The general mean and the treatment mean are still designated by rand y respectively, and grand total and treatment total are still designated by G and T respectively. The difference y is still called the treatment effect. However, some new notations are introduced to cope with the two factors A and B. The new notations and their definitions are listed below:
r
T A -The total of nb observations belonging to a particular level of A.
T 8 -The total of of B. fA-The mean of A. 18 -The mean of B. effect A y_The _ The effect 18 -
r -
r-
na observations belonging to a particular level
of nb observations belonging to a particular level of na observations belonging to a particular level of a particular level of factor A. of a particular level of factor B.
The numerical example of the above notations is given in Table 18.2a, where it can be observed that the sum of the A effects and that of the B
FACTORIAL EXPERIMENT
312
Ch. 18
effects are both equal to zero. The effects of the ab or 12 treatments, are given in Table 18.2b. The sum of these effects is also equsl to zero. Each of the nab or 2 x 4 x 3 or 24 observations (y) can be expressed as the sum of the (1) general mean, (2) A effect, (3) B effect, (4) interaction AB, and (5) error, that is,
Y-
y
y,
= Y+ (YA
-
1> + (YB - y) + [(f - 1> - (fA - Y) - (y 8 - .y)] + (y -
y). (1)
The above equation is an algebraic identity. After simplification, it can be reduced to y = y. The fourth term of the right-hand side of Equation 1 may be defined as the interaction AB; that is, interaction AB = (Y -
1> - (fA - 1> - (Y8
-
y)=
f-
YA ... f8 +
y.
(2)
TABLE 18.2b
~
1
2
3
4
1
1
-2
-4
-3
-8
2
5
-2
-4
-3
-4
3
6
4
-1
3
12
Total
12
0
-9
-3
0
Total
TABLE 18.2c
~
1
2
1
-1
0
1
0
0
2
2
-1
0
-1
0
3
-1
1
-1
1
0
0
0
0
0
Total
3
4
Total
For example, the effect of treatment No.1 is 1 {Table 18.2b}; the corresponding A effect and B effect are 4 and -2 respectively (Table 18.2a). Then the effect of interaction of this treatment is 1 - 4 - (-2) = -1. Here the interaction is defined as a quantity. Its physical meaning is explained in the following section. The 12 interaction terms are given in Table 18.2c. It should be noted that the sum of every row and every column of the interaction terms is equal to zero. The five components (Equation 1) of each of the 24 observations are given in Table 18.2d. The sum of the five components of any observation is equal to that observation; that is, for each line of Table 18.2d.
18.2
313
MECHANICS OF PARTITION OF SUM OF SQUARES
TABLE 18.2d (1)
(2)
Treatment
Observation
AB
y
(3) General
5
~8D
Y
(4)
(5)
(6)
A
B
AB
Interaction .:ffe~ .:ffe~ = YA -Y Y~-Y Y-YA-YB+Y
(7) Error
y-y
9
6 6
4 4
-2 -2
-1 -1
-2 2
10 12
6 6
4 4
-1 -1
2 2
-1 1
12 12
6 6
4 4
3 3
-1 -1
0 0
2 1
4 4
6 6
0 0
-2 -2
0 0
0 0
22
2 6
6 6
0 0
-1 -1
-1 -1
-2
23
9 11
6 6
0 0
3 3
1 1
-1
31
1 3
6 6
-3 -3
-2 -2
1
1
-1 1
32
2 2
6 6
-3 -3
-1 -1
0 0
0 0
3
6 6
-3 -3
3 3
-1
-2
7
-1
2
41
2 4
6 6
-1 -1
-2
-2
0 0
-1 1
42
2 4
6 6
-1 -1
-1 -1
-1 -1
-1 1
43
8 10
6 6
-1 -1
3 3
1 1
-1 1
11 12 1 3
33
2 1
Sum of Squares
1192
Lr
£;2/24
156 ASS
112 B SS
24 AB SS
36 E"or SS
Degrees of Freedom
24 na6
1 1
3 a-I
2 6-1
6 (a-})(6-1)
12 a6(n-l)
864
the number in Column (2) is equal to the sum of the numbers in Columns (3) to (7). The partition of the total SS can be observed in Table 18.2d. The sum of the squares of the 24 numbers of each column of the table is given in the last line. The total SS, which is
I(y - y)2
= Iy2 - ca /24 = 1192 -
864 - 328,
314
FACTORIAL EXPERIMENT
Ch.18
is equal to the sum of ASS, B SS, interaction A R SS, and erTor SSe The algebraic proof of this identity is similar to that given in Section 12.1 and, therefore, is omitted. The treatment SS described in Section 12.1 is equal to the sum of A SS, B SS, and AB SSe The 12 treatment effects are given in Table 18.2b. The treatment SS is equal to n times the sum of the squares of the 12 treatment effects. Therefore, the mechanics of the partition of the sum of squares presented here are very similar to those given in Section 12.1. The extra complication is due to the partitioning of the treatment SS into three components, namely ASS, B SS, and AB SSe Each of the various SS measures a different kind of variation. The total SS measures the over-all variation of all observations. It is equal to zero, if all the nab observations are the same. The A SS measures the variation among the levels of the factor A. It is equal to zero, if the means of all levels of A are the same. Similarly the B SS measures the variation among the levels of the factor B. The erTor SS measures the variation of the observations receiving the same treatment. It is equal to zero, if the n observations are the same for each of the ab treatments. The interaction SS is explained in the following section. The numbers of degrees of freedom of the various SS may also be observed from Table 18.2d. The quantity l:y2 can be found only if all the nab or 24 observations are knowD; therefore, its number of degrees of freedom is nab or 24 {column 2}. The general mean is the same for all observations {column 3}. When it is known for one observation, it becomes known for all observations. Therefore nab or G2/nab has only one degree of freedom. Thus the total SS, which is Iy2 - G2/24. has nab - 1 or 23 degrees of freedom. The number of degrees of freedom for the A SS is equal to a -lor 3. This is due to the fact that the sum of the four A effects is equal to zero (Table 18.2a). If only three of them are known, the remaining one will become known. Similarly, the number of degrees of freedom for the B SS is equal to b - 1 or 2. The interaction SS has (a -1)(b -I) or 6 degrees of freedom. This can be observed from Table 18.2c. The sum of every row and every column of the interaction terms is equal to zero. If the last row and the last column are deleted, all the missing terms can be reconstructed. Therefore, if (a -1) (b - 1) terms are known, all the ab terms will become known. The number of degrees of freedom for the error SS is ab(n - I} or 12, 8S explained in Chapter 12. The number ab(n -1) is obtained from ken - 1), because k = abo The number of degrees of freedom for treatment. as expected, is equal to the sum of those for A. B. and AR; that is,
y2
ab - 1 = (a - I) + (b - 1) + (a - l)(b - I).
18.3
315
INTERACTION
F or the example under consideration, the number of degrees of freedom for treatment is 11; that is, 12 - 1
= 3 + 2 + 6.
The result of the partitioning of the sum of squares and its numbers of degrees of freedom is summarized in Table 18.2e. TABLE 18.2e Source of Variation
Sum of Squares
A B AB Error Total
156 112 24 36 328
Mean Squares MS
DF
SS
52 56 4 3
3 2 6 12 23
18.3 Interaction The interaction of the factors A and B is defined in the preceding section as
{f -
1> - - (1B - y).
The meaning of the above expression can be interpreted from the absence of interaction, that is, the interaction term being equal to zero, or
or
f=Y+{'A
-Y> +('8 -y).
It would be interesting to observe the characteristics of a set of treatment means so that all the interaction terms are equal to zero or the interaction 55 is equal to zero. From the general mean, A effects, and B effects given in Table 18.2a, one can create such a set of treatment means. The 12 means thus created are given in Table 18.3, which shows that the corresponding elements of any two rows or any two columns TABLE 18.3
I~ 1 2 3 mean
1
2
3
4
meaD
6+4-2=8 6+4-1=9 6 + 4 + 3 = 13
6+0-2=4 6+0-1=5 6+0+3=9
6-3-2=1 6-3-1=2 6-3+3=6
6-1-2=3 6-1-1=4 6 .... 1+3=8
4 5 9
10
6
3
5
6
316
FACTORIAL EXPERIMENT
Ch. 18
maintain the same difference. The four means of the Eirst row are 8,4,1,3, and those of the second row are 9,5,2,4. The diEEerence between the corresponding means is 1; that is, 9 - 8 = 5 4 =: 2 - 1 = 4 - 3 c 1. This is true for any two rows or any two columns. The implication of the absence of the interaction may be seen from an example. Suppose the 4 levels of the factor A are 4 difEerent quantities of the fertilizer A and the 3 levels of the factor B are 3 different quantities of the fertilizer B. Then the 12 treatment means are the average yields. Since the means 8,4,1,3 and 9,5,2,4 maintain a constant difference of 1, the interpretation is that the second quantity of the fertilizer B enables the crop to yield, on the average, one unit more than the first quantity, regardless of the level of fertilizer A used. Another example of absence of interaction mlly help to clarify further the meaning of interaction. Suppose the 4 levels of the factor A are the 4 classes of high school, senior, junior, sophomore, and freshman and the 3 levels of the factor B are 3 different schools. Each of the classes of the 3 schools has n students. The 12 treatment means are the average scores for a given examination. The absence of interaction implies that the students of school No. 2 scored, on the average, one point higher than those of school No.1, regardless of the grade of the students. Any departure from this condition means the existence of interaction between the schools and the years. The interaction SS measures this departure quantitatively. It is equal to zero only if all the interaction terms are equal to zero (Column 6, Table 18.2d). The interaction discussed here is a descriptive measure of the date. The statistical inference about the interaction is given in Section 18.5. 18.4 Compating Method The mechanics of the partition of the sum of squares are shown in detail in Section 18.2. In that section the simple example, involving mostly one-digit numbers, is deliberately chosen to avoid complicated computation. The purpose of that section is to show the meaning of partitioning the total SS into various components, but the method of doing so is too tedious to be of practical value. In this section, a short-cut method is presented for the computation of the analysis of variance. The
18.4
317
COMPUTING METHOD
TABLE 18.4a
X
I
1
~
2
...
a
Total
T,
,TB
TA
G
2
·· ·b Total
basic principles behind the short-cut method are discussed in Sections 12.3 and 14.4 and will not be repeated here. Furthermore, this section is mainly concerned with the partitioning of the treatment SS, with" -lor ab - 1 degrees of freedom, into three components, namely the ASS, B SS, and AB SSe The method of obtaining the treatment SS, error SS, and total SS for a completely randomized experiment is given in Section 12.3 and that of obtaining the replication SS, treatment SS, error SS, and total SS for a randomized block experiment is given in Section 14.4. For both cases, the method of partitioning the treatment SS into the three components is the same. The notations used here are the same as those defined in Section 18.2. The first step in the computation is to find the TABLE 18.4b Preliminary Calculations (1)
(2)
(3)
(4)
(5)
Type of Total
Total of Squares
No. of Items Squared
No. of Observations per Squared Item
Total of Squares per Observation (2) + (4)
Grand Factor A Factor B
G2
1
ITiA
a
nob nb
n
6
no
UI
06
n
IV
Treatment
Irs Ir:
I
Analysis of Variance Source of Variation
Sum of Squares
A B AB
III -I
0-1 b-l
I - II - III + IV IV - I
(0 - l)(b - 1) 06-1
Treatment
II-I
Degrees of Freedom
Mean Square
F
318
Ch. 18
FACTORIAL EXPERIMENT
ab treatment totals, T I' and then arrange these totals in a two-way table such as Table 18.4a. The rest of the computing procedure is sbown in Table 18.4b. This table, however, does not give the complete computing procedure. It should be used in conjunction with Table 12.3a for a completely randomized experiment and with Table 14.4a (or a randomized block experiment. TABLE 18.4c
~
I
2
3
4
Ta
1 2 3
14 22 24
8 8 20
4 4 10
6 6 18
32 40 72
TA
60
36
18
30
144
The computing procedure may be illustrated by the example given ir Table 18.2a. The 12 treatment totals are shown in Table 18.4c and the analysis o( variance computation is sbown in Table 18.4d. It should be noted that the values o( the various Ss..values are the same as those shown in Table 18.2e.
TABLE 18.4d Preliminary Calculations (1)
(2)
(3)
(4)
(5)
Type of Total
Total of Squares
No. of Items Squared
No. of Observations per Squared Item
Total 01 Squares per Observation (2) +.(4)
Grand Factor A Factor B Treatment Observation
20,736 6,120 7,808 2,312 1,192
1 4 3 12 24
24 6 8 2 1
864 1,020 976 1,156 1,192
Analysis of Variance Source of Variation
A B AB Error Total
.-
Sum of Squares
Degrees of Freedom
Mean Square
156 112 24 36 328
3 2 6 12 23
52 56 4 3
F
STATISTICAL INTERPRETAnON-FIXED MODEL
18.5
319
18.5 Statistical Interpretation-Fixed Model The nab observations of a completely randomized a x b factorial experiment with n replications are considered ab random samples, each consisting of n observations, drawn from ab normal populations with the same variance (II. An example of six population means of a 3 x 2 facTABLE 18.5a
~
1
2
3
Total
1 2
64 50
51 47
53 47
168 144
Total Mean I1A A effect
114 57 5
98
100 50 -2
312
49
-3
Mean
118
56 48
B effect 118-11
4 -4 0
11 = 52
0
I1A -11
torial experiment is shown in Table 18.5a. Each population mean #l is made of the general mean ~ A effect #lA - ~ B effect #lB - ~ and the interaction AB effect (11
-;D -
(I1A -
Ii) -
(118 -
iL) = 11-I1A
-118 + ii,
that is, (1)
The above equation is an algebraic identity. After simplification it becomes 11 = 11· The method of finding the general mean, A effect, B effect, and interaction AB effect is the same as that given in Section 18.2. The only difference is in the notations used. Such differences in notations are introduced to differentiate the parameters and statistics. The A effects and B effects are ~iven in Table 18.5a. The interaction effects are given TABLE 18.5b
~ 1 2
Total
Total
1
2
3
3 -3
-2 +2
-1
+1
0 0
0
0
0
0
in Table 18.5b. The six population means, expressed in terms of their components (Equation 1) are shown in Table 18.5c. The contrasting notations for the parameters and statistics are listed in Table 18.5d.
Ch.18
FACTORIAL EXPERIMENT
320
TABLE 18.5c
~
1
2
3
1 2
52 + 5 + 4 + 3 52+5-4-3
52 - 3 + 4 - 2 52- 3 - 4 + 2
52- 2 + 4-1 52 - 2 - 4 + 1
TABLE 18.5d Statistic
Parameter
General Mean
Il IlA -ji IlB -Ii 1l-IlA -IlB+ji
A Effect B Effect AB Effect
There are three variances connected with the ab population means ot a two-factor factorial experiment. The variance of the A effects is defined os
that of the B effects is defined as b
~(IlB-pY
I
uB -
b-1
(3)
,
and that of the interaction effects is defined as ab U
I AB
=
L: (Il -
Il A - Il B + ~)I (a _ 1){b - 1) .
In terms of the example given in Tables 18.5 a, b, c, (5)1 + {_3)1 + (_2)1
ul
A
-
= B
ul
3-1 (4)Z + {_4)1
2-1
38 --
2
= 19,
32 =-=32,
1
and Note that the divisors for the three variances are not the number of items squared but the number of degrees of freedom.
STATISTICAL INTERPRETATION-FIXED MODEL
18.5
321
The variance of the k or ob population means is defined as (Equation 2, Section 12.4) lib
2 0'
J.I.
1: (,.,. - ji.)2• =
(5)
Gb-l
For the example given in Table 18.5a,
(64 - 52)2 + ••• + (47 - 52)2 0'2
=
6 -1
J.I.
200 D-=
5
40.
This variance of the k or ob population means is a linear combination of the three variances o'~, o'~, and O'~B' More specifically, it is the weighted average of bu~, aO'~, and O'~B' with the numbers of degrees of freedom being the weights. In symbolic form, the relation is 0'2
= (0 -l)bO'A + a(b -l)O'~ + (0 -l)(b -l)O'AB
(6)
ab-l
J.I.
which can be verified by the example under consideration, that is, (3 - 1) (2) (19) + (3) (2 -1) (32) + (3 -1) (2 -1) (14)
40 =
200 =-.
(3 x 2) -I
5
The average mean squares for treatment and error are given in Equations (4) and (5) of Section 12.4 as: Average treabnent MS =
O'J
+
no;
Average error AlS = 0'2
(7) (8)
The average mean squares for A, B, and AB are given as follows:
= 0'2 + nbu A B &IS = 0'2 + nGq~
Average A MS Average
Average AB MS = 0'2 + no AB
(9) (10) ( 11)
The above equations are given without any justification, but they are reasonable to expect. Tile treatment MS is the weighted average of the three mean squares due to A, B, and A B; that is, treatment !tiS
(0 -1)(A MS) + (b -1)(B MS) + (0 -I)(b -1)(AB MS)
=- - - - - - - - - - - - - - - - - - - ob -1
(A SS) + (B SS) + (AB SS)
=----------ob -1 treatment 55 =----ab -1
FACTORIAL EXPERIMENT
322
Ch.18
The average treatment mean square is also the weighted average of the three average mean squares, with the numbers of degrees of freedom being the weights. This can be verified by the example given in Table 18. Sa. It is already determined that a = 3, b = 2, u~ = 19, = 32, u~B = 14, and u~ = 40. Suppose n = 10, u 2 = 100. Then the values of the average mean squares are:
ui
Treatment:
u 2 + 1k1~ = 100 + 10(40) - 500
A: u 2 + nbu~ = 100 + 10(2)(19) = 480 B: u 2 + nQ(1~ = 100 + 10(3)(32) = 1060 AB: u 2 + rJC1~B = 100 + 10(14) = 240 It can be observed that
500 =
2(4RO) + 1(1060) + 2(240) 2500
=-.
5
5
The algebraic proof of the stated relation can be carried out similarly by utilizing the equation that
ab - 1
= (a
- 1) + (b - 1) + (a - l)(b - 1)
and also Equation (6). The average means squares given in Equations (7) to (11) can be explained in terms of a sampling experiment. A group of 6 random samples,
TABLE 18.5e
~ 1
1
2
3
55 72 52
55 62 38 47
38 54 67 44 41 51 33 47
58
73
46
66
46 55 40 49 58
74 79 54 89 2
61 58
42 39 59 40 42 43 71
42
38 67 32 69 48 53 41 52 50 36
71
56 49 64
67 47 56 51 43 50 41 44
323
STATISTICAL INTERPRETATION-FIXED MODEL
18.5
TABLE 18.5f
X
1
2
3
Total
1 2
672 497
496 486
502 512
1,670 1,495
Total
1,169
982
1,014
3,165
each conslstlDg of 10 observations, may be drawn from the tag population (Chapter 4) which is a normal population with mean equal to 50 and variance equal to 100. Then 14 is added to each of the 10 ohservations of the first sample, and 1 is added to each of the 10 observations of the second sample and so forth, so that the 6 population means would be equal to those given in Table 18.5a. An example of such a group of 60 observations is given in Table 18.5e. The treatment (sample) totals are given in Table 18.5f, and the mean squares are shown in Table 18.5g. If another group of 6 samples of the same size were drawn from the sarqe 6 populations, a different set of mean squares would be obtained. When infinitely many groups of samples are drawn, there will be infinitely many eets of values for the mean square. An average mean square is the average of the infinitely many values of a particular mean square. It should be noted from Table 18.5g that the mean square. computed from TABLE 18.5g Preliminary Calculations (1)
(2)
(3)
(4)
(5)
Type of Total
Total of Squares
No. of Items Squared
Observations per Squared Item
Total of Squares per Observation (2) + (4)
Grand A B Treatment Observation
10,017 ,225 3,359,081 5,023,925 1,694,953 175,981
1 3 2 6 60
60 20 30 10 1
166,953.75 167,954.05 167,464.17 169,495.30 175,981.00
Analysis of Variance Source of Variation A
B
Sum of Squares 1,000.30 510.42
Degrees of Freedom
Mean Square
Average MS
2 1
500.15 510.42
480 1060
AB
1,030.83
2
515.42
240
Error Total
6,485.70 9,027.25
54 59
120.11
100
324
FACTORIAL EXPERIMENT
Ch. 18
one goup of 6 samples do not necessarily agee with their respective average mean squares. 18.6 Models--Test of Hypotheses In the preceding section, the ab population means are considered fixed quantities. This interpretation is called the linear hypothesis model or the fixed model (Section 12.4) of the analysis of variance. However, other interpretations of the population means are also possible. The #lA's and #lB's may be considered slllDples of large numbers of #lA's and #lB's. This interpretation of the population means is called the component of variance model or the random variable model. It is also possible to mix tbese two models; that is, one of the two factors fits the linear hypothesis model, while the other fits the component of variance model. An example at this stage may help to distinguish the various models. The data of the 3 x 2 factorial experiment given in Table 18.5e can serve this purpose. The three levels of the factor A may be considered three high schools and the two levels of the factor B two teaching methods used in instructing a certain subject. An observation y is the test score of a student. If one's interest is only in comparing these three particular schools and these two particular teaching methods, the linear hypothesis model is the appropriate interpretation. On the other hand, the component of variance model is the appl'opriate interpretation, if one is interested in knowing whether, in general, schools and teaching methods would affect the students' scores and only incidentally selected these three schools and two teaching methods for experimentation. If one is interested only in the comparison of these two particular teaching methods and he incidentally selected the three schools out of many, the mixed model is the appropriate interpretation. The reason for discussing the various models of the analysis of variance is that the average mean squares are different for the different cases. Those given in Equations (9), (10), and (11) of Section 18.5 are correct for the linear hypothesis model only. As a contrast the average meab squares for the linear hypothesis model and the component of variance model are listed in Table 18.6. The three variances u~, u~, and for the linear hypothesis model are defined in Equations (9), (10), and for the com(11), Section 18.5. The three variances u~, uS' and u ponent of variance model constitute a different set of quantities. 'They are the variances of all the population means rather than the particular population means involved in the experiment. The prime is used to differentiate these two kinds of variances. The pmpose of listing the average mean squares for the two models, however, is not to show the difference in the variances with or without the prime, but to furnish a
uAB
AB
MODELS-TEST OF HYPOTHESIS
18.6
325
guide in finding the F -value in testing a particular hypothesis. In the linear hypothesis model, each of the A, B, and AB mean squares should be divided by the error mean square to test the three hypotheses tbat (1) the factor A bas no effect on the means of y, (2) tbe factor B bas no effect on the mean of y, and (3) there is no interaction between factors A and B. But in the component of variance model, to test the three hypotheses, the A and B mean squares sbould be divided by the AB mean square (Table 18.6) and the AB mean square sbould be divided by tbe error mean square in testing the same three bypotheses. The test procedure for a mixed model is the same 8S that for tbe component of variance model. Like all analyses of variance, the F -test is a one-tailed test. The numbers of degrees of freedom of F, as usual, are those of the mean squares used as the numerator and the denominator in finding the Fvalue. TAJJLE 18.6 Average Mean Square Component
Lillear Hypothesis
Factor A
(72
+
Factor B
(72
+ llauBl
IDteractiOll AB Error
(72
nbu 2
A
+ 11(71
(72
AS
ComponeDt of Variance (72 u l (71
+ nu'2AS + nbu'2A
+ 110'2AB + nau'2B
+ 11(1'2
AS
(71
The nun:ber of degrees of freedom for the error MS depends design used for the experiment. For a completely randomized ment, the number of degrees of freedom is k(n -1) or ab(n - 1). randomized block experiment the number of degrees of freedom is (,. - 1) or (ab - 1) (,. - 1).
on the experiFor a (k - 1)
18. 7 Tests of SpectRe Hypotheses The methods of testing the three general hypotheses are given in the preceding section. In addition to these methods, the individual degree of freedom (Section 15.3), the least significant difference (Section 15.4), the linear regression (Section 17.8), and the multiple range test (Section 15.5) can also be used in connection with the factorial experiment for testing more specific hypotheses. Furthermore, these methods can be used not only on the treatment means, each based on n observations, as described previously; but also on the means of the levels of the factqr A and the means of the b levels of the factor B. However, one must be aware that each of the A means is based on nb observations and that
326
FACTORIAL EXPERIMENT
Ch.18
each of the 8 means is based on na observations. In using the individual degree of freedom, the letter n in Equation (11), Section 15.3, and Equation (6), Section 17.8 should be interpreted as the number of observations from which a total T is computed. The example given in Table 18.5e may be used as an illustration. If an individual degree of freedom is used on the 6 treatment totals, n is equal to 10. If it is used on the 3 A-totals, n is replaced by nb or 20; if it is used on the 2 8-totals, n is replaced by na or 30. The use of the least significant difference between means also follows the same rule. The least significant difference as given in Inequality (2), Section 15.4 is (1)
It may be used on the treatment means, A-means, or 8-means, but the letter n is to be interpreted as the number of observations from which a 'mean is computed. This principle also applies to the new multiple range test. In finding ..;srTii, the letter n is subject to the same interpretation.
18.8 Hierarcbieal Classificalion An experiment involving multiple factors is not necessarily a factorial one which requires that factors A and 8 crisscross to form the treatment combinations. H the factors do not crisscross, but one factor nests inside another, the experiment is called a hierarchical one. For example, samples of iron ore may be sent to four different chemical laboratories for analysis and each laboratory may assign three technicians to carry out the chemical analysis, each technician making two determinations of the percentage of the iron content in the ore. Such a set of data may be tabulated as shown in Table 18.2a, where the four levels of the factor A are the laboratories, the three levels of the factor B are the technicians, and the two observations in each cell are the determinations. Such a tabulation indeed resembles the factorial experiment; however, the data are really of the hierarchical classification. The three technicians of the first laboratory may be Jones, Smith, and Brown; those of the second may be White, Johnson, and Miller; those of the third may be Carpenter, Robinson, and Riley; and those of the fourth may be Anderson, Howard, and Walker. These people are 12 individual technicians employed by four independent laboratories. They are not three sets of quadruplets crossing the laboratory lines. Instead of calling the laboratories and technicians the factors A and 8, to call them tiers A and B would be
18.8
IDERARCHICAL CLASSIFICATION
327
more appropriate. So this set of data has three tiers, namely laboratory, technician, and determination. Another illustration may help to clarify further the meaning of the hierarchical classification. A college may be divided into a number of schools, such as the school of science and the school of engineering. The school of science may have such departments as mathematics, physics, and chemistry; the school of engineering may have the departments of civil engineering, mechanical engineering, and electrical engineering; and within the departments are the faculty members. Even though both schools are subdivided into departments, there is no one-to-one correspondence betweeD the departments of the school of science and those of the school of engineering. The schools and departments are two tiers rather than the two factors. The difference between the factorial and hierarchical experiments can be illustrated by the following diagrams:
1
Factorial A 3 2
4
A
1 B
8
2
,
3 Hierarchical
In the hierarchical classification, each observation may be broken down into (1) the general mean, (2) A effect, (3) B effect within tier A, and (4) error within tier B, that is (1)
The notations used here have exactly the same definitions as given in Section 18.2. In comparing the above equation with Equation (1), Section 18.2, one will notice that the general mean, A effect, and error are the same for both equations. The only difference is that the B effect and
Ch.18
FACTORIAL EXPERIMENT
328
AB effect of the factorial experiment are added together to fonn the B effect within tier A of the hierarchical experiment. As a result, the total 55, A 55, and error 55 are the same for both cases. The B 55 within tier A is equal to the sum of the B 55 and AB 55. Therefore, the details of the partition of the sum of the squares are omitted here. The short-cut method of computation is given in Table 18.8a. The letter a represents the number of levels of the tier A; the letter b is the number of levels of the tier B within each level of A; the letter" is the number of observations within each level of tier B. As a further illustration, the data of Table 18.2a may be considered hierarchical experiment and analyzed as such. The result is given in Table 18.8b. The purpose of analyzing the same set of data by two different methods, factorial and hierarchical, is to show the numerical relation between these two methods; for example, the B 55 within the tier A is equal to the sum of the B 55 and AB 55. One should not acquire the Table 18.Be Preliminary CalcwatioDS (1)
(2)
Type of Total
Total of Squares
Grand Tier A B within A Observation
G2
l:TA l:T2 l:~
!
(3)
No. of Items Squared
1 a ab nab
(4)
Obaervation& per Squared Item nab nb n 1
(5) Total of Square. per Observation (2) .;. (4)
I 11 1Il IV
Analysis of Variance Source of Variation Tier A B within A Error (within B) Total
Sum of Squares II-I
III - II IV - III IV -I
Degrees of Freedom
Mean Square
F
a-I a(b - 1) ab(n - 1) nab - 1
impression that the method of analysis for a given set of data is arbitrarily chosen. The method of analysis is determined by the method by which the experiment is carried out or by the physical meaning of the experiment, and not by the appearance of the tabulation of the data. As long as the two-dimensional paper is used for tabulation, all tables have similar appearances. The figures are arranged either in columns, in rows, or in columns and rows. Therefore, the appearance of a table can hardly
329
HIERARCHICAL CLASSIFICATION
18.8
be used as a guide in selecting an appropriate statistical method. The average mean squares are given in Table l8.8b. The variances 0'2 and O'~ have the same definitions as given in Section 18.5, while 0'; (A) is the variance of the B effects within Tier A. In terms of the example of the iron ore, 0'2 is the variance among determinations made by a technician, 0'; t1) is the variance among the means of the determinations made by different technicians of a laboratory, and O'~ is the variance among the means of the determinations made by different laboratories. The average mean squares (Table 18.8b) may be used as a guide in selecting the appropriate denominator for the F-value in testing a hypothesis. The statistic
F.
Tier A MS
B within A MS
is used in testing the hypothesis that 0' ~
O. The statistic
OK
B within A MS
F------Error MS
is used in testing the hypothesis that
O';(A)
= o.
Table 18.8b PrelimilUU'}' Calculations (1)
(2)
(3)
(4)
Type of Total
Total of Squares
No. of Items Squared
Observations per Squared
Correction Tier A B within A Observation
20,736 6,120 2,312 1,192
1 4 12 24
24 6
Item
(5) Total of Squares per Observation (2) + (4) 864 1,020 1,156 1,192
2
1
Analysis of Variance Source of Variation
Sum of Squares
Degrees of Freedom
Average Mean Square
Mean Square 2
Tier A
156
3
52
0'
B within A Error (within B) Total
136 36 328
8 12 23
17 3
2 0' 0'2
2 60"2 + 110S( A) + n A 2
+ MS( A)
FACTORIAL EXPERIMENT
330
Ch.18
The analysis of variance with hierarchical classification does not present any difficulty in computation, even if the numbers of subdivisions are not the same for the different levels of a tier. Since the hierarchical classification is a one-way classification (Chapter 12) within another, the computing method for single classification with unequal numbers of observations may be used repeatedly at different tiers. The basic principles of this method are given in Section 12.10 and therefore are not repeated here. The difference in the computing method between the equal and unequal number of subdivisions is in replacing T~ +
If2
T: + •••
-a
n
n Table 18.Se
Tier A
1
B
Y
1
19 23
361 529
17 15 16
289 225 256
48
3
768
3
18
324
18
1
324
1
12 16
144 256
28
2
392
2
20 24
400 S7~
44
2
968
1
21 23 22
441 529 484
66
3
1,452
18 21 16
324 441 289 256
72
4
1,296
3
28
784
28
1
784
4
19 15
361 225
34
2
578
2
2
Sum
No. of Totals
17
Grand
A
,.Z
2
3
B
Oba. T
n
T2/n
42
2
882
T
ra
TZ/n
108
6
1,944
72
4
1,296
200
10
4,000
T
n
T2/n
380
20
7,220
.-
7,444
7,240
7,220
20
9
3
1
IV
III
II
I
7,494
331
SAMPLING ERROR
18.9
where T is Bny kind of totBI Bnd n is the number of observations in that total. The rest of the computing procedure is the same for both cases. An example of computation for the hierarchical classification with unequal numbers of subdivisions is shown in Table IS.ac. The first two columns identify an observation and the observation (1) itself is shown in the third column. The rest of the table shows the details of computation. The procedure involves finding the totals (T) and couDting the numbers (n) of observations for different levels of different tiers. Then one finds the quantity I.(T 2 In) for each tier and counts the number of totals in each such quantity as shown at the bottom of the table. The sums of squares and their numbers of degrees of freedom can be obtained by finding the differences between two adjacent terms, that is, II - I for A, III - II for B within A, IV - III for error within B, and IV - I for total. The analysis of variance table showing these components is given in Table 18.8d.
Table 18.8d Source of Variation
SUID of Squares
Tier A B within A Error Total
20 204 50 274
Degrees of Freedom
Mean Square
2 6
10,00 34.00 4.55
11
F
0.29 7.47
--
19
The amount of computing work involved is not nearly so great as it ~eems. Many of the intermediate steps shown in Table 18.8e may be omitted. Each of the quantities I.Y and I.(P In) may be obtained in one continuous operation on a desk calculator. Therefore the columns for y and T2 In for various tiers are not really necessary.
18.9 Sampling Error The hierarchical classification of the analysis of variance presented in the preceding section is often used in combination with other experimental designs. An example of the application of the analysis of variance given in Section 12.8 may be used as an illustration. A manufacturer has three different processes of making fiher bORrds Rnd wishes to determine whether these processes produce equally strong boards. A random sample of 20 boards is to be obtained from each of the products manufactured by the tb~ee processes. The strength (observation) of each of the 60 boards is determined. Then the analysis of variance may be used to test the hypothesis that the average strengths of the boards
FACTORIAL EXPERIMENT
332
Ch. 18
produced by the three different processes are the same. as shown in Section 12.8 is as follows:
The analysis
Sources of Variation
Degrees of Freedom
Among Processes Within processes Total
57
2
59
However, one may take more than one observation for each board. Suppose 4 measurements of strength are made on each board. There will be 240 observations. The analysis (Table 18.8a) is as follows: Sources of Variation
Degrees of Freedom
Among processes Among boards, within processes (Experimental Error) Within board (sampling error) Total
2
57 180 239
In this experiment, a board is called a sampling unit. The variation among the sampling units is called the exper.imental error (Section 14.8), while the variation within the sampling units is called the sampling error. The tomato experiment of Table 14.5a may be used as an illustration of the use of the hierarchical classification in a randomized block experiment. There are 6 varieties and 5 replications and, therefore, 30 plots in this experiment. Here a plot is a sampling unit. The error with 20 degrees of freedom (Table 14.5b) is the experimer.tal error. If the weights of tomatoes of individual plants were recorded and suppose there were 10 plants in each plot, the analysis would be as follows: Sources of Variation
Degrees of Freedom
Replication Variety Experimental error Sampling error Total
4
The procedure that described as tier A, the error as within
5 20 270 299
of testing a hypothesis in this experiment is the same as in the preceding section. The item variety is interpreted experimental error as tier B within A, and the sampling tier B.
333
EXERCISES
EXERCISES l. Make three sets of fictitious treatment (sample) means for a 3 x 4 factorial experiment so that (i) A SS is equal to zero, but B SS and AB SS are not equal to zero; (ij) B SS is equal to zero, but A SS and AB SS are not equal to zero; (iii) AB SS is equal to zero, but A SS and B SS are not equal to zero. 2. The following data are those of a completely randomized 3 x 2 factorial experiment.
~ 1 2
1
2
3
17 25
10 4
11
12 10
9 5
2 10
5
(i) Express each of the 12 observations as the sum of the general
mean, A effect, B effect, AB effect, and error. (ii) Find the total SS, ASS, B SS, AB SS, and error SS from the components of the observations. (iii) Find the same set of S5-values by the short-cut method and note that the values thus obtained are the same as those obtained in (ii).
3. For the following eet'of 10 population means, find
~
1
2
3
4
5
1
63
84
45
78
60
2
75
92
57
46
85
(7A' oA,
and
(7A8.
4. Twelve random samples, each consisting of two observations, are drawn from the tag population with mean equal to 50 and variance equal to 100. The data are tabulated as follows:
IX 1 2 3 '--
1
2
3
4
46 37
53
58 46
60
46 55
61
56 30
55 47
58
--r-49 57 51 53 42 54
53 53
46 46
~
FACTORIAL EXPERIMENT
334
Ch.18
Test the hypotheses, at the 5% level, that (i) the population means of four levels of the factor A are the same F = 1.16), (ii) the population means of three levels of the factor B are the same (F = 0.36), and (iii) the interaction between the factors does not exist. (F = 0.33) Since the sources of these samples are known, state whether your conclusions are correct or errors (Type I or II) have been made. 5. Add appropriate numbers to the observations of Exercise 4 so that the 12 population means are as follows:
IX 1 2 3
1
2
3
4
50 50 150
50 50 150
150 50 50
150 50 50
'-
\lhich of the three hypotheses are correct? Test the hypotheses at at the 5% level and see if your conclusions are correct (Exercise 7). 6. Add appropriate numbers to the observations of Exericse 4 so that the 12 population means are as follows:
~"<
IT
1
2
3
4
350 250 250
150 50 50
250 150 150
150 50 50
Which of the three hypotheses are correct? Test the hypotheses at the 5% level and see if your conclusions are correct (Exercise 7). 7. The mean squares for Exercises 4, 5 and 6 are listed below:
Component
Ex. 4
Ex. 5
Ex. 6
A B AB
80.00 24.50 22.83 69.00
80.00 5,991.16 7,189.50 69.00
51,080.00 26,691.16 22.83 69.00
Error
The value of a mean square mayor may not be the same for the three exercises. Why? Explain each component separately. Hint: Find u~, u~, and u~ B for the population means of F.xercises 4, 5, and 6.
EXERCISES
335
S. The data of a completely randomized 2 x 2 factorial experiment with 4 replications are given in the following Table. The number in the parenthesis is the Treatment No.
~ 1 2
1
2
(1) 52 51 62 31 (3) 85 29 59 62
(2) 57 40 29 48 (4) 36 54 38 49
{i) Find the ASS, B SS, A B SS, and error SSe (ii) Find the treatment SS and the error SS only.
9.
10.
11.
12.
Then break down the treatment SS into three individual degrees of freedom by the multipliers given in Set 2, Table lS.3b. Observe that the three individual degrees of freedom are the ASS, B SS, and AB SS respectively. Why? Consider the data given in Exercise 4 those of a randomized block experiment, with the tust observation in each cell belonging to the first block and the second observation in each cell belonging to the second block. Test the same three hypotheses of Exercise 4. Since the source of the data is known, state whether your conclusions are correct. Consider the data of Exercise 8 those of a hierarchical classification. (i) Find the ASS, B SS within A, and error SS by the method given in Table lS.Sa. Note that the B SS within A is equal to the sum of B SS and AB SS in Exercise S, Part (n. (ii) Repeat Part {i) by the method given in Table lS.Sc and note that the values obtained by the two methods are the same. (iii) Break down the treatment SS of Part (ji), Exercise S, into three individual degrees of freedom so that one degree of freedom is for A and the sum of the two other individual degrees of freedom is for B within A. Express each of the 20 observations (y) given in Table lS.Sc as the Hum of the general mean, A eUect, B eUect within A, and error. Then from these components find the various SS-values. The purpose of this experiment is to determine the effect of 4 different fertilizers and 3 different weed killers on the yield of Alta Fescue (grass). The 12 treatment combinations are randomized within each of the 4 blocks (replications). The yields, in grams, of the 4S plots are tabulated as follows:
Ch. 18
FACTORIAL EXPERIMENT
336
Block
Treatments Weed Killer
Fertilizer
1
2
3
4
Blank
Blank NuCreeD (NH.hSO. CaCNa
50 266 303 175
91 258 243 252
85 234 240 227
82 261 239 114
IPC
Blaak NuCreeD (NH.>aSO. CaCNa
75 317 303 281
56 173 288 265
90
251 245 241
65 238 238 209
Blaak NuCreeD (NH.>aSO. CaCN,
152 461 403 344
103 383 466 295
179 391 387 388
154 339 388 274
CIIPC
Using the 5% significance level, test the hypotheses that 0) there is no interaction between fertilizers and weed killers; (2) the different fertilizers have the same effect on the yield of grass; and (3) the different weed killers have the same effect on the yield of grass. After the analysis, summarize yoW' conclusions in a short paragraph.
13. The following table gives the % shrinkage during dying of folD' types of fabrics at folD' different dye temperatures. Temperature Fabric
210°F
215°F
220°F
225°F
I
1.8 2.1
2.0 2.1
4.6 5.0
7.5 7.9
II
2.2 2.4
4.2 4.0
5.4 5.6
9.8 9.2
III
2.8 3.2
4.4 4.8
8.7 8.4
13.2 13.0
IV
3.2 3.6
3.3 3.5
5.7 5.8
10.9 11.1
p:..:.-.
This is a completely randomized 4 x 4 factorial experiment with 2 replications. Test the main effects and the interaction at the 5% level. 14. An experiment was conducted to investigate the effects of (a) the date of planting and (b) the application of fertilizer on the yield of soybean. The randomized block design with folD' replications was used. The yields of the 32 plots are given in the following table:
337
EXERCISES
Date of Planting
Fertilizer
Early
Late
Replicate 1
2
3
4
Check Aero Na K
28.6 29.1 28.4 29.2
36.8 29.2 27.4 28.2
32.7 30.6
32.6 29.1
26.0 27.7
32.0
Check Aero Na K
30.3 32.7 30.3 32.7
32.3 30.8 32.7 31.7
31.6 31.0 33.0 31.8
30.9 33.8 33.9 29.4
29.3
Test the various hypotheses at the 5% level and write a summary on the findings of the experiment. 15. The following data were obtained in a study of the effectiveness of benzaldehyde, 3-thiosemicarbazone and two of its analogs against vaccinia virus in chick embryos. Six eggs were used at each of three virus dilutions for each compound tested. The entire experiment was done twice at different times (replications). The following table shows the lI1ean reciprocal survival times (the mean of the values of l04/ time in hours) obtained for each group of six eggs. Substituent p-metboxy
p-amino Virus Oil uti on Rep. 1 Rep. 2
10-4 • 0 10-4 • 3 10- 4 • 6
Unsubstituted
Rep. 1
Rep. 2
Rep. 1
Rep. 2
87
90
82
71
72
77
79 77
80
73
72
70
66
81
72
68
62
61
Test the main effects and the interaction at the 5% level. (Hamre, Dorothy, Brownlee, K. A., and Donovick, Richard: "Studies on the Chemotherapy of Vaccinia Virus, The Activity of Some Thiosemicarbazones," The Journal of Immunology, Vol. 67, 1951, pp. 305312). 16. In a study of bacterial counts by the plate method, twelve observers counted each of twelve plates three times. The first time the plates were counted the plates were labelled 1 to 12. They were then taken away and re-numbered 13 to 24 in an order different from that previously used. For the third count they were re-numbered 25 to 36, again in different order. While the same twelve plates were counted three times by each observer, the imp'ession was given that 36 different plates were being provided. Each observer entered up the count on a slip of paper; this walJ removed after each series of twelve plates so that, if suspicion was aroused that the same plates
n.
338
FACTORIAL EXPERIMENT
Ch.18
were being re-counted, no reference could be made to the previous figures obtained. The results of the counts are given in the accompanying table. Does the average bacterial count vary with the observer? Observer
Plate No.
A.
E.
F.
H.
J.
K.
C.
Rl
353 339 345 347 339 340
346 344 344
340 382 374 384 356 362 349 391 364
375 359 372 358 375 359
355 333 332 341 334 336 340 336 328
334 334 328
R4
201 197 211 210 198 199
205 209 204
202 205 206 203 206 205 201 204 211
203 203 211 207 214 206
200 188 194 186 176 192 174 188 191
201 199 200
I
,i R6
G.
I.
L.
59 62
50 45 54
35 39 45
43 45 46
46 48 45
139 146 138 145 140 140
147 149 150 144 138 145
133 130 132
126 116 132
138 135 133
135 133 136
137 134 135
144 145 143 154 147 153
148 155 147 156 157 153
160 164 160
145 143 150
142 148 141
152 141 142
240 250 242
218 225 223
268 272 261 267 252 253
266 266 262 264 261 261
227 239 235
220 234 231 233 236 228
231 228 233
55 57 64
63 55 61
55 52 55
53 55 55
167 158 151
155 144 148
51 53 67 174 153 175
54 55 54
168 145 148 160 141 157
54 67 58 158 150 169 166 161 165
51 53 55
P3
58 70 64 154 160 163 186 177 156
159 152 130
149 145 143
138 137 152
P5
88 86 85 101 90 94
81 93 102
84 93 93
106 96 103 101 109 105
101 91 93
85 95 97
87 93 117
65 92 108
83 98 82
84 82 93
170 173 174 187 173 175
174 182 182
165 161 170
182 180 181 194 191 170
179 183 183 176 182 172
165 163 171
160 158 169
166 166 161
163 166 171
121 159 128 144 138 137 126 155 144 155 127 150
146 142 128
123 120 121
143 139 125 139 132 143
137 142 130 132 141 138
123 143 154 120 140 155 126 143 143
143 164 140 135 147 143
101 112 112 103 116 112
127 128 121
138 138 135
115 126 119 127 134 139
128 115 121 125 122 123
58 51 57
54 62 58
52 54 59
50 41 44
R7
146 139 136 140 134 146
137 142 139
138 134 133
RIO
141 139 139
149 157 150
158 146 151
R12
239 261 223 259 224 246
I
I
D.
B.
I
P2
I I
P8
P9
P11
,
55 59 71
51 52 60
56 67 53
57 66 59
56 57 53
50
57 59 56
117 124 123
(Wilson, G. S.: "The Bacteriological Grading of Milk," His Majesty's Stationery Office, London, 1935)
EXERCISES
339
17. The purpose of this experiment is to study the effect of 4 different baking temperatures and 5 different recipes on the size of the cake wbose cross-section is measured. in square incbes. Tbe 40 cakes used in this experiment were individually mixed and baked. Tbe areas of the cross-sections of cakes are given below:
Recipe Temperature
Plain Cake
3% GMS
6% GMS
3% Aldo
6% Aldo
218°C
4.26 4.49
5.35 5.39
5.67 5.67
5.30 5.67
5.52 5.80
190°C
4.59 4.45
4.75 5.10
5.30 5.57
5.00 5.02
5.41 5.29
163°C
4.63 4.63
4.56 4.91
4.80 4.86
4.79 4.88
4.65 4.80
149°C
4.01 4.08
3.87 3.74
4.13 4.03
3.98 4.11
4.16 4.35
Using the 5% significance level, test the hypotheses that 0) there is no interaction between recipe and temperature, (2) the recipe bas no effect on the size of the cake, and (3) the temperature has no effect on the size of the cake. If the temperature is significant, use the new multiple range test (Section 15.5) to rank the temperatures according to the average cake sizes. Test the specific hypotheses concerning the recipes by the individual degrees of freedom (Exercise 5, Chapter 15). After the analysis, list all the conclusions. 18. In an effort to reduce labor absenteeism in a large plant, 2 variables were studied. Once hired, applicants were assigned at random to one of 6 groups. These were in different parts of the plant, and treatments were as nearly as possible alike save that 3 different lengths of work week and the condition of presence or absence of music. This is a completely randomized 3 x 2 factorial experiment with 30 replications. The results, in terms of numbers of half days absent during the succeeding quarter, are given in the accompanying table. Do the length of work week and the presence of music affect the absenteeism?
FACTORIAL EXPERIMENT
340
Ch. 18
Length of Work Week 3S-hour
4O-hour
Mus.
No Mus.
0 0 0 0 0 0 1 1 1 1 1 2 2 2 3 3 4 4 4 4 4 5 8 12 12 13 20 20 21 29
0 0 0 0 1 2 2 2 2 2 2 3 3 3
"
4 4 5 5 5 6 7 8 10 10 12 14 14 16 38
Mus.
0 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 2 2 2 3 4
"
4 5 6 6 10 10 12 16
48-hour
No Mus.
Mus.
No Mus.
0 0 0 0 0 0 1 1 1 1 2 2 2 2 2 2 2 3 3 3 4 4 4 4 6 6 6 6 10 28
0 0 0 0 0 1 1 1 2 2 2 2 2 2 2 3 3 5 8 8 10 10 10 16 16 20 22 23 30 30
0 0 0 0 0 1 1 2 2 2 2 2 3
"
4 6 6 6 6 6 6 7 10 10 11 12 14 14 29 34
19. A group of 60 auto mechanics working at several different garages, all of which are agencies for the same car manufacturer, are chosen as the subjects in a study of the effectiveness of the company's onthe-job refresher courses. First, all are given a trade examination. They are then classified into three groups of 20 each, namely-the top, middle, and bottom third according to their performance on the test. Each group of 20 is then divided at random into two subgroups. One subgroup of 10 mechanics attended the refresher course. The other did not. At the end of the course, all 60 mechanics are given another examination. The scores of the· second examination are as follows:
341
EXERCISES First Examination Bottom
17 18 20
20 22 22 22 24 25 26
Training Course
-13
14 16 16 16
No Training Course
17
17 18 18 20
Middle
Top
26 29 30 30 33 33 34 35 35 37
35 36 38
38 '40
41 41 44 44
46
24 26 26 28 28 28 32 32 33 34
29
33 34 36 36 38 41 41 43 47
This is a completely randomized 3 x 2 factorial experiment with 10 replications. . (a) Test the hypothesis that the interaction is absent, at the 5% level. What is the meaning of interaction in this experiment? (b) Test the hypothesis that the training course is valueless, at the 5% level. 20. The following data were obtained in a study of the effect of induced maltase formation on the free glutamic acid depletion of nitrogen Flasks
Inductor Absent
Inductor Present
1
19.6 20.4
10.3 10.1
2
17.9 17.2
10.9 10.8
3
17.2 18.0
9.9 9.9
4
18.9 19.6
11.1
17.3 17.5
12.0 11.8
5
11.5
342
FACTORIAL EXPERIMENT
Ch.18
replenished yeast cells. ex-methyl-glucoside was used to induce maltase formation. Cells were nitrogen starved for 80 minutes in a synthetic medium, replenished with NH.CI for 15 minutes, centrifuged, washed, and resuspended in buffer. Equal aliquots were placed in 10 flasks containing glucose with or without the inductor. Following incubation, two determinations per flask were made of the free glutamic acid content of centrifuged washed cells obtained manometrically by the decarboxylase method. Test for a difference in the average glutamic acid between the two treatments, at the 5% level. (Halvorson, Harlyn 0., and Spiegelman, S.: "Net Utilization of Free Amino Acids During the Induced Synthesis of !\1altozymase in Yeast," Journal of Bacteriology, Vol. 65, 1953, pp. 601-608.) 21. This is exactly the same experiment as F:xercise 20. Arginine instead of glutamic acid was measured. Test for a difference in arginine between the two treatments, at the 5% level.
Flasks
Inductor Absent
Inductor Present
1
3.2 3.2
1.7 1.7
2
2.4 2.4
1.9 2.3
3
2.6 2.9
1.8 1.6
4
2.8 3.0
1.9 1.8
5
2.6 2.6
2.0 1.9
QUESTIONS 1. What is a factorial experiment? 2. What are the advantages of a factorial experiment over & series of simple experiments? 3. \\ hat are the sources of variation and their numbers of degrees of freedom for a 5 x 7 factorial experiment with 4 replications, if (a) the completely randomized design is used and (b) the randomized block design is used? 4. \Uat are the sources of variation and their numbers of degrees of freedom for a 4 x 6 factorial experiment with 10 replications, if (a) the completely randomized design is used and (b) the randomized block design is used?
REFERENCES
343
5. In a factorial experiment, the factor A consists of three schools and the factor B consists of two teaching methods and an observation y is the test score of a student. (a) What is the meaning of the presence of interaction? (b) What is the meaning of the absence of interaction? 6. In a factorial experiment, three different teaching methods are used on the students of five different levels of I.Q. An observation is the test score of a student. (a) Suppose it is found that one method is best suited to students with a high I.Q. and that another method is best suited to those with a low I.Q. Would you say that there is an interaction or no interaction between the teaching methods and the levels of I.Q. of the students? (b) Suppose the three teaching methods have the same relative merits for students of all levels of I.Q. Would you say that there is interaction or no interaction? 7. What is the difference between the experimental error and the sampling error? 8. In a factorial experiment, one may test the hypothesis that the ab treatment means (Il) are equal. (a) Then why should one bother to test the three component hypotheses? (b) What are the three component hypotheses? 9. Both factorial and hierarchical experiments are cases of the analysis of variance with multiple classifications. What is the difference between them? 10. What hypotheses can be tested by the hierarchical experiment?
REFERENCES Anderson, R. L. and Bancroft, T. A.: Statistical Theory in Research, McCrawHill Book Company, New York, 1952. Bennett, Carl A. and Franklin, Normal L.: Statistical Analysis in Chemistry and the Chemical Industry, lohn Wiley and Sons, Inc. New York, 1954. Fisher, R. A.: The Design of E"periments, Hafner Publishing Company, New York, 1951. Mood, A. M.: Introduction to the Theory of Statistics, McCraw-Hill Book Company, New York, 1950. Dstle, Bernard: Statistics in Research, Iowa State College Press, Ames, 1954.
CHAPTER 19
ANALYSIS OF COVARIANCE The distributions of the regression coefficient and the adjusted mean are presented in Sections 17.3 and 17.4, but the discussion is limited to a single sample. In this chapter, the discussion is extended to " samples. Therefore, the material presented here is essentially an extension of the subject of linear regression. The new technique used here, called the analysis of covariance, involves the partitioning of the sum of the products of " and y into various components.
In many ways, this
technique is similar to the analysis of variance, which involves the partitioning of the sum of squares. Incidentally, this chapter also integrates linear regression and the analysis of variance.
19.1 Tesl of Ho.ogeaeilY of Regressioa Coefficienls Section 17.3 shows that the regression coefficient b varies from sample to sample even though all the samples are drawn from the same population. Therefore, a mere inspection of the values of the sample regression coefficients will not enable one to tell whether the k population regression coefficients, f3, are equal. In this section a method is given to test the hypothesis that" population regression coefficients are e'!ual, under the assumption that the samples are drawn at random from the populations described in Section 16.2. The physical meaning of the regression coefficient is extensively discussed in Chapters 16 and 17. The age (x) and the height (y) of children is one of the illustrations used. The regression coefficient in this example is the rate of growth of the children. 1£ one wishes to know if the children of different races grow at the same rate «(3), one may obtain a random sample of children from each race and test the hypothesis that the growth rates are equal. The rejection of this hypothesis implies that the growth rate of children varies from one race to another. Geometrically, the hypothesis that all {3's are equal means that the regression lines of the k populations are parallel. 'The test of homogeneity of means (analysis of variance, Chapter 12) is accomplished by comparing the variation among the sample means (among sample MS) with that among the observations (within sample MS). The test of homogeneity of regression coefficients is based on the same principle, except that the comparison is between the variation among the sample regression coefficients and that among the observations. Since these two tests have so much in common, one test can be explained in tenns of the other.
(m
344
19.1
TEST OF HOMOGENEITY OF REGRESSION COEFFICIENTS
345
r
The mean and the regression coefficient b, different though they are, yet have many properties in common. This fact should not be surprising, for both rand b are linear combinations of the n observations (y) of a sample (Equation 3, Section 15.1 and Equation 4, Section 17.3), and consequently (Theorem 15.2) both follow the nonnal distribution if the population is normal. Furthermore, the similarities do not end here. In fact, the analogy can be carried to the numerators and denominators of these two statistics. The mean y is equal to Tin, and b is equal to SP ISS •• The two numerators T and SP, and the two denominators nand SS s also play corresponding roles. For example, the variance of is equal to a J I n (Theorem 5.3) and that of b is equal to a J ISS. (Th eorem 17.3). The general mean 9 is the weighted mean of the k sample means y, with the sample sizes n 88 the weights (Equation 4, Section 12.10), that
r
IS,
_ nlr, + n.jJ + ••• + n~L T, + TJ + ••• + T G .. II "I + n J + ••• + n, I.n =In'
'Y -
(1)
Then it is to be expected that the mean 1j of k regression coefficients, b" bit ••• , and b Ie' should be the weighted mean of the b's, with the values of SS. of the k samples as the weights, that is,
-=
SPI+SPJ+"'+SP Ie
---I.S~S--.........::.."
• where SS.1 is the SS. for the hrat sample; SSd is the SS. for the second sample; and so forth. The fact that bSS" is equal to SP is derived from the equation that b - SP ISS. (Equation 5, Section 16.9). The amoog-aample SS is (Equation 2, Section 12.10) Ie
L n(y - 9)J -
nl
9)2
+ narr2 - y)2 + ••• + nle(y. -
y)2
T! + -~ + ••• + -~ - -. GJ
.. -
nl
nJ
n,
(3) (4)
I.n
Then the corresponding S5 for the regression coefficients is Ie
1:. SS.(b - bf .. ss d(b 1 - b)2 + S5s2{b 2 - b)2 + ••• + SSd(blc (SP .)2 (SP 2)2 (SP 1e)2 - - - + - - + ••• + SS.1 SSd SSd
6)2
(5)
(I.SP)2
I.S5.
(6)
which is the sum of " separate regression SS minus the regression SS
346
ANALYSIS OF COVARIANCE
Ch. 19
due to pooled SP and pooled SS", Both SS.values given in Equations (3) and (5) have Ie - 1 degrees of freedom. The within-sample SS is the pooled SSy of the Ie samples with ~n - 1) or In - Ie degrees of freedom. Then the corresponding SS for the regression coefficients is the pooled residual S5 with ~n - 2) or In - 'llc degrees of freedom. This analogy stems from the fact that 8 2 or S5/(n - 1) is an estimate of 0 2 , while in linear regression, 8 2 or residual 55/(n - 2) (Equation 1, Section 16.6) is an estimate of the same parameter. In testing the hypothesis that the Ie population means are equal, the statistic used is Among-sample S5
Ie-I
F=-------
(7)
Pooled S5
In -Ie
r
with Ie -1 and I.n -Ie degrees of freedom. Then. by analogy, the statistic used in testing the hypothesis that Ie population regression coefficients are equal should be
I5S
(b - b)2
" Ie -1
(8)
F""------Pooled residual 5S
with Ie - 1 and In - 21e degrees of freedom.
This
F value can be visu-
TABLE 19.1a Sample 1 x
y
"2 105 " 158 " 12 6
Sample 2
" 1 3 2 1 3
y
5 9
5 1 5
Sample 3
"
y
"
10 5
"6
10 15 8
2 6
2
12
alized with the help of a numerical example. Table 19.1a shows three = 3) samples. For each sample, the values of S5", SP, SSy' b, regression SS, and residual SS are computed by the procedure given in Section 16.9. The intennediate steps are given in Table 19.1b and the result,! are given in the first three lines of Table 19.1c. The weighted mean b
(k
19.1
TEST OF HOMOGENEITY OF REGRESSION COEFFICIENTS
347
TABLE 19.1b Sample No.
-
Item
1
2
3
-
--
5
6
T3
20 400
10 100
24 576
T:;II
80
20
96
U-
88
24
112
II
5
T.-~
•
16
La}
54 C.
Total 1-3.375 C!tLa
= 182.25
(I)
1962:.(::)
(II)
I:f
(III)
224
A:
L:ss", pooled ss"
8
4
16
50 2,500
25 625
60 3,600
T;/II
5)0
125
600
1,22[: :
(II)
I.yZ
558
157
658
1,373 I.y2
(III)
SS"
T,. .. I.y T3,.
28
135 Cyi C;/In - 1,139.0625
.()
(I)
Ir
SS,.
58
32
58
1,000
250
1,440
C.C,./In
T.T,./II
200
50
240
490t(T x:,.)
I%)'
220
58
268
546~
T.T,.
148 1:SS,.. within sample SS
=455.625
(I)
(II)
(III)
A:
SP
20
8
28
56 L,SP, pooled SP
of the three regression coefficients b (Equation 2) is,
b = 8(2.50) + 4(2.00) + 16(1.75) = 20 + 8 + 28 = 56 _ 2.00. 8 + 4 + 16
Note that the numerator 56 of 28 is the pooled 55•.
28
b is
28
the pooled 5P while the denominator
The weighted sum of squares among the b's (F.quation 5) is A:
I:55.(b -
b)3 - 8(2.50 - 2.00)3 + 4(2.00 - 2.00)3 + 16(1.75 - 2.(0)3
= 2 + 0 + 1 = 3.
348
Ch. 19
ANALYSIS OF COVARIANCE TABLE 19.1c SEIple No.
SEIple Size n
55"
5P
55y
1 2 3
5 5 6
8 4 16
20
58
8
32
28
58
-
-
28
56
Sam Pooled
b
Residual 55 DF
2.50 2.00 1.75
50 16 49
1 I 1
8 16 9
3 3 4
115 112
3 1
33
10
2.00
3
2
148
Regressioll 55 DF
Difference
By the method of computation shown in Equation (6), the same 55 is equal to the sum of k separate regression 55 minos the regression SS computed &om pooled SP and pooled 55", that is (Table 19.1c), II
~55
(b -
"
b)1
= 50 + 16 + 49 -112 = 115 - 112 - 3,
with 2 degrees of freedom. These two methods alwaya gave identical resolts, bot the former shows the meaning of this 55, while the latter provides a short-cut method for computation. It can be readily seen from Equation (5) that this 55 is equal to zero, if all the b's are equal. The pooled residual 55 is simply the som of the values of the residual 55 of the k samples, that is (Table 19.1c), pooled residual 55 = 8 + 16 + 9 - 33, with ~(n - 2) or (5 - 2) + (5 - 2) + (6 - 2) or 10 degrees of freedom. Then the statistic 3 2 F - - =0.45 33
(9)
10 has 2 and 10 degrees of freedom. The statistic F is used to test the hypothesis that the k population regression coeffi~ients. {3, are equal. This is, of course, a one-tailed test. The hypothesis is rejected only because F is too large, never because it is too small. The F valoe is equal to zero, if all the b's are equal (Equation 8). A large F value indicates that a great deal of variation exists among the b's and consequently leads to the conclusion that the {3's are not all equal. The results of the computatioo given in Tables 19.1b and 19.1c may be summarized in an analysis of variance table as shown in Table 19.1d.
19.2
349
ANALOGY BETWEEN MEAN AND REGRESSION COEFFICIENT
This table provides a slightly different point of view with respect to the test of homogene ity of regression coefficients. In the analysis of variance of '1, x being ignored, the among-sample SS has k - 1 degrees of freedom and the within sample SS has In - k degrees of freedom (Section 12.10). Now the within sample SS itself is partitioned into three components (Table 19.1d). This partition can be traced to the analysis of variance in linear regression which shows that the SS,., with n - 1 degrees of freedom, of a sample may be partitioned into the regression SS with 1 degree of freedom and J1te residual SS with n - 2 degrees of freedom (Section 16.4). Now with k samples available the sum of these k regression S5-values, with k degrees of freedom, is partitioned into two components, namely, the regression SS due to b, with 1 degree of freedom, and the SS due to the variation among the b's with k - 1 degrees of freedom (Table 19.1d). The pooled residual SS, with In - 2k degrees of freedom, as the name implies, is simply the sum of the residual S5-values of the k samples. Table 19.1d shows two F values. In both cases, the pooled residual mean square is used as the denominator. The purpcee of the lower F value has already been explained, and that of the upper one is almost self-evident. The regression SS due to b is equal to (ISP)2/ISS" and b is equal to ISP /ISS". Therefore, the regression SS is equal to bZISS". TABLE 19.1d Degrees of Freedom =c_-=_ f=.-::- - --Regression doe to 1) 1 112 1 Variation among b's 3 2 k-l Pooled residual 33 10 La - 2k 13 In - k 148 I Within sample Sum of Squares
Source of Variation
-;
Mean Square
_.
112.0
I.S
F
Hypothesis
--
33.94 0.45
3.3
--
A-
:8=0 ~""" =,BA:
I
Consequently, the F value will Ee large, if b deviates greatly from zero. Thus a large F indicates that f3 is_different from zero. In other words, the hypothesis being tested is that f3 a O. 19.2 Aaalo8)' Between Mea ad Regression Coefficient
As pointed out in the preceding section, the mean y and the re6l'ession coefficient b have many properties in common. For easy reference the cOlTesponding quantities are listed side by side in Table 19.2. If one can identify the sample total T with SP and the sample size n with SS", the identification of the rest of the colTesponding terms follows automatically.
350
Ch. 19
ANALYSIS OF COVARIANCE
Purely by analogy, a theorem concerning the difference between two regression coefficients can be derived &-om Theorem 10.4&, that is, the statistic
TABLE 19.2 Items -
Regression Coefficient
Mean
-
- --
y T
Statistic Numerator of statistic Denominator stati stic Mean of statistic Variance of statistic Weighted mean Weights for weighted mean Numerator of weighted mean DellOminator of weighted mean Hypothesis tested Test used Numerator of F
b
SP SS"
n
,
~
J'
r;; ISS"
u'ln
7)
SS" kSP
n
G =:£T
In
J.Is
Iss"
=~ =... =1lA: F
~n(r - ~/(k-I} Pooled SSy Pooled d.l.
! Denominator of F
A = At =... =.8A: F
IsSS
follows Student's t-distribution with n l + n 2 - 4 degrees of freedom, where S2 is the pooled residual SS of the two samples divided by n l + n 2 - 4. The above equation is obtained from the equation of Theorem 10.4a by replacing every tenn of the latter equation with its corresponding term given in Table 19.2. The statistic t of Equation (1) may be used in testing the hypothesis that fi, - f32 is equal to zero or any other value. If the hypothesis is {3l - f32 ... 0 or f3l = f32' eitlJer the F or the t test may be used. The conclusions thus reached are always the same, because t2 = F (Theorem 12.6). The confidence interval of the difference between two regression coefficients can also be found by Equation (1). The 95% confidence interval of {3l - f32 is (Equation I, Section 11.5)
b, -b2 ft. 02•
2~
1 + -1- ] s --
5S"1
SS,,2
.
(2)
19.3
351
SAMPLING EXPERIMENT ON REGRESSION COEFFICIENTS
If the confidence coefficient of 99% is desired, the 2.5% point IS replaced by the 0.5% point of t with Rl + R:a - 4 degrees of freedom. As a numerical example, the samples 1 and 2 of Table 19.1a may be used, assuming that sample 3 does not exist. The quantities computed from these samples are shown in Table 19.1c. The pooled estimate g2 of q:a is the pooled residual SS divided by Rl + R:a- 4, that is, s:a
=
8 + 16 :-\+3
=4
If the hypothesis to be tested
with 6 degrees of freedom. f31 - fJ:a = 1, the statistic is t
=
(2.50 - 2.(0) - 1
0C
- .50 c
4 -- + 1) 8
--
IS
that
== -0 408
1.225
•
•
4
with 6 degrees of freedom. The 95% confidence interval of f31 - f3:a is
(2.50 - 2.00)
±2.447 V~.....-:a:--+-lT'").
or .50 ±3.00, tbat is, -2.50
< {31 - f3:a < 3.50.
In other words, the difference f31 - fJa is somewhere between -2.50 and 3.50. In order to obtain a narrower interval, the values of SS" of the two samples have to be increased either by increasing the sample sizes or by placing the x-values more widely apart, or by doing both. This point has already discussed in detail in Section 17.3.
19.3 SampliDg Expert_eat OD ResressioD CoemeieDts A sampling experiment may be used to verify the fact that the statistic F for testing the homogeneity of regression coefficients really follows the F-distribution. This experiment requires very little computation if the results of previous sampling experiments are utilized.
To compute F, the residual SS and b are needed. But the residual SS is already computed for each of 1000 random samples in Section 16.5 and the regression coefficient b is computed for the same samples in Section 17.3. The values of the residual SS, I(y - ,,):a, and the regression coefficient b for four such samples are given in Table 4.2. The 1000 samples, each consisting of 5 observations, are drawn at random from the population with f3 equal to 20 and q2 equal to 100. To
r
352
ANALYSIS OF COVARIANCE
Cb.19
compute F the samples may be wouped into 500 pairs (i.e. " ... 2). Samples 1 and 2 constitute a pair; samples 3 and 4 constitute another pair; and so forth. For eacb pair, the F value can be obtained by the method described in the preceding section. The computing procedure is quite simple, because the value of SS. is equal to 10 for all samples (Section 17.3). Then
b _ SS.lb 1 + SSd b2 = lOb, + 10ba SSd +SSd
(1)
20
and
ISS ,,(b -
b)· = 10(b, - b)' + 10(b. - b)· .. 10 a
b, + b ~~b'-2
a)·
b, + b-2 ) ' ] + ~b. - 2
5(b, - b.)'.
Therefore, for each pair of samples, the F value is
5(b, - b.)' F
=
2- 1 .. 3O(b, - b.)' • Pooled residual SS Pooled residual SS
(2)
6
For tbe first pair of samples (Table 4.2),
F
30(17.0 - 19.0)' -
508.8 + 227.2'
120 -
-a
736
0.163.
For the second pair of samples, F
-
30(15.6 - 16.8)'
43.2 - - - .. 0.103. 203.2 + 216.8 420
After the 500 F values are computed, it can be observed that approximately 25, or 5%, of the 500 F values will be greater than 5.99, the 5% point of the F-distribution with 1 and 6 degrees of freedom. (A sampling experiment, conducted cooperatively by about 50 students at Oregon State College in the Spring of 1954, showed that 31 of the 500 F values were greater than 5.99). This sampling experiment can also be used in verifying the fact tbat the statistic t given in Equation (1), Section 19.2, follows Student's tdistribution. Since all the samples are drawn from the same population, the two population regression coefficients are equal, tbat is, p, = ,8••
19.4
TEST OF HOMOGENEITY OF ADJUSTED MEANS
353
Then the ststistic , for a pair of samples is b l - ba , - -~===:~==:=;=='?"iiiiii==r====T
Pooled residual
(3)
6 bl
-
ba
- -r.:====;:=;==::::::T==i;:::;r.ir
VPooled ~idu81
55 .
r
It should be noted that is exactly the F of Equation (2). For the first pair of samples, t - -0.4038; for the second, t - -0.3207. After the 500 t values are computed, one would expect that about 2.5% of the t values are less than -2.447 and another 2.5% greater than 2.447. (A sampling experiment, conducted cooperatively by about 50 students at Oregon State College in the Spring of 1954, showed that 14 t-values were less than -2.447 and 17 t-values greater than 2.447, out of 500 pairs of samples.) 19.4 Tesl of HomogeDelty of AdJasted MeaDS
The adjusted mean or sub-sample mean y" is discussed in Section 17.4, and its properties are summarized in Theorem 17.4. But the discussion has been limited to but one sample of n pairs of observations. Now, in this section, the case of k samples is discussed. The hypothesis being tested is that the means I'-y. " of the k populations at % - %' are equal, where %' is a particular value of %. This hypothesis can be readily interpreted in terms of geometry (Fig. 19.4). The k populations have k different lines of regression. Suppose a vertical line drawn at % - ,,' intersects the lines of regression at k points. The hypothesis is that the k points of intersection coincide. In other words, the k lines of regression intersect one another at % - x'. This hypothesis implies that the k populatioDs have the same line of regression, if the {3's are equal; that is, if the k lines of regression are parallel. A helpful illustration of this hypothesis can be found in the example of the age (%) and height (y) of children (Section 19.1). Suppose a random sample of children of various ages, say from two to six years old, is available from each of three (k = 3) different races. For each sample, the regression line (assuming linear regression) shows the average height (f,,) of children for different ages (%). The children of the different races mayor may not grow at the same rate, but one may wish to compare the average heights (I'-y. ,,) of the children of the different races at a particular age, say five (%' - 5). Then the hypothesis being tested here is that the three adjusted means (I'-y. ,,) are equal at % - 5. The comparison of the relative merits of different teaching methods
354
Ch.19
ANALYSIS OF COVARIANCE TABLE 19.4
Line No. 1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17 18
Quantity
n -x
1
2
.625 2.50 1.56250 10 8.43750
5 2 -1.375 2.00 -2.75000 5 7.75000
.390625 1.953]25 8 40 9.953125 4.018838
1.890625 9.453125 4 20 13.453125 1.486643
5
4
-x-x =
b b(x-%)
1.
y£
(x-i)a n(x-%)a 55" n55" 55" + n(x - Xla If
33.908946 286.106732
Wyj' wya y (IWyy>a
11.521483 89.291493
Total
3 6 4
16
.625 1.75 1.09375 10 8.90625
:
.390625 2.343750 16 96 18.343750
28 10.738871
5.233390
46.609880 415.119244
(IwYrY/IW 550fyy
92.040309 790.517469 8,471.418481 788.855596 I 1.661873
may also be used as an example. Suppose four (k - 4) different methods are used in teaching a certain subject. The four groups of students may have different I.Q. 's (x) and consequently their test scores (y) may be influenced by both the teaching methods and their I.Q.'s. However, one may adjust the mean scores so that I.Q. is equal to 100 (x' - 100) and compare the adjusted average scores {f,,~ of the different methods. In other words, the experimenter wishes to determine the relative merits of the methods for students with I.Q. equal to 100. Of course, the value of x' need not be 100. Any meaningful value of x may be chosen as x'. But the most frequently chosen value is X, the average I.Q. of all the students of the four groups under observation. The method of testing the homogeneity of the adjusted means is similar to that of the test of homogeneity of the unadjusted means (analysis of variance, Chapter 12) and also to that of homogeneity of regression coefficients (Section 19.1). Therefore, the method may be developed by analogy. In Theorem 17.4, the variance of Y,,' is given as 1 (x' _%)2J (12.. 0'2 _,= [ -+ Y"
n
SS
"
[ss" + n(x'_%)aJ 0' 2 nSs
"
0'2 _ _ .. (12 = ___
nSS
"
S.'i + n(x'- %')2
"
Ir'
(1)
19.4
TEST OF HOMOGENEITY OF ADJUSTED MEANS
355
Then the quantity If, where
W...
nS5
",
55 + n(x' _%)2
(2)
"
plays the role of n for the mean y and that of 55" f~ the regression coefficient b (Section 19.2). Then the weighted mean y", of the k adjusted means y,,' is
(3)
and the weighted sum of squares due to the variation among the adjusted means is
(4) Note that the above equation is similar to Equations (3) and (5), Section 19.1, with only the weights changed. A numerical example based on the data in Table 19.1a can be used to clarify the many awkward notations used in this section. The means and regression coefficients for these three samples are already computed. The general mean %of all the x values is chosen as x'; that is, x' 3.375 (Table 19.1b). Then the adjusted mean of a sample is (Equation 2, Section 17.4)
x-
~
- r - b{x -
X).
(5)
The three adjusted means are shown in Line 7, Table 19.4, with the details of the computation given in the first six lines of the table. The weights If of the three samples are computed by Equation (2) and given in Line 13 of Table 19.4, with the details of computation shown in Lines 8 to 12. The weighted sum of squares due to the variation among the adjusted means is equal to 1.66, with the details of the computation shown in Lines 14 to 18. The pooled residual SS is equal to 33 with 10 degrees of &eedom (Table 19.1d). The statistic used to test the hypothesis that the k adjusted population means (/l y. ~ are equal to 1.66 2 Fo:::: ... 0.25 33
.-
(6)
356
Ch. 19
ANALYSIS OF COVARIANCE
with 2 and 10 degrees of free~om, or in general, of freedom.
"-1 and l:n-2k degrees
,
;-"=,, I I
I I
"
Fig. 19.4
19.5 Sampll81 Expertme8t 08 Adjusted MellDll A sampling experiment may be used to verify the fact that the statistic F for testing homogeneity of adjusted means actually follows the Fdistribution. This sampling experiment requires very little computation if the results of Sections 16.5, 17.3, 17.4, and 19.3 are utilized. The 1000 random samples may be grouped into 500 pairs (i.e. " - 2). Samples 1 and 2 constitute a pair; samples 3 and 4 constitute another pair; and so forth. For each pair of samples, an F-value can be computed, where F-
r
r
,)1 + IV ki '2 ~I " " S2 ""
IV.("f ~1
(1)
The adjusted mean at % .. 5 is already ohtained lor each of the 1000 samples (Section 17.4). The weight If has the same value for all the samples, because, for each sample, n = 5, 55" - 10, i' .. 3, %' - 5; and consequently, Jr' = nS5" _ 5(10) .., ~. (2) 55 + n(%'_ill 10 + 5(5 _3)1 3
"
19.5
TEST or HOMOGENEITY
or ADJUSTED MEANS
357
Then the weighted average of the two adjusted means is
5
5
-31.'1 + -31.'2
=
Y,,'·
5 5 -+3 3
-
-
Y.'1 + Y.'2 2
(3)
and
5
"" - {f "1 6·
-
1 '2)2. "
Therefore, for each pair of samples, the F-value is
~(1" 6.1
r ' )2 .2
r
2- 1 S(y '1 - '2)2 (5) ." " pooled residual SS pooled residual SS 6 with 1 and 6 degrees of freedom. Because the pooled residual SS is already computed in Section 19.3, the computation of F is further simplified. As examples, the F-values of the two pairs of samples given in Table 4.2 are computed. For the first pair,
F-
F..
5(122.8 - 123.6}2 508.8 + 227.2
3.2 • - •. 004 736
For the second pair, F • 5(130.4 - 129.0)2 • 9.8 203.2 + 216.8 420
= .023._
After the 500 F-values are computed, it can be observed that approximately 25 or 5% of the 500 values will be greater than 5.99, the 5% point of the F-distribution with 1 and 6 degrees of freedom. (A sampling experiment conducted cooperatively by about 50 students at Oregon State College in the Spring of 1954, showed that 28 out of 500 F-values were greater than 5.99). This sampling experiment can also be uaed in verifying the fact that the statistic (6)
358
AN AL YSIS OF COVARIAN CE
Ch. 19
follows Student's t-distribution with In-2k degrees of freedom. Since all the samples are drawn from the same population, the two adjusted means of the populations are equal. Then the statistic t for a pair of samples is
It should be noted that t 2 is exactly the F of Equation (5). For the first pair of samples, t = -0.066; for the second pair, t = 0.153. After the 500 t-values are computed, one would expect about 2.5% of the t-values to be less than -2.447 and another 2.5% greater than 2.447, the 2.5% point of the t-distribution with 6 degrees of freedom. (A sampling experiment, conducted cooperatively by about 50 students at Oregon State College in the Spring of 1954, showed that 12 t-values were less than -2.447 and 16 t-values greater than 2.447, out of 500 pairs of samples.) The statistic t shown in Equation (6) may be used ir testing the hypothesisthat the difference between two adjusted means, Ily. ,,'1- #Lt. "'2' is equal to zero or any other value. If the hypothesis is that the difference is equal to zero, either the F or the t test may be used. The conclusions thus reached are always the same, because t 2 = F (Theorem 12.6). The confidence interval of the difference between two adjusted means can also be found by Equation (6). The 95% confidence interval of #L y • "'1 - #L y • ,,'2 is (Equation 1, Section 11.5)
where W is defined in Equation (2), Section 19.4. If the confidence coefficient of 99% is desired, the 2.5% point of t is replaced by the 0.5% point with n 1 + n 2 - 4 degrees of freedom. As a numerical example, the samples 1 and 2 of Table 19.1a may be used. The adjusted means are given in Line 7 and the weights are given in Line 13 of Table 19.4. The pooled estimate S2 of a 2 is the pooled ~esidual SS divided by n 1 + n 2 - 4, that is, S2
=
8 + 16
3+3
=4
'
(Table 19.1c) with 6 degrees of freedom. If the hypothesis to be tested
19.6
359
INDIVIDUAL DEGREE OF FREEDOM
is that the first adjusted mean exceeds the second one by 3, the statistic IS
(8.4375 -7.7500) - 3
t
-2.3125
= -;=;==:=======::=====r = 1 1 1.91988
-
-1. 20
----+----
1.486643
with 6 degrees of freedom. " '1 - " The 95% confidence interval of ry'" ry'" '2 is
8.4375 -7.7500 ± 2.447
1 '(
1
V~ 4.018838
+
1 ) 1.486643
or 0.6875 ± 4.6979, that is,
-4.01 < ry'" " '1
- r'Y'" " '2
< 5.39.
In other words, the difference between the adjusted means of the two populations is somewhere between -4.01 and 5.39, at the array where oX = 3.375. In order to obtain a narrower interval, the weights must be increased and consequently the sample size n and the value of SS" must be increased (Equation 2, Section 19,4.). The value of SS" may he increased either by increasing n, or by placing the x-values more widely apart, or by doing both.
19.6 ladividaal Degree of Freedom The tests of (a) homogeneity of means (analysis of variance), (b) homogeneity of regression coefficients and (c) homogeneity of adjusted means are used to test the general hypothesis that k parameters are equal, whether the parameter in question is Il' {3, or Il y • ,,' For a more specific hypothesis concerning the parameters, the individual degree of freedom may be used in each of these three cases. The method of obtaining Q2, the sum of squares due to an individual degree of freedom, is given in Section 15.3. However, the method described is applicable only to the means with equal sample size. In this section, a more general method is introduced. It is equally applicable to means, regression coefficients, and adjusted means. To make the discussion general, the method is described in terms of the estimates obtained from the k samples, whether an estimate is Yo b, or y . " are designated by E l' E 2' ••• , and E k The estimates of the k samples and their corresponding wei ghts by If'" If' 2' ••• , and Wk respectively. If E is y-, If' is n; if E is b, lfi is SS (Section 19.2); if E is Y " the weight " " is nSS
w= SS
"
" xP
+ n(x' -
360
ANALYSIS OF COVARIANCE
Ch.19
(Equation 2, Section 19.4). For this general case, the rules for obtaining an orthogonal set of multipliers, M, as described in Section 15.3, need only slight modification. The modified rules are: (a) The weighted sum of the multipliers M for each individual degree of freedom is equal to zero, with If's as the weights. (b) The 8um of the weighteJ products of the corresponding multipliers M of any two individual degrees of freedom is equal to zero, with the If's 88 the weight8.
These two rules can be more precisely stated in symbolic forms. For the example given in Table 19.6a, rule (a) states that (i) If 1M u + W/rI12 + If aM II + If )114 .. 0 (i i) If aM II + If tV u + If.}lII + If)f 24 - 0 (1) (Hi) If 1M.. + WJ.f u + If.Mn + VI )1,4" 0; and rule (b) states that (i) VI PII-f aa + VlI,fI.JJU + 1f,M.aMu + If )/ 1 )124" 0 (ii) VI,M liM II + 1' -;'1,:)1 u + WaM ,aM sa + If )t ,)I,I. .. 0 (2) (iii) VI,MU M,. + VI/d2Jr1I2 + IfsM2,Mn + VI)t2)1,. = O. Note that these are exactly the same rules stated in Section 15.3, if the weights, IV's, are equal. TABLE 19.6a Sample Item Estimate Weight Multipliers for Multipliers for Multipliers for
2
3
4
E,
E2
E, W.
WI.
1
"'I
1l'2
Of
Mil
QI QI
Mil Mil
Mn Mu M,a
E4
Mil
MIl.
Mil
Mal.
Mu
M,l.
The sum of squares due to the first individual degree of freedom is
Q!
(I.MWE)2 (M lllf' ,E I + M12"' aE 2 + ••• + Mu lf "E "P = IIf'M2 = If'IM!1 + If aM!2 + ... + If' "M~"
If the estimate E is the sample mean
r and the
.
(3)
weight If is the sample
size n, Equation (3) reduces to
(4) because WE .. nf = T, the sample total. When the n's are equal, Equation (4) becomes Equation (11). Section 15.3.
19.6
361
INDIVIDUAL DECREE OF FREEDOM
In the case of regression coefficients, the estimate E is the sample regression coefficient 6 and the weight , is the 55.. Then Equation (3) becomes I. (M u 5P 1 + M,~P 1 + ••• + Mu5P ,,)1
Ql SS ,,;lM!, + 55 dAl~2 + .•. + 55 dM~,,'
(5)
because WE - (55 )6 - 5P • • For the above example, where " is only equal to 4, the task of determining the multipliers seems to be unsurmountable. For a larger number of samples, it seems almost impossible to find the sets of multipliers to satisfy so many restrictions. However, an experienced person, guided by the physical meaning of the " treatments, usually can do 80 without difficulty. As examples, an orthogonal set of multipliers for 3 samples is given in Table 19.6b and another set for 4 samples is given
"-1
TABLE 19.6b Sample No.
Item
t-----
Estimate Weight 1 \'s. 2 (I & 2) vs. 3
1
2
3
E, Wl W2 W,
E2
E, W,
If'2
-,'l
W.
0 -(Wl
+ W.)
TABLE 19.6c Sample No.
Item I
2
3
4
Estimate Weight
E,
IV,
E. W.
E, W,
E. W.
1 vs. 2
W.
-Wl
0
0
0
0
1', + W.
W,+ W.
W.
-WI
-
3 va. 4 (1 & 2) vs. (3 & 4)
-(Wl + Wa)
-(Wl + IV.>
in Table 19.6c. The multipliers M's in these tables are expressed in terms of the weights, W's. It can be easily verified that both sets of multipliers satisfy Equations (1) and (2). For further clarification, the multipliers given in Table 19.6b are used on the numerical example of Table 19.1a. The multipliers for the means is given in Table 19.6d. Then the sums of squares for the two individual degrees of freedom
362
Ch.19
ANALYSIS OF COVARIANCE
(Equation 4) are:
Q2 = [5(50) - 5(25)12 = 62.5 5(5)2 + 5(-5)2
1
~
=
[6(50) + 6(25) - 10(60)]1 5(6)1 + 5(6)1 + 6(-10)1
-
23.4375.
The among sample 55 is 1.225 - 1,139.0625 or 85.9375 ("total" column and middle section of Table 19.1b) which is exactly the sum of Q! and
Q:' TABLE 19.6d Sample No.
Item
r
Estimate W(,ight n WE-nf- T y 1
VII.
1
2
3
10 5 50
5 5 25
10 6 60
5 6
-5 6
0 -10
2
(1 & 2)
VII.
3
TABLE 19.6e Sample No.
Item Estimate b Weight 55" WE - 55"b .. 5P 1
VII.
(1 &
2 2)
VII.
3
1
2
3
2.50 8 20
2.00 4 8
1.75 16 28
"
-8 16
0 -12
16
The sum of squares due to the variation among the regression coefficients can be partitioned in a similar manner. The multipliers for the regression coefficients are given in Table 19.6e and the values of band 55" are obtained from Table 19.1c. Then the sums of squares for the two individual degrees of freedom (Equation 5) are:
QI _ [4(20) - 8(8)]1 _ 0.6667 1
8(4)2 + 4(-8)2
Q22'" [16(20) + 16(8) - 12(28)]2 = 2.3333. 8(16)2 + 4(16)2 + 16(-12)2
19.7
ADJUSTED MEANS WITH EQUAL REGRESSION COEFFICIENTS
363
The sum of these two quantities is equal to 3.0000 which is the sum of squares due to the variation among the regression coefficients (Table 19.1d). The sum of squares due to the variation among the adjusted means can be partitioned in the same way. The adjusted means and their weights are given in Lines 7 and 13 of Table 19.4. The multipliers are given in Table 19.6f. Then the sums of squares for the two individual degrees of freedom (Equation 3) are:
Q! co 2
[1.486643(33.9089) - 4.018838(11.5215»)2 4.018838(1.486643)2 + 1.486643(-4.018838)2
= 0.5129
[5.233390(33.9089) + 5.233390(11.5215) - 5.505481{46.6099)]2
Q2 = 4.018838(5.233390)2 + 1.486643(5.233390)2 + 5.233390(-5.505481)2
= 1.1490. The sum of these two quantities is equal to 1.6619 which is the sum of squares due to the variation among the adjusted means (Line 18, Table 19.4). TABLE 19.61 Semple No.
Item
1
2
3
Estimate'.'
8.43750 4.018838 33.9089
7.75000 1.486643 11.5215
8.90625 5.233390 46.6099
1.486643 5.233390
-4.018838 5.233390
0 -5.505481
Weight W
WE
=:
wr.'
1 va. 2 (1 8r 2) vs. 3
The plD'poses and applications of the individual degrees of freedom have been extensively discussed in Section 15.3 and therefore are not repeated here. This section presents only a more general, and consequently more complicated, method of computation, but no new principles are involved.
19.7 Test or Adjusted
MeaDS
with EqII" Regresatoa Coerrtcieats
The method of testing the homogeneity of the adjusted means as described in Section 19.4 is a general one but not very commonly used. The more popular version of the test is for the special case where the k population regression coefficients are the same and the means are adjusted to % .. i, the mean of all the %-values of the k samples. The computing WOl"k for this special case is not nearly so complicated as for the general one.
364
ANALymSOFCOVARUNCE
Ch. 19
For this special case, the adjusted mean of a sample becomes
1; - r - b
(1)
that is, the regression coefficient b of a sample (Equation 5, Section 19.4) i8 replaced by the weighted average regression coefficient;; of the " 8amples, because the " population regression coefficients are equal. For the three samples of Table 19.1a, the adjusted means are: Sample 1: 1, - 10 - 2.00(4 - 3.375) - 8.75 Sample 2: 1; - 5 - 2.00(2 - 3.375) - 7.75 Sample 3: 11 .. 10 - 2.00(4 - 3.375) - 8.75 The 8um of 8quares due to the variation among the adjusted means for this special ca8e i8 much easier to obtain than for the general cue. The computing procedure i8 8ummarized in Table 19.7. The various component8 of SSy are th08e of the analY8is of variance of y (Section 12.10). Similarly, the component8 of SSs are those of the analysi8 of variance of ~. The components of SP are computed in the 8ame way, with the products replacing the square8, that i8, (;I is replaced by C sC y' 'fI by T STy and I"a by ~. The details of the computation are aliown in Table 19. lb. The among-sample component is (In - (I) and the within-sample component is (Ill) - (II). Comparing Tables 19.7 and 19.1c, one will notice that the wi thin-sample SSIe' SP, and SS y are equal to the pooled SS s' SP, and SS respectively, a8 they should be. The residual SS for the "within-samp'e" and "total" are obtained, as usual, by subtracting the regression SS, (SP)I ISS S' from SSy, but the residual SS for the "amOllg-eample" component is obtained by subtracting the withinsample residual SS from the total residual SSe The F-value with 2 and 12 degrees of freedom is used to test the hypothesi8 that the adjusted means (p. .) of the Ie populations are equal, under the special condition that all p~psulatiOD regression coefficients are equal and % - So TABLE 19.7 Residual Source of VanatioD
Among Sample Within Sample Total
55s
5P
13.75 34.375
55y
DF
55
DF
85.9375
2 13 15
2.3054 36.0000 38.3054
2 12 14
28.00 56.000 148.0000 41.75 90.375 233.9375
M5
F
1.1527 0.38 3.0000
It should be noted that this F has 2 and 12 instead of 2 and 10 degrees of freedom as given in Equation (6), Section 19.4. This discrepancy stems &om the fact that the SS due to the variation Dong the b's with
19.7
ADJUSTED MEANS WITH EQUAL REGRESSION COEFnCIENTS
365
2 degrees of freedom is combined with the pooled residual SS with 10 degrees of freedom (Table 19.1d). The justification of such a procedure is that the population regression coefficients are the same. The test of homogeneity of regression coefficients may be oed as a preliminary step towards the test of bomogeneity of adjoted means. If the preliminary test indicates that all {1's are equal, the method described here should be used; otherwise, that of Section 19.4 should be uaed. If the teat indicates that P .. O. in addition to the It {J'. being equal, neither method is needed. Then the ~values should be ignored and the straightforward analysis of variance should be oed on the observation (Y). The variance of the difference between two adjusted means has two different expressions depending on whether the fl's are equal. The variance of an adjusted mean,. is given in Theorem 17.4 as
u;.,- ulL;r1 + (s'-;)I1 SS. J.
(2)
Then, by Theorem 15.2, it follows that the variaoce of the difference between the adjusted means of two independent samples is
[1
S,)J_u b-1+ -IJ
1
a 2 t.;, - SO)' +-+ (s' uoy, -V, -u -+ • 1 • 2 ", SSsl "a SS.2
a
If,'a
. (3)
where , is defined in Equation (2), Section 19.4. This version of the variance is ueed in Equation (6), Section 19.5. However, when the fl's are equal, this variance will acquire a different expression, because both adjusted means involves the same regression coefficient h. Since in this case the adjoted means becomes , .'1 -
r1 - b
, .'2 -
12 - b
and
the difference between them is
'.'1 -1.'2 -'1
-1 2 -
'b(i l
-x 2)·
Then, by Theorem 15.2, the variance of this difference is
u'-
-
(is---ia)&J -uI ~l - +1- + -
:r .' 1 - :r.' 2
",
",
ISS.
(5)
where ISS. is the pooled SS. of the two samples. Therefore, wb en the
366
Ch. 19
ANALYSIS OF COVARIANCE
{J's are equal, the statisti c
(y , t-
(6)
~~~~=-~~~~~~
follows Student's t distribution with II, + "I - 3 degrees of freedom. Note that the number of degrees of freedom of this t is one more than that of Equation (6), Section 19.5. This difference stems from the way in which the Sl is obtained. When the two {J's are different, the Sl is the pooled residual SS divided by [(II, - 2) + (III - 2)]. On the other hand, when the two {J's are equal, the sI is obtained from the pooled residual SS plaa the SS due to the variation between the two b's. Therefore the uumber of degrees of freedom is (II, - 2) + (lla - 2) + 1 or II, + 3. In other words, the extra degree of freedom comes from the variation between the two b's, while the two {J's are equal. When there are only two samples, either the t or the F test may be used. The square of the t of Equation (6) is exactly the F obtained by the method shown in Table 19.7. The numerical verification of this statement is given as an exercise at the end of this chapter (Exercise 7).
"I -
19.8 Teat of AdJ.a&ed Mea. for R..dolDlzed Block Expert...t The test of homogeneity of adjusted means can be used for any kind of experimental design under the assumption that the " {J's are equal. Table 19.8a shows a set of data from a randomized block experiment with 4 treatmeuts and 5 replications. lu terms of a field experiment, the treatments may be different kinds of fertilizers, and the replications the different blocks of land in a field. The observation 1 may be the yield of a crop grown in a plot and x the number of plants in that plot. Since the yield (,) may be affected by the number (x) of plants, the average yield (1) of a treatment may need adjustment. The adjusted mean (y;) is then the average yield of a treatment after it is adjusted to the average number of plants. TABLE 19.8a
~ "'" Replicatioll
1 2 3
1
" 4
2
2 '!
10 5
"
1 3 2 1
3 '!
5
"
9
2 6
6 2 20
4
8
4
6
5
4
15 12
3.
5 1 5
Total
20
50
10
25
4
Total
4
'!
"
'!
"
3 1
5
4
6
2
9
8
1
3
10 12 14 15 10
45
11
30
61
5 12 10 10
7
'!
25 33 29
35 28
150
19.8
367
ADJUSTED MEANS FOR RANDOMIZED BLOCK EXPERIMENT
The test of homogeneity of adjusted means is often used on feeding experiments on animals. In this case, the treatments are the different feedst the replications may be different litters; % is the initial weight of an animal; aDd 'Y is the final weight of that animal. The final weight (r) of an animal may be affected by its initial weight (%). Therefore, the treatment mean y, the average final weight of the animals receiving the same feed, may need adjustment so that the treatment means will be on a comparable basis. TABLE 19.8b PrelimiDary Calculatiolls (2)
(1)
(3)
(4)
Total No. of of Items Squares or Squared or Products Multiplied
(5)
No. of Total of Squares Observeor Products tiODS per per Observation (2) ;. (4) Item
Variable
Type of Total
3,721 765 1,021 239
1 5 4 20
20
''''
Grand ReplicatioD Treatment ObservatioDs
4 5 I
186.05 191.25 204.20 239.00
"'Y
Grand ReplicatioD Treatment ObservatioDs
9, ISO 1,857 2,480 543
1 5 4 20
20 4 5 I
457.50 464.25 496.00 543.00
22,SOO 4,564 6,050 1,348
I 5 4
20
'Y'Y
Grand ReplicatioD TreatmeDt Observatiolls
1,125.00 1,141.00 1,210.00 1,348.00
" 5 1
20
Analysis of CovariaDce AdjuBted (Residual) Source of V.iatioD
SS"
SP
SS"
DF
SS
DF
MS
F
Total ReplicatioD TreatmeDt Error Treatment + Error
52.95 5.20 18.15 29.60 47.75
85.SO 6.75 38.50 40.25 78.75
223.00 16.00 85.00 122.00 207.00
19 4 3 12 15
9.85 67.27 77.12
3 11 14
3.28 6.12
0.54
The details of the computation are shown in Table 19.8b. The components of SS" are the results of the analysis of variance performed on the ~values (Section 14.4) and those of SS" are results obtained from the analysis of variance performed on the 'Y-values. The components of SP are computed by the same procedure, except that all squares are replaced
368
ANALYSIS OF COVARIANCE
Ch.19
,P
ca is re placed by G y; ~ra is replaced by replaced by~. So far the computing procedure is the same as that described in Section 14.4. However, the lat line of Table 19.8b is new. The SS , SP, and SS for this line are obtained by adding the corresponding ter:'s of the tre~tment and error components. The adjusted SS for error and "treatment + error" are, 811 usual, obtained by subtracting spa ISS" &om SSy; and the adjusted SS for treatment is obtained by subtracting the adjusted error SS &om the adjusted "treatment + error" SSe Then the statistic F with 3 and 11 degrees of freedom is used to test the hypothesis that the adjusted means of the" populations are equal. In general, the nmnber of degrees of &eedom for F is " -1 and (II - 1)(k - 1) - 1 degrees of freedom, where II is the number of replications and c the number of treatments. It should be noted that this computing method is much simpler than the one given in the lat section. The advantage is derived &om the equal sample size. The test of bomogeneity of adjusted means is often 11Ied on the "before and alter" type experiment. For exampe, in a feeding experiment, ~ may be the initial weight and y the final weight of an animal. Similar examples can be found in almost any field. In comparing the relative merits of several methods of teaching spelling, ~ may be a child's score before, and y bis score after the training period. In experiments of this type. a question often arises 811 to what sbould be the observation y. Should it be the rlDal weight or the gain in weight? Should it be the "after" score alone or the "after" score minus the "before" score? These two interpretations of y seem to be equally logical; tberefore, it is bard for one to decide. Fortunately, tbis problem turns out to be no problem at all. Both cboices bappen to lead always to the same conclusion. Therefore one cboice is as good as tbe other. As evidence, the data of Table 19.8a are revised and given in Table 19.8c. The s-values are the same for both seta of data, but the revalues of Table 19.8c are the values of (y -~) of Table 19.8a. The details of the computation on the revised data are sbown in Table 19.8d. It can be observed that the adjusted S5-values in both cases (Tables 19.8b and 19.8d) are the same. by products. For example,
~T"T y; and
Iy is
TABLE 19.8c
~ Replication 1
2
3
"5 Total
1
"
2 y
"2 63 "9 6" 8 " 20 30
3
"1
y
3 2 1 3
6 3 0 2
10
15
"
"
2 6
Total
4 y
3 6 6
"
3 1
y
6 2
6
2 1
2 6 2 7 2
20
25
11
19
" " "
"
y
10 12 14 15 10
15 21 15 20 18
61
89
19.8
ADJUSTED MEANS FOR RANDOMIZED BLOCK EXPERIMENT
369
TABLE 19.8d
Adjuated Source of Vmation
SS.
SP
SSy
DF
Total Replication Treatment Error Treatment + Error
52.95 5.20 18.15 29.60 47.75
32.55 1.SS 20.35 10.65 31.00
104.95 7.70 25.15 71.10
19 4 3 12 15
97.25
SS
DF
MS
F
9.85 67.27
3 11 14
3.28 6.12
0.54
77.12
There lore, the concl1l8ions reached by these two dillerent delinitions ol
r are always the same.
The adjusted mean ol a treatment is,
88
usual, (1)
where
-6
...
Error 51' 40.25 ----1360 Error SS. 29.60 •
(Table 19.8b). The general mean of justed treatment means are:
%
(2)
is % 61/20 - 3.05. Then the ad0:::
Treatment 1: 10 + 1.360(4..0 - 3.05) - 11.292
Treatment 2: 5 + 1.360<2.0 - 3.05) - 3.572 Treatment 3: 9 + 1.360(4.0 - 3.05) - 10.292 Treatment 4: 6 + 1.360(2.2 - 3.05) - 4.844. The estimated variance of the dillerence between two adj1l8ted means is I
S
YII - YU
as
2
- -)1] [ (Error 2 n
%. -
%a
-+=-_~~
SS.
(3)
with (n - 1)(k - 1) -1 degrees of &eedom. The above equation is obtained &om Equation (5), Section 19.7. The variance S', adj1l8ted error mean square, is an estimate of u ' ; the expression 2/" is obtained &010 1/". + 1/"1 when ". - ", - "; the error SS. is equivalent to ISS., the pooled S5.. For the numerical example under con8ideration, the variance is 6.12
- 2>'J - 3.275. ~5-2 + (429.60
Then the standard error of the dillerence between the first and the second adjusted treatment means is "';3.275 - 1.810. It should be noted that the standard error varies &om one pair of treatments to aaother, because
370
ANALYSIS OF COVARIANCE
Ch. 19
the means i of % varies from treatment to treatment. Therefore, there is DO LSD for the adjusted means. Before the test of ~omogeneitl. of the adjusted means, one should test the hypothesis that {3 = o. If f3 - 0, under the assumption that all the {3' s are equal, the adjustments will not be needed. Then the x-values may be ignored and the analysis of variance may be used on the observations (y). To test this hypothesis, one may use the statistic
F = Regression MS Residual MS (Equation 1, Section 16.7), with both mean squares obtained from the error componenL For the example given in Table 19.8b, 54.73 F ... (40.25)2/29.60 =--=8.94 6.12 6.12 with 1 and 11 degrees of freedom. The F-value indicates that T3 is greater than zero, and therefore the adjustments on the treatment means are necessary. All the discussion in this section is applicable to a completely randomized experiment with only slight modifications. The computing method shown in Table 19.8b remains the same, except that the lines labeled "replication" are eliminated (Section 12.3). The procedure shown in Table 19.1b is complicated because the sample sizes are not equal. When sample sizes are equal, however, both computing methods lead to the same result. Equations (1), (2), (3), and (4) are applicable to both completely randomized and randomized block experiments. For the completely randomized experiment, the error SP, SS. and SSy are the pooled SP, SS. and SSy respectively. Of course, the numbers of degrees of freedom for the error are different. They are k(II - 1) for one case and (k - 1)(11 - 1) for the other.
19.9 Relation Between Aaalysis of Cov.nance and Factorial Experiment The analysis of covariance and the factorial experiment are seemingly different topics, yet in fact they are very closely related. The relationship can be illustrated by the set of population means given in Table 19.9a. From the point of view of the analysis of covariance, the table shows three populations, each with 5 arrays. The 15 tabulated values are the means (J.L '1 ••) of the arrays. On the other hand, from the point of view of the factorial experiment, the 3 populations are the 3 levels of factor A and the 5 x-values are the 5 levels of factor B. Then the tabulated values are 15 population means. This difference in point of view, however, is superficial. It stems from the fact that an array may be re-
19.9
ANAL YSIS OF COVARIANCE AND FACTORIAL EXPERIMENT
371
garded either 88 a population or as a sub-population (Section 16.1). Fundamentally, the hypotheses being tested by these two methods are the same, if the regression is linear. The hypothesis that k {3' s are equal is the same as that the interaction (Section 18.3) is absent, provided that the regression of y on x is linear. TABLE 19.911 (B)
PopaladoD (Factor A)
"
1
2
3
2
3 4
50 50 50
50 100
5 6
50 50
50 60 70 80 90
150 200 250
TABLE 19.9b (B)
PopalatioD (Factor A)
"
1
2
50 60 70 80 90
3 4 5 6
2
3
50 60 70
50 60 70
80
80
90
90
For the &ret population of Table 19.9a, the means of the arrays remain the same, regardless of the value of Xi therefore, the regression coefficient {3 is equal to zero. F or the second population, {3 is equal to 10, because the mean of the array increases at the rate of 10 units per unit of x. Similarly, it can be observed that, for the third population, {3 is equal to so. Therefore, the {3's are not equal. But in terms of the factorial experiment this situation means the existence of interaction between the factors A and B (Section 18.3). Now Table 19.9b shows three populations with the same regession coefficient 10. It also shows that the interaction AS is abeent. Therefore, the two hypotheses that the {3's are equal and that the illteraction AB is absent are really the same, provided that the regression is linear. The hypothesis that the average repssion coefficient is equal to zero is the same as that ~ is equal to zero (Equation 3, Section 18.5). Table 19.9c shows a set of population means which satisfied both hypotheses. The three regression coefficients are 5, 15, and - 20 and their average is zero. At the same time, the means of all the 5 levels of factor B are equal to 40. This, of course, is not a coincidence. After all, the regression coefficient 8 is the rate of change of the mean #ly •• of the
P
372
Ch. 19
ANAL YSIS OF COVARIANCE
array with respect to %. The average rate ~ being equal to zero implies that the rates of increases and decreases nullify each other. Therefore, as % increases, the means of the 5 levels of factor B remain the same. TABLE 19.9c (B)
Population (Factor A)
"
1
2 3 4 5 6
10 15
Mean
20
20
25 30
2
3
m
90
35 50 65
70
Mean
80
30 10
40 40 40 40 40
50
50
40
50
The hypothesis that the adjaated means are equal is the same as that (7~ is equal to zero (Equation 2, Section 18.5). From Table 19.9c, it can be observed that the %'s of all three populations and also i are equal to 4. In this array, the means of the populations are 20, SO, and 50 respectively; so are the means (,,) of the three populations. Therefore, the two hypotheses are the same, if the regression is linear. The SUIDS of squares due to the variation among the adjusted means and that among the unadjusted means are the same, if the " samples have the same set of x-values. The adjusted mean is
Y; =Y- b(i - i).
(1)
Since i is the same for all samples, i - i, or i - i - O. Then the adjusted and the unadjusted means become equal. At the same time, the weights for the two kinds of means also become equal. The weight If for the adjusted mean is (Equation 2, Section 19.4)
If _
1&5Ss . SSs + n(i _i)2
i-x
Since = 0, W becomes n, which is the weight of the unadjusted mean (Table 19.2). Therefore, the sum of squares due to the variation among the adjusted means is the ordinary among-sample SS (Equation 3, Section 12.3). Table 19.9d shows a set of data for a completely randomized 3 x 5 factorial experiment with 2 replications. The 30 observations are originally drawn at random from the tag population which is a normal population with mean equal to 50 and variance equal to 100. They are subsequently altered by adding appropriate numbers to them so that the 15 population means agree with those of Table 19.9a. The same set of data is first treated as that of a factorial experiment and then as that of the analysis of covariance. The results are shown in Tables 19.ge and 19.9f
19.9
373
ANALYSIS OF COVARIANCE AND FACTORIAL EXPEIlIMENT TABLE 19.9d Samples ('actor A)
(Factor B)
Total
"
1
2
3
2 2
49
86
"
.&5
58 61
Sam
96
131
48 41
3
3 Sam
128
89
Sam 5 5
83 80 119
Sam
74 76
Sam Total
426
330
612
413
662
203 210 150
99 82 102
50 46
6 6
2D9 170 160
163
42 57
346
99 110
67 61
78 41
4 4
119
235
264
96
184
499
779
499
756
1,570
2,825
TABLE 19.ge So1ll'ce of VariatiOD
Sam of Squares
A
62,522.87 20,782.67 27,222.13 2,456.50 112,984.17
B AB
Error Total
Degrees of Freedom
Mean Square
2 4 8 15
31,261.43 5,195.67 3,402.77 163.77
F 190.89 31.73 20.78
29
respectively. Comparing these two tables, one will notice that the top line of one is identical to that of the other except the F-values. This is due to the fact that the .values are the same for all the samples, and consequently the adjusted and anadjusted means are equal. The bottom lines of the two tables are alao identical. This, of course, is to be expected, because the two analyses are made on the same set of data. The middle three lines are not the same for the two tables, but they are correspoDding qullDtities, in the sense that they are 1UIed to test the same hypotheses. The middle three components of Table 19.9f are the same ODes shoWD in Table 19.1d and may be computed by the same method. Therefore, the detaila of the computation are omitted. Actually these components can be readily computed by the method of the individual degree of &eedom
374
Ch.19
ANALYSIS OF COVARIANCE TABLE 19.91
Source of Variad_
Sam of S.........
ne. .a of Freedom
S. .pl.. (Factor A) Re. . .aloD dae to 7i VarlatloD . .OD8 II. Pooled realda" Total
62,522.87 20,240.07 27,048.93 3,172.30 112,.4.17
2 1 2 24 20
Me.
Squ....
F
31,251.43 20,240.07 13,524.47 132.18
236.51 153.13 102.32
(EquatioD 6, SeCtiOD 17.8), becaase the .....ple sizes are the same. The values of the regeaaion SS are as followa:
5.00 (3) (SP)I [-2(131) -1(128) +0(163) + 1(150) + 2(184)]1 (128)1 Sample 2: - - - --SS. 2[( -2)1 +{ -1)1 +(0)1 +(1)1 +(2)1) 20 819.20
(4)
Sam Ie 3: (SP)I • [-2(119)-1(209)+0(330)+1(413)+2<499»)1 _ (964)1 _ p SS. 2[( -2)1 +{ -1)1 +(0)1 +(1)1 +(2)1) 20 46,464.80 (5) P
1 d (ISP)I (10 + 131 +964)1 (1,102)' e : ~SS - 20,240.07
00
•
20+20+20
(6)
60
The reweaaioD 55 of EquatioD (6) ia the SS due to i (aee Table 19.91). The sum of the resressioD 55 of EquatioDs (3), (4), 8Dd (5) miDas that of EquatioD (6) ia the 55 due to the variatioD amoDS 6'a (Table 19.1c), that is,
5.00 + 819.20 + 46,464.80 - 20,240.07 - 27,048.93.
(1)
Thia reaaltiDS qaaatity is the aame as that shown iD Table 19.91. Thia method of computatioD ia Dot only time-aaviDs, but it alao iDdicatea that the 55 due to variatioD amODS the 6' s is a .,..t of the iDteractioD 55, becauae both 55-values are comp.ted from the same set of totals whether they are called aUHample totals or treatmeDt totals. The regeaeioD SS due to i ia aD individual degree of freedom of the 55 due to the factor B. This CaD be aeeD from the fact that the qaaDtity of
19.9
ANALYSIS OF COVARIANCE AND FACTORIAL EXPERIMENT
375
Equation (6) may be computed directly from the totals of the 5 levels of the factor B (s) 88 follows:
This is an additional evidence that ~ being equal to zero is the same aa being equal to zero, if the regression of y on s (factor B) is linear. The 4 degrees of freedom due to factor B really consist of 1 degree of freedom due to liDear regression and 3 due to the deviation &om linearity (Section 17.7). Therefore, Tables 19.ge and 19.91 can be combined into one table (Table 19.9g), regarding the analysis of covariance aa a method of further partitioDing the SS of the factorial experiment. The factor B SS and interaction SS are each broken down into two parta. One part is already computed and the other part, which is the SS due to the deviation from linearity, may be obtained by subtraction. As stated before, the observations are drawn from the population with means equal to those given in Table 19.9 which shows that the regression is linear, but (7~, (7, and (7~B are all greater than zero. Now, if the 5% significance level is used, all the conclusions reached from Table 19.9g are correct, whether one regards the data aa those of a factorial (7~
exporimoDt or as thoso of the aualysis of covariauce. But at times the conclusions reached by these two methods may not be the same. The factor B SS is not distributed evenly over the 4 degrees of freedom but concentrates on the single degree of freedom due to linear regression. By isolating this one useful degree of freedom, one boosts the F-value from 31.73 to 123.59. Similarly the F-value for testing the interaction is boosted from 20.78 to 82.58. As a result, the probability of committing a Type n error i. reduced. Therefore, in choosing between the two methods, the analysis of covariance is preferred, if the regression is linear. It extracts the useful part and discards the waate product of the analysis of the factorial experiment. However, the analysis of covariance cannot always be used. If the regression is not linear, the analysis of covariance may throwaway the useful information. Therefore, the indiscriminate application of the analysis of covariance may produce misleading conclusions. Thus to play safe one should, whenever possible, test the linearity of regression of y on s before he uses the analysis of covariance. If the regression is linear, one haa a more powerful tool at his disposal; if not, he still can rely on the factorial experiment. The above discussion has important bearing on designing an experiment. Whenever possible, the values of s should be controlled. As an illustration, the example of teaching methode may be used. The three samples (factor A) of Table 19.9d may be regarded as three different
w ~
TABLE 19.9g Source of V.ri.tiOD F.ctor A (samples) F.ctor B (s) LilLear regresslOll Devl.tloa from liaearity Iat••ctiOll V.riatiOD amcag b's Devi.tlGll from liaearity Error Tot.l ----_._---
Dep_ of Freedom
Sam of Squ....
- -
.
2
62.522.87 20.182.67
1 3
20.240.07 542.60 8
21.222.1-3
2 6
27.048.93 113.20 15
2."SCS.50 112.98".17 ---
---
-
Mea Square 31.261.43 5,195.61 20.240.01 180.87 3,"02.77 13.52.....7 28.87 163.17
F 190.89 31.13 123.59 1.10
2O.'M 82.58 0.18
Hypothesis
~-O 01.-0
J-O
Liaear rep••lolL ~B-O A. - 4- ~s Liaear rep_lOll
~
~
en
~
8
~
~n
till
29 -------
'---
n
D'"
•
..~
19.9
377
ANAL YSIS OF COVARIANCE AND FACTORIAL EXPERIMENT
teaching methods. The ~values (factor B) are the I.Q.' s of the students with 2 representing 102, 3 representing 103, and so fortb. The observa-
tions (1) are the test scores. Then each clus bas ten students with two in each I.Q. group, and consequently all three classes have students of similar I.Q.'s. In this example, the I.Q.'s are controlled; nature is not allowed to take its own course. The control of x-values enahles one to test the linearity of regression and to decide whether the analysis of
.,
Pop. 3
70
Pop. 2
60
Pop. I
so
•
2
4
3
5
6
Factor B
Fig. 19.9
covariance should be used. If the regression is not linear, one may use the analysis of the factorial experiment which may be regarded as the curvilinear analysis of covariance. Instead of dealing with three lines, the factorial experiment deals with three curves of regression (Fig. 19.9). The connection between the curves of regression and the factorial experiment can be shown by an example. Table 19.9h gives a set of 15
378
ANALYSIS OF COVARIANCE
Ch.19
treatment means for a 3 x 5 completely randomized factorial experiment. The factor B, which is quantitative, is interpreted as the ~ It is evident from the table that the interaction between factors A and B does not
TABLE 19.9b (8)
Population (Factor A)
8 MeaDs
"
1
2
3
2 3 4 5 6
40 42 52 60
45 47 51 57 65
53 55 59 65 73
66
A Means
48
53
61
54
46
46
48 52 58
exist. Now the treatment means are plotted in Fig. 19.9 88 15 points and the means of the 5 levels of the factor B are plotted 88 5 crosses. The 5 pointe belonging to the same level of factor A are connected by the curve of regression. It can be observed from the figure that the geometric interpretation of the hypothesis that O'~B - 0 is that the curves are "parallel." The interpretation of the hypothesi. that O'~ - 0 i. that the 5 crosses are on a horizontal line. The in~rpretation of the hypothesis that O'~ - 0 is that the three curves occupy the same general position in the graph. If there is no interaction, the hypothesis that O'~ - 0 ~mplies that the three curves are the same curve, the one which links the 5 crosses. Combinations of the two topics, factorial experiment and the analysis of covariance, causes a conflict in the use of notations. For example, the letters II and b represent different quantities at different places. In the factorial experiment, II is the number of observations in each of the CJ x b treatments; b is the number of levels of the factor B. In the analysis of covariance, II is the number of observations in each sample (level of factor A); b is the regression coefficient of r on ~ (factor 8). To clarify the possible confusion, one may try to identify these letters for the example given in Table 19.9d. Here, in terms of the factorial experiment, CJ - 3, b - 5, II = 2; but in terms of the analysis of covariance, " - 3, 11-10.
EXERCISES (1) Four random samples, each consisting of five observations, are drawn from the tag population which is a normal population with mean equal to 50 and variance equal to 100. An arbitrary x-value
EXERCffiES
379
is assigned to each observation (y). below:
The observations are listed
Sample No.
1
3
2
l:
Y
l:
Y
2
3 6 1 2
54
4
52 39 40 41
6
48
4
50
7
5
49 50 58
"5 9 4 3 2
4
Y
l:
Y
63 50
3 4 6 8
64
7
68
64
44 59
55 54
58
The mean of every array of every population is equal to 50. Therefore, the adjusted means of the four populations are all equal to SO, and the population regression coefficients are equal to zero and Consequently the average of the population regression coefficients is also equal to zero. Pretending that the source of the samples is unknown, test the following hypothesis at the 5% level: (a) The four population regression coefficients are equal. (b) The average of the populatioD resreaaioD coe££icieDte ia equal to zero. (c) The adjusted means, at x - 5, of the four populations are the same. Since the source of the samples is actually known, state whether your conclusions are COlTect or Type I elTors have been made. A Type II error cannot be made in this cue, because all the hypotheses being tested are correct. (2) Ignore samples 3 and 4 and use only samples 1 and 2 of the data of Exercise (1). (a) Test the hypothesis that f3. - f3a by the t-test at the 5% levels. State whether your conclusion is correct or a Type I error is made. (b) Test the same hypothesis by the F-test and note that tI - F. (c) Find the 95% confidence interval of f3. - f3. and state whether your estimate is correct. (3) The samples given in this exercise are obtained from the data of Exercise (1) by adding appropriate numbers to the observations (y) so that the four regression equations are as follows:
(m
"Y." - 50 + 0 x Populations 3 & 4: "y.,. - 50 + 10 x
Populations 1 & 2:
380
Ch. 19
ANALYSIS OF COVARIANCE Sample No.
2
1
"
Y
2 7 5 4
52 39 40 41 48
6
"
4
3 Y
3 6
54 49
1 2 4
58 50
50
"
5 9 4
Y
3
113 140 104 74
2
79
" 3
4 6
8 7
Y
94 95 114 138 138
The four population regression coefficients are, 0, 0, 10, and 10 and the four adjusted means are equal to 50, 50, 100, and 100 at % = 5. Pretending that the source of the samples are unknown, test the same three hypotheses of Exercise (1). State whether your conclusions are correct or a Type n error has been made. A Type I error cannot be made, because all the hypotheses being tested are false. (4) Find the 95% confidence interval of /3. - /32 of the data of Exercise (3). Use the pooled residual mean square of the four samples as the S2. Since /3. - /32 is known to be 10, state whether your estimate is correct. (5) Find the 95% confidence interval of the difference of the adjusted mean or populatioDs 3 and 2 or Exercise (4) at % - 5. Use the pooled residual mean square of the four samples 88 the S2. Since the difference is known to be 100 - 50 or 50, state whether your estimate is correct. (6) (a) Break down the 55 due to the variation among the regression coefficients of Exercise (3) into three individual degrees of freedom, namely, (n 1 vs. 2; (m 3' VB. 4; and (iii) 1 and 2 vs. 3 and 4; and test each of these three hypotheses. Since it is known that /31 /32 = 0 and /3. - /3 .. - 10, state whether your conclusions are correct. (b) Break down the 55 due to the variation among the adjusted means of Exercise (3) into similar individual degrees of freedom, and test each of the hypotheses. Since it is known that the adjusted means of the first two populations are equal to 50 and the last two equal to 100, state whether your conclusions are correct. (7) Ignoring samples 1 and 2 and using only samples 3 and 4 of Exercise (3), test the hypothesis that the two adjusted means are equal at % - x by the F-test (Table 19.7) and also by the t-test (Equation 6, Section 19.7). Note that f • F. The two /3's are both equal to 10. Therefore, the use of the methods given in Section 19.7 is justified. (8) Consider the data of Exercise (1) as those of a randomized block experiment with 4 treatments (samples) and 5 replications, and test the same three hypotheses of Exercise (1). Then test the hypothesis 1I - and see if the adjustments are necessary.
°
QUESTIONS
381
(9) For the data of Exercise (1), Chapter 12, let ~ be the initial weight aod , the final weight and test the hypothesis that the adjusted meaos at ~ - ¥ are equal. Then let ~ still be the initial weight, but let , be the gain in weight aod repeat the aoalysis. Observe that the adjusted mean squares are the same for both cases. (10) For the data of Exercise (5), Chapter 18, make a complete analysis as showu in Table 19.9g. The %-values (factor 8) are 1, 2, aod 3. Note that the regression of , on s is not linear aod that consequently some of the conclusions reached by the analysis of covariance are misleading. (11) The specific gravity (~) and modulus of rupture (,) of four fiberboards of each of 5 types A, B, C, D, and E are given below: Type of Board
B
A
"
0.930 1.021 1.026 1.016
'1
519 574 633 622
"
0.902 0.958 0.989 1.026
C '1
424 525 549 601
"
0.870 0.963 0.981 0.987
E
D '1
392 492 541 557
"
0.914 0.968 0.977 1.040
r 462 559 565 611
"
0.935 0.991 1.026 1.020
'1
465 568 613 611
(a) Test the hypothesis that the regression coefficients of , on % of the five types of boards are equal, at the 5% level. (b) Test the hypothesis that the adjusted means of , at ~ - % are the same for the five types of boards, at the 5% level. (12) (a) For the data of Exercise 13, Chapter 18, test the hypothesis that the regression coefficients of shrinkage (,) on the dye temperature (~) for the four fabrics are the same, at the 5% level. If the regression is linear, this hypothesis is equivalent to which one in a factorial experiment? one in a factorial experiment? (13) (a) For the data of Exercise 17, Chapter 18, test the hypothesis that the regression coefficients of cake size (,) 011 the baking temperature (~) are the same for the five recipes, at the 5% level. If the regression is linear, this hypothesis is equivalent to which one in a factorial experiment? (b) For the same data, test the hypothesis that the adjusted meaDS of , at ~ .. % are the same for the five recipes, at the 5% level. If the regression is linear, this hypothesis is equivalent to which one in a factorial experiment?
QUESflONS (1) What hypotheses can be tested by the analysis of covariance? (2) The assumptions underlying the analysis of covariance are those of
382
ANALYSIS OF COVARIANCE
Cho 19
the analysis of variance plus those of linear regression. Enumerate these assumptions with the duplications eliminated. (3) What is the geometric interpretation of the hypothesis that the regression coefficients are equal? (4) What is the geometric interpretation of the hypothesis that the adjusted means are equal at x • %'1 (5) For the weighted average, what is the weight for the (a) mean, (b) regression coefficient, and (e) adjusted mean? (6) The mean and the regression coefficient have many properties in common. TJ;ley are both linear combinations of the n observations (y), that is,
E - MsYl + MU't
+000
+ M,.r II.
(a) If E is the mean y, what are the multipliers M? (b) If E is the regression coefficient b, what are the multipliers M? (7) [n dealmg with means, the following notations are used: (a) T; (b) n; (c) y; (d) 11/1,; (e) y; (l) Gt /I.n. What are their counterparts in dealing with the regre.sion coefficients? (8) The analysis of covariance refers to the partitioning of the sum of products. For what purpose is the 5P broken down into the components, such as replications, treatment, error, and total? (9) In dealing with means, the sample sizes are preferably equal (Section 12.11). In dealing with regression coefficients what quantities are preferably equal? (10) If the means are adjusted to ~ - %, and the b's are not equal to zero, under what condition are the adjusted and the unadjusted means equal? (11) How do you select the x-values for the " treatments so that (a) the sizes n are the same for all the samples, (b) the values of 55 s are the same for all the samples, and (c) the mean % of the x-values are the same for all the samples? (12) After you have selected the x-values to satisfy the three conditions of Question (11) what kind of experiment do you have? (13) What are the three general hypotheses of the factorial experiment? What are the equivalent ones of the analysis of covariance? (14) Under what condition do the weights of the adjusted means become the sample sizes? (15) What is the purpose of using an individual degrees of ueedom on the means, regression coefficients, or adjusted means? (16) Under what condition should the analysis of covariance be used as an individual degrees of freedom for the factorial experiment?
REFERENCES
383
REFERENCES Cochran, W. C. aDd Cox, C. M.: Esperlllleralol Dulgu, Jolm Wiley Ir Sona, New York, 1950. Mood, Alexander M.: /ntrotlUCUOfl Company, New York, 1950.
&0
&M Theory 0/ SIGaIa&"'., McCraw-Bill Book
Ostle, Bernard: S&GIiauc. in ReuMcIa, Iowa State Collep Pre.., Amea, 1954. SDedecor, C. W.: Seou.&ical MeeAod•• Iowa State Collep Preaa, Amea, 1946. Steel, R. C. D.: ·-which dependent variate? Y or Y-X?" MiJlUlo Serle •• BU-54-M, Biometrics Unit. Department of Plant BreecliDs. Cornell University, 1954.
CHAPTER 20
REVIEW .. The material covered so far deals essentially with sampling from normal populations with equal variances. It is arbitrarily divided into chapters for no other reason than the convenience of presentation. The different methods are actually various aspects of the same general topic. To make effective use of these methods, it is necessary for one to acquire a comprehensive view of them. This chapter affords such an opportunity. 20.1 ADalysls
or VBriaaee
The principles of the analysis of variance are introduced very early in this book. As early as Section 7.9, the sum of the squares of the deviations of the n observations from the population mean is partitioned into two components, that is, JI
JI
2:(1-,,)1 - [:(1-;-")1 + n(1"-,,)1
(1)
(see Equation 1, Section 7.9). The three components have n, n..l, and 1 degree of freedom respectively. Then the statistic
n\1-,,)1
F.
1(1_1>1
(f_,,)1
n\1-,,)1
1 =
n-l
Sl
=
Sl
(2)
n
follows the F..distribution with 1 and n-l degrees of freedom. This F is obviously the square of e (Theorem 8.1a) with n-l degrees of freedom. For the paired observations, the hypothesis to be tested is p. .·0; therefore,
which is the square of e showin in Table 8.7 Equation (1) is linked with one of the most commonly used computing methods. This algebraic identity holds for any value of p.; therefore, it 384
385
ANALYSIS OF VARIANCE
20.1 holds for p. -
o.
Then Equation (1) becomes.
1:y2 - 1:(, -1>' + nY' or
T'
1:(y-y)l- I,J- -,
(3)
n
because , - TIn. It can be observed that Equation (3) gives the familiar method of computing SS. From Equation (3), one can alao aee wby the SS haa n-l degrees of &eedom. It i. because 1:,' haa originally n degrees of &eedom and fI In haa 1 degree of freedom. The quantity fl/n is an individual degree of &eedom of the n observations. When all the multipliers are equal to lin, the SS is
(4)
(Equation 11, Section 15.3). This is, of course, an individual degree of freedom of the n observations rather than that of " treatment means. The regression SS is another individual degree of freedom of the n observations (,), that is,
Q:..
[(~l-~)'l + (~,-~, + ••• + (~n-%)yrl
Q:
I[(~l-%)'
+ (~._~I + .•• + (~n-~']
(SP)' SS.
=--.
(5)
Q!
Furthermore, and are orthogonal, because the sum of the products of the multipliers is equal to zero; that is,
1 n
1 n
-(~l-~ i'-(~I-~
1
1 n
+ ..• +-(~ -'il --1:(~-%) - O. lin
This is the reason wby the residual SS,
(SP)'
fI
(SP),
55
n
55
5Sy--=~----
•
=1:". -Q:-Q!,
(6)
•
has n-2 degrees of freedom. The assumption that the population variances are equal governs many practices in statistics. Among these practices are the pooling of S5,. and the pooling of the residual 5S from " samples to obtain an estimate s' of uI. However, it haa become known recently (Section 12.7) that this practice does not have serious consequences so long as the population variances are not too much different.
386
REVIEW II
Ch.20
The basic assumption8 of the analY8is of variance are that (1) the sample8 are random, (2) the population8 are nonnsl, and (3) the variances are equal (Section 12.7); the basic function is to compare the means. The many ramification8 of thi8 topic 8tem from the different conditions imp08ed on the population mean8. For example, an observation may be expre88ed as the 8nm of the general mean, treatment effect, and error in a completely randomized one-factor experiment; but it may be expres8ed as the sum of the general mean, replication effect, A effect, B effect, interaction, and error in a two-factor randomized block experiment. The8e conditione are really regre8sion equations of a more complicated kind. The treatments, if quantitative, of the analysis of variance are actually the ~values of the re~ssion. It bas been repeatedly 8hown that the (linear) regression 55 is an individual degree of freedom of the treatment 55. The analysis of variance and the regression, linear or otherwise, are the two most &equently used statistical methods at the present time. Though treated as two different methode, they are really different aspects of the same general method. Unfortunately, the analysis of variance is often 88Hciated with planned experiment, and in contr88t the regression is associated with uncontrolled s-values. Actually, for a planned experiment, the linear regres8ion used aa an individual degree of freedom is very useful indeed (Sections 17.8). To make efficient use of regression, the experiment neede to be planned, that is, s-values should be controlled (Sections 17.5 & 19.9). When the regres8ion is nsed on lID unplanned experiment, it is by necessity rather than by choice. Sometimes the ~values cannot be controlled. 2O.2ladlvldaal Degree
or Freedom
The use of the individual degree of &eedom 88 a method of testing a specific hypothesis concerning " parameters haa been repeatedly emphuized. The advantage of this method lies in its lower probability of making a Type D error as compared with the general F-test. The basic principle of this method may be illustrated by an analogy. The treatment 55 with "-1 degrees of &eedom may be regarded aa "-1 test tnbes. The hypothesis to be tested is that they all contain distilled water. The alternative hypothesis is that some or all of the tubes may contain salt. To test this hypotheses one may taste the solution. If the water contains salt and the taster cannot detect its pre8ence, the re8ult is a Type D error. The problem now is how to taste the 801ution with a minimum risk of malting a Type n error, that is, how to detect the presence of salt with the greate8t ease. Shall one mix the solution &om all tnbes and taste it (general F-test), or 8hall one taste the content of each tube singly (individual degree of freedom)? If the salt i8 evenly distributed among the tubes, one method is as good as the other. But if only one tube contains
20.3
RANDOMIZED BLOCK EXPERIMENT
387
the salt, and tbe otbers are filled with pure water, tbe situation would be different. One may be able to detect the salt, if be tute. tbe content of the tubes one at a time. However, if the contente of the tube containing the salt are mixed with tbe diatilled water of tbe other tube. before the mixture is tuted, one may not be able to detect the salt from the dilnted .olution. Aa one can see, the larger tbe number ~f tubes involved, the more important it is to tute the contents singly. This is mo true for the individual degree of hedom, whicb become. more valuable a the number of treatmente increues. 20.8 Radollllzed Block Experlme.,
There is a great deal of similarity between tbe one-factor randomized block experimeat and the two-factor completely randomized factorial experiment. They are both special caea of the snaly.is of variance. The treatments are equivalent to the levela of factor A, and the replications are equivalent to those of factor B. Indeed the former cue is hqaently calletl the two-way cl88sification with a single ob.ervation, while the latter is called the two-way clusification with multiple obHrvations. It seemll that the only difference between them is the number of obaervations in a cell. This is, of course, the obvious difference which CaD be observed from the tabnlated data (Tables 14.2a and 18.2&). Whetherthere is any additional difference depends on which model is being considered (Section 18.6). For the variance component model and the mixed model, there is no additional differeace, bnt for the linear hypothesi. model, it must be 8118umed that the interaction between the replicatioDa and treatmente does not exist. If it did, tbere would be no error term to test any bypothesis. For example, an experimeBt consiste of 3 treatment. and 10 replicatioDS; and the numbers of degrees of freedom for the treatment, replication, and error are 2, 9, and 18 respectively. From the point of view of the factorial experiment, this error is really the interaction. Therefore, there is no error term present. However, it can be observed from Table 18.6 that, for the variance component model (or the mixed mode!), this third component, whether it is called interaction or error, may be used as the denominator for aD F -test. Therefore, for this model the one-factor randomized block experiment is the completely randomized two-factor experiment, the only difference being in tbe number of observations in a cell. But for the linear hypothesis model, q2A B must be equal to zero before the third component can be used u the denominator of an F-test (Table 18.6). If a 2 AB were greater than zero, more than one observation would be needed in each cell to provide the fourth component, whicb would be used u the denominator. Otherwise the error mean square (third component) will contain 0 2 A B. and consequently tbe F -values are made too small and a Type n error will be made too often. Therefore, the fixed levels of a factor cannot be designated u replicati.oDS; only a
388
REVlEWU
Ch.20
random sample of many levels can be so designated. For example, in a feeding experiment, the litters of animals of the same meed may be designated as replications, becauBe there are many litters in a breed, and the few litters under observation are regarded as a random sample &om a population of many litters. On the other hand, a few particular
breeds should be regarded as different levels of a new factor rather than as replication., unless it i. known that the feeds (factor A) and breeds (factor B) do not interact (03 A B - 0). 20.4 Units of Measnrements All the statistics, such as U, t, X2, and F, are independent of the unita of measurements. A change of unit for the observations, such as &om numbers of pounds to numbers of kilograms, will not affect the value of any of the final statistics which are used in testing various hypothese •• Therefore, the conclusion reached by a statistical method remains the same no matter what unit of measurement is used. This is the reason why no special importance has been attached to the units in the diecussion of the methods. However, the change of unit does affect some of the intermediate steps. If y is the number of feet, the 55 ia the number of square feet; if 'Y is the number of inches the 5S is the number of square inches. When the unit is changed &om foot to inch, the numerical value of the mean will be 12 times as large and that of the SS will be 1~ or 144 times as large. This principle applies to all kinds of 5S including those of the analysis of covariance. For example, the regression coefficient may be expressed as the number of feet (unit for y) per pound (unit for %). Then for b2 , the unit would be feet squared divided by pounds squared. But when hi is multiplied by 5S., which has the unit of pounds squared, the 55 due to the variation among b's will have the unit of feet squared again. This is more evidence showing why S5. is an appropriate weight to use for the regression coefficient. For a confidence interval, the units of the limits are always the same as the unit of the parameter being estimated. For example, if 'Y is the number of feet and % the number of pounds, the confidence limits of the mean ,,"y or the adjusted mean ""y." are expressed in terms of the numbers of feet, while those of a regression coefficient are expressed in terms of the numbers of feet per pound. 20.5 Applications of Statistical Methods
The material presented in this book is centered around statistical methods, for each of which a number of examples of applications are given. However, these examples serve only as illustrations and are by no means the typical applications. They are selected mainly for their simplicity, and they can be understood without specialized knowledge in a subject matter field. These might be called common-sense examples.
20.6
POWER OF A TEST
389
Though the most frequently 1I8ed examples refer to fertilizei'll, teaching methods, and children's heights, one does not need to be a fanner, a teacher, or a parent to comprehend them. However. in general, to understand applications, one must have the knowledge of statistic!s 88 well as that of a subject matter field. It is only natural for a statistica book to uee the methode ss centel'll of discussion and to treat the problema of applications as incidental. But in application a problem itself becomes the center and one calls for whatever statistical methode are available to accomplish his purpose. As a result, a aeries of methode, rather than a single method, is usually used in an applied problem. For example, the analysis of variance, useful as it is, is seldom the final analysis of a set of data. It is frecpently accompanied by other methode auch 88 the individual degree of freedom. Unforlunately-and this creates a dilemma for the beginner-there is no one-to-one correapondence between methods and problems. The methode are not one-purpoae gadgets. They are ueed wherever they are needed. The application of statistics is not unlike the uee of arithmetic in everyday life. The four operations-addition, subtraction, multiplication, and division-are used in any combination and in any order as they are needed, eveu though they are taught in school, one at a time, in a particular sequence, and accompanied by specific examples of applications. 20.6 Power of • Test The power of a test is the probability that a particular false hypothesis will be rejected. It is equal to one minus the probability of committing a Type II error. For the same sample size and the same significance level, a test is said to be more powerful than another one f if its probability of committing a Type II error is smaller. For example, the individual degree of freedom is a more powerful test than the general F-test of the analysis of variance. This is the reallon for the emphasis given the individual degree of freedom throughout this book.
CHAPTER 11
SAMPLING FROM BINOMIAL POPULATION All the statistical methods preseDted iD the preceding chapters deal with Dormal populatioDs. In this chapter parallel methods are developed for a differeDt populatioD which i. called the binomial populatioD. TraditioDally, the sampliDg from a biDomial populatiOD is treated 8S aD iDdepeDdeat topic with ita own termiDology aDd DotatioDs. However, this chapter departs from that cODveDtioD and the preseDtatioD depeDds very much OD the theorems developed previously.
2L 1 BIDO.I" Pop".I.,. A binomial populatioD, as the Dame implies, is a populatioD which has a dichotomous or two-sided character: aDimals, male or female; appl~s, good or bad; iDsecta, dead or alive; egp, hatched or unhatched; meals, hot or cold; rooms, light or dark; places, rural or arban. An observatioD 88SUllleS ODe or the other side of the character. To distiDgaish the two opposite categories, an observation is techDically called either a success or a failure. But these terms are used for designative rather than for descriptive purpose. No connotative meaDiDg should be attached to them. Thus the questioD of whether the male or the female sex sbould be designated 88 a "success" is wholly irrelevanL Dichotomous characters are usually cODsidered qualitative characters as agaiDst the quantitative ODes which can be measared Damerically. There is actually DO fundameDtal differeDce betweeD the qualitative o. aervatioDs beiDg cODsidered DOW and the quantitative oheenadoDS CODaidered previously. The temperature was ODce cODsidered qualitative and described as hot or cold. But, with the iDveDtioD of the thermometer, it became a quantitative character. EveD sex is Dot eDtirely qualitative. Some womeD are more femiDiDe than others, while some meD are more lIasculiDe thaD others. Many characters cODsidered qualitative today may become quantitative tomorrow with the iDveDtioD of measariDg devices. However, the traDsformatioD Deed DOt be from qualitative to quantitative. A quantitative character can also ChaDge into a qualitative ODe. For example, if people six feet or more iD height are cODSidered tall aDd those below six feet are cODsidered short, height becomes a qualitative character. If the observatioDS greater than, or equal to, the meaD of a populatioD are cODsiured large, and those below the mean are cODsidered small, any populatioD becomes a biDomial populatioD. The two categories, success and failure, of the observatioDs of a binomial popalatioD may be expressed iD quantitative terms by an appropriate 390
21.1
391
BINOMIAL POPULAnON
definition of the observation. The observation y is defined as the number of successes for each observation. If the observation is a success, y is equal to 1; otherwise, y is equal to o. For example, one wishes to observe insecta in terms of their being dead or alive. The observation y may be defined as the number.of dead insects for each insect observed. If an insect is dead, there is one dead insect; therefore, the observation 'Y is 1. If an insect is alive, there is no dead insect; therefore, the observation y is o. In tossing a coin, the number of heads obtained for each trial may be defined as the observation y. If the head appears, there is one head; the observation y ~s 1. If the tail appears, there is no head; the observation y is O. Therefore, from this point of view, the binomial population consists of only 0' sand l' s as its observations. The mean and variance of a binomial population can be obtained in the same manner as for any other population. For example, a binomial population consists of the following five observations:
0, 1, 0, 1, 1. These observations may be interpreted as three dead and two living insecta. The mean of this population is
0+1+0+1+1 3 P. - - - - - - - = -= .6,
5
(1)
5
and the variance is (Equation 1, Section 2.3)
(0-.6)1 + (I _.6)1 + (0-.6)1 + (1 _.6)1 + (1 _.6)1 u
l .-------------------
5
=
.36 + .16 + .36 + .16 + .16 5
= 0.24.
(2)
The above computation can be greatly simpHfied by organizing the observations into a frequency table as shown in Table 21.1. TABLE 21.1 y
f
(y-w
(,.-14 2
(,.-waf
-.6
.36 .16
.72 .48
0 1
2 3
Sam
5
1.20
N
r.(,.-14 2f
.4
r.r' 3 .6 1-'=---N 5 aI _
2{r-l4If _ 1.20 ~ 0.24 N 5
392
SAMPLING FROM BINOMIAL POPULATION
Ch.21
All binomiat populations consist of the same observations 0 and. 1. What differentiates one population from another are the frequencies of these observations (Table 21.1). Therefore, the frequencies are of primary importance to a binomial population. The mean and variance are both related to 10 and II' the frequencies of 0 and 1. Whatever values the frequencies may assume, the sum of them is equal to N, the total frequency; that is,
10 + II - N;
(3)
and the sum of the relative frequencies is equal to 1 or 100%; that is,
10 + II _ 1 N
N
(4)
'
or
(5) The physical meaning of Equations (3), (4), and (5) is very simple. What Equation (3) says is that the number of living insects plus that of dead insects is equal to the total number of insects observed. Equation (4) says that the percentage of livinS insects plus that of dead insecta is equal to 100%. Equation (5) says that the percentage of living insecta is equal to 100% minus the percentage of dead insects. In terms of the example of Table 21.1, Equation (3) says that
2 + 3 - 5; Equation (4) says that
2 3 -+ - - 1 or .4 + .6 - 1; 5 5 Equation (5) says that
2 5
3 or .4 - 1 - .6. 5
- - 1- -
10
The mean and variance can be obtained in terms of the frequencies and II' The mean (Equation I, Section 2.5) is
!.YI
P.-N-
(0)10 + (Ull
II
N
-N'
(6)
the relative frequency of the observation 1. The variance (Equation 2, Section 2.5) is
(7)
21.2
SAMPLE MEAN AND SAMPLE SUM
393
But, by Equation (6), " .. faiN; by Equation (5), 1 - I' - folN; therefore, the variance becomes
(~ )' ; + (;)' ~ _ fo • fl rfl + foJ
u' -
N N~
._e_ N N'
N
(8)
fo fl
which is the product of the relative frequencies of the observations 0 and 1. As au illustration, one may refer to Table 21.1 which shows a population with two failures and three successes. The relative frequencies of 0 and 1 are 2/5 or 0.4 aod 3/5 or 0.6 respectively. Therefore, the mean of the population is 0.6 aod the variance is (.4) (.6) - 0.24 (Table 21.1). As one can see, there is practically no computation involved in obtaining the meao and variance. Furthermore, the binomial population has only one parameter, the mean I' or the relative frequency of successes. When the mean is known, the variaoce becomes automatically known. For example, if the meao is equal to 0.2, the variance is equal to (.2)(.8) or 0.16; if the mean is equal to 0.3, the variance is equal to (.3)(.7) co 0.21. It is needless to say that the mean, being a relalive frequency, cannot be greater than 1 or less than O. The discussion in this section may be summarized in the following theorem: Theorem 21.1 The mean of a binomial population is equal to the relative frequency of successes and the IJCJriance is eqUlJl to the product of the relative frequencies of successes arad failures. The mean is the only parameter of a binomial population.
21.2 Suaple Me ....d s..ple
s..
The meao 1', or the relative frequency of successes, of a binomial population is a fixed quantity, but the sample meao y and the sample sum T are subject to sampling fluctuation. In this section, the properties of the distributioDs of f and T are presented. The properties of the distribution of sample means are extensively discussed in Chapter 5. There is really no need to repeat the discussion for such a simple population as the binomial, which c008ists of only O's l's. However, this extreme simplicity sometimes creates undue confusion. Therefore, a sampling experiment is performed here to clarify the physical meaning of the terminology and the notations. The equipment used for this experiment consists of a bagful of 4000 beads. 40% of them white and 60% red. The white bead is c008idered a
894
SAMPLING FROM BINOMIAL POPULATION
Ch.21
"success" and the red a "failure." Then the population mean is equal to 0.4 and the variance is equal to (.4)(.6) .. 0.24 (Theorem 21.1). From this population 1000 random samples, each consisting of 20 beada, are drawn. For each sample, the sum T and mean y are computed. However, the procedure is so simple that there is practically no work involved. For example, a sample of 11 white and 9 red beads, when translated into quantitative terms, may look like this: 0,.1,0, 1, 1, 1, 0, 0, 1,0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1. The sample sum, Iy or T, is equal to 11, and the sample mean y is equal to Tin or 11/20 or .55. Therefore, the sample sum is obtained by merely counting the number of white beads in a sample; while the sample mean y is obtained by dividing the number of white beads by the sample size. In general, the sum T is the number of successes in a sample and the mean y is the relative &equency of successes in a sample. Therefore, in sampling hom a binomial population, one may think either in terms of &equency and relative frequency or in terms of sum and mean. The sample mean or the relative frequency of the white beads is expected to follow the normal distribution approximately (Theorem 5.2a), with mean equal to I" and variance equal to u*ln (Theorem 5.3). For this population of 4000 beads, the mean is equal to 0.4 and the variance is TABLE 21.2.
T
r
f
r.c.r.(")
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.10 0.15 0.80 0.85 0.90 0.95 1.00
0 0 3 14 53 70 121
0.0 0.0 0.3 1.7 7.0 14.0 26.7
Tot.l
181
44.8
155 161 116 68 32 15
60.3 76.4 88.0 94.8 98.0 99.5 99.9 100.0 100.0 100.0 100.0 100.0 100.0
4
1 0 0 0 0 0 1000
395
SAMPLE MEAN AND SAMPLE SUM
21.2
equal to (.4)(.6) or 0.24, because 40% of the beads are white (Theorem 21.1). Therefore, the sample means approximately follow the normal distribution with mean equal to 0.4 and variance equal to 0.24/20 - 0.012. The frequency distribution of the 1000 sample means is shown in Table 21.2a. The column T gives the sample sum which is the number of white
" ,.c.,. 90
•,
,,I ,, , ,, I
------- ---------.--- -- --- ------ --
50
10
.20
.25
.30
.3S
.40
.4$
.50
.60
Fig. 21.2
beads. Since each sample consists of 20 observations, T ranges from 0 to 20. The column if gives the sample mean which is the relative frequen cy of white beads. The mean 1, being equal to T/20, ranges from 0 to 1. Column f shows the number of times for which a particular mean occurred. For example, at f equal to 0.20, f ... 53. This indicates that 53 out of 1000 samples, each consisting of 20 observations, have a mean of 0.20. 10 plainer English, this says that 53 samples out of 1000 contain 4 white beads. The relative cumulative frequencies (r.c.f.) of the means are plotted on the normal probability graph paper as shown in Fig. 21.2. The fact that the points are almost on a straight line indicates that the mean follows the normal distribution very closely. The mean and stand-
396
SAMPLING FROM BINOMIAL POPULATION
Ch.21
ard error of f can be read ·from the graph as 0.395 and 0.11 respectively. These values are approximately equal to 0.4 and v'0.012 as expected. The distribution of the sample sum, or the number of white beads. can be obtained from that of the sample mean. It can be observed from Table 21.2a that the frequency distribution remains the same whether one is interested in the sum or in the mean. The fact that there are 53 saqJles with mean equal to 0.20 also indicates that there are the same number of samples with the sum equal to 4. Therefore, the same histogram fits both sum and mean. However, the differences occur in the mean and variance of these two distributions. A sum is II times its mean. Therefore, the mean of the sums is equal to lip. and the variance is equal to 111«(12/11) or 11(12 (Theorem 2.4b). Consequently, the sample sum or the number of successes in a sample, follows approximately the normal distribution with mean equal to "" and variance equal to IIp(1-p), because (12 -1l(1-Il). For this samplillg experiment, the average number oE white beads for the 1000 samples should };e equal to 20(0.4) - 8 and the variance equal to 20(.4)(.6) e 4.8. In conducting the sampling experiment described above, the basketful of beads is not the essential equipment. The same experiment can be camed out with the aid of a random number table. Table 2 of the Appendix may be used for this purpose. Each line of the table has 40 digits. The first 20 digits constitute one sample and the remaining 20 digits constitute another sample. To make Il c: 0.4, one may consider 1.2,3,4 as successes and 5,6.7.8,9,0 88 failures. For the following sample of 20 observations (digits) 16 74 ..... 85
63
79
48
96
34 80
54,
the number of successes is 7 and f is equal to 7/20 or 0.35. Table 2, Appendix, can provide 500 such samples. If more samples are desired, a more extensive table may be used. The Table of Random Numbers, Tracts for Computers, No. XXIV, Cambridge University Press, can provide 5000 samples of size 20. There are three distinct distributions associated with this sampling experiment, namely, (a) the binomial population, (b) the distribution of the sample mean or the relative frequency of successes in a sample, and TABLF. 21.2b MeaD
VariaDce
BiDomial populatioD
p
Sample meaD
II
ILO -IL) ILO -IL)
DistributioD
Sample Sum (BiDomial distributioD)
np.
n
np.U -IL)
21.3
397
MINIMUM SAMPLE SIZE
(c) the distribution of sample sum or the number of successes in a sample. The last of the three distributions is called the binomial distribution, which is not to be confused with the binomial population. The means and variances of these distributions are listed in Table 21.2b, where n denotes the sample size and p. the relative &equency of successes in the population.
2LS MiDi... S..ple Size The sample means from a binomial population do not follow the normal distribution closely unless the sample is large (Theorem 5.2a). Therefore there is always the question of how large a sample must be to allow the distribution of sample means to be considered a normal distribution. Actually the answer to this question depends on the sample size as well as on the population mean. The commonly used working rule is that both rap. and n(1 -p.) should be greater than or equal to 5, before the normal approximation could be used. This working rule can be explained in terms of a series of sampling experiments. The equipment used consists of one bagful of white beads TABLE 21.3 Relative Frequency of White Beads
-r
.00 .05 .10 .15 .20 .25 .30 .35 :40 .45 .50
ElEpt. 1 ElEpt. 2 ElEpt. 3 n-l0; J'-.5 n .. 10; ~= .4 n - 20; J'-.4 1
2
12
54
49
145
118
221
211
221
222
185
.60 .65 .70 .75 .80 .85 .90 .95 1.00
202
117
117
47
59
6
9
2
0
Total
1000
ElEpt.4 20; J'- .9
0
32 15 4 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 31 84 191 278 270 135
1000
1000
1000
.55
0 0 3 14 53 70 127 181 155 161 116
II -
68
Ch.21
SAMPLING FROM BlNOMIAL POPULATION
398
and one bagful of red beads. The white bead i. con.idered a. a success and the red bead a. a failure. The bead. of different colora may be mixed in any proportion to provide different value. of p.. To make the population mean equal to 0.5, 2000 bead. of each color are mixed. To make p. equal to 0.4, 1600 white and 2400 red beads are mixed. To make p. equal to 0.9, 3600 white and 400 red bead. are mixed. In each case, the number of beads in a population is 4000. Then random .ample. of variou. sizes may be drawu from the.e population.. The four experiments conducted are as follows: Experimelll 1
" .. 10 " ... 5
I 250
200
150
100
50
.00
.10
.20
.30
.40
.50
Fig. 21.38
.60
.70
.80
.90
1.00
.,
21.3
399
MINIMUM SAMPLE SIZE
f Experimenl 2 11-10 ",-.4
.00
.10
.20
.30
.40
.so
.60
.70
.80
.90 1.00
7
Fig. 21.3b
Esperiment 1 One thousand random samples, each consisting of 10 beads, are drawn from the binomial population with mean equal to 0.5. The frequency distribution of the 1000 sample means (1) is shown in Table 21.3. The mean and variance of the sample means should be approximately equal to ". (or 0.5) and ".(I-".)/n (or 0.025) respectively. The histogram of the sample means, with a superimposed normal curve with the same mean and variance, is shown in Figure 21.3a. Esperiment 2 One thousand random samples, each consisting of 10 beads, are drawn from the binomial population with mean equal to 0.4.
400
Ch.21
SAMPLING FROM BINOMIAL POPULATION
The frequency distribution of the 1000 sample means (j') is shown in Table 21.3. The histogram, with a superimposed normal curve with mean equal to 0.4 and variance equal to 0.024, is shown in Figure 21.3b. Experiment 3 One thousand random samples, each consisting of ~ beads, are drawn from the binomial population with mean equal to 0.4. The frequency distribution of the 1000 sample means (y) is shown in Table 21.3. The histogram, with a superimposed normal cUfVe with mean equal to 0.4 and variance equal to 0.012, is shown in FifCUle 21.3c. f
250 Ellpetlmeal 3
,,-20 /1- .4
200
ISO
100
SO
.00
.10
.30
.40
Fig. 21.3c
.90
1.00
21.3
401
MINIMUM SAMPLE SIZE
Experiment 4 One thousand random samples, each consisting of 20 beads, are drawn from the binomial population with mean equal to 0.9. The frequency distribution of the 1000 sample means (y) is shown in Table 21.3. The histogram, with a superimposed normal curve with mean equal to 0.9 and variance equal to 0.0045, is shown in Figure 21.3d. Whether a normal curve is a good approximation of the distribution of the sample means drawn from a binomial population depends not only on the sample size, but also on the population mean. If Il is equal to 0.5, 10 observations constitute a sample large enough to insure normality. As Il deviates farther from 0.5, a larger sample is required. The reason is this: Both the population mean Il and the sample mean y, being relative 300
~
f
t.pe' ..... 4 ~.
20
" •• 9
200
~
100
~
-
so
~
/ 11 .00
.10
.20
.30
.40
.50
.60
Fig. 21.3d
.70
.80
.90
~
1.00
402
SAMPLING FROM BINOMIAL POPULATION
Ch.21
frequencies, can only assume the values between 0 and 1. Since p. is the mean of the f's, the y's are clustered around p.. If p. '"" 0.5, 1's have an equal amount of room to spread out on both sides of 0.5 as illustrated by Figure 21.3a. It can be seen that the normal curve is a good approximation of the histogram. When p. - 0.4, the 1's cluster around 0.4. The sample means do not have as much room to spread out toward 0 as toward 1 (Figure 21.3b). As a result, the 1's are more crowded on the left tail of the distribution. Comparing Figures 21.3a and 21.3b, one will notice that the normal curve is a better approximation for the case p. - 0.5 than for the case p. - 0.4, even though the sample size is 10 in both cases. When p. - 0.9, the nonnal approximation is wholly inadequate (Figure 21.3d). Even the larger sample size of 2O does not do much good. The contrast among Figures 21.3a, 21.3b, and 21.3d shows the influence of p. on the shape of the distribution of sample means. The farther p. deviates from 0.5, the worse the normal approximation, provided the sample size remains the same. The distribution of sample means can be brought closer to a normal distribution by increasing the sample size, even if p. is not equal to 0.5. The reason is this: As the sample size increases, the variance of the sample means, being (la/n, decreases. The f's hug the p. so closely that they do not need much room to spread out on either side of p.. This situation is illustrated by Figure 21.3c, which shows that the normal curve is a good approximation of the histogram. The contrast between Figures 21.3b and 21.3c reveals the influence of the sample size on the shape of the distribution of the sample means. For the same population mean, the larger the sample size, the better the normal approximation. In summary, both p. and n influence the shape of the distribution of sample means drawn from a binomial population. The working rule that both np. and n(l-p.) should be equal to or greater than 5 is derived from this principle. Figures 21.3a and 21.3c satisfy this rule, while Figures 2I.3b and 21.3d do not. Observing these figures, one will notice that the histograms of the former two figures fit the normal curve more closely than the latter two. However, this number 5 is not a sacred number. Obedience to the working rule is not "right," nor is the violation of it "wrong." Like many rules in many fields, this one is intended to be a guide. 2L4 Test of Hypothesis ConcenaiDg Meaa It is shown in Section 21.2 that the sample mean approximately follows the normal distribution, with mean equal to p. and variance equal to p.(l - p.)/n. Then the statistic (Equation I, Section 6.7) o
U
Y-I-' r-p. = ---= ---===
~
(Ii(l n- It)
(1)
21.4
TEST OF HYPOTHESIS CONCERNING MEAN
403
app-oximately follows the normal distribution, with mean equal to 0 and variance equal to 1. This statistic may be used in testing the hypothesis that the population mean is equal to a given value. For a two-tailed test, the critical regions are u < -1.96 and u > 1.96, if the 5% significance level is used. A similar statistic (2)
may be obtained from the distribution of the sample sum (Table 21.2b). It should be noted that the two statistics, even though derived from different sources, are actually the same. The second u may be obtained from the first one by multiplying both the numerator and the denominator by II. An example of application may help to clarify the purpose of the u-test. The manufacturer of a certain insecticide claims that a particular dosage
can kill 80 percent of certain insects. To test this claim, an investigator conducted an experiment and found that the drug killed 300 out of 400 insects, a percentage of 75. The question is: Would this evidence justify the allegation that the manufacturer had made a false claim? Here the u-test may be applied. The hypothesis is that Il - 0.8 and the sample mean is 0.75. Therefore, the statistic u·
0.75 - 0.80
-0.05 =----25 '0.8) (.2) 0.02 •
(3)
400 which is less than -1.960. If the 5% significance level is used, one can conclude that the insecticide is not so effective as the manufacturer claims. The conclusion is the same if the second version of the u-test is used. In this case, 300 - 400(0.8)
-20
u - ";400(.8) (.2) - -8- - -25 •
(4)
which i. exactly the same value obtained previously.
It is wholly irrelevant whether a dead or a living insect is considered a success. The u-test just carried out considers the dead insect as a success. Now, suppose the living insect is considered a success. The hypothesis is then that 20 percent of tbe insects will survive the drug. The experiment showed that 25 percent remained living. Then the sta-
404
SAMPLING FROM BINOMIAL POPULATION
Ch.21
tistic is
u-
.25 - .20
0.05 =--=25 V('2) (.8) 0.02 .
(5)
400
which is the same value obtained previously with only the sign changed. Therefore, the effect of interchanging the definitions of success and failure is the interchanging of the left and right critical regions, but the conclusions reached by both definitions are always the same. For a normal population, the u-test is very seldom used, because the variance (72 is usually unknown. Here, for a binomial population, the utest can be used, because the variance (72 can be obtained through the hypothesis being tested. If the hypothetical mean is equal to 0.8, the hypothetical variance is bound to be equal to (0.8)(.2) or 0.16. In testing the hypothesis that the population mean is equal to a given value, besides the u-test, one may use an alternative method which is called the chi square test of the goodness of fit. This method involves finding the discrepancies between the observed frequencies f and the hypothetical frequencies h. The observed frequencies are the numbers of successes and failures in a sample of size n. The hypothetical frequencies are the numbers of successes and failures divided among the n
observations according to the hypothesis; that is, the hypothetical frequencies are nil and nO -Il) or n -np. respectively. For the example of insecticide, the observed frequencies of the experiment (sample) are 300 dead and 100 living insects. The hypothetical frequencies, by the hypothesis that Il = 0.8, are 320 dead and 80 living. The statistic used to test this hypothesis is )(2
L:
=
(f-h)2
(300 -320)2
h
320
__ =
+
000 -80}2 80
= 6.25.
(6)
with 1 degree of freedom. The details of the computation are shown in Table 21.4. From F:quation (6), it can be observed that the value of )(2 is affected by the discrepancies, f-h. When the observed frequencies are too much different from their corresponding hypothetical frequencies, the hypothesis will be rejected. This is obviously a one-tailed test. TABLE 21.4 Observation
Observed Frequency I
Hypothetical Frequency h
I-h
(/_h)2
(/-h)2/h
Dead Alive
300 100
320 80
-20 20
400 400
1.25 5.00
Total
400
400
0
6.25
CONFIDENCE INTERVAL OF MEAN
21.5
405
No sampling experiment is needed to show that this statistic really follows the x'-distribution. The X' of Equation (6) is exactly the square of the u of Equation (3), that is, 6.25 - (-2.5)1. By Theorem 7.6. the X', which is u2, is bound to follow the X'-distribution with 1 degree of freedom. Therefore, there is no need to develop the x'-distribution anew. The algebraic proof of u ' - X' is very simple. It is mostly a matter of identifying the tenninology and notations. The observed frequency of successes is the sample sum T; and therefore the observed frequency of failures should be n- T. The hypothetical frequencies are "IL and n(l-ll) respectively. Then the statistic
X' -[ :
= =
(f-h)1 h
(T-nll)' nIL (T -nil) I
...
[T-nll)' nIL
[{n-n - n(1-Il)}'
+ -----:----n(1 -Il)
(-T + nllP
+---n(l-Il)
[
1-1l + Il ] nll(1-Il) nll(1-Il)
... (T -nil) I _ u'
(7)
nll(1-Il)
(Equation 2). The physical meaning of the number of degrees of freedom is shown in Table 21.4. The two discrepancies, f-h, are always numerically the same with the signs different. When one discrepancy is known. the
other is bound to be known.
Therefore there is only one degree of
freedom. A hypothetical frequency is really the mean of the observed frequencies of repeated samples, if the hypothesis is true. For example, the frequency of successes of a sample is the sample sum T, and the corresponding hypothetical frequency is nil. It is shown in Section 21.2 that nIL is the mean of the distribution of the sample sum T (Table 21.2b). The x'-test, as the u-test, is valid only if the sample size is large. Since the two tests are actually identical, the same working rule is applicable to both cases. The rule is that both hypothetical frequencies should be greater than or equal to 5 (Section 21.3).
2L 5 Confidence Interval of Mean The 95% confidence interval of the population mean is given as
r- 1.96
vr;;
< Il <
r + 1.96
vr;;
(1)
406
SAMPLING FROM BINOMIAL POPULATION
Ch.21
in Inequality (3), Section 11.3. But one will face difficulty in applying this method to a binomial population, because oJ .. Il(1- Il). The very purpose of finding the confidence interval is to estimate IL. Now the confidence limits themselves are expressed in terms of Il. Therefore, in order to estimate a parameter by this method, the parameter itself must be known. If it were known, there would be no problem of estimation. This is undoubtedly a vicious circle. One way of breaking this circle is to use 1(I-f) to replace 0'. Then the 95% confidence interval of Il becomes
(2) which constitutes an approximation of Inequality (1). This approximate method is usable, especially when Il is close to 0.5 and n is large. As an illustration of application, the example of the insecticide given in the preceding section may be used. The experiment showed that the insecticide killed 300 out of 400 insects, or 75%, while the manufacturer claimed that his product was 80% effective. The u-test showed that the claim is not justified. Now the natural question one would ask is, what, then, is the correct percentage if it is nol 80? In this case, one may use Inequality (2) to estimate Il, the true percentage of insects that would be killed. The sample mean 1 is equal to 300/400 ... 75, and the confidence limits are .75 -1.96
yt
75) (.25)
400
and
.75 + 1.96
and
0.79.
Y<.
75) (.25)
400
or 0.71
(3)
Therefore, the conclusioD is that the true percentage killed is somewhere between 71 and 79, with a confidence coefficient of 0.95. If one desires a confidence coefficient of 0.99, the value of 1.960 is replaced by 2.576 in computing the confidence limits. Of course, Inequality (3) also implies that the confidence interval of the survival rate is 0.21 to 0.29, where 0.21 is 1-0.79 and 0.29 is 1-0.71. The method of finding the confidence interval given above is an approximate one. A more refined and more complicated method is also available. However, the confidence intervals, based on this refined method, are already tabulated and given in Tables 9a and 9b of the Appendix. For example, if 7 out of 20 observations are successes, the interval of Il is given as 0.15 to 0.59, with a confidence coefficient of 0.95 (Table 9a). However, the limits cannot always be read off the table dire cdy. If 1 .75 and n - 400. as in the example of inse cticide, the fa
21.6
DIFFERENCE BETWEEN TWO MEANS
407
interval is Dot $iven; but tl)e table can be used anyway. The fact that 1 exceeds .50 does not create any difficulty. All one has to do is to find the confidence interval of the survival rate rather than that of the fatality rate. For the survival rate. 1 - 1-0.75 - 0.25. The confidence intervals. as shown in Table 9a. are 0.20 to 0.31. for n • 250; and 0.22 to 0.28, for n - 1000. The interval for n - 400 is bound to be narrower than the former and wider than the latter. Therefore. the interval for n - 400 should be about 0.21 to 0.29. The limits of the interval for the fatality rate can be obtained by subtracting each of the two limits 0.21 and 0.29 from 1. Hence the 95% confidence interval for the fatality rate is 0.71 to 0.79 which is exactly the same as that obtained by the approximate method (Inequality 3). This shows that the approximate method is good for such a large sample. But when n is small and". is far from 0.5. the result obtained by the approximate method and that obtained from Tables 9a and 9b do not agree closely.
2L6 Dlfl'ereDce BetweeD Two MeaDS It is shown in Section 21.2 that the sample mean approximately follows the normal distribution. with mean equal to ". and variance equal to ".(1-".)/n. Then the statistic (Equation 1, Section 10.3)
(fl-fa> - (P.I-"'I) u-
~ C1a -+VFf n na l
(11-f2) - (P.I-P.a> - --;=r.==:r===r.=='T yi~IO -"'1) + "'aO -"'a) --nl na
(1)
approximately follows the normal distribution, with mean equal to 0 and variance equal to 1. This statistic, with some modification, may be used in testing the hypothesis that the difference between two population means is equal to a given value. The necessary modification is that the unknown variances O'~ and of Equation (1) must be replaced by their estimates. How the population variances should be estimated is determined by the hypothesis being tested. In a binomial population. the variance is equal to ".(1-".). H two population means are equal, the population variances are bound to be equal. Therefore, in testing the hypothesis that "'I-"'a is equal to a value other than zero. the population variances are considered to be different. Then the estimates of O'~ and are 11(1-f,) and fa(1-fa) respectively. Thus the revised statistic is
0':
0':
u-
(fl-1a) - ("'I-"'a)
~-~.~~~~~~
1 !fl(l-YI) + y 20 -Ya)' nl na
(2)
V-
On the other hand, if the hypothesis is "'I-"'a 0, the two population variances are considered equal. Then the pooled estimate y(1-y) should 0::
408
Ch.21
SAMPLING FROM BINOMIAL POPULAll0N
be used to replace both O'! and O'~ of Equation (1). Therefore, in testing the hypothesis that III - 112' the appropriate statistic to use is
fl-f2
U
- -[1 1]--'
(3)
-= --;===========~
f(l-f) - + RI
R2
r
where is the general mean of the two samples. The application of Equation (2) may be illustrated by an example. Suppose a manufacturer claims that his new formula for a certain insecticide can kill 8% more insects than his old formula. In statistical terms, the hypothesis is that III - 112 - 0.08. To see whether this claim is justified, an investigator conducted an experiment and found that the new formula killed 320 out of 400 insects, while the old one killed 60 out TABLE 21.6a Observation
Treatment
Total
New
Old
Killed Not killed
320 80
60 40
380 120
Sample size
400
100
500
of 100 insects. The data are given in Table 21.6a in the form of a twoway frequency table. Considering a killed insect as a success, the mean for the new formula is 320/400 or 0.80, and the mean for the old formula is 60/100 or 0.60. The hypothesis being tested here is 111-112" 0.08, without specifying the values of III and 112" The value of u is (Equation 2) u=
(.80 - .60) - .08
.12
, /(.80) (.20) + (.60) (.40)
V
400
.. - - - 2.27, .0529
100
with the details of computation shown in Table 21.6b.
Since 2.27 is
TABLE 21.6b
=-
Item T
- - -
n
-r
l-y yO-y) yO-y)/n Standard error
-
New
Old
320 400 .800000 .200000 .160000 .000400
60 100 .600000 .400000 .240000 .002400
Combination
.200000 (-)
.002800 (+) .052915 V
-
21.6
409
DIFFERENCE BETWEEN TWO MEANS
greater than 1.96, the hypothesis should be rejected, if the 5% significance level is used. In plain English, the conclusion is that the margin of superiority of the new formula over the old one is wider than what the manufacturer claims. The confidence interval of Ill-Ila can be obtained from Equation (2). For a confidence coefficient of 0.95, the limits are (Chapter 11)
- -)±1960 ( Yl-Ya·
yY
O-y) 1
1
n.
Ya(1-y)a + .
(5)
na
For the data given in Table 21.6a, the limits are
(.80 -.60) ± 1.960(.0529) or
.20 ± .10, with the details of the computation shown in Table 21.6b. Therefore the difference Ill-Ila is somewhere between 0.10 and 0.30, with a confidence coefficient of 0.95. In other words, the new formula can kill somewhere between 10 to 30 percent more insects than the old one can. In order to obtain a more exact estimate of the difference Ill-Ila' that is, to narroy the intervAl. larger sample sizes are required. The sample sizes of the data given in Table 21.6a are deliberately made different to show the general method of computation. However, in conducting an experiment, efforts should be made to equalize the sample sizes. Instead of using 400 and 100 insects for the two treatments, 250 should be used for each. For the same total number of observations, the equalization of sample sizes reduces the standard error of the difference between two sample means (Section 10.7). As a result, in the test of a hypothesis the probability of making a Type II error is reduced; and in the estimation of a parameter, the confidence interval is narrowed. The computing method for Equation (3) can be illustrated by the data given in Table 21.6a. The new item involved here is the general mean y, which is the grand total G divided by the total number of observations. For the given set of data, the general mean is equal to 380/500 or 0.76 (Table 21.6a). Of course, is also the weighted mean of 11 and with the sample sizes as the weights; that is,
r
ra'
(6) The physical meaning of this general mean is that 76 percent of the insects are killed for the two treatments combined. With the general
410
Ch.21
SAMPLING FROM BINOMIAL POPULATION
mean computed, one can proceed to find the valne of u, which is (Equation 3)
.80 -.60 u-
.20
--;;.=====;;r::;:==;::::;;r
(.76) (.24)
~ 100 1
= - - - - 4.188,
-+-
(7)
.04775
400
with the details of the computation shown in Table 21.6c. Since 4.188 is greater than 1.960, the conclusion is that III is greater than Ill' if the 5% significance level is used. Of course, this is a foregone conclusion, because it is already established by the confidence interval that Ill-Ill is somewhere between 0.10 and 0.30. However, there is a reason for using this same set of data here as an illustration. The purpose is to show that the estimated standard error of 11-11 is not the same, when ~ and are estimated in different ways. The standard error is equal to 0.0529 (Table 21.6b), if two variances are estimated individually; and is equal to 0.0477 (Table 21.6c), if the two variances are estimated jointly.
0':
TABLE 21.6c Item
New
Old
T
320 400 .800000
60 100 .600000
ra
t
y-
l-y 1'(1-1> l/ra -y(1-f) -
J
~l-+1 fit raJ
Standard error
.002500
.010000
Combination 380 500 .200000 .760000 .240000 .182400 .012500
G(+) "i.n(+) (-)
G/"i.n
<+)
.002280 .047749
r
An alternative method of testing the hypothesis that III - III is the chi sqUQl'e test of independence. This method involves finding the discrepancies between the observed frequencies f and the hypothetical frequencies h. The observed frequencies, as the name implies, are the numbers of successes and failures of the two samples. The hypothetical frequencies are the numbers of successes and failures required to satisfy the hypothesis that two means are equal. The two kinds of frequencies, in numerical as well as in notational terms, are shown in Table 21.6d. The numerical values of the observed frequencies are those of Table 21.68 and the notations are the same ODes used all through this chapter. The number of successes of a sample, consisting of n observations, is the sum T of that sample. Therefore, the number of failures is n- T. The total of the two sample sums is the grand total G, the total number of successes for the two samples. As for the hypothetical frequencies, they
21.6
411
DIFFERENCE BETWEEN TWO MEANS
are derived from the observed frequencies in such a way that both sample means are equal to the general mean y, with the sample sizes unchanged. TABLE 21.6d Observed Frequencies, Sample 1
ObservatioDs
f
Sample 2
Succefl'" Failure
320 80
Ill-T I
60 .40
Total
400
lis
100
TI
Total
Ta Ila- Ta IlJ
380 120
C 4-C
500
LII
Hypothetical Frequencies, h Observations
Sample 1
Success Failure
304
Total
400
96
Sample 2
-
76 24
na(1-y)
380 120
100
"a
500
illY
1l1(1-y) "l
Total
Illf
C Ln-C Ln
When two hypothetical frequencies are added vertically, it is obvious that
ny + nU-1) - n.
(8)
When two hypothetical frequencies are added horizontally, it follows from Equation (6) that (9)
and
nIU-y) + naU-1) - (n l + na)(l-y)
= In
- (In) y = In - G.
UO)
Therefore, the marginal totals for the observed and hypothetical frequencies are the same. This fact may be utilized as a checking device for computation. TABLE 21.6e Treatment
Observation
/
II
(/-h}2/h
1
Success Failure Success Failure
320 80 60 40
304 96 76 24
16 -16 -16 16
256 256 256 256
.8421 2.6667 3.3684 10.6667
500
500
0
2 Total
17.5439
The statistic for testing the hypothesis that III -Ila' is
t=
Xa •
(f-I.)a
I.
(11)
412
Ch.21
SAMPLING FROM BINOMIAL POPULATION
with 1 degree of freedom. For the data of Table 21.6a, the value of Xl is equal to 17.54, with the details of computation shown in Table 21.6e. As expected, 17.54 is equal to the square of 4.188, the value of " given in Equation (7). This is not a coincidence. It is an algebraic identity that the square of the" of Equation (3) is equal to the Xl of Equation (11). Therefore, in testing the hypothesis that III -Ila' either the "",test or the xl-test may be used; the conclusions reached are always the same, because follows the xl-distribution with 1 degree of freedom (Theorem 7.6). Of course, the " is a two-tailed test and the X a is a one-tailed test. For the 5% significance level, the critical regions for" are ,,< -1.96 and" > 1.96; and the critical region for x* is that ->t > 3.84 (Section 7.6). The physical meaning of the number of degrees of freedom can be noted
"a
from the hypothetical frequencieB of Table 21.6d. The four marginal totals, nu n a, G, and In-G, are fixed by those of the observed frequencies. When one of four hypothetical frequencies, h, is determined, the other three are automatically determined. Therefore, the number of degrees of freedom is 1. The meaning of the number of degrees of freedom can also be Doted from the discrepancies, {-h, of Table 21.6e. The four values are all equal to 16, with the only difference being in the sign. When one value is determined, the other three are automatically determined. If the first discrepancy were 9, the others would be -9, -9, and 9. Therefore, the number of degrees of freedom is 1. There is a short-cut method of computing the value of Xa. The method is
Xa •
[T I(n a- T a) - T a(n l - T l)]2(n l + nJ (n l ) (n a) (G) (In-G)
(12)
•
For the data given in Table 21.6a,
[320(40) - 60(80)]2(500)
X2 -
(400) (100) (380) (120)
32,000,000,000 ..
1,824,000,000
-
17.54.
(13)
The value obtained above is the same as that shown in Table 21.6e. The two versions of X 2 , as shown in Equations (11) and (12), are algebraically identical, despite their difference in appearance. Equation (12), however, is a computing short-cut, because it does not need the hypothetical frequencies. Large samples are required for the statistic " to follow the normal distribution, and consequently large samples are also required for to 2 follow the X -distribution. To insure validity of the u or the x2-test, the sample sizes should be large enough so that none of the four hypothetical frequencies is less than 5. This is the same working rule given in Section 21.3.
"a
21.7
TEST OF HOMOGENEITY OF MEANS
413
21.7 Tesl of Homogeneity of Me .... The method of testing the hypothesis that two population means are equal is given in the preceding section. In this section, a method is developed to cope with the case of "samples. The hypothesis is that" population means are equal. This method serves the binomial populations in the same way that the analysis of variance serves the normal populations. As shown in Section 21.2, the sample means drawn from a binomial population follow approximately the normal distribution with mean equal to Il and variance equal to jL(1-Il)/n (Table 21.2b). Then the sample mean 8 1 may be considered a sample of " observations drawn from a normal population with mean equal to Il and variance equal to u 2 /n (Section 12.2). From this point of view, it can be seen that the statistic
t,(r-y)2 .. nt(r-Y>2 among-sample SS -
X2_-
0 2
q2
1l(1-Il)
(1)
follows the X2-distribution with "-1 degrees of heedom (Theorem 7.7 a and Equation 2, Section 12.2). If the sample sizes are not equal, the statistic : , n( 1-1')2
Xl -~
01
among-sample SS - -------
(2)
Il(l-Il)
follows the xl-distribution with "-1 degrees of freedom. The numerator n 1( 11-1')2 + nl(YI-y>1 + .•. + n Ic( flc _1')2
(3)
is still the among-sample SS (Equation 2, Section 12.10). The distributions of the statistics given in Equations (2) and (3) are derived hom Theorem 7. 7a; but these statistics cannot be used to test any hypotheses until the population variance is replaced by its estimate. Since the variance of a binomial population is equal to 1l(1-Il), the quantity 1'(1-1') may be used as a pooled estimate of 0 2 (Equation 3, Section 21.6). This estimate is a good one, if the sample sizes n are large, or if the number of samples, Ie, is large, or if both the n's and" are large, because y is an estimate based on In observations. With Il replaced by Yo the new statistic is
"",t
2 X
n( f-1)2 = among-sample SS
y(1-f)
y(1-1)
(4)
414
Ch.21
SAMPLING FROM BINOMIAL POPULAnON
which follows approximately the )(2-diatribution, with k-l degrees of freedom. This statistic may be need in testing the hypothesis that k population means are equal. An example may be used as an illustration of the purpose of the X2test. Table 21.7a shows the frequencies of the successes aDd failures 01 4 samples. In application, the samples are the treatmenta; the successes and failures are the observations. Therefore, the DumberS of successes and failures may be the numbers of insects killed and not killed; the 4 sa~les may be 4 kinds of insecticides, or 4 dosages of the same insecticide. In terms of germination rate, the numbers of successes and failures are the numbers of seeds germinated and not ~erminated; TABLE 21.7a Observation
Sample No. 1
2
3
Total
4
Success Failure
154 46
172 28
146 54
244 156
Sample Size
200
200
200
400
284 1,000
.610 59,536 148.84
.716000 512,656 512.656
Item Mean T2 Tl/n 1-1
10-1) I(Tl/n)-GI!In )(1
716
C ~n
Computation .770 23,716 118.58
.860 29,584 147.92
.730 21,316 106.58
-
Y C2 Ga~n
.284000 .203344 9.264 45.56
the 4 samples may be 4 varieties of grass. In terms of medicine, the numbers of successes and failures are the numbers of patients responding favorably and unfavorably; the 4 salJ1)les may be 4 kinds of drugs. One can think of numerous examples of this sort, and the )(2_ test can be used to determine whether the percentages of successes of the k treatments differ significantly. In statistical terma, the hypothesis being tested is that the Ie ,",'s are equal. The short-cut methods of calculating the among-sample 55 given in Sections 12.3 and 12.10 can be utilized in computing the value of ~. The among-aample 55 is equal to (5)
21.7
TEST OF HOMOGENEITY OF MEANS
415
if the sample sizes are not the same; it reduces to
ITI GI T: + T: + ••• + T1 GI ------, ,. Ie,. ,. Ie,.
(6)
if the sample sizes are the same. The computing procedure is illustrated by the data given in Table 21.7a. The details of the computation are shown in the lower half of the table. The notations used here are the same ones used in the preceding sections. The number of successes iD a sample remains the sum T of that sample. The among-sample 55 is
Le') -:~
-U8.58 + 147.92 + 106.58 + 148.84 - 512.656 - 9.264.
The general mean ris equal to 716/1000 or 0.716. Therefore, the value of Xl is (Equation 4) 9.264
XI
-
(.716) (.284)
9.264 -
.203344
-
45.56
with 3 degrees of freedom. Because of the magnitude of 45.56, the hypothesis is rejected and the conclusion is that the percentages of successes. of the four treatments are not the same. Needless to say, this ~ is a one-tailed test. The hypothesis is rejected only because Xl is too large. If there are only two treatments, that is, Ie - 2, the Xl given in Equation (4) is exactly the same as those given in Equations (11) and (12) of Section 21.6. For the data of Table 21.6a, (320)'
(60)1
(380)1
400
100
500
--+-----
(:) (:)
3.2 ----17.54, .1824
which is the same value shown in Table 21.6e and Equation (13), Section 21.6. Thus another computing method is introduced for the two-aample cue. Another method of testing the hypothesis that Ie population means are equal is the chi square test of independence, which is described in the preceding section. The computing method is the same as that given in Tables 21.6d and 21.6e, except that the number of samples is changed from 2 to Ie. For the data given in Table 21.7a the value of)( computed by this method is 45.56 (Table 21. 7bj, which is the same as that obtained previously (Table 21. 7a). Therefore, these two )(-tests are really the same one, despite their difference in terminology, notations, and computing methods.
416
SAMPLING FROM BINOMIAL POPULATION
Ch.21
TABLE 21.7b
Sample No. 1 2 3 4
Observation
f
Success Failure Success Failure Success Failure Success Failure
154 46 172
Total
(f_h)2
h
f-h
143.2 56.8 143.2
10.8 -10.8 28.8
116.64 116.64 829.44
28
56.8
-28.8
839.44
146 54 244 156
143.2 56.8 286.4 113.6
2.8 - 2.8 -42.4 42.4
7.84 7.84 1,797.76 1,797.76
1000
1000 .0
0.0
(f-h}l/b
0.8145 2.0535 5.7922 14.6028 0.0547 0.1380 6.2771 15.8254 45.5582
21.8 Analysis of Variance Versns )(2-Test The choice of a statistic in testing the hypothesis that " population means are equal, depends on the kind of populations involved. For the normal populations, the statistic used is Among-sample SS
F
=
"-
1 _ Among-sample MS Within-sample SS Within-sample MS
(1)
In-k with " - 1 and In -" degrees of freedom; for the binomial populations, the statistic used is Among-sample SS
>t -
y(l- y)
(2)
with ,,- 1 degrees of freedom. There is undoubtedly a great deal of similarity between these two statistics. F or the purpose of comparison, the X2 may be expressed in terms of F (Theorem 9.7); that is, Among-sample SS
F' =
L
"-1
= _ _"_-_1_ _ Among-sample MS = y(1-y) yO -y)
(3)
with" - 1 and 00 degrees of freedom. Note that a prime is attached to the above F to distinguish it from the F of the analysis of variance. Observing Equations 0) and (3), one will notice that the numerators of F and F' are the same; the denominators, though different, are both pooled estimates of the population variance. For normal populations, q2 is estimated directly by Sl, the error mean square; for binomial populations, q2 is estimated through the general mean y, because q2 :::: ,.,. (1 - ,.,.). Therefore, the basic principles underlying these two tests, )(2 and F, are the
21.8
417
ANALYSIS OF VARIANCE VERSUS Xa-TEST
same. The only difference between them is the manner in which 0 2 is estimated-directly through sa for normal populations and indirectly through y for binomial population~. It would be interesting to see what would happen if the variance 0 2 of the binomial populations is estimated by S2 instead of by yU - y). The relation between S2 and y(1 - y) can be readily seen through a numerical example. Table 21.8a shows 4 samples, with the observations given individually rather than in the familiar form of a frequency table. The analysis of variance of the data is shown in the lower part of the table. ..
_-
Item
TABLE 21.8a Sample No.
I
(3)
(4)
1 1
0
0
0 0
1
r
i
I
n
'f2 'f2/n
Total
(2)
(I)
1
0
1 1
0
1 0
0 0
2 4 4 1.0
3 5
2 5 4 0.8
9 1.8
1 1 2 4 4 1.0
G
9
18 81 4.5
~.
~
G2 G2/In
Analysis of Variance Source of Variation
5S
DF
MS
Among-sample Within-sample Total
0.1
3 14 17
0.0333 0.3143 0.2647
4.4 4.5
The computation is very simple, because y = y, for y = 0 or 1. The quantity Iy2 for all ~n observations is simply equal to the grand total G. Dy the computing method given in Section 12.10, the among-sample 55 is
L~(Pn) -~: k. the within-sample 55 is
~y' - L(~) =G
(4)
= 1.0 + 1.8 + 0.8 + 1.0 - 4.5 :: 0.1;
-I1 ~)=
9 -1.0 -1.8 - 0.8 -1.0
=4.4;
(5)
the total 55 is
G2 If - -
In
G2
=G - -
~
:: 9 - 4.5
= 4.5.
(6)
418
Ch.21
SAMPLING FROM BINOMIAL POPULATION
It is through these components of 55 that the relation between S2 and y(1-1) can be established. The total mean square, which has never been used in the analysis of variance, is approximately equal to y(1 - y). When the total 55 is divided by ~II instead of ~II - 1, its number of degrees of freedom, the result is (Equation 6)
r G III LG - III 1
2]
G
= ~II
( G -:
~II
)1 = Y_- f_I =1(1_ y). _
(7)
Therefore, the total mean square is only slightly gt'eater than ;(1-1>. For the data of Table 21.8a, y(1-1> = .5(1-.5) = 0.25 as compared with 0.2647, the total mean square shown in Table 21.8a. The difference between 0.25 and 0.2647, though small, is still misleadingly large, because In is only equal to 18. The difference in these numbers is induced by the difference in the divisors, In and In - 1. Therefore, the larger the total nUDlber of observations, the smaller the difference between the total mean square and y(l - f}. Since the ~-test is used only for large samples and never for samples as small as those shown in Table 21.8a, the total mean square and ;(1-1> may be considered equal for all practical purposes. Now the relation between S2 and yel - f) is established. The fonner is the within-sample mean square, while the latter is approximately the totel mean square of the analysis of variance. Therefore the x2-test is equivalent to the analysis of variance, with the
total mean square being used as the elTor tel'ltl. The practice of using the total mean square 88 the elTor tenn in the analysis of variance may seem shocking at first, but it is not nearly so bad 88 it appears. The consequences of this practice are illustrated by Table 21.8b which shows the analysis of variance of the data given in Table 21. 7a. The total mean square is the weighted average of the among-sample and within-sample mean squares, with their numbers of degrees of freedom being the weights. Table 21.8b Source of Variation
SS
Among-sample Within-sample Total
9.264 194.080 203.344
DF
3 996 999
MS
F
F'
3.0880 .1949 .2035
15.84
15.17
In terms of the mean squares given in Table 21.8b, the relation among them is 0.2035
=
3(3.0880) + 996(.1949) 3 +996
(8)
21.8
ANALYSIS OF VARIANCE VERSUS )(-TEST
419
The' above equation is obviously true, because a mean square multiplied by its nwr.ber of degrees of freedom is equal to the SS. Because of this relation, the magnitude of the total mean square must lie somewhere . between those of the among-sample and within-sample mean squares. But the total mean square is much closer to the within-sample mean square, because its number of degree of freedom, In - Ie, is much larger than ,,- 1, and therefore the within-sample mean square carries more weight in the weighted average (Table 2108b). If F is greater than 1, that is, if the among-sample mean square is greater than the withinsample mean square, the total mean square, being between the two values, is also greater than the within-sample mean square. Therefore F' (Equation 3), being obtained from a larger divisor, is less than F (Table 21.8b). But the difference is slight for large samples, because large samples make In -" much greater than" - 1, and consequently make the total mean square and the within-sample mean square more alike (Table 2108b). For samples not very large, the difference in F and F' is further compensated by the difference in their critical regions. The critical region of F', which has 00 degrees of freedom for the denominator, is determined by the values given in the bottom line of the F table; while that of F is determined by the larger values given in the upper lines. Therefore to be significant F needs a larger value than F' does.
All the discussions so far indicate that the x2-test and the analysis of variance usually yield the same conclusion in testing the hypothesis that " population means are equal. Due to the fact that F' is less than F when F is greater than 1, the x2-test will not reject the hypothesis as . frequently 88 the analysis of variance will. Therefore, the ,r-test seems to have a slightly higher probability of committing a Type II error than the analysis of variance does. However, one must realize that the F -test is not beyond reproach, because the populations under consideration are binomial and not normal. It is known that the analysis of variance tends to reject the true hypothesis more frequently than the significance level specified, if the populations are not nonnal (Section 12.7). Therefore, the F-test seems to have a higher probability of committing a type I error than the ,r-lest does. Now the discussion about the )(-test versus the F-test can be boiled down to this: For binomial populations, either method may be used in testing the hypothesis that" population means are equal. The conclusions reached by the two methods are usually the same. On rare occasions when they do not agree, one test may be 88 much at fault 88 the other. After all, both of them are approximate tests in the sense that the validity of x2-test requires large samples and the validity of F -test requires nonnal populations. Neither of these requirements is entirely fulfilled in sampling from
420
SAMPLING FROM BINOMIAL POPULATION
Ch.21
binomial populations. Therefore, one has no basis for favoring one over the other. The discussion about F and F'is equally applicable to t and u, where
(Equation 1, Section 10.6), and u=
(Equation 3, Section 21.6). Here the contrast is also between S2 and ;< 1 - y). This is, of course, to be expected, because r .. F and u2 - F' (Theorems 7.6 and 9.7).
21. q Individual Oegree of Freedom The x2-test described in the preceding sections is used in testing the general hypothesis that the k population means are equal. For testing a more specific hypothesis an individual degree of freedom may be used. The among-sample SS (Equation 4, Section 21.7) may be broken down into k -1 components. each of which is to be divided by;
1: 2: 3: 4:
College girls College boys High school girls High school hoys
By the method of individual degrees of freedom, one can detennine (1) whether there is a significant difference in the percentages of the
affinnative answers between the boys and girls; (2) whether there is a significant difference in the percentages of the affirmative answers between college and high school students; (3) whether the interaction is significant. In short, this is a 2 x 2 factorial experiment. The absence of interaction, in this case, implies that the difference in the percentages of affinnative answers between the boys aDd girls remains the same, regardless of whether they are college or high school students.
ANALYSIS OF VARIANCE VERSUS ~-TEST
21.8
421
TABLE 21.9a Sample No.
Observation 1
2
3
Success Failure
124 76
96 104
Sample Size
200
200
4
Total
108
82
410
92
IIA
390
200
200
800
Preliminary Calculation
Item \lean
.62 15,376 76.88
P T'/n
.48 9,216 46.08
.54 11,664 58.32
.41 6,724 33.62
.5125 1,.8,100 210.125
X
x 2.Tests Source of Variation Among-sample College vs. high school Boys vs. girls Interaction
yU-y)
SS
DF
~
4.7750 1.1250 3.6450 0.0050 0.2498
3 1 1 1
19.12 4.50 14.59 .02
SS
OF
~
4.7750 3.2490 1.5260 0.2498
3
19.12 13.01 6.11
2
.05
7.81 3.84 3.84 3.84
y-Tests Source of Variation An:ong-sample Linear regression Deviation form linearity
yO -1)
]
2
X
2
.0'
7.81 3.84 5.99
The details of the computation are shown in Table 21.9a. The amongsample or treatment SS is
L
(:1) -
~~ = 76.88 + 46.08 + 58.32 + 33.62 - 210.125 .. 4.775.
(1)
The SS for college versus high school students is (Equation 11, Section 15.3) 024 + 96 - 108 - 82)2 (30)2 QI =200(0)2 + (1)2 + (_1)2 + (_1)2] = ~OO = 1.125. (2) The SS for boys versus girls is (124 - 96 + 108 - 82)2
QI = 200[(1)1 + (_1)2 + 0)2 + (_1)2]
(54)2 =
800 = 3.645.
(3)
The S5 for interaction is ~
(,
(124 - 91) - 108 + R2)2 = 200[0)2 + (-1)2 + (_1)2 + 0)2]
(2)2 = 800
= 0.005.
(4)
422
Ch.21
SAMPLING FROM BINOMIAL POPULATION
The sum of the 55-values due to the three individual degrees of freedom is equal to the among-sample 55. Then four ~-values can be computed by dividing each of the four 55-values by 1(1 - y). Therefore, the computing procedure is same as that for the analysis of variance. The only difference is that the error term is yO - y) instead of s'. The conclusions drawD from Table 21.9a are that (1) a higher percentage of girls than boys are in favor of the issue; (2) a higher percentage of college students than high school students are in favor of the issue; (3) there is no interaction between the sex and the level of the students. The method of individual degree of freedom, of course, include, the linear regression, if the treatment is quantitative. The data of Table 21.9a may be used as an example. Suppose the samples 1, 2, 3, 4 represent the responses of freshmen, sophomores, juniors, and seniors among the college students. One may determine whether the percentage of the affirmative answers is related to the wade (x) of the students. Here one may test the hypothesis that fJ c: O. Another interpretation may be impolled on the IIBme set of data. The four samples mA)' represent grass seeds of 1, 2, 3, and 4 years of age. The successes and failure are the germinated and not germinated seeds. lIere one may determine the effect of age (%) of the seed on its germination rate. Whatever the interpretation of the data, the among-sample 5S may be broken down into two components, the 55 due to linear regression and that due to deviation from linearity. The advantage and purpose of this practice are discussed in Section 17.8 and therefore are not repeated here. Since the x-values are equally spaced, the 55 due to linear regression is (Equation 3, Section 17.8, and Table 17.8)
Q2
1(96) + 1(108) + 3(82)]' (-114)' = 200[(-3)' + (-1)' + (1)2 + (3)2] 4000
= [-3(24) -
_ 3.249.
(5)
The 55 due to deviation from linearity is obtained by suLtraction. With the components computed, one can make the x'-tests as shown in the bottom of Table 21.9a. If the 5% significance level is used, the X'value of 13.01 indicates that the percentage of success decreases with % (5P is negative). The value of 6.11 indicates that the rate of decline is not linear. In terms of the age of the grsss seeds, the conclusion is that the germination rate decreases as the seeds become older, but that the decline is not the same from one year to the next. It is not uncommon to use the straightforward linear regression on a similar set of data. The ages, 1, 2, 3, and 4, of the seeds are considered the values of %. Either the numbers (T) or the percentages 000,) of the seeds are considered the values of y. Then the hypothesis
21.10
SUMMARY AND REMARKS
423
that f3 - 0 is tested. This practice can hardly be condoned, because it throws away valuable information. As evidence, the regression analysis is carried out on the data of Table 21. 9a, with the age of seeds (sample No.) as x and the number of germinated seeds (T) as y. The results of the analysis is given in Table 21.9b. Comparing the analysis of variance given in this table with that of the bottom of Table 21.9a, one will note that the three components of Table 21.9b are 200 times the corresponding ones of Table 21.9a. The corresponding components can be identified by their numbers of degrees of freedom, and the factor 200 is the sample size n. Therefore, the use of straightforward linear regression amounts to ignoring the tem and usin~ the deviation from linearity as the error term. This practice is not bad, if the number (Ie) of samples is large and the regression is really linear. But here, Ie is equal to 4; the number of degrees of freedom for the residual SS is only 2. In addition, the regression is not linear. As a result, the F -value of 4.26 (Table 21.9b) misleads one to the conclusion that the germination rate does change with the age of the seeds. The Type II error is made, not because the data are inadequate, but because the information available is not fully utilized. This example constitutes another evidence of the usefulness of the method of individual degree of &eedom.
y(l-y)
TABLE 21.9b
"
I.x (h)2
(I.x)2/" I.x2
SS
"
4 10 100 25 30 5
(I.x) (l:n (I.x) (l:T)I"
l:T (IT)2
4100 1025 968 -57
I.xT
SP
The notations" and T are equivalent to n and
410 168,100 42,025 42,980 955
SST
r of Table 16.9b.
Analysis of Variance Source of Variation
SS
DF
MS
F
F.os
Regression
649.8 305.2 955.0
1 2 3
649.8 152.6
4.26
18.5
Residual Total
The computing methods used here for the individual degrees of freedom are quite simple, because the sample sizes are equal. If they were not equal, the method given in Section 19.6 would have to be used, and the computation would become more complicated. 21.10 Summary and Remarks In sampling &om normal populations, the basic assumptions are that (1) the samples are random; (2) the populations are normal; and (3) the
424
SAMPLING FROM BINOMIAL POPULATION
Ch.21
population variances are equal (Section 12.7). In this chapter, the only assumption underlying the various methods is that the samples are random. The other two assumptions are not required. A binomial population is recognizable on sight. There is no quesswork involved. Furthermore, a population is often deliberately made binomial by the experimenter himself. Then he certainly knows what he is dealing with. Therefore, one never needs to assume a population to be binomial. The use of the pooled estimate of the population variance is dictated by the hypothesis, rather than by an arbitrary assumption. When the hypothesis states that the means of the k populations are equal, the variances are bound to be equal, because in a binomial population q2 = 1l(1-Il). Therefore, there is no assumption regarding the variances. In conclusion, the only assumption underlying the )(-tests presented in this chapter is that the samples are random. The x-J-tests presented in this chapter are similar to the analysis of variance in many respects. Both methods are used to test the same hypotheses. The only difference between them is that the x-J-tests deal with binomial populations while the analysis of variance deals with normal populations. As to computation, the now familiar procedures of the analysis of variance are equally applicable to the X-2, if one keeps two principles in mind: (1) The number of successes in a sample is the sample sum T, which has been called the "treatment total" in the analysis of variance. (2) The error term used is 1'(1- y), rather than the error mean square of the analysis of variance. The subject matter in this chapter is limited to the completely randomized one-factor experiments. But it can be easily extended to other topics, such as the factorial experiment, hierarchical classification, and the analysis of covariance. So long as one remembers the two principles listed in the preceding paragraph he will not run into any difficulty in computation. If the sample sizes are not equal, the computing procedure becomdS necessarily more complicated. But this is also true for the analysis of variance. Since the )(2 and the analysis are so similar, there is really no need to repeat the various topics for binomial populations. In conducting an experiment, an effort should be made to equalize the sample sizes. The advantages of this practice are already given in Section 12.11. However, most of the examples given in this chapter are of unequal sample sizes. The reason for using them is to illustrate the general computing procedure. If one can handle the data of unequal sample sizes, he is hound to ~ able to handle those of equal sample sizes. These examples do not constitute a recommendation to use unequal sample sizes deliberately in planned experiments. The methods presented in this chapter require large samples. The sample size should be large enoush so that all the hypothetical frequencies are weater than or equal to 5. When this condition is not satisfied,
21.10
SUMMARY AND REMARKS
425
other methods are required. Some of these methods are given in Chapter
23. The computing methods of this chapter require many decimal places. The maximum value of y is only 1, and that of y(1-y) is 0.25. One practically deals with nothing but decimal places. Perhaps because of the habit acquired in dealing with dollars, many people are reluctant to carry more than two decimal places in any computation. But here one cannot afford to be stingy: to insure some accuracy in the final value of )(2, the value of yor y must be carried to at least six decimal places. The sampling from binomial population is one of the oldest topics in statistics. In fact, it is so well known that even many college algebra textbooks have a chapter on this topic, under the heading of "probability." Since it is chronologically developed first, it is also traditionally presented at the beginning of a statistics book. As a result, the more advanced statistical methods, such as the factorial experiment and the individual degree of freedom, are not often associated with the sampling from binomial populations. This book reverses the traditional order of presentation and explains the topic in terms of the analysis of variance. Thus it becomes possible to give the topic a comparatively more thorough treatJllent in a limited space. F'urtherJllore, this method of presentation integrates two major topics of statistics, the san-piing from nomlal and from binomial populations. The link bet"'een these topics is generally known among statisticians; yet somehow this knowledge is not usually made available to beginners. With the advantage of this unconventional method of presentation goes a disadvantage. The notations and terminology used in this chapter are those of the analysis of varia!lce, rather than those traditionally associated with the binomial population. This departure from convention saves the beginner the trouble of learning a new set of notations, terminology, and even concepts, but it may also handicap him in reading other books, or even the latter chapters of this book. It is to preclude this handicap that the traditional notations and terminology are introduced in this section. The population mean p. is called the probability of obtaining a success in a single trial. It is frequently represented by the letter p. The quantity I-p. is called the probability of obtaining a failure in a single trial. Then q = I-p and p + q = 1. The binomial population is usually ignored. The attention is centered around the binomial distribution, the distribution of the sample sum The mean and variance of this distribution are np. and np.(1-p.), which, in traditional notations, are np and npq respectively.
r.
The sample mean y is represented by a different version of p, such as ~ or p. Then it follows that 1 - y is represented by q or 7/. However, to
426
SAMPLING FROM BINOMIAL POPULATION
Ch.21
the annoyance of beginners, these notations are by no means universally used. As to distinguish p. from Y, the condition is really chaotic. The parameter and statistic are represented by a pair of letters, such as p and p or p and EY. Sometimes the representation is reversed; that is, p. and Y are represented by ~ and p or p and p. Therefore, the first thing for one to do in reading anything about binomial distribution is to distinguish the parameter and statistic. Unfortunately, however, this suggestion is not always helpful. Some writers use the same letter p to represent both p. and y. In recent years there seems to be a tendency to represent p. and y by rr and p. This is decidedly an improvement. This pair of notations, rr and p, rather than p. and y, is most likely to become popular in the future, because this choice of notations does not discard the time-honored p, yet rr and p are clearly distinguishable. Furthennore, the choice of rr and p follows the tradition of using a Greek letter for the
parameter and a Latin letter for the statistic. The difference in the two systems of notations, as typified by p. and rr, is more than just the difference in the use of letters. There is also a difference in concepts involved. The letter p. represents the population mean; the letter rr represents the relative frequency of success of the population. It is only because of the way in which the observation r is defined that p. and rr become the same. All the methods presented in the preceding chapters deal with parameters such as p. and (72; most of the methods presented in the following chapters deal with relative frequencies. This chapter is an amphibious one. It is used as a transition from parameters to frequencies. Therefore, it seems to be desirable to use dual notations for sampling from the binomial population. The corresponding quantities and notations for the two systems are listed in Table 21.10. It is fairly easy to identify the corresponding notations of the two systems. As long as p. and yare known to be equivalent to rr and p, the TABLE 21.10
Mean Population mean Population variance Sample Total
--._-
Frequency
p.
rr
p.(1-p.)
rr(l-rr)
T
np
=/
Sample mean
y
p
erand total
C
Lap
General mean
y
p-
r.f. of successes of population No. of successes in a sample r.f. of successes in a sample Total no. of successes in k samples r.f. of successes in k samples combined
EXERCISES
427
other equivalent terms become known automatically. To cite a few examples, l-Yisl-p; fis p; 1-;; isl-p. The pooled variance y(1-y) is pO-pl. The number of successes in a sample, or the sample sum, T or nY, becomes np. The number of failures in a sample, n -T, Lecomes nO -p). The hypothetical frequencies for the test of goodness of fit, np. and nO .... JL) become ntr and nO -rr). Those for the test of independence, ny and nO -f), become np and nO -pl. Incidentally, the hypothetical frequencies are more commonly called expected frequencies or theoretical frequencies. All of these terms are equally adequate and equally misleading. As in any field of science, the non-technical meaning of the words must be ignored, and one must· pay especially close attention to the exact technical definition of the term-not because scientists are queer, but because of the way language works. When a technical term is first introduced, it usually has a simple specific reference. But as kDowledge of the field develops, the term acquires additional meaning or meanings and tends to become ambiguous. In short, the principles of semantics operate in technical as well as in non-technical langua~e. In every-day language, the word "car" means automobile to almost everybody, if not everybody, in the United States. This word was coined centuries before the automobile was invented and originally meant sorr.ething like our word "cart;" but nowadays few, if any, Americans use the two terms interchangeably. The expert in semantics could cite many other examples. The s1U'Dames Smith and Carpenter, for instance, do not remotely suggest the trades practiced by the original bearers of these names. And the term "lady" no longer means one who kneads bread, 88 it originally did. These examples explain why scientific terms sometimes do not say what they mean. Therefore it is often desirable to ignore the non-technical meanings of the technical terms.
EXERCISES (1) A population consists of the following observations:
1, 1, 0, 0, 1, 1. 1, 0, 1, 0. (a) Find the mean p. and variance (12 in the usual manner (Equation 1, Section 2.2; Equation 1, Section 2.3). (b) Organize the observations into a frequency table. Find the mean and variance in terms of the frequencies. Note that the values obtained here are the same as those of Part (a). (c) Draw a histogram for the population. The purpose of this exercise is to verify Theorem 21.1. (2) A random sample of 18 red and 22 white beads is drawn from a population consisting of 50% red beads. (a) Pretending the source of the sample is unknown, test the hypoth-
SAMPUNC FROM BINOMIAL POPULATION
428
Ch.21
esis that Il or " is equal to 0.2 a~ the 5% level. Is your conciu· sioo cOITect? (b) Use both the u-test and the x'-test of goodness of fit. Show that "I = )(2.
(c) Find the 95% confidence interval of p.. (3) 10 a public opinion survey on a college campus, 51 persons answered "yes" and 49 persons answered uno" to a certain question. Is this sufficient evidence to claim that the majority of the students are in favor of the issue? (4) A certain chemical killed 123 out of 200 insects. Find the 95% confidence interval of the survival rate. State the conciusion in plain English. (5) The seeds of garden peas may be smooth or wrinkled. Crosses between the varieties of smooth and wrinkled seeds produce all smooth seeds. Subsequent crosses be tween these smooth·seeded plants produced 518 smooth and 170 wrinkled seeds. Test the hypothesis that the numbers of smooth and wrinkled seeds are in the ratio of 3 to 1, that is, the relative frequency of wrinkled seeds is equal to 0.25. (6) Sample No. 1 consists of 16 successes and 14 failures. Sample No. 2 consists of 10 successes and 15 failures. Test the hypothesis that the two samples are drawn from the same populatioD at the 5%
level. (a) Use both the u-test and the x'-test of independence. Show that ,,1=X2. (b) Find the 95% confidence interval of III - 1l2. (7) In an effort to reduce the cost, both to individuals and the company, of failures on a job training course, a brief screening examination was developed. It was given to the next 300 candidates for the training program. No attention is paid to their scores in determining their entrance. The performance of the candidates in the screening examination and the training program can be summarized as follows: TraiDiDg Program
ScreeDiDg Ex amiD atioD
PassiDg
Failure
PassiDg Failure
217 38
3 42
Is the examination score related to the training performance? (8) In evaluating a new treatment to a disease, as compared to the standard one, a number of patients are divided, at random, into two groups. The new treatment is applied to one group and the standard one to the other. Out of 150 patients receiving the new treatment 122 responded favorably. Out of the same number of patients receiv-
429
EXERCISES
ing the standard treatment, 119 responded favorably. treatment more effective than the standard one?
Is the new
(9) Sample No.
Observation 1
2
3
4
Success
24
15
19
22
Failure
26
35
31
28
(a) Four random samples are drawn from four different populations. Test the hypothesis Ihat the percentages of successes of the populations are the same. Use 5% level and the chi-square test of independence. Give the critical region and state your conclusion. (b) Test the same hypothesis by the analysis of variance. Give the critical region and state your conclusion. (c) Compute the chi-square value by dividing the among-sample SS by 1(1 - ;). Show that the value thus obtained is the same as that obtained in Part (a). (lO) Each of a group of cigarette smokers who regularly smoke different ones of 4 well-known brands is blindfolded and given 4 cigarettesone of each of these brands-and asked to decide which of these cigarettes is the one he regularly smokes. Each smoker is told what the possible brands are, but not their order, which is randomized. The result is as follows: BraDd Regularly Smoked
JudgemeDts Right
WroDg
13 17
30 22 37
A B C D
20 7
26
Is there a significant differeDce in judgemental accuracy depending on which brand one regularly smokes? (11) A manufacturer is making an effort to reduce the percentage of the defective articles he produces. To evaluate his effort, a sample of 400 articles was inspected on five successive days. The result is as lollows: Day ObservatioD Defective NOD-defective
1st
2nd
3rd
4th
5th
41 359
36 364
32 368
27 373
22 378
430
SAMPLING FROM BINOMIAL POPULATION
Ch.21
(a) Test the hypothesis that the defective rate remained the 8ame all through the five days at the 5% level. What is your conclusion? (b) Test the same hypothesi.. by the individual degree of &eedom due to linear regression. Use the multipliers -2, -1, 0, 1, aDd 2. What is your concluaion? (c) Through hia effort, the manufacturer knows that the defective rate cannot increase. Therefore, the one-tailed u-test can alao be used. The statistic" ia the aquare root of the )( of Part (b). The critical region of the one-tailed teat ia ,,< -1.645 for the 5% significance level. (d) If the three testa yield different conclusions, which one do you truat? Why? Which one of the three testa is the most powerful one? (e) Estimate the difference ill the defective rates between the first and the fifth day by a 95% confidence iuterval. (l2) The defective rates of the products made by the manufacturers A. B. C, and 0 are given in the following table. Manufacturer
Defective
Non-defective
A B C
13
289 270 265 289
D
28 40 15
(a) Test the hypothesis that the defective rates of the products made by the four manufacturers are the same. (b) Find the 95% confidence interval of the difference between the defective rates of A aDd B. (c) Find the 95% confidence interval of the defective rate of the product made by the manufacturer A.
QUES110NS (1) Suppose a population consists of 1600 white and 2400 red beada,
(2) (3) (4) (5)
what is the mean and variance of this population? What did you use as the definition of the observation y? What are the minimum and maximum valuea of the meaD and variance of a binomial population? How many parameters does a binomial population have? What are they? What is a binomial population? How does it differ from a binomial distribution? A dead insect is considered a success in the examples given in the text. If the living insecta were considered a success, what is the effect of this change of definition on the value of)(? Why?
21.10
REFERENCES
431
(6) What is the sample SUID in terms of the &equencies? (7) (a) What is the hypothesis of the chi-square test of goodness of fit? (b) How many samples are involved in this test? (8) (a) What is the hypothesis of the chi-square test of independence? (b) How many samples are involved in this teet? (9) What is the purpose of using the individual degree of &eedom? (10) If the notations I' and are changed into 7T and p, what are the equivalents of 1(1 -1) and 1(1- 1>?
r
RBFERENCBS Cochran, Wo CoS "The xl-Teat 01 Gooaeea 01 Fit," An,.al. of MaliaemaUcal Stall.Ue•• Vol o 23, ppo 315-345, 19520 Cochran, W. C.: "Some Methode for Strengthening the Common>c Teata," metric., Vol. 10, pp. 417-451, 1954. SDeclecor, C. Wos SIDIl.eical Method., Fourth EdltlOD, 1946.
Bi~
CHAPTERZZ
SAMPLING FROM MULTINOMIAL POPULATION This chapter is essentially an extension of the preceding one. Instead of dealing with the binomial population which consists of only two kinds or observatioDs, this chapter discDsses the rrwltinomial populatioD whose observations can be classified in more than two categories. However, the basic techniques remain the same. The x'-teet of goodness of fit is used on the one-sample case, while the ~-test of independence is used on the k-sample case. 22.1 Malt'aoa,aI Pop.lad... A multinomial population is a set of observations which can be clusified into a finite uumber of categories. The number of categories may be designated by the letter r. If r = 2, the multinomial population becomes a binomial population. For example, the five grades A, B, C, D, aDd F given to students at the end of a term cODstitute a multinomial population with r = 5. If the grades consist only of passing and failing, the population becomes binomial. In answering a question, if the answers "yes", "no", and "I don't know" are permitted, the answers constitute a multinomial population with r = 3. If the answers are restricted to only "yes" and "no," the population becomes binomial. There is an abundance of examples of multinomial populations in everyday life. Cars may be classified by their makes, by their body styles, or by their colors. Houses may be classified by their architectural styles, or by their constructional material. Skilled workers may be classified by their trade. Army officers may be classified by their ranks. People may be classified by their races. Whenever the observations, such as the different makes of cars, can be divided into a fiuite number of categories, they constitute a multinomial population. A multinomial population is described by the relative frequencies of the observations in the r categories. These relative frequencies are designated by"., "a' ... , and "r' Suppose 5% of the students receive tbe Grade A, 20% receive B, 50% receive C, 20% receive D, and 5% receive F. Then "1 = .05, "a = .20, ", = .50, ". = .20, and ". = .05. The sum of these relative frequencies" is equal to 1. If r = 2, the relative frequencies "1 and "2 are equivalent to "and (I - ") of a binomial population.
22.2 Test of Goodaess of Fit The test of goodness of fit is first introduced in Section 21.4 to test the hypothesis that the relative frequency of successes of a binomial 432
433
TEST OF GOODNESS OF FIT
22.2
population is equal to a given value. The same test may be used in testing the hypothesis that the r relative &equencies, "., "2' ••• , "r' of a multinomial population are equal to certain specified values. With a random sample of n observations, the statistic used to test thia hypotheaia is
t
>t _
(f - h)2 _ (f. - h.)2 + (f2 - h2)2 + ••• +
(1)
For large aamplea, this statiatic approximately follows the >t-distribution with r - 1 degrees of &eedom, if the hypotheais is correct. The {'S which are the numbers of observations faIlins into the r categories are called the obseroed frequencies. The sum of these &equencies ia equal to n. The h's which are equal to n"., n"2' ••• , and n" n' are called the hypothetical frequencies. The sum of the A's is alao equal to n, becauae the sum of the is equal to 1. t\. sampling experiment conducted by the author may be used to verify the fact that the statistic X 2 given in Equation (1) approximately follows the X2-distribution with r - 1 degrees of freedom. A multinomial population consists of 4000 beads, of which 1600 are red, 1200 are blue, and 1200 are white. The observations are the 3 colora, red, blue, and white. The relative frequencies of the three categories (colors) are 0.4. 0.3. and 0.3. the sum of which is equal to 1. From this population, 1000 random samples. each consisting of 20 observations. are drawn. For each sample. the numbers of red. blue. and white beads are recorded. The average numbers of red. blue. and white beads for the 1000 samples are 7.98. 5.99. and 6.03. respectively. These values are approximately equal to 8. 6. and 6, which are n7l'., n7l'2' and 1171', respectively. If all possible samples were drawn, the average frequencies would be exactly 8. 6, and
"'S
6. For each of the 1000 samples, the statistic X 2 is computed. Suppose f •• f2' and fl are the numbers of red, blue, and white beads in a sample. The statistic for that sample is
>t _(f. -
8)2 + (f2 - 6)2 + (f, - 6)2,
(2)
866
For example, the first sample consists of 10 red, 4 blue, and 6 white beads. The statistic for this sample is
>t =
(10 - 8)2
8
(4 - 6)2
+
6
(6 - 6)2
+
6
- 1.17.
(3)
After the 1000 ~-values are computed, a &equency table and a histogram may be made to show the shape of the distribution. The mean of the 1000
434
Ch.22
SAMPLING FROM MULTINOMIAL POPULATION
X--values is approximately equal to 2. This indicates that the distribution has T - 1 or 3 - 1 or 2 degees of &eedom. Oot of the 1000 >1values, 60 or 6% exceeds 5.99, the 5% point of the X--distribution with 2 degrees of freedom. The discrepancy between 6% and 5% is not excessive for an experiment consisting of only 1000 samples. If a larger nnmber of samples were drawn, the discrepancy would be expected to diminish. From Equations (2) and (3), it can be seen that the statistic >t is a measurement of the deviations of the observed &equencies of a sample from the true average frequencies, ",ra,. ""It ... , and "".' If these average &equencies are incorrect because of a wrong hypothesis concerning the values of the ,,'s, the result is that the X--value of a sample will be affected. In general, it is increased rather than decreased. For example, the x2-value for the sample consisting of 10 red, 4 blue, and 6 white beads is 1.17 (Equation 3). In testing die hypothesis tbat "1" . 2, "2 .4, and ", = .4, the same sample yields a statistic of 1:1
X-
(10 - 4)2 1:1
4
+
(4 _ 8)1 8
+
(6 - 8)1 8
... 11.5.
This large value of X- enables one to reject the wrong hypothesis. Therefore in testing the hypothesis that the ,,'s are equal to a given set of values, a large X--value indicates that the hypothesis might be wrong. Since the ~-value can also be large sometimes, even if the hypothesis is correct, there is always the possibility of committing a Type I error in rejecting a hypothesis. 00 the other hand, a wrong hypothesis can sometimes yield a small value of~. In testing the hypothesis that "a .5, "2 = .2, and ". = .3, the sample consisting of 10 red, .. blue, and 6 white beads yields a X--value of zero. 00 the basis of the value of ~, this wrong hypothesis would be accepted. Therefore, in accepting a hypothesis there is always the chance of making a Type II error. The rolling of dice may be used as an illustration of the application of the test of goodness of fit. If a die is well balanced, the six faces 1:1
TABLE 22.2 No. of Spots I 2 3 4 5 6 Total
Observed Frequency
Hypothetical Frequency h-
I-h
<1- h)1
-8.33 -7.33 11.67 9.67 -6.33 -4.33
11.09 53.7·3
29
33.33 33.33 33.33 33.33 33.33 33.33
200
199.98
.02
f 30 26 45 43 27
93.51
.333 1.612 4.086 2.806
40.07
1.~2
18.75
.563
136.109
)(21:110.602
22.3
INDIVIDUAL DEGREE OF FREEDOM FOR GOODNESS OF FIT
435
should occur with equal frequency. In terms of statistics, the six rr's are all elJUal to 1/6. To test this hypothesis, the die is rolled 200 times. The frequencies of occurrence of the six faces are given in Table 22.2. The six hypothetical frequencies given in the table are all equal to n" or 200/6. The value of )( is equal to 10.60 with 5 degrees of freedom. 1£ the 5% significance level is used, the conclusion is that the die is a well balanced one, because 10.60 is less than 11.07, the 5% point of the )(-distribution with 5 degrees of freedom.
22.3 ladlvidaal Degree of Freedom for Goodaess of Fit The test of goodness of fit is used in the preceding section to test the general hypothesis that the rI s are equal to a given set of values. In testing a more specific hypothesis concerning the ,,'s, an individual degree of freedom may be used. An indi vidual degree of freedom of )( is
= [M J1 + M;2 + .•. ~ M,fl . 1 n[M~1I'1 + ~Pa"2 + .•• '+ M;'''J
X2
(1)
The multipliers, M's, are chosen so that
M1"1 + M2"2 + ••• + M,", = 0,
(2)
which is the specific hypothesis to be tested. The application of an indi vidual degree of freedom may be illustrated by the example of dice rolling given in the preceding section. By examining the shape of the die, or by knowing the nature of the game, one may suspect that certain sides of the die may turn up more frequently than others. Suppose, before the die is rolled, one suspects that 3 and 4 do not occur as frequently as 1, 2, 5, and 6. To confirm or disprove this suspicion, one may test the hypothesis that
- "1 - "2 + 2", + 2"4 -". -". = O.
(3)
For the data given in Table 22.2,
~-
[- 30 - 26 + 2(45) + 2(43) - 'Xl - 29]2
(64)2
200 [(- 1)2 + (- 1)2 + (2)2 + (2)2 + (- 1)2 + (- 1)2](1/6) -
4096
-
400
200(12) 6
= 10.24,
=
(4)
which is much larger than 3.84, the 5% point of the )(-distribution with 1 degree of freedom. The conclusion is that the sides 3 and 4 of the die occur more frequently than the sides 1, 2, 5, and 6. This example illustrates the desirability of testing a specific hypothesis by the method of individual degree of freedom. For the same set of data and the same sig-
SAMPLING FROM MUL11 NOMIAL POPULATION
436
Ch.22
nificance level, this specific t.eat rejects the hypothesis while the general one accepts it (Section 22.2). The method of individual degree of ueedom is truly a more powerful test (Section 20.6) than the general one, described in the preceding sectioD. The individual degree of freedom really amounts to the reduction of the multinomial population to a binomial one. Suppose the sides 3 and 4 are considered successes and the sides I, 2, 5, and 6 are considered failures. The specific hypothesis is that" c 2/6 or 1/3. For the data of Table 22.2, n is 21>0; f, tbe number of successes, is 45 + 43 or 88. The value of is (Equation 2, Section 21.4)
>t
(f _11,,)3
m-
(88 - 2nO/3)3
u' - nllO- w) -
room
(64)3/9
400/9 - 10.24
(S)
which is the same value given in Equatiou (4). The x'-test of the indi vidual degree of heedom is equivalent to a twotailed "..test (Sect.ion 7.6). At times, the one-tailed "..test is more desirable. For example, because of the nature of the game, one suspects that the sides 3 and 4 of the die occur more hequently, and not less hequently, than I, 2, 5, and 6. In this case, the one-tailed u-test should be used. For the example of Table 22.2, u
=
f -II" vn'{l-wl
=
88 - 200/3
V~G)m
- 3.21>,
(6)
which is the square root of x' or 10.24. However, the difference in the one- or two-tailed "..tests is not in the value of u, but in the critical regions. For the 5% significance level, the critical regions for a twotailed test are u < - 1.960 and u> 1.960; the critical region for a onetailed test is u > 1.645. If u is greater than 1.645, the hypothesis is rejected by a one-tailed test; while in a two-tailed test, u must exceed 1.960 before the bypothesis is rejected. Therefore, the one-tailed test is more powerful than the two-tailed one, if the sides 3 and 4 are suspected to occur only more uequently than the other sides. 22. 4 F''''.g Freqaeacy CIII'Yel!J
Many applications of the test of goodness of fit can be found in this book. Various distributions of the statistics, sucb as the t. F, and )(2_ distributions, are verified by sampling experiments consisting of 1000 samples. Such an empirical distribution may be tested for agreement with the theoretical one. The sampling experiment of Section 8.2 is used here as an example. The observed and hypothetical frequencies of the
437
FITI1NG FREQUENCY CURVES
22.4
1000 t-values are given in Table 22.4&. The observed frequencies are transferred from Table 8.2. The theoretical relative frequencies of Table 8.2 are the values of "'s. The hypothetical frequencies are equal to 1000 times the theoretical relative frequencies. The computation of Table 22.4& shows that )( = 1l.S9 with 10 degrees of freedom. Therefore, the conclusion is that the result of the sampling experiment does not refute the theoretical t-distribution. In fitting the empirical data to a theoretical frequency curve, the value and the number of degrees of freedom are affected by the manner in of which the observations are grouped. In Table 22.4a, the t-values are arbitrarily classified into 11 groups. But the same 1000 t-values can also be classified into 3 groups as shown in Table 22.4b. For this grouping, )( - 4.23 with 2 degrees of freedom. This example illustrates the effect of the grouping of observations on the )(-1est of goodness of fit. Sometimes the conclusions reached through different groupings may even be different. Comparing Tables 22.4a and 22.4b, one sees that the latter grouping obscures the possible discrepancies between the observed and hypothetical frequencies at the tail ends of the t-distribution.
>t
TABLE 22.48
,
Observed Frequency
1 Below -4.5 to -3.5 to -2.5 to -1.5 to -0.5 to 0.5 to 1.5 to 2.5 to 3.5 to Above
-4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 4.5 4.5
Total
Hypothetical Frequency h
(/- h)l
/-h
8 6 23 85 218 325 219 80 25 4 7
5 7 21 71 218 356 218 71 21 7 5
-81
1000
1000
0
3 -1 2 14 0 1 9 4
-8 2
(/- h)1
h
9 1 4 196 0 961 1 81 16 9 4
1.800 .143 .190 2.761 .000 2.699 .005 1.141 .762 1.286 .800
>t = 11.587
TABLE 22.4b
,
Observed FrequeDCY
1
Hypothetical FrequeDCY h
(I - h)3 I-h
(I - h)2
324 961 169
Below -0.5 -0.5 to 0.5 Above 0.5
340 325 335
322 356 322
18 -31 13
Total
1000
1000
0
h
1.006 2.699 .525
>t = 4.230
438
SAMPLING FROM MULTINOMIAL POPULATION
Ch.22
In general, a more minnte grouping of observations is more useful in detecting the difference in the shapes of the empirical and the theoretical frequency distributioDs. However, the namber of groups cannot be too large either. If the observations are divided into minute groups, some of the hypothetical frequencies may be too small. The time-honored working rule is that all the hypothetical frequencies must be at least 5 (Section 21.3). But this working rule need not be strictly observed. As long 88 80% of the hypothetical frequencies are equal to or greater than 5, and the other 20% are not less than 1, the test of goodness of fit still can be used. The reason for compromising the working rule is to make the test more sensitive to the possible discrepancies at the tail ends of the empirical and theoretical distributions.
22.5 Test of ladepeadeace The test of independence is first introduced in Section 21.6 to test the hypothesis that two samples are drawn from the same binomial population. The same test can be used in testing the hypothesis that " samples are drawn from the same r-categoried multinomial population. The statistic used to test this hypothesis is
ir(f _ A)I
x'=L
(1)
A '
where f is an observed frequency and A the corresponding hypothetical frequency. For large samples, this statistic follows the x'-distribution with (" - 1)(r - 1) degrees of freedom. The sampling experiment of Section 22.2 can be used to verify the distribution of the statistic x'. The multinomial population consists of 4000 colored beads, 40% red, 30% blue, and 30% white. One thousand random samples, each consisting of 20 beads, are drawn from this three-categoried population. The numbers of red, blue, and white beads are recorded for each of the 1000 samples. Then the 1000 samples are organized into 500 pairs of samples. The first and the second samples cOllstitute a pair; the third and the fourth cOllstitute another pair; and so forth. One such pair of samples is shown in Table 22.5a. For each pair of samples, a x'-value can be computed. TABLE 22.5a Observation
Sample No.
Total FrequeDCY
Pooled Relative Frequency
1
2
Red Blue White
10 4 6
7
8 5
17 12 11
.425 .300 .275
Sample Size
20
20
40
1.000
22.6
439
AN EXAMPLE OF TEST OF INDEPENDENCE
The end result is 500 ~-values. The purpose of this sampling experiapproximately follows the ~ ment is to show that the statistic distribution with (k - 1)(r - 1) or (2 -1)(3 -1) or 2 degrees of freedom.
>t
TABLE 22.5b Observed Frequeacy
Hypothetical Frequency
I
1&
Red Bloe White
10 4 6
Red Bloe While
Sample No.
Observatioll
1
2 Total
(/- 1&). 1-1&
(1- 1&)2
8.5 6.0 5.5
1.5 -2.0 .5
2.25 4.00
7 8 5
8.5 6.0 5.5
-1.5 2.0 -.5
2.25 4.00
40
40.0
.0
.25
.25
1&
.2647 .6667 .0455 .2647 .6667 .0455
>t = 1.9538
The details of computation for the pair of samples of Table 22.5a are shown in Table 22.5b. The hypothetical frequencies given in the table are not n times the ,,'s 88 in the test of goodness of fit, but n times the pooled relative frequencies of the r categories. The hypothetical frequencies 8.5, 6.0, and 5.5 are equal to 20 times .425, .300, and .275 (Table 22.5a) respectively. Judging from Table 22.5b, it seems extremely tedious to calculate 500 x2-values. However, a computing short-cut can be used to reduce the amount of work. For the pair of samples given in Table 22.5a, ,.2 X
(10 _7)2
(4 - 8)2
10 + 7
4 +8
=- - +
+
(6 - 5)2
6+5
= 1.954.
(2)
The numerators are the squares of the differences between the two observed frequencies, and the denominators are the IJUIIlS of the same two frequencies. The >t-value obtained this way is the same as that given in Table 22.5b. Even though this short-cut is applicable only to the case of two samples of the same size, it does save a tremendous amount of time for this sampling experiment. After the 500 ~-values are computed, the empirical distribution may be checked against the theoretical ~ distribution with 2 degrees of freedom. Out of the 500 ~-values computed, 7:l or 5.4% exceed 5.99, the tabulated 5% point of the >t-distribution with 2 degrees of freedom. The discrepancy between 5.4% and 5.0% for a sampling experiment involving only 500 >t-values is not excessive. 22.6 AD. Ex_pie of Teet or IadepeDdeDce The data given in Table 22.6a may be used as an illustration of the application of the test of independence. The table shows the grade dis-
440
Ch. 22
SAMPLING FROM MULTINOMIAL POPULATION
tribation of 693 stDdents in a freshman chemistry cl88s. The stDdenta are cl88sified by the grades they received and also by the school in which they are registered. Snch a two-way frequency table is called a contingency table. Table 22.6a is a 5 x 6 contingency table. The purpose of compiling such data is to determine whether the percentages of students receiving the five different grades vary from school to school of the same college. If the five percentages remain the same for all the six schools, the grade distribution is said to be independent of the schools. In tenne of statistics, the 5 grades may be interpreted 88 the 5 categories of a multinomial population. The 6 schools may be considered 88 6 random samples drawn &om 6 multinomial populations. The hypothesis is that the 6 populations have the same set of relative &equencies; that is, the grade distribution is independent of the schools. Moreover, another interpretation can be given the data. The 6 scbools may be interpreted 88 the 6 categories of a multinomial population and the 5 grades as 5 samples drawn &om the multinomial populations. It does not matter which interpretation is adopted. The final conclusion remains the same. The computing procedure for this set of data is Ihe same 88 that described for the sampling experiment. The 30 cells of Table 22.6a contain the observed frequencies. The pooled relative frequencies for the 5 grades are given in the bottom of the table. The hypothetical frequencies are equal to n times these pooled relative frequencies. For example, the hypothetical frequencies for the school of agriculture are TABLE 22.6a Grade School Agriculture Engineering Science Home Economics Pharmacy Uncl ...ified
Sample Size
A
B
C
D
F
22 28 8 15 9
59 66
49 41 21 16 40 38
38
23 17 7 2 6 10
"
11 27 34 20
23 16 9 15 19
191 175 63 69 104 91
120 693 65 205 217 86 Total frequ_cy Relative frequency .124098 .313131 .295815 .173160 .093795 .999999
". ria
~
". '"."
In
(This table is published with the permiasion of Mr. DenDi. KrzyzlIIlI" instructor in Chemistry, South Dakota State College.)
equal to 191 times the pooled relative frequencies; for the school of engineering, they are equal to 175 times the same set of relative frequencies. The 30 hypothetical frequencies thus computed are given in Table 22.6b. The value of )( is equal to 47.82 which exceeds 31.41, the 5" point of the )(-distribution with 20 degrees of &eedom. If the 5% significance level is used, the conclusion is that the percentages of the stD-
22.6
AN EXAMPLE OF TEST OF INDEPENDENCE
441
TABLE 22.6b -
School Asricaltnre
EagiDeerill8
A B C D F A B
C D F Science
A B
C D F Home Economics
Pb_acy
8 11 21 16 7 15 27 16 9
A B
9 34 40 15 6 4 20
D F
Total
22 59 49 38 23 28 66 .. 1 23 17
A B C D F
C
Unclassified
-
Obeerved Hypothetical Grade Frequency Frequency h /
A B C D F
2
23.70 59.81 56.50 33.07 17.91 21.72 54.80 51.77 30.3() 16.'1 7.82 19.73 18.64 10.91 5.91
- h)1 - h)2 II
-1.70 2.8900 -0.81 .6561 -7.SO 56.asOO 4.93 24.3049 5.09 25.9081 6.29 39.4384 11.20 125."400 -10.77 115.9929 -7.30 53.2900 .91 .3481
.1219 .0110 .9956
.7350 1.4466 1.8158 2.2891 2.2405 1.7587 .0212
0.18 -8.73 2.36 5.09 1.09
.0024 76.2129 5.5696 25.9081 1.1881
.0041 3.8628 .2988 2.3747 .2)10
8.56 21.61 20.41 11.95 6.47
6..... 5.39 -4.41 -2.95 -4.47
41."36 29.0521 19.4481 8.7025
4.8450
19.9809
3.0882
12.91 32.57
-3.91 15.2881 2.0449 1."3 9.24 85.3776 -3.01 9.0601 -3.75 14.0625 -7.29 53.1441 -8.49 72.0801 11.08 122.7664 3.24 10.4976 1.46 2.1316
1.1842 .0621) 2.7756 .5031 1.4423 4.7012 2.5300 4.5604 .6661 .2496
~.76
19 10
18.01 9.75 11.29 28.49 26.92 15.76 8.54
693
693.00
38
<1- h)
.00
>t
II:
I.M.... .9529 .7282
47.8168
dents receiving the 5 different grades are not the aame for all schools. In other words, the grade distribution of the freshman chemistry course varies with the schools of the college. An examination of Table 22.6b reveals the meaning of the test of independence. The hypothetical frequencies are computed from the pooled relative frequencies of the 5 grades If the 6 schools should have identical grade distributions, the hypothetical frequencies would be equal to their corresponding observed &equencies. Then (f - h) would be equal to zero. Consequently >C would be equal to zero. The examination of the column (f - h) reveals that the school of agricnlture has a deficit of high grades as compared to the college as a whole; while the school of home
442
SAMPLING FROM MULTINOMIAL POPULATION
Ch.22
>t
economics has an excess of high grades. Therefore, the statistic may be regarded as a measure of the deviations of the grade distributions of the 6 schools from that of the college as a whole. However, the examination of the data, illuminating as it may be, does not replace the )(z-test. The quantities (f - h) describe this particular set of data (samples); while the use of the >t-test is an attempt to reach a more general (populations) conclusion which may hold for years to come for this college. 22.7 Individual Degree of Freedom for Tesl of Independence In testing specific hypotheses, the x'" with (k-l)(r-l) degrees of freedom may be partitioned into components, each of which has 1 degree of freedom. The elaborate method of doing so is not presented here. A simple method is introduced instead. The simple method amounts to carving a 2 X 2 contingency table out of a k x r contingency table. For practical purposes, the simple method works just as well. The example of the grade distribution of the preceding section may be used as an illustration. Suppose one wishes to know whether the percentage of failures in freshman chemistry is the same for the students in science and engineering and the students in other schools. To test the hypothesis that the percentages of failures for both groups of students are the same, a 2 x 2 contingency table, as shown in Table 22.7, may be constructed out of Table 22.6a. The ~-valuet which caD be computed by ODe of several methods given in Section 21.6, is equal to 0.21. The conclusion is that the percentages of failures in the freshman chemistry course for the two groups of students are not found to be different. TABLE 22.7 Grade School
Total
ABeD
Failing F
Science & Engineering Other Schools
214 414
24 41
238 455
Total
628
65
693
22.8 Computing Short-cut for
Passing
)(2
The x"'-tests, as presented in this chapter, are used either in the test of goodness of fit or in the test of independence. Both cases involve finding the hypothetical frequencies. This section presents some computing short-cuts so that the hypothetical frequencies need not be computed. The short-cuts are derived from algebraic identities which are irrelevant to the meaning of the statistical tests and therefore are not Riven here.
22.8
COMPUTING SHORT-CUT FOR)(2
For the goodness of fit, the short-cut method of computing follows:
far ' 1 ~f2• fa2 ~ -- - +- + .•• + - - - . 1
RaJ
R"."a
"r
)(2
is as
(1)
The f's of the above equation are the observed frequencies; the "'S are the given hypothetical relative frequencies; If - R; I" - 1. There are two reasons for calling this method a short-cut. One is that the pr~
cedure is similar to the now-familiar procedure of the analysis of variance. The other is that the quantity
f! ",
g g
-+-+ ... +-
"."a
(2)
can be obtained in one continuous operation on a desk calculator. As an illustration, the first sample of the sampling experiment described in Section 22.2 may be considered. The sample consists of 10 red, 4 blue, and 6 white beads. The corresponding values of " are 0.4, 0.3, and 0.3. By the short-cat method,
1
2
2
20
.3
.3
1
1 tlO2 4 6 23 33 )(3 _ _ --+-+----~-1.17 20
.4
20
(3)
which is the same value given in Equation (3), Section 22.2. In a test of independence, the statistic )(2 is computed in a somewhat similar manner. The data given in Table 22.6a may be used as an illustration. For each of the 6 schools, a quantity R is computed. For agriculture,
(22)2
(59)2
(49)2
(38)2
(23)2
86
217
205
120
65
R - - - + - - + - - + - - + - - ... 53.55337.
(4)
The numerators are the squares of the observed frequencies for the 5 grades of a particular school. The denominators are the total frequencies TABLE 22.8 School
n
Agriculture Engineering Science Home Economics Pharmacy Unclassified
191 175 63 69 104 91
Total
693
R 53.55337 46.24450 6.34019 7.96104 16.50277 13.62006
SAMPLING FROM MULTINOMIAL POPULATION
Ch.22
of the corresponding grades. The quantity R can be obtained in one continuous operation on a desk calculator. 'The values of R for the 6 schools are given in Table 22.8 with the n's, which are the numbers of students in the 6 schools. Ordinarily, this table is not needed. The Rvalues can be listed in an extra column provided in the original contingency table. From these R-values, -,(' can be computed as followa:
RI
-,(' - (l:n) [ -
nl
RJ
+-
nJ
R.]
(5)
+ ••• + - - 1 n.
- 693 [1.069005 -1] - 47.82 which is the same value given in Table 22.6b. The quantity l:(R/n) can also be obtained in one continuous operation. Therefore, besides listing the values of R, the short-cut method requires practically no writing. A whole table, such as Table 22.6b, is eliminated.
EXERCISES (1) The frequencies of the ten numbers 0 to 9, that appeared on a page of a random number table are given as follows: Number
Frequency
0 241
1 255
2
3
249
254
4 246
5 250
6 254
7 253
8
9
250
248
(a) Test the hypothesis that the ten numbers appear equally frequent, at the 5% level. (b) Test the hypothesis that even numbers appear as frequently as odd numbers by an individual degree of freedom, at the 5% level. For this exercise, 0 is to be considered as even. (2) For the data given in Table 5.6, test the hypothesis that the sample means follow a normal distribution, at the 5% level. Note that the total number of samples is 1000. (3) For the data given in Table 7.5, test the hypothesis that IuJ follows the -,('-distribution with n degrees of freedom (Theorem 7.5), at the 5% level. (4) For the data given in Table 7.7b, test the hypothesis that SS/~ follows the -,('-distribution with n-l degrees of freedom (Theorem 7.7a), at the 5% level. (5) For the data given in Table 9.2, test the hypothesis that the statistic sV s: follows the F-distribution (Theorem 9.1), at the 5% level. (6) For the data given in Table 9.6, test the hypothesis that the statistic
SSI + SSJ 02
22.8
EXERCISES
follows the ~-distribution with Rl + R2 - 2 degrees of freedom (Theorem 9.6), at the 5% level. (7) (a) For the data given in Table 12.5, test the hypothesis that the statistic t follows the Students's t-distribution, at the 5% level. (b) For the same data. teat the hypothesis that the relative frequency of t greater than 1.5 is equal to that of t less than -1.5, by an individual degree of freedom, at the 5% level. (8) The grade distributions of five different instructors who taught in the same department for a period of one year are given in the following table.
~
A
B
C
D
F
Freshmen
33
89
66
19
3
Sophomore
13
50
66
8
1
Junior
18
38
31
0
0
Senior
16
24
13
0
0
IDstructor
(Courtesy of General J. H. Berry, Oregon State College.)
Did the instructors give significantly different percentages of the five grades? The frequencies of the F grade are too low. Combine the grades D and F and call it "D or below" before any calculation. Use the 5% significance level for the test (~ = 27.20 with 9 d.f.). (9) A poll conducted among the students on a university campus sbows
the following result: Year Freshman Sophomore Junior Senior Graduate
Yes
No opinion
No
98
42 25 19 10 8
25 41
65 54
42 70
33
29 52
(a) Test the hypothesis that the percentages of the three different opinions are the same for the five groups of students, at the 5%
level. (b) By an individual degree of freedom, test the hypothesis that the percentage of "yes" is the same for the graduate and undergraduate students, at the 5% level. (10) The common mistakes in typing are addition, omission, substitution, and transposition. The number of mistakes of each kind is recorded
446
SAMPLING FROM MULTINOMIAL POPULATION
Ch.22
for each of three groups of equal numbers of student-typists. data are as follows:
The
Group Mistakes
I
II
III
Addition Omission Substitution Transposition
23 43 223 31
18 19 191 25
24 22 210 29
Test the hypothesis that the three groups of typists make the same percentage of errors on the four types of mistakes, at the 5% level.
QUESOONS (1) What is a multinomial population?
(2) Under what condition does a multinomial population become a binomial population? (3) What do the parameters "., "2' ••. ", stand for? (4) What is the hypothesis being tested by a test of goodness of fit? (5) How do you find the hypothetical frequencies for a test of goodness of fit? (6) What is the purpose of using an individual degree of freedom in connection with a test of goodness of fit? (7) What is a test of independence? (8) What do you mean by independence? (9) How do you find the hypothetical frequencies for a test of independence? (0) 'The methods presented in this chapter require large samples. What are the minimum sample sizes?
REFERENCES Cochran. W. G.: "The ~ Test of Goodness of Fit." Annals of Mathematical Statistics. Vol. 23 (1952). pp. 315-345. Cochran. W. G.: "SOl"'e Methods for Strengthening the Conunon x' Testa." Biometrics. Vol. 10 (1954). pp. 417-451. Irwin. J. 0.: "A Note on the Subdivision of x' into Components." Biometrika. Vol. 36 (1949). pp. 130-134. Kimbell. A. W.: "Short-Cut Formulas for the Exact Partition of x' in Contingency Tables." Biometrics. Vol. 10 (1954) pp. 452-458.
CHAPTER 23
SOME COMMONLY USED TRANSFORMATIONS The analysis of variance is one of the most useful methods in statistics. Unfortunately it requires that the populations be nonnal and that the variances be equal. Even though a slight departure from these requirements does not result in serious consequences (Section 12.7), a drastic departure may need corrective action. This chapter introduces several devices for normalizing the populations and equalizing the variances 23.1 Angular Transformation
It is shown in Section 21.3 that the sample means drawn from a binomial population do not follow the normal distribution closely, unless the sample size is large and ,.,. is not too far from 0.5. This section offers a device by which the non-normality can be remedied. The device is to transform the sample mean y, which ranges from 0 to 1, into an angle, which ranges from 0 to 90 degrees. The geometric relation between the two scales is shown in Figure 23.1a. The vertical axis represents and the horizontal axis represents VI - f. A unit circle is constructed with the center at the origin. As the point P moves along the arc of the circle, its projection Q moves along the horizontal axis.
..;r
l~
vr
__
-- -------l--o ___"""______..L-_""'____ ________ J _ ~
Q
Fig. 23.1a
447
1
448
SOME COMMONLY USED TRANSFORMATIONS
Ch.23
Sample mean (lOOy%)
An~le
¢
Fig. 23.1b
Thus various values of y can be represented geometrically. The new scale of measurement is the angle cp as indicated in the figure. When 'Y = 0, that is, when the points P and Q coincide, the angle q, is equal to 0 degree. When y - 1, that is, when the line PQ and the vertical axis coincide, the angle cP is equal to 90 degrees. Thus for every value of y, there is a corresponding value of tP. The numerical relation between the two scales is shown in Figure 23.1h. From this figure, it can be observed that the center portion of the y-scale is contracted, while the two ends are expanded by the angular transformation. It is conceivable that the TABLE 23.1 Relative Frequency of White Beads -y .00 .05 .10 .15 .20 .25 .30 .35
.40 .45 .50 .55 .60 .65 .70 .75 .80 .85
.90 .95 1.00 Total
No. of Degrees
q,
0.0 12.9 lR.4 22.8 26.6 30.0 33.2 36.3 39.2 42.1 45.0 47.9 50.8 53.7 56.8 60.0 63.4 67.2 71.6 77.1 90.0
Expt.3 n = 20; P. = .4 f 0 0 3 14 53 70 127 181 ISS 161 116 68 32 15 4 1 0 0 0 0 0 1000
r.c.f.(~)
0.0 0.0 0.3 1.7 7.0 14.0 26.7 44.8 60.3 76.4 88.0 94.8 98.0 99.5 99.9 100.0 100.0 100.0 100.0 100.0 100.0
Expt.4 n = 20; P. = .9 f 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 31 84 191 278 270 135 1000
r.c.f.(%) 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 1.1 4.2 12.6 31.7 59.5 86.5 100.0
23.1
449
ANGULAR TRANSFORMATION
congestion of r's toward 0 or 1, because p. is too small or too large, can be relieved by this transformation of scale. The more detailed numerical relation between and ¢ is given in Table 10, Appendix. Here the value of is expressed in terms of the percentage of successes in a sample and the value of ¢ is the number of degrees of the angle. For example, if there are 7 successes in a sample of 20 observations, 7/1JJ - 0.35. In terms of percentages, the 7 successes constitute 35% of the 1JJ observations in the sample. For any given percentage, the corresponding number of degrees is listed in the table. The number of degrees corresponding to 35% is given in the row marked 30 and the column marked 5. The value of ¢ thus located is 36.3. Similarly, the values of ¢ corresponding to 40%, 51%, and 62% are 39.2, 45.6, and 51.9 respectively. The number of degrees corresponding to 100% is understood to be 90. This pair of values is not given in the table.
r
r
r ..,
99 '.~.f.
Espl.3
" [apl.4
90
84
50
10
20
30
Fig. 23.1c
450
SOME COMMONLY USED TRANSFORMATIONS
Ch.23
To show the effectiveness of the angular transformation, the result of Experiments 3 and 4 of Section 21.3 may be used as evidence. The two distributions of sample means originally given in Table 21.3 are transferred to Table 23.1, where'j is transformed to cp. The relative cumulative frequencies for both experiments are plotted against cp on the nonnal probability graph paper as shown in Figure 23.1c. The fact that the plotted points cluster about straight lines indicates that the two distributions of cp are normal. Furthermore, the fact that the two lines are almost parallel indicates that the variances of these two distributions are equal. As to the values of the mean and standard deviation of cp, it can be read from the graph that they are equal to 37.0 and 6.5 for Experiment 3; 70.5 and 6.0 for Experiment 4. What this transfonnation has accomplished is really marvelous. Originally the means of the two populations are quite different. For Experiment 3, p. - 0.4; for Experiment 4, p. - 0.9. Since the mean and variance of are p. and p.(I - ,.,.)/n respectively, the two distributions ofr have different means as well as different variances (Fig. 21.3c and 21.3d). In addition, the distribution of y for Experiment 4 is very much skewed. But after transfonnation, the distributions of cp become normal and their variances become equal. The one feature that the means are different is still maintained. In tenns of 'j, Experiment 4 has a larger mean that Experiment 3. After transfonnation, that is, in terms of cp, Experiment 4 still has a larger mean. Therefore, the comparison among 1S can be carried out through the comparison among cp's, by the analysis of variance, with the assumptions satisfied. The variance of cp is theoretically equal to 821/n (reason not given here) which depends only on n and not at all on p.. For n - 20, as in this example, the theoretical standard deviation of cp should be y'821/20 or 6.4. The two values 6.5 and 6.0, which are read from the graphs, agree with the theoretical value fairly closely.
r
23.2 An Example of Transformation The result of a sampling experiment is given in this section, for the dual purpose of (a) illustrating the application of the angular transfonnation, and (b) establishing the connection between the angular transformation and the methods presented in the preceding two chapters. The sampling experiment involves 4 binomial populations, with means equal to 0.08, 0.16, 0.10, and 0.18 respectively. (The technique of creating such populations is described in Section 21.3). From each population, 5 samples, each consisting of 50 observations, are drawn. The numbers of successes and failures are recorded. The data are given in Table 23.2a. To make the figures more meaningful, the data are interpreted as those of an experiment on the genninating rate of grass seeds. The 4 populations are considered as 4 varieties of grass. The 5 samples are
23.2
451
AN EXAMPLE OF TRANSFORMATION
considered as 5 replications. The failure and success are considered as germinated and ungerminated seeds. In reality the replications might be 5 different germinators (incubators for seeds), or one germinator used at 5 different times. Of course, it is irrelevant whether a germinated seed is considered a success or a failure. In this example, a germinated seed is considered a failure so that, in computation, one deals mostly with
one-digit numbers. In terms of the grass seed, the objective of this experiment is to determine whether the germinating rates of the seeds of the 4 varieties of grass are the same. In terms of statistics, the hypothesis being tested is that the four population means are equal. Among the severM ways of testing this hypothesis are these: (1) the ~-tests for homogeneity of means of binomial populations, (2) the analysis of variance for a 4 x 5 factorial experiment of the mixed model (Section 18.6), with observations equal to 0 or 1, and (3) the angular transformation for the 20 percentages of a randomized block experiment. !.
TABLE 23.2a
--
Varieties of GraSR
Replication
Observation
1
I- --
Total
1
2
3
4
Ungerminated Germinated
2 48
11 39
5 45
7 43
25 175
2
Ungerminated Germinated
4 46
9 41
7 43
12 38
32 16R
3
Ungerminated Germinated
1 49
11 39
7 43
9 41
28 172
4
Ungerminated Germinated
2 48
10 40
3 47
7 43
22 178
5
Ungerminated Germinated
5 45
9 41
1 49
11 39
26 174
Total
Ungerminated Germinated
14 236
50 200
23 227
-
46 204
- I===c ___=-= 133 867
Because the first and the second methods are so much alike (Section 21.8), they can be treated together. The germinated and ungerminated seeds are designated by 0 and 1 respectivel)". Thus the numbers of ungerminated seeds, such as 2, 4, 1, etc., become the treatment totals af course, the "treatment" here is the replication-variety combination. There are 20 such combinations altogether. Each combination has 50 observations. The method of computation is explained in Section 21.8 and the details of the computation are shown in Table 23.2b. The F-
r.
452
Ch.23
SOME COMMONLY USED TRANSFORMATIONS TABLE 23.2b Preliminary Calculations (1)
(2)
(3)
(4)
Type of Total
Total of Squares
No. of Items Squared
Observations per Squared hem
Grand Replication Variety Rep. - Var. Observation
17,689 3,593 5,341 1,131 133
1 5 4 20 1,000
1,000 200 250 50 1
(5) Total of Squares per Observation (2) + (4) 17.689 17.965 21.364 22.620 133.000
Analysis of Variance Source of Variation
Sum of Squares
Replication Variety Rep. x Var. Error Total
0.276 3.675 .980 110.380 115.311
Y(I.:y)
_.
Degrees of Freedom 4 3 12 980 Q99
Mean Square 0.0690 1.2250 0.0817 0.1126 0.1154
(0.133)(0.867)
= 0.1153
F
Xl
0.61 10.88 .73
2.39 31.87 8.50
--
values are the various mean squares divided by the error mean square, while the )(-values are the various sums of squares divided by yO -1>. Since the source of the samples is known, one can see that the conclusions reached by either the F-test or the )(-test are correcL However, in this mixed model, if the replication x variety interaction is significant, this interaction mean square should be used as the denominator in computing the F-values for replication and variety. The reason is that both replication and variety mean squares contain the interaction component (Table 18.6). Therefore, if the interaction is significant, the analysis of variance is reduced to only three useful components, namely, replication, variety, and interaction, with the interaction performing the function of the error term. To distinguish the interaction used as the error term and the original error mean square, the former is often called the experimental error (Section 14.8) and the latter the sampling error (Section 18.9). In terms of the grass seed experiment, the experimental error is the variation among the batches of seeds within a variety, while the sampling error is the variation amoAg the individual seeds within a batch of 50. Needless to say, when the interaction is significant, the )(-tests cannot be used, because the denominator y(l-y) plays the same role as the error mean square. Then the ~-tests must be abolished in favor of the
23.2
453
AN EXAMPLE OF TRANSFORMATION
analysis of variance, which has only three componellts: replication, variety, and experimental error (interaction). But this analysis of variance is exactly that of the ~ numbers of ungerminated seeds given in Table 23.2a. If these 20 numbers are considered as observations and the analysis of variance is carried out, the sums of squares and mean squares thus obtained are n or 50 times as large as those given in Table 23.2b. Consequently the F-values for replication and variety are not affected. Therefore, in experiments of equal sample sizes such as this, it is not uncommon for one to record only the number of germinated or ungerminated seeds and use the analysis of variance on these numbers, because the experimental error (interaction mean square) is usually significantly larger than the sampling error (error mean square). In such an analysis, the sample sum T is regarded as an observation drawn from the binomial distribution, the distribution of the sample sums. This is a very popular point of view. Whenever the number of successes is referred to as an observation and is designated by or even x, the binomial distribution is being considered as the population. In dealing with the number of successes, whether T is regarded as a sample sum from a binomial population, or as an observation from a binomial distribution, one may face the possibility that T may not follow the nonnal distribution closely. If this is the case, the angular transformation may be used to remedy the non-oormality. Thus the situation leads to the third method of analysis of the same set of data.
r
TABLE 23.2c Variety of Grass Replication
Total
1
2
3
4
5
11.5 16.4 8.1 11.5 18.4
28.0 25.1 28.0 26.6 25.1
IS.4 22.0 22.0 14.2 8.1
22.0 29.3 25.1 22.0 28.0
79.9 92.S 83.2 74.3 79.6
Tolal
65.9
132.8
84.7
126.4
409.8
1 2 3 4
As an example of the application of the angular transformation, the percentages of ungerminated seeds of Table 23.2a are transformed into numbers of degrees. For example, 2 ungerminated seeds out of 50 constitute 4%, which is in tum transformed to 11.5 degrees (Table 10, Appendix). The transformed data are shown in Table 23.2c and the analysis of variance is shown in Table 23.2d. Comparing Tables 23.2b and 23.2d, one will notice that the F-values for replication and variety are very similar for the original and transformed data. Therefore, the three different methods of analysis yield substantially the same result. As a
454
Ch.23
SOME COMMONLY USED TRANSFORMATIONS TABLE 23.2d Preliminary Calculations (1)
(2)
(3)
(4)
Type of Total
Total of Squares
No. of Items Squared
Grand Replication Variety Observation
167,936.04 33,774.74 45,129.70 9,287.52
1 5 4 20
(5)
Observations
~r
Squared Item
Total of Squares per Observation (2) + (~) 8,396.802 8,443.685 9,025.940 9,287.520
20
4 5 1
Analysis of Variance Source of Variation
Sum of Squares
Degrees of Freedom
Replication Variety Error Total
46.883 629.138 214.697 890.718
4 3 12 19
Mean Square 11.721 209.713 17.891
F
0.66 11.72
word of precflution, it should be realized that in using the angular transfonnations, the percentages must be based on the same number of observations. 23.3 Square Root Transformalion The angular transfonnation may be replaced by the square root trans{onnation, when the mean p. of the binomial population is small. Before the use of the analysis of variance, the value of y is substituted by vy, rather than by the number of degrees of an angle. The reason why the slJUare root transformation can replace the angular transformation for TABLE 23.3a
-
r
cf>
Yr
cp;..li
.00 .01 .02 .03 .04 .05
0.0 5.7 8.1 10.0 11.5 12.9
.0000 .1000 .1414 .1732 .2000 .2236
57.0 57.3 57.7 57.5 57.7
small p. is that the two transformations maintain approximately the same scale for small values of y. The evidence is given in Table 23.3a. For six small values of y, the corresponding numbers of degrees and the
23.3
455
SQUARE ROOT TRANSFORMATION
corresponding square roots are listed in the table. The fact that the two scales are approximately proportional indicates that the square root transformation may be used to replace the angular transformation when the values of y are small. TABLE 23.3b Variety of Grass Replication
Total
1
2
3
4
1 2 3 4 5
1.414 2.000 1.000 1.414 2.236
3.317 3.000 3.317 3.162 3.000
2.236 2.646 2.646 1.732 1.000
2.646 3.464 3.000 2.646 3.317
9.613 11.110 9.963 8.954 9.553
Total
8.064
15.796
10.260
15.073
49.193
TABLE 23.3c Preliminary Calculations
0)
(2)
(3)
(4)
(5)
Type of Total
Total of Squares
No. of Items Squared
Observations per Squared Item
Total of Squares per Observation (2) + (4)
Grand Replication Variety Observation
2,419.9512 486.5372 647.0046 133.0083
1 5 4 20
20 4 5 1
120.9976 121.6343 129.4009 133.0083
Analysis of Variance Source of Variation
Sum of Squares
Degees of Freedom
Replication Variety Error Total
.6367 8.4033 2.9707 12.0107
4 3 12 19
Mean Square
F
.1592 2.8011 .2476
11.31
.64
As a rule, in using the square root transformation, the sample sum T is regarded as an observation drawn from a binomial distribution. The square root is extracted from T rather than from y. Regardless of whether one deals with T or y, however, the final conclusion remains the same. But in dealing with T, the number of successes, one can avoid the trouble of finding y from T.
456
Ch.23
SOME COMMONLY USED TRANSFORMATIONS
As an example of application, the square root transformation may be used on the data of Table 23.2a. The square roots of the 20 numbers of the ungerminated seeds are given in Table 23.3b. The analysis of variance on the transfonned data is given in Table 23.3c. Comparing Tables 23.3c and 23.2d, one will notice that the F-values obtained through the square root and the angular transfonnations are substantially the same. So far the square root transfonnation is introduced as a subttitute for the angular transformation when the percentages of success are small. Yet actually the square root transformation is designed to be used on a particular distribution called the Poisson distribution. The general shape of this distribution is exhibited in Figures 23.3a and 23.3b which shows the histograms of Poisson distributions with means equal to 2 and 5 respectively. One of the characteristics of Poisson distribution is that its mean and variance are equal. The counts T, such as the number of bacterial colonies in plates of the same size or the number of diseased plants in plots of similar dimensions, frequently follow this distribution. In fact, the numbers of successes of a binomial distribution follow the Poisson distribution very closely if ,.,. is small. For example, when p. -
r.
f.
" 25
20
I---
15
-
10
-
5
-
~
r-
~ 012345678 Counts lilT = 2)
Fig. 23.38
9
10
T
23.3
457
SQUARE ROOT TRANSFORMATION
'()02 and n = 1000, the mean of the binomial distribution is nil or 2 and the variance is nll(1 -Il) or 1.996 (Table 21.2b). For all practical purposes, the mean and variance may be considered equal. Therefore, the binomial distribution may be approximated by the Poisson distribution, when p., the relative frequency of successes of a binomial population, is very small. This is the reason why the Poisson distribution is sometimes called the distribution of rare events. This is also the reason why the observation, or count, of the Poisson distribution is designated by T. Thus the same notation T has an identical meaning for both binomial and Poisson distributions. From this unified notation T, .it follows that the mean P.T of the Poisson distribution is equivalent to the mean np. of the binomial distribution. The main purposes of using the square root transformations are (1) to normalize the Poisson distributions and (2) to equalize the population variances. But this transformation has a tendency to overdo both jobs, when P.T is small. The Poisson distribution is a skewed distribution with the long tail on the right (Fig. 23.3a and 23.3b). The distribution of ..;Tis not symmetrical, but has the long tail on the left. As to the variances, they are not exactly equalized. Suppose two Poisson distributions have different variances. After transformation, the two variances do not become equal, but the one which originally has a large variance will actually have a slightly smaller variance on the square root scale. However, when IlT is larger than 10, the transformation is very effective. The square root transformation is designed for a population whose mean and variance are equal. To detennine whether this condition exr.
f.
"
20
IS
10
5
o
2
3
4
5
6
7
8
9
Counts (lIT" 5)
Fig. 23.3b
10
II I:?
13 14 15
T
458
SOME COMMONLY USED TRANS FORMA TlONS
Ch.23
ists, a number of counts must be acquired under similar environment. With this information, the hypothesis that the mean and variance are equal may be tested. The appropriate statistic to use is (1)
where the n 1"s are the counts and T the mean of the counts. This method is derived essentially from Theorem 7.7a. Since the hY.fOthesis is that I'-T = q;" the population variance of T is replaced by T. This statistic approximately follows the x'-distribution with n - 1 degrees of freedom. This is a two-tailed tesL If the value of x' is too small, the indication is that the variance of T is less than the mean of T. If the value of x' is too large, the indication is that the variance is greater than the mean. In either event, 88 long as the hypothesis is rejected, the square root transformation is not an appropriate one to use.
28.4 Logaritlaalc Tr8Dsfonaatlo.. The logarithmic transformation, like the square root transformation, is used on the data which consists of counts. The counts T are replaced by the logarithm of T before the analysis of variance is used. When the counts include 0, the logarithm of (1 + 1) may be used instead of log T. The purpose of this shift is to avoid the logJrithm of o. For example, the counts may be 2, 0, 20, .•• , etc. These observations are tran~ formed to log 3, log 1, log 21, ••• , etc., which are 0.477, 0.000, 1.322, ••• , etc. Then the analysis of variance is used on the transformed observations. Theoretically, this transformation is used on the populations whose means are proportional to their standard deviations. It is a more drastic TABLE 23.4
T
";T
0 1 2 3 4 5 6 7 8 9 10 11 100 1000
.000 1.000 1.414 1.732 2.000 2.236 2.449 2.646 2.828 3.000 3.162 3.317 10.000 31.623
10g(T
+ 1)
0.00000 0.30103 0.47712 .60206 .69897 .77815 .84510 .90309 .95424 1.00000 1.04139 1.07918 2.00432 3.00043
~log(T + 1) 3.3 3.0 2.9 2.9 2.9 2.9 2.9 3.0 3.0 3.0 3.1 5.0 10.5
23.5
NORMAL SCORE TRANSFORMATION
459
transformation than the square root transformation. The long right tail of a distribution of the counts is pulled closer towards the center and the short left tail is pushed out farther away from the center. On the transformed scale, the variances of different populations become equal, while the means are still different, if they were different originally. The logarithmic transformation is not wholly unrelated to the square root transformation. For small counts, tbe square root of T is about three times as large as the logarithm of (T + 1). The evidence is shown in Table 23.4. For T greater than 10, the two scales drift farther and farther apart. When T = 100, the square root of T becomes about five times as large as the logarithm of (T + 1). This evidence is not presented to show that one transformation is as good as the other, but to show that these transformations are not effective for small counts. Yet for large counts each goes its own way, does its own job, and does it very well.
23.5 Nora" Score Tnuasrormadoa The normal score transformation is designed for ranked data. Many characteristics of various objects cannot yet be measured numerically, but they can be ranked in an orderly sequence. For example, in judging ice cream of various flavors, one may not be able to express his preference in a quantitative measure, but he can usually rank the different flavors. If he likes vanilla best, chocolate next, and strawberry third, the ranks, 1, 2, 3 may be assigned to these flavors, with 1 being the best liked. Before the analysis of variance is carried out, these ranks are replaced by quantities called normal scores which are listed in Table 11, Appendix. The scores for the ranks 1, 2, and 3 are 0.85, 0.00, and -0.85. If there are 4 objects to be ranked, the ranks are 1.03, 0.30, -0.30 and -1.03, with the highest score for the highest ranking. The origin of these normal scores can be explained in terms of a sampling experiment. Table 1 of the Appendix shows 1000 random samples, each consisting of 5 observatioos, drawn from a normal population with mean equal to 50 and standard deviation equal to 10. The observations of a sample such as 46, 37, 62, 59, 40 may be arranged in a descending order,
8S
62, 59, 46, 40, 37. This is done for each of the 1000 samples. Then the average of the largest observations of the 1000 samples is computed. Similarly, the averages of the second, third, fourth, and fifth largest observations are also computed. These averages for the 1000 samples given in Table 1, Appendix, are 61. 750, 55.128, 50.376, 45.628. 38.945.
460
Ch.23
SOME COMMONLY USED TRANSFORMATIONS
If the population mean were 0 in8tead of SO, the averages would be 11.750, 5.128, 0.376, -4.372, -11.055.
If the population 8tandard deviation were 1 in8tead of 10, the would be 1.17,
0.51,
0.04,
-0.44,
aver~s
-1.11.
The8e value8 llI'e approximately equal to the normal 8core8 for 5 ranked objectee The 8core8 given in Table 11, Appendix, are 1.16. 0.50. 0.00, -0.50, -1.16. One will notice that the valnes computed from the sampling experimeat agree with the tabulated 8CoreS very closely. The reason why the computed value8 are not exactly equal to the normal 8core8 i8 that the tabolated scores are obtained from all possible samples rather than &om 1000 samples. With their origin e8tabli8hed, it can be 8een that the normal 8cores are the " average8 of the ranked observations of all pos8ible 8amples of 8ize II, drawn from the normal population with mean equal to o and variance equal to 1. Then through the normal 8cores, any population is tran8formed into a normal population with mean equal to zero and variance equal to 1. TABLE 23.5. Type 0 f Sagar Jadp
Sucrose
Paritoae
Brown
1 1 2
White
1
3 2 1 2
Smith
JODes
Corn
Syrup 2 3 3 3
Food tasting 8erves a8 a good illu8tration of the application of the normal score transformation. A food processor may wish to know whether different type8 of sugar used will affect the taste of frozen fruit. He can let judges rank his products in the order of their preference for these products. The data acquired may be tabulated aa shown in Table 23.Sa. Then the ranks are tran8formed into normal score8 as shown in Table 23.5b. The analysi8 of variance of the 8cores i8 given in Table 23.5c. If the 5% significance level is used, the conclusion i8 that no difference in preference among the three sugars u8ed haa been detected, because the F-value of 3.86 is less th8ll 5.14, the 5% point of the F-di8tribution with 2 and 6 degrees of freedom. The computing method for the analysis of variance of tbe normal score8 i8 very 8imple. Somehow, this extreme 8implicity often create8
23.5
461
NORMAL SCORE TRANSFORMATION TABLE 23.Sb Type of Sagar Judge
Sucrose
Puritose
COrD Syrup
Total
Smith Jones BroWD White
.85 .85 .00 .85
-.85 .00 .85 .00
.00 -.85 -.85 -.85
.00 .00 .00 .00
Total
2.55
.00
-2.55
.00
undue confusion. This is the reason why the details of calculation for the food tasting example shown in Table 23.5c are arranged in the conventional form of a randomized block experiment, with the judges as replications. Actually most of the steps could be omitted. The fact that the grand total is equal to zero eliminates most or the suhtractions. Since all the replication totals are equal to zero, the component of replication is completely eliminated. The total sum of squares is nothing but 8(.85)1. As to the numbers of degrees of &eedom for the various components~ they are the same 88 those of a randomized block experiment. However, since the component of replication is eliminated, the number of degrees of freedom for the total SS is correspondingly reduced. In using the normal score transfonnation, duplicate ranks are permitTABLE 23.Sc Preliminary Calculations (1)
(2)
(3)
(4)
(5)
Type of Total
Total of Squares
No. of Squared
Observations per Sqaated Item
Total of Squares per Observation (2) + (4)
Grand Judges Sagar Observation
0 0 13.0050 5.7800
1 4 3 12
Items
12 3
0 0 3.J5125 5.78000
" 1
Analysis of Variance Source of Variation
Sam of Squares
Degrees of Freedom
Judges Sagar Error Total
0 3.a5125 2.52875 5.78000
0 2 6 8
Mean
Square
1.62562 .42146
F
3.86
462
SOME COMMONLY USED TRANSFORMATIONS
Ch.23
ted. If a judge likes two products equally well, he may split the ranks. Instead of ranking two products 2 and 3, he can rank both of them as 2. S. In transforming the split ranks. the average of the two corresponding Dormal scores may be used. For example, amoDg three products a judge may like one of them best, but h as no preference between the oth er two. The ranks for the three products should be 1, 2.5, 2.5 and the nonnal scores are 0.85, -0.425, and -0.425, where -0.425 is the average of 0.00 and -0.85. The application of the normal score transformation is not limited to one-factor experiments. It can be used on factorial experiments just as well. For example, each of the three sugars may be used in four different quantities on the frozen fruit. Then the experiment becomes the factorial type. Since one of the two factors is quantitative, regression may also be used. In short. all the methods dealing with normal populations may be used on the transformed scores. However in foocl tasting, this method is not so useful as it seems. For a 3 x 4 factorial experiment, a judge has to rank 12 products. It is generally known among food technologists that few people can rank more than four products effectively. Therefore, the limitation of this method is not because of the transformaa tion itself, but because of the difficulty in obtaining the ranks in the first place. The application of the normal score transformation is by no means limited to food tasting. Preference for different types of clothing, preference for different kinds of sports, and preference for different models or makes of cars can be ranked. To cite a few more examples, students may rank their preference for different courses, different extra-curricular activities, and even different instructors. As long as the data are expressed in ranks, the normal score transformation can be used. In fact. this transformation can also he used on quantitative data. The application of the nonnal score transformation on quantitative data is discussed in Section 24.4.
23.6 Summary ad Re.arks Many users of statistics look with disfavor upon transformations which, they feel, unjustifiably twist their data of natural measurements. Sometimes they even feel that there is something crooked about the whole idea of transformation. It is true that the transformation is unnatural; but so are all kinds of measurements. There is no natural way of measuring any character. When one is familiar with a certain kind of measurement, he feels that it is "natural." When one encounters a new measurement, he may react violently toward it. The acidity of a solution is usually expressed in terms of pH value, which is the negative logarithm of the concentration of hydrogen ion in a solution. 1£ the concen-
23.6
SUMMARY AND REMARKS
463
tration itself were more commonly used, the pH value would be considered the logarithmic transformation of a "natural" measurement. Since the pH value is actually more commonly used, the unnatural logarithmic transformation becomes a natural measurement; while the original concentration of the hydrogen ion becomes an awkward measurement. Therefore, the feeling of "naturalness" in choice of scale of measurement is a matter of habit rather than anything else. One does not have to go to science, such as chemistry, to find an example of transformation. Transformation is actually used in everyday life. For example, the price of certain articles may be expressed as 2 for a dollar, 4 for a dollar, and 8 for a dollar. The same prices can be expressed differently, sucb as 50 cents apiece, 25 cents apiece, and 12~ cents apiece. This is reciprocal transformation through which the observations 2, 4, and 8 are transformed into 50, 25, and 12.5. Transformation or no transformation, a more expensive article is still more expensive before or after the transformation. Another example of daily use of transformation is in expressions of speed. Suppose a man can run a mile in five minutes. His speed can also be expressed as 12 miles an hour. Gasoline consumption in a car is another familiar example. In the United States, the consumption rate is usually expressed as the number of miles per gallon such as 16 miles per gallon and 2D miles per gallon. This same rate of gas consumption can also be expressed as 0.0625 and 0.05 gallons per mile, or 6~ and 5 gallons per 100 miles. In whatever form the gas consumption is expressed, the car with a more economical rate of gas consumption still maintains this characteristic. The data are not twisted by a transformation. Various transformations are used to make the assumptions of the analysis of variance come true. The basic technique of developing a transformation is to find a method which will equalize the variances. Fortunately, when the variances are equalized, the other assumptions are usually also satisfied. One important characteristic of all transformations is that the relative rank order of the observations and consequently the means of the observations is still maintained. The number 4 is larger than 2. The square root of 4 is larger than the square root of 2. The logarithm of 4 is also larger than the logarithm of 2. If two numbers are equal, their transformed values are still equal. Therefore, a transformation does not twist the population means around. The function of the analysis of variance on the transformed observations is still to compare the means which are by now expressed in a different scale. Of course, a transformation not only enables one to use the analysis of variance, but also enables him to use many other methods associated with the analysis of variance, such as the individual degree of freedom and the multiple range test (Chapter 15).
SOME COMMONLY USED TRANSFORMATIONS
Ch.23
The choice of an appropriate transformation often presents a difficult problem. The normal score transformation is easy to use, because the nature of the data enables one to decide that this transformation is an appropriate one to use. But the use of square root or the logarithmic transformation often becomes a problem. Unless a great deal of data are accumulated under a uniform environment. it is &equently hard to decide which one of the two. or any other transformation. is an appropriate one to use. The use of the angular transformation is not easy either. Even if the population is binomial. there is no assurance that the sample SUID T will follow the binomial distribution. because the samples may not be random or the samples may not be drawn &om the same population. There are many other transformations besides the ones presented in this chapter. They are usually 88sociated with particular fields of application. In psychology. the test scores are usually normalized. The original data are called raw scores and the transformed ones are called standard scores. Since these kinds of transformations are discussed so extensively in books on psychology. they are omitted here. In the field of plant breeding. the experimenters usually take advantage of the central limit theorem (Theorem 5.2a) to normalize the population. The yield of individual com planta may not be normally distributed. but the sum or mean of the yield of five plants will follow the normal distribution much more closely. This is the reason. or rather one of the reasons. why the yield of a single plant is seldom recorded and regarded as an observation by plant breedera. Theoretically. any population can be transformed into a normal population if the original population is known. But in practice it is very hard to determine what the original population is. Therefore. there is a definite need for statistical methods which do not depend on any knowledge of the population one is dealing with. Such methods are called distribution-free methods or non-parametric methods. These methods are discussed in the following chapter.
EXERCISES (1) Seventy-two subjects with normal eyesight were divided at random into 9 goups of 8 subjects each. All were required to report the reading of a mock dial which was exposed for a brief time. The lighting. distance to the dial. and viewing time were the same for all subjects. However. the effects of (A) dial size and (B) angular distance between dial calibrations were appraised. The sizes of the dials were 1. 2~. and 5 inches and the angles were 10. 25. and 50 degrees. Each group of 8 subjects was assigned. at random. to ODe of the 9 treatment combinations. Each subject made 2n readings. The percentages of erroneous readings made by the 72 subjects are given in the bllowing table:
465
EXERCISES
Dial Size Ana le
1"
2~"
5"
10 15 15
20 25 25 50
0 0 0 5 5 10 15 25
0 5 5 5 15 15 15 25
25°
0 0 10 15 15 15 15 30
0 0 5 5 10 20 20 25
0 5 5 10 15 15 15 20
50°
0 5 5 5 5 15 15 25
5 5 5 10 10 15 15 15
10 10 10 10 15 15 20 20
10 °
m
Use the angular transformation and test the main effects and interaction, at the 5% level. (2) A field experiment was conducted to evaluate the effectiveness of two kinds of fungicides in controlling a plant disease. A randombred block design with 10 replications was used. Two fungicides plus the control make 3 treabDente. Each of the 30 plots consisted of 50 plants. The numbers of diseased plants observed in the plots are given in the following table: FlIIlgiclde Replication
1 2 3 4 5 6 7 8
None
A
B
40 32 42
12
38
11 9
9 8 17 10 8 3 7 7
28
m
9 15 4 8
9
36 39 41
11
9
10
36
7
8
6
466
SOME COMMONLY USED TRANSFORMATIONS
Ch.23
Use the angular transformation and test (a) the hypothesis that the fungicides are completely ineffective and (b) tbe two fUDgicides are equally effective, by individual degrees of freedom, at tbe 5% level. (3) One bundred and fifty bouse flies were divided, at random, into 15 groupe of 10 nice cach. Each group of 10 flice was kept in ft bottle. The 15 bottles were divided, at random, into three seta, each of whicb consisted of 5 bottles. Three different insecticides were applied. to the three seta of bottles. The numbers of dead nies in the 15 bottles are given in tbe following table: IDsecticide
A
B
C
1 2
5 4
10 9
2
6 4
10
3
5
7
, (4)
(5)
(6)
(7)
(8)
(9)
8
Use the angular transformation and test the hypothesis that the three insecticides are equally effective in killing flies, at the 5% level. Use the square root transformation on tbe data of Exercise 8, Chapter 12, and test the same bypothesis at the same significance level. Is the conclusion cbanged by the transformation? Use tbe square root transformation on the data of Exercise 9, Chapter 12, and test the same bypothesis at the same significance leveL Is the conclusion cbanged by tbe transformation? Use tbe square root transformation on tbe data of Exercise 10, Chapter 12, and test the same hypothesis at the same significance level. Is the conclusion cbanged by the traD8formation? Use tbe logarithmic transformation on the data of Exercise 10, Cbapter 14, and test the same bypotbesis at the same significance level. Is the conclusion cbanged by the transformation? Use the logarithmic transformation on the data of Exercise 16, Cbapter 18, and test the same bypothesis at the same significance level. Is the conclusion cbanged by tbe transformation? Since one of the main criteria of performance of a razor blade is ita user's satisfaction witb it. and since differences in pbysical properties are difficult to measure, a group of 24 judges who regularly used blade razors were asked to evaluate samples of 10 brands. The brand marks were obscured, but eacb was identified by a code letter from A to J so that the snbjects could label them. Each subject tried tbem all in bis regular sbaving routine for some time, and theD
EXERCISES
467
each ranked them in order of preference. The data are given in the following table:
Judge
Rank Award 1
2
3
4
5
6
7
8
9
F
B
C D
J
E I
G A
A
I
G
D C E E
H
B
F
D
J
H I H A
G B A
F
E E J
F H J G
H A J
1 2 3 4 5 6 7 8 9 10
D F
11
E
12 13 14 15 16 17 18 19 20 21 22 23 24
J
H C
A
H
C
I
D
J
F
D F
C
B
J C I C E
D D F E 0 J
B
F B
B H
F
J
B I H
B B B
D D C
F B
A J
B
F
H
J
F F C
B
G G H A E
G
D
B
H F
C
E
I E A
0 C
E
A
F
F D
C
B
F F E B
D H F D H
D B J U B
C J H E
0 D F C E B H B
E 8 A
A
C B A H H J
G
I C E G H H G
F I E E
D
J
I J E G B A
A
J G
C D I I G
C I
I E G I H G E G I C E I C H
J E G
J
10
I G A G
C J
I A A I
F D
J
G
A A H
C G I A
C D
G
I G I H A G
J A
J
A
I
F
D
Use the normal score transfonnatioD and test the hypothesis that the 10 brands of blades are equally good, at the 5" level. If the result is significant, rank the brands by the new multiple range test (Section 15.5). (10) Four flavors of ice cream were evaluated by 10 judges. Each judge ranked the flavors 1, 2, 3, or 4, with 1 being the most preferred. The data are given in the following table:
Judge 1 2 3 4 5 6 7 8 9 10
Flavor A
B
C
2 1 2 3 2 2 1 2 2 3
1 2 1 2
" " "
1 3 2 1 1 1
D
3
3
4
""3
" 4 2
3 1 3 1
"33 4
468
SOME COMMONLY USED TRANSFORMATIONS
Ch.23
Use the nonnal score transformation and lest the hypothesis that the four flavors of ice cream are equally preferred, at the 5% level. If the result is significant, uee the new multiple range test (Section 15.5) to rank the flavors.
QUESTIONS (1) What are the purposes of using a transformation? (2) What is the angular transfonnation? Where is it used? (3) What is the squ..e root tran8fonnation? Where is it used? (4) What is the logarithmic transfonnation? Where is it used? (5) How are the angular and the square root transformations related? (6) How ..e the square root and logarithmic transformation8 related? (7) Between the square root and the logarithmic transformatioos, which one i8 more dra8tic? (8) What is the normal 8core? What kind of data may require the DOrmal 8core transformation?
REFERENCES Bartlett, M. S.: "The Uae of Tra8formatiODa," Biomelrica, Vol. 3 (1947), pp. 39-52. Blom, Ga.nnar: "TraaformatioDa of the Binomial, Negative Binomial, Poisson ad Xl DistributioDa," Biometrika. Vol. 41 (1954), pp. 302-316. Fiab_, R. A. and Yates, ·F.: Sealiaucal Table ... for BioloBlcal Asriculeural aM Medical Reaearch, Oliver and Boyd, Edinburgh, 1938. Stevena, ". L.: "Tables of the Angular Traaformation," Biometrika. Vol. 40 (1953). pp. 10-73.
CHAPTERM
DISTRIBUTION-FREE METHODS A subetantial part of present-day statistical methods deals with nonnal distributions. If the distributions are known to be nonnal 01I approximately normal, these methods can be used directly. If the d{.etributions are non-aormal, but their shapes are known, these methods still can be used through appropriate transfonuations (Chapter 23). Unfortunately, however, at times the distributions are far from nonnal and their exact shapes are unknown. Then the statistical methods based on normality, such as the analysis of variance, are not of much use. To cope with this situation, "a host of methods has beeD developed, mostly during recent years. These methods are called distribution-free or non-parametric methods. As the name implies, the knowledge of the shapes of the distributions is not a prerequisite for using these methods. There are many distribution-free methods. For a given hypothesis, there .e frequently several parallel methods available. In this chapter, ooly one is presented for each purpose. The selection is baaed on the number of new principles involved in the methods. The methods presented here are those which need practically no introduction of new principles. More specifically, this chapter deals essentially with the applications of the chi square testa on quantitative data.
24.1 Media The 50% point of a distribution is called the median of 'that distribution. When the N observations of a population are arranged in an asceading order according to their magnitudes, the median is the middle observation of Ibis arrangement, if N is odd; it is the average of the two middle observations, if N is even. For example, a population consists of the following 9 observations: 31,59, 58, 37, 12,86, 77, 95, 89, which may be rearranged as follows: 12, 31, 37, 58,59, 77, 86, 89, 95. The fifth largest observation 59 is the median. If the population consists of 8 observations, such as,
2, 2, 5, 0, 4, 8, 5, 9, which may be rearranged as
0, 2, 2, 4. 5, 5, 8, 9, 469
470
DISTRIBUTION-FREE METHODS
Ch.24
the median is 4.5, the average of the two middle observations 4 and 5. Strictly speaking, any value between 4 and 5 could be called the median, which divides the N observations into the upper and lower hal vee. So far, the median ie defined in terme of a population. Actually, the median of a eample ie defined in the eame way. The eample median ill 811 eetimate of the population median. Furthermore, the uee of the median is not limited to a population or a eample. The distribution of a statistic also has a median. The 50% point of the F -distribution is the median of F. The 50% point, 0, of the t-distribution ie the median of t. The median hu eeveral important properties. If a distribution is symmetrical in shape, the mean is equal to the median. For example, 50% of the obeervations of a normal distribution fall below the mean '" and the other 50% fall above it. Therefore, for a normal dietribution, the mean i. equal to the median. This is also true for Student's t-distribution which is symmetrical in shape. Both mean and mediaa of t are equal to O. Another important property of the median is that it is traneformable. For example, the median of the following 5 observations
14, 15, 26, 100, 125, is 26 and the mean is 56. Suppose the equare root transformation i. used: the corresponding transformed observations are
3.74, 3.S7, 5.10, 10.00, ILlS. By the transformed scale, the median ie 5.10, which is the equare root of the original median 26; but the mean is 6.78, which ie not the equare root of the original mean 56. The fact that the median is tranelormable ia true for any transformation. For example, if the logarithmic traneformation ia used, the logarithm of the original median is the median of the transformed observations; but the logarithm of the mean is not necessarily the mean of the transformed observations. Like the mean, the median is a descriptive measure of the location of a distribution (Fig. 3.1b). Even though it is introduced late in this book, the median is actually the center of discussion for practically every method presented. For a normal population, the mean is equal to the median. Therefore, the analysis of variance and regreesion and any other method dealing with the means of normal populations actually deal with the medians in disguise. The transformations are also disguiees for the median. When a population is transformed into a normal population, the mean and median of the transformed observations become the same. As shown above, the mean of the transformed observations is not the transformed mean of the original observations; but the median of the transformed observations actually ie the transformed median of the original observation. Therefore, the comparisons among the means of the transformed observations are the
24.2
HYPOTHESIS CONCERNING MEDIAN
471
comparisons among the transformed medians, and consequently among the original medians themselves. Thns, when one uses a transformation accompanied by the analysis of variance, he is actually making comparisons among the medians on the original scale. For the distribution-free methods, the median is used directly. [t no longer wears any disguise. The reason for neglecting the sample median heretofore is that the sample mean has a smaller standard error (Sections 5.3 and 5.5) than the sample median, as an estimate of Il of a Dormal population. Therefore, the sample mean is preferred. Now, the distribution-free methods deal with any kind of population which may not be normal or may not even be symmetrical. Then the mean and the median of a population may not be the same. Therefore, the sample median comes into the open and is no longer overshadowed by the sample mean. The fact that the sample mean has a smaller standard error than the sample median as an estimate of p. of a Dormal population, can be verified by a sampling experiment. Table 1, Appendix, shows 1000 random samples, each consisting of 5 observatioDs drawn from the normal population with mean equal to 50 and variance equal to 100. The variance of the 1000 sample means is approximately equal to q2/n (Theorem 5.3) or 100/5 or 20; while that of the 1000 sample medians is approximately equal to 29. Consequently, for a given sample size, the sample mean is a more accurate estimate of Il of a nonnal population than the sample median. The median is useful in transforming any population into a binomial population. Some of the distribution-free methods are based on such a transformation. The observations of a population can be divided into two groups depending on whether they fall above or below the median. An observation which falls above the median may be considered a success; otherwise it is a failure. [n this way, any population can be transformed into a binomial population. The use of such a transformation is shown in the following two sections. 24.2 Hypothesis Concel'Ding Me4II. The test of the hypothesis that the population median is equal to a given value can be accomplished by transforming the population into a binomial population. The n observations of a sample can be classified into two groups. All the observations above the hypothetical population median are considered as successes and those below as failures. Then, in terms of the transformed observations, the hypothesis becomes that " - 0.5. [n testing this hypothesis, either u or X"-test may be used (Section 21.4). Strictly speaking, this method is applicable only to the case in which none of the n observations is equal to the hypothetical population median. But in practice the method still can be used, even though some of the observations are equal to the hypothetical population
472
DISTRIBUll0N-FREE METHODS
Ch.24
median. Such observations may he eliminated from the sample. The sample size should be at least 10 after the eliminations (Section 21.3). For example, a random sample consists of the following 12 observations:
26, 28, 27, 21, 23, 29, 28, 24, 21,27,25, 24, which may be rearranged ..
21, 21, 23, 24, 24, 25, 26, 27, 27, 28, 28, 29. The hypothesis being tested is that the population median is equal to 28. There are 9 observations less than 28 and only 1 observation greater than 28 iD the sample. The sample size is, therefore, reduced to 10. 10 terms of the x'-test, the statistic is
>t -
(9 - 5)1
5
+
(1 - 5)·
5
(1)
-6.4
with 1 degree of freedom. ID terms of the "-test, the statistic is
"-
T-""
9-5
",,;;(1 - ,,) -- YlO(.5)(.5) • 2.53.
(2)
>t
Of course, it is expected that'" (Theorem 7.6). If the 5" significance level is ueed, the cODcluioD is that the populatioD median is greater than 28. As to the fUDctioD of these testa, they serve the same purpose .. the Hest which testa the hypothesis that". - lie (SectioD 8.4). The oDly differeDce is that the "or >t-test described here is applicable to any populatioD iDstead of to a Dormal populatioD oDly. 24.3 c.o.ple&ely R.doa.ud. Experta.,
To test the hypothesis thai the medians of Ie populatioDs are equal, the biDomial transformatioD may be used. Firat the median of the pooled observatioDs of the Ie samples is determiDed. TheD all the observatioDs EalliDS ahove this pooled mediaD are cODsidered successes and the rest
failures. After the traIlsformatioD, the hypothesis becomes that Ie IT'S are equal (SeCtiOlls 21.7 aDd 21.10). Strictly speakiDg, this hypothesis is Dot exactly equivaleDt to the ODe that Ie populatioD medians are equal. The reason is that the observations which are eqUDI to the pooled median of the Ie samples are classified, with those below the median, as failures. But for practical purposes, these two hypotheses may be considered the same. After the transformation, all the methods dealing with binomial populations may be used. If some of the percentages of successes are too close to 0 or 100, the binomial data can be transformed one step farther into normal data by the angular transformation. Through the two successive transformations, a bridge is built between the distribution-free methods and the normal-distributioD methods.
24.3
COMPLETEL Y RANDOMIZED EXPERIMENT
473
TABLE 24. loa TreatmeDt No.
2
1 61 60 64 53 53
54 60 65 52 51 58 59 60 57 58 53 54 58 60 55
60
3
67 61
56
54 61 70 56 59 55 64 56 51 51 55 56 56 60
54
58
55 51 60 51 52 54 71 55 51 58
54 54 62 56 55 54 54 62 53 54
TABLE 24.311 TreatmeDt Total
FreqaeDCY
1
2
3
Greater than median Less than or equal to median
12 8
9 11
5 15
34
Sample size
20
20
20
60
26
The interpretation of the data is not at all complicated by these lraDeformations. A Ireabnent which has a large average angle also haa a large percentage of observations above the pooled mediaD. This fact, in turn, implies that the treabnent haa a large mediaD. As an illustration of the binomial traDsformation, the data of Table 24.3a may be used. The three treabnents given in the table are actually three raDdom samples, each consisting of 20 observations, drawn from a normal population, with the lower half of the distribution cut off. Therefore, the population from which these samples are drawn is decidedly not normal and not even symmetrical. Now suppose that the source of these samples is unknown. The hypothesis that the three population medians are equal is tested. The first step is to rearrange the observations according to their magnitudes and to determine the pooled median of the three samples. Among the 60 observations given in Table 24.3a, 26 are above 56; 6 are equal to 56; 28 are below 56. Therefore, the pooled median is 56. The next step is to count the number of observations above 56 for each of the three samples. These frequencies are shown in Table 24.3b. The among sample 55 is (Equation 5, Section 21. 7) (12)2
(9)2
(5)2
(26)2
- - + - + - - - - = 1.2333; 20 20 20 60 the value of
pU - p) is equal to 26 34 60 • 60 - 0.2456;
(1)
474
DISTRIBUTION-FREE METHODS
Ch.24
the value of )(' is 1.2333
- - - = 5.02,
.2456
(3)
with 2 degrees of freedom. If the 5% significance level is used, the conclusion is that no difference has been detected among the three population medians, because 5.02 is less than 5.99, the 5% point of the Xldistribution with 2 degrees of freedom. Of course, the hypothesis being tested here is a general one. For more specific hypotheses, the individual degrees of freedom and linear regression may be used. Since the detailed description of these methods is given in Section 21.9, it is omi tted here. The binomial data can be transformed further, if necessary, into nonnal data by the angular transformation. The samples given in Table 24.3a are drawn nom the same population. Consequently, the percentages of successes, or the percentages of the observations exceeding the pooled median, are all fairly close to SO. If the samples were drawn from diffe~ ent populations, some of the percentages of successes might be close to o or 100. Should this occur, the angular transformation might be used. The advantage of such an additional transformation is that it enables one to take advantage of the numerous methods developed for normal populations, including the multiple range test (Section 15.5). The method presented in this section is basically the binomial transformation. Therefore, in using this method, large samples are required. The sizes of the k samples should all be equal to or larger than 10 (Section 21.3). 24.4 Randomized Block Experiment The normal score transformation (Section 23.5) may be used in a randomized block experiment with k treatments and n replications. The observations in a replication may be ranked according to their magni tude and subsequently replaced by the corresponding normal scores. Then the analysis of variance may be used on the transformed observations. The procedure of computation may be illustrated by the data given in Table 24.4a. The original observations of the 5 treatments and 10 replications are shown in the upper half of the taLle, while the transformed ones are shown in the lower half. The details of the analysis of variance are given in Table 24.4b. It should be noted that the median of every replication is replaced by O. A positive or negative normal score indicates that the original observation is above or below its replication median. Therefore, the hypothesis being tested is that the medians of the k treatments are equal.
24.4
475
RANDOMIZED BLOCK EXPERIMENT TABLE 24..4a Treatment ReplicatiOll
1
2
3
4
fi
2 3 4 5 6
46 48 32 42 39 48
50 46
69 47 46 65 49 59
48 60 54
44. 40 59 44. 55 50
1
7
49
8 9 10
30 48 34
50 48 37 58 50 44. 40 39
I 2 3 4 5 6 7 8 9 10
-.50 .50 -1.16 -1.16 -.50 -1.16 .00 -1.16 -1.16 -1.16
.50 -.50 .00 .50 -1.16 .00 .50 -.50 -1.16 .00
1.16 .00 -.50 1.16 .00 .50 -1.16 .50 .50 .50
.00 .00 -.50
-1.16 -1.16 1.16 -.50 1.16 -.50 -.50 1.16 -.50 1.16
Total
-5.14
-1.82
2.66
3.98
0.32
47 50 68 58 46 46 37
42 63 47 47
.00 1.16 .50 .00
.50 1.16 1.16
47 71
43 55
TABLE 24..4b Preliminary CalculatioDs (l)
(2)
(3)
(4)
(5)
Type of Total
Total of Squares
No. of Items Squared
ObservetiODS per Squared Item
Total of Squares per ObservatioD (2) .;. (4)
Treatment Observation
52.7504 31.9120
5
10
SO
1
5.27504 31.91200
Analysis of Variance Source of Variation
Sum 0 f Squares
Degrees of Freedom
Treatment Error Total
5.27504 26.63696 31.91200
36
4 40
Mean Square 1.3188 .7399
F
1.78
.-
The positive and negative signs of the normal scores also suggest that the binomial transformation may be used. Indeed, if the numbers of repli-
476
DISTRIBUTION-FREE METHODS
Ch.24
TABLE 24.4c TreatmeDt FrequeDcy Greater tho mediaD Less thaD or equal to mediaD No. of replicatioDs
Total
1
2
3
2 8
3 7
6
10
10
10
"
"5
5
5
6
30
10
10
50
"
20
cations and treatments are both large, the binomial transformation is aD alternative to the normal score transformation. The observations which exceed their respective replication medians are considered successeSi those which do not are considered failures. Then. for each of the It treatments, the numbers of successes and failures can be obtained. Such a 2 x 5 contingency table, which is obtained from the data of Table 24.4a, is given in Table 24.4c. The value of ,r is (Section 21.7)
tJ! + 31 + 61 + 51 + 41
(20)1
---
10 50 1 X2 = - - - - - - - - - - - - = - - 4 . 1 7 , al 30 .24
(1)
-.SO 50
with 4 degrees of freedom. The linomial transformation for a randomized block experiment can be used only if both k and n are large. 1£ k, the number of treatments, is small, the ,r-value needs be corrected. The corrected value is
-1) X! = (k -k- ,r.
(2)
In terms of the example under consideration, the corrected ,r is X!
4 0::: -
5
(4.17) = 3.34
(3)
with 4 degrees of freedom. This correction term originated from the relation between the chi square test of independence and the analysis of variance (Section 21.8). The x2-test value for a completely randomized experiment is approximately equal to the treatment SS divided by the total mean square and exactly equal to the treatment SS di vided by p( 1 - p) (Table 21.10). The total mean square is the total SS divided by kn - 1 and p( 1 - p> is the same SS divided by kn, the total number of observations (Section 21.8). Since the ratio of kn - 1 to kn is almost equal to 1 for large values of kn, the conclusion reached through the analysis of variance and the xl-test are almost always the same. However, with
24.4
477
RANDOMIZED BLOCK EXPERIMENT TABLE U.4d Tre atm eat ReplicatioD
1 2 3 4 5 6 7 8 9 10
1
2
3
4
5
0
1 0
1 0
0 1 0
0
0 1 1 0 1 1 1
0 0 1 0 1
-1
0 0 0 0 0 0 1 0
Total
-- _~_I
2
0 1 0 0 0
1 0 1 0 1 1 1
3
6
Total
0 0 0
0 0 1 0 1
2 2 2 2 2 2 2 2 2 2
5
4
20
TABLE 24•.te Preliminary CalculatioDs (1)
(2)
(3)
Type of Total
Total of Squares
No. of Items Squared
Crand ReplicatioD Treatment ObservatioD
400 40 90 20
1 10 5 50
(5)
(4)
ObservetiOllS per Squared
Item
Total of Squres per Observation (2) + (4)
50 5 10 1
8 8 9 20
Analysis of Variance Source of VariatiOll RepliCaliOD Treatment Error Total
pO - p)
Sum of Squares 0 1
DF
11
0 4 36
12 12
40 50
MeaD Square
X2 4.1667
X: 3.3333
.30 .24
the randomized block experiment, the situation is different. The difference can be shown through the example t9veQ in Table 24.4a. Basically, what the binomial transfonnation does is to replace all observations above their respective replication medians by 1 and the others by O. The data thus tl"ansfonned are given in Table 24.4d. The analysis of variance of these transformed observations is given in Table 24.4e. Since the replication totals are always the same, the replication SS is always equal to zero. As a result, the 9 or 11 - 1 degrees of freedom for replica-
DISTRIBUTION-FREE METHODS
478
Ch.24
tion disappear completely. Consequently, the number degrees of freedom fer the total 55 is correspondingly reduced from 49 er len - 1 to 40 or (Ie -Un. The uncorrected chi square is the treatment 55 divided by pO - p> which is the total 55 divided by 50 or len. The corrected chi square X~ is the treatment 55 divided by the total mean square. Therefore, the correction term is simply the ratio of the total number of degrees of freedom to the total number of observations; that is,
>t
(Ie -
nn
len
1e-1 ... -le-'
.\s long as Ie, the number of treatments, is large, this correction need not be used. However, when the number of treatments is small, such as Ie = 2, this correction is absolutely essential (Section 24.5).
24.5 Sign Test The sign test is a method which deals with the randomized block experiment with 2 treatments and n replications (Section 8.7 and 14.6). Since the general method for the randomized block experiment is presented in the preceding chapter, there is little need to emphasize the two-treatment case. However, the computing method for this case is simplified to such an extent that the sign test deserves some special attention. The use of the sign test may be illustrated by the data of a randomized block experiment with 2 treatments and 12 replication. given
TABLE 24.58 Rep. No. I
2 3 4 5 6
Treatment
1
2
18 4 18 76 1 35
37 56 52 25 71 83
Sign of Difference
Rep. No.
-
7 8 9 10
-+ -
11
12
Treatment
1
2
30 84 75 41 51 50
86 81 66
5 91 75
Sign of Difference
-
+ + +
-
in Table 24.5a. A plus or minus sign is given to each of the 12 replications, depending on whether the observation of the first treatment is greater or less than the observation of the second treatment. If there is no difference between the effects of the two treatments, there should be approximately the same number of plus and minus signs. If the effect of the first treatment is greater that that of the second treatment there is an excess of plus signs, otherwise a deficit in plus signs. Therefore, the hypothesis that two treatment effects are equal is the same as that the relative frequency of plus signs is equal to 0.5, or rr = 0.5. Here
479
SIGN TEST
24.5
TABLE 24.5b TrelllllleDt
Frequency
Total
1
2
Creater thaD media Less thaD median
4 8
8 4
12 12
No. of replicatioDs
12
12
24
again, a distribution-free method is essentially the binomial transfonnation. To test the hypothesis that TI = 0.5, either u or ~-test (Section 21.4) may be used, provided that the number of replications is weater than or equal to 10 (Section 21.3). For the example given in Table 24.Sa, there are 4 plus signs and 8 minus signs. By the test of goodness of fit, the two hypothetical frequencies are both equal to nTl or n/2 or 6. Therefore, •~
)( =
(4 - 6)2
6
+
(8 - 6)2
6
8
= - .,
6
1.33
with 1 degree of freedom. By the u-test, the statistic is
u ..
4-6
T-nTl
ylnnO - TI)
=
ylI2{.S)(.5)
-2 - -
{3
.. -1.15.
(2)
Of course, it is expected that u2 = ~ (Theorem 7.6). The u is a twotailed test; the ~ is a one-tailed tesL If the 5% si gnificance level is used, the conclusion is that no difference between the two treabnent effectll is detected, because 1.33 is less than 3.84 and -1.15 is weater than -1.96. Strictly speaking, the sign test is applicable only to the case in which all the n signs are either positive or negative. But in practice the two observations of a replication are sometimes equal. When this occurs, such a replication may be excluded from the test. The ,c-value of the sign test is exactly the corrected chi square for the randomized block experiment with 2 treatments and n replications of the preceding section. This relation can be shown through the example giveD in Table 24.5a. The 2 x 2 contingency table showing the numbers of observations weater than or lells than their respective replication medians is given in Table 24.5b. In testing the hypothesis that two treabnent medians are equal,
X:
42
82
(12)2
-+---,c = 12 12 24
12 12 24
24
8
=- • 3
(3)
480
DISTRIBUTION-FREE METHODS
Ch.24
Since the correctioD term is equal to (k - I)/k or ~ the corrected chi square is (4) which is the same value given in Equation (1).
This relatioD can also be shoWD algebraically. The mediaa of a replicatioD is the average of the two ObservatiODS iD that replicatioD. A plus sign implies that the fust observatioD is greater than the secoDd one in that replicatioD (Table 24.5a). Therefore, the Damber of plus sips is the Damber of ObservatiODS greater thaD their replication medians for the first treatmeDt and alao the Damber of observatioDs leu than their replicatioa medians for the secoDd treatmeDt. Therefore, the 2 x 2 contingeDCY table is as follows: n- T
T
II
II-T
T
II
II
II
2n
The letter T in the above table is the namber of plus signs. By the sign test,
>t .. ul
=
(:T--"J
2 = (2T 11(.5)(.5) II
By the method for randomized block experimeDt, 'fI (II _1)1 -+--2-1
X! =-2--
"
"
1 1
11)1.
(5)
(6)
2 2 which can be reduced to the same expreuion given in Equation (5). The sign test is alllO closely related to the normal score transformation. By the normal score transfonnation, the larger one of the two observations in a replication is replaced by 0.56 (Table 11, Appendix) and the smaller one is replaced by -0.56. Then the '-test may be nsed OIl the " differences to test the hypothesis that the population mean of the differences is equal to zero (Section 8.7); or the transformed scores may be analyzed as a randomized block experiment with 2 treatments and " replications. The two methods are equivalent (Section 14.6). For the example of Table 24.5a, , a
-1.1726
(7)
SIGN TEST
24.5
481
with 11 degrees of freedom;
F ... 1.3750
(8)
with 1 and 11 degrees of freedom. The numerical values of e and F are slightly greater than those of u and )( of the sign test (Equations 1 and 2). But the conclusions reached through the sign test and the normal score transformation are usually the same. The similarity of the sign test and the normal score transformation stems from the fact that the normal score transformation also amounts to the binomial transformation, when the number of treabDents is two. If the observation in the first treatment is greater than that in the second treatment, the difference between the two normal scores is 0.56 - (-0.56) or 2(.56). Similarly, if the 6ret observation is less than the second one, the difference is -2(.56). So long as there are no ties, the difference between a pair of normal scores can only assume the value of either 2(.56) or -2(.56). Therefore, the normal score transformation, like the sign test, also amounts to the binomial transformation. The difference between the sign test and the normal score transformation stems from the fact that the sign test oses the hypothetical population variance ,,(1 -".) or (.5)(.5) or 0.25 as the divisor in obtaining the ,c-value, while the normal score transformation uses the estimated
variance
T n-T
s2--.-n
n-l
(9)
as the divisor. The letter T in the above equation is the namber of plos sigos and n - T is the number of minus sigos in the n replications. If the factor (n -1) in the above equation were replaced by n, S2 would be equal to p(l- pl. This is another example showing the relation between the analysis of variance and the )(-18st (Section 21.8). In general, the ratio of the )(-value of the sign test to the F-value of the normal score transformation is equal to the ratio of sa to 0.25; that is,
)( s2 ---. F 0.25 For the example of Table 24.5a, sa
4
8
12
11
= - • - = .242424.
The ratio ,e/F = 1.3333/1.3750 = .9697; while the ratio 8-/0.25 is also equal to .9697. In view of the fact that various distribution-free methods dealing with the randomized block experiment with 2 treatments and n replications are
482
DISTRIBUTION-FREE METHODS
Ch.24
practically alike, it does not make much difference which method is used. However, because of its simplicity in computation, the sign test deserves some special consideration. During recent years, it has rightfully become one of the most commonly used statistical methods. The purpose of this section is not only to introduce the sign test, but also to show that this simple method is as good as other seemingly more refined methods. 24.6 Rem.n. No examples of applications are given in this chapter, because the distribntion-&ee methods presented here are applied to the same types of problems for which the analysis of variance is used. The only difft;=rence is that the distribution-free methods are used to deal with the populations which are far from being normal. In this chapter the distribution-free methods are explained in an 1D101'thodox fashion, in tenns of transformations and the analysis of variance. The main purpose of developing the distribution-free methods is to free statistics from the shackles of the normal populations. Transformations and the analysis of variance are not crdinarily considered indigenous to the subject of distribution-free methods. Yet, because both techniques are presented in the earlier portion of this book, it is expedient to use them in explaining the distribution-free methods. Moreover, as a byproduct this method of presentation bridges the gap between distributionfree methods and normal-distribution methods. All methods presented in this chapter are called large-sample methods. Before one can make use of these methods, he must have large samples. The minimum sample sizes associated with the various methods are derived from the working rule given in Section 21.3. Computation in the distribution-free methods often appears to be simple. Actually, what is simple and what is complicated depend entirely upon what tools are available. If only pencil and paper are available, the distribution-&ee methods are decidedly simpler than the analysis of variance. On the other band, if desk calculators are available, the analysis of variance is actually much simpler than the distribution-free methods. The determination of the pooled median for k samples is a tedious job wben the sample sizes are large. But if JIlOre elaborate tools, such as the punch card machines, are available, the distribution&ee methods become simpler than the analysis of variance. These machines enable one to rearrange and classify the observations and tabulate the frequencies very efficiently; while the analysis of variance is a little more time consuming because more complicated computations are required. Obviously, therefore, it is difficult to evaluate a statistical method as to its simplicity, without specifying what tools are available.
24.6
EXERCISES
483
In general, a method considered simple usually implies that it is simple when no tools other than pencil and paper are used. The distribution-free methods can be used on any kind of population. By contrast, the normal-distribution methods, such as the analysis of variance, seem to have very limited use. The distribution-free methods appear to be able to replace the normal-distribution methods completely. However, in reality, this is not true. By analogy the normal-distribution methods are like suits made to fit a tailor's dummy. People who are too fat or too thin cannot wear these suits. Thus the transformations are devices for changing people's weight to fit them into the ready-made suits. Besides, even without transfonnations, these suits are flexible enough to fit many people. They are not so rigid as they were once thought to be (Section 12.7). As a result, a large number of people can wear them. In terms of the analogy, then, the distribution-free methods are like loose coats. They are made so big that they can cover anybody but fit nobody in particular. Choosing the more desirable kind of clothes depends very much on a man's size. Therefore, in using the distributionfree methods, one may have something to gain and also something to loee, depending on what kind of population he is dealing with. If the normal distribution methods are used on populations far from being normal, the significance level may be seriously disturbed. If the population is nearly normal, the distribution-free methods are less powerful than the normal-distribution methods.
EXERCISES (1) For the following sets of data, test the hypothesis that the median of the population is equal to the given value: (a) Exercise 1, Chapter 8. (b) Exercise 2, Chapter 8. (2) Use the sign test on the following exercises of Chapter 8: (a) Exercise 3. (b) Exercise 6. (c) Exercise 7. (d) Exercise 8. (e) Exercise 9. (3) Do the following exercises of Chapter 12 by the distribution-free method given in Section 24.3: (a) Exercise 2(b) Exercise 3. (c) Exercise 4. (d) Exercise 8. (e) Exercise 9. (f) Exercise 10. (g) Exercise 11.
484
DISTRIBUTION-FREE METHODS
Ch.24
(h) Exerciee 12. (i) Exerciee 13.
(j) Exercise 14. (It) Exerciee 15. (1) Exercise 16.
Do the following exerciees of Chap"" 14 by the dietribution-free metllod (bued on median) given in Section 24.4: (a) Exerciee 2. (b) Exerciee 3. (c) Exerciee 4(d) Exercise 8. (e) Exercise 10. W Exerciee 11. (5) Repeat Exercise 4 by the normal score transformation. (4)
QUESDONS (1) What is a distribution-free method? (2) What is a median?
(3) The mean and median of a normal population are both equal to p. Why is the s.uple mean IDOre desirable than the sample median as aD estimate of p.? (4) Practically all the methode presented in this chapter are baaed on 8 particular transformation. What is it? (5) What are the advantages and disadvantages of the distribution-free methode as compared with the normal-distribution methode?
REFERENCES Dixoa, W. J. ud Mood, A. M.: "The Statistical Sip Testo" /0l1l7I. of Me Amenean StaIl.lleal Auoeiaai01l, Vol. 41 (1946). pp. 557-566. DilDa, W. J.: '"Power Fuacticms of the Sip Teat .d Power Efficieacy lor Normal Alteraativea," Annal. of Maahemaaieal Seatlalle., Vol. 24 (1953), pp. 467-473. Mood, A. Y.: lraerotiueeiOll eo Me Theory of Seaai.eiea, McGraw-Hili Book ColD'" puy. New York, 1950.
APPENDIX
APPENDIX
487
TABLE 1· Table of Random Normal Numbers with Mean Equal to 50 and Variance Equal to 100 1
2
3
4
5
6
7
8
9
10
46 37 62 59 40
53 58 49 60 56
58 46 47 52 51
60 46 36
60 47 40
49
34
40
36
35
47
48 50 43 44 49
46 63 53
51
59 42 61 67 53
78 48 42 49 50
49 51 51 62
57 53
46 55 50 49 36
61 46 54 37 44
53 49 43 51 53
46 61 52 55 34
44
44 29 41 45 51
50
56 30 48 55 49
55 47 49 48 48
45
60
47
66
60
58 53 67
43 43 46 36
63
48 39
42 54 52 58 54
47 57 53 53 52 54 63
53 56 37 48 40
62 45 37 39 54
51 43 54 50 36
45 54 67 45 50
56 51 63 59
43 47 42 43 42
62 59 32 41 63
31 62 45 48 47
57 36 50 42 36
40 54 39 45 45
68
52
51
85
29
41 49 46
45 41 60
64
47 68
38 61
40
50
45 47 53
41 41 57 47 46 57 86 44 42 51
60
53 42 51
66
39 45
67 72 37 38 46
35 46 47
46 49 49 33 39
49 56 76 37 52
46 63 45 62 60
48 48 53 58 48
61 51 56 56 51
58 70 53 61 60
60
41 29
38 56 54 52
48 33
·This table is derived, with the permission of Professor E. S. Pearson, from Tracts for Computers, No. XXV, Department of Statistics, University College,
University of London.
488
APPENDIX
TABLE 1 (continued)
1
2
3
4
5
6
7
8
9
10
62 56 32
51 52 43 39
38
49 39
39 58 44 53 54 44 40 52
43
45 60
40 36 52
56 33 40 59 49
35 49
48
34 36
42 34 52 36 49 47 46 41 40 62
60
50 57 44
51 42 55
63
46 41 47 59 36
77 31 59 44 58 45
47
40 44 51 32 39 48 74
33 50 50 38 51
48 48
46 61 37 41 41 43
71 45
52 36 62 62
67 49 46 53 31 52
56
40
65
42 42 49 57 42 67
55 45 54
53 40 37
50 37 46
52 46
43 52 53
43 47 50
66
58
54
38
51 55 77 54 36
53 42 56 50 61
48
67 55 55 54
68
60
56 48 38 47 68
44
50 47 41
48 68
46
65
42
31 43 59 51
65
53 46
35 32 52 50 57 51 53 42 38 64
40 43 50
52 39 40
41 48
54 64
52 54
34 65 66 46
50 62
47 42 42 60
52 54 49 50 58 50 55 41 53 49 55 70 59 62 60 55
64 60 60
60 43
62 46
57
63
64
50
55
64 44
54 58
59
68
60 44
46 53 42 50 61 53 32 61 52 73
44 34 53 61 32 79 58 52
APPENDIX
TABLE 1 (continued)
1
2
3
46 48 32 57 42 39 48 49 55
55 50 32 47
66
30 48 59 34 50
4
5
6
7
8
9
10
59
66
78 52 63 49 32 44
70 52 42 59
56 54
53
26
50 53 62 55
51 43 39 35 46 50
51
41 69 37 57 44
38 54 61 53 58
45
60
43
56
60
49
55 55 57
52 21 51 42 47 37 59 56 59 42 46 54 34 40 38
54 58 42 51 54 34 55 55
58 40
50 50 51 53 45 79
71
37
48 47 48 52 56 53 49 70 54 50
58 50 44 40 39
62 73 58 52
46 50 69 48
68
71
49 64
52 54 61 58 35 49 46 53 65 60
56 67 39 67
42
45
56
42
48
60
40 57 86 53
52 27 59 57 46 55 52 80 47 42 42 56 56
58 52 51 57 49 50 41
37 60 68
60
52 38
57 46 50 55 45 39 36 51 39
44
62
34
47 36
49 42
41
60
55
35 47 53 49 45 50 26 31
59 49
66
68
49
40 73 49 57 53 47 42
50 54
40
52 43
64
45
50 57 52 35 39 50 53 67 54 33
50 59 62 50 38 51 57
47
60 44
58 50 48 35 38
47 43 43 43
490
APPENDIX
TABLE 1 (coneinued)
1 69 47 46 65
49 59 42 63 47 47 48 60
54 47 50 68
58 46 46 37 44 40 29 59 44
2
3
4
5
6
7
8
9
10
60
53 57 51 48
45 51 52 55 39
45 54 42 57
46 62 45 46 49
58 51 44 56 56
44 62 48 37 40
42 55 54 54 34
38 50 40 50 39
48 54 53 40 48 54 37 41 54 53 58 62 58 76 52
41 39 58 47 59 46 46 29 57 32 40 38
55 56 34 47 50
45 56 40 58 51 51 42
57 61 62 44 50 48 56 55 53
54 30 16 73
66
60
53 61 54 44 27
34 58
43 33
54 55 50 38 35 43 56 52 57
69 49 45 62 44 55 43 42 32 51
58 45 38
52 61 43 42 51
42 53 37 36 59
63 52 51 43 46 35 51 47 64
49 51 43 58 48 43 63 66
29 48 56 48
38 47 63
64
50 34 60
77 49 32 52 38 44
58 48 65 60 64
52 36 36 56 60 38
sa
49
60
66
68
45 44
59 65 65
55 37 46
74 56
71
54 49 42 60 46 47 41 49 33 48 49 48
40 48
37 32 61
65
59 51
55
71
60
63
44 44 42
46 46 41
APPENDIX
491
TABLE 1. (continued)
1
2
3
4
5
6
7
8
9
10
37 50 56
69 51 37
48 62
53
36
48
53 56 51 51
48
45
58
46 27 48 52
55 58 53 55 67
41 52 54 49 66 59 38 34 45 47 54
24 58 37 73 40
43 47 27 70 70 55 50
54 43 67 52 55 40 40 66 46 35 63 61
65 48
51 53 55 61 40 56 34 48 44 64
67 47 39 45
30 53 61 46 56 57 32 11 56 59
64
68
46
64
45
41 55 49 52 44 49 62 56 51
59 56 49 38 70 45 51 52 62
48 38 41 48 46 55 57 46 55 56
44 73 54 41 44
56 57 57 57 56
54 62 54 30 57 46 57 44 44
33 55 57 51
56
44 43 45 47 46 51 56 38 52 48
37 56
56
48
60 61 48 48 26 56
62 38 37 37 30 48 50
63
66
46
35
56 50 56 30 46 54 35 44 49 36
55
49
40 58 47 56 57 54 37 58 55 61
54 58 54 21 61 58 48 44 56 69
50 43 57 56 63 56 59 56 51 41
60
44
57 56 61 44
72 69 69 41 44 59 57 62 60
52 38 44 47
66 34
50 51 46 52 60 67 37 54 40
61 44
492
APPENDIX
TABLE 1 (cora"rauecf)
1
2
3
4
5
6
7
8
9
10
58 53 50
62 56 56
50 39 47
34 55
67 54 46
49 47 38
62 48 59
22
48
52
44
53
54
56
49
55
53 52
43
51
41 44
45
42
52 43
38
66
57 51 28
43 56 47 51 47
55
60
41 55 52
59 51 62 47 46 52 50 52
44
30 53
54 66
45
55
54 40 52 42 47 49
68
61 44 50 58 46
61 67 54 66 78
55 47 46 47 32 62 36 56 53 59 48
27 62 62 63
46 58 40
40 56
57 48 42
50
42 44 66
63
63
57 46 49
47 48
68
42 44 42 39
48
42
58 51 49 49 50 55 49 49 46
60
52 59 54 53 54 49 40 51
48 45
46
53
66
54 56 46 46 64
46 56
51 57 59 54 47 43
44 44 42 56 41
53 49 57 49
55
60
62
31 52
36
51 56 55
48
62
60
48
54 68
48 66 63
46 39 47 37 40 42
71
35 56 35 61 67 39 53 61 54
55
55
28 67 44 28
41 55 55 60
73 33 58
52
41 43 47
51
60 48
51 36
40 48
54 42 61 52 38
35 60
60 40
58 58
49 49 49
58 63
42
47 52 43
493
APPENDIX
TABLE 1. (continued)
1
2
3
4
5
6
7
8
9
10
41 58
54 61 37 46
35 51
52 44 50 47 67
32 56 42 49 49
40
52
53 42 52 47
64
42 51 55
49 53
27 55 39
50 58 39 47 52
35
34 52
26 48
38
44 59
45
64
41 38
52
45
48
60 65
54 39
30
40 75
48
68
57 35
53
64 29 66
61
61 58 42 39 59
44
~
57 44 49 61 40 58 38 43
44 71 62
72 50 67
63
63
68
54
53
49
70
45 64
44
52 57 53 54 37
60
57
52 56 56
65 46
51
48
51
65
56 61 56 36
64
48 48
49
61
48 45
62
54 70 39
58
58
56 56 43 41
44
34 57
45 54
75
60
52 34 43
43 61 42 43 53 52 36
60
55 74 39
48
60
53
39 54
45
44 53
57
60
66
48 40 47 38 41
62 55 62 37
42 52 58 41
48
42
50 38 49 47 33
60
43
52 42 50
44
38
!)2 51
46 51 32
69
34 50
45 59
40
49 55 56
42
52
44
54
66
53
58
51
48
47
52
56
39
43
50
51
43
35
62
51
40 42 43 71
64
50
54 65
50 49 45 46
45
56
38
52 51 41 41 27
494
APPENDIX
TARLE 1 (continued)
6
7
8
9
10
52
48
61
46
55 66 68
40
56 56 45
48 65
56 63 49
45
66
37 52 52 49
50 39 46
44 62
38
66
40
58 43
64
49 62
47 52 53
35 55 53 39
48
64
37
63 49 51 42
42 54 59 49
43
46 48 51
50 49
39 57
1
2
3
4
5
58 61 34 39 49
45 42 54 59
32 55 60
67 36 47
61 35
48 42 51 43 36
56 55 51 56 56
58 49 46 21 58
30 46 42 27 53
53 61 62 60
66 46 38 53
43 41 56 67
47
60
64
53 48 56 65 43
55 45
56 49
S6
54
40 43
39 52
68
60
42
60
39 53 54 45 71 48 67 52 51 65 50 36
68
48
54 48
62 59
44 44 69 46 54
48 70 53
51 52
41
63 46
38
63
47 47
64
56 55 50 43 51
49 32 52 45 58
62
52 52 50 72 55
49 55 54
42 56 53 57 52
50 39 49 57 36
56 48 52 75 49
45 36
66 44
59 56 60
42 38 52 42 54 44 51 41 45
45
60
49 40
44 54
48
72
37
47 39
40
71 52 49 65 40 48
53 34 50 54 43 63
45
33 46 54 48 42 44 60
APPENDIX
495
TABLE 1. (continued)
1
2
3
4
5
6
44 66
37 52
31 55 49
53 37 31
54 57
48
45 41 36 57 50
48
63
41 53 75 56
55 61 46 67
50 29 39
48 45 76 52 47 54 42 39 27 44
40 42 52 26 36 43 41 35
42 43 53
56 35
64
38 45 54 47 50
64
49
71 64
63
47 44 58 48
60
48
7
8
9
58
40 42
50
33
44
63
65
34 39
54 76 53
54 59 44
54 38 50 40
55 52 47 41
48 61
64
63
50 55 85
58
60
52 76 64
49 48
49 71 65
61 50 62
44 43
63
42
49 55
34 43
50 43 47 38 39
50 48
32 64
39
59
31
66
61 39 46
56 50 57
61 41 47 45 49
68
46 67 56 48 43
44
57
42 44 46 67
65
35 36 59 52
64
76
50 64
62
40
52 34 50 56
47
68
52 33 62 32 70
53
39
54 41
53 54 49
51 48 15 45
44 61 37 49
49 33 54 42 37
71 56 51 52 43
53 53 62 55 54
48 38 59 50 38
55 62 52 63 42
55 48 51 51 67
39 39 45 42 46
60
56 56 59
66
10
66
55 43
66
42 34 50 40 46 42 63
48 46 46
496
APPENDIX
TAB LE 1 (continued)
1
2
3
4
5
6
7
8
9
10
28
33 49 59
38 43
37
62
42 46
55
56
64 68
34 54
48
47
53 43
40 52 36 33 42
61 52
36 48 51 59 57
41
63
58 52 43 49
35 51 50 40 51
42 53 54
44 48 51 54 52
53 36 61 54 52
40 52 48
62 49 56
48
68
54
46
51 53 59 56
41 51 53
49 61
65
72 69 51 58 44 54
49 32 45 52 54 57 55
52 47 62 62 46 51 51 47 58 47
43 55 43 39 57 55 38 38 62 49
62 54 46 40 49 49 39
39 41 53 52 59
40
43
41
42
50 56 62 52
48 46
45 50
40 68
52
50 53
56 57
52 50 52 42
58
49
45
46 50 44 49
46
48
52 49 59 34 55 75 41
42 47 54
60
50 55
63
45 42 49 64
61 53 70
44
59 31 42 42 38 63
42 58 43 50 56
60 48
43 46 52 62 61 38 74 64
53 51
63
68
44
48
44 56 39 38 42 59 53 44
50 68
46 50
48 43 47
28
60 29
32
52
48
48 68
57 36 57 62 52 32 S6
52 56 40 46 48 39 33
43 53 69 46 50
37 58 58 59 62
66
497
APPENDIX
TABLE 1. (cominuecl)
1
2
50 54
61 63
66
44
47
40 54
50
59
67
51 52 49 46 41
50 46 70
56 51 43 53 43
58
49 46
61
60 62
58 48
60 44
55
59
47 60
62 60 56
56 47 53 45
42
41 61 52
46 53 50
55
68 60
68
69 46 50
46
43
51 46
8
50
50
45 63 54 30
7
21
52
68
76 53 54 57
6
54 48 40
46
44
5
66
42
10
4
38 30 57
64 54
9
3
38 48
45 66
40 51 56 51
47 48
42 52 46 47 50 69 42 61
62
57 49 52
43 43 50
50 54
43
43 49
61 57 55 34
48
58
50
58 63
47 50 43 50 73
65
64
54 59
60 36
48
43
55 50 71 44
50 52 39 46
54
46 70 71
61 46
64
61 40 51 58 70 53
45
44
52 57 51 56 35
29
58
57 47 53 34
52 50 50 42
48 45
48 68 60
76 35 52 62 52
44 63
60
67 44
44
56
46 57
45 60
51 49 32 53 50 41 48
39 47 44
64
39
45
57 69
41 44
46 53 51
54 50 51 53 58 52 37 43 41 37
56 55 62
54 55 56 61 38
34 30 44 37 54
48 69
50 40 62 50 44
498
APPENDIX
TABLE 1 (continued)
-
1
2
3
4
5
6
7
46 57
50
43 62
62 59 37 50 54
57 49
40 54 49
51 39 49 45 34
49
40 55
59 62 34 50 38
34 42
50 42 75 50
34 48 49 35 53
77 47 30 53 78
62 45 51 41 57
53 59 63 49 40
42 57 50 39 51
31 57 50 57 45
51 57 46 53 72
54 40 30 51 54
33 35 43 40
53 54 37 46 57
44 47 36 50 52
56 55 51 58 48
41 50 46 56 34
59 35 69 52 67
53 45 56 52 61
46 42 37 48
57 49
41 61 34 62 45
50
61
72 53 49 53 42
59 58 37 54
60
43 48 47 49 39
62 39 58 40 59
60
68
58
57 38 57 40 55
60 64
48
36
37
65
60
64
54 41 39
43 36 59
61 67 45 52
66 64
60
30 47 34 61
44
44
43 70 54 35 41 51 42 49
8
9
44
62 45 56 43
55 64
63
40
10 48
49
33
49
48
69
42 61 35
55
55
45
44
65
46
48 34 54
68
71 62 52
46 57 56 76 47
59 41 35 49 36
50 52 40 52 49
39 44
52 61 47 50 50
50
48
39 57
67 43
41 39
60
64
36 52 71
499
APPENDIX
TABLE 1. (continued)
1
2
3
4
5
61 60
60 60
43 49 30
61 28 50
55 58 51
58 41 45 52 59
40 66 31 62
54 43 61 52
44 64
45
40 37 53
55 51 67
38
41 48 37 31
55
68
47 46 46 33 49
54 53 44 49 55
53 25
51 48
52 59
44 54 50 49 70
33 61 51 63
62 54 45 51 53
54 86 40 42
57 52 56 49 39
47 57 47 . 37
42
37 64
39
55 43 48
71 51 54 39 62
58 55 69
47 55 43 49 67
55 44 56 56 50
56
44
61 58 56 47 56
44 50 40 39 36
47 55 43 49 67
55 44 56 56 50
56
54
48
38 46 58 58
49 45
32
46
48 45 46 54
60
43 52
68
66
64
54 44 58 42
58
60
47 56 52 54 58
54 44 58 42 60
6
7
8
9
10
45
62 57 46 45 62
52
61 33 43 45 53
61 54 29 54 46
48 44 46 53 53
48 50
63
34 54 48 43
43 49 36 49 48
62 56 50 52 45
57 70 55 57 59
62 56 50 52 45
57 70 55 57 59
63
63
27 45
45
44
48
49 57 69
46 38 39 40
44
48 60
43 41 29 60
43 41 29
60
53 67
500
APPE!iDII
TABLE 1 (eontiuecl)
5
6
7
8
9
10
63
as
54
81
6S
40
63 36
4S
54 48
56
32
69
49
70 53 42
72 42 42
59
53
36
37
SO 48
57
60
61 53 57
63
46
51 49
63
52 39 42
47 43 57
69
71
58
48
62 48
38
54
37
54
4S
62
47
53
36 37 48 36 62
59 56
46 50
39 59 41 57
54
51 33 71
55
55 58
60
64 64
40 4S
49 43
22
47
30
48
58
60 44
61 4S
46 34
38 51 39 62 50
66
49 49 31 50
4
1
2
33
46
55
60
SO 44
54
53
60
36
54
SO
34
49
53 47
42
56
38
4S
56 48 42
39 39
40
4S
51
37 60
57
46 37 56
54
54
55 46
59
29 64
44
65 58
57 48
32 51 53 59 48
47
3
54 54
42 45 51
48
56 47 35 49 29
37 49 43 47 46
34 42
36 63
4S
44
50 45
36 44
56 51 38 42 42
46 40 43 52 54
52 68
50 44
58 41 58
45 47
60
SO
4S
46 27 47
49 57 26 72
46
57 54
52 34 50 51 50 45 35
52
55 67
56
4S
47
39 50
43 53
55
52 43
59 58 42
60 44
4S
44
46
40 35
51 54 50 47 43
I
SOl
APPENDIX
TABLE 1. (continued)
1
2
3
42
50 53 59 62 50 42 53
46 46 38 54
51 52 50 60
54
52
55
24
4
5
6
7
38
25
53 67 62
66 60 63
64
71 45 41 61
21
56
47
66
'39
52 55 51 57 46
56 54 51 42 39
65
62 61 32
54 32 46 59 61
55
50 55 49 50 57
54
52 49
52 52 38
38
60
41 74
62
50
65
58 42 35 42 43
36
62
44
44
69 50 79 56
38 57 70 40
43
49 51 56
60
50
39 30
47
63
52
71 45
42
48
60
44
41 41 39
55 52
55 44
49 33
47 52 53 61 34
61 33 50 40 41
6S 34
66 44
47 53 56
40 53 53
42
41
55
55
46
51 46
45 76
66
56
76
44
33
56 43 37 48 36 45
68
56
61 59 58
49
56
39
48
67 52
44
47
58
66
44
54
44
10
54 66
40
50 56 51
9
61
56
45
8
55
45
55
47
57
50 50 52 55 33
56 50 46 31
46 60
54 38 39 60
52 47 45 38
54
48 55
40 60
27
69 48
27 48
51 57 42 56
48 60
59 52
48 50 49 47 62
55 47 58 42
32
53 9 57 51 51
64
63
48 48
43 54 53
502
APPENDIX
TABLE 1 (com,raued)
1
2
3
4
41 54 66
66 60 45
49 57
54 51
55 53 34 43
37 34 33 52
63
65
62 38 42 61
52 50 61
56
45
63
49 42 47 35
59
66
64
45 58
56 50 39 55 42 52 61 76 50 56 36 38 71 41 39
45
64
55 61 55 41
44
60 66
50 53 54 68 48 64 38 52 57
49
54
44
44
42 54
41
55 57 38 61
63
64
33
47 39
9
10
66
54
57 41
34 35 51 50
56
54 41 61 47
6
7
35
49 61 51 41 43 65 45
44
49
69 43 33 40 52 51 56 59
57 59 52 31 51 59 49
59 53 67 62
65
43
46 54
37 62 62
63
54
63
50 67 37 73
68
61 58 58 52
48 53 43
8
5
43 51 56 54 48
50 45
65
69
46
59 41 40 57 58
44
53 70 55 39 63
58
63 54
49 59 51 43 54 53 34
51 45 59 68
37 51 46 56
46 65
63
60
35 54 47 46 46 41 56 40 49
48
35
44
38 54 51
71 39 53
60
54
58 51 55 45
41 53 45 48 54
43 66
27 58 53 45 44
55 53
64
79
54 57
59
36
63
45 64
44
56
42
65
47
54
60
55
APPENDIX
503
TABLE 1. (continued)
3
4
5
6
7
43
48
47 62 59 52
57 38 39 38 41
53
45
44
42 56
59 60
1
2
59 44
46 59
43
60
40 63
8
9
10
67
45
53
46
70
57 41
61 50
52
41 58 58 43 54
52
54 44
47 39 67 33 53
57 42 45 50 49
67 58 36 55 41
40
57
44
33 56 37 32
40
51 39 51 55
67 50 53 42
68 64
49
65
55
67
44
43
60
57 47
65
47
64
40
42 47
39
61
64
58 30 35 53 59
51
64
43 56 61 49
49 45 47 42 73 56 41 67
45 27 39 62
54 36 49 67 58
58
51 50 49 51 52 51 58
49 46
39
55 65
58 51
62
65
55 44 45
55 74 40
58
51 31
57 46
38 41 58
61 61 54 32 49
55 48 53 51 55
49 52 61 78 23
40 46
42 51 41 53 60
40
58 58 68
46 50 51 75 41 56
45
33 55
55
64
39 54 29
63
29 43 44
48 53 60
46
61
50
44
39 48
45 49
55
63
44
48
70 58 41 55
72 40 38 58
51 54
57 56 37 63
53 36 42 59 26 60
45 41 41
64 40
37 51
79 56 37 49 50 40 37 51 49 45
S04
APPENDIX
TABLE 1 (colldnued)
1
2
3
4
5
6
7
59 53 62
46 30 42
42 55 46
44
4S
51 51
59 41
60 60
58
43
44
58
44
67 32 55 74 41 46
63 38
61 41
44
49 57 52 33 47 37
40 74 58 54 61
50 35 57 43
50
55 43
52 46 43 37 52
48
58
51 42
40
52
53
44
63
43 47 59 43
36 67
4S
56 59
48 26 63 48
41 --57 53
48
60
30
35
66
59 55 59
54
32 43
35
65
63
42
50 56 43 37 52
49 50
53
48 54 45
45 65
44
63
44
27 50
54
52
46 47 67
29 4S
48
27 50 41 52 36
57 54 55 58 59
49
38
68 62
57
64
68
51
43
36 72 51
49
48
66
58
45
44
36
69 74
40 45
44
55 53
39 41
48
50
46
53 46
39 60
63
65
40
68 48
51
43
49 42 56
10
55
69
48
60
9
72 62 57
41 56
38
8
47 22
44
54
80
53 55 49 56
60
42
47 52 42
34
68 63
38 69
43 42 52 36 52 63
42
44
63
55
50 64
45
66
58
50
54
49
60
60
55
48
4S
37
56 41 56
39
42 41 40
52 36
60 46
52 52
68 50
60
50 54
44
41
505
APPENDIX
TABLE 1. (continued)
1
2
3
4
5
56 58 42 50 27
58 57 47 62
47 52 51
58
70
43 61 53
50
60 65
55
60
42
52
62 55 54
52 55 50 29
42 53 52 37 67
42
71
50
64
40
48
53 56 51
41 50 50
54 78 57 42 36 31 51
63
51 41
58 47
60
34
55 41
36 50
61 56 50 41 62 51 25 48
47 62
43 42 60 48
31
46
71
45 48 25
46 51 44 34
47
40
38 49
61
48
60 44
33 50
44 54
60
41
44 65 44
57 45 54 53 44 44
49
41 67 57 33 71 68
47 56 51 48
43 41 54 57 49
6
7 59 47 32 52 58
8 52
49 42
52
66 60
45
54 52
51 35 55
48 47
28
46
66
50 40 62
40
32 28 49
48
40
58 53 72 47 48
51 43 27
63 58 61 57 44
63
54 57 53
9
10
64
54
49 42 54 27
64
60
43 54
44
36 52 54 52
43
48
54
49
42
58
39 55
68
43 41
34 43 56
44 55
63
60
48
59 47
38
48
39 59 56 56 42
49 58 38 46 45
45
51 43
41 53 57 46 58 44 49 52 34
48
34 53 41 47 52 58
50 47 53 29 49 58
508
APPENDIX
TABLE 2 (COMnuecl) SECOND THOUSAND
1-4
5-8
9-12
13-16 17-20 21-24
25-28 29-32 33-36 37-40
1 2 3 4 5
6475 10 30 71 01 60 01 37 33
5838 85 84 12 22 59 20 25 22 89 77 43 63 4430 79 84 95 61 30 85 03 74 25 56 0588 41 03 48 79 0946 56 49 16 14 28 02
6 7 8 9 10
4786 3804 73 50 32 62 97 59
98 70 041:1 83 09 3464 19 95
01 31 59 11 22 37 64 16 78 95 08 83 05 48 00 74 84 06 10 43 4936 63 03 51
73 50 52 51 28 2234 78 39 32 34 93 2488 78 3666 93 02 95 56 24 20 62 83 73 19 32 06 62 06 9929 75 95
11 12 13 14 15
74 01 56 75 49 80 4358 16 65
23 19 4264 0499 4896 37 96
55 59 7909 69 57 13 35 10 50 0854 83 12 19 47 24 87 85 66 6460 32 57 13
82 14 98 70 01
16 4850 2690 55 17 96 76 55 46 92 18 38 92 36 15 50 19 77 95 88 16 94 20 17 92 8280 65 21 22 23 24 25
94 03 47 46 4785 57 61 0830
68 59 0604 6560 63 46 091:1
65 32 25 36 31 68 80 35 78 25 22 50 25 5860
87 62 17 55 87
17 69 61 56 55 95 3811 2490 67 07 66 59 10 28 87 53 7965 59 01 69 78 481:1 45 47 55 44
41 05 14 37 4064 10 37 57 18
41 28 41 57 87
05 51 71 65 91
31 67 70 15 07
87 27 13 62 54
59 47 3328 91 49 3666 5090
69 16 12 12 4343 8706 4604 53 36 3564 3969 32 OS 7734
66 22 42 40 15 96 74 90 9096 63 36 74 69 0963 08 52 82 63 72 92 92 36 00 22 15 01 9399 59 16 35 74 28 36 36 73 0588
48 31 44 68 02 37 31 30 4829 63 83 52 23 84 23 44 41 24 63 33 87 51 07 30 10 7060 71 0264 18 50 6465
78 02 31 80 44 99 7956 23 04 84 17 88 51 9928 24 39 53 92 29 86 20 18 0466 75 26 66 10
0459 34 82 76 56 80 00 5536
2529 81 66 99 22 21 86 79 64 43 12 55 80 46 31 9869
75 89 3488 50 26 2377 72 29 63 67 4094 81 28 19 61 81 70
15 96 0368 82 88 0756 2222 20 13
507
APPENDIX
TABLE 2Table of Random Sampling Numbers FIRST THOUSAND
1-4 1 2 3
4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19 20 22
9-12
13-16 17-20 21-24 25-28 29-32 33-36 37-40
23 15 75 48 59 01 83 72 0554 55 50 43 10 53 74 14 87 16 03 50 32 4043 38 97 67 49 51 94 05 17 97 31 26 17 18 99 75 53
59 35 62 58 08
93 08 23 53 70
76 90 50 78 94
24 61 05 80 25
97 18 10 59 12
08 37 03 01 58
86 44 22 94 41
95 10 11 32 54
23 96 54 42 88
03 22 38 87 21
67 13 08 16 05
44 43 340 95 13
11740 26 93 81 44 33 93 08 72 32 79 73 31 18 22 6470 6850 01 26 42 94
64 80 98 82
56 23 44 91 401 87 15 69
93 55 69 41
00 75 53 95
90 11 82 96
04 99 403 89 32 58 96 61 77 86 70 45
6407 47 55 73 80 27 48
24 62 86 06 17 10 77 27 79
71 38 36 91 59
26 03 37 89 36
07 07 75 41 82
0655 52 55 63 14 31 57 90 52
84 51 89 97 95
53 61 51 64 65
72 89 43 78 00 92 45 37
10 48 19 26 12
71 83 88 82 07
37 29 85 64 94
30 52 56 35 95
72 23 67 66 91
97 08 16 65 73
57 25 68 94 78
89 35 62 35 09 22 95 71 99
74 01 58 06
18 70 25 18
48 23 48 46 56 21 26 34 66
74 92 20 19 31
33 22 62 05 42
59 29 17 95 23
17 18 06 37 92 65 4806 43 86
45 35 02 74 08
47 05 82 69 62
35 54 35 00 49
41 54 28 75 76
44 22 8988 62 84 67 65 67 42
36 80 54 76
12 88 62 04 01 31 8726
59 78 81 33
0709
25 23 00 10 00 63 28 37 8865
92 81 60 99 26
34 63 25 11 07
42 50 95 12 44
0068
98 09 68 33 25
54 71 82 17 49
43 93 49 36 43 61 31 57 09 97 93 72 61
31 57 35 04 24 95 73 62 02
97 83 89 16 23 2596 24 81 44 25 11 32 21
5-8
11 38 08 37 44
40
05 23 87 04 36
4036 2571 95 27 38 80
44 67 33 84 53 20
29 74 48 53
4647 5993 69 19 22 540
29 82 53 26 9964 68 75 53 61
7650 15 87 4569 18 67 93 78
03 43 91 01 24
3000 63 61 4883 65 45 32 45
42 81 95 71 52
-This table is reproduced with the permission of Professor E. S. Pearson from
Trac'. for Compueer•• No. XXIV, Department of Statistics, University College, University of London.
516
APPENDIX
TABLE 2 (contilaud) TENTH THOUSAND
1-4 1 2 3 4 5
5-8
9-12
13-16
17-20 21-24
25-28 29-32 33-36 37-40
13 06 6257 58 92 5696 61 74 80 75 71 50 36 17 50 72 3586 5820 82 21 38 97 09 41 72 01 7009 11 70 94 92 5250 8585 40 62 65 13 05 49 8060 92 12 2590 76 76 70 83 21 17 79 81 00 74 4496 81 96 30 94 50 51 0926 0934 18 70 88'n 6553 'n66 71 35 73 11 58 76 3566 5656 22 25 03 10 21 33 43 42
5420 54 84 21 84 01 12 91 34 8885 2005 77 42 2946 89 87 1869 0902
6 7 8 9 10
0345 43 71 53 17 9644 6909 19 64 70 49 30 20 86 82 6866 53 46 3556 5653 14 92 04 57
34 75 61 30 9266 77 31 55 41
11 12 113 14 15
82 24 43 47 5811 65 52 19 42 24 10 87 57 94 59 98 61 85 73 22 03 29 71 91 90 0054 17 00
41 46 49 52 04 76 25 26 0898 'n56 0222 98 27 03 51 11 18 9899 72 83 8066 19 40 82 74 7160 2890 9284 7862 6048 2054
16 17 18 19 20
03 'n
4897
0864 71 62 12 37 8536 5685 47 32 6897 36 84
35 38 48 78
7888 55 01 67 61 34 52 9092 1020 19 83 22 63 16 03 63 96 20 71 82 IS 82 30 7904 37 89 0409 4875 41 39
18 92 4398
2986 31 07
7461 13 88 80 41 9770 39 17 35 28 0544 3938
41 75 86 52 01 17 4825 81 65 76 82 13 42 19 83 91 65 82 59 32 64 31 65 8971 8259 0176 92 53 4460 7580 3522 3368 3088 63 45 7540 4340 34 16
21 5596 47 65 48 05 95 52 6927 15 22 73 59 23 05 5809 53 92 4266 48 23 28 37 7264 76 66 43 'n 97 24 17 24 47 61 97 72 2089 47 05 33 14 16 25 13 23 75 38 94 97 9835 8446 50
80 71 45 10 93
20 69 05 71 32 13 5900 51 27
24 69 23 38 09 16 59 75 42 59 9734 8765 3815 8185 83 85 4948 4527 3698 4378 90 24 96 37 0665 2038 73 04 34 15 92 74 09 41 32 77 79 13 6611
2380 5879
7248 2241 37 IS
517
APPENDIX
TABLE :3 Area UDder the Normal Curve u
Area
a
Area
u
Area
-3.0 -2.9 -2.8 -2.7 -2.6
.0013 .0019 .0026 .0035 .0047
-1.0 - .9 - .8 - .7 - .6
.1587 .1841 .2119 .2420 .2743
1.0 1.1 1.2 1.3 1.4
.8413 .8643 .8849 .9032 .9192
-2.5 -2.4 -2.3 -2.2 -2.1
.0062 .0082 .0107 .0139 .0179
-
.5 .4 .3 .2 .1
.3085 .3446 .3821 .4207 .4602
1.5 1.6 1.7 1.8 1.9
.9332 .9452 .9554 .9641 .9713
-2.0 -1.9 -1.8 -1.7 -1.6
.0228 .0287 .0359 .0446 .0548
0 .1 .2 .3 .4
.5000 .5398 .5793 .6179 .6554
2.0 2.1 2.2 2.3 2.4
.9772 .9821 .9861 .9893 .9918
-1.5 -1.4 -1.3 -1.2 -1.1
.0668 .0808 .0968 .1151 .1357
.5 .6 .7 .8 .9
.6915 .7257 .7580 .7881 .8159
2.5 2.6 2.7 2.8 2.9 3.0
.9938 .9953 .9965 .9974 .9981 .9987
u
Area
-2.5758 -1.9600 -1.6449 -0.6745
.005 .025 .050 .250
a 2.5758 1.9600 1.6449 0.6745
Area
.995 .975 .950 .750
518
APPENDIX
TABLE 4Percentage Points of the X'·Distribution
v
d.,.
99.5"
97.5"
1 392704 xl 0-10 982069 xl 0-' 0.0100251 0.0506356 2 3 0.0717212 0.215795 0.206990 0.484419 4
5" 3.84146 5.99147 7.81473 9.48773
2.5"
1"
5.02389 7.37776 9.34840 11.1433
6.63490 9.21034 11.3449 13.2767
7.87944 10.5966 12.8381 14.8602
.5"
5 6 7 8 9
0.411740 0.675727 0.989265 1.344419 1.734926
0.831211 1.237347 1.68987 2.17973 2.70039
11.0705 12.5916 14.0671 15.5073 16.9190
12.8325 14.4494 16.0128 17.5346 19.0228
15.0863 16.8119 18.4753 20.0902 21.6660
16.7496 18.5476 20.2777 21.9550 23.5893
10 11 12 13 14
2.15585 2.60321 3.07382 3.56503 4.07468
3.24697 3.81575 4.40379 5.00874 5.62872
18.3070 19.6751 21.0261 22.3621 23.6848
20.4831 21.9200 23.3367 24.7356 26.1190
23.2093 24.7250 26.2170 27.6883 29.1413
25.1882 26.7569 28.2995 29.8194 31.3193
15 16 17 18 19
4.60094 5.14224 5.69724 6.26481 6.84398
6.26214 6.90766 7.56418 8.23075 8.90655
24.9958 26.2962 27.5871 28.8693 30.1435
27.4884 28.8454 30.1910 31.5264 32.8523
30.5779 31.9999 33.4087 34.8053 36.1908
32.8013 34.2672 35.7185 37.1564 38.5822
20 21 22 23 24
7.43386 8.03366 8.64272 9.26042 9.88623
9.59083 10.28293 10.9823 11.6885 12.4011
31.4104 32.6705 33.9244 35.1725 36.4151
34.1696 35.4789 36.7807 38.0757 39.3641
37.5662 38.9321 40.2894 41.6384 42.9798
39.9968 41.4010 42.7956 44.1813 45.5585
28 29
10.5197 11.1603 11.8076 12.4613 13.1211
13.1197 13.8439 14.5733 15.3079 16.0471
37.6525 38.8852 40.1133 41.3372 42.5569
40.6465 41.9232 43.1944 44.4607 45.7222
44.3141 45.6417 46.9630 48.2782 49.5879
46.9278 48.2899 49.6449 50.9933 52.3356
30 40 50 60
13.7867 20.7065 27.9907 35.5346
16.7908 24.4331 32.3574 40.4817
43.7729 55.7585 67.5048 79.0819
46.9792 59.3417 71.4202 83.2976
50.8922 63.6907 76.1539 88.3794
53.6720 66.7659 79.4900 91.9517
70 43.2752 80 51.1720 90 59.1963 100 67.3276
48.7576 57.1532 65.6466 74.2219
90.5312 101.879 113.145 124.342
95.0231 106.629 118.136 129.561
25 26
27
100.425 112.329 124.116 135.807
104.215 116.321 128.299 140.169
-This table is reproduced with the permission of Professor E. S. Pearson from Biometrika, vol. 32, pp. 188-189.
APPENDIX
519
TABLE 5· Percentage Points of y!/1I II
.5%
99.5"
97.5"
5"
2.5"
1"
I 2 3 4
:492704 x 10-10 0.0050126 0.0239071 0.0517475
982069 xl 0-' 0.0253178 0.0719317 0.1211048
:4.84146 2.99574 2.60491 2.37193
5.02:489 3.68888 3.11613 2.78583
6.63490 4.60517 3.78163 3.31918
7.87944 5.29830 4.27937 3.71505
5 6 7 8 9
0.0823480 0.1126212 0.1413236 0.1680524 0.1927696
0.1662422 0.2062245 0.2414100 0.2724663 0.3000433
2.21410 2.09860 2.00959 1.93841 1.87989
2.56650 2.40823 2.28754 2.19183 2.11364
3.01726 2.80198 2.63933 2.51128 2.40733
3.34992 3.09127 2.89681 2.74438 2.62103
10 11 12 13 14
0.2155850 0.2366555 0.2561517 0.2742331 0.2910486
0.3246970 0.3468864 0.3669825 0.3852877 0.4020514
1.83070 1.78865 1.75218 1.72016 1.69177
2.04831 1.99273 1.94473 1.90274 1.86564
2.32093 2.24773 2.18475 2.12987 2.08152
2.51882 2.43245 2.35829 2.29380 2.23709
15 16 17 18 19
0.3067293 0.3213900 0.3351318 0.3480450 0.3602095
0.4174760 0.4317288 0.4449518 0.4572639 0.4687658
1.66639 1.64351 1.62277 1.60385 1.58650
1.83256 1.80284 1.77594 1.75147 1.72907
2.03853 1.99999 1.96522 1.93363 1.90478
2.18675 2.14170 2.10109 2.06424 2.03064
20 21 22 23 24
0.3716930 0.3825552 0.3928509 0.4026270 0.4119263
0.4795415 0.4896633 0.4991955 0.5081957 0.5167125
1.57052 1.55574 1.54202 1.52924 1.51730
1.70848 1.68947 1.67185 1.65547 1.64017
1.87831 1.85391 1.83134 1.81037 1.79083
1.99984 1.97148 1.94525 1.92093 1.89827
25 26 27 28 29
0.4207880 0.4292423 0.4373185 0.4450464 0.4524517
0.5247880 0.5324577 0.5397519 0.5467107 0.5533483
1.50610 1.49558 1.48568 1.47633 1.46748
1.62586 1.61243 1.59979 1.58788 1.57663
1.77256 1.75545 1.73937 1.72422 1.70993
1.87711 1.85730 1.83870 1.82119 1.80468
30 40 50 60
0.4595567 0.5176625 0.5598140 0.5922433
0.5596933 0.6108275 0.6471480 0.6747950
1.45910 1.39396 1.35010 1.31803
1.56597 1.48354 1.42840 1.38829
1.69641 1.59227 1.52308 1.47299
1.78907 1.66915 1.58980 1.53253
70 80 90 100
0.6182171 0.6396500 0.6577367 0.6732760 1.0000000
0.6965371 0.7144150 0.7294067 0.7422190 1.0000000
1.29330 1.27349 1.25717 1.24342 1.00000
1.35747 1.33276 1.31262 1.29561 1.00000
1.43464 1.40411 1.37907 1.35807 1.00000
1.48879 1.45401 1.42554 1.40169 1.00000
d·f·
00
I
·This table is obtained from Table 4.
520
APPENDIX
TABLE 6· Percentage Points of the e-Distribution
v d./.
5%
2.5%
1%
.5%
1 2 3 4
6.314 2.920 2.353 2.132
12.706 4.303 3.182 2.776
31.821 6.965 4.541 3.747
63.657 9.925 5.841 4.604
5 6 7 8 9
2.015 1.943 1.895 1.860 1.833
2.571 2.447 2.365 2.306 2.262
3.365 3.143 2.998 2.896 2.821
4.032 3.707 3.499 3.355 3.250
10 11 12 13 14
1.812 1.796 1.782 1.771 1.761
2.228 2.201 2.179 2.160 2.145
2.764 2.718 2.681 2.650 2.624
3.169 3.106 3.055 3.012 2.977
15 16 17 18 19
1.753 1.746 1.740 1.734 1.729
2.131 2.120 2.110 2.101 2.093
2.602 2.583 2.567 2.552 2.539
2.947 2.921 2.898 2.878 2.861
20 21 22 23 24
1.125 1.721 1.717 1.714 1.711
2.086 2.080 2.074 2.069 2.064
2.529 2.518 2.508 2.500 2.492
2.945 2.831 2.819 2.807 2.797
25 26 27 28 29
1.708 1.706 1.703 1.701 1.699
2.060 2.056 2.052 2.048 2.045
2.485 2.479 2.473 2.467 2.462
2.787 2.779 2.771 2.763 2.756
30 40 60 120
1.697 1.684 1.611 1.658 1.645
2.042 2.021 2.000 1.980 1.960
2.457 2.423 2.390 2.358 2.326
2.750 2.704 2.660 2.617 2.576
00
·This table is reproduced with the pennission of Professor E. S. Pearson from Biometrika, vol 32, p. 311.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
1
2
3
4
5
6
7
,
r::... •• •
i•
1.
!
236.77 211.99 2M-58 210.16 215.71 199.50 161.45 19.1S1 19.3SO 19.2N 19.2t7 19.164 19.000 18.513 8.1161 1.9t06 9.0 lIS 9.1172 9.2766 9.5521 10.128 6.09t2 6.1631 6.2560 6.3883 6.59U 6.9... 7.7086 401759 4.9501 5.0503 5.1922 5.tot5 5.7161 6.6079 403066 4.2839 401174 405337 407571 5.1433 5.9174 30(1660 -.7870 s.971$ 4.1201 401448 407374 5.5914 1.5005 3.58G6 3.6875 3.1378 400662 4.4590 5.1177 3.2927 1.1718 3.4817 3.6131 3.1626 402565 5.1174 3.1355 3.2172 3.3258 1.4710 3.7013 401028 4.9646 3.0121 3.09-46 3.2019 1.3567 1.5874 3.9121 4.8441 2.9134 2.9961 3.1059 3.2592 3.4903 1.1853 4.7472 2.1321 2.9153 3.0254 3.1791 3.4105 3.8056 4.6672 2.7642 2.8417 2.9582 3.1122 3.3419 1.7389 4.6001 2.7066 2.7905 2.9013 3.0556 a 3.2874 1.6123 4.MIl 2.6572 2.7413 2.1524 3.0069 3.2389 1.6317 4.4940 JI 2.6143 2.6987 2.1100 2.9647 3.1961 3.5915 4.4513 I 2.5767 2.6611 2.7729 2.9277 3.1599 3.5546 4.4139 2.S411 2.6281 2.7401 2.8951 3.1274 1.5219 4.3808 2.5140 2.5990 2.7109 2.1661 3.0984 3.4928 4.3513 20 2.4876 2.5727 2.6141 2.1401 3.0725 3.4661 4.3241 21 0 2.4638 2.5491 2.6613 2.1167 3.0491 3.4434 4.3009 22 2.4422 2.5277 2.6400 3.0_ 2.7955 3.4221 4.2793 21 2.42. 2.5082 2.6207 2.7763 3.0018 3.4028 4.2597 24 a 2.4047 2.4904 2.6010 2.7587 2.9912 3.3852 4.2417 25 2.3813 2.4741 2.5161 2.7426 2.9751 3.3690 4.2252 26 2.1732 UBI 2.5719 2.7278 2.9604 3.3541 4.2100 27 2.1S93 2.4453 2.5581 2.7141 2.9467 3.3404 4.1960 28 2.3463 2.432t 2.MM 2.7014 2.9340 3.3277 4.1830 29 2.4105 2.3341 2.5136 2.6196 2.9223 3.3158 401709 30 2.2490 2.11S9 2.4495 2.6060 2.1387 3.2317 400841 40 2.1665 2.2S40 2.3613 2.5252 2.7581 3.1504 4.0012 60 2.0867 2.1750 2.2900 2.4472 2.6102 3.0718 3.9201 120 2.0096 2.0986 2.2141 2.3719 2.6049 2.9957 3.8415 • -".i. t.w. io reproduc.d willt lit. ,..&.eloa of Prof....r E. So P._D ...... B. . .Irii....1. aa. ........ 1.
lIa
I5Z D• .,.•• of Freeclo. for N_.rator
TABLE 7.S" PoiDla.f lit. F.DilllrlJllllioll
211.11 19.171 8.8452 600410 4.1181 4.1468 ..,.7 1.4111 3.22M 3.0717 2.9480 2.1416 2.7669 2.6987 2.6408 1.5911 2.5480 2.5102 2.4768 2.4471 2.4205 2.1965 2.1748 2.lSn 2..171 2.3205 2.3053 2.2911 2.2782 2.2662 2.1802 2.0970 2.0164 1.9384
I
_.M 19.385 1.1121 5.9911 407725 400990 -'6767 3.3881 3.1789 1.02104 2.1962 2.7964 2.7144 2.6411 2.5876 2.5377 2.4943 2.4563 2.4m 2.1928 2.3661 2.3419 2.3201 2.1002 2..21 2.:.55 2.2501 2._ 1.2229 2.2107 2.1240 2.0401 1.9588 1.1799
9
,
I
~
....
en
>C
o
Z
-
~
>
..
--
Q
i
0
lao
-•
!
~
e0
oS
Q
.
II
II
0
e
• .s.
S
20 21 22 23 24 25 26 27 28 29
10 11 12 13 14 15 16 17 18 19
5 6 7 8 9
2 3 4
I
...
30 40 60 120
III
~
---
--
2.1646 2.0772 1.9926 1.9105 1.8307
-
-
-
2.0921 2.0035 1.9174 1.8337 1.7522
2.2'%76 2.2504 2.2258 2.2036 2.1834 2.1649 2.1479 2.1323 2.1179 2.1045
4.6777 3.9999 3.5747 3.2840 3.0729 2.9130 2.7876 2.6866 2.6037 2.5342 2.4753 2.4247 2.3807 2.3421 2.3080
4.7351 4.0600 3.6365 3.3472 3.1373 2.9782 2.8536 2.7534 2.6710 2.6021
2.5437 2.4935 2.4499 2.4117 2.3779 2.3479 2.3210 2.2967 2.2747 2.2547 2.2365 2.2197 2.2043 2.1900 2.1768
243.91 19.413 8.7446 5.9117
12
24L88 19.396 8.7855 5.9644
10
-- -
I
I
II
I
--
-
2.2033 2.1757 2.1508 2.1282 2.1077 2.0889 2.0716 2.0558 2.0411 2.0275 2.0148 1.9245 1.8364 L7505 1.6664
2.8450 2.7186 2.6169 2.5331 2.4630 2.4035 2.3522 2.3077 2.2686 2.2341
4.6188 3.9381 3.5108 3.2184 3.0061
245.95 19.429 8.7029 5.8578
15
-
20
---
L9317 1.8389 1.'7480 1.6581 1.5705
2.0075 1.9898 1.9736 L9586 1.9446
2.1242 2.0960 2.0707 2.0476 2.0267
2.3275 2.2756 2.2304 2.1906 2.1555
2.7740 2.6464 2.5436 2.4589 2.3879
4.5581 3.8742 3.4445 3.1503 2.9365
'----- -
-
1.8814 L1929 L1001 L6084 1.5113
2.7372 2.6090 2.5055 2.4202 2.3487 2.2878 2.2354 2.1898 2.1497 2.1141 2.0825 2.0540 2.0283 2.0050 1.9838 L9643 L9464 1.9299 1.9147 1.9005
249.05 19.454 8.6385 5.7744 4.5272 3.8415 3.4105 3.1152 2.9005
24
-
-
USO.
__
4.4638 3.7743 3.3404 3.0428 2.8259
4.4957 3.8082 3.3758 3.0794 2.8637 2.6996 2.5705 2.4663 2.3803 2.3082 2.2468 .2.1938 2.1477 2.1071 2.0712 2.0391 2.0102 L9842 1.9605 L9390 L9192 L9010 1.8842 L8687 1.8543 L8409 L7444 LM91 1.5543
25L14 19.471 8.5944 5.7170
250.09 19.462 8.6166 5.7459
'---
L6223 1.5089 LSl9S L2539
1.6835 1.5166 1.4613 US 19 1.7396 L6313 1.5343 L4290 I.U80
1.1110 L6906 1.6717 L6541 L6371
1.2214
1.8963 La657 L8380 1.8128 1.7897
---~
1.0000
L8432 1.8117 1.7831 1.7570 1.7331
2.5379 2.4045 2.2962 2.2064 2.1307 2.0658 2.0096 L9604 1.9168 L8780
1.7684 1.7488 1.7307 L7138 1.6981
4.3650 3.6688 3.2298 2.9276 2.7067 4.3984 3.7047 3.2674 2.9669 2.7475 2.5801 2.4480 2.3410 2.2524 2.1778 2.1141 2.0589 2.0107 1.9681 1.9302
-
254.32 19.496 1.5265 5.6281
253.25 19.487 1.5494 5.6581
120
252.20 19.479 8.5720 5.6878 4.4314 3.7398 3.3043 3.0053 2.7872 2.6211 2.4901 2.3842 2.2966 2.2230 2.1601 2.1058 2.0584 2.0166 L9796 L9464 L9165 L8895 L8649 L1424 LI217 L8027 1.7851 L7689 L7537
60
_ - - '------- - - - -
1.7918 L6928 L5943 1.4952 1.3940
L9938 L9645 L9380 1.9139 L8920 1.8718 L8533 L8361 L8203 L8055
2.6609 2.5309 2.4259 2.3392 2.2664 2.2043 2.1507 2.1040 2.0629 2.0264
40
30
Delree. of Freedolll for Numerator
248.01 19.446 8.6602 5.8025
--
TABLE 7. (c-a-ecI) 5,. Poiat. of the F.Dielributioa
-.;,
>C
o
2:
P2
> -.;,
~ ~
en
4.4613 4.4199 4.3828 4.3492 4.3187
4.2909 4.3655 4.2421 4.2205 4.2006 4.1821 4.0510 3.9253 3.8046 3.6889
5.8715 5.8266 5.7863 5.7498 5.7167
5.6864 5.6586 5.6331 5.6096 5.5878 5.5675 5.4239 5 ••57 5.1524 5.0239
..•
•••
:4
3.5894 3.463l 3.3425 3.2270 3.1161
3.6943 3.6697 3.6472 3.6264 3.6072
3.8587 3.8188 3.7829 3.7505 3.7211
4.1528 4.0768 4.0112 3.9539 3.9OM
4.8256 4.6300 4.4742 4.3472 4.2417
7.7636 6.5988 5.8898 5.4160 5.0781
864.16 39.165 15.439 9.9792
eN. t ..... i. reproduced with the . - -•• Ioa of Ptol•• _
-
19
2» 21 22 23 24 CI 2S 36 27 28 29 30 40 60 12»
0
i•! lao ...
4.76SO 4.6867 4.6189 4.5597 4.5075
5.4564 5.2559 5.0959 4.9653 4.8567
6.1995 6.1151 6.0420 5.9781 5.9216
6.9367 6.7241 6.5538 6.4143 6.2979
10 11 12 13 14
I
8.4336 7.2598 6.M15 6.0595 5.7147
799. SO 39.000 16.044 10.649
2
15 16 17 18
CI
0
i I ..s
!
10.007 8.U31 8.0727 7.5709 7.2093
647.79 38.506 17.443 12.218
1
5 6 7 8 9
1 2 :4 4
~
IIJ
TABLE 7b'
3.5147 3.4754 3.4401 3.4083 3.3794 3.3530 3.3289 3.3067 3.2863 3.2674 3.2499 3.136 1 3.0077 2.8943 2.7858 E. S. P•• _
3.8043 3.7294 3.6648 3.6083 3.5587
4.4683 4.2751 4.1212 3.9959 3.8919
7.3879 6.2272 5.5226 5.0526 4.7181
899.58 39.241 15.101 9.6045
4
3.4147 3.3406 3.2767 3.2209 3.1718
4.0721 3.8807 3.7.3 3.6043 3.5014
6.9777 5.8197 5.1186 4.6517 4.3197
937.11 39.331 14.735 9.1973
6
3.2891 3.1.3 3.2501 3.0895 3.2151 3.0546 3.1835 3.0232 3.15. 2.9946 3.1287 2.9685 3.1048 2.9447 3.0828 2.9228 3.0625 2.9027 3.0438 2.1840 3.0265 2.1667 2.9037 2.7444 2.7863 2.6274 2.6740 2.5154 2.5665 2.4082 fro. Bu..,",a. yol. 33,
3.5764 3.5021 3.4379 3.3820 3.3327
4.2361 4.0440 3.8911 3.7667 3.66M
7.1464 5.9876 5••52 4.8173 4.4844
92L85 39.298 14.885 9.3645
5
Depe.. of Fre.dom for Nlllller.tor
2.5" Point. of tis. F·Dietribatloa
,Po
3.0074 2.9686 2.9338 2.9024 2.8738 2.8478 2.8240 2.8021 2.7820 2.7633 2.7460 2.62. 2.5068 2.3948 2..75 82-83.
3. 19M 3.2194 3.1556 3.0999 3.0509
3.9498 3.7586 3.6065 3.4827 3.3799
948.22 39.355 14.624 9.0741 6.8531 5.6955 4.9949 4.5286 4.1971
7
2.9128 2.8740 2.8392 2.8077 2.7791 2.7531 2.7293 2.7074 2.6872 2.6686 2.6513 2.5289 2.4117 2.2994 2.1918
3.1987 3.1248 3.0610 3.0053 2.9563
3.8549 3.6638 3.5118 3.3880 3.2853
6.7572 5.5996 4.8994 4.4332 4.1020
956.66 39.373 14.540 8.9796
8
2.5746 2.4519 2.3344 2.2217 2.1136
2.8365 2.7977 2.76. 2.7313 2.7027 2.6766 2.6528 2.6309 2.6106 2.5919
2.8800
3.1227 3.0488 2.9849 2.9291
3.7790 3.5879 3.4358 3.3120 3.2093
6.6810 5.52M 4.8232 4.3572 4.0260
963.. 39.387 14.473 8.9047
9
1
~
to.:>
CI1
><
C
-
~
Z
> ."
-
..
Q
i
0
-r!•
." II
0
S
Q
.. oS
8 II
a 'jj
•
S
- -
00
120
60
30 40
20 21 22 23 24 25 26 27 28 29
10 11 12 13 14 IS 16 17 18 19
5 6 7 8 9
1 2 3 4
i~
va
II
2.5437 2.4935 2.4499 2.4117 2.3779 2.3479 2.3210 2.2967 2.2747 2.2547 2.2365 2.2197 2.2043 2.1900 2.1768 2.1646 2.0772 1.9926 1.!»105 1.8307
_ _ _ _ _ L-_
2.0921 2.0035 1.9174 1.8337 1.7522
2.1649 2.1479 2.1323 2.1179 2.1045
2.2'Z76 2.2504 2.2258 2.2036 2.1834
I
I
2.2033 2.1757 2.1508 2.1282 2.1077 2.0889 2.0716 2.0558 2.0411 2.0275 2.0148 1.9245 1.8364 L7505 1.6664
2.8450 2.7186 2.6169 2.5331 2.4630 2.4035 2.3522 2.3077 2.2686 2.2341
2.9130 2.7876 2.6866 2.6037 2.5342 2.4753 2.4247 2.3807 2.3421 2.3080
I
4.6188 3.9381 3.5108 3.2184 3.0061
4.6777 3.9999 3.5747 3.2840 3.0729
4.7351 4.0600 3.6365 3.3472 3.1373 2.9782 2.8536 2.7534 2.6710 2.6021
245.95 19.429 8.7029 5.8578
15
243.91 19.413 8.7446 5.9117
12
241.88 19.396 8.7855 5.9644
10
L9317 1.8389 1.7480 1.6587 1.5705
2.0075 1.9898 1.9736 1.9586 1.9446
2.1242 2.0960 2.0707 2.0476 2.0267
2.3275 2.2756 2.2304 2.1906 2.1555
2.7740 2.6464 2.5436 2.4589 2.3879
4.5581 3.8742 3.4445 3.1503 2.9365
248.01 19.446 8.6602 5.8025
20
1.8874 L7929 L7001 L6084 1.5173
2.0825 2.0540 2.0283 2.0050 L9838 L9643 L9464 1.9299 1.9147 1.9005
2.7372 2.6090 2.5055 2.4202 2.3487 2.2878 2.2354 2.1898 2.1497 2.1141
4.5272 3.8415 3.4105 3.1152 2.9005
249.05 19.454 8.6385 5.7744
24
L8409 L7444 L6491 1.5543 L4591
1.9192 L9010 1.8842 L8687 L8543
2.0391 2.0102 L9842 1.9605 L9390
2.2468 .2.1938 2.1477 2.1071 2.0712
2.6996 2.5705 2.4663 2.3803 2.3082
4.4957 3.8082 3.3758 3.0794 2.8637
250.09 19.462 8.6166 5.7459
30
De,ree. of Freedom for Numer.tor
5" Point. of the F-Dimibutioa
TABLE 7. (eonalllleel)
1.7918 L6928 L5943 1.4952 1.3940
LB718 L8533 L8361 L8203 1.8055
L9938 L9645 L9380 1.9139 1.89Z
2.6609 2.5309 2.4259 2.3392 2.2664 2.2043 2.1507 2.1040 2.0629 2.0264
4.4638 3.7743 3.3404 3.0428 2.8259
25L14 19.471 8.5944 5.7170
40
L7396 L6173 1.5343 L4290 1.3180
L9464 L9165 L8895 LB649 L8424 L8217 LB027 1.7851 L7689 L7537
2.6211 2.4901 2.3842 2.2966 2.2230 2.1601 2.1058 2.0584 2.0166 L9796
252.Z 19.479 8.5720 5.6878 4.4314 3.73. 3.3043 3.0053 2.7872
60
L6223 1.5089 L3893 L2539 1.0000
1.7110 L6906 1.6717 L6541 L6377 1.7684 1.7488 1.7307 L7138 L6981 1.6835 L5766 L4673 Ll519 L2214
L8432 1.8117 1.7831 1.7570 1.7331
2.5379 2.4045 2.2962 2.2064 2.1307 2.0658 2.0096 L9604 1.9168 L8780
4.3650 3.6688 3.2298 2.9276 2.7067
254.32 19.496 8.5265 5.6281
00
1.8963 1.8657 L8380 1.8128 1.7897
2.1141 2.0589 2.0107 1.9681 1.9302
4.3984 3.7047 3.2674 2.9669 2.7475 2.5801 2.4480 2.3410 2.2524 2.1778
253.25 19.487 8.5494 5.6581
120
I
I
><
-
Z CI
P:I
>
." ."
~ ~
en
• ••iii • CI
0
i! ...'"'
I
oS
CI
0
Ill)
60
-
30 40
27 28 29
•
:II 21 22 23 24 25
15 16 1'7 18 19
5.9781 5.9216 5.8715 5.8266 5.7863 5.7498 5.7167 506864 5.6586 5.6331 5.6096 5.5878 5.5675 5.'239 5.2857 5.1524 5.0239
6.0.20
6.1995 6.1151
6.9367 6.7241 6.5538 6.4143 6.2979
647.79 38.506 17.443 12.218 10.007 8.8131 8.0727 7.5709 7.2093
1
I
. . .55 .. 2421 .. 2205 4.2006 .. 1821 ..0510 3.9253 3.8046 3.6889
.. 2909
'.3187
•• 3492
•• 3828
4.4613
...."
5.'564 5.2559 5.0959 '.9653 '.8567 .. 7650 '.6867 .. 6189 .. 5597 ..5075
8.4336 7.2598 6.5415 6.0595 5.71'7
799.50 39.000 16.044 10.649
2
3.8587 3.8l88 3.7129 3.7505 3.7211 3.6943 3.6697 3.6472 3.6264 3.6072 3.5894 3.4633 3.M25 3.2270 3.1161
3.9539 3.90M
4.0112
'.1528 ..0768
'.8256 .. 6300 ...742 '.3472 .. 2417
86.. 16 39.165 15.'39 9.9792 7.7636 6.5988 5.8898 5."60 5.0781
3
3.51'7 3••75. 3.4401 3.4083 3.379' 3.3530 3.3289 3.3067 3.2863 3.2674 3.2499 3.1.1 3.0077 2.8943 2.7858
3.6083 3.5587
a.6648
3.8043 3.729'
'.4683 .. 2751 '.1212 3."59 3.8919
899.58 39.248 15.101 9.6045 7.3879 6.2272 5.5226 5.0526 •• 7181
4
6 937.11 39.331 1'.735 9.1973 6.9777 5.8197 5.1186 '.6517 •• 3197
•• 0721 3.8807 3.7283 3.6043 3.501. 3.5764 3•• 1.7 3.5021 3.3406 a••a~ 3.2767 3.3820 3.2209 3.3327 3.1718 3.2891 3.1283 3.2501 3.0895 3.2151 3.0546 3.1835 3.0232 3.15. 2."" 3.1287 2.9685 3.1048 2.9447 3.0128 2.9228 3.0625 2.9027 3.04S8 2.8140 3.0265 2.8667 2.9037 2.7444 2. '7163 2.6274 2.6740 2.515. 2.5665 2.4082 fro. B'-elrlAa. yol. 33, '.2361 '.0440 3.8911 3.7667 3.66M
92L85 39.298 1'.885 9.3645 7.1464 5.9876 5.2852 '.8173 .. 4844
5
Dell'eee of Freedom for Numerelor
eN. lui. ie res-odaced wilb lbe peralee'_ of ~aI•••_ E. S. P ••IIOD
•..
JI
!
5 6 7 8 9 10 11 12 13 1.
•
1 2 3
~
V2
TABLE 7b· 2.5" Pointe of lbe P·Dl.tribulioa
,Po
7
3.007' 2.9686 2.9U8 2.9026 2.8738 2.8478 2.8240 2.8021 2.7820 2.7633 2.74060 2.62. 2.5061 2.39. 2..75 8z-.83.
3.0999 3.05009
3.1SS6
3.9498 3.7586 3.6065 3•• 27 3.3799 3. 29M 3.2194
948.22 39.as5 1'.62' 9.01'1 6.8531 5.m55 '."49 .. 5286 .. 1971
8
3.1987 3.1248 3.0610 3.0053 2.9563 2.9128 2.87.0 2.8392 2.1077 2.7791 2.7531 2.7293 2.707. 2.6872 2.6686 2.6513 2.5289 2.4117 2.299. 2.1918
3.8549 3.6638 3.5118 3.3880 3.2853
956.66 39.373 1'.540 8.9796 6.7572 5.5996 • .899. '.4332 •• 1020
2.8365 2.7977 2.7628 2.7313 2. 7027 2.6766 2.6528 2.6309 2.6106 2.5919 2.5746 2.4519 2.3344 2.2217 2.1116
3.1227 3.0488 2.0849 2.9291 2.8800
3.1790 3.5879 3.4358 3.U2O 3.2093
M3.28 39.387 1...73 1.90'7 6.6810 5.52M ..8232 .. 3572 ..0260
9
II
I
I
I
I
W
tit t.:I
><
CI
-
> "'CI ~ ~
..
CI
0
-'"••" .••
i•.
J.. .s
!
.s•
S
6.6192
2.1131 2.1348 2.6998 2.6682 2.6396
2.6135 2.5895 2.5676 2.5413 2.5286 2.5112 2.3882 2.2702 2.1570
2.0483
20 21 22 23 24
25 26 27 28 29
30 40 60 120
oe
-
3.0602 2.9862 2.9222 2.8664 2.8113
3.1168 3.5251 3.3136 3.2491 3.1469
4.1611 4.2951 3.9639
506613
-
-
204120 2.2882 2.1692 2.Gl48 1.9M7
2.5149 2.4909 204688 204484 2.4295
2.6158 2.6368 2.6011 2.5699 2.5412
6.5246 5.3662 4.6658 4.1991 3.8682 3.6209 306296 3.2113 3.1532 3.!lS01 2.9633 2.8890 2.8249 2.1689 2.7196
976.71 390415 14.331 8.1512
~.63
39.398 14.419 8.8439
12
10
10 11 12 13 14 IS 16 17 18 19
8 9
5 6 7
I 2 S 4
"2 ~
--
-
1.8326
2.3012 2.1819 2.0613 1.9150
2.5131 2.5331 2.4984 2.4665 2.4314 2.4110 2.3161 2.3644 2.3438 2.3248
2.8621 2.1815 2.1230 2.6661 2.6111
3.5211 3.3299 3.1112 3.!lS21 2.9493
6.4211 5.2681 4.5618 4.1012 3.1694
984.87 39.431 14.253 8.6565
15
TABLE1I»(~
2.2422 2.2174 2.1946 2.1135 2.1540 2.1359 2.0069 1.8111 1.7591 1.6402
2.3005 2.2159 2.2533 2.2324 2.2131 2.1952 2.0611 1.9445 1.8249 1.7085
2.4016 2.3615 2.3315 2.2919 2.2693
6.2180 5.1112 404150 3.9412 3.6142 3.3654 3.1125 3.0181 2.8932 2.1888 2.1006 2.6252 2.5598 2.5021 2.4523
997.25 39.456 14.124 8.5109
24
2.1816 2.1565 2.1334 2.1121 2.0923 2.0139 1.9429 1.1152 1.6899 1.5660
2.6431 2.5611 2.5021 204445 2.3931 2.3486 2.3082 2.2718 2.2389 2.2090
6.2269 5.0652 4.3624 3.8940 3.5604 3.3110 3.1116 2.9633 2.8313 2.1324
1001.4 390465 14.081 8.4613
30
40
104835
2.1183 2.0928 2.0693 2.0471 2.0216 2.0089 1.8152 1.7440 1.6141
2.2813 2.2465 2.2091 2.1163 2.1460
2.5850 2.5085 2.4422 2.3842 2.3329
-
-
6.1151 5.0125 4.3089 3.8398 3.5!lS5 3.2554 3.0613 2.9063 2.1191 2.6142
1005.6 390473 14.031 8.4111
D....... of Fre.dom for Numeretor
2.4645 2.4241 2.3890 2.3561 2.3213
3.4186 3.2261 3.0128 2.9411 2.11431 2.'7559 2.6808 2.6158 2.5590 2.5089
6.3285 5.1684 4.4661 3.9995 3.6669
993.10 390448 14.161 8.5599
20
2.5" Poiat. of tb. F-Oi_dhlllioa
,
-
-
2.GlI1 2.0251 2.0011 1.9196 1.9591 1.9400 1.1028 1.6668 1.5299 1.3883
2.22M 2.1119 2.1446 2.1101 2.0199
2.5242 204411 2.3101 2.3214 2.2695
3.19M 3.0035 2.11418 2.1204 2.6142
6.1225 4.9519 4.2544 3.1844 3M9S
1009.8 390481 13.992 8.3604
60
-
1.9CII5 1.8111 1.8521 1.8291 1.1012 1.1861 1.6311 1.4822 l.3101 1.0000
1.8664 1.1242 1.5110 1.4321 1.2684
2.0853 2.0122 2.0032 1.9611 1.9353 1.9111 l.tlN5 1.9299 1.9012 1.1861
2.1S62 2.1141 2m6O 2.0n5 2.0099
2.4611 2.38S1 2.3153 2.2558 2.2032
2.3953 2.3163 2.2414 2.1169 2.1333
6.0153 4.8491 4.1423 3.6102 3.3129 3.0198 2.8821 2.1249 2.5955 2.41'12
6.0693 4.9045 4.1989 3.7219 3.3911 3.1399 2.M41 2.7114 2.t159O 2.5519
1018.3 39.498 13.902 8.2513
oe
1014.0 39••90 13.M1 8.3092
120
I
I
""
>:c
C
2
[IIJ
"0
>
tn
~
Q
~
..
.
0
--
3.4735 3.2910 3.1187 2.9559 ?5020
3.6990 3.5138 3.3389 3.1735 3.0173 4.0179 3.8283 3.6491 3.4796 3.3192
4.5097 4.3126 4.1259 3.9493 3.7816
5.3904 5.1785 4.9774 4.7865 4.6052
7.5625 7.3141 7.0771 6.8510 6.6349
30 40 60 120
00
3.6272 3.5911 3.5580 3.5276 3.4995
3.8714 3.8117 3.7583 3.7102 3.6667
3.8550 3.8183 3.7848 3.7539 3.7254
4.1027 4.0421 3.9880 3.9392 3.8951
4.1774 4.1400 4.1056 4.0740 4.0449
4.4307 4.3688 4.3134 4.2635 4.2184
4.6755 4.6366 4.6009 4.5681 4.5378
4.9382 4.8740 4.8166 4.7649 4.7181
4.3183 4.2016 4.1015 4.0146 3.9386
5.5680 5.5263 5.4881 5.4529 5.4205
5.8489 5.7804 5.7190 5.6637 5.6136
4.5556 4.4374 4.3359 4.2479 4.1708
7.7698 7.7213 7.6767 7.6356 7.5976
8.0960 8.0166 7.9454 7.8811 7.8229
4.8932 4.7726 4.6690 4.5790 4.5003
25 26 27 28 29
20 21 22 23 24
5.4170 5.2922 5.1850 5.0919 5.0103
5.3858 5.0692 4.8206 4.6204 4.4558
10.672 8.4661 7.1914 6.3707 5.8018
7
8
3.0665 2.8876 2.7185 2.5586 2.4073
3.1n6 3.3045 3.1238 2.9930 2.953W·8233 2.7918 2.6629 2.6393 2.5113
3.4567 3.3981 3.3458 3.2986 3.2560
3.8948 3.7804 3.6822 3.5971 3.5225
4.9424 4.6315 4.3875 4.1911 4.0297
10.158 7.9761 6.7188 5.9106 5.3511
3.2172 3.181R 3.1494 3.1195 3.0920
3.5644 3.5056 3.4530 3.4057 3.3629
9 6022.5 99.388 27.345 14.659
3.3239 3.2884 3.2558 3.2259 3.1982 3.4568 3.4210 3.3882 3.3581 3.3302
3.6987 3.6396 3.5867 3.5390 3.4959
4.0045 3.8896 3.7910 3.7054 3.6305
5.0567 4.7445 4.4994 4.3021 4.1399
5.2001 4.8861 4.6395 4.4410 4.2779 4.1415 4.0259 3.9267 3.8406 3.7653
10.289 8.1016 6.8401 6.0289 5.4671
5981.6 99.374 27.489 14.799
10.456 8.2600 6.9928 6.1776 5.6129
5928.3 99.356 27.672 14.976
e11li. lable i. reproduced with the peTmiaaiOD of Proleaaor E. S. Pear.oD &0lIl Biomf!rrika, yol. 33, pp. 84-85.
-•.
~
II
6.3589 6.2262 6.1121 6.0129 5.9259
IS 16 17 18 19
.. oS -8. ..
Q
8.6831 8.5310 8.3997 8.2854 8.1850
5.6363 5.3160 5.0643 4.8616 4.6950
5.9943 5.6683 5.4119 5.2053 5.0354
6.5523 6.2167 5.9526 5.7394 5.5639
7.5594 7.2057 6.9266 6.7010 6.5149
10.044 9.6460 9.3302 9.0738 8.8616
10 11 12 13 14
S • .§a
il
10.967 8.7459 7.4604 6.6318 6.0569
11.392 9.1483 7.8467 7.0060 6.4221
12.060 9.7795 8.4513 7.5910 6.9919
13.274 10.925 9.5466 8.6491 8.0215
16.258 13.745 12.246 11.259 10.561
6 5859.0 99.332 27.911 15.207
5 5763.7 99.299 28.237 15.522
5 6 7 8 9
1 2 3 4
5624.6 99.249 28.710 15.977
3
5403.3 99.166 29.457 16.694
2
4999.5 99.000 30.817 18.000
1
va
4
0,,11'''''. of Fr."dom for Numeralor
4052.2 98.503 34.116 21.198
~
1~
TABLE 7c. Pol•• oIth. F.olelrlla.laa
J
I
en ~ en
><
:2 C
> ~ ~
• C•
0
r.
!
10
2.9791 2.8005 2.6318 204121 2.3209
30 60 60 120
-
3.1291 3.0961 3.0618 3.0320 3.0045
25 26 2'7 28 29
6055.8 99.399 27.229 14.546 10.051 '7.8'741 6.6201 5.8143 5.2565 4.8492 4.5393 6.2961 6.1003 3.9391 3.8049 3.6909 3.5esl 3.5082 304338
3.3682 3.3091 3.2516 3.2106 3.1681
15 16 17 18 14)
10 11 12 13 14
5 6 T 8 9
•
1 2: 3
20 21 22 23 26
-,•
J•
. J!
j
i
•
!
va
~
12
2.8431 2.6648 204961 2.3363 2.1848
2.9931 2.95'79 2.9256 2.8959 2.8685
3.6662 3.5527 304552 3.3706 3.2965 3.2311 3.1129 3.1209 3.0160 3.0316
4.7059 4.3974 6.1553 3.9603 3.8001
6106.3 99A16 27.052 14.374 9.8883 7.7183 604691 5.6668 5.1114
2.7002 2.5216 2.3523 2.1915 2.0385
2.8502 2.8150 2.'782'7 2.1530 2.'7256
3.0880 3.0299 2.9780 2.9311 2.8881
4.5582 4.2509 6.0096 3.8156 3.6557 3.5222 304089 3.3117 3.2213 3.1533
9.7222 '7.5590 6.3143 5.5151 4.9621
615'7.3 99A32 26.872 14.198
15
2.548'7 2.3689 2.1978 2.CD46 1.8'783
2.6993 2.6640 2.6316 2.6011 2.5'762
2.9377 2.8196 2.8216 2.1805 2.'7380
404054 6.0990 3.8584 3.6646 305052 3.3119 3.2588 3.1615 3.0771 3.0031
6208.'7 99.449 26.690 14.020 9.5527 7.3958 6.1554 5.3591 4.8080
20
204689 2.2880 2.1154 1.9500 1.7908
2.6203 2.5848 2.5522 2.5223 204916
2.8591 2.8011 2.7488 2.1011 2.6591
3.2940 3.1808 3.0835 2.9990 2.9249
4.3269 6.0209 3.7805 3.5868 3 ••27.
7.312'7 6.0743 5.2'793 4.7290
904665
6234.6 99.U8 26.598 13.929
24
2.3860 2.2036 2.0285 1.8600 1.69M
2.5383 2.5026 204699 204397 204118
2.7'785 2.7200 2.6615 2.6202 2.5'7'73
4.2469 3.9611 3.'7008 3.5070 3.3476 3.2141 3.1007 3.0032 2.9185 2.8442
6260.7 990466 26.505 13.838 9.3793 7.2285 5.9921 5.1981 4.6486
30 40 60
-
201m 2.0191 1.8363 1.655'7 1.4730
2.363'7 2.32'73 2.2938 2.2629 2.2344
2.6071 2.5684 204951 2M71 2.6035
3.0471 2.9330 2.8348 2.7493 2.6742
6313.0 99AU 26.316 13.652 9.2020 '7.0568 5.8236 5.0316 404831 4.0819 3.7761 3.5355 3.3413 3.1813
--~
1.5923
2.2992 2.1142 1.9360 1,.7628
2.3840 2.3535 2.3253
204530 2041'70
2.6911 2.6359 2.5831 2.5355 204923
3.1319 3.0182 2.9205 2.8356 2.1608
3.2656
304253
3.61~
4.1653 3.8596
9.2912 '7.1432 5.9014 5.1156 4.5667
6286.8 990474 260411 13.'7U
Delr••• of Freedom for N_er.tor
1" Poillu crI ala. F-DleermUlloa
TABLE '71: (-cUlMeI)
2.1107 1.9172 1.7263 1.5330 1.3246
2.2695 2.2325 2.1'" 2.16'70 2.13'78
2.5168 2A568 2.6029 2.3562 2.3099
2 •• 95 201M67 2.7U9 2.6597 2.5839
3.9965 3.6904 3.MM 3.2548 3.0912
99.691 26.221 13.558 9.1118 6.9690 5.7372 4.9660 4.3978
6339A
120
2.0062 1.801'7 1.6_ 1.3805 1.0000
2.1691 2.1315 2.0965 2.0642 2.0362
204212 2.3603 2.3055 2.2559 2.2107
2.8684 2.7528 2.6530 2.5660 204893
3.0060
9.0204 6.8801 5.6495 4.8588 4.3105 3.9090 3.6025 3.3608 3.1654
6366.0 99.501 26.125 130463
00
I
I
I
I
~
1:1
-
""""
Z
PS
>-
~
c:n
S
9.4753 9.4059 9.3423 9.2838 9.2297
9.1797 8.8278 8.4946 8.1790 7.8794
I 25
26 27 28 29
30 40 60 120 -
6.3547 6.0664 5.7950 5.5393 5.2983
6.5982 6.5409 6.4885 6.4403 6.3958
6.9865 6.8914 6.8064 6.7300 6.6610
7.7008 7.5138 I 7.3536 7.2148 7.0935
9.4270 8.9122 8.5096 8.1865 7.9217
I
r
I
TABLE 7ct°
4.6233 4.3738 4.1399 3.9207 3.7151
4.8315 4.7852 4.7396 4.8977 4.6591
5.1743 5.0911 5.0168 4.9500 4.8898
5.8029 5.6378 5.4967 5.3746 5.2681
7.3428 6.8809 6.5211 6.2335 5.9984
~::~!
15.556 12.028 10.050
22500 199.25 46.195 23.155
"
I
I 4.2276 3.9860 3.7600 3.5482 3.3499
4.4327 4.3844 4.3402 ",.2996 4.2622
4.7616 4.6808 4.6088 4.5441 4.4857
5.3721 5.2117 5.0746 4.9560 4.8526
6.8723 6."'217 6.0711 5.7910 5.5623
~:!~:~
14.940 11.464 9.5221
23056 199.30 45.392 22.456
5
3.9492 3.7129 3.4918 3.2849 3.0913
4.1500 4.1027 4.0594 4.0197 3.9830
4.4721 4.3931 4.3225 4.2591 4.2019
5.0708 4.9134 4.7789 4.6627 4.5614
6.5446 6.1015 5.7570 5.4819 5.2574
~:~~~:
14.513 11.073 9.1554
23437 199.33 44.838 21.975
I
--;----
n"lr"". of Fr"edom for Numerator
Po.,. d' ,II. F.Oi.tribulioa 7
3.7416 3.5088 3.2911 3.0874 2.8968
3.9394 3.8928 3.8501 3.8110 3.7749
4.2569 4.1789 4.1094 4.0469 3.9905
4.8473 4.6920 4.5594 4.4448 4.3448
6.3025 5.8648 5.52"'5 5.2529 5.0313
~:::!
14.200 10.786 8.8854
23715 199.36 44.434 21.622
Prot...« E. S. P._OD fr_ Brioonetril .. yol. 33, pp. 86-87.
5.2388 4.9759 4.7290 4.4973 4.2794
5.4615 5.4091 5.3611 5.3170 5.2764
5.8177 5.7304 5.6524 5.5823 5.5190
6.3034 6.1556 6.0277 5.9161
6.4'760
8.0807 7.6004 7.2258 6.9257 6.6803
::~~~~
16.530 12.917 10.882
21615 199.17 47.467 24.259
eni. IMls i. re ...... uc.d wilh lbe penaia~iOD of
9.9439 9.8295 9.7271 9.6348 9.5513
20 o 21 ~ 22 23 24
~
:
~
17 18 19
10.798 10.575 10.384 10.218 10.073
': IS oS 16
:l
9
.9
i
12.826 12.226 11.754 11.374 11.060
10 11 12 13 14
..
:~:~~
:::~!
:
18.314 14.544 12.404
22.785 18.635 16.236
5 6 7
20000 199.00 49.799 26.284
16211 198.50 55.552 31.333
1 2 3 4
2 - - f - - __ t-r---'l----_+-_ _ _ 1_
0.5"
8
3.5801 3.3498 3.1344 2.9330 2.7444
3.7758 3.7297 3.6875 3.6487 3.6130
4.0900 4.0128 3.9440 3.8822 3.8264
4.6743 4.5207 4.3893 4.2759 4.1770
6.1159 5.6821 5.3451 5.0761 4.8566
~:::~
13.961 10.566 8.6781
23925 199.37 44.126 21.352
I 9
3.4505 3.2220 3.0083 2.8083 2.6210
3.6447 3.5989 3.5571 3.5186 3.4832
3.9564 3.8799 3.8116 3.7502 3.69&9
4.5364 4.3838 4.2535 4.1410 4.0428
5.9676 5.5368 5.2021 4.9351 4.7173
~~!~
13.772 10.391 8.5138
24091 199.39 43.882 21.139
I
~
-.I
en
>C
-
~ ~ ~ o
II
10 11 12 13
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 60
~
:
";
-
luo
J
.!
10
3.4117 3.3765 3.3440 1.1167 2.9012 2.7052 2.5188
304499
5.CllS5 1.8199 4.6f114 4.4236 4.2719 4.1423 4.0305 3.9329 3M70 1.7709 3.7OS0 3.6420 3.5870 3.5170 3.4916
5M67 506182
1 24224 199.40 2 I 41.616 4 20.967 5 13.618 10.250 6 8.1803 7 8 . 7.2107 9 606171
c:a 15
•
I
! • .s
....1
X
4.2498 4.0991 3.9709 3.8599 3.7631 3.6779 1.6024 3.5350 306741 3.4199 3.1701 1.1252 3.2839 3.2460 3.2111 3.1787 2.9531 2.7419 2.5439 2.3S83
406281
24426 199.42 41.187 20.705 13.184 10.OS4 8.1764 7.0149 6.2274 5.6613 5.2163 4.9C163 4.6429
12
i
I
20
24116 1t6I0 199.45 199A1 41._ 42.771 20Al8 20.167 13.146 12.901 9.8140 9.s111 7.9671 7.7540 6.6012 6.8141 6.0325 5.as18 5.4707 5.2740 5.0619 4.8552 4.7214 4.5299 4.27OS 406600 4.0585 1.2468 3.8826 4.0698 3.7142 9205 1 1. 3.7929 3.6073 3.6827 306977 1.4020 1.5866 3.5020 I 1.1178 3.2411 106270 1.1600 3.1764 1.2999 3.1165 1.2456 3.0624 3.1963 3.0133 2.9685 3.1515 3.1104 2.9275 1.0727 2.8899 3.OS79 2.8551 3.0057 2.8230 2.5984 2.7811 2.1872 2.5705 2.3727 2.1881 2.1868 1.9998
15
10
25064 199.47 42.466 19.8. 12.656 906741 9.1583 7.6450 7.5345 6.5029 6.1961 5.7292 5.6248 5.1712 5.0705 4.7557 4.6541 4.3109 4A115 4.1726 1.0727 3.9611 1.8619 1.7859 1.6867 1.6378 1.5388 1.5112 1.4124 106017 3.1030 1.1062 3.2075 3.2220 1.1214 3.1474 1.0488 2.9821 1 ••07 1.0201 I 2.9221 2.8679 2.9667 1 2.8187 2.9176 I 2.8728 2.7718 2.8318 2.712.7 2.7Ml 2AM9 2.75M 2.6601 2.7272 2.6278 2.5020 2.4015 2.2898 2.1874 1.9139 2.0190 1.7891 1.8913
21M. 199M 42.622 20.OS0 12.780
24
40
6.2875 5.5186 4.9659 1.5508 4.2282 1.9704 1.7600 3.5850 306172 1.1107 1.2014 3.1058 3.0215 2.M67 2.8799 2.8198 2.7854 2.7160 2.6709 2.6296 2.5916 2.5565 2.5241 2.2958 2.0719 1.8709 1.6691
706225
25168 199.47 42.1. 19.752 12.510 9.2401
D• .,.•• of FN.clo. few "_Ntew
0.5" Po• • aI au F-Dl8Iri1tll&lca
1.5325
2.1838 1.9622 1.7'"
206151
2.9159 2aGl 2.7736 2.7112 2.6511 2.6018 2.5613 2.5217 2.6131 204479
120602 9.1219 7.1_ 6.1772 SAlOl 4.8592 1.4450 4.1229 1.8655 3.6553 1.4801 3.1121 1.2058 3.090 1.0001
25251 199"8 42.149 190611
60
106055 1.3637
l.8Ml
25159 199.49 41.989 19.468 12.274 9.0015 7.1911 6.0649 5.3001 4.7501 4.1367 4.0149 3.7577 3.5471 3.1722 1.2240 1.0971 2.9871 2.8908 2.8058 2.7102 2.6625 2.6015 2.5463 2.4960 2.4501 2.4078 2.1689 2.1330 2.2997 2.0635
120
2.1760 1.9318 1.6885 lA111 1.0000
2.3765 2.1297 2.2867 2.2469 2.2102
206857 206276
1
02 1.26 1.1115 2.9839 . 2.8712 2.7762 2.6901 2.6140 2.5455
12.144 I L8793 7.076. 5.9505 5.1871 4M1S 4.2256 1.90S9 i 1.6465 I 1.4359
-
25465 199.51 41.129 19.125
o >C
lJJ z
>
."
~
2
3
4
5
6
7 8
--
-
9
10 12
14
--
16
18
20
SO
100
2
1
-----~-
18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 6.09 6.09 6.09 6.09 6.09 6.09 6.09 6.09 6.09 6.09 6.09 6.09 6.09 6.09 6.09 6.09 4.50 4.50 4.50 4.50 4.50 4.50 4.50 4.50 4.50 4.50 4.50 4.50 4.50 4.50 4.50 4.50 3 4.02 4.02 4.02 4.02 4.02 4.02 4.02 4.02 4.02 4.02 4.02 4.02 4.02 4 3.93 4.01 4.02 3.83 3.83 3.83 3.83 3.83 3.83 3.83 3.83 3.83 3.83 3.83 3.83 3.83 3.64 3.74 3.79 5 3.68 3.68 3.68 3.68 3.68 3.68 3.68 3.68 3.68 3.58 3.68 3.68 3.68 3.68 6 3.46 3.64 3.61 3.61 3.61 3.61 3.61 3.61 3.60 3.61 3.61 3.61 3.61 3.61 7 3.35 3.47 3.58 3.54 3.56 3.56 3.56 3.56 3.56 3.39 3.56 3.56 3.56 3.52 3.55 3.56 3.56 3.56 8 3.26 3.47 3.52 3.52 3.52 3.52 3.52 3.52 3.52 3.52 3.52 3.52 3.52 3.34 3.41 9 3.47 3.50 3.20 3.48 3.48 3.47 3.47 3.47 3.47 3.43 3.47 3.47 3.47 10 3.30 3.l5 3.37 3.46 3.47 3.48 I 3.48 3.48 . 3.48 3.47 3.46 3.46 3.46 3.46 3.39 3.27 3.35 3.43 3.44 3.45 3.46 11 3.11 3.48 3.48 I 3.47 3.48 3.46 3.46 3.46 3.46 12 3.08 3.23 3.33 3.36 3.40 3.42 3.44 3.44 3.47 3.47 3.47 3.46 3.47 3.45 3.46 3.06 3.35 3.38 3.41 3.42 3.44 3.45 13 3.21 3.30 3.47 3.47 3.47 3.46 3.46 3.41 3.42 3.47 3.39 3M 3.45 3.18 3.27 3.33 3.37 14 3.03 3.47 3.47 3.47 3.47 3.16 3.46 3.38 3.40 3.42 3.43 3.44 3.45 3.01 3.31 3.36 l5 3.25 3.47 I 3.47 3.47 3.46 3.47 3.43 3.44 3.45 3.37 3.41 3.30 3.34 3.39 16 3.00 3.15 3.23 3.41 3.47 3.47 3.45 3.47 3.42 3.44 3.46 3.13 3.28 3.33 3.36 3.38 3.40 17 2.98 3.22 3.47 3.47 3.41 3.41 3.39 3.41 3.43 3.45 3.46 3.27 3.32 3.35 18 2.97 3.12 3.21 3.37 3.47 3.47 3.47 3.41 3M 3.46 3.39 3.41 3.43 3.31 3.35 3.37 2.96 3.11 3.19 3.26 19 3.47 ' 3.47 3.46 3.46 3.41 3.40 3.44 3.43 3.30 3.34 3.36 3.38 3.10 3.18 20 2.95 3.25 3.46 3.47 3.47 3.46 3.47 3.39 3.42 3M 3.32 3.35 3.37 3.24 3.29 22 2.93 3.08 3.17 3.47 3.46 3.47 3.47 3.22 3.37 3.39 3.41 3M 3.45 3.07 3.34 3.28 3.31 3.15 2.92 24 3.36 3.47 I 3.47 3.46 3.41 3.34 3.41 3.43 3.45 3.30 3.38 26 2.91 3.06 3.14 3.21 3.27 3.41 3.47 3.46 3.35 3.37 3.43 3.45 3.20 3.26 3.30 3.33 3.40 28 2.90 3.04 3.13 3.47 I 3.47 3.46 3.47 3.47 3.32 3.40 3.43 3M 30 3.29 3.37 2.89 3.04 3.12 3.20 3.25 3.35 3.47 3.46 3.47 3M 3.47 3.35 3.39 3.42 40 3.10 3.17 3.01 3.27 3.30 2.86 3.22 3.33 3.47 3.48 3.48 3.40 3.43 3.45 3.31 3.33 3.37 60 2.83 2.98 3.08 3.14 3.20 3.24 3.28 3.53 3.41 3.53 3.45 3.12 3.22 3.26 3.29 3.32 3.36 3.40 3.42 100 2.80 2.95 3.05 3.18 00 3.61 3.61 3.41 3M 3.47 3.02 3.19 3.26 3.29 3.38 3.09 2.77 2.92 3.15 3.23 3.34 -Thia table ia reproduced from David B. DUDc:aD. "Multiple RDle aad multiple F teata ... BioiiNlriiii. Volume 11 (1955),1,. 3 With the permiaaioD of the autbor of tbe article ud Profeaaor Gertrude M. Cox. the editor of BitJmfiric:..
~
-
5"
TABLEaa· Lnel New Multiple RUle Teat
Sipificaat StadeDtiaed R-.ea far a
~
en
><
e
~ :z:
>
."
90.0 14.0 8.26 6.51 5.70 5.24 4.95 4.74 4.60 4.48 4.39 4.32 4.26 4.21 4.17 4.13 4.10 4.07 4.05 4.02 3.99 3.96 3.93 3.91 3.89 3.82 3.76 3.71 3.64
90.0 14.0 8.5 6.8 5.96 5.51 5.22 5.00 4.86 4.73 4.63 4.55 4.48 4,42 4.37 4.34 4.30 4.27 4.24 4.22 4.17 4.14 4.11 4.08 4.06 3.99 3.92 3.86 3.80
3
90.0 14.0 8.6 6.9 6.11 5.65 5.37 5.14 4.99 4.88 4.77 4.68 4.62 4.55 4.50 4.45 4.41 4.38 4.35 4.33 4.28 4.24 4.21 4.18 4.16 4.10 4.03 3.98 3.90
4
I
r== -
90.0 14.0 8.7 7.0 6.18 5.73 5.45 5.23 5.08 4.96 4.86 4.76 4.69 4.63 4.58 4.54 4.50 4.46 4.43 4.40 4.36 4.33 4.30 4.28 4.22 4.17 4.12 4.06 3.98
5
4.84 4.74 4.70 4.64 4.60 4.56 4.53 4,s0 4.47 4.42 4.39 4.36 4.34 4.32 4.24 4.17 4.11 4.04
4.~
90.0 14.0 8.8 7.1 6.26 5.81 5.53 5.32 5.17 5.06
6 8
i I
9
--
----
90.0 90.0 90.0 14.0 14.0 14.0 8.9 8.9 9.0 7.1 7.2 7.2 6.33 6.40 6.44 6.00 5.88 5.95 5.61 5.73 5.69 5.40 5.47 5.51 5.25 5.36 5.32 5.13 5.20 . 5.24 5.01 5.06 5.12 4.92 . 4.96 5.02 4.88 4.84 4.94 4.78 4.83 4.87 4.72 4.77 4.81 4.67 4.72 4.76 4.63 4.68 4.72 4.59 4.64 4.68 4.56 4.61 4.64 4.53 4.58 4.61 4.48 4.53 4.57 4.53 4,49 4.44 4.41 4.46 4.50 4.39 4.43 4.47 4,41 4.36 4.45 4.30 4.37 4.34 4.31 4.23 4.27 4.17 4.21 4.25 4.09 4.17 4.14
7 90.0 14.0 9.0 7.3 6.5 6.0 5.8 5.5 5.4 5.28 5.15 5.07 4.98 4.91 4.84 4.79 4.75 4.71 4.67 4.65 4.60 4.57 4.53 4.51 4.48 4.41 4.34 4.29 4.20
10 14
90.0 90.0 14.0 14.0 9.0 9.1 7.3 7.4 6.6 6.6 6.1 6.2 5.9 5.8 5.7 5.6 5.S 5.5 5.36 5.42 5.24 5.28 5.13 5.17 5.08 5.04 4.96 5.00 4.90 4.9& 4.84 4.88 4.80 4.83 4.79 4.76 4.72 4.76 4.69 4.73 4.65 I 4.68 4.62 4.64 4.58 4.62 4.;56 4.60 4.58 4.54 4.51 4.46 4.39 4.44 4.38 4.35 4.31 4.26
12
I
90.0 14.0 9.2 7.4 6.7 6.2 5.9 5.7 5.6 5.48 5.34 5.22 5.13 5.04 4.97 4.91 4.86 4.82 4.79 4.76 4.71 4.67 4.65 4.62 4.61 4.54 4.47 4.42 4.34
16
i
I
20
90.0 90.0 14.0 14.0 9.3 9.3 7.5 7.5 6.7 6.8 6.3 6.3 6.0 6.0 5.8 5.8 5.7 5.7 5.54 I 5.55 5.39 5.38 5.24 5.26 5.14 5.15 5.06 5.07 4.99 5.00 4.9& 93 4. 4.88 1 4.89 4.84 4.85 4.81 ! 4.82 4.78 4.79 4.74 4.75 4.70 4.72 4.67 4.69 4.65 4.67 4.63 4.65 4.57 4.59 4.50 4.53 4.45 4.48 4.41 4.38
18
4.79 4.75 4.74 4.73 4.72 4.71 4.69 4.66 4.64 4.60
90.0 14.0 9.3 7.5 6.8 6.3 6.0 5.8 5.7 5.55 5.39 5.26 5.15 5.07 5.00 4.9& 4.89 4.as 4.82
50
I
I
i
I
,
I
I
5.07 5.00 4.9& 4.89 4.85 4.82 I 4.79 I 4.75 4.74 4.73 , 4.72 4.71 4.69 4.66 4.65 4.68
S.15
5.55 I 5.39 I 5.26
90.0 14.0 9.3 7.5 6.8 6.3 6.0 5.8 5.7
100
I
-r==~---=
-Thia table ia reproduced from Dnid B. DUDcan, "Multiple fanle aad multiple F teatl," BiorM.rlc., Volume 11 (1955), p. 4 with the permillioa of the author of the article and Profe •• or Gertrude M. COli, the editor of Biometric..
00
30 40 60 100
28
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 24 26
~
2
-
Sicnificatlt Studentized HIlDie. (or • 1" Lenl New Multiple Rale Teat
TABLE 8b*
c.n o
"a
><
:z: o
til
> "a
c..:I
531
APPENDIX TABLE 9a95" Coafid.ace lale"al of P.rceata,e of Succe••e. of BiaQmial Popl,ration No. of Suece ... T
0 1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
R.lative Frequency Sample SiI•• n of Succe •••• 250 1000 'Y=.T/n
Sil. of Sample. n
10
15
20
0 17 0 25 S6 1 31 65 4 48 3 38 74 8 55 6 44 81 12 621 9 49 881 16 68! 12 54 93 21 73' 15 59 44 97 27 79 i 19 64 55100 32 84123 68 69100 38 88'27 73 45 92'32 77 52 96136 81 160 98 141 85 68100 '46 88 178 100 51 91 56 94 I 62 97 i 69 99 75 100 83100 0 0 3 7 12 19 26 35
31 45
0 22 0 32 2 40
I
I
, I
I
I
I
50
30 0 12 0 17 1 22 2 27 4 31 6 35 8 39 10 43 12 46 15 50 17 53 20 56 23 60 25 63 28 66 31 69 34 72 37 75 40 77 44 80 47 83 50 85 54 88 57 90 61 92 65 94 69 96 73 98 78 99 83 100 88100
0 0 0 1 2 3 5 6 7 9 10 12 13 15 16 18 20 21 23 25 27 28 30 32 34 36 37 39 41 43 45 47 50 52 54 56 57 59 62 64
11
14 17 19 22 24
27 29 31 34 36 38 41 43 44
46 48 50 53 55 57 59 61 63 64
66 68 70 72 73 75 77 79 80 82 84
85 87 88 90 91 93 94
66 69 71 73 76 CJS 78 97 81 98 83
l
07
99
86100 89100 93 100
100 0 0 0
1 1 2 2 3 4 4 5 5 6 7 8 9 9 10 11 12 13 14 14 15 16 17 18 19 19 20 21 22 23 24 25 26 27 28 28 29 30 31 32 33 34 35 36 37 38 39 40
4 5 7 8
10 11 12 14 15 16 18 19 20 21 22 24 25 26 27 28 29 30 31 32 33 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 S3 54 55 56 57 58 59 60
.00 .01
.02 .00 .04 .05 .06 .07
.08 .09 .10 .11 .12 .13 .14 .15 .16 .17 .18 .19 .20 .21 .22 .23 .24 .25 .26 .27 .28 .29 .30 .31 .32 .33 .34 .35 .36 .37 .38 .39 .40 .41 .42 .43 .44 .45 .46 .47 .48 .49 .50
1 4 5 6 7 3 9 3 10 411 5 12 613 714 7 16 817 918 1019 1020 11 21 1222 13 23 1424 1526 1627 1728 1829 1930 2031 2032 2133 2234 2335 2436 2537 2638 2739 2840 2941 3042 3143 3244 3345 3446 3547 3648 3749 3850 3951 4052 4153 4254 4355 4456 0 0 1 1 2
0 0 0 2 1 3 2 4 3 5 4 7 :; 8 6 9 610 711 812 913 1014 11 15 12 16 13 17 14 18 15 19 1621 17 22 1823 1924 1925 2026 2127 2228 2329 2430 2531 2632 2733 2834 2935 3036 3137 3238 3339 3440 3541 3642 3743 3844 3945 4046 4147 4248 4349 4450 4551 4652 4753
-Thi. lable i. reproduced from Statistical Me.hod., fourth editioa (1946). pp. 4-5, with lhe permi.aion of Profe.aor Geor,e W. Snedecor and Iowa St.te Colleare Prea ••
532
APPENDIX TABLE 9b. 9~ Confidence Iate".1 of Percent.,e of Succe.ae. of Binomi.l Popul.tion
No. of Sacce •••• T
0 1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17 18 19 20 21 22 23
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 ,0 41
42 43 44 45 46 47 48
49
50
Sile of aample,
10 0 0 1 4 8 13 19
41 54 65
74 81 87 92
26 96 35 99
46100 59100
15 0 0 1 2 5 8 12 16 21 26 31 37
30 40 49 56 63
20 0 0 1 2 4 6 8
23 32 39 45
51 56 61
69 74 79 11 66 84 15 70 88 18 74 92 22 78 95 26 82 44 98 30 85 51 99 3' 89 60100 39 92 70100 44 ~ 49 96 55 98 61 99 68100 77100
/I
30 0 0 0 1 3
4 6 8
10 12
14 16 18 21 24 26 29 32
50
16 0 22 0 28 0 32 ) 36 1 40 2 '4 3 48 4 52 6 55 7 58 8 62 10 65
1)
10 14 17 20 23 26 29 31 33 36 38 40 '3
68 12 45 71 14 47 74 15 49 76 17 51 79 18 53 35 82 20 55 38 84 21 57 42 86 23 59 45 88 24 61 48 90 26 63 52 92 28 65 56 ~ 29 67 60 96 31 69 64 VI 33 71 68 99 35 72 72100 37 74 78100 39 76 84100 41 77 43 79 45 80 47 82 49 83 51 85 53 86 55 88 57 89 60 90 62 92 64 93 67 94 69 96
100 0 0 0 0 1 1 2 2 3 3
, 4 5 6 6 7 8
9 9 10 11
12 12 13 14 15 16 16 17 18 19 20 21 21 22 23 24 25 26 27 28 29 29 30
71 97 31
74 98 77 99 80 99 83100 86100 90100
32 33 34 35
5 7 9 10 12 13 14 16 17 18 19 20 21 23 24 26 27 29 30 31 32 33 34 35 36 38 39 40 41
42 43 44
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62
36 37 63
Rel.i"e Fr.quency Sample Size, " of Succ •••e. 250 1000 TIll
;=
.00 .01 .02
.os
.01 .06 .06 .07 .08 .09 .10 .11
.12 .13 .14 .15 .16 .17 .18 .19 .20 .21 .22
.23 .24 .25 .26
.27 .28 .29 .30 .31 .32 .33 .34 .35 .36 .37 .38 .39 .40 .41 .42 .43 M .45
.46 .47 .48
.49 .50
0 2 0 5 1 6 1 7 2 9 210 311 3 13 4 14 515 616 617 718 819 920 922 r 1023 1124 1225 13 26 1427 15 28 1630 1731 1832 1833 1934 2035 2136 2237 23 38 24 39 2540 26 41 2642 2743 2844 2945 3046 3147 3248 3350 3451 3552 3653 3754 3855 3955 4056 4157 4258
0 1 0 2 1 3 2 4 3 6 3 7 4 8 5 9 610 712 813 914 915 1016 1117
1218 13 19 1420 1521 1622 17 23 1824 1926 2027 2128 2229 2230 2331 24 32 25 33 2634 2735 283d 2937 3038 3139 3240 3341 3442 3543 3644 3745 3846 3947 4048 4149 4250 4351 4452 45 53 4654
-Tbia table i. reproduced &om Suaau,leal lIe,hod•• fourth edition (1946), pp. 4-5 with the permi•• ion of Profe••or Geo"e •• Sa.decor .nd low. St.te Coll.,e Pre.a.
533
APPENDIX TABLE 10Transformation of Percentages to Degrees %
0
1
2
3
4
5
6
7
8
9
0 10 20 30 40
0 18.4 26.6 33.2 39.2
!i.7 19.4 27.3 33.8 39.8
8.1 20.:t 28.0 34.4 40.4
10.0 21.1 28.7 35.1 41.0
11.5 22.0 29.3 35.7 41.6
12.9 22.8 30.0 36.3 42.1
14.2 23.6 30.7 36.9 42.7
15.3 24.4 31.3 37.5 43.3
16.4 25.1 31.9 38.1 43.9
17.5 25.8 32.6 38.6 44.4.
50 60 70 80 90
45.0 50.8 56.8 63.4 71.6
45.6 51.4 57.4 64.2 72.5
46.1 51.9 58.7 64.9 73.6
46.7 52.5 58.7 65.6 74.7
47.3 53.1 59.3 66.4 75.8
47.9 53.7 60.0 67.2 77.1
48.4 54.3 60.7 68.0 78.5
49.0 54.9 61.3 68.9 80.0
49.6 55.6 62.0 69.7 81.9
50.2 56.2 62.7 70.6 84.3
-Table 10 is reprinted from Table XII of Fisher and Yates: Statiatical Tables
for Biological. Agricul'ural. and Medical Research. published by Oliver and Boyd Ltd., Edinburgh, by permission of tbe authors and publishers. TABLE 11Normal Scores for Ranks (Zero and negative values omitted) Number of Objects Rank
-
1 2 3 4
2
3
4
5
6
7
8
9
10
.56
.85
1.03 .30
1.16 .50
1.27 .64 .20
1.35 .76 .35
1.42 .85 .47 .15
1.49 .93 .57 .27
1.54 1.00 .66 .38
.l2
5 Number of Objects Rank
11
12
13
14
15
16
17
18
19
20
1 2 3 4 5
1.59 1.06 .73 .46 .22
1.63 1.12 .79 .54 .31
1.67 1.16 .85 .60 .39
1.70 1.21 .90 .66 .46
1.74 1.25 .95
1.76 1.28 .99 .76 .57
1".79 1.32 1.03 .81 .62
1.82 1.35 1.07 .85 .67
1.84 1.38 1.10 .89
.71
1.87 1.41 1.13 .92 .75
.10
.19
.27 .09
.34 .17
.39 .23 .08
.45 .30 .15
.50 .35 .21 .07
.55 .40 .26 .13
.59 .4.5 .31 .19
6 7 8 9 10
.71 .52
.06
-Table II is abridged from Table XX of Fisher and Yates: Statistical Tables for Biological, Agricultural, and Medical Research, published by Oliver and Boyd Ltd., Edinburgh, by permission of the. authors and publishers. For large number of objects up to 50, see original table.
INDEXES
INDEX TO THEOREMS Theorem 2.48. • • • • • • • • • • • • • • . • • • • • • • • • 2.4b • • • • • • • • • • • • • • • • • • • • • • • • • 5.2a • • • • • • • • • • • • • • • • • • • • • • • • • 5.2b. • • • • • • •• • • • • • • • • • •• • • •• •
Page 5 6 33 35
Theorem
9.6 ••••••••••••••••••••••••
9.7 ••••••••••••••••••••••••
10.1& ••••••••••••••••••••••• lO.Ib ••••••••••••••••••••••• IO.Ie ••••••••••••••••••••••• 10.4••••••••••••••••••••••••
S.3.......................... 35 7.2 • • • • • • • • • • • • • • • • • • • • • • • • • • 62 7.5 • • • • • • • • . • • . • • • . • • • • • • • • • • 67 7.6 • • • • • • • • • • • • • • • • • • • • • • • • • • 71 7.7.. •••• ••••.•.••••••••.•••• 74 ??b.. .•• .••..•.••••••....... 76 8.1.. • • • • • • • • • • • • • • • • . • • • • • . • 88 8.1b • • • • • • • • • • • • • • • • • • • • • • • • • 89 9.1 • • • • • • • • • • • • • • • • • • • • • • • • •• 106
537
Page
112 114 122
lO•.t.b ••••••••••••••••••••••• 12.6 ••••••••••••••••••••••••
124 124 128 128 172
15.2 •••••••••••••••••••••••• 11.2 ••••••••••••••••••••••••
222 276
17.3 •••••••••••••••••••••••• 17.4 •••••••••••••••••••••••• 21.1 ••••••••••••••••••••••••
280 286 393
INDEX TO TABLES Table
Page
2.5a. . • • • • • • . • • • • • . • . • • • • • . 2.5b. • • • • • • • • • • • • • • • • • • • • • • 4.18. • • • • • • • • • • • • • • • • • . • •• •
7 7 23
4.1h • • • • • • • • • • • • • • • • • • • • • • • 4.2 • • • • • • • • • • • • • • • • • • • • • • • •
25 27
5.1a. • • • • • • • • • • • • • • • • • • • •• • S.lh. ••• . • . . •• • • • •• • . • • •••• 5.2. • • • • • • • . • . • • • • • • . • • • • • • 5.3 • • • • • • • • • • • • • • • • • • . • • • • •
30 31 32 35
5.6 • • • . • • • • • • • • • • • • • • • • • • • • 7.2 • • • • • • • • • • • • • • • • • • • • • • • • 7.4 • • • • • • • • • • • • • • • • • • • • • • • • 7.5 • • • • • • • • • • • • • • • • • • • • • • • •
38 61 65 68
7.7. . • • • • . • . • • • • • • • • • • • • • • •
73
8.2. • • • • • • . • • • • • • • • • • • • • •• • 8.7 •.•••••..••.••..••••••••
90 9'7
9.2 • • • • • • • • • • • • • • • • • • • • • • • •
107
9.5. • . • • • • • • • • • • • • • • • • • • • • •
110
9.6 • • • • • • • • • • • • • • • • • • • • • • • •
112
10.le. • • • • • • • • • • • • • • • • • • • • • • IO.lb .•••.•••••..••.. IO.le.......................
120 120 121
lO.ld.......................
10.2. • • • • • . • • • . . • • . • . • • • . •• •
10.5........................
10.6 • • • • • • • • • • • • • • • • • • • • • • • •
12.18. . • . • • • • • • . • • • • • • • • . • • . 12.1b • • • • • • • • • • • • • • • • • • . • • • • 12.1c. • . • . • . • • . • • . • • • • • . • . • • 12.1d • • • • • • • • • • • • • • • • • • • • • • • 12.1e • • • • • • • • • • • • • • • • • • • • •• • 12.2. • • • • • • • • • • • • • • • • • • • • • • • 12.38 • • • • • • • • • • • • • • • • • • • • • • •
Page
14.4a. • • • • • • • • • • • • • • • • • • • • • • 14.41» • • • • • • • • • • • • • • • • • • • • • • • 14.5. • • • • • • • • • • • • • • • • • • • • • • •
205 2()6 2CJ8
14.5b • • • • • • • • • • • • • • • • • • • • • • •
2()8
14.6 • • • • • • • • • • • • • • • • . • • • • • • • 14.7. • • • • • • • • • • • • • • • • • • • • • • • 14.8. • • •• • • • • • • • • • •• • • • • • • • •
15.2.. • • •• • • • • • • • • • • • • •• • • • • 15.3a. • • • • • • • • • • • • • • • • • • • • • • lS.3b.. • • •••• • •. • • •• • . • ••••. 15.58. • • • • • • • • • • • • • • • • • • • • • • lS.Sb • • • • • • • • • • • • • • • • • • • • • • • 16.1 • • • • • • • • • • • • • • • • • • • • • • • • 16.3. • . • • • • • • • • • • • • • • • • • • • • •
72
7.7b.......................
I......
Table
209 211 213
223 2.29 231 2..0 2~
245 250
16.3b ••••••••••••••••••••••••. 252 16.3e. • • • • • • • • • • • • • • • • • • • • • • 254
16.4. • • • • • • • • • • • • • • • • • • • • • • • 16.58. • • • • • • • • • • • • • • • • • • • • • •
258 25St
16.5b • • • • • • • • • • • • • • • • • • • • • • • 16.Sc • • • • • • • • • • • • • • • • • • • • •• •
26()
261
16.98. • •• • • • • • • • • • • • • • • • • •••• 267 16.9b. • • • • • • • • • • • • • • • • • • • • • • 268 17.1 ••••••• a 17.3 •••• a • • •
• • • • • • • • • •• • • •• •
275 279
J,24 130 132 151 152
17.4 •••••••••••••••• a a • • • • • a
285
155
1"_
155
12.3b. • • • • • • • • • • • • • • • • • • • •• •• 163 12.10.......... ...... .... ... 176
190 191 198
14.2b • • • • • • • • • • • • • • • • • . • • • • • 14.3. • • • • • • • • • • • • • • • • • • • • • • •
l' .68. • • • • • • • • • • . • • • . • . • • • • •
293
17.6b •••• a • • • • • • • • • • • • • • • • • • 17.6c • • • • • • • • • • • • • • • • • • • • • • • 17.6d • • • • • • • • • • • • • • • • • • • • • • •
294 294 295
17.6e . . . •• .. •• . . • . • .• • . . .••. 17.7........................
295 296
17.8 • • • • • • • . • • • • • • • • • • • • • • • • 18.1 • • • • • • • • • • • • • • • • • • • • • • • • 18.28. • • •• • •• •• • • ••• •• •• • •• •
300 309
17.7h....................... 298
155 158 162
12. lOb • • • • • • • • • • • • • • • • • • • • • • 12.10c............... ••••.•. 12.10d. • • • • • • • • • • • • • • • • • • • • • 13.2. • • • • • . • • • • • • • • • • • • • • • • • 13.3. • • •• •• • • • • • • •• • • • • • ••• • 14.28. • • • • • • • • • • • • • • • • • • • • • •
a •• a ••••••••• a..
176
171 178
18.2b • • • • • • • • • • • • • • • • • • • • • • •
311 312
IS.2e. . • • •• • • • • • • • • • • • . • • •• • 18.2d • • • • • • • • • • • • • • • • • • • • • • • 18.2e. • • • • • • • • • • •• • • •• • •• ••• lS.3. • • • • • . • • • • • • . • • . • • • • . • •
312 313 315 315
18.48.......................
317
18.4b. • • • • • • • • • • • • • • • • • • • • • •
317
199 201
IS.4c. • • • . . • • • • • • • . • • • • • • • • • 18.4d. • • • • • • • • • • • • • • • . • • • • • •
318 318
14.3b. • • • • • • • • • • • • • • • • • • • • • •
202
18.Sa. • • • • • • • • • • • • • • • • • • • • • •
319
14.3c • • • • • • • • • • • • • • • • • • • • • • • 14.3d. • • • • • • • • • • • • • • • • • • • • • •
2()S 2Cl4
18.Sb. • • • • • • • • • • • • • • • • • • • • • • IS.Se. • . • • • • • • • • • • • • . • • • •• • •
319 320
538
INDEX TO TABLES
Table
Page
Table
18.5d ......•...•.....•......
320 322 323 323 325 328 329 330 331 346 347 348 349 350 354 360 361 361 362 362 363 364 366 367
21.6c ....................... 21.6d ..•.................... 21.6e ....................... 21.1 •....................... 21.7b ..•.................... 21.S •.......................
18.5e .........•.............
IS.Sf ...................••.. 18.58 ...................•... 18.6 ........................ 18.8a .....•...•.............
18.8b ..........•............ l8.ac ....................... IS.8d ....................... 19.1_ ..................•.... 19.1b .......•............... 19.1c .......................
19.1d •.•.•...........•...... 19.2 ••..•................•.. 19.4 ........................ 19.68 .•.........•........... 19.6b .............•......... 19.6c ....•.................. 19.6d •...................... 19.6e ..•.................... 19.6f .......•............... 19.7 ........................ 19.8 •..............•........ 19.8b •.....•................ 19.8c .......................
368
369 19.98 ...•................... 371 19.9b ......................... 371 19.9c ..•.......•............. 372 19.9d ................•...... 373 19. ge ....................... 373 19.9f ....................... 374 19.9g ....................... 376 19.9h ....•.................. 378 21.1 ........................ 391 21.2a ....................... 394 21.2b ....................... 396 21.3 .......•................ 397 21.4 ........................ 404 21.6a ....................•.. 408 21.6b ....................... 408 19.8d .••.••••.••••••••.•••..
21.8b ....•.................. 21.9•....................... 21.9b ....................... 21.10 ....................... 22.2 ........................ 22.48 ....................... 22.4b ............•.......... 22.58 ..........•............
22.Sb ...............•....... 22.68 ..........•.........•.. 22.6b ....................... 22.7 ........................
22.8 ........................ 23.1 .................•...... 23.18 ....................... 23.2h ....................... 23.2c ....................•..
23.2d .......................
539 Page
410 411 411 414 416 417 418 421 423 426 434 437 437 438 439 440 441 442 443 448 451 452 453 454
23.3a ..•••..••.•.•••........ 23.3b .............•......... 23.3c ....................... 23.4 ........................
454
24.5b .......................
479
455 455 458 23. Sa .•....•......•..•.....• 460 23.5b ....................... 461 23.5c ....................... 461 24.38 ....................•... 473 24.3b ....................... 473 24.48 ....................... 475 24.4b ..................•.... 475 24.4c ....................... 476 24.4.d ••....•......•...•....• 477 24.4e ....................... 477 24.5a ....................... 478
INDEX TO FIGURES Figure 2.68 • • • • • • • • • • • • • • • • • • • • • • • 2.6b • • • • • • • • • • • • • • • • • • • • • • • 2.6c. • • • • • • • • • • • • • • • • • • • • • •
3.1s....................... 3.lc.......................
Page
Figure
8 9 10
S.lb. • • • ••• • •• • ••••• •• •••• •
14 16 17
3.3a. • • • • • • • • • • • • • • • • • • • • • • • 3.3b. • • • • • • • • • • • • • • • • • • • • • •
19 20
4.1b • • •• • • • • •• • •• • • • • • • • •• •
2S
24
Pap
9.1 ••••••••••••••••••••••••
1()6
9.2. • • • • • • • • • • • • • • • • • • • • • • • 9.6........................ 10.1 • • • • • • • • •• • • • • ••• • • • • ••• 10.2. . . • • . . • . . • • . . • • • . • • . • . • 10.5. . • • • • • • • •• .• •. •• • • •• • •• 12.2........................ 12.5. •• • •••• • ••• • • .•• ••••••• 14.1........................ 15.2. • • • • • • • • • • • • • • • • • • • • • • • 16.1 • • • • . . • • • • • • • • • • • • • • • • • • 16.2. • • • • • • • • • • • • • • • • • • • . • • • 16.3. •• • • •• • • • •• • • • • • •• •• •• •
108 113 123 125 131 159 170 197 224
17.7........................
29'7
19.9 • • • • • • • • • • • • • • • • • • • • • • • • 21.2. • • • • • • • • • • • • • • • • • • • • • • •
377 395
4.1a....................... 5.28.......................
32
S.2b. • • • • • • • • • • • • • • • • • • • •• • 5.2c. • • • • • • • • • • • • • • • • • • • • • •
38 83
S.2cI. • • • • • • • • • • • • • • • • • • • ••• 5.6 • • • • • • • • • • • • • • • • • • • • • • • •
34
6.38........... •••••••••.•• 6.3b. • • • • • • • • • • • • • • • • • • • • • • 6.4 • • • • • • • • • • • • • • • • • • • • • • • • 6.5 • • • • • • • • • • • • • • • • • • • • • • • • 7.Sa. • • • • • • • • • • • • • • • • • • • • • • •
47 .t8 50 51 67
?Sb....................... 69
21.3a....................... 398
7 .6a • • • • • • • • • • • • • • • • • • • • • • • 7.6b • • • • • • • • • • • • • • • • • • • • • • •
70 71
7 .10a • • • • • • • • • • • • • • • • • • • • • • • ? • lOb • • • • • • • • • • • • • • • • • • • • • •
79 80
?10e....... ••••••...••••••
81
21.8b. • • • • • • • • • • • • • • • • • • • • • • 21.3c. • • • • • • • • • • • • • • • • • • • • • • 21.8d.. • • •• • • • • • • • ••• • • • • • • • 23.1.. • •• • • •• • • • •• •• • • • • • • • • 23.1b • • • • • • • • • • • • • • • • • • • • • • • 23.lc. • • • • • • • • • • • • • • • • • • • • • • 23.38. • • • • • • • • • •• •• •• • • •• •• • 23.3b. • • • • • • • •• •• •• •• • • •• • ••
8.2........................ 8.7. • • • • • • • • • • • • • • • • • • • • • • •
249 255
16.4........................ 256 16.5. • • • • • • • • • • • • • • • • • • • • • • • 262 17.4........................ 286
4()
19.4........................ 356
7.7........................ 74
8.1........................
2-'6
88 91
96
540
399
.too ~1
447 .. 449 456 457
INDEX TO SUBJECT MATTER A Additive components, 203 Adjusted mean, 283 distribution of, 283 factors affecting, 287 mean of, 284 randomized block, 366 sampling experiment, 284, 356 test of homogeneity of, 353 with equal regression coefficients, 363 with unequal regression coefficients, 353 variance of, 284 weight of, 355, 359 Advantages of equal sample sizes, 133, 148, 179 Advantages of large sample, 147, 168 Amons-sample mean square, 158 average value of, 166 Amons-sample SS, 155 Anal ysis of covariance, 344 relation with factorial experiment, 370
Analysis of variance, 77, 151 applications, 173 assumptions, 172 computing method for equal sample sizes, 159 computing method for unequal sample sizes, 177 model of one-way classification, 163 models, 163, 214 relation with linear regression, 290 relation with Xl test, 416 summary, 384 tests of specific hypotheses, 175, 221, 226, 233, 238, 298 unequal sample sizes, 175 Angular transformation, 447 example, 450 sampling experiments, 450 Applications of statistics, 388 Array, 245 mean of, 245 Assumptions, 54
8 Bioomial disuibution, 197 in contrast with binomial population. 396. 397
mean, 396 minimum sample size, 397 variance of. 390 Binomial population, 390 in contrast with binomial distribution, 396, 397 mean,391. 393 notatioDS, 425, 426 sampling experiments, 393, 397 variance, 391, 393 Binomial transformation, 47l, 472, 481 Block, 196, 197
C Central Limit Theorem, 34 Chi square, see X I Class, 6 Complete block, 197 Completely randomized experiment, 196 average values of mean squares, 166 distribution-free methods, 472 factorial experiments, lIee Factorial experiments one-way classification, 151 vs. randomized blocks, 196 Components of variance, aee Mean of mean squares Components of variance model, 166, 324 Confidence coefficient, 142, 144 Confidence interval, 142. 144 effect o( sample size on, 147, 193 length 0(, 144. 145, 147, 193 of adjusted mean, 287 of difference between adjusted means, 358 of difference between means, 147 of difference between regression coefficients, 350 of mean o( binomal population, 405 of population mean, 145, 278 o( regression coefficient, 282, 283 Confidence limits, 144 Contingency table, 440 2 x 2, 408. 442 2 x k, 414. 438 r x k. 438. 440 Correlation Coefficient. 265 Critical region, 47 Curve cl regression, 247 connectioD with factorial experimeDt. 377
Curvilinear regression, 298
541
542
INDEX
o Degrees of freedom, 66 physical meaning, 159, 405 Descript,ive statistics, I, 3 Dichotomous, 390 Distribution-free methods, 469, 482 advantages and disadvantages, 483 completely randomized experiment, 472 randomized block experiment, 474 sign test, 478 Distribution of rare events, 457 Dummy value for missing observation, 210 Duncan's test, see new multiple range test
E Equally spaced ~values, 300 Error, 167, 194 Error SS, 158, 167 Estimation, by interval, 141 Expected frequency, see Hypothetical frequency Expected value, see Mean of Experimental error, 212, 331, 452
F Factorial experiment, 309 average values of mean squares, 321, 325 computing method, 316 description of, 309 model,324 partition of sum of squares, 311 relation with analysis of covariance, 370 tests of specific hypotheses, 325 vs. hierarchieal classification, 326 Failure, 390 F-distribution, 105 description of, 10!; relation with ",distribution, 171 relation with X2 -distribution, 114 Fitting frequency curves, 436 Fixed model, 166, 324 Frequency, 6 Frequency curve, 9 fitting to data, 436 Frequency table, 6
G General mean, 152, 177 Grand total, 151
H Hierarchical classification, 326 Histogram, 8 Hypothesis, 44 Hypothetical frequency, 404, 410, 433, 438
I Incomplete block, 197 Independent sample, 26, 122 Individual degree of freedom, 226 advantage of, 302, 386 binomial, population, 420 linear regression, 298 multinomial population, 442 orthogonal set, 229, 230, 360 relation with least significant difference, 235 summary, 386 use and misuse, 237 x2-test of goodness of fit, 435 Induction, 2, 43 Inefficient statistics, 193 Inequality, 141 Interaction, 315 relation with homoReneity of regression coefficients, 371
L Large sample method, 482 Least significant difference, 234, 326 between two treatment means, 234, 326 between two treatment totals, 235 relation with individual desree of freedom. 235 relation with new multiple range test, 238
sampling experiment, 237 use and misuse, 236 Least squares, 253 advantages of, 277, 281 Length of confidence interval, )44, 147 Level of significance, 46 Lint!ar combination, 221 distribution of, 222
543
INDEX TO SUBJECT MATTER
mean of, 222 sampling experiment, 223 variance of, 223 Linear hypothesis model, 166, 324 Linear regression, 244, 247 algebraic identities, 267 aa individual degree of freedom, 298 assumptions, 248 computing methods, 268 distribution of adjusted mean, 283 distribution of regression SS, 260 distribution of regression coefficient, 278 distribution of residual SS, 261 distribution of sample (unadjusted) mean, 276 estimation of parameters, 250 estimate of variance of array, 263 line of regression, 247 model, 248 partition of sum of squares, 255 planned experiment, 300, 302, 375 relation with analysis of variance, 290 sampling experiments, 259, 274, 278,
284 spacing %-val ues, 282, 375 test of hypothesis, 264 test of linearity, 295 variance component, 288 Line of regression, 247 Logarithmic transformation, 458
M Mean, 3 vs. median, 470 Mean of, adjusted mean, 284 binomial distribution, 396 binomial population, 391, 393 difference between means, 124 mean squares, 166, 204, 289, 321, 325
population, 3 regression coefficient, 281 sample, 28 sample mean, 31, 276 Mean square, 158 Mechanics of Partition of sum squares, 151. 198 Median, 469 test of hypothesis. 471 vs. mean. 470
Minimum sample size for binomial distribution, 397 Missing observation. 209 Model. 166, 324 Multinomial population, 432 slIIIlpling experiment, 433
N New multiple range test, 238 relation with le88t significant difference, 238 Non-parametric methods, 469 Normal curve, 14 Normal probability gaph paper, 19 Normal scores, 459 Normal score transformation, 459 quantitative data, 474 ranked data, 459 relation with sign test, 481
o Observation, 1 Observed frequency, 433
Observed relative frequency, 39 One-tailed test, 54 Orthogonal, 229, 230, 360
p Paired observations, 96 relation with randomized blocks, 208 sign test, 478 Parameter, 28 Parameter and statistics of randomized blocks, 204 Poisson distribution, 456 test of hypothesis, 458 Pooled estimate of variance, 114 Pooled SS, 155 sampling experiment, 112 Pooled variance, 114 Population, 1 mean, 3 variance, 4 Population mean, 3 confidence interval of. 145. 278 Power of a test, 389 Probability. 21, 31 Probability of obtaining a failure on a single trial, 425 Probability of obtaining a success on a single trial, 425
INDEX
544 Q Quality control, 84
R Random number table, 135, 396 Randomization, 135 Randomized blocks, 196 binomial traosformation, 472 computing method, 205 distribution-hee methods, 474 models, 214 normal score transformation. 459 relation with paired observations, 208 sign test, 478 summary, 387 vs. completely randomized experiment, 196 Random sample, I, 26 Random variable model, 166, 324 Range, 4 Ranked data, 459 Regression, 244 curve of, 247 line of, 247 linear, aee Linear regression Regression coefficient, 247 confidence interval of, 282, 283 distribution of, 278 factors affecting, 281 in contrast with mean, 344, 349 mean of, 280 relation with in teraetion, 371 sampling experiment, 278,351 test of homogeneity, 344 test of hypothesis fJ = 130, 282 test of hypothesis fJl = A, 344, 3Se variance of, 281 weight of, 34S, 350, 359 Regression equation, 247 Regression SS, 260 distribution of, 260 mean of, 289 Relation among various distribution 189 Relative CUIII11ative frequency, 7 Relative frequency, 7, 21 Reliability of sample mean, 36 Replication, 197 Replication,effect, 201, 204 Replication mean square, 204 average value of, 204 Residual SS, 254, 255 distribution of, 261
s Sample, 1 Sample..,... 28 distribution of, 31, 276 in contrast with reFessioll coefficient, 344, 349 mean of, 35, 276 variance of. 36, 276 weight of, 177, 345,350, 359 Sample size, SO. 147, 192 advantage of equal sample sizes, 133, 179 advantage of large sample, 147, 168. 192 effect on confidence interval, 147, 193 minimum for binomial distribution, 397 Sample aam, 393 Sample variance, 62 computing method, 64 distribution of, 75 linear regression, 263 weighted mean of, 111 Sampling error, 331, Sampling uperimenlS, 2S adj uated mean, 284, 356 angular transformalion, 450 binomial population, 393, 397 description, 23 difference between means, 125 F-distribution, 107, 157 least significant difference, 237 Sampling experiment, (cont.) linear combination, 223 multinomial population, 433 normal score, 459 pooled SS, 112 regression coefficient, 278, 351 regression SS, 259 residual SS, 261 sample mean, 38 Xl-distribution, 66, 73 ,-distribution, 89, 129 Sampling unit, 332 Sampling with replacement, 30 Shortest significant range, 238 Sign test,478 as binomial traosformation, 481 relation with corrected Xl, 479, 481 relation with normal score transformation,480
545
INDEX TO SUBJECf MATTER
Sipificance, 192 Significance level, 46, 192 SipificaDt studmtized range, 238 Single desree of freedom, see IDdividual desree of freedom Size of the sample, ue Sample size Square ft)Ot transformatioD, 454 Standard deviatioD, 4 Standard error, 36, 194 Statistic, 28 Statistical inference, 1 Soedecor's F-distributioD, aee Fdistribution Student's e-distribution, aee e-distribulion Sub-eample, 250 Success, 390
T Table of raDdom Dumbers, 135, 396 e-distributioD, 87, 127 description, 87 relation with F -distribUlioD, 171 sampling experiment, 89. 129 Tea, of bomogeneity of, adjusted mean for randomized block experiment, 366 adjusted means with equal resressioD coefficients, 363 adjusted means with unequal regressioD coefficients, 353 meaDS of binomial populations, 413 means, aee analysis of variance regression coefficieDts, 344 relation with interactioD, 371 Test of hypothesis, 43, 78, 92, 109, 190 Teat of linearity of rep88ioD, 295 Test of sipificance, ue Test of hypothesis Tests of specific hypotheses, 175, 221, 226, 233, 238, 298 factorial experiment, 325 Theoretical frequency, 427 Theoretical relative frequeDCY. 39 Tier, 326 Total frequency, 6 Total 55, 153 Transformation, 447 angular, 447 binomial, 471, 472, 481 loprithmic, 4S8
normal score, 459 square root, 454 Treatment effect, 167, 203, 204 1jeatmentSS, 158,200 Two-tailed test, 54 Type I Error, 45, 192 Type II Error, 45, 190
U Unadjusted mean, 283 Unbill1led estimate, 63 Units of measuremente, 388 lMeat,53
v Variance, 4 Variance compoDeDt model,l66, 32t Variance compoDents, aee Mean of mean squares Variance of adjusted mean, 284 binomial distribution, 396 binomial populatioD, 391, 393 difference betweeD. meas, 124 populatioD, 4 population means, 164 regression coefficient, 281 sample, 62 sample mean, 36. 277
w Weighted mean of adjusted means, 3SS, 359 means, 177, 345, 350, 359 regression coefficients, 345, 350, 359 variances, 111 Within....... ple mean square, 158 average value of, 166 Within-aample 55, ISS
x )( l-distri bution, 6S mean of, 66 relation with F~lItributioD, 114 correction, 476, 479 )( I-tellt, 78 relation with aualysill of variance, 416
S46
INDEX
xl-test of gooclnesa of 6t, 404, 432 computing ehort-cut, 443 fhting frequency curves, 436 individual degee of freedom, 435 X l-teet of independence, 410, 438 computing ehort-cut 2 X 2, 412
2 x At 414 r x At 443 individual degae of &eedom, 442
z z-dietribution, 105
SYMBOLS AND ABBREVIATIONS
SYMBOLS AND ABBREVIATIONS
549
LOWER CASE LETTERS a
6
= (1) The mean of a sample which consists of several arrays or the least squares estimate of ex. in connection with I inear regression. Page 250. (2) The number of levels of the factor A of a factorial experiment. Page 310• .. (1) The regression coefficient of a sample or the least squares estimate of f3. Page 250. (2) The number of levels of the factor B of a factorial experiment. Page 310. The regression coefficient of the kth sample, e.g., 6 n 62, etc. Page 345. = The weighted mean of the regression coefficients of several samples. Page 345 • ... The dummy value for a missing observation. Hsed in Section 14.7 only. Page 210. = Degrees of freedom. Page 67. .. Frequency as used in Table 2.5a. Page 7. m The number of failures in a binomial population. Page 392= The number of successes in a binomial population. Page 392= The number of treatment means in a group. Page 239. = The hypothetical, theoretical, or expected frequency in the chi square test of goodness of fit and the chi square test of independence. Pages 404, 410, 433, 438. =The number of samples or treatments. Pages lSI, 198. ... (1) A number. Used only in Theorem 2.4b. Page 6. (2) The mid-point of a class in a frequency table, such as Table 7.7b, page 73; Table 8. 2, page 90; Table 9.6, page 112; Table 10.5, page 130; Table 16.5c, page 261. (3) The mean of the observations (y) with the same s-value. Used only in Section 17.7. Page 296. =(1) The sample si ze or the number of observations in a sample. Page 29. (2) The number of replications in a randomized block experiment. Page 198. zs Defined in Equation (5), Section 12.10. Page 178. = The average sample size. Page 180. = The size of the kth sample, e.g., nu n2f etc. Page 119. = The relative frequency of successes of a sample. Table 21.10, page 426. = Tile relative frequency of successes of several samples combined. Table 21.10, page 426. = 0) Correlation coefficient. Used in Chapters 16 and 17 only. Page 265. (2) The number of categories of a multinomial population. Used in Chapter 22 only. Page 432. = Relative frequency. Table 2.5a, page 7. - Relative cumulative frequency. Table 2.5a, page 7. - The standard deviation of a sample or the square root of 8 2• Page 62= The variance of a sample or the unbiased estimate of (72, the variance of a population. Pages 62, 263.
= d
d·f· f
fo f, g h
k m
n
no n
p r
r·f· r.c·f· s
550
SYMBOLS AND ABBREVIATIONS
=The variance of the kth = Weighted mean of several
t '.021
'.001 U
v
", " " "y y'
Y8 ~ Y,
sample, e.g., .st, .sl, etc. Page 105. sample variances or pooled estimate of a population variance. Pages Ill, 114= The variance of the means of several samples of the same size. Page 157. = A statistic which follows the Student's e-distribution. Pages 88, 128, 227, 277, 282, 287, 353, 357, 366. = The 2.5% point of the Student's e-distribution. Page 145. c: The 0.5% point of the Student's e-distribution. Page 147. c: A statistic which follows the nonnal distribution with mean equal to zero and variance equal to one. Pages 18, 83, 126, 226, 277, 282, 287, 402, 403, 407, 408• .. A linear combination of a number of observations. Page 221. = A value associated with an observation r or a quantitative treatment. Page 244. m A particular value of ". Page 283. - The mean of the x-values of a sample. Page 246. m The mean of the ,,-values of several samples. Page 354. = An observation. Page 3. eo: An original observation in the sampling experiment described in Section 17.1. Page 275. = The mean of a sample. Page 28. =The mean of several samples combined or the weighted mean of several sample means. Page 152.. The mean of the observations belonging to a particular level of the f.:tor A of a factorial experiment. Page 311. c: The mean 01 the observations belonging to a particular level of the factor B of a factorial experiment. Page 311. = The mean of the kth sample, e.g., ~, 120 etc., Page 119. =- The mean of a replication of a randomized block experiment. Page 198. == The mean of a treatment of a randomized block experiment. Page 198. == An adjusted sample mean or a sub-sample mean. Page 250. = The estimated mean of r at " - ,,'. Page 283. :a The estimated mean of y at" .. ~ Page 355.
CAPITAL LETTERS AB C
DF E
Ek F
F.05 G
e"
.. The interaction between the factors A and B of a factorial experiment. Page 312. = Defined in Equation (1U, Section 16.3. Used in this section only. Page 253. == Degrees of freedom. Page 67. == A sample estimate of a parameter, such as i b, and~. Used in Section 19.6 only. Page 359. == An estimate oE a parameter obtained from the kth Il8D1ple, e.g., E 10 E 20 etc. Page 359. = A statistic which follows the F-distribution. Pages 106, 158, 204, 227, 263. == The 5% point of the F-distribution. Page 235. == The grand total. Page 151. == The grand total of ,,-values. Page 347.
SYMBOLS AND ABBREVIA nONS
= The
G
551
grand total of y-values. Page 347.
L§D = The least significant difference between two means. Page 233. LSD. os = LSD with the significance level equal to 5%. Page 234 • M
... A multiplier used in obtaining an individual degree of freedom of the sum of squares. Page 221. eo The Irth multiplier used in obtaining an individual degree of freedom. such as Mit M" etc. Page 221. ,. Mean square. Page 158. The number of observations in a population. Page 4. = An individual degree of freedom of the treatment sum of squares, e.g., {Pu Q\. etc. Pages 227, 228. = (1) The sum of the observations of the replication which has a missing observation. Used in Section 14.7 only. Page 210. (2) A quantity used in the short-cut method of computing the chi square value for a k x r contigency table. Used in Section 22.8 only. Page 443-= The grand total of a randomized block experiment with a missing observation. Used in Section 14.7 only. Page 210 • ... The sum of products. Page 266. = The sum of products for the kth sample. e.g., SP u SPa. etc. Page 345• ... The sum of squares. Page 64II: The SS for the x-values. Page 267. - The SS for the %-values of the kth sample. e.g., SS,d' SSJC2' etc. Page 345 • .. The SS for the y-values. Page 267.
=
R
S SP SP Ir SS SS
SS"
"Ir
SS y SSR
T
T,
= The shortest significant range. Page 238. - A sample total or the sum of the observations of a sample. Page 151. = The sum of the observations belonging to a particular level of the factor A of a factorial experiment. Page 311. = The sum of the observations belonging to a particular level of the factor B of a factorial experiment. Page 311 • .. The total of the kth sample. such as T I. Ta. etc. Page 151. -= A replication total or the sum of the observations belonging to a particular replication of a randomized block experiment. Page 198• ... A treatment total or the sum of the observations belonging to a particular treatment of a randomized block. experiment. Page 198. ,. The sum of the %-values of a sample. Page 364The sum of the y-values of a sample. Page 364. = A version of sample variance defined in Equation (4). Section 7.2. Page 60• .. A version of sample variance defined in Equation (5). Section 7.2. Page 61. = The weight used in obtaining a weighted mean. Page 359. ::I
If
GREEK LETTERS eo The mean of a population with several x-arrays. Page 247. ,. The population regression coefficient of y on x. Page 247. - The mean of several population regression coefficients. Page 349• • The regression coeHicient of the kth population. e.g •• {3u {3a. etc. Page 349.
552
SYMBOLS AND ABBREVIAnONS
.. The mean of a population. Page 3• .. The mean of several population means. Page 166= The hyptohetical value of a population mean. Page 46= The mean of the statistic Go Page 277. = The population mean of the factor A of a factorial experiment or the mean of the statistic 1A' Page 319. = The mean of the statistic b. Page 280. = The population mean of the factor B of a factorial experiment or the mean of the statistic 1B' Page 319. = The mean of the kth population. e.g., 1'-2. etc. Page 119. = The population mean of a replication of a randomized block. experiment or the mean of the statistic 1,. Page 201. = The population mean of a treatment of a randomized block experiment or the mean of the statistic Page 201. = The mean of the statistic 1. Page 35. = The mean of the statistic 1.1. Page 222. - The mean of the observations y within an x-array of a population. Page 245- The mean of r at " ... %: Page 247. .. The mean of r at" .. "I' Page 247. .. The mean of the statistic 1", Page 284.
""It
rt'
""y."';
""y.
"I
""-;
""i-~
"
- The
mean of the statistic ¥&-Ya. which is the difference between two sample means. Page 121• .. The number of degrees of freedom. Page 66... (1) The number of degrees of freedom of the sum of squares of the first sample. Page 106. (2) The number of degrees of freedom of the numerator of the statistic F. Page 105• .. (U The number of degrees of freedom of the sum of squares of the second sample. Page 106. (2) The number of degrees of freedom of the denominator of the statistic F. Page lOS. = The relative frequency of successes of a binomial population. Table 21.10. page 426e The relative frequency of the 0 bservations of the rth category of a multinomial JX)pulation. e.g•• etc. Page 432:0: The standard deviation of a population. Page 4::: The variance of a population. Page 4• .. The hypothetical value of a population variance. Page 78 • ... The variance of the first population. Page 105• .. The variance of the statistic Go Page 277• ... The variance due to factor A for the linear hypothesis model as defined in Equation (2). Section 18.S. Page 320. ... The variance due to factor A for the variance component 1I!0del. Page 324... The variance due to interaction AB for the linear hypothesis model as defined in Equation (4). Section 18.5. Page 320. = The variance due to interaction AB for the variance component model. Page 324. The variance of the statistic b. Page 280. = The variance due to factor B for the linear hypothesis model as defined in Equation (3). SectioD 18.5. Page 320.
"It "2.
q
q'l
A
q~B
q1B
=
SYMBOLS AND ABBREVIATIONS
553
.. The variance due to factor B for the variance component model. Page 324- The variance due to replications. ~age 204. - The variance due to treatments. Page 204• ... The variance of the statistic tI. Page 223... The atandard error of the mean or the atandard deviation of the atatiatic Y; Page 36... The variance of the aample mean y. Page 36. = The variance of the statistic 1". Page 284 • ... The variance of a IllllDber of populatioll mealls as de filled ill Equation (2), Section 12.4. Page 164. 0- - The standard error of the difference between two aample means. Yl- Y2 Page 124. u~ _:. _ The variance of the difference between two sample meaDB. Page ,I,~ 124. u.L, - , = The variance of the difference between two adjusted means at Y" 1-1" 2 , P age 365 • - An angle to which a percentage is transformed. Page 448- A statistic which follows the chi square distribution. Pages 65, 404,405, 412, 413, 433, 435, 438, 443, 444.
U '2
B
":a,,.