STATISTICS TABLES for mathematicians, engineers, economists and the behavioural and management sciences
H.R.Neave
Lon...
513 downloads
1134 Views
6MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
STATISTICS TABLES for mathematicians, engineers, economists and the behavioural and management sciences
H.R.Neave
London and New York
First published 1978 by George Allen & Unwin Routledge is an imprint of the Taylor & Francis Group This edition published in the Taylor & Francis e-Library, 2009. To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk. © 1978 H.R.Neave All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN 0-203-01167-8 Master e-book ISBN
ISBN 0-203-15532-7 (Adobe ebook Reader Format) ISBN 0-415-10485-8 (Print Edition)
Preface For several years I have been teaching a first-year undergraduate Statistics course to students from many disciplines including, amongst others, mathematicians, economists and psychologists. It is a broad-based course, covering not only probability, distributions, estimation, hypothesistesting, regression, correlation and analysis of variance, but also non-parametric methods, quality control and some simple operations research, especially simulation. No suitable book of tables seemed to exist for use with this course, and so I collected together a set of tables covering all the topics I needed, and this, after various improvements and extensions, has developed into the current volume. I hope it will now also aid other teachers to extend the objectives of their own applied Statistics courses and to include topics which cannot otherwise be covered very meaningfully except with practice and use of some such convenient set of tables. In preparing this book, I have recomputed the majority of the tables, and have thus been able to extend several beyond what has normally been previously available and have also attempted some kind of consistency in such things as the choice of quantiles (or percentage points) given for the various distributions. Complete consistency throughout has unfortunately seemed unattainable because of the differing uses to which the various tables can be put. Two particular conventions should be mentioned. If, in addition to providing critical values, a table can be used for forming confidence intervals (such as tables of normal, t, χ2 and F distributions), then quantiles q are indicated, i.e. the solutions of F(x)=q, where F ( ) represents the cumulative distribution function of the statistic being tabulated, and x the tabulated values. If a table is normally used only for finding critical values (such as tables for non-parametric tests, correlation coefficients, and the von Neumann and Durbin-Watson statistics), then significance levels are quoted which apply to two-sided or general alternatives. To make clear which of the two conventions is relevant, quantiles are referred to in decimal format (0·025, 0·99, etc.) whereas significance levels are given as percentages (5%, 1%, etc.). I would like to express my gratitude to a number of people who have helped in the production of this book: to Dr D.S.Houghton, Dr G.J.Janacek, Cliff Litton, Arthur Morley and Peter Worthington, who have ‘vetted’ the work at various stages of its progress, to the staff of the Cripps Computing Centre at Nottingham University who have helped in very many ways, to Tonie-Carol Brown for help with preparing data for some of the computer programs, and to Betty Page for typing the text. Responsibility for any errors is mine alone—there should not be many, as the tables have been subjected to many hours of checking and cross-checking, but I would be grateful to anyone who points out to me any possible mistakes, whether they be substantiated or merely suspected. I would also greatly appreciate any suggestions for improvements which might be incorporated in subsequent editions. HENRY NEAVE Nottingham University, June 1977.
iv Preface
Preface to the second impression The only errors of any substance found in the first impression were in Charts 1.2(a, b), which were drawn a little inaccurately because of a tracking error on an incremental graphplotter; these charts have now been redrawn. Improved versions of Charts 6.1(a, b) and of Tables 6.4 and 6.5 have also been included and a few minor amendments made to the text and the layout. Those who purchased the first impression are welcome to write to the author at the Department of Mathematics, University of Nottingham, Nottingham NG7 2RD, England for correct copies of Charts 1.2(a, b) and details of any other alterations.
Acknowledgements Most of the tables have been newly computed for this publication; the exceptions are cited below. But I would like to give a general note of acknowledgement to all those who have previously published sets of tables for their ideas concerning content and layout which clearly have indirectly contributed to this volume. I am indebted to the publishers for permission to include material in full or in part from the following journals: the Annals of Mathematical Statistics (Tables 2.5(a), (b)), Biometrics (Table 4.1), The Statistician (Table 4.4), the Journal of the American Statistical Association (Table 5.4) and Statistica Neerlandica (Table 6.4). Tables 4.2 and 5.2(c) respectively have been derived from Selected Tables in Mathematical Statistics Vol. 3 (1975), pp. 329–84 and Vol. 1 (1970), pp. 130–70 with permission of the publisher American Mathematical Society. Part of Table 2.3(a) has been derived from Biometrika Tables for Statisticians Vol. 2, Table 1 with permission of the Biometrika Trustees. Tables 6.7 and 6.8 have been taken from On the theory and application of the general linear model by J.Koerts and A.P.J.Abrahamse (Rotterdam University Press) by permission of the publisher and authors. The redrawn Poisson probability chart (Chart 1.3(6)) has been included by permission of H.W.Peel & Co. Table 8.1(a) has been abridged from Table 13 of Attribute Sampling by H.Burstein (McGraw-Hill) by permission of the publisher and author.
Contents page 1 10 12
2.2 ordinates of the standard normal density function 2.3(a) quantiles (percentage points) of the standard normal distribution 2.5(a) moments of the range distribution
SECTION 4: 4.1 factors for Duncan’s multiple-range test
19 21 25 26 28 28 29 29 31 31 33
35 36 38 42 43 43 44
vi Contents SECTION 5: NON-PARAMETRIC TESTS
5.1
5.2(b) Kolmogorov-Smirnov asymptotic critical values 5.2(c) Kolmogorov-Smirnov two-sample test
5.3
5.4
Mann-Whitney/Wilcoxon/rank-sum twosample test improved Tukey quick test
5.5
Wald-Wolfowitz number-of-runs test
Wilcoxon signed-rank test
5.2(a) Kolmogorov-Smirnov one-sample test
SECTION 6: CORRELATION
6.1(a), (b) charts giving 95% and 99% confidence intervals for p 6.2 critical values rn, α of the linear correlation coefficient r 6.3(a) the Fisher z-transformation
6.3(b) the inverse z-transformation
6.4
6.5
47 48 49
50 52 52 53 54 55
56 57
6.8
57
59
61
62
63
64
7.1
bounds for critical values of the DurbinWatson statistic RANDOM NUMBERS random digits
7.2
8.1(a) the construction of single-sampling plans
45 46
random numbers from the standard normal distribution 7.3 random numbers from the exponential distribution with unit mean SECTION 8: QUALITY CONTROL
45
critical values of the Kendall rank correlation coefficient 6.6 critical values of the multiple correlation coefficient 6.7 critical values of the von Neumann ratio
SECTION 7:
critical values of the Spearman rank
correlation coefficient
45
8.1(b) the operating characteristic of singlesampling plans
Contents
65 66 67 67 68
69 70 71 72 75 77
79 9.6(a) common logarithms: log10 (x) 81 83 Inside back cover
vii
Descriptions of the Tables (References to appropriate books or articles are given for the less familiar tables) Section 1: Discrete probability distributions The quantity tabulated in Table 1.1 (pp. 20–26) is
which is the probability of obtaining x or less ‘successes’ in n independent trials of an experiment, where the probability of a success at each trial is p. Individual probabilities are easily obtained using P(0)=F(0) and P(x)=F(x)−F(x−1) for x>0. The table covers all n≤20 and p=0·01(0·01)0·10(0·05)0·50. For values of p>½, probabilities may be calculated by reversing the roles of ‘success’ and ‘failure’. Charts 1.2 (pp. 27–28) give (a) 95% and (b) 99% confidence intervals for p on the basis of a binomial sample of size n in which X ‘successes’ occur. If f=X/n≤½, locate the value of f on the bottom horizontal axis, trace up to the two curves labelled with the appropriate value of n, and read off the confidence limits on the left-hand vertical axis; if f>½, use the top horizontal axis and the right-hand vertical axis. For each value of n, the appropriate points have been plotted for all possible values of f and these points joined by straight lines to aid legibility. The charts may also be used ‘in reverse’ to provide (a) 5% and (b) 1% two-sided critical regions for a hypothesis test of H0: p=p0 against H1:p≠p0, or equivalently (a) 2½% and (b) ½% one-sided critical regions. Results for values of n not included may be obtained by interpolation. The quantity tabulated in Table 1.3(a) (pp. 29–32) is
this being the cumulative distribution function (c.d.f.) of the Poisson distribution with mean µ. Individual probabilities may be found as in Table 1.1. For µ>2·0, the c.d.f. occupies two or more rows of the table, the first row giving F(0) to F(9), the second row giving F(10) to F(19), etc. The Poisson probability chart, Chart 1.3(b) (p. 33), gives Prob (X≥c)=1−F(c−1) where X has the Poisson distribution with mean µ. The value of µ, ranging from 0·1 to 100, is found on the horizontal axis and the probabilities are read on the left-hand vertical. There is a curve for each of the following values of c: 1(1)25(5)100(10)150. The horizontal axis has a logarithmic scale and the vertical axis a normal probability scale.
2
Statistics Tables
Section 2: The normal distribution Table 2.1(a) (pp. 34−5) gives values of the standard normal c.d.f. Φ(z) for z=−4·00(0·01)3·00, expressed to four decimal places (4 d.p.) and with proportional parts for the third decimal place of z, and also for z=3·00(0·01)5·00 to 6 d.p. Note that proportional parts are subtracted if z<0. Clearly by symmetry one can also obtain values of Φ(z) to 4 d.p. with proportional parts for z=3·00(0·01)4·00 and to 6 d.p. for z=−5·00(0·01)−3·00. If F(x) is the c.d.f. of N(µ, σ2), i.e. the normal distribution with mean µ and variance σ2, then Table 2.1(b) (p. 36) gives values of 1−Φ(z)=Φ(−z) for a range of values of z from 3·0 to 200·0. Table 2.2 (p. 36) is a brief table of the ordinates of the standard normal density function. Table 2.3(b) (p. 37) gives the inverse normal function, i.e. it provides the values of z satisfying Φ(z)=q=1−p for q=0·500(0·001)1·000. Table 2.3(a) (p. 36) is essentially a selection of the more useful values from Table 2.3(b), but with some extra extreme points included. Further, six particularly important values are given to 10 d.p. Table 2.4(a) (pp. 38–9) gives expected values of normal order statistics (normal scores) for sample sizes n≤50, i.e. the values where Z(1), Z(2),…, Z(n) represents a sample of size n from N(0, 1) arranged in ascending order. The values tabulated are and remaining values may be obtained by symmetry, viz . (i=1, 2,…, ½n). These quantities are useful in formulating some particularly powerful non-parametric tests (see Section 5); the variances of such test statistics usually involve , and these sums of squares are given in Table 2.4(b) (pp. 38–9). Tables 2.5(a) and (b) (p. 40) respectively give moments and quantiles of the distribution of the range R=maximum value—minimum value of samples from normal distributions for sample sizes up to 20. Denoting the expected (mean) range by E[R] and central moments of R by rk=E[(R−E[R]k], then the five columns of Table 2.5(a) give E[R]/σ,r2/σ2, r3/σ3 and r4/σ4, where σ2 is the variance of the normal distribution. In particular the first and second columns give the mean and standard deviation of R in units of σ. Table 2.5(b) gives six quantiles at both sides of the distribution of R. Further reading: Table 2.4, Bradley (1968, §6.2); Table 2.5, Lindgren (1976), §7.2.1).
Section 3: Continuous probability distributions Table 3.1 (p. 41) gives 13 quantiles tv,q of of the Student t distributions for degrees of freedom v covering 1(1)40,45,50(10)100,120,150,∞. The quantiles are all in the righthand half of the distributions; values in the left-hand half may be obtained by symmetry: Table 3.2 (pp. 42–3) gives 25 quantiles of the χ2 (chi-squared) distributions for degrees of freedom v covering 1(1)40,45,50(10)100,120, 150,200. Quantiles are given in both the left-hand and right-hand halves of the distributions, since χ2 distributions are not symmetrical.
Descriptions of the Tables 3 Table 3.3 (pp. 44–7) gives six right-hand quantiles of the Snedecor F distributions for the ‘numerator’ degrees of freedom v1=1(1)10,12,15,20,30,50,∞ and the ‘denominator’ degrees of freedom v2=1(1)25(5)50,60,80,100,120,∞. The six quantiles for a particular choice of (v1, v2) are given in a block for easy reference, rather than using a separate page for each value of q. Critical regions for tests in analysis of variance etc. and for tests of against using the statistic (where s1 and s2 are the adjusted standard deviations of the two samples—see p. 88) require critical regions of the form F≥
, so in these cases the tables immediately give the
required critical values. When testing H0 against use if or (in which case the values of v1 and v2 must be interchanged), and then corresponds to a rejection of H0 at the significance level 2(1−q). If required, left-hand quantiles may be obtained from
Section 4: Analysis of variance Clearly, Table 3.3 (pp. 44–7) could be regarded as part of this section, as the ratios of variance estimators used in analysis of variance have F-distributions under the appropriate null hypotheses. Table 4.1 (p. 48) permits a more detailed analysis of the effects of different levels of a factor. Suppose that represent the mean values of observations , at the k levels of a factor in an analysis of variance experiment where there being n observations at each level. Then if the residual (error) unbiased estimate of the variance is s2, based on v degrees of freedom, the critical value of the range of any m adjacent means is where rm, v, α is read from the table and α is the significance level 5% or 1%—i.e. if the actual range exceeds Rm, v, x one may reject the hypothesis that those m levels are indistinguishable. Tables 4.2 and 4.3 (p. 49) give critical values for well-known non-parametric tests for the one-way and two-way classification analysis of variance situations. For the KruskalWallis test, suppose there are random samples of sizes n1, n2,…, nk from k populations. If the N=∑ni observations are ranked (ordered) from 1 to N and the k rank-sums are R1, R2,…, Rk, the Kruskal-Wallis statistic is
Table 4.2 gives 5% and 1% critical values for various sample sizes for k=3,4,5. For the Friedman test, we assume there are N=nk observations, one from each combination of one of k treatments with one of n blocks. The statistic for testing for differences between the treatments is formed by ranking the k treatments from 1 to k within each block, letting R1, R2,…, Rk denote the treatment rank-sums, and computing
4
Statistics Tables
Table 4.3 gives 5% and 1% critical values for k=3,4,5,6 and various values of n (the range of n is smaller for higher k because the null distributions are then much more difficult to compute). Differences between blocks may be tested similarly. Asymptotically both H and S have the χ2 distribution with k−1 degrees of freedom. Table 4.4 (p. 50) gives 5%, 1% and 0·1% critical values for some quick non-parametric tests for slippage (shift in location or mean) of one population away from the others. There are four tests depending on the form of the alternative hypothesis: slippage of (A) a specified population, or (B) any population in (1) a specified direction, or (2) either direction. The test statistic in each case is the number of observations from one sample which are greater than (or less than, as appropriate) all the observations from all the other samples. In case (A1), if the specified direction is upwards, the statistic TA1 is the number of observations (possibly 0) from the specified sample exceeding the maximum observation from all other samples; if downwards then it is the number of observations less than the minimum from all other samples. In case (A2), the statistic TA2 is the larger of the two possibilities for case (A1). In case (B1), the statistic TB1 is as in (A1) except that one chooses the sample containing the overall maximum (or minimum) observation (so the value must be at least 1), and in case (B2), one takes TB2 as the larger of the two possibilities in (B1). The tables assume a common sample size n, and cover k=3(1)10,15,20 samples and all values of n. E.g. for test (A2), the 0·1% critical values for k=3 are 5 for n=5, 6 for n=6 to 12, and 7 for n≥13, these being indicated in the table by 5(5), 6(6–12), 7(13 ↑). Further reading: Table 4.1, Miller and Freund (1965, §13.4); Table 4.2, Siegel (1956, ch. 8); Table 4.3, Siegel (1956, ch. 7); Table 4.4, Neave (1972).
Section 5: Non-parametric tests To test the hypothesis H0 that m is the median of a symmetrically-distributed population from which a random sample of size n is drawn, subtract m from each observation, rank the resulting differences from 1 to n without regard to sign, and if W+ and W− are the rank-sums of the positive and negative differences respectively, form W=min (W+, W−). Table 5.1 (p. 51) gives 5% and 1% critical values for W with n≤100. The test is often used in matched-pairs situations, when one uses the differences of the matched observations. Onesided tests at 2½% and ½% significance levels are based on W+ or W− as appropriate. The one-sample Kolmogorov-Smirnov test tests the null hypothesis that a random sample of size n comes from a population having a specified c.d.f. F(x). Denoting by Fn(x) the empirical distribution function of the sample, and letting then Table 5.2(a) (p. 51) gives 5% and 1% critical values for Dn=max ) with n≤100. One-sided tests may be based on or as appropriate. (The ( , entries in the table are actually based on the null distribution of the one-sided statistic, but when used for the two-sided statistic, there are only occasional discrepancies in the fourth d.p.) Asymptotic critical values for Dn are obtained from Table 5.2(b) (ρ.© 51) by multiplying the tabulated values, a, by
Descriptions of the Tables 5 The remaining four tables give 5% and 1% critical values for two-sample non-parametric tests, mainly appropriate for location differences, though the Kolmogorov-Smirnov and runs tests have power for other types of differences also. In each case a full table for unequal sample sizes ns< nL≤25 is given (S=Smaller sample, L=Larger sample), and equal sample sizes are then dealt with in a separate table. For unequal sample sizes, 5% critical values are given above the main diagonal (nL is read on the top horizontal and ns on the right-hand vertical), and 1% critical values are found below the main diagonal (nL is read on the left-hand vertical and ns on the bottom horizontal). Critical values for equal sample sizes n are given for n≤100 in Tables 5.2(c) and 5.5, n≤50 in Table 5.3, and all n in Table 5.4. For sample sizes greater than those included in the tables, some asymptotic (largesample) null distributions are given on the inside back cover. Critical values for the two-sample Kolmogorov-Smirnov test are obtained from Table 5.2(c) (p. 52). If D represents the maximum absolute difference between the empirical c.d.f.s of the two samples, the main table gives critical values for nSnLD and the equal-sample-size table gives critical values for nD. Critical values for D with large samples may be obtained from Table 5.2(b) (p. 43) by multiplying the tabulated values, a, by {(1/nS)+ (1/nL)}½ or (2/n)½. The Mann-Whitney statistic U may be defined as follows. Naming the two samples as A and B, compare each member of A in turn with each member of B. Let UAB denote the number of pairs in which the A-value is less than the B-value and UBA as the number in which the B-value is less than the A-value (count ½ for ties). Then U=min (UAB, UBA), and critical values are given in Table 5.3 (p. 53). Alternatively, UAB or UBA may be computed by subtracting ½m(m+1) from the sum of the ranks of observations from one of the samples when the samples are jointly ordered, where m is the size of that sample. A related test statistic C is produced by performing a similar ‘rank-sum’ calculation but where the ranks are replaced by normal scores (Table 2.4(a)). The asymptotic null distribution (see inside back cover) is fairly accurate even for quite small samples. (See Bradley (1968, §6.2).) In 1959, Tukey proposed the following quick test. Suppose the overall minimum and maximum observations come from different samples, say A and B respectively (otherwise set the statistic T to 0). Count the number of A-values less than the minimum B-value, and the number of B-values exceeding the maximum A-value, and add to give the statistic T. Tukey showed that if the sample sizes are not very small and not very unequal, 5%, 1% and 0·1% critical regions are T≥7, T≥10 and T≥13. The test’s power may however be improved by first deleting a single observation, chosen so as to maximise the resulting value of T, denoted by TN. (Exceptionally, if all the As are less than all the Bs, define TN=T, not T−1.) Critical values of TN are generally about 3 greater than for T, and are given in Table 5.4 (p. 54). Finally, if the observations in samples A and B are jointly ordered, the number-of-runs statistic R for which critical values are given in Table 5.5 (p. 55) is the total number of runs of consecutive As and Bs in this ordering (a ‘run’ may consist of just a single observation). The Kolmogorov-Smirnov, Mann-Whitney and improved Tukey tests are easily adaptable to one-sided alternative hypotheses, in which case the tabulated critical values correspond to 2½% and ½% significance levels. Further reading: Table 5.4, Neave (1966); all other tables, Siegel (1956, chs 4–6).
6
Statistics Tables
Section 6: Correlation If (X1, Y1), (X2, Y2),…, (Xn, Yn) denotes a random sample from a bivariate normal distribution with correlation coefficient ρ, the sample linear correlation coefficient is used for estimating and testing hypotheses about ρ. Charts 6.1 (pp. 56–7) give (a) 95% and (b) 99% confidence intervals for ρ for a selection of sample sizes: find the points of intersection of the two curves labelled with the sample size n with the vertical through the obtained value of r, and read off the confidence limits for ρ on the vertical axes. The charts may also be used ‘in reverse’ to provide (a) 5% and (b) 1% two-sided critical regions for tests of H0: ρ=ρ0 against H1: ρ≠ρ0; or equivalently (a) 2½% and (b) ½% one-sided critical regions. Results for values of n not included may be obtained by interpolation. Table 6.2 (p. 58) gives critical values for r in testing H0: ρ=0. Denoting a typical table entry by rn, α, critical regions for testing at significance level α against alternative hypotheses H1: ρ≠0, ρ>0, ρ<0 are |r|≥rn, α, r≥rn,2α, r≤−rn,2α respectively. Inferences about ρ may also be made by using the Fisher z-transformation:
which is approximately normally distributed with mean z(ρ) and variance 1/(n−3). Table 6.3(a) (p. 58) gives z(r) for r=0·00(0·01)0·900(0·001)0·999, negative values being obtainable by symmetry. Table 6.3(b) (p. 59) gives the inverse z-transformation, calculating r given z for z=0·00(0·01)3·00 with proportional parts for the third d.p., and for z=3·0(0·1)7·9, these values being given to 6 d.p. If in a bivariate sample (X1, Y1),…, (Xn, Yn) we rank the Xis and Yis in turn from 1 to n Spearman’s and denote these ranks by (x1, y1),…, (xn, yn) with rank correlation coefficient ρs is defined as where Denoting the entries in Table 6.4 (p. 60) by ρn, α, a critical region of form ρs≥ρn, 2α is appropriate for testing at significance level α that the Xis and Yis are similarly ranked, i.e. that they tend to increase together; further, ρs≤−ρn, 2α tests at level α that the Xis and Yis are contrarily ranked, i.e. that the Yis tend to decrease as the Xis increase. Values of 1/6(n3−n) are also given. Kendall’s rank correlation coefficient performs a similar task. For any i≠j, if Xi−Xj has the same sign as Yi−Yj then the (i, j) pair is said to be concordant, otherwise it is discordant. Considering all ½n(n−1) pairs with, say, 1≤i<j≤n, let Nc and ND be the numbers of concordant and discordant pairs. Kendall’s rank correlation coefficient τ is (Nc−ND)/½n(n−1). As with ρs, values near +1 indicate similar rankings, values near −1 indicate contrary rankings. Denoting the entries in Table 6.5 (p. 60) by τn, α, critical regions of the forms τ≥τn, 2α and τ≤−τn, 2α test at significance level α that the Xis and Yis are similarly or contrarily ranked respectively. Values of ½n(n−1) are also given. With both the Spearman and Kendall coefficients, critical regions for two-sided tests are obtainable by combining two one-sided regions. In a multiple linear regression where a0, a1,…, ak are estimated by least squares, the multiple correlation coefficient R measures the goodness of fit of the model. Given a sample of n values of the vector (Y, X1,…, Xk), R is the positive
Descriptions of the Tables 7 , representing the mean of the Y-values. If k=1, R=|r|. square root of In general R is also the linear correlation coefficient of (Y, ). Denoting the values in Table 6.6 (p. 61) by Rk, n, α, the critical region R≥Rk, n, α tests at level α that R is significantly greater than 0, i.e. that the regression model is useful. Next, if Y1, Y2,…, Yn denote observations taken at equidistant intervals of, say, time, tests for the presence of serial the von Neumann ratio correlation, indicating trends, cycles etc. Denoting the entries in Table 6.7 (p. 62) by Vn, α, the critical regions for tests at significance level α are V≤Vn, α. The Durbin-Watson statistic similarly tests for serial correlation, but in the residuals after a regression. The statistic is where r1,…, rn are the n residuals and is their mean. It is only possible to give bounds for the critical values—in Table 6.8 (pp. 62–3), d≤dL and d≤dU give critical regions corresponding to significance levels which are at most a and at least α respectively. k represents the number of regression variables, as in Table 6.6. Further reading: Tables 6.4 and 6.5, Siegel (1956, ch. 9); Table 6.6, Freund (1973, §15.5); Tables 6.7 and 6.8, Koerts and Abrahamse (1969, chs 4–6).
Section 7: Random numbers Table 7.1 (pp. 64–5) gives 5000 random digits, arranged in blocks of five for ease of reading. Random digits may be used to simulate random samples from any probability distribution. First, if one takes random digits n at a time (where usually n=3, 4 or 5) and precedes them with a decimal point, then in essence random numbers U from the uniform distribution on (0, 1) are being generated (strictly speaking one should add on 0·5×10−n). If F( ) is the c.d.f. of a continuous distribution the solution of U=F(X), i.e. X=F−1(U), yields random numbers from that distribution—this may be accomplished graphically or algebraicly. Random numbers X from a discrete distribution with c.d.f. F( ), expressed to n d.p.s., may be generated by finding the smallest value of X for which F(X)>U. Table 7.2 (p. 66) gives 500 random numbers from the standard normal distribution N(0, 1). To convert them to random numbers from the normal distribution with mean µ and variance σ2, N(µ, σ2), multiply by σ and add µ. Table 7.3 (p. 67) gives 500 random numbers from the exponential distribution with mean 1, having a probability density function f(x)=e−x(x≥0). They may be converted to random numbers from the exponential distribution with mean µ by multiplying by µ. Further reading: Freund (1973, §§8.1, 8.3).
Section 8: Quality control A large batch (population) has a proportion p of defective items. A random sample of size n is drawn from the batch and if the number of defectives found in the sample exceeds the acceptance number c the batch is rejected, otherwise it is accepted. It is desired to choose n and c such that if p≤p1 the probability is at most α that the batch is rejected, and if p≥p2 the probability is at most β that the batch is accepted. Table 8.1(a) (p. 68) enables the construction of sampling plans approximately satisfying these conditions for α, β=10%, 5%,1%. First and in the column corresponding to the calculate required values of α and β find the entry nearest to S. Read off the corresponding
8
Statistics Tables
values of c and m, and calculate n as the nearest integer to (m/p2)−½(m−c). E.g. if α=5%, β=10%, p1=0·01, p2=0·04, then S=4·061, c=4, m=7·9936 and n=198. Thus a sample of size 198 would be drawn, the batch being accepted if no more than four defectives are found. (If a lower level of accuracy is sufficient, take S=p2/p1 and n=m/p2.) Table 8.1(b) (p. 69) enables the operating characteristic of such a single-sampling plan to be drawn, i.e. a graph of L(p), the probability of a batch being accepted, against p. For each of 13 values of L(p), the corresponding values of p are found by dividing the numbers in the row corresponding to the appropriate acceptance number c by the sample size n. Thus for the plan constructed above, L(p)=0·900 corresponds to p=2·433/198=0·0123. (Use of this table depends on the Poisson approximation to hyper-geometric or binomial distributions, and is thus only accurate for small p. A tolerance interval (tL, tU) is an interval within which one may assert with a degree of confidence γ that a proportion of at least P of a population lies; i.e. if the c.d.f. of the population distribution is F( ), then there is a probability of at least γ that In Tables 8.2(a) and 8.2(b) (pp. 70–1), the population distribution is presumed to be normal, and suppose a sample of size n has mean and adjusted standard deviation s (see p. 88). Denoting a typical value from Table 8.2(a) by k, the form of the appropriate tolerance interval is either or . In Table 8.2(b), again with k denoting a typical value, the interval is of the form .(The values in Table 8.2(b) are ‘strong’ tolerance limits in the sense that, e.g. with γ=95% and P=0·98, there is a probability of at least 95% that no more than 1% of the population lies to the left of and also that no more than 1% lies to the right of .) In Table 8.3(a) (p. 72), the tolerance interval is again taken in the form (∞, tU) or (tL, ∞) but now tL=Xmin and tU=Xmax, where Xmin and Xmax are simply the minimum and maximum values in a sample of size n. The table gives minimum values of n satisfying the conditions for a variety of values of γ and P. There is no assumption necessary concerning the population distribution. Table 8.3(b) (p. 72) gives the sample sizes necessary for tolerance intervals of the form (tL, tU) where again tL=Xmin and tU =Xmax and tL and tU are ‘strong’ tolerance limits in the above sense. Table 8.4 (p. 73) enables the construction of various types of control charts—for sample means , sample ranges R, and (unadjusted) sample standard deviations S (see p. 88). If the population distribution is normal, warning limits correspond to the 0·025 and 0·975 quantiles, and action limits to the 0·001 and 0·999 quantiles. Variability of the observations is characterised either by the population standard deviation σ or by the average range or standard deviation in pilot samples of the same size n as those to be plotted. For -charts, the central line C is placed at the population mean µ or at the overall mean of pilot observations. Warning limits are placed at and action limits at according to the measure of variability being used. The values r and s give E[R/σ] and E[S/σ] respectively and their reciprocals are also listed. These columns facilitate changes between the three variability characteristics, e.g. if σ has been specified, then and ; on the other hand, and provide unbiased estimators for σ. , Given the R-chart has central line warning limits and and action limits and (L and U indicating Lower and Upper respectively). Given σ, the central line should be placed at rσ and the warning and action limits at
Descriptions of the Tables 9
For S-charts, replace R and r by S and s throughout the last paragraph. Further reading: Table 8.1(a), Burstein (1971, ch. 3); Table 8.2(a), Owen (1965); Table 8.2(b), Owen (1964); Table 8.3, Wilks (1942); Table 8.4, Moroney (1965, ch. 11).
Section 9: Miscellaneous These tables need little explanation. Table 9.1 (p. 74) presents reciprocals, squares, square roots, cubes and cube roots of integers n up to 100. In Table 9.2 (p. 75) the binomial coefficients
are given for n≤30, n being read in the left-hand column and r along the horizontal (r≤15). For r>15, use
The factorials n! in Table 9.3(a) (p. 76) are given exactly for n≤30 and to six significant figures for 31≤n≤150. Table 9.3(b) (pp. 77–9) gives logarithms (to base 10) of factorials for n≤850. Table 9.4(a) (p. 80) gives ex for x=0·00(0·01)5·00((0·1)10·9. Table 9.4(b) (p. 81) gives −x e for x=0·00(0·01)3·99 with proportional parts (which are subtracted) for the third d.p. of x, and for x=4·0(0·1)10·9 to at least four significant figures. Table 9.5 (pp. 82–3) gives natural logarithms loge (x)≡ln (x). Tables 9.6(a) (pp. 84–5) and 9.6(b) (pp. 86–7) are standard tables of common logarithms and antilogarithms.
References (An asterisk indicates a reference of above-average difficulty, intended more for the teacher than the student.) Bradley, J.V. 1968. Distribution-free statistical tests. Englewood Cliffs, N.J.: Prentice-Hall. Burstein, H. 1971. Attribute sampling. New York: McGraw-Hill. Freund, J.E. 1973. Modern elementary statistics, 3rd edn. Englewood Cliffs, N.J.: Prentice-Hall. *Koerts, J. and A.P.J.Abrahamse 1969. On the theory and application of the general linear model. Rotterdam: Rotterdam University Press. *Lindgren, B.W. 1976. Statistical theory, 3rd edn. London: Macmillan. Miller, I. and J.E.Freund 1965. Probability and statistics for engineers. Englewood Cliffs, N.J.: Prentice-Hall. Moroney, M.J. 1965. Facts from figures, 2nd edn. Penguin Books Ltd. Neave, H.R. 1966. A development of Tukey’s quick test of location. J. Am. Statist. Assoc. 61, 949–64. Neave, H.R. 1972. Some quick tests for slippage. The Statistician 21, 197–208. *Owen, D.B. 1964. Control of percentages in both tails of the normal distribution. Technometrics 6, 377–87. *Owen, D.B. 1965. A special case of a bivariate non-central t-distribution. Biometrika 52, 437–46. Siegel, S. 1956. Nonparametric statistics for the behavioral sciences. New York: McGraw-Hill. *Wilks, S.S. 1942. Statistical prediction with special reference to the problem of tolerance limits. Ann. Math. Statist. 13, 400–9.
THE TABLES
1·1 the binomial c.d.f.
Discrete probability distributions 13
14
Statistics Tables
Discrete probability distributions 15
16
Statistics Tables
Discrete probability distributions 17
18
Statistics Tables
Discrete probability distributions 19 1·2(a) chart giving 95% confidence intervals for p
20
Statistics Tables 1·2(b) chart giving 99% confidence intervals for p
Discrete probability distributions 21 1·3(a) the Poisson c.d.f.
22
Statistics Tables
Discrete probability distributions 23
24
Statistics Tables
Discrete probability distributions 25 1·3(b) Poisson probability chart Prob (X≥c)
2·1(a) the c.d.f. of the standard normal distribution
The normal distribution 27
28
Statistics Tables 2·1(b) extreme values of the standard normal c.d.f.
2.2 ordinates of the standard normal density function
The normal distribution 29 2.3(a) quantiles (percentage points) of the standard normal distribution
2·3(b) the inverse normal function
30
Statistics Tables
The normal distribution 31 2·4(a) normal scores (expected values of normal order statistics)
2.4(b) sums of squares of normal scores
32
Statistics Tables
The normal distribution 33 2·5(a) moments of the range distribution
34
Statistics Tables 2·5(b) quantiles of the range distribution
3·1 the Student t distribution
36
Statistics Tables 3·2 distribution the χ2 (chi-squared)
Continuous probability distributions 37
38
Statistics Tables 3·3 the F distribution
Continuous probability distributions 39
40
Statistics Tables
Continuous probability distributions 41
4·1 factors for Duncan’s multiple-range test
Analysis of variance 43 4·2 the Kruskal-Wallis test
4·3 the Friedman test critical region: S≥tabulated value
44
Statistics Tables 4·4 quick multi-sample tests
5·1 Wilcoxon signed-rank test critical region: W≤tabulated value
5·2(a) Kolmogorov-Smirnov one-sample test critical region: Dn≥tabulated value
5·2(b) Kolmogorov-Smirnov asymptotic critical values
46
Statistics Tables 5·2(c) Kolmogorov-Smirnov two-sample test
Non-parametric tests 47 5·3 Mann-Whitney/Wilcoxon/rank-sum
48
Statistics Tables 5·4 improved Tukey quick test
Non-parametric tests 49 5·5 Wald-Wolfowitz number-of-runs test
6·1(a) chart giving 95% confidence intervals for ρ
Correlation 51 6·1(b) chart giving 99% confidence intervals for ρ
52
Statistics Tables 6·2 critical values rn, x of the linear correlation coefficient r
6·3(a) the Fisher z-transformation
Correlation 53 6·3(b) the inverse z-transformation r=r(z)=tanh z
54
Statistics Tables 6·4 critical values of the Spearman rank correlation coefficient
Correlation 55 6·5 critical values of the Kendall rank correlation coefficient
56
Statistics Tables 6·6 critical values of the multiple correlation coefficient
Correlation 57 6·7 critical values of the von Neumann ratio critical region: V≤tabulated value
6·8 bounds for critical values of the Durbin-Watson statistic critical region: d≤tabulated value
58
Statistics Tables
7·1 random digits
60
Statistics Tables
Random numbers 61 7·2 random numbers from the standard normal distribution
62
Statistics Tables 7·3 random numbers from the exponential distribution with unit mean
8·1(a) the construction of single-sampling plans
64
Statistics Tables 8·1(b) the operating characteristic of single-sampling plans
Quality control 8·2(a) one-sided tolerance factors for normal distributions for tolerance intervals (−∞, +ks) or ( −ks, ∞).
65
Statistics Tables
66
8·2(b) two-sided tolerance factors for normal distributions for tolerance intervals (
)
Quality control 8·3(a) sample sizes for one-sided non-parametric tolerance limits for tolerance intervals (−∞, Xmax) or (Xmin, ∞)
8·3(b) sample sizes for two-sided non-parametric tolerance limits for tolerance intervals (Xmin, Xmax)
67
68
Statistics Tables 8·4 control chart constants
9·1 reciprocals, squares, square roots, cubes, cube roots
70
Statistics Tables 9·2 binomial coefficients
Miscellaneous 71 9·3(a) fa ctorials n!=n(n−1)…2 . 1
72
Statistics Tables 9·3(b) logarithms of factorials
Miscellaneous 73
74
Statistics Tables
Miscellaneous 75 9·4(a) the exponential function: ex
76
Statistics Tables 9·4(b) the negative exponential function: e−x
Miscellaneous 77 9·5 natural logarithms: logc (x)
78
Statistics Tables
Miscellaneous 79 9·6(a) common logarithms: log10 (x)
80
Statistics Tables
Miscellaneous 81 9·6(b) antilogarithms: 10x
82
Statistics Tables
Useful constants
Sample standard deviations If X1, X2,…, Xn is a random sample from a distribution having variance σ2 then
is an unbiased estimator for σ2, being the sample mean. s is called the adjusted standard deviation of the sample. The square root S of
is called the unadjusted standard deviation of the sample; its only specific use in these tables is in the construction of some control charts—see Table 8.4.
Asymptotic (large-sample) distributions A number of statistics included in these tables are asymptotically normally distributed; their means and variances are listed below. In the case of a single sample, n denotes the sample size; in the case of two samples, n1 and n2 denote the samples sizes, and N=n1+n2.