Description : Probability and Statistics in Engineering by William W. Hines, Douglas C. Montgomery, David M. Goldsman, Connie M. Borror
Probability and Statistics in Engineering by William W. Hines, Douglas C. Montgomery, David M. Goldsman, Connie M. BorrorFull description
Probability and Statistics in Engineering by William W. Hines, Douglas C. Montgomery, David M. Goldsman, Connie M. BorrorFull description
Tables with variations of the theme.Full description
introduction to statistics
chapter 3 questions with answers
Probability and Statistics
Apostila de probabilidade e estatistica da NASAFull description
statisticFull description
Sample advanced question on Probability and Statistics
probability and statisticsFull description
Super Hrvoje, prvi hrvatski strip-junakFull description
Super Hrvoje, prvi hrvatski strip-junakFull description
Full description
stat
Download Full Solution Manual Applied Statistics and Probability for Engineers 5th Edition by Douglas C. Montgomery SLW1017
Full description
Full description
Descripción completa
Probability and Statistics in Engineering
Probability and Statistics in Engineering Fourth Edition
William W. Hines Prqjessor Emeritus School of Industrial and Syste.rns Engineen"n.g Georgia fnstit,ge of Technology
Douglas C. Montgomery Professor of Engineering and Statistics
Department of Industrial Engineen"ng Arizona State University
David M. Goldsman Professor School of Industrial and Sys!erns Engineering
Georgia Institute of Technology
Connie :VL Borror Senior Lecroirer Department of lr.dustrlal Engir.eering Arizona State University
~
WILEY
John Wiley & Sons, Inc.
Acquisitions Edi!or
Wayne Anderson Jenny Welter Marketi.'1g Manager Kathedne Hepburn Se::ior Production Editor valerie A. mrgas Senior Designer Dawn Stamey Cover Image Alfredo PasiekalPhoto Researchers Production Management Services Argosy Publishing
Associate Editor
This book was set in 10/12 Tunes: Roman by Argosy Publishing and printed and bound by Hamil.:on Printing. The cover was printed by Phoenix Color. This book is printed on acid~free paper,
10 order books or fur custOtr..er service please call1(800}-CALL vm.EY (225-5945), Library of Congress CaJ.aloging in PublicaJian Data:: Probabilit)' and statistics in engineering fW'l1Jjaro W. Hines .. (et :ll.]. - 4th cd.
p.em. Includes bibliographical referenceS, 1. Engineering·~Statistica1 methods. L Hines, W'tlli:::u:n W. TA340 Jj55 2002 519.2--de21 2002026703 ISBN 0-471~24087-7 (cloth: acjd~free paper) Printed in :..l::e United States of .A.cl.erica 1098765432
PREFACE to the 4th Edition
This book is written for a first course in applied probability and statistics for undergraduate smdents in engineering, physical sciences, and management science curricula. We have found that the text can be used effectivelY as a two-semester sophomore~ or junior-level course sequence, as well as a one~semester refresher course in probability and statistics for first-year graduate students. The text has undergone a major overhaul for ille fourth edition, especially wiill regard to many of the statistics chapters. The idea has been to make the book more accessible to a wide audience by including more motivational examples, rea1~world applications. and useful computer exercises. With the aim of making the course material easier to learn and easier to teach, we have also provided a convenient set of cOurse notes, available on the Web site www.wiley.comico.11ege/hines. For instructors adopting the text, the complete solutions are also available ou a password~protected portion of this Web site. Structurally speaking, we start ille book off with prObability illeory (Chapter I) and progress through random variables (Chapter 2), functions of random variables (Cbapter3), joint random variables (Chapter 4), discrete and continuous distributions (Chapters 5 and 6), and normal distribution (Chapter 7). Then we introduce statistics and data description techniques (Chapter 8). The statistics chapters follow the Same rough outline as in the previous edition, namely, sampling distributions (Chapter 9), parameter estimation (Chapter 10), hypothesi, testing (Chapter 11), single- and multifactor design of experiments (Chapters 12 and 13), and simple and multiple regression (ChapterS 14 and 15). Subsequent special-topics chapters include nonparametric statistics (Chapter 16), qUality control and reliability engineering (Chapter 17), and stochastic processes and queueing theory (Chapter 18), Frnally, there is an entirely new Chapter, on statistical techniques for computer sim~ ulation (Chapter 19)-.--perhaps ille first of its kind in this type of statistics text. The chapters that have seen the most substantial evolution are Chapters 8-14. The discussion in Chapter 8 on descriptive data analysis is greatly enhanced over that of the previous edition~s. We also expanded the discussion on different types of interval estimation:in Chapter 10. In addition. an emphasis has been placed On real-life computer data analysis examples, Throughout ille book, we incorporated other SlIUctural changes. In all chapters, we included new examples and exercises; including numerous computerflbased exercises. A few words on Chapters 18 and 19. Stochastic processes and queueing theory arise naturally out of probability, and wefeelillat Chapter 18 serves as a good introduction to the subject-nonnally taught in operations research. management science, and certain engineering disciplines. Queueing theory bas garnered a great deal of use in such diverse fields as telecommunications, manufacturing, and production planning, Computer simulation. the topic of Chapter 19, is perhaps the most widely used tool in operations research and man~ agement science, as well as in a number of physical sciences. Simulation :ma..'7ies all th.e tools of probability and statistics and is used in everyJring from financial analysLs to factory control and planning, Our text provides what amounts to a simulation mL.1icourse, covering the areas of Monte Carlo experimentation. random number and variate generation, and simulation output data analysis.
v
vi
Preface
We are grateful to the following individuals for their help during the process of completing the current revision of the text. Christos Alexopoulos (Georgia fustitute of Technology), Michael Cararnanis (Boston University), David R. Clark (Kettering University), J, N. Hool (Auburn University). Johu S. Ramberg (University of Arizona), and Edward J. Williarus (Gniversity of )":lichigan - Dearborn), served as reviewers and provided a great deal of valuable feedback. Beamz Valdes (Argosy Publishing) did a wor.derful job supervising the typesetting and page proofing of the text, and Jennifer Welter at Wiley provided great leadership at every tum. Everyone was certainly a pleasure to work with. Of course, we thank our families for their infinite patier:.ce and support throughout the endeavor.
Hines. Montgomery, Goldsman., and Borror
Contents
1. An Introduction to
1
4. Joint Probability Distributions
1~ 1 rntroduction 1-2 1-3 1-4 1-5 1-6
1-7 1-8
1-9 1-10
A Review of Sets 2 Expe:.'iments a."l.d Saruple Spaces 5 Events 8 Probability Definition and Assignment 8 FInite Sample Spaces and Enumeration 14 1-6_1 Tree Diagram 14 1-6_2 MultiplicatioD Principle 14 1-6.3 Permutations 15 1-6.4 Combinations 16 1-6.5 Permutations of Like Objects 19 CODditiODal Probability 20 Pa.·titions, Total Proba~ility. and Bayes' Theorem 25 Summa,y 28 Ex=i,es 28
4-6 4-7 4-8 4-9 4~ 10 4-11 4-12 4-13
101
101
5, Some Important Discrete Distributions 106 5-1 44
3. Functions of One Random Variable and Exp~tion 52 3-1 Introduction 52 3·2 Equivalent Events 52 3·3 Functions of a Discrete Random Variable 54 3-4 Continuous Functions of a Continuous Random Variable 55 3-5 Ex,pectation 58 3-6 Approximations to E[H(Xl] and I1H(X)] 3-7 The Moment-Gene.'1Iting Function 65 3-8 Sum:na:y 67 3-9 E--:ercises 68
4-3 4-4 4-5
Introduction 71 Joint Distribution for Two--Dimensional Random Variables 72 Marginal DistrlbutionS 75 Conditional Distributions 79 Conditional Expectation 82 Regression of the Mean 85 Independence of Random Variables 86 Covariance and Correlation 87 The Distribution Function for TW(JDimensional Random Variables 91 Funetions of Two Random Variables 92 Joint Distributions of Dimension n > 2 94 Linear Combinations 96 ~omen>Generat:ing Functions and Linear Combinations 99 The Law ofL"1le Numbers 99
4-14 4-15 S\lIll1!llUy 4-16 Exercises
2_ One-Dimensional Random Variables 33 2-1 Introduction 33 2-2 The Distribution Function 36 38 2-3 Discrete Random Variables 2-4 Continuous Random Variables 41 2-5 Some Characteristics ofPi;.}tnOUtiODS 2-6 Chebyshev', Ioequality 48 2-7 SuInmary 49 2-8 Exercises 50
4-1 4-2
71
62
Introduction
106
5w2 Bernoulli Trials a.."1d the Bernoulli Dis:ributior.. 106 5~3 The Binomial Dis:ri::outio:l. 108 5-3.1 Mean and Variance of the Binomial Distribution 109 The Cumulative Binomial Distribution - ltO 5~3.3 An Application of the Binomial Distribut:!on 111 54 The Geometric Distribution 112 5-4,1 Mean and Variance of the Geometric Distribution 113 5~5 The Pascal Distribution 115 5-5.1 Mear. a::.d Variance 0: the Pascal Distric:.r:iolJ 115 5-6 The Multinomial DistnlJotioa 116 5-7 The Hypergeometric Distribution 117 5-7_1 Mean ar.d Variance ofthc Hyper1 I8 geometric Distribution 5~3.2
viii 5-8
Contents rr.e Poisson Distribution 118 5-8.1 Development from a Poisson Process 118 5~S.2 Development of the Pojsson Distribution from the Bir..oroial
120
7-4
5-8.3 Mean and Va..·iance of the Poisson Distribution
5-9 Some Approximations 5-10 Generation of Realizations 5-l! Summary 123 5~12
Exercises
120 122
6-2
6-3
6-4
7-6
123
7~7
7~8
Introduction 128 The Unifonn Distribution 128 6-2.1 Mean and Variance of the Uniform Lnsbibution 129 The Exponential Disbibution 130 6-3.1 The Relationship of the Exponential Distribution to the Poisson Distribution 131 6~3.2 Mean and Variance of the Exponential Distribution 131 6-3.3 Memoryles, Property of the ExponcLtial Distribution 133 The Gamma Distribution 1346-4.1 The Gamma Function 134 6-4.2 Definition of the Gamma Distribution 134
6-4.3 Relationship Between the Gamma
6-5
6-6 6-7 6-8
Distribution and the Exponential Distribution 135 6-4.4 Mean and Variance of thc Gamm.a Distribution 135 The Weibull Distribution 137 6-5.1 Mean and Variance of the Vleibull Distribution 137 Generation of Realizations 138 Summary 139 Exercises 139
7. The Normal Distribution 7-1 7-2
7-5
123
6_ Some Important Continuous 128 Distributions 6--1
7~3
143
Introduction 143 The Normal Distribution 143 7-2.1 Properties of the Normal Distribution 143 7-2.2 Mean and Variance of the Normal
Distribution
144
7-2.3 The Normal Cumulative Distribution 145
7-9 7-10
7-2.4 The Standard Normal Distribution 145 7M2.5 Problem-Solving Procedure 146 The Reproductive Property of the Normal Distribution 150 The Central Limit Theorem 152 The Nonnali\.:pproximation to the Binomia! Distribution 155 The Lognormal Distribution 157 7-6,1 Density Function 158 7~6,2 Mean and Va..rianee of the Log:normal Distribution 158 7-6.3 Properties of the Lognormal ~bution 159 The Bivariate Normal Distdbution 160 Generation of Normal Realiza'.:ions 164 Summary 165 Exercises 165
S. Introduction to Statistics and Data Description 169 8~1
The Field of Statistics 169 Data 173 Graphical Presentation of Data 173 8-3.1 Numerical Data; Dot Plots and Scatter Plots 173 8-3.2 Numerical Data: The Frequency Distribution and Histogram 175 8-3.3 The Stem-and-LeafPlot 178 8- 3.4 The Box Plot 179 8-3.5 ThePa!'oto Chart 181 8-3.6 Ti.:ne Plots 183 8-4 NUlIlerical Description of Data 183 8-4.1 Measures of Central Tendency 183 186 8-4.2 Measures of Dispersion 8-4.3 Other Measures for One Variable 189 8-4.4 Measuring A.!.wciation 190 191 8-4.5 Grouped Data 8-5 Summary 192 8-6 Exercises 193 8-2 8-3
9. Random Samples and Sampling Distributions 198 9-1
Random Samples 198 9-1.1 Simple Random Sampling from a Finite Universe 199 9-l.2 Stratified Random Sampling of a Finite Universe 200 9-2 Statistics and Sampling 201 Distributions 9-2.1 Sampling Distributions 202
ix
Contents 9-2.2 Finitc Populations and Enumerative Studies 204The Chi-Square Distribution 205
9-3
9-4 The t Distribution 9~5 The F Distribution
9-6 Summary 9-7 Exercises
208 211
214 214
11. Tests of Hypotheses
r ~,,
(~:
,Parameter Estimation
216
10-1 Point Estimation 216 10-1.1 Properties of Estimators 217 10-1.2 The Method of Maximum Likelihood 221 10-1.3 The Metbod of Moments 224 10-1.4 BaycsianInferencc 226 227 10-1.5 Applications to Estimation 10·1.6 Precision of Estimation: The Standard:Error 230 10-2 Single-Sample Confidence Interval Estimation 232 10-2.1 Confidence Llterval on the Mean of a Normal Distribution, Variance Known 233 10-2.2 Confidence Interval on the Mean of a Nonnal Distribution, Variance Unknown 236 10-2.3 Confidence Interval on the Variance of a Non:u.ll Distribution 238 10-2A Confidence Interval on a P;oportion 239 10-3 Two-Sampl. Confidence Interval Es~on 242 10-3_1 Confidence Interval on the Difference between Means of Two Normal Distr1outions, Variances Known 242 10-3,2 Confidence Interval on the
Difference between Mea:n:s of'IWo Normal Distnoutious. Va.."'iatices
Unknown 244 10-3.3 Confider.ce Interval an IJ.: - J4 for Pl'rired Observations 247 10-3.4 Confidence Interval 0:.1 the Ratio ofVariauces of1'v,"o Normal IAstributions 248 10-3.5 Confidence bterval on the Difference between Two Propor::ions 250 Approximate Confidence Intervals in Maximum Likelihood Esti:trtation 251 Simultaneous Confidence Intervals 252 Bayesian Confidence Intervals 252 Bootstrap Confidence Intervals 253
255
266
11-1 Introduction
11-2
1l~3
11-4 11-5 11-6 11-7 11-8
266 !l·Ll Statistk:al Hypotheses 266 11-1.2 Type! and Type II Errors 267 !l-13 One-Sided and Two-Sided Hypotheses 269 Tests of Hypotheses on a Single Sample 271 11 ~2.1 Tests of Hypotheses on the Mean of a Normal Distribution. Variance KnoVitn 271 11~2.2 Tests of Hypotheses on the Mean of a Kormal Disttfoution. Varianee Unkao~n 27& 11-2.3 Tests of Hypotheses on the Variance of a Normal Distribution 281 11-2.4 Tests of Hypotheses 0:.1 a Proportion 2&3 Tests of Hypotheses on Two Samples 286 11-3.1 Tests of Hypotheses on the Means of Two Normal Distributions, Variances KnO\\T. 2&6 11~3.2 TesLS of Hypotheses on the Means of1\vo Normal Distributions, Variances Unknovro 288 11-33 The Paired t- Test 292 11 ~3.4 Tests for the Equality ofIwo Variances 294 11 ~3.5 Tests of Hypotheses on 1\vo Proportions 296 Testing for Goodrless of Fit 300 Contingency Table Tests 307 Sample Computer Output 309 Summary 312 Exercises 312
12_
Design and Analysis of Single. Factor Experiments: The Analysis of Variance 321
12-1
The Completely Randomized Single-Factor Experiment 321 12-:_1 AD Example 321 12-1.2 The Analysis of Variance 323 12-1.3 Esti.mation of the Model Parameters 327
X
Contents 12-1.4 Residual Analysis and Model
12-2
12-3 12-4
12-5 12~6
12·7 12-8
13.
Checking 330 12-1.5 An linbaLmcedDe,ign 33\ Tests on Individual Treaonen~ Means 332 12-2.1 Orthogonal Contrasts 332 12-2.2 'f\Jkey's Test 335 The Random-Effects Model 337 The Randomized Block Design 341 12-4.1 Design and Statistical Analysis 341 12-4.2 Tests on Individual Treatment Mea:lS 344 12-4,3 Residual Analysis and Model Checking 345 Deterrr.:n:.ng Sample Size in Si."l.gle-Factor Experiments 347 Sample Co:rnputer Ou~ut 348 Summary 350 Exercises 350
Design of Experiments with Several Factors 353
--~~--~-----------13-1 Examples of Experimental Design Applications 353 13-2 Factorial Experi.-uents 355 13-3 Tvio-Factor Factorial Experiments 359 13-3.1 Statistical Analysis of the Fixed~ Effects Model 359 13-3.2 Model Adequacy Checking 364 13-3.3 One Observation per Cell 364 13-3.4 The Random·Effects Model 366 13-3.5 The Mixed Model 367 13-4 General Factorial Experiments 369 13·5 The 2' FaC'.orial Design 373 13·5.1 The2'Design 373 13·5.2 The 2' Design for k ~ 3 r"'actors 379 13-5.3 A Single Replicate of the 2~ Design 386 13·6 Confounding in the 2' Design 390 13~7 Fractional Replication of the 2k Design 394 13-7.1 The One-Half Fracuon of the 2' Design 394 13M7.2 Smaller Fractions: The 2k -p Fractional Factorial 400 13·8 Sample Computer Output 4D3 13·9 SlUlllllary 404 13·10 Exercises 404
14.
Simple Linear Regression and Correlation 409
14-1 Simple Linear Regression 409 14-2 Hypot..~esis Tes!ing in Simple Linear Regression 414 14-3 lnterval Estimation in Simple Linear 417 Regression 14-4 Prediction of New Observations 420 14-5 Measuring the Adequacy of the Regression Model 421 14-5.1 Residual Analysis 421 14-5.2 The Lack-of·Fit Test 422 14-5.3 The Coefficient of Dete:rmination 426 14-6 Transformations to a Straight Line 426 14w-7 Correlation 427 14~8
SampleComputerOu~ut
14·9 Summary 14-10 Exercises
15. 15~ 1
43:
431 432
Multiple Regressiol' __43_7_ _ __
Multiple Regression Models 437 Estimation of the Parameters 438 15-3 Confidence Intervals in Multiple Linear Regression 444 15-4 Prediction of New Observations 446 15~ 5 HypOthesis T~'ting in Multiple Linear Regression 447 15-5,1 Test for Significance of Regression 447 15-5.2 Tests on Individual Regression Coefficients 450 15·6 MeasuresofMcdelAdequacy 452 15-6.1 The Coefficient of Multiple Detennination 453 15-6.2 ResidualAnalysis 454 15-7 Polynomial Re:gression 456 15-8 Indicator Variables 458 15-9 The Correlation Matrix 461 15-10 Problems in Multiple Regression 464 464 15-1O.! Multicollinearity 15-10.2 Influential Observations m Regression 470 15-10.3 Autocorrel1.tion 472 15,11 Selection of Var'.ab1e.s in Multiple RegressioQ 474 15-11.1 The Model·Buililing Problem 474 15-11.2 Computational Procedures for Variable Selection 474 15-12 SUlIlIrulJ:)' 486 15-13 Exercises 4&6 15~2
Contents
16.
Nonparametric Statistics
491
491 16-1 Introduction 491 16-2 The Sign Test 16~2.1 A Description of the Sign Test 491 16-2.2 The Sign Test for Paired Samples 494 16-2.3 Type II Error (i3) for the Sign Te't 494 16~2.4 Co::nparisou of the S:g!l Test and the t- Test 496 496 16-3 The Wilcoxon Signed Rar..k Test 16~3.1 A Description of the Test 496 16~3.2 A La..rge-Sample
Approrimat:on t.97 16·3.3 Paired Observations 498 16~3A Comparison with the ,-Test 499 16-4 'The Wilcoxon R:mk Sum Test .!99 164.1 A Description of the Test 499 16-4.2 A Large-Sample ApproxJ.mz:tion 501 16-4.3 Compa.'"'ison with the '-Test 501 16-5 Nonparnmetric Methods in the Analysis of Variance 501 16-5.1 11le Kruskal-Wallis Test 501 5(l4 16-52 The Rank Transformation 16-6 Summary 5(l4
16-7 Exercises
505
11. Statistical Quality Control and ReliabilityJJ:Il~"Ii"~_ _ 507 _ __ Quality Improvement and • 507 17-2 Statistical Q"ality Con",,1 508 17·3 Statistical Process Control 508 17-3.1 Il:!.ttoduction to Control Char~ 509 17-3.2 Control Charts for Measurements 510 17-3.3 Control Cbw~ for Individual Measurements 518 17~3.4 Control Chart.s for Attribotes 520 17-35 CUS"L'M and EWMA Control Charts 525 17·3.6 Average Run Length 532 17-3.7 Other SPC Problem-Solving Tools 535 17-4 Reliability Engineering 537 17·4.1 Basic Reliability Definitions 538 17-1
Statistics
xi
17-4.2 The Exponential Time~to-FaiJure Model 541 17-4.3 Simple Serial Systems 542 17-4.4 Simple Active Redundancy 544 17-4.5 Standby Redundancy 545 17·4.6 Life Testing 547 17-4.7 Reliability Estimation with a Known Tirr.e-to-Failure Distribution 548
17-4.8 EsEmation with the Exponential Tune~:o-Failure Distribution 548 17-4.9 Demonstration and Acceptance Testing 551 17-5 Summary 551 17-6 Exercises 551
19-3.3 Steady State Simulation Analysis 590 Comparison of Systems 591 19-4.1 Classical Confidence Ix:tervals 592
xii
Contents 19-4.2 Common Random Numbers 593 19-4.3 Antithetic Rmldom
Numbers
593 19-4.4 Selecting the Best Syst= 593 19-5 SlllIllDZIY 593 19-6 Exercises
Chart VIII Operating Characteristic Curves for the Random~Effects Model Analysis of VarianCe
C:itical Values for the WUcoxon Two-
Table X
Sample Test 627 Critical Values for the Sign Test 629
Table XI
594
Critical Values for the Wdcoxon Signed Rank Test
Appendlx
623
Table IX
630
Table XTI Percentage Points of the Studentized
597
Range Statistic
Cum\llative Poisson 598 Distribution Table II Cumulative Standard Nonna:! Distribution 601 Table ill Percentage Points of the y} Distribution 603 Table IV Percentage Points of the t Distribution 604 Table V Percentage Pomts of the F DiJ;tribution 605 CbartVI Operating Cl:iaraJ;teristic Curves 610 Cbart\lI Operating Characteristic Ccrves for the Table I
F)X£d-Elfects ModelAnalysiJ; of
V:u:iance
619
Table
631
xm Factors for Quality-Control
Cbarts 633 Table XIV k Values for One-Sided and Two-Sided Tolerance Intervals Table XV Rmldom Numbers
References
637
Answers to Selected Exercises Index
649
634 636
639
Chapter
1
An Introduction to Probability 1-1 INTRODUCTION Since professionals working in engineering and applied science are often engaged in both the analysis and the design of systems where system component characteristics are nondetetnlinistic. the understamling and utilization of probability is essential to the description.
design, and analy&is of such systems. Examples reflecting probabilistic behavior are abundant, and in fact. true deterministic behavior is rare, To illusrrate. cons:der the description of a variety of product quality or perfonnance measurements: the operational lifespan of mechanical and/or electronic systems; the pattern of equipment failures; the occurrence of
natural phenomena such as sun spots ortomados; particiecouots from a radioactive source; travel times in delivery operations; vehicle accident counts during a given day on a section of freeway; or customer waiting times in line at a branch bank, The termprobability has come to be widely used in everyday life to quantify thc degree of belief in an event of interest. There are abundant examples, such as the statements that
"there is a 0.2 probability of rain showers" and "the probability that brand X personal computer will survive 10,000 hours of operation without repair is 0.75." In tltis chaprer we introduce the basic structure. elementary concepts, and methods to support precise and unambiguous statements like those above. The formal study of probability theory apparently originated in the seventeenth and eighteenth centuries in France and was motivated by the study of games of chance. WIth little formal mathematical understructure, people viewed the field with some $kepticism~ however, this view began to change in the nineteenth century, when a probabilistic model
(description) was developed for the behavior of molecules in a liquid. This became known as Brownian motion r since it was Robert BrO\vn, an English botanist, who firs.t observed the phenomenon in 1827. In 1905, Albert Einstein explained Brownian motion under the hypothesis that particles are subject to the continual bombardment of molecules of the surrounding medium. These results greatly stimulated interest in probability, as did the emergence of the telephone system in the latter part of the nineteenth and early twentieth centuries. Since a physical connecting system was necessary to allow for the interconnection of individual telephones, with call1engths and interdemand intervals displaying large variation~ a strong motivation emerged for developing probabilistic models to describe this system's behavior.
Although applications like these were rapidly expanding in the early twentieth cenrury, it is generally thought that it was not until the 19305 that a rigorous mathematical structure for probability emerged. This chapter presents basic concepts leading to and including a
definition of probability as well as some results and methods useful for problem solution. The emphasis th:roughout Chapters 1-7 is to encourage an understanding and appreciation of the subject, with applications to a variety of problems in engineering and science. The reader should recognize that there is a large, rich field of mathematics related to probabil-
ity (hat is beyond the scope of this book.
1
2
Chapter 1
An Introduction to Probability
Indeed, our objectives in presenting the basic probability topics considered in the cur~ rent chapter are threefold. First, these concepts enhance and enrich our basic understanding of the world in which we live. Second, many of the examples and exercises deal with the use of probability concepts to model the behavior of real-world systems. Finally, the probability topics developed in Chapters 1-7 provide a foundation for the statistical methods presented in Chapters 8-16 and beyond. These statistical methods deal with the analysis and interpretation of data, drawing inference about populations based on a sample of units selected from them, and with the design and analysis of experiments and experimental data. A sound understanding of such methods will greatly enhance the professional capability of individuals working in the data-intensive areas commonly encountered in this twenty-first century.
1-2 A REVIEW OF SETS To present the basic concepts of probability theory, we will use some ideas from the theory of sets. A set is an aggregate or collection of objects. Sets are usually designated by capital letters, A, B, C, and so on. The members of the set A are called the elements of A. In general, when x is an element of A we write x E A, and if x is not an element of A we write x ~ A. In specifying membership we may resort either to enumeration or to a defining prop~ ertj. These ideas are illustrated in the following examples. Braces are used to denote a set, and the colon within the braces is shorthand for the t = "such that."
The set whose elements are the integers 5, 6, 7, 8 is a finite set with four elements. We could denote this by
A=[5,6,7,8). Note that 5 E A and 9 ~ A are both true.
If we write V::: {a, e, i,
0, u} we have defined the set of vowels in the English alphabet. We may use a defining property and write this using a symbol as
V = (*: * is a vowel in the English alphabet}.
If we say thatA is the set of all real numbers between 0 and 1 inclusive, we might also denote A by a
defining property as
A= [x:x
E
R, O$x'; 1),
where R is thc set of all real numbers.
The setB= (-3, +3} is the same set as B
where R is again the set of real numbers.
=(x: x E R, x' =9),
1-2
A Review of Sc:ts
:3
In the real plane we can consider points (x, y) that lie on a. given lineA, Thus, the condition for inclusion for A requires (x, y) to satisfy ax..;.. by = c, so that A = (x, y): x
E
R, y E R. ax+by=c).
where R is the set of real numbers.
The universal set is the set of all objeets under consideration; and it is generally denoted by u: Another special set is the null set or empty set, usually denoted by 0. To mustrate tbls concept> consider a set A={X:XE R,x'=-l).
'The universal set here is R. the set of re-al numbers. Obviously, set A is empty, since there are no real numbers having the defining property :? :;:: -1. We should point out that the set (0)0'0. If two sets are considered, say A and B, we call A a subset of B, denoted A c B, if each element in A is also an element of B. The sets A and B are said to be equal (A = B) if and only if A c B and B c A. As direct consequences of this we may show the following: 1, For any set A, 0 c A.
2. For a given U, A considered in the context of U satisfies the relation A c U. 3, For a given set A, A c A (a rejJexive relation). 4, If A c B and Bee, then A ceCa transitive relation). An interesting consequence of set equality is that the order of element listing is immaterial, To illustrate, let A (a, b, c) and B = (c, a, b). Obviously A ; B by our definition. Furth=ore, when defining properties are used, the sets may be equal althougb the defining properties are outwardly different. As an example of the second consequence. we let.4. : : : {x:.x E R, where x is an even~ prime number} andS """ {x: x+ 3 = 5}. Since the integer 2 is the only even prime, A B. We now considet some operations on sets. Let A and B be any subsets of the universal set U. Then the following hold:
=
1. The complemernof.4 (with tespectto U) is the set made up of the elements of Uthat do not belong ro A. We denote this complementary set as A. That is, A=(X:XE u,x>!c A}.
2. The intersection of A and B is the set of elements that belong to both A and R We denote the intersection as A ('j R In other words,
AnB
(x:xEAandxEB).
We should also note that A n B is a set, and we could give this set some designator, such as C. 3. The union of A and B is the set of elements that belong to at least one of the ,etsA and B. 1f D represents the union, then
D=A u B
{x: x
E
A or x
E
B Corbothl}.
These operations are illustrated in the following examples,
4
Chapter 1
An Introduction to Probability
'E~~ie'l;~ Let Ube the set of leiters ttl the alphabet, that is, U = (*; * is a letter of the English alphabet}; and let A {"': ;;0 is :a yowell and B '=' {*: * is one ofthe1etrers a, b. c}. As a consequence of the definitions,
A,= the set of consonants, B ~ {t!, e,t g ••..• x; y. zl. AuB= (a, b. c, e, i,o, U}. A0B~{al.
:E~l!leI1 If the unlversal.setis defined as U =- {1, 2, 3.4, 5, 6, 7), and threesubsets,A = {l. 2. 3},B= {2.4, 6}. C::;: (1. 3, 5, -"}}. arc defined, then we see immedia:ely from the de:fi.citions that
A= (4.5.6. II. B= {U.5. 71=C. A c, B= (1. 2. 3, 4. 6). AUC=!1.2.3.5.7}. A"8~{2I,
o4nC=(l,3),
BvC=U,
80C=0.
The Venn diagram can be used to illustrate certain set operations, A rectangle is drawn to represent the universal set U. A s~setA of U is represented by the region, within a cir~ de drawn inside the rectangle. Then A will be represented by the area of the rectangle outside of the circle, as illustrated in Fig. I-I. Using this notation, the intersection and union are illustrated in Fig. 1-2. The operations of intersection and union may be extended in a straightforward manner to accommodate any finite number of sets. In the case of three sets, say A. B, and C, A u B u C has the property that Au (B u C) = (A u B) u C, which obviously holds since both sides have identical members. Similarly, we see that A n B n C = (A n B) 0 C = A n (B " C). Some important laws obeyed by sets relative to the operations previously defined are listed below. Identity laws:
De Morgan's law: Associative iaws: Distributive laws;
Au0=A. AuU=U.
-- Au B =A n B, A u (B u A nCB n A U (B n A n (B u
AnU=A, An0=0. A 0 B =A u B.
C) = (A u B) u c, C)=(A nB) n C. C) = (A u B) n (A u C), C) = (A n B) u (A n C).
The reader is asked in Exercise 1~2 to illustrate some of these statements with Venn dia~ Fonnal proofs are usually more lengthy.
grams.
u
FJgUr< 1-1 Asetln. Venn diagram.
1-3
u
Experiments and Sample Spaces
5
u
(aJ (bJ Figure 1~2 The interseetion and union of two sets in a Venn diagram. (a) The intersection shaded. (b) The union shaded.
In the case of more than three sets, we usc a subscript to generalize. Thus, if n is a positive integer, andE j , E 2, .. , , En are given sets, thenE j nE 2 n ... n En is the set of elements belonging to all of the sets, and E I U E2 U ... U En is the set of elements that belong to at least one of the given sets. If A and E are sets, then the set of all ordered pairs Ca, b) such that a E A and bEE is
called the Cartesian product set of A and B. The usual notation is A x B. We thus have AxB~{(a,
b):aE AandbE B).
Let r be a positive integer greater than 1, and let A j ' sian product set is given by Al xA 2 x .. ' xA r = {(ai'
•••
,Ar represent sets. Then the Carte-
az, ... ,a r): ajE AJorj= 1, 2, ... ,r}.
Frequently, the number of elements in a set is of some importance, and we denote by n(A) the number of elements in set A. If the number is finite, we say we have afinite set. Should the set be infinite, such that the elements can be put into a one-to-one correspondence with the natural numbers, then the set is called a denumerably infinite set. The nondenumerable set contains an infinite number of elements that cannot be enumerated. For example, if a < b, then the set A = {x E R, a':::; x':::; b} is a nondenumerable set. A set of particular interest is called the power set. The elements of this set are the subsets of a set A, and~ common notation is {O, 1 For example if A = {I, 2, 3 }, then
t,.
~ ~ (O, (I), (2), (3),
p, 2), (1,3), (2,3), (1,2,3)).
1-3 EXPERIMENTS AND SAMPLE SPACES Probability theory has been motivated by real-life situations where an experiment is performed and the experimenter obser...es an outcome. Furthermore, the outcome may not be predicted with certainty. Such experiments are called random experiments. The concept of a random experiment is considered mathematically to be a primitive notion and is thus not otherwise defined; however, we can note that random experiments have some common characteristics. First, while we cannot predict a particular outcome with certainty, we can describe the set o/possible outcomes. Second, from a conceptual point of view, the experiment is one that could be repeated under conditions that remain unchanged, with the outcomes appearing in a haphazard manner; however, as the number of repetitions increases, certain patterns in the frequency of outcome occurrence emerge. We will often consider idealized experiments. For example, we may rule out the outcome of a coin toss when the coin lands on edge. This is more for convenience than out of
6
Chapter 1
An Introduction to Probability
necessity. The· set of possible outcomes is called the sample space, and these outcomes define the particular idealized experiment. The symbols <=& and :t are used to represent the random experiment and the associated saITlple'space·:Following the terminology employed in the review of sets and set operations, we will classify sample spaces (and thus random experiments). A discrete sample space is one in which there is a finite number of outcomes or a countably (aenumerably) infinite number of outcomes. Likewise, a continuous sample space has nondenumerable (uncountable) outcomes. These might be real numbers on an interval or real pairs contained in the product of intervals, where measurements are made on two variables following an experiment. To illustrate random experiments with an associated sample space, we consider several examples.
:Ji~i!#~i~;i~,' ~: ·"T~~.s_~~e coin and observe the "up" face. ff,:
{H,n
Note that this set is finite.
'&2: ff,:
Toss a true coin three times and observe the sequence of heads and tails.
~3:
Toss a true coin three times and observe the total number of heads.
weld is inspected, and the total number of defectives is counted.
'!i5:
(0, 1,2, ... ,X}, whereX= the total number of welds in the door.
1-3
Experiments and Sample Spaces
7
~G:
A cathode:ay rube is manufactured. put on life test. ar.d aged to failure. The elapsed time (in hoUts) at faih.t..-e is recorded.
Ef,:
{UE
R, ,;"0).
This set is uncountable.
'tj7:
A monitor records the emission count from a 7adioactive source in one minute.
Ef,:
{O, 1. 2, ... }.
Tn.is set is cocntably infinite.
"£8:
Two key solder joints On a printed circuit board are inspected with a probe as well as-visually, and each joint is classified as gqod. G, or defective, D, requiring rework or scrap.
Ef,:
{GG, GD,DG,DD).
e.g9:
In a particular chemical plant the volume produced per day for a particular product ranges between a minimum value, fl, and a maximum value, c, which corresponds to capacity. A day is randomly selected and the amount produced is observed.
Ef,:
(X:XE
R, b:;;x';c).
i,O: An extrusion plant is engaged i::l making up an order for pieces 20 feet long. Inasmuch as the trim
~eration crea~es
scrap at both ends, the extruded bar
rr.us~
exceed 20 feet.
Because of costs involved, the amount of scrap is critical. A bar is extruded. trimmed, and iinished. and the totalleagth of Smtp is measured.
:flO:
{r.xeR,x>O}.
~! 1:
In a missile launch, the three components of velocity are mentioned from the ground as a function of time. At 1 minute after launch these are printed for a control unit.
9' .l:
«v..... V,Y' v): Vx' vJ1 v; arc real numbers}.
i
11:
In the preceding example, the ve:ocity components are contin:.l.ously recorded for 5 minutes.
9'12:
The space lS complicated here. as we have all possible realizations of the functions v.>:(t). and vt(t} for 0 S tS 5 minutes to consider.
8
Chapter 1
An Introduction to Probability
All these examples have the characteristics required of random experiments. With the exception of Example 1-19, the description of the sample space is straightforward, and although repetition is not considered, ideally we could repeat the experiments. To illustrate the phenomena of random occurrence, consider Example 1-8. Obviously, if ~l is repeated indefinitely, we obtain a sequence of heads and tails. A pattern emerges as we continue the experiment. Notice that since the coin is true, we should obtain heads approximately onehalf of the time. In recognizing the idealization in the model, we simply agree on a theoretical possible set of outcomes. In ~ I' we ruled out the possibility of having the coin land on edge, and in ~6' where we recorded the elapsed time to failure, the idealized sample space consisted of all nonnegative real numbers.
1-4 EVENTS An event, say A, is associated with the sample space of the experiment. The sample space is considered to be the universal set so that event A is simply a subset of ':t. Note that both 0 and ':t are subsets of ':t. As a general rule, a capital letter will be used to denote an event. For a finite sample space, we note that the set of all subsets is the power set, and more generally, we require that if A c ':t, then A' c ':t, and if AI' A1 , ••• is a sequence of mutually exclusive events in ':t, as defIned below, then U~"'l Ai c g. The following events relate to experiments 'i; I' '"t;;1' ••• , 'i; 10' described in the preceding section. These are provided for illustration only; many other events could have been described for each case. '&\.A:
The coin toss yields a head (H).
'&2' A: 'iSJ.A:
The total number of heads is two (2).
All the coin tosses give the same face {HHH, TIT}.
'&
The sum of the "up" faces is seven (I, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, I)}.
'&5' A:
The number of defective welds does not exeeed 5 (0, 1,2, 3,4, 5).
'&6' A:
The time to failure is greater than 1000 hours (t: t> 1000).
'is,. A: 'iSs. A: 'is,. A:
The count is exactly two [2}.
'&10' A:
The scrap does not exeeed one foot [x: x
Neither weld is bad {GG}. The volume produced is between a > b and c [x: x E
Therefore, to adequately detect a significant change in thc proportion of nonconforming n:bbcr belts, random sac:ples of at least n. """ 348 woU:d be needed.
11-7 SUMll4ARY This Chapter bas introduced hypothesis testing. Procedures for testing hypotheses on means and variances are summarized in Table 11-8. The chi-square goodness of fit test was intro~ duced to test the hypothesis that an empirical distribution follows a particular probability law. Graphical methods are also useful in goodness-of-fit testing. particularly when sample sizes are small. 1\vo-way contingency tables for testing the hypothesis that two method.;; of classification of a sample are independent were also introduced. Several eompurer examples were also presented.
11-8 EXERCISES 11-1. The breaking strength of a fibet used in manufac:uring cloth is tequited to be at least 160 psi Past experience has indicated that the standard deviation of bteaking strength is 3 psi. A random sample of four speci!nens is tested and the average breaking strength is found to be 153 psi. (a) Should the fiber be judged acceptable with
a=O.05? (b) \Vh31 is rhe probability of accepting Ho: Ji. ~ 160 iftbe fber has a trne breaking strength of 165 psi? 11~2. The yield of a chemical process is being studied. The variance of yield is known from previous ex.perience with this process to be 5 (units of a2 ):;:::per~ centagel). The past five days of plant operation have resulted in the following yields (in percentages): 91,6. 88.75,90.8,89.95,91.3. (a) Is there reason to believe chcyield is k:ss than 90%1
(b) 'What sample size would be required to detect a true mean yield of 85% with probability 0.95?
11-3_ The diameters of bolts are knowu to have a standa:d deviation of 0.0001 inch. A tandom sample of 10 bolt.,> yields an average diameter of 0.2546 inch. (a) Test the hypotbesis that the true mean diameter of bolts equals 0.255 inch. using a= 0.05. (b) Vihat size sample would be necessary to detect a true mean bolt diameter of 0.2551 inch with a probability of at least 0"90?
114. Consider the data in Exercise 10-39. {a) Test the hypothesis that the mean piston ring diametetis 74.035 rom. Use ct= 0,01. (b) 'What san::.ple size is required to detect a troemean diameter of 74.030 with a probability of at least 0.951
ll-S ExeJX:ises
313
Table 11-8 Sll1ll.ltla.""y of Hypothesis Testing Procedures on Means and Variances
ume. whether or not this yolu.::r:,e is 16.0 ounces, A
hypothesis that the mean life of the light bulbs is 1000 hours. Use Ct= 0.05.
random sample is taken from the outpu: of each machine.
U--6. Consider the data in Exercise 1041. Test the hypothesis that mean compressive strength equals 3500 psi. Gsc a= 0.01. 11~ 7.
Two machines a..~ used for filling plastic bottles with a net volume of 16.0 ounces. The filling processes can be assumed normal, with standard deviations '" =O.oJ5 and '" =oms. Quality engineering s~ that both mach.i.net :fill to the same net vol-
Machine 1
16.03 16.04 16.05 16.05 16.02
16.01 15.96 15.9S 16.02 15.99
Machine 2 16.02 15.97 15.96 16.01 15.99
16.03 16.04 16.02 16.01 16.00
314
Chapter 11
Tests of Hypotheses
(a) Do you tbin.k that quality engi!leering is correct?
Usea:M5.
at random fro:n the current production. Assume that shelf life is normally distributed,
(b) Assuming equal sample sizes, what sample size should be used to assure that f3 0,05 if the true difference in means is 0.0751 Asswue that a=O.05.
=
(c) What is the power of the test in. (a) for a true difference in .means of 0.075? 11~8. The f;1m development department of a local store :s considering the replacement of its current Elm-processing .machine. The time in which it takes the machine to completely process a roU of film is impo:"".ant. A random sample of 12 rolls of 24-exposure color film is selected for processing by the c\.trrent machine. The average processir.g time is 8.1 minutes, 'filth a sample standard deviation of 1.4 mi:tutes. A random sample of 10 rolls of the same type of film is se:ected for !esting in the new machine. The average processiltg ti;:ne is 7.3 minutes. wit.~ a sample standard deviation of 0.9 minutes, The local store '>Vill not pUIChase the new machine unless the processing time is more than 2 minu~es shorter than the current machine. Based on this information. should they purchase the new machine?
11-9. Consider the data in Exercise 1{)"45. Test the hypothesis that both machines filllO the same volum!!. Use Ct= 0.10.
11.. 10. Consider the data in Exercise IO-4ti Test He: ::= J.l::. against H! :,U: > JIz, using a;;::; 0.05,
Round
Deviation
2 3
11.28
6
-9.48
-10.42 -
7
,[
1.95
8 9
6.25 IO.ll -
5
6.47
10
Tes~
the hypothesis that the mean latetru deviation of these mortar shells is ::ffO. Assume that lateral devia~ tion is normally distributed. 11~13. The shelf life of a photographic film is of interest to the manufacturer. The manufacturer observes the following shelf life tor eight urits chosen
124 116
159 134
(b) If it is important to detect a ratio of Clr::r of 1.0 with a probability 0.90, is the sample size su.fEcient? 11~ 14.
The titanium content of an alloy is being stud-
ied in the hope of ultimately increasing the tensile strength. An analysis of six. recent heats choSen at mcdam produces the following ritanium con~ents.
8.0% 9.9
9.9
7.7% 11.6 14.6
Is th.et'C any evidence that the mean titanium content is greater than 9.5%1 1l~15.
An article in the Journal of Construction Engineering and Management (1999, p. 39) presents some data on the nwuber of work hours lost per day on a construction project due to weather-related inci~ dents, Over 11 workdays, the following lost work hours were recorded.
Consider the gasoline road octane number data in Exercise 10-47. If fonm.:lation 2 produces a higher road. octane number than fClfl'4ulation 1, the ::r.anufacturer would like to dctect tl".is, Fonnulate and test e.n appropria:-c hypo:hesis, using (X= 0,05.
Deviation
163
that t.":e mean shelf life is greater than or equal to 125 days?
ll~ll.
Round
128 days
134
(a) Is there any evidence
j1;
11 ~ 12. The la:eral deviation in yards of a certain type of mortar shell is being investigated. by the propellant manufac:urer. The following data have been observed.
108 days
8.8
8.8
12.5 5.4 12.8
12.2 133 6.9
9.1
2.2
14.7 Assuming work hours are :r.otma11y distributed. is there any evidence to conclude that the mean number of work hours lost per day is greater than 8 hours? 11 ..16. The percentage of scrap produced in a !!leW finishing operation is hypothesized to be less than 7.5%. Several days were chosen at random and the percentages of scrap were calculated. 5,51%
732%
6.49 6.46 537
8.56
8.81 7.46
(2.) In your opinion. is the true scrap rate less than
7.5%1 (b) Ifitls important to detect a ratio of (JIG: 1.5 v.'ifu a probability of at least 0.90, what is the :mini~ mum sample size ';hat can be used?
(c) For 5/(1:::.2.0, what is the power of the above test?
1\ "8
11-17. Suppose that we must test the hypotheses
Exercises
Field
315
Model
H,:I1" 15,
H,:11<15, -where it is knoVtn that c? :;;; 2.5. If a=;; 0.05 and the trUe mean is 12, what sample Si4C is necessary to assure 11 type IT er:nr of 5%?
11-18. An engineer desires to test !.he hypothesis that the melting point of an alloy is lOOOQC. 1f the true melting point differs from this by more thM 20CC he
must change the alloy's composition, If we assume that the melting point is a normally distributed ran~ com. variable, 0::;;;0.05, th:;O.lO. and (1= lO"C. how many observ.ations should be t2.ken? 11-19. Two methods for producing gasoline from c..41c.e oil are being i:westigated. The yields of both processes are asscrc.ed to be normally distributed. The follo\1.'ng yie:d data have been ob:ain::d from the pilat
1
2
53.33 55.17
55.17 55.17 57.14
47.40 49.80 51.90 52.20 54.50
58.20 59.00 60.10 63,40
69.57
55,70
69.57
56.70
71.30 75.40
Yields (0/.;:) 24.2 26.6 25.7 21.0 22.1 2\.8
20.9 22.4 22.0
Ca) Is there reason to believe that process 1 has a greater mean yield? Use 0: O.OL Assume that bolli variances are equal. Co) ~l;ssuming that in order to adopt process 1 it must produceame:m yield thatls at least5% grea,.-er than ti:at of process 2, what are your ;re....--ommendations:? (e) Fl.Ilrl the power of the test in part (a) lithe mean yield of process 1 is 5% greater L'lan that of process 2. (d) VIhat samp1e size is required for the test in part (a) to ensure that the null hypothesis will be rej~ with • probability of 0.90 if tho mean yield of process J exceeds tbe mean yield of process 2 by 5%? 11~20. An article that appeared in the Proceedings of the 1998 Winter Simulation Conferen.ce (1998, p. :079) discusses the concept of validation for traffic
simulation :;).OOels, The stated purpose of the study is to design aad modify the facilities (roadways and control devices) to optimize efficiency and safety of traffle flow. Pan of the study compares speed observed at various ir.tersections and speed simulated by a model being tested, The goal is to determine whether the simulation model is representative of the actual observed speed. Field data 1s collected at a particular location and then the simulation model is imple~ mented, Fourteen speeds (ft/sec) are measured at a particular location, Fourteen observations are simu~ tated using the proposed model The data are;
65.80
Assu.ming the variances are equal, conduct a hypoth~
esis test to determine whether there is a significant difference between the field data and the model simulated data. Use a:;;; 0.05. 11~21. The foUov.i:ng are the bu..lning times (in minutes) of fla..-es of!\Vo different types.
Type 2
63 81
82
64
68
72
63
57 66
59 75 73
83
74 82
82
L
57.14 57.14 61.54 6\.54 61.54
Type:
plant
Process
53.33 53.33
59 65
56
82
(a) Test the hypothesis that the two variances are equaL Use 0::;;;0.05. (b) Using the results of {a), test the hypothesis that the mean buming times are equal.
11 ..22. A new filtering device is insta:led in a cber:ll~ cal unit. Before its i:tstaEation. a random sample yielded the fo:Cov.-ing in1:orma:ion about the percentage of impurity: XI = 12.5. si :;: 101.17, and n! =8. Aiter installation, a random sample yielded x,::::: 10.2, 1 " sl=94.73,nz=9. (a) Can you conclude that the two variances are equal? (b) Has the filtering device reduced the percentage of impurity significantly? 11-23~ Suppose that two rando:;). samples were drawn from norn:al populations with equal variances. The s:unple data yields x\ =: 20,0, nl :::;: 10, L(x!! Xj)l 1480, X, = 15.8, n., = 10, and J:(x,; -x,j' -1425. (a) Test the hypotl'.esis that the !'No means are equal Use 0:=0.01, (b) Fwd the probability that the null hypothesis in (a.; .,-till be rejected if the t.-ue difference in means is 10. Ce) What sample size is required to detect a true difference in means of 5 with probability at least 0,80 if it is known at the starr of the experiment that a rough estimate of the common variance is 150?
316
Chapter 11
11~24.
Consider the data in Exercise 10-56.
1 I
Tests of Hypottleses
(a) Test the hypothesls that the means of the two nor~ mal distributions are equal. Use a ~ 0.05 and assume that ~ 0;. (b) v.'hat sample size is required to detect a difference in means of2.0 with a probability of at least 0.SS1
cr:
(c) Test the hypothesis that the: variances of the two distributions ru;e equaL Use (J,;;!;; 0.05.
(d) Find the power of the tesl in (c) if the variance of a population is four times the other. 11-25. Consider the data in Exercise 10-57. Assw:ofig that ~:=: 0;. test the hypothesis that the mean rod diameters do not differ. Use a:::::: 0.05. 11~26. A chemical company produces a certain drug whose weight has a standard deviation of 4 mg. A new method of producing this drug has been proposed. although some additioual cost is involved, Manage~ ment wpl authorize a change in production technique only if the standard deviation of the weight in the new process is less than 4 mg. If the standard deviation of weight in the new process is as small as 3 mg, the company would like to switch production methods with a probability of at least 0.90. Assuming weight to be normally distributed and a = 0.05, how many observations should be taken? Suppose the resea.."'Chers choose n ;: :; : 10 and obtain the data below, Is this a good choice for n? What SI:lOuld be their decision? 16,630 grams 16.628 grams 16.631 16.622 16.624 16.627 16.622 16.623 16.626 16.618 11~27. A manufacturer of precision measuring instruments clai:ns that the standzrd deviation in the use of the i=:.sil\;!;).cnt is 0.00002 inch. A.'1 analyst, who is u:1aware of the claim, uses the instrument eight times and obtain." a sample standard deviation of 0,00005 inch.
(a) Using a= 0,01. is the claimjustiiied? (b) Compute a 99% confidence interval for the true variance. (c) "Vhat is the power of the test if the true standard de-...1ation equals 0.000047
(d) ·What is the smallest sample size that can be used to detect a true standard deviation of 0,00004 with a probability at least ofO,95? Use a= 0.01.
11-28. The standard deviation of measu."'ements made by a special thermocouple is supposed to be 0,005 degree. If the standard deviation is as great as 0.010. we wish to detect it -with a probability of at least 0,90, Use a;;;::: 0.01. What sample size should be used? If
this san:ple size is used and the sample standard devi~ ation s = 0.007, what is your conclusion, using a~ 0,01? Construct a 95% upper-confidence interval for \he true variance, 11~29. The manufacturer of a power supply is inter~ ested in the variability of output voltage. He ha.s tested 12 units. chosen at random, with the: following results:
5.34 5.00 5.07 5.25
5.54 5.44 4.61
5.35
(a) Test the hypothesis t1at
cr
'1
4,76
5.65 5.55 5.35
cr =05. Use a=0.05.
(b) If the true value of = LO. what is the probabil~ ity that the hypothesis in (a) v"ill be rejected?
11-30:. For the data in Exercise 11-7, test the hypo£h.. esis that the two variances are equal. using a = 0.01. Does the result of this test influence the manner in which a test on means would be cor.ducted1 What sample size is necessary to detect ~j~;;:; 25, 'With a probability of at least 0,901 11~31. Consider the fOliOVling two sampies, from two normal populations.
draVt"D.
Sa.'11ple 2
Sample 1 ----_... ,.
4.34 5,00 4.97
1.87 2.00
2.00 1.85 2.11
4.25 5.55 6.55 6.37 5.55
2.31
2.28 2.0'7 1.76
3.76
1.91 2.00
Is there evider.ce to conclude that the variance of population 1 is greater than the variance of population 21 Use a 0.01. Find the probability of detecting if,1c?, ~ 4.0.
11-32. Two machines produce :metal parts. The yari~ ancc of the weight of these parts is of interest. The fol-
lowing data have been collected. Machine 1
Machine 2
1""'2=30
x\
"1 =25 0.984
x" ~ 0.907
s~;;:; 13.46
$;:9.65
' I·
-
11-8 Exercises (a) Test the hypothesis that the var:ances of the twO macr.rnes are equaL Use a=0.05. (b) Test the hypothesis that the two machines proo:.J.ce parts having the same mean weight. Usc a= 0.05.
In a hardness test. a steel ball is press.ed into the material being tested at a standard load, The diam~ eter of the indentation is measured, which is related to the hardness, Two types of steel balls are available, and thei:' pe:iormance is compa.red on 10 specimens. Each speci::nen is tested twice. once with each ball. The results are given below: 11~33.
Test the hypothesis that the two: steel balls give the same expected hardness measurement. Use a= 0.05, 1l~34.
1\.'1o:.ypes of exercise equ;p:nent.A and B. for handicapped individuals are often !..!Sed to determine the effect of the particul:u exercise aD heart rate (in beats per minute), Seven subjectS participated In a study to determine whether the two types of equipment have the same effect on hem rate, The results are given in the table below, Subject
A
B
1 2 3 4
162 163 140 191 160 158 155
161 187 199 206 161 160 162
5 6 7
an appropriate tCSt of hypo:hesis to deter~ mine whether there is a significant difference in heart rate due to the type of equipmcnt used. CQnd~ct
11·35. An aircraft designer has theoretical e..1dence t.b.at painting lhe airplane reduces its speed at a speci~ fied power and flap setting. He tests six consecutive airplanes from the assembly line before and after painting. The results are shown below. Top Speed (mph)
Airplane
Painted
1 2 3
286 285 279 283 281 286
4
5 6
Not Pairted
289 286 283 288 183 289
317
Do the d&.ta support the designer's theory? Use
a=O.OS. 11·36. An article in the Internation.al Journal of Fatigue (1998. p. 537) discusses the bending fat~gue resistance of ge::\!" teeth when using a pa.."ticular prestressing or presetting process. Presetting of a gear tooth is obtained by applying and then removing a single overload to the machine element. To detcrmine significant differences in fatigue resistance due to presetti:1g. fatigue data were A "preset" tooth and a "nonpreset" tooth were paired if t.'1.ey were present on the same gear. Eleven pairs were formed a:ld :he fatigue life measured for e:lCrL (The final response of inrerest is In[(fatigue life) X 10-'n Pair
Conduct a test of hypothesis to de:ermine whether presetfulg significantly i::creascs the fatigue life of gear teeth. Use a=O.10. 11-37. Consider the deta in Exercise 10·66. Test the hypothesis that the uninsa.w. rate is 10%. Use a:;;; 0.05. 11~3S.. Consider the data in Exercise 10-68, Test the hypothesis t.13t the fraction of defective clLculators produced is 2.5%.
11~39.
Suppose that we . . . -ish to
~est
the hypothesis
Ho: PI = fJ.:. against the alternative HI: iJl ;{; 112, where botb variances ~ and
a; are kno'n'Il. A total of fl.[ + nz
= N observations can be taken. How should these
observations be allocated to the two populations to mv:imize the probability thatHowill be rejected if H j is true. and JlI - P1. : : : b;{; O'? 11-40. Consider the ~:r:ion membership study described in Exercise 10-70, Test the hypothesis that the proportion of men who belong to a union does not differ from the proportion of women who belo:tg to a union, Use a=0.05. 1l~41. Using the data in :a~se 10-71. determine Whether it is reasonable to conclude that production line 2 proeuced a higher fraction of defective prcd:lct t.ian ili:.e 1. Use a = 0.01,
318
Chapter 11
Tests of Hypothe:>es
11-42. Two different types of injectio::-molding machines are used to fonn plastic parts. A part is considered defective if it has excessive shrinkage or is discolored. Two random samples, each of size 500, are selected, and 32 defective parr.3 are found lj. the sample from machine 1. while 21 defectl.',le part<; are found in the sample from machine 2. Is it reasonable to co::c1ude that both machines produce tie same medon of defective parts?
1143. Suppose that we wish to test Ho: PI = fi2 agains~ H!: f.lj ;i: il2., where ~ and 0; are known. The total sample size N is fLxed, bu',; the allocation of observations to tl'lC two populations such that r.( + "''2 = N is to be made on ilie basis of cost. If the costs of sampling for populations 1 and 2 are C t and rcspectively, find the minimum cost sample sizes that provide a specified va."iance for the difference in sam~ pie means. 11 ~44. A manufacturer of a new pain relief tab1et would like to demonstrate tnat her product works twice as fast as hetcompetltor's product Specifically, she would like to test
Ho: 1'-: = 2/11. H,: 1'-, > 2/11.
Derive an expression similar to equation 1120 for :he /3 error for the test of the equfuity of the variances of two normal clistributions. Assume that the l.VtO-sided alternative is specified
1147. The number of defective -:.nits found each day by an in-circci: functional tester in a printed circuit board assembly process is shov.'U below.
31-35
--
36-40 41-45
Number of Defeetsi
0
Times Observed 6 11 16
Number of Wafers with i Defects 4
13 2
34
3
56 70 70 58
4
5 6 7 8 9 10
12
11~46.
1}-10 11-15 16-20 21-25 26-30
11-48. Defects on wafer su.'1'aces in integrated cirelli: fabrication are unavoidable. In a particular process the follol1ting data were collected.
11
where !J.l is the mean absorption time of ::he competitive product and f11. is t.ie mean absorption time of the nevi product. Assuming that ::he variances ~ and 0;are known. suggest a procedure for testing this hypot.iesis. 11-45. Derive an expression si.'TIi1ar to cquation 1120 for the /3 error for the test on the variance of a nor~ mal distribution. Assume that tie two-sided alternative is speCified.
Number of Defectives
(a) It is reaso:::able to conclude that these data come from a normal dist."ibution? Use a chi-square goodncss-of-fit test (b) Plot the data on nonnal probability paper. Docs a.1. assumption of norrn.ality seem justified?
42
25 15
9 3 1
Doos the assumption of a Poisson distribution seem appropriate as a probability model for:his process?
1149. A pseudora.."1dom number generator is designed so that integers 0 thro~gh 9 have an equal proba:rility of occurrence. The.first 10.000 numbers are as follows: 0123456789 967 1008975 1£)22 1003 989 1001 981 1043 1011
Does this generator seem to be working properly? 11-50. The cycle time of an automatic machine !las been observed and recorded.
(a) Does the nor:tXU11 distribution seem to be a reason~ able probability' model for the cycle time? Use the chi-square goodness-of~fit test. (b) Plot the data on IlQrmal probability ;>aper. DOO1l
the assumption of normality seem reasonable?
28 22 19 lL 4
11",51. A soft drink bottler is s:udying the interru!l pressure strength of l-liter glass nor.returnable bottIes. A random sample of 16 borGes is tested and the pressure strengL~s obtained, The data are shown below. Plot these data on normal probabiliry paper.
11-8 Exercises
Does it seem reasonable to conclude that presst:..'"e strength is normally distributed? 226,16 psi 202,20 219.54 193,73 208.15 ;95.45 ;93,71 200,81 11~2.
tions and ranges. Would you conclude that deflection and range are independent?
RaDge (yards)
0- 1.999 2,000 - 5,999 6,000 - 1l.999
A company operates four machines for three
shifts each day. From production records. the following data on the number of breakdowns ate collected.
319
Latera: Deflection Left Normal Rigt.t 14
6 9 8
II
8 4
17
6
11-56. A stc:.dy is being made of the failures of an electronic componer.t There ate four t}"pcs of fail'JIes possible arid two mounting positions for the deviee. The folloVling data have been taken. Fellure Type
Mounting Position 2 3
31 15
II 17
2
14 10
9 16
Test the hypothesis that breakdowns are independent of the shift. 11-53. Patients in a hospital are classified as surgical or r:.tedicaL A record is ~ept of the number of tmes patients require nu;:sing service during t.~e night and whether these patients are or. Medicare or not. The data are as follows:
22
B 46
4
17
A
Surgical
Medical
Yes
46
No
36
52 43
Inactivc
Active
216 226
245
Low
Operations Research Grade
Grade
A
B
C
Other
A
25
6
B C Other
17 18 10
16
17 15 18
13 6 10 20
11
9 12
11-57. An article in Research in Nursing and Health (1999, p. 263) summarizes data collected from a previous study (Research in Nursing and Healih, 1998, p. 285) on the rela.tionship between physical activity and socio-economic status of 1507 Caucasian women. The data artl given in the table below.
Socia-economic Stat'.!s
11~54. Grades L'1 a statistics course and aD operations research course taken simultaneously were as follows for a group of students.
4 8
18 6
Physical Activtty
Test the hypothesis that caJ.s by surgical-medical patients are independen: of whether the patier.rs are receiving Medicare.
Statistics
D
Would you conclude that the t:,rpe of failure is independent of the moor-ting position'?
Patient Category
Medicare
C
f:..:e the grades in statistics and operations research related? 11·55. All experiment with artillery shells yields the following data on the characteristics of laterdl deflec-
Medium H:gh
409 297
114
Test the hypothesis that physical activity is bdepend~ ent of socia-economic status.
11-58. Fabric is graded into three classifications: A. E, and C. The results below were obtained from five looms. Is fabric classification independent of th~ loom? ::J~ber of Pieces
of Fabric in Fabric Classi5catio:::
Loom
A
2 3 4 5
185 190 170 ,58 185
B
C
16
12
24 35
21 16 7
22 22
15
320
ChaptCI 11 Tests of Hypotheses
1l~59. An article in the Joumal of Marketing Research (1970, p. 36) reports a study of the relation~ ship between facility conditions at gasoline stations and the aggressiveness of their gasoline marketing pol~ icy, A sample of 441 gasoline stations was investigated with the results shown below obtained. Is there e-.idence that gasoline pricing strategy and facility CODdl~ tions a..re independent?
Condition Subst2.ndard
Standard
Modem
Aggressive
24
52
58
Neutral
15
73
86
Nonaggrcs..<;ive
17
80
36
l1~()O. Consider the injection molding process described in li'Cercise 11-42. (3) Set up this problem as a 2 X 2 contingency table and perform the indicated statistical a:'.alysis.
(b) State deady the hypothesis being tested. Are you testing homogeneity or independence? (c) Is tb.i.s procedure equivalent to the test procedure used if. Exercise 1:-42?
Chapter 12
Design and Analysis of Single-Factor Experiments: The Analysis of Variance Experiments are a natural part of the engineering and management decision-making process. For e.xample. suppose that a civil engineer is investigating the effect of curing methods on the mean compressive strength of concrete, The experiment would consist of making up several test specimens of concrete using each of the proposed curing methods and then testing the compressive strength of each specimen. The data from this experiment could be used to determine which curing method should be used to provide maximum compressive strength, If the:e are only two curing methods of interest, the experiment could be desigl'.ed and analyzed using the methods discussed In Chapter 11. That is, the experimenter has a single facwr of interest--curing methods-and there are only two levels of the factor. If the experimenter is intere..<;ted in deten::rining which curing method produces the maximum compressive strength, then the number of specimens to test can be determined using the operating characteristic curves in Chart ·VI (Appendix), and the t-test can be used to deter~ mine whether the two means differ, Many single-factor experimentS require more than two levels of the factor to be considered. For example, fPe civil engineer may have five different curing methods to investigate. In this chapter we introduce the analysis of ",mance for dealing with more than Wo levels of a single factor. In Chapter 13, we show how to design and analyze experiments with se...-eral factors.
12-1 THE COIVIPLETELY RA.."IDOMIZED SINGLE-FACTOR EXl'ERIM:&",
321
322
Chapter 12 The .Analysis of Va.'ance Table 12-1 TellSLe Strength of Paper (psi) Hardwood Concentration (%)
5 10 15 20
~---
...
2
7
8
12 14 19
17 18 25
Observations 3 4
15 13 19
5
6
Totals
Averages
11
9
18
19 16 18
10 15 18 20
60 94 102 127
10,00 15,67
17 23
22
383
17,00
2Ll7 15.96
has six observ'ations. or replicates. The role of randomization in this experiment is extremely important. By randomizing the order of the 24 runs, the effect of any nuisance variable that may affect the observed tensile strength is approximately balanced Ollt. For example, suppose that there is a warm-up effect on the tensile tester; that is, the longer the machine is on, the greater the observed tensile strength. If the 24 runs are made in order of increasing hardwood concentration (i.e., all six 5% concentration specimens are tested first. followed by all six 10% concentration specimens, etc.), then any observed differences due to hardwood concentration could also be due to the warm-up effect. It is importar:.t to gT2pwcally analyze the data from a designed experiment. Figure 12-1 presents box plots of tensile strength at the four hardwood concentration levels. This plot indicates that changing the hardwood concentration has an effect on tensile strength; specifically. higher hardwood concentrations produce bigher observed (ensile streng"ill. Fur~ thennore, the distribution of tensile strength at a particular hardwood level is reasonably symmetric, and the variability in tensile strength does not change dramatically as the hard. . wood concentration changes.
The Completely Randomized Single-Factor Experiment
323
Graphical interpretation of the data is always a good idea. Box plots show the variability of the observations within a treatment (factor level) and the variability between treatments. We now show how the data from a single-factor randomized experiment can be analyzed statistically.
12·1.2 The Analysis of Variance Suppose we have a different levels of a single factor (treatments) that we wish to compare. The observed response for each of the a treatments is a random variable. The data would appear as in Table 12-2. An entry in Table 12-2, say Yip represents the jth observation taken under treatment i. We initially consider the case where there is an equal number of observations n on each treatment. We may describe the observations in Table 12-2 by the linear statistical model, i ~ 1,2, ... ,a, Yij:::: J1.+'f j +Eij ._ {
]-1,2, ... ,n,
(12·1)
where Yij is the Ui)th observation, J1. is a parameter common to all treatments, called the overall mean, 't',. is a parameter associated with the ith treatment, called the ith treatment effect, and €ij is a random error component. Note that Yij represents both the random vari-
able and its realization. We would like to test certain hypotheses about the treatment effects and to estimate them. For hypothesis testing, the model errors are assumed to be normally and independently distributed random variables with mean zero and variance 0"2 [abbreviated NID(O, 0"2)]. The variance 0"2 is assumed constant for all levels of the factor. The model of equation 12-1 is called the one-way-classification analysis of variance, because only one factor is investigated. Furthermore, we will require that the observations be taken in random order so that the environment in which the treatments are used (often called the experimental units) is as uniform as possible. This is called a completely randomized experimental design. There are two different ways that the a factor levels in the experiment could have been chosen. First, the a treatments could have been specifically chosen by the experimenter. In this situation we wish to test hypotheses about the 'fl , and conclusions will apply only to the factor levels considered in the analysis. The conclusions cannot be extended to s:imilar treatments that were not considered. Also, we may wish to estimate the 7:'1' This is called the fixed effects model. Alternatively, the a treatments could be a random sample from a larger population of treatments. In this situation we would like to be able to extend the conclusions (whiCh are based on the sample of treatments) to all treatments in the population, whether they were explicitly considered in the analysis or not. Here the 'f; are random variables, and knowledge about the particular ones investigated is relatively useless. Instead, we test hypotheses about the variability of the 7:'; and try to estimate this variability. This is called the random effects, or components of variance, model.
Table 12-2 Typical Data for One-Way-Classification Analysis of Variance Treatment
Observation
Totals
Averages
y,. y,.
Ya'
2
y" y"
y" y"
Y2l1
y,. y.,
a
Yo,
Yo'
YO"
Yo'
y,"
324
Chaptcr 12 The Analysis of Variance In this section we will develop the analysis of variance for the fixed-effects model. one-. way classification. In the fixed-effects model, the treatment effects ~ are usually dnfined as deviations from the overall mean, so that (12-2)
Let y;, represent the total of the observations under the ith treatment and Y" represent the average of the observations under the ith treatment Similarly. let y .. represent the gnnd total of all observations and y.. represent the grand mean of all observations. Expressed mathematically,
i=1,2, .. "a. ,
(12-3)
n
Y··=y··/N,
y .. = 2,2:>ii' 1=1 i"'i
whereN =: an. is the total number of observations. Thus the "dot" sUbscript notation implies sum.m.ation over the subscript that it replaces, We are interested l!J. testing the equality of the a treatment effects. t;sing equation 12-2, the appropriate hypotheses are
Ho: 't': 1J :;; ... :;; '4<) 0, H!: ~ ¢ 0 for at least one l.
(12-4)
That is, if the null hypothesis: is true. then each observation is made up of the overall mean f.L plus a realization of the random error Ell' The test procedure for the hypotheses in equation 12-4 is called the analysis of vari~ aIlee. The name "analysis of variance" results from partitioning total variability in the data into its component pa.'1S, The total corrected sum of squares, which is a measure of total -variability in the data, may be written as (12-5)
or
a
a
(12-6)
+2 2,2,(Yi.- 'y.)(Yi; - Yi.) ~l
j=l
Note that the cross-product term in equation 12-6 is zero, since
i(Yij -Yi.) = Yi·-nYi. =Yi.-n(Yi.!n) =0. j::::.l
Therefore we have j
(12-7)
12~ 1
The Completely Randomized Single-Factor Experiment
325
Equation 12-7 shows !hat !he total variability in the data, measured by !he total corrected sum of squares, can be partitioned into a sum of squares of differences betv,,'een treatment means and the grand mean and a sum of squares of differences of observations within treatments and the treatment mean. Differences bet\Veen obsen-ed treatment meanS and the grand mean measure the differences between treatments. while differences of observations within a treatment from the treatment mean can be due only to random error. Therefore. we write equation 12-7 symbolically as
SST = SStr"'..aIme~1l< + SSe' where SST is the total sum of squa...--es. SS:reolm~~~ is the sum of squares due to treatments (Le.,
between treatments), and SSE is the sum of squares due to error (i.e., within t:eatments). There are an =Ntotal obseJ."'lations; thus SST has N -1 degrees of freedom. There are a levels of the factor, so SS~=ts has a - 1 degrees of freedom, Finally, widrin any treatment there are n replicates providing n - 1 degrees of freedom Vlith which to esthnate the experimental error. Since there are a treatmen~ we have a(n 1) = an - a = N - a degrees of freedom for error, Now consider the distributional properties of these sums of squares. Since we have asst:med that !he errors €;, are NID(Q, a"), !he observations Yij are NID(,u + "i, 0'), Thus SSr/a" is distributed as cbi-square with N - 1 degrees of freedom, since SST is a sum of squares in normal random variables, Vie may also show that SSl1o.:.:mi:,1)~/d is chi-square with a 1 degrees of freedom, if Hois tru_e, and SSfid is chi-square witt N - a degrees of freedom. However, all three sums of squares ate not independent, since SStr~~m~nu and SSE add up to SST' The followmg theorem, which is a special form of one due to Cochran, is useful in developing the test procedure,
Theorem 12·1 (Cochran) Let Z, be NID(O, I) for i = 1, 2, ,," v and let
wheres •• " v, degrees of freedom, respectively~ if and only if
Using this the-orem, we note that the degrees of freedom for SSt=.tm~n~ and SS£:,' add up to N - 1, so that SS""',m"Ja" and SSe/a" are independently distributed cbi-square random variables, Therefore, under the null h)l'othesis, the statistic
(12-8) follows the Fa _ I• N _ a distribution. The quantities MS<=.:rw:nu and ~y'S£ are rr.ean squares. The expected values of the mean squares are used to show that Fo in equation 12-8 is an appropriate test statistic for Ho: 1) =: 0 and to determine the criterion for rejecting this null hypothesis. Consider
:i 326
Chapter 12 The Analysis of Variance
Substituting the model, equation 12-1, into this equation we obtain
E(MSE) = N
:0
E[
t.~(!i-C" Hut -± t.(~(!iHI ~elj
In
~ow on squaring and taking the expectation of the quantities within brackets, we see that terms involving e7; and L;",,;f.7j are replaced by if and n
Using a similar approach. we may show that a
From the expected mean squares we see that MS E is an unbiased estim.torof Also, uncer the null hypothesis. A[SueI.ltmeot!, is an ur:.biased estimator of cr. However, if the null hypothesis if false, then the expected value of ~o/fS=tmeJl[; is greater than vl. Therefore, under the alternative hypothesis the expected value of the numerator of the test statistic (equation 12-8) is greater than the expected value of the denominator, Consequently; we should reject H'J if the test statistic is large, This implies an upper~tail, one-tail critical region. Thus, we would reject H, if
where Fe is computed from equation l2~8. Efficient computational formulas for the sums of squares may be obtained by expanding and simpJifying the definitions of SS~a and SST in equation 12~7. This yields
, "
SST = I.2>~ i=1
I::'l
N
(12-9)
and (12-10)
'r; "
i
,I
12~ 1
The Completely Randomized S:.ngle-Factor Exp"'...riment
327
The error sum of squares 1s obtained by subtraction: SS£
(12-Jl)
:;;; SST - SSrreo.:menr:r,'
The teSt procedure is summarized in Table 12-3. This is called an analysis-of-variance table,
Consider the hardwood concentration experiment described in Section 12-1.1 We can use the analysis of wrian<:e to test '.he hypothesis that different hardwood concentrations do not affect the mean tensile strength of the paper. The sums of squares for analysis of variance are computed from equadOO$ 12·9, 12-10, and 12-11 as follows:
"
_ (383)'
=(7) +(8J-+ ,,+(20t---=512,96, 24
(60)' 7(94)' -(102)' +(127)'
(383)' = 382.79 24
6
'
SSE = SST - SStl:'elItnl~nt:.
=512,96-382,79 130,17,
Th.e analysis of variance is sumrr;.ar.zed in Table 124. Since Fo.c:.:l,:tO = 4.94, we reject Ho and conclude that hardwood concentral'io:::, b. the pulp significantly affects the stre:::lgth of the pilper.
12-1.3 Estimation of the Model Parameters It is possible to derive estimators for the parameters in the one-way model
analysis-of~variance
Table 12-3 Analysis ofVa..";ance for the OI'.e~Way~aassification Fixed-Effects Model Variation
Sum of Squ.a:res
Source of
Degrees of Freedom
Betweca treatments
SStm.l:mOOIt<
&rOT (within treatments)
SSE
a-I N-a
Total
SST
N-I
Mean
Square
Fo
MS~r.1$
MS,
Table 124 Analysis of Variance for the Tensile Strength Data Source of
Sum of
Variatio=
Hardwood concentration Error
382.79 130,17
Total
512,96
Degrees of Freedom
Mean
3
127,60
20 23
6,51
19,61
328
Chapter 12 The Analysis of Variance An appropriate estimation criterion is to estimate J1 and 1'; such that the sum of the squares of the errors or deviations Ei I is a minimum. This n:ethod of parameter estimation is called the method of least squares: In e.ti.,nating }.l and ~i by least squares, the nonnality assump_ tion on the errors Elf is not needed. To find the least-squares estimators of J1 and 1'!~ we form the sum of squares of the errors
(12-12)
j!.
and find valnes of }.l and T" say and ii' that minimize L The values p. and ii are the solutions to the a + 1 simultaneous equations
~a}.lL -;
=0
,
jl,.{
i=1,2, ... ,a.
Differentiating equation 12-12 with respect to JJ.. and 1'1 and equating to zero, we obtain a
,
-iLI,(Yirj!.-r,)= 0 i.=:l J=l
and
,
-2I,(Yi rj!.-i',)=O,
i::::: 1,2.".,a.
1';;1
After simplification these equations become
NjJ.+nf1 +rii1+ ...
+nf~=y." =)'t' ,
+m,
=Y2"
(12-13)
Equations 12-13 are called the least-squares normal equations. Notice that if we add the last a normal equations we obtain the first normal equation. Therefore, the normal equations are not linearly independent, and there are no unique estimates for f.4 't'p1:.z,.,.,'t'". One way to overcome this difficulty is to impose a constraint on the solution to the normal equations. There are many ways to c1loose this constraint Since we have defined the treatment effects as deviations from the OY"erall mean, it seems reasonable to apply the constraint (12-14) Using this constraint, we obtain as the solution to the normal equations i:::; 1, 2, ..., a.
(12-15)
12-1
The Completely Randomized Single-Factor Experiment
329
This solution has considerable intuitive appeal since the overall mean is estimated by the grand average of the observations and the estimate of any treatment effect is just the difference bct\lleen the treatment average and the grand average, This solution is obviously not unique because it depends On the constraint (equation 12-14) that we have chosen. At :first this may seem unfortunate. because two different experimenters could analyze the same data and obtain different results if they apply different constraints. However, certainfimctions of the model parameter are estimated uniquely. regardless of the constraint. Some examples are 'i-"r which would be estimated by ~,-~ ~ )1. -:ip and f.1 + r i • which would be estimated by it +7:; = <~};.' Since we are usually interested in differences in the treatment effects rather than their actual values, it causes no concern that the 7:1 cannot be estimated uniquely. In general, any function of the model parameters that is a linear combination of the left-hand side of the normal equations can be estimated lI1liquely, Functions that are lI1liquely esrimated, regardless of which constnlint is used, are called estimable functions, Frequently, we would like to construct a confidence interval for the ith treatment mean, The mean of the ith treatment is l= 1. 2, .... a. A point estimator of f.1l would beil} =it + 1i Now t if we assume that the errors are nor~ mally distributed, each 1" is !\'1D(;1,! we can base the confidence interval on the t distribution, Therefore, a 100(1 - a)% confidence interval on the ith treatment mean Pi is
cr
(12-16) A 100(1- a)% confidence interval on the difference between any two treatment means, say J11 - /lj. is
~~#l!1~)~-2 .' We can use the results given previously to estimate the mean tensile s.trengths at different levels of hardwood concentration for the experiment in Section 12~ 1. L The mean tensile strength estimateS are
Yl' : : :. AI(,:::::' 10.00 psi, ~.
=:
jJ.ll.Yfc ~ 15.67 psi,
J,. ~ ilm, = 2Ll7 psi
A 95% codidence interval on the mean tensile strength at 20% hardwood is found from equation 12-16 as follews:
[2L17± (2,086)-I65lj6 ]. [2Ll7 ± 2.17],
330
Chapter 12 The Analysis of Variance 1he desired confidence interval is 19.00 psi ~ I'm. ~ 23.34 psi. \'b-ual. examinarion of the data suggests that mean tensile strength at 10% and 15% hardwood is si1l1~ ilar. A confidence interVal on the difference in means J1.lS'h - JllM. is
Thus. the confidence intm'3.l on J1.!5% -Il;(jl, is - 1.74 51l1YJ. -
Ji.!M. ~
4.40.
Since the confidence interval includes zero, we would conclude that there is no difference in mean
tensile strength at these two particula:: !laniwooC :levels.
12-1.4 Residual Analysis and Model Checking The one-way model analysis of variance assumes that the observations are normally and independently distributed. with the same variance in each treatment or factor level. These assumptions should be checked by examining the residuals, We define a residual as: e'i =:: Yl} - y~, that is, the difference between an observ"3tion and the corresponding treatment mean. The residuals for the hardwood percentage experiment are shown in Table 12-5. The normality assumption can be checked by plotting the residuals on normal probability paper. To check the assumption of equal varia..'1ces at each factor level, plot the residuals against the factor levels and compare the spread in the residuals. It is also useful to plot the residuals agamst )'1. (someti,nes called thefitted value); the variability in the residuals should not depend in any way on the value ofy,: Vv'fien a pattern appears in these plots, it usually suggests the need for transfonnation, that is, analyzing the data in a different metric, For example, if the variability in the residuals increases with Yi" then a transformation such as log y or .[Y should be considered. In some problems the dependency of residual scatter in:;1" is very important information. It may be desirable to select the factor level that results in maximum y; however, this level may also cause more variation in y from run to run. The independence assumption can be checked by plotting the residuals against the time or run order in which the experiment was performed. A pattern in this plot. such as sequences of positive and negative residuals, may indicate that the observations are not independent. This suggests that time or run order is important, or that variables that change over ti..rne are important and have, not been included in the experimental design. A normal probability plot of the residuals from the hardwood concentration experiment is shown in Fig, 12-2, Figures 12·3 and 124 present the residuals plotted against the treatTabl.12-5 Residuals for the Tensile Strength Experiment Hardwood Concentration 5% 10% 15% 20%
ment number and the fitted value YI' ' These plots do not reveal a:ly model inadequacy or unusual problem with the assumptions,
12-1.5 An "Cnbalanced Design In some single~factor experiments the number of observations taken under each treatment may be different. We then say that the design is unbalanced. The analysis of variance described earlier is still valid. but slight modifications must be made in :he sums of squares formulas. Let n, observations be taken under treatment i(i = 1, 2,. ". a); and let the total
2
.
. • '
•
:•
·•
:
4
6
Residual value
Figure 12--2
l\~onnal probability plot
of residuals from the hardwood co:r.centration experiment.
t 4
2
·1•• 2
3
-2
Figure 12~3 Plot of residuals YS. treatment.
• 4
t
332
Chapter 12 T.::le A..'1alysis ofVatiance
2~ !
I
f
• ••
Figure 12*4 Plot of residuals vs. )it'
number of observations N:::: k;"'ln l The computational fOIDlulas for SST and SSfftltilY:rJs become +
a
SST
"
2::2>3 r""l j:<
N
and
r: .
In solving the normal equations, the constraint In,tl 0 is used. No other changes are required in the analysis of variance. 111ere are two important advantages in choosing a balanced design. First, the test sta~ tistic is relatively insensitive to small departures from the assumption of equality of vari~ ances if the sample sizes are equal This is not the case for unequal sample sizes. Second, the power of the test is maximized if the samples are of equal size.
12·2 TESTS ON INDIVIDUAL TREATMENT MEANS 12·2.1
Orthogonal Contt-dSts Rejecting the null bypothesis in the fixed-effects-modeI analysis of variance implies that there are differences between the a treatment means1 but the exact nature of the differences is not specified. In this siruation, further comparisons between groups of treatment means may be usefuL The ith treatment mean is defined as Ili I' ~ T,. and 1'; is esti.."",ted by 'ii;.• Comparisons between treatment means are usually made in tenr.s of the treatment totals {y, }. Consider the hardwood concentration experiment presented in Section 12-1,1. Since the hy'pothesis Eo: !j ~ 0 was rejected, we know that some hardwood concentrations produce tensile strengths different from others, but which ones actually cause this difference?
J
12~2
Tests on Individual Treattnent Means
333
We might suspect at the outset of the experiment that hardwood concentrations 3 and 4 pro~ duce the same tensile stren~ implying that we would like to test the hypothesis
H,: p, = ,,",, H,: p, *11" This hypothesis could be tested by using a linear combination of treatment toWs, say
Y:r-Y,·=O.
If we had suspected that the average of hardwood concentrations 1 and 3 did not differ from the average of hardwood concentrations. 2 and 4. then the hypothesis would have been
Ho: III + P, = P, + 11" p,+ 1'.,
HI: 1', + p,
*
which implies that the linear combination of treatment totals YI' +y,. -y,.- y" =0. In general, the comparison of treatment means of Interest will imply a linear oornbination of treatment totals such as a
C=l..Ci}'i,' i<=1
L;
with the restriction that ""tC, of squares for any contrast is
O. These linear combinations are called cOntrasts. The sum
(12-18)
and has a single degree of freedom. If the design is unbalanced, then the comparison of treatment means requires that 1~ ",1niGo:::: 0, and equation 12~ 18 becomes
(12-19)
A contrast is tested by comparing its sum of squares to the mean square error. The resu1t~ ing statistic would be distributed as F, with 1 and N - a degrees of freedom. A very important special case of the above procedure is that of orthogonal contrasts. Two contras:s with coefficients (cJ and (dJ are otthogonal if
or, for an unbalanced design, if
334
Chapter 12
The Analysis of Variance
For a treatments a set of a-I orthogonal contrasts will partition the sum of squares due to treatments into a-I independent single-degree-of~freedom components. Thus. tests per~ formed on orthogonal contrasts are independent. There are many ways to choose the orthogonal contrast coefficients for a set of treatments. Usually, something in the nature of the experiment should suggest which comparisons will be of interest. For example, if there are a=::3 treatments, with treatment 1 a "control" and treatments 2 and 3 actua11evels of the facto! of interest to the experimenter. then appropriate orthogonal contrasts might be as follows:
Treatment
Otthogor.al Contrasts
1 (control) 2 (levell) 3 (level 2)
-2 1 I
0 -1 1
Note that contrast 1 with Ci =-2, 1, 1 compares the average effect of the factor \Vith the control while contrast 2 with d i = 0, - 1, 1 compares the two levels of the factor of interest. Contrast coefficients must be chosen prior to running the experiment, for if these com·
parisons are selected after examining the data. mOSt experimenters would construct tests that compare large observed differences in means. These large differences could be due to the presence of real effects or they could be due to random error, If experimenters always pick the largest differences to compare, they will inflate the type I error of the test, since it is likely that in an unusually high percentage of the comparisons selected the observed differences will be due to error.
Consider the hardwood concentration experiment. There an:: four levels of hardwood concentration. and the poss:.Ole sets of comparisons between these means and the associated orthogonal comparisons are Ha: f,i; ..... f,i;, =P-t -1-'1,
C1 t;;;,YI' -Y2' -Yy + Y4"
H,,: 3f,i( + ,LL:.,:;;; P-t + 3fJ. 4t
C2 = - 3y!' -)'2' + YJ- - 3Y4' •
H,: 1', + 3", ~ 3", + ",.
C3=-YI·+3Y2'
3Y3'+Y4'
Notice that the contrast constants are orthogonaL Using the data from Table 12-1. we find the numerical values of the contrasts and the sums of squares as follows: C1 ~60-94-102+127~-9.
(-9)'
•
SSe. ~ 6(4) ~ 3.,8,
~ (209)' ~ 364.00
C, ~-3(60)-94+ 102+3(127) =209.
SS
C, =-<50 +3(94)-3(102) + 127 = 43.
(43)' SSe, ~ 6(20) =15.41.
C,
6(20)
•
These contrast sums of squares C-Ontpletely partition the treatment sum of squares; that is. SS_G::= SSe l + SSC~ - SSe:; == 382.79. These tests on the contrasts are usually incorporated into the analysis of variance, such as shown in Table 12-6. From this analysis, We conclude that there are sig:n.ificant dif~ ferences between bardwood concentrations 1,2 vs. 3, 4. but that the average of 1 and 4 does not dif~ fer from the average of 2 and 3. nor does the average of 1 and 3 differ from the average of 2 and 4.
J
12-2 Tests on Individual Treatment Means
335
Table 12-6 Analysis of Variance for the Tensile Strength Data Sum of Squares
analysts do not know in advance how to construct appropriate orthogonal contrasts, or they may wish to test more than a-I comparisons using the same data. For example, analysts may want to test all possible pairs of means. The null bypotheses would then be Ho: III = Ilj for all i j. If we test all possible pairs of means using I-tests, the probability of committing a type I error for the entire set of comparisons can be greatly increased.
*
There are several procedures available that avoid this problem. Among the more popular of these procedures are the Newman-Keuls test ",ewman (1939); Keuls (1952)), Duncan's multiple range test [Duncan (1955»). and Tukey's test [Tukey (1953)]. Here we describe Tukey's test. Tukey's procedure makes use of another distribution, called the Studentized range distribution. The Studentized range statistic is
q
_ Y1ThlX - Y~n r=-;- • •~MSE!n
wbere y~ is the largest sample mean and y""" is the smallest sample mean out of p sample means. Let qa (a,f) repre.<:;ent the upper a percentage point of q, where a is the number of treatments andfis the number of degrees of freedom for error. Two means) y;. and Yj. (i j), are considered significantly different if
*'
lYi . -)\.1 > T. where
/MSE Ta = qa (a,f )'I-n-'
(12-20)
Thble XlI (Appendix) contains values of q. (a,fJ for a= 0.05 and 0.01 and a selection of values for a and! Thkey's procedure has the property that the overall significance level is exactly a for equal sample sizes and at most a for unequal sample sizes.
~~,!,!~'g4: We will apply Tukey's test to the hardwood concentration experiment, Rec::ill that there are a = 4 rnea.1S, n:::: 6) and MSc :::: 6,51. The treatment means axe
Y" = 10.00 psi.
y,. = 15.67 psi.
y, = 17.00 psi,
Y..,
=:
21.17 psi.
Fro", Table XII (Appendix). with a= 0,05. a =4. and!= 20. we find qOM(4, 20) = 3.96.
336
Chapter 12
The Analysis of Variance
Using Equation 12-20,
Therefore. we would conclude:hat Mo means are significantly different if
L',· - )',·1 = 117.00-21.17: =4.17. From this analysis. we see significant differences between all pairs of means except 2 and 3. It may be of use to draw a graph of the treatment means, such as Fig, 12-5, wi.lo'1 the means that are not different underlined.
Simultaneous confidence intervals can also be constructed on the differences in pairs of means using the Tukey approach. It can be shown that
when sample sizes are equal. This expression represents a 100(1 a)% simultaneous confidence interval on aU pairs of means Jl; - j..l;. If the sample sizes are unequal, the 100(1 - 0:)% simultaneous confidence interval On all pairs of means JJ.; - ~ is given by
Interpretation of the confidence intervals is straightfonvard. If zero is contained in an inter~ val. then there is no significant difference beween the two means at the a significance level It should be noted that the significance level, IX, in Tukey's multiple comparison procedure represents an experimental error rate. "With respect to confidence intervals, (1. represents the probability that one or more of the confidence intervals on the pairwise differences will not contain the true difference for equal sample sizes (when sample sizes are unequal, this probability becomes at most 0:).
91.
12,'
I
I
~.
10.0
12.0
I
14.0
Figure 12-5 Results ofTukey's :est.
16,0
fa.
Y4-
I
I 1M
20.0
12-3
The Random-Effects Model
337
12·3 THE RAC'mOM·EFFECTS MODEL In many situations, the facwr of interest has a large number of possible levels. The analyst is interested i:1 drawing conclusions about the entire population of factor levels. If t.l-te experimenter randomly selects a of these levels from the population of factor levels, then we say that the factor is a random factor. Because the levels of the factor actually used in the experiment were chosen randomly, the conclusions reached wil1 be valid about the entire population of factor levels. We will assume that t.~e population of factor levels is
either of infinite size Or is large enough to be considered inficite. The linear statistical model is i 1,2"",,a, Yij=/1+'r i '€ij { ' J -1,2,"",n,
(12·21)
where 1; and €;, are independent random variables. Note that the model is identical in structure to the fixed-effects case, but the parameters have a different interpretation. If t.'1e variance of 'T, is ~, then the variance of any observa:ion is V(y)
= rr; + <:f,
The variances ~ and t:T are called variance componems and the model, equation 12-21. is called the components-ofvariance or the raru1om~effectsmodeL To test hypotheses using this model, we require that the (E,) areNID(O, <:f), tha, the ('ti} are ,NID(O, and that "i and €/j are independent.The assumption that the {1j} are independent random variables implies that the usual assumption on:;,} ~, = 0 from the fixed-effects model does not apply to the random-effects modeL The sum of squares identity j
rr;),
(12-22)
still holds. That is~ we partition the total 'Vmiability in the observations into a component that measures variation between treatments (SSI.."tlI;mI:1lJ and a component that measures variation within treatments (S5E)' However, instead of testing hypotheses about individual treatment effects) we test the hypotheses
If 0, all treatments are identical. blo1 if U; > 0, t.'len there is variability between treatments, 'The quantity SSE/a' is distributed as chi-square with N - a degrees of freedom, and under the null hypothesis, SS~enl$/cf is distributed as chi-square with a 1 degrees of freedom, Further, the random variables are independent of each other. Thus, under the null hypothesis, the ratio SS"""""",/( a - I) SSE/eN -a)
(12-23)
is distributed as F with a - I and N - a degrees of freedom, By examining the expected mean squares we can deten.nme the critical region for this statistic, Consider
L
338
Chapter 12 The Analysis of Variance
If we square and take the expectation of the quantities in brackets; we see that terms involving -r: are replaced by as E(r;) O. Also, terms involving I7", 1~' L~.. lL;", I€~' and L;.IL;~ I ~ are replaced by nO". 000". and an 0;, respectively. Finally, all cross-product tenns involving ~i and €ijhave zero expectation. This leads to
d!.
or E(MS=Il,:) =
(J1
+ nd;.
(12-24)
A similar approach will show that E(MS,) = (Jr,
(12-25)
From the expected mean squares. we see that if Ho is true, both the numerator and the denominator of the test statistic, equation 12-23, are unbiased estimators of 0-2; whereas if HI is true, the expected value of the numerator is greater than the expected value of the denominator. Therefore, we should rejectHo for values of Fo that are too large. This implies an upper~tail, one-tail critical region, so we reject Ho if Fo > Fa. a_ I,N_,:' The computational procedure and analysis-of-variance table for the random-effects model are identical to the fixed~effects case. The conclusions., however, are quite different because they apply to the entire population of treatments. We usually need to estimate the \'ariance components (0-2 and ~) in the model. The procedure used to estimate 0-2 and a! is called the "analysis..of-variance method," because it uses the lines in the analysis-of-variance table. It does not require the normality assumption on the observations. The procedure consists of equating the expected mean squares to their observed values in the analysis-of-variance table and solving for the variance components. When equating observed and expected mean squares in the one-way-classification random-effects model, we obtain
MSu<;wnctJ.I1 = 0-2 + nO-; and
Therefore, the estimators of the variance components are &?:::!ldS£
(12-26)
and (12-27)
For unequal sample sizes, replace n in equation 12-27 with
,-
12-3
The Random-Effects Model
339
~n21 , Ln,-.c;-1 L ~ i
1 a no=-
r
a-I 1'-1 ;
L
11.
,
kcl
Sometimes the analysis-of-variance method produces a negative estimate of a variance component Sinee variance components are by defmition nonnegative, a negative estimate of a variance component is unsettling. One course of action is to accept the estimate and use it as evidence that the true value of the variance component is zero, assuming that sampling variation led to the negative estimate. \\'!rile this has intuitive appeal, it will disturb the sta~ tistical properties of other estimates, Another alternative is to reestimate the negative vari~ 3..1.ce component with a method that always yields nonnegative estimates. Still anodier
possibility is to consider the negative eS"Jmate as evidence that the assumed linear model is incorrect, requiring that a study of the model and its assumptions be made to find a more appropriate model.
;J,l~~ii;)l#:~ In his book Design aruiAnalysis afExperiments (2001). D. C. Montgo:nery describes a single-factor experiment involving the random-effects model. A textile manufacturing company weaves a fabric on a la.,'"gc number of loows. The company is interested in loom-to-loom. variability in tensile stre::1gth, To investigate this, a IruUlufacturing engineer selects four looms at FaJldom and makes fOtlf st."'\'::ngth determinations on fabric samples chosen at random for each loom. The data are shown in Table 12-7, and the analysis of variance is summarized in Table 12-8. From the analysis of variance, we conclude :hat the looms :.n the p:ar.t differ significantly in their ability to produce fabric of unifonn strength. The variance components are estimated by &::;:::: 1.90 and ., _ 29.73-1.90 4
vr
6% . .
Therefore, the varia;}.ce of strength in the man.li!actlqing process is estimated by
= 6.96 + 1.90 8.86.
Most of this variability is attributable to dill'erences between looms.
Table 12·7 Streugth Data for Example 12-5
Observations
l
Loom
1
2
3
4
Totals
Averages
1 2 3
98 91
97 90
96
95
4
95
96
390 366 383 388 1527
97.5
96
99 93 97 99
92
95 98
91.5 95.8 97.0 9SA
34(l
Chapter 12
The Analysis of Variance
Table 12-8 Analysis of Variance for the Strength Data Source of Variation
Sum of Squares.
Degrees. of Freedom
Mean Square
P,
Looms Error Total
89.19 22.75
3 12 15
29.73
15.68
111.94
1.90
This example illustrates an important application of analysis of variance-the isolation
of different sources of variability in a manufacturing process. Problems of excessive variability in critical functional parameters or properties frequently arise in qualityim.provement program.~. For example, in the previous fabric-strength example, the process mean is estimated by y.. = 95.45 psi and the process standard deviation is estimated by &, = = '18.86 2.98 psi. If strength is approximately normally distributed, this
'wl;;J
would imply a distribution of strength in we Outgoing product that looks like the normal distribution shown in Fig. 12-6a. If the lower specification limit (LSL) on strength is at 90 psi, then a substantial proportion of the process defective is fallout; that is, scrap or defec~ dve material that must be sold as second quality, and so aD.. This fallout is directly related to the excess variability resulting from differences between looms. Variability in 1001P- performance could be caused by faulty setup. poor maintenance) inadequate supervision, poorly trained operators, and so forth. The engineer or manager responsible for qUality improvement must identify and remove these sources of variability from the process, If he can do this, then strength variability will be grea~y reduced, perhaps as low as = = -.11.90 = 1.38 psi, as shown in 12-6b, In this improVed process, reducing the variability in strength has greatly reduced the fallout. 'This: will result in lower cost, higher quality, a more satisfied customer, and enhanced competitive position for the company.
a, a
Process fallout
(a)
110
'.
pSI
(b)
Figure 12.-6 The distribution of fabric strengdl, (a) Current process. (0) Improved process.
12-4 The Randomized Block Design
341
124 THE RANDOMIZED BLOCK DESIGN 12-4.1 Design and Statistical Analysis In many experimental problems it is necessary to design the experiment so that variability arising from nuisance variables can be controlled. As an example, recall the situation in Example 11~17j where two different procedures were used to predict the shear strength. of steel plate girders. Because each girder has potentially different strength, and because this variability in strength was not of direct interest, we designed the experiment using the two methods on each girder and compared the difference in average strength readings to zero using the paired t-test. The paired t-test is a procedure for comparing two means when aU experimental runs cannot be made under homogeneous conditions, Thus, the paired t-test reduces the noise in the experiment by blocking out a nuisance variable effect. The randomized block design is an extension of the paired t-test that is used in situations where the factor of interest has more than two levels. As an example. suppose that we wish to compare the effect of four different chemicals on the strength of a particular fabric. It is knOYtll that the effect of these chemicals varies considerablY from one fabric specimen to another. In this example, we have only one factor: chemical type. Therefore. we could select several pieces offabric and compare all four chemicals within the relatively homogeneous conditions provided by each piece of fabric. This would remove any variation due to the fabric, The general procedure for a randomized complete block design consists of selecting b blocks and running a complete replicate of the experiment in each block. A randomized complete block design for investigating a single factor with a levels would appear as in Fig. 12-7. There will be a observations (one per factor level) in each block, and the order in whicb these observations arc run is randomly assigned within the block. We will now describe the statistical analysis for a randomized block design. Suppose that a single factor with a levels 1s of interest. and the experiment is run in b blocks. as shown in Fig. 12-7. The observations may be represented by the linear statistical model.
fi= 1.2, ...,a.
Yii=:!!f.1+'ti+fJj+€ij~. ~J
,
.
1.2..... b,
(12-28)
where f.1 is an overall mean, 'ti is the effect of the ith treatment, PI js the effect of the jth block, and e" is the usual NID(O, a') random error term. Treatments' and blocks will be C(lJ1sidored initially as fixed factors. Furthermore, the "eattnent and block effects are defined as deviations from the overall mean, so that:L~", 1'ti:::: 0 and :L;", 1/3;= O. We are interested in testing the equality of the treatment effects. That is, BIJ! '1:1 =:!!
H}:
Block 1
B:cck 2
't',;;t
12::::
.,.:= 1'a=O~
0 for at least one i,
Brock b
IY1D
y" he
Fi,.,oure 12-7 The randomized complete block design,
L
342
Chapter 12 The Analy'sis of Variance
Let Yi. be the total of all observations taken under treatment i,let Y'} be the total of all observations in blockj, let y .• be the grand total of all observations, and let N = ab be the total number of observations. Similarly, y;. is the average of the observations taken under treatment i, yO) is the average of the observations in blockj, and y•. is the grand average of all observations. The total corrected sum of squares is
Expanding the rigbt-hand side of equation 12-29 and applying algebraic elbow grease yields (J
The degrees-of-freedolI! breakdown corresponding to equation 12-31 is
no -
1 = (0
1) + (b
1) T (0 - 1)(b - 1).
(12-32)
The null hypothesis of no treatment effects (H,: "" = 0) is tested by the F ratio, The analysis of variance is summarized in Table 12-9. Computing formu, las for the sums of squares are also shown in this table. The same test procedure is used in cases where treatments and/or blocks are random. MS~,,,IMS£.
Ex..nJ~~eX2~6 Au experiment was performed ::0 determine the effcct of four different chemica:s on the stre';lgth of a fabric. These chemicals a.--e used as part of the penna.nent-press finishing process. Five fabric samples were selected, and a randomized block design was run by testing each chemica! type once in random order on each fabric s:unple. The data are shown in Table 12-10. The S~ of squares for the analysis of variance are corcputed as follows:
Table 12~9 Analysis of Variance for Randomized Complete Block Design Source of Variation
The analysis of variance is summar'.z.ed in Table 12-11. We would conclude that there is a signiiicant difference in 11e chemical types as far as their effect on fabric strength is oo:l.cernoo,
Table 12-11 Analysis ofVarimcc for the RaI:.domized Block Experiment
Source of
Sumo!
Variation
Squares
Degrees of Freedom
Squa...-e
18.04
3
6.01 1.67 O.OS
Mean
Chemical type (treatments)
Fab::ic saople (blocks)
6.69
4
Error
0.96
12
Total
25.69
19
75.13
'J 344
Chapter:2 The A.'1alys:s of Variance
Suppose an experiment is conducted as a randomized block design, and blocking was not really necessary. There are ab observations and (a - I)(b - 1) degrees of freedom for error. If the experiment had been run as a completely randomized single~factor design with b replicates, we would have hada(b -1) degrees offreedom for error. So, blocking has cost alb I) - (a-I) (b - 1) = b -1 degrees of freedom for error. Thus, since the loss in error degrees of freedom is usually small. if there is a reasonable chance that block effects may be important1 the experimenter should use the randomized block design. For example, consider the ex.periment described in Example 12-6 as a one-wayclassification analysis of va..riance. We would have 16 degrees of freedom for error. In the randomized block design there are 12 degrees of freedom for error. Therefore, blocking has cost only 4 degrees of freedom, a "tty small loss considering the possible gain in information that would be achieved if block effects are really important. As a gene.rn.l rule, when in doubt as to the inlporta:Jce of block effects, the experimenter should block and gamble that the block effect does exist. If the experimenter is "''Tong, the slight loss in the degrees of free~ dom for error will have a negligible effect, unless the number of degrees of freedom is very small. The reader shollid compare this discussion to the one at the end of Section 11 ~3 .3.
12-4.2 Tests on Individual TreatnIent Means When the analysis of variance indicates that a difference exists between treannent means, we usually need to perform some follow~up tests to isolate the specific differences. Any multiple comparison method, such as Tukey's test, could be used to do this. Tukeis test presented in Section 12-2.2 can be used to determine differences between treatment means when blocking is involved Simply by replacing n with the number of blocks b in equation 12~20. Keep in mind that the dl?grees of freedom for error have now Changed. Forth. rancomized block design,j= (a - I)(b - I). To illustrate this procedure, recall that the four chemical type means from Example 12-6 are
1,
= 1.14,
y,. = 1.76,
y.
= 3.56,
Therefore, we would conclude that two means are significantly different if 1)';' -5\.1 > 0.53. The absolute values of the differences in treatment averages are
Too results indicate chemical types I and 3 do not differ, and types 2 and 3 do not differ. ; fignre 12-8 represents the results graphically, where the underlined pairs do not diffeLI
r
12-4 The Rar.domized Block Design
,
13>
}'2-
y,.
I I
I
I
y,.
345
!
2.QO
1.50
1.00
2.50
$,00
3.50
Figure 12-8 Results of Tukey's !:esC
124.3 Residual Analysis and Model Checking In any designed experiment it is always important to examine the residuals and check for violations of basic assumptions that could invalidate the results. The residuals for the randomized block design are just the differences between the observed and fitted . . 'alues
where the fitted . . '3lues are (12-33)
The fitted value represents the estimate of the mean response wben the itb treatment is run in thejth block The residuals from the experiment from Example 12-6 are shown in Table 12-12. Figures 12-9, 12-10, l2-ll, and 12-12 present the important residual plots for the experiment. There is some indication iliat fabric sample (block) 3 has greater variability in strength when treated with the four cbemicals than the other samples. Also, cheILical type Table12·12
Residuals from ihe Ra:1dontized Block Design Fabric
Chemical
Type -0.18 0.10 0.08 0.00
2
3 4
-0.11 0.07 -0.24
027
0.44 -0.27 030 -0.48
-0.18 0.00 -0.12 0.30
0.02
0.10 -0.02
-0.10
·-T-·_··_··_·
2
• •
",-
~
1il
••
0
• ••
E 0
Z
•
,•
-,
•
• -2
• 0.50
0.25
a
0.25
0.50
Res;d:tal value
Figure 12-9 Normal probability plot of residuals from the randomi2ed block design.
Chapter 12
The Analysis of Variance
4, which provides the greatest strength, also has somewhat more mability in strength, Fol, low-up experiments may be necessary to confirm these findings if they are potentially important,
r
•
-0.5 Figure 12-10 Residuals by trea!:D:.ent.
el/+ , I
0.5 ;-
! 01-
2
r
t
3
t5 1
.~
-O.5~ I
Figure 12-11 Residuals by block.
eat
°l· °1
-O'5~
•
. •• ..•• • '
• •
I
2
•
4
6
Figure 12-12 Residuals versus YI}"
j
r
l2-5 Determining Sample Size in Single-Factor Experiments
341
12·S DETEIL'1JNING SAMPLE SIZE r.T SINGLE·FACTOR
EXPERIMENTS In any experimental design problem the choice of the sample size or number of replicates to use is important. Operating characteristic curves can be used to provide guidance in makthis selection. Recall that the operating characteristic curve is a plot of !.he type II (fJ) error for various sample sizes against a measure of the difference in means that it is important to detect. Thus, if the experimenter knows how large a difference in means is of poten~ rial importance, the operating characteristic curves can be used to determine how many replicates are required to give adequate sensitivity. We first consider sample size determination in a fixed-effects model for the case of equal sample size in each treatment. The pnwer (1- (3) of the test is 1- {3= P(Reject HoIH, is false}
(12·34)
=P{Fo > F", ,.,."••iH, is false I· To e\'aluate this probabili~ statement, we need to know the distribution of the test statistic Fo if the null hypothesis is false. It can be shown tllat if He is false, the statistic Fo = MS=_.IMSE is distributed as a noncentral F random variable, with a I and N - a
degrees of freedom and a noncentra1ity parameter o. If Ii ~ O. then the noncentral F distri· bution becomes the usual centroi F distribution. The operating characteristic curves in Chart vn of the Appendix are used to calculate the pnwer of the test for the fixed-effects model. These curves plot the probability of type II error ({3) against <1>. where
,
nL'!f {=1 '--,-.
(12-35)
aIr
o.
The parameter tt;2- is .related to the noncentrali::y parameter Curves arc available for a:::: I::: 0.01 and for several values of degrees of freedom for the numerator and denominator. In a completely randomized design, the symbol n in equation 12-35 is the number of replicates. In a randomized block design, replace n by the number of blocks. In using the operating characteristic curves, we must define the difference in means that we wish to detect in tenns of 2::", j 1;. Also, the error variance (52 is usually unknown_
0.05 and a
In such cases. we must choose ratios of 'i~"'l~j<:f that we wish to detect. Alternatively, if
an estimate of d- is available. one may replace
cr with this estimate. For example, if we
were interested in the sensitivity of an experiment that has aJrea.dy been performed, we might use MS E as the esti:nate of
cr,
~E~i!~j~~1 Suppose that five means are being compared in a completely random.ized experimen:: with a::;:; 0.01. The e.'l.perimenter would like to know how IIlalJ,y replicates to run if it is l.."UpOrta."l.l to reject HIJ with:l probability of at least 0.90 ifL~", jJdl::;:; 5.0. The parameter ¢2 is, in this case,
:1
nt~? n(5j'=n.. , =--,-=QIJ 5 ;"'1
and the openting chal;a.cteristic curve for a -1 = 5 -1 =4 and N - a = a(n-l) = 5(11 -1) euor degrees offreedomis shown in Chart VII (Appendix). As a fut guess, try n =4 replicates. This yields
=
348
Chapter 12 TbeAnalysis of Variance
P 1 - 0.38 ;;;: 0.62. whicn is less than the required 0.90, and so we conclude that It:::::: 4 replicates are not sufficient. Proceeding in a similar manner, we can constru:ct the following display.
Therefore, the PO?'ef of the test is approxllnately 1 -
1)
n
<1>'
4
4
2.00
15
5 6
5 6
2.24 245
20
0.38 0.18
0.62 0.82
25
0.06
0.94
a(n
Power (1 - jJ)
Thus, at least 11 = 6 replicates must be run in order to obtain a test with the required power.
The power of the test for the random-effects model is
1- J3=P{RejectHclHo is false} (12-36)
,ja; > O}.
=P{F,> F.o_, ..
Once again the distribution of the test statistic Fe under the a1ternative hypothesis is needed. It can be shown that if HI is lIUe (a: > 0), the distribution of F, is centrnl F, with a-I and N - a degrees of freedom. Since the power of thc random-effects model is based on the central F distribution, we could use the tables of the F distribution in the Appendix to evaluate equation 12-36. However; it is much easier to evaluate the power of the test by llSing the operating characteris~ tic curves in Chart of the Appendix. These curves plot the probability of the type II error against}., whcre
vm
(12-37)
In the randomized block design, replace n with b, the number ofblocks. Since u' is usually unknown, we may either use a prior estimate or define the value of
in detecting in terms of the ratio d!/~.
cr! that we are interested
Consider a completely randomized design with five treatments selected at random, with six observa tions per treatment and a=O.05. We \¥ish to determine the power of the testlf is equal to cT. Since a 5,11 = 6, and = 02, we oay compute w
a-;
a;
A~~1+(6)1 ~2.646. From the operating characteristic curve with a-l :::::4, tV -a = 25 degrees of freedom and ct= 0,05, we find that {J = 0.20. Therefore. the power is approrimately 0.80. ~~~
..
.....- -
12·6 SAMPLE CO:\1PUTER OUTPUT Many computer packages can be implemented to carry out the analysis of variance for the situations presented in this chapter. In this section, computer output from 1-linitab® is presented.
j
12-6 Sa'JOple Computer Output
349
computer Output for Hardwood Concentration Example Reconsider Example 12-1, which investigates the effect of hardwood concentration on tensile strength. Using ANOVA in Minitab® provides the following output.
Analysis of Variance Source DF SS
Concen ; Error : Total
382.79 130.17 S12.96
3 20 23
~or
TS
>is
F
F
127.60 6.51
19.61
0.000
Indi '\ridual 95% C:s :For Mean! Based 0:1. Pooled 'StDev Level
N
Mean
StDev
--+-----r"----+-~
5 10 15 20
6 6 6 6
10,QOO 15,667 17,000 21.167
2.828
(--*--)
...~-+-
(--*--) (--*--)
2.805 1. 789 2.639
(--* .. _) ---+----+-----+----~-
Pooled StDev
2,551
10.0
15.0
20,0
25.0
The analysis of variance results are identical to those presented in Section 12-1.2. 11initab® also provides 95% confidence intervals for the means of each level of hardwood concentration using a pOOled estimate of the standard deviation, Interpretation of the confidence intervals is straightforward. Factor levels with confidence lnterYals that do not overlap are said to be significantly different. A hetter indicator of significant differences is provided by confidence intervals based on Tukey's test on pairwise differences, an option in MinitabV , The output provided is
The (simultaneous) confidence intervals are easily interpreted, For example, the 95% confidence intenra} for the difference in mean tensile strength between 5% hardwood concentration and 10% hardwood concentration is (-9.791, -1.542), Since this confidence interval does not contain the value 0, we conclude there is a significant difference between 5% and 10% hardwood concentrations. The remaining confidence intervals are interpreted simi~ larly, The results provided by M.in.itab' are identicil to those found in Section 12-2.2,
350
Chapter 12
The Analysis afYanancc
12-7 SGMMARY This chapter has introduced design and analysis methods for experiments with a single factor. The importance of randomization in sing1e~factor experiments was emphasized. In a completely randomized experiment, all runs are made in random order to balance out the effects of unknown nuisance varia.bles. If a known nuisance variable can be contrOlled, blocking can be used as a design alternative. The fixed-effects and random~effects models
of analysis of variance were presented. The primary difference bet\\'een the two models is the inference space.1n the fixed-effects model inferences are valid only about thefactor levels specifically considered in the analysis) while in the random~effects model the conclu~ sions may be extended to the population of factor levels. Orthogonal contrasts and Tukey's test were suggested for making comparisons between factor level means in the .:fixed~effects experiment. A procedure was also given for estimating the variance components in a random-effects model. Residual analysis was introduced for checking the underlying assumptions of the analysis of variance.
12-8 EXERCISES 12--1. A study is conducted to determine the effect of
Observations
cuttir.g speed on the life (in hO":lrs) of a particular
?".acbine tooL Four leveis of cutting speed are selected for the study with the following results: Cuttir.g
2
41 42
43 36
Tool Life 33 39 45 34
3
34
38
34
34
4
36
37
36
38
36
40
40
39
36 35
33
35
(a) Does cutting speed affect tool life? Draw comparative box piots a.id perform an analysis of variance.
(b) Plot average tool life agai.."lst cutting speed and interpret the results.
125
2.7
160
4.9
200
4.6
(d) Find the residuals and examine them for model
4.6 3.4
6
3
4
5
2.6
3.0 4.2 3.5
3.2
3.8
3,6 4.1
4.2 5.1
5,0
2.9
(a) Does C~6 flow rate affect e~ch unifon:.rity1 Con-
struct box. plots to compare the fac~or levels and perform the analysis of variance, (b) Do the residuals indicate any problems v.i.th the underlying assun:ptions? tz..3. The compressive strength of concrete is being studied, Fom: different mi.:x.ing techniques are being investigated. The following data have been collected:
Mixing TechrJque
(c) Use Tukey's tese to investigate Cifferences
between the individual levels of cutting speed. Interp:et the results.
2 4,6
Compressive Stren..,c-th (psi) 3129
3000
2865
2
3200
3 4
2800
3300 2900 2700
2975 2985
2600
2600
2890 3150 3050 2765
inadequacy.
J2...2. I!l "Orthogonal Design for Process Optimization and Its Application to Plasma Etching'; (Solid Sfate Technology, May 1987), G. Z. Ym:md D, W. lillie describe (1,'1 experiment to determine the effeet of q,) flow rate on the ur..iformity of the etch on a silicon wafer used in integrated circuit manufacturing, Three flow rates are used in the experiment., and the resulting unifomrity (in percent) for six :eplicmes is as follows:
Ca) Test the hypothesis that mixing techniques affect the strength of the concrete. 1;se a=' 0.05.
Cb) Use Thkey's test to make comparisoos bem'eetl pairs of means. Estimate tl::.e treatment effects. 12-4. A textile mill has a la..rge nurr:.ber of looms. Each loom is supposed to provide the sam.e output of cloth per minute. To investigat!! this assumption. fiv!! looms are chosen at random and the!! output measured at dif~ ferent times. The following data.are obtained:
•• .....:...J
r
I
351
12-8 Exercises
Output (lb i miJ.<)
LoOIC 4.0 3,9 4,1 3,6 3,8
2 3 4 5
4.1
4.2
3.8 4,2 3,8 3.6
3.9 4,1
4,0
3.9
4,0 4,0 4,0 3,9 3,8
4,1 4,0 3,9 3,7 4,0
(a) Is this a:5xed- or randoIll~effects experiment? Ate the looms similar in output?
{c) Compute a 95% interval estimate of the mean for coating type 1. Compute a 99% interval estimate of the mean difference bern'een coating types 1 and 4, (d) Test all pairs of means using Tukey's test, with
a= 0,05, (e Assu:ni.."1g t.l)ar. coating type 4 is currently in use, what are your recommendations to the manufac~"'Cf? We wish to minimize conductivity.
(c) Estimate the experimental error variance.
12-7. The response time in milliseconds was determined fot three different types of clrccirs t:.sed in an electronic calculator. The results a..-e recorded here:
(d) What is the probability of accepting Hr; if ~ is four times the experimental error variance?
Circuit Type
('J) Estimate the .wability between looms.
(e) Analyze the residuals from this experiment and check for model itmd£quacy,
12-5. 1m experiment was run to determine whether four specific :5ring tempetatu:res affect the density of a cer.ain type of brick. The experiment led to the fol~ lowiDgda~:
(a) Does dle fixing temperature affect the density of the bricks? (b) Estirnate the components in the model,
(c) Analyze the residuals from the expe~ent. 12-6.An electro::tics eogineer is interested in the effect on tube conductivity of .five different types of coating for cathode ray tubes used in a telecommunications system display device, The follov.i.ng conductivity data are obtained:
3 4 5
2 3
22 21 15
20 33 18
18
25
27
4{)
26
17
(a) Test the hypothesis that the three circuit types have the same response time.
(b) Lse Tuley's test to compare pairs of treatment
Temperatu.·..c 100 125 150
1
:43 152 134 129 147
141 149 133 127 148
150 137 132 132 1'4
146 143 127 129 142
(a) Is there any difference in conductivity due to coat~ ing type? Use
(c) Construct a set of Ort.1og0r..al contrasts. assuming that at the outset of the experiment you suspected the response time of circuit type 2 to be different
from the other two. (d) What is the power of this test for detecting
-r;
L~., ler = 3.0? {e) Analyze the residuals from this experiment.
12·S.In "The Effect of Nozzle Design on the Stability and Performance of Turbulent Water Jets" (Fire Safety JOUT1'.aI. VoL 4, Angus! 1981), C. Theobald describes an experiment in which a. shape fac:or was determined for several different nozzle designs at different lev>!ls of jet efflux velocity. Interest in this experimer.( focuses primarily on nwil.e design, and velocit} is a nuisance factor. The data are shown below:
(a) Does noz:ile type affect shape factor? Compare the noZ<.1es using box plots and the analysis of variance. {b) Use Tukey's test to determine specific differences between the nozzles. Does a graph of average (or
352
Chapter 12 The Analysis of Variance
standard deviation) of shape factor versus nozzle type assist -.vith the conclusions? {c) Analyze:he residuals from this experiment. 12-9. In his bookDesigrt andAnalysis ajExperimen.ts (2001), D. C. Montgomery describes an experiment to determine the effect of four chemical agents on the strength of a particular type of clo:h. Due to possible variability from cloth to cloth. bolts of cloth are considered blocks. Five bolts are sclec:ed and all four chemicals in random order are applied to each bolt. The resulting tensile streng"..hs are Bolt 4
5
2
3
68 67
74
71
67
75
72
70
3
73 73 75
68
78
73
4
73
71
15
75
68 69
Chemical 1
2
(a) Is there any dIfference 1."1 tensile sjre:tgth between the chemicals? (b) Use Tukcy's test to investigate sped..£ic differ~
ences between the chemicals. (c) Analyze the residuals itom L.":ris experin1ent. 12~10. Suppose that four normal populations have COUll!J.On variance
12~11. Suppose :hat five uonnal populations have coUll!J.On variance IT =100 and means Ji.: = 175~ J.i: =: 190,)1., 160,)1., = 200, and)1., 215. How = y observatiollS per population must be taken so :hat the probability of rejecting the hypothesis of equ.allry of means is at least 0.951 Use a= 0.01.
1,2..12. Consider testing the equality of ilie means of two normal populations where t.'1e variances are uokuo\\'n but assumed equal. The appropriate test procedure is the two-sample Hest Show that ::he two-sample t-test is equivalent to tl:e one-way~ classification analysis of variance. 12-13. Show that the variance of the linear combioa~ tion Z:~"[c;y,, is 02Z:;d!ni 12··J4. In a fixed-effects moru;l. suppose that there ue n observations for each of four treatments, Let~, ~, and be single-degree~of-freedom components for r.h~ ortl;ogO;.1al cootrasts. Prove :hat SSl(tt!!l!Ienu = Q~ +
c7,
a;
Q;+ 12;. 12-15. Consider the data shown in Exercise 12-7. (a) Write out the :east squares normal equations for this problem., and solve them forfl and1;., making the usual coostraint (Z:~'" Ji,-;; 0). Estimate 'C) - ; . (b) Solve the equations in (a) using the constraint~:::;; O. Are the estimators f, and fi the same as you found in (a)? Why? Now estim.ate 1'[ ~ 't; and compare your answer with (a). '\\t'hat statement can you make about estimating contrasts in the !) (c) Estimate jl+ 1';. 21'] -11- 1";: and jl + 'C( + 12 using the two solutions to the normal equations. Compa..-e L1e results obtained !n each case.
Chapter
13
Design of Experiments with Several Factors An experiment is just a test or a series of tests. Experiments are performed in all scientific and engineering disciplines and are a major part of the discovery and learning process. The conclusions that can be drawn from an experiment will depend. in part, on how the experiment was conducted and so the design of the experiment plays a major role in problem solution, This chapter introduces experimental design techniques useful when several fa:~
tors are involved,
13·1 EXAMPLES OF EXPERIMENTAL DESIGN APPLICATIONS
.~.:t.ailii(i~l:fr A Characterization Experiment
A development engineer is working on a new process for sol-
dering electronic components to printed circuit hoards. Specifically. he is working with a r.ew type of flow solder machine that be hopes will reduce the number of defective solder joints. (A flow solder machine preheats printed circuit boards and then rr..oves them into contact with a wave of liquid solder. This machi:1e makes all the electrical and most of me mechanical COllt.ections of the components to the pr. ."1ted circuit board, Solder defects require touehup or rework. which adds cost and often dam~ .ages the boards.) The flow solder machine has several variables that the engineer can controI. They are as follows: 1. Solder temperature 2.. Prebeat temperature 3. Conveyor speed 4. Flux type 5. Flu., specific gravity 6. Solder wave depth 7. Conveyor angle In addition to t.iese controllable factors, there are several factors that cannot be easily controlled once the machine enters routine manufacturing, ineluding the following: 1. Thickness of the printed circuit board 2. Types of components used on the board 3. Layout of the components of the board 4. Operator 5. En"irorunental factors 6. Production rate
Sometimes we call tl:Ie uncontrollable factors noise factors, A schematic representation of the process is shOVf"U in Fig. lJ., L
353
354
Chapter 13 Design of Experiments with Several Factors Controllable factors
11 r Input
Process
Outpl..1:
(printed circuit boards)
(flow solder machine}
(defects, y)
.
ll-~ Z1
2'2
J
Zq
Uncontrollable {noise) factors
Figure 13~1 The flow solder experiment
In this sit'.lscion the engineer is interested in characterizing the flow solder m:lClrine; that is, he is interested in detemrining which factors (borb controllable and uncontrollable) affect the occurrence of defects on the p.r:inted circuit boards, To accomplish this he can design an experiment that v;:ill enable him to estimate the magnitude and direction of the factor effects, Sometimes we call an experiment such as this a screening experiment The information from this characterization study Or screening experiment can be used to identify the critical factors, to determine the direction of adjustr.::Jent for these factors to reduce the n:nnber of defects, and to assist in determining which factors should be ca.re. fully controlled dcring manufacturing to prevent high defect levels and erratic process perfo.rmance.
;Ekli~~:i3:~.J An Optimiza.tion Experiment In a characterization experiment, we are interested in determining
which factors affect the response. A logical next step is to determine the region in the important factors that leads to an optimum response, For example. if the response i.,. yield, we would look for a region of maxim1.ll:l yield, and if the response is cost, we would look for a region of minimum cost. As an illustration. suppose that the yield of a chemical process is influenced by the operating temperature and the reaction time. We are currently operating the process at 155°F and 1.7 hours of reaction time and experiencing yields around 75%. Figure 13-2 shows a view of the time-temperature space from above. In this graph we have connected points of constant yield with lines. These lines are called contours. and we have shown the contours at 60%, 70%, 80%, 90%, and 95% yield. To locate the optimU1l1, it is necessary to design an experiment that varies reaction time and temperature together. This design is illu:strated in Fig. 13-2. The responses observed at the four points:in the experiment (145'F, 1.2 br), 045°F, 2.2 br), (16S'F. 1.2 br), MId (165'1', 2.2 br) indicate that we should
move in the general direction of increased temperature and lower reaction time to increase yield. A few additional IUIlS could be performed in this direction to locate the region of maximum yield.
These examples illustrate only two potential applications of experimental design methods. In the engineering environment~ experimental design applications are numerous. Some potential areas of use are as follows: 1. Process troubleshootiI:g
2. Process development and optimization
3. Evaluation of :material alternatives 4. Reliability and life testing 5. Performance testing
13-2 Factorial Experiments
355
200
Path leading to region of higher yield
CUrrent operating conditions
160
150
140
0.5
1.0
1.5
2.0
2.5
TIme (hr)
Figure 13M2 Contour plot of yield as a function of reaction time and reaction temperature, illustrating an optimization experiment.
6. Product design configuration 7. Component tolerance detennination Experimental design methods allow these problems to be solved efficiently during the early stages of the pr6duct cycle. This has the potential to dramatically lower overall product cost and reduce development lead time.
13-2 FACTORIAL EXPERIMENTS Vlben there are several factors of interest in an experiment, afactorial design should be used. These are designs in which factors are varied together. Specifically, by a factorial experiment we mean that in each complete trial or replicate of the experiment all possible combinations of the levels of the factors are investigated. Thus, if there are two factors, A and E, with a levels of factor A and b levels of factor E, then each replicate contains all ab treatment combinations. The effect of a factor is defined as the change in response produced by a change in the level of the factor. This is called a main effect because it refers to the primary factors in the study. For example, consider the data in Table 13-1. The main effect of factor A is the difference between the average response at the first level of A and the average response at the second level of A, or
A_ 30 + 40 10+20_20 --2---2-- .
356
Chapter 13 Desig."l of Experirne;;.ts v,1.th Several Factors Table 13-1
A Factorial Expcrimc:1.t with Two Factors Factor B
B,
Factor .4
10
20
30
4()
That is. changing factor A from levell to level 2 causes an average response increase of 20 units. Similarly, the main effect of B is B= 20+40 _10+30 =10. 2 2 In some experiments. the difference in response between the levels of one factor is not the same at all levels of the other factors. V/hen this occurs. there is an interaction bet\Veen the factors. For example, consider the data in Table 13-2. At the first level of factor B, the A effect is A =30-10 = 20,
and at the second level of factor B. the A effect is A=O-20=-20.
Since the effect of A depends On the leVel chosen for factor B, there is interaction between A andB. Vt'hen an interaction is large, the corresponding main effects have little meaning. For example, by using the data in Table B-2, we find the main effect of A to be A
30+0_10+20 =0
22' and we would be tempted to conclude that there is no A effect. However. when we exam~ ined the effects of A at different levels offactor B. we saw that this was not the case. The effect of factor A depends on the levels of factor B. Thus, knowledge of the AB interaction is more useful than knowledge of the main effects. A significant interaction can mask the significance of main effects. The concept of interaction can be illustrated grnphically. Figure 13-3 plats De data in Table 13-1 against the levels of A for both levels of B. Note that the B, and B,lines are roughly parallel, indicating that factors A and B do not interact significantly. Figure 13-4 plots the data in Table 13-2.ln this grnpb, the B, and B, lines are not parallel, indicating the interaction between factors A and B. Sucb graphical displays are often useful in presenting the results of experiments. An alternative to the factorial design that is (unfortunately) used in practice is to change the factors one at a time rntberthan to vary them simultaneously. To illustrate this one~factor~ at~a-time procedure, consider the optimization experiment described in Example 13-2, The Table 13-2
A Factorial ExpCrir::.cllt with Inte::action
Factor B
B,
Faztor A A,
10
A,
30
10
o , -~
:3-2 Factorial Experiments
357
:~
i 30~ \D
20 i-
~ 10~ O~
i ................
L ..
A,
~ ___ _
Factor A
Figure 13-3 Factorial experiment, no interaction.
50 -
§ 4O~ ~ 30~
~ 20~
8 10~ o~
8, 8,
I
<
81
8, Figure 13..4 Factoria: experiment, with interaction,
A,
A,
engineer is interested in finding the values of temperature and reaction time that maximize yield, Suppose that we fix temperature at 155'F (the current operating level) and perform five runs at different levels of time, say 0.5 hour, La hour, 1.5 hours, 2.0 hours, and 2.5 hours. The results of this series of runs are shown in Fig. 13-5. This figure indicates that maximum yield is achieved at about 1.7 hours of reaction time. To optimize temperature, the engineer fixes time ar 1.7 hours (rhe apparent optimum) and perfonns five runs at different temperatures, say l40'F, 150'F, 160'F, 170'F, and 180'F. The results of this ser of runs are plotted in Fig. 13-6. Maximum yield occurs at about 155°F. Therefore, we would conclude that running the process at 155'F and 1.7 hours is the bestset of operating conditions, resultingin yields around 75%,
BO
•
70
~ .", :.; >-
I
•
60
•
•
•
50
0.5
1.0
1.5
2.0
2.5
TIme (hr)
Jrrgw;e
13~5
Y"lCld versus reaction time with temperature constant at 155"P'
358
Design of Expe:imcnrs with Several Factors
Chaptc; 13
t
~ :Q
m
:;:
70
eo
• • Temperature (!JA
Figure 13~6 Yleld versus temperature with reaction time constant at 1.7 hr"
Figure 13-7 displays the contour plot of yield as a function of temperature and time with the one-factor-at-a~time experiment shown on the contours. Clearly the one-factor-ata-time design has failed dramatically here, as the true optimum is at least 20 yield points higher and occurs at much lower reaction times andbigher temperatures. The failure to discover the shorter reaction times is particularly important as this could have significant impact on production volume or capacity, production planning, manufacturing cost. and total productivity. The one-factor-at-a-time method has failed here because it fails to detect the interaction between temperature and time, Factorial experiments are the only way to detect inter-
actions, Furthermore, the one-factor-at-a-time method is inefficientj it will require more experimentation than a factorial, and as we have just seen, there is no assurance that it will produce the correct results. The experiment shown in Fig. 13-2 that produced the informa~ tion pointing to the regiOD of the optimum is a simple example of a factorial experiment.
13-3 TWO-FACTOR FACTORIAL EXPERIMENTS The simplest type of factorial experiment involves only two factorS. say A and B. There are "levels offac1
(13-1)
where J.1 is the overall mean effect.. -t: is the effect of the ith level of factor A,~, is the effect of the jth level of factor S, (1:!llij is the effect of the interaction between A and S, and is a NID(O, a') (norma! and independently distributed) random error component. We are interested in testing the hypotheses of no significant factor A effect, no significant factor B effect, and no significant AB interaction. As with the single-factor experiments of Chapter 12, the analysis of variance will be llsed to test these hypotheses, Since there are two factors under study, the procedure used is called the two-way analysis of variance.
'il'
13-3.1 Statistical Analysis of the Fixed-Effects Model Suppose that factors A and S are fixed. That is, the " levels of factor A and the b levels of factor S are specifically chosen by the experimenter, and :nferences are co",';ned to these levels only. In this model, it is customary to define the effects 'fi,!l" and (1:{3),. as deviations from the mean, so that :£:~, T, 0, ,!ll 0, :£:~, (1:{3)ij = 0, anl:E:",('l'{3)i) '= 0. Let y.:. denote the total of the observations under the itb level of factor A, let yo)' denote the total of the observations under the jth level of factor B, let Iii' denote the total of the obse::vations in the ijth cell of Table 13-3, and let y ... denote the grand total of all the
:£;"
Table 13-3 Dat:! Arrangement for a Two-Factor Factorial Design Factor B
Factor A
2
b
1
)'1 (1' )'\11' '''')'1:"
Yilt> )'12:1' , .. , )'l1ft
)'IW)';O:;- '''' )'11'"
2
)flU> Ym.,.
Yl'll>
Y2n' .•. , Y:.t.!n
Y1.W)':ll>2' ••• ,)l7m.
a
···'11111
""J!:l:!
360
Chapter 13
Design of Ex-periments with Several Factors
observations. Define Yf ... YT' grand averages. That is,
YIj" and y... as the correspondlng row. colllIIlD.. cell, and .1 i::::: 1.2•... ~a,
,
i:= 1.2..... a.
Yij. ::::;: LY(fk'
a
•
j=l.2, ... ,b,
n
k::::.l
,
(13-2)
Y...
Y... = LLLYij"
ahn
f=-\ i:zlk=l
The total corrected sum of squares may be written a
i=1 1",1 k>
Thus. the total sum of squares is partitioned into a sum of squares due to "rows," or factor A (SSA)' a sum of squares due to "columns," or factor B (SS.), a sum of squares due to the interaction between A and B (SSM). and a sum of squares due to error (SSE)' Notice that there must be at least tv.'o replicates to obtain a nonzero error sum of squares. The sum of squares identity in equation 13-3 may be written symbolically as (13-4)
There are abn - 1 total degrees of freedom. The main effect<; A and B have a - I and &-1 degrees of freedom, while the interaction effectAB has (a - 1)(b-1) degrees cffreedom. Within each of the ah cells in Table 13-3, there are n -I degrees of freedom between the n replicates, and observations .in the same cell can differ only due to random error. Therefore, there are ahen -1) degrees of freedom for error. The ratio of each sum of squares on the right~hand side of equation 13-4 to its degrees of freedom is a mean square. Assuming that factors A and B are fixed, the expected values of the mean squares are
Therefore} to test He: 't;::;:; 0 (no row factor effects), Ho: /3,- -;::; 0 (no column factor effects), and
H,: (~fi),! ~ 0 (no interaction effects), we would divide'the corresponding mean square by the mean square error, Each of these ratios will follow an F distribution with numerator degrees of freedom equal to the number of degrees of freedom for the numerator mean square and ab(n-I) denominator degrees of freedom, and the critical region will be located in the upper tail, The test procedure is arranged in an analysis-of-variance table. such as is shown in Table 13-4. Computational fonnulas for the sums of squares in equation 13-4 are obtained easily, The total sum of squares is computed from
SSy~
a
r.,.,
b
y2
LLLYi;, -~"'" aim 1=1 j=l
(13-5;
.(:",1'
The sums of squares for main effects are (13-6)
and 2
b
2
'\' Y,j. y", SSB -~.L,; --. }=1 an abn
(13-7)
We usually calculate the SS" in two steps. First, we compute the sum of squares between the ab cell totals, called the sum of squares due to Hsubtotals": IJ
SSS\lbtO".,;:Us;;
b y~.
LL..JL:..b nan 1",1 J""1.
Table 134 The .Analysis~of- Variance Table for the Two- Way~aassification Fixed~Effects Mode:
l
Mean
Sum of Squares
Degrees of Freedom
A treatments
SS,
a-I
MSA
B treatments
SS.
0-1
MS,
Interaction
SS"'S
Source of Variation
Total
(a
~
1)(0 - I)
SS,
ab(n-l)
SST
abn-l
MSAfj
a-I MS,
0-1
MS,
SSAS
(o-I)(b-I)
MS,:
MS,
i 362
Chapter 13
Design of Experi;;nentS with Several Factors
'This sum -of squares also contains 5Sft and SSs' Therefore, the second step is to compute SSAS as (13-8)
SS" = SS"""", - SSA - 5S8 •
The error sum of squares is found by subtraction as either (13-9.)
or (13-9b)
. E~pl.13,3·. Aircraft primer paints are applied to alUI1li.nilI!l. surfaces by two n:.ethods: dipping and spraying, The purpose of the primer is to improve paint adhesion. Some par...s can be primed using either applicadon method and engineering is interested i'11eamiug whether th...--ee different primers differ in their adhesio::::. properties. A factorial experiment is performed to investigate the effect of paint primer type and application method on paint adhesion. 'Three speci.-.ner.s are painted with each primer using each application method. a ilnish palnt applied, and the adhesion force measured. The data from the experiment are shown in Table 13-5. The circIed nUDlbers in the cells are the cell totals Y,r . The sums of squares required to perfow the analysis of variance are computed as follows:
10.72-4.58 -4.91-0.24 = 0.99. The analysis of variance is summarized in Table 13~6, Sinee F'J,(r.;,i,l'2 =- 3,S9 and FC.O,..,11 = 4.75. we ..:onclude that the main effects of primer type and application method affect adhesion force. Fu..'tber M
more, since 1.5 <: F O.05 ,2,12' there is no indication of interaction between these factors.
!
13-3 Two-Factor Fac:or'.J1l Experiments
TabJe 13--5
363
Adhesion Force Data for Example 13-3 Application Method
Primer 'JYpe
Dipping
1
4.0.4.5,4.3
2
5.6,4.9, 5.4
3
3,8,3.7,4,0
Y,.
40.2
Spraying
@ @ @
5.4,4.9,5,6 5.8,6,1,6.3 5.5,5.0,5.0
'I:..
@ @ @
49,6
28,7 34,] 27.0
89,8=y.,
T.b1.1;>.6 Analysis of Variance for Example 13--3
SoutCe of Variation
Sum of Squares
Degrees of
Mean
FreOOom
Square
Primer types Application !Ilethods Interaction Error Total
4581 4,909 0241 0,987 10,718
2 1 2 12
2,291 4,909 0,121 0,082
27.86 59.70 1.47
17
A graph of t.':1e cell adhesion force averages Yir versus the levels of primer type for each application method is shown in Fig. 13--8. The absence ofintcraction is evident by the parallelism of the two lines. Furthermore. since a large response indicates greater adhesion force, we conclude that spraying is a superior application method and that primer type 2 is most effeetive.
Tests 011 Individual Mean. When borh factors are fixed, comparisons between rhe individual means of either factor may be made using Tukey's test. When there is no interaction; these comparisons may be made using either the rOw averages Yi" or the column averages Y.).. However. when interaction is significant, comparisons between the means of one factor (say A) may be obscured by rhe AS interaction. In rhls case, we may apply Tukey's test to the means of factor A, 'With factor B set at a particular level.
Yi}·t 7,06.0
rf
5,Or 4.0~
~ng
~Ping
a,oc 2 Prlmertypa
Figure 13-& Graph of average adhesion force versus primer types for Example 13-3,
364
Chapter 13
13-3.2
Design of Experiments with Several Facton:
Model Adequacy Cbecking Just as in the single-factor experiments discussed in Chapter 12, the residuals from a facto" rial experiment play an important role in assessing model adequacy, The residuals from a two~factor factorial experiment are
That is~ the residuals are JUSt the difference between the observations and the corresponding cell averages. Table 13-7 presents the residuals for the aircraft primer paint data in Example 13-3. The normal probability plot of these residuals is shown in Fig. 13-9. This plot has tails tbat do not fall exacdy along a straight line passing through the center of the plot, indicating some potential problems with the normality assumption, but the deviation from normality does not appear severe. Figures 13-10 and 13-11 plot the residuals versus the levels of primer types and application methods, respectively, There is some indication that primer type 3 results in slightly lower variability in adhesion force than the other two primers. The graph of residuals versus fitted values Yl/k;;;;; YU' in Fig. 13-12 reveals no unusual or diag~ nostic pattern.
13-3.3
One Observation per Cell In some cases involving a two-factor factorial experiment, we may have only One replicate, that is, only one observation per ceIL In this situation there are exactly as many parameters Residuals for the Aircraft Primer Paint Experiment in Example 13-3
Figure 13-9 Nonnal probability plot of the residuals from Example B-3.
13-3
Two~Factor
Factorial Experiments
365
Figure 13~10 Plot ofresiduals Yers"JS primer type.
+0.5
Or-------~~------~~~~-·
D
S
~
Application method
•• -0.5
c
Figure 13~11 Plot of residuals versus application method.
•
• 0
.4
•
• • -0.5
•• 5
6
••
•
A
Yijk
•
••
in the analysis-of-variance model as there are observations, and the error degrees offree~ dam is zero. Thus, it is not possible to test a hypothesis about the main effects and interac~ tians unless some additional assumptions are made. The usual assumption is to ignore the interaction effect and use the interaction mean square as an error mean square. Thus the
analysis is equivalent to the analysis used in the randomized block design. This
366
Chapter 13 Design of Ex.periments with Several Factors no~interaction assumption can be dangerous, and the experimenter should carefully exam~ ine the data and the residuals for indications that there really is interaction present. For more details, see Montgomery (200 1),
13·3.4 The Random·Effects Model
So far we have considered the case where A and B are fixed factors. We now consider the situation in which the levels of both factors are selected at random from larger populations of factor levels, and we wish to extend our conclusions to the sampled population of factor levels, The observations are represented by the model ~ i:::: 112, ..• ,a.
~ Pj + ("Pl.j H ijk j
j:
1,2" •., b, (13·10) ,.k -1,2,. ..• n, where the parameters 1:i' pj • (~f5)ii' and ti" are random variables. Specifically, we assume that~i is NIO(O, a;), p) is NIO(O, ~), (o/!I} is NIO(O, 0".,), and e,jk is NIO(O, 0"), The vari· ance of any observation is Yij' : !1 Hi
V(Yij,) ~
a; + ~ + a;p + 0",
and 0;.. ~ o;.~~ and (52 are called variance components. The hypothcses that we are inter~ ested in testing are H,: ~ ~ G, Ho: ~ ~ G, and Ho: 0"" ~ O. Notice the similarity to the one· way classification random-effects model, The basic analysis of variance remains unchanged; that is, 8S".• SSe. SSw SSI~ and SSE are all calculated as in the fixed-effects case, To construct the test statistics. we mUIit examine the expected mean squares. They are E(MS,) ~ 0" + no"., + bna;, E(MS,) ~ 0" +
na:." + ana;,
E(MSA,) ~ 0" + na:p,
(13·11)
and E(MS,J = <1'. ~ote
from the expected mean squares that the appropriate statistic for testing H;;:
0"",: 0 is (13·12)
since under Ho both the numerator and denominator of Fo have expectation
1'0 = MS,
,
(13·13)
MSAB
which is distributed as F4j -l,(s-l;(b _ l») and for testing Ho: E _ MS. 0- MS '
An
which is distributed as F £>""1,(0-:)(1;-1)' These are all upper-tail, one. .tail tests, Notice that these test Statistics are not the sarne as those used if both factors A and B are fixed, The expected mean squares are always used as a guide to test statistic construction.
13-3
Two-Factor Factorial E}."Periments
367
The variance components may be estimated by equating the observed mean squares to the::r expected values and solving for the variance components, This yields -2 (j
= MS" (13-15)
Suppose that in Example 13-3, a large number of primers and several application methods could \:Ie used. 'Three pri..'"Uers, say 1. 2. anc 3, were se:ected at rancom, as were :he two application methods, Tne analysis of va.--iance assuming the rando:n effects :;node! is sho'W'D in Table 13-8. Nerice that the first four colurr.ns.in the analysis ofvanance table are exactly as in Exa.'1lple 133, Xow. however. the F ratios are computed according to equations 13~12 through 13~14. Since F&,~,1.l2::::: 3.89, we conclude that interaction is not sigrlficant. Also, since FI)IJ!,;';;::::: 19_0 and FM).I.? = 18,5, we conc!ude that both types and application. methods significantly affect adhesion force, although primer type is just ba:ely significant at (j. = 0,05. The '1-ariance components may be o!Stimated using equation 13-15 as follows: ,,' =0,08,
0,12-0,08 00133 = 3 " _, 2,29-0,12 ,~ 6 ",0,36,
v" -2
"r
,,~
4,9: - 0, 12 = 0,53,
Clearly, :he tw'Q largest variance components are for prirr.er typeS (6';::::: 0.36) and application methods 0.53),
13-3.5 The Mixed Model Now suppose that one of the factors, A, is fixed and the otherl B, is random, This is called the mixed model analysis of variance. The linear model is [i=I,2"",a,
f1 ~ ti + P; + (,Plil -;- Eil'lj = 1,2"",b, k = l,2, ... ,n,
(13-16)
Table 1J..8 Analysis of Variance for EXar:lple 13-4 Source of Variation
Sumo! Squares
Primer typeS
4,58 4,91 0.24 0.99 10,72
Applicatioo methods Interaction Error
Total
Degrees of
Mean
Freedom
Sq:.:.are
F,
2
2.29 4.91 0,12 0,08
19,08 40,92 1.5
2 12
17
368
Chapter 13
Design of Ex.periments with Several Factors
In this model, 'li is a fixed effect defined such that 'L~ '" I 't, = 0, ~ Is a random effect, the inter~ action term ('f/3)ij is a random effect, and el;~ is a NID(O, a2) random error. Ie is also cusand that the interaction elements (rfJ!ij are nonnaJ tomary to assurne that Ilj is NlD(O, random variables "ith mean zero and variance l(a - 1)la]d'fJ' The interaction elements are not all independent, The expected mean squares in this case are
a
bnI ~? E(.M:SA)=a2+/ra;jJ+
h:
a-I
2 E( MS. ) ~(f 2 +an(fp,
(13-17)
E(MSAB ) = 0'2 +na;', and
Therefore, the appropriate test statistic for testing Ho: off = 0 is (13-18)
whicb is distributed as F.. _1,(c-I)(I1- n' For testing He:
cr1 = 0, the test statistic is
MS MSe
B Fo="--,
(13-19)
which is distributed as Fb_l,ab(n_ 1)' Finally. for testing Ho:cr;p = 0, we would use (13-20)
which is distributed as F(C_IXb_1J,;<'b(,,_I)' The variance components ~ , dtjl. and rr may be estimated by eliminating the first equation from eqcation 13-17t leaving three equations in three unknowns, the solutions of which are
and (13-21)
This general approach can be used to estimate the variance components in any mixed model. After eliminating the mean squares containing :fixed factors, there will always be a set of equations remaining that can be solved for the variance components, Table 13-9 summarizes the analysis of variance for the two~factor mixed model.
J
369
13-4 General Factorial Experb.en.ts Table 13·9 Analysis ofVa.."iance for the }\No-Factor Mixed Model Sou....-ce of Variation
Squares
Rows {A.)
55,
Columns (B)
Interaction
Mean Square
Expected Mean Square
a-I
M5,
aZ+ncr2 +bnI>r"2 l{a~l)
55,
b-l
MSg
cr + ar.0'~
55"
(a -l)(b -1)
MSAB
r;;1 + nO'~
Error
SSE
ab(n -I)
MS,
a'
Tota1
SSr
abn-l
Sum of
Degrees of Freedom
'"
'
F,
MSA MSA,B
M5 p M5, MSA~
MS E
13·4 GENERAL FACTORL'l.L EXPERIMENTS :Many experiments involve more than two factors. In this section we introduce the case where there are a lev-els of faetor A, b levels of factor B, c levels of factor C, and so on, arranged in a factorial experiment. In general. there will be abc ... n total observations, if there are n replieates of the complete experiment, For example, consider the thtee~:fuctor experiment with underlying model
Assuming that A, B, and C are fixed, the analysis of varianee is shown in Table 13-10, Note that there must be at least !\vo replicates (n ~ 2) to compute an error sum of squares. The F -tests on main effects and interaetions follow directly from the expeeted mean squares. Computing formulas for the sums of squares in Table 13·10 are easily obtained, The
total sum of ,squares is, using the obvious "dot» notation, c
b
C
If
Ss,. = I, I,I, :~::'5k/ i=1 i=l ko;.l1=: aDen
(13-23)
The sum of squares for the main effects are eomputed from the totals for factors A (Yf"')' BIy,),,), and ely.. ,.) as follows;
Chapter 13 Design of Experiments with Several Factors
To compute the two-factor interaction sums of squares, the totals for theA x B,A x C, and B x C cells are needed. It may be helpful to collapse the original data table into three twoway tables in order to compute these totals. The sums of squares are
rr 11.
b
;""1 j=l
"2
,2 Yij.
c_L',-SSA -SSo en abcn
(13-27)
= SSw.hroroh(AlT; - SSA - SSB'
rr 11.
SSAC =
C
2
2
Yd. -~-SSA -SSe i=l k""l bn abcn
;; SSwblotalS{AC) -
(13-28)
SSA - SSe>
and
rr b
SSnc =
c
J=l K""t
2
2
y ,j>' -~-SSB -SSe
an
abcn
(13-29)
= SS'.'''WCBC) - SSe -SSc· The three-factor interaction sum of squares is computed from the three"way cell totals Yijtas
Table 13-10 The Analysls~of-Variance Table for the Three-Factor Fixed-Effects Model SOwceof Variation
Sum of Squares
--~-
Degrees of Freedom
Mean Square
Expected Mean Squares
..
,
(1z
+ bcrir."t:
MS;.
a-t
MS,
acnI.j3;
MSn MS,
A
SS,
a-I
MS,
B
SSe
b-I
MS.
CI
C
SSe
e-I
MS c
(12+atJ
AS
SSAB
(a - tj(b-i)
MS",
)
+~-
b-I
'nl; ,
AC
SSAC
(a-I)(e-I)
MS;.c
BC
SSnc
(b -I)(c - I)
MSBC
,4BC
SSABC
(a - I)(b -1)(c - 1)
MS;JJC
Error
SSE
cbc(n - i)
Total
Ss.,
aCch-i
MS r
It c-I
c' + 01 +
CI
'
F,
+
0" +
rnlI«P);j
MSc;_ MS,
MS",!>
(a-l)(b-I)
MS,
brIE{'ry)~~
MS AC
(a-l)(c-l)
MSc
ar£l:(Jl'f);. '
MSec
(b-I)(,-I) til.::IT.(-rj}y):jk
(a-l)(b-l)(c-1)
MSE MSAlIC MSe
a'
;
.J
13-4 Gen::ral Factorial Experiments
371
(13-30b)
The error sum of squares may be found by subtracting the sum of squares for each main effect and interaction from the tOtal sum of squares, or by (13-31)
A mechanical engineer is stUdying::he sw::face roughness of a part produced in a metal-cutting Qper~ arion. Three factors. feed rare (."1), depth of cut (B), and tool angle (C), are of interest. All three f
y'1
+148\' (J77')' ='138"I +(44)' _ (4")' " I . 10.5625-3.0625
4
L
16
372
Chapter 13 Design of Experiments \'lith Several Factors
SSE
:=
SST -SS:n.blotlh(ABC}
= 92.9375-73.4375 = 19.5000. 1M analysis of ..'afiance is summarized in Table 13-12. Feed rate has a significant effect on su.rface finish (a: < 0,01). as does the depth of cut (0.05 < a < 0.10). There is some indication of a mild inter-
action betweet. these factors, as the Ftest for the AJ3 interaction is just less than the 10% critical value.
Obviously factorial experiments wit'" three or more factOTS are complicated and require many runs, particularly if some of the factors have several (more than two) levels. This leads us to consider a class of factorial designs with all factOrs at two levels. These designs are extremely easy to set up and analyze, and as we will see, it is possible to greatly reduce the number of expe~ental runs through the tecbnique of fractional replication.
Table lJ..ll
Coded Surface Roughness Data for Example 13-5
Depth of Cut (B)
Feed Rate (A)
0.Q25 Loch
0.040 inch
Toel Angle l c)
Tool Angle (C)
15'
W
25"
25"
9
11
9
10
7
10
II
8
@
@
@
@
10
10 13
12
12
16 14
@
@
@
®
38
44
47
48
20inJmln
30 in.lmln
B x Ctotals
15
A X B Totals )iF'
Yr" 75
102
177
A X CTotals }r'k'
0.025
0.040
MC
15
25
20
37
45
38 57
20 30
36
30
49
39 53
;,;./<.
82
95
y,,).
85
92
AlB
=, ....
'
13-5 The 2t: Factorial Design
373
Table 13-12 AnalYStS of Variance for Example 13~5 Sourc~of
13·5 THE 2k FACTORIAL DESIGN There are certain special types of factorial designs that are very useful. One of these is a factorial design with k factors, each at two levels. Because each complete replicate of the design has Z" runs or treatment combinations, the arrangerr.ent is called a Z~ factorial design. These designs have a greatly Simplified statistical analysis, and they also form the basis of many other useful designs.
13·5.1 The 22 Design The simplest type of2k design is the 2z~ that is, two factors, A and B, each at two levels. We usually think of these levels as the "low" and "high" levels of the factor. The 2' design is sho"" in Fig. 13·13. Note that the design can be represented geometrically as a square, with the 21. 4 runs forming the cOmers of the square. A special notation is used to represent the treatment combinations. In general a treatment combination is represented by a series of lowercase letters. If a, letter is present, then the corresponding factor is run at the high level
High b (+)
,....-----------tab
B
Low (-)
(1)
Low (-)
A
a High (+)
Figure 13·13 The 2' lilctorial desigr..
374
Chapter 13 Design of Experiments with Severa:. Fac:t.ms in that treatment combination; if it is absent, the faetor is run at its low level. For example. treat:raent combination a indicates that factor A is at the high level and faetor B is at the low
leveL The treatment eombination with both factors at the low level is denoted (1), 'This notation is used throughout the 2K. design series. For example. the treatment combination in a 24 design with A and C at the high level and B and D at the low level is denoted ac. The effects of interest in the 22 design are the main effects A and B and the two~factor interaction AB. Let (I), a, b, and ab also represent the totals of all n observations taken at these design points. It is easy to estimate the effects of these factors. To estimate the main effect of A we would average the obsenrations on the right side of the square, where A is at the high level. and subtract from this the average of the observations on the left side of the square, where A is at the low level, or A= a+ab _ b+(l)
2n
2n
J...[a +ab -b -(1)].
(13-32)
2n
Similarly, the main effect of B is found by averaging the observations on the top of the square, where B is at the high level. and subtracting the average of the observations on the bottom of the square, where B is at the low level:
B= b~aiJ._ a+(l) 2n
2n
= I- [b+ab-a-(I),..
2n
(13-33)
'
Finally, the AB interaction is estimated by taking the difference in the diagonal averages in Fig. l3~13i or
(13-34)
The quantities in brackets in equations 13-32, 13-33, and 13-34 are called contrasts. For example, the A contrast is Contrast, ~ a - ab - b - (1).
mthese equations, the contrast coefficients are always either +1 Or -
L A table of plus and minus signs, such as Table 13-13, can be used to determine the sign of each treatment com~ bination for a particular contrast. The column headings for Thble 13MB are the main effects A and B, AB interaction. and I, which represents the total. The row headi:r.:.gs are the treat~ ment combinations, Note that the signs 1:1 the AS column are the products of signs from
Table13-13
Signs for Effects in che 22 Design
T-"eatment
Combinations (1)
a b ab
Factorial Effect
I
+ + + +
A
B
AS
+
+ +
+
~
+
13·5
The 2' Factorial Design
375
colu."llOS A and B. To generate a contrast from this table, multiply the signs in the apprQpri~ ate column of Table 13-13 by the treatment combinations listed in the rows and add. 1b obtain the sums of squares for A, B, and AB, we can use equation 12-18; which expresses the relationship between a sing1e-degree-of~freedom contrast and its sum of squares: (ContrdS t
SS = o
l"
".
L (contrast coefficientst
Therefore, the sums of squares for A, B, andAB are
SSA SS. = SSAB =
40 4n
40
The analysis of variance is completed by computing the total sum of squares SST (~ith 40 - 1 degrees offreedom) as usual, and obtaining the error sum of squares SSE [with 4(0 - 1) degrees of freedom] by subtraction.
§;;§gI~',l~~ An article in the AT&T Technical Jo"""u (MarcblApril, 1986, Vol. 65, p. 39) describes the application of two-level experimental designs to integrated circuit manufacturing. A basic processing step in t.'1.is industry is to gr:ow an epitaxial layer on polished silicon wafers. The wafers are mounted On a susceptor and positioned inside a bell jrtr, Chemical vapors are introduced rr..rough nOl.7..les near the top of the jar, The susceptor is rotated and heat js applied. These conditions are maintained until the eplR taxiallayer is thick enough. Table 13-14 presen!'$ theresuJts of a 2' factorial design \\-1.thn=4replicates using the factors A "" deposition time and B:::: arsenic flow rate. The two leve~s of deposition time are-= short and,.;. == long, and the two levels of arsenic flow rate are -:=; 55% and - = 59%. The response variable is epitaxial layer thickness (urn). We may find the estimates of the effects using eq'Jations 13 32, 13-33, and R
13-34 as follows: A = ;n [a+ah-b-(ll]
=2t4l [59.299+ 59, 156-55,686-56.081J =0.836,
1\1ble13-14 The 2,z Design for the Epitaxial Process Experiment Treatment Combinations
= 2(4) [59.156+56.081 -59.299-55.686]=0.032. The numerical estimates of the effects indicate that the effect of deposition time is large and has a positiYe direction (increasing deposition time increases thickness), since changing deposition time from low to high changes the mean epitaxial layer thickness by 0,836 pm. The effects of arsenic flow rate (B) and the AB interaction appear small. The magnitude of these effects may be eonfi'llled with the analysis of variance, The sums of
squares for A, E, andA,B are computed using equation 13-35: (Contrast)' n·4
SS=-- -, SS =
ra+ab-b-(I)l' 16
A
<
= [6.688]2 16 ~2.7956,
•
,[b_+_ab_-_"_-l-'(llCL.l'
SS.=-
<
16
[-{).5381' 16 = 0.0181,
[ab+(I)-a-bj' SSAB
16
_ [O.256f -
16
=0.0040.
The analysis of variance is summarized in Table 13-15. This confirms our conclusions obtained by examining the magnitude and direction of the effects; deposition time affects epitaxial layer tb.ickness, and from the direction of the effect estimates we know that longer deposition ti.:n.es lead to thicker epi~ taxia11aYer5,
Tablel3-15
Analysis ofVa.ciance for the EpitaXial Process Experiment
Source of Variation
Sum of Squares
A (deposition time) B (arsenic flow)
Et::ror
2.7956 0.0181 0.004() 0.2495
Total
3.0672
AB
Degrees of Freedom I
12 15
Mean 2.7956 0.0181 0.0040 0.0208
134.50 0.87 0.19
:3-5
The 21( Fa.ctorial Design
377
ResidualAnalysis It 1s easy to obtain the residuals from a 2;; design by fitting a regression mode! to the data. For the epitaxial process experiment, the regression model is y;
130 + 13,x, + E,
since the only active variable is deposition time, which is represented by Xl' The low and high levels of deposition time are assigned the vaiues Xl =-1 and Xl =+1, respectively. The fitted model is
y
14.389+lro.836ixl> 2 /
where the intercept A, is the grand average of all 16 obServatiODS (Y) and the slope $, is one· half the effect estimate for deposition time. The reason the regression coefficient is one-half the effect estimate is because regression coefficients measure the effect of a unit change in on the mean of y, and the effect estimate is based on a two-unit change (from -1 to +1),
Xl
This model can be used to obtain the predicted values at the four points in the design. For example, consider the point with low deposition time (x: = -1) and low arsenic flow rate. The predicted value is
IH):
ji 14.389 + (0.836 2 J
13.971 pm,
and the residuals would be
.,
14.037 - 13.971: 0.066,
e,
14.165
eJ
=13.972 -
13.971 = 0.194, J3.971
=0.001.
e4 = 13.907 -13.971 =-Q.064.
It is easy to verify that for low deposition time (x, : -1) and high arsenic flow rate, ji: 14.389 + (0.83612) (-1) = 13.971 J1.rI4 the remairting predicted values and residuals are
and that for high deposition time (x, = +1) and high arsertic flow rate,:9 = 14.389 + (0.83612)(~1)= 14.807 J1.rI4 they are e,,: 14.888 -14.807 = 0.081,
e" = 14.921-14.807 =0.114,
378
Chapter 13 Design of Experiments with Several Factors
e" ~ 14.415 -
14,807 = - 0,392,
e" = 14.932 - 14.807 = 0.125. A normal proba1:>ility plot of these residuals is shown in Fig. 13-14. This plot indicates that one residUal, e;s =:: 0.392, is an outlier. Examining the four runs with high deposition time and high arsenic flow rate reveals that observ-ation Y15 =:: 14.415 is considerably smaller than the other three observations at that treatment combination_ This adds some additional evidence ro the tentative conclusion that observari on 15 is an outlier. Another possibility is that there are some process variables that affect the variability in epitaxial layer thickness, and if we could discover which variables produce this effect, then it might be possible to adjust these variables to levels that would minimize the variability in epitaXial layer thick~ ness. This would have important implications in subsequent manufacturing stages. Figures 13-15 and 13-16 are plots ofresiduals versus deposition time and arsenic flow rate, respec· tively. Apart from the unusually large residual associated with YU' there is no strong evidence that either deposition time or arsenic flow rate int1uences the variability in epitaxial. layer thickness. Figure 13·17 shows the estimated St:llldard deviation of epitaxial layer thickness at all four runs in the 21 design. These standard deviations were calculated using the data in Table 13-14. Notice that the standard deviation of tilefour observations with A and B at the high level is considen:bly larger than the standard deviations at any of the other three design
+T +1
ZJ
o~
..
-t
.••.
•
•
1
•
-2~ -0.39200
-0.29433
~O.~9S-67
-0.09900 Residual
-0.00133
0.09633
0.19400
Figure 13-14 Normal probability plot of residuals for the epita:tial proces..'i experiment.
•• O~------~L~O-W------~H~i9~h----~
• -0.5 ~
Oeposition time, A
I<'igure 13--15 Plot of residuals versus deposition time.
13-5 The 2k Factorial Desig!!.
379
ol~:------~IL~O-W----~H~i~9h----
~J
...
Arsenic flow rate. B
Figure 13~ 16 P;ot of residuals versus arsenic flow rate.
s~O.077
(+) b
B
H (1)
a
$=0.110
H
5= 0.051
A
(+)
Figure 13-17 ne estimated standard deviations of epitaxial layer thickness at the four runs in the 2'design.
points. Most of this difference is attributable to the unusually low thickness measurement associated with YIS' The standard deviation of the fout' observations -with A and B at t.'1e low levels is also somewhat larger than the standard deviations at the remaining two runs. This could be an indication that there are other process variables not included in this experiment that affect the variability in epitaxial layer thickness. Another experiment to study this possibility, involving other process variables, could be designed and conducted (indeed, the original paper shows that there are two additional faetors. uneonsidered in this example, that affect process variability).
13-5.2 The 2' Design for k ;;: 3 Factors The methods presented in the previous section for factorial designs with k:::: 2 factors cach at two levels can be easily extended to more than two factorS. For example, consider k = 3 factors, each at two levels. This design is a 2) faetorial design. and it has eight treatment combinations. Geometrically, the design is a cabe as sho\\
380
I
Chapter 13 Design of Experiments with Several Factors be
~----------~~abc
Be
,--:err-
e r''---T-, I I I I
bi __
~~_
------
..,- ... ..,-' -1 :...
,
( 1 ) " " ' - - - - -. .
ab
+1 /
-/a
-1
A
Figure 13·18 The 2' desigu.
level, and sUbtracting from that quantity the average of the four treatment combinations on the left side of the cube, where A is at the low level. This gives
I
A = --[a ~ ab+ac + abe- b -e- bc- (I)].
4"
(13-36)
In a similar manner the effect of B is the average difference of the four treatment combinations in the back face of the cube and the four in the front, or
B = ..!...rb+ ab + be +abe- a - c - ac -(I)], 4n'
(13-37)
and the effect of C is the average difference between the four treatment combinations in the top face of the cube and the four in the bottom, or
1 • 4n
, )1 , '
C=--lc~ac+be+abc-a-b-ab-\I
(13-38)
~ow consider the tv/o-factor interaction AB. \¥hen C is at the low level, AB is just the average difference in the A effect at the t\Vo levels of E, or
AB(CIOW)=_I [ab-b]-_I [a-(I)]. 2n 2" Similarly, when C is at the high leve~ the AB interaction is AB( C high) = ..!...[abe- be] - "!'.(ac-e]. 2n 2" The AB interaction is just the average of tbese two components. or
AB = "!"'[ab+ (1) + abc + e- b -a - be -ac]'
(13-39) 4n Using a similar approach, we can show that the AC and Be interaction effect estimates are as follows:
1 AC= -[ac+ (I) + abc + b -a-c - ab - be], 4n I
BC =--[bc+ (1) + abc+a- b-e- ab- ae). 411
(13-40)
(13-41)
13~5
The 2k Factorial Design
381
The ABC interaction effect is the average difference between theAB interaction at the two levels of C. Thus
ABC = :n ([abc~bcl-[ac-cl~[ab~bl+[a~{I)]}
bc~ac+ c~ab + b + a - (I)l..
= 2-[ abc 4n
(13-42)
The quantities in brackets in equations 13-36 through 13-42 are contrasts in the eight treatment combinations. These contrasts can be obtained from a table of plus and minus signs for the 2' design, shown in Table 13-16. Signs for thc main effects (colu.mns A, B, and C) are obtained by associating a plus with the high level of the faetor and a minus with the low level. Once the signs for the main effeets have been established. the signs for the remaining eolumns are found by multiplying the appropriate preceding columns row by row. For example. the signs in column AB are the produet of the signs in columns A and B. Table 13-16 has severnl interesting properties.
1. Except for the identity column l, each eolumn has an equal number of plus and minus signs. 2~
The SUJr. of products of signs in any two columns is zero; that is, the columns in the table arc orth..ogonal,
3. Multiplicating any column by column [leaves the column unchanged; that is, I is an identity element.
4. The product of any two columns yields a column in the table; for example, A x B = C, since any colu.mn multiplied by itself is the iden-
AS and AB x ABC =A'B'C tity column.
The estimate of any main effect or interaction is determined by multiplying the treatment combinations in the first column of the table by the signs in the corresponding main cffect or interaction colUlllll. adding the result to produce a contrast, and then dividing the contrast by one-half the tolal number of runs in the expctimcnt. Expressed mathematically, Effect
Contrast
(13-43)
The sum of squares for any effect is SS -
(Contrast)2
(13·44)
n2k
Table 13-16 Signs for Effects.in the 2) Design
Treatmem Combinations (I)
a b
ab C
ac Ix:
abc
I
+ + + + + + + +
A
S
Factorial Effect AS C ,K
+
+
+
+
+ + + +
+ +
+ + +
+ +
BC ;I.BC
+
+ + + +
+ + +
+ +
+ +
+
1 382
Chapter 13 Design of Experiments with Several Factors
F!¥~il(>!~11~J Consider the surface--roughncss experiment described originally in Example 13-5. This is a: 23 factorial design in the factors feed rate (.4)j deptl::. of cut (B), and tool angle 'G), 'Yri.tb 10 =2 replicates. Table 13-17 presents the observed surface~rougbness data.
The main effects may be estimated using equations 13-36 through 13-42. The effect of A is, for example.
A =.l.ra + ab+ac+abc-b - c-bc- (I)] 4" ~
= 4t2)[22+27+23+30-20-21-18-16] 1 8
=-[27]=3375, and the sum of squares for A is found using equation 13-44:
(ContrastS
(21)' =--=455625 2(8) ~ . It is easy to verify that the"other effects are
B= 1.625,
C= 0.875. AB
1.375,
AC=O.I25. BC=-O.625. ABC= 1.125.
From. examining the magrrimde of the effects. clearly fced rate (factor A) is dom.i."lant. followed by depth of cut (8) and the AB interaction. although the interaction effect is relatively small. The analy-
sis of \':It....ance is summarized in Table 13-181 and it confirms our interpretation of the effect estimates.
Table 13,17 Treatment
Combinations
Surface Roughne.'iS Data for Example 13-7 Design Facto'" C A B
(1) a
-1
-1
1
-1
b ah c
-I
I 1 -1 -I 1 I
ae be abc
-I 1 -I
-1 -1 -1 -1
1 1
Surface Thtals
9, 7 10.12 9.1! 12,5 11.10 10,13 10,8 16.14
16 22 20 27 21
23 18 30
:3-5
The 2k Factorial Design
383
T.bl.13-18 Analysis of Variance for the Sl.lt'!aee-Finish Bxperi..mcnt Source of Variation
Other Methods for Judgfng Significance af Effects The analysis of variance is a formal way to detenni:1e which effects are nonzerQ. There are t\\'O other methods that are useful. In the first method, we can calculate thc standard errors of the effects a.~ compare: the magnitudes of the effects to their standard errors. The second method uses normal probability plou> to assess the importance of the effects, The standard error of an effect is easy to find, If we assume that there are ft replicates at each of the 26 runs in the design, and if Yil' yc~' , .. , Yill are: the observations at the ith run (design point), then
1,2 ..... 2.<,
i
r;=
is an estimate of the variance at tl:e ith run. where )'t = lyJn is the sample mean of the n obsenrations. The 2" variance esGnates can be pooled to give an overall variance estimate ... S' -
1
2k
2
fi
2k {n_l)I.I.(Yij-Yi) ,
(13-45)
1=1 j""l
Where we have obviously assumed equal variances for each design point. This is also the variance estimate gNen by the mean square error from the analysis of variance procedure. Each effect estimate has variance given by
V{Effect) = v[contrast] n2k 1
1
-:r
-'-'::""'T, k
(n2
V(Contrast).
Each contrast is a linear combination of 27. treatment totals, and each total consists of n ohservations. Therefore.
V(Contrast) = n2'
V(Effect)
1
(n2H)' 1
n2 k- 2
n2' (J" (13·46) 2
{J •
384
Chapter 13 Design of Experiment<: with Several Factors The estimated standard error of an effect would be found by replacing
s' and taking the square root of equation 13-46,
rr with its estimate
To illustrate for the surface-roughness experiment, we find that S' = 2.4375 and the standard error of each estimated effect is s,e,(Effect) =
~;;
= ,} 2, ~3-2 (2.4375) =0,78, Therefore two standard deviation limits on the effect estimates are
A: 3.375 ± 1.56, B: 1.625± 1.56, C: 0,875 ± 1.56,
AB: 1.375 ± 156, AC: 0,125 ± 1.56, BC: -0,625 ± 156, ABC: Ll25 ± 1.56, These intervals are approximate 95% confidence intervals. They indicate that the wo main effects. A and 8, are important, but that the other effects are not, since the intervals for all effects except A and B include zero. Normal probability plots can also be used to judge the significance of effects, We will illustrate that method in the next section.
Projection o/2k Designs .Any zi< design will collapse or project into another i:. design in fewer variables if One or more of the original factors are dropped Sometimes this can provide additional insight into the remaining factors. For example, consider the surface-rough~ ness experiment. Since factor C and all its interactions are negligibl~ we could eliminate factor C from the design. The result is to collapse the cube in Fig. 13~ 18 into a square in the A - B plane-however, each of the four runs io the new design has four replicates, In genera!, if we delete h factors so that r = k - h factors remain, the origioalZ' design with n replicates will project into a 2r design with n2h replicates. Residual Analysis We may obtain the residuals from a 2' design by usiog the method demonstrated earlier for the 22 design. As an example, consider the surface~roughness experiment. The three largest effects areA, B, and theA.Binteractiou. The regression model used to obtain the predicted values is
y = Po + p,x, + Ax, + /3,,,,,,,,,,, where Xj represents factor A~.l1 represents factor B, and x!A:z represents the AD interaction. The regression coefficients,Bl,A. andft!~ are estimated by one·balfthe corresponding effect esti:clates andfJo is the grand average, Thus
and the predicted values would be obtained by substituting the low and high levels of A and B into this equation, To illustrate, at the treatment combination where A, B, and C are all at the low level, the predicted value is
The observed values at thls run are 9 and 7, so the residuals are 9 - 9.25 = -0.25 and 7 - 9.25 ~ -2.25. Residuals for the other seven runs are obtained similarly. A normal probability plot of the residuals is shown in Fig. 13-19. Since the residuals lie approximately along a straight line. we do not suspect any severe nonnormaJiry in the data. There are no indications of severe outliers. It would also be helpful to plot the resid· uals versus the predicted values and against each of the factors A, 8, and C. Yates' Algorithm for the 2" Instead of using the table of plus and minus signs to obtain the contrasts for the effect estimates and the sums of squares, a simple tabular algorithm de'\1.sed by Yates can be employed, To use Yates' algorithm. construct a table with the treatment combinations and the corresponding treatment totals recorded in standard order. By standard order, we mean that each factor is introduced one at a time by combining it with all factor levels above it. Thus for a 22, the standard order 1s (1), a, b. 00, while for a 23 it is (1), a, b, ab, c, ac, be, abc, and for a 2' it is (1), a, b, ab, c, ac, be, abc, d, ad, bd, abd, cd, acd, bcd, abed. Then follow thls four-step procedure:
L Label the adjacent column [IJ. Compute the entries in the top half of thls column by adding the observations in adjacent pairs. Compute the entries in the bottom half of this column by changing the sign of the first entry in each pair of the original obser~ vation. and adding the adjacent pairs. 2. Label the adjacent column [2J. Construct column (2) using the entries in column (1). Follow the same procedure employed to generate colurr,n (1]. Continue this process until k columns have been constructed, Colunm [k] contains the contrasts designated in the rows. 3. Calculate the sums of squares for the effects by squaring the entries in column [k] and dividing ey ni:'. 4. Calculate the effect estimates by dividing the entries in column [k) by n2'·'.
Exa ';'/lie13-8. Consider the surfacc-rougr..ness experiment in Example 13-7, This is a 23 design with n;= 2 replicates. The analysis of this data using Ya,es' algorithm is illustrated in Table 13~19. Note that the sums of squares computed from Yates' algorithm agree with the resultS obtained in Example 13-7, .~~
...
~~-
:3-5.3 A Single Replicate of the zf' Design As the number of factors in a factorial experi:r:lent grows, the ~umber of effects that can be estimated g:ows also. For exaClple, a 24 experiment has 4 main effects, 6 two-factor inter~ actions, 4 three-factor interactions, and 1 four-factor interdction, whlle a 2 5experiment has six main effects, 15 two~factor interactions, 20 three-factor interactions, 15 four-factor interactions, 6 fivewfactor interactions, and 1 six-factor interaction. In most situations the sparsity of effecrs principle applies; that is, the system is usually dominated by the main effects and low-order interactions. Three-factor and higher interactions are usually negligible, Therefore, when the number of factors is moderately large, say k'2: 4 or 5, a common practice is to ron only a single replicate of the Zl. design and then pool Or combine the higher-order interactions as an estimate of error.
.Exam;,l" 1:>:9 An article in Solid State Technology ("Orthogonal Design for Process Optim.i:z.ation and its Application in Plasma Etchir.g," May 1987, p. 127) describes the application of factorial designs in developing a nitride etc]:: process ou a single-wafer plasma etcher. The process uses y,., as the reactant gas. Iris possible to 'vary the gaz flow, the power applied to the cathode, the pressure in the reactor chamber, and the spacing berwee!l the ::mode azd the cathode (gap). Several response variables would usually be of i:.tercst in this process, but in this exa.:mple we will concentrate on etch rate for silicon ni::ride. We will use a single replicate of a r design to i.;:.vestigate this process. Since it is un!ikcly that the three-factor and four-factor intera.ctions are significant, we will tentatively plan to combine diem as an estim.ate of error. The factor levels nsed in the design are shown here: Design Factor Gap
B Pressure
c,F,Flow
D Power
Level
(em)
(roThrr)
(SeeM)
(w)
LowH
0.80 1.20
450
125
550
200
275 325
A
High (+)
C
J
13-5
The 21 Factorial Design.
387
Table 13-20 presents the data from the 16 runs of the 24 design. Table 13~21 is the table of plus and minus signs for the 24 design. The signs in the columns of this table can be used to estimate the factor effects. To illustrate, the estimate of factor A is
A ::::..!.[a+ab+ac+abC+ad +ahd +acd+abcd-(l}-b-c-d -be -btl -cd-bed] S = ~[669+650+642 +635 + 749 +868 +860+ 729- 550 -604-633
-601-1037 - ,052- ,075-1063J =-101.625. Thus the effect of increasing the gap between the anode and the catbode from 0.80 em to 1.20 em is to decrease the etch ;rate by 101.625 Almin. 1t is easy to verify that :he complete set of effect estimates is
D=306.125,
A = -101.625.
AD=-lS3.625,
B=-1.625, AB=-7.875,
BD=-O.615,
C=7375,
ABD=4.125,
AC = -24.875,
CD=-2.12S,
BC=-43.875,
ACD=5.625.
ABC=-IS.62S,
BCD =-25.375, ABCD=-40.125,
A very helpful method in judging the significance of factors in a 2' experiment is to construct a normal probability plot of the effect estimates, If none of the effects is
Table 13-20
The 24 Design for the Plasc:;;a Ecch Experim::nt
A
B
C
D
(Gap)
(Pressure)
(C,F, Flow)
(power)
-1
-1
-I
-1
-1 -I
550
-1
1 1
-I
-1
604
-1
-1 -1
633
-I 1 -1
-1
1
-J
-1
-1 1
!
-1 -1
-1
Etch Rate
1
-1 -1
-1 1
-1
-1
1
-1
-1
1
1 -1 -1 -1 -1
1 1
669 650 642 601 635
1037 749
1052 868 1075 860 1 1
1063 729
388
Chapter 13 Design ofExperi.."Pents with Several Factors Tablel",21 Contra:>, Constants for the 2' Design A
B
AS
C AC
+
(1)
BC
b
. .;. .
ab c ac be
+
+
+ . .;. . +
+
+ + +
+ + +
+
+
+
aha
+
~
+
+
+ +
+ +
+
+ +
+
+
+
+ + +
+ +
+
+
+
+
+
+ +
cd acd bed abed
.,.
+
+
+
+ +
+
+
+
+ + abc++++++
+
+
+
.,.
+ +
-r-
d ad bd
AD BD ABD CD ACD BCD ABCD
+
+
+
a
ABC D
+
+
+
+
.,.
+
+
+
+ +
+
+ + +
+
+
+
+
+
+
+
+
+
+
+
+ +
+
+
significant, then the estimates will behave like a random sample drawn from a normal dis~ tribution with zero mean, and the plotted effects will lie apProximately along a straight line, Those effects that do not plot on the line are significant factors:. The normal probability plot of effect estimates from the plasma etch experiment is shown in Fig, 13-20, Clearly the main effects of A and D and the AD interaction are significant, as they fall far from the line passing through the other points, The analysis ofvmance summarized in Table 13-22 confirms these findings. Notice that in the analysis of variance we have pooled the three- and four~factor interactions to form the error mean square. If !.he nonnal probability plot had indicated that any of these interactions were important, they then should not be included in the error term. Since A =-101.625, the effect of increasing the gap benveen the cathode and anode is to decrease the etch rate. However. D 306.125, so applying higher power levels \Vill increase the etch rate, Figure 13-21 is a plot of the AD interaction, This plot indicates IIlat the effect of changing the gap width at low power settings is smaU. but that increasing the
+2,~,------,~----~----'------r-----r------Dnil
+1 Zj
~
l
O~.
l
229,80 79.25 152.37 306,12 Effects Normal probabilit)' plot of effects from the plasma etch experiment
For example, when A and D are both at the low level the predicted value is
y = 776.0625 - (!O\625} -I) + e06~125)r-I) _ e53~625}_I)( -I) =597, and the four residuals at this treatment combination are
e,=550-597 =-47,
e, = 604- 597 = 7,
1400 1200 c
]
~
~
u
1000 800
- - - - - - - - - - - Dh;gh = 325w
600
_------.. D10w = 275w
W 400 200 0
Low (0.80 em)
High (1.20 em) A (Gap)
Figure
13~21
AD interaction from the plasma etch experiment
390
I
Chapter 13 Design of Experlments with Several Factors
..
•
•
•
•
•
•
• • ~;:-_-;;;!'c;o; __~.L _ _--::-;:!J
-3.00
Residua!,
20.16
~.33
66.50
9;
Figure 13-22 Normal probability plot of residuals from the plasma etch experiment.
e, = 638 -597 =41, e,=601-597=4_ The residuals at the oth~r three treatment combi:Iations (A high, D low), (A low, D high), and (A high, D high) are obtained similarly. A normal probability plot of the residuals is shown in Fig. 13-22. The plot is satisfactory.
13-6 CONFotJNDING IN THE 2' DESIGN It is often impossible to run a complete replicate of a factorial design under homogeneous experimental conditions. Confour.ding is a design technique for running a factorial experiment in blocks, where the block size is smaller than the number of treatment combinations in one complete replicate. The technique causes certain interaction effects to be indistin~ guishable from, or confounded with, blocb. \Ve will illustrate confounding in the 2* facto~ ';"1 design in 2P blocks, where p < k. Consider a 21 design. Suppose that each of the 22 == 4 treatment combinations requires focr hours of laboratory analysis. Thus, two days are required to perform the experiment. If days are considered as blocks, then we must assign two of the four treatment combinations to each day. Consider the design shown in Fig. 13-23, Notice that block 1 contains the treatment combinations (1) and ab, and that block 2 contains a and b. The contrasts for estimating the main effects A and B are Contrast, = ab + a - b - (I), Contraste = ab + b - a - (1). Note that these contrasts are unaffected by blocking since in each contrast there is one plus and one minus treatment combination from each block. That is, my difference between block 1 and block 2 will cancel out. The contrast for the AB interaction is ContrastAl! = ab + (1) - a-b.
ao
Since the two treatment combinations with the plus sign, and (1), are in block I and the two with the minus sign. a and b~ are in block 2, the block effect and theAR interaction are identical. That is, AS is confounded with blocks.
13·6 Confounding in the 2' Design Block 1 Block 1
Block 2
(1)
(1) all
~ ~
ab 8C
Block 2
[IJ ,
c
[abc
be
Figure 13<-23 The 2: design in two blocks,
391
Figure 13~24 The 2} design in two blocks,ABC
confounded.
The reason for this is apparent from the table of plus and Illlnus signs for the 2' design (Table 13-13), From this tablel we see that all treatment combinations that bave a plus on AB are assigned to block 1. while all treatment combinations t..1at have a minus sign on AB are assigned to block 2. This scheme can be used to confound any 2k design in two blocks. As a second exar.Iple. consider a 2' design. run in two blocks. Suppose we wish to confound the three-factor interaction ABC witb blocks, From the table of plus and minus signs for the 2' design (Table 13-16), we assign the treatment combinations that are minus on ABC to block 1 and those that are plus onABCto block 2. The resulting design is shown in Fig. 13-24. There is a more general method of constructing the blocks. The method employs a defining contrast, say (1347) where Xi is the le'vel of the ith facto:: appearing in a treatr:lent combination and (Xi is the exponent appearing on the ith factor in the effect to be confounded. For the 2k system. we have either lX, = 0 Or lj and either x, = 0 (low level) or Xi =: 1 (high level). Treatment combinatious that produce tbe sarne value af L (modulus 2) will be placed in tbe sarne block. Since the only possible values of L (mod 2) are 0 and I, this ,,'ill assign tbe 2' treatment combinations to exactly wo blocks. ill an example consider a 2 3 design with ABC confounded with blocks. Here Xl corresponds to A. X Z to E, X] to C, and lX, :::::: ~ :::: OJ = L Thus. the defining contrast for ABC is
L=x1 +X1 +X1• To assign the treatment combinations to the wo blocks, we substitute the treatment combinations i:Jto the defi.n.ing contrast as follows: (1): L = 1(0) + 1(0) + 1(0)
0 = 0 (mod 2),
a: L= 1(1) + 1(0)+ 1(0) = I = I (mOd 2), b: L= 1(0) + 1(1) T 1(0) = I = 1 (IDlJd 2),
abo L = 1(1) + 1(1) + 1(0) = 2 = 0 (mod 2), e: L = 1(0) + 1(0) + 1(1) = I
abc: L = 1(1) + 1(1) -r 1(1) = 3 = I (mod 2). Therefore, (I), ab, ac, and be are!1ll1 in block I, and a, b, c, and abc arenm in block 2, This is the same design shown in Fig. 13-24.
392
Chap'i"er 13 Design of Experiments with Several Factors
A ~hortcut method is useful in constructing these designs. The block containing the treaunent combination (I) is called the principal block. Any element [except (I)l in the principal block may be generated by multiplying two other elements in the principal block modulus 2. For example, consider the principal block of the 2' design with ABC confounded, shown in Fig. 13-24. Note that
ab . ac :::; a?bc
bC
j
ab· bc=ab"J.c=ac, a.c • be = abC' = abo Treatment combinations in the other block (or blocks) may be generated by multiplying one element in the Dew block by each element in the principal block modulus 2. For the 2' with ABC confounded, since the principal block is (I), ab, ac, and be, we know that b is in the other block. Thus, the elements of this second block are
b'(I)=b,
b·ab=ab'=a, b 'ac =abc, b . be = b'c = e.
Jt'xample13;:lO An experiment is performed to investigate the effects of foW" factors on the terminal miss distar.ce of
a shoulder-fired ground~to-air missile. The four faclors are target type CAl, seeker type (8). tatget altitude (C). and target range (D), Each. factor may bc conveniently ron at two levels, and the optical tracking system will allow termi~ nal miss dist.'Ulce to be measured to the nearest foot. Two differet.t gunners are used in the flighl test, and since there may be differences between individuals, it was decided to conduct the 24 design in two blocks with ABeD confounded. T!;.us, the defining contrast is L=x~
+.:tz+x,+x4-
The experimental design and the resulting data are Block 1
Block 2
(1).3 ab ... 7 ae =6 be ~8 ad"" 10 bd.4 cd 8 : abed I t 9
,·7
;lid
b .5 c 6
d.4 abe = 6 bed = 7 acd =9 abd ~ 12
The anal:ysis of the dcsign by "rates' algorithm is shown in Table 13-23. A noJ;ID.al. probability plot of the effects would reveal A (ta.'"get type), D (target nL1.ge), and AD to have large effects. A confuming an.alysis of variance. using three~factor interactions as error, is shown in Table 13-2J..
It is possible to confound the 21:. design in four blocks of 2}:;·-1 observations each. To con~ struct the design. twO effects are chosen to confound with. blocks and their defullng contrasts obtained. A third effect, the generalized interaction of the tvlo initially chosen, is also
:.3-6 Confoun.Cing in the 2k Design Table 13·23
Yates' Algorithm for the 24 Design in Examp:e 13-10
confounded with blocks. The generalized interaction of two effects is found by multiplying their respective columns. For example, consider the 24 design in four blocks. If AC and BD are confounded with blocks, their generalized interaction is (AC)(BD) = ABCD. The design is constructed by using the defining contrasts for AC and BD:
L} =X1 +X:J,
L,. ~ x, + "•.
394
Chapter 13 Design of Experi:nents with Several Factors
It is easy to verity that the four blocks are Block 1
Btock 2
Siock :3
L,~O, 4~O
L,~1,4=O
t,~O,4~1
rmr : ire
i
; bd
I
1
Iad
~
bed
~
~ be ,
~l
c abd
iabed I
Btock 4-
t, ~ 1, 4
: cd
--
- '
This general procedure can be extended to confounding the 2k design in 21' blocks. where p < k. Select p effects to be confounded, such that no effect chosen is a generalized interaction of the others. The blocks can be constructed from the p defIning contrasts L:, L" ... ~ Lp associated with these effects. 10 addition, exactly 2! - P - 1 other effects are CODfounded with blocks, these being the generalized in:eraction of the origina! p effects chosen. Care should be taken so as not to eonfound effects of potential interest. For more information on confounding refer to Montgomery (2001, Chapter 7). That book contains guidelines for selecting factors to confound with blocks so that main effects and low-order interactions are Dot confounded. In particular, the book contains a table of suggested confounding schemes for designs with up to seven factors and a range of block sizes, some as small as two roos,
13·7 FRACTIONAL REPLICATION OF THE 2' DESIGN As the number of factors in a 21i. increases, the number of rons required increases rapidly. For example, a 2' requires 32 runs. In this design, only 5 degrees of freedom correspond to main effects and 10 degrees of freedom correspond to two-factor interactions. If we can assume that certain high-order interactions are negligible. then a f"'otiona! factorial design involving fewer than the complete set of 21< runs can be used to obtain information on the main effects and low-order interactions. In this section, we will introduce fractional replication of the 2' design. For a more complete treatment, see Montgomery (2001, Chapter 8).
13·7.1 The One·Half Fraction of the 2' Design A one-half fraction of the 2k design t.-ontalns 2k - 1 ru::tS and is often called a Zk-I fractional factorial design. .A..s an example, consider the 2'-1 design; that is. a one-half fraction of the 2'. The table of plus and minus signs for the 2' design is shown in Thble 13-25. Suppose we select the four treatment combinations a~ h. C1 and abc as our one~half fraction, These Table 13-25 Plus and Minus Signs for the 21 Factorial Design Treatment Combinations
Factorial Effect I
A
a b
+ +
+
c
+
B
ac be (I)
+ +
+
AB
AC
BC ABC +
.;.
+
+ + +
+ .-'
ab
C
+ +
+
+
+
+ +
+
.,.
+
-'
+
13-7
Fractional Replication of t!:e 2k Design
395
treatment combinations are shown in !he top half of Table 13-25. We will use bo!h the CODventional notation (a, hI c, ... ) and the plus and minus notation for the treatment comblll.3.~ tions. The equivalence between the two notations is as follows: Notation 1
Notation 2
a b
+--+--+ +++
c abc
Notice that the 23 -1 design is formed by selecting only those treatment combinations !hat yield a plus an !he ABC effect, Thus ABC is called !he generator of this particular fraction. Furthermore,. the identity element I is also plus for the four runs, so we call
[=ABC the defining relation for the design. The treatment combinations in the 2:!< -: designs yield three degrees of freedom associated with !he main effects, From Table 13-25, we obtain !he estimates of the rna;:'! effects as
1 A =z:[a-b-c+abc], I B=-[-a+ b -c+ abc], 2 I C=-[-a-b+c+abc]. 2
It is also easy to verify that the estimates of the two-factor interactions are I
BC =z:[a-b -c+abe], I AC= 2[-a+b-c+abc],
AB= 1 2
-b-c+abc].
e
Thus, !he linear combination of observations in column A, say A , estimates A + BC Similarly, E estimate.B .,.AC, and c estimates C ~ AB. Twa or mare effects !hat have this proper:y are called aliases. In our 2'-1 design, A and BC are aliases, B andAC are aliases, and C and.4.B are aliases. iiliasing is !he direct result of fractional replication. In many practical situations, it will be possible to select the fraction so that !he main effectS ""d loworder interactions of interest will be aliased with high~order interactions (which are probably negligible). The alias structure for this design is found by using the defining relation I = ABC, Multiplying any effect by the defining relation yields !he aliases for !hat effect. In OUr example, the alias of A
e
e
A=A . ABC =A'BC = BC, since A . I = A and A' = 1. The aliases of B and C are
B=B·.4.BC=AB'C=AC
396
Chapter 13
Design of Expe:irnents with Several Factors
and C=C·ABC=ABC'=AB.
Now suppose that we had chosen the other one-half fraction. that is" the treatment com~ binations in Table 13-25 associated with minus on ABC. The defining relation for this design is I =--ABC. The aliases are A =-BC. B =-AC, and C ~ --AB. Thus the estimates of .4.B, and C with this fraction really estimateA- BC, B -AC, and C -AB. In practice, it usually does not matter which one·half fraction we select. The fraction with the plus sign in the defining relation is usually called the principal fraction, and the other fraction is usually called the altemate fraction. Sometimes we use sequences of fractional factorial desig::lS to estimate effects. For example, suppose we had run the principal fraction of the 23 -! design. From this design we have the following effect estimates: e,.=A+BC. C,=B+AC. ec=c+AB.
Suppose that we are willing to assume at this point that the t:\J,Io-factor interactions are neg~ ligible. If tile}' are, then tbe 23 - J design has produced estimates of the three main effects, A, B, and C, However, :if after running the principal fraction we are uncertain about the inter~ actions. it is possible to estimate them by running the alternate fraction. The alternate frac~ tien produces the following effect estimates: e;=.4-BC.
e~ B-AC, e;~C-AC. If we combine the estimates from the two fractions t we obtain the following:
Effect i I
;j(A+BC+A-BC)=A 1
I=C
~A
+BC-(A-BC)] =BC
1(B+AC+B-AC)cB
}[B+AC-(B-AC)] =AC
~(C+AB+C-AB)=C
1 ~[C+AB-
(C -AB)) =AB
Thus by combining a sequence of two fractional factorial designs we can isolate both tile main effects and the two~factor interactions. This property makes the fractional factorial deSign highly tL>;eful in experimental problems a') we can run sequences of small. efficient experiments, combine infonnation across several experiments, and take advantage of learning about the process we are experimenting with as we go along. A 2;:-1 design may be constructed by \\-TIting down the treatment combinations for a full factorial with k - I factOrs and then adding the kth factor by identifying irs plus and minus levels with the plus and minus signs afthe highest-order interaction ±ABC,,· (KI). Therefore, a 2'-1 fractional factOrial is obtained by writing down the full 2' factorial and then equating factor C to the ±AB interaction, Thust to obtain the principal fraction. we would use C =+ AB as follows: j
r
13-7 Fractional Replication of the 21 Design
397
l
i
Full 2'
A
B
A
C=AB
B
+
+
+
+
+ +
+
+
+
To obtain the alternate fraction we would equate the last column to C = -AB.
To illustrate the use of ZI one-half fraction, consider the plasma etch experiment described in Exa"n-
pIe 13-9, Suppose that we decide to use a 24 - 1 design with 1::; ABeD to investigate the foU!' factors. gap (A). p::essure (B), C
A·!=A·ABCD, ;;;;;.A 2BCD,
= BCD, and sirnila::ly B=ACD, C=.4BD, D=ABC. The two~factor interactior..s are aliased with each otheL For ex.nnple, the alias of AB is CD: AB·!=AB·ABCD, =A'B'CD, CD.
Tb.e other aliases axe
Table 13~26
A
B
Tte 24 -! Design with Defining Relation 1 "" ABCD C
+ T
+
+ +
+ +
Etch Rilte
(1)
550
ad
T
bd
749 1052 650 1075 642
+
+
+
Treatr:1ent Combinations
+ +
+
+
D=ABC
+
ab cd ac be abed
601
n9
398
Chapter 13 Design of Experiments with Several Factors AC=BD, AD=BC
The estimates of the mal.!;. effects and their aliases are found using the four colUl.Il:ls of signs in Table 13~26. For example, from column A we obtain
Clearly .e" a....d b are large, and if we believe that me tlL."'ee-factor interactions are negligible. then. the main effects A (gap) and D (power setting) significantly affect etch rate. The interactions are estimated by formi::!g the AE, AC, and AD colu::n.1.s and adding them to the table. The signs in theAB column are +,-, -. +, +, -. -, +, aDd this column prod..:ces the estimate
t" =AB + CD =-j:(550-749-1052 + 650 + 1075 -
642- 601 + 729)
=-10.00. From theAC and AD columns we find e;,c=AC+BD ""- 25.50, £Ao=AD+BC=- :97.50. The tAD estimate is large; the most straightforward interptetacion of the results is that this is the AD inreractio!].. Thus, the results obtained from the 2d ~ I design agree with the full factorial results in
Example 13.-9.
Normality Probability Plots and Residuals The normal probability plot is very useful in assessing the significance of effects from a fractional factorial. 'This is particularly true when there are many effects to be estimated. Residuals can be obtained from a fractional factorial by the regression model method shown previously. These residuals should be plotted against the predicted values~ against the levels of the factors. and on normal probability paper. as we have discussed before, both to assess the validity of the underlying model asSl.!IDptions and to gain additional insight into the experimental situation.
Projection of the 2k - 1 Design If one or core factors from a one-half fraction of a 210 can be dropped, the design will project into a full factorial design. For example, Fig. 13-25 pres-
ents a t·- 1 design, Notice that this design will project into a full factorial in any t'No of the three original factors. Thus. if we think that at most two of the three factors are important the 2:} -1 design is an excellent design for identifying the significant factors. Sometimes we call experiments to identify a relatively few significant factors from a larger number of fac~ tors screening experiments, This projection property is highly useful in factor screening. as it allows negligible factors to be elim.i:mted, resulting in a stronger experiment in the active factors that remain.
j
13-7
f
399
B
I
I
I /
/
b
l
/
/
I I
I
I
I
"I I I I
I I I I •
~
I
I
abc
/
/
I
I
I
II
tf--
I
A
- a
, II I I I I I I
V
C
Figure 13-25 Projection of a 2 3 - 1 design iIlto three 2). designs.
In the 2'-1 design used in the plasma etch experiment in Example 13-11, we found that two of the four factors (B and C) could be dropped. If we eliminate these two factors, t':te remaining columns in Table 13·26 form a 2' design in the factors A and D, with two reFli· cates. This design is shown in Fig. 13-26, Design Resolution The concept of design resolution is a useful way to catalog fractional factorial designs according to the alias patterns they produce. Designs of resolution III, IV-, and V are particularly imporraut The cefioitions of these terms and an example of each follow:
1. Resolution III Designs. These are designs in which no main effects are aliased with any other main effect, but main effects are aliased with two~factor interactions, and two·factor interactions may be aJiased with each other. The 2' -, design with I ~ ABC is of resolution m. We usually employ a subscript Roman numeral to indicate design resolution; thus this one-half fraction is a t'm 1 design, 2~
Resolution IV Designs. These are designs in which no main effect:is aliased with any other main effect or two-factor interaction. but two-factor interactions are (1052.1075)
[749,729)
+1~------------------'
1(550,601)
-1~~~~--------~
-1
(650,642)
+1
A (Gap)
Figure 13-26 The 22 design obtained by dropping factors B and C from the plasma etch experiment
400
Chapter 13
Design of Experiments with Severa!. Factors
aliased "ith each other, The 2'-1 design Mth I =ABCD used io Example 13-11 is of resolution N (21; 1), 3. Resolution V Designs. These are designs in which no main effect Or two-factor interaction is aliased with any other main effect or two-factor interaction, but two~ factor interactions are aliased with three~factor interactions. A 25 -! design with I ~ ABCDEis ofresolution V (2',,1).
Resolution ill and N designs are particularlY useful io factor screening experiments, A resolution IV design provides very good information about main effects and will provide some information about two-factor interactions.
13-7.2 Smaller Fractions: The 2'-P Fractional Factorial Although the 2k - 1 design is valuable in reducing the number of runs required for an experiment, we frequently:find that smaller fractions will provide almost as much useful information at even greater economy. In general. a 2( design may be run in a 112P fraction called a 2'-' fractional factorial design. Thus. a 1/4 fraction is called a 2'-' fractional factorial design. a 1/8 fraction is called a 2k - J design, and so on. To illustrate a 1/4 fraction, consider an experiment Mth six f""tors and suppose that the engineer is interested primarily in main effects but would also like to get some information about the two-fa;:;tor interactions. A 2'- t design would require 32 runs and would have 31 degrees of freedom for esthnation of effects. Since there are only six main effxts and 15 two-factor interactions, the one~ha1f fraction is inefficient-it requires too many runs. Suppose we consider a 114 fraction, or a 26 - 2 design. This design conta.ins 16 runs and, with 15 degrees of freedom. will allow esthnation of all six main effects "ith some capability for examination of the two-factor interactions, To generate this design we would write dovma 24 design in the factors A. B, C, and D. and then add two columns for E and F. To find the new columns we would select the two design geMrators I = ABCE andl = ACDF. Thus column E would befouod from E=ABCand column Fwould be F=ACD, and also columns ABCE and ACDF are eqnal to the identity column. However, we know that the product of any two columns in the table of plus and minus signs for a 2' is just another column in the table; therefore, the product of ABCE and ACDF or ABCE (ACDF) =A'BC'DEF= BDEF is also an identity column. Consequently, the complete defin.ing relation for the 25 - 2 design is I=ABCE=ACDF~BDEF.
To find the alias of any effect, simply multiply the effect by each word io the foregoiog deflning relation. The complete alias strUct1lre is A = BCE= CDF=ABDEF,
AE=BC= CDEF=ABDF, AF=CD=BCEF=ABDE, BD=EF=ACDE=ABCF, BF=DE= ABCD = ACEF ABF=CEF=BCD=ADE, CDE=ABD=AEF= CBF. );otice that this is a resolution IV design; main effects are aliased with three~fac~or and higher interactions, and two~factor interactions are aliased with each other. TIlls design would provide very good information on the main effects and give some idea about the strength of the two-factor interactions. For example, if the AD interaction appears signi:fi~ can~ either AD andior CF are significant, If A andior D are significant main effectS, but C and F are not, the experimenter may reasonably and tentatively attribute the significance to the AD interaction. The construction of the design is shov.'1l in Table 13-27. The same pr.nciples can be applied to obtain even smaller fractions. Suppose we wish to investigate seven factors in 16 runs. This is a 27 - 3 design (a 118 fraction). This design is constructed by writing daVID. a 24 design in the factors A, B, C, and D and then adding three new columns. Reasonable choices for the three generators required ace I = AliCE. I = BCDF, and I = ACDG, Therefore, the new columns are formed by setting E = ABC, F = BCD, and G =ACD. The complete defining relation is found by multiplying the generators together two at a time and then three at a time, resulting in
I =ABCE = BCDF = ACDG = ADEF = BDEG =ABFG = CEFG. Notice that every main effect in this design will be aliased with three-factor and higher interactions and that two--factor interactions will be aliased with each other. Thus this is a resolution IV design, For seven factors. we can reduce the number of runs even further. The 27 -.; design is an eight-run experiment accommodating seven variables, This is a 1116 fraction and is Tabl,13·:.7 Construction of the 26 - 2 Design with Generators I "" ABCE and I =ACDF A
B
C
D
E=ABC
F=ACD
~
+
(1)
+
.;.
+
+ ~
+ +
+ +
~
+ + +
+ +
+
+
+ + + +
eif bcf
abce .;.
+ + +
be obf at:
+
+ +
dej
+
.;.
+ + +
+
+
dj ade bdef obd
+
ede oedf
+
obcdef
bed +
402
Chapter 13 Design of Experiments with Several }actors obtained by first writing down a 2' design in the factors A, B, and C, and then forming the four new columns, from 1= ABD, I = ACE, 1= BCF, and 1= ABCG. The design is shown in Table 13-28. The complete defining relation is found by IDuetiplying the generators together two, three, and :finally four at a time, producing
I =ABD =ACE =BCF =ABCG = BCDE =ACDF = CDG =ABEF =BEG=AFG=DEF=ADEG=CEFG=BDFG=ABCDEFG. The alias of any main effect is found by multiplying that effect througb each term in the defining relation. For example, the afuu; of A is
A=BD
CE=ABCF=BCG=ABCDE=CDF=ACDG
=BEF =ABEG = FG =ADEF = DEG = ACEFG =ABDFG = BCDEFG. This design is of resolution rn~ since the main effect is aliased with two-factor interactions.
If we assume that all three-factor and higher interactions are negligible, the aliases of the seven main effects are
e,=A+BD+ CE+ FG, e,=B+AD+CF+FG, fc=C+AE+BF+DG, CD=D+AB+CG+EF, C,=E+AC+BG+DF C,=F+BC+AG+DE, eo
G+CD+BE+AF.
This 27:i!4 design is called a saturated fractional factorial, because all of the available degrees of freedom are used to estimate main effects. It is possible to combine sequences of these resolution ill fractional factorials to separate the main effects from the two-factor interactions. The procedure is illustrated in Montgomery (2001, Chapter 8). In constrUcting a fractional factOrial design it is important to select the best set. of design generators. Montgomery (2001) presents a table of optimum design generators for designs 'With up to 10 factors. The generators in this table will produce designs of maximum resolution for any specified combination of k and p. For more than 10 factors, a
A
S
C
DC_AS)
E( ACI
F( SCI
+ + +
+ + + + +
+
+
+
+
+
+
+
+
+ +
+ +
+
+
+ +
+
13~8
Sample Computer Output
403
resolution ill design is recommended. These designs may be constructed by using the same method illustrated earlier for the 211«4 design. For example, to investigate up to 15 factors in 16 runs, v.'rite down a 24 design in the factorS A, B, C. andD, and then generate 11 new columns by taking the productS of the original four columns two at a time, three at a time, and four at a time. The resulting design is a 21~!- n fractional factoriaL These designs, along witit other useful fractional factorials, are discussed by Montgomery (2001, Chapter 8).
13-8 SAMPLE COMPUTER OUl'l'UT We provide Minitab® ou:put for some of the examples presented in this chapter. Sample Computer Output for Example 13-3 Reconsider Example 13-3, dealing with ai.rcraft primer paints. The Minitab0 results of :he
3 x 3 factorial design with t.lu'ee replicates are
Ar..a1ysis of Variance for Force Source
DF
SS
"4S
Type
2 1 2 12 17
4.5811 4.9089
2.2906 4.9089 0.1206 0.0822
Applicat Type~Applicat
Error Total
0.2411 0.9867 10.7178
I 27.86 59.70 1.47
p
0. 000
O. 000 0.269
The Minitab® results are in agreement with the results given in Table 13-6. Sample Output for Example 13-7 Reconsider Example 13-7, dealing with surface roughness. The Minitab® results for the 23 design with two replicates are Term
P 0.000 0.003 0.071 0.295 0.116 0.877 0.446 0.1.88
Analysis of Variance Source DF Mall: E£fect s 3 2-Way In~eractions 3 3-Way Interactions 1 Residual Error 8 Fure Error 8 Total 1.5
Seq SS 59.187 9.~87
5.062 19.500 19.500 92.937
Adj SS 59.187 9.187 5.062 19.500 19.500
Adj !l.S 19.729 3.062 5.062 2.437 2.438
P 8.09 1.26 2.08
p
0 008 0.352 0.188
The ou:put from Minitab'" is slightly different from the results given in Example 13-7. Hests on the main effects and interactions are provided in addition to tite analysis of variance on the significance of main effects. two~factor interactions, and three-factor intera(.i.ions. The
Chapter 13 Design of Experiments with Several Factors
404
Al~OVA results indicate that at least one of the main effects is significant, whereas no twofactor or three-factor interaction is significant.
13-9 SUM1VlARY This chap:er has introduced the design and analysis of experiments with several factors concenrrating on factorial designs and fractional factorials. Fixed, randoIr., and mixed mod~ els were considered, The F~tests for main effects and interactions in these designs depend on whether the factors are fixed or random. The 2k. factorial designs were also introduced. These are very useful designs in which all k factors appear at two levels. They have a greatly simplified method of statistical analysis. In situations where the design cannot be run under homogeneous cQnditiQns~ the 2~ design can be confounded easily in 2 P blocks. This requires that certain interactions be confounded with blocks. The 2' design also lends itself to fractional replication. in which only a parJcular subset of the 2" treatment combinations are run. In fractional replication~ each effect is aliased with one or n:ore other effects. The general idea is to alias main effects and low-order interactions with higher-order interactions. This chapter discussed methods for construction of the 2~-P fractional fac:orial designs, that is, a If2P fraction of the 2* design. These designs are particularly useful in industrial experimentation, j
13-10
EXERCISES
13-1. An article in the Journal of Materials Processing Technology (2000. p. 113) presents results from an experiment involving tool wear estimation in m.ilEng. The objective is to minimize tool wear. 1\vo facto;s of ~nterest in ~e study were cutting speed (mlmin) and d:;pth of cut {mm). One response of interest is tool flank wear (mm). Three levels of ea.cll factor were selected and a factorial experiment with three replicates is run. Analyze the data and draw conclusions.
25, and 30 minutes-andrandornly chooses two types of paint from several that axe available, He conducts an experiment and obtains t.'1e data shown here. Ana~ lyze the data and draw conclusions, Estimate the variance components. Drying Time (min) Paint
Depth of Cut
12
15
18.75
13~2..A:1 engineer
2
3
0.170 0.IS5 0.110
0.198 0.210 0.232
0.217 0.241 0.223
0.178 0.210 0.250 0.212
0.215 . 0.243 0.292 0.250
0.260 0.289 0.320 0.285
0.238 0.267
0.282 0.321
0.325 0.354
suspectS that the surface finish of a
meta: part is lnI1uenced by the :ype of paint used and the dr)'ing time, He selects three d..")ing times-20,
20
25
30
74
73 61 44
78
64 50 92
98
66
86
73 88
45
68
35 92
85
13~3. Suppose that in Exercise 13-2 paint types wc:e fixed effects. Compute a 95% interval estimate of the mean difference between the responses for paint type
1 and painttype 2.
13-4. The factors that influence the breaking strength of cloth are being stu.died. Four machines and three ope{ators axe chosen at random and an experiment is run using cloth from the same one~y.ard segment The results are as follows:
]3·10 Exe,cises
l09
llO
lOS
110
1I0
as
109
116
13M8. Consider the tool wear dara in Exercise 13-1. Plot the residuals from this expe.riment against the lev~ cis of cutting speed and against the depth of cut. Comment on the graphs obtained. \Vhat are the possible consequences of the information conveyed by the :residual plots?
111 112
110 III
III
114 112
raw pulp ar.d the ::reeness aLd cooking time of pulp
109 111
112
114
liS
109
Machine 2
Operator A
B C
3
4
109
111 112
Test for interaction and main effe...-r.s at the 5% leveL Estimate the components of variance. ]3..5. Suppose that in Exercise 13-4 the operators were chosen at random. but only four machines were available for the test Does this influence the analysis or your conclusion:.? 13-6. A company employs two time-study engine"". Tbei= supervisor wishes to determine whether the standards set by them are bfluenced by any interaction between engineers and operators. Sbe selects three operators at random and conducts an experiment in which the engineers set standard times for the same job. She ohtains the data shown here. Analyz.e the data and draw conclusions. Operator Engineer
2
405
2
3
2.59
2.38
2.78
2,49
2,40 2.72
2.15
2.85
2.86
2.72
2.66 2.87
13-7. An article in Industrial Quality Con1lt)l (1956, p. 5) descnbes an investig'.ite the effect of t\.Vo factors (glass type and phosphor type) on the brightness of a television tuhe, The response variable measured is the current necessary (in IIiicroamps) to obtaizt a spedfied brightness level The da:a are shO\\TI heJ;e. A. nalyze the data and draw conc1us:ons,-a.ssumir.g that hoth factors are fixed, Phospbor Type 2
3
280 290 285
300 310 295
290 285 290
230 235 241)
260 2"0 235
220 225 230
Glass
13~9.
The pc;cenrage of hardwood concentration in
are being investiga':ed for their effects on the strength of paper, Analyze the data shown in the following table., assUltli..;g that all three factors are fixed. Cooking Time
Percen:age of Hardwood Concentration
Cooking TUM
1.5 hours Freeness ~JO
500 650
2.0hou.. . s
Freeness 41)0 500
650
10
96.6 97.7 99.4 96.0 96.0 99.S
98.4 99.6 1OQ.6 9S.6 100." 100.9
15
985 96.098.4 972 96.9 97.6
97.5 98.7 99.6 9S.1 98.0 99.0
20
97.5 95.6 97,4 96.6 96.2 98.1
97.6 97.0 98.5 98.4 97.8 99.8
13-10. An article in Quality Engin.eering (1999, p. 357) presents the results of an experiment conducted to determine the effects of three factors on warpage in a.."1. injection-molding process. Warpage is defined as the nonfiatness property in the product manufactured, This parrictaar company to.ar.ufactures plastic molded components for use in television sets. washing machines, and automobiles. The three factors of interest (each at two levels) are A = ::celt temperature, B::::: injection speed, ar:.d C;;;;; injection pr;;x:ess. A complete 23 factorial design was carried out 'Nith replication, 'fINo replicates are provided in the table below. A'll1lyze the data from this experiment
IT
A
B
C
-1
-1 -I
-1 -1 -1 -1
1.35 2.15 1.50
-I -1
1
0.70
-1 -1 1 -1
1.10 1.40
1.20
LlO
1.40 2.20 1.50 1.20 0.70 135 1.35 1.00
13-11. For the Warpage experimentm Exercise 13~10, obtain the residuals and plot them on non:nal probability pape:. ALso plot the residuals VerSus the predicted values. Commen~ on these ?lots.
406
Chapter 13
Design of Experiments with Several Factors
13..12. Four factors are thought to possibly influence the taste of a soft drink be¥"erage: type of sweetener ~4), rario of syrup to water (B), carbonation level (C). and temperature (D), Each factor can be run at two levels. produci.Ilg a 24 design. At each run in the design, samples of the beve:.-age are given to a test panel consisting of 20 people. Each tester assigr.s a point score from 1 to 10 to the beverage. Total score is the response variable, and the objective is to futd a formulation that m.aximizes tOW sCOre, Two replicates of this design are run, and the results shown here. Analyze the data and draw conclusions, Tteatment ~lica:e Combinations I II (1) a
190 174 181 183 177 181 188 173
b ab
c ac be abc
193
tlame*testing fabrics after applying fIre-retardan, treatments. There are four factors: type of fabric (A), type of firc-reta."Iiant treatment (B), laundering condi~ ton (C-the low level is no laundering. the hlgh level is a.Fter one laundering). and the method of conducting rhe flame test (D). All factors are ron at two levels. and the response ..-a.riable is the inches of fabric burned on a standard size test sample. The data are (1)
Treatment Replicate: Combinations T II
d
179 187 180
195 176 183 186 190 175 184 180
experiment in Exercisc
13~12.
178
ad
185 180 178 180
bd aba cd aed bed abed
182 170
Experimental Statistics (No. 91, 1963) involves
198 172 187 185
199
42
d ...,,40
= 30
a=;: 31
ad
b =45
hd
50
ab = 29 e = 39
abd
= 25
DC
=28
cd =40
aed "'" 25
be =46
bed = 50
= 32
abed = 23
abc
(a) Esti.matc the effects and prepare a nonnal probability plot of the effects. (b) Construct a nonnal probability plot of thc residu-
als and comment on the results" 13~13. Consider the
Plot the residoals against the levels of factors A, B, C. and D. Also construct a normal probability plot of the residuals. Comment on these plots. 13-14. Fmd the standard error of the effects for £he cxperit:.lcnt in Exercise 13~ 12. Using the standard errors as a guide. what factors appear significant? 13-15. The data sho~'D here represent a single repli~ cate of a 25 design that is used in an experiment to study the compressive strength of concrete. The factors are nUx (A). time {B). laboratory (C), temperature (D), and drying time (E). Analyze the data, assuming that three-factor and higher interactions are negligible. Use a nom'.a1 probability plot to assess the effects. (1) = 700
d=,OOO
e= 800
de =1900
900
ad=ll00
ae: 1200
ade=1500
b= 3400
hd = 3000
be =3500
bde=4000
ab= 5500
abd=6100
abe =6200
abd< =6500
ce = 600
cde =1500
a
e= 600
cd:::::: SOO
ae= 1000
acd=ll00
ace""" 1200 aede =2000
be= 3000
bed =3300
oC$=3006
bede=3400
abc = 5300 abcd=6000 abee = 5500 abed< =6300 13~16. An experiment described by ~L G. Natrella in the National Bureau of Standards Handbook of
(c) Construct an analysis of variance table assuming that three~ and four-factor interactions are negligible. 13-17. Consider the data from the first replicate of Exercise 13-10, Suppose rhat these o!:;servations could not all !:;e ron under the same conditions. Set up a design to run these observations in two blocks of four observations, each withABC confounded. Analyze Lie
data. 13~18. Consider the data from the firSt replicate of E."l:ercise 13~ 12. Construct a desig=. with two blocks of eight observations each. with ABCD confounded. Analyze the data.
13 ..19. Repeat Exercise 13-18 assuming that four blocks are required. Confound ABD and ABC (and consequemJy CD) wi'" blocks. 13~20. Construct a 2$ design in four blocks. Select the effects to be confounded so rhat we confound the highest possible interactions with blacks.
13·21. An article in Industrial and Engineering Chemistry ("Factorial Experiments in Pilot Plant Studies," 1951, p. 1300) repo::ts on an experiment to investigate the effects of temperature (A), gas throughput (B). and concentration (C) on the strength mproduct solution in a recirculation unit TWo blocks were used with ABC confounded, and the .experiment was replicated twice. The data are as follows:
r
13·10 Exercises 1
2 Block 1
Block 2
Block 1
0=51 ~ 10& I
(1)';; 46
'abc= 35 '
bc=67
:
c:::o
1
ab=-47
Block 2 a
D :=: hold down time, and E:=: quench oil temperature, The data are shown below,
with I =ABCDE was used to investigate-the effects of five factors on the color of a chemical product The factors are A ::: solvent/reactant, B:.:;; catalyst/reactant, C::: temperature, D = reactant pu."ity, lll1d E = reactant pH. The results obtained are as follows:
b abe e
d =6,,9 ode
6,47
~-H8
bde
3A5
= L66
abd
5,68
~ ~06
cd<
5,22
aed
~.438
ace = 1.22 bee = -2.09 abc =L93
bed =430 abede
E
=4,05
(a) Prepare a nnrmal probability plot of the effects. \Vbich factors are active? :P) Calculate the residuals. Construct a nonna! ptobahility plot of the residuals and plot the residuals versus the fitted values. Comment on the plots, (c) If any factors are negligible, collapse the 25 - i design into a full factorial in the active factors. Co~e::1t on t'lC resulting design. and i::1te!"pret
(d) Suggest. betterdesign; specifically,ono that would provide some information on aU interactions,
~-O,63
D .;.
{c) Comment Oli the efficiency of this design. Note that we have replicated the experiment tv.'ice, yet we have no information on the ABC interaction.
e
C
ac;;;22
(a) Analyze 6e data from this e.."\:perixucnt. (b) Plot the residuals on Donna! probability paper and against the predicted values. Comment on the p:ots obtained.
a = 2.51
B
A
b
13--22. R. D. Snce ("Experimenting \Vim a Large Number of Variables," in Experiments in Industry; Design.. Analysis and Interpretation of Results. by R. D. Snee. L. B. Hare, and J. B. Trout, Editors, ASQC, 1985) describes an experiment in which a 2~-1 design
(al \Vhat is the generator for this fractio!!? Write out
the alias struCture. ('0 luialyz.e t.'le data. Vihat factors influence mean free height?
(c) Calct;late the range of free height for each run, Is there any indication that any of these factors affects variability in free height? (d) .A.nalyze the residuals from this experiment and comment on your findings. 13~24. An artic1e in Industrial and Engineering Chemistry ("More on Planning Experiments to
increase Research Efficiency," 1970, p, 60) uses a 25 - 2 design to investigate the effects of A """ condensation temperature, B "'" anlount of material 1, C = solvent vol"..!ZlC, D :)ondenSation time, and E = amount of material 2 on yield. The results obtained are as follows: e
ab
= 23.2, = 15.S,
ad = 16,9,
cd
be = 16.2. ace
=23.8, =23,4.
bde
16.3,
abed<
18.1.
(a) Verify that the design generators used were I :: ACE and 1= BDE. (b Write down the complete defudng relation and
the aliases from this design.
13-23. An article in the Jouma.l of Quality Tedmology, (VoL 17, 1985, p. 198) describes the use of a
(e; Es'l:iJ.:late the rna.in effects. (d) Prepare a... analysis of variance table. Verify that the Ali and AD interactions are available to use as
replicated fractional factorial to investigate the effects of five factors on the free height of leaf springs used in an automotive application. The factors are A =f..rrnace temperature, B = heating time. C = transfer time.
(e) Plot the residuals versus the fitted values. Also construct a normal probability plot of the residuals, Corcoe:.t ou the results.
the results.
error.
408
Chapter 13
Design of Expe6ments with Severn! Fa,to"
13-25~ An article in Cement and Concrete Research (2001. p.1213) describes an experimenttQ investigate
the effects of four metal oxides on several cement
properties. The four factors are aU run at two levels and one response of i:r.terest is the mean bc.lk density (g!cm~). The four factors and their le'l.'els are Low Level
High Level
(-1)
(+1)
0 0 0 0
30
Factor ll.: %Fe,G,
8: % 7.J10 c: %PbO D: %0:,0,
15
2.5 2.5
Typical results from this type of experiment are given in the follO'Ning !able:
Run 2 3 4 5 6
7 8
ll. -I I -1
B -1 -1
C -I -I
-1 -1
D
Density
I
2.001
-1 -1
2.062
-1
-1
-1
2.019 2.059 1.990
1 -1
-1
1 1 -1
2.076 2.038 2,118
1
1
1 1
1
(a) 'What is the generator for dlls fraction? (b) Analyze the data. What factors influence mean bulk density?
(e) Analyze the residuals from this experiment and
comment on your findings.
13-.26. Consider the 2,,-2 design in Table 13-27. Suppose that aftet analyzing the original data, we find that factors C and E can be dropped. Vlhat type of 2t design is left in the remaining variables? 13-27. Consider the 2'-' design in Table 13·27. Sup. pose that after the original data analysis. we find that factors D and F can be dropped, ""'hat type"'of 2k design is left in the rero.aining variables? Compare the results with Exercise 13-26. Can you explain why the answers are different? 13~28. Suppose that in Exercise 13-12 it was possible to nm only a one-half fracdon of the Z' design. ConstrUCt the design and perform the statistical analysis, use the data from replicate 1
13-29. Suppose that in Exercise 13-15 only a one·half fraction of the 2~ design could be run, Construct the design and perform the analysis. 13-30. Consider the data in Exercise 13~15, Suppose that only a one-quarter fraction of the 2;5 design could be run. Construct the design and analyze the data. 13,,31~ Construct a 2t\t:~ fractional factorial design. Write do'Wll the :a.:iases, assuming that only main effects aLd two-factor interactions are of inteCCSL
J
r
chap_te_r_l_4____________________
Simple Linear Regression and Correlation In many problems there are two or more variables that are inherently related~ and it is necessary to explore the nature of this relationship. Regression analysis is a statistical technique for modeling and investigating the relationship between NO or more variables. For example, ill a chemical process, suppose that the yield of product is related to the process operating temperature. Regression analysis can be used to build a model that expresses yield as a function of temperature. This model can then be used to predict yield at a given temperature leveL It could also be used for process optimization or process control purposes, In general, suppose that there is a single dependent variable. or response y, that is related to k independent, or regressor, variables, say Xl,:tz. ,." Xt" The response variable y is a random variable, while the regressor variables x!~;t1, ... , x,I;- are measured wi!h neg1igi~ ble error, The Xi are called mathematical variables and are frequently controlled by the eXperimenter. Regression analysis can also be used .in situations where y, Xi' X 2' ", •• x\: are jointly distributed random variables, such as when !he data are collected as different measurements on a common experimental unk The relationship between these variables is characterized by a matheIr..atical model called a regression equation, More precisely. we speak of the regression of yon X 1>;t1. " "Xk" This regression model is fitted to aset of data. In some instances. the experimenter will know the exact form of the true functional relationship between y and x l .;t1 • ... , x", say y;;;; ¢(x\, x2• " ' , xJ, However, in most cases, the true functional relationship is unknov.n, and the experimenter will choose an appropriate function to approximate i/J, A polynomial model is usually employed as the approximating function, In this chapter, we discuss the case where only a single regressor variable, x. IS of inter est Chapter 15 will present the case involving more than one regressor variable. w
14-1 SIl'vIPLE LINEAR REGRESSION We wish to detennine the relationship bet\\'een a single regressor variable x and a response variable y. The regressor variable x is assu.."Ued to be a continuous mathematical variable, controllable by !he experimenter. Suppose that the true relationship benveen y and x is a straight line, and that the observation y at each level of;; is a random variable. Now; the expected value of y for each value of x is (14-1)
where the intercept Pc and the slope /31 are unknown constants. We assume that each observation~ y~ can be described by the model
y
13.+ f3,x+€,
(14-2)
409
410
Chapter 14 Simple Linear Reg:re!)sio~ and Correlation
cr.
where € is a random error with mean zero and variance The {€} are also assumed to be uncorrelated random variables, The regression model of equation 14-2 involving only a single regressor variable x is often called the simple linear regression model Suppose that we have n pairs of observations, say (y" x), (y,. x,), .... (y., x"). These data may be used to estimate the unknown parameters f30 and Pl in equation 14-2. Our estimation procedure will be the method ofleast squares. That is, we will estimate p, and Pl so that the sum of squares of the deviations between the observations and the regression line is a minimum. Now using equation 14--2, we may write
i==-1.2 ••.. ,n,
(14-3)
and the sum of squares of the deviations of the observations from the true regression line is n
n
L =2>1 =l., (Yi - Po - Plx,t
(14-4)
The least-squares estimators of Po and p" say Po andfi" must satisfy
(14-5)
Simplifying these two equations yields
j=l n
1=1
n
Pol., X, + P,l.,X; (14-6) Equations 14-6 are called the equations is
least~squares
normal equations. The solution to the normal
(14-7)
( 14-8)
,y,
where y= (l/niE;. andX'= (lln)L;.lx" Therefore, equations 14·7 and 14-8 are the leastsquares estimators of the intercept and slope, respectively. The fitted simple linear regression model is (14-9)
14-1
Simple Lbear Regression
411
Notationally, it is convenient to give special symbols to the numerator and denominator of equation 14-8. That is. let (14-10) and n
S", =
:2>,(x,
(14-11)
i=l
We callSp; the corrected sum of squares of.x and Sr; the corrected sun: of cross products of x and y. The extreme right-hand sides of equations 14-10 and 14-11 are the usual computational form.u1as, Using this new notation, the least~squares estimator of the slope is
-
S S'"
f3.. =-'L.
(14-12)
A chemical engineer is investigating the effect of process operating terr.perature on product jield, The study results irl the following data: Tempernture. 'C (xl
YIeld, % (y)
100
110
120
130
140
150
160
170
180
190
45
51
54
61
66
70
74
78
85
89
These pairs of points are plotted irl Fig. 14--1. Such a display is called a scatter diagram. Examination of this scatter diagramirldicaLes that there is a s'JOng relationship between yield anc temperature, and the tentative assumption of the straight~li.:le model y = Pc + PIX - € appears to be reasonable, The follollti."l.g quantities IDay be computed:
" =1450. 2:>i=673. Lx, IO
n=lO.
;:=145.
..I
!C
10
!C
Lx? =218,500.
Lyl = 47,225,
LX,y, =101,570,
i"'l
fool
i",l
From equations 14--10 and 14-11, we find
S = n
~ x:
,L,. 1"',
-1.l'~" yI 1O,L,'
and 10
1 (!O
'
i[""y I 10
S = "" ",y.--I "" x·1 .ty ,L..;" lO:.L,;' 1"'/..1
=218 500- (:450)' =8250 10
,'",j)
\.;",1,; ;",,1
j
(1450)(6731
,=101570-,,""""",""-' -3985 . I' 10
)
Therefore, the least~squares estimates of the slope and intercept are ,
S",
3985
/3,, =-=--=0.483 S," 8250
412
l
Chapter 14 Simple Linear Regression and Correlation
Figure 14.1 Scatter diagram of yield versus temperature.
and
Po ~ y - fi ii =67.3
(0.483)145 =- 2.739.
The fitted simple linear regression model is
y= -
2.739 + 0.481<.
Since we have only tentatively assumed the straight-line model to be appropriate, we will want to investigate the adequacy of the model. The statistical properties of the leastsquares estimators and are useful in assessing model adequacy. The estimators and PI are random variables, since they are just linear combinations of the Yi> and the Yl are random variables. We will investigate the bias and variance properties of these estimators. Con~ sider first The expected value of /3; is
Po
A.
Pt
A
,...
r
14-1
Simple Li::!ear Regression
413
I
since '2.7= !(xl-X) = 0,
r..;. ;7:'(X1
;;;;;
Sx,x. and by assurnptionE(e,) = O.Thus'PI is an unbi-
ased estimator of the true slope PI' Now consider the variance of
assumed that VeE) =
0-', it follows that V(y); 0-', and
Pt , Siuce
we have
(14-13)
The random variables {y,) are uncorrelared because the {€,} are uncorrelated, Therefore, the variance of the stlmm equation 14-13 is just the sum of the variances, and the variance of each term in the sum, say VIY;(XI is VZ(x; - i)2. Thus.
v(p,)
I
-(1
S2
xx
0-
22:' I x,-x_)2 ~
;""l
•
(14-14)
2
S'" Using a similar approach~ we can show that (14-15)
and
ftc
:Note ~t is an ::nbiased estimator of Po. The covariance of Po and 13: is not zero; iu fact. Cov({3e' f3,) ;-rr"iIS"" It is usually necessary to obtain an estimate of
vation 1, and the corresponding predicted value y" say e} = y, - y" is called a residual, The sum of the squares of the residuals, or the error sum of squares, would be
(14-16)
, )2 ' y, A more couyeni",nt computing formula for SSe may be found by substiruting the fitted model {30 - i3,x, into equation 14-16 and simplifying, The result is n
SSE=2:Y7 1=1
and if we let :Lf""tyl-ny'2 =i(Yi _y)2 =: Syy. theu we may 'Write SSfi as j",l
(14.17)
414
Chapter 14 Simple Linear Regression and Correlation The expected value of the error sum of squares SSe is E(SSE) = (n - 2)(1'. Therefore, (14-18)
is an unbiased estimator of oz. Regression analysis is widely used and frequently misused. There are several common abuses of regression that should be briefly mentioned. Care should be talren in selecting variables with which to construct regression models and in determining the form of the approxi~ mating function. It is quite possible to develop statistical relationships among variables that are completely unrelated in a practical sense. For example, one might attempt to relate the shear stren.,"1h of spot welds with the nn.'Uber of boxes of computer paper used by the data processing department. A straight line may even appear to provide a good fit to the data, but the relationship is an unreasonable one On which to rely. A strong observed association benveen variables does not necessarily imply that a causal relationship exists beween those variables. Designed experiments are the only way to detennine causal relationships. Regression relationships are valid only for values of the independent variable \'vithin the range of the original data. The linear relationship that we have tentatively assumed may be valid over the original range of XI but it may be unlikely to remain so as we encounter x values beyond that range, In other words, as we move beyond the range of values of x for which data were collected, we become less certain about the validity of the assumed modeL Regression models are not necessarily valid for extrapolation purposes, Finally, one occasionally feels that the model y = .8x "". is appropriate. The onission of the intercept from this model implies, of course, that y =0 when x = O. This is a very strong assumption that often is unjustified. Even when t'.VO variables, such as the height and weight of men, would seem to qualify for the use of this model, we would usually obtain a better fit by including the intercept, because of the limited range of data on the independent variable,
14-2 HYPOTHESIS TESTING IN SlMPLE LINEAR REGRESSION An important part of assessing the adeqaacy of the simple linear regression model is testing statistical hypotheses about the model parameters and constructing certain confidence intervals. Hypothesis testing is discussed in this section) and Section 14-3 presents methods for constructing confidence intervals. To test hypotheses about the slope and int=ept of the regression model, we must make the additional assumption that the error component €t is normally distributed. Thus, the complete assumptions are that the errors are l'<'JD(0, d') (normal and independently distributed). Later we will discuss how these assumptions can be checked through residual analysis. Suppose we wish to test the hypothesis that the slope equals a constant, say .8,. o' The appropriate hypotheses are
.8, ~ .8,." B,:.81 '" .8,.c, B,:
(14-19)
where we have assumed a t'.Vo-sided alternative. Now since the £/ are NlD{O, oZ). it follows directly that the observations y, are !XJD(j3, + .8.x" 0"). From equation 14·8 we observe that /3, is a linear combination of the observations Thus, /3, is a linear combination of independent normal random variables and, consequently, is "'!(/3" o"!S~), using the bias and variance properties of /3, from Section 14-1. Furthermore, /3, is independent of MS,. Then, as a result of the normality assumption, the statistic
Yr
fit
(14-20)
14-2 Hypothesis Te}''ting in Simple Linear Regression
415
P,.,. We would reject
foDows the tdistribution with n - 2 degrees of freedom underHo' p, Ho: p, ~ P;. 0 if
Itol > t"","~"
(14-21)
where to is computed from equation 14-20. A similar procedure can be used to test hypotheses about the intercept To test Ho:
p, P,.o'
(14-22)
H,' Pn " Pn.o, we wood use the statistic
(14-23)
and reject the null hypothesis if Itol > ta!1."~" A very important special case of the hypothesis of equation 14-19 is
He'
p, =0,
H"
/3,,, O.
(14-24)
This hypothesis relates to the significance of regression. Failing to reject Ho: PI :.:: 0 is equivalent to concluding that there is no linear relationship between.x and y. This situation is illusrrated in Fig. 14-2. Note that this may imply either tbatx is of little value in explaining the variation in y and that the best estimator of y for any x is y = Y (Fig. 14-2.a) or that the true relationship between" and y is not linear (Fig. 14-2b). Alternatively, if Iio: {3, = 0 is rejected, this implies that ):is of value in explaining the variability in y. This is illustrated in Fig. 14-3. However, rejecting HO=fJl :::::: 0 could mean either that the straight-line model is adequate (Fig. 14-3a) or that even though there is a linear effect ob, better results cnuld be obtained with the addition of higher-order polynomial terms in x (Fig. 14-3b). The test procedure for Iio:/3, 0 may be developed from two approaches. The frst approach starts with the following partitioning of the total corrected sum of squares for y: (14-25)
•
• •
•
.. ,
., . •
(a)
Figure 14--2 The hypothesis Hy:
.
, "
,
• • • • • • •
... (b)
/3; :.:: 0 is not rejected.
•
416
Chapter 14
1,
Simple Linear Regression and Correlation
..
y
•
• •
• ••
•• • • •
...' .'. . ... . • •••
..
•• • '
•• •
'
"
x (b)
(a)
Figure 14~3 The hypothesis Ho: PI "" 0 is rejected.
The two components of Sy! measure, respectively. the amount of variability in the Yi accounted for by the regression line, and the residual variation left unexplained by the ICY! - Yr)'l the error sum of squares and SSp;:::: regression line. We usually call SSE:=: ,(Yi - Y)'l the regression sum of squares. Thus, equation 14-25 may be v.rritten
I.;=
(14-26)
Comparing equation 14-f 6 with equation 14-17. we note that the regression sum of squares
SSE'S (14-27)
S" has" - 1 degrees of freedom, and SS, and SSE have 1 and n - 2 degrees of freedom, re·spectively. We may show thatE[SS,,!(r. 2)] ~ a'- and E(SS;;J ~ are independent. Thus, if Ho: f3, = 0 is true, the statistic 1';,
=
a'- + f3;S_ and that SSE and SS,
SSR/l
(14-28)
SS£/(" - 2)
o
follows the FL,,_l distribution. and we would rejectHo if Fc > Fex. l,II-1' The test procedure is usually arranged in an analysis of variance table (or k'lOVA), such as Table 14-1. The test for significance of regression may also be developed from equation 14-20 with ilL' = 0, say (14-29)
Squaring both sides of equation 14-29, we obtain 2 _
/Jfs", _AS", _MS.
to - - - - - - - - -
MSe
MSE
MS£
Table 14..1 Analysis ofVa.'ance for Tes:ing Significance of Regression Source of Variation
Sum of
Degrees of
Squares
Freedom
p,So-,
Regression Error or residual
35,= S5,=S,,_- {J,S"
Total
Syy
n-2 n 1
Mean Square
MS. MS,
(14-30)
Inte:val Estlmatior. in Simple Linear Regression
14-3 Table
14~2
41-7
Testi.'lg for Significance of Regression. Example 14-2 Degrees of Freedom
Mean
1924.87
1
1924.87
7.23
8
0.90
1932.10
9
Source of Variation
Sum of Squares
Regression Ettor Th:al
2,38.74
Note thatt; in equation 14-30 is identical to Fain equation 14-28. It is true, in general, that the square of a t random variable with! degrees of freedom is an F random variable, with One and!degrees of freedom in the numerator and denominator, respectively. Thus, the test using", is equivalent to the test based on Fo.
We will test the model developed in Example 14-1 for significance of regression. The fitted model is
y:::- 2,739 ..;..0.483.x, and Syy is computed as
:47225- (673)' , :0 =1932.10. The regression sum of squares is
SS, = PeS", = (0.483)(3985) = 1924.87, and the error sum of squares is:
1932.10 -1924.87 7.23. The analysis of variance for testing Ho: f3! 0 is summarized in Table 14-2. Noting that Fo 2138.74 > FIJ.O\. L~ = 11.26, We reject He and conclude that P! -:;L O. t":";
14-3 INTERVAL ESTIMATION IN SIMPLE LINEAR REGRESSION In addition to point estimates of the slope and intercept, it is possible to obtain confiden
418
Chapter 14
Simple Linear Regression and Correlation
are both
Similarly, a 100(1 - a)% confidence interval on Ibe intercept /3, is
,
I'
[
1 ",' :
,
: [1
j"l
/30-la12 S./3o+ta'2'-2~MSE -+-J' ",-2 I I MS;; -+-J5./3o n S n S !,
,xx
(14-32)
xx
We will find a 95% confidence interval on Ll:le slope of the regression line using the data in Exa:rnplc 14-1. Rec.all that ~, ~ 0.483, S~= 8250, and MS,= 0.90 (see Table 14-2). Then, from equation 14-31 we fine.
or 0.483- 2,306,1 0.9~ ,; III ,; 0.483+2.306,1 0.9_0 , ,8250 ,82,0 TIris simplifies to
0.45% Il,'; 0,507.
A confidence interval may be constructed for the mean response at a specified x. say x". This is a confidence interval about E(YIxJ and is often called a confidence interval about Ibe regression line. Since E(ylx,) ~ /3, + /31x", we may obtain a point estimate of E(YIxJ from
Ibe fitted model as
E~) ""y,=i3,+P,xo' Now Yo is an unbiased point estimator of E(YIxJ, since Aand p, are unbiased estimators of /30and /3" The variance ofYo is V(i'o) =O"'~.!.+ ("" -
Ln
x)'],
S~
A
and Yo is normally distributed, as and PI are normally distributed. Therefore, a 100(1 a)% confidence interval abom the true regression line at x '\'0 may be computed from •
/1
(1
~
n
Yo -ta{l,n-2 _MSEl-+
(xo-xl"J (14-33)
Sa
,----,-----,-,. '"1 \I n
"E (YixO)';Yo+'aJ2~-2> 1MSEI' -+ I'
(x,-x)'1 . Sa
,
r
btcrval Estimation in Simple Linear Regression
14-3
419
Tne width of the confidence interval for E(y1xJ is a function of xc' The interval width is a minimum for Xu =::X and 'Widens as ~o - Xl increases, This widening is one reason why using regression to extrapolate is ill-advised,
We will construct a 95% confidence interval about the regression line for the data in Example 14~ 1. The fi!ted model is Yi; = - 2.739 ..;.. 0.483x,., and the 95% confidence interval on E(yix,,) is found from equation 14-33 as
r
-21]I'
\
,
_ 1 (xo -14,\ : f Yc ::2.30\!0.90 Iii--' &250 ' 1
The fitted values Yo and the coa:espomii."lg 95% cor.fldence limits for the poi::ts 10, a:-e displayed in Table 14-3. To illustrate the use of this table, we may find the interval on the true mean process yield at Xc = 140"C (say) as
i= 1, 2. ""
confidence
64.88 - 0.71 :;; E(Yix, = [40)'; 64.&8 + 0.71 or
64.[7"; E(Ylx" ~ 140) ,,65.49. The fitted model and the 95% confidence in,terval about the regression line a.'''c shown in Fig. 14-4,
Table 14-3 Confidence Interval about the Regression Line, Example 14-4
Figure 144 A 95% confidence interval about the :regression line for E;r;ampie 14-4.
420
14-4
Chapter 14 Si.rcple Linear Regres~10n and Correlation PREDICTIO~
OF NEW OBSERVATIO~S
An important application of regression analysis is predicting new Or future observations y corresponding to a specified level of the regressor variable x. If Xo is the value of the regressor variable of interest, then (14-34) is the point estimate of the new or future value of the response Yo' Now consider obtaining an interval estimate of this future observation Yo' llis new observation is independent of the observations used to develop the regression model. There~ fore) the cOlli"idence interval about the regression line, equation 14-33, is inappropriate, since it is based only on the data used to fit the regression model. The confidence interval about the regression line refers to the true mean response at x = Xc (that is, a population parameter), not to future observations. Let Yo be the future observation atx:;;:; xo' and letyc given by equation 14--34 be the esti~ matOr of Yo' Note that the random variable 'I'=y, -Yo is normally ilistributed with mean zero and variance
because Yo is independent ofy,> Thus, the 100(1 - a)% prediction interval on a future obser, vations ar Xc is
(14,35)
Notice that the prediction interval is of minimum width at Xo := X and widens as ~J xl increases. By comparing equation 14-35 with equation 14-33, we observe that the predic, tion interval atxois always wider than. the confidence interval at x()' This results because the prediction interval depends on both the error from the estimated model and the error asso, ciated with furore observations (cr). We may also find a 100(1- a)% prediction interval on the mean of k future observa' tions on the response at.x =X O' Let Yo be the mean of k future observations at x = Xo- The 100(1- a)% prediction interval on Yo is
(14-36)
14-5 Measuring the Adequacy of the Regression Model
421
To illustra::e the cons!ruction of a prediction interval, suppose we use the data in Example 14-1 and find a 95% prediction .inter,fal on the next observation .on the process yield at Xo == 160°C, Gsing equation 14~351 we find that the prediction interval is 74.55-2.306
iO.90[1+~+ (160-145)']
~
10
8250
r-c---------:-c;
< < 745' ? 306 1090[1 1 (160-145)'" - Yo - .• L. ~. +10+ 8250 ' which simplifies to
14-5 MEASL"RING THE ADEQUACY OF THE REGRESSION MODEL FirJng a regression model requires several assumptions. Estimation of the model parameters requires the assumption that the errors are uncorrelated random variables with mean zero and constant variance, Tests of h:'t'Potheses and inteITal estimation require that the errOrS be normally disnibuted. In addition, we assume that the order of the model is correct; that is, if we fit a first-order polynomial. then we are assuming that the phenomenon actually behaves in a first·order manner. The analyst should always consider the validity of these assumptions to be doubtful and conduct analyses to examine the adequacy of the model that has been tentatively entertained. In this section we discuss methods useful in this respect.
14-5.1 Residual Analysis We define the residuals as e1 == Y,-y;, i == 1, 2, "', n1 where Yi Is an observation andy, is the corresponding estimated value from the regression modeL Analysis of the residuals is fre~ quently helpful in checking the assumption that the errors are 1\'ID(O, iT) and in detennining whether additional terms in the model would be useful. As anapproximate~ check of normality, the experimenter can construct a frequency his~ togram of the residuals or plot them on normal probability paper. It requires judgment to assess the nonnormality of such plots. One may also standardize the residuals by comput~ ing d, = e/YMSE , i = 1, 2, ... , n. If the errors are NIO(O, a'), then approximately 95% of the standardized residuals should fall in the interval +2). Residuals far outside this interval may indicatc the presence of an outlier, that is, an observation that is atypical of the rest of the data. Various rules havc been proposed for discarding outliers, However, sometimes outliers pro-vide important information about unusual circumstances of interest to the experimenter and should not be discarded, Therefore a detected outlier should be investigated first, then discarded if warranted. For further discussion of outliers, sec Y!ontgomery. Peck, and Yrning (2001). It is frequently helpful to plot the residuals (I) in time sequence (ifrnown), (2) against the }>" and (3) against the independent variable x. These graphs will usually look like one of the four general patterns sho"Wn in Fig. 14-5. The pattern in Fig. 14-5a represents normality. while those in Figs. 14-5b, C t and d represent anomalies, If the residuals appear as in Fig. 14-Sb, then the variance of the observations may be increasing with time or w.ith the magnitude of the }'i or xi' If a plot of the residuals against time has the appearance of Fig. 14-5b~ then the variance of the observations is increasing with time, Plots agamsty, and
L
Figure14.5 patterns for residual plots, [al Satisfactory, (b) funnel, (e) double bow, (d) nonlinear, [Adapted from Montgomery, Peck, and Vming (2001),J
Xi that look like Fig. 14-5c also indicate inequality of variance, Residual plots that look like Fig. 14.5d indicate model inadequacy; that is, higher-order terms should be added to the
model.
l{xam"i,,14~5' The residuals for the regression model in Example 14-1 are computed as follows:
e, =45.00 - 45.56=- 0,56, e, ~ 51.00 e~
50.39 = 0,61,
54.00 - 55.22 =-1.22,
e, = 61.00-60,05 = 0,95, e, = 66,00 - 64.88 = Ll2,
These residuals Me plotted on normal probability paper in Fig. 14--6. Since the residuals fall
approx~
imately along a straight line in Fig. 14-6, we conclude that there is no severe departure from nonnality. The residuals are also plotted agai.'1st y! in Fig. 14-7a and against Xi in Fig, lA-7b. These plots indicate no serious model inadequacies.
14-5.2 The Lack-of·Fit Test Regression models are often fit 10 data when the true funetioual relationship is unknown. Naturally, we would like to know whether the order of the model tentatively assumed is carrect, This scction will describe a test for the validity of this assnmption,
14-5 Measu..ring the Adequacy of the Regression Moce]
98
2
•
5
95 90
10
•
20
80
•
50
70 60 50
60
40
70
30
30 .~ 40 j3 m
.0
e c..
423
BO
20
•
90
-: 10
~
95
•
9B~··-L·
-is I
-1.5
I
I
1.0
-0.5
0.0
1.0
0.5
2.0
1.5
2
Residuals Figure 14-6 Nonnal probability plot of residuals.
e,t 2.00~
1,00~
•
•
0.00 c
•
-1.00 ~ -2.00
·
l
------~-----------------~-
40
60
50
70
80
$,
2.00 :.00
----------~---------------
•
-1.00
•
•
0.00 .
•
•
.
---~----------------------
-2.00 .L
! ! !
100 110 120 130
~40
!!
:50 160 170 180 190
)
x,
Figure 14-7 Residual plots for Example 14-5. (0) Plot agalnsty" (b) plot against.tr
'The danger of using a regression model that is a poor approximation of the true func~ tional relationship is illustrated in Fig. 14·8. Obviously, a polynomial of degree two or greater should have been used in this situation.
424
l,
C1:.apter 14 Simple Linear Regression and Correlation
• • • •
•
•
• • • • • •
• •
x Figure 14-8 A regression model displaying llck of fit. We present a test for the "goodness of fit" of the regression model. Specifically, the hypotheses we wish to test are
Ho: The model adequately fits the data
H,: The model does not fit the data The test involves partitioning the error or residual sum of squares into the NO components SSE = SS" + SSWF
where SSi'E is the sum of squares attributable to "pureH error and SSWF is the sum of squares attributable to the lack of fit of the model. To compute SS" we must have repeated observations OIl y for at least one level of x, Suppose thar We have n total observations such that repeated observations at XI' repeated observations at~,
repeated observations at x"" Note that there are m distinct levels of x. The contribution to the pure-error sum of squares at x (say) would be Yml'
Ynr2' ... , YI1VI",
(14-37) The total sum of squares for pure error would be obtained by summing equation 14-37 over all levels of x as ~
SSpE =
,
I,L(Y,,, _y,)2
(14-38)
There are n~ = :L:! (n; - 1) ;:::: n m degrees of freedom associated with the pure-error sum of squares. The sum of squares for lack of fit is simply (14-39) wid, n - 2 - n, = m - 2 degrees of freedom. The test statistic for lack of fit would then be ~ _ SSwd(m-2) TQ -
SSPE!(n-m)
(14-40)
and we would reject it if Fo> F(t",-2.II-m'
J
14-5
Measuring the Adequacy of the Regression ~lodel
425
This test procedure may be easily introduced into the analysis of va....-fance conducted for the significance of regression. If the null hypothesis of model adequacy is rejected, then the model must be abandoned and attempt!) made to find a more appropriate model. If Ho is Dot rejected, then there is DO apparent reason to doubt the adequacy of the model, and MSPE, and MSWF are often combined to estimate rfl.
i;:Xi#~re 1ft;. SuppQse we have the follov,,1ng data: x
1.0
LO
2.0
3.3
3.3
4.0
4.0
4.0
4.7
5.0
Y
2.3
L8
2.8
1.8
3.7
2.6
2.6
2.2
3.2
2.0
x
5.6
5.6
5.6
6.0
6.0
6.5
6.9
3.5
2.8
2.1
3.4
3.2
3,4
5.0
y
We may compute St)';;; 10.97, S,ry:= 13.20, S;u: =
52.53~
y:;; 2.847, ru;d x;;; 4.382. The regression model
isy,;:::: 1.703 + Q,26Qx, a.nd the,regressionsum of squares is SSr:= f3IS1:)' = (0.260)(13.62) = 3.541, The pure-error SUr:l of squares is computed as follows: Level of x
:E(y, -
Yl'
4.0 5.6 6.0
0.1250 1.8050 0.1066 0.9800 0.0200
Total:
3.0366
1.0 3.3
Degrees of Freedom
2 2
1 7
The analysis of variance is summarized in Table 14-4. Since Foz, a, 1 L 70. we cannot reject the hypothesis that the tenta;tve model adequatelY describes the data. \Ve ",ill poor la¢k~of-fit and pure~ error mean squares to fonn the denominator mean square in the test for significance of regression. Also, since =4.54, we conclude that{1!"# O.
In fitting a regression model to experimental. a good practice is to use the lowest degree model that adequately describes the data. The lack-of-fit test may be useful in this
T.bl.14-4 Anilysis of Variance for Example Source of
Variation Regression Residual (Lack of fil) (Puree:rror)
Total
Sum of Squares
3.541 7.429 4.392
3.037 10.970
14~6
Degrees of Freedom
1 15 8 7 16
Mean Square 3.541 0.495 0.549 0.434
7.15
1.27
426
Chapter 14 Simple Linear Regression and Correlation
respect. However, it is always possible to fit a polynomial of degree n - I to n data points, and the experimenter should not consider using a model that is "saturated.," that is, that has very nearly as many independent variables as observations on y,
14·5.3 The Coefficient of Determination The quantity (I4-4J)
is called the coefficient of determination and is often used to judge the adequacy of a regression model. (VIe will see subsequently that in the case where x and y are jointly distributed random variables, R2 is the square of the correlation coefficient between x and y.) Clearly 0 ,; R' ~ I, We often refer loosely to R' as the amount of variability in the data explained or accounted for by the regression model. For the data in Example 14·1, we have R' = SSFIS" = 1924,87/1932,10 = 0,9963; that is, 99,63% of the variability in the data is accounted for by the model The statistic Rl should be used with caution. since it is al'W"&ys possible to makeR2 unity simply by adding enougb tenns to the model For example, we can obtain a "perfect" fit to n data points with a POlplOmiaI of degree n - 1. Also. Rl will always increase if we add a variable to the model, but this does not necessarily mean the new model is superior to the old one. Unless the error Sum of squares in the new model is reduced by an amount equal to the original error mean square, the new model will have a larger error mean square than the old one, because of the loss of one degree of freedom. Thus the new model will actu~ ally be worse than the old one, There are several misconceptions about R2, In general, R? does not measure the magnitude of the slope of me regression line, A large value of R' does not imply asleep slope, Furthermore. Rl does not measure the appropriateness of the model, since it can be artificially inflated by adding higber-order polynomial terms, Even if y and x are related in a non· linear fashion, J?l will often be large. For example, R2 for the regression equation in Fig. 14-3b will be relatively large, even though the linear approximation is poor. Finally, even though R2 is large, this does not necessarily imply that the regression model will pro~ vide accurate predictions of future observations.
14·6 TRANSFORMATIONS TO A STRAIGHT LTh'E We occasionally find that the straigbt-line regression model y = Po + P,X + € is inappropriate because the true regression function is nonlinear. Sometimes this is visually determined from the scatter diagram. and sometimes we know in advance that the mode1 is nonlinear because of prior experience or underlying theory. In some situations a nonlinear function can be expressed as a straigbt line by using a suitable transformation. Such nonlinear mod· els are called intrinsically linear. As an example of a nonlinear model that is inttinsically linear, consider the exponential function Y
/3oe P'"e.
This function is intrinsically linear, since it can be transformed to a straigbt line by a logarithmic transformation
lny =lnP,+ Jh+ In €.
J
14.7 Correlation
427
This transformation requires that the transformed error ten:ns 1n E be nor:maEy and inOO· pendently distributed with mean 0 and variance a?. Another intrinsically linear funetion is
(11 y;/30+/3'l-I+€' x/ By using the reciprocal t:ransformation z ;;;: lix, the model is linearized to
y;/3o+/3,z+€. Sometimes several transformations can be employed jointly to linearize a function. For example, consider the function
Letting y* = lly. we have the linearized form lny'; /30 + /3,:<+" Several other examples of nonlinear models that are intrinsically linear are given by Daniel and Wood (1980).
14-7 CORRELATION Our development of regression analysis thus far has assumed that x is a mathematical vari~ able. measured with negligible error, and that y is a random variable. Many applications of regression analysis invulve situations where both;; and y are random variables. In these sit~ uations, it is usually assumed that the obser"llations (Yi' XI)' i = 1. 2, "', n, are jointly distributed random variables obtained from the distribution/(y, x). For example, suppose we wish to develop a regression model relating the shear strength of spot welds to the weld diameter. In this example, weld diameter cannot be controlled. We would randomly select n spot welds and observe a diameter (Xi) and a shear strength (y.) for each. Therefore, (y, x) are jointly distributed random variables. We usually assume that the joint distribution ofy! and Xi is the bivariate normal distribution. That is,
(14-42)
0:
a;
where J.i.l and are the mean and va..--iance of Y. J.4 and are the mean and variance of x, and p is the correlation coefficient between y and x, Recall from Chapter 4 that the correlation coefficient is defined as
where ()12 is the covariance between y and x. The conditional distribution of y for a given value of x is (see Chapter 7)
428
Chapter 14 Simple Linear Regression and Correlation
(14-43) where (14-44a)
(14-44b)
and --' ' , "]2 = 0;(1 - {r),
(l4-44c)
That is~ the conditional distribution of y given x is normal with mean
E(ylx) = Po + P,X
(14-45)
and variance O"~i' Note that the mean of the conditional distribution of y given x is a straight~line regression modeL Furthermore, there is a relationship between the correlation coefficient p and the slope PI' From equation 14-44b we see that if p=(), then P, 0, which implies that there is no :regression of y On x. That is, knowledge of x does not assist us in predicting y. The method of maximum likelihood may be used to estimate the parameters 130 and {JIIt may be show'n that the maximum likelihood estimators of these parameters are
{Jo=y-/J"i
(14400)
and i-I
(l4-46b)
We note that the estimators of the intercept and slope in equaden 1446 are identical to those given by the metl::od of least squares in the case where x was assumed to be a mathe~ matical va..~able, That is, the regression model with y and x jointly normally distributed is equivalent to the model with x considered as a mathematical variable. This follows because the random variables y given x are independently and normally distributed with mean Po + PIX and constant variance These results will also hold for any joint distribution of y and x such that tl::e conditional distribution of y given x is noxmaL It is possible to draw inferences about the correlation coefficient p in this model. The es.timator of p is the sample correlation coefficient
0':,.
(14-47)
J
14-7 Correlation
429
Note that
'S
,,1/,2
III, ~.,YY s j'
(14-48)
r,
\ .a
so the slope PI is JUSt the sample correlation coefficient r multiplied by a scale factor that is the square root of the "spread" of the y values divided by the "spread" of the x values, Thus and r are closely related, although they provide somewhat different information, The sample correlation coefficient r mea!."1U'es the linear association between y and x. while measures the predicted change in the mean of y for a mtit change in x. In the case of a mathematical variable x, r has no meaning because the magnitude of r depends on the choice of spacing for x. We may also write) from equation 14-48,
il,
il,
which we recognize from equation 1441 as the coefficient of determination. That is, the coefficient of determination RZ is just the square of the sample correlation coefficient between y and x. It is often useful to test the hypothesis
H,:p=O, (14-49)
H,:p,eO, The appropriate test statistic for this hypothesis is
(14-50)
which follows the t distribution with n - 2 degrees of freedom if Ho: P = 0 is true, Therefore, we would reject the null hypothesis if It,l > 1"',"_1' This test is equivalentto the test of the hypothesis He: Il, = 0 given in Section 14-2, This equivalence follows directly from equation 1448, The test procedure for the hypothesis H,: P=Pa'
(14-51)
H,:p",po,
where P,,e 0, is somewhat more complicated, For moderately large samples (say n;' 25) the statistic
1
l~r
Z = arctanh r = -l:n-'2 l-r is approximately normally distributed with mean Ilz
,
1 I~p =arctanh P =-l:n-2 I-p
(14-52)
430
Chapter 14 Simple Linear Regression and Correlation
and variance
Zo =(arctanh T -
arctanh pOl(n - 3)ln
(14-53)
and reject H,: P = Po if IZo! > Z.~. It is also possible to construct a 100(1- a)% confidence interval for P using the transformation in equation 14-52. The 100(1 - a)% eonfidence intelval is Zai2 r· ' 'J ~ p "In-3
tanh( arctanh r -
~ tanh
(arctanh r +
Z.·2 ~} ..In-3
(14-54)
where tanh u = (e" - .-")I(e" + e-").
~~~@1Efi Montgomery, Peck, and VUJing (2001) descn.be an application of regression analysis in which an engineer at a soft-drlnk bottler is investigating the product distribution and route service operations for vending machines. She suspects that the time required to load and service a machine is related to the number of cases of produ..-t delivered. A random sample of 25 (etail outlets having vending machines is selected. and tOe in-outlet delivery time (in minutes) and volume of product delivered (in cases) is observed for each outlet. The data-are shown in Table ~4-5. We assume that delivery time and volume of product delivered ate jointly normally distributed. Using the data in Table 14-5. we may calculate s~ =
S" = 6105.9447,
698.5600,
S.,= 2027.7132.
The regression model is
y= 5.1145+ 2.9027<. The sample correlation coefficient between x and y is computed from equation 14-47 as T
S;ry
l
S:n;SYJ
2027.7132
t
0.9818.
[(698.5600)(6105.9447)1'"
Table 14-5 Data for Example 14-7 Delivery Observation
Note that R'l ~ (0.9818)2 0,9640, or that approximately 96.400/0 of the variability in delivery time is explained by the linear relationship with delivery volume, To test the hypothesis
Iio: p~ 0, Ii,: p" 0, 14~50
we can compute the test statistic of equation
as follows:
=
~
_ '~n-2 ,......-.,jl_,2
0,9818,,23
to -
.,jl-0,9640
2480 • ,
Since to.Ol.'i,1:! = 2.069, we reject Ho and conclude fuat the correlation coefficient p;t O. Fi:::ally, we may construct an approximate 95% confidence .interval on p from equation 14-54, Since arctanh r ;:; ; arctmh 0,9818 = 2.3452, equation 14-54 becot:1
tmh' 2.3452-~ ,;p,; tmh IZ,3452+ 1.96., \ ~22 \ -fii) (
)
I
\
which reduces to 0.9585 s
P s 0,9921.
14-8 SAMPLE COMPUTER OUTPUT Many of the procedures presented in this chapter can be implemented usiug statistical soft~ ware. In this seetion, we present the Minitab" output for the data in Example 14-1. Recall that Example 14-1 provides data On the effect of process operating temperature on product yield. The Minitab® output is
The regression equation is Yield
=-
2.74 + 0.483 Temp
Predictor Constant
Coef
T~p
S
= 0.9503
A.~alysis
~2.739
SE Coe£ 1.546
-~,77
0.48303
O.0~046
46.~7
99.6%
R-Sq
P 0.114 0.000
T
R-Sq(adj) = 99.6%
of Variance
Source Regression Residual Error Total
55
MS
F
F
~924.9
~924.9
2131. 57
0,000
8
7.2
0.9
9
~932.~
DF 1
The regression equation is pro'tided along with the results from the Hests on the individual coefficients. The P-values indicate that the intercept does not appear to be significant (P-value = 0.114) while the regressor variable, temperarure, is statistically significant (P-va!ue = 0), The analysis of variance is also testing the hypothesis that Ii,: /31 = 0 and can be rejected (P-value = 0)_ Note also that r = 46.17 for temperature, and f = (46,17)' 2131.67 = F. Aside from rounding, the computer results are in agreement with those found earlier in the chapter,
14-9 SUMMARY This chapter has introduced the simple linear regression model and shown how leastsquares estimates of the model parameters may be obtained. Hypothesis ..testing procedures
432
Chapter 14 Simple Linear Regression and Correlation
and confidence interval estimates of the model parameters have also been developed. Tests of hypotheses and confidence intervals requrre the assumption that the observations y are nonnally and independently distnouted random ...'3Iiables. Procedures for testing model adequacy, including a lack-of-fit test and residual analysis, were presented. The correlation model was also introduced to deal with the case where x and yare jointly normally distributed. The equivalence of the regression model parameter estimation problem for the case where x and y are jointly nonnal to the case where x is a mathematical variable was also discussed, Procedures for obtaining point and interval estimates of the correlation coefficient and for testing hypotheses about the correlation coefficient were developed.
14-10 EXERCISES 14-1. Montgomery, Peck. and Vwing (2001) present data concerning the performance of the 28 National Football League teams in 1976. It is suspected that the number of games won (y) is related to the number of yards gained:.'U.Sh.ing by an opponer:.t (x). The data are shown below, Yards Rushing by
Teams Washington Minnesota New Englaod Oakland Pittsbcr:gh
Baltimore Los Angeles Dallas Atlanta Buffalo Chicago
Cinc!"1.Ilati Cleveland Denver Detroit Green Bay Houston Kansas Cit)' Miami ~ew Orleans New York Gia:ats
Games Won (y)
Opponent (x)
10
2205 2096 1847 1903
11 11
13 10
1848
10 1!
1564
4
2577
2 7 10 9 9 6 5 5 5
2476 1984 1917 1761 1709 1901
6 4 3
'S'ew York Jets
3
Philadelphia St Louis San Diego
4
San. Francisco Seatt::c
Tampa Bay
4151
11
10 6 8 2 0
1321
2288 2072 2861 2411
2289 2203 2592 2053 1979 2048 1786 2876 2560
(a) Fit a linear regression model relating games won to yards gained by an opponent (b) Test for significance of regression. (e) Find a 95% confidence interval for me slope. (d) '\Vh.a: percentage of total variability is explained by the lIlodel? (e) Find::he residuals and prepare appropriate residual plots. 14-2. Suppose We would like to use the model devel~ oped in Exercise 14-1 to prediet ::he number of games a team will ..........n if it can limit the opponents to 1800 yards rushing. Find a poin~ estimate of the number of games won if the opponents gain only 1800 yards rushing. Find a 95% prediction interval on the number of games WOn.
14-3.. Motor Trend magazine frequently presents performance data for automobiles. The table below presents data from the 1975 volume of Motor Trend concern.iog the gasoline milage perfonnance and the engine displacement for 15 automobiles.
Automobile Apollo
Omega t\ova Monarch Duster Jensen ConY. Skybawk
Monza Corolla SR-5 Carnaro Eldorado Trans Am Charger SE Cougar
Co,rvette
:'files I Gallon (y) 18.90 17.00 20.00 18.25 20.01
(a) Fit a regression model relating mileage performance to engine displacemenL
(a) Fit aregression model relating sales price to taxes
(b) Test far significance of regression.
(b) Test for significance of regression,
(c) 'What percentage of total ,,"arizbility in mileage is explained by the model?
(c) W"hat percentage of the variability in selling price
(d) Fi:r:.d a 90% confidence inten'a! On the mean mileage if the engine displacement is 275 cubic inches, 144. Suppose that we wish to predict the gasoli:r:.c mileage from l! car v.ith a 275 cubic inch displacement
engine, Find a point estimate. using the model developed in Exercise 14-3. ar.d an appropriate 90% interval estimate. Compare this interval to the Onc obra:ned in Exercise l4-3d, 'Which one is wider. and why? Fmd the rcsidl;als from the moder in Exercise 14.-3. Prepare appropriate residua: plors anC commer.t on model adequacy.
paid.
is explained by the taxes paid? Cd) Find the residuals for this model. Construe: 11 normal probability plot for the residuals. Plot the residuals versus y and versus x. Does the model seem satisfactory?
14-7. The stre:r:gth of paper used ill the manufacture of cardboard boxes (y) is related to the percentage of hardwood concentration in the original. pulp (x). UnCer controlled cor.ditions, a pilot plant manufaetu.'>'CS 16 samples. each from a different batch of pu~p, and measures the tensi::e strength. The data are shown here.
14~5.
14~6. An article in TechnomeIrics hy S. C. Narula and J. E Wellington ("Prediction, Linear Reg:ession, and a :Mi.nin::n.ntJ. SUDl of Relative Errors:' VoL 19, 1977) presents data On the selling price and annual taxes for 27 houses. The data are shown below.
:
:~~4 (:7;411:~/ 11~~211~~911:60911~8 1~:9
y ! 11.3 123.0 12':),1 145.2 1343 ,,<.:.4.5 J43.'1 146.9
x •
zlZSl28 I
2$ • 3.0 •
3.0 • 3.2 I B
(a) Fit a simple linear regression model to the data, (b) Test for lack of fit and significa.'1ce of regression.
(c) Construct a 90% confidence interval on the slope Taxes (Local, School. Sale Price f 1000
11000
25.9 29.5 27.9 25.9 29.9 29.9 30.9 28.9 35.9
4.9176 5.0208
(e) Construct a 95% confidence interval on the true
4.5429
14~8.
31.5 3LO 30.0
5.3003 6.2712 5.9592 5.0500
36.9
8.2464
Jan. Feb.
41.9
6.6969
Mar.
4().5
7.7841
Apr.
30.9
43.9 37.5 37.9
44.5 37.9 38.9
36.9
l
County)
fl,· Cd) Construct a 90% confidence interval 00 the inter~ cept flo.
45.8
4.5573 5.0597 3.8910 5.898Q5.6039 5.8282
9.0384
5.9894 7.5422 8.7951 6.0831 8.3607 8.1400 9.1416
regression line at :t =: 2.5. Compute the residuals for the regression model in Exercise 14-7. Prepare appropriate residual plots and comment on model adequacy, 14..9. The number of pounds of steam used per month hy a chemical plant is thought to he related to the aver~ age ambient temperature for that month, The past year's usage a...'1d temperatures are shown in the following table. Month
May June
July Aug,
Sept. Oct.
No\" Dec.
Temp.
Usage 1 1000
21 24 32 47 50 59 68 74
185.79 21".47
62 50 41 30
288.03 42".8' 45".58 539.03 621.55
675.06 562.03 452.93 369.95 273.98
434
Chapter 14 Simple Linear Regression and Correlation
(a) Pit a simple linear rcgression model to the data. (b) Test for significance of regression,
(e) Test the hypothesi! that the slope 131 = 10. (d) Construct a 99% confidence interval about ilie true regression line atX'''''' 58, (e) Construct a 99% prediction interval on the steam usage in tlie ne:;.:.t month having a mean ambient temperat.'Jl'c of 58<>, 14-10, Compute the residuals for the reg::ession model i::. Exercise 14~9. Prepare appropriate residual ploto;; and comment on model adequacy. 14-11. The percentage of impurity in oxygen gas pro-. eueen by a distilling process is iliought to be related to
the percentage of Ilydroca:bon in the main condenser of the processor. One mon1±.'s operating data are avail~ able. as shown in the table at the bottom of this page. (a) Pit a simple linear regress~o;). model to the data. (b) Test fot lack of fit and significance of regresston. (c) Calculate R2 for this model (d) Calculate a 95% confidence interval for the slope
13,· 14~12.
• Compute the residuals for the data in Exercise
14-11. (a) Plot the residua:s on norma: probability paper and draw appropriate coucbsions. (b) Plot the residuals againsty and x. Interpret these displays. 14~13. NJ article in Transportation. Research (1999. p. 183) presents a study On world maritime employ~ ment, The pwpose of the St.:.dy was to determine a relationship between average manning level and the average size of the fleet. Mar..rung level refers to the ratio of number of posts that must be manned by a seaman per ship (posts/ship). Data collected for ships of the lIn.:.:ed Kingdom over a 16~yeax period are
Average Size
Level
9154 9277 9221 9)98 8705
20.27 19.98 20.28 19.65 18.81
continues Purity (%)
HYGr'ocarbon (%) Purity (%)
Hydrocarbon (%)
(a) Fit a l.i:near regression model relating average :manning level to avemge ship size. (b) Test for significance of regression. (c) Find a 95% confidence interval 0:: the slope. {d) '\\-'hat percentage of total variability is explained by ilie model? (e) Find the residuals and construct appropriate residual plots. 14-14. The final averages for 20 randomly selected stv.dents taking a course in engineering statistics and a course in operations researcb at Georgia Tech a:e shown here, Assume that the final averages are jointly normally distributed. Statistics
OR Statistics
OR (a) Find the regression line relating the statistics final average to the OR final average. (b) Estimate the correlation coefficient. (c) Test the hypothesis that p: O. (d) Test the bypothesi.~ that p:::: 0.5. (e) COIlSQUct a 95% confidence interval esti:nate of the correlation coefficient. 14-15. The weight and systolic blood pressure of 26 randomly selected males in the age gl'O'Jp 25-30 are
86.91
87.33
1
86 .29
89.86
1.02
0.95
I
1.11
1.02
%.73
95.00
1.46
1.01
I 96.85
85.20
90.56
0.99
0.95
0.98
i
14-10 Exercises shown in the following table. Assume that weight and blood pressure are jointly normally distributed. Subject
Weight
Systolic BP
165 167 180
130
2 3
155 212
128
4
5 6 7
133 150 lSI
13
175 190 210 200 149 158 169 170
14 15
172 159
153 128
16 17
168 174 183
149
8 9
10 11 12
18 19 20 21 22 23 24
25 26
215 195 180 143
:240 235 192 187
146
150 140 148 125
435
Is the es-:ir.1ator of t.'1e slope in the simple linear regression model unbiased? 14~18. Suppose that we are fitting a straight line and we wish to make the variance of the s10pe Pl as small as possible. Vt'bere should the observations x" i:::. 1, 2"", n, be taken so as to minimize V(fi l )? Dis~ss the practical implications of this allocation of the X"
14M19. Weighted Least Squares. Suppose thar we are fitting the straight line y;;;::: flo "l' iJjx + € but that the vzriance of the y values now depends on the level of x; that is, i:::.1,2 .... ,n,
133
135 150
where the Wi are unknown constants, often called weights, Show that the resulting !easHquares normal equations are
132
50
158
t
,
150
l",!
WI -
PI:t w;x; = .t 1=1
,
W;Y,',
;""
,
/loLwixl-P:Lwixr= Lw/x;'Y;.
163 156
t=l
;"'!
i",:
124 170 165 160 159
(a) Find a regression tine relating.systolic blood pres~ sure to weight, (b) Estimate the correlation coefficient. (e) Test the hypothesis that p = O. (d) Test the hypothesis that p = 0.6. (e) Construct a 95% confidence interval estito.a.."e of the correlation coefficient 14-16. Consider the siJ:r.ple linear regression model Y= 130 ..... f3;x + c, Snow that E(MS~) ::,,«;12 ..... fl~S.tt' 14-17. Suppose that we have assumed the straight~line
regression model
but that the response is affected by a second variable,
.tz, such t..~t the true regression function is
14 20. Consider the data shOWll below, Suppose that the relationship between y and x is hypothesized to be y = (iJ':! + PIX + Erl. Fit an appropriate model to the data, Does the assumed model form seem appropriate? M
x
6
y
0.24
14-21. Consider the weighta.n.d blood pressure data b Exercise 14-15. Fit a nowbtercept model to the data, and compare it to the model obtained in Exercise 14~15. Which model is superior?
14-22. The following data, adapted from Montgomery, Peck, and Vming (2001), present the number of certified mental defectives per 10,000 of estimated population in the United Kingdom (y) and the number of radio receiver licenses issued (x) by the BBC (in millions) for the years 1924-1937. Fit a regression model relating y to x. COIll.m.ent on the model. Specifically, does the existence of a strong cor~ relation imply a cause~and-effect relationship?
436
Year
Chapter 14 Simple Linear Regression and Correlation
Multiple Regression Many regression problems involve more than one regressor variable. Such models are called multiple regression models. Multiple regression is one of the most widely used statistical techniques. This chapter presents the basic techniques of pardII1eter estimation, confidence interval estimation, and model adequacy checking for multiple regression. We also introduce some of the special problems often encountered in the practical use of multiple regression including model building and variable selection. autocorrelation in the errors t and multicollinearity or near-linear dependence among the regressors. j
15-1 MULTIPLE REGRESSION MODELS A regression model that involves more than one regressor variable is called a multiple regression. model As an example, suppose that the effective life of a cutting tool depends on the cutting speed and the tool angle. A mnltiple regression model that might describe this relationship is (15-1) where y represents the tool life, XI represents the cutting speed, and x 2 represents the tool angle, This is a multiple linear regression model with two regressors. The term "linear' is used because equation 15-1 is a linear function of the unknown parameters f3o~ f31~ and~, Note that the model describes a plane in the two-dimensionalx1 • .:.s space. The parameter f30 defines the intercept of the plane. We sometimes call /3, and A partial regression coefficients, because {3J measures the expected change in y per unit change in x: when A2 is held constant, and A measures the expected change in y per unit change in X, when Xl is held constant. In general; the dependent variable or response y may be related to k independent vari~ abies. The model (15-2)
is called a mnltiple linear regression model with k independent variables. The parameters ~,j = 0, I, ... , k, are called the regression coefficients. This model describes a hyperplane in the k-dimensional space of the regressor variables {x). The parameter f3; represents the expected change in response y per unit cbange in Xj when all the remaining independent variables Xi (i :;t)} are held constant. The parameters ~.,j;;;:;:; 1, 2, .,., k, are often called par~ rial regression coefficients, because they describe the partial effect of oue independent 'tari~ able when the other independent variables in the model are held constant. Multiple linear regression models are often used as approximating functions. That is; the true functional relationship between y and .x" X" ••. ,.x, is unl::no"'D, but over certain ranges of the independent variables the linear regression model is an adequate approximation.
437
438
Chapter 15 Multiple Regression
Models that are more complex in appearance than equation 15-2 may often still be ana~ lyzed by multiple linear regression techniques. For example, consider the cubic polynomial model in one independent variable,
y
Po + P,x+ pzX' + f3i' + e.
.I i
(15-3)
If we let x, =x, A1 ::;;r, andxJ =;:3, then equation 15-3 can be written y = Po + PIXI + {3.,x,+ {3.,x) - e,
(15-4)
which is a multiple linear regression model wil.h three regressor variables. Models that include interaction effects may also be analyzed by multiple linear regressiOTIlllethods, For
example, suppose that the model is y=
If we let x) = XIx" and
p,
Po + PIX, + f3,x, + Pi',xlx" + s.
(15-5)
Pll' then equation 15-5 can be written y = Po + P,x l + f3,x, + {3.,x, + e,
( 15-6)
which is a linear regression model In general, any regression model that is linear in the paramJJrers (111e {3's) is. linear regression model, regardless of the shape of the surface 111.t it generates.
15·2 ESTIMATION OF THE P.4.RAMETERS The method of least squares may be used to estimate the regression coefficients in equation 15-2. Suppose that n;> kobservations are available, and let Xlj denote the ith observation Or level of variable x,. The data will appear as in Table 15-1. We .ssume that the error term 8 in the model bas E(E) 0, V(s) 0", and 111at the if,} are uncorrel.ted random variables. We may write the model, equation 15-2, in terms of the observations, Yi = [10 + f31xil ~ fllxf2 +.,.+ flkxik + ej k
'=
J30 + LJ3j xJj + ei •
i::::l.2 •... ,lL.
(15-7)
r=:l
The least-squares function is
(15-8)
Table 15..1 Data for Multiple Linear Regression y Xu
15-2 Estimation of the Parameters The function L is to be minimized with respect to
A"
439
/3" ... , /31- The least-squares estiroa-
tors of /30' /3" _", /31 must satisfy (\5-9.)
and j
~ 1,2, .... k.
(15-9b)
Simplifying equation 15-9, we obtain the least-squares nonnal equations ...
n
"r:
n/3o + /3.:I,Xil i",l
•
ffi· L,x. v
,.
r""l
-
•
+h:2>i2
n
+ ... + fl.L,Xik
i~:
2
+ /31 L, Xi!
+ ffi2 L, X il X ', +.,
.,
•
/."'i
i:1
+;3k 2. xilxi/;; ;::: LXilYi'
1::.1
i""l
j",l
l"'l
•
•
=L,y,. (15-10)
Note that there are p;::: k + 1 normal equations, one for each of the unknown regression coef~ ficients, The solution to the no.rrnaI equations will be the least-squares estimators of the regression coefficients /10, r 1ti5 simpler to solve the normal equations if they are expresse<.iin matrix notation, We now give a matrix development of the normal equations that parallels the development of equation 15-10. The model in terms of the observations, equation 15-7. may be written in matrix notation,
/Jll ... , /J
y=XIHe, where
:1
x 11
[";:J x~l: X., X 21
y=
'0[1]
and
x"
.,
.
>": 1'
-"22
... Xu
-".2
.. '
x""
,-[::1 £1';_
In general, y is an (n X 1) vector of the observations, X is an (n xp) matrix of the levels of the independent variahles, II is a (p x I) vector of the regression coefficients, and s is ail (n
X
I) vector of random errors. We wish to fud the vector of least~squares estimators. ~ that :c:litlimizes
• L= L,SY i=l
e'e=(y-X[3)'(y-XII).
440
Chaprer 15 Mcltiple Regression Note that L may be expressed as
L = y'y -Il'X'y - y'X~+ Il'X'X~ (15-11)
y'y - 21l'X'y + Il'X'Xf3,
since IfX'y is a (1 x 1) matrix, hence a scalar, and its transpose (IfX'y), = y'X~ is the same scalar, The least-squares estimators must satisfy
aLI
'
=-2X'y+2X'X~=O, a~~ which simpliftes to
X'X~ X'y.
(15-12)
Equations 15-12 are the least-squares normal equations. They are identical to equations 15-10. To solve the normal equations, multiply both sides of equation 15-12 by the inverse of X'X. Thus, the least-squares estimator of ~ is
p= (X'Xt'X'y.
(15-13)
It is easy to see that the matrix form of the normal equations is identical to the scalar form. Writing out equation 15-12 in detail we obtain
, n
l>i! ie::
n
n
LXi.< cpo ,:;;;;1
LX'2
n
,
,
L,xa LxA
LXi:Xn
LXnXi,k
j""l
1"'1
l=1
1=1
t=l n
l
;
n
LV"
tti.l n
A1=
LXllY' :=1
-I
LXikYi
,
,
LX" L,=!
,
,
n
LXu:xil i"'~
L
LX~
X ik X l"2
1::::1
i=l
PkJ
n
,1""1
If the indicated matrix multiplication is performed, the scalar form of the normal equations (that is, equation 15-10) will result. In this fonn it is easy to see that X'X is a (p X p) symmetric matrix and X'y is a (p x 1) column vector. Note the special structure of the X'X matrix. The diagonal elements of X'X are the sums of squares of the elements in the columns QfX, and the off-diagonal elements are the sums of cross products of the elements in the columns ofX. Furthermore, note that the elements ofXfy are the sums of cross products afthe columns of X and the observations {yJ. The fitted regression model is
Y=Xp.
(15-14)
In scalar notation, the fitted model is k
y,
Po + LP
j );.,
j:::1
The difference between the observationy, and the fitted valuey; is a residual, say e;:::: Yt The (n xl) vector of residuals is denoted
o=y-y.
-Yi'
(IS-IS)
15~2
Estimation of the Parameters
441
iu.l article in the Journal of Agricultural Engineering and Research (2001, p. 275) describes the use of a regression model to relate the damage susceptibility of peaches to the height at which they are dropped (drop height. measured in rom)
of the ana]ysis is to provide a predictive model for peach damage to serve as a guideline for harvesting and postharvesting operations. Data typical of this type of experiment is given in 1'able 15-2. We will fit the multiple linear reg:::-e.....sion model y=
i>o+ pjx 1 + f3J.~ +"
to these data. The X matr:..x and y vector for this model are
y=-33.831 + O.01314x, + 34.890x,. Table 15~3 shows the fitted values of)' and the residuals. The fitted values andresidua1s are calculated to the same accuracy as the original data.
J
15-2 Estimation of the Parameters Table 15-3
Observations, Fitted Values, and Resid'J:a1s for Example 15-1
'The statistical properties of the leastRsquares estimator ~ may be easily demonstrated. Consider first bias: E(ii) = E [(x'X.t'X'y] E [(x'X)'X'(xP + Ell
E [(X'xr'x'Xl3 + (X'XY'X' EJ
= I), since E(e) = 0 and (X'XY'X'X = 1. Thus ~ is an unbiased estimator of fl. The variance property of ~ is expressed by the covariance matrix
Cov(~)
E ([~ - E{~)][~ -E(~)]').
The covariance matrix of ~ is a (p xp) symmetric matrix whosejjth element is the variance of ~ and ",hose (i,j)th element is the covariance between ~i and The covariance matrix of I) is
A
Cov(~) It is usually necessary to estimate
cr'(X'X)'.
f.il. To develop this estimator, consider the sum of
squares: of the residuals, say
•
:2A /.1
: :;: e'e.
l
444
Chapter 15
Multiple Regression
Substituting e=y-y =y -X~, we bave SS" = (y - X~)'(y
X~)
= y'y - P-X'y - y'X~ + ~'X'X~
= y'y-2~'X'y + ~'X'Xp. Since X'XP = X'y, this last equation becomes SS£= y'y
(15·16)
P'X"y.
Equation 15-16 is called the error or residual sum of squares. and it bas n - p degrees of freedom associated with it. The mean square for errOr is MS,=
(15-17)
n-p
It can be shown that the expected value of MS£, is 0-2; thus an unbiased estimator of (f'l is given by (15-18)
&'=MS£,
t"".,:pi?f5~ We \Vill estim..a~ t.'le error va:ciance <:f for the multiple regression problem in Exan:.p1e 15-1, Using the dara in Table
15~2,
we find 20
y'y=
I,0 =904.60 1",1
and
~'X"Y=[-33.831
120.79 1 0.01314 34.890J 51129.17J [ 122,70
= 866.39. Iberefore, the e:ctor sum of squares
is SSe = y'y - ~'X'Y
904.60 - 866.39
=38.2[. The estimate of IT is '2 _ SSe _ 38.21 _ 2 04" .. - - - - - - ._ i_
n- p
20-3
15·3 CONFIDENCE INTERVALS IN M1;"LTIPLE LTh"EAR REGRESSION It is often necessary to construct confidence intervcl estimates for the regression coefficients {fl,}. The development of a procedure for obtaining these confidence intervals requires that w~ assume the errors (o,) to be normally and independently distributed with mean zero and variance cr'. Therefore, the observations {Y,) are normally and independently distributed
Confidence Intervals in Multiple Linear Regresskn
15-3
r;.
445
witb mean fio + 1 f1:Xi1 and variance r.r. Since the least-squares estimator /lis a linear com~ bination of the observations, it follows that is nonnally distributed with mean vector II and covariance matrix u'(X'Xt'. Then each of the qUlllltities
Ii
P P j -
j
1~2
ejj '
'lCf
j
O,l•...• k,
(15-19)
is distributed as twith n - p degrees oifreedom, where CJJ is tbei/th element oitbe (X'Xt! matrix and if' is the estimate of the error variance, obtained from equation 15-13, Therefore, a 100(1 a)% confidence interval for the regression coefficient Py j = 0, 1, ... , k, is (15-20)
We will construct a 95% confidence interval on the parameter f31 in Example 15-1. Note that:he point est:i:n.ate of PL is fil "'" 0.01314, and me diagonal element of (X'X)-! corresponding to fJl is en = 0,0000077, The estimate of
O.01314-(2.1l0)~(2.247)(o.oOOO077) ." /3, i 0.01314+(2, ;)0).)(2.247)(0.0000077). which reduces to
0,00436"; /3, ,,; 0,0219,
We may also obtain a confidence interval on the mean response at a particular point, say (X 01 ' Xw "., xoJ· To estimate the mean response at this point define the vector 1
The estimated mean response at this pom.t is
Yo=x;,~.
(15-21)
This estimatoris unbiased, since EGo) = E(x;,~) = x;,11 = EVe)' and the variance afY, is (15-22)
Therefore, a 100(1- a)% confidence interval on the mean response at the point (x,., x"' "" ..tJt)is
Equation 15~23 is a confidence interval about the regreSSion hyperplane. It is tbe multiple regression generalization of equation 14~33.
446
l
Chapt:r 15 M",tiple Regression
The scientists conducting the experiment on damaged peact.es in Example 15·1 would like to
OOn~
struct a 95% confidence interval on the mean damage for a peach dropped from:a height of Xl'" 325 ~ if its density is Xi = 0.98 glcm? Therefore,
The estimated mean response at this point is found from equation
15~21
to be
-33,831] 325 0.98] 0.01314 = 4,63. [ 34.890 The variance of Yo is estimated by
: 24,63666
O"x,(X'Xr'xc =2,247[1 325 0.98]1
000;>321
0,005321
-26.74679]~
0.0000077
-(l.008353
1 ] 325
c-26.14679 -(l,008353 30,096389lo,98
=2.247(0,0718) =0,1613, Therefore, a 95% confidence interval On the me3l1 damage at this point is found from equation 15-23 to be
which reGuces to 3.78 S E(y,) :;; 5,48,
15-4 PREDICTION OF ],;'EW OBSERVATIONS The regression model can be used to predict future observations on y corresponding to particular ..'alues of the independent variables, say Acl' xC<,; • ••• , XQI:' If ~ :::: [1, X:)I' XC;!, ••• , X ct]; then a point estimate of the future observation Yo at the point (xm• Xoo. • ••• ~ Xl)k) is (15-24) A 100(1 - a)% prediction interval for this future observation is
Cl(I' +:<0 (X",,)-l Xc )
A
YC-tctj2,I1-p1./cJ
"'
.A.
(15-25)
/',c-,(;---,-,-X--X-)-:-t---C'
sYo -Yo +t"12,._p~d'- I+xc\
"0i'
TIlls prediction interval is a generalization of the prediction inteI'\la! for a future observation in simple linearregressio~ equation 14--35. In predicting new observations and in estimating the mean response at a given point (xc;. Xw ••• , X Ok )' one must be careful about extrapolating beyond the region containing the original observations. It is very possible that a model that fits well in the region of the orig· inal data will no longer fit well outside that region. In multiple regres>ion it is oiren easy to inadvertently extrapolate, since the levels of the variables (XI1 • xa.> ... Xi*>' i;;;; 1, 2, , .. , n) jointly define the region comaining the data. As an example, consider Fig. 15-1, which illus-
I
15~5
Hyporb.esis Testing in Multiple Linear Regmision
447
: XOj
L , Figure
15~1
Onginal
rangs for x1
An example 0: extrapolation in multiple regression.
trates the region containing the observations for a two-variable regression model. Note that the point (.:tol' xO") lies within the ranges of both independent variables Xl and ~~ but it is outside the region of the original observations, Thus, either predicting the value of a new observation or estimating the mean response at this point is an extrapolation of the original regression modeL
Suppose mat the scientists in Example lSffl wish to construct a 95% prediction interval on the damage on a peach that is dropped from a heignt of-XI = 325 IIlIl1 and has a density of.;s =0.98 glcmJ • Note that x:: = [1325 0.98], and the point estimate of the damageisYn =x~~ =4.63 rom. Also. inE.~ample 15-4 we calculated x~(X'Xrlx.:: = 0.0718, Therefore, from equation 15-25 we have
4.63 -2.1l0~2.247{1 + 0.0718) 5; Y, ~ 4.63+ 2.110,/2.247(1 +0.0718), and the 95% prediction interval is
15-5 HYPOTHESIS TESTING IN MULTIPLE LINEAR REGRESSION In multiple linear regression problems, certain tests of hypotheses abeut the model parameters are useful in measuring model adequacy. In this section, we describe several importan! hypothesis-testing procedures. We continue to require the nonnality assumption on the errors, which was introduced in the prevlous section.
15-5,1 Test for Significance of Regression The test for significance of regression is a test to determine whether there is a linear relationship between the dependent variable y and a subset of the independent variables XI' hzl x,t:. The appropriate hypotheses are ••• f
448
Chapter 15 Multiple Regression
H,:/l ~ /3,~ ... ~ /3,=0,
°
(15·26)
H,: f3;" for at least onej.
Rejection of He: /3) == 0 implies that at least one of the independent variables XI' x?, .... xk contributes significantly to the model. The test procedure is a generalization of the proce~ dure used in simple linear regression. The total sum of squares SfY is partitioned into a sum of squares due to regression and a sum of squares due to error, say
Syy=SS,+SS,. and if H,:
/3, = 0 is !rue, then S5.10" -
xi, where the number of degrees of freedom for the
x' is equal to the number of regressor variables in the model. Also, we can show that S5,10" - X;,'," and SSe and SS, are independent. The test procedure for H,: J3, = a is to compute _. SS,lk _ MS R EeSSE!(n-k-l) MSE
(15-27)
and to reject Ho if FI) > F a.k,lI-i.'-I' The procedure is usually summarized in an analysis of vari~ ance table such as Table 15-4. A computational formula for SS, may be found easily. We have derived a computa· tion~ formula for SSE in equation 15-16, that is,
SSE=y'y- ~'X'y.
Now since SyJ :;; l,~",y~ - O:';"'ly'-y1/n
= y'y - CZ;"'ly,lln.
we may rev.rite the foregoing
equation
or
Therefore, the regression sum of squares is
(15·28) the error sum of squares is (15·29) and the total surn of sq!WeS is
(15·30)
15~5
Hypothesis Testing in Multiple Lir.ear Reg;r<>...ssion
449
Table 154 Analysis of Variance for Significance of RegreSSIon in MultipJe Regression
Sou..rce of Variation Regression Er.ror or residual Total
Sillll of Sc;u"m
Degrees of Freedom
Square
SS, SSE
k n-k-l
MS, MS,
S;),
Mean
n-[
'~~j~~~~ We v.:ill test for significance of regression using the damaged peaches data from Example 15~ L Some of the numerical quantities required are calculated in Example J5~2. Note that I
r
',2
fi
Syy=y'y-- .L,y,. n \i=!
)
= 904.60 (120.79)'
20 =175.089,
SSR~~'X'Y-.!.( :~.>,r n\l:!
=866.39 ~
SS,
)
(,20.79)' 20
[36.88.
=Syy -
5S,
~y'Y-~'X'Y =38.21. The analysis of variance is shown in Table 15-5, To testR(J: f31 = A =0, we calcula[e the statistic
Po = MS,
MS,
= 68.44 =30.46.
2.247
Since Fo > Fo,IJ~;)',.1 """ 3.59, peach damage is related to drop height, froitdensity, or both. However, we note that this does not necessarily imply that the relationship found is an apprOpriate oae for predicting damage as a function of drop height or fruit density. Further tests of model adequacy are required.
Table 15-5 Test for Significance of Regression for Example
Source of Variation
Squares
Sumo:
Regression EI:ror
136.88 38,21
17
Total
175.09
19
15~6
Degrees of Freedom
Square
F,
2
68."4
30.46
Mean
2.247
450
Chapter 15 Multiple Regression
15-5.2 Tests on lndividual Regression Coefficients We are frequently interested in testing hypotheses on the individual regression coefficients. Such tests would be useful 10 determining the value of each of the independent variables in the regression model For example, the model might be more effective with the inclusion of additional variables, or perhaps with the deletion of one or lUore of the variables already in the modeL
Adding a variable to a regression model always causes the sum of squares for regression to increase and the error sum of squares to decrease. We must decide whether the increase in the regression sum of squares is sufficient to warrant using the additional variable in the model. Furthermore, adding an unimPOrtant variable to the model can actually increase the mean square error) thereby decreasing the usefulness of the model. The hypotheses for testing the significance of any individual regression coefficient, say
f3i , are Ho: f3j~ 0, (15-31)
Hl:~''''O. If H,: f3j
= 0 i, not rejected, then this indicates that xJ can possibly be deleted from the
model. The test statistic for this hypothesis is
p,
to ;
~"a:!c;'
(15-32)
A.
ejj is the diagonal element of (X'xyt corresponding to The null hypothesis = 0 is rejected if Itol > 1"",.,_ ,. Note that this is really a partial or marginal test, because the regression coefficient depends on all the other regressor variables xli ¢;) that are in the modeL To illustrate the ;se of this test. consider the data in Example 15-1, and suppose that we want to test where
He:
~
p;
He: f3, = 0, H,: {3," O. The main diagonal element of (X'Xt' corresponding to jj, is
en ~ 30.096, so the r statistic
in equation 15-32 is .. 34.89 _ 4.24 . .)(2.247)(30.096) Since to"" 22 = 2.110, we reject Ho: /3, =' 0 and conclude that the variable x, (density) contributes sigiuficantly to the modeL Note that this test measures the marginal or partial con~ tribution of X1 given that Xl is in the model. We may also examine the contribution to the regression sum of squares of a variable, say x" given that other variables x, (i ¢ J) are included in the model. The procedure used to do this is called the general regression significance test, or the "extra sum of squares" method. This procedure can also be used to investigate the contribution of a subset of the regressor variables to the model. Consider the regression model with k regressor variables y~Xll+£,
whereyis (nX I), Xis (nxp), II is (px I),. is (nx I), andp = k+ 1. We would like to determine whether the subset of regressor variables Xt~ X:l~ .. "' xr (r < k) contributes
J
15-5
Hypo~esis
Testing in Multip:e Lineru; Regression
451
significantly to the regression model. Let the vector of regression coefficients be pa.'1itioc.ed as follows:
where
P: is (rx I) and ~2 is [(P -
r) X IJ. We wisb to test the hypotheses
Ho: Pl =0,
(l5-33)
H,: p, ;cO. The model may be \.VIitten
(15-34) where Xl represents the columns of X ass.ociated with PI and ~ represents the columns of X associated ",itb ~" Fortbeft,II1110del (including both 13, and ~,), we know that ~ = (X'Xr'X'y. Also, the regression sum of squares for all var..ables including the intercept is (p degrees of freedom)
and y'y -~'X'y MS" = --'-""""""'. ~ n- p
SS,(I3) is called the regression sum of squares due to 13, To find the contribution of the terms in 13, to the regression, fit the model assuming the null bypothesis Ho: 13, =0 to be true, Tbe reduced model is found from equation 15-34 to be y = X'~2 +e,
(15-35)
The least-squares estimator of ~, is ~, = (X,x,r'~, and
S5,(J~,) =~Xy The :egression sum of squares due to
(p
rdegrees of freedom).
(15-36)
13, given that a,z is already in the model is
SS,(I3,:~,)
SSR(~) -SS,l~'),
(l5-37)
This sum of s.q1.lZlIes has r degrees of freedom. It is sometimes called the "extra sum of squares" due to ~ ,. Note tblIt S5,(~,1~,) is the increase in the regression sum of squares due to including the variables x" x" '"., x, in the model. 1'ow S5'(13 ,113,) is independent of MS E' and the null hypothesis ~, = 0 may be tested by the statistic
sSR(i3d~2)lr (15-33) If Fo > Fa. f, "_p we reject HCr< concluding that at least one of the parameters in 1">1 is not zero and, consequently, at least one of the variables XI' Xz... " x!' in Xj contributes significantly to the regression model Some authors call the test in equation 15-33 a partial F-test The partial F-test is very useful. We can use it to measure the contribution of xJ as if it were the last var.able added to the model by computing
SSiflip", p" ... , Pi-I' /5;."
... , pJ,
452
Chapter 15 Multiple Regression
This is the inerease in the regression sum of squares due to adding X; to a model that already includes XI' , ..• Xj_I' ;:,"H' ..•• xk' Note that the partial F-test on a single variable Xj is equivalent to the t-test in equation 15-32. However, the partial F-tes! is a more general procedure in that we can measure the effect of sets of variables. In Section 15-11 we will show how the partial F-test plays a major role in model building, that is, in searebing for the best set of regressor variables to use in the model.
Consider the damaged peaches data in Example 15-1. We will investigate the contribution of the variable x,. (density) to tte model. That is, we wish to test
He: fl, = 0. H,: /3," 0, To test this hypothesis. we need the extra sum of squares due to /3,.. or
SS,(/3,I/3,. flc) = SS,cf3" fl" f3;J - SS,(/3,. I30l = SS/J31' f3,I/3;J - sS,(/3,lf3,). In Example 15-6 we calc'Jlated
(2 degrees of freedom).
and if the model y =: flo + f3 ,XI + e is fit, we have
SS,(fl,lfJol =fj,s", = 96,2,
(1 deg;ee of freedom).
The:[efore. we have
sS'/'/3,I/3"
flu) = 136,88 -96.21 =40.67
(1 degree of freedom),
This is the increase in the regression sum of squares attributable to adtfng x,. to a model aJJ:eady containing XI' To testHo: fl.1 ;: 0, [onn the test statistic SSR(fl,lfl,.flc)/1 MS E
~ 40,67 =18.10, 2,247
Note that theMS£; from thefull model. usi::g both x: ar.d x,.. is used in the denominator of the test sta~ t:.suc. Since F). Cd,I,;7::::: 4.45, we reject Hi; 132 =: 0 and conclude :.bat density (x2) contributes significantly to the model. Since this partial F~tes[ involves a single variable, it is equivalent to the Mest To see this, recall that the t-test on Ho: .B: ;;; resulted in the tes: statistic tc :;:;; 4.24. Furt:b.crmore, recall that :he square of a trmdom variable with v degrees of freedom is an Frandom variable with one and v degrees of freedom. and we note that (4.24)' =: 17,98::.: FfJ ,
°
15-6 MEASURES OF MODEL ADEQUACY A number of techniques can be used to measure the adequacy of a multiple regression modeL TItis section will present several of these techniques . Model validation is an impor-
15~6
Measures of Model Adequacy
453
tant part of the multiple regression model building process. A good paper on this sUbject is Snee (1977) (see also Montgomery, Peck, and VIning, 2001).
15-6.1 The Coefficient of Multiple Detennination The coefficient of multiple determination R2 is defined as (15-39)
R2 is a measure of the amount of reduction in the variability of y obtained by using the regressor variables xi'.:s, ... , x k • As in the simple linear regression case, we must have a :s;: R2:s;: 1. However, as before, a large value of Rl does not necessarily imply that the regression model is a good one. Adding a variable to the model will always increase R2, regardless of whether the additional variable is statistically significant or not. Thus it is possible for models that have large values of If to yield poor predictions of new observations or estimates of the mean response. The positive square root of R2 is the multiple correlation coefficient between y and the set of regressor variables xl' .:s, ... , xk• That is, R is a measure of the linear association between y andx I'.:s, ... , x k• When k;:::: 1, this becomes the simple correlation between y and x.
The coefficient of multiple determination for the regression model estimated in Example
15~ 1 is
R' ~ SSR ~ 136.88 ~ 0.782. S,., 175.09
That is, about 78.2% of the variability in damage y is explained when the two regressor variables, drop height (XL) and fruit density (.:s) are used. The model relating density to Xl ouly was developed. The value of R2 for this model turns out to be Rl == 0.549. Therefore, adding the variable X? to the model has increased R2 from 0.549 to 0.782. -
AdjustedR' Some practitioners prefer to use the adjusted coefficient a/multiple determination, adjusted
R2, defined as (15-40) The value Sy-/(n - 1) will be constant regardless of the number of variables in the model. SSE1(n -p) is the mean square for error, which will change with the addition or removal of tenns (new regressor variables, interaction terms, higher~order terms) from the model. Therefore, R ~dj will increase only if the addition of a new term significantly reduces the mean square for error. In other words, the R!dj will penalize adding terms to the model that are not significant in modeling the response. Interpretation of the adjusted coefficient of multiple detemrination is identical to that of R2.
454
Chapter 15 Multiple Regression
We can. calculate K~j for the model fit in Example 15-1. Prom Example 15-6, we found that SSE;;:;: 38.21 and Srr::= 175.09. The estimate R~: is then 38.21/(20-3) _1_ 2.247 175.09/(20 I) 9.215 =0.756.
~)=I
The adjusted R' will playa significant role in variable selection and model building later in
this chapter.
15-6.2 Residual Analysis The residuals from the estimated multiple regression model. defined by", = y,-Y,. play an important role in judging mode! adequacy. JUSt as they do in simple linear regression .•'\.5 Doted in Section 14-5.1, there are several residual plots that are often useful. These are illustrated in Example 15 -9. It is also helpful to plot the residuals against variables nIlt presently in the model that are possible candidates for mc1usion. Patterns in these plots. similar to those in Fig. 14-5, indicate that the model may be improved by adding the candidate variable.
The residt:.aL.... for the n:odel estimated in Example 15-1 are shown.in Table 15-3. These residuals are plotted 00:1 a normal probability plot in Fig. 15-2. No severe deviatiollS from r.ocrnaliry are obvious, although the smallest residual (e) = -3.11) does not fall near the remaining residuals. The standardized residual. -3.l7/~j2,247 :::: -2,11, appears to be large and could indicate an unUSt:a1 observation. The residuals are plotted against y in Fig. 15~3, and againstx j and.A1 in Figs. 15-4 and ;5-5, respective;y_ In Fig. 15-4, there is some i.'1dication that the assumption of constant variance may not be satisfied. Removal ofth.e unusual observation may improve the model fit, but there is no indication of error in data colleetion. Therefore, the point 'kill be retained, We will see subsequently (Example 1516) iliat!:\Vo other regressor variables are required to adequately model these data,
2
• • .•• . .
~ 0
0 m
ro 0 E
,
~ ~ -1
I •,
-2 ....
• • •
• ••
•
•
•
1 -;
-1 -2 0 2 -3 3 Figure 15-2 Normal probability plot of residuals for E""ample 15-10.
r 15-6
3~
i
"
2- " .
~ ~ -----~-"-~"-~------
i
:
[
.
.
o:~:~"
~!easures
j
"
... -"-------+---- .
'.
I
~~l~----------~----------~ 2
Figure
15~3
7 Fitted value
12
Plot of residuals againsty for Exan:.ple
15~10.
3~----------------------------------,
2
•
• •
r i
-2
'.
~'
--~~--~~----~;~----~ 300 400 500 600
x,
,Figure 154 Plot of residuals against x ( :or Example 15-10.
•
Figure J5.5 Plot of residuals against", for Example 15· I 0,
of Model Adequacy
455
456
Chapter 15
Multiple Regression
15-7 POLYNOMIAL REGRESSION The linear model y = X~ + e is a general model that ean be used to fit any relationship that is linear in the unknown parameters ~. This includes the important elass of polynomial regression models. For example. the second-degree polynomial in one variable. y=
/3, + /3.x ,. /3,,:? + S,
(1541)
and the second-degree polynomial in two variables, y = /3, ,. /3,x.
+ /3,x, + /3. ,x; + f3...:l X ; + /3.::>"x, + s,
(1542)
are linear regression models, Polynomial regression models are widely used in cases where the response is curvilinear, because the general principles of multiple regression can be applied. The following example illustrates some of the types of analyses that can be performed.
E~pl~;~s~ii Sidewall panels for the interior of an airplane are formed i::.l a 1500~ton press. The unit manufactur~ ing cost varies \\1:th the production lot size. The data shown below give :he average cost per unit (in hundreds of dollars) for this product (y) and :he production lot size (x), The scatter diagram. shown in F1g. 15-6, indicates that a second-order polynomiul may be appropriate,
y
LSI
1,70
[.65
[,55
lAS
lAO
1.30
1.26
1.24
1.21
1.20
1.18
x
20
25
30
35
40
50
60
65
70
75
80
90
We will fit fue model y = f30 + /3.x + /3"r + "
"90~ 1.80
• •
1.70
" i.50
~
"
•
0
~
"- 1.50
•u
0
m
'" ~
~
1.40
1,30
• •
« 1.20
•
.
1.10
1,00 20
30
, , 40
I
.. !~.~
50 60 70 Lot size, x
Figure 15-6 D.ta for Example 15-11.
8()
90
f 1
15-7
The y vector, X matrix, and
I
~
Polyr.omial Regression
457
vector are as follows:
j:.811
1 20
400
25
625
30 I 35
1225
; 1.70
1.65
I
1.55 L48 1.40 y= 1.30 '
x=
900
1 40 1600 1 50 2500 60 3600'
1.26
I
65
L24 1,21
:
70 4900
4225
1 75 5625
1.20 ;
80
cLl8j
6400
1 90 8100
Solving the normal equations X"'Xtl = X'y gives the fitted model
, ; 2.1983 - 0.0225, + 0.0001251:<". The test fo~sign.ificance of:egrcssioIl is shoVt"O in Table :5-6. Since Fo =2171.07 is significant at 1%, we conclude iliat at least O!:.e of the parameters fJ! and /311 is not zero. Furth.ermore, the sta.1.da!d tests for model adequacy reveal no unusucl behavior,
In fitting polynomials, we generally like to use the lowest-degree model consistent with the data. In this example, it would seem logical to investigate dropping the quadratic term from the modeL That is, we would like to test
13,,: 0, H,: 13" "' O. H,:
The general regression significance test can be used to test this hypothesis. We need to determine the "extra sum of squares" due to {111' or
SS,(f3ulf3;,f3J = SSR(fi"
f3 li lf3,) - SS,(f3,If3J·
The sum of squares 5S"(f3,, f3::1f3;J = 0.5254, from Table 15-6. To find SSR(fi,lf30), we fit a simple linear regression model to the original data, yielding
y
1.9004 0.0091".
It can be easUy verified that the regression sum of squares for this model is
58,(13,113,)
0.4942.
Table 15-6 Test for Significance of Regression for the Second~~er Model in Example 15-11 Source of Variation
Sum of Squares
Regression Error Total
0.5254
0_0011 0.5265
Degrees of Freedom
2 9 11
Mean Square 02627
0.000121
F,
217L07
458
Chapter 15
1
Multip!e Regression
Table 15-7 Analysis ofVariaz;ce of Example
Reg:-ession
SS,([3" {J"i[3,) ~ 0.5:254
Lin_ Quadratic Error
SS,([3,I[3,) ~ 0.4942 SS,(,8"1[3,, .8,) = 0.0312
2 1 I
0.0011 0.5265
Thtal
Showing the Test for He: [31,
Degree of Freedom
$\l."U. of Squaces
Source of Variation
15~ 11
9 11
0
Mean Square 0.2627 0.4942 0.0312 0.000121
2171.07 4084.30 257.85
Therefore, the extra sum of squares due to Pll' given that /31 and Pc are in the model, is
SS/fi"lA" /3)
SSk(/3" ~
/3,,113,) - SSifiJ/3~
0.5254 - 0.4942
~0.0312,
The analysis of variance, with the test of H,: /311 ~ 0 incorporated into the procedure, is displayed in Table 15-7, Note that the quedratic term contributes significantly to the model.
15-8 INDICATOR VARIABLES The regression models presented in previous sections have been based on quantitative vari~ abIes) that is, variables that are measured on a numerical scale. For example. variables such as temperature, pressure, distance, and age are quantitative variables. Occasionally. we need to incorporate qualitative variables in a regression model. For example, suppose that one of the variables in a regression model is the operator who is associated with each obser'''arion y;. Assume that only two operators are involved. We may wish to assign different levels to the two operators to account for the possibility that each operator may have a different effect on the response. The usual method of accounting for the different levels of a qualitative variable is by using indicator variables. For instance, to introduce the effect of two different operators into a regression model, we could define an indicator variable as follows:
x
0 if the observation is from operator 1,
x = 1 if the observation is from operator 2.
In general, a qualitative variable with t levels is represented by t - 1 indicator variables. which are ,",signed values of either 0 or 1. Thus, if there were three operators, the different levels would be accounted for by two indicator variables defined as follows:
x,
X,
0 1
0 0
0
1
if the observation is fmm operator 1, if the observation is from operator 2, if the obsertation is from operator 3.
Indicator variables are also referred to as dummy variables. The following example illus~ trates some of the uses of indicator variables. For other applications, see Montgomery. PeCk, and Vining (200 1).
15-8 Indicator Variables
459
E~pl~(?::i2', (Adapted from Montgomery. Peck. and Vmiog, 2001), A mechanlcal engineer is investigating the sur~ face finish of metal parts ptoduced on a lathe and its relationship to the speed (in RPM) of the lathe, The data are shown in Table 15-8. Note that the data have been collected usiGg two different typCs of cutti.:lg tools. Since it is likely that the type of cutting tool affects tb.e surl'ace fiIrish, we will fit the model
y ~ /3, + fil~' + Ph + E, where y is the surface finL'>h, XI is the lathe speed in RPM, and X: is a.'1 indicator va.--!ahle denoting the type of ct;.tting tool used; 'ilia: is,
ofor tool type 302, x -{ , - 1 for tool type 416, The parameters if. ttis model may be easily interpreted. If x,! ;;;; 0, then the model becomes
Y""
which is :a straight-line model with slope becomes
Pc + ftjX t + e,
Pi and intercept f3y However, if X. ;;;; 1. then the model
y ~ 13, + fi,x, + 13,(l) + E; p, -
/3, + f3,x, + E, which is a st:raight~Jine model v.ith slope PI and intercept f3ij + /32' Taus, the model y;; Po + f31X+ f3-zXz + e implies that surface finish is linearly related to lathe speed and that the slope f31 does Dot depend on the type of cutting tool used, However, the type of cutting rool does affect rbe intereept,. and /32 indi~ cates the ehange in the intercept associated with a change in too) type from 302 to 416,
Thble15·8 Swiace Furish Da,afor Example 15-12
Observation N'lr!I:.ber. i 2
3 4 5 6 7 8 9 :0 II 12 13 14 15 16 11 18 19
20
Su.,..face F1nish. y, 45.44 42,03 50,10 48.75 41-92 47,79 52_26
5= 14.2762 + O.141u, -13.2802x,. The a."ta:ysis of variance for this model is sho'h'Il in Table 15~9. Note that the hypothesis Eo: PI ::: /32 of regression) is reje...'1:ed. This table a.:w contains ::he sum of squares
=0
SSg = SS,(/3,. }3,iP,J
=ssg{/3,I}3') + SS,(}3,I}3,. }3,). a test of the hypothesis Ho: /32 = 0 can be made. This hypothesis is also rejected, so we conclude that tool type has an effect on surface finish..
S(}
It is atsopossihle to use indicarorvariables tomvesligate whether tool type affects both slopemui intercept Let the model be
Table 15-9 Analysis of Variance of Example 15-12 Sou.."'Cc of
Sumo!
Degrees af
Variation
Squares
Freedom
Regression
SS'(f:l,IfJ.> 55,<{3,IfJ •• }3o) Error Total "Signific.ant at 1%.
1012.0595 (130.6091) (881.4504) 7.7943 1019.8538
2
(1) (1) !7 19
Mean Square 506.0297 130.6091 881.4504 0,4508
1103.69' 284.87' 1922.52'
15-9
The Correlation Matrix
461
where.:s is the inmcam! variable. Now if too) type 302 is used,.:s "'" O. and the model is
{J,x,+e.
y=
lftool rype 416 is used• .:s;;;;: 1, and the model becomes y ~ {Jo - {J,", + A+ f3:,x, + e
= (fJ" + fiJ + (/3, + /3;JX, + SNote that A is the change in the intercept. and A is the change in slope produced by a change in tool type, Another method of analyzing these data set is to fit separate regression models to the data for each tool type. However, the indicator variable approach has several advantages, rmt, only one regression model must be estimated. Second, by pooling the data on both :001 types, more degrees of freedom for error are obtained. Th.lrd. tests of both hypotheses on the parameters 13: end /33 a:e just special cases of the general regression signific3IWe test
15-9 THE CORRELATION l'.lATRIX Suppose we wish to estimate the parameters in the model
i 1,2, ""n.
(15-43)
We may rewrite this model with a transformed intercept f3~ as
or, since
The
Y, = /3; + /3,(x" - x,) + A(x,~
+ z,
(15-44)
y,-y= /3,(X" -x,) + A.(x,~ -
+ cr
(15-45)
A= y,
xx matrix for this model is (1546)
where
•
Skj= I,(xik-x.)(xg-Xj),
k,j
1,2.
(15-47)
i=l
It is possible to express this XX matrix in correlation fo=. Let Tk;' ::::; I \
S,S. JJ
)1/2
I
k,j=1,2,
(1548)
/<.Ii.
and note that r" = r" = 1. Then the correlation form of the XX matrix, equation 15-46, is
R-['I12
r~21
(15·49)
The quantity r t2 is the sample correlation bet\.veenx[ and~. We may also define the sample correlation between Aj and y as j=1,2,
(15·50)
l 462
Chapter 15 Multiple Regression
where
Sj,
i(x'J XI)(Y'-y),
j=I,2,
(15-51)
1<"":
is the corrected sum of cross products between Xj and y, and S'I"J is the usual total corrected sum of squares of y.
These transfur:nations result in a new regression model, (15-52)
where
. . Y,-y
Yi =Slf2-, yy
z11;::::
x·· -XI 1~~'2"
j= 1,2.
B
The relationship between the parameters b , and b2 in the new model, equation 15-52, and the paramerers {J" {JI' and A in the original model, equation 1.5-43, is as follows: (15-53)
{J,
(15-54)
(15-55) The least-squares normal equations for the transformed model, equation 15-52, are (15-56) The solution to equation 15-56 is
or (15-57a)
rZy - liz'll' 1-1.~
(J5-5ib)
The regression coefficients, equations 15-57, are usually called staruiardized regression coefficients. Many multiple regression computer programs use this transfor.:nation to reduce round~off errors in the (X'X)-l mati"" These round-off errors may be very serious if the
15~9
The Correlation Matrix
463
original variables differ considerably in magnitude, Some of these computer programs also display both the original regression coefficients and the standardized coefficients, The standardized regression coefficients are dimensionless, and this may make it easier to com~ pare regression coefficients in situations where the original variables Xj differ considerably in their units of measuremenL In interpreting these standardized regression coefficients, hovvever\ we must remember that they a...-.oe still partial regression coefficients (i.e., b, s..'1ows the effect of z) glven iliat other Zi> i are in the model). Fur",hermore, the hJ are attected by the spacing of the levels of the x)' Consequently, we should not use the magnitude of the bj as a measure of the importance of the regressor variables. While we have explicitly treated only the case of two regressor variables. the results generaIiz.e. If there are kreg:ressor variables X,I ~> •••• xk• one may vrrite the X'X matrix in
correlation form.
r'12 R=,r13
TZ3
1
r2k
r,.
1
rn
!
L'ik
w:tere r.j (X'I -
'13
'12
'"
"Jk
,.,
r",
(15·58)
'''' , "
1
S/(S;,Sl))Lf2 is dIe sample correlation between x, and x-' and Sij;';;; I.;:",(X"i - Xi)
x,). The correlations between Xj and y are
!'ly g
: r2'j
(15·59)
lrky j "
where "Y =1:;"', (x., b" ''', b,] :s
x,)(y" - Y), The vector of standardized regression coefficients
0= R·1g,
b' = £h" (15·60)
The relationship between the standardized regression coefficients and the original regres~ sion coefficients is
j
0;;:
1,2, ... ,k,
(15·61)
~~lil~'131 For the data in Example 15·1, we find S", = 175,089,
S" = 184710.16,
SI,V;;4215.372,
S" = 0.047755,
S" = 2,33,
Sll ;;;51.2873.
Therefore. 51.2873
~(18471O.16)(0,047755)
0.5460,
464
Chapter 15
Multiple Regression
42\5,372 1(18471O.16)(6i089) - 0,7412,
S" (S"S,)
r2v :::: - - ,.... ljl
,
'. -
2.33 ",;;l
•••••
-
0.8060.
"JrO,047755)(175,089)
f)
and the correlation matrix for this problem is
0.54601
I,
J From equation
15~56.
the normal equations in terms of the standardized regression coefficients are
These standardized regression coefficients could also have been computed directly from elfuer equation 15,57 or equation 15·61. Note that alfuough b, > b" we should be cautious about ooncluding that the fruit density X, is more important fuan drop height (x,)' smce h, and b2 are still partial regression coefficients.
15-10 PROBLEMS IN MULTIPLE REGRESSION There are a number of problems often eneountered in the use of multiple regression. In this section, we briefly discuss three of these problem areas: the effect of multicollinearity on the regression model, the effect of outlying points in the x-space on the regression coefficients,. and autocorrelation in the errors.
15-10.1 Multicollinearity In most multiple regression problems. the independent or regressor variables
Xi are intercorrelated, In situations whieh this intercorrelation is very large. we say that multi: collinearity exists. lv1ulticollinearity ean have serious effects on the estimates of the regression coefficients and on fue genernl applicability of fue estimated model. The effects of multicollinearity may be easily demonstrated. Consider a regression model with two regressor variables Xl and x,., and suppose that'xi and ~ have been ;~stan~ dardized.'" as in Section 15~9, so that the X'X matrix is in correlation fo:rm, as in equation
where T;2 is the sample correlation between x; and.:tj, and x;y and x~y are the elements of the X'y vector. Now, if multicolli.'learity is presen~ x, and.<, are highly correlated, and If,,1 -> 1. In such a situation) the variances and covariances of the regression coefficients become very large, since v(li) = ejF' -> ~ as If,,1 -> 1, and COy = C"d' -> ± = depending on whether r 12 ~ ± 1. The large vari,ances for A, imply that the regre.';sion coefficient. . are very poorly estimated. Note that the effect of multicollinearity is to introduce a ''neat' linear dependency in the columns ufthe X matrix, As r" ->± 1. this linear dependency becomes exact. Furthermore, if we assume that x;y -? x~ as Irnl ~ ± 1. then the estimates of the regression coefficients become equal in magnitude btll opposite in sign; that is = regardless of the true values of p, and p,. Similar problems occur when multicollinearity Lt; present and there are more than t..vo regressor variables, In general, the diagonal elements of the matrix C (X'X)-J can be
(A, Ii)
Ii, -A,
;:.!
written
j"'" 1,2, ...• k.
where
(15·62)
RJ is the coefficient of multiple determination resulting from regressing Xi on the
other k - 1 regressor variables. Clearly) the stronger the linear dependency of Xj on the
remaining regressor vatiables (and hence the stronger the multicollinearity), the larger the value of If, will be. We say that the variance of , is "inflated" by the quantity (1- , Consequently, we usually call
R:r',
Ii,
j:::::: 1,2..... k,
(15-63)
the variance inflation factor for ~j' ~ote that these factors are the main diagonal elements of the inverse of the correlation matrix. They are an important measure of the extent to which multicollinearity is present. Although the estimates of the regression coefficients are very imprecise when multicollinearity is present, the estimated equation may still be useful For example, suppose we v.1sh to predict new observations. If these predictioQ.'i are required in the region of the x~space where the multicollinearity is in effect, then often satisfactory results will be obtained because while individual f3jmay be poorly estimated, the function 'L;',f3;:;, may be estimated quite well. On the other band. if the prediction of new observations requires extrapolation, then generally we would expect to obtain poor results. Exlrdpolation usually requires good estimates of the individual model parameters.
466
Chapter 15 Multiple Regressio" Multicollinearity arises for several reasons, It will occur when the analyst collects the data such that. constraint of the form L~, ojX] ~ 0 holds among the colnnms of the X matrix (the a, are constants, not all zero), For example. if four regressor varia.bles are the components ~f a mixture, then such a constraint will always exist because the sum of the cOClponents is always constant. Usually. these constraints do not hold exactly, and the analyst does not know that they exist. There are several ways to detect the presence of multicollinearity. Some of the more important of these are briefly discussed. 1. The ·variance ioilation factors, defined in equation 15-63, are very useful measures of multicollinearity. The larger the variance inflation factor. the more severe the
multicollinearity. Some authors have suggested that if any variance inflation factors exceed 10 then multicollinearity is a problem. Other authors CQusiderthis value too j
liberal and suggest that the variance inflation factors should not exceed 4 or S. 2. The determinant of the correlation matrix may also be used as a measure of multicollinearity. The value of this determinant can range between 0 and 1. When the value of the determinant is I, the columns of the X matrix are orthogonal (i.e" there is no intercorre1ation between the regression variables). and when the value is O. there is an exact linear dependency among the columns of X. The smaller the value of the determinant. the greater the degree of multicollinearity. 3. The eigenvalues or characteristic roots of the correla. .:ion matrix provide a measure of multicollinearity, If X'X is in correlation form. then the eigenvalues of XX are the roots of the equation
IX'X - All = o. One or more eigenvalues near zero implies that multiCOllinearity is present. If Am:»; and Aml'lC denote the largest and smallest eigenvalues of X'X, then the ratio Amru!Amin can also be used as a measure of multicollinearity. The larger the value of this ratio, the greater the degree of multicollinearity. Generally, if the ratio VAm" is less than 10, there is little problem with multicollinearity. 4. SOr:letimes inspection of the individual elements of the correlation matrix can be helpful in detecting multicollinearity. If an element is close to 1. then Xi and X, may be strongly multicollinear. However, when more than tvlO regressor variables are involved in a multicollinear fashion, the individual Tlj are not necessarily large. Thus, this method will not always enable us to detect the presence of multicollinearity.
h·1
5. [f the F-test for significance of regression is significant but testt; on the individual regression coefficients are not significant, then multicollicearity may be present. Several remedial measures have been pr
15·] 0 Problems in Multiple Regression
467
alternative to ordinary least squares. In ridge regression, the parameter estimates are obtained by solving 11'(1)= (X'X + Ifr'X'y,
(15·64)
where I> 0 is a constant. Generally, mues of 1in the interval 0 :s; I'; I a.-e appropriate. The ridge estimator 11'(1) is not an unbiased estimator of ]:I, as is the ordinary least·squares estimator ~, but the mean square error of 11'(1) will be smaller than the mean square error of~. Thus ridge regression seeks to find a set of regression coefficients that is more "stable,lt in the sense of having a small mean square error. Since mUlticollinearity usually results in ordinary least-squares estimators that may have extremely large variances, ridge regression is suitable for situations where the multicollinearity problem exists. To obtain the ridge regression estimator from equation 15~64, one must specify a value for the constant I. Of course, there is an "optimum" I for any problem, but the simplest approach is to solve equation 15-64 for several values of I in the inren'lll 0 :s; I ;; L Then a plot of the values of Il'(l) against 1is constructed. This display is called the ridge trace. The appropriate value of 1is chosen subjectively by inspection of the ridge trace. TypicallYI a value for I is chosen such that relatively stable parameter estimates are obtained. In general l the variance of 11'(1) is a decreasing function of I, while the squared bias [1l-Il'(I)J' is an increasing function of l, Choosing the value of [involves trading off these two properties of 13'(1)· A good discussion of the practical use of ridge regression is in Marquardt and Snee (1975). Also, there are several other biased estimation techniques that have been proposed for dealing with multicollinearity. Several of these are discussed in Montgomery, Peck, and Vrning (2001).
it,,·mp~15-14 (Based on an example in Rald, 1952.) The heat geoerated in calories per gxa.,\ for a particular type of cement as a function of the qua."1tities of foll::' additives (ZI' z;>.' z,. andz,;) is shown in Table 15-10. We wish to fit a :nultiple linear reg:res;,lon model to these da:a.
Tabl.15-10 Data [or Example 15·14 Observation Number 1
The data 'Will be coded by defining a new set of regressor variables as
i = 1.2,3,4,
i=I,2, ... ,15.
r.:
where Sjj = t (Zij - ;i! is the corrected sum of squares of the levels of <;" The coded data are shown in Table 15-11. This tran..<;fonnation makes the intercept orthogonal to the o~r regression coefficients, since the:first column of the X matrix consists of ones. Therefore, the intercept in this model will always be estimated by y. The (4 x 4) X"X matrix for the foUl' coded variables is the correlation ma:rix
1.00000 0.84894 0.91412 0.933671 0.84894 1.00000 0.76899 Q.g7567 X'X= 0.91412 0.76899 1.00000 0.86784' [ 0.93367 0.97567 0.86784 1.00000 This matrix contains several large correlation coefficients, and this may indicate significant mu1~
dcoll.i..1earity. The inverse of X'X is
(X'Xr l
20.769
25.813
= 25.813
74.486
-0.608 12.597
-44.0421 -107.710
12597
8.274
-18.903'
r
-0.608
-107.710 -18.903
l-44·042
163.620
The ...'ariance inflation factors are the main diagonal clements of this matrix, Note 'J:lat three of t!:!e variance inflation factors exceed 10, a good indication that multicollinearity is present. The eigenvalues of X'X are }'1 := 3.657, i-;::= 0.2679, A, = 0.07127. and }"4 ':;::: 0.004014. Two of the eigenvalues,.:t. and A4> are re1atively close to zero. Also, the ratio of the: largest to the smallest eige:r:.value is
?.~
=
Amill
3.652....=911.06, 0,004014
which is considerably larger than 10. Therefore, since examination of the variance inflation factors will use ridge regression :0 estiro.ate the model parameters.
and the eigenvalues indicates ?otential problems with multicollinearity, we
T.blels..n Coded Data for Example 15·14 Observ'dtion Number
2 3
4 5 6 7
8 9
y 28.25 24.80 11.86 36.60 i5.80 16.23 29.50 28.75 43.20
We solved Equation 15-64 for 'various values of l, and the results are su:m..mar:ized in Tabie 15-12. The ridge trace is shown in Fig. 15-7, The instabLity of the least-squares estimates tJ;U = 0) is evident from inspectio:1 of the ridge t"ace. It is often difficult to choose a value of 1from the ridge trace e::.3r simultaneously stabilizes the estimates of all regression coefficients. We will choose l "'" 0.064. which implies that the regression model is
y = 25,53 -18.0,566x1 using
"'r'
17.2202xz + 3{}.0743;:;) + 4.7242J:~,
A:: y:: 25.53. Coavening the model to the original variables Zt we ha,,'C y = 2.9913 -
0.8920z, + 0.3483" + 3.3209" - 0.0623".
Table 15-12 Ridge Regression Estimates for Example 15-14
15·10.2 Influential Observations in Regression When using multiple regression we occasionally find that some small subset of the obser~
varions is unusually influential, Sometimes these influential observations are relatively far away from the vicinity where the rest of the data were collected. A hypometical situation for two variables is depicted in Fig. 15-8, where one observation in x-Sface is remote from the rest of the data. The disposition of points in the x~space is important in dete..1nining the propetties of the mode!. For example, the point (x", xQ) in Fig. 15-8 may be very influential in detenr.Jning the estimates oime regression coefficients, the value of R2, and the value of MS£, \Ve would like to examine the data points used to build aregr--vSsion model to determine if they control many model properties. If these influential points are "bad>l points. or are erroneous in any way, then they should be eliminated, On the other hand, there may be noming 'Wl"ong with these points, but at least we would like to determine whether or not they produce results consistent with the rest of the data, In any event, even if an influential pomt is a valid one, if it controls important model properties. we would like to know this, since it could have an imp""t on the use of the modeL Montgomery, Peck, and Vrning (2001) describe several methods for detecting influential observations. An excellent diagnostic is the Cook (1977, 1979) distance measure, Thls is a measure of the squared distance between the least squares estimate of p, based on all n observations and the estimate ~v, based on removal of the ith point. The Cook distance measure is
D
-~) (~(;' -~')' X'X(~'il , \.
=.J
,
pMSE
'
i=l 2, ... ,n, j
Clearly if the it.!? point is influential, its removal will result in ~"1 changing considerably from me value p. Thus a large value of D, implies that the ith point is influential. The statiseic Di is acu:ally computed using (15·65)
where
f
=
ej JMSE(I-h,,) and hii is me ith diagonal element of the matrix H = X(x'Xt'X'.
Xl! XI':'. ,-~---------------!
,
Region containing 1 all obServattons I
except the ith
I.
: 1
I I I
Figure
15~8
A point that is remote in x~space.
15-10 Problems in Multiple Regression
471
The H matrix. is sometimes called the "haC matrix. since
y~x~
= X(X'Xr'X'y =Hy. Thus H is a projection matrix that transforms the obsen'ed values of y into a set of fitted
values 'Y. From equation 15-65 we note that D, is made up of a component that reflects how well the model fits the ith observatioll Yi [the quantity eJ~MSE(I- hii ) is called a Studentized residual, and it is a method of scaling residuals so that they have unit variance] and a com~ ponent that measures how far that point is from the rest of the data ["lil(1 - h,) is the distance of the ith point from the centroid of the remaining n - 1 points]. A value of DI > 1 woold indicate that the point is influential. Either component of D, (or both) may contribute to a large value,
[~[':!II~te!5Ii5; Table !5-13lists the values of D; for the c!a::naged peaches data in Example culations. CODSider the first observation:
15~ 1. 'Ib illusnte
Di
Table 15..13 I:rfb.1cnce Diagnostics for the Damaged Peaches Data in Example Cook's Distance Measure
The values in Table 15~13 were calculated using :Minitab", The Cook distance measu..--e Di does not identify any potentially influential observations in the data, as no value of Di exceeds unity.
15·10.3 Autocorrelation The regression models developed thus far have assumed that the model error components S; are uncorrelated random variables. Many applications of regression analysis involve data for which this assumption may be inappropriate. In regression problems where the dependem and independent variables are time oriented or are time-series data. the assumption of uncorrelated errors is often untenable. For example, suppose we regressed the quarterly
sales of a product against the quarterly point~of~sa1e advertising expenditures. Both vari~ abIes are time series, and if they are positively correlated with other factors such as disposable income and population which are not included in the model, then it is likely that the error terms in the regression model are positively correlated over time, Variables that exhibit correlation over time are referred to as autocorrel.ated variables. 11any regression problems in economics, business, and agriculture involve autocorrelated errorS. The occurrence of positively autocorrelated errors has several potentially serious con~
sequences, The ordinary least-squares estimators of the parameters are affected in that they are no longer minimum variance estimator51 although they are still unbiased. Furthermore,
the mean square error !l;fSJ;; may underestimate the error variance
cr, Also, confidence inter-
vals and tests of hypotheses. which are developed assuming uncorrelated errors, are not valid if autocorrelation is present. There are several statistical procedures that Can be used to determine whether the error terms in the model are uncorrelated. We ,Yin describe one of these, the Durbin~ WatSOn test This test assumes that the data are generated by the first-order au.toregressive model t=l~2,
.",n,
(15-66)
where t is the index of time and the error terms are generated according to the process
{15-67} where Ipi < 1 is an unknown parameter and 0, is a NlD(O,
He: p=O, H1:p>O.
(l5-6S)
Note that if Ho: P = 0 is not rejected, we are implying that there is no autocorrelation in the errors. and the ordinary linear regression model is appropriate.
15-10 Problems in Multiple Regression
473
To test II,: p = 0. first fit the regression model by ordinary least squares. Then, ealculate the Duibin-Watson test statistic
(15-69)
where e, is the rth residual For a suitable value of a, obtain the critical values D a. u and Dq,J. from Table 15-14.lf D > D~", do not reject II,: p =0; but if D < D .. t • reject II,: p= 0 and conclude that the errorS are pOSitively autocorrelated. If D13... L ~ D !£ D 1Z, lj' the test is
Table 15~14 Critical Values of the Durbin-Watsor. Statistic P:obability '"
k:::::: K umber of Regressors (Exchding :he Intercept)
LowetTail SampIe (Significance Size Level:;; a)
2
I
3
5
4
15
0.01 0.025 0.05
0.81 0.95 1.08
1.07 1.23 1.36
0.70 0.83 0.95
lAO 1.54
0.59 0.71 0.82
1.46 1.61 1.75
0.49 0.59 0.69
1.70 L8L 1.97
0.39 0048 0.56
1.96 2.09 2.21
20
0.01 0.025 0.05
0.95 1.08 1.20
1.15 1.28 1.41
0.86 0.99 1.10
1.27 1.41 1.54
0.77 0.89 1.00
1.41 1.55 1.68
0.63 0.79 0.90
1.57 1.70 1.83
0.60 0.70 0.79
1.74 1.87 1.99
25
0.01 0.025 0.05
1.05 1.13 1.20
1.21 1.34 1.45
0.98 1.10 1.21
1.30 1.43 1.55
0.90 1.02 1.12
1.41 1.54 1.66
0.83 0.94 1.04
!.52 1.65 1.77
0.75 0.86 0.95
1.65 1.77 1.89
0.01 0.025 0.05
1.13 1.25 1.35
1.26 1.38 1.49
1.07 1.18
1.28
1.34 1.46 1.57
1.01 Ll2 1.21
1.42 1.54 1.65
0.94 1.05 1.14
1.51 1.63 1.74
0.88
30
0.98 1.07
1.61 1.73 1.83
0.01 0.05
1.25 1.35 1.44
1.34 1.45 154
1.20 1.30 1.39
lAO 1.51 1.60
1.15 1.25 1.34
1.46 1.57 1.66
1.10 1.20 1.29
1.52 1.63 1.72
1.05 1.15 1.23
1.58 1.69 1.79
0.0: 0.025 0.05
1.32
1.40 1.50 1.59
1.28
1045
1.42 l.50
1.38
1.54
1.63
1.49 1.59 1.67
1.20 1.30 1.38
1.54 1.64
1.46
:.24 1.34 1.42
1.16 1.26 1.34
1.59 1.69 1.77
0.01 0.025 0.05
1.38 1.47 :.55
1.45 1.54 1.62
1.35
1048
1.44 l.51
1.57 1.65
1.32 1.40 1.48
1.52 1.61 1.69
1.28 1.37 1.:44
:.56 1.65 1.73
1.25 1.33 1.41
1.60 1.69 1.77
80
0.01 0.025 0.05
1.47 1.54 1.61
1.52 1.59 1.66
1.44 1.52 1.59
1.54 1.62 1.69
1.42 1.49 1.56
1.57 1.65 1.72
1.39 1.47 1.53
:.60 1.67 1.74
1.36 1.44 1.51
1.62 1.70 1.77
100
0.01 0.025 0.05
1.54 1.59 1.65
1.56 1.63 1.69
1.50 1.57 1.63
1.58 1.65 1.72
1.48 1.55 1.61
1.60 1.67 1.74
1.45 1.53 1.59
1.63 1.70 ;.76
1.44 1.51 1.57
1.65 1.72 1.78
0.025
50
60
1.25
:.72
Source: Adapted from Econometrics, by R. 1. Wonnacott and T, H. Won:aaeon., Joron W'ucy &: Sons. New York, 2970, -,villi pennlss:ion of r.he publisher.
474
Chapter 15 Multiple Regression
inconclusive, \vnen the test is inconclusive, the implication is that more data must be collected. In many problems this is difficult to do. To test for n.egative autocorrelation, that is) if the alternative hypothesis in equation 15-68 is HI: P < 0, then use lY =4 - D as the test statistic, where D is defined in equation 15-69, If a two-sided alternative is specified, then use both of the one~sided procedures. noting that the type I error for the two-sided test is 2a, where a is the type I error for the onesided tests. The only effective remedial measure when autocorrelation is present is to build a model that accounts explicitly for the autocorrelative structure of the errors. For an introductory treatment of these methods, refer 10 Montgomery, Peck, and Vining (2001).
15-11 SELECTIOl\' OF VARIABLES IN MULTIPLE REGRESSION 15-11.1
The Model-Building Problem An important problem ill many applications of regression analysis is the selection of the set
of independent or regressor variables to be used in the model. Sometimes previous experience or underlying theoretical considerations can help the analyst specify the set of independent variables, Usually, however, the problem consists of selecting an appropriate set of regressors from a set that quite likely illc1udes all the important variables, but we are sure that not all these candidate variables are necessary to adequately model the response y. In such a situation, we are interested in screening the candidate variables to obtain a regression model that contains the "best" subset of regressor variables. We would like the final model to contain enough regressor variables so that in the intended use of the model (prediction, for example) it will perform satisfactorily. On the other band, to keep model maintenance costs to a minimum, we would like the model to use as few regressor variables as possible, The compromise betvleen these conflicting objectives is often called finding the "best" regression equation. However. in most problems, there is nO single regression model that is ""best" in teons of the varions evaluation criteria that have been proposed. A great deal of judgment and experience with the system being modeled is usually necessary to select an appropriate set of independent variab~es for a regression equation. No algorithm will always produce a good solution to the variable selection problem. Most currently available procedures are sea..rch techniques. To perform satisfactorily, they require interaction with and judgment by the analyst. We now briefly discuss some of the more popular variable selection techniques.
15-11.2
Computational Procedures for Variable Selection \Ve assume that there are k candidate variables, XI> Xl' .• " XI.:' and a single dependent vari,.. able y. All models will include an intercept term (3" so that the model with all variables included would have k + 1 terms. Furthermore, the functional fonn of each candidate vari~ able (for example, x, = lIx t ,x, In x" etc.) is correct.
An Possible Regressions
This approach requires that the analyst fit all the regression equations involving one candidate va.,;able, all regression equations involving two candi~ date variables, and so on. Then these equations are evaluated according to some suitable criteria to select the "'best" regression modeL If there are k candidate variables, there are 2k total equations to be examined, For example, if k 4, there are 24:::;: 16 possible regression equations, while if k 10, there are 2" 1024 possible regression equations. Hence. the number of equations to be examined increases rapidly as the number of candidate variables increases.
Selection of Variables in Multiple Regression
15-11
475
There are a number of criteria that may be used for evaluating and comparing the dif· ferent regression models obtained, Perhaps the most commonly used cri:erion is based on !.he coefficient of multiple detennination. Let denote the coefficient of determination for a regression model with p terms, that is, p - 1 candidate variables and an intercept ~eml (note that p !5 k + 1). Computationally, we have
Ii!
R' = ~R(P) = P S
l---::-~,
"
(15-70)
where SS.cp) and SSE(P) denote the regression sum of squares ,,,,d the error sum of squares, respectively, for the p-variable equation. Now JS increases as p increases and is a maximum when p:=; k+ 1. Therefore, the analyst uses this criterion by adding variables to the model up to the point where an additional variable is not useful in tha: it gives only a small increase in The general approach is illustrated in Fig. l5~9, which gives a hypothetical plot of against p, 1i'J:>ically, one examines a display such as this :md chooses the number of variables in the model as the point at which the "knee" in the curve becomes apparent. Clearly, this requires judgment on the part of the :malyst. A second criterion is to consider the mean square error for the p-variable equation, say MSE(P) SS,(p)/(n -p), Generally, MSE(P) decreases as p increases, but this is not necessarily so. If the addition ofa variable to the model withp 1 terms does not reduce the error sum of squares in the new p term model by an amount equal to the error mean square in the old p 1 term model, ,,"dSs(P) will increase, because of the loss of one degree of freedom for error. Therefore, a logical criterion is to select p as the value that lI'inimizes 1,1SE (P); or since ,\:fS£(p) is usually relatively flat in the -vicinity of the minimum, we could choose p such that adding more variables to the model produces only very small reductions in MSE(P), The general procedure is illustrated in Fig, 15-10, A third criterion is the Cp statistic, which is a measure of the total mean square error for the regression modeL We define the total standardized mean square error as
R!
R!.
1
n
rp = ,,' rEly, - EHl ,,,,1 =
2
:2 [t,{E(y,J-E(Y,j}' t, V(y,)+
. "J= 1 [(b.)2 las + vanance (j2
o Figure 15-9 Plot of ~ against p,
476
Chapter 15 Multiple Regression
Minimum MSE(o)
L -_ _ _ _ _ _ _ _ _~~----
Figure 15~lO Plot of MSE(P) against p.
\Ve use the mean square error from the full k + 1 term model as an estimate of a'l; that is, IT = MSs(k + I). An estimator of r,is
(15-71) If the p-tel1I!. model has negligible bias, men it can be shown that E( C,lzero bias) = p.
Therefore, the values of Cp for each regression model under consideration should be plotted against p. The regression equations that have negligible bias will have values of Cp that fall near the line Cp = P. while those with significant bias will have values of Cp that pwt above this line. One then chooses as the "best" regression equation either a model with min~ imum Cp or a model with a sJJghdy larger CD that contains less bias than the m.i.nimum. Another criterion is based on a modification of that accounts for the number of variables in the model. We presented this statistic in Section 15-6.1. the adjusted R' for the model fit in Example 15-L This statistic is called the adjusted R~ defined as
R;
'() n-l(" RoO p =1--1-11:). 7 n- p P
(15-72)
Note that R:"j(P) may decrease as p increases if the decrease in (n - l){l - R; is not CDWpensated for by the loss of one degree of freedom in n - p. The experimenter would usually select the regression model that has the caximum value of R~(P). However, note that this is equivalent to the model that minimizes MS.(P). since
j
15-11
Seleetion of Variables in Multiple Rcgra""ssion
477
The data in Tab;e 15~ 15 are an expanded set of data fo:- the damaged peach data in Example 15~1. Tnere are now five candidate variables, drop height (Xl)' fruit density (.x::;, fruit height at impact point (x:J. fruit pulp thickness (x4 ), and potential energy of me fruit before me impact (x5 ), Tab~e 15-16 presents the rescl~ of rur.uing all possible regressions (except the tivial modei wim ody an intercept) on these data. The values of R~, R~lp), MS/.p), and ep are given in fr.e table. A plot of the maximum for each subset of size p is shown in Fig. 15~ 1 L Based on this plot there does not appear to be much gain in z.dding the fift:h variable. The value of does Dot seem to bcrease sigrificantly 'J.'!th the addition of x, over the four-variable model with the highest value. A plot of the minimum lrfS:/.,p) for each subset of size p is sholA.'J1 in FIg. 15~12, The best two-variable model is either (x;. X:/) or (..tz. X:/); the best th...'"U:-variable model is (xp.x::, x:J; the besl four-w.u::iablc model is either (x:,.."t:J,.:s. xJ or (xl' 4' x 3• X5)' There are several models with relatively small values of MSi,p). but either the three-variable model (xl' 4'xJ or me four~variable model (x,.~, Xz, xJ would be supe~ rior to the other models based on the MSdJ;) criterioll. Further investigation ?oill be necessary. A Cp p:ot is shown in Fig. 15-13, Only 'fue fiv~variable model has a Cp '5. P (specifically Cp =6.0), but the Cp value for the four-variable model (Xl' x.~.~, x,J is Cft:;;: 6.1732. There appears to be insufficer.t gain in the Cp value [Q justify including x5' To illustrate the calculations, for this equation (for the rrwdel including (X" x" x,. xJ] we wou:d find
R!
R!
R!
C = SS£(p) -n-2p
02
p
18.29715 20+2'5)=6.1 732. 1.13132 \
Table!;·1' Darr:.aged Peach Data fur Example 15-16
noting tha:(f1= 1.13132 is obtained from the full equation {xt>;S, x 3• x4• xj ). Slnee all other models (with the exclusion of the five-variable model) contain substantial bias, we would conclude on the basis of the C", criterion that the best subset of the regressor variables is (x:. X:!,.:s, x,J. Since this model also results in rela::ively small MSs(p) and a relatively high we would selecr it as the "best" regression equation, The final model is
R!.
jk-19.9 + 0.0 123x, + 27.3x, - 0.0655", -0.196<,., Keep in mind. though, that fi.t.r.her analysis should be conducted on this model as well as other possible candidar:e models. VV1th additional investigation,. it is possible to discover an even bcttcr.-fitting model. We v.ill discuss this in mOre detail later in this chapter.
15~ 11
Selection of Variables in Multiple Regression
479
The all-possIble-regressions approach requires considerable computational effort, even when k is moderately small. However, if the ~yst is willing to look at something less than the estimated model and all its associated statistics, ir possible to devise algorithms for all possible regressions that prOOuce less infonnarion about each model but which are more efficient computationally. For example, suppose that we could efficiently calculate only the MSEfor each model. Since models with large MSE are not likely to be selected as the best regression equations, we would then have only to examine in detail the models with small values of MS£, There are se....-ern1 approaches to developing a computationaUy efficient algorithm for all possible regressions (for example. see Furnival and Wi~son, 1974). Both
Minitab® and SAS computer packages provide the Fumival and Wilson (1974) algorithm as an option. The SAS output is provided in Table 15-16. Stepwise Regression This is probably the most widely used variable selection technique. The procedure iteratively constructs a sequence of regression models by adding or Table 15-16 Al1 Pos.sible Regressions for the Data in Example
removing variables at each step. The criterion for adding or removing a variable at any step is usually expressed in terms of a pattial F·tes<. Let F i• be the value of the F statistic for adding a variable to the model, and letFoot be the value of the Fstatistic for removing a vari~ able from the model. We must have Fin ~ Fout' and usually F.~ ;;: F 00\' Stepwise regression begir:s by forming a oneMvariable model using the regressor vari~ able that has the highest correlation with the response va..'iable y. This will also be the vari~ able producing the largest F statistic. !fno F statistic exceeds Fin' the procedure terrnina.tes, For example, suppose that at this step XI is selected. At the second step the rema.:ining k - 1 candidate variables are examined, and the variable for which the statistic
SSR(PJlP" Po)
(l5-73)
ldSE(xJ1X1 )
is a maximum is added 10 the equation, pro,ided that F, > F". In equation 15-73, MS/"xr x,) denotes the mean square for error for the model containing both XI andx? Suppose that this procedure now indicates that:s should be added to the model. Now the stepwise regression algorithm de1lermines whether the variable x; added at the first step should be removed. This is done by calculating the F statistic (15·74)
If F I < F~u.I.' the variable x~ is removed, In general, a: each step the set of remaining candidate variables is examined and the variable with the largest partial F statistic is entered. provided that the observed value of F exceeds F:~. Then the partial F statistic for each variable.in the model is calculated. and the variable with the smallest observed value of F is deleted if the observed F < F 00" The pro· cedure continues until no other variables can be added to or removed from the model. Stepwise regression is usually performed using a computer program.. The analyst exer. . cises control over the procedure by the choice of FlO and Ft)u\' Some stepwise regression computer programs require that numerical values be specified for Fin and F Qu:' Since the number of degrees of freedom on MSe depends on the number of variables in the model, which ch3!lges from step to step> a fixed value of FIE! and FOUl causes the type I and type II error rates to vary. Some computer programs allow the analyst to specify the type I error levels for Fm and F owt' However. the "advertised" significance level is not the true level. because the variable selected is the one that maximizes the par.ial F statistic at that stage. Sometimes it is useful to experiment 'With different values of Fj(j and F(j~t (or different advertised type I eITOr rates) in several runs to see if this substantially affects the choice of the final model.
~;m;~~~'i~;4t We w'Jlapply stepwise regression to the damaged peaches data in Thble 15-15, Miuita.b2i O!!tput is pnr vided in Fig. 15-14. From this figure, we see that variables XI';S, and.x;, are sigcifica..'lt; this is because the last column contains entries for only ,xl'~? and.t:J. Figure 15-15 provides the SAS computer output that '.vil.l support the computatioas ro be calculated ne."tt. Instead of speciff.!lg u~ca1 values of F ~ andFQ<,# we use an advertised type I ¢ITOr of a'"",O.10. The first Step consists ofbcild.ing a sirnp:e linear regression model using the variable that gives the largest F statistic. This is X-z, and since
Figure 15-14 Minitab
.L:! is entered into the modeL
The second step bet..:ns by finding L'ie va."iable xJ that has the largest partial F statistic, giver;, that .L:! is in the mode:. This is xI' and since
sS.(lldIl2.llc) MSE(x,.x;)
46.45691 2.26063 - 20.55 > F;,
=F,.".,.i1 =3.03,
x. is added to ':he model. :;ow the procedure evaluates whet.'ier or not Xz should be retained, given that x ~ is in the modeL rus mvol yes calculating 40.44627 = 17.89 > F 2.26063 m
FO,:O,J.17 =3.03.
TherefOie.t: should be retained. Step 2 terminates 'IN:ith bot'i x; and X:: in t'ie model.
The :bird step finds the next variable fer entry as;S. Since
ss,(Il, '/I, ,Il:, Ilc) MSE(x"x"x,)
lq.64910 L36135
12.23> Fin
FO.IQ,I.IS = 3.05,
F~te.sts on Xz (given Xl and.x~ and Xl (given ~ and xJ) indicate that these variables sholtld be retained. Therefore. the third step coucludes with the variables XI' x 2' and Xj in the model. At 'the fourth step. neither of the remainir.g terms, x.. or x~. is significant en.ough to be included in the model. Therefore. the stepwise proced\4'"e ~tes. The stepwise regression proce~""e would conehlde that the bes~ model includes XI' )'.,.. and-'S. The usual checks of medel adequacy, such as residual analysis and Cp plots. should be applied to the equation. These results are si.mi.l.ar to those found by all possible regressions, with. the exception that Xi was also considered a possible significant variable with all possible regressions,
:l:; is added to the model. Partial
482
Chapter 15 Multiple Regression
Forward Selection This variable selection procedure is based on the principle that variables should be added to the model one at a time until no re:mai::ring candidate variables produce a significant increase in the regression sum of squares. That is. variables a...~ added one at a time as long as F> Fin' Forward selection is a simplification of stepwise regression that omits the partial F-test for deleting variables from the model that have been added at previous steps. TIlls is a potential weakness of forward selection; the procedure does not explore the effect that adding a variable at the current step has on variables added at ear~ lier steps.
Mean Square P Value Pr > f 114.32885 11.4.32885 33.87 <.0001 60.75725 3.37540 175,08609 Type II 5S F Value Pr :> P variable Par~~eter Estimate St andard E=ror -42,87237 8.41429 87.62858 25.96 <.OC01 Intercept 8,43377 114.32885 33,87 <.0001 49.08366 x2 Bounds on conc.i~ion nu.1'lber~ 1. 1
Source
DF
Model Er:::or Corrected Total
1 18 19
Sum of Squares
Forward Selection: Step 2 Variable xl Enterec.: R-Square = 0.7805 and C(p) 19.9697 Mean Squ;:o..re F Value Pr :> F Source DF Sum of Squares 63,3277::. <.0001 Model 2 136.65541 30.23 38.43068 2.26063 Error 17 Corrected Total 175.086C 19 Type II S5 F Val'.le Pr :> F Variab:e Parameter Estimate Standard Error 7.46286 46.45691 0,0003 Intercept -33.83110 2C.55 0.01314 0.00418 22.32656 C.0059 xl 9.88 34.88963 8.24844 40.44627 17.89 0.0006 :<2 Bounds on condition number: 1.4282, 5.7129
------Fo~....ard
Selection: Step 3 Varl_able x3 Entered: R-Sqaare
7.2532 Source
Dr
:::0
0.8756 and C(p)
Mean Square F Value Pr > F 153,30451 51,10150 37.54 <.0001 21. 78159 1.36135 16 CO:::'rected Tota: 19 175.08609 Variable Parameter Esti,.,.tate Standard Error Type I I S5 F Value P" > F Intercept -27.89190 6.03518 29.07675 0.0003 21.36 0,01360 0.00325 23,87130 xl 17.54 0.0007 30.68486 6.50286 30.21859 22.40 :<2 0.0002 -0,06701 x3 0.01916 16.649:"0 12.23 0.0030 Bounds on condition number: 1.';786, 11.85
Model Error
Sum of Squa:.::es
3
No other variable met the 0.1000 significance level for entry into the.. model, Summary of Forward Se:e..ction Step
Variab":e Entered
Number va.:rs In
1
x2
2
xl
3
x3
1 2 3
Partial R-Square 0.6530 0.1275 0.0951
Model R-Square 0,6530 0.7805 0.8756
Figure 15-15 SAS output for stepwise regression in Example
l5~ 17.
C(p)
37.7047 19.9697 7,2532
F
Value 33.87 9.88 12.23-
Pr > P <.0001 0.0059 0.0030
15-: 1 Selection of Variables in Multiple Regression
483
Ex:unplelS-18 . AppE.carion of the forward selection algoriilim to the damaged peach data ir. Tabre 15~ 15 would begin by adding X-z to the :node1. The::l th~ variable that induces the Wb'eSt partial F~test. given tha! X 2 is in the model, is added-this is variable XI' The third step enters X:" which produces tIre Largest pa.,"tial F statistic given 1f<.at x t and x: are in the modeL Since the partial F Statlsr:cs for XJ al'l.d;; are not signif~ iea.I!t, the procedure terminates. The SAS outputfo:- forward selection is given in Fig. 15-16. Note <±tat forward selection leads to the same:final model as stepwise regression, This is not always the case,
Backward Elimino.tion This algorithm begins with all k candldate variables in the modeL Then the variable \Vith tbe smallest partial F statistic is deleted if this F statistic is insignif~ kant, that.is, ifF < Foul' Next, the model with k - 1 \'3riables is estim.ated~ and the next variable for potential elimination is found. The algorithm terminates when no further variables can be deleted.
To apply backward elimination to ilie dara in Table 15~15. we begi."'l by estimating t.'1e fur: model in a!: five variables. This model is y ~-20.89732 + 0.01 lO2x, + 27.37046x, - 0.06929", - 0.25695x, +
O.Ol66&X,. The SAS COffiputer output is given in Fig.
15~17.
The partial
F~tests
for eaeh variable are as
follows: 1'j F,
F,
F,
F,
SS,(Pl;3,. /3" p, ,p,. Po) MSe
SS, I,;3, !f3"
13.65360 1.13132
12m.
{3,. p,. P,. /30 'J 21.73153 ~ 19.2J. MS z 1.13132
SSR( /3, !(3, •.8,. f34' f3,. p,) MS E
SSR(/3,!P,. f3,. /3,. {3" f30) MS,
SS,(f3,!{3, •.8, .{3,.f34 ..80) MSE
17.37834 1.13132
15.36.
5.25602 1.13132
4.65.
2.45862 1.13132
2.17.
T.:e variablex$ has the smallest F statistic. Fs::::' 2.17 < FWI::::' Fo.!O,t.!4::;:: 3.10; therefore. AS is removed from the model at step L The model is now fit with only the four xmaini.'"lg variables, 11. step 2. the F statistic for .ott {F:,:::;; 2.86) is less than Foot =FO,10,1.1;;= 3,07, therefore, x4 is :-emoved from the model No re:rnai::ring variables have F statistics less than the appropriate F values, and the procedure is ter~ miD.ated. T.:e th.."'Ce~variable model (x!, Xz. Xv bas all variables significant according to the partial F~test criterion. Note that backward cl.i..::::.Unation has resulted in the same model that was found by forward selection and stepwise regression, This r::w:y not always happen, f)J,l!
Some Comments on Final )'Jodel Selection livre have illustrated several different approaches to the seleetion of variables in multiple linear reg..""ession. The final model obtained from any model-building procedure should be subjected to the usual adequacy checks. such as residual analysis and examination of the effects of outermost points. The analyst may also consider augmenting the original set of candidate variables with cross products, polynomial terms, or other transformations of the original variables that might improve the model
484
Chap'", 15 Multiple Regression The REG Procedure
Dependent Va=iable: y Backward El:i.\ninat:ior,.: S'tep 0 All Vax:iables En":ered; R~Squa't'e C(p) = 6.0000 Sum of Mean DP Squares Square F Value Source 28.15 5 159.24760 31. 84952 Model
Error Co~ected
Total.
14 19
15.83850 17$,08609
Pa::a.me'ter Intercept xl x2 x3 x4 x5
Pr :> F <.0001
1.13132
S1:al1dard
Error
Type II S5
7.16035
9.63604
8.52
0.01102
0.00317
13.65360
27.37046 -0.25695
0.11921
21. 73153 17.37834 5.25602. 2,45862
12.07 19.21
~O.O6929
6.24496 0.01768
Estimate
Variable
0.9095 and
-20.89732
F Value
Pr:> P 0,0112 0.0037 0.0006 0.0015 0.0490
15.36 4,65
0.016B8 0,01132 2.17 Bounds on condition number: 1. 6438, 35,628
SC·J.!.'ce Squares Sqt:are F Value D:? 153.30451 37,54 Model 3 51.10150 21.78159 1. 36135 16 Error 175.08609 Ccrrected 70tal 19 Pararnet:er S-::~'1dard F Va1'.le Variab:e Es-:::imate Error TJ."?€ II 55 6.035:'8 29.07675 21.36 ':::ntercept -27.89190 0.00326 17,54 xl 0.01360 23.87130 6.51286 x2 30.68486 30.2.1859 22.20 -0.C6701 0.01916 16.64910 12.23x3 Bounds on coruli ~ion nurnbe:!:' : 1. 4786, 11.85
AI: variables lef"t in "the model are
signific~~t
a~
~he
Pr> F <.0001
P:r > F 0.0003 0.00C7 0.0002 0.C030
O.lCOO leveL.
Ke other variable met the 0.1000 significance level for entry into the model.
Summary of St:epwise Select~on Va:r:"able Variable Number Partial Mode: Step Entered Removed Vars In .a-Squar€ R-Square Cep) x2
2 3
xl
,,3
1 2 3
0.6530 0.1275 0.0951
0.6530 0.7805 0.8756
Figure 15~17 SAS output for bacbvard elimination ill Example
F Value 33.87 9.88 7.253-2 12.23
37.7047 19.9697
Pr > F <.0001 C,0059 0,C030
15~19.
A major criticism of variable selection methods. such as stepwise regression, is that the analyst may conclude that there is one "bestH regression equation. This generally is not the case, because there are often several equally good regression models that can be used. One
486
Chapter 15 Multiple Regression way to avoid this problem is to use several different model~building techniques and see if different models result. For example, we have found the same model for the damaged peach data by using stepwise regression. forward selection, and backward elim.ination. This. is a good indication that the threewvariable model is the best regression equation, Furthennore, there are variable selection techniques that are designe.d to find the best one-variable model. the best two~variable model, and so forth. For a discussion of these methods, and the van.. able selection problem in general, see Montgomery~ Peck., and Vining (200l), If the number of candidate regressors is not too large, the all~possib:e-regressions method is recommended. It is not distorted by multicollinearity among the regressors, as stepwise-type methods are.
15·12
SU~1{
This chapter has introduced multiple linear regression, including least-squares estimation of the parameters, interval estimation, prediction of new observations, and methods for
hypothesis testing, Various tests of model adequacy. including residual plots. have been discussed. It was shown that polynomial regression models can be handled by the usual multiple linear regression methods, Indicator variables were introduced for dealing with qualitative variables. It also was observed that the problem of multicollinearity, or intercorrelation between the regressor variables. can greatly complicate the regression problem and often leads to a regression model that may not predict new observations well. Several causes: and remedial measures of this problem, inc1udi.'lg biased estimation techniques, were discussed, Finally, the variable selection problem in multiple regression was introduced. A number ofmodel~bui1ding procedures, including all possible regressions, stepwise regression. forward selection. and backward e1imination, were illustrated.
15·13 EXERCISES Consider th.e damaged peach data in Table 15-15, (a) Fit a regression model using XI (drop height) and x.;. (fruit pulp thickness) to these data. (h) Test for significance of regression. (c) Compute the residuals from this modeL Analyze these residuals using the methods discussed in this chapter. Cd) How does this twO-\Mable r:1ode~ compare Veith the two-variable model using x: and .:tz from lS~l,
Example 15- j ?
'
15-2. Consider the damaged pea.en data in Table 15-15, (a) Fit a regression ::nodel using XI {.:top height),.:tz (fruit density), and ~ (fruit height at impact point)
to cese data. (b) Test for sigrlficance of regression. (c) Compute the residcals from this model. Analyze these residuals using the methods discussed in "'Us chapter. 15-3. Using the results of Exercise 15-1, find a 95% confidenee interval on A.
lS~4.
Using the results of Exercise
15~2,
find a 95%
confidence interval on j3~. 15~5.
The data in the table at the top of page 487 are the 1976 team performance statistics for the teams in the National Football League (Source: The Sparring ly'ews),
(a) Fit a multiple regression :nodel relating the number of games won to the teams' passing yardage {x..), the percentage of rushing plays (x,), and the opponentS' yards rosblng (x,), (b) Construct the appropriate residual plots a:.d com-
ment on model adequacy, (e) Test the significance of each variable to the model, u:sing either the Nest Or the p;trtial F~test 1S-6. The table at the top of page 488 presents gasoline rr.ileage perforrr,.a.::ce for 25 at.tomobiles (Source: Motor Trend, 1975), (a) Fit a multiple regression :::::lodel relating gasoline mileage to engine displacement (Xl) and number of carburetor ba:c-els (xJ.. (b) Analyze the residuals and comment on
adequacy.
mood
15·13 Exercises
487
National Football League 1976 Team Performance
Team
y
Washington Mi:onesota New England
10 II
Oa."'dand
13 10 II 10
11
P:otsburgh
Baltimore
Los Angeles Pallas
11
Atlanta
4
Buffalo
2 7 ;0
Chieago Cincinnati
9 9
Cleveland
Denver Detroit
Green Bay
Houston Kansas City :Miami New Orleans
New York Gi.mts
New York Jets Philadelphia St. Louis San Diego San hancisco Seat-Je TIllIlpa Bay
x,; Passing yards (season). Punting aver.age (yds I punt) • .:t.: Field goal percentage (Igs made I fgs attempted). xl: T"J..'TIo'ler Ciff"eren:ial {rornovers acquirCd il:D:nOV<'..rs lost}. 'xl:
~:
Penalty yards (season).
x;: Percent rushing (rushing prays I total plays). ~: OppoIk"'1lt5' ~
rushing yards (season).
Oppocents' passing yards (season).
(c) What is the value of addirtg ,46 to a model
aut
already contains,X;?
15M'7. The electric power COJ1SllD1ed each month by a chemieal plant is thought to be related to the average
amo:cn! temperature (xa. the number of days in jre lIIDnth (:r;), the average product purity (x,), and :he tOllS of proouct produced (.xJ. The past year's hlstoncal data is available and is pres:ented in the table at the bottom of page 488.
(a; Fit a multiple regression model to these data. (b) Test for sigrllficance of regression. ( c) Use partial F statistics to test H,: {J, = 0 and H,: (J, =0. (d) Compute the residuals from this model. .Analyze the res~du.als using the merhods discussed in this chapter.
15~10. Consider the follow.Jlg data. which result from .an experiment to derer.:nine the effect of x = test time in hours at a particular temperature on y change in oil viscosity.
15-8. Hald (1952) reports data on the heat evolved in
y
calories per gram of cement (y) for various amounts of four ingredients: (x!, x... x,> xJ. Observation :."u:::.ber
2 3 4 5 6
y
x,
735 74.3 104.3
81.6 95.9 109.2 102.7 72.5
X,
7
26
6
60
1
29
52
11 11 1 11 3
56
15 8
20
31
3
47
52
6 9 17
33 22 6
22 18
22
9 10
93.1
2
55 71 31 54
115.9
21
47
11
8}.8
4()
12 13
1133 109.4
I 11
66
10
68
7
8
x,
'"
1
4 23 9 8
44
26 34 12
12
(a) Fit a multiple regression model to dlese data. (b) Test for sig;'.i.fic2nce of reg:ession,
(c) Test the hypothesis fl, = 0 using the partial F·test Cd) Compute the r statiscics for each independent vari· able. What conclusions C2n you draw? (e) Test the hypothesis {J, =fl, ={3, =0 using the partial F-test. (f) ConstrJct a 95% confidence interval estimate for
{3,.
(c) Test the hypothesis thar Pn == O. (d) Compute the residuals and test for model adequacy.
(a) Fit a second~Ol:derpolyno!tial to the data. (b) Test for significance of regres.<;ion, (0) Test the hypothesis that Pu =: O. (d) Compute the residuals and check for model adequacy, lS~ 11. For many polyr.omial regression models We subtract from each x value to produce a "centered" regressor Xl x-x. Using the data from Exercise l5~9, fit the model y e;; f3~ + P~x'+ P;I(X'? + e Use the results to estimate the coefficients: in the uncentered model y = {Jc + {J,x - f3 ..i' + E. 15~:u. Suppose that we use a standa.-'1.nzed variabie x';::o (x - x)lsft where Sy is the standard deviation of x, in constructing a polyr.omial regression model Using !he data in Exercise 15-9 and the standardized var:il;ble approach. fit the model y ~ fJ~ + P;x' + j3~I(X)Z..;.. e (a) What value of y do you predict when x =. 285;)F? (b) Estimate the regression coefficients in the unstan~ datdized model y = Po + PiX";" PllX~ + e (c) What can you say about the relationship beween SSe and RZ for the standardized and unstandardized models? (d) Suppose that y"':=: (y - 'Y>fsv is used in the model along with x~ Fit tbe [Lodel and comment on the relationship between SSE and If in the standardized model and the unstandardized model.
x
15-9. An. article entitled "P.. Method for Improving the Accuracy of Pol.Y::1omial Regression Analysis" in the loumal of Q1'aliry Technology (1971. p. 1"9) reported the following data on y:=> ultitl:!.ate shear strength of a rnb'::ler compound (psi) and x = cure temperature ("'F).
15--13. The data shown at the bot"wm of this page were collected during an experiment to detern:tine :he change in thrust ef5.ciency (%) (y) as the divergence angle of a rocket nozzle (x) cl:!anges. (a) Fit a second~order model to the data. (b) Test for significance of regression and lack of fit.
y 770
800
840
810
735
640
590 560
(c) Test the hypothesis that {3" ~
280
284
292
295
298
305
308 315
15-14. Discuss the hazards inherent in fitting polynoll1i2l models.
x
(a) Fir a second~order polynomial to this data, (b) Test for significance of regression.
4.0
5.0
5.0
o.
15-15. Consider the data in Ex.ample 15-12. Test the hypothesis that two diffexentregression models (with
6.5
6.5
6.75
7.0
7.3
490
Chapter 15 Multiple Regre>sion
different slopes and intercepts) are required to ade-
quately model the data. 15.. 16. Piecewise Linear Regression (I). Suppose that y is piecewise linearly related to x. Thar is. differ·
ent linear rela.tionships are appropriate over the intervals - o:x> < X ~:l 3lld x' < x < «l, Show how indicator variables em.;. be used to fit such a piecewise linear regression model, assuming that point x' is kno'WU. 15~17.
Piecewise Linear Regression (II). Consider the piecewise linear regression mode3 described in Exercise 15-16. Suppose that at point x' a discontinuity occurs in the .regression function. Show how indi~ cator variables can be us~ to iIlcorporate the discontinuity into the modeI. 15~18.
Piecewise Linear Regression (ill). Consider the piecewise linear regression model described in Exercise 15-16. Suppose that point x' .is not known with certainty aud must be estimated. Develop an approach that could be used to fit the piecewise linear regression model
1s..19. C'..alcu1ate the standardized regression coefficients for the regression model developed in Exercise
lS-L 15~20. Calculate the standardized regressioo coefficients for the regression model developed in Exercise
15-2. 15~21. Find the variance inflation factors for the regression model developed L1 Example 15-1. Do they ir.dicate t.~at multicollinearity is a prOblem in this
model?
15-22. Use the National Football League Team Per~ fonnance data in Exercise 15-5 to build regression models using rhe following techniques: (a) All possible regressions. (b) Stepwise regression. (c) Fonvard selection. (d) Backward elimination. (e) COIl:.lm.ell.t on the various models obtained. 15~23. Use the gasoline mileage data in Exercise 15-6 to build regression models using the following tech~
niques: (a) All possible regressors,
(b) Stepwise regression. (e) Forward selection. (d) Backward _atio" (e) Comment on the various models obtained, 15~24.
Consider the Rald cement data in Exercise
15-8. Build regression models for the data using me
following techniques: (a) i~l1 possible regressions, (b) Stepwise regression. (c) FoIW'dl'd selection. Cd) Backward elimination. Consider the Hald cement data in Exercise 15-8. Fit a regression model involving all four :reg...~s sors and find t.~ variance inflation factors. Is multicollinearity a problem i:1 this model? Use ridge regression to estimate the coefficients in this model. Compare the ridge model to the models obtal.!1ed in E..'Cercise 15~25 using variable selection methods. 15~25.
Chapter
16
Nonparametric Statistics 16·1 INTRODUCTION Most of the hypothesis testing and eonfidenee interval proeedures in previous chapters are based on the assumption that we are working with random samples from normal populations. Fortunately, most of these proeedures are relatively insensitive to slight departures from nonnality, In general. the t- and F-tests and t eonfidenee intervals will have aetuallev ... ets of signifieance or eonfidenee levels that differ from the nominal or advertised levels chosen hy the experimenter, although the difference between the actual and advertised lev-
els is usually fairly small when the underlying population is not too different from the normal distribution. Traditionally. we have called these procedures parametric methods beeause they are based on a partieular parametric family of distributions-in this case, the nonna!. Alternatively, sometimes we say that these procedures are not distnbution free because they depend on the assumption of normality. In this chapter we describe procedures ealled nonparametric or distributionwfree methods and usually make no assumptions about the distribution of the underlying population; other than that itis eontinuous. These procedures have actual levels of s:ignifieanee a or confidence levels 100(1- a)% for many different types of distributions. These procedures also have considerable appeal. One of their advantages is that the data need not be quantitative; it could be categorical (such a..,.;; yes orno~ defective or nondefective. etc.) or rank data. Another
advantage is that l10nparametrie procedures are usually very quick and easy to perform. The procedures deseribed in this chapter are competitors of the parametric t- and F-procedUIes describ~d earlier. Consequently~ it is important to compare the performance ofbotb parametric and nonparametric methods under the assumptions of both normal and nonnormal populations. In general, nonparametrie procedUIes do not utilize all the information provided by the sample) and as a result a nonparametric procedure will be less efficient than the corresponding parametric procedure when the underlying population is norma1. This loss of efficiency usually is reflected by a requirement for a larger sample size for the nonparametric procedure than would be required by the parametric procedure in order to achieve the same probability of type II error. On the other band. this loss of effi· ciency is usually not large, and often the difference in sample size is very smalL When the
underlying distributions are not nonnal, then nonp,arametric methods have much to offer. They often provide considerable improvement over the nOrmal-theory parametric methods.
16·2 THE SIGN TEST 16·2.1 A Description of the Sign Test The sign test is used to rest hypotheses about the median j1. of a continuous distribution. Recall that the median of a distribution is a value of the random variable such that the prob· ability is 0.5 that an observed value of X is less than or equal to the median, and the
491
492
Cbapter 16
Nonpa..'"3J:uetric Statistics
probability is 0.5 that an observed value of X is greater than or equal to the median. That is. P(X s,{l) = P(X?;fl) = 0.5. Since the normal distribution is symmetric, the mean of a normal distribution equals the median. Therefore the sign test can be used to test hypotheses about the mean of a nor~ mal distribution. This is the same problem for which we used the I-test in Chapter 11. We will discuss the relative merits of the two procedures iu Section 16-2.4. Note that while the t~test was designed for samples from a normal distribution. the sign test is appropriate for samples from any continuous distribution. Thus, the sign test is a nonparametric procedure. Suppose that the hypotheses are Ho: {l =/10.
(16-1)
H,: {l "/10.
The test procedure is as follows. Suppose that Xl' X2•
•••• Xtl is a random sample of n. observations from the population of mterest. Form the differences (Xi p,;J, i = 1, 2, .. ,' n. Now if Ho: {l=/1o is true, any differenceX,-/1o is equally likely to be positive or negative. Therefore let R' denote the number of these differences (X, - p,;; that are positive and let K denote the number of these differences that are negative. where R;;: min CRt, R-).
When the null bypothesis is true, R has a binomial distribution with parameters n and p = 0.5. Therefore. we would find a critical value. say from the binomial distribution that ensures that P(type I erroD = P(reject Ho whenHo is true) = a. A table of these critical val-
R:
uesR;is given in the Appendix. Table X. If the test statistic R:f sis Ho: P. = /10 should be rejeeted.
R:.
then the null hypothe-
~1ontgo:mery, Peck. and Vining (2001) report on a study in which a rocker: motor is formed by bindL"lg an igniter propellant and -a sustainer propellant together inside -a metal housing. The shear strength of the bond between j}e two propellant types is an important characteristic. Res'..llts of testing 20 ran~ comly selected motors are shown in Table 16-1. We would like to test the hypothesis that me median shear strength i'l 2000 psi.
Tne formal statement of the hypotheses of interest is
H,: fl = 2000,
H,: fl" 2000. The last two columns of Table 16~ 1 show the differences (Xi - 2000) fnr i::::::: 1.2, .,., 20 and the cor~ responding sig:us, Note that Jr = 14 and R- = 6, The:efore R = min (Jr, R-) min (1<. 6) 6. Fro:n the Appendix, Table X, with n =: 20. we find that the critical value for a = 0.05 is ~.05 = 5. Therefore, smceR =6 is not less than or equal to the critica: va1ue~1l$ =5. we cannot reject:he ;l;ull hypothesis that the median shear strength is 2000 psi. " We note that since R is a b1r.omial random variable, we could test the hypothesis of interest by directly calculating a P~value from the binomial distribution. 'When Ho: fl:::::; 2000 is true, R has a binomial distributior. with paratl1eters n = 20 and p = 0.5, Thus the probability of observing six or fewer negative signs in a sample of20 observations is
=
=
, (20\
P(R S 6) = '1:"
1(0.5)'(0.5)20-' ..:..1 r ). r..,O\
= 0.058. Since the P~ya1ue is not less than the desired level of Significance, we cannot reject the:lull h'ypoth esis of ::=. 2000 psi.
Exact Significance Levels When a test statistic has a discrete distribution 1 such as R does .in the sign test, it may be impossible to choose a critical value that has a level of signifto yield an a as close to the icance exactly equal to a. The usual approach is to choose advertised level a as possible.
R:
R;
Ties in the Sign Test Since the underlying population is assumed to be continuous, it is theoretically impossible to find a " that is, a value of XI exactly equal to Po- However) this may sometimes .happen in practice because of the way the dzta are collected. \Vhen ties occur, they should be set aside aJ:,d the sign test applied to the remaining data. One~Sided Alternative Hypotheses We can also use the sign test wben a one-sided alternative bypothesis is appropriate. If the alternative is H,: p > p", then reject Ho: p= p" if R-
The Normal Approximation When p 0.5, the binomial distribution is well approximated by a norma! distribution when n is at least 10. Thus, since the mean of the binomial is np a.'1d the vari""ce is np{l- p), the distribution of R is approximately normal, with mean
O.5n and variance 0.25n wbenever n is moderately large. Therefore in these cases the null l
hypothesis can be tested with the statistic
R-O.5/l
(16-2)
494
Chapter 16 Nonpara::::etric Slatistics
The two-sided alternative would be rejected if 1201:> Zoo' and the critical regions of the onesided alternative would be chosen to reflect the sense of the alternative (if the alternative is H,: f1 :>f1c, rejectH, if 20 > Z", for example).
16·2.2 The Sign Test for Paired Samples The sign test can also be applied to paired observations drawn from continuous popu1a~ aons. Let (X "X~,)~j:::: 1, 2. ",~ n, be a collection of paired observatiocs from two contin~ uous population;: and let
1,2. ,",.n, be the paired differences. We wish to test the hypothesis that the two populations have a common median, that is, that,il! ~~. This is equivalent to testing that the median ofllie du. . ferences fld:::: O. This can be done by applying the sign tcst to the r. differences D), as illustrated.in the following example.
Exampk16·2 An autor.o.otive engincer is investigating two different types of metering devices for an electronic :bel injection system to determine if they differ in their fuel.mi.!eage performance. The system is installed on 12 different cars, and a tes£is run with each meterslg system on each car, The observed fuel mileage performance data, corresponding differences. ar.d their signs are show!:. in Table 16~2. Note that K:::: 8 andR-;4. Thm::foreR = min (K,R1 = min (8, 4) =4. From the Appendix, Table X, with n; 12, we :find the criticlli va.:'.1e for a= 0.05 is~.O$ = 2. SinceR is not less than the critical value ~.(jj' we cannot reject the null hypothesis that the two metering devices produce the same fuel mileage performa.nee.
16·23 Type n Error (fJ) for the Sign Test The sign test will control the probability of type I error at an advertised level a for testing the null hypothesis H,: f1 = A, fo, any continuous distribution. As with any hypothesis. testing procedu!:"e, it is l.-nportant to investigate the type IT error. p, The test should be able to effectively detect departures from the null hypothesis, and a good measure of this effec,. Tablel6-Z
Pcrfurr:.ance of Flow Metering Devices Metering Device
Car
2 3 4
5 6 7 8
9 10 1l 12
2
17.6 19.4 19.5 17.1 15.3
15.9 16.3 18.4 17.3 19.1 17.8 18.2
16.8 20.0
Difference,
0.8
18.2
1.3
16.4 16.0
0.7 -0.7
15,~
0.5
16,5 18.0
-0.2
16.4 20,1 16.7
17.9
+
-0.6
0.4 0.9 -l.0
J.l 0.3
+ + + ~
+ + ~
16-2 The Sign Test
495
tiveness is the value of Pfor departures that are important. A small value of Pimplies an effective test procedure. In determining p, it is important to realize that not only must a particular value of Jl, say A:, + Ll., be used, also the form of the underlying distribution will affect the calculations, To illustrate, suppose that the underlying distribution is nonnal with 0"= I and we are testing the hypothesis that {1 = 2 (since P. = f.1 in the normal distribution this is equivalent to testing that the mean equals 2), It is important to detect a departure from{1= 2 to{1= 3, The situation is illustrated graphically in Fig, 16-10, ,,'hen the alternative hypothesis is !rue (H,: {1 = 3), the probability that the random variable X exceeds the value 2 is p
=P(X> 2) = P(Z>-I) = 1- 4>(-1) = 0.8413.
Suppose we have taken samples of size 12, At the a = 0,05 level, Appendix Table X indicates that we would reject Ho: {1 = 2 if R ,; R;,,, = 2. Therefore, the Perror is the probability that we do not reject No: {1 = 2 when in fact {1 = 3, or
P
' J
'2 1 12 x 1- ~l x (0.1 587)"(O.8413)lt- = 0,2944,
If the distribution of X has been exponential rather than nonnal, then the situation would be as shown in Fig. 16-1b, and the probability that the random variable X exceeds the value.x : :;:; 2 when fl 3 (note when the median of an exponential distribution is 3 the mean is 433) is 1
p=P(X>2)
-1
2
r~_1_;4,33X dx=;<,33 =0.6301.
h 4.33
0
2 Under He:
il =2
3
4
5
6
Under ~;;:;'=3 (a)
J(,=2 J.L-2.89 Under
H;;: 'ji "" 2 (b)
Figure 1&.1 Calcu!ation of {3for the sign test (a) Normal distributions, (b) exponential distributions.
496
Chzpter 16
Nonparametric Statistics
The p error in this case is
f3 ~ 1-
I' 12x 1J(0.3699)" (0.6301)12-x = 0.8794. 2 (
x={)\
..
l1ms, the /3 error for the sign test depends not only on the alternative value of {l but on the area to the right of the value specified in the null hypo thesis under the population probability distribution. This area is highly dependent on the sbape of that particular probabil· ity distribution.
16-2.4 Comparison of the Sign Test and the t-Test If the underlying population is normal, then either the sign test or the t-test could be used to test Ho: p.;;;;;:; flo. 1be Hest is known to have the smallest value of /3 possible among all tests that have significance level ex,. so it is sUperior to the sign test in the normal distribution case. When the population distribution is symmetric and nonnormal (but with finite mean i' ~ P.), then the t-test will have a f3 error that is smaller than f3 for the sign test, unless the distribution has very heavy tails compared with the nonnal Thus, the SlgTl test is usually consid~ ered a test procedure for the median rather than a serious competitor for the t-tesL The Wucoxon signed rank rest in the next section is preferable to the sign test and compares well with the (-test for symmetric distributions.
16·3 THE WILCOXON SIGNED RANK TEST Suppose that we are willing to assume that the population of interest is continuous and sym~ metric. As in the previous section. our interest focuses on the medianJl (or equivalently, the mean fJ.. since f1 = f.L for symmetric distributions). A disadvantage of the sign test in this situation is that it considers O11ly the signs of the deviations Xi - flo and not their magnitudes. The Wilcoxon signed rank test is designed to overcome that disadvantage.
16·3.1 A Description of the Test We are interested in testing He: IJ:;:: Jl.o against the usual alternatives. LA.ssume that Xl' X2, , •• , X,,:is a random sample from a continuous and symmetric distribution with mean (and medi~ an) fl. Compute the differences X, - 11<;, i = I, 2, ... , n. Rank the absolilte differences ~XI - .L4>!, i;;;;;:; L 2, ... ; n, in ascending order, and then give the:ranks the signs of their corresponding differences. Let R' be the sum of the positive ranks and R- be the absolute value of the sum of the negative ranks, and let R = min (R', R-). Appendix Table XI contains critical values of R, say R~ If the alternative hypothesis is Hi: i' ~ 11<;, then if R':; R; the null hypothesis He: i' ~ 11<; is rejected. For one~sided tests, if the altemative is Hj : ,u > Ji.o reject Ho: Ii = J10 1£ R- < R:; and if l the alternative is HI: i' < 11<;, reject Ho: i' ~ 11<; if R' < R". The significance level for one-sided tests is one-half the advertised level in Appendix Table Xl
2F;iiUli~1~~~;1: To illustrate the Wucoxon signed rank test. consider the propellant shear strength data presented in Table 16-1. The signed ranks are
-1-336.75 +357.90 +399.55 +414.40 +575.00 -.
The sum of the positive:ranks is R" (-1 +2 + 3 +4 + 5 + 6...:.11 + 13.,.. 15 + 16 +:7 + 18 + 1920) =: 150 and the sum of the nega:ive ranks is lC::::. (7 + 8 + 9 + 10 + 12 + 14) = 60. Therefore R:::. nilil (P:', R') = nilil (150, 60) = 60. Proll: Appendix :able Xl, with n = 20 and a= 0.05, we find the critical value R~.J1 = 52. Since R exceeds R;, we cannot reject the null uypothesls that the mean (or median. since the popUlations are assumed to be symmetric) shear strength is 2000 psi,
Ties in the Wilcoxon Signed Rank Test Because the underlying population is continuous, ties are theoretically impossible, although
:.bey will sometimes occur in practice. If several observations have the same absolute magnitude. they are assigned the average of the ranks that they would receive if they differed slightly from one another.
16-3.2 A Large-Sample Approximation If the sample size is moderately large, say n > 20, then it can be shown that R has approximately a normal distribution with mean
n{n + 1) 4
and variance
n(n + 1){2n + 1) 24
498
Chapter 16 Nonpara:netric Statistics
Therefore, a test of lIo: J1::::::: J1r; can be based on the statistic R
n(n+ 1)/4
Zo == ---;===:':;=.==::.00-. ..,jn(n + 1)(2n + 1)/24
(16-3)
An appropriate critical region can be chosen from a table of the standard normal distribution.
16·33 Paired Observations The Wilcoxon signed rank ,est can be applied to paired data. Let (X,)' X,),j = 1,2, ... , n, be a coLlection of paired observations from continuous distributions that differ only with respect to their means (it is not necessal)' that the distributions of X, and X, be symmetric). This assures that the distribution of the differen.ces DJ ::: X lj - Xl} is continuous and symmetric. To use the WIlcoxon signed rank test, the differences are fL""St ranked in ascending order of their absolute values, and then the ranks are given the signs of the differences. Ties are assigned average ranks. Let K be the sum of the positive ranks and R- be the absolute value of the sum of the negative ranks, and let R =min (R'", R-), We reject the hypothesis of equality of means if where is chosen from Appendix Table Xl. For one-sided tests, lithe alternative is H:: P, > 110: (or H,: Pn > 0), rejectH, if R- < R;'; and if H,: P, < 110: (or H,: PD < OJ, reject Ho if R'"
R,; R:,
R:
Consider th~ fuel metering device data examined in Example 16~2. The signed :an.ks are shown
Note that K= 55.5 and It =: 22.5; therefore.R=min (Ir"t10 = min (55.5. 22.5) 22.5. FromAppcndix Table Xl. with It =: 12 and (1,= 0,05, we find::he critical value ~,iJ) = 13. Since R exceeds (,1)$' we cannot reject tbe null hypothesis that the two metering devices produce the same mileage performance.
499
164 The Wilcoxon Rank-Sum Test
16·3.4 Comparison with the t· Test Vlhen the underlying population is normal. either the t-test or the Wilcoxon signed rank test can be used to test hypotheses about /.1., The t-test is the best test in such situations in the sense that ir produces a minimum value of fJ for all tests with significance level a. Howeverj since it is not always clear that the nonnal distribution is appropriate, and since there are many situations in which we know it to be inappropriate, it is of interest to compare the rNO procedures for both normal and nonnormal populations. Unfommate)y, such a comparison is not easy. The problem is that [3 for the Vlilcoxon signed rank test L.1:i very difficult to obtain, and 13 for the t-test is difficult to obtain for nonnormal distributions. Because type II error comparisons are difficult, other measures of comparison have been developed, One widely used measure is asymptotic relative efficiency (ARE). The ARE of one test relative to another is the Um.:ting ratio of the sample sizes necessary to obtain identica: error probabilities for the two procedures. For example, if the ARE of one test relative to a competitor is 05. then when sample sizes are large, the first test will require a sample t'.¥lce as large as the second one to obtain similar error performance. While this does not tell us anything for small sample sizes, we can say the following: 1. For normal populations the ARE of the Wilcoxon signed rank test relative to the ,·test is approximately 0.95. 2. For nonnormal populations. the ARE is at leab~ 0.86, and in many cases it will exceed unity. VV'ben it exceeds unity, the Wilcoxon signed rank test requires a smaller sample size than does the t~test. Although these are large-sample results. we generally conclude that the Wilcoxon signed rank test will never be much worse than the t~test, and in many cases where the population is nonnormal it may be superior. Thus the Vlikoxon signed rank test is a useful alternative to the t-test.
16·4 THE WILCOXON RANK·SUM TEST Suppose that we have nvo independent continuous populations X J andX:z with means ,u and p.,. The distributions of X, and X, have the same shape and spread and differ only (possibly) in their means. The Wilcoxon rank-sur:! test can be used to test the hypothesis flo: 11, = f.L:. j
Sometimes this procedure is called the Mann-Whitney test, although the Mallll-Vihitney test statistic is usually expressed in a different form.
16·4.1 A Description of the Test LetX 11 , X12> ••• ~XI"I andX21 .Xn , ...• Xo,J be two independent random samples from the con~ tinuous populations X; andXj. described earlier. We assume that n l :s:; 1t:!. A..'Tange all n J + n2 observations in ascending order of magnitude and assign ranks to them, If two or more observations are tied (identical). then use the mean of the ranks that would have been assigned nthe observations differed. Let R, be the sum of the r:mks in the smaller X. sample, and define
(16.4)
Now if the two means do not differ, we would expect the surrt of the r:mks to be nearly equal for both samples. Consequently, if the sums of the ran.l.cs differ g:eatly. we would conclude that the means are not equal.
500
Chapter 16 :Konparar.Jetric Statistics
R:
Appendix Table IX contains the critical values of the rank sums for 0: = 0.05 and a 0.0 I. Refer to Appendix Table IX, with the appropriate sample sizes n, and Ilz. The null hypothesis Ho: Ill;;;;.~ is rejected in favor of HI: J1 1-t: J.L;.if either R: or R'{. is less than or equal to the tabulated critical value R~ The procedure can also be used for one-sided alcematives. If the alternative is H j : J.lt < /1), then reject He if R] :;; R~ while for He: #1 > fJ..z1 reject 8 0 if Rz s: R~ For these one~sided tests the tabulated critical values correspond to levels of significance of a::::: 0.025 and 0:'" 0.005.
R;
Exartlple.16-S The mean axial stress in tensile members used in an aircraft st:ucture is bei:1g studied. Two. alloys are being investigated. Alloy I is a traditio.nal material and a:loy 2 is a new aluminum-lithium tilloy that is much lighter than the standard materiaL Ten specimens o.f each alloy type are tested, and the axial stress meas'..lX""...d. The sample data are assembled in the following table:
Alloy 1
3238 psi 319-5 3246 3190 3204
3254 psi 3229 3225 3217 3241
Alloy 2
3261 psi 3187 3209
3212 3258
3248 psi
3215 3226 3240 3234
The data fI.""e arranged in ascending o.rdet and ranked as follows:
Alloy Nutnber
Axial Stress
2
3187 psi 3190 3195 3204 3209 3212 3215 3217
2 2
2 ! 2
2 1 2
2 2 2
Rank
2
3 4 5
6 7
g
3225
9
3226 3229
10
3234
:2 13
3238 3240 3241 3246 3248 3254 3258 3261
11
14
15 :6 17
18 19 20
r
16-5
Nor-parametric Methods in the Analysis ofVa.'iauce
FromAppendix Table IX. wit.1. ft J :::; nl = 10 and a= 0.05, we fiad that R;,c~ =78. Since neither RI nor ~ is less than R~.ils. we cannot reject the hypothesis that both alloys exhibit the Same mean axial stress.
164.2 A Large-Sample Approximation When both nj and 1tz are mOderately large, say greater than 8, the distribution of R j can be well approximated by the normal distribution with mean
n,(n, +":1 +1) IlR,
2
and \'ariance
,
n,":1(n,+,,:!+I) 12
OR!;;::
Therefore, for n J and
.
nz > 8 we could use
z, (16·5)
2:,/ ,.
as a test statistic, and the appropriate critical region as Zm:' :z;, > Z'" or Z, < -Z", dependiag on whether the test is a two-tailed. upper-tail, or lower-tail test.
1643 Comparison with the t·Test In Section 16-3.4 we discussed the comparison of the t-test with the Wilcoxon signed rank test. The results for tl:!e two-sample problem are identical to the ooe-sample case; that is, when me normality assumption is correct me Wilcoxon rank·sum testis approximately 95% as efficient as the t-test in large samples, On the other band, regardless of the form of the distributions, the Wilcoxon rank-sum test will always be at least 86% as efficient.if the underlying distributions are very nonnormal. The efficiency of the ViilClJxon test relative to me I-test is usually high if me underlying distribution has heavier tails thao the normal, hecause the behavior of the t-test is very dependent on the sample me~ which is quite unstable in heavy.tailed distributions,
16·5 NO"'PARA..'-1ETRIC METHODS IN THE ANALYSIS OF VARIAi'iCE 16-5.1 The Kruskal-Wallis Test The si::tgle·factor analysis of variance model developed in Chapter 12 for comparing a population means is Yij
= J1.+ 1:,
'i=t2,,,.,a,
+£ij
t
.
. ] = 1, 2, ..., nl'
(16-6)
502
Chapter 16 Nonpanunctric Statistics
In this model the error tenns
E., are assumed to be normally and independently distributed with mean zero and variance '0". The assumption of normality led directly to the F-test described in Chapter 12. The Kruskal-Wallis test is a nonparametric alternative to the F-test; it requires only that the Eli have the same continuous distribution for all treatments j= I, 2, .. " D, Suppose that N .:;;: L~ln! is the total number of observations. Rank all N observations from smallest to largest and assign the smallest observation rank 1, the next smallest rank. 2, , .. , and the largest observation ranklY. If the null hypothesis
Ho: 1': = }l, = ... 1', is true. theN obsen'ations come from the same distribution, and all possible assignments of the 11' ranl:'s to the a s"'!Iple, are equally likely, then we would expect the ranl:'s I, 2, ... , 11' to be mi'ICed throughout the a samples, If, however, the null hypothesis Ho is false, then some samples will consist of observations having predominantly small ranks while other samples will consist of observations having predominantly large ranks, Let R'j be the rank of observation Y;" and let R;+ and denote the total and average of the ni ranks in the ith treatment. When'the null hypothesis is true, then
and
The Kruskal-Wallis test statistic lI1.eaSuteS the degree to which the actual observed average ranks RI , differ from their expected value (N + 1)/2.lfthis difference is large, then the oull hypothesis Ho is rejected. The test statistic is K
12
"
(-
11'+1)'
N(N+1)~ n,\"R - -2 -
An alternate computing formula is
K=
12
'
±n,
N(N + I) i~,
11,2 -3(11'+1).
(16-7)
(16-8)
We would usually prefer equation 16-8 to equation 16-7, as it involves the rank totals rather than the averages. The null hypothesis He should be rejected if the sample data generate a large value fOr K. The null distribution for K has been obtained by using the fact that under He each possible assignment of mlks to the a treatments is equaliy likely, Thus we could enumerate all possible assignments and count the number of times each value of K occurs. This has led to tables of the critical values of K, although most tables are restricted to small sample sizes nf' In practice. We usually employ the following large-sample approximation: Vlhenever He is true and ei ther
a= 3 and n,2:6
for i = 1, 2, 3
or for i:::: 1, 2, ... , G,
16-5
Nonparametric Met.100s IT. the Analysis of Variance
503
then K has approximately a chi-square distribution with a-I degrees of freedom. Since large values of K imply that Ho is false, we would reject He if
K?!i~'_l' The test has approximate significance level a. Tu;s in the Kruskal- ..Vallis Test When observations are tied, assign an average rank to each of the tied observations. \\'1len there are ties, we shQuld replace the test statistic in equation l6~8 with (16-9) where ni is the number of observations in the itb treatment, N is the total number of obser~ vatioM.and C
52 = _.I_ N -1
iIR,} _!l(N +1)2].
l
(16-10)
4
I'd 1'=>1
Note that :f is just the variance of the ranks. When the number of ties is moderate, there will be little difference between equations 16-8 and 16-9, and the simpler form (equation 16-8) may be used.
In Design and Analysis oj Experiments, 5t.'l Edition (John Weey & SODS, 2(01), 0, C, Montgomery presents data from an experiment in which five different levels of cotton content in a syn~etic Eber were tested to daermine if eottOn content has any effect on fiber tensile strength. The sa."llple data and ranks. from this experimer.t are shown in Table 1&-3. Since :here is a fairly large number of ties, we use equation 16~9 as the test statistic. From equation 16~ 10 we fud
S2
=~f-iiili~ _N(N+l)'] f;
-l~i"'l J""l
4
1
= 1..[5497.79- 25(26)' 24 4 j ~53.03.
Table 16-3 Data and Ran.<.s for :he Tensile Testing Experiment Percentage of Cotton
IS
20
2.0 2.0
12
7
IS
12.5
11 9
7.0 4.0
12 18 18
7
lit
27.5
17
30
25
9.5 14.0 9.5 16.5 16.5
66.0
14 18
IS 19 19
11.0 16.5 16.5 20.5
20.5 85.0
19 25 22 19 23
35
7
2.0
10
5.0 7,0 12.5 7.0
20.5 25.0 23.0 20.5
11 15
24.0
11
113.0
33.5
504
Chapter 16 Nonparametric Statistics and the test Statistic is
1:
a
2
R L-" - NtN+l)21 \' 4 J'
K=S2liQ1 nJ
•
=_l_r5245.0- 25(26)'] 53.03,
4
=19.25. Si...."lce K> x~,c,'.; ::::;; 13.28, we would reject the nullhypoiliesis andcondude that treatnents differ. This is the same conclusion given by the usual analysis of variance Fwtest.
16·5.2 The Rank Transfonnation The procedure used in the previous section of replacing the observations by their ranks is called the rank tran.sformation. It is a very powerful and widely useful technique. If we were to apply the ordinary P-test to the rarJ:s rather than to the or::ginal data, we would obtain
Ki(a-l) (N-I-K)i(N-a) as the test statistic. Note that as the Kruskal-Wallis statistic K increases or decreases~ F<;; also increases or decreases, so the Kruskal-Wallis test is nearly equivalent to applying the usual analysis of variance to the ranks. The rank transformation has wide applicability in experimental design problems for which no nonparametric alternative to the analysis of variance exists. If the data are ranked and the ordinary F wtest applied, an approximate procedure results, but oue that has good statistical properties, Villeo we are concerned about the normality assumption or the effect of outliers or '''wild'' values, we recommend that the usual analysis of variance be perfotmed on both the original data and the ranks. \','11en both procedures give similar results, the analysis of variance assumptions are probably satisfied reasonably well. and the standard analysis is satisfactory. Vi/hen the two procedures differ, the rank tralll;formation should be preferred since it is less likely to be distorted by nonnormality and unusual observations. In such cases, the experimenter may want to investigate the use of transformations for non... normality and examine the data and the experimental procedure to detennine whether outliers are present and if so, why they have occurred.
16-6
S~1ARY
This chapter has introduced nonparametric or distribution-free statistical methods. These procedures are alternatives to the usual parametric t~ and F-tests when the normality assumption for the underlying population is not satisfied. The sign test can be used to test h)'Potheses about the median of a continuous distribution. It can also be applied to paired observations. The Wilcoxon signed rank test can be used to test hypotheses about the mean of a symmetric continuous distribution. It can also be applied to paired observations. The Wilcoxon signed rank test is a good alternative to the t~test. The two~sample hypothesistesting problem on means of continuous symmetric distributions is approached using the Wilcoxon rank-sum test. This procedure compares very favorably with the two-sample Hest. The Kruskal-Wallis test is a useful alternative to the F~testin the analysis of variance,
16-7
E,'{ercises
505
16-7 EXERCISES 16~1. Ten samples were taken Emma plating bath used in an electronics manufacturing process and the bath pH determined. The sample pH values are given below:
Manufacturing engineering believes that pH has a median value of7 ,0. Do the sample cla::a indicate that this statement is correct? Use the sign tes~ to investi-
gate this hypothesis. 16..2. The titanium content in an aitcraft~gTade :ilioy is an important determinant of strength. A sample of 20 test coupons reveals the following titanium contents (in percent): 832,8.05,8.93,8.65,8.25,8,46,852,8.35, 8.36,8.41, 8A2, 8.30,8.71,8.75,8.60,8.83, 850. 8.38, 8.29. 8.46. The median titanium content should be 8.5%. Use jre sign test to investigate this hypottesis.
16 ..3. The distribution time between a..-'rivals in a telecommt."Dicauon system is ex-ponential, and the system man.ager wishes to test the hypothesis that HG: p.;::.. 3.5lIlin versus HI: p. > 3.5 min, (a) is the value of the mean of the exponential distribution under Hij: fl.:::. 3.5?
' ' hat
(0) Suppose that we have taken a sample of It:::::. 10 observations and we observe R-;:;; 3, Would the sig:1 test reject Ho at a= 0,05? (c) What is the type n error of this test ifjl =4,57 164. Suppose that we take a sample of n;:;; 10 measurementS from a normal distributio!l with (1 == 1, We wish to test Ho: J1. "'" against HI: J1. > O. The no:rrn.al
°
a/-rn ,
test statistic is ~ =(x - #0)/ and we decide to use a critical region of 1.96 (:.hat is, rejectRc if4 2: 1.96). (a) 'What is
ex for this test?
(b) What is j1fortlllstestif,u= I? (c) If a sign tes~ is used, specify the critical region that gives an a value consiste::lt with a for the nor:nal test (d) What is the t> value for the sign test if fJ. == 1? Compare this with the :esult ob~ed in part (b). 16--5. TWo different types of tips can be used in a Rockwell hardness tester. Eight COUP0!lS from test ingots of a nickel-based alloy are selected, and each coupon is tested twice, once with each tip. The Rockwell C-sca1e hardness readings are shown next Use
the sig:c test to determine whether or not the produce equivalent hardness readings. Coupon 1 2 3 4 5 6 7
Tip 1
Tip 2
63 52 58
60 51 56 59 58 54 52 61
60 55 57 53 59
8
CWO
tips
16-6. Testing for Trends. A turbocharger whee: is ma."n;factured m.ing an invest.'TIent casting process. The shaft fits into me wheel opening, and this wheel opening is a critical dimension. As wheel wax patterns are fanned, the hard tool producing the wax patterns wears. This may cause growth in the wheel-opening dimension, Ten wheel-opening measurements, in time order of production. are shown below:
4.00 (mm), 4,02, 4.03. 4.01, 4.00, 4.03, 4.()4, 4.02, 4.03, 4.03. (a) Suppose that p is the probability that observation Xi+5 exceeds observation X" If t.1.ere is no upward or downward trend. then X;"'5 is no more or Iess likely to exceed XI or lie below Xt VVhat is the value ofp? (0) Let V be :he number of values of i for which Xi+, > Xr If there is no upward or dov,llward trend in the measurements, what is the probability dis-
tribution of V? (c) Use t!:.e data above and the results of parts (a) and (b) to test He: there is :10 trend versus HI: there is
i!pwatd trend. Use a"",O.05.
Note that this test is a modification of the sign test. It Was
dc·...eloped by Cox and S:uart.
16..7. Consider the Wilcoxon signed rank test, and suppose :hat n == 5. Assume ~t Ho: j.J. = ,l1t.J is trJe. (a) How many different sequences of signed r;mks are pOSsible? Errt!rnerate these sequences. (b) How ma.1.Y different 'values of R" are there? Find
the probability tlSsociated with each value of R". {c) Suppose that we define the critical region of the test to be such that we would reject if R"'>
R:
R:,
and R~;::::: 13, What is the approximate a level of this test? (d) Can you see from this exerc:se how the critical 'values for the Vlilcoxon signed xank test were
deve,oped? Explain.
506
Chapter 16
Nonparametric Statistics
16..&. Consider:he data in Exercise 16-1. and asstllXte that the distribution of pH is symmetric and continu~ ous, Use the Wilcoxon signed rank test to test the hyp<:1thesis Ho: ,u == 7 against H!; /1 '# 7. 16~9. Consider the data
in Exercise 16-2, Suppose
that the distribution of titanium content is symmetric and cont:ir.uous. Use the Wilcoxon signed rank test to test the hypotheses Ho: J.l;:;::! 8.5 versus H;: J.l #; &.5. 16~10" Consider
t.1e data in Ex.ercise 16-2. Use the
large-sample approximation for the Wilcoxon signed rank: test to test :he hypotheses Ho: J1::::: 8.5 Versus H,; J1 ~ 8,5. Assume that :he distribution of titanium content is continuo'.)s and symmetric,
Unit I 25, 27, 29, 31, 30, 26, 24, 32, 33, 38. Unit 2 31, 33, 32, 35, 34, 29, 38, 35, 37, 30, 16-16. In Design and h..alysis 0/ Experiments, 5th Edition (John Wiley & Sons, 2001), D. C. Montgomery presents the results of an experiment to com-
pare four different :n;.,iting tech.rUques: on the tensile strength of portland cement, The results are sho\Vn below. Is t.~ere any indication that rnixi:ng !echnique affects t..lJ.e st:ength?
1
Hi-ll. For the large-sample approximation to the Wilcoxon signed rank test, derive the mean and standard deviation of the test statistic used in t!1e procedtlte. 16~12. Consider the Rockwell hardness test data in Exercise 16-5. Assume Li.a: bot."1 distributions are continuous and use ll-)e \Vilcoxon signed rank test to rest that the mear:. difference iL hardr,ess readings between the two tips is zero,
16-13. An electrical engineer must design a circuit to deliver the r.:l:aximum amount of current to a display rube to achieve sufficient image brightness. Within rJs allowable design constraints. he has developed t'NO candidate circuits and tests prototypes of each. The resulting data (in microamperes) is shown below: Circuit 1: 25:, 255, 258, 257. 250, 251, 254, 2S0, 248 Circuit 2: 250, 253, 249, 256, 259, 252, 250, 251 , Use ;he Wilcoxon r.m.\<;-sum test 1.0 test Ho: J1; = against the alternative HI; J11 > ,u1>
,u1
16-14. A consultant frequently travels from Phoenix, Arizona, to Los Angeles. California, He will use one of two airlir:es, Uni:ed or Southwest. The number of minutes that his flight arrived late for the last six trips on each airlbe is sbo\Vll belOW. Is there evidence that either airli,'le has superior on-time arrival perfon:l3!lce'1
United
2
Southwest 20
19, 4 8
-2
8
0 (minutes late)
8 -3
5 (minu:es late)
The manufacrurer of a bot tub is interested in twO different heating elements fOf his product. The element that produces the maximum heat gain after 15 rcinutes would be preferable. He obtains 10 samples of each heating unit and tests each one. The heat gain after 15 minutes (in "F) is shown below, Is there any reason to suspect that one unit is superior to the other? l
testing
Tensile St:'ength (lb i In.')
:Mixing Technique
2 3 4 16~17.
3129 3200 2800 2600
3000 3000 2900 2700
2865 2975 2985 2600
2890 3150 3050 2765
An a.rticle in the Quality Control Handbook,
Srd Edition (McGraw~Hill. 1962) presents the results of an experiment perfonned to investigate t!le effect of th..'"ee differer.t conditioning methods on the breal:ir.g strength of cement briquettes. Tne data are shown below, Is there any indication that conditioning method affects breaking strength? Conditioning Method
1 2
3
(Ib i in,')
Brealdng
553 553 492
550 599 530
568 579 528
541 545 510
537 540 571
16~1S. In
Statistics/or Research (John Wiley & Sons, 1983), S. Dowdy and $, We3rden present the results of an experiment to measure stress resulting from operating hand-held chain saws. The experimenters measured the kickback angle through which the saw is deflected when it begins to cut a 3-inch stock synthetic board. Shown below are deflection angles for five saws chosen at random from each of four differ~ eut manufacturers. Is there any evidence that the manufacturers' products Ciffcr with respect to kickback
ar:gle? Kickback Angle
l\.la.nufucturer
A
B C D
42 28 57 29
17
24
50 45 40
44 48
22
39 32 41 34
43 61 54
30
Chapter
17
Statistical Quality Control and Reliability Engineering The quality of the products and services used by our society has become a major consumer decision factor in many, if not most, businesses today. Regardless of whether the consumer is an individual, a corporation. a military defense program, or a retail store, the consumer is likely to consider quality of equal importance to cost and schedule. Consequently, quality improvement has become a major concern of many U.S. corporations. This chapter is about statistieal quality control and reliability engineering methods, two sets of tools that are essential in quality~improvement activities.
17-1 QlJALITY IMPROVEMENT Ar."D STATISTICS Quality means fitness for use. For example, we may purcbase automobiles that we expect to be free of manofactering defects and that should provide reliable and economieal transportation, a retailer buys finished goods with the expectation that they are properly packaged and arranged for easy storage and display. or a manufacturer buys raw material and expeets to process it with minimal rework Or scrap. In other words, all consumers expect that the products and serviees they buy will meet their requirements, and those requirements define fitness for use. Quality, or fitness for use, is detemtined through the il"J.ternetion of quality of design and quality of conformance. By quality of design we mean the different grades or levels of performance, reliability, serviceability, and function that are the result of deliberate engineering and management decisions, By quality of conformance, we mean the systematic reduction of variability and elimination of defects until every unit produced is identical and defect free. There is some confusion :in our SOCiety about qlJ.t1lity improvement; some people still think that it means gold plating a product or spending more money to develop a product or process. This thinking is \\Tong. Quality improvement means the systematic elimination of waste. Examples of waste include serap and rework in manufacturing, inspection and test, errors on documents (such as engineering drawings. checks, purchase orders, and plans), customer complaiDt hotlines1 warranty costs, and the time required to do things over again that could have been done right the first time. A successful quality-improvement effort can eliminate much of this waste and lead to lower costs, higher productivity, increased customer satisfaction. increased business reputation, higher market share, and ultimately bigher profits for the company.
507
508
Chapter 17
Statistical Quality Control and Reliability Engineering
Statistical methods playa vital role in quality improvement. Some applications include the following: 1. In product design and development, statistical methods, including designed experiments, can be used to compare d.i.f.te:rent materials and different components or ingredients1 and to help in both system and component tolerance determination. This can significantly lower development costs and reduce development time. 2. Statistical methods can be used to determine the capability of a manufacturing process. Statistical process control can be used to systematically improve a process by reduction of variability, 3. Experiment design methods can be used to investigate improvements in the process, These improvements can lead to higher yields and lower manufacturing costs. 4. Life testing provides reliability and other performance data about the product. This can lead to new and improved designs and products that have longer useful lives and lower operating and maintenance costs. Some of these applications have been illustrated in earlier chapters of this book. It is essential that engineers and managers have an in-depth understanding of these statistical tools in any industry or business that wants to be the bigh-quality, low-cost producer, In this chap. ter we give an introduction to the basic methods of statistical quality control and reliability engineering that, along with experimental design, form the basis of a successful quality~ improvement effolt..
17·2 STATISTICAL QUALITY CONTROL The field of statistical quality control can be broadly defined as consisting of those statistical and engineering methods useful in the measurement, monilor..ng. contro~ and improvement of quality, In this chapter, a somewhat more narrow definition is employed, We will define statistical quality control as the statistical and engineering methods for process controL Statistical quality control is a relatively new field, dating back to the 19205, Dr, Walter A. Shewbart of the Bell Telephone Laboratories was One of the early pioneers of the field. In 1924, he '''TOte a memorandum shoViing a modem oontrol chart, one of the basic tools of statistical process control. Harold F. Dodge and Harry G. Romig, two other Bell System employees, provided mucb of the leadership in the development of statistically based sam· pling and inspection methods, The work of these three men forms the basis of the modem field of statistical quality controL World War II saw the widespread introduction of these methods to U,S, icdustry, Dr, W Edwards Deming and Dr, Joseph M. Juran bave been instrumental in spreading statistical quality-control methods since World War II. The Japanese have been particularly successful in deploying statistieal quality-control methods and have used statistical methods to gain significant advantage relative to their competitors, In the 1970, American industry suffered extensively from Japanese (and other foreign) competition, and that has led, in tum, to renewed interest in statistical qualitycontrol methods in the United States. Mucb of this interest focuses on statistical process control and expen'mental design. Many U.S. companies have begun extensive programs to implement these methods into their manufacturing~ engineering, and other business organizations.
17·3 STATISTICAL PROCESS CONTROL It is impossible to inspect quality into a product; the product must be built right the first time, This implies that the manufacturing process must be stable or repeatable and capable
17-3
Statistical Process CenITol
509
of operating with little variability around the target .or nominal dimension. Online statisti ~ cal process controls are powerful tools useful in achieving process stability and improving capability through the reduction of variability. It is customary to think of statistical process control (SPC) as a set of problem-solving tools that may be applied to any process. The major IDols of SPC are the following: 1. Histogram 2. Pareto chart 3. Cause-and-effect diagram
4.
Defect~concentration
diagram
5. Control chart 6. Scatter diagram 7. Check sheet
While these tools are an important part of SPC, they really constitute only the technical aspect of the subject. SPC is an attitude-a desire of all individuals in the organization for continuous improvement in quality and productivity by the systematic reduction of vari~ ability. The control chart is the most powerful of the SPC tools. We now give an introduction to several basic types of control charts.
17-3.1 Introduction to Control Charts The basic theory of the control chart was developed by Walter Shewhart in the 1920s, To understand how a control chart works~ we must first understand Shewharfs theory of variation. Shewhart theorized t."1at all processes, however good; are characterized by a certain amount of variation if we measure with an instrument of sufficient resolution. When this variabilicy is confined to ra:ndom or chance variation ouly, the process is said to be in a state of statistical control. However. another situation may exist in which the process variability is also affected by s.ome assignable cause, such as a faulty machine setting, operator error. unsatisfactory raw material, worn machine components, and sO,on. l These assignable causes of variation usually have an adverse effect on product quality, so it is important to have some systematic technique for detecting serious departures from a state of statistical control as soon after th~y occur as possible, Control charts are principally used for this purpose. The power of the control chart lies in its ability to distinguish assignable causes from random variation, It is the job of the individual using the control chart to identify the underlying root Cause responsible for the out-of-control conditio~ develop and implement an appropriate corrective action, and then follow up to ensure that the assignable cause has been eliminated from the process. There are three points to remember. 1~
A state of statistical control is not a natural state for wost processes.
2. The attentive use of control charts will result in the elimination of assignable causes, yielding an in-control process and reduced process variability. 3. The control chart is ineffective without the system to develop and implement corrective actions that attack the root causes of problems, Management and enginee:~ :ing involvement is usually necessary to accomplish this.
ISoo::.etimes contnum c:J.use is used.insre:ld of "random" or '"'chAnce cause." and special cause is used ifIstead of "assignab:e cause,"
510
Chapter 17 Statistical QUality Control and Reliability Engineering
We distinguisb between control charts for measurements and control charts for attributes, depending on whether the observations on the quality characteristic are measurements or enumeration data. For example, we may choose to measure the diameter of a shaft, say with a micrometer, and utilize these data in conjunction with a control chart for measurements. On the oilier hand, we may judge each unit of product as either defective or nondefe;::tive and use the fraction of defective units found or the total number of defects in conjunction with a control chart for attributes. Obviously. certain products and quality characteristics lend themselves to analysis by either method. and a clear-cut choice between the two methods may be difficult. A control chart, whetberfor measurements or attributes, consists of a cemerline, corresponding to the average quallty at which the process should perform when statistical "ontrol is exhibited, and two control U",jts. called the upper and lower control limits (UCL and LCL). A typical control chartls shown in Fig. 17-1. The control limits are chosen so that values falling between them can he attributed to chance variation, while values falling beyond them can be taken to indicate a lack of statistical conrroL The general approach consists of periodically taking a random sample from the process, computing some appropriate quantity, and plotting that quantity on the control chart. When a sample value falls outside the control limits, we search for some assignable cause of variation. However, even if a sample value falls between the conrrollimits~ a trend Or some other systematic pattern may indicate that some action is necessary, usually to avoid more serious trouble. The samples should be selected in such a way that each sample is as homogeneous as possible and at the same time maximizes the oppo:rturUty for variation due to an assignable cause to be present. This is usually called the ratiolllll subgroup concept. Order of production and source (if more than one source exists) are commonly used bases for-obtaining rational subgroups. The ability to interpret control charts accurately is usually acquired with experience. It is necessary that the user he thoroughly familiar with both the statistical foundation of control charts and the nature of the production process itself.
17-3.2
Control Charts for Measurements When dealing with a quality characteristic that can be expressed as a measurement, it is customary to exercise control over both the average value of the qUality characteristic and its variability. Control over the average quality is exercised. by the control chart for means, usually called the X cha.:t. Process variability can be controlled by either a range (R) cha.:t or a
Upper control limit (UCL)
Sample number
Figure 17~1 A typical control chart
17-3 Statistico.1 Process Control
511
standard deviation chart, depending on how the population standard deviation is estImated. We will discuss only the R chart. Suppose that the process mean and standard deviation, say p. and a. are known, and, furthermore, that we can assume that the quality characteristic follows the normal distribution. LetXbe the sample mean based on a random sample of size n from this process. Then the probability is 1 - IX that the mean of such random samples w,n fall between f.l+Za/2{a/.Ji;) and I1-Z
-
X
l' IL.X,.
(17-1)
1'=;
Thus, we may take X as the centerline On the X control chart. We may estimate afrom either the standard deviations or the ranges of the k samples. Since it is more frequently used in practice, we confine 'Our discussion to the range metb,od. The sample size is relaliv'ely small, so there is little loss in efficiency in estimating afrorn the sample ranges. The relationship between the range, R, of a sample from a normal population with known parameters and the standard deviation of that population is needed. Since R is a random variable. the quantity W:;:;: RIO', called the relative range. is also a random ,{ariable. The parameters of the distribution of W have been determined for any sample size n. The mean of the distribution of W is called d,., and a table of d,. for various n is given in Table XllI of the Appendix. Let R, be the range of the ith sample, and let
_ R
1
k
=-2:11, k
(17 -2)
I:::::L
be the average range, Then an estimate of (j would be
. R
0"=-.
(]7·3)
d,.
Therefore, we may use as our upper and lower control limits for the X chart
= ~-······F 3R.
UCL = X
d.2~n
3 LCL=X--:-;=J:(
We note that the quantity
.
(17-4)
512
Chapte: 17 Statistical Quality Control and Reliability Eugineering
is a constant depending on the sample size, so it is possible to rewrite equations 17-4 as
UCL=
X+A,R,
LCL=
X -A;J.
(17-5)
The constant A, is tabulated for various sample sizes in Table XIII of the Appendix. The parameters of the R chart may also be easily determined. The centerline will obviously be R. To determine the control limits, we need an estimate of (jR> the standard devia~ tion of R. Once again, assuming the process is in control. the distribution of the relative range, W. will be useful. The standard deviation of W, say (jWI is a function of n, which has been determined. Thus, since R=WO",
we may obtain the standard de;iation of R as (jR= (jl\<,(j',
As a is unknown, we may estimate O"R as
and we would use as the upper and lower control limits on the R chart UCL
R+ 30"w. R d2
'
(17-6)
LCL=R- 30"w R.
d, Setting D3 = 1- 30"w1d2 and D4
:=
1 + 30"w1d2' we may rewrite equation 17-6 as
UCL =D4R,
(17-7)
LCL=D,R,
where D, and D4 are tabulated in Table XIII of the Appendix. When preliminary samples are used to construct limits for control charts, it is cus~ tomary to treat these limits as trial values. Therefore, the k sample means and ranges should be plotted on the appropriate charts, and any points that exceed the control limits should be investigated. If assignable causes for these points are discovered, they should be eliminated and new limits for the·contrel charts determined. fn this way, the process may eventually be brought into statistical control and its inherent capabilities assessed. Other changes in process centering and dispersion may then be contemplated,
Exaxnpie17,1. A component pa..rt for a jet aircraft engine is manufactured by ax:. investment casting process, The vane opening on this casting is an impor.ant functiOIlal Parar::le:er of the part, We will illustrate the use of X&.,'ld R control cha."tS to assess the statistical stability of this process. Table 17-1 presents 20 samples of five parts each. The values given in the table have been coded by using the last three digits of the dimension; that is. 31.6 should be 0.50316 inch. The quantities X:= 33.33 a:;;.d R "'" 5.85 are shown at the foot of Table 17-L Notice that even though X, X, R, and R are now realizations of random variables, we have still \V:.i.tten them as
17·3
TabJe
17~1
Vane Opening Measure.-nents
Sample Number 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
513
Statistica: Process Control
x,
x,
'"
33 35 35 30 33 38 30 29
"' 0"
29 33 37 31 34 37
39 28 31 27 33 35
>,
X
32 37 34 34 33
33 31 36 33 34 38 31 39 43
31.6 33,2 35,0 32,2 33,8 38.4 31.6 36,8
32
34,0 29,S 34,0 33,0 34,8 35.6 30,8 33.0 31.6
39 34
31 33 33 35 39 32 38 35
33
32
30
32
31
35
34
35
37 36 39 30
30
33
27 33
28
31
28
x,
33
33
35 32
34 33
28 35 34 35 32 27 34 30
25
27
34
35
35
36
3S 32 33 37
40 34 39 36 34
37 35 31 30
32
30
35,2
R 4
6 4 4
2
3 4
10 15 7 L
4
10 4
7
28.2 33,8
6 5 3 9 6
X=33.33
R=5,85
uppercase letters, This is the usual convention in quality control. and it will always be clear from the context what the notation implies, The trial concrollimits are, for the Xcbart.
X ±A,R
33,3 ± (0,577)(5.85) = 33.33 ,,3.37,
or UCL = 36,70. LCL=29,96,
For the R chart, the trial co:r:trollimits are UCL =D,R = (2.115)(5,85) = 12.37, LCL =D,R = (0)(5,85) = 0,
The X aodR cootrol charts .....1tb. these trial cootrollimit.,> are sho'NU in Fig, 17~2, Notice that sa.'":lples 6, S, 11. and 19 are out of cOlltrol on the X chart, and that sample 9" is out of control on the R chart. Suppose that all of these assigr.ablc causes can be traced to a defective tool io the wax-molding area, We should discard these !lve samples and recompute the limits for the X and R charnL These new revised limits are, fur the X chart,
UCL=
X+A,J< =32,90 + (0.577)(5.313) =35,96.
LCL =
X
32,90 - (0,577)(5.313) = 29,84.,
and, for ce R chart, they are
UCL LCL
(2,115)(5,067) = 1O,7l,
= (0)(5,067) = 0,
514
Chapter 17 Statistical Quality ContrQI and Reliability Engineering
I-"~-""-
c
iii
E
UCL= 36.70
3S:-
o
!
Mean = 33.33
J-----*-io,
ILCL
29,96
~1;;-O------;2~O Sample number
, - - - - - - - .... _ - - 1S
UCL; 12.37 m
0> C
10
~
.!!l
c.
E M
(j)
S
O~
LCL=O 10 Sample number
20
Figure 17M2 The X and R control charts for vane opening,
The revised control charts are shO'WTl in Fig. 17-3, };otice that we have treated the firs..: 20 preliminary samples $S estimation. data with which to establish control limits. These limits can now be used. to judge the statistical control offuturc production.. As each l1eW sample becomes available, the values ofX' and R should be cotUputed and plotted on t.~e cOIltrol charts. It may be desirable to revise the lim-
hs periodically, even if the process rCJllJlins stable. The limits should alv.rays be revised when process improvements are made.
Estimating Process Capability It is usually necessary to obtain some information about the capa.bility of the proc~ that is, about the performance of the process when it is operating in control. Two grapbical tools, the tolerance chart (or tier chart) and the histogram, are helpful in assessing process capability. The tolerance chart for all 20 samples from the vane manufacturing process is shov.'ll in Fig. 17-4. The specifications. on vane opening. 0.5030 ± 0.001 inch, are also shown on the chart. In terms of the coded data, the upper specification limit is USL ~ 40 and the lower specification limit is LSL = 20. The tolerance chart is useful in revealing pattems oyer time in the individual measurements, or it may show that a particular value niX or R was produced by one or two unusual observations in the sample. For example. note the tWO unusual ObserY3tions in sample 9 and the single unusual observation in sample 8, Note also that it
17 ~3
§
Statistica! Process Control
40r
S 35~ w
""g . L
c.
"-
" m
1"'~ f i UCL~35.96
~
VS
LCL=29.84
\
I
10
0
515
20
Subgroup 15~
m
~
"e
'" c. E
'"
(J)
10
UCL; 10.71
~
R;5.C67
:~
LCL;C
0
8
10 Subgroup
= Not used in computing control ilm:fs
Figure 17~3 X a.,.d R control charts fOT vane openi.1g, revised limits,
40
35
I
"- 30
0 ID
USL;40
•
!Lit! ITI ItT 1'1 r
--\Nominal
C ID
>
•
dirrension ;:: 30
i
.
25
20
LSL
Sample r:umber
Figure 17 ~4 Tolerance diagram of vane openings.
20
516
Chapter 17
Statistical Quality Contro: and Raliability Engineerillg
is appropriate to plot the specification limits on the tolerance chart, since it is a chart of indi~ vidual measurements, It is never appropriate to plot specijicationlimits on a control chart, Or 10 use the specifications in determining the control limits, Specification limits and con~ trollimits are unrelated. FInally, note from Fig. 174 that the process is running off center from the nonrillal dimension of 0.5030 inch. The histogram for the vane opening measurements is shown in Fig. 17-5. The observations from samples 6, 8, 9, II, and 19 have been deleted from !his histogram. The gen· era} impression from examining this histogram is that the process is capable of meeting the specifications. but that it is running off center. Another way to express process capability is in terms of the process capability ratiQ (PCR), defined as peR = USL-LSL, 60'
(17·8)
Notice that the 60' spread (30' on either side of the mean) is sometimes called the basic capability of the process. The limits 3a on either side of the process mean are sometimes called natural tolerance limits, as these represent limits that an in·control process should meet with most of tIle units produced. For the vane opening, we could estimate aM
The PCR has a natural intetpretation; (I!PCR)IOO is just the percentage of the tolerance band used by the process. Thus) the vane opening process uses approximately (111.62)100 = 61.7% of the tolerance band. Figure 17-6a shows a process for which the peR exceeds unity. Since the process natural tolerance limits lie inside the specifications. very few defective or nonconforming units will be produced. If PCR= I, as shown in Fig. 17-6b, more nonconforming units result. In fact, for a normally distributed process, if peR = 1, the fraction nonconforming Is 0.27%. or 2700 parts per million. Finally, when the PCR is less than mity, as in Fig. 17-6<, the process is very yield sensitive and a large number of nonconforming U.1.l3 will be produced. The definition of the peR given in equation 17-8 implicitly assumes that the process is centered at the nominal dimension, If th.e process is running off center, itS actual capa~ biliry will be less than that indicated by the peR. It is convenient to think of PCR as a measure of potential capability, that is, capability with a centered process. If the process is not centered, then a measure of actUal capability is giVe!1 by D
P C"k =
•
rUSL-X X-LSL1
romi!
'J 3<1 "
3<1
---'c------'~::.L----!c-.
_PC-!-R'-=-1--:-:" ... _
LSL
L3U I
3u-J
USL
(al
PCR
Nonconforming ( unilS
NonconformIng units
LSL
(b)
USL
Nonconforming units
lSL
" 3
Figure 17.-6 Process fallout and the process capability ratio (peR).
(17-9)
518
lI
Chapter 17 StatistiC!! Q~a1ity Cootrol and Reliability Engineering
In effect, peRk is. a one-sided process capability ratio that is calculated relative to the specification limit nearest to the process mean. For the vane opening process, we find that
Note that if PCR = PCR, the process is centered at the nominal dimension. Since PCR, 1.1 0 for the vane opening process, and PCR = 1.62, the process is obviously running off center, as was first noted in Figs. 17-4 and 17-5. This off~center operation was ultimately traced to an oversized wax tOoL Changing the tooling resulted in a substantial improvement in the process, Montgomery (2001) provides guidelines on appropriate values of the peR and a table relating fallout for a normally distributed process in statistical control as a function of peR. Many U.S. companies USe PCR 1.33 as a minimum acceptable target and PCR = 1.66 as a mini"llUIn target for strength. safety, or critical characteristics. Also, some U.s, compa rues, particularly the au;omobile industry, have adopted the Japanese terminology Cp =PCR and Cpk ::;;: peRk.- As Cp has another meaning in statistics (in multiple regression; see Chapter 15), we prefer the traditional notation PCR and PCR,. M
11-3.3 Control Charts for Individual Measurements Many situations exist in which the sample consists of a single observation; that is, n;;::; L These situations occur when production is very slow or costly and itis impractical to allow the sample size to be greater than one. Other cases include processes where every observa~ tion can be measured due to automated inspection, for example. The Shewhart control chartjor individual measurements is appropriate for this type of situation. We will see later in this chapter that the exponentially weighted moving average control cbart and the cumulative sum control chart may be more informative than the individual chart. The Shewhart control chart uses the moving range. MR, of two successive observations for estimating the process variability. The moving range is defined
For example, for m observations, m - I moving ranges are calculated as MR, lx" - XII, MR3 = lx, - x,1, ... , MRm = Ixm x",-ll. Simultaneous control chans can be established on the individual observations and on the moving range. The control limits for the individuals control chart are calculated as UCL
x+3
MR d2 '
Centerline:::: X, - ,MR LCL -x-'-::i;'
where MR is the saJ::1ple mean of the MR i•
(17-10)
17~3
Statisticul Process Control
519
If a moving range of size n = 2 is used, then d2 = 1.128 from Table XIII of the Appendix. The control limits for the moving range control chart are
UCL = D4MR,
Centerline = MR,
(17-11)
LCL=D,MR.
Batches of a particular chemical product are se:ected from a pro:::ess ar.d the purity measured on each, Da:a for 15 successive bat:::hes have been collected mld are given in Table 17-2, The moving ranges of size n = 2:are also displayed in Table 17~2, To Set up the control chart for individuals. we ~t need the sample average of the 15 purity measurements. This average is found to bex == 0357. The average of the moving ranges of two observations is MR ;::; 0,046. The control limits for the individuals cha...-rt with moving ;:angcs of size 2 using
me linritsin equation 17*lQ are UCL = 0.757.,. 3 0.046
The control limits for:he moving.range cha.-rt a..--e found using the Emits giver. in Equation 17-:11: UCL = 3.267(0.046) = 0.150, Centerline = 0.046,
LCL = 0(0.046)
O.
Table17-2 Purity of Olernical Product
Moving Range, ,"ill
Batch
2
3 4
5 6 7 8 9 10 11 12
13 14
15
0,77 0.76 0.77 0.72 0.73 0.73
0.85 0.70 0.75 0.74 0.75 0.84
0.79 0.72 0.74 7:=0.757
0,0) 0,01 0.05
om 0.00 0.12 0.15 0.05
om am 0.09 0.05 0.07 0.02 MR=O.0<6
520
Chapter 17
Statisticil Quality Control and Reliability Engineering
0.9r-======================lUCL=0.879
:·:t
Mea.,::::: 0,757
L======::::::=======;::======~LCL= 5 18 15
0.6 0 ..
0.635
Subgroup
(a)
0.15 r::--==========::;;::==========l ~~ i,~
UCL=
0,10
r'
0,05
r
0.00
bL====:::;, ~=====::;,10=====::::J1 LCL" o 5 15
Q,150
:!-R= 0,046 0
Subgroup (b)
Figure 17~7 Control charts for (a) the individual observations and (b) the moving range ap pu..""ity.
The control charts for individual observations and for the moving range are provided in Fig. 11-1,
Since there are no pointz beyond the controlli..rojts, the process appears to be in statistical canuoI. . The individuals chart can be interpreted much like the X control chan, An out~of...;;:ontrol situa~ rion would be indicated by either a poi.'u (or points) plotting beyond the controllimi-;s or a pattern such as a run on one side of the centerline. The moving range dwt cannot be i::tterpreted in the same way. Although a point (orp0lnts) plot~ ling beyond the control limits would likely indicate a.:l out-of-controlsituation.. a par.em or run on one side of the centerline is not necessarily an indica:ion that the process is out of conn:oL This is due to the fact that the moving ranges are correlated, and this cone1ation.lll3.Y natu1'3..lly cause patterns or trends on the chart
17-3.4 Control Charts for Attributes The p Chart (Fraction Defective or Nonconforming)
Often it is desirable to classify a product as either defective or nondefectlve on the basis of comparison with a standard. This IS usually done to achieve economy and simplicity in the inspection operation. For example. the diameter of a ball bearing may be checked by determining whether it will pass through a gauge consisting of circular holes cut in a template. This would be much simpler than measuring the diameter with a micrometet. Control charts for attributes are used in these situations. However, attribute control charts require a con~ siderably larger sample size than do their measurements counterparts. Vie will discuss the fraction-defective chart, or p chart, and two chartS for defects, the c and u chartS, Note that it is possible for a unit to have many defects, and be either defective or nondefective. In some applications a unit can have several defects, yet he classified as nondefective.
17-3 Statjsti,aJ Process Control
521
SUFPose D is the number of defective units in a random sample of size n. We assume that D is a binomial random variable with unkno'wn parameter p. Now the sample fraction defective is an estimator of p, that is D n
(17-12)
Furthennore; the variance of the statistic p is
so we may estimate
p(l- p) n
a; as
';(1- P)
(17-13)
n
The centerline and control limits for the fraction~defective control chart may now be easily determined. Suppose k preliminary samples are available, each of size n, andD; is the number of defectives in the ith sample. Then we may take
(17-14)
as the centerline and i
(I .)
UCL=P+3? : P , (17-15)
LeL= p-3 I!P(l- p) \
n
as the upper and lower control limits. respectively. These control limits are based On the normal approximation to the binomial distribution. "When p is sman. the normal approximation may not always be adequate. In such cases. it is best to use control lim.lts obtained directly from a table ofbinornial probabilities or. perhaps; from the Poisson approximation to the binomial distribution. If p is small, the lower control limit may be a negative number. If this should occur, it is customazy to consider zero as the lower control limit.
;.~~I1I~iZ:~ Suppose we wish to construct a £raction~defectiye control char:: for a ceramic substrate production line. We have 20 preliminary samples, each of size 100; the number of defectives in each sample are shown in Table 17~3, Assume that the samples are numbered in the sequence of production. Note that Ii 80012000 OAO, and therefore the trial parameters for the control chart are
=
=
Centerline =: 0.395
UCL = 0.395 ... 3,: (0.395)(0.605) ,
100
LCL=O.395_3!(0.395)(O.605) "'~
100
0.5417, 0.2483.
522
Chapter 17 Statistical QUality Control and Reliability Engineering 'fibl.17·3 Nu.'Uher of Defectives in S3tI1p:ies of 100 Cerar..1ic Substrates Sample
No. of Defectives
Sample
1 2
44 48
3 4
32
11 12 13
50
14
29 31 46 52
15 16
42
17 18
44 38
19 20
46 38 26
5
6 7
8 9 10
No. of Defectives
36 52 35 41 30
30
The control chart is shoVr'TI in Fig. 17~8. All samples are in control. If they were not, we would search for assignable causes of variation and revise the limits accordingly, Although this process exhibits statistical conlrol. its capability (,0 0.395) is very poor, We should take appropriate steps to investigate the process to detenr.Jne why such a large number of defective units are being produced. Defective units should be analyzed to determine the specific types of defects present. Once the defeet types are known. process changes should be investigated to determine their impact on defeet levels. Designed exper~ iments may be useful in this regard.
..E~!lI~i7~4 Attributes Versus Measuremenl. Control Charts Tae advantage of measurement control charts :relarlve to the p chart 'With respect to size of sample may be easily illustrated" Suppose that a DOnnally distributed quality characteristic has a sta:ndard devia-
tion of 4 a.."\d specification limits of 52 and 68. The process is centered at 60, which results iri. a frnc~ defective of 0.0454. Let the process mean shift to 56, ~ow the fraction defective is 0.1601. If the
tiOD
O.SS
10.45
C -'-=--=--=--=--=--=--=--=-'_'-::-:"_-=--====_"_1 UCL= 0.5417 1
§
~~
0.35
~
"'
LCL= 0,2483
0.25
o
10 Sample number
Figure 17~8 The p chart for a ceramic substrate.
20
17-3 Statistical Process Control
523
probability of d::tecting the shi..~ on ';he first sample following the mift is to be 0.50. ';hen the sample size mus~ be such tl::at ';he lower 3~sigma limit will be at 56. This implies
whose sOlution is It:::; 9, For a p chart, using the no~ approximation to the binomial, we must have
0,0454+3~(0,0454~0,9546)
0.160L
whose solution is n:::; 30. Thus. unless the cost of measurement inspection is more than three times ali costly as the attributes inspection, the measurement control chart is cheaper to operate.
The c Chart (Derects)
In some situations it may be necessary to control the number of defects in a unit of product rather than the fraction defective. In these situations we may use the control chart for defects, or the c chart. Suppose that in the production of cloth it is necessary to control the number of defects per yard, or that in a')sembling an aircraft wing the number of missing rivets must be controlled. Many defects-per-unit situations can be modeled by the Poisson dismbution, Let c be the numher of defects in a unit, where c is a Poisson random variable with parameter a. Now the mean and va.."i.ance of this distribution are both ex. Therefore, if k unit.,; are availahIe and c; is the number of defects in unit i, the centerline of the control chart is
_
1
k
c;;;;;~Lc[,
(17-16)
K i=l
and
UCL;;;;;c+W.
(17-17)
LCL;;;;;c -3-« are the upper and lower control limits, respectively.
Printed circuit boards are assembled by a combina~on of manual assembly and automation. A flow sold~r machine is used to make the mechanical and elecrrical connections of the leaded components to the board. The boards are run thtough the flow solder process almost continuously. and every bour five boards are selected a.,d i..'lSpectcd for process-control purposes, The number of defects .l:_ each sample of five boards is noted. Results for 20 samples are shown in Table:'7-4. Now c = 160120;;;;; 8. and
t.tlerefore UCL~8+3;S ~16,41!4.
LCL ~ S- 3-./8 < 0, set to 0, F:om lite control chart i:o Fig. 17~9, we see that the process is in controL However, eight defects per group of five printed circuit boards is too many (about 8/5;;;;; 1.6 defectslboord), and the process needs improvement. An investigation needs to be made of the specific types of defects found on the printed circuit boards. This will usually suggest POteJltial ave::mes for process improvement
524
Chapter 17 Statistical Quality Control and Reliability Engineering
Table 17 ~4 Number of Defects in Samples of Five Pri.nted Circuit Boards Sample
No. of Defects 6
11
2
4
12 13 14
3
8
4 5 6 7
10 9
Sample
16
g
2
3
10
13
9 15
8 10 8
15 16 17 18 19 20
12
9
!'-l'o. of Defects
2
7 1 7 13
20 UCL~
16.484
c
r1 w "'0 J£
10
;;;
.0
E ~
z
0 Sample number
Figure 17-9 The c chart for defeCts in samples of five prir.red circuit boards.
The u Chart (Defects per Unit)
In some processes it may be preferable to work with the number of defects per unit rather than the total number of defects. Thus, if the sample consists of n units and there are c total defects in the sample, then
c
u=n
is the average number of defects per unit. A '" chart may be constructed for such data. If there are k preliminary samples, each with "11 Uz, .•. , Uk defects per unit, then the center~ line on the u chart is
(l7-18) and the control limits are given by
(17-19)
17-3 Statistical Process Con""l
525
A u chart may be constructed for the printed cl.rcuir board defect data in Examp~e 17~5. Since each sample contains r. = 5 printed circuit boards, the values of u for eacb sample may be calculated as shown in the following display: Sample
Number of defects. c
Sample size, n
5 2
3 4 5 6 7 8
9 1O 11 12 13 14 15
16 17
[3 19
20
Defects per unit
6 4
5 5 5 5 5 5
1.2
0.3
S
1.6 2.0
10
1.8
9 12 16 2
5 5 5 5 5 5 5 5 5 5 5 5 5
2.4
3.2 0.4
3
0.6 2.0 1.8 3.0 1.6 2.0 1.6
10
9 15 8
10 8 2 7
0.4 1.4 0.2
7 13
L4
2.6
The centerline for the u chart is
_
1
20
U=-2> 20 ",[ 1
32
1.6,
,
and the upper and lower control limits are _
1"17
f!':6
UCL=u+3~-;; =1.6+3~5 =33_ LCL=u -3
(f. = 1_0-".,i-:5:-<0, set to O.
1n
The control char: is plotted in Fig. 17~1O. Notice that the u chart in this example is equivalent to the c chart in Fig, 17 ~9, In some cases, particuLtrly when the s~le size is !lot constant. the u chart will be preferable to the c chart. For a disc1.!ssio:c. of va..-iable sample sizes on control charts, see Montgomery (2001).
17-3.5 CUSLM and EWMA Control Charts Up to this point in Chapter 17 we have presented the most basic of control charts, the Shewbart control cham. A major disadvantage of these control cham is their insensitivity to small shifts in the process (shifts often less than 1.50)_ This disadvantage is due to the fact that the Shewhart charts use information only from the current observation.
526
Chapter 17 St>tistical Quality Control and Reliability Engineering
w
1:>
ill
"
r !
1
i
0 0
20
10 Sample number
Figure 17,.10 The u chart of defects pe~UDit on printed circuit boards. Example 17 o. w
Alternatives to Shewhart eontrol charts include the cumulative sum control chart and the exponentially weighted moving average control chart. These control charts are more sensitive to small shifts in the process because they incorporate i..'1formation from current and recent past observations,
Tabular CUStJM Control Cilarts for lb. Process Mean The cumulative sum (CUSUM) control chart was first introduced by Page (1954) and incorporates information from a sequenee of sample observations. The chart plots the cumulative SU~ of deviations of the observations from a target value. To illustrate. let Xj represent the jthsample mean, let J1{J represent the target value for the process mean. and say the sample size is n " 1, The CUSUM control ebllrt plots the quantity n
Ci =
I,lx J1e) j -
(17-20)
j=l
against the sample i. The quantity C; is the cumulative sum up to and including the ith sam~ pIe. As long as the process is in control at the target value flo, then Ci in equation 17-20 represents a random walk with a mean of zero. On the other hand. if the process shifts away from the target mean, then either an upward or downward drift in Ci. will be evident. By incorporating information from a sequence of observations, the CUSUM chart is able to detect a small shi.fi: in the process more quickly than a standard Shewhart chan, The CUSUM eharts Can be easily implemented for both subgroup data and individual observations, We will present the tabular CUSUM for indiyidual observations. The tabular CUSUM involves tvlo statistics. c;- and C;, which are the accumulation of deviations above and below the target mean., respectively. C; is called the one-sided upper CUS'(rM and is called the one-sided lower CUSUM, The statistics are eomputed as follows:
c:
(17-21) (17-22)
'Nith initial values of C~ :;;:: CO =O. The eonstant, K. is referred to as the reference value and is often chosen approximately halfway between the target mean, ,'4> and the out-of-control
17~3
Statistical Process Control
527
mean that we are interested in detecting, denoted Jl.l' In other words. K is half oftbe magnitude of the shift from 110 10 )1.1' or
The Sllltistics given in equations 17-21 and 17·22 accumulate the deviations from target that are larger than K and reset to zero when either qm.u:.tity becomes negative. The Cr..:SUM control chart plots the values of C; and C; for each sample. If either statistic plots beyond the decision interval. H, the process is considered out of controL We will discuss the choice of H later in this ch,,!,!er, but a good rule of thumb is often II = 5a.
A study presented in Food Control {200t p. 119) gives the results of measuring the dry-matter cor-tent in buttercream from a batch process, One goal of t.1e study is to monitor the amount of Cry matter from batch to batch. Table 17-5 displays some data ':hatrnay be t']pical of this type of process. Th::: reported values. Xi' are percentage of dry-ll.la!:ter content examinee after mixing. The target amount of d..ry~mat ter cor-tent is 45% a:ld assnme that a= 0.&4%, Let us also assume t.1at we.are interested in detect:i:r.g a shift in the process of mean of at least 10; that is, J1.1 =IJ;;+ 10'=45 + 1(0.84) =45.84%. We will use the
The CUSL"'M calcJlations give::. in Table 17~5 indicate rhat the upper-sided CUSUM for batch 17 is C77 = 4.46, which CAceeds the dedsion value of H 4,2. Therefore, the process appears to have shifted. out of concrol. The CUSUM status chart created using ~1initab® withH =4.2 is given in Fig. 17-1 L The out-of-control control sh:uation is also evident On this cha.'t at batch 17.
The CliSUM control chart is a powerl'uJ quality tool lor detecting a process that has shifted from the target process mean, The correct choices of H and K can greatly improve the sensitivity of the control chart while protecting against t.'1e occurrence of false alarms (the process is actually in control, but the control chart signals out of con~ol). Design rec~ ommendations for the CliSUM will be provided later in this chapter whcn the concept of average run length is introduced,
17~3
6~
St
529
Upper CUSUM
:~----------------------~r-----+-+--3~
~ 2~
.r
.~ ~
B
1;, 0 -1
~
"'--~-""-~-"-"";o;;,.----.......
"J.~""~~;/'~s..v../ ~'~~ :!",~ .. .,j.~~
-2 ~ -3 '~4 ! Lower CUSU;v,
' I:
t
o
!
5
!
1-42
I
20
10 15 Subgrouo nurr:ber
25
We bave presented the upper and lower CUSti'M control charts for situations in which a shift in either direction away froe the process target is of interest. There are many instances w ben we may be interested in a shift IT. only one direction, either upward or dov.nward, One-sided CUSUM charts can be constructed for these situations. For a thorough development of these charts and more details, see Montgomery (2001),
EWMA Control Charts The exponentially weighted moving average (E\V.MA) control ch~"t is also a good alteITl3tive to the Shewhart control chart when detecting a small s!:rift in the process mean is of interest. We will present the EWMA for individual measurements although the procedure can also be modified for subgroups of size n > 1. The EWMA control chart was firSt introduced by Roberts (1959). The EV;'MA is defined as
(17-23) where Ais a weight, 0 < A';; 1. The procedure to be presented is initialized withZo = Po, the process target mean, If a target mean is unknown, then the average of preliminary data. is used as the initial value of the EWlYfA. The definitior: given :in equation 17-23 demonstrates that information from past observations is incorporated into the current value of z/. The value Zi is a weighted average of all previous sample means. To illustrate, we can replace Zl~l on the right-hand side of equation 17~23 to obtain
x,
Zi =
;..x, + (1- A)[,Axi _1 + (I -
A)zd
= Axi + A(I - A)xi_l + (1 - A)2Z~2' B~"
recursively replacing Z.i~pj
1,2, ... , t, we find i-I
Z;
= A2::(1-
Ai
X;_J
+(1- A)i,,".
530
Chapter 17
Statistical Quality Control a=.d Reliability Engineering
The EWMA can be thought of as a weighted average of all past and current observations. Note that the weights decrease geometrically with the age of the observation. giving less weight to observations that occurred early in the process. The E\v:MA is often used in forecasting. but the EWMA control chart bas been used extensively for monitoring many types of processes. !fme observations are independent random variables with variance
Given a target mean, Po, and the variance of the EVlMA, the upper control limit! centerline, and lower control limit for the Em1.A control chart are UCL Centerline = il
I(
"'J
A '[ LCL=.Uo-L<1~ 2~;y-(1-At', where L is the width of the comrollimits. Note that the term 1 (1 - ).)2.i approaches 1 as i increases. Therefore, as the process continues running. the control limits for the E\V.M.1\.
approach the steady srate values
(17-24)
Although the control limits given in equation 17 -24 provide good approximations, it is recommended that the exact limits be used for smalJ values of i.
F!x:lrnp", i '7-8 We will now imp!cmenr the EVlMA control chart with 1;::< 0.2 and L;;: 2.7 for the dry-matter content data provided in Table 17-5. Recall that the target mean.is Ji.J;; 45% and the process standard deviation is assumed to be <5= 0.&4%. Tae EWMA calculations are provided in Thble 17-6. To demonsmtte some of the calculations, consider the first observ:ation with XI "'" 46,21. We find " ~,\,xl
+ (1- A.)z"
; (0.2)(46.21) + (0.80)(45) ;45.24. The second EWMA '\'alue is then
The EW1v1A values 3I'e plotted on a control chart along with the upper and lower control J.i:::;ito given by • UCL= 11.0 +LO' /( ~ )-1-(I-i.'J
·V~2-i.J
.'
2i
J
1[7·-0=-2=-":-----1
=45+z.7(0.841J - '~)[1-(1-0.212i , ,\2-0.2 "
LCL= Po -LO'X~ 'f1-(I-i.)"j
i,2-.l.,l
117'-:'0"'0-'-,- - - -
=45 - 2.7( 0.84)~l2 -'~.2 )[1- (1- 0.21'1 j. Therefore. for i = 1,
=45.45,
rr""X--'" 2(11 -LO'\(Z_i.J[.-(H.) 1 1
LCL = Po
=44.55.
..
532
Chapter 17 Statistical QUality Control and Reliability Engineering
UCL=45.76
"::;
~
45,5 ~l;ea.n::
45.Q
45
44.5
LCL=44.24 44.0 0
5
15 10 Sample number
20
25
Figure 17·12 EWMAcha.",forE:wnple17·8.
The reIl)aining controllim,its are calculated similarly and plotted on the control chart giver:. in Fig. 17~ 12, The conrrollimits ::end to' increase as i increases, but then rend to die steady state values given by equations in 17-24: .-)-
UCL= 1'0 +La;12~ A
=45+2.7(0.84») 0.2
12~0.2
LCL=uo -La
."
'-A1_ _•
~2-1.
=45-2.7(0.84)
1-0.
2 '12-0.2
= 4424. The E\V'I\1A, control chart signals at observation 17, indicating that the process is O\;t of concr.l.
The sensitivity of the EWMA control chart for a particular process will depeod on the choices of L and A. Various choices of iliese parameters will be presented later in this chapter, when the concept of the average run length is introduced. For more details and deyel~ opments regarding the EWMA, see Crowder (1987), Lucas and Saccucci (1990), and Montgomery (2001).
17-3.6 Average Run Length In this chapter we have presented control-charting techniqces for a variety of situations and made some recommendations about the design of the control cbarts. In this section; we will present the average run length (ARL) of a control chart The A.RL can be used to assess the performance of the control cbart or to detennine the appropriate values of various parameters for the control charts presented in this chapter.
17 ~3
r
S:atistical Ptocess Control
533
The ARL is the expected number of samples taken before a control chart signals out of COn~ trOL In general, the ARL is
I ARL=:-. p
where p is the probability of any point exceeding the controillmlts. If the process is in con~ trol and the control chart signals out of control,. then we say that afalse alarm has occurred To illustrate, consider the Xcontrol chart with the standard 30-limits. For this situation, p = 0.0027 is the probability that a single point falls outside the limits when the process is in controL The in~control ARL for the X CO!ltrol chart is
.!.. =.~I~ =370.
ARL =
0.0027
p
In other words, even!f the process remains in control we should expect, on the average. an out-of-control signal (or false alarm) every 370 samples. In general, if the process is actually in control. then we desjre a large value of the ARL, More formally. we can define the
in-control ARL as 1 ARLo ==-,
a
where ais the probability that a sample point plots beyond the control limit. If on the other hand the process is out of control, then a small ARt value is desirable, A small value of theARL indicates that the control chart will signal out of control soon after the process has shifted. The out~of~control ARL is
1
1-13 ' wbere 13 ,s the probability of not detecting a shift on the lirst sample after a shift bas occurred. To illustrate. consider the X control chart with 3 (J limits. Assume the target or incontrol mea:J. is f.to and that the process has shifted to an out~of...control mean given by fJ,1 :::: ,Ur! + ka. The probability of not detecting this shift is given by
13 = P[LCL 5X5 ucLl,u = ,uJl. That is, 13 is the prooability that the next sample mea:> plots in control, when in fact the process bas shifted out of control. Since X~N(f.J.,d'ln) and LCL= f.J.o Lo-/.[,; and UCL=,un + Lo-/,!';; we can rewrite 13 as
=P[-L-k.[,; ';;Z5L-kFn], where Z is a standard normal random variable. If we let CI> denote the standard normal cumulative distribution fonctl.on, then
534
Chapter 17
Statistical Quality Control and Reliability Engineering
From this, 1:- fJ is the probability that a shift in the process is detected on the first sample after the shift has occurred. That is, the process has shifted and a point exceeds the control limits,--signaling the process is out of contro1. Therefore, ARLl is the expected number of samples observed before a shift is detected. The ARLs have been used to evaluate and design control charts for variables and for attributes. For more discussion on the use of ARLs for these chms, see Montgomery (2001), ARLs for the CUSUM and EWMA Control Charts Earlier in this chapter, we presented the CUSl,'M and Ew'MA control charts, TheARL can be used to specify some of the parameter values needed to design these control charts. To implement the tabular CUSUlvl control chart, values of the decision interval, H, and the reference value, K, must be cbosen, Recall that H and K are multiples of the process standard deviation, specifically H =ha and K =ka, where k = 1/2 is often used as a standard. The proper selection of these values is important. The ~~ is one criterion that can be used to determine the values of Hand K As stated previously, a large value of the ARL when the process is in control is desirable. Therefore, we can set ARLo to an acceptable level. and deter.nine h and k accordingly. In addition) we would want the control chart to quickly detect a shift.in the process mean. This would require values of hand k such that the values of ARL, are quite smalL To illustrate, Montgomery (2001) provides the ARL for a ClJS'L'".M coritrol chart with h = 5 and k = 112, These values are given in Table 17-7, The in-control average run length, ARLo, is 465, If a small shift, say, 0.50a, is important to detect, then with h = 5 and k = 1/2, we would expect to detect this shift within 38 samples (on the average) after the shift has occurred, Hawkius (1993) presents a table of h and k vaJues that will result in an in-control average run length of ARLo::::: 370. The values are reproduced in Table 17-8, Design of the EWMA control chart can also be based on the ARLs, Recall that the design parameters of the EWMl\ control chart are the multiple of the standard deviation, L, and the value of the weighting factor, l. The values of these design parameters can. be cho~ sen so that the ARL performance of the control charts is satisfactory. Several authors discuss the ARL performance of the EWMA control chart, including Crowder (1987) and Lucas and Saccucci (1990), Lucas and Saccucci (1990) provide the ARL performance for several combinations of L and,t The results are reproduced in Table 17-9. Again, it is desirable to have a large value of the in-controlA.RL and small values of out-okontrolARLs, To illustrate, if L= 2:8 and A= 0,10 are used, we would expect ARLo e 500 while theARL for detecting a shift of 0.5 aisA.RL, '" 31.3. To detect smaller slrifu
Table 17-7 Tabula:: CUSlo'M Penor=nce withh = 5 and k ~ 1/2
Shift in M= of 0)
0
0.25
0.50
0.75
1.00
LSD
2.00
2.50
3.00
4.00
ARL
465
139
38,0
17,0
10,4
5,75
4,01
3.11
2,57
2,01
Table 17..s
k h
Values of h and k Resulting iDARI..c
0,25
s.ol
0,50 4,77
370 (Hawkins 1993)
0,75
)'0
1.25
3,34
2,52
1.99
L5 1.61
17-3 Statistical Process Control
535
Table 17-9 ARl...$ for Various E"'lMA Control Schemes {Lucas and Saccucci 1990} Shift iL Mean (multiple of d) 0 0.25 0.50 0.75
L= 3.054 }.;;:;;0.40
L""2.998 ).=0.25
L=2.962 1=0.20
L=2.814 A=O.1O
L=2.615 ).=0.05
500 170 48.2 20.1 11.1 5.5 3.6
500 150 4LS 18.2 10.5
500
500 84.1 28.8
500
224 71.2
28.4
1.00
1~.3
1.50
5.9 3.5
2.00 2.50 3.00 4.00
2.5 2.0 1-4
106 31.3 15.9 ~O.3
16.4 1:-4 7.1
5.5
6.1
3.1 2.9
4.4
5,2
2.7
3.4
4,2
23 1.7
2.4 1.9
2.9 2,2
3.5
2.7
in the process mean, it is found that small values of I. should be used. Note that for L = 3.0 and I. = LO, the EWMA reduces to the standard Shewhart control chart with 3-sigma l:mits. Cautions in the Use of ARLs Although the ARL provIdes valuable information for designing and evaluating control
schemes, there are drawbacks to relying on theARL as a design criterion. It should be noted that run length follows a geometric distribution. since it represents the number of sarr..ples before a "success» occurs (a success being a point falling beyond the control limits), One drawback is the sumdard de,iation of the run length is quite large. Second, because the distribution of the run length follows a geometric distribu:ion, the mean of the distribution (AR.L) may not be a reliable estimate of the true run length,
17-3.7 Other SPC Problem-Solving Tools \Vhi1e the control chan is a very powerful tool for investigating the causes of variation in a process, it is most effective when used with other SPC problem-solving tools, In this section we illustrate some of these toois. using the printed circuit board defect da~ in Example 17-5. Figure 17-9 shows a c chart for the number of defects in samples of five printed circuit boards. The chart exhibits statistical control, but the number of defects must be reduced, as the average number of defects per board is 815 = 1.6, and this level of defects would require extensive rework.
The first step in solving this problem is to construct a Pareto diagram of the individual defect types. The Pareto diagram, shown in Fig. 17-13, indicates that insufficient solcer and solder balls are the most frequently OCCUrrillg defects, accounting for (109fl60)IOO 68% of the observed defects. Furthermore, the fust five defect categories on the Pareto chart are all solder-related defects. This points to the flow solder process as a petential opportunity for impro'lement. To improve the flow solder process, a team consisting of the flow solder operator, the shop 8upen'isor, the manufacturing engineer responsible for the process, and a quality neer me...~ to study potential causes of solder defects. They conduct a brainstonning session and produce the cause~and.effect diagram shown in Fig. 17-14. The causew~md-effect
536
Chapter 17
Statistical Quality Control and Reliability Eng'.neering
75 64
Figure 17~13 Pareto diagram for printed circuit board defects,
diagram is widely used to clearly display the various potential causes of defects in products and their interrelationships, It is useful in summari7ing knowledge about the process. As a result of the brainsto:rming session, the team tentatively identifies the following variables as potentially influential in creating solder defects: lw 2. 3. 4. 5. 6. 7.
Flux specific gravity Solder temperature Conveyor speed Conveyor angle Solder wave height Preheat temperature Pallet loading meL'>od
A statistically designed experiment could be used to investigate the effect of these seven variables on solder defects. Also, the team constructed a defect concentration diagram for the product. A defect concentration diagram is just a sketch or drawing of the product, with the most frequently occurring defe<:ts shown on the part. This diagram is used to determine whether defects occur in the same location on the part The defect concentration diagram for the printed circuit board is shown in Fig. 17-15. This diagram indicates that most of the insufficient solder defects are near the front edge of the board, where it makes initial contact with the solder wave. Further investigation showed that one of the pallets used to carry the boards across the wave was bent, causing the front edge of the board to make poor contact with the solder wave.
17 -4
Reli~bility
Engineering
537
Amount Exhau~
Conveyorsoeed
Wf:Ne height Specifio
rav:
Contact time Conve or an Ie Maintenance
Wave fluidity
Alignment of pallet Orler.tation
Contaminated lead
Temperature
Figure 17~14 Cause--and-effect diagram for the print!:d circuit board flow solder process.
Sack
Figure 17~ 15 Defect concentration diagram for a printed cireui: board.
When the defective pallet was replaced, a designed experiment was used to investigate the seven variables discussed earlier. The results of this experiment indicated that several of these factors were influential and could be adjusted to reduce solder defects. After the results of the experiment were implemented, the percentage of solder joints requiring rework was redueedOOm 1% to under 100 parts per million (0.01 %).
17-4 RELIABILITY ENGINEERING One of "'Ie challenging endeavors of the past three decades has been the design and development of large-scale systems for space exploration, new generations of commercial and military aircraft, and complex electromechanical products such as office copiers and computers. The perlormance of these systems. and the consequences of their failure, is of vital concern, For example, the military community has historically placed strong emphasis on equipment :reliability. This empbasis stems largely from increasing ratios of Illamtenance cost to procurement costs and the strategic and tacticallinplications of system failure, In the
I
538
Chapter 17 Sutistical Quality Control and Reliability Engineering
area of consi.uner product manufacture, high reliability has come to be e.xpected as much as confonnance to other important quality characteristics. Reliability engineering encompasses several activities, one of which is reliability mod~ eling. Essentially, the system survival probability is expressed as a function of a subsystem of component reliabilities (survival probabilities). Usually, these models are fune dependent, but there are some situations where this is not the case. A second important activity is that of life testing and reliability esfunation.
17-4.1 Basic Reliability Definitions Let us consider a component that has just been manufactured. It is to be operated at a stated «stress level" or within some range of stress such as temperature, shock. and so On. The random variable T will be defined lIS fune to failure, and 1ha reliability of the component (or subsystem or system) at time t is R(,) = P[T> ,]. R is called the reliability function. The fuilure process is usually complex, consisting of at least three types of failures: initial failures, wear-out failures, and those that fail between these. A hypothetical composite distribution of tin:.e to failure is shown in Fig, 17-16. This is a mixeti distribution, and (17-25) Since for many components (or systems) the initial failures or time zero failures are removed during testing, the random variable Tis conditioned on the event that T> 0, so that the failure density is
g(t)
j(t) =
i - p{O)'
=0
t>o,
(17-26)
otherwise.
Thus, in tenns of/; the reliability function, R, is R(t) = 1- F(t) = fi(x)dx.
(17-27)
The term interval failure rate denotes the rate of failure on a particular interval of time [r i , t 2J and the terms failure rate, iflstantaneousjailure rate, and hazc.rd \\till be used synonymously as a limiting form of thc interval failure rate as 12 ~ t 1- The interval fajlure rate FR(tt, t,) is as follows: (17-28)
t
g(l)
~.-----.:::-..~----.:::--+;
o
Figure 17~16 A composite:faiJu:re distribution.
17-4 Reliability EngiLeering
539
The first bracketed term is simply P{Failure during it"~ tillSurvival to time Id.
(17-29)
The second term is for the dimensional characteristic> so that we may express the conditional probability of equation 17-29 on a per-unit time basis. We will develop the instantaneous failure rate (as a function of I). Let h(l) be !.'Ie hazard function. Then
sinceR(t) l-F(r) and -K(t) J(I). A typical hazard function is shown in Fig. 17-17. Note that h(t) . dl might be thought of as the instantaneous probability of failure at I, given survival to t. A useful result is that the reliability functionR may be easily expressed in tenns of has ' ) -'h(x)'" = e-H("," Rtr ;;:; e k
where
Equation 17-31 results from the definition given in equation 17-30,
h(t)
,
ag~'
Early failures
h !
and rancom 1allures
Wear Oll:
Random failures _ _-t-faiiures and_ random fai!ures
Figure 17~11 A typical hazard function,
(17-31)
540
Chapter 17 Statistical Quality Control and Reliability Engineering
and the integratiorrof both sides
r
r h(x)dx =- 0 R'«X)"ilx =-lnR(X}',!o R x)
Ja so that
J; h(x)dx = -In R(I)+ Since F(O)
0, we see that In R(O)
lnR(O).
= 0 and
The mean time to failure (MITF) is
A useful alternate form is (17-32) Most complex system modeling assumes that only random component failures need be considered. This is equivalent to stating that the time~to~failure distribution is exponential, that is,
t<: 0, othenvise. so that
i() ;l,,-" h(t)=-' =--=,1. R(I) e-" is a constant. \\tnen all early-age failures have been removed by bum in., and the time to occurrence of wearolit failures is very great (as with electronic parts), then this assumption is reasonable. The normal distribution is. most generally used to mode! wearout failure or stress failure (where the random variable under study is StreSS level). In situations where most failures are due to wear, the nonnal distribution may very well be appropriate. The lognormal distribntion has been found to be applicable in describing time to failure for some types of components; and the literature seems to indicate an increased utilization of this density for this purpose. The Weibull distribution has been extensively used to represent time to failure. and its nature is such that it may be made to approximate closely the observ-ed phenomena. Vt'hen a system is composed of a number of components and failure is due to the most serious of a large number of defects or possible defects, the Weibull distribution seems to do particularly well as a model. The gll1Iltlla distribution frequently resulfll from modeling standby redundancy where components have an exponential ti.me~rQ-failure distribution. We will investigate standby redundancy in Section 17-4.5.
17-4 Reliability Engineering
541
17-4.2 The Exponential Time-to-Failure Model In this section we assume that the time-to-failure distribution is exponential; that is, only "random failures" are considered. The density, reliability function, and hazard functions are given in equations 17-33 through 17-35, and are shown in Fig. 17~18: t;'
;::;0,
otherwise,
(17-33)
;::;0,
otherwise,
(17-34)
othe:'INise.
(17-35)
h(t) = f(t) R(t) =0,
The constar.t memory; that is,
haz~d
0,
= J..
,
function is mterpreted to mea."'1 L'1at the failure process has no
(17-36)
fit)
(a) Density function
(b) Re:iability function hit)
(c) Hazard t ..mction
Figure 17~lS Density, reUabwr:y ft.'Uctio::;., and hazard function for the exponential failure model.
542
Chapter 17 Statistical Quality Control and Reliability Engineering
a quantity that is independent of t. Thus if a component is functioning at time t, it is as good as new, The remaining life has the same density asj.
Emnple 17:9 A diode used on a printed c!.reui~ board has a fated failure rate of 2.3 x lo-e failures per hour. Howeve:, under an increased tempera~e stress, it is felt that the :ate is about :t5 x 10-5 failures per hour. The time to failure is exponentially distributed, so that we have ft.t) "'" (1.5 X 1O-5)e-
t~O,
~O,
otherwise,
RU) ::::: e-< L5:«11)-5)I,
t~O,
othetwise,
=0,
and h(f) ~
L5x 10-',
!~O.
otherwise.
=0,
To determine the reliability at t = 10 and t "" 1iY, we evaluate R( 10') =- e...iJ,15 ::.:: 0.861, and R( 10 5) [15;0.223. 4
:::::-
17-4.3 Simple Selia! Systems A simple serial system is shown in Fig. 17-19. In order for the system to function, all components must function. and it is assumed that the components function independen.!ly. We let ~ be the time to failure for component Cj for j = 1, 2, .. ,' n and let T be system time to failure. Tbe reliabiJ~ty model is thus R(t) = peT> t] = P(TI > t) , P(T2 > f) •. " . PIT, > t),
or R(I) = R I (:) ,R,(t) , ". ,R,(t),
where
Example17.10. Three components must all iu..'1ction [or a simple system to funetion. The random variables T;. T2 , and TJ representing t.i.r!J.e to failure for the components are independent with the following distributions: Tr-N(2XIO J , 4X10
4
),
I T,-WcibUl\y=O, 0=1,
T,-lOgnormal(1l = 10, <1'=4).
Figure 17-19 A simple serial system.
'71)-'
17-4 ReliabLity Engineering
543
1: follows that
so that
For exa:nple. if t::::: 2187 ho-.;rs, then R(2187) ~ [1 - (0.935)][e-'][I- <1>(-1.154)] ~
[0.175::0.0498J[0.876)
=0.0076.
For the simple serial system, system reliability may be calculated using the product of the component reliability functions as demonstrated; however, when all components have an exponential distribution, the calculations are greatly simplified, since R(t) ::;:;; e-).;l . e-.t:f
...
e-J"I = e-;).I + AZ+ " . . . ..1.,)/,
or (17-38)
where A.f = L:~).j represent>; the system failure rate. \Ve also note that the system reliability function is of the same fonn as the component reliability functions. The system failure rate is simply the sum of the component failure rates, and this makes application very easy.
Consider an electronic cii-c'Jit with three integrated ciTC".rit devices, 12 silicon diodes. 8 ceramic capacitors, and 15 composition resistors. Suppose under given Stress levels of <;eroperatt:re, shock, and so on that each component has fail:rre rates as shown 1:;. the following table. and the component faillh-eS ~ independent
Faillh-es per Hour Integrated circuits
Diodes Capacitors Resistors
1.3 x lv' 1.7 X lv' 1.1 X 10-7 6.1 x 1U"
Therefore,
A, ~3(O.013
10-') + 12(1.7 x 10- ) + 8(1.2 x 10-7) + 15(0.61 ' =3.9189 X 10"',
and
X
x 10-7 )
544
Chapter 17
Statistical Quality Control and Reliability Engineering
The circuit mean time to failure is
l>1TIF~ E(T];..!:..;_I_X10' ;2.55xlO'ho=. J.,
3.9189
If we wish to determine. say. ROO.?), we get R(104) =e-o·039189
'!::I:
0.96,
17-4.4 Simple Active Redundancy A.~ active redundant configuration is sho,," in Fig. 17·20. 'The assembly functioos if k or more of the components function (k:5 n). All components begin operation at time zero. thus the term "active" is used to describe the redundancy. Again, independence is assumed. A general fonnulation is not convenient to work withl and in most cases it is unneces~ sary. When all components ha\-e the same reliability function, as is the case when the components are the same type, we ret RP) ; r\t) for j = 1, 2, ... , n, so that
(17·39)
Equation 17·39 is derived from the definition of reliability.
Three identical components are arranged in active redundancy. operating independently. In order for the assembly to fun~tion, at least two of t.'e components must function (k = 2), The reliability func~ don for the system is thus
= 3[r(1)]'[1- r(/l]+[r(r) j' = [r(I):'[ 3 - 2r(I1]. It is noted tharR is a function of time, r,
Figure 17~2{1 An active redundant configuration.
17 A
Reliability Engineering
S4S
When only one of the n components is required, as is often the case, and the components are not identical, we obtain R(t) = 1-
•
I1[I- Rj(t)1
(17-40)
i""l
The product is the probability that all components fail, and, obviously, if they do not fail the system survives. When the components are identical and only one is requited, equation 17-40 reduces to R(t) = 1 - [I
r(tlJ",
(17-41)
where r(t) = R/t),j = 1, 2, ... , "'When the components have exponential failure laws, we will consider two cases. First, when the components are identical with failure rare ;.t and at least k compo:cents are required for the assembly to operate, equation 17-39 becomes (17-42) The second case is considered for the situation where the components have identical exponential failure densities and where only one component must function for the assembly to function. Using equation 17-41. we get 1 - [1 -
R(tl
e-'1".
(17-43)
In Example 17-12, where th::ee identical components were arranged in an active redundancy, and at :east !:\Vo were required for system operation. we found R(t) = (r(t)]'[3 - 2r(tl], If the component reliability functions are
r(tl = e"",
then R(t) = .-"'[3 -
2'-'1
=3e-2M _ 2e-3.l.:, If tv.'o eomponents arc ammgcd in an active redu.Ldancy as described. and only one must function for the assembly to function, and. furthermore, if the time-to-failurc densities arc exponential with failure rate 1, tb,en from equation 17-42. we obtci:n
R(t)
1- [1
e-42 = 2e-"-
174.5 Standby Redundancy A common form of redundancy, called stllndby redundancy, is shown in Fig, 17-21. The unit labeled DS is a decision switch that we will assume has reliability of 1 for all t. The operating rules are as follows. Component 1 is initially "online," and when this component fails. the decision switch switches in component 2. which remains online until it fails.
546
Chapter 17 Statistical Quality Control and Reliability Engineering
Figure 17-21 Standby redundancy.
Standby units are not subject to failure until activated. The time to failure for the
assem~
bly is T = T1 + T'}. + ... + Til! where T! is the time to failure for the ith component and T1, T2, •• ,. Tn are independent random variables. The most common value for n in practice is two, so the Central Limit Theorem is of little value, However, we know from the property of linear combinations that
"
E(T]: 2,£(li) 1=1
and
" V(li). Y(T] = 2, i=l
We must know the distributions of the random variables Ti in oroer to find the distribution of T. The most common case occurs when the components are identical and the time-tofailure distributions a..re assumed to be exponential. In this case. Thas a gamma distribution
f(t)=~(;urle-",
1>0,
(n-l)!
,
=0
otherwise,
so that the reliability function is "-1
R(t): 2,e-" (At)k /k!,
t>O.
(17-44)
k.O
The parameter A. is the component failure rate; that is, E(TJ = ItA.. The mean time to failure and varia"lce are
MTIF = E[11 = nI A.
(17-45)
and
(17-46) respectively,
1
17-4 Reliability Engineering
547
Example i7:14 Two identical compOllents are assembled in a standby redundant configuration with perfect switching. The cornponer.t lives are identically distributed. independer.t rand()m variables having an expol , The mean time to failure is ner.tial distribu::ion with failure rate
lOu
and the variance is
The reliability function
V[Tl
21(100-')' = 20,000,
R(t) =
:22e.;!J('" (-t/100)' /k!, ,.0
R is 1
or
R(t) =e-;/IOO[1 ~ t/l00l,
17-4.6 Life Testing Sometiroes~ n units are placed on test and aged until all or most uruts ha\'e failed; the purpose is to test a hypothesis about the fann of
Ufe tests are conducted for different purposes.
the tirne-to-failure density with certain parameters. Both forma1 statistical tests and probability plotting are widely used in life testing, A secane objective in life testing is to estimate reliability. Suppose, for example, that a manufacturer is interested in estimating R(IOOO) for a particular component or system. One approach to this problem would be to place n units on test and count the number offail~ ures, r, occurring before 1000 hours of operation. Failed units are not to be replaced in this example. An estimate of unreliability isp::::: rln, and an estimate of reliability is
R(IOOO) I-~, n
(17-47)
A 100(1- a)% lower--co:Uidence limi: on R(lOOO) is given by [I upper limit onp]. where p is the unreliability, This upper limit on p may be detemrinec using a table of the binomial distribution. In t..?te case where n is large, an estimate of the upper limit on pis (17-48)
.:E"".tipi~17,~~ One hundred units are placed on life test, and the test is run for 1000 Mum. There are two failures during test. soft:;;;;; 0.02, ar.d ~~(1000) =0.98. Using a table oft.'e binomial Gistribution, a 95% uppc('confidence limit on pis 0.06, so that a lower limit on R(lOOO) is given by 0.94.
In recent years, t.~ere has been much work on the analysis of failure-u.'Ue data, .:ncluding plotting methods for identification of appropriate failure~time models and paramet:er estimation. For a good summary oftbis work. refer to Elsayed (1996).
S48
Chapter 17 Statistical Quality Control an.d Reliability Engineering
17-4.7 Reliability Estimation with a Known Time-to-Failure Distribution In the case where the form of the reliability function is assumed known and there is only one parameter, the maximum likelihood estimator for R(I) isR(t), which is formed by substituting ~for the parameter ein the expression for R(t), where ~ is the maximum likelihood estimator of O. For more details and results for specific time-to-failure distributions. refer to Elsayed (1996).
17-4.8 Estimation with the Exponential Time-to-Failure Distribution The most COllUDon case for the one-parameter situation is where the tlme-to-failure distri~ burioo is exponentia~ R(t) :' e..;to, The parameter e= E[T] is called the mean time to fallure and the estimator for R is R(t), where R(t) ~ ,-di
and {j is the maximum likeliliood estimator of e. Epstein (1960) developed the maximum likeliliood estimators for under a number of diEerent conditions and, furthermore, showed that a 1OO( 1 - a)% confidence interval on R(t) is: given by
e
(17-49) for the two-sided case, Or (17-50)
for the lower, one-sided intervaL In these cases, the values upper-confidence limits on O. The following symbols will be used: n
e and eu are the lower- and L
= number of units placed on test at t;::::; O.
Q = total test time in unit hours. t~
=.
time at wruch the test is terminated.
r = number of failures accumulated at time t.
r· : : : preassigned number of failures. 1 - a ;: confidence level.
X! ~
:=
the upper a percentage pomt of the chiwsquare distribution with k degrees of freedom.
There are four situations to consider. according to whether the test is stopped after a preassigned time or after a preassigned number of failures and whether failed items are replaced or not replaced during test. For the replacement test, the total test time in unit hours is Q nt", and for the nonreplacement test
,
Q=
2> +(n-r)t',
(17-51)
If items ace censored (withdrawn items that have not failed), and if failures are replaced whlle censored items are not replaced. then c
Q=
2:>! +(n-c)" )"'1
(17-52)
r
17-4 Reliability Engineering
549
where c represents the number of censored items and tj is the time of the jrh censorship. If neither censored items nor failed items are replaced, then
Q~
,
c
1''"'1
j=l
2>+ 2>j+(n-r-c)t'.
(17-53)
The development of the maximum likelihood estimators for 6 is rather straightforward. In the case where the test is nonrep1acement,. and the test is discontinued after a fixed num~ ber of items have failed, the likeJ11cod function is L= TIf(ti)'TIW) (17-54)
Then
l~
)" l~lnL~-rlne-ekti- (n-rt
/8
(""1
and sc1ving (awl!) ~ 0 yields the estimator
,
e 2:. 'i + (n - r)? i=l,_ _ _ __
Q
r
r
(17-55)
It turns out that (17-56)
e
is the maximum likelihood estimator of for all cases considered for the test design and operation. The quantity 2"y8has a chi-square distribution with 2r degrees of freedom in the case where the test is terminated after a fixed number of failures, For fixed te:rmL.lation time t" l the degrees of freedom becomes 2r + 2. Since the expression 2r818 =2Q!O, confidence limits on (j may be expressed as indicated in Table 17-10. The results presented in the table may be used direcdy with equations 17-49 and 17-50 to establish oonfidence limits on R(t). It should be noted that this testi.:lg procedure does not require that the test be run for the time at which a reliability estimate is required. For example, 100 writs may be placed On a aoureplacement test for 200 hours, the paramorer e estimated, and R(lOOO) calculated. In the case of the binomial testing mentioned earlier, it would have been necessary to r.m the test for 1000 hours,
Tnble 17·10 Confidence Limits on B
Nature of Limit Two...sided lL'11its
Lower. one-sided limit
FIxed NlIlllber of Failures, r'
Fixed Terminarior. Time, /
550
Chapter 17
Statistical Quality Control and Reliability Engineering
The results are, however. dependent on the assumption that the distribution is exponential, It is sometimes necessary to estimate the time tR for which the reliability will be R, For the exponential model, this estimate is
(17-'>7)
and confidence limits on 'R are given in Table 17-11.
,l:;~~~';17',!6. Twenty items are placed on a replacemenr test that is to be O?Cf3ted until 10 failures occur. The tenth failure occurs at 80 hours, ar.d the reliability engineer wishes to estimate :he (':lean time '.:0 failure, 95% two-sided limits on 8, R(10D), and 95% two-sided limits on ROOO). Finally, she wishes ':0 estimate the time for whkh the :eliability wil: be 0,8 with point and 9.5% two-sided confdence interval estimates.
According to equation
17~56
and the results presented in Tables
iI = nt = 20(80) =160 hours r
10
17~lO
and 17-11,
'
Q= m* ::: 1600 unit hoUl'S,
r 2Q 2Q' ~ %&.025)0 ' X6.\YIl,,,
r
c 3200 3200 1 l34,17' 9,591J
= [93,65,333,65J, R(lOO)=e-100/e =e- 1OOf:60
0.535.
According to equation :7-49, the confidence interval onR(100) is [e-J.OOi9J·65, e-IOO1333.65];;;;; [0.344, 0.741J. Also,
The two-sided 95% confidence limit is determined from Table 17-11
17-4_9 Demonstration and Acceptance Testing It is not uncommon for a purchaser to test incoming products to assure that the vendor is conforming to reliability specifications. These tests are destructive tests and, in the case of attribute measurement, the test design follows that of acceptance sampling discussed eat~ lier in this chapter. A special set of sampling plans that assumes an exponential time-lo-failure distribution has been presented in a Department of Defense handbook (DOD fI- 108), and :hese plans are in v.ide use,
17-5 SL"MMARY This chapter has presented several widely used methods for statistical quality control. Control charts were -introduced and their use as process surveillance deviees discussed. The X and R control cha."tS are used for measurement data. ¥t'ben the quality characteristic is an attribute, either the p chart for fraction defective or the cor u chart for defects may be used. The use of probability as a modeling technique in reliability analyse, was also discussed. The exponential distribution is widely used as the distribution of time to failure, although other plausible models inelude the normal, lognormal, Weibull, and gamma distributions, System reliability analysis methods were presented for serial systems, as weB as for systems having active or standby redundaney. Life testing and reliability estimation were also briefly introduced.
17-6 EXERCISES 17~1. An extrusion die is used to produce aluminum rods. The diameter of the rods is a critical quality eharacteristic. Below are shown X and R values for 20 samples of five rods eaeh. Specifications on the rods are 0.5035 ± 0,0010 inch. The values given are the la.st three digits of the measurements; that is. 34.2 is read as 0.50342.
Sample 1 2
3 4
5 6 7
8 9
10
X
R
Sample
X
R
34.2 31.6
3 4 4 5
11 12
35.4 34.0 36.0
8 6 4
37,2
7 3 10
31.8
33.4 35.0 32.1 32.6 33.8 34.8 38.6
4
2 7 9 10 4
13 14 15 16
17 18 19
20
35.2 33.4 35.0 34,4 33.9 34.0
4
7
8 4
(a) Set up the X wd R cha..''ts, revising the trial control limits if neeessary. assuming assignable causes cau be found, (0) CalctUa>e peR and PCRk' Interpret these ratios. (c) What pereentage of defectives is being produced by this prOl..---ess?
17..2.. S'I!ppose a process is in control, and 3-sigma controllimlts are in use on the X chart. Let the :mean shift by 1.50. ""'ba! is the probability that t1is shift will remain undetected for three consec:.ttive samples? V/hat would this probability be if2-sigma control. limits are used? The sa."Ilple size is 4, 11~3. Suppose that an X chart is used ~o control a :'.01.'mall)' distributed process, and that samples of size n
are taken every h hours and plotted on the chart, which has k sigma limits.
(a) Find the ex.pected number of samples that will be Laken until a false action signal is generated. This is called the in-co:::trol average run length CARL). tha~ the ptocess srJfts to an out-ofeontrol state, Find the expected number of Saclpies that will be taken until a false action is generated. This is the out-of-cootrol ARL.
(0) S'I!ppose
(e) Evaluate the in-comrolARL fork= 3. How does this change if k::; 27 What do you thi:lk about the use of 2-sigma limits in practice? (d) Evaluate the out-of-controlARL for a shift of one sigma, give:;', that n = 5. 11-4~ Twenty-five samples of size 5 are dra'Wn from a process at regular intervals, and the following da!a a(e obtained:
552
Chapter 17
'-' IX
i ::
Statistical Quality Control and Reliability Engineering
17~7. Montgomery (2001) presents 30 observations of oxide thickness of individual silicon wafers, The data
36275,
are
{""!
(a) Compute the control limits for the X and R charts. (b) Assutning the process is in control and specificatior.limits are 14.50 ± 0.50, what conclusions can you draw about the ability of the process to operate \\1tbin these limits? Estimate the percent2.ge of defective items ilia: will be produced.
Oxide
Wafer
Thickness 45.4
2
48.6 49.5
me proce..o:;s mean stills out of control by 1.50' 10 minutes after the hour, If D is the ex.pected number of defectives produced per quarter hour in this out-ofcontrol st2.te.:find the expected loss (in tenus of defective unlt.,» that re..,>ults from this control procedure.
3 4 5 6 7 8 9 10 11
17..(i. The overall length of a cigar lighter body used in a."l automobile application is controlled uSing X and R
12 13
charts. The following table gives length for 20 samples of size 4 (measU!"ements a\'1: coded from 5.00 rnm; that is, 15 is 5.15 r:ill-.).
14
(c) Calculate peR and peRI(.' Interpret these ratios.
11-5. Suppose an X c.1.li.-"t for a process is in control with 3-sigma limits. Samples of size 5 are drawu ever.y 15 minutes. on ::he quarter hour, Now suppose
2
3 ~
5 6 7 8 9
2
3
4
10
8
9
14
]0
6
10
9
6
9
8 ;0
9
11 13 12 13
10
16
7 12 11 16
7 14
11
II 11 13
12
10
13
15 12 12 16
9 8 14 14 9
14
S
S
8 8
10 14 15
16
10
7
15
10
13 14 15 16 17 18 19 20
8 15
13 9
8
11
!O
50.9
55.2 45.5 52.8 45.3 %.3 53.9 49.8 46.9 49.8 45.)
IS
16 17 18 19 20 21
Oxide TIickncss 58.4
51.0 41.2
47.1 45.7 60.6 51.0 53.0 56.0
22 23 24 25
47.2
26
~.O
-,
55.9 50.0 47.9 53.4
?~
28 29 30
(a) Construct a nocnaI probability plot of the data. Does the normality assumption seem reasonable?
Observation
15 14 9 8 14 9 15 14
44.0
Weier
12
10 10 12 5 !O
(b) Set up an individuals control chart for oxide
thickness. Interpret the chart, 17..s.. A machine i~ used to fill bottles Vlith a particu~ Jar brand of vegetable oil. A single bottle is randomly selected every half hour and the weight of the bottle recorded, Experience with the process indicates that the variability is quite stable. with (J:: 0,07 oz. The process target is 32 oz. Twenty-four samples have been recorded in a 12~hour time period with the results given below. Sample Number
9
6 5 12 9 9
8 8
(a) Set up the X and R charts. Is the process in statis-
tical control'? (b) Specifications are 5,10 ± 0.05 mm. \Vbat can you say about process capability?
(a) Construct a normal probability plot of the data
17 .. 13, Consider a process where specifications on a
Does the no:rrnality assumption appear to be
qUality characteristic are 180 ± 15. We know that the
satisfied?
standard deviation of this quality characteristic is 5. \Vbe:re ~hou1d we een~er the process to minimize the fraction: defective produced? Now suppOse the mean sl'.ifts to 105 ar.d we atc using a sample size of 4 on an X chan:, Wh2.t is the probability that such a sb.ift wit! be detw.ed on the first sou:.ple following t."ic shift: What sample size would be needed on a p chart to obtain a similar degree of protection?
(b) Set up an. individuals control chart for the weights. Interpret the results,
17-9. The follo~1ng are the numbet of defective solder joints found during successive samples of 500 sol· derjol."lts,
Day
No. of Defectives
Day
No. of Defectives
17-14. Suppose the following fraction defective had
been found L'1 successive samples of size 100 (read 2
106 116
3
164
4 5
89
99
6
40
7 8
\\2 36
9 10
69 74
11
42
12 13 14 15
37 25
0.09
0.03
88 101
0.10 0.13
0.05 0.13
64
0.08 0.14 0.09 0.10 0.15
0.10
0.09
0.13
0.13
0.08 O.ll
0.12
16 17 18 19 20 21
down):
51 74 71 43 80
0.06
Construct a fraction-defective control chart. Is the
process in control? 17~10. A process is controlled by a p chart using samples of size 100. The centerline on the chart is 0.05. What is the probability th2.t the control chart detccts a shift to 0.08 on the first sample f&lowing the shift? What is the probability t.hat the shift is detected by at least the third sample following the shift?
17~11. Suppose a p :::hart with cC'I}terline at p with k sigma units is used to control a process. There is a critical fraction defective p< t3at must be detected with probability 0.50 on the firs: sample followe:g the shift to dUs state. Derive a general fonuula for the sample size that snould be used on this enart. 17~12. A normally distributed process uses 66.7% of the specification band, It is ce:.1tered at the nominal dimension, located halfway between the upper and lower specifica~on limits.
0.14
0.12 0.14 0.06 0.05 0.14
0.D7
0.11
0.06
0,09
0.09
Is the process in control with respect to its fraction defective? 17~15. The foUov.iug- represent the nu.-r.ber of solder defects observed on 24 samples of five printed circuit boards: 7. 6.S. 10.24. 6.5,4,8. 11.15. 8,4. II, 12, 8,6,5,9,7, 14,8,21. Can we conclude that t.'e process is in control using a c chart? If not, asst.::;ne assignabie causes can be found and revise the control
,6.
limitS. 17-16. The following represent t.'e number of de:fects per 1000 feet ir:. rubbcr-cov:red wire: I, 1,3,7,8, 10. 5,13,0,19,24,6,9, 11, 15,8,3,6.7,4,9,20,11,7, 18, 10.6,4.0,9,7,3, I, 8, 12. Do the data come from a control1ed process? 17-17. Suppose t.'e nt::mber of de:ects in a unit is to be 8, If the number of defects ma unit shifts to 16, what it !he probability that it will be detected by the c chart on the :Erst sample follo'Ning the s..lift?
knOVI'l1
(a) What is t."ie process capability ratio peR?
17~lS. Suppose we are inspectir:.g disk drives for
(b) What fallout level (fraction defective) is produced?
defects per unit, and it is h:o-wn that there is an aver-
(c) Suppose the mean shifts. to a distance exactly 3 standard deviations below the upper 1>1lecification llinil What is the value of PCRk? How has peR changed?
Cd) 'iVhat is the actual faI:out experienced a....fter the shift ill the meal:?
age of two defects per urit. We decided to make our inspection unit for the c chart five disk drives. and we control the tOtal number of defects per inspection unit. Describe the new control chart, 17~19. Consider the data in Exercise 17-15. Set ..tp a u chart for tlrls process. Compare it to the c chart in Exercise 17-15.
554
Chapter 17 Statistica: Quality Control and Reliability Engineering
17~20. Consider the oxide thickness d3.ia given in Exercise 17-7. Set up an EWMA control chart with ;. == 0.20 and L;:;;;;: 2.962. Interpret t.':le chart.
11-21. Consider the oxide thickness data given in Exercise 17-7. Construct a CUSUM control ch..'ilrt with k = 0.75 and h = 3.34 if the target L';leme" is 50. Interpret the chart. 17-22. Consider the weigh:s provided in Exereise 17·8. Set up an E\VMA eontro1 chart with .<=0.10 and L == 2.7. Interprl!t the chart.
17kl3. Consider t."Ie weights provided in Exercise 17~g. Set up a CUStr:M control chart with k;:;;;;: 0.50 and h:::::: 4,0. Interpret the chart. 17-24. A titue-to-failure distribution is giv::::n by a uniform distribution: 1
/(1)=--, otheI'Vllse,
Ca) Determine the re:iability function. (b; Show
17~28. One 111mdred unit;.; are p:aced on test and aged until all units have failed. The follo~ing results are
obtained, and a mean life oft:= 160 hours is calculated from the serial data. Number off-allures
Time Interval
0-100 100-200 200-300 300-400 400-500 After 500 hours
50
18
:7
8 4 3
Use the chi-square goodness-of-fit test to detennine whether you consider the exponential distribution to represent a reasonable tirne~to--fai1ure model for thesl! data.
fJ-a
=0
decision switch and only one unit is required for subsystem survival, detem.:i;ne tbe subsystem reliability.
tha~
17-29. Fifty uaits are placed on a life test for 1000 hours. Eight units fail during the period. Estimate R(lOOO) for these units. Determine a lower 95% con~ fidence interval on R(lOOO).
(R(t)dt= (tf(t)dt. (c) Determine t.':le hazard function, (d) Show that
17-30. In Section 17-4.7 h was noted that for onepartt.."nerer reliability functions, R(t;fi ),R(t;f1):= R(t;8). where and Rare the maximum likelihood estima~ tors. Prove this statement for the case
e
whereH is defmed as in eqcati.on 17~31.
17-25. TIrree units that operate and fail independently form a series configuration. as shown in the figure at the bottom of this page.
R(r,8) ~ e-
=0,
t:?:O,
othenv1.se.
The tirr.e-to-failure distribution for each UJI.it is exponential with t.1e failure rates indicated,
Hint: Ex.press the density function fin tenns of R.
(a) Find R(60) fur the systeIIL (b) ~'hat is the mean time-to-failurc Cv1TTF) for this
17-31. For a nonreplacement test that is terminated after 200 hours of operation, it is noted that failures occur at the following ti:nes: 9, 21, 40, 55, and 85 l:ours. The units are a.
sy&tem'f
17M26. Five identical units are arranged in an active redundancy to fonn a subsystelXl. Unit failure is indepcadem, and a:: least t'No of the units must survive 1000 hours for the subsystem to perfonn its :nission. (a) If the units have c:xpone!ltial time-to-failure distributions with failure rate 0,002. what is t.':le subsys~ reliability? (0) Vihatis the reliability if only onc unit is reqci."'ed?
If the utrits described in the previous exercise are operated in a standby redundancy with a perfect
17~27.
~
A1=3x10- 2
H
Figure for Exercise 17-25.
A2=6x10- G
:4
17~32.
Use the statement in Ex.ercise 17-3!.
(2) Estimate R(300) and construct a 95% lowerconfidence limit on R(300).
(b) Estin-.ate the time for which the reliability will be 0,9. and construct a 95% lower limit on te.g,
A:::=4x10- 2
H
Chapter
18
Stochastic Processes and Queueing 18-1
INTRODUCfION The term stochastic process Is frequently used in connection with observations from a time-oriented. physical process that is controlled by a random mechanism, More precisely; a stochastic process is a sequence of random variables {XI}' where t E T is a time or sequence index. The range space for X f may be discrete or continuous; however. in this chapter we will consider only the case where at a particular time t the process is in exactly one of m + 1 mutually exclusive and exhaustive stales. The states are labeled 0, 1, 2, 3~ '''. m. The variables Xl~~' ... might represent the number of customers awaiting service at
a ticket booth at times 1 minute, 2 minutes, and so on. after the booth opens, A.nether example would be daily demands for a certain product on successive days. xc. represents the initial state of the process. The chapter will introduce a special type of stochastic process called a Markov process. We will also discuss the Chapman-Kolmogorov equa:tions~ various special properties of Markov chain.s, the birth-dearh equations, and some applications to waiting~line, or queue~ ing, and interference problems. In the study of stochastic processes. certain assumptions are required about the joint probability distribution of the random variables Xl' X" ._ .. In the case of Bernoulli trials, presented in Chapter 5, recall that these variables were defined to be independent and that the range space (state space) consisted of tw'o values (0, 1). Here we will first consider discrete-time ~arkov chains, the case where time is discrete and the independence assumption is relaxed ro allow for a one-stage dependence.
18-2 DISCRETE-TTh:IE MARKOV CHAJNS A stochastic process exhibits the lr1arkovian property if
P {XrT1 ::::!!jlX,= i} =P {Xr+ l =jlXr = i, Xt_ l = i j ,Xr_7. =i2,
.".
Xo:;;;:
fa
(18-1)
for t= O. 1r 2, ... ~ and every sequencej, i, it • •.. , it' This is equivalent to stating that the probability of an event at time t + 1 gi:ven only the outcome at time t is equal to the probability of the event at time t + 1 given the entire state history of the system. In other words, the probability of the event at t.,. 1 is not dependent on the state history prior to time t. The conditional probabilities
P {Xt + 1 =j1X,. """ i} :;:::: Plj
(18.2)
are called one..step transition probabilities, and they are said to be statior.ary if for t= 0,1,2, .,.,
(18-3)
55S
556
Cbapter 18 Stochastic Processes and Queueing
so that the transition probabilities remain unchanged through time. These values may be displayed in a matrix P :;;; {Pi)]' called the one-step transitio.o matrix. Tne matrix P has m + 1 rows and m + 1 columns, and
while
That is. each element of the P matrix is a probability, and each row of the matrix surns to one. The existence of the one-step! stationary transition probabilities implies that
p;;) =P{X,~" =jlK, =i) =P (X, =jlK, =i)
(18-4)
p:;)
for all t= 0, 1,2, .... The values are called n-step transition probabilities, and they may be displayed in an n-step transition matrix
pzn; = [pi;)], where
O,,;p'" ~I 'J
'
n=O, 1,2, ....
i:;;;O, 1.2, "., m,
j
~
O. 1; 2, ... , m,
and n = 0,1,2.....
i=O,I,2, ... ,m.
The O~step transition matrix is the identity matrix. Afinite·state Mar!r.ov chain is defined as a stochastic process having a tillite number of states, the Markovian property, stationary transition probabilities, and an initial set of prob, [(0) '0) (0)] b "' - P IXI) -:: _ I'} • abili"tieL~ GIn, a (0) p ll1' ... , am • were ai -
The Chapman-Kolmogorov equations are useful in computing n-step transition probabilities. These equations are
i=O,1,2, ...• m~ j = O.1~2•... ,m ,
(18-5)
Osv:::;;n,
and they indicate that in passing from state i to state j in n steps the process will be in some state, say I, after exactly v ,mps (v,,; n). Therefore J'~-'J is the conditional probability that given state i as the starting state, the process goes to state 1in v steps and from 1to j in (n - v) steps. When summed over 1. the sUIIl of the products )'ie1dspiJ) . By setting v lor V'=n -1. we obtain
pt' .
i=O,1,2, ...• m,
O.1.2.... ,m. n=l,2, .. .. j
It follows that the n-step transition probabilities, pt'" may be obtained from the one-step probabilities, and (18-6)
18-3 Classification of States and Chains The unconditional probability of being in statej at time
A (")
[{n)
In,]
(IT)
a'J!a l
, ••• ,a",
t::;;;n
557
is
'
where "
'" a~O) . p~;:)
~
I
I)
j 1
O,I,2" ..,m,
n:;;: 1,2, •...
i=C
Tnus, A(n) =A· p(n). Further, we note that the rule for matrix mcitiplication solves: the total probability law of Theorem 1~8 so that A{n)
A(n-I).
p.
.lli~~l~i.~:r In a eomputing system, the probabiHty of an error on each cyele depends on whether or not it was pre~ ceded by an error. We will define a as the error state and 1 as the nonerror sta:e. Suppose the probability of an error if preceded by an error is 0.75, the probability of a."'! error if p:eceded by a nonerror is 0.50, :.he probability of a nonerror if pre~eded by an error is 0.25, a:r:d the probability of noneITOr if preceded by nOl!error is 0.50. Thus.
If we know that initially the system is in the nonerro: state, thc.'l ai~) A· pro,. Thus, forcxample, A(1)= [0.667, 0.333J.
I, d,,0) = O. and A(II) :::::; (ar] =
18·3 CLASSIF1CATION OF STATES AND CBADlS We will first consider the notion ofJirst passage times. The length of time (number of steps in discrete~time systems) for the process to go from state i to state j for the first time is called the first passage time. If i ::::j. then this is t.."Ie number of steps needed for the process: to return to state i for the first time, and this is tenned the ,first return time or recurrence rime for state i. FIrst passage times under certain conditions are random variables with an associated probability distribction. We let denote the probability that the first passage time from state i to j is equal to n, where it can be sho\\'IJ. directly from Theorem 1-5 that
I;'
./~)::
j:;
0)
=Pij =Pij'
-,1) _ / ;i} -
(n)
f'l)
Pi; -.
J}
in-ll
'PJJ
.....2)
- J
Ij
(1l-2)
'Pjj
.An-i)
_.,. - J;j Pjj'
(18-8)
558
Chapter 18 Stochastic Processes and Queueing
Thus, recursive computation from the one-step transition probabilities yields the probability that the :first passage time is n for given i,j,
;:t>:an;~i~I§:Z: Using the one-step transition probabilities presented in Exan::.ple time index n for i = O,j = 1 is determined as
There are four such distribution, corresponding to i, j value: (0,0), (0, I), (I, 0), (I, 1).
It).
If i and} are fixed, then I:::"tfi;} s 1. W'hen the sum is equal to one, the values for 1, 2; ... , represent the probability distribution of first passage time for specific i,j. In the case where a process in state i may never reach state j, L:tf~; < L Where i = j and I::,/i;) :;:;: 1, the state i is termed a recurrent STate, since given that the process is in state i it will always eventually return to i. If Pa 1 for some State i, then that state is: called an absorbing state. and the process will never leave it after it is entered. The state i is called a transient state if
n::;
ifS
tt )
<1,
n=!
since there is a positive probability that given the process is in state i, it will never retum to this state. It is not always easy to classify a state as transient or recurrent, since it is sometimes difficult to calculate first passage time probabilities for all n 1 as was the case in Example 18-2. Nevertheless, the expected first passage time is
t;;i
' " ~(n) < I , LJ'j n=l
(18-9) 1) a simple conditioning argument shows that
I'ij = 1+
I,Pil 'I'v' !*j
(18-10)
If we take i the expected first passage time is called the expected recurrence time. If J.4i ::: 00 for a recurrent state, itis called null; if Jlu < 00, it is called narmul[ or positive recurrent. There are no null recurrent states in a finite-state Markov chain. All of the states in such chains are either positive recurrent or transient.
18~3
Classification of Sta:es and Chains
559
A state is called periodic with period 1'> 1 if a return is possible only in 'r, 2'r, 3-r, "'\ steps; so p;~) 0 for all values of n that are not divisible by 1'> 1. and -ris the smallest integer hav~ ing this property. A state j is termed accessible from state i ifp::) > 0 forsorne n= 1.2, .. , . In ol.!!'exam'J pie of the computing system. each state. 0 and 1, is accessible from the other; since p;~') > 0 for all i, j and all n. If state j is accessible from i and state i is accessible from j. then the states are said to communicate. This is the case in Example 18-1. We note that any state conununicates with itself. If state i communicates withj,j also communicates with i, Also, if i communicates with 1and I commuricates \vithJ, then i also communicates with). If the state space is partitioned into disjoint sets (called equivalence classes) of states, where COIIL.Yflunicating states belong to the same class, then the IvIarkov chain may consist of one or more classes. If there is only one class so that at: states communicate. the Markov chain is said to be irreducible. The chain represented by Example 18-1 is thus also irreducible. For finite-state Markov chains, the states of a class are either aU positive recurrent or all transient. In many applications, the states will all communicate. This is the case if there is a value of n for which > 0 for all values of i andj, If state i in a class is aperiodic (not periodic), and if the state is also positive recurrent. then the state is said to be ergodic. An irreducible Markov chain is ergodic if all of its states are ergodic. In the case of such Markov chains the distribution
::
p;;)
A") = A· P"
converges as n -7 00 , and the limiting disuibution is independent of the initial probabilities, A.In F.xamp1e 18-1, this was clearly observed to be the case, and after fiye steps (n > 5), PIX, = OJ = 0.667 and PIX, 1) = 0.333 when three significant figures are used, In general. for irreducible, ergodic 11arkov chains, lim
{r.) -
n--t"" Pi)
lim
{II)_
- n~"''''' aj
-
Pj.
and, fur..hennore, these values Pi are independent of i. These "steady stare" probabilities, p/, satisfy the following state equations: (l8-Ha) m
I,Pj =1,
(18-Hb)
j=O m
j == O.1,2,.".m.
pj=LPi'Pij
(l8-11e)
i=O
Since there are m + 2 equations in 18-11b and 18-11c, and since there are m + 1 un.!::nowns, one of the equations is redundant. Therefore we will use 111 of the 111 + 1 equations in equation 18-110 with equation 18-11b. j
In the case of the computing system presented in E.xample 18-1. we have fro;:;. equario:1S
1S-lle,
1 =Po+Pj' Po=Po (0.75)+p,(0.50).
IS-lIb and
560
Chapter 18 Stochastic Processes and Queucing
or p, = 113,
and
Po =213
which agrees with l.1e emerging result as n:> 5 in Exampie
18~ 1.
The steady state probabilities and the mean recurrence time for Markov chains have a reciprocal relationship,
1
j = O,1,2, .. "m,
irreducible~
ergodic
(18,12)
Pj
In Example 18,3 :lote that,lloo ~ 1ipo = 1,5 and)1"
~
lip,
3,
Tue mood of a corporate president is observed over a period of time by a psychologist in the opera~ tions research department. Being inclined toward mathematica:. modeling, the psychologist classifies mood into three states as follows: 0: Good (cheerful) 1: Fair (so-so)
2: Poor (glum and depressed) 'The psychologist observes that mood changes of the transition probabilities
OCClr
only overnight: thus. the data allow estimation
jO,6 0,2 0,2' P~
l0.3
0,4
O.3J'
0,0 0,3 0,7
The equations
Po
O.6pQ .... a.3pt + 0P2'
PI ;;;: O,2pJ + OAp; -'- O.3P2'
1 =Po+P,":"Pz are solved simultaneously for the steady s+..ate probabilities Pa=3/13.
p, =4113, P2= 6/13. Given that the president is in a bad mood. that is, state 2. the mean rime reql.l.ired to return to that state is i-i;:;, where 1
[3
p:!
6
I-tn ,::::::-=-days,
As noted earlier ifpk}t= 1, state k is called an absorbing state. and the process remains in state k once that state is reached, In this case, b" is called the absorption probability, j
18~4
CQntinuous~Time
Markoy Chains
561
which is the conditional probability of absorption into state k given state i. }'lathematically, we have bik.
'" ·bjl\.> = LP(l
l=0,1.2, ... ,m,
(18-13)
i""O
where
and
for i recurrent. i
*' k.
184 CONTINUOUS-TIME MARKOV CHAINS If tl'..e time parameter is continuous rather than a discrete index, as asslL.'TIed in the previous sections, the Markov chain is called a cOfltinuous~parameter chain, It is customary to use a slightly different natation for continuous~parameter Markov chains, namely X{t} =X(' where (X(I) J, t:;' 0, will be considered 1:0 bave states 0, I, ... , m. The discrete nature of the state
space [range space for XlI)] is thus maintained. and i PiP)~P
[X(t +s)=jlX(s) = ,1,
0,1,2, ... , m.
j=O, 1, 2t .. " m, s;;:: 0, t::?:O,
is the stationary transition probability function. It is noted that these probabilities are not cependent on s but only on t for a specified i.j pair of states. Furthermore, at time t = 0, the function is continuous with
There is a direct correspondence between the discrete ..time and continuous-time models, The Chapman-Kolmogorov equations become
Pij(t)
'" 2::Pe(v), p/j(t-v)
(18-14)
1=0
for 0:::;; v S; t, and for the specified state pair i,j and time t. If there are times t1 and '2 such that Plj(t,) " 0 and Pj,{r,) > 0, then states i andj are said to communicate. Once agab states that communicate form an equivalence claSs. and where the chaL, is irreducible (all states form a single class) p,P) > 0,
for eacb state pair iJ We also have the property that
for t > 0,
562
Chapter 18 Stochastic Processes and Queueing where Pi exists and is independent of the initial state probability vector A, The values Pj a..--e again called the steady state probabilities and they satisfy
Pj >0,
j = 0.1.2....,m,
m
PI
= '\' v·· p .. (t). £..,.<1 1)
,
j = O,1.2,.,.,m,
t;;'
O.
f",C
The intensity of transition, given that the state is j, is defined as (18·15) where the limit exists and is finite, Likewise; the intensity of passage from state ito statej, given that the system is in state i, is (18·16)
again where the limit exists and is finite. The interpretation of the intensities is that they represent an instantaneous rate of transition from state i to j. For a small ill, Pij(ill) :::: u(p.t + 0(/;1), where 0(/;1)1/;1 ... 0 as /;1 ... 0, so that .ii is a proportionality constant by which Pij(6t) is proportional to ill as ill ~ 0. The transition intensities also satisfy the balance equations PI ,uj = LP;-Ujj'
j =O,1,2, .... m.
(18·17)
i7lij
These equations indicate that in steady state, the rate of transition out of state j is equal to the rate of transition into j.
An electronic control mechanism for a chemical process is constructed with two identical modU!es, operating as a parallel. active cedur:.dant pair, The function of at least one module is necessary for the mechanism to operate. The m3i:ntenance shop has two identical repair stations for these modules and, furtilerrnore, when a module fails and enters the shop, on,er work is moved aside and repair work is i.:m..rLediately initiated_ The "system" here consists of tbe mechanism and repair facility and the states are as fellows:
0: Both modules operating 1: One unit operating and one unit itt repair 2: 1\vo units in repair (mechanism down) The random variable representing t.inle to faillL>-e for a module has an exponential density~ say
t,.(tl = k"", =0,
1<: O.
,<0,
and the random variable describing repair time at a repair station also has an exponential density, say r-;
~ j1e-J.U,
I ~
=0,
1<0.
0,
18-4
Continuous-Tlffie Markov Chains
563
Inteti'ailo.re and interrepait times are independent ax.d {X(t) t can be shown to be a continuous-parameter, irred:J.cible Markov cb.ain with transitions only from a state to its neighbor states: 0 -) 1. 1 --t 0, 1 ~ 2, 2 -;. 1. Of course, there may be nO state cbange. The transition htensities are ll~::::: 2)..,
and since Po';" p; ..... Pz ::::: 1, some algebra gives fJ,'
Po=-(;-)-,' A+Jl,
The system a.vailability (probability that the mechanism is up) in the s:eady state condition is thus
.. bill'ty= 1- - ';t' Avaca' -,.
(},+fJ,t
The matrix of transition probabilities for time increment At may be expressed as
p~[Pij(ru)J !-uOru =
u1O .6t
"olru 1-u, ru
UiO At
uil At
UmOM
U m t6.t
"0/'" . Ul/lt
".
",
"
"
...
uO,"At
. ulmM ..
uljAt
,
urr.jAt
...
(18-18)
u:mAt
1- umD.t
and Pj(t •. /U) = iPi(t). Pij(t>t),
J:::::O.1.2,.",m,
(18,19)
i"'J
where p,{l) = P (X(t) = jJ.
From tbejtb equation in tbe m + 1 equations of equation 18-19, p/t+ ru) = Po(t). uo;ill+ ". + p,{t)· ui;ill + ... + p/t)(1- u;ill] + ... + Pm(t) , uc, ru,
564
Chapter 18 Stochastic Processes am! Queueing
which may be rewritten as
d I . [PJ(t+b.J)- p,(t)] =-u.·p.(t)+ -p;,t)= lim at 6t t' 1\ )) J'
--)"
ul
.L u.. ·p.(t). I'*j
IJ
(18-20)
I
The resulting system of differential equations is
pj(t)=-uj·p/t)+ .Luij.p,(t),
(18-21)
j:;;:; O,1.2,'4',m,
which may be solved when m is finite, given initial conditions (probabilities) A. and using p,(t) = L The solution lhe result that
2:;.0
(18-22)
(Po(t),p,(r), ··.,Pm(t)J =P(I)
pt'
presents the state probabmties as a function of time in the same manner that presented state probabilities as a function of the number of trdIlsitions, n. given an initial condition vector A in the discrete-time modeL The solution to equations 18-21 may be sornewhatdifficult to obtain. and in general practice~ transformation ted.utiques are employed.
18-5
THE BIRTH-DEATH PROCESS IN QUEUEING The maior application of lhe so-called birth-dealh process that we will study is in queueing or waiting-line theory, Here birth will refer to an arrival and death to a departure from a physical system. as shown in Fig. 18~ L Queueing theory is the mathematical study of queues or waiting lines, These waiting lines occur in a variety of problem environments. There is an input process or "calling pop'" ulation," from which arrivals are drawn. and a queueing system. which in Fig. 18-1 consists of the queue and service facility. The calling population may be finite or infirjte. Arrivals occur in a probabilistic manner. A common assumption is that the interarrival times are exponentially dis1..ributed. The queue is generally classified according to whether its capac~ ity is infinite or finite. and the service discipline refers to the order in which the customers in the queue are served. The service mechanism consists of one or more servers, and the elapsed service time is commonly called the holding time. The following notation will be employed:
Xl,) = Number of customers in system at time t States::::: 0, 1,2• ... ,j,j-r 1, ...
s = Number of servers P{X(,) = jlAl
pit)
p.; limp(t) ;
An
(4""')
= Arrival rate given that n customers are in the system
JJ..'t ::: Service rate given that n customers are in the system
The birth-death process can be used to describe how X(I) cbanges through time. It will be assumed bere that when X(t) lhe probability distribution of the time to the next birth the (arrival) is exponential \Vith parameter Aj'! j :: O. 1. 2; .... Furthermore, given X(r) remaining rime to the next service completion is taken to be exponential with parameter j1.i' j 1, 2, .... Poisson-type postulates are assumed to hold, so tha, the probability of mo{~ than one birth or death at me same .instant is zero.
18-5
The Birth-Deaili Process in Queuei..'lg
565
System ~------------------!
: !
lnpLi process
Service facility Queue
Arriva.ls
o
0
/)
, •• 0
! 1 I I I I r------1- Departl.res
o o o
I I
I 0 I I I I I ------------------~
Figure 18-1 A simple queucing system,
A transition diagram is shown in Fig. 18-2. The transition matrix corresponding to equation 18-18 is
: -,;,AI '<,tJ
0
0
AlAr
0
tJ.zD.:
l··(!.,-.,)&
0
0
1-')6(
C
0
0
0 ).,' Jt.:
,!tID.:
1-(}..+tJ.j)c"
0 0
po C 0
0
0
i~-()'j ..,u:})&
0
0
0
J1 }+l Ar
0
0
0
C
'"
\Ve note that Plj(&) ;;; 0 for j < i 1 or j > i + 1. Furthermore, the transition intensities and intensities of passage shown in equation 18-17 are
Uj;;;Ay+,u;
forj= 1,2, .. "
u'J:;;:;:; A..{
forj=i+ I, forj=i-l, forji+ 1.
The fact that the transition intensities and intensities of passage are constant with time is important in the development of this model The nature of ttansiti9D car; be viewed to be specified by assumption, or it may be considered as a result of the prior assumption about. the distribution of time between occurrences (births and deat.'ls),
Figure 18--2 Transition diagram for the birth-death process,
5(j(j
Chapter 18 Stochastic Processes and Queueing
The assumptions of independent, exponentially distributed service times and independent, exponentially distributed interarrival times yield transition intensities that are con~ stant in time. This was also observed in the development of the Poisson and exponential distributions in Chapters 5 and 6. The methods used in equations 18-19 through 18-21 may be used to formulate an infinite set of differential state equations from the transition matrix of equation 18-22. Thus. the time-dependent behavior is described in the following equations: (18-23) 1,2, .... ~
~>}(t)= I,
and
(18-24)
1 A--[ao(0),Ill(0) .,.,aj(0)..... I
i"O
In the steady state (I -7 ~). we have p;(t) = 0, so the steady state equations are obtained from equations 18-23 and 18-24:
Equations 18·25 could have also been determined by the direct application of equation 18-17, which provides a "rate balance" or "intensity balance:' Solving equations 18-25 we obtain
If we let
18~6
Considerations in Que:ucing Models
'_;C!/.)-Z "'An
C j-
J1jJ1j-l'''fl:
•
567
(18·26)
then
and since
or
Pc+ LPj=l,
(18·27)
/=1 we obtain
1
PC=-';"--'
1+
LC; J.1
These steady state results assume that the A,h J1J values are such that a steady state can be reached. This will be true if Ai = for j > k, so iliat there are a finite number of states, It is also true if p = NS).t < 1, where A. and).t are constant and,$ denotes the number of serverS. The steady state will not be reached if I G.,;;;
°
Ll",
QQ,
18·6 CONSIDERATIONS IN QUEUEING MODELS \Vhen the arrival rate ~ is constant for alij. the constant is denoted .:l SimUarly, when the service rate per busy server is cO::lstant, it will be denoted fl. so that 11;::;;: sp. ifj 2:: S and flj <:::: j fl if j < s. 1ne exponential distributions
ACt) ~N,-k, 0,
t:?:O, t
TT(t) = Ilf,-Jil,
t<: 0,
=0,
t
for inte!arrival times and service times in a busy channel produce rateS A. and p... which are constant The mean interarrival time is II)" and the mean time for a busy channel to com plete service is 11fl. A special set of notation bas been widely employed in the steady S!A.te analysis of queueing systems, This notation is given in the following list: M
L = L~.J· Pi = Expected number of customers in !be queueing system L, = L~,,(j-S)' Pj = Expected queue length W -;::;; Expected time in the system (including sel'\'lce time)
Wq = Expected waiting time in the queue (excluding service time) If A is constant for all j, then it bas been shown that
L=AW and
L,=AW,
(18·28)
568
Chapter 18 Stochastic Processes and Queue!ng
(These results"are special cases of what is bown as Little's law.) If the ;"J are not equal, I replaces A., where ~
:1:= LAj·Pj.
(18-29)
i:=:.O
The system utilization coefficient p = /Js(.1 is the fraction of time that the servers are busy. In the case where the mean service time is 11(.1 for allj ~ 1, I
)¥,,+-.
W
(18.30)
I' The birth-death process rates,~, 1 1, ••• , Aj; ••• and J.ljl J.L;" ••• , J.l;~ ••. may be assigned any positive values as long as the assignment leads to a steady state solution. This allows considerable flexibility in using the results given in equation 18-27. The specific models subsequently presented will differ in the manner in which Ay and I'} vary as a function ofj. j
18·7 BASIC SINGLE·SERVER MODEL V\!ITH CONSTAl"iT RATES We will now consider the case where s;:;:; 1, that is. a single server. We will also assume an unlimited potential queue length with exponential mterarrivals having a constant parameler A., So that .:t" =AI =... =A. Fnrthennore, service times will be assumed to be independent and exponentially distributed with 1'1 = J.L, !L We will assume). < 1'. As a result of equation 18-26, we have j=1,2,3, ... ,
(18·31)
and from equation 18-27 j=I,2.3, ..., 1
Po =
_
I-p.
(18-32)
l+LP! i"'l
Thus, the steady state equations are
Pj= (1- p)pi,
j = 0, 1,2, . ., .
Note that the probability that there X'I3 j customers in the systempj is given by a geometric distribution with parameter p. The mean number of customers in the system, L, is deter~ mined as ~
L= Lj·(1-p)pJ j=O
(18,34)
=2I-p
1&-7 Basic Single-Server Model \f.-ith Constant Rates
569
And the expected queue length is Lq=I,U-I),pj j=l
=L-(I-po)
Using equations 18-28 and 18-34, we find that the expected waiting time ill the system is (18-36) and the expected waiting time in the queue is
;.' !1-(!1-- 1 )1
W, = ,
A
1 !1-(!1--).)'
(18-37)
These results could have been developed directly from the distributions of time in the system and time in the queue. respectively. Since the exponential distribution reflects a memoryless process, an arrival finding j units in the system will wait through j + 1 services, including its own, and thus its waiting time ~. + I is the sum ofj + 1 independent, exponentially distributed random variables, This random variable was shown in Chapter 6 to have a gamma distribution. This is a conditional density given that the arrival fi:lds j units in the system. Thus, if S represents time in the system.
P(S>w) 2:>rP(Tj+1 >w) j""O
...
/+1
I, (l_p)pi . r~~e-I'Idl i"O
w
n-
r (J+l)
p}!.le-I'II (P':,Y dt
w
1",,0
),
(18-38)
= e-,u(I-P)"',
w;;'O,
W
If we let S, represent time in the queue, excluding service time, then
P(S,=O)=p,= I-p.
570
Chapter 18 Stochastic Processes and Queueing
If we take T; as the sum of} service times. as in the previous manipulations.
~
will again have a gamma distribution. Then,
p(Sq:>Wq ) 2>j·P(1j:>Wq ) 1=1 ~
=
L(l-p)pj .P(Tj:>w,)
(18-39)
/",,1
-u(I-p)" =pe' q,
=0, and we find the distribution of time in the queue g(w
\IV q > 0,
to be Wq
>0,
Thus, the probability distribution is g(w,) = I-p,
= 4(1- Ple-"'-;;'.,
(18-40)
which was noted in Section 2-2 as being for amixed~type random variable (in equation 2-2, G,t 0 and H,t 0). The expected wajting timeln the queue W,could be determined directly from Ibis distribution as
W.
=(1 p)·O.,. Jo~ w, ·;,{I pV,"-I.)""dWq (18-41)
4
Vlhen /" 2: J.i, the summation of the terms PI in equation 18-32 diverges, In this case. there is no steady state solution since the steady state is never reached. That is, the queue would grow without bound.
18-8 SINGLE SERVER 'WITH LIMITED QUEUE LENGTH H the queue is limited so that at most N units can be in the system, and if the exponential service times and exponential interarrival times are retained from the prior model. we have
;1,,=41 = ... = '-:1_1 =/4]=0,
jcN,
and
/1:
f.1.z='"
/1,,=/1.
It follows from equation 18-26 that j~N,
=0,
j>N.
(18-42)
18-8
Single Server with Limited Qt:.eue I..engt.l:!
571
j = O,I,2,,,,,N,
so that N
pol>j =1 )"'0
and I-p 1- pN+l'
(18-43)
As a result, the steady state equations are given by
j=O,1,2, ...• N.
(18-44)
The mean number of customers in the system in this case is
(18-45)
The mean number of customers in t.lte queue is N
Lq =
LU -1)· Pj fool
N
N
= LjPj - LPj
(18-46)
The mean time in the system is found as
m
(18·.. and the mean time in the queue is
(18·48)
where L is given by equaton 1845.
572
Chapter 18
Stochastic Processes and Queueing
18-9 MULTIPLE SERVERS WUHAN ~IMITED QUEUE \Ve now consider tbe ease where there are multiple servers. We also assume that the queue is unlimited and tbat exponential assumptions hold for interarrival times and senrlee times. In this case, we have (18-49) and
forj" s,
forj> s. Thus. defining ¢:=: }Jp., we have
c =)!- . ;11/j;J1/
j
j';'s
j!
(18-50) j>5.
It follows trom equation 18-27 that the state equations are developed as
p=!bi po j
ji
j >$
(18-51) 1
where p = YS/1 = ¢Is is the utilization coefficient, assuming p < 1, The value Lrr representing the mean number of units in the queue, is developed as follows:
(18-52)
Then L
W =-'L q I.
(18-53)
lS~12
Exercises
573
and (18-54)
so that (18-55)
18·10 OTHER QUEUEING MODELS There are numerous other queueing models that can be developed from Ule birth-deat1. process. In addition, it is also possible to develop queueing models for situations involving nonexponential distributions. One useful result, given without proof, is for a single-server system having exponential interan::ivals and arbitrary service time distribution with mea.,,, 1/J1 and variance d'. If p=Ai!1< I, then steady stare measures are given by equations 18-56:
Po =1
p,
)}(i2 + p2
L
=.~-.~
q
2(1-p)'
L= p+Lq ,
(18·56)
In the case where service times are constant at 1Ij.L, the foregoing relationships yield the measures of system performance by taking the variance cJl::::::; O.
18·11 SUlY1MARY This chapter introduced the notion of discrete~state space stochastic processes for discrete~ time and continuous"time orientations. The Markov process was developed along with the presentation of state properties and characteristics. TIllS was followed by a presentation of the birth-death process and several important applications to queueing models for the description of waiting~time phenomena.
18·12 EXERCISES 1s..1. A shoe repair shop in a suburban .!.'Call has one shoesmith. S=.toos are brought in for repair and arrive according ro a Poisson process with a constant arrival rate of two pairs per hour. The repair time distribution is exponential 'kith ame:an of20 minutes. and there is independence betwee~ the reprur and arrival processes. Consider a pair of s:Qoes to be the unit to be served, and do the following: (a) In the steady slate, find the probability tlJat the number of pairs of shoes in the system e~ceeds 5.
(b) Find the mean ll1!IDber of pairs in :.'I)e shcp and the
n::.ean nu::nber of pairs waiting for ser.'ice. (c) Bnd the mea>. tumaround time for a pair of shoes (time in the shop waiting plus repair, but exclud~
ing time waiting to be picked up), 1s..2. Weather data are analyzed for a particular local~ it)', and a Markov chain is employed as a model for weather change as follows. The conditional probahil~ i:y of change from rain to clear weather in one day is 0.3. Likewise, the co;.:;.ditional probabili~ of transition
574
Chapter 18 Stochastie Processes and Qaeueing
from elearto rain in one day is 0.1. The model is to be a discrete~time model, with transitions occurring only between days.
If t Xn} can. be modeled as a Markov chain with onestep tranSition matrix as
(a) Determine the matrix P of one-step transition
o
probabilities,
1/6
(b) Find the steady state probabilities.
2/3
o
(c) If today is clear, find the probability that it will be dear exactly 3 days hence. (d) Find the probability that the first passage from a clear day to a rainy day OCC'Ui.'S in exactly 2 days, given a clear day is the initial state. (e) \Vhat is the mean reCW(ence time for the rainy day state? 19..3~ A
communication link: transmits binary chamc~ tets, (O~ 1), There is a probability P t...'1at a trnnsmitted character will be received corredy by a receiver, w:-.dch then transmits to another link, etc. If Xu is the initial character and Xl is the character received after the first transmission, X2 after the second. etc., then \\ith independence {X~} is a ~iarkov chain. End the one-step and steady state transitinn matrices. 18-4. Consider a two--component active redundancy where the components arc identical and the time-tofailure distributions are ex.ponential. When both units are operating. each cames load U1 and each has fail~ ure rate .I•. However. when one unit fails. the load carried by the other component is L, and its failure ::ate under this load is (1.5)A. There is only one repair
facility available, and. repair time is exponentially dis~ tributed Wtth mean lip,. The system is considered failed when both components are in the failed state. Both components are initially operat:ir:.g. Assume that J1:> (l.S))", Let the states be as follows: 0: No components are failed. 1: O:J.e component is failed and is in repair. 2: Two components are failed, one is in repair, one is waiting. and the system is in the failed
condition. (a; Detem:ine the !:latrix P of transition probabilities associated v.ith interval ill. (b) Determine the steady state probabilities, (c) Write the system of differential equations that present the transicct or time-dependent relation-
ships for transitioI!, 18 S, A communication satellite is launched via a booster system that has a d!screte-time guidance control system, Course correction signals form a sequence {X,,} where the state space for X is as follows: 0: No correction required, R
1: lvfiuor correction required. 2: Major correction required" 3: Abort and system destruct.
do !he following: (a) Show that states 0 and 1 are absorbing states. (b) If the initial state is state 1. compute the steady state probability Clat the system is in state O.
(c) If !he ~-utial probabilities are (0,112.112,0), compute the steady state probability Pc' (d) Repeat (c) "i!hA
=(lJ4,1I4,1I4,1I4),
18-6, A gambler bets $1 on each hand of blackjack. The probability of winning on any hand is p. and the probability of losing is 1 - P = q, Tlle gambler will continue to play until either $Yhas been accumulated. or he has no money left. Let XI denote the acct;.mulated 'Winnings on hand t • .Kate that X,+ I =Xt + 1, with probability P. that Xl,. 1 :;:;;; Xr - 1. with probability q, andXH I ::z:X. if XI = 0 or X,.=Y. The stochastic process X, is a Markov chain.
(a) Find the one-step transition matrix P. (b) For Y =4 and p 0.3, find the absorption probabilities: blO • b 14, bJo• and by.. 18-7. Au ohject moves between fot:! points on a circle, which are labeled 1. 2, 3, and 4, 11:~ probability of moving one unit to the right is p, and the probability of mO'.'ing one unit to the left is 1-P =q. Assume that the object starts at 1. and let Xn denote the location on t.'1c circle after n steps. (a) Find the one....step transition matrix p. (b) FilYJ an expression for the steady stilZe probabili~
ties Pi' (c) Evaluate the probabilities Pj for p ::::: 0 ..5 and p 0.8, 18..8. For the singie-set'/er queueing model presented in Section 18·7, sketch the graphs of the following
quantities as a function of p:;:;;; Alp.., for 0 < P < L (a) Probability of no units in the system. (b) :Mean time in the system, (c) Mean time in the queue. 1s..9. Interamval times at a telephone booth are exponential. with an average time of 10 minutes. The length of a phone call is assumed to be exponentially cEstnouted with a mean of 3 minutes. (a) 'What is the probability that a person arn..r--.ng at the booth v,.ill have to wait?
18~ 12
(b) \\lh31 is the average queue length?
(c) The telephone company will ir.stall a second booth whe4 an arrival would expect to have to wait 3 minutes or more for the phone. By how much must the rate of amvals be increased in order to justify a second booth? (d) What is the probability that an arrival VIill ha.ve to wait more than 10 minutes for the phone? (e) What is the p:obability that it will: take a person
more than 10
;:nir~utes
altogether, for the phone
and to complete the call?
(0 Estimate the :fraction of <:I day that the phone will be in use. 18-10. Automobiles arrive at a serviee station in a ran" dam manner at a mean rate of 15 per hour. This station has o:uy one service position. with a mean serviciI:g
rate of2i customers per hour. Service times are expo~ nentially distributed. There is space fur only the auto~ mobile being sen'cd and two waiting. If all three spaces are filled., an miving automobile will go on to another station. (a) "''hat is the average ll'J.mber of units in the station? (b) \¥bat fraction of customers VIill be lost? (e) \VbyisL,>'L-l? 18-11. An engineering school has three secretaries in its general office.. Professors wim jobs for the secretaries arrive at random. at an average rate of 20 per 8-hour day. The amount of time that a secretary spe::1ds on a job has an exponential distribution v."ith a mean of 40 minutes. (a) What fraction of the time a."'C the secretaries busy?
E;o::ercises
575
(b) Hov.' much time does it take, on average. for a professor to get his or her jobs completed? (c) If an. economy drive reduced the secretarial force
to 1:'\¥o secretaries. what wi1: be the new answers
to (a) and (b)? 18~ U.
The mean frequency of arrivals at an airpor. is 18 planes per hour, and the mean time that a runway is tied up with an arrival is 2 minutes. How many run~ ways VoiD have to be provided so that the probability of a plane having to wait is 0.201 Ignore finite population effects and make the assumption of exponential mJera.'1ival and service times. 1&-13. A hotel reservations facility uses inward WATS lines to service customer requests. The mea.r, number of ealls that arrive per hour is 50, and the mean serv~ ice tim.e for a call is 3 minutes. Assume that interar~ rival and service times are exponentially distributee.. Calls that arrive when all lines are busy obtaln a busy signal and a..~ lost from the syste:n. (a) Flnd the steady state equations for dlls system, (b) How many WATS Jines must be provided to ensure that the probability of a customer obtaining a busy signal is 0.05? (c) Mat fraetion of the time are all WATS lines busy? (d) Suppose that during the evening hours cal: arrivals occur at a mean rate of 10 per hour, How does this affect the WATS line utilization? (e) Suppose the estimated mean serv:ce time (3 millutes) is in enor, and the true mean service rin::e is really.5 minutes. Wha: effect INill this have on the probability of a customer finding all lines busy if the number of lines in (b) are used?
Chapter
19
Computer Simulation One of the ",ast widespread application. of probability and stati.tics lies in the use of computer simulation methods. A simulation is simply an imitation of the operation of a realworld system for purposes of evaluating that system. Over the past 20 years t computer simulation has enjoyed a great deal of popularity in the manufacturing, production, logistics. service, and financial industries) to name just a few areas of application. Simulations are often used to analyze systems that are too complicated to attack via analytic methods such as queueing theory, We are primarHy interested in simulations that are: l~
Dynamic-that is, the system state changes over time, .
2. Discrete-that is, the system state changes as the result of discrete events such as customer arrivals or departures.
3. Stochastic (as opposed to deterministic). TIle stochastic nature of simulation prompts the ensuing discussion in the text. This chapter is organized as follows. It begins in Section 19-1 with some simple motivational examples designed to show how one can apply simulation to answer interesting questions about stochastic systems, These examples invariably involve the generation of random variables to drive the simulation, for example customer interarnval times and serv ice times. The subject of Section 19-213 the development of techniques to generate random variables. Some of these techniques have already been alluded to in previous chapters, but we will give a more complete and self-contained presentation here. After a simulation run is completed, one must conduct a rigorous analysis of the resulting output, a task made difficult because simulation output, for example customer waiting times, is almost never independent or identically distributed. The problem of output analysis is studied in Section 19-3. A particularly attractive feature of computer simulation is its ability to allow the experimenter to analyze and compare certain scenarios quickly and efficiently, Section 19-4 discusses methods for reducing the variance of estimators arising from a single scenario, thus resulting in more-precise statements about system performance, at no additional cost in simulation run time. \Ve also extend this work by mentioning methods for selecting the best of a number of competing scenarios. Vle point out here that excellent general references for the topic of stochastic simulation are Banks. Carson, Nelson, and Nicol (2001) and Law and Kelton (2000). M
19-1 MOTIVATIONAL EXAMPLES This section illustrates the use of simulation through a series of simple, motivational examples. The goal is to show how One uses random variables wit.'lin a simulation to answer questions about the underlying stochastic system.
576
19-1
Motivational Examples
577
:EXam "'19";1 , p",:, Coin Flipping We are interested in simulating independent flips of a fair coin. Of course, :his is a trivial sequence of Bernoulli trials with success probability p =. 112. but this example serves to show how OnC ea., use sim~ ulation to analyze such a system. First of all we need to generate realizations of heads (H) and tails ('T). each with probability 112. Assuming that the simulation can somehow produce a sequence of independent uniform (0,1) random numbers. Vi' V2• ••• , we will a.,"bitrarily designate f'llp i as H if we observe Ui < 0.5, and a flip as T if we observe Vi 2: 0.5. HQ'IN one generates independent uniforms is the subject of Section 19~2. In any case, suppose tha: the following uniforr.:.s are observed:
032
0.41
0,82
0.93
0,06
0.L9
0,21
0,77
0,71
0,08.
This sequence of uuifonns corresponds to the outcomes HHHTIHHTTH. The reader is asked to study this exan:ple in various ways in Exercise 19--1. This type of "static" simulation. in which we simply repeat the same type of trials over and over, has come to be known as Monte Carlo simt;lation, in honor of the European city-state, where gambling is a populru: recreational activity.
Estimate ::r In this example, we 'Will estimate 1r using Mon~e Carlo simulation in conjunction with a simple
geo~
metric relation. Referring to Fig. 19~1. consider a unit square with an L"lscribed eL."de, both centered at {l/2.1/2). If one were to tlu:ow darts randomly at the square. the probability that a particular dart willla..d in the circle is ref4, t."le ratio of the circle's area to that of the square. Hew can we use this simple fact to estimat:e :r? We shall use Monte Carlo sirrl'..:lation to throw many darts at the square. Specifically, genera:e indepehdent pairs of .independent '..Liiform (0,1) random variables. {U;!. Ud. (U'2"' U'1.2)' .... Tnese pairs will fall randomlY on the square. If, for pair i. itbappens 'j}at (19-1)
t."len that pair will also fall Vlithin the circle. Suppose we run the experiment for II. pairs {darts). Let X, I if pair i satisfies ineqmlity 19-1. t.ltat is, if the itb dart falls in the c;rcle; otherwise,let Xi "'" O. Now count up the number of d~'"tS X "'" L~JX{ falling in the circ:e, Clearly, X has the binoro.ial dis;;ribution with parameters n and p "'" 1r/4. Then the proportionp .= Xlr. is the maximum likelihood estimate for p = ;cf4, and so the rn.aximum likelihood estimator for ;cis justj"", 4ft. If, for instance,
Chapter 19 Computer Simulation we conducted n = 1000 trials and obscrvedX =:: 753 dart5 in the circle, OUI estimate wocld be,r::::::: 3.12. We will encounter this estimation technique again in Exercise 19-2.
Monte Curio lnltgratian Another interesting uSc of computer simulation involves Monte Carlo integration. Usually. the method becomes efficacious only for high-dimensional integrals, but we will fall back to the basic one~dimensiona1 case for ease of exposition. To this e.:ld, consider the integral (19-2)
As described in Fig. 19-2, we shall estimate the value of this integral by su:mmiog up n rectangles. each ohvidth lin cen~ randorr.ly at point U; on [O,lJ. and of hcightj(a + (b -a)U), Then an estimate for I is
(19-3; One can show (see Exercise 19-3) that ~ is an unbiased estimator for I. that is, EV~J : : :. 1 for all n. Tris makes Z, an in!:'citive and attractive estimator. To illustrate. suppose that we wa:.t to est:ir:1ate the integra:
31!d the follo\\wg n =4 numbers are a uniform (0.1) samplc:
0.419
y
Figure
19~2
Monte Carlo integration.
0.109
0.732
0.893.
19¥1
Motivational Examples
579
Plugging into equation 19-3. we obtain
,
I, = I~O .2:[1+co",,(0+(1-O)U,))] = 0896, I",:
wbicb. is close to the actual answer of 1. See Exercise 19-4 for examples,
additi.o~a:
Monte Carlo integration
A Single.Server Queue Now the goal is to simulate the behavior of a single-server queueing system. Suppose that s:x customers arrive at a bank at the following times, which have been generated from some approp;::ia:c probability distribution:
3
4
6
10
15
20.
UpOn arrival, customers q~eue up in front of a single teller and are processed sequentially, in a fustcome~:first~served manner. The &erVice times corresponding to the a..'1iving customers are
7
4
6
6
2.
For this example, we assume t."at the bank opens at time 0 and closes its doors at t~me 20 (just after custorr.er 6 arrives), serving any remaicing customers. Table 19~1 and Fig. 19-3 trace the evolution of the system as time prOg.Tesses. The table keeps track of ~e times at which customers arrive, begin semce,
Table 19·1
Bank Customers in Single-Sen.'er Queueing System
t, eustomer
Ai' arrivai time
2
:0
Bi• begin serv::cc
St, service time
3
3
7
4 6
10 16
6
3 4 5
15
20 26
6
20
27
4
6 I 2
D", Gepart time
10 16 20 26 27
29
1\':. wait 0
6 10
10 11 7
580
Chapter 19
Co:nputer Simulation
r 5
Queu e
i
i
Customerl 3
I
4
4
3
3
5
6
4
5
6
4
5
i -~
.-
I 2
2
i
1
,
i
'1
I"~~:1"E> I ;
1
~
3 4
6
I
2
i
121
3
I
15 16
10
20
6
2627
I 29
Figure 19~3 Number of customers L(t) in single~server queueing system.
. EXamJilei?,~' (s, S) Inventory Policy Customer orders for a particular good arrive at a store every day. DurIDg a certain one-week period. the quantities ordered are
10
6
11
20
3
6
8.
The store starts the week off with an initiai stock of 20. If the stock falls to 5 or below, the owner orders enough from a central warehouse to replenish the stock to 20. Such replenishment orders are p:laced only at the end of the day and are rece~ved before the store opens the next day. There arc r.o cusmmer back orders, so any custo:r.:ter orders that are DOt filled immediatelY a..--e lost. This is called an (s, S) inventory sYStem, where the inventory is replenisbed to S -;: ;:; 20 whenever it hits level s =: 5. The fonowing is a history for this system:
Initial Stock
Customer Order
End Stock
Reorder?
No Yes
20
10
10
2
10
6
4
3 4
20
11
9
9
6
5
6 20 14
3 20
6 7
6 8
0 14 6
No No Yes
No No
Lost Orders 0 0 0 0 14 0 0
We see that at the end of days 2 and 5, replenishment orders were made, In .particular. on day 5, the store ran out of stock and lost 14 orders as a result. See Exercise 19-6.
19-2 GEJ\"ERATION OF RANDOM VARIABI,ES All the exampJes described in Section 19-1 required random variables to drive the simulation. In Examples 19~ 1 through 19-3, we needed uniform (0.1) random variables; Example, 19-4 and 19-5 used more~complicated random variables to model customer arrivals, serv-
19~2
Generation of Random Variables
581
ice times, and order quantities. This section discusses methods to generate such random variables automatically. The generation of uniform (0,1) random variables is a good pI",. to start, especially since it turns out that uniform (a,l) generation forms the basis for the generation of all other random variables.
19-2.1 Generating Uniform (0,1) Random Variables There are a variety of methods for generating uniform (0,1) random variables, among them are the following: L Sampling from certain physical devices, such as an atomic clock. 2. Looking up predetermined random nll.."'Ilbers from a table.
3. Generating pseudorandom numbers (PRNs) from a deterntinistic algorithm. The most widely used techniques in practice all employ the latter strategy of generating PRNs from a deterministic algorithm. Alt.':iougb, by definition. PRNs a,,-e not truly random, ·there are many algorithms available that produce PRNs that appear to be perfectly random. Fu...'1:her. these algorithms have the advantages of being computationally fast and repeatable-speed is a good property to have for the obvious reasons, while repeatability is desirable for experimenters wbo want to be able to replicate their simulation results when the runs: are conducted under identical conditions. Perhaps the most popular method for obtaining PRNs is the linear congruential generator (LeG), Here, we start with a nonnegative «seed~' integer, Xo. use the seed to generate a sequence of nonnegative integers, X:, Xz, .,., and then convert the Xi to PRNs, UI , Uz• .•.. The algorithm is simple. 1. Specify a nonnegative seed integer, X~j' 2. For i = 1. 2, ... , letX,= (aX,_, +c) mod (m), wbere a, c, andm are appropriately cho· sen integer constants, and where umodH denotes the modulus function, for example, 17 mod (5) =2 and-I mod (5) =4. 3~
For i= 1, 2
j
••• ,
let Ui =X/m.
E~~iiii?-6 Consider the "'toy" generator Xi = (SXf-1 + 1) mod (8), v.-ith seed Xo ;: O. This produces the .:nteger sequence X: 1,X2= 6, X:) ::.:: 7, X 4 =4,Xs =5, X6 = 2., X, 3, Xl! O. whereupon things start repeating, or "cycling." The PRl'i's corresponding to the sequence sta."ting with seedXo;..:;;; 0 are therefore VI ;;;;; li8, Uz = 618, UJ =7/8, U4 ;:4/8, U~ =5/8, U6 := 2/8, U, "'" 3fS. Us:::; O. Since an;y seed ev·enJ:ual.:y pr0duces all integers 0, 1, •.. , 7. we say that this is ajt.tU-cycle ~orfull period) generator.
~Pl.el
Xc.;; O. This ptoeuces the integer sequence X) =: 1,Xz =4, X3
6, X", =5,Xj :::; 2,X6 =0, whereupon cycling ensues. Further, notice that for this generator, a seed of Xc =.3 produces the sequence Xl := 3 = Xl =X3"'" "', not very :random looking!
~-----------------------
The cycle length oft."e generator from ExampJe 19·7 obviously depends on the seed chosen,. which is a disadvantage, Full-period generators; such as that studied in Example 19-6 obviously avoid this problem, A full~period generator with a long cycle length is given in the following example, 7
582
Chapter 19 Computer Simulation
.~~".T;iI'I~J?;~ The generator XI = 16807 X,_l mod (2 31 -1) is full period. Since c;;= 0, this generator is termed a multipiicative LeG and must be \!sed with a seed XO ~ O. This generator is used in many rea1~world a.ppli~ cations and passes most s~atistical !ests for uniformity and randomness, Lrt order to avoid integer
overllow and real-arithmetic round~off problems. Bratley, Fox, and Schrage (1987) oner the following Fortran implementation scheme for this algorithm. F.mCTION ltrJIF(IX)
Kl '" IX/l27773 IX = 16807*(IX - K1~127773) - K1~2836 :p (IX.LT.O)IX ~ IX + 2147483647 UNIF ~ IX ~ 4,6566l2S75E-lO =uRN END
L'1 the above prognu:n, we input a.."I. integer seed IX and receive a PRN UNIF. The seed IX is autorr.atical2.y updated for the next calL ?-;'ote t.1.at b Fortran, integer division results in trun.cation., for example 15/4 3; thus I
19-2.2
Gimerating Nonuniform Random Variables The goa1 now is to generate random variab1es from distributions other than the uniform. The methods we will use to do so always start with a PRN and then apply an appropriate transformation to the PRN that gives the desired nonuniform random variable. Such nonunifonn random variables are important in simulation for a number of reasons: for example, customer arrivals to a service facility often follow a Poisson process; service times may be normal; and routing decisions are usually cha.."3.cterized by Bernoulli random variables. Inverse Transform Methojl for Random Variate Generation The most basic technique for generating random variables from a uniform PRN relies on the remarkable Inverse Transfonn Theorem.
Theorem 19·1 If X is a random variable with cumulative distribution function (CDF) F(x), then the random variable Y = FCX') has the unifOffil (0,1) distribution.
Proof For ease of exposition. suppose thatX is a continuous random variable. Then the CDF of fis G(y)=P(Y"y)
P(F(X')" y) = P(X" F'; (y)) (the inverse exists since FCx) is continuous) = F(P-I(y)) =y.
Since G(y) = y is the CDF of the unifonn (0,1) distribution, we are done, With Theorem 19-1 in hand. it is easy to generate certain random variables. All one has to do is the following:
1_ Find the CDF of X, say F(x).
19-2 Generation of Random Variables
583
2. Set F(X) = U, where U is a uniform (0,1) PRN, 3. Solve for X = F- 1( U).
We illustrate this technique w:th a series of examples, for both continuous and disc:ete dislributions.
l
I
Here we generate an exponential random variable with rate A, fol1owi!:g the :ecipe outlined above. 1. The COP is F(x) = l-e''-'.
2. Se: F(X) = 1 -
I
e-'X ~
U.
3. Solving for X, we obtain X =r'(U) =-[1n(l- U)]IJ.. Thus. if one supplies a unifoIm (0,1) PRN U, we see:hat X:::::: - [In(l - U);/). is an exponential random. variable v.'ith parameter 1.
Now we try to generate a standa."'d normal random variable, ca!l itZ. Using the special notation (2) U. so that Z = -I(U). t:::.fortunate1y, :he ,>verse COP does not exist in closed fonn, so one must reso~ to the use of standard Doana! tables (or other approximations). For instan~ if we have U"" 0.72, then Table II (Appendix) yields Z"".p-l(O,72) ""0583.
'~~I~1?i'(i'" \Ve can ex:end the previous example to generate any normal random variable, that is. one with a...'""hicrary !!lean and variance. 'This fQilows easily. since ifZ is standard normal, then X "")1+ aZ is llormcl with mean )1 and variance (5'2. For inst3nce, suppose we are interested in generating a r.orroal ,'Briate X 'Rith mean)1 = 3 and variance (J(.::;::: 4. Then if, as in the previous example, U= 0.72, we obtainZ"" 0.533, and, as.a consequence, X o:::! 3 + 2(0.583) =4.166.
~:E;~~p,I:e'Q,I~ We C3.f~ ilio use the ideas:from Theorem 19# 1 to generate realizations from discrete random variables, SUppose that the discrete random variable X has probability function rO.3 ifx=-l,
-' \-to.6
P\b/-
ifx=2.3.
0.1 ifx=?
o otherwise. To generate variates frore this distribution. we set up:he following table, where F(x) is the associated CDF and U denotes the set ofunifonn (0,1) PR.,'\'"s cor;esponding to each x~value: F(x)
x
-1
0.3
D.3
2.3
0.6
7
0.1
0.9 1.0
U
[0,0.3) [0.3,0.9) (0.9.1.0)
To generate a realization of X. we first generate a PRN U and then read the corresponding x-value from the table. For instance. if U::;::: 0.43, then X =: 2.3.
l
584
Ch.apter 19 Computer Simulation
Other Random Variate Generation ~Iethods Although the inverse transform method is intuitively pleasing to use, its real-life application may sometimes be difficult to apply in practice. For instance> closed-form expressions for the inverse CDF, r" (U), might not exist, as is the case for the normal distribution, or application of the method might be unnecessarily tedious. We now present a small potpourri of interesting methods to generate a variety of random \wables.
Box-M.illier M.ethod The Box-Muller (1958) method is an exact technique for generating independent and identically distributed (lID) standard normal (0,1) random variables. The appropriate theorem. stated without proof. is Thwrem 19-2 Suppose that U, and U, are lID uniform (0,1) random variables. Then Z[
= ,j-2ln{U[) cos(2n;U,)
and
are IID standard normal random 'variates. Kote that the sine and cosine evaluations must be carried out in radians,
Suppose that Vt ::::: 0.35 and U. . ;;; 0.65 are two Ill) PRNs. Using the Box-Milllermethod to generate two normal (0.1) random variates. we obtain
Z, = .J-2Jn(O.35) cos{2:r(O.65)) = -0.8517
Z,
~-21n(O.35) sin(2:r(O.65)) ~-1.172.
Central Limit Theorem One can also use the Central Limit Theorem (CLT) to generate "quick-and-dirty" random variables that are approxil'lUltely normal. Suppose that U" U" ... , U, are lID PIU';s. Then for large enough r., the CLT says that
E[l:;"U,] " ,IV ar( l:Z" U, I " ,
Z?[U, -
c···
:=:
=
I:7-1Ui -
Ii-lE[Ud
r-'
"l:~.,var(U,)
17",1UI -(n/2) -In/12
=:-:1(0,1).
-
,
19·2
Generation of Random Variables
585
In particular, the choice n = 12 (which turns out to be "large enough") yields the convenient approximation 12
is a realization :from a distribution that is a;?proximately standard nOrI!lat
Convolution Another popular trick involves the generation of random variables via convolution, indicating that some sort of sum is involved.
Suppose that X~, Xl' ... , X" are lID exponential random variables with rate A.. Then Y = L:IX1 is said to have an Erlang distribution with para."1leters n and A.. It turn.s out that this distribution has probability density function (19-4) whid:, readers may recognize as a special ca.qe of the gamma distribution (see Exercise 19-16), This distribution's CDF is too difficult to invert directly. One way that comes to mi.'1d to gener-
a:e a ~tion from the BrIang is simply to generate and then add up n TID cxponcntia1().,) random variables. The following scheme is an efficient way to do precisely that. Suppose that Vj • V z• .. "' Ur are TID PR..~s< From Example 19~9, wc know thatXr=-xln(l- UI)' i = 1, 2 • ... ,n. are lID expone:r:.~ tial()..) rand.om variables. 'I1?erefore, we can ",,'rite
This iu1plcmematiou is quite efficient., sir.ce it requires only one execution of a natural log operation, In fact:., we can even do slightly better from an efficiency point of vicw-si.mp:y note that both [I" a:..d (1- U) a;e uniform (0.1). Then
is. also Erlang.
586
Chapter 19 Computer Simularion
To illustrate, suppose that we have three lID PRNs at our disposal. VI = 0.23, U;. =0.97, and U3 = 0.48. To generate an Erlang realization V.1th parameters n::.;:: 3 and ,1.= 2.we simply take
Acceptance-Rejection One of the most popular classes of random variate generation procedures proceeds by sampling PRl.'is until some appropriate "acceptance)1 criterion is met.
',J':ii~tn~t~!9,;i~~ An easy example of the acceptance-rejection technique involves the generation of a geometric :taIldom variable with snccess probability p. To this end, consider a sequence of PRNs UI • U'), • .... Our aim is to genet3.te a geometric realizatior. X, t.hat is,. one that has probability function
'( )"-''p lfx =1.2..... p(xi=1)-P ,
O.
l
othenvise.
In words. X represents the number of Bernoulli trials nntil the first success is obse.....'ed, 'This English characterizatior. immcdiately suggests an elemenTary acceptance-rejection algorithm.
1. Initialize i (- O. 2. Leti t:-i ..... 1. 3. Take a Bemoulli(p) observation,
{I
11 = 0,
ifUi
4. If Y,::::: I, then we have our first snccess Md We StOp. in which case We accept X=::> i, wise, if Y/=O, then we reject and go back to step 2.
Other~
To illustrate, let US generate a geometric variate having success probabl.!ity p ;::: 0.3. Suppose we have ore at our disposal the fonowing PRNs:
0.38
0.67
0.24
0.89
0.10
0,71.
Since U~ =0,38 cp, we have Y1-=0, and so we reject X = 1. Since U2 ::;; 0,67 cp. we have 1:2 ::;;0, and so we reject X =2. Since [13::;; 0,24
19·3 OOTPUT ANALYSIS Simulation output analysis is one of the most important aspects of any proper and complete simulation study. Since the input processes driving a simulation are usually random variables (e.g., interarrival times, service times. and breakdown times), we must also regard the output from the simulation as random. Thus, runs of the simulation only yield estimates of measures of system performance (e,g., the mean customer waiting time), These estimators are themselves random variables and are therefore subject to sampling error-and sampling error must be taken into aCCO\lllt to make valid inferences concerning system performance.
19,3 Output Analysis
587
The problem is that simulations almost never produce convenient raw output that is ffi) normal data, For example. consecutive customer waiting times from a queueing system are not independent-typically, they are serially correlated; if one customer at the post office waits in line a long time, then the next customer is also likely to wait a
longtime. • are not identically distributed; customers showing up early in the morning might have a much shorter walt than those who show up just before closing time. are not normally distributed-they are usually skewed to the right (and are certainly
never less than zero). The point is that it is difficult to apply "classical'" statistical techniques to the analysis of simulation Output Our purpose here is to give methods to perform statistical analysis of output from discrete-event computer simulations. To facilitate the presentation, we identify two types of simulations with respect to output analysis: Terminati.'1g and steady state simulations.
1. 'terminating (or transient) simulations, Here, the nature of the problem explicitly defines the length oftbe simulation run. For instance, we might be interested in sim~ ulating a bank that closes at a specific time each day.
2. Nollterminating (steady stare) simulations. Here, the long-run behavior of the system is studied. Presumedly this "steady state>~ behavior is independent of the simulation's initial conditions, l\.n example is that of a continuously running production line for which the experimenter is interested in some long~run performance measure.
Techniques to analyze output from tenninating simulations are based on the method of independenlreplications, discussed in Section 1Y,3. I. Additional prob1ems arise for steady state simulations. For instance, we must now worry about the problem .of starting the simulation-how should it be initialized at time zero, and how long must it be run before data representative of steady state can be collected? Initialization problems are considered in Section 19,3.2. Finally, Section 19-3.3 deals with point and confidence interval estimation for steady state simulation performance parameters.
19-3_1
Terminating Simulation Analysis Here we are interested.in simulating same system of interest over a finite time horizon. For now assume we obtain discrete simulation output y!~ Yt., •H' Ymt where the number of observations m can be a constant or a random variable. For example} the experimentcr can specify the number m of customer waiting times Yl • Yt.? .. " Ym to be taken from a queueing simulation, Or m could denote the random number of customers observed during a specij
fied time period [0, TJ. Alternatively, we might observe continuous simulation output {Y(t)IO ~ t ~ T} over a specified interval [0, TJ. For instance, if we are interested in estimating the time-averaged number of customers waiting in a queue during [0, 1], the quantity Y(t) would be the number of customers in the queue at time I. The easiest goal is to estimate the expected value of the sam.ple mean of the observations,
e" E[YmJ, where the sample mean in the discrete case is _ 1
m
m
"'.1
Ym=-I;r,
588
Chapter 19 Computer Sirculation
(with a similar expression for the continuous case). For instance. we might be interested in estimating the expected average waiting time of all customers at a shopping cemer during the period 10 a.ill, to 2 p,m. Although Ym is an unbiased est.imator for a proper statistical analysis requires that we also pro\'ide an estimate of Var{Ym). Since the Y" are not necessarily IID random variables, it may be that Var(Ym) :;i: Var(Y{)/m for any i a case not covered in elementary statistics courses, For this reason. the familiar sample variance
e, j
2.
1
m.
_
S =-~(y,-y) t m m- 1~' i=l
2
is likely to be highly biased as an estimator of mVar(i'm)' Thus, one sbould not use [{'1m to estimate Var{i'm)' The way around the problem is via the method of independent replications (IR). IR estimates Var(i'm) by conducting b independem simulation runs (replications) of the system under study, where each replication consists of m observations.l! is eru;y to make the replications independent-simply reinltialize each replication with a different pseudorandom number seed. To proceed, denote the sample mean from replication i by
where YiJis observationj from replication i. for i = 1, 2.... , b andj = 1, 2, .. ', m. If each run is started under the same operating conditions (e,g.• all queues empty and idle), then the replication sample means Zlj~' , .. , Zb are ill) random variables. Then the obvious point estimator for Var(l'm) = Var(Z,) is ,
1
b
b-l
;",1
_
2
VR "-2:;(Z, -Z,) , where the grand mean is defined as
Notice how dosely the forms of lin and Slim resemble each other, But si:l.ce the replicate sample means are IID, VR is usually much less biased for Var(Ym) than is S2/m. In light of the above discussion, we see that V"Ib is a reasonable estimator for Var(Z,,).
Further. if the number of observations per replication, m, is large enough, the Central Limit Theorem tells us that the replicate sample means are approximately lID normal. Then basic statistics (Chapter 10) yields an approximate 100(1- a)% two-sided confidence interval (eI) for
e,
(19-5)
where '''''''-' is the 1 - cd2 percentage point of the t distribution with b -1 degrees of freedom.
19-3 OutpUt Analysis
589
Suppose we want to estimate the expected avc:age waiting time for the first 5000 custOmerS i::i a ccr~ tam queueing system, We wiil make five indepe:.:dent replications of th.: system, v.itl:. each run initialized e:npty and idle and consisting of 5000 waiting times. The resulting replicate means are
Z,
32
2
3
4
5
4.3
5.1
4.2
4.6
Then Zs = 4,28 andV:~ "'" 0.487, For level a::= 0.05. we have tom1,4:= 2.72. and equation 19~5 gives [3.41.5.153 as a 95% CI for the expec'".ed average waiting time for the first 5000 customers.
Independent replications can be used to calculate variance estimates for statistics other t.i.an sample means. Then the method can be used to obtain CIs for quantities oilier than E[Ym ], for example quantiles, See any of the standard simulation teXts for additional uses of independent replications.
19-3.2 Initialization Problems Before a simulation can be run. one must provide initial values for all of t.'le simulation'S state variables, Since the experimenter may not know what initial values are appropriate for the state variables, these values might be chosen so~ewhat arbitrarily. For instance, we might decide that it is "most convenient" to initialize a queue as empty and idle. Such a choice of initial conditions cae have a significant but unrecognized impact on the simulation run's outcome. Thus, the initialization bias problem can lead to errors, particularly 1."1 steady state output analysis. Some examples of problems concerning simulation initialization are as follows. 1~
Visual detection of initialization effect..s is sometimes difficu!t-especially in ':he case of stochastic p:-ocesses having high intrinsic variance, such as queueing systems.
2. How should ':he simulation be initialized? Suppose that a machine shop doses at a certain time each day, even if there are jobs waiting to be served. One mllst therefore be careful to _-1 each day with a d=and that depends on the number of jobs remaining from ':he previous day.. 3. Initialization bias can lead t:O point: estimators for steady state parameters ha"ring high mean squared error. as well as for CIs having poor coverage. Since initialization bias raises important concerns, how do we detect and &a1 with it? We first list methods to detect it. 1. Attempt 10 detect the bias visually by scanning a realization of the simulated process. This might not be easy, since VL-'Ual analysis can nllSS bias. FurJ1er, a visual scan can be tedious. To make the visual analysis more efficient, one might transform ':he data (e.g., take logs or square roots), smooth it. ave::age it across several independent replications, or construct mov.ing average plots.
590
Chc.pte:r 19 Computer Simulation
2. Conduct statistica.l testsJor initialization bias, Kelton and Law (1983) give anintu-
itively appealing sequential procedure to detect bias. Various omer tests check to see whether the initial portion of the simulation output contains mOre variation than lat~ ter portions,
If initialization bias is detected, one may want to do something about it. There are tv/o simple methods for dealing with bias. One is to truncate the output by allowing the simu~ !ation to '''warm up" before data are retained for analysis. The experimenter would then hope that the remaining data are representative of the steady state system. Output truncation is probably the most popular method for dealing with initialization bias, and all of the major simulation languages have built-in truncation functions. But how can one find a good truncation point? If the output is truncated "too early," significant bias might still exist in the remaining data. If it is truncated "too late," then good observations might be wasted. Unfortunately, all simple rules to determine tr..mcation pomts do not perfort:l well in general, A common practice lS to average observations across several replications and then visually choose a truncation point hased on the averaged run. See Welch (1983) for a good visual/graphical approach, TIle second method is to rr.ake a very long mn to overwhelm the effects of initialization bias. This method of bias control is conceptually simple to carry out and may yield point estimators having lower mean squared errors than the analogous estimators from truncated data (see, <,g.• Fisbman 1978). However, a problem with :his approach is that it can be wasteful with observations; for some systems, an excessive run length might be required before the initialization effects are rendered negligible.
19·3.3 Steady State Simulatiou Analysis :Sow assume that we have on hand stationary (steady state) simulation output, YI > Y:.;. " ' 1 YIl' Our goal is to estimate some parameter of interest, possibly the mean customer waiting time or the expected proft produced by a certain factory configuration, As in the case of terminating simulations, a good statistical analysis must accompany the value of any point estimator wiL~ a measure of its variance. A number of methodologies have heen proposed in the literature for conducting steady state output analysis: batch means, independent replications t standardized time series~ spec~ tral analysis, regeneration, time se:ies modellilg, as well as a host of others. We will examine the two most popular: batch means and independent replications. (Recall: As discussed ear· lier, confidence intervals for terminating simulations usnally use independent replications.)
BatchMcam The method of batch means is often used to estimate 11lr(Yn) or calculate as for the steady state process mean JL The idea is to divide one long simulation ron into a number of contiguous batches, and then to appeal to a Central Limit Theorem to assume that the resulting batch sample means are approximately IID normal. In particular, suppose that We partition Yt , Y:;:> •. , Y" into b nonoverlapping, contiguous batches, eacb consisting of m observations (assume that n =bm). Thus, the itb hatcb, i;::; 1, 2, ... ! b, consists of the randoI
The ith batch mean is siIDply the sample mean of the mobservations from batch i, i= 1,2, ... , b,
1 m Z, =- L~H)m+r m
1",,1
19-4 Comparison of Systems
591
Similar to independent replications, we define the batcb means estimatorfor Vai\Z;l as
...
1
b
_
2
b_lI,(Z,-z.),
VB =
(""l
wbere
_
_
1
b
y" = Zb = -;- I,Z, /) ;.>1
is the grand sample mean. If m is large. then the batcb means are approxixnately IID normal, and (as for IR) we obrain an approximate 100(1 - a)% Clfar f1, J.L
E
Zb ±ta/'2.b_l-JVB !b.
This equation is very similar to equation 19~5, Of course, the difference here is that batch means divides one long run into a number of batches~ wbereas independent replications uses a number of independent sborter runs. Indeed, consider the old IR example from Section 19-3.1 with the understanding that the ZI must now be regarded as batch means (instead of replicate means); then the same numbers carry through the example. The technique of batch means is intuitively appealing and easy to understand. But problems can come up jf the 1) are not stationary (e.g., if significa.l1t initialization bias is present), if the batcb means are not nonnal, or if the batch means are not independent. If any of these assumption violations exist, poor confidence interval coverage may result-un beknO\\'J1St to the analyst. To ameliorate the initialization bias problem, the user can truncate some of the data or make a long run, as discussed in Section 19-3.2. In addition, the lack of independence or normality of the batch means can be countered by increasing the batch size In.
Independent Replications Of the difficulties encountered when using batch means, the possibility of correlation among the batch means might be the most troublesome. This problem is explicitly avoided by the method of IR, described in the context of terminating simulations in Section 19-3.1In fact, the replicate means are independent by their construction. Unfortunately, since each of the b replications has to be started properly. initialization bias presents more trouble when using IR than when using batch means, The usual recommendation. in the context of steady state analysis., is to use batch means over IR because of the possible initialization bias in each of the replications.
19-4 COMPARISON OF SYSTEMS One of the most important uses of simulation output analysis regards the comparison of competing systems or alternative system configurations. For example, suppose we wish to ev-diuate two different "restart'! strategies that an airline can evoke foll0'.1o-ing a major traf~ fic disroption, such as a snowstorm in the Northeast. Vlhich policy minimizes a certain cost function associated with the restart? Simulation .is uniquely equipped to belp the experimenter conduct this type of comparison analysis. There are many techniques available for comparing systemS, among them (i) classical statistical CIs! (ii) common random numbeIS t (iii) antithetic variates, and (iv) ranking and selection procedures.
592
Chapter 19 COr:.lputer Simclation
19-4.1
Oassical Confidence lntervals With our airline example in mind, let 2'J be the cost from thejth simulation replication of strategy i, i 1, 2. j ::; 1, 2, ...• bi' Assume that Zi,l' Zi,;l, ... , Z"b, are ITO normal with unknown mean I1r and unknown variance, i;::::: 1, 2. an assumption that can be justified by arguing that we can do the following: 1. Get independent data by controlling the random numbers between replications. 2. Get identically distributed costs be~'een replications by performing the replications under identical conditions. 3* Get approximately normal data by adding up (or averaging) many su'bcosts to obtain. overall costs for both strategies.
The goal here is to calculate a 100(1 - a)% CI for the difference ,11, - jl" To this end, suppose that the 2,,) are independent of the z,J ane define
_
z,.
1,<11
1
hi
I
;=1
=~2, b,~ 1,)'
and
L b
1 _.2 , ( S,,2 =-,2 ,-Z. j, b. -1 ., t,j 1'1 I
j"":'
An approximate 100(1 - a)% CI is
where the approximate degrees of freedom v (a function of the sample variances) is given in Chapter 10, Suppose (as in airline example) that small cost is good. Jf the interval lies entirely to the left [rigbt] of zero, then system I [2] is better; if the interval contains zero, then the two systems must be regarded, in a statistical sense, as about the same, An alternative classical strategy is to use a C1 that is analogous to a paired Hest. Here we take b replications from both strategies and set the differences Dj =Z',J - z,J forj = I, 2, ... ) b. Then we calculate the sample mean and variance of the differences: _
1b
1
?
Db=-LD,
and
b i"",l •
b
_2
S;;=-L(Dj-Db) ' b-l J",l
The resulting 100(1 - a)% CI is 111 - J1z
E
Db ± t/X!2,b_l-'\/S~/b-.
These paired tintervals are very efficient if Corr(Z;,J' 2zJ) > O,j= I, 2, '''' b (where we still assume that 2"" Z,," ' .. , Z"b are lID and z,,l' z"" "" z"" are lID), In that case, it rums out that
v(15 ') < V( 2"j ) ~ V( z"j ) b
b
'
Jf Zl,J and z,J hed been simulated independently, then we would bave equality in the above expression. Thus. the trick may result in relatively small S~ and) hence. small C1 length. So how do we evoke the trick?
19-5 Summ'r:l
593
19-4.2 ConunOll Random Numbers The idea behind the above trick is to use cOmmon rantlom. numbers, that is, to use the same pseudorandom numbers in exactly the same ways for corresponding runs of each of the competing systems, For example. we might Use precisely the same customer arrival times when simulating different proposed configurations of a job shop. By subjecting the alternative systems to identical experimental conditions, we hope to make it easy to distinguish which systems are best eVen though the respective estimators are subject to sampling error, Consider the case in which we compare two queueing systems, A and B, on the basis of their expected customer transit times, fJA and 88' where the smaller 8-value corresponds to the better system. Suppose we have estimators ~A and ~B for eA and es- respectively. We will declare.4 as the be"er system <~" If and are simulated independently. then the variance of their difference,
ileA
e, e.
could be very large, in which case our declaration might lack conviction. H we could reduce V(9A -fiB)' then We could be much more confident about our declaration. eRN sometimes induces a high positive correlation between the point estimators ~A and 88' Then we have V(~A
1I,l = V(~A) + V(ils) -
2eav(llA'
e.)
< V(~A) - V(~D)'
and we obtain a savings in variance.
19·4.3 Antithetic Random Numbers
at
Alternatively, if we can induce negative correlation between two unbiased estimators, and ~" for some parameter then the unbiased estimator (ill + ~2)/2 might have low variance. Most simulation texts give advice on bow to run Lf].e simulations of the competing sys~ tems so as to induce positive or negative correlation between them. The consensus is that if conducted properly, common and antithetic random numbers can lead to tremendous variance reductions.
e,
19-4.4 Selecting the Best System Ranking, selection. ar.d multiple comparisons methods form another class of statistical techniques used to compare alternative systems. Here, the experi."llenter is interested in seiecting the best of a number of competing processes. Typically, one specifies the desired probability of correctly selecting the best process, especially if the best process is significantly better than its competitorS. These methods are simple to use. :airly general, and ir::tuitively appealing. See Bechhofer, Santner. and Goldsman (1995) for a synopsis of the most popular procedures.
19·5 SUMMARY This chapter bega!1 vrith SOme simple motivational examples illustrating ,,'arious simulation concepts. After this) the discussion t'.lrned to the generation of pseudorandom numbers, that is. numbers that appear to be lID unifO!D1 (0,1). PRNs are important because they drive the generation of a number of other important random variables) for example normal. expo~ nential, and Erlang. We also spent a great deal of discussion On simulation output analysis--simulation output is almost never IID, so special care must be taken if we are to make
594
Chapter 19 Computer Si..llulation
statistically valid conclusions about the simulation's results. We concentrated on output analysis for both terminating and steady state simulations.
19·6 EXERCISES 19~1. Extension of Example 19.. 1. (a) F'.up a coin 100 times. How n-..:my heads to do you observe? (b) How many times do you observe two heads in a row? T1m::e in a row? Four? Five? (c) Find 10 friends and repeat (a) and (b) based on a rotal of 1000 flips. (d) Now simulate eoin flips via a spreadsheet program. Flip the simulated coin 10,000 times and answer (a) and (h). 19-2. Extension ofRxampie 19,.2. Throw 11 dat"srandomly at a unit square cootai.ning an inscribed circle. Use the results of your tosses to estimate 1C Let r. =:: 2'< for k ::.:: 1. 2, ...• 15. and graph your estimates as a function of k. 19~3. Extension of Example 19~3~ Show thd defined in equation 19-3, is ur.biased for the integral I, defined in equation 19~2.
t.
19-4. Other c-x1.ensions ofRxample 19>-3. (a) Use Monte Carlo :.ntegration with n=10 observa-
~e-tl/2.dx. Now use r. = J() 2n 1000. Compare to the answer '±tat you c.a.."1 obtain via nonna] tables. tions to estimate
r2
(b) Vt'hat would you do if you had to estimate
r1Q J..... e-;;.1/2dx? Jo 21r (c) Use Monte Carlo integration withn;;;;;: 10 observa~ tions to estimate I~ COS(21CC) dx. Now use n =: 1000. Compare to the actual answer,
19-5. Extension of Example 194. Suppose that 10 customers arrive at a post office at the following times: 3 4 6 7 13 14 20 25· 28 30
Upon arrival, eustomers queue up in £::ont of a single clerk and are processed in a first-comewfirst-served JJ1a!mer. The service times corresponding to the arriving customers are as follows::
6.0 5,5 4.0 1.0 2.5 2.0 2.0 2.5 4.0 2.5 Assume that the post office opens at time 0, and closes its doors at time 30 Gust after customer 10 arrives), SIDing any remaining customers. (a) wnen does the last customer fillally leave the system? (b) 'What is the average waitiIlg time for :he 10 C'Jstomers ?
(c) Vlhat is the maximum number of customers in the system? When is this maximum achieved? (d) What is the average number of customers in line during the first 30 minutes? (e) Now repeat parts (a)-{d) assuming t.tlat the serv~ ices are performed last-in-mst-out.
19-6_ Repeat Example 19-5, whichdea], with an (s, S) im'entory pOlicy, except now use order level s.:c::: 6. 19-7. Consider the pseudorandom number generator X, = (5X,-, + I) mOO (16), with seed A;, = O.
(a) Calculate Xl andX1, along with the eorresponding PR..1I.fs U; and Uz' (b) Is this afullwperiod generator? (c) What is XJ5I)? 19~8. Consider the "recoIU."Tlended" pseudorandom n'J:JLber geoerator XI = 16807 Xi-I mod (231 - 1\ with seed Xc = 1234567. (a) Calculate X; andXz, along with the corresponding PR-'1s Vi and V,. (h; What.isXtoo.ooo? 19-9~ Show how to use the inverse transform method to generate an exponential ralldom va:i.able with rate A;;;;;: 2. Demonsnte your technique using :he PRN V=0.75.
19-10. Consider the inverse transformrnethod to generate a standard normal (0, 1) random variable. (<1) DemOllSt.n1te your technique using the PRN V=O.25. (b) Using your answer in (a). generate a:;N 0,9) random variable. 19~11. SU:pPose
thatXhas probability density funetion
j(x) = !xl41, -2
(a) Develop an inverse transfonn technique to generate a realization of X (b) Demo:J.strate yQW'tech:uque usiDg U:: 0.6. (c) Sketch out/ex) and see if you can come up with another method to geoerate X. 19-12. Suppose that:he discrete random variable X has probability function
(0.35
p(x)= 10.25 0. 40
1 O.
if x =-2.5.
ifx =: 1.0, ifx= 10.5. otherwise.
19-6 E.--.::ercises As ia Exa.-r.ple 19· ~2, set up a table 'X) generate realizations from this distribution. Illustrate yO'GC technique with the PRN U = 0,86, 19-13. The Weibull (a, (3) distributiou. popular inreli· ability theory and o:her applied statistics disciplines, has CDP
ifx> 0, otherwise. Show how to us.e the inverse transform method to generate a realization from the Weibull distribution, (b) Demonstrate your technique for a ,\\reibull (1.5,2,0) random va.-iable using the PRN
(a)
U=0.66.
595
19~22. The yearly unemployment rates for Andorra daring the past 15 years are as follows:
6,9 9.9
8,3 9.2
8.8
11A
11.8
12.1
10,6
11.0
12.3 13.9 9.2 8.2 8.9
Use the method of batch means on the above data to obtain a two-sided 95% confidence interval for the mean unemployment. Use five batches, each consist~ ing of tbJ:ee years' data 19,,23. Suppose that we are interested in steady state confidence intervals for the mean of simulation ou~ut Xl' X2• X1COOO" (You can pre:end :hat t.'iese are wailing times.) We have conveniently divided the run up into five batches, each of size 2000; su;pose that the resulting batch means are as follows: "'j
100 80 90 110 120
19~14.
Suppose that VJ "" 0,45 and U2 =: 0.12 are two TID PR.'!s, vse the Box-Miiller method to generate two N (0,1) variates,
Use the method of batch means on the above data to obtain a two-sided 95% confidence interval for the mean.
19-15. Consider the following P~"l"s: 0.88 0.87 0.33 0,69 0.20 0.79 0.21 0.96 0.11 0.42 0.91 0,70 Use the Central Limit theOrem method to generate a realization that is approximately st..andard normal. 19~16. Prove equation 194 from the text Tnis 1.1,OWS that the sum of n ITO exponential tandom variables is Erlang. Hint: Find the moment-generating function of y, and compare it to fuat of the gan.:.r:la distribution.
19~24. The yearly total snowfall figures for Siberacuse, NY. during the pa.<;t 15 years are as follows:
19·17. Using r~o PR.'ls, U. = 0.73 and U, = 0,11. generate a realization from an Erlang distibuti011 with n=2and.i.:;;;3.
19-18. Suppose that UI' U2 , ... , Un are PRl'4's. (a) Suggest an easy inverse transform method to generate a sequence of IID Bernoulli random variables, each with success pardlIleter p. (b) Show how to use your answer to (a) to generate a billomial r..mdom variate with parameters n and p. 19~19. Use thc aceeptance~rejeetion technique to generate a georeetric random variable wiTl::. success prob~ ability 0.25. Use as many of the P~'\ls from Exercise 19~ 15 as necessary.
Suppose that Z. = 3. z,. = 5, and .z.. = 4 are ::hree batch means res~ti.ng fr;m a long ~ation run, Find a 90% t.vo-sided confidence lnterYfh for the mean. 19~21. Suppose that 11 E [-2.5,3.5] is a 90% confidence interval for the mean eost incurred by a een:ain inventory policy. Further suppose that this interval was based on five independent replications of the underlying inventory system. Unfortunately, the boss has decided that she wants a 95% confidenee interval. Can you supply it? 19~20.
100 103 88 72 98 121 106 110 99 162 123 139 92 142 169 (a) Use the me:hod ofbatcb means on the above data to obtain a two-sided 95% confidence interval for the mean year:y snowfa:L Use eve batches, each consisting of three years' data. (b) The corresponding yearly !Oml snowfall figures for Buffoonalo, I'-'Y (which is down the road from Siberacuse), are as fo11O\1,1s:
90 95 72 68 95 110 112 144 110 123 81 130 145
90
75
How does. Buffoonalo's snowfall compare to Siberacuse's? Just give an eyeball answer. (c) Now find a 95% confidence in~erval for t.'le difference in means be!:'Ween the two cities. Hint: Think common random numbers. 19~25. Antithetic variates. Suppose :hat Xl' Xz• . , ,. X" are IID with mean 11 and yat~ce cr. F:rrthe: sup-
pose that fl' Y;;, , .• , YIf are also no 'With mea'l f1 and V'ariance 0-1 , The triek here is that we will also assc.me that Cov(X.. Y,) < 0 for all 1.. So, in other words, the observations within one of the t'NO seq"~ences are nD, but they are negatively oon:elared between sequences. {a} Here is an eKmlpie showing how can we end up wi:h the above scenario using simulations. Let Xi = -m(Ui) and ~ .",...;!nO U;), where the U i are the usual IID uniform (0,1) random vari.b1es, i. ¥/hat is :be distrib;;;.tion of Xi? Of Y,? iL What is Cov(U;.: U;)?
5%
Chapter 19 Computer SirmLation
Would you expect that Cov(X;, 11) < 07 Answer. Yes, (b) LetI'" and f,; denote the sample means of the-X, aed Y" res;;ectively, each based on n observations. 'Ni::hout actua:ly calculating Cov(Xi' Yi), state how V(O!" ~ ))12) compares [0 V(X",). k other words. should we do two negativelY correlated runs, each consisting of n observations, or just one run consisting of 2r. observations? (e) What if you tried to use tills trick when using Monte Carlo simulation to estimate J~ sin(n:x) d:x? iii,
19-U. Another varianee ceduction technique. Sup~ pose that our goal is to estimate the mean jl. of some steady state simulation output process, Xl' Xi, .. ,. XII' Suppose we SOITh':how know the expected value of some other RV Y. and we also know that Cov(X, 1) > 0, where X is the sample me
c~X -
where the Ui a:e IID uniform (0.1). 'Vv'hat ki!1d of dis~ tribution does it look like? 19~28. Another miscellaneous computer exercise. Let us see if the Cent:ral Limit Theorem works. In Exercise 19~27> you generated 20,000 exponcntial(l) observations, Now form :000 averages 0[20 observa~ tiollS each fro:.r.. theoriginal20,QOO. More precisely, let
I '"
Y, : -20..i..J ' " X,,(, . '. -'" ,-.J"'J .1",1
Make a histogram of the Yr DQ they look approxi !n.ately nor:nal?
19~29. Yet another nUsccllaneous computer execcise. Let us generate some Donnal observations via the Box-Mtl1cr method. To do so, fust generate 1000 pairs of liD uniform (0,1) flndoID nu...rnbers, (U J ,!' Ul. l )' (0;,::, U2;2)' .". (UU:lJo, U2 ,JOIX;). Set
r·-·-,
.
XI ~ r21n(Ulj)cos(2rrU"iJ
kiY -E(l']),
where k is some consmnt.
M
and
(a) Show that C is unbiased for jl, (b} Find an expression for V{C). Commems'? (c) Minimize Vee) with respect to k. 19~.27.
A miscellaneous computer exercise. Make a histogram of XI ~ -In(U), for i ~ I, 2, ... , 20,000,
for i;; 1,2, ,." 1000, Make a hisrogram of the resultingX" [The Xl'S areN(O,l),) Now gtaphX; vs. Y,> Any commcn~s?
Appendix
Table I
Cumulative Poisson Distribution
Table II
Cumulative Standard Normal Distribution
Table m
Percentage Points of the X2 Distribution
Table IV
Percentage Points of the t Distribution
Table V
Percentage Points of the F Distribution
Chart VI
Operatiug Characteristic Curves
Chart VII
Operatiug Chara :teristic Curves for the Fixed-Effects Model Analysis of Variance
Chart vm
Operatiug Characteristic Curves for the Random-Effects Model Analysis of Variance
Table IX
Critical Values for the Wilcoxon Two-Sample Test
Table X
Critical Values for the Sign Test
Table XI
Critical Values for the Wilcoxon Signed Rank Test
Table XII
Percentage Points of the Studentized Range Statistic
Table xm
Factors for Quality-Control Charts
Table XIV
k Values for One-Sided and Two-Sided Tolerance Intervals
curves for different values of n for the tvfo~sided normal test for a leve~ of significance
,,~O.05.
I
--- _., ..
0.00
I ,.. l-
I
£
'" i"•
-~
~
~
0.60
L._
U
d
O.4C,~-
D
6:
0.20
0
d
(b; OC curves fot cii.fferent values of n for the two-sided nonnal rest for a level of significance "~O.OL
SO!4rce: Cia.'t VIa. e,/: k, r.; md q are reproduced Vlifr. pe.'1D:issior. from "'Operating Cha.-a~eristiC$ for the Common St.J.istical Tests of Significance;" by C. L. Ferris. F, E. Grebbs. and C. L. W¢aver,.A1w11s of 14athematicai Statistics, June 1946.
Cha..'1:s Vlb. c, d, g, k, i,j, I, n, 0, p. and r are reproduceC ...1m pemissio!1 from Er.gineering Srar~tics. 2.'1.d edi~ non, by A H. Bowker and G. 1. Lieber:na:n, Pre!J.oce-Hal~, Englewood Cliffs, NJ, 1972.
Appendix
611
Chart VI Operating Characteristic Curves (continued) 1.00
(q) OC curves for different values of n for tb.e one-sided F-test for a level of significance a:;:;:; 0.05.
1,00
0.80
:if
:E ~
0,6Q
~
15
:s'"•
0.40
n
:t
o~~~~~~~~~~~==~~ o 1.00 .2.00 4.00 5,00 8.00 10.00 12.00 14.00 16,00 '(r) OC CUlVes for different values of n for the one-sided F-test for a level of significance a = 0.01.
Appcndix
619
Chart VII Operating Characteristic Curves for the Fixed-Effects Model Analysis of Variance 1.00,---~----c--,--------~-------,--------,--------,--------,----,
Source: Chart vn is adapted with permission from Biom.etrika Ta.bles/or Sra.risrician.s, Vol. 2, by E. S. Pearson ilIld H. O. Hartley, Cambridge Univen;ity Press, Cambridge, 1972.
(continues)
620
Appendi:<
ChartVll Opemtiug Characteristic Curves for the FIxed-Effects Model Analysis of Variance (continued)
'i
I
030
1
0,20 i
~
g
~
"?i
t
0.10 0.05
M7 0,,"
t"
0,05
I ~ •
i " i oror {L(J.t
\l,;)S
-, ···,t
(},01 tp(rore",i).C~)-1
1,00
o.en 0.70 0,60
0.50 i
.*
" ~
02C:
~
~
~
! ~ 'il
'" 1 ~
:),10 I" 0,06", -
e,C7: 0,00 C· ;::,05 !'"
0,,,1 o,osl
2
3
4
,
Appendix
621
Chart vn Operating Characteristic Curves for the Fixerl~Effects Model Analysis of Variance (continued) ,-~--~----,-----~----~----~~
Chart vn Ope(ating Characteristic Curves for the Fixed-Effects Model A!talysis of Variance (continued) 1.00 ~----:-'--------'--T'-'-'~--"--'--'--,-----'-'~'---'
Chart Operating Characteristic Curves for the Ra.ndom~Effects Model Analysis of Variance (contir.ued) '.00 "-~---"--'-c----.. :-~'·--'---C--·-T---;--'----r----'--r~.,
'ubi. IX Critical Values for the Wilcoxon Two--Sample Test" (continued) 1':: nl
2
3
4
5
6
7
8
9
10
II
12
13
14
15
,~--
5 6
10
10
7 8 9 10
6 6
11
6
12 12
12 13 14 15 16 17 18 19 20 21
7
13
7 7
14
8
15 15 16 16 17 18 18 19 19 20 20 21
22
23 24
25 26 27 28
8 8 8
3 3 3 3 3 3
3 3
4 4
9 9 9 10 10 10 il 11 11
II 11
14
15 16
17 17 18 19 20 21 22 22 23 24
25 26
27 28 29 29 30 31 32
23 24
25 26 27 28 30 31 32 33 34
36 37 38 39 40
42 43 44
32 34 35 37 38 40
41 43 44
46 47 49 50 52 53 55
57
43 45
56
47 49 51 53
58
54
56 58 60 62 64 66 68 70
61 63
65 67 70 72
74 76 78 81 83
71 74 76 79 81 84 86 89 92
94 97
87 90 93
96 99 102 105 108 111
106 109 112
115 119 122 125
125 129 133 137 140
147
151 155
171
629
Appendix
Table X
;;Z.! n
Critical Values for the
R'« 0,10
5 6
0 0
7
0
12
2 2
6
2 2 2 3 3 4 4 4 5 5
6
5
13
3
14
3
15 16
3
17
18 19 20 21 22
4 4 5 5 5
0.01
n
23 24
0 0
0
8 9 10 11
0.05
25
0
26
0
27
0
28
0
29
2
30 31 32 33
IX
0.10
0.05
om
7
6 6 7
4
7 7 8 8
5
5
7
6
7
6
9 9 10 10 10
8 8
6
34 35
II II
10 10
12
3 3 3
36 37
II 11
13
38
13
4 4
39
13 14
2
2
40
12
"For n:> 40, R is approximately normally distributed. with mean nI2 and variance nl4.
9
9 9
12 12 12 13
7 7 7
8 8 9
9 9 ;0 10 II II
630
Appendix Table XI Critical Values for the Wilcoxon Signed Rank Test'
EI 4 5
0,10
0.05
I
7
2 3
0 2
8
5
3
9
8 10
8
13
14 15 16 17
18 19 20 21 22
23 24 25 26 27
O,O[
0
I
6
10 11 12
0.02
~-~--,----.-~--
13
17 21 25
30 35 41 47 53 60 67
75 83 91 100 110 119
5 10 13 17 21 25
29 34 40 46 52 58 65 73 81 89 98 107
0 3 5 7 9
12 15 19 23 27 32
37 43 49 ,5 62 69
76 84 92
0 1 3 5 7
9 12
15 19 23 27 32 37 42 48 54
61 68 75 83
" "I
0,[0
28 29 30 31 32
130
116
101
140 151 163
126
137
175
159
33 34
187 200
170
110 120 130 140 151 162
35 36 37 38 39 40 41 42 43
213
271
44
353
327
45 46 47 48 49 50
371 389 407 426 446 466
343 361
0.05
0,02
0,01
.~-~--.~-~---.~-----~"~~
227 241 256
286 302 319 336
147
182 195 208 221 235 249 264 279
294 310
173 185
198 211 224 238 252 266 281
296 312 328
9[ 100 109 118 128 138
148 159 171 182 194 207 220 233 247 261 276
291 307
378
345
322
396
362 379 3'17
339
415 434
355 373
--~
Source: Adapted with peo:nission from "Extended Table.'> of the Wilcoxon Matched Pair Signed Rank Statistic" by Robert L. McComnck. Joumw. of the American. Statistical Assoc".atiOI'l, Vol. 60, September, 1965.
"If n > 50, R is approximately normally distributed, with mean n(n 1" 1)/4 and varianoe rt{r. + lX2Jt 1" 1)124.
Thh]e XII
Percentage Point~ of the Studentized Range St'Jlisli(;' QO_01(P.f)
f"' degress of freedom. }l'rom J. M. May, ''li'{leuded and Corrected Tables of tbe Upper Pe~enl'lge Points of tbe S[lldel'l!ized Rang~," JJiomelrika. Vbl. 39, pp. 192-193. 1952. Reproduced by permission of the
trus{eM of Biowetrlka.
(contllIues)
i
~
,..el
Table ~'1I
!li
Percentage Points of the Studcnlized Range Statistica(cQII/ilwed) q",(p,f)
Agresti, A. and B. Coull (1998), "Approximate is Better than "Exact' for Jr.terval Estimation of B:nomiai Proportions." The Aw..erican Statistician, 52(2). .t\nderson, V. L., ar.dR A. McLean (1974), Design of Experiments: A Realistic Approach,
~1arcel
Del:ker. New York. Banks.]., J, S, Carsoo, B,L, Nelson., andD. M.Nicol (2001).Discrete~E)len.t System Simulation, 3rd edition, Prentice~Hall. Upper Saddle River, NJ. Bartlett, M. S. (1947), "The Use ofTransforma· tions," Biometrics, Vol. 3, pp. 39-52. Bechhofer, R. E .• T. 1, Santner, and D. Goldsman (1995), Design. and Analysis of Experiments for Statistical Selection" Screen.ing and Multiple Com~ parnaM, lohn Wiley and Sor.s, New York. BelsleY, D, A., E, Kuh, and R. E. Welsch (1980). Regression Diagnostics, John Wiley & Sons, New York. Berrettoni, J. M. (1964), "Practical Applications of the Weibull Distribution,"" Industrial Quality Con""I, Vol, 21,No, 2,pp. 71-79. Box, G, E, P" and D, R. Cox (1964). "AllA:1alysis of Ttansfon:::lations." Journal of the Royal Statistical Society, B, VoL 26, pp. 211-252, Box, G, E, p" and M, F. Mdller (1958), "A Note on the Generation of Normal RandgID Deviates," A.n1tals of Mathematical Statistics, Vol, 29, pp. 610-011, Bradey, P,. B, L, Fox. and 1" E. Schrage (1987),A Guide to Simulation, 2nd edition, Sp.ringer~Verlag, New York. Cheng, R. C. (1977), "The Generation of Gamma Variables with NoninJ:cgra1 Shape Parameters;' Applied Statistics, VoL 26, No, I. pp, 71-75, Cochran, W, G, (1947), "Some Co!)Seq"e""es "''ben the .A.ssumptions for the Analysis of Variance Are Not Satisfied," Biometrics, Vol. ;., pp, 22-38. CochrllJ1, W. G, (1977). Sampling Techniques, 3rd edition, Joha Wl1ey & Sons, New York, Cochr:m, W. G,. and G, M, Cox (1957), Experimental Designs. John Wiley & Sons, Ne\VYork. Cook, R. D. (1979), "1Dlluential Obsel'V1loons in Linear Regression," Jou.mal oftheAmencan: Sta;isti169-174. cal Association.. Vol.
Cook, R. D. (1977). "Detection of1Dlluential Ooservations in Linear Regression," Technoff'..etrics, Vot 19, pp, 15-18, Crowder, S, (1987), "A Simple Method for Studying Run~LeIlc,trth Distributions of Exponentially Weighted Moving Average Olarts," Technometn'cs, VoL 29. pp, 401-407, Daniel. C" and F. S, Wood (1980), Fitting Equations to Daw.. 2nd edition, John Wiley & Sons, New York. Davenport, W. B" and W, 1" Root (1958), An Introduction to the Theory ofRarulom Signals and Noise, McGraw~Hill. ~ewYork. Draper, N. R., and W. G. Hanter (1969), "Transformations: Some Examples Revisited,'" Technometrics, Vol. 11, pp, 23-4{), Draper, N. R., and 'H. Smith (1998), Applied Regres· sion Atudysts, 3rd edition., John Wiley & Sons, NewYotk. D-.mcan, A, I, (1986), Quality Control and Industrial
Statistics, 5th edition, Richard D. Irwin, Homewood, IL,
Duncan, D, B. (1955). "Multiple Range and Multip!e FTests," Biometrics, Vol. III PP, 1-42, Efron, B. andR. TIbshi.'1mi(1993),Anln.rroti"wctionlO the Bootstrap, ~an ar.d Hall. ~ew lOCk. Elsayed, E. (1996), Reliability Engineering. Addison Wesley Longman, Reading. MA. Epstein, B, (1960). "'Estimation from Ufe Test Data," IRE Tran.sactions on Reliability, Vol. RQC"9. Feller, W. (1968). An Introduction to Probability Theory and Its Application.~, 3rd edition, John Wiley & Sans, New York. Fishman, G, S. (1978). PrinCiples of Discrete Event Simulation, John Wtiey &: Sons, ~C\\' York. FumiYll!, G, :>1" and R. W, Wilson, Jr, (1974), "Regression by Leaps and Bounds," Tecrmometnus. Vol. 16, pp, 499-512, Hahn, G., and S. Shapiro (1967), Statistical Models in Engineering, John Wliey &: Sons, New York. Hald,A. (1952), Statistical 'Dutory wiJI: Engineering Application.s, John Wiley & Sons, New York. Hawkins, S. (1993), "Cumulative Sum Control Chartillg: All underntilized SPC Tool;' Quality
Engineering. Vol, 5, pp, 463-477,
637
638
References
Hocking, R. R, (1976), "The Analysis and Selection of Variables in Linear Regression," Biometrics, Vo!' 32, pp. 1.-49. Hocking, R. R .. F. M. Speed, and M. J. Lynn (1976). "A Class of Biased Estimators in Linear Regres~ SiOD," TecntU>metrics, Vol. 18, pp. 425-437. HOed, A. E., and R W. Kennard (1970a), "Ridge Regression: Biased Estimation for Non-Orthogonal Problems,'" Technometdcs, Vol. 12, pp, 55-67. Hoed. A. E., andR. W. Kennard (197Gb), "Ridge Regression: Application to Non-Orthogonal Prob~ leInS," TechtU>metrics, Vol. 12, pp. 69~82. Kelton, W. D .. and A. M. Law (1983), "A New Approach for Dealblg with the Startup Problem in Discrete Event Simulation:' Naval Research Logist;cs Q-,tarterly, \O\. 30, pp. 641-{558. Kendall, M. G., and A. S:uart (1963), The Advanced Theory of Statistics, Hafner Publishing Co:::npany, >lew York, Keuls, M. (1952). 'The Use of me S:Udentized Range in Connection with aD Analysis of Variance," Euphytics, Vol. 1, p. 112. Law, A. M., and W. D. Kelton (2000). Simulation Modeling andAnalysis, 3rd edition, McG:raw-Hill, New York, Lloyd.D. K, aDdM. Lipow (1972), Reliability: Mana.gerr.ent, Methods, arul Mathenulrics. P1entice~ Hall, Englewood ClilIs, N.J. Lucas, 1., and M, Saccucd (1990), "Exponentizlly Weighted Moving Average Control Schemes: Properties and Enhancements." Technometrics, Vol, 32, pp. 1-12. Marquardt, D. W.. and R. D. Snee (1975), "Ridge Regression in Practice:' The American Statistician, Vol. 29, pp. 3-20. Montgomery, D. C. (2001), Design and Analysis of Experimeras, 5th edition, 10hn Wiley & Sons. New York. ~fontgomery, D. C (2001),lntroduction to Statisti~ cal Quality Control, 4th edition, John W:t1ey & SOIlS, New York. Montgomery, D. C, E. A. Peck, and O. G. vining (2001), Introduction to Linear Regressicrn. Analysis. 3rd edition, John Wiley & Sons, New York. Montgomery, D.C., and G.C. Runger (2003), Applied Statistics and Probability for Engineers, 3rd edtion, John Wiley & SODS, New York.
Mood, A. M., F. A. Graybill, and D. C. Bees (1974), IrtlroducMn to (roe Theory 0/ Statistics. 3rd edi~ !ion, McGraw..Hill, New York. Neter.l. M. Kutner, C. Nachtsheim, and W. Wasser~ man (1996), Applied Lir.ear Statistical Mrxiels. 4th edition, Irwin Press, Homewood.,.IL. >lewman, D, (1939), 'The Distribution of the Range in Samples from a NoID13l Population Expressed in Tern:s of an Iodependent Esti.JruIte of Standard Deviation," Biometrika, Vol. 31, p. 20. Odeh. R., and D. Owens (1980), Tables/or Nonr.al Tolerance Limits, Sampling Plans, ami Screening, Marcel Dekker, ~ew York. Owen. D. B. (l962), Handbook of Statistical Tables, Addison·Wesley Publishing CompllIlY. Reading, Mass, Page, E. S. (1954). "Continuous Inspection Schemes." Biometrika, Vol. 14. pp. 100-115. Roberts, S. (1959), "Control C'ha.ttTests Based on Geometric Moving Averages," Technometrics. Vol I, pp. 239-250. Scbeff6, H. (1953), "A Memod for Judging All Con· trasts in the Analysis of Variance," Biometrika, VoL 4(], pp.87-104. Snee, R. D. (1977), "Validation of Regression Mod~ eIs; Methods and Examples," Technometrics, VoL 19, No.4, pp. 415-428. Tucker, Ii. G. (1962), An Introduction to Probability and Mathematical Statistics, Academic Press, ~ewYork.
Tukey, J. W. (1953), 'The Problem of Multiple Comparisons," unpublished notes, Princeton University. Tukey, J, W. (1977), Exploratory Data Analysis. Addison-Wesley, Reading, MA. United States Department of Defense (1957). Military Sttmdard Sampling Procedures and Tablesfor Inspection by Variables for Percent Defective (MlL-STD-414), Government Printing Office, Washing'.on, DC. Welch, P. D. (1983), 'The Statistical Analysis of Simulation Results." in The Corr.puter Perfor· mance M.odeling Handbook (ed. S. Lavenberg), . Academic Press, Orlando, FL.
Generate realizations u. i - uniform [0, 1] as random numbers, as is described in Section 6-6; use these in the inverse as Y. =Fy-1(u.). i = 1, 2, .... 6-9,
10·21- The posterior density for p is a beta distribution with para.n:.eters a + n and b + L. Xi - n, 10·29, The posterior density fer ?.is gamma with parameters r =: m + Lx; + 1 and 0;:; n + (m +l)/l,:. 10·31. 0.967. 10·33. 0.3783. j(x"e) 2x 10·35. (al j.elx, 112. 10-37. at:::;: a;::::: cd2 is shorter, f )=-':-(~)' =-'(2-2" (b) J x·, e , - xl, 10·39, (a) 74.03533 ~ 1">74.03666. (b) 74.0356" .u. 10..41, (al 3232.11 ;; i" ~ 3267.89. (b) 1004.80:$ J.l. 1043.